Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #16
    When importing sam to bam, I use 'view' rather than 'import' :

    @SQ header lines present in SAM file:
    samtools view –bS alignment.sam > alignment.bam

    @SQ header lines absent from SAM file:
    samtools view –bt reference.fasta.fai alignment.sam > alignment.bam

    Not sure if that will help at all but might be worth a try
    Good luck

    Comment


    • #17
      Thanks for the suggestion Tally, I just reran everything and used the command view instead of import, but still no luck

      Comment


      • #18
        I think there are two problems here:

        1) Display of NNNNs instead of sequence
        This seems to be related in part to the actual terminal window. I thought it was weird that the NNNs don't appear until exactly after I start scrolling across the terminal. If I resize the terminal before running the 'tview' command, the position where the NNNs begin also changes. It may not be important, as according to mpileup output the NNNs are only occurring in between aligned regions.

        2) Incomplete alignment
        My fault!!! Helps when you use the correct reference sequence...
        Last edited by HeidiJTP; 01-24-2012, 10:44 AM.

        Comment


        • #19
          Originally posted by HeidiJTP View Post
          I think there are two problems here:

          1) Display of NNNNs instead of sequence
          This seems to be related in part to the actual terminal window. I thought it was weird that the NNNs don't appear until exactly after I start scrolling across the terminal. If I resize the terminal before running the 'tview' command, the position where the NNNs begin also changes. It may not be important, as according to mpileup output the NNNs are only occurring in between aligned regions.
          I am also facing this problem. Does anybody know what is the work around ?

          Comment


          • #20
            Originally posted by sudeep View Post
            I am also facing this problem. Does anybody know what is the work around?
            To summarize, I think there's 3 main problems that trip up users:
            1. Forgot to specify the reference on the command line (eg. "samtools tview foo.bam" => "samtool tview foo.bam foo.fa")
            2. Fasta file has different names for sequences. This is painful to fix, but you'll have to either rewrite all the sequence names (e.g. ">chr1" lines in foo.fa) to match the bam file sequence names, or rewrite the sequence references in the SAM/BAM file. The former's probably easier, but definitely the "right" way to go is to use the same fasta files when building the alignment to begin with
            3. Corrupt fasta files? I can't confirm this, but I suspect samtools might choke on reading FASTA files with dos/windows CR/LF linebreak codes (shows up as ^M in unix terminals a lot). This would explain HeidiJTP and naluru's 80 character problem (as 80 characters per line is common). You can normalize your dos/windows ASCII files to unix with the dos2unix command (e.g. dos2unix foo.fa).


            Also, it may not be what you're looking for if you care about the reference outside of mapped areas, but as an alternative, Samscope infers and displays the reference from BAM data alone (MD + CIGAR tags) without relying on FASTA reference files.

            Comment


            • #21
              I have solved this issue with renaming the fai (fasta index file). I had FileName.fa.fai and rename it FileName.fai. I think the program expects it like that.

              Comment

              Latest Articles

              Collapse

              • seqadmin
                Essential Discoveries and Tools in Epitranscriptomics
                by seqadmin


                The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist on Modified Bases...
                Today, 07:01 AM
              • seqadmin
                Current Approaches to Protein Sequencing
                by seqadmin


                Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                04-04-2024, 04:25 PM

              ad_right_rmr

              Collapse

              News

              Collapse

              Topics Statistics Last Post
              Started by seqadmin, 04-11-2024, 12:08 PM
              0 responses
              37 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 04-10-2024, 10:19 PM
              0 responses
              41 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 04-10-2024, 09:21 AM
              0 responses
              35 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 04-04-2024, 09:00 AM
              0 responses
              54 views
              0 likes
              Last Post seqadmin  
              Working...
              X