Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • samtools tview reference sequence

    Does anyone else know how to get samtools tview to display the reference sequence at the top? I only see a string of N where I think the reference sequence should be.
    Attached Files

  • #2
    Originally posted by rodney View Post
    Does anyone else know how to get samtools tview to display the reference sequence at the top? I only see a string of N where I think the reference sequence should be.
    Add in the reference after the BAM file in the command line.

    Comment


    • #3
      Hi Rodney and Nilshomer,
      I have a similar problem. I do add my reference fasta file after the BAM file but all I see are the Ns. I dont even see the reads that have aligned.
      Any suggestions?
      Thanks guys.

      Comment


      • #4
        Originally posted by Mansequencer View Post
        Hi Rodney and Nilshomer,
        I have a similar problem. I do add my reference fasta file after the BAM file but all I see are the Ns. I dont even see the reads that have aligned.
        Any suggestions?
        Thanks guys.
        Please give us what you typed into your command line.

        Comment


        • #5
          Here is the command:
          samtools tview file.sort.bam reference.folded.fas

          Comment


          • #6
            samtools tview issue

            I have several issues with "samtools tview":
            g : goto position, allows me to type a number, but not going anywhere
            b : won't toggle

            anyone has similar problem, and solutions

            Comment


            • #7
              Originally posted by nntao View Post
              I have several issues with "samtools tview":
              g : goto position, allows me to type a number, but not going anywhere
              b : won't toggle

              anyone has similar problem, and solutions
              Enter the chromosome name and position (even if there is only one chromosome). For example "chr1:1000" (no quotes).

              Comment


              • #8
                samtools tview

                Hello,

                I am having similar problem as mentioned here earlier. I have a string of N's in the reference genome, when I use the samtools tview command.Only the first 80 bases are present and remaining are all NNNNNN's. I tried using g option and tried different chromosome numbers. Still I have only N's.

                Any help will be highly appreciated.

                Here is the command I gave:

                samtools tview MH_0001alignedreadssorted.bam danRer6.fa

                Comment


                • #9
                  If your reference sequence is correct and indexed, as well as you BAM file, you should see the reference sequence where reads aligned. I noted that if there are no reads for a long stretch of bases, tview doesn't bother to show the reference, it shows only Ns. When I checked with another BAM file that had reads at that location, everything was OK. Before I thought the fasta file was currupted - which might very well be a reason for only seeing Ns.

                  Comment


                  • #10
                    Retracted message
                    Last edited by tonge; 04-19-2011, 08:47 AM. Reason: I found the correct answer, my mistake

                    Comment


                    • #11
                      Did you use 'danRer6.fa' in your alignment step? I think the issue here is that the reference used in the alignment step is not the same as 'danRer6.fa'


                      Originally posted by naluru View Post
                      Hello,

                      I am having similar problem as mentioned here earlier. I have a string of N's in the reference genome, when I use the samtools tview command.Only the first 80 bases are present and remaining are all NNNNNN's. I tried using g option and tried different chromosome numbers. Still I have only N's.

                      Any help will be highly appreciated.

                      Here is the command I gave:

                      samtools tview MH_0001alignedreadssorted.bam danRer6.fa

                      Comment


                      • #12
                        It's probably not a problem with tview, you can get the same issue in mpileup. Double-check that the names in your reference file are exactly the same as in your .sam file, or that there aren't any odd characters that might be confusing things.

                        Comment


                        • #13
                          For me, it was a colon ":" in my chromosome name for the reference sequence. When I deleted the colon, the actual sequence showed up (before that it was only N's).

                          Comment


                          • #14
                            Thank you VeBeKay!! That solved my problem perfectly :-) However I had to delete it in the reference AND rerun the alignment and subsequent steps, time-consuming but 100% effective.
                            I am surprised this was an issue, as the fasta reference was created using samtools faidx region command, which utilises a semi-colon to define the region. I tried faidx region extract with no semi-colon, and this does not give the correct output. Does anyone know of how to use faidx to extract regions to a multi-fasta which does not have semi-colons in the output files? It seems silly to require them for faidx yet malfunction over them in tview.

                            Comment


                            • #15
                              I am also having this problem, where the first 80 bases are present, then mostly Ns except in a few positions where aligned reads appear. I am just running a test set to try to learn the process, so I took a reference sequence and fragmented it into 500 bp segments with 50 bp overlaps, then aligned the fragments (short_reads.fas) back to the reference.fas. Thus, the full reference should have fragments aligned to it. Here are the commands I used:

                              $ bwa index reference.fas
                              $ bwa aln reference.fas short_reads.fas >short_reads.sai
                              $ bwa samse reference.fas short_reads.sai short_reads.fas >short_reads.sam
                              $ samtools faidx reference.fas
                              $ samtools import reference.fas.fai short_reads.sam short_reads.bam
                              $ samtools sort short_reads.bam short_reads.srt
                              $ samtools index short_reads.srt.bam
                              $ samtools tview short_reads.srt.bam reference.fas

                              I made sure that there were no colons in any sequence titles, but I am not sure I have used all the commands correctly. I would really appreciate any help!

                              Comment

                              Latest Articles

                              Collapse

                              • seqadmin
                                Essential Discoveries and Tools in Epitranscriptomics
                                by seqadmin




                                The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
                                04-22-2024, 07:01 AM
                              • seqadmin
                                Current Approaches to Protein Sequencing
                                by seqadmin


                                Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                                04-04-2024, 04:25 PM

                              ad_right_rmr

                              Collapse

                              News

                              Collapse

                              Topics Statistics Last Post
                              Started by seqadmin, 04-11-2024, 12:08 PM
                              0 responses
                              59 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 04-10-2024, 10:19 PM
                              0 responses
                              57 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 04-10-2024, 09:21 AM
                              0 responses
                              51 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 04-04-2024, 09:00 AM
                              0 responses
                              55 views
                              0 likes
                              Last Post seqadmin  
                              Working...
                              X