Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • subeet
    Junior Member
    • Dec 2009
    • 1

    How to get uniquely mapped reads from Tophat

    Hi,

    I was wondering if anyone has used the "--max-multihits" option in Tophat. I specified "--max-multihits 1" in my command line with the hope of getting uniquely mapped reads; however, when I checked the run.log file, I noticed that the program still uses the default "--max-multihits 40". It seems that specifying the value of --max-multihits doesn't make a change in the analysis.

    Has anyone experienced similar problems before?

    Thanks,

    Subeet
  • mgogol
    Senior Member
    • Mar 2008
    • 197

    #2
    I've used -g 1 successfully, which is the same as --max-multihits according to the manual.

    Comment

    • wisense
      Member
      • Sep 2011
      • 30

      #3
      If you want to get uniquely mapped reads from Tophat output bam file,you can use the command like this:
      Code:
      samtools view accepted_hits.bam | awk '$5==255{print $0}' > uniq_mapped_reads.sam
      In the bam file ,when the 5 colum is 255,it means that this read is an uniquely mapped read,which also contain a string"NH:i:1" in the last field.You can see the statement of bam/sam format file for detail.
      Also,you can use another command as follow,which will do the same as the first one:
      Code:
      samtools view accepted_hits.bam | grep -w "NH:i:1" > uniquely_mapped_reads.sam

      Comment

      • simonandrews
        Simon Andrews
        • May 2009
        • 870

        #4
        We use -g 1 as part of our standard tophat pipeline. Seems to work OK and definitely doesn't produce the same results as running with the default options.

        Comment

        • JQL
          Member
          • Apr 2011
          • 83

          #5
          Originally posted by wisense View Post
          If you want to get uniquely mapped reads from Tophat output bam file,you can use the command like this:
          Code:
          samtools view accepted_hits.bam | awk '$5==255{print $0}' > uniq_mapped_reads.sam
          In the bam file ,when the 5 colum is 255,it means that this read is an uniquely mapped read,which also contain a string"NH:i:1" in the last field.You can see the statement of bam/sam format file for detail.
          Also,you can use another command as follow,which will do the same as the first one:
          Code:
          samtools view accepted_hits.bam | grep -w "NH:i:1" > uniquely_mapped_reads.sam
          Hi,

          When I read the SAM file format, it states:
          "MAPQ: ...A value 255 indicates that the mapping quality is not available."

          Does that mean uniquely mapped?

          thanks.

          Comment

          • JQL
            Member
            • Apr 2011
            • 83

            #6
            Originally posted by simonandrews View Post
            We use -g 1 as part of our standard tophat pipeline. Seems to work OK and definitely doesn't produce the same results as running with the default options.
            For RNA-seq study, is it too stringent to use -g 1? I recall I read somewhere, that one will miss gene famlily memebers.

            what number is more optimal? the default is 20, i believe.

            Comment

            • westerman
              Rick Westerman
              • Jun 2008
              • 1104

              #7
              Originally posted by JQL View Post
              Hi,

              When I read the SAM file format, it states:
              "MAPQ: ...A value 255 indicates that the mapping quality is not available."

              Does that mean uniquely mapped?

              thanks.
              For Tophat it does.

              I am not sure who made the following observation on a Galaxy mailing list but it is what I refer to:

              Tophat/bowtie don't report mapping quality
              values that are as meaningful as BWA, but there is some information in the
              mapping quality values tophat reports. Tophat yields 4 distinct values for
              its mapping quality values (you can do a "unique" count on the mapping
              quality field of any SAM file from tophat to verify this):



              255 = unique mapping

              3 = maps to 2 locations in the target

              2 = maps to 3 locations

              1 = maps to 4-9 locations

              0 = maps to 10 or more locations.

              Comment

              • JQL
                Member
                • Apr 2011
                • 83

                #8
                Originally posted by westerman View Post
                For Tophat it does.

                I am not sure who made the following observation on a Galaxy mailing list but it is what I refer to:
                thanks. that info is helpful. It is makes senes to me now. Because it takes two mappings to calculate quality. So uniquely mapped reads do not have mapping quality to report, hence, 255.

                Comment

                • Triple_W
                  Member
                  • Jan 2012
                  • 12

                  #9
                  Originally posted by westerman View Post
                  For Tophat it does.

                  I am not sure who made the following observation on a Galaxy mailing list but it is what I refer to:
                  The information is helpful. Does tophat in Galaxy adopt such protocal?

                  However, I found .bam file from tophat.v2 used a different version:
                  255: unmapped reads
                  50: unique mapped reads
                  3: two mapped locations
                  1: 3 or 4 mapped locations
                  0: >4 mapped locations

                  Comment

                  • wisense
                    Member
                    • Sep 2011
                    • 30

                    #10
                    Originally posted by Triple_W View Post
                    The information is helpful. Does tophat in Galaxy adopt such protocal?

                    However, I found .bam file from tophat.v2 used a different version:
                    255: unmapped reads
                    50: unique mapped reads
                    3: two mapped locations
                    1: 3 or 4 mapped locations
                    0: >4 mapped locations
                    You're right.I also find this change in tophat v2.0.4. In my opinion,maybe the best way to find the uniquely mapped reads is using the "NH:i:1" tag in the last field of every alignment.

                    Comment

                    • westerman
                      Rick Westerman
                      • Jun 2008
                      • 1104

                      #11
                      With version v2.0.6 I do not find this exact behavior. But perhaps it is because my reads get separated into 'accepted_hits.bam' and 'unmapped.bam'. The latter does, indeed, have all reads at mapq 255 but the former does not. The former has the mapq of 50, 3, 1 or 0. So when I look at the "interesting" reads -- i.e., those that map in accepted_hits.bam -- then just doing a samtools filtering by mapq is good enough.

                      Comment

                      Latest Articles

                      Collapse

                      ad_right_rmr

                      Collapse

                      News

                      Collapse

                      Topics Statistics Last Post
                      Started by SEQadmin2, 06-09-2026, 11:58 AM
                      0 responses
                      14 views
                      0 reactions
                      Last Post SEQadmin2  
                      Started by SEQadmin2, 06-05-2026, 10:09 AM
                      0 responses
                      26 views
                      0 reactions
                      Last Post SEQadmin2  
                      Started by SEQadmin2, 06-04-2026, 08:59 AM
                      0 responses
                      36 views
                      0 reactions
                      Last Post SEQadmin2  
                      Started by SEQadmin2, 06-02-2026, 12:03 PM
                      0 responses
                      60 views
                      0 reactions
                      Last Post SEQadmin2  
                      Working...