Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Some "wrong" XS:A in Tophat output for strand specific pair-end RNA-Seq data

    Hi, all,

    When dealing with strand-specific pair-end RNA-Seq data, sequenced from 5' end and mapped with --library-type fr-secondstrand, I got some strang flags and XS:A labeling listed as follows.

    H2208:7439:9018 pPr1 chr1 4887061 50 100M = 4887050 -111 XS:A:+ NH:i:1
    H2208:7439:9018 pPR2 chr1 4887050 50 100M = 4887061 111 XS:A:+ NH:i:1

    H16340:40651 pPR2 chr1 6284001 50 100M = 6284208 307 XS:A:- NH:i:1
    H16340:40651 pPr1 chr1 6284208 50 100M = 6284001 -307 XS:A:- NH:i:1

    According to my library type, I can guess all the reads listed here are from negative strand.

    Why in the first pair, these reads are assumed to be from positive strand?

    There are about 1 pair of reads like the first one in every 400 paired reads with flag pPr1 or 83.

    I wonder if Tophat has other considerations or my assumption is wrong or this is a bug of Tophat (TopHat v2.0.4)?

    Thank you!

    Tong Chen

  • #2
    You are confusing strand orientation of the read relative to the transcript, which is what the --library-type parameter controls for, and orientation of the transcript relative to the genome, which is what the XS:A flag reports. In your example the XS:A:+ of the first read pair is reporting that the transcript which gave rise to this fragment is transcribed from the forward(+) strand of the genome. Conversely the transcript associated with the second pair is transcribed from the reverse(-) strand. This is kind of the the main reason for using strand-specific RNA-Seq protocols, to determine without ambiguity* which strand a transcript lies on.

    (*Ignoring that you can never entirely eliminate ambiguity because nothing is 100%.)

    Comment


    • #3
      Thanks kmcarr. I understand your point.

      However, I think the first pair of reads should also be transcribed from the nagative strand for the following three reasons.

      1. "H2208:7439:9018 pPr1" means the first one of paired-reads mapped to the reverse strand.

      2. The library-type is "fr-secondstrand" can tell you the first one of paired-reads always map to the strand that generates it.

      3. Here, the XS:A for "H2208:7439:9018 pPr1" should be '-'.

      Is this right?

      Comment


      • #4
        Originally posted by ct586 View Post
        Thanks kmcarr. I understand your point.

        However, I think the first pair of reads should also be transcribed from the nagative strand for the following three reasons.

        1. "H2208:7439:9018 pPr1" means the first one of paired-reads mapped to the reverse strand.

        2. The library-type is "fr-secondstrand" can tell you the first one of paired-reads always map to the strand that generates it.
        What method was used to prepare your strand specific library? If it was the newer TruSeq stranded kits, or any kit which uses dUTP marking of the second cDNA strand, you should use "--library-type fr-firststrand" for the TopHat alignment, not "fr-secondstrand"

        3. Here, the XS:A for "H2208:7439:9018 pPr1" should be '-'.

        Is this right?
        First make sure you know what method was used for the strand-specific library preparation and you use the correct "--library-type" parameter for TopHat (READ the TopHat manual). If the alignment has not been done properly any discussion of strandedness is going to be completely confused and ultimately fruitless.

        Comment


        • #5
          I am quite sure that my "--library-type" is "fr-secondstrand". That is why I did the folowing inferring.

          I think others have realized this type of question too, as showed in this post. [http://onetipperday.blogspot.com/201...f-tophat.html]

          Besides, some already published data also contain this type of inconsistent tags. The test data used by Rseqc. The link of the test data http://dldcc-web.brc.bcm.edu/lilab/l...Human_hg19.bam. [Attention: a large BAM file]

          Comment

          Latest Articles

          Collapse

          • seqadmin
            Recent Innovations in Spatial Biology
            by seqadmin


            Spatial biology is an exciting field that encompasses a wide range of techniques and technologies aimed at mapping the organization and interactions of various biomolecules in their native environments. As this area of research progresses, new tools and methodologies are being introduced, accompanied by efforts to establish benchmarking standards and drive technological innovation.

            3D Genomics
            While spatial biology often involves studying proteins and RNAs in their...
            01-01-2025, 07:30 PM
          • seqadmin
            Advancing Precision Medicine for Rare Diseases in Children
            by seqadmin




            Many organizations study rare diseases, but few have a mission as impactful as Rady Children’s Institute for Genomic Medicine (RCIGM). “We are all about changing outcomes for children,” explained Dr. Stephen Kingsmore, President and CEO of the group. The institute’s initial goal was to provide rapid diagnoses for critically ill children and shorten their diagnostic odyssey, a term used to describe the long and arduous process it takes patients to obtain an accurate...
            12-16-2024, 07:57 AM

          ad_right_rmr

          Collapse

          News

          Collapse

          Topics Statistics Last Post
          Started by seqadmin, 01-09-2025, 04:04 PM
          0 responses
          443 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 01-09-2025, 09:42 AM
          0 responses
          444 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 01-08-2025, 03:17 PM
          0 responses
          459 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 01-03-2025, 11:18 AM
          1 response
          50 views
          1 like
          Last Post Tonia
          by Tonia
           
          Working...
          X