Seqanswers Leaderboard Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • crh
    Member
    • Dec 2009
    • 46

    over representation of reads mappign to UTRs

    HI,

    We have mapped illumina reads using a combination of soap and bowtie. Reads (35nt) were mapped w/ soap and the non-mapped set was iteratively remapped after trimming from 5' and 3' down to 21nt. Bowtie was used to select 'best mapping' reads from the set that mapped to multiple positions.

    Looking at the mapping for many genes, it appears we have an over-representation of mappings to the UTRs:


    There are also gaps in the mappings to some exons which may be due to alt splicing .

    Comments? I'd like to be certain mapping is OK prior to looking for DE.

    Charles
  • pmiguel
    Senior Member
    • Aug 2008
    • 2328

    #2
    Just the 3' UTRs? Your picture shows strong bias towards mapping reads to the 3' end of that gene. This is to be expected using many rRNA depletion/cDNA synthesis methods. That is, methods that either purify mRNA using a hybridization to a polyA tail or prime first strand cDNA synthesis using an oligo dT primer will bias the resulting library toward 3' mapping reads.

    No mystery here, the polyA tail is on the 3' end of a transcript, so if you pull out template or prime synthesis on the basis of that tail you get cDNA that is biased 3'. The more highly degraded the RNA, the higher the bias.

    What method of library construction was used? Was any initial QC done to determine how intact the initial RNA sample was?

    --
    Phillip
    Last edited by pmiguel; 08-04-2011, 08:37 AM.

    Comment

    • eslondon
      Member
      • Jul 2009
      • 21

      #3
      Bear in mind that this is also often seen as an effect of the method used to fragment RNA, which is well documented in the literature. Similar comment to previous reader (i.e. a library problem) but different step in the library preparation. If you fragment RNA (e.g. hydrolysis of RNA into 200-300 nucleotides prior to reverse transcription ) before preparing the cDNA, you are more likely to achieve more uniform coverage of the gene, as is usually done.

      In any case, like with all NGS experiments, get to know the experimental protocol used, which has a big effect on what you see in the end....

      With the data you have you are still likely to obtain good estimates of gene expression, but you will not be able to use your data to perform more sophisticated approaches, e.g. alt. splicing, etc.
      --------------------------------------
      Elia Stupka
      Co-Director and Head of Unit
      Center for Translational Genomics and Bioinformatics
      San Raffaele Scientific Institute
      Via Olgettina 58
      20132 Milano
      Italy
      ---------------------------------------

      Comment

      • steven
        Senior Member
        • Aug 2009
        • 269

        #4
        As Phillip said, the lower the RNA stability, the higher the bias towards 3'UTRs, especially with oligodT selected mRNAs.
        Do you have an idea about the global trend, like the overall proportion of reads in 3'UTR? In regular RNA-seq data with standard Illumina protocols I frequently find about 20% of the reads that overlap a 3'UTR.

        Comment

        • crh
          Member
          • Dec 2009
          • 46

          #5
          Thanks All,

          I was not involved in the library prep, but have asked for details and will post them when I hear back.

          I'm going to characterize this trend now for all genes, I'll post that as well.

          thanks!

          Charles

          Comment

          • crh
            Member
            • Dec 2009
            • 46

            #6
            Hi All,

            I've checked the # reads mapping to 5utr, cds,3utr and it does appear there was likely degradation of the polyA prior to fragmentation:

            >summary(cds$utr5)
            Min. 1st Qu. Median Mean 3rd Qu. Max.
            1.00 1.00 3.00 12.02 6.00 2689.00

            > summary(cds$cds)
            Min. 1st Qu. Median Mean 3rd Qu. Max.
            1.0 11.0 24.0 113.7 54.0 22640.0

            summary(cds$utr3)
            Min. 1st Qu. Median Mean 3rd Qu. Max.
            1.0 16.0 42.0 177.8 111.0 17710.0


            I think we can still extract DE analysis as counts/gene are being compared across treatments?

            Charles

            Comment

            • steven
              Senior Member
              • Aug 2009
              • 269

              #7
              Yes, more than 50 reads just considering the CDS sounds quite good to me -although I don't have any precise standards in mind

              Comment

              Latest Articles

              Collapse

              • seqadmin
                Pathogen Surveillance with Advanced Genomic Tools
                by seqadmin




                The COVID-19 pandemic highlighted the need for proactive pathogen surveillance systems. As ongoing threats like avian influenza and newly emerging infections continue to pose risks, researchers are working to improve how quickly and accurately pathogens can be identified and tracked. In a recent SEQanswers webinar, two experts discussed how next-generation sequencing (NGS) and machine learning are shaping efforts to monitor viral variation and trace the origins of infectious...
                03-24-2025, 11:48 AM
              • seqadmin
                New Genomics Tools and Methods Shared at AGBT 2025
                by seqadmin


                This year’s Advances in Genome Biology and Technology (AGBT) General Meeting commemorated the 25th anniversary of the event at its original venue on Marco Island, Florida. While this year’s event didn’t include high-profile musical performances, the industry announcements and cutting-edge research still drew the attention of leading scientists.

                The Headliner
                The biggest announcement was Roche stepping back into the sequencing platform market. In the years since...
                03-03-2025, 01:39 PM

              ad_right_rmr

              Collapse

              News

              Collapse

              Topics Statistics Last Post
              Started by seqadmin, 03-20-2025, 05:03 AM
              0 responses
              49 views
              0 reactions
              Last Post seqadmin  
              Started by seqadmin, 03-19-2025, 07:27 AM
              0 responses
              57 views
              0 reactions
              Last Post seqadmin  
              Started by seqadmin, 03-18-2025, 12:50 PM
              0 responses
              50 views
              0 reactions
              Last Post seqadmin  
              Started by seqadmin, 03-03-2025, 01:15 PM
              0 responses
              201 views
              0 reactions
              Last Post seqadmin  
              Working...