Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Reads within introns from polyA+ samples

    Hi all,

    Long time reader, first time poster.

    I've been processing some RNASeq data where the RNA was isolated using a Clontech's SMART kit which selects polyA+ mRNA. This was done single end, and the reads were mapped using Tophat.

    I find that many genes have significant numbers of reads in the introns which I thought would be suppressed due to polyA+ selection. I've attached an example which is a screen grab from Seqmonk. An example of the "problematic" area is marked in green.

    It would be good to know what others think, and whether these reads should be used as a measure of expression, or whether it would be wiser to use only exonic reads.

    I should also point out these were run on a HiSeq and multiplexed on the same lane.

    Many thanks,

    Sham
    Attached Files

  • #2
    This question is oft asked:
    (

    Discussion of next-gen sequencing related bioinformatics: resources, algorithms, open source efforts, etc

    )

    Short answer: it happens. No big deal. Just deal with it.
    Some non-poly A reads get through "the filter".

    Either method works for counting: Pick 1) count reads only hitting exons OR 2) reads (greater than or overlapping transcription start) AND (less than or overlapping transcription end).

    Comment


    • #3
      Hi shambam,

      your image seems good to me in terms of 5' or 3' bias. Have you checked this thoroughly? Do you have your reads homogeneously distributed along the genes? Just asking because we have huge 3' using Evrogen Mint-2, that could be considered analogous.

      thanks
      Carlos

      Comment


      • #4
        Originally posted by CPCantalapiedra View Post
        Hi shambam,

        your image seems good to me in terms of 5' or 3' bias. Have you checked this thoroughly? Do you have your reads homogeneously distributed along the genes? Just asking because we have huge 3' using Evrogen Mint-2, that could be considered analogous.

        thanks
        Carlos
        Thanks Carlos. From what I see, I don't believe there is a bias (see attachment, good coverage with exons from end to end), but I have had datasets where a bias has happened. These samples were from low cell numbers, so the amplification prior to sequencing was probably the root cause of this.

        Regarding my original post, I was just talking to the guy who did the extraction. He said it was a total RNA trizol prep followed by column clean-up. This was then amplified using the clontech SMART kit which primes off the ployA. I understand that nascent RNA will be in there, but I'm failing to understand why i see bursts of reads within introns, rather than low level intron wide coverage.

        My concern is really whether these should be counted as the measure of expression, or as Richard said, just count within exons.
        Attached Files

        Comment


        • #5
          I would say the intron you marked and the two at its left are at the same distance than the expected exons but shifted to the left? Am I right?

          Comment


          • #6
            Originally posted by CPCantalapiedra View Post
            I would say the intron you marked and the two at its left are at the same distance than the expected exons but shifted to the left? Am I right?
            I see what you mean, but when I zoom in the distances between exons do not exactly match where I see reads. Just to be on the same side I just double-checked the genome/annotation versions and they all match. It looks the same on UCSC too.

            Thanks for pointing this out.

            Comment

            Latest Articles

            Collapse

            • seqadmin
              Current Approaches to Protein Sequencing
              by seqadmin


              Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
              04-04-2024, 04:25 PM
            • seqadmin
              Strategies for Sequencing Challenging Samples
              by seqadmin


              Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
              03-22-2024, 06:39 AM

            ad_right_rmr

            Collapse

            News

            Collapse

            Topics Statistics Last Post
            Started by seqadmin, 04-11-2024, 12:08 PM
            0 responses
            18 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 04-10-2024, 10:19 PM
            0 responses
            22 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 04-10-2024, 09:21 AM
            0 responses
            16 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 04-04-2024, 09:00 AM
            0 responses
            47 views
            0 likes
            Last Post seqadmin  
            Working...
            X