Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Alternative splice or RNA-seq generated error?

    I am about as newbie as a newbie can get so to be honest I'm a little reluctant to post - however, I have discovered a result looking at the raw Illumina reads that is not readily answerable this early in my RNA-seq workflow, and, with my limited knowledge at this point, thought I might ask to see if I could get an opinion on the below results.

    I am particularly interested in an hypothetical unannotated paralog/alt splice that may not align with my genome so I am spending quite a bit of time perusing the raw illumina reads in order to take a close look at some of the more conserved regions of my research proteins looking for paralogs, etc. (as well as to get a 'feel' for the raw reads, how they behave, etc, on manual queries). After I finish with this preliminary analysis I have a few hundred hours of learning before I can comment with any confidence on sequence assembly matters - I enjoy computers but I am far removed from a Linux wizard.

    Below is a result I found generated from high quality reads (>Q30) which suggests an alternative splice. In the code box below there are three lines:

    Line 1: partial exon 2 of one of my research proteins
    Line 2: raw Illumina reads linked by grep query, all have >Q30, and all cross the putative splice site
    Line 3: partial exon 7 of the same protein in Line 1

    The <....> bracket indicates the beginning of an intronic sequence at the end of exon 2.

    Code:
    [FONT="Courier New"]
                                  ********** ***::****:*
    e2                         ...RGHTGLFAGG<ASTYQVGLELC...>
         ...GHALLFRTSVMAKVEIQAVSTCRGHTGLFAGG<ASTFHVGLEAC...>
    e7   ...GHALLYRTTVMAKLEIQAVSTCR...      <--- intron --->
            *****:**:****:*********
    [/FONT]
    The above result appears to be an alternative splice. However, I was wondering if it may be an error generated by RNA-seq preparation of exp material, i.e., two pieces of DNA randomly cut and joined. There were about 10 copies of the middle region above all yielding high quality reads and all crossing an apparent splice site.

    Q: What is the likelyhood that the above is real and not a machine artifact?
    Last edited by Louis_Lemire; 07-17-2011, 08:36 AM. Reason: grammer

  • #2
    I may have found an explanation for this strange exon7-exon2 splice. Li et al. (2008) discuss a statistical approach towards identifying the degree of alternative splicing in a differential gene expression paper. [1] Li points out that the splicing of exons in reverse order is 'impossible' [2,3] and uses these rare events as measures of alternative splice false discovery events. Li found that these type of splicing events amount to roughly 1% of mapped junctions.

    Presumably during the workup of the cDNA for Illumina reading there is a low probability that the DNA can form a hair-pin turn back on itself and purportedly recombine - a rare event. That this happened with my research protein was coincidental.

    [1] Hairi Li et al (2008) Determination of tag density required for digital transcriptome analysis: Application to an androgen-sensitive prostate cancer model. PNAS 105(52):20179-20184
    [2] D. L. Black et al. (2003) Mechanisms of alternative pre-messenger RNA splicing. Annu Rev Biochem 72:291-336.
    [3] J. M. Johnson et al. (2003) Genome-wide survey of human alternative pre--mRNA splicing with exon junction microarrays. Science 302:2141-2144

    Comment

    Latest Articles

    Collapse

    • seqadmin
      Strategies for Sequencing Challenging Samples
      by seqadmin


      Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
      03-22-2024, 06:39 AM
    • seqadmin
      Techniques and Challenges in Conservation Genomics
      by seqadmin



      The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

      Avian Conservation
      Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
      03-08-2024, 10:41 AM

    ad_right_rmr

    Collapse

    News

    Collapse

    Topics Statistics Last Post
    Started by seqadmin, 03-27-2024, 06:37 PM
    0 responses
    13 views
    0 likes
    Last Post seqadmin  
    Started by seqadmin, 03-27-2024, 06:07 PM
    0 responses
    11 views
    0 likes
    Last Post seqadmin  
    Started by seqadmin, 03-22-2024, 10:03 AM
    0 responses
    53 views
    0 likes
    Last Post seqadmin  
    Started by seqadmin, 03-21-2024, 07:32 AM
    0 responses
    69 views
    0 likes
    Last Post seqadmin  
    Working...
    X