Alternative splice or RNA-seq generated error?

Louis_Lemire

Junior Member

Join Date: Jun 2011

Posts: 5
- Share
- Tweet
#1

Alternative splice or RNA-seq generated error?

07-17-2011, 07:31 AM

I am about as newbie as a newbie can get so to be honest I'm a little reluctant to post - however, I have discovered a result looking at the raw Illumina reads that is not readily answerable this early in my RNA-seq workflow, and, with my limited knowledge at this point, thought I might ask to see if I could get an opinion on the below results.

I am particularly interested in an hypothetical unannotated paralog/alt splice that may not align with my genome so I am spending quite a bit of time perusing the raw illumina reads in order to take a close look at some of the more conserved regions of my research proteins looking for paralogs, etc. (as well as to get a 'feel' for the raw reads, how they behave, etc, on manual queries). After I finish with this preliminary analysis I have a few hundred hours of learning before I can comment with any confidence on sequence assembly matters - I enjoy computers but I am far removed from a Linux wizard.

Below is a result I found generated from high quality reads (>Q30) which suggests an alternative splice. In the code box below there are three lines:

Line 1: partial exon 2 of one of my research proteins
Line 2: raw Illumina reads linked by grep query, all have >Q30, and all cross the putative splice site
Line 3: partial exon 7 of the same protein in Line 1

The <....> bracket indicates the beginning of an intronic sequence at the end of exon 2.

Code:

[FONT="Courier New"] ********** ***::****:* e2 ...RGHTGLFAGG<ASTYQVGLELC...> ...GHALLFRTSVMAKVEIQAVSTCRGHTGLFAGG<ASTFHVGLEAC...> e7 ...GHALLYRTTVMAKLEIQAVSTCR... <--- intron ---> *****:**:****:********* [/FONT]

The above result appears to be an alternative splice. However, I was wondering if it may be an error generated by RNA-seq preparation of exp material, i.e., two pieces of DNA randomly cut and joined. There were about 10 copies of the middle region above all yielding high quality reads and all crossing an apparent splice site.

Q: What is the likelyhood that the above is real and not a machine artifact?

Last edited by Louis_Lemire; 07-17-2011, 08:36 AM. Reason: grammer
Tags: None
Louis_Lemire

Junior Member

Join Date: Jun 2011

Posts: 5
- Share
- Tweet
#2

07-22-2011, 02:38 AM

I may have found an explanation for this strange exon7-exon2 splice. Li et al. (2008) discuss a statistical approach towards identifying the degree of alternative splicing in a differential gene expression paper. [1] Li points out that the splicing of exons in reverse order is 'impossible' [2,3] and uses these rare events as measures of alternative splice false discovery events. Li found that these type of splicing events amount to roughly 1% of mapped junctions.

Presumably during the workup of the cDNA for Illumina reading there is a low probability that the DNA can form a hair-pin turn back on itself and purportedly recombine - a rare event. That this happened with my research protein was coincidental.

[1] Hairi Li et al (2008) Determination of tag density required for digital transcriptome analysis: Application to an androgen-sensitive prostate cancer model. PNAS 105(52):20179-20184
[2] D. L. Black et al. (2003) Mechanisms of alternative pre-messenger RNA splicing. Annu Rev Biochem 72:291-336.
[3] J. M. Johnson et al. (2003) Genome-wide survey of human alternative pre--mRNA splicing with exon junction microarrays. Science 302:2141-2144
Comment

Previous template Next

Essential Discoveries and Tools in Epitranscriptomics

by seqadmin

The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
- Channel: Articles
04-22-2024, 07:01 AM
Current Approaches to Protein Sequencing

by seqadmin

Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
- Channel: Articles
04-04-2024, 04:25 PM

Topics	Statistics	Last Post
Genetic Variants and Diabetes Risk in Childhood Cancer Survivors by seqadmin Started by seqadmin, Yesterday, 08:47 AM	0 responses 16 views 0 likes	Last Post by seqadmin Yesterday, 08:47 AM
Cancer Metastasis: A Deep Dive into Cellular Plasticity by seqadmin Started by seqadmin, 04-11-2024, 12:08 PM	0 responses 60 views 0 likes	Last Post by seqadmin 04-11-2024, 12:08 PM
Proteogenomic Profiles Offer New Clues in Prostate Cancer by seqadmin Started by seqadmin, 04-10-2024, 10:19 PM	0 responses 60 views 0 likes	Last Post by seqadmin 04-10-2024, 10:19 PM
Novel Diagnostic Assay Enhances Ovarian Cancer Detection by seqadmin Started by seqadmin, 04-10-2024, 09:21 AM	0 responses 54 views 0 likes	Last Post by seqadmin 04-10-2024, 09:21 AM

Seqanswers Leaderboard Ad

Announcement

Alternative splice or RNA-seq generated error?

Comment

Latest Articles

ad_right_rmr

News