SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
PubMed: Differential genome-wide profiling of tandem 3' UTRs among human breast cance Newsbot! Literature Watch 0 09-07-2011 03:00 AM
Pileup representation of indels ragowthaman General 1 04-30-2011 08:12 PM
SOLiD WTP alignment file: representation of spliced reads Simon Anders Bioinformatics 0 08-19-2010 10:29 AM
Are UTRs parts of exons ? kala934 General 2 06-18-2010 03:53 AM
RNA-Seq: The Landscape of C. elegans 3'UTRs. Newsbot! Literature Watch 0 06-05-2010 03:00 AM

Reply
 
Thread Tools
Old 08-04-2011, 08:35 AM   #1
crh
Member
 
Location: tx

Join Date: Dec 2009
Posts: 46
Default over representation of reads mappign to UTRs

HI,

We have mapped illumina reads using a combination of soap and bowtie. Reads (35nt) were mapped w/ soap and the non-mapped set was iteratively remapped after trimming from 5' and 3' down to 21nt. Bowtie was used to select 'best mapping' reads from the set that mapped to multiple positions.

Looking at the mapping for many genes, it appears we have an over-representation of mappings to the UTRs:
http://www.cs.stedwards.edu/binf/phosphate/ptb12.tiff

There are also gaps in the mappings to some exons which may be due to alt splicing .

Comments? I'd like to be certain mapping is OK prior to looking for DE.

Charles
crh is offline   Reply With Quote
Old 08-04-2011, 09:34 AM   #2
pmiguel
Senior Member
 
Location: Purdue University, West Lafayette, Indiana

Join Date: Aug 2008
Posts: 2,317
Default

Just the 3' UTRs? Your picture shows strong bias towards mapping reads to the 3' end of that gene. This is to be expected using many rRNA depletion/cDNA synthesis methods. That is, methods that either purify mRNA using a hybridization to a polyA tail or prime first strand cDNA synthesis using an oligo dT primer will bias the resulting library toward 3' mapping reads.

No mystery here, the polyA tail is on the 3' end of a transcript, so if you pull out template or prime synthesis on the basis of that tail you get cDNA that is biased 3'. The more highly degraded the RNA, the higher the bias.

What method of library construction was used? Was any initial QC done to determine how intact the initial RNA sample was?

--
Phillip

Last edited by pmiguel; 08-04-2011 at 09:37 AM.
pmiguel is offline   Reply With Quote
Old 08-05-2011, 12:16 AM   #3
eslondon
Member
 
Location: London, UK

Join Date: Jul 2009
Posts: 21
Default

Bear in mind that this is also often seen as an effect of the method used to fragment RNA, which is well documented in the literature. Similar comment to previous reader (i.e. a library problem) but different step in the library preparation. If you fragment RNA (e.g. hydrolysis of RNA into 200-300 nucleotides prior to reverse transcription ) before preparing the cDNA, you are more likely to achieve more uniform coverage of the gene, as is usually done.

In any case, like with all NGS experiments, get to know the experimental protocol used, which has a big effect on what you see in the end....

With the data you have you are still likely to obtain good estimates of gene expression, but you will not be able to use your data to perform more sophisticated approaches, e.g. alt. splicing, etc.
eslondon is offline   Reply With Quote
Old 08-05-2011, 02:42 AM   #4
steven
Senior Member
 
Location: Southern France

Join Date: Aug 2009
Posts: 269
Default

As Phillip said, the lower the RNA stability, the higher the bias towards 3'UTRs, especially with oligodT selected mRNAs.
Do you have an idea about the global trend, like the overall proportion of reads in 3'UTR? In regular RNA-seq data with standard Illumina protocols I frequently find about 20% of the reads that overlap a 3'UTR.
steven is offline   Reply With Quote
Old 08-05-2011, 08:32 AM   #5
crh
Member
 
Location: tx

Join Date: Dec 2009
Posts: 46
Default

Thanks All,

I was not involved in the library prep, but have asked for details and will post them when I hear back.

I'm going to characterize this trend now for all genes, I'll post that as well.

thanks!

Charles
crh is offline   Reply With Quote
Old 08-09-2011, 03:15 PM   #6
crh
Member
 
Location: tx

Join Date: Dec 2009
Posts: 46
Default

Hi All,

I've checked the # reads mapping to 5utr, cds,3utr and it does appear there was likely degradation of the polyA prior to fragmentation:

>summary(cds$utr5)
Min. 1st Qu. Median Mean 3rd Qu. Max.
1.00 1.00 3.00 12.02 6.00 2689.00

> summary(cds$cds)
Min. 1st Qu. Median Mean 3rd Qu. Max.
1.0 11.0 24.0 113.7 54.0 22640.0

summary(cds$utr3)
Min. 1st Qu. Median Mean 3rd Qu. Max.
1.0 16.0 42.0 177.8 111.0 17710.0


I think we can still extract DE analysis as counts/gene are being compared across treatments?

Charles
crh is offline   Reply With Quote
Old 08-11-2011, 07:38 AM   #7
steven
Senior Member
 
Location: Southern France

Join Date: Aug 2009
Posts: 269
Default

Yes, more than 50 reads just considering the CDS sounds quite good to me -although I don't have any precise standards in mind
steven is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 08:45 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2021, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO