SEQanswers

Go Back   SEQanswers > Applications Forums > RNA Sequencing



Similar Threads
Thread Thread Starter Forum Replies Last Post
Strange higher size amplicon in ChIP seq Library!! Chiper Epigenetics 5 02-18-2014 02:02 AM
Size selection of ChIP-Seq library OptimusBrien Sample Prep / Library Generation 7 02-22-2012 12:18 PM
Broad size range for Illumina RNA-seq library - secondary subsampling biases? JHess Sample Prep / Library Generation 3 10-05-2011 07:38 AM
Problems with Invitrogen Size Select E-gels for Illumina RNA-Seq Library Sample Prep? Jerry Glenn Sample Prep / Library Generation 0 04-18-2011 07:14 AM
Strange higher size amplicon in ChIP seq Library!! Chiper Sample Prep / Library Generation 5 07-10-2010 06:43 AM

Reply
 
Thread Tools
Old 05-07-2011, 02:54 AM   #1
yeyeming
Junior Member
 
Location: beijing

Join Date: Nov 2010
Posts: 7
Default What's the effect of large difference of library size by RNA-seq ?

We RNA-seq two samples from the same tissue but different groups using SOLID,and the number of reads are 28 million and 95 million, is this normal,and what is the effect of such large difference?
yeyeming is offline   Reply With Quote
Old 05-09-2011, 03:24 AM   #2
pmiguel
Senior Member
 
Location: Purdue University, West Lafayette, Indiana

Join Date: Aug 2008
Posts: 2,315
Default

Did you do the sequencing yourself? If so could you tell us what the enrichment % was for each library? Were these barcoded and put into the same region, or in different regions?

It is possible, for many reasons, to see large differences in the numbers of sequences from similar samples.

--
Phillip
pmiguel is offline   Reply With Quote
Old 05-10-2011, 04:35 AM   #3
yeyeming
Junior Member
 
Location: beijing

Join Date: Nov 2010
Posts: 7
Default

Quote:
Originally Posted by pmiguel View Post
Did you do the sequencing yourself? If so could you tell us what the enrichment % was for each library? Were these barcoded and put into the same region, or in different regions?

It is possible, for many reasons, to see large differences in the numbers of sequences from similar samples.

--
Phillip
Thank you for your reply,a pity,we did not do the sequencing by ourselves,what i got are just the csfasta and qual files of each library.so i have no idea about the enrichment% and the mean of "Were these barcoded and put into the same region, or in different regions".
Can i get some indexes from the csfasta and qual files to decide whether i could go on my analysis? thanks.

Ye
yeyeming is offline   Reply With Quote
Old 05-10-2011, 04:48 AM   #4
pmiguel
Senior Member
 
Location: Purdue University, West Lafayette, Indiana

Join Date: Aug 2008
Posts: 2,315
Default

Hi Ye,
My advice would be to ask the facility that did the sequencing for you why the numbers are so different for the samples.
Have you mapped the sequences against your reference genome? If so what were the mapping percentages and what organism was it?
You might also want to plot a histogram of quality scores for both data sets. That would give you an indication of whether there are vast differences in their sequence quality.

--
Phillip
pmiguel is offline   Reply With Quote
Old 05-19-2011, 10:47 AM   #5
dzavallo
Member
 
Location: argentina

Join Date: Apr 2011
Posts: 16
Default

Hi Ye, I have the same problem between my 3 treatments: 10, 6.4 and 5.7 millon on each. And when I mapped against the reference genome the percentages were very different too.
How did you normalized your counts?
dzavallo is offline   Reply With Quote
Old 05-19-2011, 10:28 PM   #6
ishmael
Member
 
Location: NY, US

Join Date: Jul 2008
Posts: 17
Default

I agree with Philipp, many factors may effect final sequencing result.
Were the datasets sequenced in same batch?
What the results of FastQC?
You may check the top 20 expressed sequences of each dataset, and they may give you some clues.
ishmael is offline   Reply With Quote
Old 05-19-2011, 10:37 PM   #7
yeyeming
Junior Member
 
Location: beijing

Join Date: Nov 2010
Posts: 7
Default

What's tool you use to map? I use Tophat to analysis,the accepted hits of my two libraries acount for 42% and 57%,and then use cuffdiff (in cufflink )with FPKM to do different expression analysis.
yeyeming is offline   Reply With Quote
Old 06-04-2012, 08:59 AM   #8
CC_seqanswers
Member
 
Location: ILLINOIS

Join Date: Jan 2011
Posts: 30
Default

It's normal.

The two groups might just have different number of libraries running on a flowcell or library normalization was off before emPCR. The number of reads your sample has depends on the proportion of your library in the whole sequencing pool. Ask whoever handles the agreement between your group and the sequencing service provider to see how many raw reads you are supposed to get based on what you paid for. You might be just lucky to get 95M reads as you are supposed to get only 28M. Or maybe it's' the other way around.


Quote:
Originally Posted by yeyeming View Post
We RNA-seq two samples from the same tissue but different groups using SOLID,and the number of reads are 28 million and 95 million, is this normal,and what is the effect of such large difference?
CC_seqanswers is offline   Reply With Quote
Old 03-15-2013, 05:33 AM   #9
Marianna85
Member
 
Location: Italy

Join Date: Mar 2012
Posts: 32
Default

Quote:
Originally Posted by dzavallo View Post
Hi Ye, I have the same problem between my 3 treatments: 10, 6.4 and 5.7 millon on each. And when I mapped against the reference genome the percentages were very different too.
How did you normalized your counts?
Hi dzvallo,
I'm dealing with a similar problem...at the end what did you do? how did you perform the normalization?
Marianna85 is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 02:06 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2021, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO