The story just happened these days. We had 3 rice samples (1 was control and 2 were treated) for RNA-seq. According to the budge issue and the suggestion from the local sequencing provider, we decided to do single end sequencing (10 millions reads and 50 bp length per read). When we got the sequencing results, I found the file sizes had huge different from 1.5 GB to 3 GB‧ This was because they divided our sample into 2 different sequencing batch, one batch was sequenced with 50 bp, and the other was sequenced with 150 bp.
This makes my adviser and me worry can we compare the differential expression within these data. The provider said "Yes, you can. Don't worry." Is it?
The other question is about differential expression comparison.
First, I was trimmed the raw reads into the same length (50 bp per read), and used TopHat and Cufflinks to calculate the RPKMs. Here I got around 600 genes were up-regulated with at least 2 fold changes. Then, I though since we had 150 bp reads, why not using the whole length for the calculation again. This time, I got around 750 genes were up-regulated with at least 2 fold changes. When I compared these 2 results, there were only around 220 genes shown up in both calculations. (The raw reads with 50 bp length was omitted in the comparison.)
The DE results made we confused. Which results should we trust? From the assembly with 50 bp reads or with 150 bp reads? If we using qPCR to qualify them, what will happen?
Does anyone can give me some advises?
Many thanks,
Chung-Wen
This makes my adviser and me worry can we compare the differential expression within these data. The provider said "Yes, you can. Don't worry." Is it?
The other question is about differential expression comparison.
First, I was trimmed the raw reads into the same length (50 bp per read), and used TopHat and Cufflinks to calculate the RPKMs. Here I got around 600 genes were up-regulated with at least 2 fold changes. Then, I though since we had 150 bp reads, why not using the whole length for the calculation again. This time, I got around 750 genes were up-regulated with at least 2 fold changes. When I compared these 2 results, there were only around 220 genes shown up in both calculations. (The raw reads with 50 bp length was omitted in the comparison.)
The DE results made we confused. Which results should we trust? From the assembly with 50 bp reads or with 150 bp reads? If we using qPCR to qualify them, what will happen?
Does anyone can give me some advises?
Many thanks,
Chung-Wen
Comment