SEQanswers

Go Back   SEQanswers > Applications Forums > RNA Sequencing



Similar Threads
Thread Thread Starter Forum Replies Last Post
Differential expression from Transcript assembly StopCodon RNA Sequencing 0 02-10-2012 02:55 AM
Differential expression graphics Chuckytah Bioinformatics 10 06-18-2011 01:34 PM
differential expression for de novo papori De novo discovery 2 05-26-2011 08:12 AM
Determining differential expression swarbre Bioinformatics 0 11-12-2010 09:38 AM
Differential expression noe Bioinformatics 0 07-07-2010 04:16 PM

Reply
 
Thread Tools
Old 09-27-2011, 11:19 AM   #1
giorgifm
Member
 
Location: Columbia University Medical Center

Join Date: Aug 2011
Posts: 35
Smile Differential transcript expression between different varieties of the same species

Dear all,

we have sequenced the transcriptome of different varieties of the same species (lacking complete genome information) using different read types, and obtained contigs with a respectable N50 for each of the varieties.
Now we want to use the reads to perform a differential gene transcript expression analysis, and two are the possible strategies that I could think of.

1) Map the variety reads on the variety contigs, and then group the contigs via some homology procedure. Then perform the analysis by comparing the counts within these groups.
Issue 1: the same transcript has sometimes different levels of fragmentation across the assemblies (like three fragments here, full sequence there), making a direct 1-1 comparison inappropriate.
Issue 2: robust orthology assignment methods (OrthoMCL, inParanoid etc.) are tuned and work principally (as far as I know) on proteins.
Issue 3: all differential expression tools that I know (e.g. EdgeR) assume identical lengths for the contigs targeted by the read match counts.

2) An alternative is to do a whole assembly using all varieties, and then map each variety reads separately on these contigs, thereby solving all the previous issues. However, it sounds dirty, and the joint assembly is very fragmented compared to the variety-specific ones.

How would you tackle a case like this? Would you favour one approach or the other? Possibly I'm missing some major strategy (and perhaps I'm duplicating another post on the issue), but forgive me, I'm a fresher

Thank you!

Federico
giorgifm is offline   Reply With Quote
Old 10-01-2011, 01:36 PM   #2
peer.b
Junior Member
 
Location: Germany

Join Date: Oct 2011
Posts: 5
Default

Dear Federico,

in my experience I always adopted the second solution. Maybe you can increase the completeness of the reference mapping transcriptome with some public ESTs if they are available for your species.
peer.b is offline   Reply With Quote
Old 10-05-2011, 07:43 AM   #3
tonybolger
Senior Member
 
Location: berlin

Join Date: Feb 2010
Posts: 156
Default

It's much easier to work with a single transcriptome for DE comparisons.

In theory, a combined assembly sounds like an easy way to get that - unfortunately it's not easy to prevent differences from fragmenting the assembly using typical de bruijn assemblers.

Perhaps it might be possible to merge the assemblies somehow, combining very similar transcripts into one.
tonybolger is offline   Reply With Quote
Old 10-05-2011, 01:41 PM   #4
greigite
Senior Member
 
Location: Cambridge, MA

Join Date: Mar 2009
Posts: 141
Default

There seem to be two strategies for a pooled assembly:
1) pool all transcriptome reads from all varieties, then assemble
2) pool only consensus contig sequences from your variety-specific assemblies and assemble those
Not sure which option you used already but if it was (1), you might see better results with strategy (2).
greigite is offline   Reply With Quote
Old 10-06-2011, 07:03 AM   #5
giorgifm
Member
 
Location: Columbia University Medical Center

Join Date: Aug 2011
Posts: 35
Default

Dear all, thanks for your replies!

I adopted (my) second solution. In this way I got precious information on e.g. variety-specific genes. Since Differential Expression ANalsyis works well even in cases many vs. zero (I'm using DESeq) I encountered no apparent problems. However, I still haven't ruled out the problem arising from reads aligning to nearly-identical contigs, which are discarded by my pipeline for having two identical hits in different contigs. In this respect, the approach suggested by Tony and by greigite(2, i.e. merging the contigs after variety-specific assembly) would be optimal.
But in this case, I would do an all_vs_all alignment of the contigs, group them into high similarity clusters, multialign them, merge them with something like consambig, and then use them as merged contigs. However like this the variety-specific information would be partially lost, as SNPs for example would be completely ignored.

So everything considered even a merged consensus contig solution does not seem to be the best one...
giorgifm is offline   Reply With Quote
Old 03-04-2014, 06:56 AM   #6
Birdman
Member
 
Location: Montreal

Join Date: Jan 2014
Posts: 21
Default

Hi, you might be completely somewhere else, after nearly 3 years, but I'm really curious about how this story ended... I'm facing the same kind of problem and I would like to know what was your best strategy at the end, and if it worked well.
Birdman is offline   Reply With Quote
Old 03-04-2014, 07:02 AM   #7
giorgifm
Member
 
Location: Columbia University Medical Center

Join Date: Aug 2011
Posts: 35
Default

Quote:
Originally Posted by Birdman View Post
Hi, you might be completely somewhere else, after nearly 3 years, but I'm really curious about how this story ended... I'm facing the same kind of problem and I would like to know what was your best strategy at the end, and if it worked well.
Ah, we ended up using the second approach and then validate selected differential expressions. Classic. 80% of our findings had a significant match with RT-PCR results.
giorgifm is offline   Reply With Quote
Old 03-04-2014, 07:08 AM   #8
Birdman
Member
 
Location: Montreal

Join Date: Jan 2014
Posts: 21
Default

80% match with RT-PCR is great! So in summary, you merged your assemblies together and then aligned your samples separately to this super-assembly? How did you merge the similar contigs whithin this super-assembly? CD-Hit-EST or something else?
Birdman is offline   Reply With Quote
Old 12-12-2016, 05:21 AM   #9
ina-maria
Junior Member
 
Location: Germany, Berlin

Join Date: Jul 2012
Posts: 2
Default

Dear giorgifm,
did you already publish a paper describing your method? I would like to have a closer look how you solved this issue. Kind regards
ina-maria is offline   Reply With Quote
Reply

Tags
rnaseq

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 08:49 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2019, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO