SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics

Similar Threads
Thread Thread Starter Forum Replies Last Post
RNA-Seq - From Cufflinks FPKM to differential gene expression aituka Introductions 1 07-29-2012 11:20 AM
Differential Expression Analysis Pipeline with Reference pcg Bioinformatics 1 01-27-2011 10:04 AM
Differential gene expression: Can Cufflinks/Cuffcompare handle biological replicates? marcora Bioinformatics 38 12-14-2010 04:57 PM
help with differential gene expression with cufflinks and tophat waterboy Bioinformatics 1 11-28-2010 10:51 AM
Differential gene expression of gene clusters anjana.vr RNA Sequencing 1 10-28-2010 11:33 AM

Reply
 
Thread Tools
Old 04-29-2012, 12:53 AM   #1
cerebralrust
Junior Member
 
Location: Sweden

Join Date: Jan 2012
Posts: 8
Exclamation Differential gene expression analysis without reference

I have to find the differential gene expression between two genotypes of a plant species sequenced using 454 pyrosequencer.

There is no reference genome and the closest species -glycine max aligns poorly with the reads.

How does one go about DE analysis in this case?

Should i combine the reads of both genotypes, assemble them, use that as a reference genome.Then map the reads of each genotype to this reference and continue the analysis?

Thank you.
cerebralrust is offline   Reply With Quote
Old 05-02-2012, 08:51 PM   #2
phoss
Member
 
Location: Beltsville, MD

Join Date: Aug 2011
Posts: 12
Default

Hi cerebralrust,

I was curious if you've tried M. truncatula? This a close relative of G. max.
Have you had any luck with genomes off of phytozome?

Last edited by phoss; 05-03-2012 at 06:11 PM.
phoss is offline   Reply With Quote
Old 05-03-2012, 12:06 AM   #3
sdriscoll
I like code
 
Location: San Diego, CA, USA

Join Date: Sep 2009
Posts: 401
Default

You might try one of the "no genome" assemblers like Trinity or Abyss to build a "gene" library from your data. I think those put together some consensus set of sequences assembled from your reads. The you could build a bowtie reference from those FASTA sequences and align your reads to it with bowtie. Finally you can count reads aligned to each one and compare samples using something like DESeq.

You'll need some major computer power to run Trinity, from what I hear. That process of assembling sequences from reads is much more reasource consuming than the bowtie alignment stage.
sdriscoll is offline   Reply With Quote
Old 05-03-2012, 02:21 PM   #4
jujubix
Member
 
Location: Vancouver

Join Date: May 2011
Posts: 14
Default

De novo assembly, as sdriscoll mentioned, is the typical solution when no decent reference genome exists.

Given that you're dealing with gene expression, I assume you have transcriptome reads, in which case you could look into Trans-ABySS, which is the transcriptome-specific version of ABySS. It is a single software pipeline that aims to assemble reads into transcripts and quantify transcript abundance, all without a reference genome. In theory you would end up with two sets of transcripts and expression levels, after which standard DE analysis could be conducted. Although finding corresponding transcripts between the two sets could be tricky...

Software link is here and paper is here

Last edited by jujubix; 05-03-2012 at 02:26 PM.
jujubix is offline   Reply With Quote
Old 05-03-2012, 02:37 PM   #5
sdriscoll
I like code
 
Location: San Diego, CA, USA

Join Date: Sep 2009
Posts: 401
Default

indeed. do you think that one would have to engage in a massive pairwise BLAST session between assemblies in order to match them up?

Maybe, for that reason, it would be easiest to pool all reads into a massive FASTQ and run them through ABySS at once to get a master list of transcripts and then perform quantification through other means.
sdriscoll is offline   Reply With Quote
Old 05-03-2012, 02:40 PM   #6
jujubix
Member
 
Location: Vancouver

Join Date: May 2011
Posts: 14
Default

Yeah, at this point building a common reference via assembly is looking mighty tempting. This of course, assuming cerebralrust has the major computer power to run everything
jujubix is offline   Reply With Quote
Old 05-03-2012, 02:42 PM   #7
sdriscoll
I like code
 
Location: San Diego, CA, USA

Join Date: Sep 2009
Posts: 401
Default

yeah. i'd be a little nervous to try it myself. but that's why i have more than one computer.
sdriscoll is offline   Reply With Quote
Old 05-04-2012, 03:57 AM   #8
cerebralrust
Junior Member
 
Location: Sweden

Join Date: Jan 2012
Posts: 8
Default

Hello members. Thank you for your valuable advice.

I've run assemblies on my data using Trinity,Newbler,MIRA,velvet on my HP laptop which has 4GB RAM and i3 processsor. About 800k reads with both genotypes pooled together.
No, i have not tried M.truncatula, phoss.I will,thanks!
I've pooled all the reads and assembled using MIRA + CAP3.Trinity, although a really good assembler is quite bad for plant genomes.(poor annotation, poor N50 etc)
Yes it is transcriptome. Now i suppose i will map the reads back to this 'reference', quantify and continue with the analyses.

Thanks for the Abyss suggestion and paper, jujubix & sdriscoll. I will try it out.
cerebralrust is offline   Reply With Quote
Reply

Tags
de analysis, no reference, plant

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 12:28 AM.


Powered by vBulletin® Version 3.8.6
Copyright ©2000 - 2014, Jelsoft Enterprises Ltd.