SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
RNAseq: Pipeline to detect allele specific expression dariober Bioinformatics 9 07-17-2015 01:46 PM
Expression quantification/differential expression gene analysis by RNA-Seq chenjy Bioinformatics 12 08-02-2013 04:06 AM
estimate differential expression papori De novo discovery 0 07-19-2011 07:32 AM
Differential Expression Analysis Pipeline with Reference pcg Bioinformatics 1 01-27-2011 10:04 AM
Differential expression noe Bioinformatics 0 07-07-2010 05:16 PM

Reply
 
Thread Tools
Old 06-30-2015, 09:17 PM   #1
Blaze9
Junior Member
 
Location: NJ

Join Date: Feb 2013
Posts: 8
Default Need verification of my Differential Expression pipeline

This is all gonna be in pseudo code/explanations, but can someone verify my pipeline, or let me know if there's a better method to doing something?

The project has 10 samples (5 male: 1 control, 2 experiments with a replicate each, same w/ female) of an organism w/o a reference genome. We are using de novo assembly to assemble all 10 samples. (Trinity/Oases/Bridger, etc)

After we assemble the samples, we want to create a reference so we can use it for Differential Expression. We will merge the 10 assembles together, run CD-HIT-EST to remove redundancy, and then proceed to annotate the fasta. We plan on using blastx, and save the output to an xml. Import the xml into Blast2GO, remove all the non-annotated transcripts, and export the annotated fasta. This fasta will use as our reference for mapping.

We take the above annotated fasta, and map it back to the raw reads using bowtie2 or BWA, generate SAMs. Then use samtools to sorted BAMs.

Convert our annotated reference to gff3, and use HTseq-count to evaluate counts. Then run DESeq to get our DE genes.

Does this sound like a good plan?

We're currently at the "reference transcript" stage, and we will be submitting the reference to our local blast cluster in the next few days. I just want to verify that what I'm thinking is correct, or if there's something else I should be doing.

Thank you!
Blaze9 is offline   Reply With Quote
Old 06-30-2015, 11:01 PM   #2
Brian Bushnell
Super Moderator
 
Location: Walnut Creek, CA

Join Date: Jan 2014
Posts: 2,707
Default

Are you assembling the genome or transcriptome?
Brian Bushnell is offline   Reply With Quote
Old 06-30-2015, 11:12 PM   #3
Blaze9
Junior Member
 
Location: NJ

Join Date: Feb 2013
Posts: 8
Default

It's illumina RNA-seq data, so it'll be a transcriptome.

My organism has no reference genome at all, so we're doing denovo assembly via Bridger.

Last edited by Blaze9; 06-30-2015 at 11:20 PM.
Blaze9 is offline   Reply With Quote
Old 07-01-2015, 08:38 AM   #4
fanli
Senior Member
 
Location: California

Join Date: Jul 2014
Posts: 198
Default

I'd evaluate how many non-annotated transcripts get removed after Blast2GO.
fanli is offline   Reply With Quote
Old 07-01-2015, 10:33 AM   #5
Brian Bushnell
Super Moderator
 
Location: Walnut Creek, CA

Join Date: Jan 2014
Posts: 2,707
Default

I suggest merging all of the reads prior to assembling, so you just get one assembly. That should give a better and less-redundant assembly compared to assembling 10x and deduplicating the results.
Brian Bushnell is offline   Reply With Quote
Old 07-01-2015, 10:54 AM   #6
GenoMax
Senior Member
 
Location: East Coast USA

Join Date: Feb 2008
Posts: 7,087
Default

Would it be better to assemble male/female data separately?
GenoMax is offline   Reply With Quote
Old 07-01-2015, 11:02 AM   #7
Brian Bushnell
Super Moderator
 
Location: Walnut Creek, CA

Join Date: Jan 2014
Posts: 2,707
Default

I wouldn't... as long as they're all the same organism, it's easiest to assemble them all together both in terms of recovering low-expression genes and avoiding redundancy, which really is hard to remove without loss of real information.

The ideal method of assembly (combining first or not combining first) may vary, though, depending on the ploidy and SNP rate. High-SNP-rate haploids, for example, might be better assembled individually.
Brian Bushnell is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 12:35 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2021, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO