SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
Differential Expression analysis without replicates polsum Bioinformatics 1 08-05-2011 03:40 AM
Differential analysis of non coding RNA seq Claudia34 RNA Sequencing 5 06-16-2011 02:37 PM
Maone, newbie in exome sequencing and data analysis Maone Introductions 0 06-15-2011 07:11 AM
Differential Expression Analysis Pipeline with Reference pcg Bioinformatics 1 01-27-2011 09:04 AM
SOLID analysis Newbie El_rna SOLiD 2 11-19-2009 09:36 PM

Reply
 
Thread Tools
Old 05-30-2011, 10:55 AM   #1
Kotoro
Member
 
Location: Farmington CT

Join Date: May 2011
Posts: 31
Default Wish to do differential analysis, complete newbie.

When it comes to finding/using established tools I am completely untrained.

We are starting with illumina paired-end reads (we don't necessarily care if the analysis is done with tools supporting paired-ends per se, the sequencing just happened to be done this way). The paired ends have already been separated by the sequencer pipeline.

In an earlier post I asked how one would trim for quality, and I found a few tools to do that.

Now I'm supposed to perform an alignment and use the alignment information to get expression profiles. (Eventually we would like to use these expression profiles to compare different samples.)

It seems like cufflinks should be able to do this sort of thing (I'm assuming it would be working in partnership with bowtie or tophat for this.). Are there any other pipelines and how would I best find the sequence of commands that would prepare the data and produce the mapping files that I need to run these analyses?
Kotoro is offline   Reply With Quote
Old 05-30-2011, 11:11 AM   #2
DZhang
Senior Member
 
Location: East Coast, US

Join Date: Jun 2010
Posts: 177
Default

Hi Kotoro,

The online manuals of Tophat and Cufflinks should be sufficient for you to carry out the analysis pipeline.

Douglas
www.contigexpress.com
DZhang is offline   Reply With Quote
Old 05-31-2011, 08:17 AM   #3
Kotoro
Member
 
Location: Farmington CT

Join Date: May 2011
Posts: 31
Default

what produces the gtf file that cufflinks is looking for?
Kotoro is offline   Reply With Quote
Old 06-09-2011, 09:49 AM   #4
vineeth_s
Junior Member
 
Location: Germany

Join Date: Jun 2011
Posts: 9
Default

Hi Kotoro,

For the gtf file, one potential solution is to use the Table Browser from the UCSC genomes website

If you're working with human or mouse samples, you can choose the mRNA track in the "groups" selection, then mouse mRNAs in the "track" selection and refGene as the "table"

Then, in the output format field, you'd be able to choose GTF which should get you going.

For expression analysis, the best workflow would be to use TopHat followed by Cufflinks, you cannot use Bowtie -> Cufflinks, as Bowtie would not match reads to splice junctions.

Vineeth
vineeth_s is offline   Reply With Quote
Old 07-07-2011, 10:09 AM   #5
Kotoro
Member
 
Location: Farmington CT

Join Date: May 2011
Posts: 31
Default

Our current goal is to compare quality of frozen vs paraffin embedded tissue samples as to their usefulness for sequencing and expression analysis with the hypothesis being that they are sufficiently similar so as not to significantly bias the resulting sequence data in a damaging or misleading manner.

As I am still a student early in my training I am unfamiliar with the tools and methods required to test this hypothesis. Is the tophat/cufflinks combination even appropriate for this or am I moving in the wrong direction?

--edit:

Forgot to mention it, but we are working with human normal/cancer tissues.

Last edited by Kotoro; 07-07-2011 at 10:13 AM.
Kotoro is offline   Reply With Quote
Old 07-07-2011, 10:22 AM   #6
vineeth_s
Junior Member
 
Location: Germany

Join Date: Jun 2011
Posts: 9
Default

Yes, the tophat/cufflinks will serve you fine for this purpose.

Though if you want to be really sure, I am guessing you do, otherwise you(your lab) would not have gone in for NGS just to confirm this, I would use the more conservative DESeq.

What you can then do is ... map to the genome with bowtie, and have HTSeq do the counting for you, you can then give these to DESeq for differential expression analysis.

You will need some basic knowledge of python and R to do this, so if you are not familiar with these two, then it is better to stick to tophat/cufflinks
vineeth_s is offline   Reply With Quote
Old 07-07-2011, 10:38 AM   #7
Kotoro
Member
 
Location: Farmington CT

Join Date: May 2011
Posts: 31
Default

python and R are the languages i've been meaning to learn but haven't had the chance to learn yet.

(I know c/c++, java, perl and batch languages (like windows batch files and Bash scripting), so its not outside my capability to understand new languages. I am just not intimately familiar with python & R yet.)
Kotoro is offline   Reply With Quote
Old 07-07-2011, 10:41 AM   #8
Kotoro
Member
 
Location: Farmington CT

Join Date: May 2011
Posts: 31
Default

The manuals for the tuxedo suite are fairly specific to use and they do tell you the output format, but they're not particularly informative to how to interpret the output.

What exactly does the output tell me other than which differences exist (as in cuffcompare and cuffdiff. I'm not sure what to do with these results to get a meaningful comparison and declare whether the samples are similar enough or too different.) How can I relate the output to the overall hypothesis?
Kotoro is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 05:04 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO