SEQanswers

Go Back   SEQanswers > Applications Forums > RNA Sequencing



Similar Threads
Thread Thread Starter Forum Replies Last Post
Cufflinks, Cuffdiff, without replicates brachysclereid Bioinformatics 1 10-16-2012 02:43 AM
several questions about tophat and cufflinks liuxq Bioinformatics 0 11-14-2011 11:03 PM
DESeq: question about with replicates and without any replicates. nb509 RNA Sequencing 2 10-25-2011 06:04 AM
Differential gene expression: Can Cufflinks/Cuffcompare handle biological replicates? marcora Bioinformatics 38 12-14-2010 03:57 PM
Differential gene expression: Can Cufflinks/Cuffcompare handle biological replicates? marcora Bioinformatics 0 05-19-2010 01:11 AM

Reply
 
Thread Tools
Old 12-06-2010, 01:59 PM   #21
RockChalkJayhawk
Senior Member
 
Location: Rochester, MN

Join Date: Mar 2009
Posts: 191
Default Update on Cufflinks v0.9.3

Quote:
Originally Posted by adarob View Post
By "newest version" do you mean 0.9.2? There was a small bug in that release which in some cases caused some of the alignments to be ignored. If it is not too much trouble, could you rerun with 0.9.3 (released yesterday)?
This figure was generated using the same protocol as above.
Using v0.9.3 really made things better. Now I have a FDR of 2.13% (427/20014), much better than the 30% I had previously. There is still a bit more variation than from the counting methods, but at least I can work with this.

Thanks for the updates!
Attached Images
File Type: jpeg CuffDiff.v0.9.3.jpeg (8.0 KB, 51 views)
RockChalkJayhawk is offline   Reply With Quote
Old 02-07-2011, 08:31 AM   #22
lahoman
Member
 
Location: Houston

Join Date: Jan 2011
Posts: 12
Default How did you count the reads from the assembly file?

Hi, xinchen,

Once I get the sam file using TopHat, how should be we count the reads for each gene? You know, I am new to RNA-Seq analysis. My question might be naive

Thank you so much,

Lahoman
lahoman is offline   Reply With Quote
Old 02-08-2011, 01:23 AM   #23
nkwuji
Member
 
Location: Dublin

Join Date: Mar 2010
Posts: 19
Default

Quote:
Originally Posted by lahoman View Post
Hi, xinchen,

Once I get the sam file using TopHat, how should be we count the reads for each gene? You know, I am new to RNA-Seq analysis. My question might be naive

Thank you so much,

Lahoman
Hi, Lahoman,

Here is the introduction for cufflinks.
http://cufflinks.cbcb.umd.edu/manual.html

It is better to read through it. I thought there might be some shortcuts or easy ways to use it. But later on, I found it really necessary to read all the instructions carefully, and make sure about the parameters you should use.

Cheers,
Jun
nkwuji is offline   Reply With Quote
Old 02-08-2011, 11:47 AM   #24
lahoman
Member
 
Location: Houston

Join Date: Jan 2011
Posts: 12
Default Cufflink results filtering?

Hi,RockChalkJayhawk,

After I use tophat to map Human RNA-Seq to the genome, then cufflinks for the transcript analysis, I checked the file of transcripts.expr. There are
263,506 transcripts. That's a lot. Do I need to filter the results based on FPKM? What criteria should I use? 1.0 or 0.5? I have no idea about it.

Thanks,

Lahoman
lahoman is offline   Reply With Quote
Old 02-08-2011, 12:24 PM   #25
RockChalkJayhawk
Senior Member
 
Location: Rochester, MN

Join Date: Mar 2009
Posts: 191
Default

Quote:
Originally Posted by lahoman View Post
Hi,RockChalkJayhawk,

After I use tophat to map Human RNA-Seq to the genome, then cufflinks for the transcript analysis, I checked the file of transcripts.expr. There are
263,506 transcripts. That's a lot. Do I need to filter the results based on FPKM? What criteria should I use? 1.0 or 0.5? I have no idea about it.

Thanks,

Lahoman
Unfortunately, there is no right answer. Everything is subjective. Just use good judgement...
RockChalkJayhawk is offline   Reply With Quote
Old 02-09-2011, 04:58 AM   #26
chrisbala
Member
 
Location: North Carolina

Join Date: Jan 2010
Posts: 82
Default

I think this issue of excessive cufflinks predicted transcripts is one many people are wrestling with. I have not figured out a satisfactory way to deal with this other than just using ONLY existing Ensembl gene models (boo!). My interpretation of the problem is that pre-mRNA reads (and perhaps other forms of "background") give cufflinks trouble (understandably). I don't know if using paired-end data (which I did not do) and 100s of millions of reads (like the cufflinks paper did) as opposed to 10s of millions of read would help. If others have thoughts I would love to hear them..
chrisbala is offline   Reply With Quote
Old 02-10-2011, 04:15 AM   #27
lpachter
Member
 
Location: Berkeley, cA

Join Date: Feb 2010
Posts: 40
Default

The reason for excessive transcripts is that in genes with low coverage, Cufflinks will likely predict multiple disjoint transcripts- that is in fact the only thing it can do because it is not a gene finder and has no reason to glue them together (its also true that "background" causes problems and makes assembly difficult).

To address these issues, and also the fact that many users want to discover novel isoforms with respect to a known annotation (e.g. ENSEMBL), we have developed a new reference annotation guided assembly mode that will take as input your annotation and assemble with respect to that. It results in much cleaner assemblies by merging partial transcript fragments into the known annotations, yet is able to reveal novel isoforms. It sounds like it addresses the issue (and desired feature) you are reporting on.

It will be released in the next week or two as soon as we finish some more testing, so please check back with us before the end of the month.
lpachter is offline   Reply With Quote
Old 02-10-2011, 06:46 AM   #28
lahoman
Member
 
Location: Houston

Join Date: Jan 2011
Posts: 12
Default the mapping position of RNA-Seq spanning junction

Hi,
I checked the accepted_hits.sam generated by TopHat. For one reads pair, their mapping results are:

USI-EAS376_0001:3:112:7215:15873#0 129 chr1 14790 255 40M140N35M = 155253852 0 CCGGGCCCCTCACCAGCCCCAGGTCCTTTCCCAGAGATGCCCTTGCGCCTCATGACCAGCTTGTTGAAGAGATCC bbbbbabbbbbbbbbbbbbbbababbbbbbbb^bbbbcbbbb`cR`\aaY`\a`bbbcbb_\_^baa[`Wa]X_a NM:i:0 XS:A:- NS:i:0


The CIGAR code is 40M140N35M. We can see it spans two exons. 14790 on chr1 is the position in the first exon. How can we know the information about the second exon?

Thanks a lot,

Lahoman
lahoman is offline   Reply With Quote
Old 02-10-2011, 06:58 AM   #29
lahoman
Member
 
Location: Houston

Join Date: Jan 2011
Posts: 12
Default

Hi,

Suddenly I realize that the second exon is from the same chromosome. The distance between the first and second exon is 140bp.
The one more question is, if a RNA-Seq reads is spliced and aligned to different chromsome, TopHat won't report it. Am I right? In other words, TopHat can't be used for genefusion analysis.

Thanks,

Lahoman
lahoman is offline   Reply With Quote
Reply

Tags
cufflinks deseq edger

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 06:28 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2018, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO