SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
QR Codes and Plate Maps rfrichholt Bioinformatics 1 12-21-2011 04:46 PM
GATK and IUB codes donorio.demeo Bioinformatics 1 11-03-2011 07:34 AM
gsMapper error codes spacedninja 454 Pyrosequencing 2 08-02-2011 03:47 AM
Cuffcompare Class Codes jdanderson Bioinformatics 1 08-01-2011 12:39 PM
454 gsMapper error codes spacedninja General 2 06-29-2011 10:21 AM

Reply
 
Thread Tools
Old 06-23-2010, 10:14 AM   #21
gpertea
Member
 
Location: Maryland, US

Join Date: Jan 2010
Posts: 21
Default

Indeed, the manual hasn't yet been updated to reflect the fact that two extra fields were added there, so the format is now like this:
Code:
qJ:<gene_id>|<transcript_id>|<FMI>|<FPKM>|<conf_lo>|<conf_hi>|<cov>|<len>
..where the added fields are:
  • <cov>: the estimated average depth of read coverage across the transfrag
  • <len>: the length of the transfrag
gpertea is offline   Reply With Quote
Old 06-23-2010, 10:40 AM   #22
elisa*_*
Junior Member
 
Location: CT

Join Date: Aug 2008
Posts: 8
Default

gpertea, thank you for your prompt reply!
elisa*_* is offline   Reply With Quote
Old 06-29-2010, 11:07 AM   #23
GeneSeeker
Junior Member
 
Location: Fl

Join Date: Jun 2010
Posts: 1
Smile visualization of Cuffcompare class codes

Hi,
Some of the class code descriptions are a little difficult to interpret.

I have made a visualization of the transfrags that fall into each classes, based on my interpretation and attached it to this post.

Am I interpreting the descriptions correctly? Thanks in advance.
Attached Images
File Type: jpg Cuffcompare_output.jpg (90.6 KB, 193 views)
GeneSeeker is offline   Reply With Quote
Old 06-29-2010, 04:08 PM   #24
thinkRNA
Member
 
Location: Carlsbad,CA

Join Date: Jan 2010
Posts: 94
Default

Quote:
Originally Posted by gpertea View Post
Indeed, the manual hasn't yet been updated to reflect the fact that two extra fields were added there, so the format is now like this:
Code:
qJ:<gene_id>|<transcript_id>|<FMI>|<FPKM>|<conf_lo>|<conf_hi>|<cov>|<len>
..where the added fields are:
  • <cov>: the estimated average depth of read coverage across the transfrag
  • <len>: the length of the transfrag
HI Gpertea, how is coverage calculated in this file? Can you please tell me the formula used? "Estimated average depth of coverage across the transcript", how do you determine which is the transcript or this is local to the transfrag? I am very confused. Also is transfrag same as "fragments" in FPKM? How can I determine how many million reads were mapped for each experiment, how are multireads handled in this case?

Finally, for single end reads, what makes a fragment? I understand the definition of fragments in paired end reads.

One last thing, sometimes cufflinks will show disconnected parts of a transcript even though there are reads in the entire gene, why is this so? could it because coverage is too low in other parts of the transcript.

Thanks so much, and I really hope that you will reply to my queries because my data is not making sense to me.
thinkRNA is offline   Reply With Quote
Old 10-04-2011, 12:03 PM   #25
brdido
Member
 
Location: Sao Paulo, Brazil

Join Date: Apr 2011
Posts: 17
Default (=) Is "perfect match" or "Complete match of intron chain"

Hy guys,

We're trying hard to understand the definition of "intron chain" in the class code description for "=".

As you guys stated in your tables "Complete match of intron chain" means "perfect match of a transcript", is that it?

thanks in advance (should i study more english? :P )
brdido is offline   Reply With Quote
Old 10-04-2011, 12:07 PM   #26
wenhuang
Member
 
Location: Raleigh, NC

Join Date: Feb 2010
Posts: 30
Default

I think, that "intron chain" essentially means ignorance of the 5' end of the first exon and 3' end of the last exon. In other words, you get all the introns recovered, which does not necessarily mean that you get all the ends recovered.

Quote:
Originally Posted by brdido View Post
Hy guys,

We're trying hard to understand the definition of "intron chain" in the class code description for "=".

As you guys stated in your tables "Complete match of intron chain" means "perfect match of a transcript", is that it?

thanks in advance (should i study more english? :P )
wenhuang is offline   Reply With Quote
Old 10-04-2011, 12:14 PM   #27
gpertea
Member
 
Location: Maryland, US

Join Date: Jan 2010
Posts: 21
Default

wenhuang is correct. The intron coordinates must all match, which means that all the internal exons also match, and only the start coordinate of the first exon and the end coordinate of the last exon are allowed to differ from those of the reference transcript.
gpertea is offline   Reply With Quote
Old 10-04-2011, 12:18 PM   #28
brdido
Member
 
Location: Sao Paulo, Brazil

Join Date: Apr 2011
Posts: 17
Default

ok! i think i got it!
Thanks!
brdido is offline   Reply With Quote
Old 01-20-2012, 07:44 AM   #29
jtrivino
Junior Member
 
Location: Valencia

Join Date: Feb 2011
Posts: 4
Default

Hi all

In an analysis of transcriptomics with cufflinks and cuffcompare I want to filter and eliminate noise, so, eliminate the transfrag "suspicious", for this, I think that the "class code" could be a good parameter for this selection. What are the best "class codes" for the filtering of transfrag?


Thanks!!!
jtrivino is offline   Reply With Quote
Old 09-19-2012, 07:14 PM   #30
upendra_35
Senior Member
 
Location: USA

Join Date: Apr 2010
Posts: 102
Default cuffcompare

Hi gpertea,

I have couple of questions regarding cufflinks/cuffcompare:

1. I found strange results when i compare the cufflinks with annotation and cuffcompare and cufflinks with no annotation and cuffcompare. Here are the results:

#with annotation:
[upendra_35@vm142-17 Denovo_stuff]$ cut -f3 cufflinks_out/cuffcompare_out.transcripts.gtf.tmap |sort|uniq -c
41003 =
7 c
1 class_code

# Without annotation:
[upendra_35@vm142-17 Denovo_stuff]$ cut -f3 cufflinks_out_no_annot/cuffcompare_out_no_annot.transcripts.gtf.tmap |sort|uniq -c
11935 =
6397 c
1 class_code
5014 e
562 i
16519 j
7226 o
1844 p
51 s
8169 u
624 x

Why is it that we only one class with cufflinks with annotation. I have already checked the annotation file and transcripts.gtf file and the chromosome names match. I believe the cufflinks without annotation might be true. Right?

2. The result above is based on 3 lanes of illumina data. Do you think we can increase the percentages of interesting classes (o and u) if you include more data?

Thanks in advance......
upendra_35 is offline   Reply With Quote
Old 02-26-2013, 12:37 PM   #31
myRNA_Seq
Member
 
Location: Canda

Join Date: Feb 2013
Posts: 18
Default cufflinks warning

I downloaded latest version of cufflinks binary and also added them to my path as following:
export PATH=$PATH:~/cufflinks-2.0.2.Linux_x86_64/cufflinks
export PATH=$PATH:~/cufflinks-2.0.2.Linux_x86_64/cuffdiff
export PATH=$PATH:~/cufflinks-2.0.2.Linux_x86_64/cuffcompare

When I tried to run cuffcompare, it gave me a warning:

Warning: Your version of Cufflinks is not up-to-date. It is recommended that you upgrade to Cufflinks v2.0.2 to benefit from the most recent features and bug fixes (http://cufflinks.cbcb.umd.edu).

and without *.tmap file generated.

Is there any problem in terms of the path setting?

Thanks a lot

Last edited by myRNA_Seq; 02-26-2013 at 12:51 PM.
myRNA_Seq is offline   Reply With Quote
Old 02-26-2013, 01:19 PM   #32
brdido
Member
 
Location: Sao Paulo, Brazil

Join Date: Apr 2011
Posts: 17
Default

myRNA_Seq, i would check if there is a copy of cufflinks in other PATHs.

i.e.: /usr/bin /usr/local/bin
brdido is offline   Reply With Quote
Old 02-27-2013, 06:36 AM   #33
myRNA_Seq
Member
 
Location: Canda

Join Date: Feb 2013
Posts: 18
Default

Thank you Brdido. I have just checked. There isn't a copy of cufflinks in other PATHs.
myRNA_Seq is offline   Reply With Quote
Old 02-27-2013, 08:03 AM   #34
brdido
Member
 
Location: Sao Paulo, Brazil

Join Date: Apr 2011
Posts: 17
Default

Have you tried to run cuffcompare using its full path?

~/cufflinks-2.0.2.Linux_x86_64/cuffcompare

If the message doesn't appear in the example above means that cuffcompare is healthy! I would suggest you to search all of the computer for cuffcompare (using find).

And remember that the variable PATH will be updated when you start a new terminal.

Other thing to try is to see if older cuffcompare is in your PATH:

echo $PATH

Any clues from here?

good luck! But this is more a linux question than a cufflinks question, maybe linux gurus can help you better than me...
brdido is offline   Reply With Quote
Old 02-27-2013, 08:26 AM   #35
myRNA_Seq
Member
 
Location: Canda

Join Date: Feb 2013
Posts: 18
Default

Than you Brdido. I have tried full path. It works without warning anymore, and got all the output files except *.refmap *.tmap file. Do you know why?

I followed the instruction from the paper "Differential gene and transcript expression analysis of RNA-seq experiments with TopHat and Cufflinks" (http://www.nature.com/nprot/journal/....2012.016.html) and use exact files in the examples.

$ find . -name transcripts.gtf > gtf_out_list.txt
$ ~/cufflinks-2.0.2.Linux_x86_64/cuffcompare - i gtf_out_list.txt -r gene.gtf
myRNA_Seq is offline   Reply With Quote
Old 02-27-2013, 09:31 AM   #36
brdido
Member
 
Location: Sao Paulo, Brazil

Join Date: Apr 2011
Posts: 17
Default

Have you checked if the chromosomes names are identical in transcripts.gtf and gene.gtf?

It is case sensitive AND some databases the id is just the number of the chromosome. For instance chromosome 1:

hg19 : chr1
b37: 1
brdido is offline   Reply With Quote
Old 02-27-2013, 11:08 AM   #37
myRNA_Seq
Member
 
Location: Canda

Join Date: Feb 2013
Posts: 18
Default

No. Not all the chromosomes names are identical between transcripts.gtf and gene.gtf.
myRNA_Seq is offline   Reply With Quote
Old 02-28-2013, 11:11 AM   #38
brdido
Member
 
Location: Sao Paulo, Brazil

Join Date: Apr 2011
Posts: 17
Default

So, have you corrected?

Did it worked?
brdido is offline   Reply With Quote
Old 02-28-2013, 11:17 AM   #39
myRNA_Seq
Member
 
Location: Canda

Join Date: Feb 2013
Posts: 18
Default

No, it still doesn't work.

I used full path to run cufflinks:
$ ~/cufflinks-2.0.2.Linux_x86_64/cuffcompare - i gtf_out_list.txt -r gene.gtf

No error/warning messages, I still can not get the output file *.refmap *.tmap files.
myRNA_Seq is offline   Reply With Quote
Old 03-01-2013, 01:05 PM   #40
myRNA_Seq
Member
 
Location: Canda

Join Date: Feb 2013
Posts: 18
Smile

Thank you very much brdido. It works for me now. I re-run some of the protocol to get the searchable index for each map files. it could be because of some of the program in wrong path and the old one was working there.
myRNA_Seq is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 07:21 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2021, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO