SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
Warning message- BFAST CS Student General 2 05-26-2011 10:56 AM
Orthomcl warning message joscarhuguet Bioinformatics 0 04-20-2011 08:13 AM
Orthomcl warning message joscarhuguet General 0 04-13-2011 12:47 PM
Warning message in orthomcl v1.4 joscarhuguet Bioinformatics 0 04-13-2011 09:46 AM
Tophat warning message Sherry Bioinformatics 0 01-03-2011 01:09 PM

Reply
 
Thread Tools
Old 05-23-2010, 08:19 PM   #1
Robin
Member
 
Location: US

Join Date: Nov 2009
Posts: 10
Default cuffcompare warning message

This is first time I used your cufflink software. I don't understand some of warning messager from the cuffcompare command line. I am using the lastest version cufflinks-0.8.2.Linux_x86_64.
I download the reference annotation GTF files (human ensembl and refseq ) from UCSC table browser.
1) UCSC human ensembl GTF file:
chr1 hg19_ensGene CDS 67126196 67126207 0.000000 + 0 gene_id "ENST00000237247"; transcript_id "ENST00000237247";
chr1 hg19_ensGene exon 67126196 67126207 0.000000 + . gene_id "ENST00000237247"; transcript_id "ENST00000237247";
chr1 hg19_ensGene CDS 67133213 67133224 0.000000 + 0 gene_id "ENST00000237247"; transcript_id "ENST00000237247";
chr1 hg19_ensGene exon 67133213 67133224 0.000000 + . gene_id "ENST00000237247"; transcript_id "ENST00000237247";
chr1 hg19_ensGene CDS 67136678 67136702 0.000000 + 0 gene_id "ENST00000237247"; transcript_id "ENST00000237247";
chr1 hg19_ensGene exon 67136678 67136702 0.000000 + . gene_id "ENST00000237247"; transcript_id "ENST00000237247";
chr1 hg19_ensGene CDS 67137627 67137678 0.000000 + 2 gene_id "ENST00000237247"; transcript_id "ENST00000237247";

2) cuffcompare command line:
/usca/clscratch/geru1/cufflinks-0.8.2.Linux_x86_64/cuffcompare -r /usca/home/geru1/gtf/refgene.gtf -o s_1_and_s_2.txt -R -s /usca/clscratch/geru1/bowtie-0.12.5/indexes/ ./testme/transcripts.gtf ./testme_s2/transcripts.gtf

3) Warning messager from cuffcompare:

GFF Warning: discarded overlapping feature segment (3019321-3021003) for GFF ID ENST00000416194
GFF Warning: discarded overlapping feature segment (2990575-2990576) for GFF ID ENST00000439917
GFF Warning: discarded overlapping feature segment (2904529-2904530) for GFF ID ENST00000431516
GFF Warning: discarded overlapping feature segment (2933284-2934966) for GFF ID ENST00000383431
GFF Warning: discarded overlapping feature segment (2953771-2953772) for GFF ID ENST00000436814
GFF Warning: discarded overlapping feature segment (2982531-2984213) for GFF ID ENST00000457089
GFF Warning: discarded overlapping feature segment (2941694-2941695) for GFF ID ENST00000423612
GFF Warning: discarded overlapping feature segment (2970446-2972128) for GFF ID ENST00000437010
Warning: transcript ENST00000370343 discarded (structural errors found, length=88047).
Warning: transcript ENST00000401006 discarded (structural errors found, length=22054).
Warning: transcript ENST00000465119 discarded (structural errors found, length=35491).
Warning: transcript ENST00000448632 discarded (structural errors found, length=26138).
Warning: transcript ENST00000444385 discarded (structural errors found, length=41396).
Warning: transcript ENST00000447431 discarded (structural errors found, length=30178).
Warning: transcript ENST00000372433 discarded (structural errors found, length=2407).

Thank you in advances!

Robin
Robin is offline   Reply With Quote
Old 06-13-2010, 07:28 PM   #2
zorph
Member
 
Location: FL

Join Date: May 2010
Posts: 40
Default

bump
zorph is offline   Reply With Quote
Old 07-22-2010, 05:40 AM   #3
mfischer
Junior Member
 
Location: Austria

Join Date: Mar 2010
Posts: 9
Default

Hi everybody,

I ran into the same warnings when running cuffcompare (v0.8.4) with the refFlat or refGene gtf files downloaded from UCSC table browser as reference parameter. When using Ensembl's gtf reference file (which cufflink's manual referes to) everything works fine.

Here are the first view warnings:
Quote:
GFF Warning: discarded overlapping feature segment (43916982-43916984) for GFF ID HYI
GFF Warning: discarded overlapping feature segment (43916824-43916982) for GFF ID HYI
Warning: transcript HYI discarded (structural errors found, length=2680).
And the refFlat entries which seem to cause them: (I don't show all of HYI's exons and CDS)
Quote:
chr1 hg19_refFlat stop_codon 43916981 43916983 0.000000 - . gene_id "HYI"; transcript_id "HYI";
chr1 hg19_refFlat CDS 43916984 43916982 0.000000 - 2 gene_id "HYI"; transcript_id "HYI";
chr1 hg19_refFlat exon 43916824 43916982 0.000000 - . gene_id "HYI"; transcript_id "HYI";
chr1 hg19_refFlat CDS 43919266 43919464 0.000000 - 0 gene_id "HYI"; transcript_id "HYI";
chr1 hg19_refFlat start_codon 43919462 43919464 0.000000 - . gene_id "HYI"; transcript_id "HYI";
chr1 hg19_refFlat exon 43919266 43919660 0.000000 - . gene_id "HYI"; transcript_id "HYI";
I recognized that the stop codon outreaches the last exon (ending at 43916982) which causes the first warning. Am I using the wrong gtf reference?

Are there any recommendations which reference gtf files should be used with cufflinks?

Thanks in advance
mfischer is offline   Reply With Quote
Old 10-21-2010, 11:18 PM   #4
zun
Member
 
Location: Japan

Join Date: Oct 2010
Posts: 26
Unhappy me too

Hi,everyones

I have same warnig shown as below,

GFF Warning: discarded overlapping feature segment (1610953-1611069) for GFF ID Os06t0130100-02
Warning: transcript Os06t0130100-02 discarded (structural errors found, length=6310).

I checked my reference GTF file and found that the gene(ID:Os06t0130100)
has alternative splicing.

but there are many other genes which have altenative splicing and no warnings.

What should I do??

I gave up that gene
zun is offline   Reply With Quote
Old 11-01-2010, 04:28 PM   #5
Bacilo
Junior Member
 
Location: Madrid

Join Date: May 2010
Posts: 5
Default Same problem

I have the same problem using cufflinks and using -G option in tophat (1.1.2 that admits a GTF annotation file). Does anyone get a solutions or an explanation to this warning message?

If the problem is alternative splicing perhaps the program is discarding the duplicated exon, present in several mRNAs, and it only counts this exon once to build junctions database.
Bacilo is offline   Reply With Quote
Old 12-13-2010, 08:06 AM   #6
adumitri
Member
 
Location: Cambridge, MA

Join Date: Jan 2010
Posts: 27
Default

Hi,

It does not seem like anyone had solved the problem mentioned in this thread, but I am hoping that someone could help me with a similar problem. I am using the latest version of Cufflinks (v0.9.3) and I am getting a lot of warnings that look like this:

Quote:
GFF warning: merging adjacent/overlapping segments of ENST00000323801 on chr1 (245133554-245133622, 245133624-245133839)
GFF warning: merging adjacent/overlapping segments of ENST00000400934 on chr1 (247206093-247206248, 247206251-247206433)
GFF warning: merging adjacent/overlapping segments of ENST00000400934 on chr1 (247206093-247206433, 247206436-247206753)
The used .gtf file is the one downloaded from the UCSC browser.
Does anyone have a clue what the problem might be?

Even more, further during the Cufflinks run, I get these errors:

Quote:
> Processed 32736 loci. [*************************] 100%
[14:57:01] Re-estimating abundances with bias correction.
> Processing Locus chr20:18118498-18169031 [************ ] 51%E
rror: sqrt(det(cov)) == 0, 0.000000 after rounding.
> Processing Locus chr3:12919020-12926710 [************** ] 56%E
rror: sqrt(det(cov)) == 0, 0.000000 after rounding.
> Processing Locus chr3:49977439-50226508 [************** ] 57%E
rror: sqrt(det(cov)) == 0, 0.000000 after rounding.
> Processing Locus chr7:99686576-99689823 [******************* ] 79%E
rror: sqrt(det(cov)) == 0, 0.000000 after rounding.
> Processed 32736 loci. [*************************] 100%

Any help would be appreciated,
Alexandra
adumitri is offline   Reply With Quote
Old 12-13-2010, 09:45 AM   #7
jb2
Member
 
Location: Boston, MA

Join Date: Jun 2010
Posts: 25
Default

I am also getting similar error messages to adumitri. The sqrt(det(cov)) issue was also mentioned in this thread: http://seqanswers.com/forums/showthread.php?t=6178
jb2 is offline   Reply With Quote
Old 04-19-2011, 04:29 PM   #8
josiah42
Junior Member
 
Location: Colorado

Join Date: Apr 2011
Posts: 1
Default

Since no one else has responded with a solution, I thought this might help:
Here is the error I was getting:
Quote:
GFF warning: merging adjacent/overlapping segments of ENSMUST00000170708 on chr19 (9090282-9092073, 9092076-9092111)
GFF warning: merging adjacent/overlapping segments of ENSMUST00000170708 on chr19 (9090282-9092111, 9092116-9093685)
GFF warning: merging adjacent/overlapping segments of ENSMUST00000073056 on chr19 (9290834-9291148, 9291150-9291487)
GFF warning: merging adjacent/overlapping segments of ENSMUST00000088040 on chr19 (9357869-9358351, 9358354-9358368)
GFF warning: merging adjacent/overlapping segments of ENSMUST00000088040 on chr19 (9357869-9358368, 9358371-9358391)
GFF warning: merging adjacent/overlapping segments of ENSMUST00000088040 on chr19 (9357869-9358391, 9358394-9358654)
I was running cufflinks using the -G option with a gtf file that I downloaded from the UCSC Genome Browser "Tables" page.

The issue is that I was using Ensembl gene names and they didn't match my data. Switching to using RefSeq gene names fixed the problem. For me, it was as simple as changing the "Track" dropdown box. I hope that helps someone in the future.
josiah42 is offline   Reply With Quote
Old 10-26-2011, 10:14 AM   #9
jhb1980
Junior Member
 
Location: Switzerland

Join Date: Dec 2010
Posts: 7
Default

Hi all,

Unsure if this issue has been cleared yet, but I recently encountered the same GFF warning messages using Ensembl's v64 (mm9) *.gtf when running Cufflinks v1.1.0, e.g.:

Code:
GFF warning: merging adjacent/overlapping segments of ENSMUST00000098967 on chr2 (181331877-181332007, 181332010-181332048)
Looking at the gene tracking output files, Cufflinks seems to have merged well over 1,000 reference gene loci. I went through a few of them on the UCSC browser, and it would appear that these merges occur when a reference transcript is annotated to extend into a downstream gene on the same strand. In the attached example, Cufflinks merged Lypla1 and Tcea1 into a single gene locus due to ENMUST**0155020 supposedly extending into Tcea1. I guess it's hard to tell if this is genuine alternative splicing or just an annotation artifact.

Looking at the merged reference genes, it's not any of apparent interest to me so I guess I'll live with it for the time being. Other than manually removing the individual transcripts causing the merge from the reference *.gtf, I am not sure if there's any way to suppress these merges in Cufflinks? If so, please let me know!
Attached Images
File Type: png Example.png (127.2 KB, 26 views)
jhb1980 is offline   Reply With Quote
Reply

Tags
cuffcompare, warning

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 04:53 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2019, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO