Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • cuffcompare warning message

    This is first time I used your cufflink software. I don't understand some of warning messager from the cuffcompare command line. I am using the lastest version cufflinks-0.8.2.Linux_x86_64.
    I download the reference annotation GTF files (human ensembl and refseq ) from UCSC table browser.
    1) UCSC human ensembl GTF file:
    chr1 hg19_ensGene CDS 67126196 67126207 0.000000 + 0 gene_id "ENST00000237247"; transcript_id "ENST00000237247";
    chr1 hg19_ensGene exon 67126196 67126207 0.000000 + . gene_id "ENST00000237247"; transcript_id "ENST00000237247";
    chr1 hg19_ensGene CDS 67133213 67133224 0.000000 + 0 gene_id "ENST00000237247"; transcript_id "ENST00000237247";
    chr1 hg19_ensGene exon 67133213 67133224 0.000000 + . gene_id "ENST00000237247"; transcript_id "ENST00000237247";
    chr1 hg19_ensGene CDS 67136678 67136702 0.000000 + 0 gene_id "ENST00000237247"; transcript_id "ENST00000237247";
    chr1 hg19_ensGene exon 67136678 67136702 0.000000 + . gene_id "ENST00000237247"; transcript_id "ENST00000237247";
    chr1 hg19_ensGene CDS 67137627 67137678 0.000000 + 2 gene_id "ENST00000237247"; transcript_id "ENST00000237247";

    2) cuffcompare command line:
    /usca/clscratch/geru1/cufflinks-0.8.2.Linux_x86_64/cuffcompare -r /usca/home/geru1/gtf/refgene.gtf -o s_1_and_s_2.txt -R -s /usca/clscratch/geru1/bowtie-0.12.5/indexes/ ./testme/transcripts.gtf ./testme_s2/transcripts.gtf

    3) Warning messager from cuffcompare:

    GFF Warning: discarded overlapping feature segment (3019321-3021003) for GFF ID ENST00000416194
    GFF Warning: discarded overlapping feature segment (2990575-2990576) for GFF ID ENST00000439917
    GFF Warning: discarded overlapping feature segment (2904529-2904530) for GFF ID ENST00000431516
    GFF Warning: discarded overlapping feature segment (2933284-2934966) for GFF ID ENST00000383431
    GFF Warning: discarded overlapping feature segment (2953771-2953772) for GFF ID ENST00000436814
    GFF Warning: discarded overlapping feature segment (2982531-2984213) for GFF ID ENST00000457089
    GFF Warning: discarded overlapping feature segment (2941694-2941695) for GFF ID ENST00000423612
    GFF Warning: discarded overlapping feature segment (2970446-2972128) for GFF ID ENST00000437010
    Warning: transcript ENST00000370343 discarded (structural errors found, length=88047).
    Warning: transcript ENST00000401006 discarded (structural errors found, length=22054).
    Warning: transcript ENST00000465119 discarded (structural errors found, length=35491).
    Warning: transcript ENST00000448632 discarded (structural errors found, length=26138).
    Warning: transcript ENST00000444385 discarded (structural errors found, length=41396).
    Warning: transcript ENST00000447431 discarded (structural errors found, length=30178).
    Warning: transcript ENST00000372433 discarded (structural errors found, length=2407).

    Thank you in advances!

    Robin

  • #2
    bump

    Comment


    • #3
      Hi everybody,

      I ran into the same warnings when running cuffcompare (v0.8.4) with the refFlat or refGene gtf files downloaded from UCSC table browser as reference parameter. When using Ensembl's gtf reference file (which cufflink's manual referes to) everything works fine.

      Here are the first view warnings:
      GFF Warning: discarded overlapping feature segment (43916982-43916984) for GFF ID HYI
      GFF Warning: discarded overlapping feature segment (43916824-43916982) for GFF ID HYI
      Warning: transcript HYI discarded (structural errors found, length=2680).
      And the refFlat entries which seem to cause them: (I don't show all of HYI's exons and CDS)
      chr1 hg19_refFlat stop_codon 43916981 43916983 0.000000 - . gene_id "HYI"; transcript_id "HYI";
      chr1 hg19_refFlat CDS 43916984 43916982 0.000000 - 2 gene_id "HYI"; transcript_id "HYI";
      chr1 hg19_refFlat exon 43916824 43916982 0.000000 - . gene_id "HYI"; transcript_id "HYI";
      chr1 hg19_refFlat CDS 43919266 43919464 0.000000 - 0 gene_id "HYI"; transcript_id "HYI";
      chr1 hg19_refFlat start_codon 43919462 43919464 0.000000 - . gene_id "HYI"; transcript_id "HYI";
      chr1 hg19_refFlat exon 43919266 43919660 0.000000 - . gene_id "HYI"; transcript_id "HYI";
      I recognized that the stop codon outreaches the last exon (ending at 43916982) which causes the first warning. Am I using the wrong gtf reference?

      Are there any recommendations which reference gtf files should be used with cufflinks?

      Thanks in advance

      Comment


      • #4
        me too

        Hi,everyones

        I have same warnig shown as below,

        GFF Warning: discarded overlapping feature segment (1610953-1611069) for GFF ID Os06t0130100-02
        Warning: transcript Os06t0130100-02 discarded (structural errors found, length=6310).

        I checked my reference GTF file and found that the gene(ID:Os06t0130100)
        has alternative splicing.

        but there are many other genes which have altenative splicing and no warnings.

        What should I do??

        I gave up that gene

        Comment


        • #5
          Same problem

          I have the same problem using cufflinks and using -G option in tophat (1.1.2 that admits a GTF annotation file). Does anyone get a solutions or an explanation to this warning message?

          If the problem is alternative splicing perhaps the program is discarding the duplicated exon, present in several mRNAs, and it only counts this exon once to build junctions database.

          Comment


          • #6
            Hi,

            It does not seem like anyone had solved the problem mentioned in this thread, but I am hoping that someone could help me with a similar problem. I am using the latest version of Cufflinks (v0.9.3) and I am getting a lot of warnings that look like this:

            GFF warning: merging adjacent/overlapping segments of ENST00000323801 on chr1 (245133554-245133622, 245133624-245133839)
            GFF warning: merging adjacent/overlapping segments of ENST00000400934 on chr1 (247206093-247206248, 247206251-247206433)
            GFF warning: merging adjacent/overlapping segments of ENST00000400934 on chr1 (247206093-247206433, 247206436-247206753)
            The used .gtf file is the one downloaded from the UCSC browser.
            Does anyone have a clue what the problem might be?

            Even more, further during the Cufflinks run, I get these errors:

            > Processed 32736 loci. [*************************] 100%
            [14:57:01] Re-estimating abundances with bias correction.
            > Processing Locus chr20:18118498-18169031 [************ ] 51%E
            rror: sqrt(det(cov)) == 0, 0.000000 after rounding.
            > Processing Locus chr3:12919020-12926710 [************** ] 56%E
            rror: sqrt(det(cov)) == 0, 0.000000 after rounding.
            > Processing Locus chr3:49977439-50226508 [************** ] 57%E
            rror: sqrt(det(cov)) == 0, 0.000000 after rounding.
            > Processing Locus chr7:99686576-99689823 [******************* ] 79%E
            rror: sqrt(det(cov)) == 0, 0.000000 after rounding.
            > Processed 32736 loci. [*************************] 100%

            Any help would be appreciated,
            Alexandra

            Comment


            • #7
              I am also getting similar error messages to adumitri. The sqrt(det(cov)) issue was also mentioned in this thread: http://seqanswers.com/forums/showthread.php?t=6178

              Comment


              • #8
                Since no one else has responded with a solution, I thought this might help:
                Here is the error I was getting:
                GFF warning: merging adjacent/overlapping segments of ENSMUST00000170708 on chr19 (9090282-9092073, 9092076-9092111)
                GFF warning: merging adjacent/overlapping segments of ENSMUST00000170708 on chr19 (9090282-9092111, 9092116-9093685)
                GFF warning: merging adjacent/overlapping segments of ENSMUST00000073056 on chr19 (9290834-9291148, 9291150-9291487)
                GFF warning: merging adjacent/overlapping segments of ENSMUST00000088040 on chr19 (9357869-9358351, 9358354-9358368)
                GFF warning: merging adjacent/overlapping segments of ENSMUST00000088040 on chr19 (9357869-9358368, 9358371-9358391)
                GFF warning: merging adjacent/overlapping segments of ENSMUST00000088040 on chr19 (9357869-9358391, 9358394-9358654)
                I was running cufflinks using the -G option with a gtf file that I downloaded from the UCSC Genome Browser "Tables" page.

                The issue is that I was using Ensembl gene names and they didn't match my data. Switching to using RefSeq gene names fixed the problem. For me, it was as simple as changing the "Track" dropdown box. I hope that helps someone in the future.

                Comment


                • #9
                  Hi all,

                  Unsure if this issue has been cleared yet, but I recently encountered the same GFF warning messages using Ensembl's v64 (mm9) *.gtf when running Cufflinks v1.1.0, e.g.:

                  Code:
                  GFF warning: merging adjacent/overlapping segments of ENSMUST00000098967 on chr2 (181331877-181332007, 181332010-181332048)
                  Looking at the gene tracking output files, Cufflinks seems to have merged well over 1,000 reference gene loci. I went through a few of them on the UCSC browser, and it would appear that these merges occur when a reference transcript is annotated to extend into a downstream gene on the same strand. In the attached example, Cufflinks merged Lypla1 and Tcea1 into a single gene locus due to ENMUST**0155020 supposedly extending into Tcea1. I guess it's hard to tell if this is genuine alternative splicing or just an annotation artifact.

                  Looking at the merged reference genes, it's not any of apparent interest to me so I guess I'll live with it for the time being. Other than manually removing the individual transcripts causing the merge from the reference *.gtf, I am not sure if there's any way to suppress these merges in Cufflinks? If so, please let me know!
                  Attached Files

                  Comment

                  Latest Articles

                  Collapse

                  • seqadmin
                    Strategies for Sequencing Challenging Samples
                    by seqadmin


                    Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                    03-22-2024, 06:39 AM
                  • seqadmin
                    Techniques and Challenges in Conservation Genomics
                    by seqadmin



                    The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

                    Avian Conservation
                    Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
                    03-08-2024, 10:41 AM

                  ad_right_rmr

                  Collapse

                  News

                  Collapse

                  Topics Statistics Last Post
                  Started by seqadmin, Yesterday, 06:37 PM
                  0 responses
                  10 views
                  0 likes
                  Last Post seqadmin  
                  Started by seqadmin, Yesterday, 06:07 PM
                  0 responses
                  9 views
                  0 likes
                  Last Post seqadmin  
                  Started by seqadmin, 03-22-2024, 10:03 AM
                  0 responses
                  49 views
                  0 likes
                  Last Post seqadmin  
                  Started by seqadmin, 03-21-2024, 07:32 AM
                  0 responses
                  67 views
                  0 likes
                  Last Post seqadmin  
                  Working...
                  X