Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Cuffmerge Error

    I am getting the following error when trying to merge several files produced by cufflinks using output from tophat. I'm using the same reference in both cases and do not know why it would be giving me this error.


    Error (GFaSeqGet): end coordinate (121191482) cannot be larger than sequence length 121191424
    Error (GFaSeqGet): end coordinate (121191482) cannot be larger than sequence length 121191424
    Error (GFaSeqGet): subsequence cannot be larger than 16338
    Error getting subseq for CUFF.24532.1 (1..16348)!
    [FAILED]
    Error: could not execute cuffcompare
    Traceback (most recent call last):
    File "/shared/local/cufflinks/cuffmerge", line 573, in ?
    sys.exit(main())
    File "/shared/local/cufflinks/cuffmerge", line 556, in main
    compare_meta_asm_against_ref(params.ref_gtf, params.fasta, output_dir+"/transcripts.gtf")
    File "/shared/local/cufflinks/cuffmerge", line 406, in compare_meta_asm_against_ref
    tmap = compare_to_reference(gtf_input_file, ref_gtf, fasta_file)
    File "/shared/local/cufflinks/cuffmerge", line 342, in compare_to_reference
    exit(1)
    TypeError: 'str' object is not callable

    If anyone knows why this is happening or how to circumvent it, that would be great.

  • #2
    Hi ercfrtz,
    Were you able to figure out what was causing your cuffcompare error message? I am getting the same message. I have about 15 samples I am running this for but I am getting this error for one of the samples.

    Please let me know if you were able to figure out what was the problem.

    Thanks!

    Comment


    • #3
      sorry for bumbing the thread, but I get the same error as well. Has anyone found the cause for that error yet?

      Comment


      • #4
        strange: I'm getting a similar error-- never seen it before.

        I'm running Bowtie2 > samtools view | sort > samtools merge > cufflinks



        Code:
        matthew@macmanes:/media/hd/working/tuco/social.cuff$ cufflinks -p8 -m320 -u -o /media/hd/working/tuco/social.cuff -L social \
        > -b /media/hd/working/tuco/tuco29dec11.fa --upper-quartile-norm --max-mle-iterations 20000 \
        > /media/hd/working/tuco/b2.bams/all/social.bam
        You are using Cufflinks v1.3.0, which is the most recent release.
        [07:43:18] Inspecting reads and determining fragment length distribution.
        > Processed 154768 loci.                       [*************************] 100%
        > Map Properties:
        >	Upper Quartile: 241.00
        >	Number of Multi-Reads: 0 (with 0 total hits)
        >	Fragment Length Distribution: Truncated Gaussian (user-specified)
        >	              Default Mean: 320
        >	           Default Std Dev: 80
        [08:10:53] Assembling transcripts and initializing abundances for multi-read correction.
        > Processed 154768 loci.                       [*************************] 100%
        [08:48:16] Loading reference annotation and sequence.
        Error (GFaSeqGet): subsequence cannot be larger than 384
        Error getting subseq for social.2.1 (1..385)!

        Comment


        • #5
          for me, at least, removing the -b <in>.fasta 'solves' the problem. I'd really like to use the -b option however.

          This is the same fasta file that was used in mapping--for building the bowtie index..

          Comment


          • #6
            I am getting the same error with or without the -b option in cufflinks..
            I mapped the reads with hg19.fa UCSC using samtools.
            Then removed duplicate using picard...and now I again sorted and indexed the data based using samtools.
            Finally I used cufflinks..1st part works only without -b option, then I tried cuffmerge and it fails with :
            Error (GFaSeqGet): subsequence cannot be larger than 16571
            Error getting subseq for CUFF.42374.1 (2..16614)!

            Any help is appreciated....

            Comment


            • #7
              I got the same cuffmerge error too.

              I mapped reads to genome with tophat 2.0.6, then assemble transcripts with cufflinks 2.0.2. All the above steps were successful.

              however, when i tried to merge transcript.gtf files from all my samples with cuffmerge 2.0.2, it failed with error messages:

              Error (GFaSeqGet): subsequence cannot be larger than 100
              Error getting subseq for CUFF.63509.1 (1..103)!
              [FAILED]
              Error: could not execute cuffcompare

              Strangely, the CUFF.63509.1 transcript locates at chromosome 8, which is way longer than 100 bp (148491826 bp)..


              8 Cufflinks transcript 58753100 58756101 1000 - . gene_id "CUFF.63509"; transcript_id "CUFF.63509.1"; FPKM "0.3200324464"; frac "0.180108"; conf_lo "0.246484"; conf_hi "0.393581"; cov "5.392457";
              8 Cufflinks exon 58753100 58756101 1000 - . gene_id "CUFF.63509"; transcript_id "CUFF.63509.1"; exon_number "1"; FPKM "0.3200324464"; frac "0.180108"; conf_lo "0.246484"; conf_hi "0.393581"; cov "5.392457";


              chromosome 8 info:

              >8 dna:chromosome chromosome:Sscrofa10.2:8:1:148491826:1 REF

              Did anyone have an solution to this problem? Any help is appreciated. Thanks.
              Last edited by johnwu; 02-28-2013, 03:49 PM.

              Comment


              • #8
                Hello

                Just to add weight to this - I got the same cuffmerge error too. I mapped my reads back to the my reference as usual - but now I get this error.

                Has anyone found a solution yet?

                Darren

                Comment


                • #9
                  I am guessing no one has found a solution? I also have the same problem...

                  Comment


                  • #10
                    I found this post on biostar if it helps anyone. I think the problem might be, for me at least, is that I aligned to my RNA-seq libraires to a different fasta file than what I am passing into cufflinks

                    Comment


                    • #11
                      I found that for some reason cufflinks would assemble some frags/transcripts/contigs that are longer than chromosome length.

                      After removing/modifying those records from transcript.gtf generated by cufflinks, cuffmerge could proceed without any problem.

                      Here's an example from my project:

                      chromosome/scaffold/contig name : GL893313.2
                      chromosome/scaffold/contig length : 161573
                      exon coordinate: 161578 ( > chromosome length )

                      GL893313.2 Cufflinks exon 161457 161578 1000 + . gene_id "CUFF.77262"; transcript_id "CUFF.77262.1"; exon_number "3"; FPKM "1.1077759277"; frac "1.000000"; conf_lo "1.008891"; conf_hi "1.206661"; cov "18.665712";


                      CORRECTED:

                      GL893313.2 Cufflinks exon 161457 161573 1000 + . gene_id "CUFF.77262"; transcript_id "CUFF.77262.1"; exon_number "3"; FPKM "1.1077759277"; frac "1.000000"; conf_lo "1.008891"; conf_hi "1.206661"; cov "18.665712";

                      In my case, it seems that cufflinks only generated longer frags/contigs when processing assembly on genome sequence contig (not chromosome).

                      Comment


                      • #12
                        Hello,

                        So I have been troubleshooting my problem with Geo Pertea and basically we found the problem was arising from the fact that CLC (which I mapped my reads with) only soft clipped reads when they mapped past the end of the reference contig.

                        Take for example this (partial) SAM record:

                        502_1735_1931_F3 16 scaffold_10212 558 0 36S39M [etc.]

                        CLC aligned only 39 bases of this read to the end of this short contig (596 bases), the rest of 36 nt of the read are hanging beyond the contig boundary and are thus reported soft clipped (which makes sense). Unfortunately it looks like Cufflinks didn't exclude the soft clipped part from further consideration when determining the boundaries of the transfrag. The Tuxedo pipeline (specifically TopHat) does not normally deal with soft clipped alignments so I guess that's why we didn't get to test and make Cufflinks work properly with such alignments.

                        Comment


                        • #13
                          Courtesy of Alex Dobin, this might be useful to those dealing with this problem.

                          Comment

                          Latest Articles

                          Collapse

                          • seqadmin
                            Strategies for Sequencing Challenging Samples
                            by seqadmin


                            Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                            03-22-2024, 06:39 AM
                          • seqadmin
                            Techniques and Challenges in Conservation Genomics
                            by seqadmin



                            The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

                            Avian Conservation
                            Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
                            03-08-2024, 10:41 AM

                          ad_right_rmr

                          Collapse

                          News

                          Collapse

                          Topics Statistics Last Post
                          Started by seqadmin, Yesterday, 06:37 PM
                          0 responses
                          7 views
                          0 likes
                          Last Post seqadmin  
                          Started by seqadmin, Yesterday, 06:07 PM
                          0 responses
                          7 views
                          0 likes
                          Last Post seqadmin  
                          Started by seqadmin, 03-22-2024, 10:03 AM
                          0 responses
                          49 views
                          0 likes
                          Last Post seqadmin  
                          Started by seqadmin, 03-21-2024, 07:32 AM
                          0 responses
                          66 views
                          0 likes
                          Last Post seqadmin  
                          Working...
                          X