SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
Problems with cuffmerge Nnogueira Bioinformatics 23 06-25-2016 09:38 PM
cuffmerge problem camelbbs RNA Sequencing 0 11-06-2011 04:58 PM
Questions about cuffmerge Jolin Bioinformatics 0 10-04-2011 12:46 AM
cuffmerge fail papori Bioinformatics 0 07-31-2011 03:01 PM
The test for cuffmerge fgh1124 Bioinformatics 5 07-28-2011 01:45 AM

Reply
 
Thread Tools
Old 11-01-2011, 10:50 AM   #1
ercfrtz
Member
 
Location: Iowa

Join Date: Aug 2010
Posts: 23
Default Cuffmerge Error

I am getting the following error when trying to merge several files produced by cufflinks using output from tophat. I'm using the same reference in both cases and do not know why it would be giving me this error.


Error (GFaSeqGet): end coordinate (121191482) cannot be larger than sequence length 121191424
Error (GFaSeqGet): end coordinate (121191482) cannot be larger than sequence length 121191424
Error (GFaSeqGet): subsequence cannot be larger than 16338
Error getting subseq for CUFF.24532.1 (1..16348)!
[FAILED]
Error: could not execute cuffcompare
Traceback (most recent call last):
File "/shared/local/cufflinks/cuffmerge", line 573, in ?
sys.exit(main())
File "/shared/local/cufflinks/cuffmerge", line 556, in main
compare_meta_asm_against_ref(params.ref_gtf, params.fasta, output_dir+"/transcripts.gtf")
File "/shared/local/cufflinks/cuffmerge", line 406, in compare_meta_asm_against_ref
tmap = compare_to_reference(gtf_input_file, ref_gtf, fasta_file)
File "/shared/local/cufflinks/cuffmerge", line 342, in compare_to_reference
exit(1)
TypeError: 'str' object is not callable

If anyone knows why this is happening or how to circumvent it, that would be great.
ercfrtz is offline   Reply With Quote
Old 12-27-2011, 09:11 AM   #2
arodrigu1
Junior Member
 
Location: Chicago, IL

Join Date: Nov 2009
Posts: 1
Default

Hi ercfrtz,
Were you able to figure out what was causing your cuffcompare error message? I am getting the same message. I have about 15 samples I am running this for but I am getting this error for one of the samples.

Please let me know if you were able to figure out what was the problem.

Thanks!
arodrigu1 is offline   Reply With Quote
Old 01-12-2012, 06:12 AM   #3
lukas1848
Member
 
Location: Germany

Join Date: Jun 2011
Posts: 54
Default

sorry for bumbing the thread, but I get the same error as well. Has anyone found the cause for that error yet?
lukas1848 is offline   Reply With Quote
Old 01-13-2012, 08:12 AM   #4
peromhc
Senior Member
 
Location: Durham, NH

Join Date: Sep 2009
Posts: 108
Default

strange: I'm getting a similar error-- never seen it before.

I'm running Bowtie2 > samtools view | sort > samtools merge > cufflinks



Code:
matthew@macmanes:/media/hd/working/tuco/social.cuff$ cufflinks -p8 -m320 -u -o /media/hd/working/tuco/social.cuff -L social \
> -b /media/hd/working/tuco/tuco29dec11.fa --upper-quartile-norm --max-mle-iterations 20000 \
> /media/hd/working/tuco/b2.bams/all/social.bam
You are using Cufflinks v1.3.0, which is the most recent release.
[07:43:18] Inspecting reads and determining fragment length distribution.
> Processed 154768 loci.                       [*************************] 100%
> Map Properties:
>	Upper Quartile: 241.00
>	Number of Multi-Reads: 0 (with 0 total hits)
>	Fragment Length Distribution: Truncated Gaussian (user-specified)
>	              Default Mean: 320
>	           Default Std Dev: 80
[08:10:53] Assembling transcripts and initializing abundances for multi-read correction.
> Processed 154768 loci.                       [*************************] 100%
[08:48:16] Loading reference annotation and sequence.
Error (GFaSeqGet): subsequence cannot be larger than 384
Error getting subseq for social.2.1 (1..385)!
peromhc is offline   Reply With Quote
Old 01-13-2012, 01:09 PM   #5
peromhc
Senior Member
 
Location: Durham, NH

Join Date: Sep 2009
Posts: 108
Default

for me, at least, removing the -b <in>.fasta 'solves' the problem. I'd really like to use the -b option however.

This is the same fasta file that was used in mapping--for building the bowtie index..
peromhc is offline   Reply With Quote
Old 01-30-2013, 10:11 AM   #6
rpauly
Member
 
Location: Atlanta

Join Date: Apr 2011
Posts: 32
Default

I am getting the same error with or without the -b option in cufflinks..
I mapped the reads with hg19.fa UCSC using samtools.
Then removed duplicate using picard...and now I again sorted and indexed the data based using samtools.
Finally I used cufflinks..1st part works only without -b option, then I tried cuffmerge and it fails with :
Error (GFaSeqGet): subsequence cannot be larger than 16571
Error getting subseq for CUFF.42374.1 (2..16614)!

Any help is appreciated....
rpauly is offline   Reply With Quote
Old 02-27-2013, 02:26 PM   #7
johnwu
Junior Member
 
Location: Taiwan

Join Date: Jun 2011
Posts: 5
Default

I got the same cuffmerge error too.

I mapped reads to genome with tophat 2.0.6, then assemble transcripts with cufflinks 2.0.2. All the above steps were successful.

however, when i tried to merge transcript.gtf files from all my samples with cuffmerge 2.0.2, it failed with error messages:

Error (GFaSeqGet): subsequence cannot be larger than 100
Error getting subseq for CUFF.63509.1 (1..103)!
[FAILED]
Error: could not execute cuffcompare

Strangely, the CUFF.63509.1 transcript locates at chromosome 8, which is way longer than 100 bp (148491826 bp)..


8 Cufflinks transcript 58753100 58756101 1000 - . gene_id "CUFF.63509"; transcript_id "CUFF.63509.1"; FPKM "0.3200324464"; frac "0.180108"; conf_lo "0.246484"; conf_hi "0.393581"; cov "5.392457";
8 Cufflinks exon 58753100 58756101 1000 - . gene_id "CUFF.63509"; transcript_id "CUFF.63509.1"; exon_number "1"; FPKM "0.3200324464"; frac "0.180108"; conf_lo "0.246484"; conf_hi "0.393581"; cov "5.392457";


chromosome 8 info:

>8 dna:chromosome chromosome:Sscrofa10.2:8:1:148491826:1 REF

Did anyone have an solution to this problem? Any help is appreciated. Thanks.

Last edited by johnwu; 02-28-2013 at 02:49 PM.
johnwu is offline   Reply With Quote
Old 03-11-2013, 12:09 PM   #8
DJParker
Junior Member
 
Location: UK

Join Date: Jan 2012
Posts: 7
Default

Hello

Just to add weight to this - I got the same cuffmerge error too. I mapped my reads back to the my reference as usual - but now I get this error.

Has anyone found a solution yet?

Darren
DJParker is offline   Reply With Quote
Old 03-13-2013, 08:59 AM   #9
fongchun
Member
 
Location: Vancouver, BC

Join Date: May 2011
Posts: 55
Default

I am guessing no one has found a solution? I also have the same problem...
fongchun is offline   Reply With Quote
Old 03-13-2013, 09:26 AM   #10
fongchun
Member
 
Location: Vancouver, BC

Join Date: May 2011
Posts: 55
Default

I found this post on biostar if it helps anyone. I think the problem might be, for me at least, is that I aligned to my RNA-seq libraires to a different fasta file than what I am passing into cufflinks
fongchun is offline   Reply With Quote
Old 03-13-2013, 03:36 PM   #11
johnwu
Junior Member
 
Location: Taiwan

Join Date: Jun 2011
Posts: 5
Default

I found that for some reason cufflinks would assemble some frags/transcripts/contigs that are longer than chromosome length.

After removing/modifying those records from transcript.gtf generated by cufflinks, cuffmerge could proceed without any problem.

Here's an example from my project:

chromosome/scaffold/contig name : GL893313.2
chromosome/scaffold/contig length : 161573
exon coordinate: 161578 ( > chromosome length )

GL893313.2 Cufflinks exon 161457 161578 1000 + . gene_id "CUFF.77262"; transcript_id "CUFF.77262.1"; exon_number "3"; FPKM "1.1077759277"; frac "1.000000"; conf_lo "1.008891"; conf_hi "1.206661"; cov "18.665712";


CORRECTED:

GL893313.2 Cufflinks exon 161457 161573 1000 + . gene_id "CUFF.77262"; transcript_id "CUFF.77262.1"; exon_number "3"; FPKM "1.1077759277"; frac "1.000000"; conf_lo "1.008891"; conf_hi "1.206661"; cov "18.665712";

In my case, it seems that cufflinks only generated longer frags/contigs when processing assembly on genome sequence contig (not chromosome).
johnwu is offline   Reply With Quote
Old 03-15-2013, 03:30 AM   #12
DJParker
Junior Member
 
Location: UK

Join Date: Jan 2012
Posts: 7
Default

Hello,

So I have been troubleshooting my problem with Geo Pertea and basically we found the problem was arising from the fact that CLC (which I mapped my reads with) only soft clipped reads when they mapped past the end of the reference contig.

Take for example this (partial) SAM record:

502_1735_1931_F3 16 scaffold_10212 558 0 36S39M [etc.]

CLC aligned only 39 bases of this read to the end of this short contig (596 bases), the rest of 36 nt of the read are hanging beyond the contig boundary and are thus reported soft clipped (which makes sense). Unfortunately it looks like Cufflinks didn't exclude the soft clipped part from further consideration when determining the boundaries of the transfrag. The Tuxedo pipeline (specifically TopHat) does not normally deal with soft clipped alignments so I guess that's why we didn't get to test and make Cufflinks work properly with such alignments.
DJParker is offline   Reply With Quote
Old 05-19-2014, 09:02 PM   #13
seqing.help
Junior Member
 
Location: Davis, CA

Join Date: Aug 2012
Posts: 1
Default

Courtesy of Alex Dobin, this might be useful to those dealing with this problem.

https://groups.google.com/forum/#!to...ar/Ta1Z2u4bPfc
seqing.help is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 02:15 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2019, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO