i have some sam files of Tophat and i want to assemble them to transcripts. should i merge all the sam files before running cufflinks or run cuffmerge after running cufflinkds for each sam file seperately?
Seqanswers Leaderboard Ad
Collapse
Announcement
Collapse
No announcement yet.
X
-
Header problems... cuffmerge
Hi all,
I'm getting a header error when using Cuffmerge, and I was wondering if anyone has been dealing with this recently (I'm using cufflinks v1.1.0/Sept 2011). I'm getting the following error:
[15:38:37] Inspecting reads and determining fragment length distribution.
Error: this SAM file doesn't appear to be correctly sorted!
current hit is at Contig_100_consensus_sequence:0, last one was at contig09241:34
Cufflinks requires that if your file has SQ records in
the SAM header that they appear in the same order as the chromosomes names
in the alignments.
If there are no SQ records in the header, or if the header is missing,
the alignments must be sorted lexicographically by chromosome
name and by position.
Is this still the same problem in the script? Or could this error message come about if there is anything wrong with the data file? I've tried the fixes described here and it still runs with the same error.
I'm generating my .sam file from bowtie2 and running them through cufflinks with no problems- and this is occuring for both 454 and Illumina generated data. Oh, and I don't have a reference gtf file.
Thanks!
-Alice
Comment
-
Hi, aliceb
Yes, I have met this issue,too. What i did is to sort the merged bam file(or sam file,I don't remember) generated by cuffmerge and then run cufflinks. But it is not a good idea since we may have to sort the file every time we run cuffmerge. So I wonder if there is any better way?
Comment
-
Hmmm, it seems like some people are having trouble getting a cuffmerge file to run in later analyses, while others are having trouble getting cuffmerge to execute at all.
I'm getting this header problem while running cuffmerge, so there is no output file to work with. Is anyone else getting stuck here?
Cheers,
Alice
Comment
-
Cuffmerge problems...
I'm stucked here too...
My reference file is a .gff not a .gtf. Is that a problem?
I ran the tophat pipeline until here without errors...but in cuffmerge I got this:
cuffmerge -g Triha.gff -s Triha.fa -p 12 assemblies.txt
[Tue Oct 2 09:32:48 2012] Beginning transcriptome assembly merge
-------------------------------------------
[Tue Oct 2 09:32:48 2012] Preparing output location ./merged_asm/
[Tue Oct 2 09:32:49 2012] Converting GTF files to SAM
gtf_to_sam: /lib64/libz.so.1: no version information available (required by gtf_to_sam)
[09:32:49] Loading reference annotation.
gtf_to_sam: /lib64/libz.so.1: no version information available (required by gtf_to_sam)
[09:32:49] Loading reference annotation.
gtf_to_sam: /lib64/libz.so.1: no version information available (required by gtf_to_sam)
[09:32:49] Loading reference annotation.
gtf_to_sam: /lib64/libz.so.1: no version information available (required by gtf_to_sam)
[09:32:50] Loading reference annotation.
gtf_to_sam: /lib64/libz.so.1: no version information available (required by gtf_to_sam)
[09:32:50] Loading reference annotation.
gtf_to_sam: /lib64/libz.so.1: no version information available (required by gtf_to_sam)
[09:32:51] Loading reference annotation.
[Tue Oct 2 09:32:51 2012] Quantitating transcripts
cufflinks: /lib64/libz.so.1: no version information available (required by cufflinks)
You are using Cufflinks v2.0.2, which is the most recent release.
Command line:
cufflinks -o ./merged_asm/ -F 0.05 -g Triha.gff -q --overhang-tolerance 200 --library-type=transfrags -A 0.0 --min-frags-per-transfrag 0 --no-5-extend -p 12 ./merged_asm/tmp/mergeSam_fileO6g28c
[bam_header_read] EOF marker is absent.
[bam_header_read] invalid BAM binary header (this is not a BAM file).
File ./merged_asm/tmp/mergeSam_fileO6g28c doesn't appear to be a valid BAM file, trying SAM...
[09:32:52] Loading reference annotation.
[FAILED]
Error: could not execute cufflinks
Traceback (most recent call last):
File "/usr/local/bin/cuffmerge", line 576, in ?
sys.exit(main())
File "/usr/local/bin/cuffmerge", line 558, in main
cufflinks(output_dir, merged_sam_filename, params.min_isoform_frac, params.ref_gtf)
File "/usr/local/bin/cuffmerge", line 198, in cufflinks
exit(1)
TypeError: 'str' object is not callable
Any suggestion??
Comment
-
Cuffmerge returns duplicated @SQ <nul><nul> / blankspaces in mergesam
Using cufflinks on transcript.gtf files from over 90 samples I received exactly the same error as described in the initial post.
I found that in the temporary directory for cuffmerge there is a mergesam file, and in my case there was a problem with one of the @SQ. I had a duplicated @SQ header for one of my chromosomes...
displayed as:
@SQ SN:C07 LN: 38762999
...
@SQ SN:C07 LN: 50454407
in nedit:
@SQ SN:<nul><nul>...<nul><nul>C07 LN: 38762999
...
@SQ SN:C07 LN: 50454407
I solved this by recursively removing one of my 90 transcript.gtf files from the input. Once it was solved by removing my fifth sample, and now cuffmerge insists that the first sample is removed.
I can not find any errors in the cufflinks gtf files or in the Bowtie references which were used as input, and it to me this problem shows no consistency except that the problem always occurs with C07... Maybe this is some kind of bug in cuffmerge?
cuffmerge -s ../../../out/bwtIndex/bwtRef2.fa -g ../../../out/bwtIndex/bwtRef2.gff -o . -p 4 ../../../out/cuffmerge/2/assembly-manifest.txt 1> ../log/cuffmerge_2cuffmerge.log 2> ../err/cuffmerge_2cuffmerge.err
cufflinks v1.3.0
Comment
-
I have the exact same problem as the OP.
I'm using the GTF downloaded from the ensembl FTP to merge with 6 transcripts.gtf files produced by cufflinks.
I tried both solutions:
1) LC_ALL=C;EXPORT LC_ALL
2) changing the code in the cuffmerge script
neither worked, I'm getting the exact same error. Any other ideas out there?
Comment
Latest Articles
Collapse
-
by seqadmin
The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.
Avian Conservation
Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...-
Channel: Articles
03-08-2024, 10:41 AM -
-
by seqadmin
Artificial intelligence (AI) has evolved from a futuristic vision to a mainstream technology, highlighted by the introduction of tools like OpenAI's ChatGPT and Google's Gemini. In recent years, AI has become increasingly integrated into the field of genomics. This integration has enabled new scientific discoveries while simultaneously raising important ethical questions1. Interviews with two researchers at the center of this intersection provide insightful perspectives into...-
Channel: Articles
02-26-2024, 02:07 PM -
ad_right_rmr
Collapse
News
Collapse
Topics | Statistics | Last Post | ||
---|---|---|---|---|
Started by seqadmin, 03-14-2024, 06:13 AM
|
0 responses
32 views
0 likes
|
Last Post
by seqadmin
03-14-2024, 06:13 AM
|
||
Started by seqadmin, 03-08-2024, 08:03 AM
|
0 responses
71 views
0 likes
|
Last Post
by seqadmin
03-08-2024, 08:03 AM
|
||
Started by seqadmin, 03-07-2024, 08:13 AM
|
0 responses
80 views
0 likes
|
Last Post
by seqadmin
03-07-2024, 08:13 AM
|
||
Started by seqadmin, 03-06-2024, 09:51 AM
|
0 responses
68 views
0 likes
|
Last Post
by seqadmin
03-06-2024, 09:51 AM
|
Comment