View Single Post
Old 04-23-2012, 09:45 AM   #1
Kcornelius
Member
 
Location: Los Angeles

Join Date: Apr 2012
Posts: 14
Default Cuffmerge error: loading reference annotation failed?

Hi,

I am trying to run cuffmerge for quite some time now but so far I always get the same message:

Code:
[Thu Apr 19 18:41:47 2012] Beginning transcriptome assembly merge
-------------------------------------------

[Thu Apr 19 18:41:47 2012] Preparing output location ./merged_asm/
[Thu Apr 19 18:41:55 2012] Converting GTF files to SAM
[18:41:55] Loading reference annotation.
[18:42:01] Loading reference annotation.
[Thu Apr 19 18:42:13 2012] Quantitating transcripts
You are using Cufflinks v1.3.0, which is the most recent release.
Command line:
cufflinks -o ./merged_asm/ -F 0.05 -g Homo_sapiens.GRCh37.62.filter1.sorted.gtf -q --overhang-tolerance 200 --library-type=transfrags -A 0.0 --min-frags-per-transfrag 0 --no-5-extend -p 12 ./merged_asm/tmp/mergeSam_file54ILK4 
[bam_header_read] EOF marker is absent. The input is probably truncated.
[bam_header_read] invalid BAM binary header (this is not a BAM file).
File ./merged_asm/tmp/mergeSam_file54ILK4 doesn't appear to be a valid BAM file, trying SAM...
[18:42:13] Loading reference annotation.
        [FAILED]
Error: could not execute cufflinks
So, reading in the forum here I applied two strategies to try to solve the error:

I applied the change in the python script for the header_for_chrom_info function as mentioned by damiankao in another threat:
http://seqanswers.com/forums/showthread.php?t=11694

Furthermore, I sorted the reference GTF, so that they have all the same order as the merged SAM file.

I followed so far the tuxedo protocol given by Cole Trapnell et al., my version is:

cufflinks-1.3.0.

I guess that I am still missing something but I wonder if there is a script now that works for reference gtf files ( In my case Homo_sapiens.GRCh37.62.gtf where I changed the Chromosome IDs to chr1, chr 2, ... chrM, chrX, chrY.

Here is the header of the mergeSam file:

Code:
@HD     VN:1.0  SO:coordinate
@SQ     SN:chr1 LN:     249231242
@SQ     SN:chr10        LN:     135516024
@SQ     SN:chr11        LN:     134945793
@SQ     SN:chr12        LN:     133815135
@SQ     SN:chr13        LN:     115099423
@SQ     SN:chr14        LN:     107288019
@SQ     SN:chr15        LN:     102519298
@SQ     SN:chr16        LN:     90237391
@SQ     SN:chr17        LN:     81188573
@SQ     SN:chr18        LN:     78005429
@SQ     SN:chr19        LN:     59111168
@SQ     SN:chr2 LN:     243102304
@SQ     SN:chr20        LN:     62959443
@SQ     SN:chr21        LN:     48111157
@SQ     SN:chr22        LN:     51239737
@SQ     SN:chr3 LN:     197955247
@SQ     SN:chr4 LN:     191013476
@SQ     SN:chr5 LN:     180899431
@SQ     SN:chr6 LN:     171055065
@SQ     SN:chr7 LN:     159026067
@SQ     SN:chr8 LN:     146281416
@SQ     SN:chr9 LN:     141150148
@SQ     SN:chrM LN:     16023
@SQ     SN:chrX LN:     155257848
@SQ     SN:chrY LN:     59002251
@PG     ID:cuffmerge    VN:1.0.0
So, I wonder if it is possible, given a bowtie index with chr1, chr2. etc. entries instead of redoing the bowtie index with the ensemble identifiers. Would be great if someone would know the missing link

Marc
Kcornelius is offline   Reply With Quote