SEQanswers

Go Back   SEQanswers > Applications Forums > RNA Sequencing



Similar Threads
Thread Thread Starter Forum Replies Last Post
Cufflinks error brachysclereid Bioinformatics 10 02-08-2017 05:05 AM
cuffdiff error, issues with bam sorting rhcr56 Bioinformatics 1 10-18-2012 12:21 PM
Cufflinks problem sorting RNAddict Bioinformatics 2 05-23-2012 10:57 AM
cufflinks error harshinamdar Bioinformatics 1 03-02-2012 11:51 AM
cufflinks error syslm01 Bioinformatics 0 05-23-2010 07:27 PM

Reply
 
Thread Tools
Old 05-01-2013, 05:07 AM   #1
nr23
Member
 
Location: Ireland

Join Date: Oct 2012
Posts: 42
Default Cufflinks sorting Error

Hi,

I've used Cufflinks a load of times, with both sorted.bam alignments from tophat and Stampy. Even though my .bam files are sorted (samtools sort) I get this error when running cufflinks:


Error: this SAM file doesn't appear to be correctly sorted!
Cufflinks requires that if your file has SQ records in
the SAM header that they appear in the same order as the chromosomes names
in the alignments.
If there are no SQ records in the header, or if the header is missing,
the alignments must be sorted lexicographically by chromsome
name and by position.


I tried using .sam and sorting with
$ sort -k 3,3 -k 4,4n hits.sam > hits.sam.sorted
but the sort takes hours and doesn't seem to finish, or even produce a file.

Here are the first few lines of my .sam file:

@HD VN:1.0 SO:unsorted
@PG ID:dvtgm PN:stampy VN:1.0.21_(r1683) CL:-t 12 -g [...] -h [...] -o [...]
[...]
@CO TM:Tue, 30 Apr 2013 10:15:08 IST WD:[...] HN:[...]
@SQ SN:gi|AB014461.11|_X.l_mRNA_for_Zic-related-2,_complete_cds LN:2719 AS:Uni_blash_heads SP:X.laevis
@SQ SN:gi|L09728.12|XELDLL4_X.l_putative_transcription_factor_DLL4_mRNA,_complete_cds LN:1087 AS:Uni_blash_heads SP:X.laevis
@SQ SN:gi|L09730.13|XELDLL2_X.l_putative_transcription_factor_DLL2_mRNA,_3'_end LN:1485 AS:Uni_blash_heads SP:X.laevis
@SQ SN:gi|L11263.14|XEL1A11RTP_X.l_71.0_kDa_protein_(retrotransposon_1a11_related)_mRNA,_complete_cds LN:4410 AS:Uni_blash_heads SP:X.laevis
@SQ SN:gi|AB021705.15|_X.l_mRNA_for_XMAP4,_complete_cds LN:3920 AS:Uni_blash_heads SP:X.laevis
@SQ SN:gi|AB022087.16|_X.l_mRNA_for_cytochrome_P450,_complete_cds,_clone_MC1 LN:2618 AS:Uni_blash_heads SP:X.laevis
@SQ SN:gi|AB022088.17|_X.l_mRNA_for_cytochrome_P450,_complete_cds,_MC2 LN:2905 AS:Uni_blash_heads SP:X.laevis



Really baffled by this - it's never been a problem in the past...

Many thanks,

N
nr23 is offline   Reply With Quote
Old 05-01-2013, 06:04 AM   #2
GenoMax
Senior Member
 
Location: East Coast USA

Join Date: Feb 2008
Posts: 7,079
Default

This may not be directly related but have you switched to the newest cufflinks (v.2.1.1) recently?

There appear to be some odd problem with newest version (I am getting seg faults). Going back to the last release (v.2.0.2) works.
GenoMax is offline   Reply With Quote
Old 05-01-2013, 06:09 AM   #3
nr23
Member
 
Location: Ireland

Join Date: Oct 2012
Posts: 42
Default

Hi,

Yes I did today. However, I'm aligning to a Unigene reference with 30k odd sequences, so I have headers that are too large for v.1. - I'll give v.2.0.2 a go.

Thanks,

N
nr23 is offline   Reply With Quote
Old 05-07-2013, 02:32 AM   #4
nr23
Member
 
Location: Ireland

Join Date: Oct 2012
Posts: 42
Default

Hi,

I re-ran the manual sort $ sort -k 3,3 -k 4,4n hits.sam > hits.sam.sorted on my sam files and cufflinks runs ok. It takes ~ 4 hours per condition (~80M 90bp pe reads) - which seems ridiculously long.

However, when I run the output ('assemblies' contains location of 'transcripts.gtf' to compare) from cufflinks into cuffmerge:

cuffmerge assemblies.txt

I get the same error:

[10:25:19] Inspecting reads and determining fragment length distribution.

Error: this SAM file doesn't appear to be correctly sorted!
current hit is at gi|11066191|gb|AF196575.15187|_Xenopus_laevis_class_IV_POU-homeodomain_protein_(Brn3a)_mRNA,_partial_cds:7, last one was at gi|110645744|gb|BC118825.127439|_Xenopus_tropicalis_mitochondrial_ribosomal_protein_L22,_mRNA_(cDNA_clone_MGC:146493_IMAGE:7793940),_complete_cds:493
Cufflinks requires that if your file has SQ records in
the SAM header that they appear in the same order as the chromosomes names
in the alignments.
If there are no SQ records in the header, or if the header is missing,
the alignments must be sorted lexicographically by chromsome
name and by position.

[FAILED]
Error: could not execute cufflinks


Why is cuffmerge running cufflinks?

Any thoughts?
nr23 is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 06:58 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO