SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
samtools sorting issue or HTSeq-count problem? bbl Bioinformatics 4 10-28-2014 07:57 AM
How to identify incorrectly fused gene models Hobbe Bioinformatics 1 08-01-2013 10:36 AM
samtools sorting problem yaximik Bioinformatics 9 04-22-2013 05:22 AM
samtools sorting outfile is not as large as input file vinay052003 Bioinformatics 4 03-12-2012 09:03 AM

Reply
 
Thread Tools
Old 12-13-2014, 09:18 AM   #1
drdna
Member
 
Location: Kentucky

Join Date: May 2012
Posts: 76
Default Samtools sorting incorrectly

Don't know what's going on. Samtools has forgotten how to sort. I used to have no problem but it is now ordering mouse chromosomal hits in the following order:

chr10
chr11
chr12
chr13
chr14
chr15
chr16
chr17
chr18
chr19
chr1
chr1_GL456210_random
chr1_GL456211_random
chr1_GL456212_random
chr1_GL456213_random
chr1_GL456221_random
chr2
chr3
chr4
chr4_GL456216_random
chr4_GL456350_random
chr4_JH584292_random
chr4_JH584293_random
chr4_JH584294_random
chr4_JH584295_random
chr5
chr5_GL456354_random
chr5_JH584296_random
chr5_JH584297_random
chr5_JH584298_random
chr5_JH584299_random
chr6
chr7
chr7_GL456219_random
chr8
chr9
chrM
chrUn_GL456239
chrUn_GL456359
chrUn_GL456360
chrUn_GL456366
chrUn_GL456367
chrUn_GL456368
chrUn_GL456370
chrUn_GL456372
chrUn_GL456378
chrUn_GL456379
chrUn_GL456381
chrUn_GL456382
chrUn_GL456383
chrUn_GL456385
chrUn_GL456387
chrUn_GL456389
chrUn_GL456390
chrUn_GL456392
chrUn_GL456393
chrUn_GL456394
chrUn_GL456396
chrUn_JH584304
chrX
chrX_GL456233_random
chrY
chrY_JH584300_random
chrY_JH584301_random
chrY_JH584302_random
chrY_JH584303_random


Like I said, this is a new issue. All previous processing was chr1, chr2, chr3, etc. This is causing problems when I try to merge the new sorted bam with old (correctly) sorted bam files.

I'm suspecting this might be due to a newer samtools installation. How would I find out what version is installed?

Last edited by drdna; 12-13-2014 at 09:35 AM.
drdna is offline   Reply With Quote
Old 12-13-2014, 10:03 AM   #2
blancha
Senior Member
 
Location: Montreal

Join Date: May 2013
Posts: 367
Default

Code:
samtools --version
The chromosome sort order in the file produced by samtools sort should be based "on the order in which the @SQ lines appear in the header of the unsorted BAM file."

I would check the header of the unsorted BAM file before putting the blame on "samtools sort".

Code:
samtools view -H unsorted.bam
You can reorder the chromosomes with Picard tools' ReorderSam. You'll need a reference FASTA file with the chromosomes in the desired order, e.g. karyotypic.
blancha is offline   Reply With Quote
Old 12-13-2014, 11:07 AM   #3
drdna
Member
 
Location: Kentucky

Join Date: May 2012
Posts: 76
Default

Quote:
Originally Posted by blancha View Post
Code:
samtools --version
The chromosome sort order in the file produced by samtools sort should be based "on the order in which the @SQ lines appear in the header of the unsorted BAM file."

I would check the header of the unsorted BAM file before putting the blame on "samtools sort".

Code:
samtools view -H unsorted.bam
You can reorder the chromosomes with Picard tools' ReorderSam. You'll need a reference FASTA file with the chromosomes in the desired order, e.g. karyotypic.

It's definitely a samtools error: File.sam was created using bowtie2. The .sam header is present with all entries in correct order.

File.sam was processed using: samtools view -bS File.sam > File.bam.
The header was stripped out of the resulting File.bam. This was never a problem in the past. None of my earlier .bam files had headers prior to sorting. I'm going to give it a try using samtools view -bSh to see if I can retain the header in the .bam file.
drdna is offline   Reply With Quote
Old 12-13-2014, 11:09 AM   #4
drdna
Member
 
Location: Kentucky

Join Date: May 2012
Posts: 76
Default

Quote:
Originally Posted by blancha View Post
Code:
samtools --version
Are you sure that samtools --version is the correct command? I had already tried this and it gave me a "[main] unrecognized command '--version' " error.
drdna is offline   Reply With Quote
Old 12-13-2014, 12:52 PM   #5
drdna
Member
 
Location: Kentucky

Join Date: May 2012
Posts: 76
Default

It works if you force inclusion of the header when converting sam to bam:

Code:
samtools view -bSH File.sam > File.bam
Thanks for the heads up about the header blancha.
drdna is offline   Reply With Quote
Old 12-13-2014, 01:37 PM   #6
lh3
Senior Member
 
Location: Boston

Join Date: Feb 2008
Posts: 693
Default

Option -H has no effect with -b, and a mapped BAM file always has @SQ lines. There is no way to strip them off. Please attach a SAM example if you believe samtools is wrong.

Last edited by lh3; 12-13-2014 at 01:40 PM.
lh3 is offline   Reply With Quote
Old 12-13-2014, 01:55 PM   #7
drdna
Member
 
Location: Kentucky

Join Date: May 2012
Posts: 76
Default

Quote:
Originally Posted by lh3 View Post
Option -H has no effect with -b, and a mapped BAM file always has @SQ lines. There is no way to strip them off. Please attach a SAM example if you believe samtools is wrong.
Oops, I meant:

Code:
samtools view -bSh File.sam > File.bam
drdna is offline   Reply With Quote
Old 12-16-2014, 05:01 AM   #8
dpryan
Devon Ryan
 
Location: Freiburg, Germany

Join Date: Jul 2011
Posts: 3,480
Default

Samtools sort doesn't reorder the header. If the header is in a weird order, then that's because the reference fasta file is in that order (bowtie2 will output header lines in the same order as it encounters them). If you want to reorder the header too, then there's a Picard tools command for that.
dpryan is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 03:57 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO