![]() |
samtools sorting outfile is not as large as input file
I tried to sort a bam file for paired-end genomic data using samtools sort option. BAM file size is about 85gb. I sorted them on read names instead of chromosome coordinates. The output file is about 79gb. I am wondering where did 6gb of data from the input file go? Has anyone seen this type of inconsistency before?
Thanks. |
That's the magic of sorting a .bam; it comes out smaller, because it compresses better.
If you do flagstat on the .bam before and after sorting, you'll see that they have the same number of reads. |
Definitely makes sure you have the right number of reads and that the sort did not prematurely terminate.
|
Thanks a lot......... how would I know if the sorted file is complete?
|
samtools flagstat or use samtools view -c to the count the reads
|
All times are GMT -8. The time now is 05:43 PM. |
Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2021, vBulletin Solutions, Inc.