Hi,
I am subsetting a vcf by positions stored in a tab delimited file using bcftools. I noticed the program is very slow. Here is the command:
bcftools view -R ./chr1.passing.markers.txt chr1.vcf.gz -Oz -o ./chr1.reduced.vcf.gz
where chr1.passing markers is tab delimited chromosome and position for muliple positions, no header. 68K positions. Original vcf has 524K positions.
The bcftools command is not using NFS (reading/writing to local disk, executable running from the analysis directory), no competing jobs. It is taking a really long time. Still running after 120 minutes.
I wrote an equivalent perl script that completes this in 10 minutes but uses flat files and so should be even slower than bcftools with its binary file format.
bcftools version is up to date.
Does anyone have an idea how I can speed this up or what might be wrong?
Thanks,
Craig
I am subsetting a vcf by positions stored in a tab delimited file using bcftools. I noticed the program is very slow. Here is the command:
bcftools view -R ./chr1.passing.markers.txt chr1.vcf.gz -Oz -o ./chr1.reduced.vcf.gz
where chr1.passing markers is tab delimited chromosome and position for muliple positions, no header. 68K positions. Original vcf has 524K positions.
The bcftools command is not using NFS (reading/writing to local disk, executable running from the analysis directory), no competing jobs. It is taking a really long time. Still running after 120 minutes.
I wrote an equivalent perl script that completes this in 10 minutes but uses flat files and so should be even slower than bcftools with its binary file format.
bcftools version is up to date.
Does anyone have an idea how I can speed this up or what might be wrong?
Thanks,
Craig
Comment