SEQanswers

SEQanswers (http://seqanswers.com/forums/index.php)
-   Bioinformatics (http://seqanswers.com/forums/forumdisplay.php?f=18)
-   -   How to regenerate an index file for a gzipped vcf file? (http://seqanswers.com/forums/showthread.php?t=89547)

XeroxHero69 05-28-2019 10:35 AM

How to regenerate an index file for a gzipped vcf file?
 
I have a large vcf file with genomic data (330 GB) and an index file that accompanies it. I ran the command:

bcftools view -f PASS --threads 8 -r chr9:55252802-55252810 -o output.vcf.gz -O z 722g.990.SNP.INDEL.chrAll.vcf.gz

I get:

[W::hts_idx_load2] The index file is older than the data file: 722g.990.SNP.INDEL.chrAll.vcf.gz.tbi
[W::hts_idx_load2] The index file is older than the data file: 722g.990.SNP.INDEL.chrAll.vcf.gz.tbi


I was told that I need to regenerate my index file using Tabix. I have tried the following command that seemed to work for someone else having a similar problem:

tabix -p vcf 722g.990.SNP.INDEL.chrAll.vcf.gz

but it returned:

tbx_index_build failed: 722g.990.SNP.INDEL.chrAll.vcf.gz



I am not quite sure what command I should be using to regenerate my index file. Any help is greatly appreciated!

r.rosati 05-28-2019 05:32 PM

This specifically ("The index file is older than the data file") is a warning and not an error. If you trust the index, you can in all honesty ignore this spammy message.
Anyways, see if this thread helps:
https://www.biostars.org/p/138514/
(If you want to follow that thread's suggestion, just remember that you have quite a big file to uncompress and re-compress.)

questor2010 05-29-2019 08:14 PM

You should be able to use bcftools itself to index the file. Just use "bcftools index 722g.990.SNP.INDEL.chrAll.vcf.gz"

r.rosati 05-30-2019 06:12 PM

Yes, but... if the index is just OK, he could as well do
touch 722g.990.SNP.INDEL.chrAll.vcf.gz.tbi
and the message won't bother him anymore.

XeroxHero69 05-31-2019 09:57 AM

Quote:

Originally Posted by r.rosati (Post 226542)
This specifically ("The index file is older than the data file") is a warning and not an error. If you trust the index, you can in all honesty ignore this spammy message.
Anyways, see if this thread helps:
https://www.biostars.org/p/138514/
(If you want to follow that thread's suggestion, just remember that you have quite a big file to uncompress and re-compress.)

Thank you so much, this seems the route I need to take!


All times are GMT -8. The time now is 10:52 PM.

Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.