SEQanswers

Go Back   SEQanswers > Introductions



Similar Threads
Thread Thread Starter Forum Replies Last Post
PrinSeq paired reads and zipped padmoo Bioinformatics 1 06-11-2015 06:06 AM
Bowtie output zipped but not zipped thh32 Bioinformatics 5 04-23-2014 09:14 AM
Zipped Data Alignment arcolombo698 Bioinformatics 1 11-29-2013 01:18 PM
Trimmomatic on Zipped Files arcolombo698 Bioinformatics 3 11-29-2013 01:17 PM
Zipped BCL to FastQ AmitChaurasia Bioinformatics 3 06-12-2013 12:38 PM

Reply
 
Thread Tools
Old 11-01-2016, 03:55 AM   #1
visse226
Junior Member
 
Location: The Netherlands

Join Date: Nov 2016
Posts: 9
Question VCFtools whole genome data zipped

Hi all!
New here, introduction: I investigate inbreeding in an endangered species where I have whole genome data of several individuals and variants stored in VCF files.

I want to keep the file zipped (.vcf.gz) because of used memory etc.
Let's say I want to filter and keep only the SNPs with '--remove-indels' and using gzvcf, how do I make sure the output is not .vcf but also still compressed .vcf.gz ? And in the mean time nothing is unzipped? Because I get the idea VCFtools unzips everything in the mean time.. If I would use a pipeline and say gzip -c > output_file.vcf.gz this will not work right, because that compresses the output again but I do not want it to be uncompressed in the first place.

Help?!

Thanks
visse226 is offline   Reply With Quote
Old 11-01-2016, 04:56 AM   #2
wdecoster
Member
 
Location: Antwerp, Belgium

Join Date: Oct 2015
Posts: 96
Default

You can write the output to stdout and pipe that output directly to gzip to compress the generated result.
wdecoster is offline   Reply With Quote
Old 11-01-2016, 05:11 AM   #3
visse226
Junior Member
 
Location: The Netherlands

Join Date: Nov 2016
Posts: 9
Default

Hi, Thanks for quick response!

So that means with the pipe I would be doing it correctly? Or redirection operator?
I am new to all this, so I am not aware what the difference between the two is in terms of storing the whole output first in memory, or doing it line-by-line/saving memory?

Thanks!
visse226 is offline   Reply With Quote
Old 11-03-2016, 06:09 AM   #4
dsenalik
Carrot Scientist
 
Location: Madison WI USA

Join Date: Nov 2009
Posts: 41
Default

You use both, a pipe to avoid creating an intermediate file, and then redirect, e.g.
Code:
vcftools --gzvcf in.vcf.gz --somemagicparameters --stdout | bgzip > out.vcf.gz
and afterwords create an index if you need it
Code:
tabix -p vcf out.vcf.gz
dsenalik is offline   Reply With Quote
Old 11-04-2016, 01:13 AM   #5
visse226
Junior Member
 
Location: The Netherlands

Join Date: Nov 2016
Posts: 9
Default

Thank you so much! Very helpful
visse226 is offline   Reply With Quote
Reply

Tags
compressed, filtering, snps, vcftools, zipped

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 11:32 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2018, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO