SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
create list of individual variant profiles from merged VCF pepsimax Bioinformatics 1 03-31-2013 04:07 AM
samtools variant calling for separate chromosomes tahamasoodi Bioinformatics 9 12-04-2012 11:45 AM
SNP calling vs. individual curated genes using 454 data from multiple heterozygotes DFJ111 Bioinformatics 2 11-18-2012 05:19 AM
GATK UnifiedGenotyper calling way too many SNPs in vcf swbarnes2 Bioinformatics 0 08-17-2011 02:33 PM
Merge individual vcf files francy Bioinformatics 5 06-21-2011 03:10 AM

Reply
 
Thread Tools
Old 12-26-2013, 04:45 PM   #1
arcolombo698
Senior Member
 
Location: Los Angeles

Join Date: Nov 2013
Posts: 142
Default VCF calling for individual chromosomes, domain reduction

Hello.

I am just seeing if this method for VCF creating for variant calls is correct.
Right now I keep the fastq as an entire file, and then align the whole fastq file to individual chromosomes, and this produces a VCF file for each chromosome that I later merge together.

However this can cause errors if the alignment has a poor map to the individual chromosome because it will generate a forced alignment, even if it maps poorly to the chromosome. (ie the VCF call will show a score to the poor map of the fastq to the chromosome if there is only a single chromosome the fastq is being aligned to:: (ie. reducing the domain of the fastq to a single chromsome at a time)


Is it better not to reduce the alignment map to a single chromosome at a time and instead map to the whole chromosome at once?
arcolombo698 is offline   Reply With Quote
Old 12-26-2013, 05:21 PM   #2
Jeremy
Senior Member
 
Location: Pathum Thani, Thailand

Join Date: Nov 2009
Posts: 190
Default

Yes, it is much better to align to the entire genome in a single step. If you want to reduce the memory load then separate the fastq into a few smaller files and align them separately instead of separating the genome like that.
Jeremy is offline   Reply With Quote
Old 12-26-2013, 06:35 PM   #3
arcolombo698
Senior Member
 
Location: Los Angeles

Join Date: Nov 2013
Posts: 142
Default

I am not sure how to do this.

how do you split the fastq files? how to align the whole genome?

when I align the fastq using Tophat, I align both reads at the same time.
arcolombo698 is offline   Reply With Quote
Old 12-26-2013, 06:41 PM   #4
Jeremy
Senior Member
 
Location: Pathum Thani, Thailand

Join Date: Nov 2009
Posts: 190
Default

A fastq is just a text file with one read every 4 lines, if you have paired end data it then two text files. The reads should be in the same order in both files by default so just split the files into smaller files making sure that the number of lines is a factor of 4. Actually, splitting probably isn't necessary anyway.
Jeremy is offline   Reply With Quote
Old 12-26-2013, 07:01 PM   #5
arcolombo698
Senior Member
 
Location: Los Angeles

Join Date: Nov 2013
Posts: 142
Default

Thank you
I am familiar with the alignment practice, and am interested in the variant calls.
How do I create the VCF files, without doing the variant calling for a single chromosome at a time?
arcolombo698 is offline   Reply With Quote
Old 12-26-2013, 07:19 PM   #6
Jeremy
Senior Member
 
Location: Pathum Thani, Thailand

Join Date: Nov 2009
Posts: 190
Default

Not much information to go on there. But the first step is to map against the whole genome not single chromosomes, so if you are using tophat then make a bowtie reference of the whole genome, map to that then use the resulting bam to call variants using whatever program you use which will call variants for the whole genome.
Jeremy is offline   Reply With Quote
Old 12-26-2013, 07:51 PM   #7
arcolombo698
Senior Member
 
Location: Los Angeles

Join Date: Nov 2013
Posts: 142
Default Vcf

yes. this is the protocol I follow.

I do have an HG19 human reference genome that I align the fastq files to.
after I use same tools to index the accepted hits.bam file and use mpileup -g command to call the variants.

what parameters do you use to call the variants? do you use mpileup -g ?
arcolombo698 is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 05:39 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2021, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO