SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
FastQC: A quality control application for FastQ data simonandrews Bioinformatics 366 02-06-2019 09:03 PM
RNA-seq Quality Control help lewewoo RNA Sequencing 9 11-01-2011 11:53 PM
Quality Control of Solexa aquleaf Illumina/Solexa 3 04-07-2011 05:15 AM
Quality Control question dicty Bioinformatics 1 02-10-2011 12:58 PM
Quality Control and Quality Values agc Bioinformatics 4 08-23-2010 11:44 PM

Reply
 
Thread Tools
Old 05-12-2011, 07:02 PM   #1
dongshenglulv
Member
 
Location: Shanghai

Join Date: May 2011
Posts: 15
Default quality control from fastq to vcf

Hi all,

This is my framework to call SNP from 50 fastq files,
1> use bowtie to get SAM from fastq. reference genome: hg19.fa, paras in bowtie: -k 2 -v 2. Then I have 50 SAM files
2> convert SAM to BAM. cmd line: samtools view -bS s01.sam > s01.bam, then I have 50 BAM files
3> sort BAM. cmd line: samtools sort s01.bam > s01.sort.bam
4> convert 50 sorted BAM files to one vcf file.
cmd line:
samtools mpileup -P ILLUMINA -ugf hg19.fa *.sort.bam | bcftools view -bcvg - > var.raw.bcf
bcftools view var.raw.bcf | vcfutils.pl varFilter -D 2000 > var.flt.vcf

However, the quality of VCF file seems not good. I'm wondering that in which step can I do the quality control? thanks very much
dongshenglulv is offline   Reply With Quote
Old 11-27-2013, 11:10 AM   #2
arcolombo698
Senior Member
 
Location: Los Angeles

Join Date: Nov 2013
Posts: 142
Default Initial FastQC

Hello.

Did you use trimmomatic to initially do QC on the original fastq. To my understanding the best results are gained by using QC on the fastq files initially.
arcolombo698 is offline   Reply With Quote
Old 11-27-2013, 11:50 AM   #3
swbarnes2
Senior Member
 
Location: San Diego

Join Date: May 2008
Posts: 912
Default

Your file is probably fine. That's just how mpileup works; it puts in pretty much EVERY discrepancy, so the ration of garbage to real discrepancies is very high. So filter your vcf based on depth of coverage, quality score, etc. You can eyeball individual SNPs in IGV, if you want.
swbarnes2 is offline   Reply With Quote
Old 11-05-2014, 02:08 PM   #4
zillur
Senior Member
 
Location: Puerto Rico

Join Date: Sep 2014
Posts: 106
Default

Hi,

I am new in this arena. I am using bowtie.
1. I have aligned the paired-end reads included with bowtie2
bowtie2 -x lambda_virus -1 ../reads/reads_1.fq -2 ../reads/reads_2.fq -S eg2.sam

2. Converted the SAM file into BAM file
samtools view -bS eg2.sam > eg2.bam

3. Converted the BAM file to a sorted BAM file
samtools sort eg2.bam eg2.sorted

4. Generated variant calls in VCF format
samtools mpileup -uf lambda_virus.fa eg2.sorted.bam | bcftools view -O u - > eg2.raw.bcf

To view this format
bcftools view eg2.raw.bcf
By which I got a huge file mentioning SNPs like:
gi|9626243|ref|NC_001416.1| 48496 . G <X> 0 . DP=2;I16=0,2,0,0,65,2125,0,0,45,1773,0,0,28,634,0,0;QS=1,0;MQ0F=0 PL 0,6,37
gi|9626243|ref|NC_001416.1| 48497 . G <X> 0 . DP=2;I16=0,1,0,0,27,729,0,0,3,9,0,0,25,625,0,0;QS=1,0;MQ0F=0 PL 0,3,4
gi|9626243|ref|NC_001416.1| 48498 . T <X> 0 . DP=2;I16=0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0;QS=0,0;MQ0F=0 PL 0,0,0
gi|9626243|ref|NC_001416.1| 48499 . T <X> 0 . DP=2;I16=0,1,0,0,33,1089,0,0,42,1764,0,0,0,0,0,0;QS=1,0;MQ0F=0 PL 0,3,33
gi|9626243|ref|NC_001416.1| 48500 . A <X> 0 . DP=1;I16=0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0;QS=0,0;MQ0F=0 PL 0,0,0
gi|9626243|ref|NC_001416.1| 48501 . C <X> 0 . DP=1;I16=0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0;QS=0,0;MQ0F=0 PL 0,0,0
gi|9626243|ref|NC_001416.1| 48502 . G <X> 0 . DP=1;I16=0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0;QS=0,0;MQ0F=0 PL 0,0,0

so many of this. Now I need to asses the quality of the output. Can anyone give me some suggestions by which I can measure the quality of my data.

Best Regards
Zillur
zillur is offline   Reply With Quote
Reply

Tags
bowtie, quality control

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 08:04 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2019, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO