View Single Post
Old 08-11-2010, 09:44 AM   #1
Junior Member
Location: houston

Join Date: May 2010
Posts: 2
Wink Hiseq 2000 paired-end capture data analysis problem-too many variants!

We are trying to analysis a Hiseq 2000 paired-end whole exome capture sequencing data. The quality of the data is very good. We get an average depth of coverage around 120x. The fastq files looks perfect. We used bwa for paired end alignment. Picard to remove duplicates and Samtools for variant calling. The problem we have now is that there are too many SNV and indel variants from this data, around 140,000 SNVs and Indels after filtration (mapping quality>=45, read depth>=10 and standard varFilter in Samtools). I just wonder if somebody else on this board are doing similar data analysis. How many SNV and Indel you got? Can BWA and Samtools be used on Hiseq data? Or if there are some other software we should try? Any other information we should know about Hiseq data?

lazyworm is offline   Reply With Quote