SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
Calling structural variants from capture data Heisman Bioinformatics 3 04-16-2012 08:01 AM
How to make HiSeq indexed paired-end library with homemade oligos? ostrakon Illumina/Solexa 6 03-16-2012 05:22 AM
Source of duplication in illumina hiseq paired-end reads? amango Bioinformatics 4 01-31-2012 03:55 AM
Help to analyze Illumina HiSeq 2000 Human data kiradi Bioinformatics 4 12-09-2011 05:30 AM
Procedure for paired-end with Agilent SureSelect array capture dnaeve Genomic Resequencing 5 04-16-2010 10:09 AM

Reply
 
Thread Tools
Old 08-11-2010, 10:44 AM   #1
lazyworm
Junior Member
 
Location: houston

Join Date: May 2010
Posts: 2
Wink Hiseq 2000 paired-end capture data analysis problem-too many variants!

Hi,
We are trying to analysis a Hiseq 2000 paired-end whole exome capture sequencing data. The quality of the data is very good. We get an average depth of coverage around 120x. The fastq files looks perfect. We used bwa for paired end alignment. Picard to remove duplicates and Samtools for variant calling. The problem we have now is that there are too many SNV and indel variants from this data, around 140,000 SNVs and Indels after filtration (mapping quality>=45, read depth>=10 and standard varFilter in Samtools). I just wonder if somebody else on this board are doing similar data analysis. How many SNV and Indel you got? Can BWA and Samtools be used on Hiseq data? Or if there are some other software we should try? Any other information we should know about Hiseq data?

Thanks
lazyworm is offline   Reply With Quote
Old 08-11-2010, 11:03 AM   #2
Lee Sam
Member
 
Location: Ann Arbor, MI

Join Date: Oct 2008
Posts: 57
Default

Have you done verification against dbSNP? Have you filtered down your candidates to just those within known exons after alignment (often times PE reads "splash over" into intronic regions where variation is likely more liberally tolerated)?

EDIT: I'm actually really curious to hear about how many reads and read length and how many lanes you ran the sample on. We just got our HiSeq2k installed last week and we're running our first samples though it. Details would be fantastic.

Last edited by Lee Sam; 08-11-2010 at 01:29 PM.
Lee Sam is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 04:01 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2018, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO