Go Back   SEQanswers > Sequencing Technologies/Companies > Pacific Biosciences

Similar Threads
Thread Thread Starter Forum Replies Last Post
454 reads correct with illumina biocomfun 454 Pyrosequencing 6 02-12-2012 04:00 AM
How to find out if mapping is correct/not mitochy Bioinformatics 3 01-17-2012 02:09 AM
Samtools....SAM to BAM...warning or error!! Can I ignore? Siva Bioinformatics 12 08-20-2010 07:49 AM
Mapping SOliD reads to a Newbler 454 alignment to correct errors Bukowski Bioinformatics 0 03-09-2010 03:20 AM
Manually correct heterozygous indels captainobvious Bioinformatics 2 03-03-2009 11:07 AM

Thread Tools
Old 03-18-2012, 11:11 PM   #1
Location: INDIA

Join Date: Jun 2010
Posts: 46
Question Ignore CCS reads - a correct assumption?

Hi all,

As we know that Pacbio provides CLR as well as CCS reads for error correction of the long reads for any dataset. I want to know, for Variation Detection, is it okay to ignore the CCS reads completely and simply use the .bas.h5 file as given the raw read to find variants across a reference genome?

Any help regarding this would be great..
ritzriya is offline   Reply With Quote
Old 03-26-2012, 06:48 PM   #2
Junior Member
Location: Menlo Park, CA

Join Date: Mar 2012
Posts: 1

The bas.h5 file contains both single pass CLR and multipass CCS reads. The PacBio pbh5 R API ( or the utility ( can be used to access and extract both read types from the bas.h5 file. Their usefulness in variant detection depends on a variety of other factors, including frequency of variation, insert/amplicon length, CLR vs. CCS depth of coverage, SNP phasing requirements, and method of SNP detection. Assuming you are examining haploid/diploid SNP calls, here are two different scenarios that would argue for one or the other:

1. 5kb shear of genomic DNA. In this case, the insert sizes are generally too long for CCS read generation and variant detection with aligned CLR using GATK with base QV recalibration is recommended.
2. 500bp amplicons. The shorter insert sizes allow for sufficient CCS yield, and variant detection with aligned CCS using GATK with or without base calibration is recommended.

I would suggest you examine the readlengths and yield of your CLR vs. CCS reads and evaluate against your variant detection needs to choose the best read type for your application. It is also possible to call variants with both read types in parallel and evaluate the quantity and concordance of SNP calls.
Lawrence Lee
Staff Scientist, Bioinformatics
Pacific Biosciences
llee is offline   Reply With Quote
Old 03-27-2012, 10:36 PM   #3
Location: INDIA

Join Date: Jun 2010
Posts: 46

Thank you llee. I will try calling SNPs from both the read types and compare the results.
ritzriya is offline   Reply With Quote

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

All times are GMT -8. The time now is 01:20 AM.

Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2018, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO