Seqanswers Leaderboard Ad

**GenoMax** · 03-19-2016, 12:22 PM

I have not run a CCS analysis on the command line but the files you obtained are similar what I have got from using SMRTportal. Not every ZMW is productive so having 55% ZMW fail is not unexpected.

@rhall (Dr. Hall) from PacBio participates here and may have detailed explanation for the results next week. Have you run concensustools.sh -h to see if there is inline help. There is some documentation here: https://github.com/PacificBioscience...-Documentation

**xhuister** · 03-20-2016, 02:56 AM

Thank you GenoMax! I'm grad to know that my result is all right. I've read the document on github, no explain for the result report was found. I'll try to find more.

**rhall** · 03-21-2016, 10:33 AM

The results look reasonable. A lot of the command line tools are not well documented, and as the CCS algorithm has fundamentally changed for the new software release the documentation is unlikely to be improved for this version. All percentages are of the total ZMWs (~150,000) given how ZMWs are loaded, the best that can be expected is ~40% (poisson statistics), item by item:

Successful - Quiver consensus found 55133 33.72 %

number of consensus sequences, more than one full passes of the insert.

Successful - But only 1 region, no true consensus 11694 7.15 %

single pass sequences, due to the '--minFullPasses 0' parameter, normally you would want multiple passes of the insert for a CCS dataset

Failed - Exception thrown 0 0.00 %

General catch for ZMWs that throw an error during the calculation

Failed - ZMW was not productive 90173 55.16 %

ZMWs that are not loaded with a sequencing template, 55% is reasonable for a well loaded sample

Failed - Outside of SNR ranges 3923 2.40 %

There is a per ZMW SNR filter, ZMWs that do not have high SNR are not used to generate consensus sequence

Failed - No insert regions found 5 0.00 %

Two adapter sequences joined together without an insert sequece

Failed - Not enough full passes 0 0.00 %

You set this as 0 so nothing is filtered

Failed - Insert length too small 0 0.00 %

minimum length parameter

Failed - Post POA requirements not met 0 0.00 %

I'm not exactly sure, unless this % is high I wouldn't worry about it

Failed - CCS Read below predicted accuracy 473 0.29 %

predicted accuracy parameter

Failed - CCS Read was palindrome 2081 1.27 %

Reads are palindromic, i.e. you sequence the forward and reverse strands without an adapter being read, this is likely due to sample prep, 1.27% is expected, much higher and sample prep should be looked at.

Failed - CCS Read below SNR threshold 0 0.00 %

If a SNR threshold is given as a parameter

Failed - CCS Read too short or long 0 0.00 %

Read length paramter

**xhuister** · 03-21-2016, 05:59 PM

Thank you very much, Dr. Hall.

You said there is a new release of CSS algorithm. When I looked around Seqanswers and the GitHub--PacificBiosciences, there is a new program pbccs dealing with bam files (https://github.com/PacificBiosciences/pbccs). Is this pbccs the new release you mentioned?

I need to get full length transcripts and do NGS correction after obtaining ccs. According to the tutorial on Github, the RS_IsoSeq pipeline can be easily applied for downstream analysis (I haven't tried yet). Since the new release is available and I haven't started processing my data yet, I think it may be good for me to use the new software. But is there (or do I need to use) other tools for downstream analysis if I switch to pbcss (or new release of css algorithm)?

**rhall** · 03-22-2016, 07:48 AM

As a transcript analysis pipeline RS_IsoSeq is an end to end solution, you do not need to run ccs independently. The pbccs on github is the new algorithm, but I wouldn't worry about it for transcripts, RS_IsoSeq is sufficient.

**GenoMax** · 03-22-2016, 08:08 AM

@rhall: Is RS_IsoSeq a SMRTportal only workflow? It appears that @xhuister is working on the command line and may not have SMRTportal installed.

**rhall** · 03-22-2016, 08:19 AM

It is possible to run RS_IsoSeq (or any SMRTPortal workflow) via the commandline, but if you don't have SMRTportal installed it is difficult to generate a valid parameter file. The recommended method for running a transcript analysis is to run in steps, https://github.com/PacificBioscience...ds#commandline I still wouldn't worry about using the new CCS algorithm unless you already have access to SMRT Link 3.0.

**xhuister** · 03-23-2016, 01:19 AM

Thank you very much, GenoMax and rhall. I'll continue to use RS_ISOSeq as you suggested. I've installed and tried to run the SMRT Analysis on a Computer Cluster via LSF. Till now I've finished the "Getting full length reads" step via pbtranscript.py. Hope the subsequent analysis will go well~

Topics	Statistics	Last Post
Cancer Metastasis: A Deep Dive into Cellular Plasticity by seqadmin Started by seqadmin, 04-11-2024, 12:08 PM	0 responses 27 views 0 likes	Last Post by seqadmin 04-11-2024, 12:08 PM
Proteogenomic Profiles Offer New Clues in Prostate Cancer by seqadmin Started by seqadmin, 04-10-2024, 10:19 PM	0 responses 31 views 0 likes	Last Post by seqadmin 04-10-2024, 10:19 PM
Novel Diagnostic Assay Enhances Ovarian Cancer Detection by seqadmin Started by seqadmin, 04-10-2024, 09:21 AM	0 responses 27 views 0 likes	Last Post by seqadmin 04-10-2024, 09:21 AM
Evolutionary Dynamics of Centromeres: A Comparative Genomic Analysis by seqadmin Started by seqadmin, 04-04-2024, 09:00 AM	0 responses 52 views 0 likes	Last Post by seqadmin 04-04-2024, 09:00 AM

Seqanswers Leaderboard Ad

Announcement

Is this a valid [CircularConsensus] Result Report?

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News