Seqanswers Leaderboard Ad

**rhall** · 04-30-2015, 10:26 AM

To quiver correct such a large assembly the processing will have to be split up. Do you have access to a full SMRT Analysis install, the BAM_Resequncing_Beta.1 protocol deals with this if SMRT Analysis has been setup on a cluster.

Otherwise you will have to split up the computation manually. https://github.com/PacificBioscience...-using-pbalign

Once you have the megered cmp.h5 from pbalign you will probably want to split it up again by reference contig to create multiple small jobs for Quiver.

**colindaven** · 05-02-2015, 03:34 AM

Dear rhall,

thanks for your assistance! I am doing it the manual way which isn't too tricky.

The split command is helpful here and might be useful in the PacBio wiki:
split -d -l 3 --additional-suffix=.fofn ../baxQuiverInput.fofn baxQuiver_

Are there any ballpark figures for how much memory pbalign requires per SMRT cell ?
I'm trying 30 SMRT cells on the 512GB server initially.

Regards,
Colin

**GenoMax** · 05-02-2015, 03:45 AM

@colin: Unrelated query.

That looks like a nice big data set that is pure PacBio. Are all 30 cells similar in terms of yield etc (same library being sequenced)? Are you able to tell us anything about average stats (read lengths, size of data) etc? How many chromosomes/ploidy for this genome?

**colindaven** · 05-02-2015, 05:07 AM

I haven't looked at yield per SMRT cell. The same library is being sequenced. Read lengths are critical, and are ~10.5 kbp median before filtering and ~8.5kbp median after filtering to subreads.

**rhall** · 05-04-2015, 07:47 AM

Are there any ballpark figures for how much memory pbalign requires per SMRT cell ?
I'm trying 30 SMRT cells on the 512GB server initially.

Per cell memory will be predominantly dependent on the reference size (human can be aligned on a 16Gb machine). The memory scaling with number of SMRT cells is a known bug and will likely be fixed soon. When pbalign is used in SMRT pipe workflows this isn't encountered as jobs are split up by SMRT cell in order to allow parallel execution across cluster nodes.

Topics	Statistics	Last Post
Expanding the Horizons of Cellular Research with the Single Cell Atlas by seqadmin Started by seqadmin, Yesterday, 11:49 AM	0 responses 15 views 0 likes	Last Post by seqadmin Yesterday, 11:49 AM
Genetic Variants and Diabetes Risk in Childhood Cancer Survivors by seqadmin Started by seqadmin, 04-24-2024, 08:47 AM	0 responses 16 views 0 likes	Last Post by seqadmin 04-24-2024, 08:47 AM
Cancer Metastasis: A Deep Dive into Cellular Plasticity by seqadmin Started by seqadmin, 04-11-2024, 12:08 PM	0 responses 62 views 0 likes	Last Post by seqadmin 04-11-2024, 12:08 PM
Proteogenomic Profiles Offer New Clues in Prostate Cancer by seqadmin Started by seqadmin, 04-10-2024, 10:19 PM	0 responses 60 views 0 likes	Last Post by seqadmin 04-10-2024, 10:19 PM

Seqanswers Leaderboard Ad

Announcement

PacBio quiver memory usage

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News