Seqanswers Leaderboard Ad

**choijae3** · 01-07-2014, 02:45 PM

nvm so I've found useful links to solve my problem from here and here

however is my approach makes sense in that decreasing the library size be a valid approach to see if more pooling would be beneficial?

sorry if I'm derailing the post...

**dpryan** · 01-07-2014, 03:00 PM

FYI, I assume you mean "multiplexing" rather than "pooling". While there is pooling in both cases, the former is probably a more exact description of what you're doing (I assume you're looking for sequence differences between strains or something like that, so being able to separate reads by strain would be useful).

Regarding your strategy, it's often termed "saturation analysis" or "making a saturation/rarefaction curve" or various permutations thereof. It's a very good thing to do and I've seen a few papers (mostly RNAseq) specifically doing that to estimate maximal statistical power. 70x is overkill for a lot of common things, so I wouldn't be surprised if you can get away with throwing more samples on there.

**choijae3** · 01-07-2014, 03:22 PM

Yes I should have ment multiplexing instead of pooling. I'm conducting a population genomic type project and trying to sequence as much populations without sacrificing coverage too much.

Thanks dpryan!

**barkasn** · 01-08-2014, 08:14 AM

Hi choijae3,

70X coverage is very high and an overkill for most applications so in general you are better off sequencing more samples as opposed to sequencing the same thing over and over again.

With respect to the duplication rate, I would recommend you do no trust FastQC. FastQC estimates duplication rate by looking at the first and second reads independently. Given your high coverage it is very likely that you will get 1st reads starting at the exact same spot. I would recommend you use Picard MarkDuplicates.jar to estimate the duplication rate after alignment as this takes into account both the first and second reads of each pair.

**choijae3** · 01-08-2014, 08:27 AM

Hi barkasn

thanks for the reply! I've been following best practice from broad institute and have done the mark duplicate steps. I haven't paid much attention to it (I really should have) and found that to be more helpful. Thanks again for the advice!

Topics	Statistics	Last Post
Cancer Metastasis: A Deep Dive into Cellular Plasticity by seqadmin Started by seqadmin, 04-11-2024, 12:08 PM	0 responses 30 views 0 likes	Last Post by seqadmin 04-11-2024, 12:08 PM
Proteogenomic Profiles Offer New Clues in Prostate Cancer by seqadmin Started by seqadmin, 04-10-2024, 10:19 PM	0 responses 32 views 0 likes	Last Post by seqadmin 04-10-2024, 10:19 PM
Novel Diagnostic Assay Enhances Ovarian Cancer Detection by seqadmin Started by seqadmin, 04-10-2024, 09:21 AM	0 responses 28 views 0 likes	Last Post by seqadmin 04-10-2024, 09:21 AM
Evolutionary Dynamics of Centromeres: A Comparative Genomic Analysis by seqadmin Started by seqadmin, 04-04-2024, 09:00 AM	0 responses 53 views 0 likes	Last Post by seqadmin 04-04-2024, 09:00 AM

Seqanswers Leaderboard Ad

Announcement

How to randomly remove portions of the raw reads from the FASTQ file

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News