Seqanswers Leaderboard Ad

**Brian Bushnell** · 12-18-2016, 06:48 PM

It looks to me like low library complexity due to overamplification. It's hard to say, though. Note that there is a peak at ~16x. In order to see it, you have to rotate the image to the left by 45 degrees... the fact that it is such a weak peak indicates a very wide spread of coverage, indicative of overamplification, or an extremely high error rate. Or severe contamination, which can also be a problem with low-input amplified libraries. Do the reads generally BLAST to related insects?

**TomHarrop** · 12-18-2016, 08:48 PM

Hi Brian, thanks for the reply. Contamination sounds quite possible, I just BLASTed a random subset of the reads and got human, macaque, trees, zebrafish etc. as well as the occasional hit on other insects. Uh oh.

Our server is going offline tonight but I'll do a more systematic investigation tomorrow and post the results.

**TomHarrop** · 12-21-2016, 03:13 PM

I blastn-ed 1000 R1 and 1000 R2 reads from this library against the 'nt' database. For R1, I got 544 hits with an evalue < 1. 493 of them had usable taxon identifiers. From that I got 89 plant hits (18%), 82 mammalian (17%), 65 insects (13%), 63 nematodes and 57 fish (and some other stuff). R2 numbers were similar.

I don't know if evalue is the best way to look at BLAST results for NGS reads (i.e. short queries), but either way it looks like contamination to me.

Thanks for the hint.

**WhatsOEver** · 12-22-2016, 01:00 AM

I'd suggest to do the contamination analysis is a more systematic way using biobloomtools with a couple of the different top hit plants, mammals, insect, ... genomes you got from Blast. It will probably take some time, but I'd be rather surprised if you really have so many different contaminations - as long as the person doing your library preps isn't also a dedicated gardener or fisherman

**Brian Bushnell** · 12-22-2016, 02:43 AM

Yes, to be honest, this does sound strange. Normally, contamination comes from 1 or 2 sources... a grab-bag of taxa is very unusual. Are you getting 100% identity to anything, or just weak hits?

**WhatsOEver** · 12-22-2016, 05:51 AM

Originally posted by Brian Bushnell View Post

Yes, to be honest, this does sound strange. Normally, contamination comes from 1 or 2 sources... a grab-bag of taxa is very unusual. Are you getting 100% identity to anything, or just weak hits?

And in addition: Do you get your complete read seq aligned or are your hits rather the tiny 20-40bp local alignment crap Blast may output if there is nothing more suitable?

EDIT: Just saw that you used Blastn. I'd suggest megablast here.

**TomHarrop** · 12-28-2016, 12:12 AM

Thanks for the replies. Sorry about the slow response, I missed the email notification over the holidays.

You're correct, the hits are mostly less than 60 bp, not the full read. I did try megablast but I don't get any hits (well, 11 out of 1000 reads had hits, about half to insects).

Topics	Statistics	Last Post
Cancer Metastasis: A Deep Dive into Cellular Plasticity by seqadmin Started by seqadmin, 04-11-2024, 12:08 PM	0 responses 25 views 0 likes	Last Post by seqadmin 04-11-2024, 12:08 PM
Proteogenomic Profiles Offer New Clues in Prostate Cancer by seqadmin Started by seqadmin, 04-10-2024, 10:19 PM	0 responses 29 views 0 likes	Last Post by seqadmin 04-10-2024, 10:19 PM
Novel Diagnostic Assay Enhances Ovarian Cancer Detection by seqadmin Started by seqadmin, 04-10-2024, 09:21 AM	0 responses 24 views 0 likes	Last Post by seqadmin 04-10-2024, 09:21 AM
Evolutionary Dynamics of Centromeres: A Comparative Genomic Analysis by seqadmin Started by seqadmin, 04-04-2024, 09:00 AM	0 responses 52 views 0 likes	Last Post by seqadmin 04-04-2024, 09:00 AM

Seqanswers Leaderboard Ad

Announcement

No peak in BBNorm kmer-frequency histogram

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News