Seqanswers Leaderboard Ad

**simonandrews** · 06-20-2011, 11:06 PM

Have you done any QC on your data to see if there are obvious biases or quality problems?

Have you trimmed adapters off your reads? At 100bp you might be getting a reasonable portion of your library reading through into adapter, and this will mess up your ability to map your reads.

**aligenie** · 06-21-2011, 01:13 PM

Originally posted by simonandrews View Post

Have you done any QC on your data to see if there are obvious biases or quality problems?

Have you trimmed adapters off your reads? At 100bp you might be getting a reasonable portion of your library reading through into adapter, and this will mess up your ability to map your reads.

I've looked with FastQC and it does seem that my quality score begins to drop off toward the middle of the read. Trimming by quality score in BWA does help but I still have a lot that don't map. My guess is that I have a library prep issue?

**simonandrews** · 06-21-2011, 11:55 PM

If you have decent quality reads then if they're failing to map that's going to be due to one of:

Your library is contaminated with DNA from a different source (Ecoli etc)
Your library is partially contaminated with adapters or some part of your vector
Your sequences come from repetitive sequence which doesn't allow them to map uniquely

You say you're getting 60% of your reads mapping, so the library isn't a complete disaster, so it's just a case of figuring out where the rest went.

If you have a contamination from another DNA source you could try to screen for it. We routinely put all of our libraries through a screen to see if they contain what they should.

If you have partial conatmination with adapter or improperly removed barcodes then you should see this in your FastQC reports. Such biases would show up either in the per-base sequence content plot or the Kmer plots. Any non-insert sequence still in your library would mess up your mapping efficiency.

If your sequences aren't mapping uniquely - but could map well in many places then you should be able to alter your mapping parameters to see this. I don't use BWA personally but I'm sure there will be an option to return a hit even if a sequence could have mapped in many places with high identity. This won't necessarily help your downstream analysis, but it will at least let you know why your sequences wouldn't map.

If all else fails what we've done before is to remove from our library all of the sequences which we were able to map successfully and then do an assembly of whatever is left (we used velvet). This has worked well for us on a couple of occasions to identify sources of contamination which we'd been unable to identify in any other way.

Topics	Statistics	Last Post
Cancer Metastasis: A Deep Dive into Cellular Plasticity by seqadmin Started by seqadmin, 04-11-2024, 12:08 PM	0 responses 58 views 0 likes	Last Post by seqadmin 04-11-2024, 12:08 PM
Proteogenomic Profiles Offer New Clues in Prostate Cancer by seqadmin Started by seqadmin, 04-10-2024, 10:19 PM	0 responses 54 views 0 likes	Last Post by seqadmin 04-10-2024, 10:19 PM
Novel Diagnostic Assay Enhances Ovarian Cancer Detection by seqadmin Started by seqadmin, 04-10-2024, 09:21 AM	0 responses 46 views 0 likes	Last Post by seqadmin 04-10-2024, 09:21 AM
Evolutionary Dynamics of Centromeres: A Comparative Genomic Analysis by seqadmin Started by seqadmin, 04-04-2024, 09:00 AM	0 responses 55 views 0 likes	Last Post by seqadmin 04-04-2024, 09:00 AM

Seqanswers Leaderboard Ad

Announcement

GAII low number of mapped reads

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News