Seqanswers Leaderboard Ad

**Brian Bushnell** · 11-01-2014, 08:55 AM

First, do you know what kind library prep was used? If it was Nextera, that would explain the biased sequence near the beginning, and also why some adapters are not being removed, since you're trimming for TruSeq sequences. But if it was in fact TruSeq, then I'm not really sure about the biased composition near the beginning.

Unfortunately, because of the way FastQC compresses the base positions after base 9, it's impossible to get a good idea of what's going on at the end of the read from those graphs. But note that typical adapter-trimming will not remove adapters shorter than X bp at the very end, because it becomes too short to match the sequence confidently (X is usually a parameter). However, BBDuk can still remove those very short adapter sequences from PE reads by overlapping them to determine the insert size, so you might give that a try; just use the "tbo" flag.

**bastianwur** · 11-12-2014, 01:07 AM

Just trim off the ends.
Is probably less of a headache than trying to figure out the problem.

For the high GC at the end: It seems to be that in general the longer reads have a higher chance to have GC at the end, not AT.
So if your reads are of inequal length, then you'll just get an increase of GC content at the end, because all the AT is more likely to be removed.

**avo** · 11-12-2014, 03:56 AM

Originally posted by ysnapus View Post

I ran trimmomatic with
PE -phred33 ILLUMINACLIP:TruSeq2-PE.fa:2:20:7:2 LEADING:13 TRAILING:13 SLIDINGWINDOW:4:15 MINLEN:36

I agree with Brian. Are you sure it is a TruSeq2 library? We often see this kind of sequence content plots for Nextera libraries. In this case you should just use the NexteraPE-PE.fa adapter file.

**kmcarr** · 11-12-2014, 08:25 AM

Originally posted by avo View Post

I agree with Brian. Are you sure it is a TruSeq2 library? We often see this kind of sequence content plots for Nextera libraries. In this case you should just use the NexteraPE-PE.fa adapter file.

It definitely looks like a TruSeq (or other mechanically fragmented) library to me. Nextera (tagmentase fragmented) have a very distinct and more exaggerated base composition bias at the 5' end. TruSeq or other libraries in which the input DNA is fragmented in a Covaris still show a slight bias in their 5' base composition due to base composition influencing fragmentation sensitivity.

Topics	Statistics	Last Post
Cancer Metastasis: A Deep Dive into Cellular Plasticity by seqadmin Started by seqadmin, 04-11-2024, 12:08 PM	0 responses 25 views 0 likes	Last Post by seqadmin 04-11-2024, 12:08 PM
Proteogenomic Profiles Offer New Clues in Prostate Cancer by seqadmin Started by seqadmin, 04-10-2024, 10:19 PM	0 responses 29 views 0 likes	Last Post by seqadmin 04-10-2024, 10:19 PM
Novel Diagnostic Assay Enhances Ovarian Cancer Detection by seqadmin Started by seqadmin, 04-10-2024, 09:21 AM	0 responses 24 views 0 likes	Last Post by seqadmin 04-10-2024, 09:21 AM
Evolutionary Dynamics of Centromeres: A Comparative Genomic Analysis by seqadmin Started by seqadmin, 04-04-2024, 09:00 AM	0 responses 52 views 0 likes	Last Post by seqadmin 04-04-2024, 09:00 AM

Seqanswers Leaderboard Ad

Announcement

MiSeq gDNA reads still fail "Kmer content" and "per base seq content" after trimming"

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News