Seqanswers Leaderboard Ad

**byou678** · 08-23-2011, 10:23 AM

You are right, simonandrews. The size distribution of the two libraries after adaptor trimming are significantly different. The "Basic Statistics" also show the difference as below. Attached is the picture of " Sequence Length Distribution" of 5_1

Measure Value
Filename 5.1_adaptortrim.fastq
File type Conventional base calls
Encoding Illumina 1.5
Total Sequences 33183607
Sequence length 8-76
%GC 44

Measure Value
Filename 5.2_adaptortrim.fastq
File type Conventional base calls
Encoding Illumina 1.5
Total Sequences 33183607
Sequence length 0-76
%GC 42

Originally posted by simonandrews View Post

So the problem is that you have a sequence in your library which isn't one of the adapters you passed to cutadapt. I can't immediately see where it's come from, but since cutadapt didn't know about it it didn't remove it, and your trimmed library is still biased. I'd suspect that if you looked at the size distribution of your two libraries after trimming you'll see that one has been trimmed significantly more than the other.

You need to figure out as much of this mystery sequence as you can (either by finding the sequence in one of your primers or by looking at some of your sequences and seeing where the common sequence at the end stops). You can then pass this as an extra sequence to cutadapt which can remove it from your library.

Attached Files

sequence_length_distribution.png (24.3 KB, 60 views)

**byou678** · 08-23-2011, 10:36 AM

Thanks a lot, simnandrews. We can see 5_2 get more trimming than 5_1. Does that mean 5_1 has the mystery ( or contaminated) sequence which didn't get trimmed during the adapter trimming? And 5_2 doesn't have that sequence? So for 5_1, we need to find it out and put it in cutadapt scripts to remove it. In addition, how could i explain this reason and solution in simple words to my boss. Look forward to any kind response.

Attached is the picture of " Sequence Length Distribution " of 5_2

Originally posted by simonandrews View Post

So the problem is that you have a sequence in your library which isn't one of the adapters you passed to cutadapt. I can't immediately see where it's come from, but since cutadapt didn't know about it it didn't remove it, and your trimmed library is still biased. I'd suspect that if you looked at the size distribution of your two libraries after trimming you'll see that one has been trimmed significantly more than the other.

You need to figure out as much of this mystery sequence as you can (either by finding the sequence in one of your primers or by looking at some of your sequences and seeing where the common sequence at the end stops). You can then pass this as an extra sequence to cutadapt which can remove it from your library.

Attached Files

sequence_length_distribution.png (23.8 KB, 41 views)

**GenoMax** · 08-23-2011, 11:29 AM

See the discussion here: http://bioinfo-core.org/index.php/9t...8_October_2010 (4th figure specifically).

This paper may be useful: http://nar.oxfordjournals.org/content/38/12/e131.full

Are your alignments with bwa looking ok?

Originally posted by byou678 View Post

Yes, align using BWA. Could you explain your second question in detail? Thanks for your reply.

**byou678** · 08-23-2011, 01:53 PM

Thanks for the Info you offered. The last step of bwa alignment doesn' move smothly, it has taken a long time which is not expected, and it is still running now.

Could you take a look at the above threads again and more ideas will be greatly appreciated!

Originally posted by GenoMax View Post

See the discussion here: http://bioinfo-core.org/index.php/9t...8_October_2010 (4th figure specifically).

This paper may be useful: http://nar.oxfordjournals.org/content/38/12/e131.full

Are your alignments with bwa looking ok?

Topics	Statistics	Last Post
Cancer Metastasis: A Deep Dive into Cellular Plasticity by seqadmin Started by seqadmin, 04-11-2024, 12:08 PM	0 responses 58 views 0 likes	Last Post by seqadmin 04-11-2024, 12:08 PM
Proteogenomic Profiles Offer New Clues in Prostate Cancer by seqadmin Started by seqadmin, 04-10-2024, 10:19 PM	0 responses 54 views 0 likes	Last Post by seqadmin 04-10-2024, 10:19 PM
Novel Diagnostic Assay Enhances Ovarian Cancer Detection by seqadmin Started by seqadmin, 04-10-2024, 09:21 AM	0 responses 45 views 0 likes	Last Post by seqadmin 04-10-2024, 09:21 AM
Evolutionary Dynamics of Centromeres: A Comparative Genomic Analysis by seqadmin Started by seqadmin, 04-04-2024, 09:00 AM	0 responses 55 views 0 likes	Last Post by seqadmin 04-04-2024, 09:00 AM

Seqanswers Leaderboard Ad

Announcement

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News