SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
Trim FastQ nxtgenkid10 Bioinformatics 7 05-27-2014 06:40 PM
Help with De-Multiplexing MiSeq Data Cirno Bioinformatics 8 08-16-2012 02:51 PM
tophat trim options? weasteam Bioinformatics 0 06-29-2012 07:39 AM
How to trim the adaptor sequence from the solexa small RNA sequencing data? satp Bioinformatics 11 11-17-2010 02:08 PM
Do I need to trim the sequences like this? days369 Bioinformatics 4 08-16-2010 09:19 PM

Reply
 
Thread Tools
Old 09-06-2012, 09:53 AM   #1
cwzkevin
Member
 
Location: earth

Join Date: Mar 2012
Posts: 13
Question Should I trim these MiSeq data for de novo assembly?

Hi there,

I have three 151 pe genome MiSeq data for de novo assembly (velvet). Below is the fastqc quality plot of them. (a_1, a_2, b_1, b_2, c_1, c_2)
1. Should I trim them? Or, do you think they are fine?
2. If I should trim them, do I trim /1 to 150 base (remove the last bit) or to 149 base (remove the last two bits), and how about /2?
Thanks in advance!

a_1 and a_2
a_1.png

a_2.png

Last edited by cwzkevin; 09-06-2012 at 11:07 AM. Reason: image shows now
cwzkevin is offline   Reply With Quote
Old 09-06-2012, 10:09 AM   #2
cwzkevin
Member
 
Location: earth

Join Date: Mar 2012
Posts: 13
Default

b_1 and b_2
b_1.png

b_2.png

Quote:
Originally Posted by cwzkevin View Post
hi there,

i have three 151 pe genome miseq data for de novo assembly (velvet). Below is the fastqc quality plot of them. (a_1, a_2, b_1, b_2, c_1, c_2)
1. Should i trim them? Or, do you think they are fine?
2. If i should trim them, do i trim /1 to 150 base (remove the last bit) or to 149 base (remove the last two bits), and how about /2?
Thanks in advance!

Last edited by cwzkevin; 09-06-2012 at 11:08 AM.
cwzkevin is offline   Reply With Quote
Old 09-06-2012, 10:10 AM   #3
cwzkevin
Member
 
Location: earth

Join Date: Mar 2012
Posts: 13
Default

c_1 and c_2
c_1.png

c_2.png
Quote:
Originally Posted by cwzkevin View Post
hi there,

i have three 151 pe genome miseq data for de novo assembly (velvet). Below is the fastqc quality plot of them. (a_1, a_2, b_1, b_2, c_1, c_2)
1. Should i trim them? Or, do you think they are fine?
2. If i should trim them, do i trim /1 to 150 base (remove the last bit) or to 149 base (remove the last two bits), and how about /2?
Thanks in advance!

Last edited by cwzkevin; 09-06-2012 at 11:08 AM.
cwzkevin is offline   Reply With Quote
Old 09-06-2012, 10:51 AM   #4
Wallysb01
Senior Member
 
Location: San Francisco, CA

Join Date: Feb 2011
Posts: 286
Default

Trimming can both be good and bad. It would probably be a good idea to trim off some really low quality bases (ie <10). If nothing else it will make things computationally easier. Generally, the trade off between more sequence and higher quality sequences evens out in terms of assembly quality. It just means with more sequences you'll need more RAM and more CPU time to get the job done.

However, you should think about how you do your assembly some. Longer kmer assemblies will require higher quality data because the chance of unique kmers due to sequencing errors increase with greater values of k. So, you could try lower values of k with lower quality score trimming, and higher values of k with more quality score trimming. But be mindful of how much sequence you're losing.

One thing you can do is inspect your kmer coverage distribution with different combinations of k and quality cut offs. You can do this fairly quickly using Jellyfish and make plots of the resulting distribution file it outputs. The basic point is you want something with nice big peak coverage out around 20x or greater.
Wallysb01 is offline   Reply With Quote
Reply

Tags
fastqc, quality, trim

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 08:05 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2018, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO