SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
Illumina adapter trimming figo1019 Illumina/Solexa 12 06-03-2014 12:32 PM
An Extensive Evaluation of Read Trimming Effects on Illumina NGS Data Analysis peer.b Bioinformatics 2 01-20-2014 07:34 PM
Trimming illumina 1.8 reads ssharma Bioinformatics 7 07-18-2013 05:33 AM
trimming illumina reads empyrean Bioinformatics 5 12-21-2011 12:48 AM

Reply
 
Thread Tools
Old 10-05-2016, 07:31 PM   #1
SDPA_Pet
Senior Member
 
Location: US

Join Date: Apr 2013
Posts: 222
Default trimming Illumina data

Hello, does anyone know the cutoff of trimming length. I have Illumina data from 2X150bp and 2X250bp sequencing. I'd like to trim them. I set the Q score to 30. How about length? What is the cutoff do you normally use?

For example, for 2X150bp data, should I discard all the reads short than 75bp or 100bp? How about 2X250 bp? I just want know the cutoff people usually use? Phred score >30 is normally use. I am not sure about the length.

Thanks
SDPA_Pet is offline   Reply With Quote
Old 10-06-2016, 12:25 AM   #2
Persistent LABS
Member
 
Location: Pune, India

Join Date: Apr 2016
Posts: 20
Default

Hi SDPA_Pet,
The length cutoff will depend upon how much you gain while aligning the reads to genome [if your experiment is not a denovo assembly]. Smaller read lengths will increase the chance of alignments at multiple loci, which might not help you.
You can refer this publication: An Extensive Evaluation of Read Trimming Effects on Illumina NGS Data Analysis [http://journals.plos.org/plosone/art....pone.0085024]. The authors have used 70% of original read length as the length cutoff.
__________________
Persistent LABS
Persistent LABS is offline   Reply With Quote
Old 10-06-2016, 04:23 AM   #3
GenoMax
Senior Member
 
Location: East Coast USA

Join Date: Feb 2008
Posts: 6,992
Default

Quote:
Originally Posted by SDPA_Pet View Post
Hello, does anyone know the cutoff of trimming length. I have Illumina data from 2X150bp and 2X250bp sequencing. I'd like to trim them. I set the Q score to 30. How about length? What is the cutoff do you normally use?

For example, for 2X150bp data, should I discard all the reads short than 75bp or 100bp? How about 2X250 bp? I just want know the cutoff people usually use? Phred score >30 is normally use. I am not sure about the length.

Thanks
Q30 is an unnecessarily high cutoff, if all you are doing is aligning to a reference. You could omit trimming based on quality altogether in this case.

If you were going to do de novo assembly then you may want to trim at Q20-25.

If you are seeing a large amount of data getting trimmed then there may be some issue with your data that you would want to investigate further.
GenoMax is offline   Reply With Quote
Old 10-06-2016, 06:38 AM   #4
SDPA_Pet
Senior Member
 
Location: US

Join Date: Apr 2013
Posts: 222
Default

Hi guys,

Yes, I am doing De novo metagenome assembly? Should I still use the length cutoff of 70%. For example, if I am trimming 150bp sequencing data, I will discard all the sequences shorter than 105bp.
Thanks.
SDPA_Pet is offline   Reply With Quote
Old 10-06-2016, 09:01 AM   #5
Persistent LABS
Member
 
Location: Pune, India

Join Date: Apr 2016
Posts: 20
Default

Quote:
Originally Posted by SDPA_Pet View Post
Hi guys,

Yes, I am doing De novo metagenome assembly? Should I still use the length cutoff of 70%. For example, if I am trimming 150bp sequencing data, I will discard all the sequences shorter than 105bp.
Thanks.
Longer the reads, better will be the assembly. But of course you will be loosing some reads with high length cutoff. Important point is how much you are loosing. For example, if you are loosing only 5% reads, I think you are good to go ahead. Researchers have used even 20-30nt sequences to create draft genome assemblies [http://bib.oxfordjournals.org/content/11/5/457.full].
__________________
Persistent LABS
Persistent LABS is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 09:23 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2019, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO