SEQanswers

Go Back   SEQanswers > General



Similar Threads
Thread Thread Starter Forum Replies Last Post
Why using short reads? dbrg77 General 1 03-25-2011 08:01 AM
short sequencing reads vlee2 454 Pyrosequencing 4 02-28-2011 07:34 AM
454 Reads correction with Short reads ? yvan.wenger Bioinformatics 3 11-26-2010 04:17 AM
Pre-assembly for short-reads to minimize RAM usage Alex8 Bioinformatics 6 11-05-2010 05:58 AM
clustering short reads lpantano Bioinformatics 2 02-02-2010 05:56 AM

Reply
 
Thread Tools
Old 09-20-2011, 08:27 PM   #1
samanta
Senior Member
 
Location: Seattle

Join Date: Feb 2010
Posts: 109
Default Too many short reads and too little RAM?

Someone asked me whether it makes sense to remove duplicate reads to get the library size down to fit RAM limit. I think it is a bad strategy as explained here -

http://www.homolog.us/blogs/2011/09/...n-k-mer-world/
__________________
http://homolog.us
samanta is offline   Reply With Quote
Old 09-20-2011, 11:21 PM   #2
zhidkov.ilia
Member
 
Location: Israel

Join Date: Dec 2010
Posts: 25
Default

I think duplicated reads removed to avoid biases that resulted from library preparation (for example) and not for reduction of data for de-novo assembly.

Ilia
zhidkov.ilia is offline   Reply With Quote
Old 09-20-2011, 11:35 PM   #3
samanta
Senior Member
 
Location: Seattle

Join Date: Feb 2010
Posts: 109
Default

That's a good point. Some filtering is necessary to take care of pileup of reads due to biases. I do that for alignment and SNP discovery, but think twice about it during de novo assembly. If no underlying genome is known, it is hard to tell whether the duplicated reads come from error or real sequence.
__________________
http://homolog.us
samanta is offline   Reply With Quote
Old 09-21-2011, 12:28 AM   #4
zhidkov.ilia
Member
 
Location: Israel

Join Date: Dec 2010
Posts: 25
Default

So when you assemble reads in to contigs, you will prefer that at least several reads will support the assembly. If you will have identical reads, you might obtain false contigs.

Ilia
zhidkov.ilia is offline   Reply With Quote
Old 09-21-2011, 02:38 PM   #5
samanta
Senior Member
 
Location: Seattle

Join Date: Feb 2010
Posts: 109
Default

It does not work that way for K-mer based assembler. Would you please explain your rationale? Why would one get false contigs?
__________________
http://homolog.us
samanta is offline   Reply With Quote
Old 09-22-2011, 05:48 AM   #6
zhidkov.ilia
Member
 
Location: Israel

Join Date: Dec 2010
Posts: 25
Default

Let me rephrase my last comment:
If duplicated reads don't contribute to downstream the de novo assembly pipe, it will be good idea to remove them.

Ilia
zhidkov.ilia is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 05:45 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2019, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO