SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
Tophat: options -N --read-edit-dist --read-gap-length Pradhaun Bioinformatics 0 01-04-2013 07:58 AM
picard error: Mismatch between read length and quals length writing read shawpa Bioinformatics 0 08-20-2012 05:52 AM
Interspecies alignment and read length comparison? BGould RNA Sequencing 4 05-27-2011 04:49 AM
BLAST Alignment Length iam.candice Bioinformatics 1 08-04-2009 06:43 PM

Reply
 
Thread Tools
Old 07-02-2013, 11:43 AM   #1
gene_x
Senior Member
 
Location: MO

Join Date: May 2010
Posts: 108
Default read length impact on alignment

I recently run into a problem in my ChIP-seq data as it has low alignment rate to the mm9 genome. It turned out that a large number of the reads contain adapter sequences. We haven't applying adapter trimming to our analysis since we can get a decent alignment most of the time. But in this case, I do need to trim the adapters. I have tired cutadapt and it seems to work well.

My question is on how to control the length of the reads after adapter trimming. Does it make a big difference if I don't control anything (i.e. keep everything that is left after trimming adapters?

Thanks.
gene_x is offline   Reply With Quote
Old 07-02-2013, 11:50 AM   #2
westerman
Rick Westerman
 
Location: Purdue University, Indiana, USA

Join Date: Jun 2008
Posts: 1,104
Default

Shouldn't make much of difference since you are mapping to the known mm9 genome. I'd probably toss everything below 20 bases but that is a personal preference -- I'd rather not think about them mapping repetitively.
westerman is offline   Reply With Quote
Old 07-02-2013, 01:43 PM   #3
gene_x
Senior Member
 
Location: MO

Join Date: May 2010
Posts: 108
Default

Quote:
Originally Posted by westerman View Post
Shouldn't make much of difference since you are mapping to the known mm9 genome. I'd probably toss everything below 20 bases but that is a personal preference -- I'd rather not think about them mapping repetitively.
Yeah, that's what I thought. Is 20 bases based on some estimation?
gene_x is offline   Reply With Quote
Old 07-04-2013, 04:57 AM   #4
hanshart
Member
 
Location: Germany

Join Date: Nov 2011
Posts: 27
Default

Quote:
Originally Posted by gene_x View Post
Yeah, that's what I thought. Is 20 bases based on some estimation?
20 bases sounds ok for mouse, but even then I would stick to "uniquely" mappable reads, as even for 20bp reads for such a large genome like mm there will be many reads that could be mapped to many valid positions. Have a look at this post http://blog.kokocinski.net/index.php...ability?blog=2 and the therein mentioned papers and online tools for an estimation of a sufficient read length threshold. Of course, you can allow multi-reads to be not so dependent on the (repetitiveness related) structure of mm9 but then your mapping is not as reliable as without multi-reads (IMHO)
hanshart is offline   Reply With Quote
Reply

Tags
adapter trimming

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 06:24 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO