SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
CASAVA 1.8 --use-bases-mask aggp11 Illumina/Solexa 4 02-02-2012 05:34 PM
GATK : Drop in number of confidently called bases claratsm Bioinformatics 0 10-06-2011 12:04 AM
anyone is using RepeatMasker to mask repeat sequence with short reads? feng Bioinformatics 3 11-16-2010 12:40 AM
Determining the number of bases and percent coverage in an aligned sequence kz26 Bioinformatics 0 06-28-2010 08:22 PM
Quality trimmming / Mask low quality bases? bbimber Bioinformatics 9 03-25-2010 01:40 PM

Reply
 
Thread Tools
Old 07-26-2012, 12:42 AM   #1
SEQond
Member
 
Location: Italy

Join Date: Jul 2010
Posts: 27
Default Mask x number of bases WITHIN sequence prior to alignment

Message moved to correct section
http://seqanswers.com/forums/showthread.php?t=22014

Hi all,

As you may see from the picture I have this QC from all R2 reads of my Paired End sequenced samples. Due to a technical error during the sequencing I am ending up with 30+ R2 reads with serious errors in the middle of the sequence.

Do you know any way to mask (or to allow mismatch at) a specific number of bases (2-3) at a specific position WITHIN the fragment length prior to alignment? Biostrings is an option that I would prefer not to use for reasons of speed.

Can what you propose be selectively applied to only one of the two reads in the paired end samples?

It would be ideal if this could be directly applied directly with Bowtie like the trimming left/right that already exists as an inherent option.




Last edited by SEQond; 07-27-2012 at 05:45 AM. Reason: moved to correct section
SEQond is offline   Reply With Quote
Old 07-26-2012, 08:45 AM   #2
HESmith
Senior Member
 
Location: Bethesda MD

Join Date: Oct 2009
Posts: 503
Default

ELAND should be able to mask these bases during alignment, since the initial error-free segment is longer than the seed. Include USE_BASES Y41n3Y*n in the config file to mask bases 42-44 (plus the terminal base, which is the default). The one caveat is if the error results from fluidics/chemistry issues, then the phasing after the error may be incorrect.

If this doesn't work or you prefer an alternative aligner, you could convert the reads into pseudo-paired end data. Use bases 1-41 as read one, then use bases 45-100 as read two. Filter the aligned data on expected criteria (i.e., both reads map to same chromosome and orientation, position of read 2 = read 1 + 44 [or + 41-44 if phasing is off]).
HESmith is offline   Reply With Quote
Old 07-27-2012, 03:56 AM   #3
SEQond
Member
 
Location: Italy

Join Date: Jul 2010
Posts: 27
Default

Quote:
Originally Posted by HESmith View Post
ELAND should be able to mask these bases during alignment, since the initial error-free segment is longer than the seed. Include USE_BASES Y41n3Y*n in the config file to mask bases 42-44 (plus the terminal base, which is the default). The one caveat is if the error results from fluidics/chemistry issues, then the phasing after the error may be incorrect.

If this doesn't work or you prefer an alternative aligner, you could convert the reads into pseudo-paired end data. Use bases 1-41 as read one, then use bases 45-100 as read two. Filter the aligned data on expected criteria (i.e., both reads map to same chromosome and orientation, position of read 2 = read 1 + 44 [or + 41-44 if phasing is off]).
Can ELAND in this way align Paired End sequences while at the same time masking selectively bases of only a one of the two reads?

To be honest I would prefer a BW based aligner (Bowtie,BWA, and SOAP2)
Thanks for your answer
SEQond is offline   Reply With Quote
Old 08-02-2012, 05:16 AM   #4
SEQond
Member
 
Location: Italy

Join Date: Jul 2010
Posts: 27
Default

Possibly bowtie 2 is the answer to the issue

also look here or here

Last edited by SEQond; 08-13-2012 at 08:18 AM.
SEQond is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 07:08 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2019, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO