Go Back   SEQanswers > Bioinformatics > Bioinformatics

Similar Threads
Thread Thread Starter Forum Replies Last Post
snp call in Repeat Masked region caswater Bioinformatics 0 05-07-2012 08:10 PM
SNP Analysis cmm8cmm8 SOLiD 12 10-31-2011 06:22 PM
SNP and INDEL callers for novoalign or SAM? Nix Bioinformatics 4 06-06-2011 12:46 PM
Analysis strategy for mixture samples thsuk1 General 0 01-31-2011 11:19 AM
PubMed: Pyrosequencing-based strategy for a successful SNP detection in two hypervari Newsbot! Literature Watch 0 01-20-2010 02:10 AM

Thread Tools
Old 04-10-2013, 05:57 AM   #1
Junior Member
Location: Germany

Join Date: Apr 2013
Posts: 1
Default Novoalign repeat strategy for SNP analysis

Novoalign offers the following strategies to deal with multireads (reads mapping to multiple locations):

None: No alignments will be reported. The read will be reported as a status R with a count of the number of alignments. No alignment locations will be reported.

Random: A single alignment location is randomly chosen from amongst the alignment results. The choice is made using posterior alignment probabilities.

All: All alignment locations are reported. Note, that this is all alignments with a score within 5 points of the best alignment unless you use the R99 option to extend the range.

Exhaustive: This option bypasses the iterative alignment process and the normal repeat alignment detection. It finds all alignments with a score no worse than the threshold (t 99 option) and reports all the locations.

0.99: Sets a posterior probability threshold. Any alignment with a posterior probability, P(Ai| R, G) greater than this value will be reported. Eaxmple: r 0.01 will report all alignments with a probability greater then 0.01.

Which of the options should be used for SNP detection?

If "None" is used then SNPs in some repetitive regions will be completely omitted.

Using the "All" option will avoid it, but can introduce fake SNPs (if reads from slightly different repetitive regions are mapped to the same location).

The –R option can limit the definition of alignments as "identical" based on the alignment score. "This score difference is set by the 'R99' option and defaults to 5 which corresponds to the best alignment being approximately 3 times more probable than the next best alignment. For example, two alignments with probabilities 0.7 (score 1) and 0.3 (score = 5) would be considered as multiple alignments to the read. Two alignments with probabilities 0.8 (Score 0) and 0.2 ( score 7) would be treated as a unique alignment to the location with the higher probability."

So is using "All" with "-R 1" the best setting for SNP detection?

Last edited by hvm; 04-15-2013 at 03:38 AM.
hvm is offline   Reply With Quote
Old 04-12-2013, 02:54 PM   #2
NGS specialist
Location: Malaysia

Join Date: Apr 2008
Posts: 249


We typically would like to filter out ambiguous mapping reads when calling variants. Novoalign scores all of the cases above with a mapping quality of zero. In most cases our users usually set default reporting strategy and fitter mapQ 0 reads before calling SNPs.
zee is offline   Reply With Quote

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

All times are GMT -8. The time now is 10:33 PM.

Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO