![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
Input files for Roche GSMapper | cow_girl | Bioinformatics | 4 | 11-11-2014 01:04 PM |
GSMapper trimming | Peitx | Bioinformatics | 6 | 10-10-2011 12:23 PM |
Roche gsMapper output exon contigs rather than full-length sequence? | sulicon | Bioinformatics | 0 | 02-28-2011 05:51 PM |
gsMapper contigs | haonmada | 454 Pyrosequencing | 1 | 01-22-2010 12:25 PM |
gsMapper issues | mjleaks | 454 Pyrosequencing | 1 | 05-12-2009 07:13 AM |
![]() |
|
Thread Tools |
![]() |
#1 |
Member
Location: London Join Date: Sep 2008
Posts: 58
|
![]()
Hello
Has anyone here ever changed the parameters used by gsMapper when mapping their read data to a reference genome? If so, can anyone elaborate on what "minimum overlap length" and "alignment identity score" means? ![]() ![]() Cheers Layla |
![]() |
![]() |
![]() |
#2 |
Member
Location: Branford, Connecticut Join Date: Jan 2009
Posts: 32
|
![]()
I have not modified the default setting in gsMapper running.
gsMapper algorithm is similar to other assembly software (phrap), using the similar concept of "overlap" between reads to obtain contigs. The difference is that 454 gsMapper is all based on raw flow space. Therefore, the scores, the length I believe is on flow space. For example, minimum overlap length, default value is 40 based on Manual. I believe 40 means 40 flows, not 40 bases. 40 flows is roughly between 16bp to 20 bp. You can play with the value, but I doubt that you can get any real difference in result. |
![]() |
![]() |
![]() |
#3 |
wiki wiki
Location: Cambridge, England Join Date: Jul 2008
Posts: 266
|
![]()
I don't think this is true. I think It's 40 bases not 40 flows. IIRC (not that it's in the manual), flowspace is only used in calling the consensus *after* mapping the reads (in sequence space).
I could be wrong. It's a shame its not easy to find these things out. Also, I think these settings should have a big effect on the result. 'Seed size' is a trade off between sensitivity and running time. The bigger the seed size, the quicker the running time, but the more 'nearly perfect' hits you will miss. The lower the seed size, the higher the sensitivity, but the specificity dramatically reduces at some point, so many false matches need to be inspected at later stages of the mapping. Last edited by dan; 09-14-2009 at 12:44 AM. Reason: Responding to the second point too. |
![]() |
![]() |
![]() |
#4 |
Member
Location: Nijmegen, Netherlands Join Date: Jun 2009
Posts: 22
|
![]()
We used some different values for "minimum length" and "minimum identity": -ml 90% -mi 96% to get more reliable variation detection in areas with lower coverage.
|
![]() |
![]() |
![]() |
#5 |
Member
Location: Netherlands Join Date: Sep 2009
Posts: 18
|
![]()
Maybe silly but I simply did a BLAT analysis of the reads (which is really fast) to a reference genome which allowed me to simply choose any cut-off I like (length as well as sensitivity %homology). But probably this also depends on the specific requirements.....
My 2 cents. ![]() Alex |
![]() |
![]() |
![]() |
#6 | |
Senior Member
Location: USA Join Date: Jan 2008
Posts: 482
|
![]() Quote:
|
|
![]() |
![]() |
![]() |
#7 |
Member
Location: Netherlands Join Date: Sep 2009
Posts: 18
|
![]()
I have to admit
![]() ![]() |
![]() |
![]() |
![]() |
Thread Tools | |
|
|