SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
PubMed: True single-molecule DNA sequencing of a Pleistocene horse bone. Newsbot! Literature Watch 0 08-02-2011 04:20 AM
RNA-Seq: Microarrays, deep sequencing and the true measure of the transcriptome. Newsbot! Literature Watch 1 06-10-2011 11:00 AM
Predicting true SNPs from .vcf file swbarnes2 Bioinformatics 1 04-06-2011 04:29 PM
True seq DNARNA RNA Sequencing 9 02-13-2011 01:18 AM

Reply
 
Thread Tools
Old 04-23-2012, 10:59 AM   #1
oiiio
Senior Member
 
Location: USA

Join Date: Jan 2011
Posts: 105
Default Is this true?

I was going to put this in the 'literature watch' section, but decided to place it here because its more a question about alignment tools.

In the latest online release (April 20) of Genome Research there is "lobSTR: A short tandem repeat profiler for personal genomes". Reading some of the results in this paper are quite interesting...

I attached a screenshot of one of the tables where it does a comparison to other popular read aligners (100bp Illumina).

In the column titled 'indel tolerance(bp)', only BLAT is capable of going past 7 bp indels? Is this true? And i'm assuming that the comparison for bowtie was not bowtie2..
Attached Images
File Type: png Screen Shot 2012-04-23 at 1.41.02 PM.png (34.0 KB, 60 views)
oiiio is offline   Reply With Quote
Old 04-23-2012, 12:02 PM   #2
oiiio
Senior Member
 
Location: USA

Join Date: Jan 2011
Posts: 105
Default

Here is a link to the full paper

http://genome.cshlp.org/content/earl...7-1fc846d22e95
oiiio is offline   Reply With Quote
Old 04-23-2012, 01:25 PM   #3
xied75
Senior Member
 
Location: Oxford

Join Date: Feb 2012
Posts: 129
Default

The numbers are really interesting. Can't speak for others, but for BWA, I can do 15.8 million human paired-end 90bp reads in 576 seconds real time with 30 threads (-t 30, total CPU time 12000 seconds). The paper's time is bit slow, does that include BWA SAMSE/SAMPE as well?

In BWA ALN, for reads between 93-124, the maxdiff by default is 5, that's the gap number you saw in the table.

Best,

dong
xied75 is offline   Reply With Quote
Old 04-23-2012, 01:50 PM   #4
zee
NGS specialist
 
Location: Malaysia

Join Date: Apr 2008
Posts: 249
Default

This is a fairly old version of novoalign that was used in the comparison. In that older version the gap extension penalty of 15 was much higher than BWA or Bowtie's 5. In our latest versions we have now set it to 6 which is more comparable.
Novoalign will definitely pick up indels greater than 7bp. I have generated indels with novoalign-dedup-Dindel that can go as high as 40bp.

Also, on the speed note it looks like they compared novoalign single-threaded version to their parallel version and likewise for BWA. I have not read the whole paper but I would think they should try apples-to-apples wherever possible.

Regarding that attached table I dont know what "noninformative reads" actually refers to but I think the authors are showing that their tool is best because it finds 0 noninformative reads. On the flip side lobSTR does not report the highest number of "informative" reads.
zee is offline   Reply With Quote
Old 04-23-2012, 01:52 PM   #5
dietmar13
Senior Member
 
Location: Vienna

Join Date: Mar 2010
Posts: 107
Default interesting, but

they should use a real competitor for their speed test, not the lame ducks

http://gingeraslab.cshl.edu/STAR/

i've tested RUM, STAR, and Tophat with a RNAseq data-set, followed by DE analyses, and found no major differences between these three aligners concerning DE gene lists, except mapping speed: STAR was by far the fastest...
dietmar13 is offline   Reply With Quote
Old 04-23-2012, 02:03 PM   #6
adaptivegenome
Super Moderator
 
Location: US

Join Date: Nov 2009
Posts: 437
Default

It is odd that speed is a major point in the paper. It is a new method for genotyping repeats. My question is whether it produces more accurate alignments (which an ROC plot would reveal) and really whether it produces more accurate genotypes. It was not clear to me that they tested either in the manuscript.
adaptivegenome is offline   Reply With Quote
Old 04-23-2012, 10:52 PM   #7
dvanic
Member
 
Location: Sydney, Australia

Join Date: Jan 2012
Posts: 61
Default

Quote:
i've tested RUM, STAR, and Tophat with a RNAseq data-set, followed by DE analyses, and found no major differences between these three aligners concerning DE gene lists, except mapping speed: STAR was by far the fastest...
How were they with alternative isoform detection?
And what do you mean by "no major differences"?
dvanic is offline   Reply With Quote
Old 04-29-2012, 06:41 PM   #8
erlichya
Junior Member
 
Location: Cambridge

Join Date: Apr 2012
Posts: 1
Exclamation Confusion

Hi Guys,

I think there is some confusion here. The table was generated using the default parameters of different aligners. The run times of all tools (*including lobSTR*) was determined using a single thread. We also said that in the main text.

So, Zee, for your question, we did compare 'apples to apples'. Next time, please try to make an effort to read the manuscript that you are criticizing.

Will be happy to answer any other question.

Yaniv
erlichya is offline   Reply With Quote
Old 04-30-2012, 01:40 AM   #9
arvid
Senior Member
 
Location: Berlin

Join Date: Jul 2011
Posts: 156
Default

Quote:
Originally Posted by erlichya View Post
Hi Guys,

I think there is some confusion here. The table was generated using the default parameters of different aligners. The run times of all tools (*including lobSTR*) was determined using a single thread. We also said that in the main text.

So, Zee, for your question, we did compare 'apples to apples'. Next time, please try to make an effort to read the manuscript that you are criticizing.

Will be happy to answer any other question.

Yaniv
Hmm, I don't quite agree that comparing speed, sensitivity and accuracy of aligners using their default settings make much sense, when these default settings differ - typically the default parameters are optimized for slightly different tasks. This is a typical 'apples with oranges' situation, IMHO. It would make sense to set similar sensitivity settings on all aligners before comparing anything...

Running in a single thread makes sense for strict algorithm comparison, but doesn't reflect a real usage situation, however. Does lobSTR scale well in parallelization?
arvid is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 02:28 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO