Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #16
    SOAP3 is developed by one of the best groups on pattern matching, compressed index etc. It is published:



    Nonetheless, it has not really addressed the question of accuracy as it does not talk about paired reads and gapped alignment. I have not tried by myself. I could not find a right machine. So, I do not know.

    BGI has very good reasons to take on GPU, but I am not sure if CPU computing is ready for general users.

    Comment


    • #17
      Originally posted by genericforms View Post
      Accuracy is more important than speed. But again after all these messages none of us are any closer to knowing the accuracy or speed of this code.
      Accuracy is important but there are times and places where performance is more important. Otherwise everyone would probably be using Novoalign based on the various ROCs I've seen. While unfortunately for larger projects where one has to trade off between additional data or additional compute resources very accurate but much slower (for us at least) is not a viable option on hundreds on samples, though certainly possible for a few highly important ones.

      Originally posted by ymc;
      There is also one GPU variant caller called GSNP.

      Even if these program's ROC is not as high, I think there will still be cases where its speed up is useful. After all, BWA itself is not running Smith-Waterman either.
      Interesting I was not aware of that program, good to see that there are the beginnings of full GPU pipeline.

      With regards to the performance (ignoring accuracy for the moment) 7-10x improvement is nice compared to BWA, however we can already get > 5x speed improvement on BWA using standard hardware and a commercial aligner (Real Time Genomics) while maintaining very similar levels of accuracy.

      In which case an additional 2x speedup that requires expensive specialized hardware (many current cluster nodes won't be able to take the GPUs, so would need node + Card) is less appealing.

      If one already has the hardware, is starting from scratch or has the budget to replace existing hardware then assuming the accuracy is decent then it becomes much more interesting. Or if one was suitable desktop machines and is dealing with few samples it could be very appealing.

      Comment


      • #18
        Originally posted by lh3 View Post

        BGI has very good reasons to take on GPU
        And that reason is...?

        Comment


        • #19
          Originally posted by Geneus View Post
          And that reason is...?
          this?

          Comment


          • #20
            Novoalign is pretty quick but it costs money and is not open-source so it is understandable that not everyone wants to use it.

            Anyways, I think most of us agree in general on this debate, however we just don't have enough information about speed and accuracy to decide if it is worth using SOAP3. Hopefully this will change soon


            Originally posted by aeonsim View Post
            Accuracy is important but there are times and places where performance is more important. Otherwise everyone would probably be using Novoalign based on the various ROCs I've seen. While unfortunately for larger projects where one has to trade off between additional data or additional compute resources very accurate but much slower (for us at least) is not a viable option on hundreds on samples, though certainly possible for a few highly important ones.



            Interesting I was not aware of that program, good to see that there are the beginnings of full GPU pipeline.

            With regards to the performance (ignoring accuracy for the moment) 7-10x improvement is nice compared to BWA, however we can already get > 5x speed improvement on BWA using standard hardware and a commercial aligner (Real Time Genomics) while maintaining very similar levels of accuracy.

            In which case an additional 2x speedup that requires expensive specialized hardware (many current cluster nodes won't be able to take the GPUs, so would need node + Card) is less appealing.

            If one already has the hardware, is starting from scratch or has the budget to replace existing hardware then assuming the accuracy is decent then it becomes much more interesting. Or if one was suitable desktop machines and is dealing with few samples it could be very appealing.

            Comment


            • #21
              Originally posted by aeonsim View Post
              Accuracy is important but there are times and places where performance is more important. Otherwise everyone would probably be using Novoalign based on the various ROCs I've seen. While unfortunately for larger projects where one has to trade off between additional data or additional compute resources very accurate but much slower (for us at least) is not a viable option on hundreds on samples, though certainly possible for a few highly important ones.



              Interesting I was not aware of that program, good to see that there are the beginnings of full GPU pipeline.

              With regards to the performance (ignoring accuracy for the moment) 7-10x improvement is nice compared to BWA, however we can already get > 5x speed improvement on BWA using standard hardware and a commercial aligner (Real Time Genomics) while maintaining very similar levels of accuracy.

              In which case an additional 2x speedup that requires expensive specialized hardware (many current cluster nodes won't be able to take the GPUs, so would need node + Card) is less appealing.

              If one already has the hardware, is starting from scratch or has the budget to replace existing hardware then assuming the accuracy is decent then it becomes much more interesting. Or if one was suitable desktop machines and is dealing with few samples it could be very appealing.
              The new Tesla K20 coming out later this year should be about 4.5TFLOPS for FP32 calculations which is 3x the current gen. The computation gap between GPU and CPU is widening now. I think it is time to at least try out some of the GPU software to prepare for the future.

              Comment


              • #22
                Originally posted by ymc View Post
                Thanks. I also thought BGI was converting/had converted their cluster in SZ to gpu...at least, that was what I was told.

                Comment


                • #23
                  @genericforms: Novoalign is academic free, but even so, those big sequencing centers seem to have the tradition of using open-source software. I do not know why, but personally I also feel a little more comfortable to go for an open source solution if possible. BTW, just realize that soap3-dp is soap3 + GPU dynamic programming. It is not soap3.

                  @aeonsim: Do you have the evaluation results of rtg mapper? I cannot find it on the web. They claim rtg to be 7X faster than BWA while being more sensitive. Firstly, this is not accuracy. Secondly, according to their NA12878 evaluation, rtg actually maps fewer reads than BWA. Thirdly, in the same whitepaper, rtg achieves 20Gb per CPU day (=8471386*101*24*1e-9), while the BWA mapping speed is roughly 7-10Gbp per CPU day, depending on hardware. This is not 7X faster. At last, rtg seems to use more RAM.

                  I am not saying rtg is bad - I had some brief communications with one of its developers who is really capable - but descriptions on their website look more like marketing which exaggerates the effects on data sets favoring their own pipeline (I understand this happens all the time and is necessary).

                  My 2010 review best describes my general opinion on the alignment speed: for Illumina short reads, pre-alignment steps (image analysis and base calling) are actually slower; post-alignment steps (markduplicates, variant calling etc.) are not much faster than alignment, especially in case of 1000g-like analysis. Substantially improving the alignment speed may not greatly reduce the time on the end-to-end data analysis.

                  Comment


                  • #24
                    Heng,

                    I agree, most people (including me) prefer open source. When talking about Novoalign, I was referring to the paid version that is multithreaded. The free version runs on a single CPU. The paid version would be faster (unless you use the free version and simply split the FASTQ file into bits).

                    I also agree there are other bottlenecks in processing BAM files. I mentioned it in another thread but we are trying to put together a multithreaded collection of tools (CPU parallelized, but in future also GPU parallelized) for working with HTS data. I do believe that GPUs can be useful for this space.

                    Its completely open source and based on bamtools and samtools. I suppose you are super busy these days, but if you had time to take a look I would love your opinion and also suggestions. I am funded to keep developing it for a while and I would like to make the most of the opportunity and build something truly useful. Anyways, sorry for sort hijacking this thread.

                    The code is here:
                    An accelerated framework for manipulating and interpreting high-throughput sequencing data - adaptivegenome/openge



                    Originally posted by lh3 View Post
                    @genericforms: Novoalign is academic free, but even so, those big sequencing centers seem to have the tradition of using open-source software. I do not know why, but personally I also feel a little more comfortable to go for an open source solution if possible. BTW, just realize that soap3-dp is soap3 + GPU dynamic programming. It is not soap3.

                    @aeonsim: Do you have the evaluation results of rtg mapper? I cannot find it on the web. They claim rtg to be 7X faster than BWA while being more sensitive. Firstly, this is not accuracy. Secondly, according to their NA12878 evaluation, rtg actually maps fewer reads than BWA. Thirdly, in the same whitepaper, rtg achieves 20Gb per CPU day (=8471386*101*24*1e-9), while the BWA mapping speed is roughly 7-10Gbp per CPU day, depending on hardware. This is not 7X faster. At last, rtg seems to use more RAM.

                    I am not saying rtg is bad - I had some brief communications with one of its developers who is really capable - but descriptions on their website look more like marketing which exaggerates the effects on data sets favoring their own pipeline (I understand this happens all the time and is necessary).

                    My 2010 review best describes my general opinion on the alignment speed: for Illumina short reads, pre-alignment steps (image analysis and base calling) are actually slower; post-alignment steps (markduplicates, variant calling etc.) are not much faster than alignment, especially in case of 1000g-like analysis. Substantially improving the alignment speed may not greatly reduce the time on the end-to-end data analysis.

                    Comment


                    • #25
                      Originally posted by lh3 View Post
                      @aeonsim: Do you have the evaluation results of rtg mapper? I cannot find it on the web. They claim rtg to be 7X faster than BWA while being more sensitive. Firstly, this is not accuracy. Secondly, according to their NA12878 evaluation, rtg actually maps fewer reads than BWA. Thirdly, in the same whitepaper, rtg achieves 20Gb per CPU day (=8471386*101*24*1e-9), while the BWA mapping speed is roughly 7-10Gbp per CPU day, depending on hardware. This is not 7X faster. At last, rtg seems to use more RAM.

                      I am not saying rtg is bad - I had some brief communications with one of its developers who is really capable - but descriptions on their website look more like marketing which exaggerates the effects on data sets favoring their own pipeline (I understand this happens all the time and is necessary).
                      We have done a fair bit of independent evaluation using your wgsim app and other metrics. We noted similar results to what you posted for BWA and Novoalign with rtg being similar to BWA overall but noticeably faster (note this was with a fairly recent version 2.4 I think). The speed and accuracy were enough to result in the acquisition of a licence and a bit more (would need to check how specific i can be with regards to comparisons).

                      One thing to note with regards to RTG is they are rapidly developing it and there have been some noticeable improvements in a number of areas (memory, IO, performance, variant accuracy etc) in the last 6 months. It's well worth grabbing one of the free licences they now offer and having a go with it if one has time.

                      Our cluster design also was focused on higher than average, memory pernode, for other HPC loads we run aside from NGS, which means it may be well suited to getting max benefit out of RTG.

                      Originally posted by lh3 View Post
                      My 2010 review best describes my general opinion on the alignment speed: for Illumina short reads, pre-alignment steps (image analysis and base calling) are actually slower; post-alignment steps (markduplicates, variant calling etc.) are not much faster than alignment, especially in case of 1000g-like analysis. Substantially improving the alignment speed may not greatly reduce the time on the end-to-end data analysis.
                      This is exactly what we've seen when evaluating different pipelines (BWA/GATK/Samtools/Freebayes, Novoalign/GATK, RTG etc). The majority of the time was eaten up by post alignment stages in the full published 1000g GATK pipeline (the pre-alignment steps were handled by Illumina for us), and alot of that time was due to IO. Some steps seemed to be IO limited (at least for us) and it was frankly annoying watching the BAM files be read, modified and written out to disk numerous times (mark dups, Indel-realignment, recalibration, Variant calling etc). While the 1000g GATK pipeline is nice it seems like there are a number of steps that should/could have been merged into one to reduce the IO use multithreading and noticeably speed up the full pipeline.

                      Which is one of the advantages we see with the full rtg pipeline (Their variant calling is fast as well) and other pipelines that pipe the data through multiple processing stages without writing to disk ie Freebayes with it's bam merge | indel realignment | samtools baq | variant calling.

                      So I'd certainly agree that at the end of the day depending on your exact pipeline additional speedups in the alignment stage may not overall provide as significant improvement in overall fastq -> Variants -> filtered variants time as one might think. Still it nice to have if you can get it and it's no reason not to try and boost performance at every stage you can. So with that in mind this soap3-dp is interesting to keep an eye on.
                      Last edited by aeonsim; 06-09-2012, 05:26 PM.

                      Comment


                      • #26
                        So it is free to use RTG? I assume this is temporary? Do you have an idea about the pricing going forward? Just curious as it seems lots of companies have crept up in this space...

                        Originally posted by aeonsim View Post
                        We have done a fair bit of independent evaluation using your wgsim app and other metrics. We noted similar results to what you posted for BWA and Novoalign with rtg being similar to BWA overall but noticeably faster (note this was with a fairly recent version 2.4 I think). The speed and accuracy were enough to result in the acquisition of a licence and a bit more (would need to check how specific i can be with regards to comparisons).

                        One thing to note with regards to RTG is they are rapidly developing it and there have been some noticeable improvements in a number of areas (memory, IO, performance, variant accuracy etc) in the last 6 months. It's well worth grabbing one of the free licences they now offer and having a go with it if one has time.

                        Our cluster design also was focused on higher than average, memory pernode, for other HPC loads we run aside from NGS, which means it may be well suited to getting max benefit out of RTG.



                        This is exactly what we've seen when evaluating different pipelines (BWA/GATK/Samtools/Freebayes, Novoalign/GATK, RTG etc). The majority of the time was eaten up by post alignment stages in the full published 1000g GATK pipeline (the pre-alignment steps were handled by Illumina for us), and alot of that time was due to IO. Some steps seemed to be IO limited (at least for us) and it was frankly annoying watching the BAM files be read, modified and written out to disk numerous times (mark dups, Indel-realignment, recalibration, Variant calling etc). While the 1000g GATK pipeline is nice it seems like there are a number of steps that should/could have been merged into one to reduce the IO use multithreading and noticeably speed up the full pipeline.

                        Which is one of the advantages we see with the full rtg pipeline (Their variant calling is fast as well) and other pipelines that pipe the data through multiple processing stages without writing to disk ie Freebayes with it's bam merge | indel realignment | samtools baq | variant calling.

                        So I'd certainly agree that at the end of the day depending on your exact pipeline additional speedups in the alignment stage may not overall provide as significant improvement in overall fastq -> Variants -> filtered variants time as one might think. Still it nice to have if you can get it and it's no reason not to try and boost performance at every stage you can. So with that in mind this soap3-dp is interesting to keep an eye on.

                        Comment


                        • #27
                          Originally posted by genericforms View Post
                          So it is free to use RTG? I assume this is temporary? Do you have an idea about the pricing going forward? Just curious as it seems lots of companies have crept up in this space...
                          It is commercial though there was at least until recently, a free version with a few limitations on it. Can't find the link on there website any more though so your'd have to contact them to find out more details it may well still be available on request. Price wise I don't know, those who handled that side where happy that it was decent value.

                          At the end of it all they have a fast (reduced est processing time from months to weeks) fully fledged and cleanly implemented pipeline with some very interesting tools around pedigree and population structure that I've not seen any other equivalents to (and I've been looking) and after a decent amount of evaluation we Licensed it and are very happy with it.

                          Anyway that's probably enough about it in this thread, if you've any other questions feel free to message me.
                          Last edited by aeonsim; 06-09-2012, 07:52 PM.

                          Comment


                          • #28
                            I'm late to this thread, but here are my 2 cents.

                            1. BWT was published in 1994, but the FM-index was published until 2004. This is important, as BWT-based aligners should be really be called FM-index-based aligners.

                            2. Accuracy and sensitivity is key. % of reads mapped and speed are less important, especially for clinical. The two analysis I prefer are the ROC curves from simulated data, and variant calling ROC curves on pseudo-gold-standard datasets (ex. NA12878).

                            3. I have been looking into GPUs a lot lately. My preliminary assessment is that there are two main parts of mapping: seeding, and extension/SW. Seeding is slow due to random memory access of the FM-index (or other suitable index); the GPU does not help here. The SW is quite slow (for some software), but the SIMD/vectorized C code is very fast, and one GPU (Tesla) is worth about 8-12 cores without measuring the overhead of having a hybrid solution. Then it is all down to Amdahl's law as to whether using a GPU for SW will make things significantly faster at good value. I completely disagree about the coding for GPU is similar in complexity. I've seen this for a number of projects that this is not true. Furthermore, whole algorithms need to be redesigned to work on GPU (ex. BWA needs to be run in DFS instead of BFS).

                            4. I can't believe we are still having this indel/no-indel debate.

                            Comment


                            • #29
                              Item 4... Lol

                              Comment


                              • #30
                                Just wondering, does the current Illumina machine use GPU to process the image and base calling?

                                Comment

                                Latest Articles

                                Collapse

                                • seqadmin
                                  Strategies for Sequencing Challenging Samples
                                  by seqadmin


                                  Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                                  03-22-2024, 06:39 AM
                                • seqadmin
                                  Techniques and Challenges in Conservation Genomics
                                  by seqadmin



                                  The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

                                  Avian Conservation
                                  Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
                                  03-08-2024, 10:41 AM

                                ad_right_rmr

                                Collapse

                                News

                                Collapse

                                Topics Statistics Last Post
                                Started by seqadmin, Yesterday, 06:37 PM
                                0 responses
                                10 views
                                0 likes
                                Last Post seqadmin  
                                Started by seqadmin, Yesterday, 06:07 PM
                                0 responses
                                10 views
                                0 likes
                                Last Post seqadmin  
                                Started by seqadmin, 03-22-2024, 10:03 AM
                                0 responses
                                51 views
                                0 likes
                                Last Post seqadmin  
                                Started by seqadmin, 03-21-2024, 07:32 AM
                                0 responses
                                67 views
                                0 likes
                                Last Post seqadmin  
                                Working...
                                X