Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #16
    I am sorry I did not provide the result immediately. All comparison was done under default parameter settings.

    I have compared the Bioscope and BFAST. The mapping seems rather consistent. The fraction of mappable reads are very similar, and Bioscope outperforms BFAST slightly on my test data set. The final computation of RPKM shows high correlation. I have not computed the correlation, but I would say it would be >0.95,

    BFAST runs slower than Bioscope. But I think the good thing for BFAST is that if you could design more mask for BFAST, it would map more reads.

    I have also used Bowtie, its fast (amazing), But it maps much less reads. (BFAST and Bioscope can map 60% reads, while Bowtie maps 30% reads)



    Originally posted by jlli View Post
    We just got the Bioscope, it would be great if you can share the results of bfast vs. bioscope.

    Comment


    • #17
      You also have to take the mapping length into consideration. Bioscope will clip the reads heavily to get a mapping done and there by get a higher number of mapped reads. One of the reasons they do this is their way of handling indels, that is handled by a second (optional?) step of the pipeline that will try to map these in full length.

      But I agree with Niels, what rally matters is that you can identify variation from the reference genome reliably. Over ambiguous mappings will make this a lot harder, and that is where playing with parameters come in.

      Comment


      • #18
        Bioscope ungapped alignment

        Originally posted by Brugger View Post
        You also have to take the mapping length into consideration. Bioscope will clip the reads heavily to get a mapping done and there by get a higher number of mapped reads. One of the reasons they do this is their way of handling indels, that is handled by a second (optional?) step of the pipeline that will try to map these in full length.

        But I agree with Niels, what rally matters is that you can identify variation from the reference genome reliably. Over ambiguous mappings will make this a lot harder, and that is where playing with parameters come in.
        Thanks for the info. Is there any official source mentioning these not-so-nice features of Bioscope? Without looking into the BAM file produced by Bioscope, you'd never suspect that but indeed, I see a lot of hard clipping. I also learned that Bioscope does ungapped alignment - no indels in my samtools pileup. I should mention that I work on RNA-Seq so I used the WT pipeline. SNP calling is not part of it, so I used samtools pileup for convenience. (ABI's diBayes SNP caller is next on my todo list, along with running BFAST.) Since in exons due to selective constraints you expect less indels than in noncoding regions, neglecting indels has no deleterious consequences if you only want to quantify transcript abundance. Investigating allelic expression is a different story, here I really need reliable SNPs and therefore indels. Does that "second step of the pipeline" you're referring to do something like local realignment?

        Comment


        • #19
          There is an output file, "alignmentReport.txt" that has stats and a table of what % of reads were mapped at what read length. That will tell you how clipped your reads needed to be for them to be mapped.

          --
          Phillip

          Comment


          • #20
            Originally posted by epigen View Post
            Thanks for the info. Is there any official source mentioning these not-so-nice features of Bioscope? Without looking into the BAM file produced by Bioscope, you'd never suspect that but indeed, I see a lot of hard clipping. I also learned that Bioscope does ungapped alignment - no indels in my samtools pileup. I should mention that I work on RNA-Seq so I used the WT pipeline. SNP calling is not part of it, so I used samtools pileup for convenience. (ABI's diBayes SNP caller is next on my todo list, along with running BFAST.) Since in exons due to selective constraints you expect less indels than in noncoding regions, neglecting indels has no deleterious consequences if you only want to quantify transcript abundance. Investigating allelic expression is a different story, here I really need reliable SNPs and therefore indels. Does that "second step of the pipeline" you're referring to do something like local realignment?

            The second step of the pipeline he's referring to, I think, is related to fragment/mate pair mapping where 'evidences' gathered during the mapping process are used to try more localized gapped alignments to determine genomic indels. Essentially, Bioscope decouples the methods of aligning and indel finding. Currently, with Bioscope, I don't know of a way to find indels on RNA-seq data.

            Also, I've tried using diBayes (Bioscope 1.2.1) on some RNA-seq data and it's failing by timing out. I think doing this analysis with diBayes in is their roadmap, it's not supported at the moment, but my initial crack at it failed.
            Last edited by thaley; 07-14-2010, 09:53 AM.

            Comment


            • #21
              Originally posted by thaley View Post
              Also, I've tried using diBayes (Bioscope 1.2.1) on some RNA-seq data and it's failing by timing out. I think doing this analysis with diBayes in is their roadmap, it's not supported at the moment, but my initial crack at it failed.
              Rick got that to work on some RNA-seq data here at Purdue. Although the "failing by timing out" thing is a not uncommon result when running Bioscope.

              Bioscope 1.2.1 does remove the "no adjacent SNPs called" behavior that cropped up in 1.2. Although I think, for reasons I have yet to comprehend, calling adjacent SNPs is still turned off by default.

              Well, I'm oversimplifying a bit. Ostensibly the no adjacent SNPs calls bug in 1.2 was the result of the default setting of:

              het.skip.high.coverage

              "# Parameter specifies not to call SNPs when the coverage of
              position is too high comparing to the median of the coverage
              distribution of all positions."

              That is there was some conflation of SNP calling with heterozygote calling. With 1.2, though DiBayes lost its ability to call any adjacent SNPs, even if they were homozygous at the SNP positions. 1.2.1 fixed that. Although it might be necessary to set

              het.skip.high.coverage=1

              to see adjacent heterozygous SNPs.

              --
              Phillip

              Comment


              • #22
                SNPs in RNA-Seq

                Has anyone tried to run diBayes on BAM files generated by other aligners or, conversely, used samtools pileup on BioScope output? I've done the latter and want to do a comparison, if I get diBayes to run.

                Comment


                • #23
                  I have the same question as originally posted, but it's been about a year (and I think at least NovoalignCS has had some significant updates) - has anyone done any comparison between aligners recently?

                  I'm particularly interested in NovoalignCS vs BFAST (our collaborators are using BFAST, but we were advised by another group to use NovoalignCS on the same data... all of us are very new to working with NGS data), but any recent comparisons between any of these programs would be helpful.

                  Thanks!

                  Comment


                  • #24
                    Reecent comparison of SOLiD aligners

                    Next generation sequencing has lower sequence coverage and poorer
                    SNP-detection capability in the regulatory regions

                    Weixin Wang, Zhi Wei, Tak-Wah Lam & Junwen Wang

                    Nature SCIENTIFIC REPORTS | 1 : 55 | DOI: 10.1038/srep00055

                    I actually use Bioscope aligner and bowtie - the latter is fast and reliable but needs some tricks to parse the output correctly. I used extensively oldest versions of SHRiMP for miRNA work in Color Space

                    HTH

                    Alessandro

                    Comment


                    • #25
                      Originally posted by yasashiku View Post
                      I have the same question as originally posted, but it's been about a year (and I think at least NovoalignCS has had some significant updates) - has anyone done any comparison between aligners recently?

                      I'm particularly interested in NovoalignCS vs BFAST (our collaborators are using BFAST, but we were advised by another group to use NovoalignCS on the same data... all of us are very new to working with NGS data), but any recent comparisons between any of these programs would be helpful.

                      Thanks!
                      Is it really that long ago already? I haven't been around the forum for some months now due to a very high workload. We're struggling with 50+35 bp PE tumor + matched control samples. I didn't test NovoalignCS any further because it's commercial. You may find the information interesting that BFAST does align more than Bioscope, but there are many artefact alignments, especially such with far too many gaps, and this results in a lot of false positive SNPs. Local realignment might help but is too much effort for the amount of data we have. Let alone BFAST runtime: 8 days for one slide, splitting the reads into 10-12 chunks of 50 million per end. That's not really worth it.
                      Concerning SNP calling, we did a comparison to the SNP array data for one sample. Bioscope performed best when using mpileup with the -AB options - that works actually better than diBayes! BWA (with the patch for PE data) has a high specificity but aligns less than half of the reads so sensitivity is very low. We're now waiting for our first 5500 PE run to see how LifeScope performs.

                      Comment


                      • #26
                        Thanks for the response - your comparisons are helping us a lot. We're working on a pretty small dataset at the moment, but it should get heftier down the road - I'll try novoalignCS and see how it does versus our collaborators' results. Keep us posted on how LifeScope does.

                        Comment


                        • #27
                          A lot of people have had a lot of success with novoalignCS, I would recommend it as a great alternative to tools you have tried.

                          Comment

                          Latest Articles

                          Collapse

                          • seqadmin
                            Current Approaches to Protein Sequencing
                            by seqadmin


                            Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                            04-04-2024, 04:25 PM
                          • seqadmin
                            Strategies for Sequencing Challenging Samples
                            by seqadmin


                            Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                            03-22-2024, 06:39 AM

                          ad_right_rmr

                          Collapse

                          News

                          Collapse

                          Topics Statistics Last Post
                          Started by seqadmin, 04-11-2024, 12:08 PM
                          0 responses
                          27 views
                          0 likes
                          Last Post seqadmin  
                          Started by seqadmin, 04-10-2024, 10:19 PM
                          0 responses
                          31 views
                          0 likes
                          Last Post seqadmin  
                          Started by seqadmin, 04-10-2024, 09:21 AM
                          0 responses
                          27 views
                          0 likes
                          Last Post seqadmin  
                          Started by seqadmin, 04-04-2024, 09:00 AM
                          0 responses
                          52 views
                          0 likes
                          Last Post seqadmin  
                          Working...
                          X