Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Bismark v0.6.beta1: Now supporting gapped Bisulfite-Seq alignments

    We would like to announce that Bismark has received a major overhaul. While the default alignment behaviour of Bismark (using Bowtie 1) has not changed very much (see below), Bismark does now also support gapped alignments using Bowtie 2. From all test we have performed so far (single or paired-end, directional or non-directional with various simulated methylation levels or real life datasets) the Bismark results of both Bowtie 1 and Bowtie 2 are very concordant.

    However, as Bowtie 2 is still in beta and subject to change, the current release of Bismark has therefore also to be considered a beta version (0.6.beta1).
    Here is an overview of the most prominent changes:

    Running Bismark with Bowtie 1 (default)

    - Default output changed to SAM format

    - The ‘old’ output format is still available via the option ‘--vanilla’

    - Alignment processes were slightly modified to run in --norc/--nofw mode where appropriate, which may result in a slightly increased mapping efficiencies

    - The former option ‘--directional’ is now the new default mode (‘--non_directional’ will report alignments to all four strands)

    - The default paired-end maximum insert size ('-X') was increased to 500bp (up from 250bp)


    Running Bismark with Bowtie 2 (optional)

    - Alignments are performed in end-to-end mode (similar to Bowtie 1), but do allow gapped alignments with insertions and/or deletions

    - Output format is SAM

    - Since Bowtie 2 requires different indexes for alignments, the bismark genome preparation does now also support Bowtie 2 bisulfite indexing of a reference genome


    I should like to stress that we don’t think that using Bowtie 2 for Bismark alignments is simply a replacement for Bowtie 1. Rather, as is also stated on the project its page, Bowtie 2 is supposed to work more efficiently for longer reads and allows gapped alignments. For shorter and/or indel-free reads Bowtie 1 may well be faster and more accurate, which is why Bowtie 1 will remain the default alignment mode for Bismark. Indeed, in some of the tests I have run so far the Bowtie 1 seemed to have a speed advantage.


    While Bismark seems to work fine in all alignments modes, its methylation_extractor works currently only on the old Bowtie 1 (‘--vanilla’) output and not yet on SAM output files (I am going to work on this in the next couple of days/weeks). This is another reason for calling the current Bismark version 0.6.beta1.

    Compared to Bowtie 1, Bowtie 2 has many ‘new’ parameters, of which the following are currently adjustable:

    -M <int> (reporting the best out of N valid alignments)
    -N <int> (multi-seed mismatches)
    -L <int> (seed length)
    -D <int> (maximum number of seed extension fail tries)
    -R <int> (reseeding of repetitive alignments)
    --score-min <func> (setting minimum alignment score for valid alignments)

    We are still in the process of determining a set of most sensible parameters to generate unique 'best' alignments in a reasonable time (inceasing some of the parameters above might make Bismark run dog slow...). I would very much appreciate any comments or input in this regard (and of course also bug reports...).

    All files are available from the Bismark project page.

    Thanks,
    Felix

  • #2
    We have just added a parallelization option for Bowtie 2 alignments (-p NTHREADS). This option became feasible because the latest Bowtie 2 release (Version 2.0.0-beta5 - December 15, 2011) added the option --reorder which reports alignments in the same way as they are read in, even if multiple threads are used for alignment.

    This option should potentially be useful to speed up Bismark alignments as well, however - as a word of caution - it also requires much higher system resources. E.g. specifying -p 3 will use 4*3 = 12 threads/cores for alignments as well as 1 thread for Bismark itself, and use > 15GB of memory for a human genome.

    The use of Bowtie 2 for Bismark alignments is still experimental and I would appreciate any input or feedback!

    Bismark v0.6.beta2 is available from the Bismark project page.

    Comment


    • #3
      Originally posted by fkrueger View Post
      While Bismark seems to work fine in all alignments modes, its methylation_extractor works currently only on the old Bowtie 1 (‘--vanilla’) output and not yet on SAM output files (I am going to work on this in the next couple of days/weeks). This is another reason for calling the current Bismark version 0.6.beta1.
      I am aligned my BS reads using v_0.6.beta1 and generated an output sam file. i am now trying to run the methylation extractor on that file and I am getting an error stating:

      The methylation extractor and Bismark itself need to be of the same version!

      Versions used: methylation extractor: ' v0.6.beta1 '
      Bismark: ' @HD VN:1.0 SO:unsorted '

      I am wondering if what you quoted in the above post is relevant to my issue and if I upgrade to the most recent version of bismark, will I have an issue because the alignment was done in another version?

      Comment


      • #4
        If you used Bismark to generate SAM output you need to run a more recent version of the methylation_extractor, which does now use SAM format as default input file (as of version 0.6.3).

        In any case I would recommend downloading the latest version (v0.7.2) and rerunning your alignments since several things have changed since version 0.6.1.

        Best,
        Felix

        Comment


        • #5
          Is it really necessary to rerun my alignments? It took over a week the last time because I have 7 lanes of 100bp hiseq data.
          Last edited by shawpa; 03-19-2012, 05:00 AM.

          Comment


          • #6
            The alignment and methylation information should still be the same, but there were several changes that might positively affect the outcome of your alignments, such as:

            - Changed Bismark's behavior for "--directional" mode (default) to run only 2 parallel instances of Bowtie 1/2 to the original top (OT) and bottom (OB) strands, instead of 4 instances to all possible bisulfite strands. This change might result in somewhat faster alignment speed and mapping efficiency. It is still possible to run the 4-alignment strand mode for any combination of input file(s) and choice of aligner by specifying --non_directional.
            - Sequences in FastA format do now receive Phred score qualities of 40 throughout (ASCII 'I') to prevent the SAM to BAM conversion in SAMtools from failing
            - If a genomic sequence could not be extracted it will now also be counted and reported for use with Bowtie 1
            - Changed the XX:Z mismatch field in the SAM output to display mismatching nucleotides of the reference sequence (instead of the read sequence ones)

            Since Bismark does now only run 2 alignment instances instead of 4 for directional alignments, you should not only see an increase in mapping efficiency but it should also be quite a bit quicker than it would be if you run it with 4 strand mapping (I did several lanes of 100bp SE HiSeq mapping with ~240M sequences overnight on a single instance). You may check the change log on the Bismark page to see if there is anything of relevance for you.

            Comment


            • #7
              Thanks for your advice. I'll go ahead and download the new one and try again.

              Comment

              Latest Articles

              Collapse

              • seqadmin
                Strategies for Sequencing Challenging Samples
                by seqadmin


                Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                03-22-2024, 06:39 AM
              • seqadmin
                Techniques and Challenges in Conservation Genomics
                by seqadmin



                The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

                Avian Conservation
                Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
                03-08-2024, 10:41 AM

              ad_right_rmr

              Collapse

              News

              Collapse

              Topics Statistics Last Post
              Started by seqadmin, Yesterday, 06:37 PM
              0 responses
              8 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, Yesterday, 06:07 PM
              0 responses
              8 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 03-22-2024, 10:03 AM
              0 responses
              49 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 03-21-2024, 07:32 AM
              0 responses
              66 views
              0 likes
              Last Post seqadmin  
              Working...
              X