Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Bowtie2 chokes on -a flag?

    I am working on project where I need to get ALL hits to each read - defined fairly stringently. I tried to use bowtie2 with a command like this:

    bowtie2 --threads 20 --reorder --score-min L,-0.5,-0.2 -a -x trdb -U R15.fq -S 15_tr.sam

    My reads are 100 bp long hence the parameters for match are fairly stringent here. I expected that bowtie2 might take a while but will complete the job. Without the '-a' flag the job completed in about 30 mins. But with -a, I was waiting nearly 3 days and still undone.

    To judge from the sam file, bowtie2 completed the alignments for about 70K of the reads reads (in ~ 10 mins) and then kept spinning with no writes to the sam file thereafter.

    I know bowtie2 manual says it is not optimized for the -a flag. But this looks much worse than unoptimized. Its unusable. Anyone have experience with this?

    Thanks,
    Gulu
    Kamalakar Gulukota,
    Director,
    Center for Bioinformatics and Computational Biology
    NorthShore University Health System, [email protected]

  • #2
    Yep, same results from us. The problem is that bowtie2 handles inserts, misreads, and (in local mode) read clipping. That's a lot of errors that take a much longer time to account for.

    What you may be able to try to speed things up is to get bowtie2 to dump all the multiple-mapped reads to another file (e.g. with '-k 2'), and only do the '-a' on those reads.

    Comment


    • #3
      Originally posted by gringer View Post
      What you may be able to try to speed things up is to get bowtie2 to dump all the multiple-mapped reads to another file (e.g. with '-k 2'), and only do the '-a' on those reads.
      Thank gringer! I will try that.
      Kamalakar Gulukota,
      Director,
      Center for Bioinformatics and Computational Biology
      NorthShore University Health System, [email protected]

      Comment


      • #4
        An update:
        Yes, bowtie2 does have a big issue with the '-a' flag. I ran bowtie2 on about 8.8 million reads. Following gringer's advice I first ran it with a generous '-k 50' option i.e:

        bowtie2 --score-min L,-0.5,-0.2 -k 50 -x trdb -U rd.fq -S k50.sam

        This ran and finished in about 20 mins or less. I found that 6,594 of the reads had 50 hits. Next, I created a new fastq file with just these 50's ("The50s.fq") and re-ran bowtie2 with the -a flag:

        bowtie2 --score-min L,-0.5,-0.2 -a -x trdb -U The50s.fq -S 50s_tr.sam

        Its been running for over 2 hours with no results being output. Overall, beware of the '-a' flag in bowtie2.

        Now, the 6594 sequences do appear a bit repetitive - I'll strengthen my filtering upstream. So, its understandable why bowtie2 is choking. Still, it should be possible to put in some defenses against this flailing, right? So, if anyone active in bowtie2 development sees this, I have a request:

        please have bowtie search till a Max_K parameter and come back more quickly with a message like "6,594 sequences had more than Max_K (1000) hits each - they are being ignored. See filtered.fastq for these sequences".
        Kamalakar Gulukota,
        Director,
        Center for Bioinformatics and Computational Biology
        NorthShore University Health System, [email protected]

        Comment

        Latest Articles

        Collapse

        • seqadmin
          Recent Advances in Sequencing Analysis Tools
          by seqadmin


          The sequencing world is rapidly changing due to declining costs, enhanced accuracies, and the advent of newer, cutting-edge instruments. Equally important to these developments are improvements in sequencing analysis, a process that converts vast amounts of raw data into a comprehensible and meaningful form. This complex task requires expertise and the right analysis tools. In this article, we highlight the progress and innovation in sequencing analysis by reviewing several of the...
          Yesterday, 07:48 AM
        • seqadmin
          Essential Discoveries and Tools in Epitranscriptomics
          by seqadmin




          The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
          04-22-2024, 07:01 AM

        ad_right_rmr

        Collapse

        News

        Collapse

        Topics Statistics Last Post
        Started by seqadmin, Today, 06:57 AM
        0 responses
        9 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, Yesterday, 07:17 AM
        0 responses
        14 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 05-02-2024, 08:06 AM
        0 responses
        19 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 04-30-2024, 12:17 PM
        0 responses
        23 views
        0 likes
        Last Post seqadmin  
        Working...
        X