Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Bowtie-build index file generation, no *.3.ebwt, *.4.ebwt

    Hi all,
    I am working on aligning a set of paired-end reads with bowtie. I am trying to generate my reference index using:

    /data/apps/bowtie/./bowtie-build -r pombe_fasta/chromosome1.fasta,pombe_fasta/chromosome2.fasta,pombe_fasta/chromosome3.fasta pombe_indexes/

    It generates pombe.1.ebwt, pombe.2.ebwt,pombe.1.rev.ebwt,pombe.2.rev.ebwt

    When that failed to generate the files for paired-end alignment i tried the -3 option which also didn't generate the files.

    I was wondering what I can do to generate the two files needed for paired-end alignments, or if anyone has pre-built indexs for Pombe.

    I am utilizing:
    /data/apps/bowtie/bowtie-build version 0.11.3
    64-bit
    Built on privet.umiacs.umd.edu
    Mon Oct 12 18:08:44 EDT 2009
    Compiler: gcc version 3.4.6 20060404 (Red Hat 3.4.6-10)
    Options: -O3 -m64
    Sizeof {int, long, long long, void*, size_t, off_t}: {4, 8, 8, 8, 8, 8}
    Last edited by Bardj; 12-18-2009, 10:01 AM.

  • #2
    The -r option suppresses those files. Not specifying -r should fix it.

    Thanks,
    Ben

    Comment


    • #3
      Thanks for the quick reply Ben.

      I ran it again removing the -r option,

      /data/apps/bowtie/./bowtie-build pombe_fasta/chromosome1.fasta,pombe_fasta/chromosome2.fasta,pombe_fasta/chromosome3.fasta pombe_indexes/

      And the files still aren't being generated. I tried just having the -3 option, and this is the output:

      /data/apps/bowtie/./bowtie-build -3 pombe_fasta/chromosome1.fasta,pombe_fasta/chromosome2.fasta,pombe_fasta/chromosome3.fasta pombe_indexes/
      Settings:
      Output files: "pombe_indexes/.*.ebwt"
      Line rate: 6 (line is 64 bytes)
      Lines per side: 1 (side is 64 bytes)
      Offset rate: 5 (one in 32)
      FTable chars: 10
      Strings: unpacked
      Max bucket size: default
      Max bucket size, sqrt multiplier: default
      Max bucket size, len divisor: 4
      Difference-cover sample period: 1024
      Reference base cutoff: none
      Endianness: little
      Actual local endianness: little
      Sanity checking: disabled
      Assertions: disabled
      Random seed: 0
      Sizeofs: void*:8, int:4, long:8, size_t:8
      Input files DNA, FASTA:
      pombe_fasta/chromosome1.fasta
      pombe_fasta/chromosome2.fasta
      pombe_fasta/chromosome3.fasta
      Reading reference sizes
      Time reading reference sizes: 00:00:01
      Total time for call to driver() for forward index: 00:00:01
      Reading reference sizes
      Time reading reference sizes: 00:00:00
      Total time for backward call to driver() for mirror index: 00:00:00

      Comment


      • #4
        The basename you specified ("pombe_indexes/") will result in the files being named pombe_indexes/.3.ebwt and pombe_indexes/.4.ebwt. They're probably there, it's just that you won't see them unless you do "ls -a" to list files that start with ".". You probably want to use a basename more like "pombe_indexes/pombe".

        Hope that helps,
        Ben

        Comment


        • #5
          Ah thanks very much, a big oversight on my part. The files were indeed there, and changing the base name worked. Thanks very much for the help, I feel rather silly for overlooking that detail!

          Comment

          Latest Articles

          Collapse

          • seqadmin
            Essential Discoveries and Tools in Epitranscriptomics
            by seqadmin


            The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist on Modified Bases...
            Yesterday, 07:01 AM
          • seqadmin
            Current Approaches to Protein Sequencing
            by seqadmin


            Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
            04-04-2024, 04:25 PM

          ad_right_rmr

          Collapse

          News

          Collapse

          Topics Statistics Last Post
          Started by seqadmin, 04-11-2024, 12:08 PM
          0 responses
          39 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 04-10-2024, 10:19 PM
          0 responses
          41 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 04-10-2024, 09:21 AM
          0 responses
          35 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 04-04-2024, 09:00 AM
          0 responses
          55 views
          0 likes
          Last Post seqadmin  
          Working...
          X