SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
TopHat Error: Could not find Bowtie index files /bowtie-0.12.5/indexes/. rebrendi Bioinformatics 11 06-22-2016 09:55 AM
bowtie build does not create .rev index files plasticdeath Bioinformatics 2 10-01-2014 11:24 AM
bowtie index files jay2008 Bioinformatics 0 08-20-2011 06:17 PM
Bowtie Index Files zun Bioinformatics 0 10-21-2010 06:55 PM
Multiple index files in Bowtie? spb2003 Bioinformatics 3 10-13-2009 04:22 PM

Reply
 
Thread Tools
Old 06-01-2010, 01:09 AM   #1
anecsulea
Member
 
Location: Lausanne

Join Date: Dec 2009
Posts: 12
Default Bowtie can't read index files

Dear all,

I'm having a recurrent problem with Bowtie: it fails reading the indexes it had just built.

Here are some details about my configuration: I'm using Bowtie 0.12.5 (but 0.12.3 gave the exact same error), on a Linux x86_64 computer.

I get this type of error messages :

Error reading _plen[] array: 4194272, 55604484

Error reading ebwt array: returned 41750080, length was 168445184

The index had been previously built by the same version of Bowtie. In fact these errors had occurred while running TopHat (which incidentally does not catch the errors thrown by Bowtie and finishes the run with "success", but does not give correct or complete results).

The worse thing is that this error does not occur all the times: as a test, I've run Bowtie about 100 times on a toy dataset (with the exact same input reads and genome index), and Bowtie only crashed 6 times. But it does seem that it crashes more often when the input is larger.

I don't understand what might be the problem. I'm starting to wonder if it might be because the filesystem structure is somehow corrupt on the computers I'm using. This is why I would like to know if anyone else has encountered this problem.

Any comments or suggestions would be much appreciated. Thank you for your help !

Best,

Anamaria
anecsulea is offline   Reply With Quote
Old 06-01-2010, 04:11 AM   #2
Ben Langmead
Senior Member
 
Location: Baltimore, MD

Join Date: Sep 2008
Posts: 200
Default

Quote:
Originally Posted by anecsulea View Post
I'm having a recurrent problem with Bowtie: it fails reading the indexes it had just built.

Here are some details about my configuration: I'm using Bowtie 0.12.5 (but 0.12.3 gave the exact same error), on a Linux x86_64 computer.

I get this type of error messages :

Error reading _plen[] array: 4194272, 55604484

Error reading ebwt array: returned 41750080, length was 168445184

The index had been previously built by the same version of Bowtie. In fact these errors had occurred while running TopHat (which incidentally does not catch the errors thrown by Bowtie and finishes the run with "success", but does not give correct or complete results).

The worse thing is that this error does not occur all the times: as a test, I've run Bowtie about 100 times on a toy dataset (with the exact same input reads and genome index), and Bowtie only crashed 6 times. But it does seem that it crashes more often when the input is larger.

I don't understand what might be the problem. I'm starting to wonder if it might be because the filesystem structure is somehow corrupt on the computers I'm using. This is why I would like to know if anyone else has encountered this problem.
Hi Anamaria,

These types of errors occur when the files are genuinely either corrupt or incomplete (e.g. if the disk becomes exhausted during the index-building process). Can you send detailed output from one example where this happens, including a 'ls -l' on the index files after bowtie-build completes?

Thanks,
Ben
Ben Langmead is offline   Reply With Quote
Old 06-01-2010, 06:13 AM   #3
anecsulea
Member
 
Location: Lausanne

Join Date: Dec 2009
Posts: 12
Default

Hi Ben,

This is what I originally thought, but I can't see how the exact same index file can be corrupted for one run, and ok on the next one. I've run several hundreds of tests, using the same index file and the same reads file, and only a few of these bowtie jobs crash.

Here is the ls -l of the index files:

##################################

rw-r--r-- 1 anecsule henrik 58822647 Jun 1 13:52 chr3_ensembl57.1.ebwt
-rw-r--r-- 1 anecsule henrik 23794420 Jun 1 13:52 chr3_ensembl57.2.ebwt
-rw-r--r-- 1 anecsule henrik 180665 Jun 1 13:47 chr3_ensembl57.3.ebwt
-rw-r--r-- 1 anecsule henrik 47588832 Jun 1 13:47 chr3_ensembl57.4.ebwt
-rw-r--r-- 1 anecsule henrik 205509239 May 29 15:37 chr3_ensembl57.fa
-rw-r--r-- 1 anecsule henrik 58822647 Jun 1 13:58 chr3_ensembl57.rev.1.ebwt
-rw-r--r-- 1 anecsule henrik 23794420 Jun 1 13:58 chr3_ensembl57.rev.2.ebwt

##################################

And the output :

##################################

Error reading ebwt array: returned 3953400, length was 54387328
Your index files may be corrupt; please try re-building or re-downloading.
A complete index consists of 6 files: XYZ.1.ebwt, XYZ.2.ebwt, XYZ.3.ebwt,
XYZ.4.ebwt, XYZ.rev.1.ebwt, and XYZ.rev.2.ebwt. The XYZ.1.ebwt and
XYZ.rev.1.ebwt files should have the same size, as should the XYZ.2.ebwt and
XYZ.rev.2.ebwt files.
Command: /home/vital-it/anecsule/Tools/bowtie-0.12.3/bowtie -p 4 -q --phred33-quals -m 1 /scratch/frt/yearly/necsulea/Orthosplice/results/tests_bowtie/index_0.12.3/chr3_ensembl57 /scratch/frt/yearly/necsulea/Orthosplice/results/tests_bowtie/reads.txt /scratch/frt/yearly/necsulea/Orthosplice/results/tests_bowtie/test_1_0.12.3/results_1.txt

##################################

I'm currently testing one potential solution: I've noticed that in ebwt.h you're using the "read" function in C if BOWTIE_MM is defined (i.e. on Linux) and the "fread" function if not (i.e. on Windows). I was wondering if I would get the same errors with "fread", so I've compiled bowtie as if for Windows, and I'm doing the same tests. I'll let you know if that works ok.

Also, I wanted to ask you if you think it's normal that TopHat does not catch this error thrown by Bowtie. I've had several TopHat runs that finished with apparent "success", but which in fact only gave partial results because reading the Bowtie index for the junction sequences had failed. This seems quite dangerous, as most users will not check the log files for Bowtie errors if TopHat has finished succesfully.

Thanks again for your help !
anecsulea is offline   Reply With Quote
Old 06-01-2010, 06:51 AM   #4
Ben Langmead
Senior Member
 
Location: Baltimore, MD

Join Date: Sep 2008
Posts: 200
Default

Quote:
Originally Posted by anecsulea View Post
Here is the ls -l of the index files:

##################################

rw-r--r-- 1 anecsule henrik 58822647 Jun 1 13:52 chr3_ensembl57.1.ebwt
-rw-r--r-- 1 anecsule henrik 23794420 Jun 1 13:52 chr3_ensembl57.2.ebwt
-rw-r--r-- 1 anecsule henrik 180665 Jun 1 13:47 chr3_ensembl57.3.ebwt
-rw-r--r-- 1 anecsule henrik 47588832 Jun 1 13:47 chr3_ensembl57.4.ebwt
-rw-r--r-- 1 anecsule henrik 205509239 May 29 15:37 chr3_ensembl57.fa
-rw-r--r-- 1 anecsule henrik 58822647 Jun 1 13:58 chr3_ensembl57.rev.1.ebwt
-rw-r--r-- 1 anecsule henrik 23794420 Jun 1 13:58 chr3_ensembl57.rev.2.ebwt
Yep, looks good. The write is probably not failing and the files are probably not corrupt or incomplete.

Quote:
Originally Posted by anecsulea View Post
I'm currently testing one potential solution: I've noticed that in ebwt.h you're using the "read" function in C if BOWTIE_MM is defined (i.e. on Linux) and the "fread" function if not (i.e. on Windows). I was wondering if I would get the same errors with "fread", so I've compiled bowtie as if for Windows, and I'm doing the same tests. I'll let you know if that works ok.
I'd be interested to know if that works.

Is there anything else of note about the partition/filesystem that the index files are stored on? Is it NFS? The problem seems to be that bowtie-build successfully writes the entire index, but when it then tries to read it back in *immediately*, it gets something incomplete. That *might* be Bowtie's fault, but more likely it's some combination of OS & FS.

Quote:
Originally Posted by anecsulea View Post
Also, I wanted to ask you if you think it's normal that TopHat does not catch this error thrown by Bowtie. I've had several TopHat runs that finished with apparent "success", but which in fact only gave partial results because reading the Bowtie index for the junction sequences had failed. This seems quite dangerous, as most users will not check the log files for Bowtie errors if TopHat has finished succesfully.
If you have separate questions about Bowtie and TopHat, it's best to post them separately. Cole reads Seqanswers messages about TopHat and I read ones about Bowtie.

If Bowtie later successfully opens and queries that same set of index files, then they're not actually corrupt; it just appeared that way immediately after they were written, due to OS wackiness. So the TopHat results could very well be fine.

Ben
Ben Langmead is offline   Reply With Quote
Old 06-01-2010, 07:19 AM   #5
anecsulea
Member
 
Location: Lausanne

Join Date: Dec 2009
Posts: 12
Default

Quote:

Is there anything else of note about the partition/filesystem that the index files are stored on? Is it NFS? The problem seems to be that bowtie-build successfully writes the entire index, but when it then tries to read it back in *immediately*, it gets something incomplete. That *might* be Bowtie's fault, but more likely it's some combination of OS & FS.
The system file is Lustre - I'm doing my computations on a cluster. However I should tell you that Bowtie does not only crash *immediately* after building the index - in my tests there were at least a few minutes between building the index and running Bowtie.


Quote:
If you have separate questions about Bowtie and TopHat, it's best to post them separately. Cole reads Seqanswers messages about TopHat and I read ones about Bowtie.
Of course, I understand - however I have already posted two messages about TopHat (in the forums Bioinformatics and RNASeq), with no response yet (nor was there any response to the e-mails I've sent - sorry for insisting, I was getting a bit desperate). Plus, the questions aren't that separate, in my opinion - we're dealing with a Bowtie error that TopHat should catch but fails to do so.


Quote:

If Bowtie later successfully opens and queries that same set of index files, then they're not actually corrupt; it just appeared that way immediately after they were written, due to OS wackiness. So the TopHat results could very well be fine.
No, they are definitely not fine. In fact I'm running TopHat on long reads (76bp) so TopHat splits them up into three segments, and then tries to map the three segments on the bowtie index of the junction sequences. It can happen that only one of the mapping attempts fails, and the other ones work, so TopHat can still confirm some junctions. Anyway, I will explain all this into more detail in my TopHat-specific posts.

I'll keep in touch about the Bowtie problem - but if you have any other suggestions for things that I should test, please let me know, I'm running out of ideas. Thanks !

Best,

Anamaria
anecsulea is offline   Reply With Quote
Old 06-04-2010, 04:10 AM   #6
anecsulea
Member
 
Location: Lausanne

Join Date: Dec 2009
Posts: 12
Default

Hi again,

So, I've done again several series of tests in which I replace the occurrences of the "read" function with "fread", and this solution seems to work fine. I haven't had any "Error reading..." messages in hundreds of tests, and the results are as expected.

Actually the simplest way to make this change without modifying too much the source code was to force BOWTIE_MM = 0 in the make file. I've also had to manually replace some occurrences of "lseek" in ebwt.h with MM_SEEK for correct compilation (I'm surprised that Windows users - if there are any - haven't complained about this).

Best wishes,

Anamaria
anecsulea is offline   Reply With Quote
Old 10-16-2012, 04:46 AM   #7
wanfahmi
Member
 
Location: North Sea

Join Date: Apr 2008
Posts: 34
Default Could not find Bowtie index files ( genome.*.ebwt)

Hi, I tried to follow the sample data (fruit fly) as suggested in the paper Trapnell et al 2012. But, it came out with this error even though the particular file already in the same directory. TQ


[2012-10-16 13:39:15] Beginning TopHat run (v2.0.4)
-----------------------------------------------
[2012-10-16 13:39:15] Checking for Bowtie
Bowtie 2 not found, checking for older version..
Bowtie version: 0.12.8.0
[2012-10-16 13:39:15] Checking for Samtools
Samtools version: 0.1.18.0
[2012-10-16 13:39:15] Checking for Bowtie index files
Error: Could not find Bowtie index files ( genome.*.ebwt)
wanfahmi is offline   Reply With Quote
Old 10-16-2012, 05:31 AM   #8
JackieBadger
Senior Member
 
Location: Halifax, Nova Scotia

Join Date: Mar 2009
Posts: 381
Default

Why don't you update to the current version of Bowtie2 and see if the problem is resolved?
JackieBadger is offline   Reply With Quote
Old 10-16-2012, 06:32 AM   #9
wanfahmi
Member
 
Location: North Sea

Join Date: Apr 2008
Posts: 34
Default

Quote:
Originally Posted by JackieBadger View Post
Why don't you update to the current version of Bowtie2 and see if the problem is resolved?
TQ for the suggestion. Already updated with new Bowtie2 and its working.
wanfahmi is offline   Reply With Quote
Old 02-20-2013, 09:31 PM   #10
jsimba
Junior Member
 
Location: Australia

Join Date: Feb 2013
Posts: 6
Default build my own reference

Hi everybody,

I have a problem trying to create my index with bowtie for OSX I want to use multiple fastq files but first I merge all of those files in one, when I run bowtie-build, I obtain this:

Writing header
Reserving space for joined string
Joining reference sequences
Reference file does not seem to be a FASTA file

Then when I list the outputs I only obtain the 4 .ebwt files lacking *ebwt which are needed to run tophat.

what is the solution of that???

Thanks
jsimba is offline   Reply With Quote
Reply

Tags
bowtie, error reading ebwt array, tophat

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 09:50 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO