![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
TopHat Error: Could not find Bowtie index files /bowtie-0.12.5/indexes/. | rebrendi | Bioinformatics | 11 | 06-22-2016 10:55 AM |
bowtie index problem (bowtie-build and then bowtie-inspect) | tgenahmet | Bioinformatics | 4 | 09-10-2013 12:51 PM |
BWA error and reference index | satishkumar | Introductions | 1 | 11-19-2010 08:22 AM |
Upload genome index to Galaxy for Bowtie alignment? | jjw14 | Bioinformatics | 0 | 06-08-2010 09:22 AM |
Reference genome for MAQ - split reference genome by chromosome or not? | inesdesantiago | Bioinformatics | 4 | 02-18-2009 09:44 AM |
![]() |
|
Thread Tools |
![]() |
#1 |
Junior Member
Location: Singapore Join Date: Jan 2010
Posts: 9
|
![]()
Dear all,
We are facing some problems indexing our reference genome with bowtie-index, as our reference size is greater than 4billion characters. According to the manual, this is not possible. Is there a possible solution without modification of the source code? Of course, we would like to consider source code modification as a last resort. In any case, we would also appreciate any insights as to how we can modify the source code to handle a 6billion character genome. Regards, Kevin |
![]() |
![]() |
![]() |
#2 | |
Nils Homer
Location: Boston, MA, USA Join Date: Nov 2008
Posts: 1,285
|
![]() Quote:
Could you split your reference and align to each separately and merge the results? This is not as faithful to the bowtie algorithm but seems like a practical solution. |
|
![]() |
![]() |
![]() |
#3 |
Junior Member
Location: Singapore Join Date: Jan 2010
Posts: 9
|
![]()
Hi,
Thanks for the reply. Can anyone guide me as to where the pointers I need to change are located? Regards, Kevin |
![]() |
![]() |
![]() |
#4 | |
Junior Member
Location: Nova Scotia Join Date: Feb 2010
Posts: 7
|
![]()
Hi Kevin,
Trying to update the source code could be more trouble than it is worth. If it was simply a matter of changing a few pointers, the author likely would have done that rather than adding this disclaimer to the manual: Quote:
If you are committed to Bowtie, splitting your reference sequence into two files will get you up and running, as others have pointed out. |
|
![]() |
![]() |
![]() |
#5 | |
Junior Member
Location: Singapore Join Date: Jan 2010
Posts: 9
|
![]()
Yes, we also think that messing around with source code is a cumbersome task indeed.
However, the reason why we want to do so is because we want bowtie to find reads that align uniquely to a given reference genome using the "-m 1 --best --strata" parameter. As such, if we split up the reference genome into two, then we are essentially running bowtie twice for each reference split. Even if we have a correct way to merge these result sets to obtain the unique alignments, this is not the same as running the same parameters on a combined reference. The reason being is that we are finding unique alignments at the "best strata" level. Splitting up the reference will allow bowtie to get alignments that are "best strata" unique only to a subset. Hence, we are left with the last resort which is to modify the source code. Any form of help is truly appreciated here. Thanks. Regards, Kevin Quote:
|
|
![]() |
![]() |
![]() |
#6 | |
Nils Homer
Location: Boston, MA, USA Join Date: Nov 2008
Posts: 1,285
|
![]() Quote:
|
|
![]() |
![]() |
![]() |
#7 | |
Junior Member
Location: Nova Scotia Join Date: Feb 2010
Posts: 7
|
![]()
Hi Kevin,
Take a look at the ebwt.h file in the bowtie source distribution. This file outlines the ebwt-related classes. Searching for 'int', 'uint32_t', and 'int32_t' should give you an idea of where you can start to modify the code. You might also find it useful to compile bowtie using the '-ggdb' flag, and then try invoking bowtie-build with your large reference sequence within gdb to see exactly where things are breaking down. -Scott Quote:
Last edited by sperry; 03-01-2010 at 08:21 AM. |
|
![]() |
![]() |
![]() |
#8 |
Senior Member
Location: US Join Date: Jan 2009
Posts: 392
|
![]()
An old thread, but I am currently in a similar situation. I have a polyploid genome of >10 Gbs that I have to work with. Anybody have any recommendations on altering bowtie for this?
Alternatively, any good strategies at post-processing data aligned to individual chunks to achieve the same result? |
![]() |
![]() |
![]() |
#9 |
Devon Ryan
Location: Freiburg, Germany Join Date: Jul 2011
Posts: 3,480
|
![]()
I think BWA can handle larger genomes, that'd be the easiest solution.
BTW, you can split a genome, map all the reads to each of the chunks with bowtie2, and then process the results to produce results equivalent to what would have been produced had you aligned to the genome as a whole with bowtie2, but it's not completely trivial. This is effectively how bisulfite-seq aligners work (see the source code for Bison if you really want to see how to do this). |
![]() |
![]() |
![]() |
#10 |
Senior Member
Location: US Join Date: Jan 2009
Posts: 392
|
![]()
This is for bisulphite-sequencing. The problem being, that my lab uses a specific pipeline for our analysis, we work closely with the developers. Bowtie is a standard part of that protocol and I have already used this pipeline for analyzing A LOT of data, this being the first time I have run into problems. I really would like to avoid using any other aligner, because then the effort put into achieving identical results with Bowtie will be a headache in itself.
That being said, I think I have successfully modified bowtie-build...whether or not this works I can't say until its finished and I have had a chance to align some data. But it seems to be working. |
![]() |
![]() |
![]() |
#11 | ||
Junior Member
Location: Sydney Join Date: Aug 2014
Posts: 4
|
![]() Quote:
Quote:
|
||
![]() |
![]() |
![]() |
#12 |
Senior Member
Location: Puerto Rico Join Date: Sep 2014
Posts: 106
|
![]()
Hi,
I have to map yeast genome using bowtie2. For this from where I can download genome. http://www.ebi.ac.uk/ena/data/search?query=yeast http://downloads.yeastgenome.org/seq...nome_releases/ http://www.yeastgenome.org/download-data/sequence http://www.yeastgenome.org/strain/S288C/overview Where I can reference genome? Best Regards Zillur |
![]() |
![]() |
![]() |
Thread Tools | |
|
|