memyselfandi 08-04-2010 11:00 AM

bowtie_build input help
Hello all,

What is the proper data to use with bowtie_build if I would like to build my own index for the latest mouse genome? For example, on the bowtie site, they have 4 Pre-built index downloads (2 from NCBI, 2 from UCSC) that are 2.4 GB each to download. What were the input files used to generate these indices and than the following commands?

If from NCBI, were they all the chromosome files located at:

or something else?

Could anyone point me to where the files are from and located?

Thank you so much!

raela 08-04-2010 11:08 AM

Yes, download all of the FASTA chromosome sequences from your source of choice.

Choose one of the mm#, go to bigZips/, and get the chromFa.tar.gz

mrawlins 08-04-2010 11:48 AM

For building an index you'll want to use bowtie-build in the distribution of bowtie. You'll run the command


bowtie-build <fasta_file> <bowtie_index_prefix>
This will create a set of files starting with <bowtie_index_prefix> that contain all the information in the fasta file, but in a format that makes it easy to look up sequences.

If you're mapping SOLiD reads you'll need to add -C before <fasta_file>. There are additional options explained in the bowtie user manual.

