Hey there!
I am trying to use tophat (1.0.7) with the new human genome version GRC37 (aka hg19). I get an error quite late into it:
...
Sat May 9 23:25:26 2009] Mapping reads against hg19 with Bowtie
[Sat May 9 23:55:13 2009] Searching for junctions via coverage islands
[Sun May 10 00:03:52 2009] Searching for junctions via mate-pair closures
[Sun May 10 07:49:28 2009] Retrieving sequences for splices
[Sun May 10 08:53:56 2009] Indexing splices
Error: Reference sequence has more than 2^32-1 characters! Please divide the
reference into batches or chunks of about 3.6 billion characters or less each
and index each independently.
[FAILED]
Error: Splice sequence indexing failed
...
I have a single fasta file with hg19 and have created all the bowtie indexes.
Wondering what is the best way to split the file and how should I name it.
Thanks a lot, and thanks for all the hard work that went into making tophat and bowtie.
a
I am trying to use tophat (1.0.7) with the new human genome version GRC37 (aka hg19). I get an error quite late into it:
...
Sat May 9 23:25:26 2009] Mapping reads against hg19 with Bowtie
[Sat May 9 23:55:13 2009] Searching for junctions via coverage islands
[Sun May 10 00:03:52 2009] Searching for junctions via mate-pair closures
[Sun May 10 07:49:28 2009] Retrieving sequences for splices
[Sun May 10 08:53:56 2009] Indexing splices
Error: Reference sequence has more than 2^32-1 characters! Please divide the
reference into batches or chunks of about 3.6 billion characters or less each
and index each independently.
[FAILED]
Error: Splice sequence indexing failed
...
I have a single fasta file with hg19 and have created all the bowtie indexes.
Wondering what is the best way to split the file and how should I name it.
Thanks a lot, and thanks for all the hard work that went into making tophat and bowtie.
a
Comment