Hi,
I am using Bowtie2.0.0-beta4 to map Illumina raw reads to a file with a lot of fasta sequences in it.
I have Illumina GAIIx data (75b reads, single end file size is 9 GB) and HiSeq data (100b reads, single end file size is 12 GB or 20GB).
When mapping with these options:
bowtie2 -x repbase_nov11_TEs/ref_fasta.indexed -q -U ../../singleend_rawreads.fastq --sensitive-local -a -p 6 -S mapped.sam
everything works fine for the GAIIx data and the HiSeq data when I use a fasta ref with 5 sequences in it.
But if I use a bigger fasta ref (60 megabytes), then it works fine for GAIIx, but with HiSeq data it seems to run infinitely on.
When I look at the size of the generated sam file, it increases two a not specific size and then it stops increasing the file size, but the processors are still working (though of the 6 started, sometimes only 4 go on after this size was reached).
I thought probably the fasta-ref or the raw data input size is too big . For both I used only a fraction, but the same problem occurs again.
This is strange, because the exact same data with another fasta-ref (the small one) worked just fine.
Also, I run it on another system with 48 cores and there, the *.sam file increases to 50GB and then stops increasing, but CPUs keep running. If I start it with 6 cores, then it gets just to about ~10GB.
And I waited like 5 days... so I do not think it will stop.
I like bowtie2 really much, and it is doing exactly like i wanted my stuff, so I will be grateful if someone could give me a hint how to solve this problem.
Best wishes,
Jens
I am using Bowtie2.0.0-beta4 to map Illumina raw reads to a file with a lot of fasta sequences in it.
I have Illumina GAIIx data (75b reads, single end file size is 9 GB) and HiSeq data (100b reads, single end file size is 12 GB or 20GB).
When mapping with these options:
bowtie2 -x repbase_nov11_TEs/ref_fasta.indexed -q -U ../../singleend_rawreads.fastq --sensitive-local -a -p 6 -S mapped.sam
everything works fine for the GAIIx data and the HiSeq data when I use a fasta ref with 5 sequences in it.
But if I use a bigger fasta ref (60 megabytes), then it works fine for GAIIx, but with HiSeq data it seems to run infinitely on.
When I look at the size of the generated sam file, it increases two a not specific size and then it stops increasing the file size, but the processors are still working (though of the 6 started, sometimes only 4 go on after this size was reached).
I thought probably the fasta-ref or the raw data input size is too big . For both I used only a fraction, but the same problem occurs again.
This is strange, because the exact same data with another fasta-ref (the small one) worked just fine.
Also, I run it on another system with 48 cores and there, the *.sam file increases to 50GB and then stops increasing, but CPUs keep running. If I start it with 6 cores, then it gets just to about ~10GB.
And I waited like 5 days... so I do not think it will stop.
I like bowtie2 really much, and it is doing exactly like i wanted my stuff, so I will be grateful if someone could give me a hint how to solve this problem.
Best wishes,
Jens