SEQanswers (
-   Bioinformatics (
-   -   reference size for BWA index (

ARDISSON 11-06-2018 11:24 PM

reference size for BWA index
Dear all,
I would like to take a mapping with bwa_mem on a genomic reference of 10GB but I have a problem with bwa index.
Is there a maximum size for the reference with bwa index?
how to make the index of my reference of 10GB?
Thanks for your help

Gopo 11-07-2018 02:47 AM


bwa index -a bwtsw genome.fasta

Usage:  bwa index [options] <in.fasta>

Options: -a STR    BWT construction algorithm: bwtsw or is [auto]
        -p STR    prefix of the index [same as fasta name]
        -b INT    block size for the bwtsw algorithm (effective with -a bwtsw) [10000000]
        -6        index files named as <in.fasta>.64.* instead of <in.fasta>.*

Warning: `-a bwtsw' does not work for short genomes, while `-a is' and
        `-a div' do not work not for long genomes.

You could also try BBMap as I have used it on the 32Gbp axolotl genome.

ARDISSON 11-12-2018 01:28 AM

Thank you for your answer. But I have a new problem.
When I try to create the BAM index, i encounter a new problem. The BAM index format seems to be not fit for Reference sequence above 512Mb.
So with the picard tools I do have this error:

picard.sam.MergeSamFiles INPUT=[./resultat_mapping.Tc3423_tmp/Tc3423.paired.bam, ./resultat_mapping.Tc3423_tmp/Tc3423.single.bam] OUTPUT=resultat_mapping.Tc3423.bam SORT_ORDER=
[Tue Nov 06 13:37:53 CET 2018] Executing as ardisson@cc2-n7 on Linux 2.6.32-504.16.2.el6.x86_64 amd64; Java HotSpot(TM) 64-Bit Server VM 1.7.0_76-b13; Picard version: 1.130(8b3e8abe25f920f5aa569db482bb999f29
cc447b_1427207353) IntelDeflater
INFO 2018-11-06 13:37:54 MergeSamFiles Input files are in same order as output so sorting to temp directory is not needed.
[Tue Nov 06 13:37:57 CET 2018] picard.sam.MergeSamFiles done. Elapsed time: 0.05 minutes.
To get help, see
Exception in thread "main" htsjdk.samtools.SAMException: Exception when processing alignment for BAM index ST-J00115:130:HMNN3BBXX:4:1112:8044:48386 2/2 144b aligned read.
at htsjdk.samtools.BAMFileWriter.writeAlignment(
at htsjdk.samtools.SAMFileWriterImpl.addAlignment(
at picard.sam.MergeSamFiles.doWork(
at picard.cmdline.CommandLineProgram.instanceMain(
at picard.cmdline.PicardCommandLine.instanceMain(
at picard.cmdline.PicardCommandLine.main(
Caused by: htsjdk.samtools.SAMException: Exception creating BAM index for record ST-J00115:130:HMNN3BBXX:4:1112:8044:48386 2/2 144b aligned read.
at htsjdk.samtools.BAMIndexer.processAlignment(
at htsjdk.samtools.BAMFileWriter.writeAlignment(
... 5 more
Caused by: java.lang.ArrayIndexOutOfBoundsException: 32775
at htsjdk.samtools.BinningIndexBuilder.processFeature(
at htsjdk.samtools.BAMIndexer$BAMIndexBuilder.processAlignment(
at htsjdk.samtools.BAMIndexer.processAlignment(
... 6 more
Is there a way to solve this?

Thank you for your help

Gopo 11-12-2018 01:35 AM

I am not sure I understand what you are trying to do. Are you trying to merge the mapped paired-end reads BAM file with the mapped single-end reads BAM file? Perhaps a newer version of Picard might help or perhaps using

samtools merge

ARDISSON 11-13-2018 10:11 PM

I mapped separately paired end and single end reads. Then I sorted the BAMs with the picard tools with the option CREATE_INDEX=FALSE and it goes well. Then I tried to merge them, but with the options CREATE_INDEX=TRUE. I did this plenty of times on smaller reference without any problems.

The error message I have seems to be related the Bam Index creation which cannot handle chromosomes of size > 512MB. As I have chromosomes of more than 1 GB, I am stuck.

Do you know any way to go around this limitation of the BAM index?

Gopo 11-13-2018 10:23 PM

Sorry, I don't know of a solution. The best I can suggest is to report this to the issues page on the Picard github repo.

Again, maybe a newer version of Picard or the newest version of samtools (samtools merge) or perhaps sambama (sambamba merge) might work?

Another possibility is to do the merge with CREATE_INDEX=FALSE. Then see if samtools index or sambamba index can make the index of the merged BAM.

All times are GMT -8. The time now is 09:01 AM.

Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.