SEQanswers

SEQanswers (http://seqanswers.com/forums/index.php)
-   Bioinformatics (http://seqanswers.com/forums/forumdisplay.php?f=18)
-   -   reference size for BWA index (http://seqanswers.com/forums/showthread.php?t=85471)

ARDISSON 11-06-2018 11:24 PM

reference size for BWA index
 
Dear all,
I would like to take a mapping with bwa_mem on a genomic reference of 10GB but I have a problem with bwa index.
Is there a maximum size for the reference with bwa index?
how to make the index of my reference of 10GB?
Thanks for your help
Morgane ARDISSON

Gopo 11-07-2018 02:47 AM

Code:

bwa index -a bwtsw genome.fasta
Code:

Usage:  bwa index [options] <in.fasta>

Options: -a STR    BWT construction algorithm: bwtsw or is [auto]
        -p STR    prefix of the index [same as fasta name]
        -b INT    block size for the bwtsw algorithm (effective with -a bwtsw) [10000000]
        -6        index files named as <in.fasta>.64.* instead of <in.fasta>.*

Warning: `-a bwtsw' does not work for short genomes, while `-a is' and
        `-a div' do not work not for long genomes.

You could also try BBMap as I have used it on the 32Gbp axolotl genome.

ARDISSON 11-12-2018 01:28 AM

Hello,
Thank you for your answer. But I have a new problem.
When I try to create the BAM index, i encounter a new problem. The BAM index format seems to be not fit for Reference sequence above 512Mb.
So with the picard tools I do have this error:
Quote:

picard.sam.MergeSamFiles INPUT=[./resultat_mapping.Tc3423_tmp/Tc3423.paired.bam, ./resultat_mapping.Tc3423_tmp/Tc3423.single.bam] OUTPUT=resultat_mapping.Tc3423.bam SORT_ORDER=
coordinate MERGE_SEQUENCE_DICTIONARIES=true VALIDATION_STRINGENCY=SILENT CREATE_INDEX=true ASSUME_SORTED=false USE_THREADING=false VERBOSITY=INFO QUIET=false COMPRESSION_LEVEL=5 MAX_RECORDS_IN_RAM=500000
CREATE_MD5_FILE=false
[Tue Nov 06 13:37:53 CET 2018] Executing as ardisson@cc2-n7 on Linux 2.6.32-504.16.2.el6.x86_64 amd64; Java HotSpot(TM) 64-Bit Server VM 1.7.0_76-b13; Picard version: 1.130(8b3e8abe25f920f5aa569db482bb999f29
cc447b_1427207353) IntelDeflater
INFO 2018-11-06 13:37:54 MergeSamFiles Input files are in same order as output so sorting to temp directory is not needed.
[Tue Nov 06 13:37:57 CET 2018] picard.sam.MergeSamFiles done. Elapsed time: 0.05 minutes.
Runtime.totalMemory()=2058354688
To get help, see http://broadinstitute.github.io/pica...ml#GettingHelp
Exception in thread "main" htsjdk.samtools.SAMException: Exception when processing alignment for BAM index ST-J00115:130:HMNN3BBXX:4:1112:8044:48386 2/2 144b aligned read.
at htsjdk.samtools.BAMFileWriter.writeAlignment(BAMFileWriter.java:124)
at htsjdk.samtools.SAMFileWriterImpl.addAlignment(SAMFileWriterImpl.java:178)
at picard.sam.MergeSamFiles.doWork(MergeSamFiles.java:158)
at picard.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:187)
at picard.cmdline.PicardCommandLine.instanceMain(PicardCommandLine.java:95)
at picard.cmdline.PicardCommandLine.main(PicardCommandLine.java:105)
Caused by: htsjdk.samtools.SAMException: Exception creating BAM index for record ST-J00115:130:HMNN3BBXX:4:1112:8044:48386 2/2 144b aligned read.
at htsjdk.samtools.BAMIndexer.processAlignment(BAMIndexer.java:92)
at htsjdk.samtools.BAMFileWriter.writeAlignment(BAMFileWriter.java:121)
... 5 more
Caused by: java.lang.ArrayIndexOutOfBoundsException: 32775
at htsjdk.samtools.BinningIndexBuilder.processFeature(BinningIndexBuilder.java:136)
at htsjdk.samtools.BAMIndexer$BAMIndexBuilder.processAlignment(BAMIndexer.java:195)
at htsjdk.samtools.BAMIndexer.processAlignment(BAMIndexer.java:90)
... 6 more
Is there a way to solve this?

Thank you for your help
Morgane ARDISSON

Gopo 11-12-2018 01:35 AM

I am not sure I understand what you are trying to do. Are you trying to merge the mapped paired-end reads BAM file with the mapped single-end reads BAM file? Perhaps a newer version of Picard might help or perhaps using
Code:

samtools merge
?

ARDISSON 11-13-2018 10:11 PM

Hi,
I mapped separately paired end and single end reads. Then I sorted the BAMs with the picard tools with the option CREATE_INDEX=FALSE and it goes well. Then I tried to merge them, but with the options CREATE_INDEX=TRUE. I did this plenty of times on smaller reference without any problems.

The error message I have seems to be related the Bam Index creation which cannot handle chromosomes of size > 512MB. As I have chromosomes of more than 1 GB, I am stuck.

Do you know any way to go around this limitation of the BAM index?

Gopo 11-13-2018 10:23 PM

Sorry, I don't know of a solution. The best I can suggest is to report this to the issues page on the Picard github repo.
https://github.com/broadinstitute/picard/issues

Again, maybe a newer version of Picard or the newest version of samtools (samtools merge) or perhaps sambama (sambamba merge) might work?

Another possibility is to do the merge with CREATE_INDEX=FALSE. Then see if samtools index or sambamba index can make the index of the merged BAM.


All times are GMT -8. The time now is 09:01 AM.

Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.