SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
bwa sampe segmentation fault papori Bioinformatics 5 09-22-2013 10:05 PM
bwa samse segmentation fault xguo Bioinformatics 78 05-03-2013 10:31 AM
segmentation fault in BWA sampe papori Illumina/Solexa 0 07-28-2011 08:12 AM
bwa aln Segmentation fault DNAjunk Bioinformatics 4 03-02-2011 06:28 AM
BWA Segmentation Fault (aln) raela Bioinformatics 0 05-18-2010 06:41 AM

Reply
 
Thread Tools
Old 04-18-2009, 08:52 AM   #1
adkostic
Junior Member
 
Location: Boston

Join Date: Apr 2009
Posts: 4
Default BWA Alignment Segmentation Fault

I'm trying to align a set of 51bp paired-end Illumina reads from a human cell line cDNA library. BWA does a great job aligning my reads to the indexed human cDNA database (Homo_sapiens.NCBI36.53.cdna.all.fa) from Ensembl or to individual chromosomes (for example, Homo_sapiens.NCBI36.53.dna.chromosome.1.fa), but when I try to align to the full human DNA database (Homo_sapiens.NCBI36.53.dna.toplevel.fa) the 'index' step works fine, but the 'aln' step gives a 'Segmentation fault'.

The output looks like this:

[bwa_aln] 17bp reads: max_diff = 2
[bwa_aln] 38bp reads: max_diff = 3
[bwa_aln] 64bp reads: max_diff = 4
[bwa_aln] 93bp reads: max_diff = 5
[bwa_aln] 124bp reads: max_diff = 6
[bwa_aln] 157bp reads: max_diff = 7
[bwa_aln] 190bp reads: max_diff = 8
[bwa_aln] 225bp reads: max_diff = 9
Segmentation fault


This happens even when I run the command on machines on my cluster with 32GB of RAM, so I don't think memory is an issue.
Maybe the database is too big for BWA to handle (I doubt that)? Maybe there's something about the way this database is indexed that BWA doesn't like (I don't understand enough about the way BWA indexes and reads SA coordinates and chromosomal coordinates to know if this is the issue)?
Does anyone have any ideas?

Thanks!
adkostic is offline   Reply With Quote
Old 04-19-2009, 02:27 AM   #2
lh3
Senior Member
 
Location: Boston

Join Date: Feb 2008
Posts: 693
Default

I will try that reference file by myself to see if I can recreate the segfault. I never used that toplevel contig file. The weird thing is bwa segfaults at that step where actually little computation has been done.

PS: At least on my machine, it does not segfault before you see "calculate SA coordinate...". Could you check if your cluster has memory limit by default? Or you are using 32-bit version?

Last edited by lh3; 04-19-2009 at 06:12 AM.
lh3 is offline   Reply With Quote
Old 04-20-2009, 07:52 AM   #3
adkostic
Junior Member
 
Location: Boston

Join Date: Apr 2009
Posts: 4
Default

I appreciate your looking into this.

When I run the job with a higher memory allocation it takes a little more time to process and it does output 'calculate SA coordinate...' and runs for about a minute before it segfaults. The version of BWA I'm using is 0.4.6 (the latest version on sourceforge), I'm not sure if that's a 32-bit version.

It's not so important for me to use this toplevel contig database - is there a specific human genome database that you use which works?

Thanks
adkostic is offline   Reply With Quote
Old 04-20-2009, 10:15 AM   #4
lh3
Senior Member
 
Location: Boston

Join Date: Feb 2008
Posts: 693
Default

Is it possible for you to put the first 256k reads on some FTP? I cannot recreate the segfault with my data. Many thanks.
lh3 is offline   Reply With Quote
Old 04-20-2009, 11:42 AM   #5
adkostic
Junior Member
 
Location: Boston

Join Date: Apr 2009
Posts: 4
Default

I've posted one paired set at at the address below:

ftp://ftp.broad.mit.edu/outgoing/DFC...ublic/adkostic

This is about 40,000 paired reads that are left from my original set (they did not map using my previous aligning methods). But using the cDNA database mentioned above a good portion of these are mapped by BWA, so this should probably also be the case for the full DNA database.

Thanks again.
adkostic is offline   Reply With Quote
Old 04-21-2009, 11:56 AM   #6
lh3
Senior Member
 
Location: Boston

Join Date: Feb 2008
Posts: 693
Default

Thanks for posting this. Unfortuantely, I cannot download the "F" reads. The error is "550 30BV1.1.F.unmapped.reads.fastq: Permission denied".

I tried the "R" reads and they are mapped fine against the toplevel fasta. Debugger like valgrind did not report hidden bugs (bugs that cause segfault on some machines but not the others). I think these should not be the reads causing segfaults. Probably these are reads bridging exon boundaries.

Maybe the first 256k reads in the fastq causing the segfault are more helpful to find the reason. Thanks.
lh3 is offline   Reply With Quote
Old 04-21-2009, 04:52 PM   #7
adkostic
Junior Member
 
Location: Boston

Join Date: Apr 2009
Posts: 4
Default

Thank you for trying those reads out for yourself. Both the "F" and "R" reads cause the segfault on my machine, so it seems that it must be a machine-specific cause. I personally have not tried to map more than these sets of reads using BWA; I received these reads from my coworker who got the rest of the reads to align using Arachne (I requested these reads so that I can get used to using the aligners before my sequence data comes in, and when it does I'll be using BWA (as well as Maq and Bowtie) to align them and I'll post if I'm still having trouble with my complete data set).

Thanks for your help Heng.

Alex
adkostic is offline   Reply With Quote
Old 04-22-2009, 04:07 AM   #8
lh3
Senior Member
 
Location: Boston

Join Date: Feb 2008
Posts: 693
Default

I guess this is caused by the configuration of your machines. When valgrind complains nothing, it is less likely a hidden bug in bwa. Another test would be to index half of the genome (say first 5 chromosomes) and to see if segfault occurs. Note that for that toplevel fasta, bwa requires 2.7GB memory. Maybe this is a problem. Anyway, this is wild guess. Probably it does not help.

By the way, about your other questions.

1. this top level fasta contains different haplotypes for chr6 and chr22 and is not a good reference for the purpose of read mapping. You can find the reference genome used by the 1000 genomes project somewhere on its ftp (I do not know). I have reasons to believe that is the best reference genome for human mapping.

2. For "R" reads, with the default option, bwa maps 433 reads; a more sensitive mode maps 597. Nearly all of the mapped reads contain short indels (the vast majority) or >=3 mismatches.

Last edited by lh3; 04-22-2009 at 04:33 AM.
lh3 is offline   Reply With Quote
Old 05-15-2010, 09:12 AM   #9
Fabien Campagne
Member
 
Location: New York City

Join Date: Feb 2010
Posts: 39
Default similar error on different machine

We observe a segmentation fault with bwa 0.5.7 (and 0.5.5) at approximately the same step of alignment on a different machine. The bug seems to be triggered by some datasets only (reference or input reads).

Here's a valgrind output (shown for version 0.5.7):

gobyweb@spanky FSMIQXN-solid-HBR $ valgrind /home/gobyweb/goby/nextgen-tools/bwa/bwa aln -c -l 35 -o 1 -e -1 /scratchLocal/gobyweb/input-data/reference-db/Transcript-GRCh37.57/homo_sapiens/colorspace/bwa/index.00T 14.fastq==12461== Memcheck, a memory error detector.==12461== Copyright (C) 2002-2006, and GNU GPL'd, by Julian Seward et al.
==12461== Using LibVEX rev 1658, a library for dynamic binary translation.
==12461== Copyright (C) 2004-2006, and GNU GPL'd, by OpenWorks LLP.
==12461== Using valgrind-3.2.1, a dynamic binary instrumentation framework.
==12461== Copyright (C) 2000-2006, and GNU GPL'd, by Julian Seward et al.
==12461== For more details, rerun with: -v
==12461==
[bwa_aln] 17bp reads: max_diff = 2
[bwa_aln] 38bp reads: max_diff = 3
[bwa_aln] 64bp reads: max_diff = 4
[bwa_aln] 93bp reads: max_diff = 5
[bwa_aln] 124bp reads: max_diff = 6
[bwa_aln] 157bp reads: max_diff = 7
[bwa_aln] 190bp reads: max_diff = 8
[bwa_aln] 225bp reads: max_diff = 9


??
[bwa_aln_core] calculate SA coordinate... ==12461== Invalid read of size 4
==12461== at 0x402E4B: bwt_2occ (bwt.c:100)
==12461== by 0x404E26: bwt_cal_width (bwtaln.c:63)
==12461== by 0x40511B: bwa_cal_sa_reg_gap (bwtaln.c:122)
==12461== by 0x405631: bwa_aln_core (bwtaln.c:185)
==12461== by 0x4059A1: bwa_aln (bwtaln.c:297)
==12461== by 0x3CF4E1D993: (below main) (in /lib64/libc-2.5.so)
==12461== Address 0x1580D100 is not stack'd, malloc'd or (recently) free'd
==12461==
==12461== Process terminating with default action of signal 11 (SIGSEGV)
==12461== Access not within mapped region at address 0x1580D100
==12461== at 0x402E4B: bwt_2occ (bwt.c:100)

==12461== by 0x404E26: bwt_cal_width (bwtaln.c:63)
==12461== by 0x40511B: bwa_cal_sa_reg_gap (bwtaln.c:122)
==12461== by 0x405631: bwa_aln_core (bwtaln.c:185)
==12461== by 0x4059A1: bwa_aln (bwtaln.c:297)
==12461== by 0x3CF4E1D993: (below main) (in /lib64/libc-2.5.so)
?#=????#==12461==
==12461== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 5 from 1)
==12461== malloc/free: in use at exit: 114,637,923 bytes in 1,048,662 blocks.
==12461== malloc/free: 1,048,672 allocs, 10 frees, 114,639,241 bytes allocated.
==12461== For counts of detected errors, rerun with: -v
==12461== searching for pointers to 1,048,662 not-freed blocks.
==12461== checked 111,967,840 bytes.
==12461==
==12461== LEAK SUMMARY:
==12461== definitely lost: 0 bytes in 0 blocks.
==12461== possibly lost: 0 bytes in 0 blocks.
==12461== still reachable: 114,637,923 bytes in 1,048,662 blocks.
==12461== suppressed: 0 bytes in 0 blocks.
==12461== Reachable blocks (those to which a pointer was found) are not shown.
==12461== To see them, rerun with: --show-reachable=yes
Segmentation fault

We have a small dataset that triggers the problem. Let me know if you are interested.

Program: bwa (alignment via Burrows-Wheeler transformation)
Version: 0.5.5 (r1273)
Contact: Heng Li <lh3@sanger.ac.uk>

Program: bwa (alignment via Burrows-Wheeler transformation)
Version: 0.5.7 (r1310)
Contact: Heng Li <lh3@sanger.ac.uk>
Fabien Campagne is offline   Reply With Quote
Old 05-15-2010, 02:40 PM   #10
lh3
Senior Member
 
Location: Boston

Join Date: Feb 2008
Posts: 693
Default

Yes, please send me the example file. Thank you.
lh3 is offline   Reply With Quote
Old 08-27-2010, 06:20 AM   #11
dawe
Senior Member
 
Location: 4530'25.22"N / 915'53.00"E

Join Date: Apr 2009
Posts: 258
Default

Hi all, I've got now a segfault error with bwa. Apparently this happens only if I enable threaded alignment on a NFS file system.
Code:
$ bwa aln -t 2 /db/bwa/hg19/hg19.fa s_1_2.fastq > s_1_2.sai[bwa_aln] 17bp reads: max_diff = 2
[bwa_aln] 38bp reads: max_diff = 3
[bwa_aln] 64bp reads: max_diff = 4
[bwa_aln] 93bp reads: max_diff = 5
[bwa_aln] 124bp reads: max_diff = 6
[bwa_aln] 157bp reads: max_diff = 7
[bwa_aln] 190bp reads: max_diff = 8
[bwa_aln] 225bp reads: max_diff = 9
[bwa_aln_core] calculate SA coordinate... Segmentation fault
while

Code:
$ bwa aln  /db/bwa/hg19/hg19.fa s_1_1.fastq > s_1_1.sai
[bwa_aln] 17bp reads: max_diff = 2
[bwa_aln] 38bp reads: max_diff = 3
[bwa_aln] 64bp reads: max_diff = 4
[bwa_aln] 93bp reads: max_diff = 5
[bwa_aln] 124bp reads: max_diff = 6
[bwa_aln] 157bp reads: max_diff = 7
[bwa_aln] 190bp reads: max_diff = 8
[bwa_aln] 225bp reads: max_diff = 9
[bwa_aln_core] calculate SA coordinate... 114.28 sec
[bwa_aln_core] write to the disk... 0.08 sec
[bwa_aln_core] 262144 sequences have been processed.
[bwa_aln_core] calculate SA coordinate... 118.51 sec
[bwa_aln_core] write to the disk... 0.07 sec
[bwa_aln_core] 524288 sequences have been processed.
[bwa_aln_core] calculate SA coordinate...
works

Code:
Program: bwa (alignment via Burrows-Wheeler transformation)
Version: 0.5.8 (r1442)
dawe is offline   Reply With Quote
Old 04-01-2011, 10:12 AM   #12
golharam
Member
 
Location: Philadelphia, PA

Join Date: Dec 2009
Posts: 55
Default resolution?

I'm seeing this as well. Has there been any resolution to this? I'm running BWA 64-bit v0.5.7
golharam is offline   Reply With Quote
Old 04-08-2011, 06:48 AM   #13
dp05yk
Member
 
Location: Brock University

Join Date: Dec 2010
Posts: 66
Default

Hi golharam,

Try upgrading to the most recent version. If that doesn't help, here's a reply I made on another thread:

Quote:
Lately I had been encountering inexplicable segmentation faults during the 'aln' command for SOLiD reads. The problem occurs when the first read of a 262144 block has a length of zero. This is why it's so rare and so hard to reproduce. I was able to fix this by initializing the max_l variable at the beginning of the bwa_cal_sa_reg_gap function to -1 instead of 0.
It makes sense that it would occur more often with threading enabled, since the chances are multiplied by the number of threads running.

I'd recommend:

1. Upgrading to BWA 0.5.9.
2. If you are still getting segmentation faults, modify line 82 of bwtaln.c: change "max_l = 0", to "max_l = -1", and recompile. This is what fixed it for me.
dp05yk is offline   Reply With Quote
Old 04-08-2011, 07:58 AM   #14
golharam
Member
 
Location: Philadelphia, PA

Join Date: Dec 2009
Posts: 55
Default solved

I upgraded bwa to 0.5.9. I rebuilt the index with the latest version and re-aligned with the latest version and everything seems okay.
golharam is offline   Reply With Quote
Old 07-24-2011, 10:23 AM   #15
oiiio
Senior Member
 
Location: USA

Join Date: Jan 2011
Posts: 105
Default

Quote:
Originally Posted by golharam View Post
I upgraded bwa to 0.5.9. I rebuilt the index with the latest version and re-aligned with the latest version and everything seems okay.
Did you need to modify bwtaln.c as suggested? I am having similar segfault errors like this, even though I am using the latest bersion of BWA.
oiiio is offline   Reply With Quote
Old 07-24-2011, 07:45 PM   #16
golharam
Member
 
Location: Philadelphia, PA

Join Date: Dec 2009
Posts: 55
Default

No, I didn't make any modifications.
golharam is offline   Reply With Quote
Old 10-20-2011, 01:39 PM   #17
aishsk
Junior Member
 
Location: Indiana

Join Date: Oct 2011
Posts: 2
Default Getting the same error

Hi ,
I am getting the exact error of
[bwa_aln_core] calculate SA coordinate... Segmentation fault
aishsk is offline   Reply With Quote
Old 11-11-2011, 07:33 PM   #18
pepperoni
Member
 
Location: US

Join Date: Oct 2011
Posts: 59
Default

I am getting the same error, any new ideas???
Quote:
bwa aln Mito.fa TpNoDotStd.fastq > alnTpNoDotStd.sai
[bwa_aln] 17bp reads: max_diff = 2
[bwa_aln] 38bp reads: max_diff = 3
[bwa_aln] 64bp reads: max_diff = 4
[bwa_aln] 93bp reads: max_diff = 5
[bwa_aln] 124bp reads: max_diff = 6
[bwa_aln] 157bp reads: max_diff = 7
[bwa_aln] 190bp reads: max_diff = 8
[bwa_aln] 225bp reads: max_diff = 9
[bwa_aln_core] calculate SA coordinate... 3.00 sec
[bwa_aln_core] write to the disk... 0.01 sec
[bwa_aln_core] 262144 sequences have been processed.
[bwa_aln_core] calculate SA coordinate... 3.00 sec
[bwa_aln_core] write to the disk... 0.02 sec
[bwa_aln_core] 524288 sequences have been processed.
[bwa_aln_core] calculate SA coordinate... 3.01 sec
[bwa_aln_core] write to the disk... 0.02 sec
[bwa_aln_core] 786432 sequences have been processed.
[bwa_aln_core] calculate SA coordinate... 2.96 sec
[bwa_aln_core] write to the disk... 0.02 sec
[bwa_aln_core] 1048576 sequences have been processed.
[bwa_aln_core] calculate SA coordinate... 2.97 sec
[bwa_aln_core] write to the disk... 0.02 sec
[bwa_aln_core] 1310720 sequences have been processed.
[bwa_aln_core] calculate SA coordinate... 2.99 sec
[bwa_aln_core] write to the disk... 0.02 sec
[bwa_aln_core] 1572864 sequences have been processed.
[bwa_aln_core] calculate SA coordinate... 2.98 sec
[bwa_aln_core] write to the disk... 0.02 sec
[bwa_aln_core] 1835008 sequences have been processed.
[bwa_aln_core] calculate SA coordinate... 2.99 sec
[bwa_aln_core] write to the disk... 0.02 sec
[bwa_aln_core] 2097152 sequences have been processed.
[bwa_aln_core] calculate SA coordinate... 2.98 sec
[bwa_aln_core] write to the disk... 0.02 sec
[bwa_aln_core] 2359296 sequences have been processed.
[bwa_aln_core] calculate SA coordinate... zsh: segmentation fault bwa aln MitoNC_012920.fa TeopNoDotStd.fastq > alnTeopNoDotStd.sai
pepperoni is offline   Reply With Quote
Old 12-20-2011, 11:23 PM   #19
kenietz
Member
 
Location: Singapore

Join Date: Nov 2011
Posts: 85
Default

Hi,
i get seg fault like this one in my bwa-0.6.1-r104:
------------------
[bwa_aln_core] refine gapped alignments... sh: line 1: 12337 Segmentation fault
-------------------

It happens when i modify the '-l' switch. When i set it to something below 19 is produces the error. I got it when aligning SOLID reads.

Should i rebuild bwa as per dp05yk suggestion(modify line 82 of bwtaln.c? Any ideas are appreciated.

thank you.
kenietz is offline   Reply With Quote
Old 12-20-2011, 11:39 PM   #20
kenietz
Member
 
Location: Singapore

Join Date: Nov 2011
Posts: 85
Default

Hi again,
just rebuild bwa as suggested but got the same error.

Now im out of ideas and i have to stick to -l 19 or larger.
kenietz is offline   Reply With Quote
Reply

Tags
aligners, bwa, illumina sequencing, mapping, problems

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 06:59 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO