SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
Segmentation fault, cuffdiff labrat73 Bioinformatics 12 03-22-2018 11:00 AM
soapsnp segmentation fault - help! jtjli Bioinformatics 6 02-29-2016 08:54 AM
Segmentation fault (core dumped) at contig step during SOAP denovo assembly tangzhonghui Bioinformatics 1 10-09-2012 05:32 PM
Newbler segmentation fault flobpf Bioinformatics 4 04-18-2011 11:45 AM
SOAP:segmentation fault Mansequencer Bioinformatics 1 11-19-2010 10:47 AM

Reply
 
Thread Tools
Old 09-29-2010, 06:05 AM   #1
scami
Member
 
Location: italy

Join Date: Sep 2010
Posts: 55
Default soap segmentation fault

Hi Guys

I am using soap aligner on fastaq files and I get a "segmentation fault" at the end of the process. The input files have been created by myself with a script that extracted the reads from an old alignment file and, for this reason I suspect that there may be some problem with them although at a first sight they look ok. The command I used is inside a file called "go" and is the following

Code:
./soap -D ./index/genome.fasta.index -a exp_47_s_A1.fastq  -b exp_47_s_A2.fastq -o paired_mapped_v2g3r1_1 -u unpaired_v2g3r1_1 -2 single_mapped_v2g3r1_1 -v 2 -g 3 -m 50 -x 400 -r 1 -t  -p 14

I get the following output:

Code:
Begin Program SOAPaligner/soap2
Wed Sep 29 15:34:14 2010
Reference: ./index/genome.fasta.index
Query File a: exp_47_s_A1.fastq
Query File b: exp_47_s_A2.fastq
Output File: paired_mapped_v2g3r1_1
             single_mapped_v2g3r1_1
             unpaired_v2g3r1_1
Load Index Table ...
lsLoad Index Table OK
Begin Alignment ...
 131072 ok    3.36 sec
..................................
..................................
24510464 ok    3.96 sec
24641536 ok    3.73 sec
24772608 ok    3.79 sec
24903680 ok    3.63 sec
25034752 ok    3.86 sec
./go: line 1: 25344 Segmentation fault
the last lines of my input files are
Code:
tail exp_47_s_A1.fastq 
+ILLUMINA-C3C24B_0047:1:120:18879:21119#0/1
abb\bb_]__]]]]]KKDOOWZWWWbbabbbbbbbbbbbbb_bbbOODDOOONNNb\bbba`]Xa`Ya^``_[bb
@ILLUMINA-C3C24B_0047:1:120:18877:8210#0/1
agcagatcatgtggtganggactcggctggtcacagtcaggctgtgagccgatggtttgcccctcccccagggat
+ILLUMINA-C3C24B_0047:1:120:18877:8210#0/1
bbbbbbbbbbbbbbb``F^`aaaaabbbbb_ba`baab_a``_`BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB
@ILLUMINA-C3C24B_0047:1:120:18872:16339#0/1
CTTGAAAACCTAGAATCAACACAAAATGAAAAAAAAAAAAAGCCCAAAAAAATGGCTTCCAAACCAGAAAactga
+ILLUMINA-C3C24B_0047:1:120:18872:16339#0/1
bbbbbbbbbbbbabbbabb_bbbbbbbbbbbbbb______ZL_[]`_`__b]Y\`]\\`O^]]W^bVb]bBBBBB



tail exp_47_s_A2.fastq 
+ILLUMINA-C3C24B_0047:1:120:18879:21119#0/2
bb^bbbbbbbabbbbS^W[^bbb_bbbbbbVUFZVOIKKO[ZVXWLTWWTT^^^[]RRR__BBBBBBBBBBBBBB
@ILLUMINA-C3C24B_0047:1:120:18877:8210#0/2
GTATTATCTACTGTGAGAGGAGTTGAGATCCGATTGAGTCCCGAGAGTATCTgtcgcattctcgacatcccttcg
+ILLUMINA-C3C24B_0047:1:120:18877:8210#0/2
bbbbbbbbbbbbbbbbbbbbbbbbb_bbbbabbbbbbbc`c__ababab^U^BBBBBBBBBBBBBBBBBBBBBBB
@ILLUMINA-C3C24B_0047:1:120:18872:16339#0/2
CAAATATGCAGCTCAAATGTCATCCCTGCATGCTCTAATACCAATTGATGAACTTTTAaacgacataggatcaca
+ILLUMINA-C3C24B_0047:1:120:18872:16339#0/2
bbbbb`bbbbbbbbbbbbb_bbbbbbbbbb_b^]b`abb^bbb`aaa`^`aaU]^^a_BBBBBBBBBBBBBBBBB

It looks everything right to me...... any idea?

thanks a lot for helping
scami is offline   Reply With Quote
Old 04-16-2012, 09:04 AM   #2
cwisch88
Member
 
Location: St. Louis

Join Date: Jan 2012
Posts: 15
Default

Did you ever find an answer to this? I'm having the exact same problem!
cwisch88 is offline   Reply With Quote
Old 04-16-2012, 10:38 AM   #3
scami
Member
 
Location: italy

Join Date: Sep 2010
Posts: 55
Default

No unfortunatly! i used bwa for my alignments, which is quite good and fast. i read great things also about bowtie2 that has been recently released. i will soon give it a try. Good luck!
scami is offline   Reply With Quote
Old 04-16-2012, 11:17 AM   #4
cwisch88
Member
 
Location: St. Louis

Join Date: Jan 2012
Posts: 15
Default

Okay well let me get the full story here.

What does the command "go" do? Are these reads that you suspect to be the problem?

From what I can tell, when this happens to me I get a number like 1310720 ok X.XX sec. and then I receive the segmentation fault.

So far I have deduced that the number represents how many read pairs that it has processed before failing.

Now, it only reports that the alignments are okay for each batch of 131072, so if I take out that block of reads, it continues until it hit something else!

I'm thinking it might be a q-score problem, but I'm having trouble wrapping my mind around the standards:

FASTQ formats
cwisch88 is offline   Reply With Quote
Old 04-16-2012, 09:58 PM   #5
scami
Member
 
Location: italy

Join Date: Sep 2010
Posts: 55
Default

Hi there,

so let me get this clear. When you write:

Now, it only reports that the alignments are okay for each batch of 131072, so if I take out that block of reads, it continues until it hit something else!


you mean that you cut the file containing the reads starting from line 131072 and until the end? And you got the same error?

Generally the "segmentation default" error happens (at least in C and C++) when one of the following problems occur (among other):
1) the software is trying to open a file that does not exist. This is not our case since all the files are recognised and open
2) the available memory is not sufficient for the process to terminate
3) Some variable is used to store a value that is retrieved from a file but for some reason the retrieved value is too big to fit in the amount of memory available for that variable.

If all the reads are ok then point 3 should not happen. In order to verify that point 2 is not happening I would split the file containing the reads in sub-files 131072 line long, launch the alignment and see whether the software fails again.

let me know whether this suggestion has been of any help
scami is offline   Reply With Quote
Old 04-17-2012, 04:38 AM   #6
cwisch88
Member
 
Location: St. Louis

Join Date: Jan 2012
Posts: 15
Default

A FASTQ has 4 lines per read. When it segfaults at 25034752 it means that it has gone through 12517376 reads from reads_1 and 12517376 from reads 2. And something in the next 131072 reads makes it choke. At least that is what I think I am getting from the evidence here.

When I removed the first chunk it did continue further than it had before and then dropped out again.

I thought it was a memory issue, but then I saw it was failing in the same place no matter how much memory I threw at it.

Luckily, I think I have a small dataset that chokes on this. I will try the idea of separating it into files 65536 from each of the reads files and seeing if there is something there.

Now I'm really new to soapaligner and I'm not familiar with all of the options, so it is not out of the realm of possibility that my problem could be a max size of insert problem.

Could it be that soapaligner expects a certain insert size and if those qualifications aren't met it can segfault (point 3)?

I hope I have more helpful info today.
cwisch88 is offline   Reply With Quote
Old 04-17-2012, 06:08 AM   #7
cwisch88
Member
 
Location: St. Louis

Join Date: Jan 2012
Posts: 15
Default

This morning has been productive so far, it seems the removal of the -g allows the alignment to go through. So I'm wondering if my original allowed gap (6bps) is too large? I guess I'll keep chronicling things until I solve this headache.

Last edited by cwisch88; 04-18-2012 at 06:56 AM.
cwisch88 is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 03:17 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2018, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO