SEQanswers (
-   Bioinformatics (
-   -   SOAP:segmentation fault (

Mansequencer 09-09-2010 01:40 PM

SOAP:segmentation fault
Hi All,
I am trying to map solexa PE reads (fastq) from different strains onto reference scaffolds file (fasta) using soap2. The questions I have are:
1. The alignment works but it seems that soap is not able to read seq data from all the 8 strains I have in the command as the number of reads that it processes is too low to encompass all the 8 strains.
2. When I try to msort (-k 8,n9) the PE output file from the alignment, it comes back with error (segmentation fault) however, the SE output gets sorted with the same command.
3. The snp calling through soapsnp (-d <ref> -i <SEoutput.sort> -r 0.00005 e 0.0001 -t -u -L <100>) generates a huge file many times larger than original file. Is that usual?
Thanks for the patient reading.

Awesome 11-19-2010 10:47 AM

Hi Mansequencer,

For some reason, MSORT doesn't work with files over a certain size. It works just fine for small files using the command:

$ msort -k 8,n9 mapped.out > mapped.out.sort
To get around this, you can use the SORT command that is native to most Linux and Unix distributions:

sort -t $'\t' -k 8f,8 -k 9n,9 mapped.out > mapped.out.sort
To check to see if your files are indeed sorted, you can always isolate just the two columns in question, the chromosome and position:

cut -f 8,9 mapped.out.sort > col_89.txt
SOAPSNP generates a huge file on purpose. Either of the other formats GLFv2 and GPFv2 may give a smaller footprint output. Try those.

Best of luck,

All times are GMT -8. The time now is 01:09 PM.

Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2021, vBulletin Solutions, Inc.