SEQanswers

SEQanswers (http://seqanswers.com/forums/index.php)
-   Bioinformatics (http://seqanswers.com/forums/forumdisplay.php?f=18)
-   -   SOAP:segmentation fault (http://seqanswers.com/forums/showthread.php?t=6789)

Mansequencer 09-09-2010 02:40 PM

SOAP:segmentation fault
 
Hi All,
I am trying to map solexa PE reads (fastq) from different strains onto reference scaffolds file (fasta) using soap2. The questions I have are:
1. The alignment works but it seems that soap is not able to read seq data from all the 8 strains I have in the command as the number of reads that it processes is too low to encompass all the 8 strains.
2. When I try to msort (-k 8,n9) the PE output file from the alignment, it comes back with error (segmentation fault) however, the SE output gets sorted with the same command.
3. The snp calling through soapsnp (-d <ref> -i <SEoutput.sort> -r 0.00005 e 0.0001 -t -u -L <100>) generates a huge file many times larger than original file. Is that usual?
Thanks for the patient reading.

Awesome 11-19-2010 11:47 AM

Hi Mansequencer,

For some reason, MSORT doesn't work with files over a certain size. It works just fine for small files using the command:
Code:

$ msort -k 8,n9 mapped.out > mapped.out.sort
To get around this, you can use the SORT command that is native to most Linux and Unix distributions:
Code:

sort -t $'\t' -k 8f,8 -k 9n,9 mapped.out > mapped.out.sort
To check to see if your files are indeed sorted, you can always isolate just the two columns in question, the chromosome and position:
Code:

cut -f 8,9 mapped.out.sort > col_89.txt
SOAPSNP generates a huge file on purpose. Either of the other formats GLFv2 and GPFv2 may give a smaller footprint output. Try those.

Best of luck,
-Awesome


All times are GMT -8. The time now is 06:32 AM.

Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.