Dear all,
I am trying to run Ray to de novo assemble a nematode genome.
I run into the following error:
The problem detected by Ray (not the same number of sequences in the left and right read files) is wrong. Actually Ray does not seem to correctly count the sequences:
A head of my read files looks completely normal (regular multiline fasta).
I have also run other assemblers successfully on this data, so I know there is no format problem with the files.
Any insights on what could be causing this problem?
Best Wishes,
Sophie
I am trying to run Ray to de novo assemble a nematode genome.
I run into the following error:
Code:
mpirun \ -n 32 \ /mnt/Programs/Ray-2.3.1/Ray \ -k 81 \ -o rayk81 \ -p ../../clean_reads/PE_1.noCont.ec.fa ../../clean_reads/PE_2.noCont.ec.fa \ -p ../../clean_reads/MP3_1.noCont.ec.fa ../../clean_reads/MP3_2.noCont.ec.fa \ -p ../../clean_reads/MP5_1.noCont.ec.fa ../../clean_reads/MP5_2.noCont.ec.fa \ -p ../../clean_reads/MP8_1.noCont.ec.fa ../../clean_reads/MP8_2.noCont.ec.fa [.....] Rank 7: File ../../clean_reads/MP8_2.noCont.ec.fa (Number 7) has 10233322 sequences Rank 6: File ../../clean_reads/MP8_1.noCont.ec.fa (Number 6) has 10231913 sequences Rank 5: File ../../clean_reads/MP5_2.noCont.ec.fa (Number 5) has 10722610 sequences Rank 2: File ../../clean_reads/MP3_1.noCont.ec.fa (Number 2) has 14151655 sequences Rank 4: File ../../clean_reads/MP5_1.noCont.ec.fa (Number 4) has 10722031 sequences Rank 3: File ../../clean_reads/MP3_2.noCont.ec.fa (Number 3) has 14152522 sequences Rank 0: File ../../clean_reads/PE_1.noCont.ec.fa (Number 0) has 100860164 sequences Rank 1: File ../../clean_reads/PE_2.noCont.ec.fa (Number 1) has 100860164 sequences Rank 0 wrote rayk81/NumberOfSequences.txt Rank 0 wrote rayk81/SequencePartition.txt Rank 0 : Error, ../../clean_reads/MP3_1.noCont.ec.fa contains 14151655 sequences and ../../clean_reads/MP3_2.noCont.ec.fa contains 14152522 sequences (must be the same)
Code:
grep -c '^>' ../../clean_reads/MP3_1.noCont.ec.fa 9763950 grep -c '^>' ../../clean_reads/MP3_2.noCont.ec.fa 9763950
I have also run other assemblers successfully on this data, so I know there is no format problem with the files.
Any insights on what could be causing this problem?
Best Wishes,
Sophie