I am trying to align a multifasta containing various regions of interest from a reference genome and align them to a contig assembled from pacbio reads generated from a related strain, just to get a picture of what regions are present and the degree of mismatch/rearrangement.
Ive been trying to use nucmer, but am getting a strange error which I've put into the mummer-help mailing list and am still awaiting a response:
"postnuc: tigrinc.cc:337: int Read_String(FILE*, char*&, long int&, char*, int): Assertion `Len > 0 && Line [Len - 1] == '\n'' failed."
The syntax of the error message made me suspect there was a formatting issue with my sequences, but I could find no empty sequences, and no missing newline characters.
I switched to just trying to run the alignment via blastn with the align 2 sequences option, but received the following error:
"Message: Message: NCBI C++ Exception:# "local_db_adapter.cpp", line 123: Error: ncbi::blast::s_CheckForBlastSeqSrcErrors() - NCBI C++ Exception:# "blast_setup.hpp", line 190: Error: Sequence contains no data##"
Once again it appears to be a formatting error in the fasta files, but I cannot find any empty sequences.
Looking into it, I thought it might be an issue with the line length in the sequences. I've run nucmer before on sequences of 85 kbp all in one line in the fasta file and had no issues before, but I figured I would try it.
So I transformed the fasta files to only have 80 characters per line in the sequences (max limit I read somewhere per line, but this does not seem be a standard rule).
These transformed files gave the exact same error messages in both nucmer and nblast.
I must be missing something, and if these error messages are anything to go by it should be something obvious but I just can't seem to find it.
Couldn't attach the files due to size, so I've uploaded them here:
Features, 80 char per line: http://pastebin.com/E1MyBstr
Features: http://pastebin.com/rfsiiyKJ
Contig (view as Raw):http://pastebin.com/tyVsEP8g
Contig, 80 char per line: http://pastebin.com/DmZBCtiv
Any help would be vastly appreciated!
P.S. I thought I could use bwa to do this but apparently that is just for aligning fastq reads to a reference?
Ive been trying to use nucmer, but am getting a strange error which I've put into the mummer-help mailing list and am still awaiting a response:
"postnuc: tigrinc.cc:337: int Read_String(FILE*, char*&, long int&, char*, int): Assertion `Len > 0 && Line [Len - 1] == '\n'' failed."
The syntax of the error message made me suspect there was a formatting issue with my sequences, but I could find no empty sequences, and no missing newline characters.
I switched to just trying to run the alignment via blastn with the align 2 sequences option, but received the following error:
"Message: Message: NCBI C++ Exception:# "local_db_adapter.cpp", line 123: Error: ncbi::blast::s_CheckForBlastSeqSrcErrors() - NCBI C++ Exception:# "blast_setup.hpp", line 190: Error: Sequence contains no data##"
Once again it appears to be a formatting error in the fasta files, but I cannot find any empty sequences.
Looking into it, I thought it might be an issue with the line length in the sequences. I've run nucmer before on sequences of 85 kbp all in one line in the fasta file and had no issues before, but I figured I would try it.
So I transformed the fasta files to only have 80 characters per line in the sequences (max limit I read somewhere per line, but this does not seem be a standard rule).
These transformed files gave the exact same error messages in both nucmer and nblast.
I must be missing something, and if these error messages are anything to go by it should be something obvious but I just can't seem to find it.
Couldn't attach the files due to size, so I've uploaded them here:
Features, 80 char per line: http://pastebin.com/E1MyBstr
Features: http://pastebin.com/rfsiiyKJ
Contig (view as Raw):http://pastebin.com/tyVsEP8g
Contig, 80 char per line: http://pastebin.com/DmZBCtiv
Any help would be vastly appreciated!
P.S. I thought I could use bwa to do this but apparently that is just for aligning fastq reads to a reference?
Comment