I'm trying to associate some genomic contigs with one of several references using Mummer. What I can't seem to figure out is how to successfully remove all of the spurious hits.
I align the sequences with nucmer:
then remove the short and low-quality hits with delta-filter
That gets me ~90% of the way there, but I still have some query contigs aligning to multiple references that I need to remove - primarily those around rDNA genes. I'm getting hits to multiple references for these contigs, but I can't figure out what I'm supposed to do to get rid of them.
Delta-filter has options "-r", "-q", and "-g" which appear to address these sorts of things, but they are not documented nor explained well, and they don't generate the sort of output that I am expect - some pseudo-hits still remain.
Can someone familiar with Nucmer/MUMmer please explain how to filter out the secondary hits from some of my query sequences, or explain what the above options in Delta-Filter are supposed to do?
Sincerely,
Brett
I align the sequences with nucmer:
nucmer -p <output> <reference> <query>
delta-filter -l <length> -i <quality>
Delta-filter has options "-r", "-q", and "-g" which appear to address these sorts of things, but they are not documented nor explained well, and they don't generate the sort of output that I am expect - some pseudo-hits still remain.
Can someone familiar with Nucmer/MUMmer please explain how to filter out the secondary hits from some of my query sequences, or explain what the above options in Delta-Filter are supposed to do?
Sincerely,
Brett
Comment