Hi folks,
I'm using pilon to polishing a pacbio assembled genome with illumina data, and I'm confusing about how pilon treats single "N" characters for an ambigious base. There aren't gaps in the scaffolds, just single N characters representing an ambiguous base. For many of these there should be good support to correct this to an A, C, T, or G, but these aren't being touched by my current attempts with pilon. Ideas? Pilon is correcting other ambiguous bases (e.g. R, Y, K) to the correct base, but is ignoring all Ns.
The command I'm running is:
I'm using pilon to polishing a pacbio assembled genome with illumina data, and I'm confusing about how pilon treats single "N" characters for an ambigious base. There aren't gaps in the scaffolds, just single N characters representing an ambiguous base. For many of these there should be good support to correct this to an A, C, T, or G, but these aren't being touched by my current attempts with pilon. Ideas? Pilon is correcting other ambiguous bases (e.g. R, Y, K) to the correct base, but is ignoring all Ns.
The command I'm running is:
Code:
java -Xmx120g -jar ~/software/anaconda2/pkgs/pilon-1.22-1/share/pilon-1.22-1/pilon-1.22.jar \ --genome ref.fasta \ --frags aln.sorted.bam \ --unpaired u.sorted.bam \ --changes --vcf --tracks \ --threads 16 \ --fix bases,amb \ --outdir pilon_02