I tried to look in previous threads about this issue, but couldn't really find anything about it: when I run VarScan to retrieve SNPs, I get some cases of variants with + and - symbols, some with 1 letter, some with multiple letters. Several examples include:
-G
+GGGGCCTGGTACAGCGGCT
-T
-AGAGAGAGAGAGAGAG
+ATTTCCT
+CCTT
+T
-TC
The code I use is below, and it is quite straightforward: I use samtools to create a pileup from bam file, and put it through pileup2snp in VarScan
My question is whether getting SNPs with - and + symbols is a problem of VarScan, or is there something going wrong with the script I wrote, or possibly the bam file. Since I saw it in multiple runs, I assume that it's not specific for the bam file, so I guess it is either my script of VarScan. If it is VarScan, how can it be interpreted into some valid results?
-G
+GGGGCCTGGTACAGCGGCT
-T
-AGAGAGAGAGAGAGAG
+ATTTCCT
+CCTT
+T
-TC
The code I use is below, and it is quite straightforward: I use samtools to create a pileup from bam file, and put it through pileup2snp in VarScan
Code:
/path/to/samtools-0.1.16/samtools pileup -f $REFERENCE_FILE <bam file with removed duplicates> | java -jar /path/to/VarScan2.2.5/VarScan.v2.2.5.jar pileup2snp --min-coverage 15 --min-var-freq 0.31 > Samplename.snp