Hi all, I have a quick question about the pileup format. The documentation on the samtools website was clear enough, but the files that I have are slightly different.
Instead of something like this:
I have an extra column between the reference base call, and the number of reads. The inserted column corresponds to the variant call. However, I can't actually figure out what that variant call means. At first I thought it was the automatic interpretation of the SNP corresponding to the change in amino acid, but it turns out this is not actually the case.
Here are some examples of my data:
Everything there is fairly obvious and easy to put together, except I cannot figure out what the fourth column actually corresponds to in the alignment. Obviously the SNPs are g,c,c, & G in the above example, but what do the R,S,M & K stand for?
Does anyone have any ideas?
Instead of something like this:
Code:
seq1 272 T 24 ,.$.....,,.,.,...,,,.,..^+. <<<+;<<<<<<<<<<<=<;<;7<& seq1 273 T 23 ,.....,,.,.,...,,,.,..A <<<;<<<<<<<<<3<=<<<;<<+
Here are some examples of my data:
Code:
chr1 9692883 A R 79 79 35 54 .$,..,,,,g,,,,g..,,gg,g,g,gg,g.g..g.,.,g,g.gggg,,g.,g^Fg^F. BaBBaRBVBHBBMBBZBBBBBBBBBBBBBaBabB^BaBBBBbBBBBBBBbBBBa chr1 9699171 G S 59 59 36 28 .,,,,,,,,.c,c....c....ccc,^Fc^:c B`aL^`]YQ\FBKB]^\B`^ZaBBBBBB chr1 9699574 A M 75 75 35 38 ,$,,,,,,,,,cc,,,,..,.,c.,..cc..ccc^Fc^F.^F.^Fc^Fc a\JFFFTBBBDBBBBB`\F_GE_B`aBBaaPBBBbbVB chr1 9703423 T K 55 55 36 39 GGGGGG.GG,.GG,..,.G,......,.........^F.^F.^:. BBBBBBBBB`BBB]KB_BBaJGZVYK^\XT^]\a^`baa
Does anyone have any ideas?
Comment