I used samtools pileup with data from an illumina RNA-seq dataset aligned with Tophat to hg19. It gave me some ambiguous nucleotides in the call base.
SIFT is unable to process the ambiguous nucleotides, so they've been left out of the analysis with SIFT.
Here is an example of my pileup:
chr1 881627 G R 62 62 60 65 ,$a$,$a,aa,,a,,,.aAaaaaAaaa,,aaaa,,aaaA.A,,,A,a.Aaaaaa,,,,a,.,..a,.^~,
chr1 887801 A R 228 228 60 53 G$G$g$..,,,,Gg.g,,gG..Ggg.Gg,ggG..GGG.GGG,gg,.,,,,,.,..^~T
chr1 990773 C T 48 124 60 20 TtTTTTTtttTTTtTtttt^~t
Has anyone found a way to process these ambiguous nucleotides in SIFT so that the information they do hold can be utilized?
In addition, can anyone clarify when it is appropriate to run SIFT with Zero-Based or First-Based?
SIFT is unable to process the ambiguous nucleotides, so they've been left out of the analysis with SIFT.
Here is an example of my pileup:
chr1 881627 G R 62 62 60 65 ,$a$,$a,aa,,a,,,.aAaaaaAaaa,,aaaa,,aaaA.A,,,A,a.Aaaaaa,,,,a,.,..a,.^~,
chr1 887801 A R 228 228 60 53 G$G$g$..,,,,Gg.g,,gG..Ggg.Gg,ggG..GGG.GGG,gg,.,,,,,.,..^~T
chr1 990773 C T 48 124 60 20 TtTTTTTtttTTTtTtttt^~t
Has anyone found a way to process these ambiguous nucleotides in SIFT so that the information they do hold can be utilized?
In addition, can anyone clarify when it is appropriate to run SIFT with Zero-Based or First-Based?