Hi all,
I am using Bowtie to map SOLID read in horse genome reference, in order to discover new miRNA. I employ the commande line :
and I obtain this output :
2_18_1468_F3 0 3 19908716 255 22M ACCACAGGGTAGAACCACGGTC IqqqqqqqqqqqqqqqqqqqqI XA:i:2 MD:Z:20A1 NM:i:1 CM:i:2
2_18_1468_F3 0 18 17330428 255 22M CAACACTTTGCTCCAACAAACT Iqqqqqqqqqqqqqqqq!!qqI XA:i:2 MD:Z:21A0 NM:i:1 CM:i:2
2_18_1468_F3 0 6 76907837 255 22M TGCACACCCATCTTGGTGCCAG I!!qq!!qqqqqqqqqqqqqqI XA:i:2 MD:Z:22 NM:i:0 CM:i:2
2_18_1468_F3 0 22 26347698 255 22M GTTGTTCCCATCTTGGTGCCTC Iqqq!!qqqqqqqqqqqqq!!I XA:i:2 MD:Z:22 NM:i:0 CM:i:2
2_18_1468_F3 16 21 55207881 255 22M GACCGTGGGGAGCGGGACACCA Iqqqqqq!!qqq!!qqqqqqqI XA:i:2 MD:Z:22 NM:i:0 CM:i:2
In my bowtie manual interpretation, the NM field represent mismatch between decoding read and reference, i.e. mismatchs in nucleotidic alignment, and CM field colorspace mismatch. So, in my SAM file, for a same read with 2 colorspace mismatch, number of nucleotide mismatch change ?? I don't understand why ! Anybody who work with SOLID read can help me ?
Moreover, I test --snpfrac option of bowtie, and once again, results are amazing. For the same read I have :
2_18_1468_F3 0 6 76907837 255 22M TGGTGTCCCATCTTGGTGCCAG IqqqqqqqqqqqqqqqqqqqqI XA:i:2 MD:Z:2C0A0C0A16 NM:i:4 CM:i:2
2_18_1468_F3 0 22 26347698 255 22M TGGTGTCCCATCTTGGTGCCAG IqqqqqqqqqqqqqqqqqqqqI XA:i:2 MD:Z:0G0T0T0G0T15T0C0 NM:i:7 CM:i:2
2_18_1468_F3 0 3 19908716 255 22M ACCACAGGGTAGAACCACGGTC IqqqqqqqqqqqqqqqqqqqqI XA:i:2 MD:Z:20A1 NM:i:1 CM:i:2
2_18_1468_F3 0 18 17330428 255 22M CAACACTTTGCTCCAACATTGA IqqqqqqqqqqqqqqqqqqqqI XA:i:2 MD:Z:18A0A0C1 NM:i:3 CM:i:2
2_18_1468_F3 16 21 55207881 255 22M CTGGCACCAAGATGGGACACCA IqqqqqqqqqqqqqqqqqqqqI XA:i:2 MD:Z:0G0A0C0C0G0T0G0G0G0G0A0G0C9 NM:i:13 CM:i:2
Here, the two first alignments are against two chromosomic locations which are identical sequence (perfect match between this two sequence), but if we look the NM field, they are differents ! How a same read which map against two identical sequences can be have a mismatch number different ? So, I wonder if bowtie can manage colorspace (and SNP) and how ?
Thanks
I am using Bowtie to map SOLID read in horse genome reference, in order to discover new miRNA. I employ the commande line :
Code:
./bowtie --sam -f -a -m 6 -C -v 2 --best --strata <path_fasta> <path_sam>
2_18_1468_F3 0 3 19908716 255 22M ACCACAGGGTAGAACCACGGTC IqqqqqqqqqqqqqqqqqqqqI XA:i:2 MD:Z:20A1 NM:i:1 CM:i:2
2_18_1468_F3 0 18 17330428 255 22M CAACACTTTGCTCCAACAAACT Iqqqqqqqqqqqqqqqq!!qqI XA:i:2 MD:Z:21A0 NM:i:1 CM:i:2
2_18_1468_F3 0 6 76907837 255 22M TGCACACCCATCTTGGTGCCAG I!!qq!!qqqqqqqqqqqqqqI XA:i:2 MD:Z:22 NM:i:0 CM:i:2
2_18_1468_F3 0 22 26347698 255 22M GTTGTTCCCATCTTGGTGCCTC Iqqq!!qqqqqqqqqqqqq!!I XA:i:2 MD:Z:22 NM:i:0 CM:i:2
2_18_1468_F3 16 21 55207881 255 22M GACCGTGGGGAGCGGGACACCA Iqqqqqq!!qqq!!qqqqqqqI XA:i:2 MD:Z:22 NM:i:0 CM:i:2
In my bowtie manual interpretation, the NM field represent mismatch between decoding read and reference, i.e. mismatchs in nucleotidic alignment, and CM field colorspace mismatch. So, in my SAM file, for a same read with 2 colorspace mismatch, number of nucleotide mismatch change ?? I don't understand why ! Anybody who work with SOLID read can help me ?
Moreover, I test --snpfrac option of bowtie, and once again, results are amazing. For the same read I have :
2_18_1468_F3 0 6 76907837 255 22M TGGTGTCCCATCTTGGTGCCAG IqqqqqqqqqqqqqqqqqqqqI XA:i:2 MD:Z:2C0A0C0A16 NM:i:4 CM:i:2
2_18_1468_F3 0 22 26347698 255 22M TGGTGTCCCATCTTGGTGCCAG IqqqqqqqqqqqqqqqqqqqqI XA:i:2 MD:Z:0G0T0T0G0T15T0C0 NM:i:7 CM:i:2
2_18_1468_F3 0 3 19908716 255 22M ACCACAGGGTAGAACCACGGTC IqqqqqqqqqqqqqqqqqqqqI XA:i:2 MD:Z:20A1 NM:i:1 CM:i:2
2_18_1468_F3 0 18 17330428 255 22M CAACACTTTGCTCCAACATTGA IqqqqqqqqqqqqqqqqqqqqI XA:i:2 MD:Z:18A0A0C1 NM:i:3 CM:i:2
2_18_1468_F3 16 21 55207881 255 22M CTGGCACCAAGATGGGACACCA IqqqqqqqqqqqqqqqqqqqqI XA:i:2 MD:Z:0G0A0C0C0G0T0G0G0G0G0A0G0C9 NM:i:13 CM:i:2
Here, the two first alignments are against two chromosomic locations which are identical sequence (perfect match between this two sequence), but if we look the NM field, they are differents ! How a same read which map against two identical sequences can be have a mismatch number different ? So, I wonder if bowtie can manage colorspace (and SNP) and how ?
Thanks