Dear all,
According to my previous experience, in sam file output by bismark, the third column is usually chromosome number with "chr" in front. While my new data has a mixture of chr<number> and <number> in that column. Is that common?
MWR-PRG-0014:106:C25B0ACXX:4:1101:1195:1983_1:N:0:CAAAAG/1 115 15 3399663 255 100M = 3399567 -196 CAAAAT
ACCAAAAAATAAAACACAAAATAAAAAAACTATTCTTCCTACCTAAAAACATAATAACTTCCACATCAATAATTCTTTATTACATAAATTATAN #DDDC@</=BCEEEC>ECCD@B@?7EA=HIIIIGBIHG
F>ACFFF??GGIIIHGCDD<<<B?19CIGHHIIIEHHEIHEIGIGEDEHHFHDHDDDBB=1# NM:i:23 XX:Z:2G3G3G1GGG1G1G5GG4G2GG12T2G1G1G1C2G1G31G4GT XM:Z:.
.x...h...x.hhh.h.h.....xh....h..hh...............x.h.h....h.h...............................h....h. XR:Z:CT XG:Z:GA
MWR-PRG-0014:106:C25B0ACXX:4:1101:1195:1983_1:N:0:CAAAAG/2 179 15 3399567 255 100M = 3399663 196 AACAAA
AACAATTATAAAACTAAACTAAAAAAATCCCAAATCAAAATTTTAATATTAATTTATTCATTCACCTCACAAATAAATAAAAATATTTATCAAA B@@DD;DDFDDHDEBFHIIGGIJJIJJIJIIE>BD<?F
DFGBGGID@CGICCGGIJJJJIJJJI=AECHHAE>BFFCCEE@ECCDAAC@=C>CCCDC;AC NM:i:24 XX:Z:1G2GGG3G2G1GGG3GG3GG1G8GG4G5G2G2G25G4G6G3G1 XM:Z:.
h..xhh...x..h.hhh...xh...xh.h........xh....h.....h..h..h.........................h....h......h...x. XR:Z:GA XG:Z:GA
MWR-PRG-0014:106:C25B0ACXX:4:1101:1234:1999_1:N:0:CAAAAG/1 115 chr9 57387259 255 100M = 57387164 --195 AAACAAGCATCTTAAAATAACTATTAAAATTCAAAAAACTATATATCCTCAAAACTAAAAATAATATCAAATCCATAATCTTAAAATCCTCTTTCTAAGN ?A>5>;-.;(.6:EEDA>?@
?=;DDDDACIIDECB=<??DBBEDDDDDD4DBIDIFCDD>EE?C9BDB<4<DEFEFEAEBDC+A9<B>DDBDB=A11# NM:i:21 XX:Z:1G3G8GG2GG2G2G7GG1G14G1G3G2G1G2G4G11GG15T
XM:Z:.h...xH.......hh..hh..x..h.......xh.h..............x.h...h..h.h..h....h...........hh..............H. XR:Z:CT XG:Z:G
A
MWR-PRG-0014:106:C25B0ACXX:4:1101:1234:1999_1:N:0:CAAAAG/2 179 chr9 57387164 255 100M = 57387259 -195 CAAAATATTTATACTACCTACTATATACACAACACAATACTAAATTCCACTTTACTCCCTAATCTTCCACTATTCCCTCTTCCCTAAACAAAAAAAAACA ?@@FF?D4=E?FF3:AA:
C4C:FGG>3+AFHIJGDH@DHGBDHIGG?@<FGGGHJIDCDHCHBGGIIJIIBEE=EHE;A?E7;7?7;;AA@BDD@? NM:i:19 XX:Z:6G3G1G2G3G2G1G1G4G3GG1G3G10G7G25G4G1G1G3-XM:Z:......h...h.h..x...x..x.h.h....x...zx.h...h..........h.......h.........................h....h.h.h... XR:Z:GA XG:Z:GA
MWR-PRG-0014:106:C25B0ACXX:4:1101:1406:1986_1:N:0:CAAAAG/1 115 12 77985032 255 100M = 77984894 --238 AATATTAACATAAAACCAAAACAACAAAATATCTAAAACACTCCAATCCCCACTCATTCCAAACTTCAAACTACTAAATCAAAAATATTACATTTCATTN CDAEDEED@C>DDBB@EFFFD@
HHCHHGGECCAEHD=.=8HFJHIHCHEGB9IIIHGGGIJIGIGDGIHHGEEJEJGIGIJJJJJIJHHFGHFFFDD=4# NM:i:30 XX:Z:3G2GG1G1GG1G3GG2GG3GG1G3GG8G16G6GG5GGG3GG
GG1G2G9A XM:Z:...h..hh.z.hh.h...xh..zx...hh.h...xh........z................x......xh.....xhh...xhhh.h..h.......... XR:Z:C
T XG:Z:GA
According to my previous experience, in sam file output by bismark, the third column is usually chromosome number with "chr" in front. While my new data has a mixture of chr<number> and <number> in that column. Is that common?
MWR-PRG-0014:106:C25B0ACXX:4:1101:1195:1983_1:N:0:CAAAAG/1 115 15 3399663 255 100M = 3399567 -196 CAAAAT
ACCAAAAAATAAAACACAAAATAAAAAAACTATTCTTCCTACCTAAAAACATAATAACTTCCACATCAATAATTCTTTATTACATAAATTATAN #DDDC@</=BCEEEC>ECCD@B@?7EA=HIIIIGBIHG
F>ACFFF??GGIIIHGCDD<<<B?19CIGHHIIIEHHEIHEIGIGEDEHHFHDHDDDBB=1# NM:i:23 XX:Z:2G3G3G1GGG1G1G5GG4G2GG12T2G1G1G1C2G1G31G4GT XM:Z:.
.x...h...x.hhh.h.h.....xh....h..hh...............x.h.h....h.h...............................h....h. XR:Z:CT XG:Z:GA
MWR-PRG-0014:106:C25B0ACXX:4:1101:1195:1983_1:N:0:CAAAAG/2 179 15 3399567 255 100M = 3399663 196 AACAAA
AACAATTATAAAACTAAACTAAAAAAATCCCAAATCAAAATTTTAATATTAATTTATTCATTCACCTCACAAATAAATAAAAATATTTATCAAA B@@DD;DDFDDHDEBFHIIGGIJJIJJIJIIE>BD<?F
DFGBGGID@CGICCGGIJJJJIJJJI=AECHHAE>BFFCCEE@ECCDAAC@=C>CCCDC;AC NM:i:24 XX:Z:1G2GGG3G2G1GGG3GG3GG1G8GG4G5G2G2G25G4G6G3G1 XM:Z:.
h..xhh...x..h.hhh...xh...xh.h........xh....h.....h..h..h.........................h....h......h...x. XR:Z:GA XG:Z:GA
MWR-PRG-0014:106:C25B0ACXX:4:1101:1234:1999_1:N:0:CAAAAG/1 115 chr9 57387259 255 100M = 57387164 --195 AAACAAGCATCTTAAAATAACTATTAAAATTCAAAAAACTATATATCCTCAAAACTAAAAATAATATCAAATCCATAATCTTAAAATCCTCTTTCTAAGN ?A>5>;-.;(.6:EEDA>?@
?=;DDDDACIIDECB=<??DBBEDDDDDD4DBIDIFCDD>EE?C9BDB<4<DEFEFEAEBDC+A9<B>DDBDB=A11# NM:i:21 XX:Z:1G3G8GG2GG2G2G7GG1G14G1G3G2G1G2G4G11GG15T
XM:Z:.h...xH.......hh..hh..x..h.......xh.h..............x.h...h..h.h..h....h...........hh..............H. XR:Z:CT XG:Z:G
A
MWR-PRG-0014:106:C25B0ACXX:4:1101:1234:1999_1:N:0:CAAAAG/2 179 chr9 57387164 255 100M = 57387259 -195 CAAAATATTTATACTACCTACTATATACACAACACAATACTAAATTCCACTTTACTCCCTAATCTTCCACTATTCCCTCTTCCCTAAACAAAAAAAAACA ?@@FF?D4=E?FF3:AA:
C4C:FGG>3+AFHIJGDH@DHGBDHIGG?@<FGGGHJIDCDHCHBGGIIJIIBEE=EHE;A?E7;7?7;;AA@BDD@? NM:i:19 XX:Z:6G3G1G2G3G2G1G1G4G3GG1G3G10G7G25G4G1G1G3-XM:Z:......h...h.h..x...x..x.h.h....x...zx.h...h..........h.......h.........................h....h.h.h... XR:Z:GA XG:Z:GA
MWR-PRG-0014:106:C25B0ACXX:4:1101:1406:1986_1:N:0:CAAAAG/1 115 12 77985032 255 100M = 77984894 --238 AATATTAACATAAAACCAAAACAACAAAATATCTAAAACACTCCAATCCCCACTCATTCCAAACTTCAAACTACTAAATCAAAAATATTACATTTCATTN CDAEDEED@C>DDBB@EFFFD@
HHCHHGGECCAEHD=.=8HFJHIHCHEGB9IIIHGGGIJIGIGDGIHHGEEJEJGIGIJJJJJIJHHFGHFFFDD=4# NM:i:30 XX:Z:3G2GG1G1GG1G3GG2GG3GG1G3GG8G16G6GG5GGG3GG
GG1G2G9A XM:Z:...h..hh.z.hh.h...xh..zx...hh.h...xh........z................x......xh.....xhh...xhhh.h..h.......... XR:Z:C
T XG:Z:GA
Comment