Hi
I am new to next gen sequencing. I got my exome capture reads and I aligned it to ref genome using maq. Now someone told me that my reads are not in the sanger fastq format and when I tried to convert it with patch for maq the new file was empty.
Here is the example how my reads look like
HWI-EAS440_0346:6:1:1488:13263#0/1:ACCTCTAACAGCCTCATCGTCAGCTACATCTGGTTATATGCACTT
TATTTCAGCCATATCAACTTATTTAAGGCTT:Z]RXZJQKKZ_]Y]]Ja\[[]T`^Y\VHSV]^JZZ\_\``NYTVYYZ^
SNT_^^BBBBBBBBBBBBBBBBBBBBBB
HWI-EAS440_0346:6:1:1524:7965#0/1:CTGTTTGTTGTTTAACAAGCCTACCAGGTGATTCTGACTCACATTA
TAGTTTCAGCACAACTTTAAATTCTTTCTT:]]R_LUJTXH\]_aU\bb^bb`acaccbcc`\YaTc^cccccccc]]``
^Za_U\KL[WZ]U_LLTJT`K__aSZ\
HWI-EAS440_0346:6:1:1526:8854#0/1:GGAGCATGGGAACAAATGTTCTTGAAATATTCTGCCTATACTTTCA
AGTGGGATATGGATAATCACTGGCCAAGGG:][`T]ZZZU_b^bLLYY\L\aUUUUKZTT`LT]Yaa^cYY_YU_U``bL
`HWTRSZ__MY[HTYZZ[\Y^\L_`^`
Then I used one perl script which I got online and converted my reads to this format
@HWI-EAS440_0333_8_1_1016_4512#0/2
TTAGAGACTCCTGGATGCCCTGAGGGAGCGGCTCCAGAGCTTGCCTTCCCTCCTCTGTTTTCACAACGGTCCAGCG
+HWI-EAS440_0333_8_1_1016_4512#0/2
dd^e^d^dcd`ddadeeeeeeaYea`dccd\ddTdddYdbddd\d^ca_^L^`V^dYa`b`bc\Yb\Yba`TT`BB
and then another person from my lab told me that these quality scores do not match with sanger quality scores. Now I am totally confused that what I have? As no one else from lab has done next gen sequencing before so there is noone to help, and I am new to programming and sequencing I am also confused. Can someone help me regarding this?
Manav
I am new to next gen sequencing. I got my exome capture reads and I aligned it to ref genome using maq. Now someone told me that my reads are not in the sanger fastq format and when I tried to convert it with patch for maq the new file was empty.
Here is the example how my reads look like
HWI-EAS440_0346:6:1:1488:13263#0/1:ACCTCTAACAGCCTCATCGTCAGCTACATCTGGTTATATGCACTT
TATTTCAGCCATATCAACTTATTTAAGGCTT:Z]RXZJQKKZ_]Y]]Ja\[[]T`^Y\VHSV]^JZZ\_\``NYTVYYZ^
SNT_^^BBBBBBBBBBBBBBBBBBBBBB
HWI-EAS440_0346:6:1:1524:7965#0/1:CTGTTTGTTGTTTAACAAGCCTACCAGGTGATTCTGACTCACATTA
TAGTTTCAGCACAACTTTAAATTCTTTCTT:]]R_LUJTXH\]_aU\bb^bb`acaccbcc`\YaTc^cccccccc]]``
^Za_U\KL[WZ]U_LLTJT`K__aSZ\
HWI-EAS440_0346:6:1:1526:8854#0/1:GGAGCATGGGAACAAATGTTCTTGAAATATTCTGCCTATACTTTCA
AGTGGGATATGGATAATCACTGGCCAAGGG:][`T]ZZZU_b^bLLYY\L\aUUUUKZTT`LT]Yaa^cYY_YU_U``bL
`HWTRSZ__MY[HTYZZ[\Y^\L_`^`
Then I used one perl script which I got online and converted my reads to this format
@HWI-EAS440_0333_8_1_1016_4512#0/2
TTAGAGACTCCTGGATGCCCTGAGGGAGCGGCTCCAGAGCTTGCCTTCCCTCCTCTGTTTTCACAACGGTCCAGCG
+HWI-EAS440_0333_8_1_1016_4512#0/2
dd^e^d^dcd`ddadeeeeeeaYea`dccd\ddTdddYdbddd\d^ca_^L^`V^dYa`b`bc\Yb\Yba`TT`BB
and then another person from my lab told me that these quality scores do not match with sanger quality scores. Now I am totally confused that what I have? As no one else from lab has done next gen sequencing before so there is noone to help, and I am new to programming and sequencing I am also confused. Can someone help me regarding this?
Manav
Comment