I'm new at bioinformatics, and just got a paired end (120bp) set of Illumina sequences, the sequence_2 file looks like I'm used to but the sequence_1 looks like this:
@GRC13_0025_FC:8:1:8024:1022#0/1
NAGTGAGTAGTCAAAAGAATAGTTCTATCCGACTTAACCAAAGCTAACATCTTCTGAACATCAATCCGTGCAGCAGGATCCATTCCAGCAGTTGGTTCATCCAAAAGAATCACTTTACTCC
+GRC13_0025_FC:8:1:8024:1022#0/1
BQNNMMLMRLY[Y[[WVOQQWJWXXYVYYYMMTQOMNOQNVOTVTXXWWW_____TY[[YRRWWMJRRRTWPTRTPTVTV_____BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB
@GRC13_0025_FC:8:1:8161:1016#0/1
NTCTTCATCGTCAGGCACTGGAAAGTGATTATGCGTCATCTCATCTTCATGAATGGATTGATCTGATTTTCGGATATAAACAGAATGGAGAAGAGGCAGTGAAGGAAAGAAGCAATTATTT
+GRC13_0025_FC:8:1:8161:1016#0/1
BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB
@GRC13_0025_FC:8:1:8180:1012#0/1
NTCATTTAGAAAAAATAGAAACGATATTGAAGATAAAGTACGAATAATTATAGACCTGACAGTTGATGAGGTAGAAAGTGTAAAAATTAGATCTGAGAAAATTCAAGTAGATGGGCATTGA
+GRC13_0025_FC:8:1:8180:1012#0/1
BUUUUXXXXXbbQQQQQQ___b____b_________b__________b__b___b__b___bbbb_b_______b___bbbb___b_______T___b__QQ_____Z_Z___________
I have been looking at RAD sequencing data, and interpreting the 2nd line as fastq scores but if that's true for these, the quality is really poor. Is that true or do the paired end reads have a different file format? Thanks for any advice!
@GRC13_0025_FC:8:1:8024:1022#0/1
NAGTGAGTAGTCAAAAGAATAGTTCTATCCGACTTAACCAAAGCTAACATCTTCTGAACATCAATCCGTGCAGCAGGATCCATTCCAGCAGTTGGTTCATCCAAAAGAATCACTTTACTCC
+GRC13_0025_FC:8:1:8024:1022#0/1
BQNNMMLMRLY[Y[[WVOQQWJWXXYVYYYMMTQOMNOQNVOTVTXXWWW_____TY[[YRRWWMJRRRTWPTRTPTVTV_____BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB
@GRC13_0025_FC:8:1:8161:1016#0/1
NTCTTCATCGTCAGGCACTGGAAAGTGATTATGCGTCATCTCATCTTCATGAATGGATTGATCTGATTTTCGGATATAAACAGAATGGAGAAGAGGCAGTGAAGGAAAGAAGCAATTATTT
+GRC13_0025_FC:8:1:8161:1016#0/1
BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB
@GRC13_0025_FC:8:1:8180:1012#0/1
NTCATTTAGAAAAAATAGAAACGATATTGAAGATAAAGTACGAATAATTATAGACCTGACAGTTGATGAGGTAGAAAGTGTAAAAATTAGATCTGAGAAAATTCAAGTAGATGGGCATTGA
+GRC13_0025_FC:8:1:8180:1012#0/1
BUUUUXXXXXbbQQQQQQ___b____b_________b__________b__b___b__b___bbbb_b_______b___bbbb___b_______T___b__QQ_____Z_Z___________
I have been looking at RAD sequencing data, and interpreting the 2nd line as fastq scores but if that's true for these, the quality is really poor. Is that true or do the paired end reads have a different file format? Thanks for any advice!
Comment