We sequenced some 16S rRNA gene amplicons on our collaborator's MiSeq, and got the raw fastq file that looks like the seqs below.
It seems that there are only reads_1 and reads_2 in the file, and the index reads are missing.
In addition, the 1:N:0 or 2:N:0 in the header are missing the sample number as seen in the MiSeq fastq files.
As I need the index reads to feed the data to Qiime for further analysis, I have the following questions:
1. Is the index-reads info still in this fastq file?
2. If yes, Is there a way I can extract the index reads?
3. If no, besides asking our collaborator for the index reads (which I've done but haven't heard from them), is there any other similar program as Qiime but does not require an index read file as the input?
Thank you for any feedback.
@MISEQ04:37:000000000-A2G8E:1:1101:14157:1957 1:N:0:TCCACAGGAGT
TACAGAGGGTGCAAGCGTTAATCGGAATTACTGGGCGTAAAGCGCGCGTAGGTGGTTTGTTAAGTTGGATGTGAAATCCCCGGGCTCAACCTGGGAACTGCATTCAAAACTGACTGACTAGAGTATGGTAGAGGGTGGTGGAATTTCCTGTGTAGCGGTGAAATGCGTAGATATAGGAAGGAACACCAGTGGCGAAGGCGACCACCTGGACTGATACTGACACTGAGGTGCGAAAGCGTGGGGAGCAAACA
+
?????BB?DDDDDEDDEEEEFFHIHECFFHHHIIIFHHHHIIIEHHHHEHHHAEFEHHEHHEGHHHHHHHHHHHFFFHHHHHHFFEFEFFFFFFFEEEFFFFFFFFFEFFFFFFFFEFFEFFEEFFFFFFFEEFEEFDEEEEEFFFFFFECEEEFEDDED?AACEEEEEDAEEEFEFEFEEEEEEEEEFECEE>?>?8A>;???EEFEEEFFCEE?*1::A:A0CCECA*14)48AEEEE>;;8;88:AC#
@MISEQ04:37:000000000-A2G8E:1:1101:14157:1957 2:N:0:TCCACAGGAGT
ACGGACTACCCGGGTTTCTAATCCTGTTTGCTCCCCACGCTTTCGCACCTCAGTGTCAGTATCAGTCCAGGTGGTCGCCTTCGCCACTGGTGTTCCTTCCTATATCTACGCATTTCACCGCTACACAGGAAATTCCACCACCCTCTACCATACTCTAGTCAGTCAGTTTTGAATGCAGTTCCCAGGTTGAGCCCGGGGATTTCACATCCAACTTAACAAACCACCAACCCGCGCTTTACGCCCAGCAATTC
+
?????@@BDDDDDDDDEFFFFFCFFHHHHHHGFHHHHHCDDHHDEDEHHHHHHHFEHFHGGGHHHHHHFCFHHHHF=EEEDCDEHEHHFCFHHEFHFHHFFFFFFFFFFFEEDEEDDDDEE<6@EBCEEFFFFECEEEEECEEE8:CEFFAECECEFAEFE?CEEEECAAAEAEEEFFCEFFFFEE?CEEAEFE'.8?88:?*:AE:CE?*1*:?C*?A?EAEE###########################
@MISEQ04:37:000000000-A2G8E:1:1101:14713:1991 1:N:0:TCCACAGGAGT
AACGGAGGGGGCAAGTGTTTCTCGCAATGACTGGGCCTAAAGGGCACGCAGGTGGTTTTCGACAACAGGTATTTCGGTTAAACACTGCAGGCTAACAACAGGTCTGGAATATCTACTAGGAAACTAAGAGTAGTGCTCAGGTCTTTAGAATTGCTAGCGGAGGGGTGGAATCCGGCGAGGCTAGTAGGAATGCTTATGAGTGAAGGCAATTTTCTGGAGCTGACTGACGCTCAGGTGCGCAAGCATGGGGA
+
9?????@@DDDDDDDDFFFFFFIEHHHHHHHIIHHIIHHHIHHHIHHEHHHHAEFEHIIIHHHH=FHHHC=DFHFFHEHHFFFFFFFFFDEEEFFFFFFFFFBEEFE=BEEEFFFFEEEEAECEFFFFFFCCEEFFFF?AECAEFFEEEEFFFFFEEEDD8<>DD)8>AEECEA?D?D?D>C?C??:E1?CEEAE?:CAECEAEFFFE8AEEF:?:A:8?*?*:?CAEEEEADCC*0??DD8<?ECEEEE#
@MISEQ04:37:000000000-A2G8E:1:1101:14713:1991 2:N:0:TCCACAGGAGT
ACGGACTACTGGGGTATCTAATCCTATTTGATCCCCATGCTTGCGCACCTGAGCGTCAGTCAGCTCCAGAAAATTGCCTTCACTCATAAGCATTCCTACTAGCCTCGCCGGATTCCACCCCTCCGCTAGCAATTCTAAAGACCTGAGCACTACTCTTAGTTTCCTAGTAGATATTCCAGACCTGTTGTTAGCCTGCAGTGTTTAACCGAAATACCTGTTGTCGCAAACCACCTGCGTGCCCTTTAGGCCCA
+
AAA?AABBDDDDDD<AFFFGFGHIHFFHHIIHHHHIHIIIIIHHHH@HHIIFHHHHHHHIIHHIIHHGHHFHIIIHHHFCECGHHFHIIHHHHHHHHHHHHHFHDHHHHGGGGGDEEGDEGCGGEEGGGGGGEEGGGEGEEEGCGGCGCEGGGGGGGGGEGGGGEGGEGG?CGGGGGEGGGGGGGGCGEGGCEEGGGGEECEG?C:?828<CCE?EGGGCCCC*.).CC?CEECE8CEC*11CEEE#####
@MISEQ04:37:000000000-A2G8E:1:1101:13997:2108 1:N:0:TCCACAGGAGT
TACGTAGGGTGCGAGCGTTAATCGGAATTACTGGGCGTAAAGCGTGCGCAGGCGGTTAATTAAACCAGTTGTGAAATCCCCGGGGTCAACCTGGGAATTGCATCTGTGACTGTATAGCTAGAGTACGGTAGAGGGGGATGGGATTCAGCGGGTAGCCGGGAAAAGCGTAGATATGCCGAGGAAACACGGAGGCGAAGGGAATTCTCTGGAACTGGACTTGCGCTCCTGCACGAAAAGCTGGGGAGGAAACA
+
?????BB?BDDDBBBDDDEEFFHIHHHHHHHIHHHIHHHHIHHHEHECEHECEHH<<<,,,,5,,44+4C,@D,CF,,@FF);@))34AAC################################################################################################################################################################
@MISEQ04:37:000000000-A2G8E:1:1101:13997:2108 2:N:0:TCCACAGGAGT
ACGGACTACAAGGGTTTCTAATCCTGTTTGCTCCCCACGCTTTCGTGCATGAGCGTCAGTACAGGTCCAGAGGATTGCCTTCGCCATCGGTGTTCCTCCGCATATCTACGCATTTCACTGCTACACGCGGAATTCCATCCCCCTCTACCGTACTCTAGCTATACAGTCACAGATGCAATTCCCAGGTTGAGCCCGGGGATTTCACAACTGTCTTATATAACCGCCTGCGCACGCTTTACGCCCAGCAATTC
+
?????@@BDDDBDD?BEFFFFFFHIIHHHHHIIHHHIC=DDFFGHHFHHIIIHFCCEEHGHIHHH-AEFHDDFFHHHFGGFFHHHHHFHECDEEDHHDFCDEDDFFDFFF@DDED=DEED=,ACFFAEDEDDAEFFFFE?C??8EEEF:8).:AAAEF?CEAECEA?:::CC:?EEEFFE?CCECE*?*:?ADDD84)*1:?EEEECA*00::*::CE:?>'.A?EDD;''')08*AEAD48?######
It seems that there are only reads_1 and reads_2 in the file, and the index reads are missing.
In addition, the 1:N:0 or 2:N:0 in the header are missing the sample number as seen in the MiSeq fastq files.
As I need the index reads to feed the data to Qiime for further analysis, I have the following questions:
1. Is the index-reads info still in this fastq file?
2. If yes, Is there a way I can extract the index reads?
3. If no, besides asking our collaborator for the index reads (which I've done but haven't heard from them), is there any other similar program as Qiime but does not require an index read file as the input?
Thank you for any feedback.
@MISEQ04:37:000000000-A2G8E:1:1101:14157:1957 1:N:0:TCCACAGGAGT
TACAGAGGGTGCAAGCGTTAATCGGAATTACTGGGCGTAAAGCGCGCGTAGGTGGTTTGTTAAGTTGGATGTGAAATCCCCGGGCTCAACCTGGGAACTGCATTCAAAACTGACTGACTAGAGTATGGTAGAGGGTGGTGGAATTTCCTGTGTAGCGGTGAAATGCGTAGATATAGGAAGGAACACCAGTGGCGAAGGCGACCACCTGGACTGATACTGACACTGAGGTGCGAAAGCGTGGGGAGCAAACA
+
?????BB?DDDDDEDDEEEEFFHIHECFFHHHIIIFHHHHIIIEHHHHEHHHAEFEHHEHHEGHHHHHHHHHHHFFFHHHHHHFFEFEFFFFFFFEEEFFFFFFFFFEFFFFFFFFEFFEFFEEFFFFFFFEEFEEFDEEEEEFFFFFFECEEEFEDDED?AACEEEEEDAEEEFEFEFEEEEEEEEEFECEE>?>?8A>;???EEFEEEFFCEE?*1::A:A0CCECA*14)48AEEEE>;;8;88:AC#
@MISEQ04:37:000000000-A2G8E:1:1101:14157:1957 2:N:0:TCCACAGGAGT
ACGGACTACCCGGGTTTCTAATCCTGTTTGCTCCCCACGCTTTCGCACCTCAGTGTCAGTATCAGTCCAGGTGGTCGCCTTCGCCACTGGTGTTCCTTCCTATATCTACGCATTTCACCGCTACACAGGAAATTCCACCACCCTCTACCATACTCTAGTCAGTCAGTTTTGAATGCAGTTCCCAGGTTGAGCCCGGGGATTTCACATCCAACTTAACAAACCACCAACCCGCGCTTTACGCCCAGCAATTC
+
?????@@BDDDDDDDDEFFFFFCFFHHHHHHGFHHHHHCDDHHDEDEHHHHHHHFEHFHGGGHHHHHHFCFHHHHF=EEEDCDEHEHHFCFHHEFHFHHFFFFFFFFFFFEEDEEDDDDEE<6@EBCEEFFFFECEEEEECEEE8:CEFFAECECEFAEFE?CEEEECAAAEAEEEFFCEFFFFEE?CEEAEFE'.8?88:?*:AE:CE?*1*:?C*?A?EAEE###########################
@MISEQ04:37:000000000-A2G8E:1:1101:14713:1991 1:N:0:TCCACAGGAGT
AACGGAGGGGGCAAGTGTTTCTCGCAATGACTGGGCCTAAAGGGCACGCAGGTGGTTTTCGACAACAGGTATTTCGGTTAAACACTGCAGGCTAACAACAGGTCTGGAATATCTACTAGGAAACTAAGAGTAGTGCTCAGGTCTTTAGAATTGCTAGCGGAGGGGTGGAATCCGGCGAGGCTAGTAGGAATGCTTATGAGTGAAGGCAATTTTCTGGAGCTGACTGACGCTCAGGTGCGCAAGCATGGGGA
+
9?????@@DDDDDDDDFFFFFFIEHHHHHHHIIHHIIHHHIHHHIHHEHHHHAEFEHIIIHHHH=FHHHC=DFHFFHEHHFFFFFFFFFDEEEFFFFFFFFFBEEFE=BEEEFFFFEEEEAECEFFFFFFCCEEFFFF?AECAEFFEEEEFFFFFEEEDD8<>DD)8>AEECEA?D?D?D>C?C??:E1?CEEAE?:CAECEAEFFFE8AEEF:?:A:8?*?*:?CAEEEEADCC*0??DD8<?ECEEEE#
@MISEQ04:37:000000000-A2G8E:1:1101:14713:1991 2:N:0:TCCACAGGAGT
ACGGACTACTGGGGTATCTAATCCTATTTGATCCCCATGCTTGCGCACCTGAGCGTCAGTCAGCTCCAGAAAATTGCCTTCACTCATAAGCATTCCTACTAGCCTCGCCGGATTCCACCCCTCCGCTAGCAATTCTAAAGACCTGAGCACTACTCTTAGTTTCCTAGTAGATATTCCAGACCTGTTGTTAGCCTGCAGTGTTTAACCGAAATACCTGTTGTCGCAAACCACCTGCGTGCCCTTTAGGCCCA
+
AAA?AABBDDDDDD<AFFFGFGHIHFFHHIIHHHHIHIIIIIHHHH@HHIIFHHHHHHHIIHHIIHHGHHFHIIIHHHFCECGHHFHIIHHHHHHHHHHHHHFHDHHHHGGGGGDEEGDEGCGGEEGGGGGGEEGGGEGEEEGCGGCGCEGGGGGGGGGEGGGGEGGEGG?CGGGGGEGGGGGGGGCGEGGCEEGGGGEECEG?C:?828<CCE?EGGGCCCC*.).CC?CEECE8CEC*11CEEE#####
@MISEQ04:37:000000000-A2G8E:1:1101:13997:2108 1:N:0:TCCACAGGAGT
TACGTAGGGTGCGAGCGTTAATCGGAATTACTGGGCGTAAAGCGTGCGCAGGCGGTTAATTAAACCAGTTGTGAAATCCCCGGGGTCAACCTGGGAATTGCATCTGTGACTGTATAGCTAGAGTACGGTAGAGGGGGATGGGATTCAGCGGGTAGCCGGGAAAAGCGTAGATATGCCGAGGAAACACGGAGGCGAAGGGAATTCTCTGGAACTGGACTTGCGCTCCTGCACGAAAAGCTGGGGAGGAAACA
+
?????BB?BDDDBBBDDDEEFFHIHHHHHHHIHHHIHHHHIHHHEHECEHECEHH<<<,,,,5,,44+4C,@D,CF,,@FF);@))34AAC################################################################################################################################################################
@MISEQ04:37:000000000-A2G8E:1:1101:13997:2108 2:N:0:TCCACAGGAGT
ACGGACTACAAGGGTTTCTAATCCTGTTTGCTCCCCACGCTTTCGTGCATGAGCGTCAGTACAGGTCCAGAGGATTGCCTTCGCCATCGGTGTTCCTCCGCATATCTACGCATTTCACTGCTACACGCGGAATTCCATCCCCCTCTACCGTACTCTAGCTATACAGTCACAGATGCAATTCCCAGGTTGAGCCCGGGGATTTCACAACTGTCTTATATAACCGCCTGCGCACGCTTTACGCCCAGCAATTC
+
?????@@BDDDBDD?BEFFFFFFHIIHHHHHIIHHHIC=DDFFGHHFHHIIIHFCCEEHGHIHHH-AEFHDDFFHHHFGGFFHHHHHFHECDEEDHHDFCDEDDFFDFFF@DDED=DEED=,ACFFAEDEDDAEFFFFE?C??8EEEF:8).:AAAEF?CEAECEA?:::CC:?EEEFFE?CCECE*?*:?ADDD84)*1:?EEEECA*00::*::CE:?>'.A?EDD;''')08*AEAD48?######
Comment