Dear All,
I would like to retrieve sequences (fastq format) from an Illumina fastq data file using the first part of the sequence header.
Example of a Illumina fastq header:
@X01032:109:000000000-AGKF7:1:1101:11950:1779 1:N:0:1
My query:
@X01032:109:000000000-AGKF7:1:1101:11950:1779
I tried usearch (fastx_getseqs), seqtk, and seqret but nothing works because of the special characters (e.g. ":","-") in the header. A simple grep like
would work but it would take a long time to finish. I could reformat the headers but I prefer not to (if possible).
Is there a tool out there that would work with Illumina fastq files?
Thanks for the help!
I would like to retrieve sequences (fastq format) from an Illumina fastq data file using the first part of the sequence header.
Example of a Illumina fastq header:
@X01032:109:000000000-AGKF7:1:1101:11950:1779 1:N:0:1
My query:
@X01032:109:000000000-AGKF7:1:1101:11950:1779
I tried usearch (fastx_getseqs), seqtk, and seqret but nothing works because of the special characters (e.g. ":","-") in the header. A simple grep like
Code:
grep "@X01032:109:000000000-AGKF7:1:1101:11950:1779" -A 3 in.fastq
Is there a tool out there that would work with Illumina fastq files?
Thanks for the help!
Comment