Hi
I'm trying to work through some of the various assembler programs before actually collecting my own Illumina data. I've found some test datasets here:
but I'm not sure if the file formats are the same as raw data from the Genome Analzyer.
The files are s_4_seq.txt and s_4_prb.txt and the first few lines look like this:
s_4_seq.txt
4 1 56 910 AACTTACAATTGAAAATATAAACTCAT
4 1 64 716 AAGATGATTATATGTCTTCCTTTTCGA
4 1 890 894 TCAAACCAATCAGACCTATGTTTCATA
s_4_prb.txt
40 -40 -40 -40 40 -40 -40 -40 -40 40 -40 -40 -40 -4
0 -40 40 -40 -40 -40 40 40 -40 -40 -40 -40 40 -40
-40 40 -40 -40 -40 40 -40 -40 -40 -40 -40 -40 40
So my questions are
1. Is this the raw data format from the machine?
2. How do I get these files into fastq format? The maq converter and sanger perl scripts previously mentioned do not seem to work.
Thank you!
I'm trying to work through some of the various assembler programs before actually collecting my own Illumina data. I've found some test datasets here:
but I'm not sure if the file formats are the same as raw data from the Genome Analzyer.
The files are s_4_seq.txt and s_4_prb.txt and the first few lines look like this:
s_4_seq.txt
4 1 56 910 AACTTACAATTGAAAATATAAACTCAT
4 1 64 716 AAGATGATTATATGTCTTCCTTTTCGA
4 1 890 894 TCAAACCAATCAGACCTATGTTTCATA
s_4_prb.txt
40 -40 -40 -40 40 -40 -40 -40 -40 40 -40 -40 -40 -4
0 -40 40 -40 -40 -40 40 40 -40 -40 -40 -40 40 -40
-40 40 -40 -40 -40 40 -40 -40 -40 -40 -40 -40 40
So my questions are
1. Is this the raw data format from the machine?
2. How do I get these files into fastq format? The maq converter and sanger perl scripts previously mentioned do not seem to work.
Thank you!
Comment