Dear all,
I just received data from illumina (100nt reads).
The file is named: mRNAseq_7_sequence.txt
and it looks like this:
HWUSI-EAS1999_9996:7:1:2:618#0/1:NATGTTTTTTTTTTTCAAGAACGAAAGTTNGGGGCTCGAAGACGATCAGATACCGAGAAAAAAAAAAAATCGTATGCCGTCTTCTGCTTGAAAAAAAAAAA:FNNNMWWTTYeeeeeeeeeeeeeeeRRTTEVVUUUeeeeeeeeeeeeeeeeeeeeeeee\eeeeeee[ee^\\[[\^\\\\^ZY[eee[eeeeeeBBBBB
HWUSI-EAS1999_9996:7:1:2:1071#0/1:GACGACTTCTCCGGGGGGGAAATGATAAGNTTCAGTGGACTTCCCCCCCCCCCGCGGGCAGCGAAAAAAAAAAAAAAACCGAGATCGGAAGAGCTCGTATG:EHHHHMMJJJeeeeeeeeee\eeeeKKJJEPNNPP\^^Y\^\^^Yeeeee\[\\[ee[e[\\^[\eeeeeBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB
HWUSI-EAS1999_9996:7:1:2:264#0/1:NGGCCATCAGTAGGGTAAAAAAAAAAAAANTCACGACGGTCTAAACCCAGCTTTTTTTTTTGAAGAGCTCGTATGCCCCCCCCCCCTTGAAAAAAAAAAA:FTVVV^^Y^^eeeeeeeeeeeeeee^^^ZFRRVROeeeeeeeeeeeeeeeeee[eeeeeeeeeeeeeeeeee\eeee[eeeeeee\e[[e\[^BBBBBBB
I have been searching all over the net, but there is too much info, so here is my question: how can I filter for high quality reads? I want to filter for reads which have or an average quality above a certain threshold or have a certain number of bases below a certain threshold. Is there a way to do this?
Next I want to convert this to 1. solexa fastq ; 2. normal fastq; ...
As I want to use this data in GALAXY, I should also be able to upload it there.
Any help is more than welcome
Steven
I just received data from illumina (100nt reads).
The file is named: mRNAseq_7_sequence.txt
and it looks like this:
HWUSI-EAS1999_9996:7:1:2:618#0/1:NATGTTTTTTTTTTTCAAGAACGAAAGTTNGGGGCTCGAAGACGATCAGATACCGAGAAAAAAAAAAAATCGTATGCCGTCTTCTGCTTGAAAAAAAAAAA:FNNNMWWTTYeeeeeeeeeeeeeeeRRTTEVVUUUeeeeeeeeeeeeeeeeeeeeeeee\eeeeeee[ee^\\[[\^\\\\^ZY[eee[eeeeeeBBBBB
HWUSI-EAS1999_9996:7:1:2:1071#0/1:GACGACTTCTCCGGGGGGGAAATGATAAGNTTCAGTGGACTTCCCCCCCCCCCGCGGGCAGCGAAAAAAAAAAAAAAACCGAGATCGGAAGAGCTCGTATG:EHHHHMMJJJeeeeeeeeee\eeeeKKJJEPNNPP\^^Y\^\^^Yeeeee\[\\[ee[e[\\^[\eeeeeBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB
HWUSI-EAS1999_9996:7:1:2:264#0/1:NGGCCATCAGTAGGGTAAAAAAAAAAAAANTCACGACGGTCTAAACCCAGCTTTTTTTTTTGAAGAGCTCGTATGCCCCCCCCCCCTTGAAAAAAAAAAA:FTVVV^^Y^^eeeeeeeeeeeeeee^^^ZFRRVROeeeeeeeeeeeeeeeeee[eeeeeeeeeeeeeeeeee\eeee[eeeeeee\e[[e\[^BBBBBBB
I have been searching all over the net, but there is too much info, so here is my question: how can I filter for high quality reads? I want to filter for reads which have or an average quality above a certain threshold or have a certain number of bases below a certain threshold. Is there a way to do this?
Next I want to convert this to 1. solexa fastq ; 2. normal fastq; ...
As I want to use this data in GALAXY, I should also be able to upload it there.
Any help is more than welcome
Steven
Comment