Hello,
I have several questions;
1-) We did a SE-50bp sequencing at Illumina platform. I am trying to analyze them in Galaxy Server. After uploading the fastq files, I saw dots in some of the reads and they are in a pattern with other reads containing dots (I mean some set of reads have dots in 33th and 34th position; another set at somewhere else but in same locations. - and it seems like reads containing dots constitutes up %10 of the all reads)
After grooming, those did'nt change, but I still did the TopHat; I am not sure if I need to change the dots with "N"s to be able to use that reads (do I?). If I need, how should I do that? (I am not using linux, I'll be happy if you can give a solution by using Galaxy)
2-) One of the fastqs give overrepresented sequence which is something like that:
Sequence Count Percentage Possible Source
GATCGGAAGAGCACACGTCTGAACTCCAGTCACAGTTCCGTATCTCGTAT 34465 0.11089473793609078 TruSeq Adapter, Index 8 (97% over 37bp)
Should I need to remove those reads? Because they won't map at the end and shouldn't be a problem.
3-) I guess per base quality graph like below is good and I don't need any trimming or quality cut off?
I have several questions;
1-) We did a SE-50bp sequencing at Illumina platform. I am trying to analyze them in Galaxy Server. After uploading the fastq files, I saw dots in some of the reads and they are in a pattern with other reads containing dots (I mean some set of reads have dots in 33th and 34th position; another set at somewhere else but in same locations. - and it seems like reads containing dots constitutes up %10 of the all reads)
After grooming, those did'nt change, but I still did the TopHat; I am not sure if I need to change the dots with "N"s to be able to use that reads (do I?). If I need, how should I do that? (I am not using linux, I'll be happy if you can give a solution by using Galaxy)
2-) One of the fastqs give overrepresented sequence which is something like that:
Sequence Count Percentage Possible Source
GATCGGAAGAGCACACGTCTGAACTCCAGTCACAGTTCCGTATCTCGTAT 34465 0.11089473793609078 TruSeq Adapter, Index 8 (97% over 37bp)
Should I need to remove those reads? Because they won't map at the end and shouldn't be a problem.
3-) I guess per base quality graph like below is good and I don't need any trimming or quality cut off?
Comment