Good morning everyone,
I am new to whole genome sequencing analysis, and if there is another thread for this type of problem, I will be grateful if you can provide it to me. Now a days I am working in comparative analysis of plant genome sequence (DNA). We received sequence data (paired-end) from ILLUMINA, used FASTQC to check the quality and found out > 0.20% overrepresented sequences (from True seq adapters). So, I am looking answers for some questions regarding those overrepresented sequences.
1) I am wondering if I need to remove those overrepresented sequences from raw data of Genomic DNA sequences before proceeding to downward analysis ?
2) If I removed it, there might be problem of unequal number of reads between the paired files (R1 and R2). And when trying to remove unpaired reads, we will remove big chunk of single reads from R1 and R2 files. Is there any way to use those single reads from both files that can incorporate in downward analysis, for instance, mapping with reference genome and annotation?
Thank you in advance.
akashrestha
I am new to whole genome sequencing analysis, and if there is another thread for this type of problem, I will be grateful if you can provide it to me. Now a days I am working in comparative analysis of plant genome sequence (DNA). We received sequence data (paired-end) from ILLUMINA, used FASTQC to check the quality and found out > 0.20% overrepresented sequences (from True seq adapters). So, I am looking answers for some questions regarding those overrepresented sequences.
1) I am wondering if I need to remove those overrepresented sequences from raw data of Genomic DNA sequences before proceeding to downward analysis ?
2) If I removed it, there might be problem of unequal number of reads between the paired files (R1 and R2). And when trying to remove unpaired reads, we will remove big chunk of single reads from R1 and R2 files. Is there any way to use those single reads from both files that can incorporate in downward analysis, for instance, mapping with reference genome and annotation?
Thank you in advance.
akashrestha
Comment