Hello everyone!
I have rna-seq Illumina paired end reads and want to proceed with adapter trimming.
I have some confusions:
1. Does the 5' end of both the forward and reverse reads start from the first base of the insert? Or could there be some adapter contamination also at 5' end?
From whatever I have read online, there shouldn't be any adapter present at 5' end. But, the data I am analyzing has around 75 reads (out of 7 million for forward read file) with adapter at 5' end. 75 sequences isn't much, but I want to know what causes this..
2. For the forward reads, some 3' ends may have indexed adapter. In cases where this indexed adapter occurs within the sequence, I should delete the adapter and the following sequence, right? Even if the indexed primer is present at 5' end?? In which case the whole read should be deleted. (Because this was due to absence of insert between two adapters)
3. Do the 5' ends of reverse reads have barcode sequences or any part of the indexed adapter?? I have 12,399 reads (out of 7 million) that have complete or a part of indexed adapter at 5' end, with a few of them within the reads.
I am new to rna-seq data analysis, and have gone through lots of tutorials and explanations online, but everything seems to be really confusing at this moment.
My main concern is: where to expect adapters in illumina forward and reverse reads respectively, and what to do upon encountering unexpected adapters.
I have rna-seq Illumina paired end reads and want to proceed with adapter trimming.
I have some confusions:
1. Does the 5' end of both the forward and reverse reads start from the first base of the insert? Or could there be some adapter contamination also at 5' end?
From whatever I have read online, there shouldn't be any adapter present at 5' end. But, the data I am analyzing has around 75 reads (out of 7 million for forward read file) with adapter at 5' end. 75 sequences isn't much, but I want to know what causes this..
2. For the forward reads, some 3' ends may have indexed adapter. In cases where this indexed adapter occurs within the sequence, I should delete the adapter and the following sequence, right? Even if the indexed primer is present at 5' end?? In which case the whole read should be deleted. (Because this was due to absence of insert between two adapters)
3. Do the 5' ends of reverse reads have barcode sequences or any part of the indexed adapter?? I have 12,399 reads (out of 7 million) that have complete or a part of indexed adapter at 5' end, with a few of them within the reads.
I am new to rna-seq data analysis, and have gone through lots of tutorials and explanations online, but everything seems to be really confusing at this moment.
My main concern is: where to expect adapters in illumina forward and reverse reads respectively, and what to do upon encountering unexpected adapters.
Comment