Hi Everyone,
I have weird situation. I am getting high Kmers on my 3' end (right end) of the Illumina reads. I have been using trimmomatic 0.32 and its been fairly easy to work with. Also I have paired end sequences and this is present with both forward and reverse reads.
I have already filtered out adapter sequences by using the most common Illumina sequences by using the iplant's illumina adapter file. Also I have removed the over expressed sequences. Usually in the past filtering out the over expressed sequences took care of the Kmers… But this time around it hasn't.
I was wondering what I was doing wrong? Should I trim the sequences so that there are only 30 bp left? I would be losing approximately 50 bps.
Also all the quality of the reads are great by being between 38 and 24.
Any ideas will be appreciated on how I could go about fixing this issue.
Thank you,
Zapages
EDIT: I was able to fix the Kmers, by removing the last 30 bps in the reads. I was wondering if this was correct manner of handling this or should I have used a different strategy for this? Everything passed except for high duplication levels, which is to be expected due to the data set being RNA-Seq.
I have weird situation. I am getting high Kmers on my 3' end (right end) of the Illumina reads. I have been using trimmomatic 0.32 and its been fairly easy to work with. Also I have paired end sequences and this is present with both forward and reverse reads.
I have already filtered out adapter sequences by using the most common Illumina sequences by using the iplant's illumina adapter file. Also I have removed the over expressed sequences. Usually in the past filtering out the over expressed sequences took care of the Kmers… But this time around it hasn't.
I was wondering what I was doing wrong? Should I trim the sequences so that there are only 30 bp left? I would be losing approximately 50 bps.
Also all the quality of the reads are great by being between 38 and 24.
Any ideas will be appreciated on how I could go about fixing this issue.
Thank you,
Zapages
EDIT: I was able to fix the Kmers, by removing the last 30 bps in the reads. I was wondering if this was correct manner of handling this or should I have used a different strategy for this? Everything passed except for high duplication levels, which is to be expected due to the data set being RNA-Seq.
Comment