Hi everyone.
I'm new in this world of NGS and HTS and I need some help in understanding some issues.
I've just received a set of sequences from RNA-Seq in Illumina Platform. I personally prepared the cDNA library to be used in the sequencing reaction, starting from total RNA from gram-negative bacterial.
To get rid of the rRNA I used RiboZero kit (for gram-negative bacteria) and the treatment was really successful, except for low amounts of 5S rRNA that still remains. But it seems that is something normal using this kit.
The expert from the Genome Platform and my supervisor agreed in continuing with the sequencing, and it turns out that the final data contains "5S rRNA contamination"…, what is not a surprise, of course.
I use CLC Workbench Genomics to run the data analysis. After QC analysis, sequencing seemed to be of very good quality. And after trimming, I obtained about 40% of unmapped reads, what seems reasonable. The mapping was against a Reference Genome downloaded from NCBI, containing only the chromosome.
I run the analysis to compare "non-specific matches" (max hits = 10) and "specific matches" (max hits =1)… and in both cases I obtained a huge amount of reads that map with a small region, of about 119bp of the 5S rRNA subunit of 1 of the 5 operons that exists.
Running a RNA-Seq analysis against the same genome, but using "rRNA" track, instead "genes" or "CDS" tracks, I obtained a mapping of about 18% to the same region.
The question is the following. Should I be worried about the accuracy of the RNA-Seq experiment due to this contamination of 5S? Should I repeat the sequencing reaction after getting rid completely of the remaining 5S rRNA in the samples? Or in contrast, could I use the expression value obtained if I just eliminate tRNAs and rRNAs from the Reference Genome (running the analysis agains a multicast file of all genes gave almost the same result than the aligned genome)
Thank you very much in advance
I'm new in this world of NGS and HTS and I need some help in understanding some issues.
I've just received a set of sequences from RNA-Seq in Illumina Platform. I personally prepared the cDNA library to be used in the sequencing reaction, starting from total RNA from gram-negative bacterial.
To get rid of the rRNA I used RiboZero kit (for gram-negative bacteria) and the treatment was really successful, except for low amounts of 5S rRNA that still remains. But it seems that is something normal using this kit.
The expert from the Genome Platform and my supervisor agreed in continuing with the sequencing, and it turns out that the final data contains "5S rRNA contamination"…, what is not a surprise, of course.
I use CLC Workbench Genomics to run the data analysis. After QC analysis, sequencing seemed to be of very good quality. And after trimming, I obtained about 40% of unmapped reads, what seems reasonable. The mapping was against a Reference Genome downloaded from NCBI, containing only the chromosome.
I run the analysis to compare "non-specific matches" (max hits = 10) and "specific matches" (max hits =1)… and in both cases I obtained a huge amount of reads that map with a small region, of about 119bp of the 5S rRNA subunit of 1 of the 5 operons that exists.
Running a RNA-Seq analysis against the same genome, but using "rRNA" track, instead "genes" or "CDS" tracks, I obtained a mapping of about 18% to the same region.
The question is the following. Should I be worried about the accuracy of the RNA-Seq experiment due to this contamination of 5S? Should I repeat the sequencing reaction after getting rid completely of the remaining 5S rRNA in the samples? Or in contrast, could I use the expression value obtained if I just eliminate tRNAs and rRNAs from the Reference Genome (running the analysis agains a multicast file of all genes gave almost the same result than the aligned genome)
Thank you very much in advance
Comment