Hello,
I need your help for the cleaning of a sequence. I got a big Fastq file issued from an illumina sequening and even if it was supposed to be cleaned before I got it, when I do a Fastqc test, I obtain in the overrepresented sequences category a long table with about 12 sequences of ribosomal RNA that have no hit as possible sources (I got hits when I blast them on the NCBI Site); they are all similar, it's just some bases of the sequences that vary. And I also got strange results like RNA PCR Primer and of course my adaptators. I got rid of the adaptators by a simple Cutadapt, but I still have the other overrepresented sequences. How can I eliminate them, cause I suppose that cutadapt is not adapted to do that, is it?
Thank you for all the answer you'll be able to give me.
K.
I need your help for the cleaning of a sequence. I got a big Fastq file issued from an illumina sequening and even if it was supposed to be cleaned before I got it, when I do a Fastqc test, I obtain in the overrepresented sequences category a long table with about 12 sequences of ribosomal RNA that have no hit as possible sources (I got hits when I blast them on the NCBI Site); they are all similar, it's just some bases of the sequences that vary. And I also got strange results like RNA PCR Primer and of course my adaptators. I got rid of the adaptators by a simple Cutadapt, but I still have the other overrepresented sequences. How can I eliminate them, cause I suppose that cutadapt is not adapted to do that, is it?
Thank you for all the answer you'll be able to give me.
K.
Comment