Dear Colleagues,
I want to share my strange experience with you, to ask your opinions and help.
I'm working on the miRNA sequencing for an uncommon plant. I received data from a service company that gave to me a fastq file.
I already did RNA-seq analysis, so I'm quite familiar with several tools such as FastQC, trimmomatic, Bowtie2, cuffdiff etc.
I removed the 3' and 5' adapters, provided to me by the service company. The quality control confirmed that the adaptors sequences were right. I used cutadapt to remove the adapters. I have great peaks between 19 and 39 bp, also some reads between 39 and 51 (original reads length with adaptors attached).
I downloaded the hairpin.fa file from MirBase, without filtering for a specific organism, changing all U in T and removing lines with strange chars (Y, K etc...).
First strange thing:the alignment rate is very low, about 3%!
So, I did the alignment again, this time versus the A. thaliana genome. The alignment rate increased to 20%.
Second strange thing: if I launch htseq-count in order to count alignments, I found 0 for all mirnas!
I'm sure that I'm wrong in some analysis steps...can someone help me?
Thanks in advance
I want to share my strange experience with you, to ask your opinions and help.
I'm working on the miRNA sequencing for an uncommon plant. I received data from a service company that gave to me a fastq file.
I already did RNA-seq analysis, so I'm quite familiar with several tools such as FastQC, trimmomatic, Bowtie2, cuffdiff etc.
I removed the 3' and 5' adapters, provided to me by the service company. The quality control confirmed that the adaptors sequences were right. I used cutadapt to remove the adapters. I have great peaks between 19 and 39 bp, also some reads between 39 and 51 (original reads length with adaptors attached).
I downloaded the hairpin.fa file from MirBase, without filtering for a specific organism, changing all U in T and removing lines with strange chars (Y, K etc...).
First strange thing:the alignment rate is very low, about 3%!
So, I did the alignment again, this time versus the A. thaliana genome. The alignment rate increased to 20%.
Second strange thing: if I launch htseq-count in order to count alignments, I found 0 for all mirnas!
I'm sure that I'm wrong in some analysis steps...can someone help me?
Thanks in advance
Comment