Seqanswers Leaderboard Ad

**wynstep** · 08-12-2014, 06:34 AM

Originally posted by wynstep View Post

Dear Colleagues,
I want to share my strange experience with you, to ask your opinions and help.

I'm working on the miRNA sequencing for an uncommon plant. I received data from a service company that gave to me a fastq file.
I already did RNA-seq analysis, so I'm quite familiar with several tools such as FastQC, trimmomatic, Bowtie2, cuffdiff etc.

I removed the 3' and 5' adapters, provided to me by the service company. The quality control confirmed that the adaptors sequences were right. I used cutadapt to remove the adapters. I have great peaks between 19 and 39 bp, also some reads between 39 and 51 (original reads length with adaptors attached).

I downloaded the hairpin.fa file from MirBase, without filtering for a specific organism, changing all U in T and removing lines with strange chars (Y, K etc...).
First strange thing:the alignment rate is very low, about 3%!

So, I did the alignment again, this time versus the A. thaliana genome. The alignment rate increased to 20%.
Second strange thing: if I launch htseq-count in order to count alignments, I found 0 for all mirnas!

I'm sure that I'm wrong in some analysis steps...can someone help me?

Thanks in advance

Anyone helps me? Please!

**NextGenSeq** · 08-12-2014, 07:08 AM

The Illumina miRNA library kit is known to display ligation bias. There is probably something wrong with your library.

Reducing ligation bias of small RNAs in libraries for next generation sequencing - PubMed

http://www.ncbi.nlm.nih.gov/pubmed/22647250

Sequencing bias of small RNAs partially influenced which microRNAs have been studied in depth; therefore most previous small RNA profiling experiments should be re-evaluated. New microRNAs are likely to be found, which were selected against by existing adapters. Preference of currently used adapters …

**wynstep** · 08-12-2014, 07:22 AM

Originally posted by NextGenSeq View Post

The Illumina miRNA library kit is known to display ligation bias. There is probably something wrong with your library.

http://www.ncbi.nlm.nih.gov/pubmed/22647250

Thank you very much for your help!
So, what is your suggestion? How to proceed to remove or reduce ligation biases?

Thank you!

**wynstep** · 08-12-2014, 10:20 AM

If someone wants, I can attach the fastqc files after 3' adaptor trimming...in order to have a better overview of my strange situation. I hope someone can help me, cause I finished the ideas on how to solve this problem.

Tried the adaptor trimming with: trimmomatic, cutadapt, fasts_clipper, novoalign etc...
Tried mapping with: bowtie, bowtie2, mirdeep2 etc...
for now I only want to know if there are some known mirnas...

The only thing I did not try is BLAST.

Please help!

**NextGenSeq** · 08-13-2014, 08:42 AM

Originally posted by wynstep View Post

Thank you very much for your help!
So, what is your suggestion? How to proceed to remove or reduce ligation biases?

Thank you!

The paper at the link describes how to reduce ligation bias.

The Bioo Small RNA kit uses this method for Illumina platforms.

Ion Torrent has used that method for a couple years for the PGM and Proton sequencers.

**Anton1** · 08-13-2014, 02:39 PM

Seems quite normal to me since major population of sRNAs in Plants (like A thaliana) are not miRNAs but siRNAs (a mixture of 21, 22 and 24 mers) not well conserved and arranged along the genome in cluster. I guess that you have got the mir390 mir168 and others in your mapped miRNAs since they are well conserved in thaliana as well as particular cluster of siRNA, also conserved.

**wynstep** · 08-14-2014, 02:24 AM

Originally posted by NextGenSeq View Post

The paper at the link describes how to reduce ligation bias.

The Bioo Small RNA kit uses this method for Illumina platforms.

Ion Torrent has used that method for a couple years for the PGM and Proton sequencers.

I've read the paper you suggested, but I didn't find any bioinformatics suggestion on how to treat raw data from sequencing "affected" by Illumina adaptors ligation biases... Am I missing something important into the paper or are they focusing only on a sperimental solution (only on library preparation I mean)?

Thanks for your help!

**kerplunk412** · 08-28-2014, 01:29 PM

Hi wynstep,
I have seen that low-mapping libraries can sometimes be attributed to some sort of artifact product that is taking up many of your reads. If this artifact is present in many of your reads, you should be able to find it with FastQC in the overrepresented sequences section. You will probably need to do this after adapter trimming, as otherwise I think the only overrepresented sequences that will be reported are from the 3' adapter. Also, you may want to just try BLASTing some random sequences from your data to see if you can get an indication of what they represent.

Topics	Statistics	Last Post
Genetic Variants and Diabetes Risk in Childhood Cancer Survivors by seqadmin Started by seqadmin, Today, 08:47 AM	0 responses 10 views 0 likes	Last Post by seqadmin Today, 08:47 AM
Cancer Metastasis: A Deep Dive into Cellular Plasticity by seqadmin Started by seqadmin, 04-11-2024, 12:08 PM	0 responses 60 views 0 likes	Last Post by seqadmin 04-11-2024, 12:08 PM
Proteogenomic Profiles Offer New Clues in Prostate Cancer by seqadmin Started by seqadmin, 04-10-2024, 10:19 PM	0 responses 57 views 0 likes	Last Post by seqadmin 04-10-2024, 10:19 PM
Novel Diagnostic Assay Enhances Ovarian Cancer Detection by seqadmin Started by seqadmin, 04-10-2024, 09:21 AM	0 responses 53 views 0 likes	Last Post by seqadmin 04-10-2024, 09:21 AM

Seqanswers Leaderboard Ad

Announcement

miRNA Illumina sequencing - low alignment rate

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News