SEQanswers

Go Back   SEQanswers > Applications Forums > RNA Sequencing



Similar Threads
Thread Thread Starter Forum Replies Last Post
Low overall alignment rate maivantan Bioinformatics 7 03-01-2016 01:38 PM
Advice needed: low RIN for miRNA sequencing ZoeG RNA Sequencing 0 07-15-2014 04:48 PM
The low mapping rate vivienne_lovely Bioinformatics 7 06-05-2013 06:45 PM
Low pairing rate in SOLiD 4 pair-end transcriptome sequencing amurocw Bioinformatics 7 04-13-2011 08:32 AM

Reply
 
Thread Tools
Old 08-10-2014, 04:25 AM   #1
wynstep
Member
 
Location: Rome

Join Date: Jan 2014
Posts: 11
Default miRNA Illumina sequencing - low alignment rate

Dear Colleagues,
I want to share my strange experience with you, to ask your opinions and help.

I'm working on the miRNA sequencing for an uncommon plant. I received data from a service company that gave to me a fastq file.
I already did RNA-seq analysis, so I'm quite familiar with several tools such as FastQC, trimmomatic, Bowtie2, cuffdiff etc.

I removed the 3' and 5' adapters, provided to me by the service company. The quality control confirmed that the adaptors sequences were right. I used cutadapt to remove the adapters. I have great peaks between 19 and 39 bp, also some reads between 39 and 51 (original reads length with adaptors attached).

I downloaded the hairpin.fa file from MirBase, without filtering for a specific organism, changing all U in T and removing lines with strange chars (Y, K etc...).
First strange thing:the alignment rate is very low, about 3%!

So, I did the alignment again, this time versus the A. thaliana genome. The alignment rate increased to 20%.
Second strange thing: if I launch htseq-count in order to count alignments, I found 0 for all mirnas!

I'm sure that I'm wrong in some analysis steps...can someone help me?

Thanks in advance
wynstep is offline   Reply With Quote
Old 08-12-2014, 07:34 AM   #2
wynstep
Member
 
Location: Rome

Join Date: Jan 2014
Posts: 11
Default

Quote:
Originally Posted by wynstep View Post
Dear Colleagues,
I want to share my strange experience with you, to ask your opinions and help.

I'm working on the miRNA sequencing for an uncommon plant. I received data from a service company that gave to me a fastq file.
I already did RNA-seq analysis, so I'm quite familiar with several tools such as FastQC, trimmomatic, Bowtie2, cuffdiff etc.

I removed the 3' and 5' adapters, provided to me by the service company. The quality control confirmed that the adaptors sequences were right. I used cutadapt to remove the adapters. I have great peaks between 19 and 39 bp, also some reads between 39 and 51 (original reads length with adaptors attached).

I downloaded the hairpin.fa file from MirBase, without filtering for a specific organism, changing all U in T and removing lines with strange chars (Y, K etc...).
First strange thing:the alignment rate is very low, about 3%!

So, I did the alignment again, this time versus the A. thaliana genome. The alignment rate increased to 20%.
Second strange thing: if I launch htseq-count in order to count alignments, I found 0 for all mirnas!

I'm sure that I'm wrong in some analysis steps...can someone help me?

Thanks in advance
Anyone helps me? Please!
wynstep is offline   Reply With Quote
Old 08-12-2014, 08:08 AM   #3
NextGenSeq
Senior Member
 
Location: USA

Join Date: Apr 2009
Posts: 482
Default

The Illumina miRNA library kit is known to display ligation bias. There is probably something wrong with your library.

http://www.ncbi.nlm.nih.gov/pubmed/22647250
NextGenSeq is offline   Reply With Quote
Old 08-12-2014, 08:22 AM   #4
wynstep
Member
 
Location: Rome

Join Date: Jan 2014
Posts: 11
Smile

Quote:
Originally Posted by NextGenSeq View Post
The Illumina miRNA library kit is known to display ligation bias. There is probably something wrong with your library.

http://www.ncbi.nlm.nih.gov/pubmed/22647250
Thank you very much for your help!
So, what is your suggestion? How to proceed to remove or reduce ligation biases?

Thank you!
wynstep is offline   Reply With Quote
Old 08-12-2014, 11:20 AM   #5
wynstep
Member
 
Location: Rome

Join Date: Jan 2014
Posts: 11
Default

If someone wants, I can attach the fastqc files after 3' adaptor trimming...in order to have a better overview of my strange situation. I hope someone can help me, cause I finished the ideas on how to solve this problem.

Tried the adaptor trimming with: trimmomatic, cutadapt, fasts_clipper, novoalign etc...
Tried mapping with: bowtie, bowtie2, mirdeep2 etc...
for now I only want to know if there are some known mirnas...

The only thing I did not try is BLAST.

Please help!
wynstep is offline   Reply With Quote
Old 08-13-2014, 09:42 AM   #6
NextGenSeq
Senior Member
 
Location: USA

Join Date: Apr 2009
Posts: 482
Default

Quote:
Originally Posted by wynstep View Post
Thank you very much for your help!
So, what is your suggestion? How to proceed to remove or reduce ligation biases?

Thank you!
The paper at the link describes how to reduce ligation bias.

The Bioo Small RNA kit uses this method for Illumina platforms.

Ion Torrent has used that method for a couple years for the PGM and Proton sequencers.
NextGenSeq is offline   Reply With Quote
Old 08-13-2014, 03:39 PM   #7
Anton1
Junior Member
 
Location: France

Join Date: Sep 2010
Posts: 1
Default

Seems quite normal to me since major population of sRNAs in Plants (like A thaliana) are not miRNAs but siRNAs (a mixture of 21, 22 and 24 mers) not well conserved and arranged along the genome in cluster. I guess that you have got the mir390 mir168 and others in your mapped miRNAs since they are well conserved in thaliana as well as particular cluster of siRNA, also conserved.
Anton1 is offline   Reply With Quote
Old 08-14-2014, 03:24 AM   #8
wynstep
Member
 
Location: Rome

Join Date: Jan 2014
Posts: 11
Default

Quote:
Originally Posted by NextGenSeq View Post
The paper at the link describes how to reduce ligation bias.

The Bioo Small RNA kit uses this method for Illumina platforms.

Ion Torrent has used that method for a couple years for the PGM and Proton sequencers.
I've read the paper you suggested, but I didn't find any bioinformatics suggestion on how to treat raw data from sequencing "affected" by Illumina adaptors ligation biases... Am I missing something important into the paper or are they focusing only on a sperimental solution (only on library preparation I mean)?

Thanks for your help!
wynstep is offline   Reply With Quote
Old 08-28-2014, 02:29 PM   #9
kerplunk412
Senior Member
 
Location: Bioo Scientific, Austin, TX, USA

Join Date: Jun 2012
Posts: 119
Default

Hi wynstep,
I have seen that low-mapping libraries can sometimes be attributed to some sort of artifact product that is taking up many of your reads. If this artifact is present in many of your reads, you should be able to find it with FastQC in the overrepresented sequences section. You will probably need to do this after adapter trimming, as otherwise I think the only overrepresented sequences that will be reported are from the 3' adapter. Also, you may want to just try BLASTing some random sequences from your data to see if you can get an indication of what they represent.
kerplunk412 is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 06:14 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2021, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO