SEQanswers

Go Back   SEQanswers > Sequencing Technologies/Companies > SOLiD



Similar Threads
Thread Thread Starter Forum Replies Last Post
microRNA sequencing analysis pipeline? mrfox Bioinformatics 1 02-27-2012 04:19 AM
smallrna pipeline developer rishistar Introductions 0 07-04-2011 12:05 PM
illumina smallRNA adapter sequence for downstram analysis + miRNA analysis steps ndeshpan Bioinformatics 2 06-14-2011 10:44 PM
Help for AB WT Analysis Pipeline(RNA seq) wlnjseu SOLiD 6 09-06-2010 02:01 AM
Advice on analysis pipeline asl1972 Bioinformatics 4 05-05-2010 07:18 PM

Reply
 
Thread Tools
Old 12-10-2010, 12:29 PM   #1
naluru
Member
 
Location: Woods Hole, Massachusetts

Join Date: Jul 2010
Posts: 16
Default SmallRNA analysis pipeline

I am little bit confused with the smallRNA analysis pipeline used for analyzing SOLiD results.

Here is what I have read everywhere.

1. Trim the adaptors and convert from csfasta to csfastq/fastq
2. Align them to the genome
3. Match them with miRBase to get the number of counts per miRNA sequence.


My question is Why do you need to align to the reference genome. Why can't we just find the unique sequence reads and then blast them to miRBase to get the counts.

The unaligned reads can then be blasted to the genome to discover any new miRNA.

I am really new to this and cannot seem to find a reasonable explanation for aligning first to the reference genome.

If someone can explain this to me or suggest any paper, that will be great!

Thank you
naluru is offline   Reply With Quote
Old 12-11-2010, 07:36 AM   #2
jjohnson
Member
 
Location: Washington DC Metro Area

Join Date: Aug 2009
Posts: 20
Default

You actually align to the genome after aligning to mirBase. The steps are filter against a know set of tRNA and rRNA and then what isn't filtered align to mirBase. What doesn't match there is aligned to the entire genome for novel discovery. You will get these reads annotated and returned in a gff like formatted file. The reason you would want to not use blast is that these short read aligners are much better tuned to these types of data and even if you tuned blast to better handle them, it doesn't work in color space.
__________________
Justin H. Johnson | Twitter: @BioInfo | LinkedIn: http://bit.ly/LIJHJ | EdgeBio
jjohnson is offline   Reply With Quote
Old 12-11-2010, 08:44 AM   #3
naluru
Member
 
Location: Woods Hole, Massachusetts

Join Date: Jul 2010
Posts: 16
Default

Thank you very much for the reply. Exactly thats how I thought the pipeline should be. But, most of the papers published first align to the genome and then to miRBase. That's what confused me.

Thanks again.
naluru is offline   Reply With Quote
Old 12-11-2010, 08:00 PM   #4
rdeborja
Member
 
Location: Toronto

Join Date: Aug 2008
Posts: 42
Default miRNA * non* forms and hairpin

I've been working with the SOLiD smRNA pipeline and have been questioning the genome alignment portion.

I've been using Bioscope to directly align the smRNA reads to the whole genome as 1x36 reads. I don't need to trim adapters because of the seed and extend process that Bioscope does. I've been manually fishing for reads that align to the human genome and flank +/- 100bp to look for a secondary hit with the same read. This would hopefully identify both forms of the miRNA as well as the loop. Novoalign performs this adequately on Illumina reads, has anyone seen anything for SOLiD smRNA reads?
rdeborja is offline   Reply With Quote
Old 12-12-2010, 09:47 PM   #5
snetmcom
Senior Member
 
Location: USA

Join Date: Oct 2008
Posts: 158
Default

There is a version of novoalign for SOLiD reads. I imagine you could configure it similarly
snetmcom is offline   Reply With Quote
Old 12-13-2010, 12:59 AM   #6
KevinLam
Senior Member
 
Location: SEA

Join Date: Nov 2009
Posts: 203
Default

Quote:
Originally Posted by rdeborja View Post
I've been working with the SOLiD smRNA pipeline and have been questioning the genome alignment portion.

I've been using Bioscope to directly align the smRNA reads to the whole genome as 1x36 reads. I don't need to trim adapters because of the seed and extend process that Bioscope does. I've been manually fishing for reads that align to the human genome and flank +/- 100bp to look for a secondary hit with the same read. This would hopefully identify both forms of the miRNA as well as the loop. Novoalign performs this adequately on Illumina reads, has anyone seen anything for SOLiD smRNA reads?
Hi just checking if you are using the whole transcriptome pipeline in bioscope? or just the resequencing pipeline?

I would think that for small RNA with length of 21-22 nt you would need to trim the adaptors. (or are you saying that the seed and extend process only maps up to the smallRNA and ignores the adaptor seq?)

For WT mapping one will be able to provide the filter reference which includes adaptors and stuff like rRNA and tRNAs.

I am guessing there will be unlikely spurious hits where the small RNA bridges exon-intron-exon boundaries. But I am unsure of this.
KevinLam is offline   Reply With Quote
Reply

Tags
mirna seq problem, pipeline, smallrna

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 10:49 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2021, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO