![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
GATK to discover Single Nucleotide Variation in mature miRNA from miRNA-Seq | Bioinfo83 | Bioinformatics | 0 | 01-31-2012 05:11 AM |
miRNA-Seq with samples that have different % miRNA to Total RNA... | DrDTonge | Bioinformatics | 0 | 01-13-2012 12:20 AM |
multiple mapping in miRNA sequencing | jay2008 | Bioinformatics | 1 | 10-11-2010 12:52 AM |
BWA, BOWTIE: what parameters for different analysis (ChIP, RNA, miRNA etc) | dukevn | Bioinformatics | 2 | 08-12-2010 10:57 AM |
miRNA-seq - mapping to MIRBASE | hrajasim | Illumina/Solexa | 0 | 02-28-2010 04:29 PM |
![]() |
|
Thread Tools |
![]() |
#1 |
Member
Location: Oxford Join Date: Feb 2009
Posts: 17
|
![]()
Hi,
Can bowtie be used for mapping miRNAs to the genome and if so what is the best parameters to use? I have FASTQ files where I have removed the adapter sequence leaving a 18-23mer. Would bowtie -l 18 --best --strata be appropriate? Thanks. |
![]() |
![]() |
![]() |
#2 |
Member
Location: Cambridge, UK Join Date: Jun 2009
Posts: 14
|
![]()
We've been using (to get the top 101 exact matches);
bowtie -k 101 -v 0 Our workflow uniquifies the sequences before alignment so we're not concerned about quality values. I'm also guessing that the miRNA sequences are sufficiently conserved for us not to worry about mismatches. However, I'm very interested in the views of others on this. |
![]() |
![]() |
![]() |
#3 |
Member
Location: china Join Date: Nov 2009
Posts: 67
|
![]()
in our deepBase database, we use options: –k 200 –v 0. the Specifying the parameters (–k 200 –v 0) instructs Bowtie to report up to 200 perfect hits for each read.
deepBase is a platform for annotating and discovering small and long ncRNAs from next generation sequencing data. It is available at http://deepbase.sysu.edu.cn |
![]() |
![]() |
![]() |
#4 |
Member
Location: wenzhou.zhejiang.china Join Date: Apr 2009
Posts: 23
|
![]()
Are you looking for this?
http://seqanswers.com/forums/showthr...light=mirtools Last edited by houhuabin; 02-02-2010 at 07:58 AM. |
![]() |
![]() |
![]() |
#5 |
Member
Location: Cambridge, UK Join Date: Jun 2009
Posts: 14
|
![]()
Could well be. However, the link is broken. I would be very grateful if you could fix. Thanks!
|
![]() |
![]() |
![]() |
#6 |
Member
Location: wenzhou.zhejiang.china Join Date: Apr 2009
Posts: 23
|
![]()
Sorry for that, now it is fixed.
Thanks! Last edited by houhuabin; 02-02-2010 at 08:03 AM. |
![]() |
![]() |
![]() |
#7 |
Member
Location: Cambridge, UK Join Date: Jun 2009
Posts: 14
|
![]()
After a few days of struggling with quality/homeopolymer/adaptor trimming my reads, and reading about 3' RNA edits and so forth, I've decided to try something similar to staylor's original suggestion (similar to the algorithm used by miRanalyzer);
bowtie -n 0 -l 15 --best This should give the best match(es) for an exact 15bp 5' seed. If anyone is interested in a direct comparison between this and the original (-v 0) parameters, or has another view on this, please let me know. |
![]() |
![]() |
![]() |
#9 |
Member
Location: Cambridge, UK Join Date: Jun 2009
Posts: 14
|
![]()
In terms of post-processing, We're loading the alignments into an Ensembl database so that we can screen for known genes and repeats. We then predict novel small RNAs, and estimate transcript counts for all loci based on read coverage. It's designed to be a generic pipeline for metazoa. As everything is in an Ensembl database the results can be browsed, and ad-hoc reports generated.
|
![]() |
![]() |
![]() |
#10 | |
Member
Location: Oxford Join Date: Feb 2009
Posts: 17
|
![]() Quote:
whsqwghlm - how did you get on with the mapping? Did the parameters work? |
|
![]() |
![]() |
![]() |
#11 |
Member
Location: Cambridge, UK Join Date: Jun 2009
Posts: 14
|
![]()
Yes! We ended up using;
bowtie -n 0 -l 15 -e 99999 -k 200 --best --chunkmbs 128 We then post-processed the alignments to take the one with the longest 5' exact match (could not find a way to get bowtie to do this natively). The preparation of our library helped - it had been poly-A filled, and the 3' primer was terminated with a poly-T chain. We did not bother to poly-A trim the reads (i.e. remove the primer) as we did not want to lose any 'real' As of the end of sequences. I'm still generating comparisons with other bowtie configs, and I also need to test the pipeline against a GEO data set with 'normal' primers. |
![]() |
![]() |
![]() |
#12 |
Member
Location: Oxford Join Date: Feb 2009
Posts: 17
|
![]()
Ah excellent. I will try that. Thanks for the tip!
|
![]() |
![]() |
![]() |
#13 | |
Senior Member
Location: USA Join Date: Jan 2008
Posts: 482
|
![]() Quote:
|
|
![]() |
![]() |
![]() |
#14 |
Member
Location: Cambridge, UK Join Date: Jun 2009
Posts: 14
|
![]()
We're aligning against the whole genome. Reads that do not align to the genome are aligned to mirBase (all species) just in case the assembly is incomplete.
|
![]() |
![]() |
![]() |
#15 |
Member
Location: Oxford Join Date: Feb 2009
Posts: 17
|
![]()
So are you filtering on the one with the smallest NM value with the longest read?
If you get multiple matches and they all score equally do you pick one at random? |
![]() |
![]() |
![]() |
#16 |
Member
Location: Cambridge, UK Join Date: Jun 2009
Posts: 14
|
![]()
Smallest NM value? Sorry - you lost me...
The idea is to record the hit(s) with the longest identical 5' match(es) to the genome, the theory being that primer artefacts, sequencing errors and RNA edits are all concentrated at the 3' end. We also assume that natural variation is absent for miRNAs. If we get multiple matches with the same score, then all of the matches are recorded. |
![]() |
![]() |
![]() |
#17 |
Member
Location: Oxford Join Date: Feb 2009
Posts: 17
|
![]()
Clearly not using NM then!:-) In SAM format, NM = the number of nucleotide differences to the reference sequence. I thought this may be a useful tag for filtering. Or do you just count the length of the match?
Do you reject sequences longer than 22bp? |
![]() |
![]() |
![]() |
#18 |
Junior Member
Location: Poland Join Date: Jun 2010
Posts: 8
|
![]()
I am new in the miRNA field and I am wondering why you are using -k 200 or 101 option? In other words why you want to have 200 alignment with 0 mismatches, rather than one unique with 0 mm?
Thanks! tomek |
![]() |
![]() |
![]() |
#19 |
Member
Location: Cambridge, UK Join Date: Jun 2009
Posts: 14
|
![]()
Each read may map exactly to many places in the genome. We want to capture all of these locations to a threshold promiscuity, typically 100, over which we discard all of the mappings (i.e. if 101 alignments are returned from the search).
|
![]() |
![]() |
![]() |
#20 |
Member
Location: china Join Date: Nov 2009
Posts: 67
|
![]()
you can use the options: -a -m 100. Specifying -m 100 instructs bowtie to refrain from reporting any alignments for reads having more than 100 reportable alignments.
|
![]() |
![]() |
![]() |
Tags |
bowtie parameters mirna |
Thread Tools | |
|
|