Anywhere from ten minutes to hours depending on your seed, gap setting, and number of mismatches. It's very tunable. I haven't used it on human just Celegans and my average runs are ~5hrs with 1bp gap, 2bp mismatch, 8bp seed, per lane with 5 million quality filtered reads.
Seqanswers Leaderboard Ad
Collapse
Announcement
Collapse
No announcement yet.
X
-
I see. I have tried it for human miRNA libraries and it is quite slow (especially if you don't have access to a cluster). I would strongly advise using Novoalign for miRNAs instead of SOAP, but I believe the RAM demands for Novoalign are much higher (and perhaps out of reach for some people).
Ryan
Comment
-
novoalign
I'd highly recommend novoalign as well. Use of quality scores, gapped alignments for indels, etc.
Even though an academic institution, we bought the multithreaded version. Runs ~16X faster on our 16 core server. Cost was considerably less than a single end flowcell, and when you have already bought the instrument, server, storage...(and Colin has his family to feed).
dvh
ps RAM demands vary on how big you build the index. We use a 10G human genome DNA reference, no problem on the 'standard' 32G server Illumina recommend.
Comment
-
Sure, but it is a fiddle to split the reads, rejoin etc esp with multiple lanes and flowcells to analyse. There are also some new features just in the multithreaded version and enhanced support. Anyway, you should ask the novocraft people these questions. I'm just very pleased with the results we get.
dvh
Comment
-
I am so happy to see such a healthy discussion about something I've been working on for the most part of this year.
In my experience with small RNA I found most of these adaptors on the three-prime end. I would subsequently map the reads with quality scores to the reference genome or alternatively use the tag counts as shown above.
I would usually achieve higher numbers of mapped sequences with the tagCount sequences. The yield is also dependent on how good your sequencing operator is ( and this van vary a great deal).
I've used novoalign, maq, soap in combination for most of this work but I was able to factor in all those things by providing feedback to novoalign developer.
You can run multiple instances of novoalign, there is no limit to that, however you might see some performance hits due to increased IO on your server. There are also some other extras in the commercial version dvh is using that are not related to multithreading, inclusive of support.
Comment
-
Hi Ryan,
That's certainly possible. The multi-threaded version just means that you don't have to split your reads into 16 different files. Support license sales just give you more flexibility and me some money to finance further development and improvements.
Novoalign on full human can run in 8Gbyte RAM if you using a 14-mer index with step size of 3. For miRNA we usually run at a very low threshold maybe -t10 or even -t0 (perfect match), adapters are stripped off first. At these settings it's very fast.
Colin
Comment
-
Hi dvh and Ryan,
Could you share more about using novoalign for microRNA. I used the -a -m options to map the reads, but would you be able to share the next steps to get at the miRNA and annotation?
mirTools sounded cool, but they are seem lacking in resources and documentation
thanks--
bioinfosm
Comment
Latest Articles
Collapse
-
by seqadmin
Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...-
Channel: Articles
04-04-2024, 04:25 PM -
-
by seqadmin
Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...-
Channel: Articles
03-22-2024, 06:39 AM -
ad_right_rmr
Collapse
News
Collapse
Topics | Statistics | Last Post | ||
---|---|---|---|---|
Started by seqadmin, 04-11-2024, 12:08 PM
|
0 responses
31 views
0 likes
|
Last Post
by seqadmin
04-11-2024, 12:08 PM
|
||
Started by seqadmin, 04-10-2024, 10:19 PM
|
0 responses
32 views
0 likes
|
Last Post
by seqadmin
04-10-2024, 10:19 PM
|
||
Started by seqadmin, 04-10-2024, 09:21 AM
|
0 responses
28 views
0 likes
|
Last Post
by seqadmin
04-10-2024, 09:21 AM
|
||
Started by seqadmin, 04-04-2024, 09:00 AM
|
0 responses
53 views
0 likes
|
Last Post
by seqadmin
04-04-2024, 09:00 AM
|
Comment