Go Back   SEQanswers > Bioinformatics > Bioinformatics

Similar Threads
Thread Thread Starter Forum Replies Last Post
Where perform Illumina totalRNAseq? Dolphin22 Illumina/Solexa 3 10-01-2012 12:39 AM
Memory requirements for Rnnotator NormSci RNA Sequencing 1 07-16-2012 04:46 PM
How many tags to perform a RNAseq experiment kant1_IC RNA Sequencing 6 02-06-2012 06:18 AM
How to teach myself to perform statistical analysis? Heisman General 0 07-04-2011 04:30 PM
RNA-Seq: Rnnotator: an automated de novo transcriptome assembly pipeline from strande Newsbot! Literature Watch 0 11-26-2010 02:00 AM

Thread Tools
Old 10-18-2012, 12:12 AM   #1
Location: Hyd

Join Date: Jul 2012
Posts: 14
Question How to perform a Genome Assembly with Rnnotator?

Hello all,

This is my 1st thread. Hope this is not a redundant query.

I am a beginner on NGS data analysis. I work on the server with alloted RAM of 32GB. I have been asked to assemble and annotate a highly repetitive unassembled eukaryotic genome (silkworm).

Since the memory is too low I have used Rnnotator on my RNA-seq data. I have illumina paired end reads of 4GB data each end. Total reads combined are 33968114 (33 million). I do not know the coverage exactly as most of the genome is unannotated. The data I have been given are just reads which were sequenced prior to my joining.

My aim is to assemble the genome and see if there are any other organism genes (bacterial) with a high occurance in my sample so that I can work on their interaction at genome level. This I wanted to do by first assembling the genome and then searching for similarities of the contigs produced. I want to see the expression levels of transcripts too at a later stage. .

So i tried the assembly with Rnnotator with default options as -

$ perl /opt/Apps/Rnnotator-2.4.12/scripts/ -strP 200 25_comb.fq -n 4 -trim off -kmer_length 25 -a oases -o /home/guest/Rohit/index_25/i25_k25

I tried with with different k-mer values starting from 19-79. I did not find much change in the asembly from 21-65. I have taken even k-mer values which i should not have but I needed to see if there would be any change in the data. The table of values is as follows -

K-mer k21 k22 k23 k25 k26 k27 k28 k31 k32 k33 k36 k41 k48 k51 k53 k57 k59 k66 k73 k77 k79

contigs 12325 12368 12399 12389 12345 12349 12339 12380 12354 12354 12332 12299 12276 12387 12366 12411 12358 12368 12379 12356

sequence total 10097 10118 10126 10132 10064 10093 10070 10114 10101 10095 10084 10079 10051 10124 10118 10120 10089 10110 10107 10101

Total bp 3991665 3995059 3990798 4003725 3964061 3981526 3957553 3975101 3971942 3970755 3965013 3973767 3973763 3970775 3992825 3956562 3957711 3970696 3970478 3986746

N50 Length(bp) 657 657 653 656 653 652 650 657 650 653 656 651 653 651 654 647 650 653 651 653

Largest contig 3530 3530 3528 3530 3528 3530 3530 3528 3528 3528 3530 3528 3530 3530 3528 3528 3528 3528 3530 3528

median contig 230 230 230 230 230 230 230 229 230 230 230 230 230 229 229 228 228 230 229 230

1) Which is the best one I can choose as my assembly? Is there one particular k-mer in this table?
Is it k-mer 25 as it has high number of bases, contigs and N50?

2) Are there any additional options to get a more accurate assembly with Rnnotator?
I tried using the Oases assembler in Rnnotator instead of Velvet but it is giving an error.

3) Is there any other assembler which runs on low RAM of 8-32 GB?
Please do not suggest commercial ones.


Last edited by rohitngs; 10-18-2012 at 01:37 AM.
rohitngs is offline   Reply With Quote
Old 10-18-2012, 11:15 PM   #2
Location: Hyd

Join Date: Jul 2012
Posts: 14

I can see that there is no reply till now. But I have mailed the developers of Rnnotator and got this reply just now.

1. Best K-mer value for assembly
There is no single kmer assembly that can give the best results for all genes, therefore the software executed multiple Velvet assemblies and then merged the resulting contigs using the Minimus2. Merging the Velvet assembled contigs resulted in a much better assembly than running a particular kmer assembly.

2. Oases not running on Rnnotator
Most likely oases is not in your PATH or has not been installed. You may set the software path using $PATH variable. Type "which oases" to see if the software is available. I recommend you to use velvet assembler.

3. Additional options for better results
There are several sets of advanced options for doing this, e.g., ADVANCED ASSEMBLY OPTIONS, ADVANCED POLISHING OPTIONS and ACCURACY ASSESSMENT OPTIONS. You may play around these options to improve/assess the contigs.

4. Transcript levels of expression
If you use --keep_rundir option, then intermediate files can be kept in rnnotator_run directory. counts.txt shows you the number of reads which align to each gene(contig). This file provides an accurate estimation of gene expression levels.

5. Final output
The final output file is called final_contigs.fa, which contains de novo assembled final transcripts.

This gives details of the Rnnotator now. Still it would be great to know of anything else other than this, about Rnnotator and other Assemblers using low RAM.

Last edited by rohitngs; 10-19-2012 at 12:24 AM.
rohitngs is offline   Reply With Quote

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

All times are GMT -8. The time now is 04:29 PM.

Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO