Seqanswers Leaderboard Ad

**JohnN** · 07-21-2014, 01:12 PM

I recommend two things:

1. Try the mira list where they are very helpful: http://www.freelists.org/list/mira_talk

2. Ask your PacBio sequencing provider for the metadata.xml, bas.h5 and bax.h5 files and run them through the SMRTportal.

I'm sorry I cannot be more helpful but it's a start.

**maubp** · 07-22-2014, 04:18 AM

Your manifest as shown is missing some new lines - the sequencing type of the read group should be on its own line for example.

**haudi** · 07-22-2014, 05:41 AM

Thanks a lot for your help!

I tired it with:

project = MyFirstAssembly
job = genome,denovo,accurate
parameters = COMMON_SETTINGS -NW:cmrnl=no SANGER_SETTINGS -CO:mrpg=5
readgroup = Sanger
data =xxxx.fastq xxxx2.fastq
technology = sanger
rename_prefix=HWI-ST330:422:C4AVHACXX clostraur

and it runs for 6h now...hope the result is ok then.

Silly question: To view the results..should i use gap4 or gap5 or is there any other program better?

yours,
haudi

**maubp** · 07-22-2014, 06:30 AM

Good luck

I personally convert MIRA version 4 output to SAM (using mira_convert) and then into a sorted index BAM file using samtools (optionally with 'samtools depad'). Then you can use the BAM viewer of your choice, e.g. Tablet should show MIRA's contig annotation.

If you intend to edit your alignment, gap5 is probably the best choice.

**haudi** · 07-22-2014, 11:09 PM

I don't know the results yet..mira ran now for 24h and take up 85% = 40GB of ram...i thought todue the small genome size that it wont take that many.

Ok with gap5 i can edit the alignment...think i have a lot at it first and hopefully the alignment is good

edit: i will now test it on a 500gb ram cluster. Does anyone know how to tell mira how many cpu's it should use?

**maubp** · 07-23-2014, 02:05 AM

You can set the number of threads in the MIRA v4 manifest, or at the command line, e.g. for eight threads use:

$ mira -t 8 my_manifest.txt

See http://mira-assembler.sourceforge.ne...ideToMIRA.html

Note that not all parts of MIRA take advantage of multiple threads.

**lhon** · 07-24-2014, 12:00 PM

PBcR

Hi, from the looks of it, you probably have uncorrected PacBio reads as input, but Mira 4.0 only can assemble PacBio reads that have gone through some kind of preassembly/correction. See here:

Sequence assembly and mapping with MIRA 5

http://mira-assembler.sourceforge.net/docs/DefinitiveGuideToMIRA.html#sect_pd_pacbio

To assemble from the subreads.fastq directly, I would suggest trying PBcR, a tool that is part of Celera Assembler:

Encountered a 404 error

http://wgs-assembler.sourceforge.net/wiki/index.php/PBcR

In particular, the 8.2 beta should let you comfortably assemble your genome on a single node using the MHAP algorithm.

The bas.h5 files would be required for polishing to get high consensus accuracy (by running through Quiver).

**haudi** · 07-27-2014, 11:38 PM

thanks!
How do i know if i have corrected or uncorrected reads?

**lhon** · 07-29-2014, 10:10 AM

subreads

One way to tell is the uncorrected files will have the word "subreads" in the filename, such as filtered_subreads.fasta . A subread corresponds to a single pass across some or all of the physical insert.

**haudi** · 07-29-2014, 11:05 PM

ok i havejust 2 subread files :-/ i searched but did't find any program to convert them to corrected reads.(pacBioToCA needs long and short reads). Does anyone have a good solution for my problem?
My MIRA output folder hast different mafs. Which one is the right one? *.LargeContigs_out.maf,*.out.maf
When i use Tablet to show my results..i nearly have no more than 2 alignment 'reads'(?) and over 2200 contigs. The examples from Tablet have much higher rate.

i also used celera (runCA) to assambly my reads. now i have asm data. Can i use ca2ace.pl?
Thanks everyone!

**lhon** · 07-30-2014, 12:39 PM

PBcR

Use PBcR as per my earlier post to correct and then assemble the subreads. The latest versions can do self-correction, which is equivalent to the preassembly step in HGAP.

**haudi** · 08-01-2014, 12:08 AM

Thanks again. Sry for the amount of questions but I'm really new to the topic and don;t know what is possible. I read through the MIRA manual but the connection between Celera Mira and other programs is still a little bit hard.

It worked and now I ran MIRA with the 2 self corrected fast files. How can I influence (for example with the manifest file) the fact that i get lots of contigs(=973) size from 2,300,000bp to 600bp. Are Contigs pieces which cannot be aligned?
I already know the genome size be cause its from Aeromonas salmonicida. How can I use a scaffold telling Mira to align the contains?

Topics	Statistics	Last Post
Cancer Metastasis: A Deep Dive into Cellular Plasticity by seqadmin Started by seqadmin, 04-11-2024, 12:08 PM	0 responses 17 views 0 likes	Last Post by seqadmin 04-11-2024, 12:08 PM
Proteogenomic Profiles Offer New Clues in Prostate Cancer by seqadmin Started by seqadmin, 04-10-2024, 10:19 PM	0 responses 22 views 0 likes	Last Post by seqadmin 04-10-2024, 10:19 PM
Novel Diagnostic Assay Enhances Ovarian Cancer Detection by seqadmin Started by seqadmin, 04-10-2024, 09:21 AM	0 responses 16 views 0 likes	Last Post by seqadmin 04-10-2024, 09:21 AM
Evolutionary Dynamics of Centromeres: A Comparative Genomic Analysis by seqadmin Started by seqadmin, 04-04-2024, 09:00 AM	0 responses 46 views 0 likes	Last Post by seqadmin 04-04-2024, 09:00 AM

Seqanswers Leaderboard Ad

Announcement

MIRA 4.0 denovo PacBio FastQ

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News