Seqanswers Leaderboard Ad

**flxlex** · 10-17-2012, 10:23 PM

With the 454 coverage you have (around 90x), newbler is your best bet. You may need to downsample, even, as usually 30-50x is enough. You could use Mauve contig mover to order the contigs as per a (single) reference genome, provided there are no large rearrangements (so don't take that one reference genome).

Mapping using gsMapper to multiple references is of no use, you only confuse the mapper, as many reads can be placed multiple times.

An alternative would be celera, which seems to be working well with 454 data, but is somewhat more difficult to use.

**Hobbe** · 10-18-2012, 12:01 AM

In addition to Newbler, where I would probably go the de novo route, I can recommend MIRA. You might need to down-sample a bit, but MIRA should do a really good job. The only reason that I am not using to all the time is that it has problems with larger genomes, but a bacterial genome is perfect for it. The support in the mailing-lists is excellent, so there is always help to be had. It does not do scaffolding though, so that you would need to do using a scaffolding program.

In my fungal project, MIRA produced the longest contains that I felt I also could trust. Celera also did good, but I spotted a few mis-assemblies so I dropped that one.

**mbseq** · 10-22-2012, 12:29 PM

Thank you for the suggestions. Where is downsampling done within Newbler. Do you adjust this with the expected coverage? I have been adjusting this depending on the input files I have been using.

After playing around with Mauve I found that my de novo contigs from Newbler fit two of my reference genomes rather well (not with the other one with the large rearrangement). Both gave me 11 Local Colinear Blocks. Using Mauve to align the reference mapper output (42 large, 56 total contigs) gives me 1 LCB. However, there are some gaps in the reference genome at the junction of most of the contigs in my assembled sequence.

What are my options for confirming contig order and closing these gaps? This would be a lot of work (and $) to do it by PCR. But I also don't think more coverage would be a cost-effective way to do it either.

**flxlex** · 10-24-2012, 03:49 AM

Originally posted by mbseq View Post

Where is downsampling done within Newbler.

You can use the sfffile command, with the '-pick' or option, giving the amount of bases you want to try, and it will randomly select reads to that amount of bases.

Originally posted by mbseq View Post

What are my options for confirming contig order and closing these gaps? This would be a lot of work (and $) to do it by PCR. But I also don't think more coverage would be a cost-effective way to do it either.

PCR is your choice, unless you want to spend money on, say, PacBio sequencing...

**mbseq** · 10-25-2012, 06:04 AM

I played around with downsampling from 60x coverage down to 10x coverage and running the denovo assembler. There does not seem to be much difference when I compare Newbler metrics until it gets down below 20x coverage. After that, average, N50 and largest contig size fall while the number of large contigs goes up (as expected). Is there something I should be looking for specifically to validate the downsampling or to settle on a level of coverage?

I compared one of the known isolates with the contigs from the de novo assembler (both with 70 Mb and 40 Mb) using Mauve. There were small differences in the results; 70Mb resulted in 16 LCBs with min wt 202, 40Mb resulted in 12 LCBs with min wt 658. So it looks like using 40Mb is somewhat better. Is trial and error the only way to determine the optimal coverage?

I will be using PCR to close gaps. How much confidence can I put into the LCBs provided by Mauve? Closing 12 LCBs seems much more manageable than confirming the order of 85 large contigs by PCR

Topics	Statistics	Last Post
Expanding the Horizons of Cellular Research with the Single Cell Atlas by seqadmin Started by seqadmin, Yesterday, 11:49 AM	0 responses 15 views 0 likes	Last Post by seqadmin Yesterday, 11:49 AM
Genetic Variants and Diabetes Risk in Childhood Cancer Survivors by seqadmin Started by seqadmin, 04-24-2024, 08:47 AM	0 responses 16 views 0 likes	Last Post by seqadmin 04-24-2024, 08:47 AM
Cancer Metastasis: A Deep Dive into Cellular Plasticity by seqadmin Started by seqadmin, 04-11-2024, 12:08 PM	0 responses 61 views 0 likes	Last Post by seqadmin 04-11-2024, 12:08 PM
Proteogenomic Profiles Offer New Clues in Prostate Cancer by seqadmin Started by seqadmin, 04-10-2024, 10:19 PM	0 responses 60 views 0 likes	Last Post by seqadmin 04-10-2024, 10:19 PM

Seqanswers Leaderboard Ad

Announcement

Strategy for genome assembly

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News