Seqanswers Leaderboard Ad

**Thorondor** · 05-02-2011, 12:17 AM

so you have references of your genes you are looking for and what %-identity you expect in the sequence? Blasting all your reads against your reference genes seems not to be the smartest way. ;-) using bwa or vmatch might be a lot faster but of course your results depends on your sequence identity.

**skingan** · 05-02-2011, 07:35 AM

Hi Thorondor,
The % identity should be very high, <3% divergence for the orthologous sequences. The problem is that there are many repeat elements in and around the genes so the structure is not conserved. Right now I am pulling the reads that align to the flanking sequence in my bwa alignment and will do a deNovo assembly of those reads using mira. Mira claims to be good at assembling repetitive sequence and difficult to align regions.

I may then do another iteration. Using bwa, I will map the previously unmapped reads to the contig(s) I built with mira. Then pull the singletons whose mates mapped and do another devovo assembly with mira.

Sarah

**MadsAlbertsen** · 05-02-2011, 08:10 AM

Originally posted by skingan View Post

I may then do another iteration. Using bwa, I will map the previously unmapped reads to the contig(s) I built with mira. Then pull the singletons whose mates mapped and do another devovo assembly with mira

I've been doing this to manually close gaps that can't be assembled using various short read assemblers and it generally works great if you restrict the new denovo assembly to the regions where you have "problems".

E.g. using only the reads in vicinity to where you expect your gene to be.

We normally use CLC as it is extremely fast and memory efficient (and expensive..). However most assemblers should be able to handle the repeats if it is just locally. In my experience the problem is when you have the same repeat regions in multiple area's of the genome and that is solved by doing the local assembly.

rgds
Mads

**Thorondor** · 05-02-2011, 08:21 AM

well if your genes of interested are not well covered you might also take a look at LOCAS for your assembly:

404 Error | Universität Tübingen

http://ab.inf.uni-tuebingen.de/software/locas/

**shiva** · 06-20-2011, 07:35 PM

Originally posted by skingan View Post

Hi Thorondor,
The % identity should be very high, <3% divergence for the orthologous sequences. The problem is that there are many repeat elements in and around the genes so the structure is not conserved. Right now I am pulling the reads that align to the flanking sequence in my bwa alignment and will do a deNovo assembly of those reads using mira. Mira claims to be good at assembling repetitive sequence and difficult to align regions.

I may then do another iteration. Using bwa, I will map the previously unmapped reads to the contig(s) I built with mira. Then pull the singletons whose mates mapped and do another devovo assembly with mira.

Sarah

I'm performing a similar analysis. I did find that targeted de novo assembly deals with short tandem repeats very nicely. I'm now wondering if there's a software that integrates the results from targeted de novo assembly with the reference genome so that I can still use samtools, for example, for SNP calling? Thanks in advance for any information.

Sue

**pierre350d** · 06-20-2011, 11:10 PM

Dear all,

I'd like to mention a tool called mapsembler. It takes some sequence fragments and a set of (illumina) reads. It tries to reconstruct each sequence fragment using the reads (authorizing some substitutions) and for each sequence it reconstructed it extends it left and right by targetted assembly.

The output may be either a fasta file (contig containing the sequence) or a graph that shows indels, SNPS, or more complex events like gene fusion, exon skipping...

The tool and documentation are accessible here: http://alcovna.genouest.org/mapsembler/

Any comment / feedback welcome.

Pierre

**salmonella** · 07-12-2011, 07:39 AM

mapsembler usage

Pierre
mapsembler sounds like it may work for one of my projects.
have you used it before?
Can i import the output into a viewer so that I can see how it attempted to assemble the sequences around the 'starter'?

thanks

**pierre350d** · 09-12-2011, 06:58 AM

Dear salmonella,

Sorry for this late answer...

I'm one of the authors of Mapsembler.
The output of mapsembler, while used using the graph output, can be viewed by any viewer able to deal with xgmml format (I'm using Cytoscape) or .graphml (I'm using gephi).

Pierre

Topics	Statistics	Last Post
Cancer Metastasis: A Deep Dive into Cellular Plasticity by seqadmin Started by seqadmin, 04-11-2024, 12:08 PM	0 responses 25 views 0 likes	Last Post by seqadmin 04-11-2024, 12:08 PM
Proteogenomic Profiles Offer New Clues in Prostate Cancer by seqadmin Started by seqadmin, 04-10-2024, 10:19 PM	0 responses 27 views 0 likes	Last Post by seqadmin 04-10-2024, 10:19 PM
Novel Diagnostic Assay Enhances Ovarian Cancer Detection by seqadmin Started by seqadmin, 04-10-2024, 09:21 AM	0 responses 24 views 0 likes	Last Post by seqadmin 04-10-2024, 09:21 AM
Evolutionary Dynamics of Centromeres: A Comparative Genomic Analysis by seqadmin Started by seqadmin, 04-04-2024, 09:00 AM	0 responses 52 views 0 likes	Last Post by seqadmin 04-04-2024, 09:00 AM

Seqanswers Leaderboard Ad

Announcement

Targeted de novo assembly

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News