Seqanswers Leaderboard Ad

**luc** · 10-19-2014, 07:01 AM

Could you post the FASTQC reports?
Have you tried aligning your reads to another drosophila assembly. Brians BBmap will give you some very helpful error profile quickly ( http://seqanswers.com/forums/showpos...25&postcount=1 ).
I have never seen IDBA applied to eukaryotic genomes before - SOAP should work of course.

**bio_informatics** · 10-19-2014, 07:44 AM

I would suggest you to use one tool at a time. MaSuRCA and SOAPdenovo take a lot of memory. I presume you are not on cluster/server and on default 4 GB machine.

Make sure you are having correct config files.

PS:
I was unable to run both tools to success, due to some or the other errors, I was running out of time and later used Velvet. Velvet is easy to run.
You will have to play around with it a lot. The default k-mer settings might not work with your sample.

**Zapages** · 10-19-2014, 07:52 AM

What specific Drosophila species genomes are you trying to assemble? I have assembled a few Drosophila genomes.

I would suggest using Velvet and using one tool at a time when trying to assemble any genome. Unless you are using online/cloud based resources like Galaxy and iPlant.

**francicco** · 10-19-2014, 11:03 PM

Hi guys,

Thanks for your help, really appreciated. So, fastqc profiles are attached (fastqc_data_R1-2.zip).

My species is D. nigrosparsa. The closely related species for which a genome is available is D. grimshawi. I know this because I already did the phylogeny using the mtDNA genome assembled with these reads (data not yet published). The most recent common ancestor among the two species is quite old, around 20Mya (data not yet published). So use it as a reference, I think would not be very useful, but I give it a try. Also attached there is another file (Dgri.vs.ShortInsert_reads.31.1.pdf), in this file, produced always with KAT, are plotted the kmers in common between my data, short insert reads, with respect to the D. grimshawi. As you can see the fraction in common is very very small.

About my computational power, I'm running these analysis on a cluster machine, I have enough power to perform two three assemblies in parallel.

Yes, I'm also not so sure about the MaSuRCA conf file (see attachment: MaSuRCA_Dnig.conf.txt). Maybe Somebody could give it a look.

What do you think guys?

Thanks a lot

Attached Files

**bio_informatics** · 10-19-2014, 11:08 PM

Could you help me understand how and why Jump keyword is used, when used in config file of MaSurRCA?
Apologies for taking this a off topic.

**francicco** · 10-19-2014, 11:17 PM

Originally posted by bio_informatics View Post

Could you help me understand how and why Jump keyword is used, when used in config file of MaSurRCA?
Apologies for taking this a off topic.

Maybe I did wrong but, I understood that JUMP stands for jumping libraries, so mate pairs. Is it wrong?

**bio_informatics** · 10-19-2014, 11:18 PM

I do not know, it is used for libraries, that is what manual says.

Lets wait for experienced and experts to enlighten here.

**francicco** · 10-19-2014, 11:28 PM

I can tell you, they are the same

Jumping library - Wikipedia

http://en.wikipedia.org/wiki/Jumping_library

"[...] mate-pair sequencing, which is basically a combination of Next Generation Sequencing with jumping libraries"

F

**bio_informatics** · 10-19-2014, 11:30 PM

Thank you.

Originally posted by francicco View Post

I can tell you, they are the same

Jumping library - Wikipedia

http://en.wikipedia.org/wiki/Jumping_library

"[...] mate-pair sequencing, which is basically a combination of Next Generation Sequencing with jumping libraries"

F

Topics	Statistics	Last Post
Expanding the Horizons of Cellular Research with the Single Cell Atlas by seqadmin Started by seqadmin, Yesterday, 11:49 AM	0 responses 15 views 0 likes	Last Post by seqadmin Yesterday, 11:49 AM
Genetic Variants and Diabetes Risk in Childhood Cancer Survivors by seqadmin Started by seqadmin, 04-24-2024, 08:47 AM	0 responses 16 views 0 likes	Last Post by seqadmin 04-24-2024, 08:47 AM
Cancer Metastasis: A Deep Dive into Cellular Plasticity by seqadmin Started by seqadmin, 04-11-2024, 12:08 PM	0 responses 61 views 0 likes	Last Post by seqadmin 04-11-2024, 12:08 PM
Proteogenomic Profiles Offer New Clues in Prostate Cancer by seqadmin Started by seqadmin, 04-10-2024, 10:19 PM	0 responses 60 views 0 likes	Last Post by seqadmin 04-10-2024, 10:19 PM

Seqanswers Leaderboard Ad

Announcement

Failed Genome assembly

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News