View Single Post
Old 02-01-2018, 06:08 AM   #3
berthenet
Junior Member
 
Location: France

Join Date: Jan 2018
Posts: 4
Default

Just lost my post, I've been logged out during writing it. Note to myself: always copy my answer in my clipboard before posting...

So, let's right it all again.

Thanks for the links you shared. Some of them I knew of, but some of them I'll go and have a look. I do work with bacterial genomes.

So most of my assemblies look fine in terms of number of contigs (<100 contigs) once I filter out the smallest ones (<1000bp). However, for some of them the number of contigs remain really high, and when I check the length of the complete genome, I obain 3 genomes of more than 2.4Mb when I expect 1.65Mb approximately. I checked the 30 largest contigs for one of these outsider strain by doing a nblast against the NCBI database. I noticed that some of the contigs don't match the species of interest. These contigs have a low coverage value (indicated in the name of the contig): around 1, against more than 200 for contigs matching the species of interest.

Do you usually filter your contigs based on this coverage value? Is that why I have weird sizes?
berthenet is offline   Reply With Quote