SEQanswers

Go Back   SEQanswers > Introductions



Similar Threads
Thread Thread Starter Forum Replies Last Post
Average Insert Size hamcan Bioinformatics 3 12-12-2016 07:24 PM
Script to validate read identity joseangelivan Bioinformatics 3 05-05-2016 08:54 AM
BWA - read identity Antony03 Bioinformatics 2 05-13-2015 12:44 AM
% identity cutoff Chuckytah Metagenomics 0 04-30-2011 05:18 PM
Weighted Average Difference Stiff0810 Bioinformatics 1 07-28-2010 06:17 AM

Reply
 
Thread Tools
Old 07-14-2017, 08:45 AM   #1
[email protected]
Junior Member
 
Location: india

Join Date: Jul 2017
Posts: 2
Default Average nucleotide identity

Hi all,
I'm new to the forum and working on comparative genomics.

I'm comparing around 50 bacterial genomes, most of which contain palsmid and few doesn't.

I would like to know, when i calculate average nucleotide identity(ANI) using orhtoANI should i use only chromosomal genome or both chromosomal and plasmid together?
sankarithirumal@gmail.com is offline   Reply With Quote
Old 07-14-2017, 09:39 AM   #2
Brian Bushnell
Super Moderator
 
Location: Walnut Creek, CA

Join Date: Jan 2014
Posts: 2,707
Default

Depends on your goal, but... I'd use everything. I don't see that it matters whether a gene is on a plasmid or not when calculating ANI. It gets a bit more complicated when you have multiple plasmid copies, but for simplicity, I'd just calculate the ANI from the full haploid genome representation.
Brian Bushnell is offline   Reply With Quote
Old 07-14-2017, 10:00 AM   #3
[email protected]
Junior Member
 
Location: india

Join Date: Jul 2017
Posts: 2
Default

Thank you for your reply.
Yes u are right some strains have 15 or 20 plasmids and some doesn't have at all. That is the reason i would like to know whether it will be reasonable to use only chromosomal and not both chromosomal and plasmid
sankarithirumal@gmail.com is offline   Reply With Quote
Old 07-14-2017, 10:19 AM   #4
Brian Bushnell
Super Moderator
 
Location: Walnut Creek, CA

Join Date: Jan 2014
Posts: 2,707
Default

There are lots of ways of calculating ANI. The best one depends on your goal. For example, if you want to say "These two bacteria are really closely related" then probably just the main chromosome is important, since plasmids can come and go pretty rapidly. If you want to say "These two bacteria are behaviorally similar" then you need to include all the plasmids as well. Note that ANI is not a sufficient metric in the latter case, you also need to calculate completeness. ANI generally only factors in things that align, so you might get 100% ANI between human chromosome 1 and the full human genome, but that does not mean they are equivalent.

I encourage you to try BBMap's CompareSketch:

First, for each genome fasta, run fuse.sh on it to combine the contigs into a single sequence, which makes the all-to-all comparison run at a per-genome rather than a per-sequence level (I'll probably make that automatic at some point). Then:

Code:
comparesketch.sh *.fasta alltoall records=100
That does an all-to-all comparison and reports both ANI and completeness. It's alignment-free and will give different results to alignment-based methods (well, all ANI calculation methods will give different results) but it's useful in that it reports completeness also. A bacteria with 100% ANI and 90% genome completeness compared to another bacteria will be missing some functionality, even though they are very closely related.

Last edited by Brian Bushnell; 07-14-2017 at 10:28 AM.
Brian Bushnell is offline   Reply With Quote
Reply

Tags
ani

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 10:45 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2018, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO