SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
qPCR of H4 pan ac ChIP emilyindallas SOLiD 0 12-21-2011 08:57 AM
coverage calculation arvi8689 Illumina/Solexa 7 11-11-2011 03:53 PM
3.4GHz Quad-Core Intel Core i7 versus 3.1GHz Quad-Core Intel Core i5 brachysclereid Bioinformatics 2 05-03-2011 06:31 PM
depth calculation sheilal Bioinformatics 5 10-04-2009 12:20 PM
Genome-wide detection and analysis of hippocampus core promoters using DeepCAGE. doxologist Literature Watch 0 01-12-2009 10:33 AM

Reply
 
Thread Tools
Old 02-09-2012, 12:52 AM   #1
Stegger
Member
 
Location: Copenhagen

Join Date: Nov 2008
Posts: 21
Default Calculation of pan- and core-genome

Hi,
I was hoping someone here could point me in the direction of a good tool for calculation of pan and core genomes in prokaryotes! I am looking for one or several tools/scripts that does a number of things:
I have a bucket full of bacterial genome data (in contigs mostly) of the same species and would like, based on various gropings of these, to determine initially overall pan and core genomes of the isolates.

Besides getting just the number of genes in each group, it would also be very beneficial to some sort if genes list output for further analysis.

Finally, I would like to see what difference there is between the calculated core/pan genome in 1 group compared to another defined set of isolates in another group - again not only a number of genes but an actual list of genes or gene sequences.

The contigs have not been analysed for CDSs or annotated in any way, but this I can do in another pipeline prior to the pan core calculation if needed.

Thanks!!!
Stegger is offline   Reply With Quote
Old 02-09-2012, 01:02 AM   #2
Zam
Member
 
Location: Oxford

Join Date: Apr 2010
Posts: 51
Default

Can I just clarify - do you have a bunch of reads which are labelled with which isolate of the same species they come from, and on the basis of that you want to pull out the pan (everything) and core (shared) genomes?
Zam is offline   Reply With Quote
Old 02-09-2012, 01:13 AM   #3
Stegger
Member
 
Location: Copenhagen

Join Date: Nov 2008
Posts: 21
Default

Hi Zam,
I have assembled the reads into contigs, and they have names_contigID to them indicating the species and specific isolate they come from. And yes, it from them that I would like to extract the information.
Stegger is offline   Reply With Quote
Old 02-09-2012, 01:25 AM   #4
Zam
Member
 
Location: Oxford

Join Date: Apr 2010
Posts: 51
Default

Well then, one approach is to assemble a "multicoloured" graph of your data (one colour per isolate), and then dump contigs with information about how many isolates share each contig. Then you can split things however you like - pull out the contigs that everyone shares, 95% share, etc. Software for this is here:
http://cortexassembler.sourceforge.net/
and the paper contain an example of something similar:
http://dx.doi.org/10.1038/ng.1028

>Finally, I would like to see what difference there is between the calculated core/pan >genome in 1 group compared to another defined set of isolates in another group - >again not only a number of genes but an actual list of genes or gene sequenc

You can do any comparisons you like between any subsets you like in this manner. Feel free to contact me directly (zam AT well.ox.ac.uk)
Zam is offline   Reply With Quote
Old 02-09-2012, 07:37 AM   #5
Adjuvant
Member
 
Location: Chicago, IL

Join Date: Sep 2010
Posts: 13
Default

Good references for how to do the calculations are Kittichotriat W et al, PLoS ONE July 2011 and Tettelin H. et al PNAS 2005 102:13950-13955 if you want to try doing the analysis or scripting out your own tools. There's also Pan Seq that you can try, but I haven't really been able to get it to work all that well for my purposes.
Adjuvant is offline   Reply With Quote
Old 02-10-2012, 05:06 AM   #6
Stegger
Member
 
Location: Copenhagen

Join Date: Nov 2008
Posts: 21
Default

Thanks both of you!!

And Zam, I may take you up on that offer. And congratulations on that paper.
Stegger is offline   Reply With Quote
Old 02-11-2012, 06:12 AM   #7
koadman
Member
 
Location: Sydney, Australia

Join Date: May 2010
Posts: 65
Default

At the risk of being accused of shameless self-promotion, I will point out that this is something that Mauve and specifically progressiveMauve has supported for years. Have a look a the .backbone file output (documentation here).
koadman is offline   Reply With Quote
Old 02-11-2012, 06:24 AM   #8
Zam
Member
 
Location: Oxford

Join Date: Apr 2010
Posts: 51
Default

Koadman - Good for you! (I'm certainly in no position to criticise self-promotion)
Stegger - thanks!
Zam is offline   Reply With Quote
Old 02-11-2012, 09:53 PM   #9
Stegger
Member
 
Location: Copenhagen

Join Date: Nov 2008
Posts: 21
Default

Please self-promote all you can, that just allow me to come back with potential questions to the right people
Stegger is offline   Reply With Quote
Old 04-09-2015, 10:28 PM   #10
shashankgupta
Member
 
Location: Chennai

Join Date: Feb 2015
Posts: 33
Default

Quote:
Originally Posted by Zam View Post
Well then, one approach is to assemble a "multicoloured" graph of your data (one colour per isolate), and then dump contigs with information about how many isolates share each contig. Then you can split things however you like - pull out the contigs that everyone shares, 95% share, etc. Software for this is here:
http://cortexassembler.sourceforge.net/
and the paper contain an example of something similar:
http://dx.doi.org/10.1038/ng.1028

>Finally, I would like to see what difference there is between the calculated core/pan >genome in 1 group compared to another defined set of isolates in another group - >again not only a number of genes but an actual list of genes or gene sequenc

You can do any comparisons you like between any subsets you like in this manner. Feel free to contact me directly (zam AT well.ox.ac.uk)
I am also trying to do analysis for PAN/CORE genome, but the above mentioned software is for someone who have good hands in linux based system.

Is there a simple way where non-bioinformatician can do this kind of analysis ?

Cheers !
Shashank
shashankgupta is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 12:51 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO