SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
multiple sample differential RNA editing/SNP after novoalignment 12jrowley2 RNA Sequencing 2 07-17-2014 08:39 PM
Combining multiple Trinity assemblies? yeaman Bioinformatics 2 12-08-2011 04:30 PM
SNP base calling for multiple samples shuang Bioinformatics 2 09-07-2011 02:06 PM
FPKM determination of de novo transcripts morebasesplease RNA Sequencing 0 08-06-2011 07:56 PM
SNP Allele-Frequency Determination in Pooled DNA Samples using solexa baohua100 Bioinformatics 1 07-19-2008 12:21 AM

Reply
 
Thread Tools
Old 01-30-2012, 05:01 PM   #1
austic
Member
 
Location: Denver, CO, USA

Join Date: Mar 2011
Posts: 11
Default Help! SNP determination across multiple assemblies

First off, I apologize for the long post.

Second, I have not found anything via searches that is definitively able to help me, so I am making a new thread here

Ok, so let me give you some info on my project (I am a Ph.D. student in Microbiology) so that you can understand the question I am trying to answer.
1) I study a bacteria
2) If grown on a petri dish with nutrient agar, and allowed to grow up from a single cell, bacteria will form what we call "colonies" which are visible by eye and are usually quite similar in their physical characteristics within a given species
3) My bacteria likes to make different-looking colonies
4) These different-looking colonies are more than just interesting to look at, they also have implications for physiology, behavior and pathogenicity (my bug is an opportunistic pathogen).
5) A collaborator of ours offered to sequence some of these "colony variants" for us and I have the data back and assembled
5a) The data was gathered on a Roche 454 platform
5b) I have been using DNASTAR's seqMan NGen to assemble the data and DNASTAR's seqMan Pro to view the assembly
5c) I have not closed any of the genomes with new sequence data

6) I have not yet been able to find a piece of software that will allow me to compare all of my sequenced variants at one time to determine whether any given mutation is important or not so I built a spreadsheet by hand to do that and manually searched all of the assemblies for the read/base-pair composition at any site of interest (usually found in the SNP report for one of the given variants). Unfortunately, this produces about 1,000 SNPs and introduces an unacceptable amount of human error (discovered the hard way) - both of which cannot be brute-force-fixed either with re-sequencing or with man hours (trust me, I've tried)

I **desperately** need a tool that will:
-take different assemblies from a nearly isogenic collection of samples and arrange them to see what is similar/different about them
-highlight regions that *could* be of interest but would normally be filtered out due to low depth of coverage (and could be filled in by targeted re-sequencing)

I was under the impression that SAM (Sequence Assembly Manager) could do this for me with the pileup function and I am in the process of installing it; however I have run into several snags, the biggest of which is that GNU Compiler Collection (GCC) will not update from 4.1.2 to anything higher and it looks like I am going to have to find a new OS and re-install because apparently, my OS (CentOS_5.6) is no longer supported.

Yet I cannot keep throwing my time away on this project - it supposed to be a preliminary side project and we've been working on it for over a year now, so I have to stop with this trial-and-error nonsense and actually finish the data analysis. I am begging you guys, if there is any piece of software you know that would do this, what is it and what do I need to run it? (and will you coach me into getting the thing up and running) Or, ***if you have personal experience doing this kind of thing (or know someone who does), could you PLEASE contact me*** - I will seriously buy you beer and make you cookies or whatever you want

Last edited by austic; 01-30-2012 at 05:08 PM.
austic is offline   Reply With Quote
Old 01-30-2012, 05:19 PM   #2
adaptivegenome
Super Moderator
 
Location: US

Join Date: Nov 2009
Posts: 437
Default

Send me a PM and I can talk with you offline.
adaptivegenome is offline   Reply With Quote
Old 02-28-2012, 04:20 AM   #3
austic
Member
 
Location: Denver, CO, USA

Join Date: Mar 2011
Posts: 11
Default

Thank you guys so much for all your help!
austic is offline   Reply With Quote
Old 02-28-2012, 05:18 AM   #4
Zam
Member
 
Location: Oxford

Join Date: Apr 2010
Posts: 51
Default

Sounds like you are sorted, just saw this, so probably the following is not needed, but
I have a tool that (I think) does what you want - it will simultaneously assemble them all and just tell you what the differences are and which strains have what.

Website: cortexassembler.sourceforge.net
Paper:
Z Iqbal, M Caccamo, I Turner, P Flicek, G McVean. De novo assembly and genotyping of variants using colored de Bruijn graphs. Nature Genetics (2012) (doi:10.1038/ng.1028)

Feel free to contact me (zam AT well.ox.ac.uk) if you want to find out more.
Zam is offline   Reply With Quote
Old 02-28-2012, 07:37 AM   #5
Zam
Member
 
Location: Oxford

Join Date: Apr 2010
Posts: 51
Default

PS - if you let me know how many strains and what kind of coverage, I can tell you how hard this is. But roughly speaking, I would expect you to be able to get results within a day
Zam is offline   Reply With Quote
Reply

Tags
intrastrain snp

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 09:11 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO