SEQanswers

Go Back   SEQanswers > Applications Forums > Genomic Resequencing



Similar Threads
Thread Thread Starter Forum Replies Last Post
Pipeline to find somatic mutations david.tamborero Bioinformatics 6 08-09-2013 02:05 AM
Find out SNPs and Mutations in RNA-Seq data suz.katie Bioinformatics 0 12-14-2012 09:01 AM
Targeted sequencing of large genomic fagments (100 kb) with NGS Marius Sample Prep / Library Generation 4 12-23-2011 08:32 AM
Can I use sheared genomic DNA as starting material of bacterial genome sequencing? omnivore 454 Pyrosequencing 6 10-13-2010 05:11 AM

Reply
 
Thread Tools
Old 01-21-2015, 01:44 PM   #1
kirk nelson
Junior Member
 
Location: San Diego

Join Date: Jan 2015
Posts: 4
Default Genomic NGS sequencing to find sponaneous bacterial mutations

Hi there, I am new to genomic sequencing but have a lot of experience using sanger sequence for cloning and to analyze spontaneous bacterial mutants (novel antibiotics). I need help.

I am currently trying to figure out how to use whole genome NGS sequence from Illumina's basespace to look for non-target-based mutations. I have data for the parent strain as well as several mutants.

I think I should be aligning the parent data to a published genome from Genebank or KEGG, and then aligning the mutants to the parent to look for mutations, rather than using de novo sequencing for the parent. Does that sound right?

I currently have active demos of DNA Star and Sequencher, and have performed the alignments I just described in both, only to find that there are hundreds of SNPs and idels of difference between parent and mutant strains. Clearly most of these are not real and need to be filtered out. How do I go about doing that?

Thanks for any help you can give me!
kirk nelson is offline   Reply With Quote
Old 01-21-2015, 04:23 PM   #2
GenoMax
Senior Member
 
Location: East Coast USA

Join Date: Feb 2008
Posts: 7,049
Default

Depending on how your strain was collected/stored it may have significant differences compared to the reference available in GenBank. You can use the sequence from your strain to judge how different it is from the published reference by doing some alignments. Do you have an idea of fold coverage you likely have for your strain?

You have only mentioned GUI based software packages but are you familiar with command line/unix, because the latter is going to give you significantly more flexibility. If you must use a GUI based program look into CLC Genomics Workbench. It would be better suited for bacterial genome analysis than the programs you mentioned above.

On unix side of things: SPAdes is an excellent de novo assembler for bacterial genomes. If you have enough coverage for your reference (parent) then it may be best to start assembling that data. Having a good reference would be critical to call relevant SNP's from mutants.

Another program you will find handy is "mauve" that allows genome level comparisons.
GenoMax is offline   Reply With Quote
Old 01-22-2015, 05:21 AM   #3
HESmith
Senior Member
 
Location: Bethesda MD

Join Date: Oct 2009
Posts: 505
Default

Also, keep in mind that spontaneous mutations in bacteria frequently arise from transposon hopping, and SNP/indel pipelines do a poor job of detecting those. You would need to use de novo assembly+reference comparison (e.g.., mauve), or software specifically designed for transposon detection.
HESmith is offline   Reply With Quote
Old 01-22-2015, 02:16 PM   #4
kirk nelson
Junior Member
 
Location: San Diego

Join Date: Jan 2015
Posts: 4
Default

Thank you both for your replies. They are quite helpful.

Genomax, I think a GUI based package is the way I need to go. We are only running windows at my workplace. I have done a bit of simple programming in Python, but our PhD molecular biologist is less tech savvy. I was able to download Mauve for Windows, though. It looks like my coverage is ~100x, so I presume I might be able to make a de novo reference from the parent and then align the mutants to it?

HESmith, that is a good point about transposons. I have seen lots of SNPs and idels when looking for target mutations and mutations in regulatory genes, but I should definitely be looking for transposons now that I am looking genome wide.

I am currently running some de novo alignments of a parent strain and its mutants, and will try to align them with Mauve.
kirk nelson is offline   Reply With Quote
Old 01-22-2015, 02:56 PM   #5
GenoMax
Senior Member
 
Location: East Coast USA

Join Date: Feb 2008
Posts: 7,049
Default

CLC Genomics workbench has a good assembler and since you are looking for something GUI based that is an option. It is commercial software (and not inexpensive).

Geneious is also another GUI based alternative for NGS data that includes an assembler. I have heard that it is good for bacterial genome assemblies. This is also commercial software.

I don't know of a GUI based free assembler package.

Word of caution. Sometimes having too deep a coverage can cause problems with assembly. You may want to sub-sample and start with 25-30x reads.
GenoMax is offline   Reply With Quote
Old 01-23-2015, 08:47 AM   #6
kirk nelson
Junior Member
 
Location: San Diego

Join Date: Jan 2015
Posts: 4
Default

Thanks again, Genomax.

I will certainly try those software packages out when I get a chance.
kirk nelson is offline   Reply With Quote
Old 01-26-2015, 10:38 AM   #7
kirk nelson
Junior Member
 
Location: San Diego

Join Date: Jan 2015
Posts: 4
Default

Ok, I feel comfortable with the SNP/indel workflow now. I performed a de novo alignment of my parent strain followed by a templated alignment of a mutant to the parent, and rediscoverd the SNPs our molecular biologist had previously found by aligning the data in Microsoft Word (obviously not the most time efficient way to find them).

I need a little clarification as to how to find transpositions in Mauve: I have performed de novo aligments of the parent and its mutant, and assembled each genome into ~10 contigs. Do I just save these as fastas and then align them in Mauve or is there an intermediate step?

Thanks again for all the help!
kirk nelson is offline   Reply With Quote
Old 01-26-2015, 02:30 PM   #8
GenoMax
Senior Member
 
Location: East Coast USA

Join Date: Feb 2008
Posts: 7,049
Default

Quote:
Originally Posted by kirk nelson View Post

I need a little clarification as to how to find transpositions in Mauve: I have performed de novo aligments of the parent and its mutant, and assembled each genome into ~10 contigs. Do I just save these as fastas and then align them in Mauve or is there an intermediate step?

Thanks again for all the help!
You can align multi-fasta. Check this page out: http://darlinglab.org/mauve/user-guide/aligning.html

RAM requirements will change depending how big your genomes are. Start with hardware that has atleast 16GB RAM just to be safe. Try two genomes first before adding one more at a time.
GenoMax is offline   Reply With Quote
Old 02-26-2015, 09:36 AM   #9
daffodil
Junior Member
 
Location: tehran

Join Date: Dec 2014
Posts: 2
Default

hi every body
i dont know where i can ask my question plz if u know help me
i m writing my proposal about male iffertility using exome sequencing
i want to know after we find any gene for validation we use sanger seq ok?
and after that is there any validation test for exmple mouse model
is it possible tell me what is the benefit finding new gene ...i know for detection and dignosis do u know another thing?
daffodil is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 08:17 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO