SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
Find SNP in Sanger sequenced gene genelab Bioinformatics 2 03-10-2014 11:35 PM
Anyone has protocol for Ion PES protocol? marcowanger Ion Torrent 1 01-18-2012 08:28 PM
Coverage required for Sanger based SNP detection and Genotyping gavin.oliver De novo discovery 3 08-30-2011 12:41 AM
illumina alternative v1.5 protocol for small rna seq vs. the standard protocol ik76 Sample Prep / Library Generation 1 03-25-2010 02:24 PM
454 Vs Sanger sdstella 454 Pyrosequencing 1 05-04-2009 06:29 AM

Reply
 
Thread Tools
Old 07-26-2011, 09:27 AM   #1
shuang
Senior Member
 
Location: IL

Join Date: Jul 2011
Posts: 100
Default Protocol for SNP from Sanger sequences

My project is to find SNP from Sanger sequences. I've never done before. Here is some steps I could think of to achieve my purpose. Please suggest appropriate free tools/software for each step. Please let me know if I miss any steps.

Step 1: quality trim a ABI file (to Fasta).

Step 2: align with a reference genome (input Fasta, output Sam?)

Step 3: parse the output file to retrieve SNP?
shuang is offline   Reply With Quote
Old 07-28-2011, 12:07 PM   #2
maubp
Peter (Biopython etc)
 
Location: Dundee, Scotland, UK

Join Date: Jul 2009
Posts: 1,543
Default

Why use FASTA? I'd use FASTQ so that any quality scores in the read can be taken into consideration in the mapping and SNP calling. EMBOSS seqret can do this conversion (taking the sequence and quality scores as is from the ABI file), but I'd suggest using a base caller like trace tuner which seems to do a better job than the ABI default pipeline. I wrote a patch to get FASTQ from TraceTuner directly, not sure if it has been integrated yet.

For the mapping step, do you have DNA or RNA reads? And if RNA does your organism do gene splicing? If so, you'll want an intron/exon aware read mapper.

Last edited by maubp; 07-28-2011 at 12:13 PM. Reason: Added more details
maubp is offline   Reply With Quote
Old 07-28-2011, 05:55 PM   #3
DZhang
Senior Member
 
Location: East Coast, US

Join Date: Jun 2010
Posts: 177
Default

mutation surveyor or DNAStar should work for you but both are commercial software.
DZhang is offline   Reply With Quote
Old 08-11-2011, 04:26 AM   #4
gavin.oliver
Senior Member
 
Location: uk

Join Date: Jan 2010
Posts: 110
Default

I have the precise same problem (DNA-based) - can anyone recommend how to achieve this with open source tools?
gavin.oliver is offline   Reply With Quote
Old 08-11-2011, 04:33 AM   #5
DZhang
Senior Member
 
Location: East Coast, US

Join Date: Jun 2010
Posts: 177
Default

Hi gavin.oliver, There are other ways around but it depends on your project scope. Can you share how many Sanger reads and how big the reference sequence you have?
DZhang is offline   Reply With Quote
Old 08-11-2011, 04:35 AM   #6
gavin.oliver
Senior Member
 
Location: uk

Join Date: Jan 2010
Posts: 110
Default

It will only be a single human gene.
gavin.oliver is offline   Reply With Quote
Old 08-11-2011, 04:53 AM   #7
DZhang
Senior Member
 
Location: East Coast, US

Join Date: Jun 2010
Posts: 177
Default

I assume you are doing exon sequencing via PCR. Any multiple alignment program (e.g., CLUSTALW) or BLAST should do if you do not have too many traces and are not expecting too many types of SNPs. Other more powerful programs, like MIRA or polyphred, may be too much for a one-time small project but can handle SNP detection extremely well.
DZhang is offline   Reply With Quote
Old 08-11-2011, 05:29 AM   #8
gavin.oliver
Senior Member
 
Location: uk

Join Date: Jan 2010
Posts: 110
Default

Quote:
Originally Posted by DZhang View Post
I assume you are doing exon sequencing via PCR. Any multiple alignment program (e.g., CLUSTALW) or BLAST should do if you do not have too many traces and are not expecting too many types of SNPs. Other more powerful programs, like MIRA or polyphred, may be too much for a one-time small project but can handle SNP detection extremely well.
The plan was actually to use Sanger sequencing to sequence an entire 25KB gene (40 samples). There doesn't seem to be a workable NGS solution on offer.

Last edited by gavin.oliver; 08-11-2011 at 05:35 AM.
gavin.oliver is offline   Reply With Quote
Old 08-11-2011, 05:38 AM   #9
DZhang
Senior Member
 
Location: East Coast, US

Join Date: Jun 2010
Posts: 177
Default

You may look into BWA or Bowtie to align Sanger reads; both can handle long reads. Then proceed to SNP/smINDEL calls (e.g., with samtools.) I would trim Sanger reads at both ends first.

Before NGS emerged a few years ago, Sanger was the most popular way of obtaining sequence so there are many tools to perform alignments and SNP calls. In your case, I believe Consed (free to academic) or MIRA should work.
DZhang is offline   Reply With Quote
Old 08-11-2011, 05:47 AM   #10
gavin.oliver
Senior Member
 
Location: uk

Join Date: Jan 2010
Posts: 110
Default

Quote:
Originally Posted by DZhang View Post
You may look into BWA or Bowtie to align Sanger reads; both can handle long reads. Then proceed to SNP/smINDEL calls (e.g., with samtools.) I would trim Sanger reads at both ends first.

Before NGS emerged a few years ago, Sanger was the most popular way of obtaining sequence so there are many tools to perform alignments and SNP calls. In your case, I believe Consed (free to academic) or MIRA should work.
Thanks a lot. Do you think the Sanger reads could be converted to FASTQ to work with BWA etc?
gavin.oliver is offline   Reply With Quote
Old 08-11-2011, 06:03 AM   #11
DZhang
Senior Member
 
Location: East Coast, US

Join Date: Jun 2010
Posts: 177
Default

I am not aware of any program that can convert quality scores in Sanger trace to FastQ. But you may simply convert Sanger traces in fasta to fastq. One took I know is from the MAQ package; it is a simple perl script.
DZhang is offline   Reply With Quote
Old 08-11-2011, 06:06 AM   #12
gavin.oliver
Senior Member
 
Location: uk

Join Date: Jan 2010
Posts: 110
Default

Quote:
Originally Posted by DZhang View Post
I am not aware of any program that can convert quality scores in Sanger trace to FastQ. But you may simply convert Sanger traces in fasta to fastq. One took I know is from the MAQ package; it is a simple perl script.
I'll give it a look - thanks again!
gavin.oliver is offline   Reply With Quote
Old 08-11-2011, 06:15 AM   #13
maubp
Peter (Biopython etc)
 
Location: Dundee, Scotland, UK

Join Date: Jul 2009
Posts: 1,543
Default

Quote:
Originally Posted by gavin.oliver View Post
Thanks a lot. Do you think the Sanger reads could be converted to FASTQ to work with BWA etc?
Yes, if you have ABI files for your "Sanger" capillary sequence reads you can convert them to FASTQ.

You can use EMBOSS seqret as described here - note you have the new EMBOSS 6.4.0 release you'll want the patch for this bug:
http://lists.open-bio.org/pipermail/...st/000695.html
http://lists.open-bio.org/pipermail/...st/000698.html

You will also be able to use the next release of Biopython to convert ABI to FASTQ (and other formats).
http://lists.open-bio.org/pipermail/...st/009087.html

You can also use a base caller like TraceTuner which should give slightly better base predictions than the default ABI pipeline. I wrote a patch to offer FASTQ output from TraceTuner, but by default you can get FASTA + QUAL, and convert that to FASTQ.
maubp is offline   Reply With Quote
Old 08-11-2011, 06:19 AM   #14
gavin.oliver
Senior Member
 
Location: uk

Join Date: Jan 2010
Posts: 110
Default

Quote:
Originally Posted by maubp View Post
Yes, if you have ABI files for your "Sanger" capillary sequence reads you can convert them to FASTQ.

You can use EMBOSS seqret as described here - note you have the new EMBOSS 6.4.0 release you'll want the patch for this bug:
http://lists.open-bio.org/pipermail/...st/000695.html
http://lists.open-bio.org/pipermail/...st/000698.html

You will also be able to use the next release of Biopython to convert ABI to FASTQ (and other formats).
http://lists.open-bio.org/pipermail/...st/009087.html

You can also use a base caller like TraceTuner which should give slightly better base predictions than the default ABI pipeline. I wrote a patch to offer FASTQ output from TraceTuner, but by default you can get FASTA + QUAL, and convert that to FASTQ.
Brilliant stuff - thanks a lot
gavin.oliver is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 07:51 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO