Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Protocol for SNP from Sanger sequences

    My project is to find SNP from Sanger sequences. I've never done before. Here is some steps I could think of to achieve my purpose. Please suggest appropriate free tools/software for each step. Please let me know if I miss any steps.

    Step 1: quality trim a ABI file (to Fasta).

    Step 2: align with a reference genome (input Fasta, output Sam?)

    Step 3: parse the output file to retrieve SNP?

  • #2
    Why use FASTA? I'd use FASTQ so that any quality scores in the read can be taken into consideration in the mapping and SNP calling. EMBOSS seqret can do this conversion (taking the sequence and quality scores as is from the ABI file), but I'd suggest using a base caller like trace tuner which seems to do a better job than the ABI default pipeline. I wrote a patch to get FASTQ from TraceTuner directly, not sure if it has been integrated yet.

    For the mapping step, do you have DNA or RNA reads? And if RNA does your organism do gene splicing? If so, you'll want an intron/exon aware read mapper.
    Last edited by maubp; 07-28-2011, 12:13 PM. Reason: Added more details

    Comment


    • #3
      mutation surveyor or DNAStar should work for you but both are commercial software.

      Comment


      • #4
        I have the precise same problem (DNA-based) - can anyone recommend how to achieve this with open source tools?

        Comment


        • #5
          Hi gavin.oliver, There are other ways around but it depends on your project scope. Can you share how many Sanger reads and how big the reference sequence you have?

          Comment


          • #6
            It will only be a single human gene.

            Comment


            • #7
              I assume you are doing exon sequencing via PCR. Any multiple alignment program (e.g., CLUSTALW) or BLAST should do if you do not have too many traces and are not expecting too many types of SNPs. Other more powerful programs, like MIRA or polyphred, may be too much for a one-time small project but can handle SNP detection extremely well.

              Comment


              • #8
                Originally posted by DZhang View Post
                I assume you are doing exon sequencing via PCR. Any multiple alignment program (e.g., CLUSTALW) or BLAST should do if you do not have too many traces and are not expecting too many types of SNPs. Other more powerful programs, like MIRA or polyphred, may be too much for a one-time small project but can handle SNP detection extremely well.
                The plan was actually to use Sanger sequencing to sequence an entire 25KB gene (40 samples). There doesn't seem to be a workable NGS solution on offer.
                Last edited by gavin.oliver; 08-11-2011, 05:35 AM.

                Comment


                • #9
                  You may look into BWA or Bowtie to align Sanger reads; both can handle long reads. Then proceed to SNP/smINDEL calls (e.g., with samtools.) I would trim Sanger reads at both ends first.

                  Before NGS emerged a few years ago, Sanger was the most popular way of obtaining sequence so there are many tools to perform alignments and SNP calls. In your case, I believe Consed (free to academic) or MIRA should work.

                  Comment


                  • #10
                    Originally posted by DZhang View Post
                    You may look into BWA or Bowtie to align Sanger reads; both can handle long reads. Then proceed to SNP/smINDEL calls (e.g., with samtools.) I would trim Sanger reads at both ends first.

                    Before NGS emerged a few years ago, Sanger was the most popular way of obtaining sequence so there are many tools to perform alignments and SNP calls. In your case, I believe Consed (free to academic) or MIRA should work.
                    Thanks a lot. Do you think the Sanger reads could be converted to FASTQ to work with BWA etc?

                    Comment


                    • #11
                      I am not aware of any program that can convert quality scores in Sanger trace to FastQ. But you may simply convert Sanger traces in fasta to fastq. One took I know is from the MAQ package; it is a simple perl script.

                      Comment


                      • #12
                        Originally posted by DZhang View Post
                        I am not aware of any program that can convert quality scores in Sanger trace to FastQ. But you may simply convert Sanger traces in fasta to fastq. One took I know is from the MAQ package; it is a simple perl script.
                        I'll give it a look - thanks again!

                        Comment


                        • #13
                          Originally posted by gavin.oliver View Post
                          Thanks a lot. Do you think the Sanger reads could be converted to FASTQ to work with BWA etc?
                          Yes, if you have ABI files for your "Sanger" capillary sequence reads you can convert them to FASTQ.

                          You can use EMBOSS seqret as described here - note you have the new EMBOSS 6.4.0 release you'll want the patch for this bug:



                          You will also be able to use the next release of Biopython to convert ABI to FASTQ (and other formats).


                          You can also use a base caller like TraceTuner which should give slightly better base predictions than the default ABI pipeline. I wrote a patch to offer FASTQ output from TraceTuner, but by default you can get FASTA + QUAL, and convert that to FASTQ.

                          Comment


                          • #14
                            Originally posted by maubp View Post
                            Yes, if you have ABI files for your "Sanger" capillary sequence reads you can convert them to FASTQ.

                            You can use EMBOSS seqret as described here - note you have the new EMBOSS 6.4.0 release you'll want the patch for this bug:



                            You will also be able to use the next release of Biopython to convert ABI to FASTQ (and other formats).


                            You can also use a base caller like TraceTuner which should give slightly better base predictions than the default ABI pipeline. I wrote a patch to offer FASTQ output from TraceTuner, but by default you can get FASTA + QUAL, and convert that to FASTQ.
                            Brilliant stuff - thanks a lot

                            Comment

                            Latest Articles

                            Collapse

                            • seqadmin
                              Essential Discoveries and Tools in Epitranscriptomics
                              by seqadmin




                              The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
                              04-22-2024, 07:01 AM
                            • seqadmin
                              Current Approaches to Protein Sequencing
                              by seqadmin


                              Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                              04-04-2024, 04:25 PM

                            ad_right_rmr

                            Collapse

                            News

                            Collapse

                            Topics Statistics Last Post
                            Started by seqadmin, Yesterday, 08:47 AM
                            0 responses
                            13 views
                            0 likes
                            Last Post seqadmin  
                            Started by seqadmin, 04-11-2024, 12:08 PM
                            0 responses
                            60 views
                            0 likes
                            Last Post seqadmin  
                            Started by seqadmin, 04-10-2024, 10:19 PM
                            0 responses
                            60 views
                            0 likes
                            Last Post seqadmin  
                            Started by seqadmin, 04-10-2024, 09:21 AM
                            0 responses
                            54 views
                            0 likes
                            Last Post seqadmin  
                            Working...
                            X