Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Phylogeny on SNP

    Hello,

    I'm searching for a tool inferring phylogenies of different species via snp calls? Any suggestions? I got a table containing different positions and snp calls and want to infer the phylogeny for the different species via those snps. Atm, i really don't know how to tackle the problem except binarize them and cluster via different algorithms.


    Thanks,


    Phil

  • #2
    Most phylogeny estimation tools (phylip, phyml, paup*, MrBayes, *BEAST etc) require their input to be in fasta or phylip format. SNPs alone are tricky for those tools since there's a lot of ignored data (everything in between the SNPs), which makes estimating branch lengths difficult.
    Also keep in mind that there might not actually *be* a simple tree underlying your data - recombination and incomplete lineage sorting will make the ancestry of the sequences a potentially complex network, not a simple tree.
    With those caveats, I think making a fasta-formatted input file is your best bet.
    good luck!

    Comment


    • #3
      Originally posted by brofallon View Post
      .
      With those caveats, I think making a fasta-formatted input file is your best bet.
      good luck!
      So it is a possibility to just concat the snp-calls to a complete sequence and do the analysis on that. Gaining a network isn't such a bad thing...

      Comment


      • #4
        If you concat the SNPs, and therefore ignore all invariant sites, you'll probably get approximately the correct tree topology, but branch lengths that are much too long. Some programs may break under these conditions, I'm not entirely sure. I'd be curious to hear what the results look like if you do it...
        B

        Comment


        • #5
          Originally posted by brofallon View Post
          If you concat the SNPs, and therefore ignore all invariant sites, you'll probably get approximately the correct tree topology, but branch lengths that are much too long. Some programs may break under these conditions, I'm not entirely sure. I'd be curious to hear what the results look like if you do it...
          B
          Does anybody know how to concatenate the SNPs to make the fasta sequence? I have the same problem now.

          Comment


          • #6
            Hi everybody,
            Me too i really need and answer !!!!!!!

            I have the reads from 4 DNA diploid strains... One genome de reference well annotated...
            I made a SNPs calling with CLC and a Venn diagram to represent the similarity and the difference between my 4 strains...

            And now I BLOCK !!!!

            I would like to make a phylogenetic tree with the SNPs data (not with the number of the SNPs but) with the nucleotide information from the SNPs (INDEL, mutation, rate of mutation).

            It should exist on software which code the SNPs on something like a diploid code (AA, A- or --) for each SNPs position... and create a tree with this information !!!

            Can you help me please !!!

            Thank you

            Marie-Mathilde

            Comment


            • #7
              Keep in mind that it's unlikely that there's is a phylogenetic tree that underlies the data. Recombinations are likely to make the trees differ from SNP to SNP, so taking a bunch of SNPs and forcing them into a non-recombining tree may not be that helpful.
              You can try ACG (arup.utah.edu/acg) - it can make recombining trees from SNPs from a VCF (or multiple vcfs) and a reference

              Comment


              • #8
                I assume that these programs really only need the
                numbers of mutual differences between the
                samples. So you should be able to input this
                differences-matrix directly.
                (better for few samples with long DNA, many differences)

                Making a fasta from the vcf is also straightforward,
                I just wrote a program for that (SNPs only), handling the chromosomes
                separately. You could also merge the chromosomes ...
                but that gives long fastas and you'd be back to the differeves-matrix
                option

                ----------edit-------------------------------

                just use mtDNA and y- not-recombining-area for maternal and paternal
                phylo-trees separately (primates ?)

                ---------edit------------------------------------

                hmm, there should be a program that filters the recombined chunks
                and computes the distance in the closely-related areas only

                ---------edit--------------------------------

                take one of the 2 phases/alleles/haplotypes/zygotes at random
                (e.g. hapmap has them sorted alphabetically so taking the
                first one can give bias)

                -------------------------------------
                Last edited by gsgs; 12-19-2012, 05:45 PM.

                Comment


                • #9
                  have an excel file including snps (mutational and recommbinant). How to extract the mutaional snps only into a new fasta file?

                  Comment


                  • #10
                    save the excel as text-file, post some lines as an example

                    Comment


                    • #11
                      Still need help

                      Hello everybody,
                      I really need to manage to make a phylogenetic tree with my SNP.
                      Because i am not bio-informaticien i used clcgenomic to "map and call" my SNPs.
                      Now i have a file which look like:

                      Chromosome Region Reference Allele Strain
                      contig_1 145 A G d
                      contig_1 487 G A a, d, f
                      contig_1 682 C G b, d
                      contig_333 1156 T G a
                      contig_1234 566 C T b
                      contig_1234 612 C G b, d

                      So i have 4 strains (a,b,d and f), 1 reference genome with lot of contig.
                      Can somebody help me?

                      Thank you very much

                      Comment

                      Latest Articles

                      Collapse

                      • seqadmin
                        Advancing Precision Medicine for Rare Diseases in Children
                        by seqadmin




                        Many organizations study rare diseases, but few have a mission as impactful as Rady Children’s Institute for Genomic Medicine (RCIGM). “We are all about changing outcomes for children,” explained Dr. Stephen Kingsmore, President and CEO of the group. The institute’s initial goal was to provide rapid diagnoses for critically ill children and shorten their diagnostic odyssey, a term used to describe the long and arduous process it takes patients to obtain an accurate...
                        12-16-2024, 07:57 AM
                      • seqadmin
                        Recent Advances in Sequencing Technologies
                        by seqadmin



                        Innovations in next-generation sequencing technologies and techniques are driving more precise and comprehensive exploration of complex biological systems. Current advancements include improved accessibility for long-read sequencing and significant progress in single-cell and 3D genomics. This article explores some of the most impactful developments in the field over the past year.

                        Long-Read Sequencing
                        Long-read sequencing has seen remarkable advancements,...
                        12-02-2024, 01:49 PM

                      ad_right_rmr

                      Collapse

                      News

                      Collapse

                      Topics Statistics Last Post
                      Started by seqadmin, 12-17-2024, 10:28 AM
                      0 responses
                      24 views
                      0 likes
                      Last Post seqadmin  
                      Started by seqadmin, 12-13-2024, 08:24 AM
                      0 responses
                      42 views
                      0 likes
                      Last Post seqadmin  
                      Started by seqadmin, 12-12-2024, 07:41 AM
                      0 responses
                      28 views
                      0 likes
                      Last Post seqadmin  
                      Started by seqadmin, 12-11-2024, 07:45 AM
                      0 responses
                      42 views
                      0 likes
                      Last Post seqadmin  
                      Working...
                      X