Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Calculating r^2 in vcftools for only one SNP?

    I would like to check for a set of SNPs in a given region how strongly they correlate with one particular SNP.

    I know that vcftools has the --geno-r2 option which will give me this information, however this takes very long to compute because it actually calculates all possible r^2s between all SNPs. I only need it to compute it for one of them against all others. For example plink has the `--ld-snp` option which allows you to specify a single SNP to calculate against.

    Is there any way I can do this with vcftools?

  • #2
    Yes, you can. Using --chr, --from-bp, --to_bp you can specify the region of interest.

    Comment


    • #3
      I know that, but this will still calculate the pairwise r^2 for all SNPs in this area. What I want is just the r^2 from one SNP in this region to all others.

      I just ended up making a tped file and running this though PLINK now, although this seems unnecessarily complicated.

      Comment


      • #4
        I would expect that most of the ways to do this would be complicated because your use case strays a bit from the beaten path. I find it a little odd to talk about r^2 within a "given region", and be concerned about the time it takes to do a full pairwise calculation, because if it takes less than 2h to calculate it's probably a waste of time to look for other solutions. What is the region size? How many SNPs? How many individuals?

        Comment


        • #5
          Well, the idea is to use this in local Manhattan plots from a GWAS, looking at the r^2 between the most significantly associated SNP and other SNPs around it. As far as I am aware this is not too exotic of an application, but maybe I am wrong.

          And since I want to do this for many cases and more or less on demand 2h is quite a long time. For comparison, the PLINK method takes maybe a minute or so now.

          Comment


          • #6
            And since I want to do this for many cases and more or less on demand 2h is quite a long time. For comparison, the PLINK method takes maybe a minute or so now.
            Ah, okay. In that case, write a script to do the conversion to TPED and run PLINK. Then while you're collecting money (or saving time) due to the use of your script, hunt around for more efficient solutions.

            I vaguely recall diagrams with the most significant SNP identified and r^2 values for surrounding SNPs, but unfortunately can't remember how it was done.

            Comment


            • #7
              Do you need to account for phase in your r^2 calculation? If not, you can just use PLINK 1.9's VCF import function:

              plink --vcf [vcf filename] --out [new fileset prefix]

              Then you can use --r2 as usual.

              r^2 with the entire rest of the genome:
              plink --bfile [plink fileset prefix] --r2 --inter-chr --ld-snp [snp id]

              r^2 with the rest of the chromosome:
              plink --bfile [plink fileset prefix] --r2 --inter-chr --ld-snp [snp id] --chr [chromosome number]

              r^2 with a limited window:
              plink --bfile [plink fileset prefix] --r2 --ld-snp [snp id] --ld-window [max snps + 1] --ld-window-kb [max kbs]

              Comment


              • #8
                Oh wow, PLINK continues on at BGI. I'll have to spread the word to my colleagues (who use PLINK a lot)....

                Comment


                • #9
                  Did anyone find a solution for this using vcftools? I ask because vcftools allows you to use the hap-r2 option which takes phase into account when calculating R2, as well as the geno-r2 - so with vcftools this would be more flexible.

                  Comment

                  Latest Articles

                  Collapse

                  • seqadmin
                    Essential Discoveries and Tools in Epitranscriptomics
                    by seqadmin




                    The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
                    04-22-2024, 07:01 AM
                  • seqadmin
                    Current Approaches to Protein Sequencing
                    by seqadmin


                    Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                    04-04-2024, 04:25 PM

                  ad_right_rmr

                  Collapse

                  News

                  Collapse

                  Topics Statistics Last Post
                  Started by seqadmin, Today, 08:47 AM
                  0 responses
                  11 views
                  0 likes
                  Last Post seqadmin  
                  Started by seqadmin, 04-11-2024, 12:08 PM
                  0 responses
                  60 views
                  0 likes
                  Last Post seqadmin  
                  Started by seqadmin, 04-10-2024, 10:19 PM
                  0 responses
                  59 views
                  0 likes
                  Last Post seqadmin  
                  Started by seqadmin, 04-10-2024, 09:21 AM
                  0 responses
                  54 views
                  0 likes
                  Last Post seqadmin  
                  Working...
                  X