Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • yuanzhi
    Member
    • Aug 2010
    • 19

    1000 genomes VCF format?

    I am trying to figure out the SNP genotype from the 1000 genomes VCF format

    Code:
    #CHROM POS     ID        REF ALT    QUAL FILTER INFO                              FORMAT      NA00001        NA00002        NA00003
    20     14370   rs6054257 G      A       29   PASS   NS=3;DP=14;AF=0.5;DB;H2           GT:GQ:DP:HQ 0|0:48:1:51,51 1|0:48:8:51,51 1/1:43:5:.,.
    20     17330   .         T      A       3    q10    NS=3;DP=11;AF=0.017               GT:GQ:DP:HQ 0|0:49:3:58,50 0|1:3:5:65,3   0/0:41:3
    20     1110696 rs6040355 A      G,T     67   PASS   NS=2;DP=10;AF=0.333,0.667;AA=T;DB GT:GQ:DP:HQ 1|2:21:6:23,27 2|1:2:0:18,2   2/2:35:4
    Is the SNP genotype just the column of "ALT"?

    Thanks
  • quinlana
    Senior Member
    • Sep 2008
    • 119

    #2
    There should be two files: one with the sites of polymorphism (the one you show here) and another with the genotypes at said sites. Look for *.genotypes.* or something like that.

    Comment

    • yuanzhi
      Member
      • Aug 2010
      • 19

      #3
      Hi, Quinlana

      Thanks for your answer! I am trying to figure out if I used the wrong term. When I said "SNP genotype", I meant "SNP call". For example, if the reference is A/A and the SNP call is T/T, "T/T" is the "SNP genotype" that I am looking for.

      Your "genotype" is the genotyping SNP call (not the sequencing SNP call), right?

      I apologize for my lack of knowledge of these terms. I have been trying to look for a book or something like that which can tell me the correct definition of SNP call, SNP genotype, base call, polymorphisms, and etc.

      Thanks again

      Comment

      • Jose Blanca
        Member
        • Aug 2009
        • 70

        #4
        You have plenty of information about the format at the 1000genomes vcf page.

        Comment

        • laura
          Senior Member
          • Sep 2008
          • 151

          #5
          The ALT column defines the possible alternative alleles, columns 10->n define a specific individuals genotype

          Comment

          • laura
            Senior Member
            • Sep 2008
            • 151

            #6
            As an update there 1000genomes website has recently changed its backend and therefore url structure

            The spec can now be found

            http://www.1000genomes.org/wiki/Analysis/Variant Call Format/vcf-variant-call-format-version-40

            Comment

            • johnadam33
              Member
              • Oct 2010
              • 26

              #7
              I am new to this field and I am trying to figure out how t use this data in vcf format from myself. I am looking to open these files so that I can look for some SNPs at a given location. Do u all know how to access the wiki page. Do we need to login and so how to register?

              Comment

              • laura
                Senior Member
                • Sep 2008
                • 151

                #8


                These are public pages within the wiki

                you shouldn't need to log in to see the

                Most of the 1000 genomes wiki is a internal project tracking wiki so logins are not provided to people outside the project

                Comment

                • johnadam33
                  Member
                  • Oct 2010
                  • 26

                  #9
                  Thanks a lot Laura. That helps. I guess I have to do some ground work in order to access them.

                  Comment

                  • johnadam33
                    Member
                    • Oct 2010
                    • 26

                    #10
                    Very Urgent and IMP

                    Can anyone tell me why the human ref sequence is diff in 1000 genome browser when compared to sequence at NCBI,Ensembl, and USCS browsers. I am looking at variant call data, the chr seq location has different base in 1000 genome browser than other three (all same).
                    Thanks,

                    Comment

                    • laura
                      Senior Member
                      • Sep 2008
                      • 151

                      #11
                      The pilot analysis for 1000 genomes was done using the NCBI36 assembly but the browsers are all now using the GRCh37 assembly which leads to different coordinates

                      The 1000 genomes main project uses GRCh37 and there are snps available from the ftp site for these but the browser has yet to be updated

                      Comment

                      • johnadam33
                        Member
                        • Oct 2010
                        • 26

                        #12
                        Thanks for the reply Lauara.
                        So how much difference is there? If I want to see for a location say chr1:10041132 (on GRCh37) with that of ncbi36 build, what should I do?

                        Comment

                        • laura
                          Senior Member
                          • Sep 2008
                          • 151

                          #13
                          For variants it is safest to use rs numbers which dbSNP track from one assembly to another.

                          To map specific positions though ensembl provides a tool

                          Comment

                          • vyellapa
                            Member
                            • Oct 2011
                            • 59

                            #14
                            Im looking for a link of genotypes vcf of the latest 1000 genome release and the corresponding reference. I can see one at the link below but it seems to the older release (629 individuals and vcf 4.0). Can I know where I can find the same file in the newer release.

                            ftp://ftp.1000genomes.ebi.ac.uk/vol1...notypes.vcf.gz

                            Thank you,
                            Teja

                            Comment

                            • laura
                              Senior Member
                              • Sep 2008
                              • 151

                              #15
                              The final call set for Phase 1 (1092 individuals) you can find here:

                              ftp://ftp.1000genomes.ebi.ac.uk/vol1...ted_call_sets/

                              Please have a look at the README:

                              ftp://ftp.1000genomes.ebi.ac.uk/vol1...l_set_20120621

                              All calls are relative to the GRCh37 / hg19 genome assembly.

                              Comment

                              Latest Articles

                              Collapse

                              • SEQadmin2
                                Nine Things a Sample Prep Scientist Thinks About Before Sequencing
                                by SEQadmin2


                                I’m not a sequencing expert. I’m a purification scientist who uses NGS to evaluate workflows my group develops. With this perspective, we think about the sample first and the NGS workflow second. The sequencer is an exceptionally honest reporter, but it can only report on what you give it, so whether you get clean, interpretable data from an NGS workflow is largely determined before you begin.


                                Here are nine questions we think about, in roughly the order they matter, before...
                                Yesterday, 07:11 AM
                              • SEQadmin2
                                From Collection to Sequencing: Why Sample Preparation and Preservation Define Sequencing Data
                                by SEQadmin2


                                Data variability is still an issue in sequencing technologies despite the advances in reproducibility and accuracy of these platforms. But the problem does not originate in the sequencing itself, but in the previous steps, before the sample reaches the sequencer.


                                The first step is collection, followed by preservation and sample preparation for analysis. Most scientists overlook those steps, but not being careful might just be skewing the experiment’s results.
                                ...
                                06-02-2026, 10:05 AM
                              • SEQadmin2
                                Single-Cell Sequencing at an Inflection Point: Early Impacts of New Platforms and Emerging Trends
                                by SEQadmin2


                                With the launch of new single-cell sequencing platforms in 2026, the field stands at an exciting inflection point. This article surveys the most impactful advances in the field and discusses how they’re reshaping research in cancer, immunology, and beyond.


                                Introduction

                                Single-cell sequencing technologies have undergone remarkable advances over the past decade, transitioning from low-throughput experimental approaches to highly scalable platforms capable of...
                                05-22-2026, 06:42 AM

                              ad_right_rmr

                              Collapse

                              News

                              Collapse

                              Topics Statistics Last Post
                              Started by SEQadmin2, 06-17-2026, 06:09 AM
                              0 responses
                              16 views
                              0 reactions
                              Last Post SEQadmin2  
                              Started by SEQadmin2, 06-09-2026, 11:58 AM
                              0 responses
                              37 views
                              0 reactions
                              Last Post SEQadmin2  
                              Started by SEQadmin2, 06-05-2026, 10:09 AM
                              0 responses
                              43 views
                              0 reactions
                              Last Post SEQadmin2  
                              Started by SEQadmin2, 06-04-2026, 08:59 AM
                              0 responses
                              49 views
                              0 reactions
                              Last Post SEQadmin2  
                              Working...