Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • NCBI vs UCSC

    Could someone point me in the right direction to find out if hg19 (UCSC) is exactly the same as GRCh37 (NCBI)? I thought they're the same but just found that dbSNP Build 132 is for GRCh37 and dbSNP Build 131 for gh19.
    I used coordinates from hg19 to design probes to capture my genes but as a refseq I was going to use, already anntotated, seq from NCBI. My worry is that there may be some discrepancies between these two.

  • #2
    hg19 is the same as GRCh37, though as GRCh37 getting assembly patches from the GRC while the main chromosomes may not change some of the alternative haplotypes might not always be identical

    Annotations laid on top of the assembly may not always be identical depending on the method use to place the annotation on the assembly

    Comment


    • #3
      I'm confused even more now
      Does that mean that the chr coordinates are not stable between these two (hg19 and GRCh37)?
      As for the annotations I need to see only the genes/exons and probably SNPs if it will be possible
      'Annotations laid on top of the assembly may not always be identical' -so the genes can be let say shifted?

      Comment


      • #4
        The chromosomal coordinates should be exactly the same, hg19 is just UCSC's name for GRCh37

        It means 2 different mapping programs may not give the same position for the same piece of dna. That being said if they getting their annotation from a central source e.g dbSNP or CCDS both sites should show coordinates which are the same as the central source

        Comment


        • #5
          Thank you very much for the answer!
          So could I use refseq from NCBI and annotate it with SNPs (131) from UCSC if the coordinates are the same anyway?

          Comment


          • #6
            You should be able to do that but it might not be the best way.

            What are you actually trying to do?

            If you are looking for which cdnas overlap your snps you might be better looking at a tool like the ensembl variant effect predictor http://www.ensembl.org/tools.html.

            If you are looking for which snps overlaps your cdnas of interest you are probably better using http://www.ensembl.org/biomart/martview/ or the UCSC table browser http://genome.ucsc.edu/cgi-bin/hgTab...a_doMainPage=1

            Comment


            • #7
              I'm looking for the nucleotide changes in my samples - genes which I'm interested in and I would like to compare the results with SNPs which are already in databases.
              Coordinates to design probes were taken from hg19 but because sequences from NCBI are already annotated I thought I could use them.
              I would like to download a file with SNPs not only for coding part but for introns as well but it doesn't seem to be straightforward.

              Comment


              • #8
                You might be better trying to look at NCBI's dbsnp vcf dumps to find all the snps of interest in a particular region then using something like the ensembl variant effect predictor to annotate their consequences

                ftp://ftp.ncbi.nih.gov/snp/organisms...9606/VCF/v4.0/

                Comment


                • #9
                  I downloaded the vcf file and looks like there are more SNPs that I've got from UCSC which is great.
                  I try to compare information for one of the genes, to check how big the differences are between the databases. I'm quite confused as according the data in a 1000 genomes project looks like there is almost 200 more SNPs for that gene that in the vcf file But it's from the previous genome build so not sure how/if I could use it.

                  Comment


                  • #10
                    Which 1000 genomes vcf files are you looking at?

                    The main project 1000 genomes variants have not yet been submitted to dbSNP so not all of those 20100804 snps will be in dbSNP

                    Comment


                    • #11
                      You're right I was looking on the wrong thing.
                      So I guess will be ok if I annotate the refseq from NCBI with the SNPs from ftp://ftp.ncbi.nih.gov/snp/organisms...9606/VCF/v4.0/ ?
                      Thank you very much for you help!

                      Comment


                      • #12
                        That should be fine,

                        I do recommend looking at the ensembl variant effect predictor it links effects to ensembl ids which can be very easily linked to refseq ids when desired using biomart or the ensembl api

                        Comment


                        • #13
                          I'll try.
                          I was wondering if you may know how could I convert the vcf file to gff/gtf format-I need to have the SNPs in this format to be able to annotate it on the refseq

                          Comment


                          • #14
                            I am sure there are converts that do exist but I don't know of any myself. I would suggest putting vcf to gff in google and seeing what comes out, you should only need the first 8 columns so it should be a fairly easy perl/python/awk script to write

                            Comment


                            • #15
                              There are a few minor differences between GRCh37 and hg19.

                              The random contig sequences are the same but the names are different.
                              Depending on the source of the sequence or annotation "1" may need to be converted to "chr1" and the PAR on chr Y may or may not be masked. In addition UCSC hg19 is currenly using the old mitochondrial sequence but NCBI and Ensembl have transitioned to NC_012920 the rCRS.

                              > http://genome.ucsc.edu/cgi-bin/hgGat...=Human&db=hg19
                              >
                              > Note on chrM
                              > Since the release of the UCSC hg19 assembly, the Homo sapiens mitochondrion sequence (represented as "chrM" in the Genome Browser) has been replaced in GenBank with the record NC_012920. We have not replaced the original sequence, NC_001807, in the hg19 Genome Browser. We plan to use the Revised Cambridge Reference Sequence (rCRS) in the next human assembly release.

                              Comment

                              Latest Articles

                              Collapse

                              • seqadmin
                                Current Approaches to Protein Sequencing
                                by seqadmin


                                Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                                04-04-2024, 04:25 PM
                              • seqadmin
                                Strategies for Sequencing Challenging Samples
                                by seqadmin


                                Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                                03-22-2024, 06:39 AM

                              ad_right_rmr

                              Collapse

                              News

                              Collapse

                              Topics Statistics Last Post
                              Started by seqadmin, 04-11-2024, 12:08 PM
                              0 responses
                              18 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 04-10-2024, 10:19 PM
                              0 responses
                              22 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 04-10-2024, 09:21 AM
                              0 responses
                              16 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 04-04-2024, 09:00 AM
                              0 responses
                              46 views
                              0 likes
                              Last Post seqadmin  
                              Working...
                              X