Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • dbSNP 138 same position different reference

    Hello,

    [farukge@bio-hpc05 data]$ grep "1 15926506" dbsnp_138.b37.vcf
    1 15926506 rs373703898 ATC A . . RS=373703898;RSPOS=15926507;d bSNPBuildID=138;SSR=0;SAO=0;VP=0x050000000001000002000200;WGT=1;VC=DIV;OTHERKG
    1 15926506 rs374926831 ATT A,AT . . RS=374926831;RSPOS=15926508;d bSNPBuildID=138;SSR=0;SAO=0;VP=0x050000000001000002000210;WGT=1;VC=DIV;OTHERKG;NOC
    1 159265062 rs2427824 T C . . RS=2427824;RSPOS=159265062;db SNPBuildID=100;SSR=0;SAO=0;VP=0x05012808000115051f000100;WGT=1;VC=SNV;PM;PMC;SLO;INT;VLD;G5;HD;GNO;KG Phase1;KGPilot123;KGPROD;OTHERKG;PH3;CAF=[0.1915,0.8085];COMMON=1

    When I try to merge two vcf, i get error since ref is different for same position. How can I fix this? There are tons of this I assume, fixing a few wont be a solution.

  • #2
    You should use LiftOver to convert positions from one assembly to another.

    Comment


    • #3
      What you mean by "one assembly to another" for 1 15926506 being ATC and ATT?

      Comment


      • #4
        I think Tibor is talking about the dbSNP assembly. dbSNP is regularly updated so there are different versions or "builds/assemblies". For example you using the dbSNP 138 build. The difference in rs number may be because one of the vcfs is referencing an older build such as dbSNP 135, 137, etc. Like Tibor said, software like LiftOver can fix this for you. Good luck!

        Comment


        • #5
          both are 138 and claiming ref is ATC and ATT at different rs ids for dif indels and snps. isnt liftover for 135 to 138 compatibility, fixin?

          I assume my problem is that dbsnp using different ref at same position.

          Basically you say if i use two vcf files which is dbsnp 138 and one formed with gatk dbsnp 138 and I will have no problems? Or I just need liftover? I got that I cant define my real problem here

          Comment


          • #6
            Sorry Omer,

            I think we just didn't pay enough attention to the example you posted. I looked up both rs numbers in dbSNP and
            like you said, they are both dbSNP 138 (I just wanted to confirm it for curiosity sake) and just like you had said the difference in rs number is basically just due to the different Amino Acid change:

            rs373703898 [Homo sapiens]
            CATTGCTCTGGTCCTGCCTAACAAA[-/TC]TTTTTTTTTTTTTTTTTTTTTTGAG
            Chromosome:
            1:15926507

            rs374926831 [Homo sapiens]
            ATTGCTCTGGTCCTGCCTAACAAAT[-/T]TTTTTTTTTTTTTTTTTTTTTGAGA
            Chromosome:
            1:15926508

            I am sorry our suggestion was not at all helpful in this case. I think the only suggestion I have is trying different software to merge the vcf. I know vcftools is very popular but I have not always been successful at merging vcfs correctly with it. I recently used gatk CombineVariants and found that it works very well. If you already use gatk and have it installed, then it would be very easy for you to try out.

            Comment


            • #7
              I used combine variants I get that error. Vcf-merge from vcftools didnt help too.

              Comment


              • #8
                I'm sorry. I hope someone else has an answer for you

                Comment


                • #9
                  Well, I think one of the incompatible records won't match to the reference. I think you need to filter out those records.

                  Comment


                  • #10
                    Originally posted by dGho View Post

                    rs373703898 [Homo sapiens]
                    CATTGCTCTGGTCCTGCCTAACAAA[-/TC]TTTTTTTTTTTTTTTTTTTTTTGAG
                    Chromosome:
                    1:15926507

                    rs374926831 [Homo sapiens]
                    ATTGCTCTGGTCCTGCCTAACAAAT[-/T]TTTTTTTTTTTTTTTTTTTTTGAGA
                    Chromosome:
                    1:15926508
                    Well the weird thing is that these should not be at the same position? one is at 1:15926507 and the other one at 1:15926508....so they are not actually at 1:15926506 which is what you have in your vcf. Am I wrong? maybe they should not be in the same position...weird. did you figure this out yet?

                    Comment

                    Latest Articles

                    Collapse

                    • seqadmin
                      Essential Discoveries and Tools in Epitranscriptomics
                      by seqadmin




                      The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
                      Yesterday, 07:01 AM
                    • seqadmin
                      Current Approaches to Protein Sequencing
                      by seqadmin


                      Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                      04-04-2024, 04:25 PM

                    ad_right_rmr

                    Collapse

                    News

                    Collapse

                    Topics Statistics Last Post
                    Started by seqadmin, 04-11-2024, 12:08 PM
                    0 responses
                    57 views
                    0 likes
                    Last Post seqadmin  
                    Started by seqadmin, 04-10-2024, 10:19 PM
                    0 responses
                    53 views
                    0 likes
                    Last Post seqadmin  
                    Started by seqadmin, 04-10-2024, 09:21 AM
                    0 responses
                    45 views
                    0 likes
                    Last Post seqadmin  
                    Started by seqadmin, 04-04-2024, 09:00 AM
                    0 responses
                    55 views
                    0 likes
                    Last Post seqadmin  
                    Working...
                    X