Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Alignment on hg19 or hg38 for exome-seq data

    Hi,

    I've just received the sequencing results from 6 exome-seq experiments.
    The samples come from human patients.

    Would you recommend aligning the samples on hg19 or hg38?
    Have you all switched to hg38, or do you still use hg19?
    Is hg19 still better annotated, and does it still have more related datasets facilitating downstream analyses?

    I will launch the alignment based on your recommendations.

    Sorry if the answer is obvious.
    I haven't done any exome-seq analyses since the release of hg38.

    Thank

  • #2
    I still stick with hg19 mainly because I annotate variants with population allele frequencies and 1000 genomes/ExAC datasets still use hg19 the last I checked. However Once UCSC releases the annotation, you can safely start using the new genome build.

    Comment


    • #3
      That's a good point.

      The 1000 Genomes Project actually has a post on their twitter feed stating that they are recruiting a developer to move the data to GRCh38.


      On the other hand, the updates to GRCh38 are based partly on the 1000 Genomes project, so it's ironic not to move on because the 1000 Genomes Project hasn't moved on yet.
      "Sequence updates - Several erroneous bases and misassembled regions in GRCh37 have been corrected in the GRCh38 assembly, and more than 100 gaps have been filled or reduced. Much of the data used to improve the reference sequence was obtained from other genome sequencing and analysis projects, such as the 1000 Genomes Project."


      I understand from my discussions with other analysts that hg19 is still used by the overwhelming majority of bioinformaticians, even though the official announcements about hg38 tout all its improvements.

      If anyone is using hg38 for exome-seq data, I'be interested to hear about your experience.
      For example, is it simple to use liftover to compare to earlier datasets aligned with hg19, e.g. the 1000 Genomes project?

      Comment


      • #4
        Most of the improvements are in the non-coding regions, so if you are doing exomes, you won't notice a difference.

        My concern would be future-proofing. If these are your first exomes, you may as well use hg38. If you need to use some hg19-based datasets, you can always liftover them to hg38. The problem is if you choose hg19 now and continue with that, you'll have to make a transition a couple of years down the road. It'll be a lot harder once you have a bunch of accumulated results.

        On the other hand, I still see a lot of people using mm9 and mm10 has been out since 2011.

        Comment


        • #5
          Well, I ended up trying to run concurrently the analyses on hg19 and hg38.
          I quickly gave up on the hg38 analysis.
          As vivek_ has warned me, the fact that the 1000 Genomes project is aligned on hg19 is a problem.
          I was unable to follow the GATK DNASeq Best Practices with hg38 given that there are no VCF files available for hg38 identifying known indels and SNPs.

          I think the switch to hg38 will be effortless once the 1000 Genomes Project has transitioned to hg38. I'll stick to hg19 untill that transition has been completed.

          I'll have to tell the researcher I gave up on doing the analysis with hg38. Hopefully id0 is right, and it will have no impact on the results.

          Comment


          • #6
            How reliable would it be to lift over 1000 genome snps to GRCh38/hg38 ?

            Comment

            Latest Articles

            Collapse

            • seqadmin
              Current Approaches to Protein Sequencing
              by seqadmin


              Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
              04-04-2024, 04:25 PM
            • seqadmin
              Strategies for Sequencing Challenging Samples
              by seqadmin


              Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
              03-22-2024, 06:39 AM

            ad_right_rmr

            Collapse

            News

            Collapse

            Topics Statistics Last Post
            Started by seqadmin, 04-11-2024, 12:08 PM
            0 responses
            22 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 04-10-2024, 10:19 PM
            0 responses
            24 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 04-10-2024, 09:21 AM
            0 responses
            20 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 04-04-2024, 09:00 AM
            0 responses
            52 views
            0 likes
            Last Post seqadmin  
            Working...
            X