Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • GATK: Calling variants on chromosome Y

    Hello everyone,

    I apologize if this is a stupid question but I'm new to using GATK.
    I've searched this forum for an answer but nothing seems to fit the bill.

    I'm trying to use GATK 4.1 to analyze WGS data I have.
    The data is 30X deep.

    At this point I'm only testing and optimizing my pipeline so I am only working with 4 samples (2 males and 2 females).

    I've followed the information available a the Best Practices Workflow entitled "Germline short variant discovery (SNPs + Indels)". (https://gatk.broadinstitute.org/hc/e...s/360035535932)

    I have produced the Analysis-Ready BAM files and got down to the VCF files made by GenotypeGVCFs.

    As a quick sanity check I went to see the variant calls on chromosome Y and I'm a little confused by what I see.

    First off for the males in my test cohort, multiple sites are listed as heterozygous. Others are reported as missed calls (ie: `./.`)

    Second: If I go over to the female samples I see that they have variant calls for the Y chromosome as well.

    In both cases the VCF files exclude the pseudo-autosomal regions on chrY.

    Looking at the BAM files I can clearly see that the males have read counts for known Y chromosome genes and the females do not.

    I'm wondering what commands or steps I need to run on the data to get the correct gentoyping reported in the VCF file.

    Right now my pipeline produces one genomicDB for each chromosome.
    I'm guessing that when I generate the VCF files with GenotypeGVCFs I need to adjust the command for chromosome Y (?)

    I've tried:
    Code:
    gatk GenotypeGVCFs \
    -R hg38.ref/Homo_sapiens_assembly38.fasta \
    -D hg38.ref/Homo_sapiens_assembly38.dbsnp138.vcf \
    -V gendb://genomicDB_chrY \
    -O chrY.gatk_hg38.vcf.gz \
    -ploidy 1
    But the VCF file remains the same.
    I've posted this on the GATK forum and it's gone unanswered.
    Maybe it's a stupid question and the answer should be obvious but I can't find it.

    Can anyone suggest what I should do to get the correct genotyping?

    Thanks in advance for any and all help

  • #2
    The PAR is typically masked on chrY since the PAR reference is an exact duplicate of chrX sequence.

    chrY unique regions tend to be repetitive and highly variable with many deletions, duplications, and CNV

    The human Y chromosome harbors genes that are responsible for testis development and also for initiation and maintenance of spermatogenesis in adulthood. The long arm of the Y chromosome (Yq) contains many ampliconic and palindromic sequences making it predisposed to self-recombination during spermatogenesis and hence susceptible to intra-chromosomal deletions. Such deletions lead to copy number variation in genes of the Y chromosome resulting in male infertility. Three common Yq deletions that recur in infertile males are termed as AZF (Azoospermia Factor) microdeletions viz. AZFa, AZFb and AZFc. As estimated from data of nearly 40,000 Y chromosomes, the global prevalence of Yq microdeletions is 7.5% in infertile males; however the European infertile men are less susceptible to Yq microdeletions, the highest prevalence is in Americans and East Asian infertile men. In addition, partial deletions of the AZFc locus have been associated with infertility but the effect seems to be ethnicity dependent. Analysis of > 17,000 Y chromosomes from fertile and infertile men has revealed an association of gr/gr deletion with male infertility in Caucasians and Mongolian men, while the b2/b3 deletion is associated with male infertility in African and Dravidian men. Clinically, the screening for Yq microdeletions would aid the clinician in determining the cause of male infertility and decide a rational management strategy for the patient. As these deletions are transmitted to 100% of male offspring born through assisted reproduction, testing of Yq deletions will allow the couples to make an informed choice regarding the perpetuation of male infertility in future generations. With the emerging data on association of Yq deletions with testicular cancers and neuropsychiatric conditions long term follow-up data is urgently needed for infertile men harboring Yq deletions. If found so, the information will change the current the perspective of androgenetics from infertility and might have broad implication in men health.


    Author summary The Y chromosome is extraordinary in many respects; it is non-recombining along most of its length, it carries many testis-expressed genes that are often found in palindromes and thus in several copies, and it is generally highly repetitive with very few unique genes. Its evolutionary process is not well understood in general because short-read mapping in such complex sequence is difficult. We combine de novo assembly and mapping to investigate evolution in more than 60% of the length of 62 Y chromosomes of Danish descent. We find that Y chromosome evolution is very dynamic even among the set of closely related Y chromosomes in Denmark with many cases of complex duplications and deletions of large regions including whole genes, clear evidence of GC-biased gene conversion in the palindromes and a tendency for gene conversion to revert mutations to their ancestral state.

    Comment

    Latest Articles

    Collapse

    • seqadmin
      Strategies for Sequencing Challenging Samples
      by seqadmin


      Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
      03-22-2024, 06:39 AM
    • seqadmin
      Techniques and Challenges in Conservation Genomics
      by seqadmin



      The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

      Avian Conservation
      Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
      03-08-2024, 10:41 AM

    ad_right_rmr

    Collapse

    News

    Collapse

    Topics Statistics Last Post
    Started by seqadmin, Yesterday, 06:37 PM
    0 responses
    10 views
    0 likes
    Last Post seqadmin  
    Started by seqadmin, Yesterday, 06:07 PM
    0 responses
    9 views
    0 likes
    Last Post seqadmin  
    Started by seqadmin, 03-22-2024, 10:03 AM
    0 responses
    50 views
    0 likes
    Last Post seqadmin  
    Started by seqadmin, 03-21-2024, 07:32 AM
    0 responses
    67 views
    0 likes
    Last Post seqadmin  
    Working...
    X