SEQanswers

Go Back   SEQanswers > Applications Forums > Genomic Resequencing



Similar Threads
Thread Thread Starter Forum Replies Last Post
Exome sequencing analysis manual ulz_peter Bioinformatics 138 03-09-2017 04:26 AM
Problem in exome data analysis Sini Bioinformatics 8 06-15-2012 12:32 AM
Problem sequencing Hapmap Exome - no variant found! aituka Bioinformatics 9 03-27-2012 06:11 AM
Qs in exome sequencing data analysis Maone Genomic Resequencing 4 06-17-2011 07:32 AM
Maone, newbie in exome sequencing and data analysis Maone Introductions 0 06-15-2011 07:11 AM

Reply
 
Thread Tools
Old 11-15-2012, 03:44 PM   #1
albertyu
Junior Member
 
Location: Maryland

Join Date: Aug 2010
Posts: 3
Default Exome sequencing analysis problem

Hello All,

I am new in next generation sequencing analysis. I am working on a vcf file which performed by GATK tools. I have a question about GT (genotype) for ChrY and ChrM. the data looks lke:
GT : GQ : DP : PL 0/1:8.82:96:4088,0,9
GT : GQ : DP : PL 1/1:38.92:1:390,38,0
As my understanding, GT means genotype. Since there is only one copy for chrY or chrM, why the data showed 0/1 (heterozygous) or 1/1 (homozygous)? Or do I misunderstand the meaning of "GT" ?

Thanks,
Albert

Last edited by albertyu; 11-15-2012 at 03:47 PM.
albertyu is offline   Reply With Quote
Old 11-15-2012, 07:07 PM   #2
Bukowski
Senior Member
 
Location: Aberdeen, Scotland

Join Date: Jan 2010
Posts: 388
Default

Well for mitochondria it could be :

http://en.wikipedia.org/wiki/Heteroplasmy

As for the Y - are they in pseudoautosomal regions? Or perhaps badly genotyped indels?
Bukowski is offline   Reply With Quote
Old 11-19-2012, 10:40 AM   #3
albertyu
Junior Member
 
Location: Maryland

Join Date: Aug 2010
Posts: 3
Default

Quote:
Originally Posted by Bukowski View Post
Well for mitochondria it could be :

http://en.wikipedia.org/wiki/Heteroplasmy

As for the Y - are they in pseudoautosomal regions? Or perhaps badly genotyped indels?

Thanks for your reply. I checked the possibility of the pseudoautosomal genes. It seems that most of variants are not at the pseudoautosomal PAR1 and PAR2 genes location. And many of them are substitutions not indels. I would think it's because mis-mapping of BWA. ???

Actually I originally thought this is a popular problem in Exome sequencing data since I saw several whole exome sequencing vcf data containing chrY with GT:1/0 or 1/1.
albertyu is offline   Reply With Quote
Old 11-19-2012, 11:51 PM   #4
Bukowski
Senior Member
 
Location: Aberdeen, Scotland

Join Date: Jan 2010
Posts: 388
Default

Quote:
Originally Posted by albertyu View Post
Thanks for your reply. I checked the possibility of the pseudoautosomal genes. It seems that most of variants are not at the pseudoautosomal PAR1 and PAR2 genes location. And many of them are substitutions not indels. I would think it's because mis-mapping of BWA. ???

Actually I originally thought this is a popular problem in Exome sequencing data since I saw several whole exome sequencing vcf data containing chrY with GT:1/0 or 1/1.
I just had a look at some exome data, and I think there is definitely some mismapping involved as most females in my trios a small number of chrY variations. As I suspected though in my data, most of the het calls are small indels rather than SNPs.

I don't see what is wrong with a 1/1 call though, in Y that should just mean 'base is different to reference' surely?
Bukowski is offline   Reply With Quote
Old 11-20-2012, 09:18 AM   #5
albertyu
Junior Member
 
Location: Maryland

Join Date: Aug 2010
Posts: 3
Default

Quote:
Originally Posted by Bukowski View Post
I just had a look at some exome data, and I think there is definitely some mismapping involved as most females in my trios a small number of chrY variations. As I suspected though in my data, most of the het calls are small indels rather than SNPs.

I don't see what is wrong with a 1/1 call though, in Y that should just mean 'base is different to reference' surely?
Indeed, all my female samples have chrY variations.
From my data performed by GATK, it separated substitutions and indels. So I originally thought those are all indels and finally I realized there are a lot of SNPs in the middle of vcf files.
If they are mostly mismapping, I am afraid to use these data. So I would filter out those data at chrY.
albertyu is offline   Reply With Quote
Old 11-21-2012, 04:01 AM   #6
nexgengirl
Member
 
Location: Maryland

Join Date: Apr 2010
Posts: 31
Default

Check out section 6 on this page:

http://gatkforums.broadinstitute.org...fied-genotyper

It explains how the genotypes on the sex chromosomes are called in GATK
nexgengirl is offline   Reply With Quote
Old 11-23-2012, 06:19 AM   #7
AJERYC
Member
 
Location: Spain

Join Date: Jan 2012
Posts: 26
Default

[QUOTE=albertyu;89712]Indeed, all my female samples have chrY variations.
From my data performed by GATK, it separated substitutions and indels. So I originally thought those are all indels and finally I realized there are a lot of SNPs in the middle of vcf files.
If they are mostly mismapping, I am afraid to use these data. So I would filter out those data at chrY.[/QUOTEI

In any exome sequencing you will get some "false SNP" that have different sources: sequencing errors, aligning pseudogenes, ... That is why in an exome sequencing you can see chromosome Y in women or heterozygous SNPs in men. You can eliminate these "errors" by increasing your level of filtering in the alignement process but then the price you pay is with a tougher filtering is that you can lost true information too.
AJERYC is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 03:15 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2019, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO