Seqanswers Leaderboard Ad

**swNGS** · 03-19-2012, 11:44 AM

Hi,
I really feel your pain as I have struggled with the same thing a good few times.
You could save yourself a lot of bother b getting both your reference fastq and the dbsnp vcf file from gatk, they will more likely play together
Chris

**dkrtndhkd** · 03-20-2012, 12:48 AM

Hi, thankyou for your reply.

would you let me know where can i downlaod dbsnp and fastq files from GATK?

I need dbsnp 135 and hg19 reference.

as i know, the data from GATK bundle is dbsnp 131?129? and hg18 reference.

is it possible to download the recent data from GATK?

please link the site

**alexbmp** · 03-20-2012, 01:37 AM

If you are planning to use the Broad bundle,
I reckon the bundle for hg19 is present.

1. Have you tried downloading from the following ftp yet?
ftp://[email protected]/1.2/hg19/

dbsnp version of the Broad bundle hg19, as I know it, is dbsnp132.
However, if there is no specific reason to use dbsnp135 (or I might be wrong!), I don't think there would be any problem to use dbsnp132...?

2. Also, you must make sure your reference chromosome order and vcf chromosome order are the same.
(Personally I recall struggling because dbsnp132_b37 had "MT" on the top of chromosomeID list.)

**dkrtndhkd** · 03-20-2012, 07:14 AM

thank you

Thank you alex!
but I have some questions...

1. as you might see in my reference file, chromosomes were ordered with this order(chr1~chr22,chrX/YMT,chrUn~).
however, after I run the novoalign, the error message says that it has weird chromosome order -> chr1, chr10~19, chr2, chr20~~~~
how can i handle this problem? it's out of my hand to fix aligning program.

2. Do I need to include chrUn~ sequences in my reference fasta file?
these chrUn~ are not included in VCF file, aren't they?
if I include them, the calling snp step will bother me again???

**alexbmp** · 03-20-2012, 07:34 AM

I haven't used NovoAlign, so don't fully trust me

1-1. If you build alignment index before alignment, check if your index is in chromosomal order (chr1, chr2, chr3, ..., chrX, chrY, chrM or the equivalent).

1-2. If it is, check if your alignment program output options that emits chromosome ID headers in un-coordinated or lexicographical (chr1, chr10, chr11, ..., chrM, chrX, chrY) fashion. I haven't seen this kind of alignment output option yet; I highly suspect your index file is ordered lexicographically as written, as in 1-1 (I had the same error).

2. If you are talking about contigs (or not-fully-assembled chromosome fragments), I think it is good to include them in your alignment step.

I reckon physically existing sequence from such contigs will be mapped there, probably decreasing your error rate. Thinking about it, I'm not sure of this (but I'll write my thoughts anyway. Somebody please correct me.)

I also think you can just exclude SNPs from contigs if their existence bugs you.
Contigs are not fully assembled chromosomes in the first place.

Did I understand your questions fully?

Topics	Statistics	Last Post
Genetic Variants and Diabetes Risk in Childhood Cancer Survivors by seqadmin Started by seqadmin, Yesterday, 08:47 AM	0 responses 14 views 0 likes	Last Post by seqadmin Yesterday, 08:47 AM
Cancer Metastasis: A Deep Dive into Cellular Plasticity by seqadmin Started by seqadmin, 04-11-2024, 12:08 PM	0 responses 60 views 0 likes	Last Post by seqadmin 04-11-2024, 12:08 PM
Proteogenomic Profiles Offer New Clues in Prostate Cancer by seqadmin Started by seqadmin, 04-10-2024, 10:19 PM	0 responses 60 views 0 likes	Last Post by seqadmin 04-10-2024, 10:19 PM
Novel Diagnostic Assay Enhances Ovarian Cancer Detection by seqadmin Started by seqadmin, 04-10-2024, 09:21 AM	0 responses 54 views 0 likes	Last Post by seqadmin 04-10-2024, 09:21 AM

Seqanswers Leaderboard Ad

Announcement

GATK error because of the order of reference chr.

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News