Hello all,
I have a ped and map file which are to be converted to vcf format.
A few genotypes from the ped file.
Sample1 Sample1 0 0 2 1 0 0 G G G G 0 0 C C G G G G A A T T
Sample2 Sample2 0 0 1 2 0 0 G G G G 0 0 C C G G G G A A T T
Sample3 Sample3 0 0 1 1 0 0 G G G G 0 0 C C G G G G A A T T
Sample4 Sample4 0 0 1 1 0 0 G G G G 0 0 C C 0 0 G G A A T T
Sample5 Sample5 0 0 2 1 0 0 G G G G 0 0 C C 0 0 G G G A T T
Sample6 Sample6 0 0 1 1 0 0 G G G G 0 0 C C G G G G A A T T
Sample7 Sample7 0 0 2 1 0 0 G G G G 0 0 C C G G G G A A T T
Sample8 Sample8 0 0 1 1 0 0 G G G G 0 0 C C G G G G A A T T
Sample9 Sample9 0 0 1 1 0 0 G G G G 0 0 C C G G G G A A T T
Sample10 Sample10 0 0 1 2 0 0 G G G G 0 0 C C G G G G A A T T
And few lines from map file:
1 BICF2G630707759 0 3014448
1 BICF2S2358127 0 3068620
1 BICF2P1173580 0 3079928
1 BICF2G630707846 0 3082514
1 BICF2G630707893 0 3176980
The following steps were preformed to convert to VCF:
$PLINK --file sma --make-bed --out temp
echo "PLINK temp files created"
$PLINKSEQ sma-proj new-project
echo "PlinkSeq project file created"
$PLINKSEQ sma-proj load-plink --file temp --id loaded
echo "PlinkSeq temp files loaded"
$PLINKSEQ sma-proj write-vcf > sma.vcf
echo "PlinkSeq finished! VCF file created"
The output looks like as shown below:
##fileformat=VCFv4.1
##source=pseq
##FILTER=<ID=PASS,Description="Passed variant FILTERs">
#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT Sample1 Sample2 Sample3 Sample4 Sample5
chr1 3014448 BICF2G630707759 0 0 . PASS . GT ./. ./. ./. ./. ./.
chr1 3068620 BICF2S2358127 G 0 . PASS . GT 1/1 1/1 1/1 1/1 1/1
chr1 3079928 BICF2P1173580 G 0 . PASS . GT 1/1 1/1 1/1 1/1 1/1
chr1 3082514 BICF2G630707846 0 0 . PASS . GT ./. ./. ./. ./. ./.
chr1 3176980 BICF2G630707893 C 0 . PASS . GT 1/1 1/1 1/1 1/1 1/1
Some data has '0' for ref, alt or both. I do not understand what this means and how this error occurs.
Would be highly valuable help if someone can let me understand the problem and suggest any ideas to fix this.
I have a ped and map file which are to be converted to vcf format.
A few genotypes from the ped file.
Sample1 Sample1 0 0 2 1 0 0 G G G G 0 0 C C G G G G A A T T
Sample2 Sample2 0 0 1 2 0 0 G G G G 0 0 C C G G G G A A T T
Sample3 Sample3 0 0 1 1 0 0 G G G G 0 0 C C G G G G A A T T
Sample4 Sample4 0 0 1 1 0 0 G G G G 0 0 C C 0 0 G G A A T T
Sample5 Sample5 0 0 2 1 0 0 G G G G 0 0 C C 0 0 G G G A T T
Sample6 Sample6 0 0 1 1 0 0 G G G G 0 0 C C G G G G A A T T
Sample7 Sample7 0 0 2 1 0 0 G G G G 0 0 C C G G G G A A T T
Sample8 Sample8 0 0 1 1 0 0 G G G G 0 0 C C G G G G A A T T
Sample9 Sample9 0 0 1 1 0 0 G G G G 0 0 C C G G G G A A T T
Sample10 Sample10 0 0 1 2 0 0 G G G G 0 0 C C G G G G A A T T
And few lines from map file:
1 BICF2G630707759 0 3014448
1 BICF2S2358127 0 3068620
1 BICF2P1173580 0 3079928
1 BICF2G630707846 0 3082514
1 BICF2G630707893 0 3176980
The following steps were preformed to convert to VCF:
$PLINK --file sma --make-bed --out temp
echo "PLINK temp files created"
$PLINKSEQ sma-proj new-project
echo "PlinkSeq project file created"
$PLINKSEQ sma-proj load-plink --file temp --id loaded
echo "PlinkSeq temp files loaded"
$PLINKSEQ sma-proj write-vcf > sma.vcf
echo "PlinkSeq finished! VCF file created"
The output looks like as shown below:
##fileformat=VCFv4.1
##source=pseq
##FILTER=<ID=PASS,Description="Passed variant FILTERs">
#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT Sample1 Sample2 Sample3 Sample4 Sample5
chr1 3014448 BICF2G630707759 0 0 . PASS . GT ./. ./. ./. ./. ./.
chr1 3068620 BICF2S2358127 G 0 . PASS . GT 1/1 1/1 1/1 1/1 1/1
chr1 3079928 BICF2P1173580 G 0 . PASS . GT 1/1 1/1 1/1 1/1 1/1
chr1 3082514 BICF2G630707846 0 0 . PASS . GT ./. ./. ./. ./. ./.
chr1 3176980 BICF2G630707893 C 0 . PASS . GT 1/1 1/1 1/1 1/1 1/1
Some data has '0' for ref, alt or both. I do not understand what this means and how this error occurs.
Would be highly valuable help if someone can let me understand the problem and suggest any ideas to fix this.