I there!!
A friend has passed me the data of an Infinium Omni5-4 array. In short, they have applied two different mutagens to a cell line. I have passed a data set that contains:
3 mother samples (MS)
3 samples mutagen 1 (M1)
3 samples mutagen 2 (M2)
Its objective is to identify the mutagenic potential of each mutagen.
After googeling, I think what I need to do is a Manhattan plot to identify the chromosomal regions that have mutated the most for each mutagen.
What I have done at the moment is:
1 Genotyping callin, using the Genotyping Module of Genome Studio
2 Establish the reference set: SNP's 100% call and with the same genotype in all MS.
3 Assign each SNP of samples M1 and M2 a 1 if the genotype has varied from the reference or a 0 if it has not. The SNPs of the M1 and M2 samples that are not in the reference set or that are not 100% called have been discarded.
I have done all this with perl. What I now have is a text file with the following structure:
SNP_1 SNP_2 ...
M1.1 1 0
M1.2 1 1
.
.
M2.3 1 0
And that's all folks!!, I have no idea what I should do now.
Any help will be welcome because, as you may deduced, I have no idea of working with arrays or statistics.
Thank you
A friend has passed me the data of an Infinium Omni5-4 array. In short, they have applied two different mutagens to a cell line. I have passed a data set that contains:
3 mother samples (MS)
3 samples mutagen 1 (M1)
3 samples mutagen 2 (M2)
Its objective is to identify the mutagenic potential of each mutagen.
After googeling, I think what I need to do is a Manhattan plot to identify the chromosomal regions that have mutated the most for each mutagen.
What I have done at the moment is:
1 Genotyping callin, using the Genotyping Module of Genome Studio
2 Establish the reference set: SNP's 100% call and with the same genotype in all MS.
3 Assign each SNP of samples M1 and M2 a 1 if the genotype has varied from the reference or a 0 if it has not. The SNPs of the M1 and M2 samples that are not in the reference set or that are not 100% called have been discarded.
I have done all this with perl. What I now have is a text file with the following structure:
SNP_1 SNP_2 ...
M1.1 1 0
M1.2 1 1
.
.
M2.3 1 0
And that's all folks!!, I have no idea what I should do now.
Any help will be welcome because, as you may deduced, I have no idea of working with arrays or statistics.
Thank you
Comment