Hi,
I have ChIP-seq data generated using a methylcytosine antibody and the Illumina GAII. For each sample, I have an input control (IC) and an immunoprecipitated (IP) sample. Bowtie was used to generate alignments.
MACS will locate peaks, but I have some concerns/questions:
1. Is there a minimum number of reads that a peak should be composed of? I've seen publications where it looks like peaks have hundreds of reads, and others where a peak only has 30 reads.
2. In my samples, some peaks are composed mainly of reads on either the +ve or the -ve strand, while other peaks have an equal distribution of both. Is there any strand "ratio" that defines a "true" peak versus an artifact? It would seem impossible to go through and visually verify each peak.
3. Some people suggest using mfold values between 10 and 30, while others report using 5. The MACS website seems to suggest not going below 10, but is there rule of thumb to determine the optimal mfold value for a specific data set?
Sorry for so many questions. I just want to make sure the data I report is valid. Thanks in advance,
jjw
I have ChIP-seq data generated using a methylcytosine antibody and the Illumina GAII. For each sample, I have an input control (IC) and an immunoprecipitated (IP) sample. Bowtie was used to generate alignments.
MACS will locate peaks, but I have some concerns/questions:
1. Is there a minimum number of reads that a peak should be composed of? I've seen publications where it looks like peaks have hundreds of reads, and others where a peak only has 30 reads.
2. In my samples, some peaks are composed mainly of reads on either the +ve or the -ve strand, while other peaks have an equal distribution of both. Is there any strand "ratio" that defines a "true" peak versus an artifact? It would seem impossible to go through and visually verify each peak.
3. Some people suggest using mfold values between 10 and 30, while others report using 5. The MACS website seems to suggest not going below 10, but is there rule of thumb to determine the optimal mfold value for a specific data set?
Sorry for so many questions. I just want to make sure the data I report is valid. Thanks in advance,
jjw