Hello All,
I am an undergrad just entering the field, working on Chip-seq analysis for the illumina as well as de novo assembly on the 454. The more and more I get involved with the various analysis tools, MACS SPP and MIRA in particular, the more questions I have in regards to terminology. Admittedly, I have a limited statistical background. I was wondering if I could get assistance with getting some concepts and facts straight.
"Tag Density": I have seen this term used quite a bit in regards to detecting peak binding sites. My current notion is that this refers to specific regions of the genome that multiple reads are being mapped to, and the density of this region can be used to determine if this is a binding site?
"Tag Enrichment/Depletion": This one confuses me a bit as well. I believe it refers to areas in an alignment which are either represented to a greater or lesser extent, I assume due to sample biases. How is this information (if my interpretation is even correct) utilized for chip-seq?
"Tag Position": Is this simply where it rests on the genome?
In addition, I have seen many examples of cross-correlation, or correlation graphs to demonstrate peak-binding separation distance. I was wondering what this data really represents and why it is significant.
The word "Tag" in my current understanding is another word for an aligned read, or do they have different meanings? Hopefully I am not too far off the mark, but any insights would be greatly appreciated.
Thanks!
I am an undergrad just entering the field, working on Chip-seq analysis for the illumina as well as de novo assembly on the 454. The more and more I get involved with the various analysis tools, MACS SPP and MIRA in particular, the more questions I have in regards to terminology. Admittedly, I have a limited statistical background. I was wondering if I could get assistance with getting some concepts and facts straight.
"Tag Density": I have seen this term used quite a bit in regards to detecting peak binding sites. My current notion is that this refers to specific regions of the genome that multiple reads are being mapped to, and the density of this region can be used to determine if this is a binding site?
"Tag Enrichment/Depletion": This one confuses me a bit as well. I believe it refers to areas in an alignment which are either represented to a greater or lesser extent, I assume due to sample biases. How is this information (if my interpretation is even correct) utilized for chip-seq?
"Tag Position": Is this simply where it rests on the genome?
In addition, I have seen many examples of cross-correlation, or correlation graphs to demonstrate peak-binding separation distance. I was wondering what this data really represents and why it is significant.
The word "Tag" in my current understanding is another word for an aligned read, or do they have different meanings? Hopefully I am not too far off the mark, but any insights would be greatly appreciated.
Thanks!