Hi everyone,
apologies if this has already been asked.
So I recently started reading into how to prepare libraries for ATAC-Seq and I kind of got conflicting information on what the shape of the library is supposed to look like.
From what I thought to know, I'm supposed to see peaks in the fragment length distribution at roughly n*200bp which correspond to enriched cutting sites next to mono, di, tri, ... nucleosomes, refer to https://www.ncbi.nlm.nih.gov/pubmed/24097267, Figure 2. These can be seen in Tapestation/Bioanalyzer traces as a QC and can be recovered from the mapping of paired-end reads.
My question is, as far as I know Illumina (or at least the people in our core facility) recommend to only sequence libraries of 200-500bp insert size on their machines, as longer fragments do not efficiently produce clusters in the flow cell. But does this mean that you lose most of your longer fragments during clustering? Is there a bias toward the very small ones? Or is it a matter of the sequencing protocol and machine that are used? I also heard that some people optimize their library prep to mainly produce short homogeneous libraries which to me sounds like over-digesting the chromatin.
If someone can shed a bit of light on these issues I would be eternally grateful
Cheers
apologies if this has already been asked.
So I recently started reading into how to prepare libraries for ATAC-Seq and I kind of got conflicting information on what the shape of the library is supposed to look like.
From what I thought to know, I'm supposed to see peaks in the fragment length distribution at roughly n*200bp which correspond to enriched cutting sites next to mono, di, tri, ... nucleosomes, refer to https://www.ncbi.nlm.nih.gov/pubmed/24097267, Figure 2. These can be seen in Tapestation/Bioanalyzer traces as a QC and can be recovered from the mapping of paired-end reads.
My question is, as far as I know Illumina (or at least the people in our core facility) recommend to only sequence libraries of 200-500bp insert size on their machines, as longer fragments do not efficiently produce clusters in the flow cell. But does this mean that you lose most of your longer fragments during clustering? Is there a bias toward the very small ones? Or is it a matter of the sequencing protocol and machine that are used? I also heard that some people optimize their library prep to mainly produce short homogeneous libraries which to me sounds like over-digesting the chromatin.
If someone can shed a bit of light on these issues I would be eternally grateful
Cheers
Comment