Hi,
I just got data back from a Solexa run of a ChIP for a histone modification occuring in euchromatic regions of the genome. My tags, though, look like this:
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA 200126
CACACACACACACACACACACACACACACACACACA 9848
CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC 3245
TGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTG 3211
ACACACACACACACACACACACACACACACACACAC 2684
GAGAGAGAGAGAGAGAGAGAGAGAGAGAGAGAGAGA 1390
AGAGAGAGAGAGAGAGAGAGAGAGAGAGAGAGAGAG 1226
GGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAG 1105
TGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTT 905
CACACACACACACACACACACACACACACACACACC 591
GATCGGAAGAGCTCGTATGCCGTCTTCTGCTTGAAA 582
CCTAACCCTAACCCTAACCCTAACCCTAACCCTAAC 513
GGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTA 446
TGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTTTT 406
CTAACCCTAACCCTAACCCTAACCCTAACCCTAACC 386
CACACACACACACACACACACACACACACACACACT 373
CACACACACACACACACACACACACACACACACAGA 322
GGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTG 312
CCCTAACCCTAACCCTAACCCTAACCCTAACCCTAA 299
TGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTTTG 294
GACAGACAGACAGACAGACAGACAGACAGACAGACA 290
CACACACACACACACACACACACACACACACACAAA 290
GGGGCAGAAGCTGCCTGAAAGGTGCTTGAGCAACGT 285
TACACACACACACACACACACACACACACACACACA 268
And go on like that. The majority of my reads are either are or are almost straight runs of a single base or or dinucleotides. Only 5.5% of my reads mapped to the genome at all, and those that did are not where I biologically expect them to be. Blatting a lot of them shows that they are usually in repetitive elements.
Chromatin was fragmented by digestion to mononucleosomes, but the bioanalyzer trace shows the existence a of large peak at ~300 bp, which is larger than the expected peak at ~180.
So I'm trying to figure out what when wrong with the run, so this can be avoided in the future. It looks like something's contaminating my sample, but blasting these reads doesn't show any obvious answers. Thoughts?
I just got data back from a Solexa run of a ChIP for a histone modification occuring in euchromatic regions of the genome. My tags, though, look like this:
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA 200126
CACACACACACACACACACACACACACACACACACA 9848
CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC 3245
TGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTG 3211
ACACACACACACACACACACACACACACACACACAC 2684
GAGAGAGAGAGAGAGAGAGAGAGAGAGAGAGAGAGA 1390
AGAGAGAGAGAGAGAGAGAGAGAGAGAGAGAGAGAG 1226
GGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAG 1105
TGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTT 905
CACACACACACACACACACACACACACACACACACC 591
GATCGGAAGAGCTCGTATGCCGTCTTCTGCTTGAAA 582
CCTAACCCTAACCCTAACCCTAACCCTAACCCTAAC 513
GGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTA 446
TGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTTTT 406
CTAACCCTAACCCTAACCCTAACCCTAACCCTAACC 386
CACACACACACACACACACACACACACACACACACT 373
CACACACACACACACACACACACACACACACACAGA 322
GGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTG 312
CCCTAACCCTAACCCTAACCCTAACCCTAACCCTAA 299
TGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTTTG 294
GACAGACAGACAGACAGACAGACAGACAGACAGACA 290
CACACACACACACACACACACACACACACACACAAA 290
GGGGCAGAAGCTGCCTGAAAGGTGCTTGAGCAACGT 285
TACACACACACACACACACACACACACACACACACA 268
And go on like that. The majority of my reads are either are or are almost straight runs of a single base or or dinucleotides. Only 5.5% of my reads mapped to the genome at all, and those that did are not where I biologically expect them to be. Blatting a lot of them shows that they are usually in repetitive elements.
Chromatin was fragmented by digestion to mononucleosomes, but the bioanalyzer trace shows the existence a of large peak at ~300 bp, which is larger than the expected peak at ~180.
So I'm trying to figure out what when wrong with the run, so this can be avoided in the future. It looks like something's contaminating my sample, but blasting these reads doesn't show any obvious answers. Thoughts?
Comment