Seqanswers Leaderboard Ad

**simonandrews** · 09-26-2012, 11:37 PM

For your high duplication level you might just be saturating your peaks. If your ChIP is really good then you're only looking at a limited region of your genome so eventually duplication becomes inevitable from a random selection of a diverse library. You should be able to see from your results whether you're getting incomplete or uneven coverage in your peaks which might suggest that the duplication is more technical and problematic. If the peaks look smooth and evenly covered then I'd not worry about it too much.

For the fragment size it's difficult to know why you're seeing a shift in average size but normally the only size selection during library preparation would be to avoid adapter dimers, which are small, so it would seem odd if the library preparation decreased the average insert size.

For ChIP you really want short insert sizes so you get more specific information about binding locations. If your data looks good then I wouldn't worry about messing around with your protocol.

**biznatch** · 09-26-2012, 11:47 PM

Thank you this is good to hear, it sounds like the results are pretty much as expected then, and based on what I've looked at so far the peaks do look smooth and evenly covered. It makes sense that for ChIP you want shorter sizes for more specific binding, so I'm wondering is 2x100 bp very common for ChIP or do people tend to use 2x50 or 2x75 or even single end reads? The facility we sent it to said that they pretty much only do 2x100 bp now for everything (chip, rna, etc). There's nothing wrong with getting extra data but I think usually it's cheaper to do shorter reads.

**simonandrews** · 09-26-2012, 11:54 PM

Actually we tend to do 1 x 50 for a lot of our ChIP. As long as you know the expected insert size for your library you can simply extend the single end reads to infer where the whole insert would have been. Makes things even cheaper and still seems to work OK if you've got a decent antibody.

**biznatch** · 09-27-2012, 12:15 AM

Ok that's kind of what I thought. The place we sent it said that since they do mostly 2x100 now it would take a lot longer if we did anything else, I guess because they have to wait until they have enough 1x50 requests to fill the machine? I'm not sure exactly how that works, but we only used 1 lane. The cost even for 2x100 was cheaper than other places with shorter read so it wasn't a big deal but for future we'll have to consider other options.

We did 1x50 a year or so ago at a different facility but for our 5 samples this time it was actually cheaper to do 2x100 at the new place vs 1x50 at the old place.

**mitcherr** · 02-14-2013, 12:47 PM

Biznatch,

Did you do this analysis at TCAG? I am thinking of doing the same thing right, now and was wondering exactly what you were regarding the read length, and whether to do single end instead of paired to avoid over redundancy. Did everything work out okay with your data? Would you have done things differently looking back??

cheers

**biznatch** · 02-14-2013, 02:45 PM

Hi mitcherr, yes it was TCAG. Everything worked out ok with the data, we actually just got our second set back today and I'm in the process of aligning it. The paired end reads seem to give less artifacts in a few places. There's one site in particular near a gene of interest that always shows a large peak of non-specific alignment that shows up in the 50bp single end samples and inputs but not in the 2x100 paired end reads, but maybe 1x100 would look fine too.

I don't think paired end reads would increase redundancy. I think you start getting redundancy once you get a certain amount of reads, regardless of whether you have single or paired end reads. The only problem with paired end reads is that maybe you're paying a lot more money for only a small increase in alignment accuracy. From a biological/technical perspective I think paired end can only help.

With the new data set we went with the same 2x100 reads again because the facility couldn't estimate a turnaround time for anything else, and since the 2x100 at TCAG was the same price or less than shorter single end reads elsewhere. But if it wasn't for the turnaround time issue I think single end reads would be fine and we would have gone with that. I'd suggest contacting TCAG and asking about single end reads, maybe it will be faster now.

**mitcherr** · 02-15-2013, 08:29 AM

Thanks for the reply. Pretty funny that I could figure out what facility you used via read length and country of origin lol

**syfo** · 05-16-2013, 06:25 AM

on the advantage/cost of PE vs. SE

Originally posted by simonandrews View Post

Actually we tend to do 1 x 50 for a lot of our ChIP. As long as you know the expected insert size for your library you can simply extend the single end reads to infer where the whole insert would have been. Makes things even cheaper and still seems to work OK if you've got a decent antibody.

Originally posted by biznatch View Post

The only problem with paired end reads is that maybe you're paying a lot more money for only a small increase in alignment accuracy. [...] But if it wasn't for the turnaround time issue I think single end reads would be fine and we would have gone with that.

Aren't paired end reads better to detect and remove duplicates?

**mxqian** · 08-13-2013, 10:51 PM

@biznatch
Hi, as you see, nearly all the NGS data on illumina platform are 2x100 bp now. However, I can not find the suitable analysis software for ChIP-seq with the paired reads. MACS just can accept the ELANDMULTI format for paired reads. If the format is sam/bam that is most widely used format for maping reads, MACS will just keep the left mate(5' end) tag. That will work, but I don't think that used the paired information well. Any suggestion?

Topics	Statistics	Last Post
Cancer Metastasis: A Deep Dive into Cellular Plasticity by seqadmin Started by seqadmin, 04-11-2024, 12:08 PM	0 responses 59 views 0 likes	Last Post by seqadmin 04-11-2024, 12:08 PM
Proteogenomic Profiles Offer New Clues in Prostate Cancer by seqadmin Started by seqadmin, 04-10-2024, 10:19 PM	0 responses 57 views 0 likes	Last Post by seqadmin 04-10-2024, 10:19 PM
Novel Diagnostic Assay Enhances Ovarian Cancer Detection by seqadmin Started by seqadmin, 04-10-2024, 09:21 AM	0 responses 51 views 0 likes	Last Post by seqadmin 04-10-2024, 09:21 AM
Evolutionary Dynamics of Centromeres: A Comparative Genomic Analysis by seqadmin Started by seqadmin, 04-04-2024, 09:00 AM	0 responses 56 views 0 likes	Last Post by seqadmin 04-04-2024, 09:00 AM

Seqanswers Leaderboard Ad

Announcement

Concern about short fragment size and high duplication rate in paired-end ChIP-Seq

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News