Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • [ChIP-seq] MACS 1.4: assertion error

    HI,

    I just started working with ChIP-seq and I am using MACS to predict binding sites for a fungal genome (12MB).
    As mentioned here, I am getting the same problem as some previous users: I get an assertion error when calculating negative peaks, and I only managed to solve the problem by reducing the amount of data to 75%, i.e. if the amount of reads goes to 80% or more of the original amount, I will get the error again.

    I have a control file with 31M 36bp single reads (4.1GB) and a sample file
    with 28M 36bp single reads (3.6GB).
    I tried it both a laptop (4GB RAM) and a server (16GB RAM), in
    both cases it was using 1.6GB of memory and was behaving the same way. I also tried it on Cistrome, idem.
    Changing the mfold parameter didn't help.
    With 75% of the data I did get sensible results, but I am not sure how can I move on discarding 25% of a dataset... it just adds another layer of complexity to the analysis.

    Does anybody have an idea of the cause of the problem, and of the reason
    why reducing the amount of data works?
    Also, if anybody knows any alternative, valid tool, feel free to suggest
    Thanks!

  • #2
    I just answered this question in MACS user group, however, since you asked in seqanswer, I re-post it here.


    This error normally happens when you have too many reads in a very small genome. In your case, you use a whole GA2 lane to sequence a single factor in a genome like E coli. Then due to the extremely high coverage, this overflow error occurs since my function doesn't expect a poisson rate higher than 740...

    In practice, you'd better consider using multiplex to fully use a single lane to sequence multiple factors or a single factor in multiple conditions/time points. 30million reads for a single experiment on a 4million genome is a big waste -- you can even assemble the genome for this species now...

    Anyway, since you have already got your 30millions reads, what you can do ( instead of waiting me to fix it (: ) is to subsample your sequencing reads. My impression for human chip-seq, if you want to reach saturation for peak detection, you need about 300 million reads ( from our unpublished Nat Method paper ) which is equivalent to 0.5million reads in E coli . You can use "samtools view -s" to subsample a portion of your BAM file.
    梦蝶

    Comment


    • #3
      Thanks for the explanation! I ll try with downsampling, then.

      Comment

      Latest Articles

      Collapse

      • seqadmin
        Advancing Precision Medicine for Rare Diseases in Children
        by seqadmin




        Many organizations study rare diseases, but few have a mission as impactful as Rady Children’s Institute for Genomic Medicine (RCIGM). “We are all about changing outcomes for children,” explained Dr. Stephen Kingsmore, President and CEO of the group. The institute’s initial goal was to provide rapid diagnoses for critically ill children and shorten their diagnostic odyssey, a term used to describe the long and arduous process it takes patients to obtain an accurate...
        12-16-2024, 07:57 AM
      • seqadmin
        Recent Advances in Sequencing Technologies
        by seqadmin



        Innovations in next-generation sequencing technologies and techniques are driving more precise and comprehensive exploration of complex biological systems. Current advancements include improved accessibility for long-read sequencing and significant progress in single-cell and 3D genomics. This article explores some of the most impactful developments in the field over the past year.

        Long-Read Sequencing
        Long-read sequencing has seen remarkable advancements,...
        12-02-2024, 01:49 PM

      ad_right_rmr

      Collapse

      News

      Collapse

      Topics Statistics Last Post
      Started by seqadmin, 12-17-2024, 10:28 AM
      0 responses
      22 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 12-13-2024, 08:24 AM
      0 responses
      42 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 12-12-2024, 07:41 AM
      0 responses
      28 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 12-11-2024, 07:45 AM
      0 responses
      42 views
      0 likes
      Last Post seqadmin  
      Working...
      X