Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • sbaheti
    Member
    • Jul 2010
    • 12

    tool to merge pileup

    hi

    I need to merge pileup from multiple samples, do we have an open source tool to merge multiple pileups.

    Thanks

    Saurabh
  • mgolo
    Member
    • Apr 2011
    • 10

    #2
    Hi!

    I have the same question Saurabh. I have 3 biological replicates and i would like to normalize them before i merge their pileup files. Does anyone have a clue for this?

    Thanks in advance!

    Maria

    Comment

    • gringer
      David Eccles (gringer)
      • May 2011
      • 845

      #3
      For a basic merge, 'samtools mpileup ' will do this:



      It displays different columns for each sample; I'm not sure if that is what you want.

      Comment

      • mgolo
        Member
        • Apr 2011
        • 10

        #4
        Originally posted by gringer View Post
        For a basic merge, 'samtools mpileup ' will do this:



        It displays different columns for each sample; I'm not sure if that is what you want.
        Hi! Thanks for your fast reply

        I don´t think this is what i need. I would like to merge 3 biological replicates' results into one single file. Basically for each count calculate the mean of the 3 runs. But before doing that i need to normalize somehow the 3 files, as they have different coverages. Maybe i have to do this before creating the pileup files... Any idea?

        Comment

        • gringer
          David Eccles (gringer)
          • May 2011
          • 845

          #5
          I don't think this is what i need. I would like to merge 3 biological replicates' results into one single file. Basically for each count calculate the mean of the 3 runs. But before doing that i need to normalize somehow the 3 files, as they have different coverages. Maybe i have to do this before creating the pileup files... Any idea?
          mpileup is the only hammer I know (I'm very new to this), so take this with a grain of salt....

          mpileup gives raw count data for each run, so you can extract those columns as your raw coverages per sample. I would then normalise by dividing by some statistic from the counts per column (e.g. 75th percentile). Here's a quick R script that does this:

          Code:
          # read in pileup data
          data.df <- read.delim("mpileup_lane1-6.csv", sep = "\t", header = FALSE,
                                col.names = c("isoform","pos","flag",paste(c("count","seq","qual"),rep(1:6,each=3),sep="_")));
          pos.counts <- cbind(isoform = data.df$isoform,pos = data.df$pos,data.df[,paste("count",1:6,sep="_")]);
          # rough normalisation of count data
          quantile.counts <- apply(pos.counts[,3:8],2,quantile, p=0.75);
          quantile.counts <- quantile.counts / min(quantile.counts);
          pos.counts[,3:8] <- t(t(pos.counts[,3:8]) / quantile.counts);
          pos.counts$mean <- apply(pos.counts[,3:8],1,mean);
          A more complex *PKM-style normalisation would need to take into account the transcript sizes for each hit, so you'd use something like cufflinks/DEseq/whatever to get *PKM values for each transcript, then divide by that value. I'm not sure if going that far would be necessary, given that they should be hitting transcripts with a similar relative frequency.

          Comment

          • mgolo
            Member
            • Apr 2011
            • 10

            #6
            Thank you gringer!

            I think you are right about the normalization, no need to take into account transcript sizes

            Comment

            Latest Articles

            Collapse

            ad_right_rmr

            Collapse

            News

            Collapse

            Topics Statistics Last Post
            Started by SEQadmin2, 06-05-2026, 10:09 AM
            0 responses
            18 views
            0 reactions
            Last Post SEQadmin2  
            Started by SEQadmin2, 06-04-2026, 08:59 AM
            0 responses
            34 views
            0 reactions
            Last Post SEQadmin2  
            Started by SEQadmin2, 06-02-2026, 12:03 PM
            0 responses
            47 views
            0 reactions
            Last Post SEQadmin2  
            Started by SEQadmin2, 06-02-2026, 11:40 AM
            0 responses
            24 views
            0 reactions
            Last Post SEQadmin2  
            Working...