Go Back   SEQanswers > Bioinformatics > Bioinformatics

Similar Threads
Thread Thread Starter Forum Replies Last Post
"home-made" Tn5 transposase for single-cell RNA-seq Simone78 Literature Watch 2 08-21-2014 02:38 PM
PubMed: Analysis of the Lung Microbiome in the "Healthy" Smoker and in COPD. Newsbot! Literature Watch 0 03-03-2011 03:00 AM
"Systems biology and administration" & "Genome generation: no engineering allowed" seb567 Bioinformatics 0 05-25-2010 01:19 PM
First Helicos Publication! Single Molecule DNA Seq of a "Viral" Genome ECO Helicos / Direct Genomics 15 09-05-2008 07:05 AM

Thread Tools
Old 06-30-2015, 07:36 PM   #1
Location: Singapore

Join Date: May 2012
Posts: 48
Default Have anyone tried analysis in Baslan 2012 "Genome-wide CNV analysis of Single-cell"?

Hi All,

I have been trying to use the analysis flow suggested by the paper, but it seems like I am not getting it. Just a show of hand if anyone of you have successfully replicated the analysis workflow?

wilson90 is offline   Reply With Quote
Old 06-30-2015, 10:56 PM   #2
Location: Singapore

Join Date: May 2012
Posts: 48

peaks <- function(series, span=3, ties.method = "first") {
### This peaks function from Petr Pikal
if((span <- as.integer(span)) %% 2 != 1) stop("'span' must be odd")
z <- embed(series, span)
s <- span%/%2
v <- max.col(z, ties.method=ties.method) == 1 + s
pad <- rep(FALSE, s)
result <- c(pad, v, pad)


#df <- read.table("./", header=T)
#dfs <- read.table("./SRR054616.hg19.50k.k50.nobad.varbin.short.txt", header=T)

df <- read.table("", header=T)
dfs <- read.table("hg19.50k.k50.nobad.varbin.short.txt", header=T)

starts <- c()
ends <- c()
prevEnd <- 0
len <- nrow(dfs)
for (j in 1:len) {
thisStart = prevEnd + 1
thisEnd = thisStart + dfs$num.mark[j] - 1
starts <- c(starts, thisStart)
ends <- c(ends, thisEnd)
prevEnd = thisEnd

amat <- matrix(data=0, nrow=1500000, ncol=1)
counter <- 1
for (j in 1len-1)) {
for (k in (j+1):len) {
N <- round((starts[j] - ends[j] + 1) * (starts[k] - ends[k] + 1)/1000)
D <- abs(2^dfs$seg.mean[j] - 2^dfs$seg.mean[k])
#cat(N, "\t")
if (N > 0) {
amat[(countercounter+N-1)), 1] <-, N)
counter <- counter+N
a3 <- amat[(1:counter),1]
a3.95 <- sort(a3)[round(.95*counter)]
a3d <- density(a3[which(a3 < a3.95)], n=1000)
a3d$x[which(peaks(as.vector(a3d$y), span=3))]
a3d$y[which(peaks(as.vector(a3d$y), span=3))]
cn0 <- a3d$x[which(peaks(as.vector(a3d$y), span=59))][1]
cn1 <- a3d$x[which(peaks(as.vector(a3d$y), span=59))][2]
quantile(a3d$x[which(peaks(as.vector(a3d$y), span=3))],c(0.2,0.4,0.6,0.8,1))
df$cn.ratio <- df$lowratio / cn1
df$cn.seg <- df$seg.mean.LOWESS / cn1
df$copy.number <- round(df$cn.seg)

write.table(df, sep="\t", file=paste("", sep=""), quote=F, row.names=F)

postscript("", height=400, width=600)
plot(a3d, main="seg.mean difference density")

for (a in 1:24) {
postscript(paste("", a, ".ps", sep=""), height=400, width=600)
plot(df$cn.ratio[df$chrom==a], main=paste("SRR054616 chr", a, sep=""), xlab="Bin", ylab="Ratio", col="#CCCCCC")
lines(df$cn.ratio[df$chrom==a], col="#CCCCCC")
points(df$cn.seg[df$chrom==a], col="#0000DD")
lines(df$cn.seg[df$chrom==a], col="#0000DD")
points(df$copy.number[df$chrom==a], col="#DD0000")
lines(df$copy.number[df$chrom==a], col="#DD0000")
This is the part where the author tries to estimate copy number. It is kind of him to share his code, but it comes without any comment or explanation.

Can anyone familiar with CNV field kind enough to explain what this R code is doing?

Thank you.

wilson90 is offline   Reply With Quote

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

All times are GMT -8. The time now is 06:34 PM.

Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO