Seqanswers Leaderboard Ad

**cllorens** · 07-13-2013, 07:44 AM

Well, It seems thatnot so much people is familiarized with RIp-seq and ripseeker which are both probably quite new technology and associated tool

Ok i will post the solution just if any other find the same issue.

although it would be good to also say that I had anoter issue with this pipeline in addition to that the posted above. The pipeline gave me the following error

***0. Computing optimal bin size.
***0. Computing optimal bin size.
Error in GRangesList(nbhGRList) :
all elements in '...' must be GRanges objects

Thinking about it seems that "Granges" means "Genomic ranges" ( i am not an expert in R) and the firts issue i posted seems to be a problem of calling to biomart because differences in format. Taking into account the examples given in the url of ripseeker are based on the use of the hg19 genome reference files provided by illumina via a link in the tophat web site, which only implements chromosomes 1-22 plus sexuals and "mit" and that i was using the Grch37 last versión which implements a lot of supercontings not assigned to any chromosome because are repetitive or problematic sequences in term of assembly, the solution it was quite obvious.

Use the hg19 as such and voila "problemo resolved", probably the grch37 will work too if the supercontigs not assigned to chromosomes are removed from the fasta and the index o including if such labels are removed from the bam via samtools etc.

Just hope this to be useful for others in order to not to waste time in trying to find the bug.

CArlix

**cllorens** · 07-18-2013, 06:18 AM

One additional comment,
I downloaded the hg19 from the ucsc with all the features (chromosomes plus unasigned supercontigs) and ripseeker work well using this material if previously a vector from list.files to the bampath is assigned before launching the ripseek all in one function.

bamFiles <- list.files(path="your path", recursive=TRUE, full.names=TRUE)

and then

ripSeek(bamPath=bamFiles,... same string to that above)

Cheers

**mgeorge** · 08-23-2013, 10:42 AM

Hi cllorens,
I am using RIPSeeker as well, and received 50+(probably 100s or 1000s) warnings saying for Nbm_em, Alpha had become negative. Have you seen this error? Is this a sign of bad input reads or an issue with the analysis?

**cllorens** · 08-23-2013, 02:05 PM

Hi George,
I did not receive that warning. If it is a warning you might ignore it (careful with this Word, it is just an option of your choice) if the results are reasonable. Nbm_em is the Expectation conditional maximization of likelihood for negative binomial
mixture model of RIPseeker. I am not an expert on this tool. Can only say that It Works well when using (at least in my case) aligned data generated by tophat and using the all-in-one main function ripseek and then computeRPKm and rulebaseRIPseek. I documented the steps and protocol of commands i successfully used after some proofs i have it in spanglish but if your find it useful just let it to known and i will write in english and will upload it here. My recomendation as for the function Nbm_em and your issue is to keep an eye to the tutorial available here (http://bioconductor.org/packages/2.1.../RIPSeeker.pdf) as it is quite useful and also contact to Yue Li (the autor) which is a nice guy that i am sure will be deligth to answer you any question about the theoretical underlying background. I am not sure if it is appropiate to publicly type his email here, but it is available in the NAR paper where he introduce the tool or just write me directly to my email ([email protected]) and i will reply you with the Yue address.

Best

Carlos

**mgeorge** · 08-24-2013, 02:35 PM

Could you post that please? My inputs look like this:
cNAME <- "RIP1_IgG"
> getwd()
[1] "/home/georgem/BAM"
> outDir <- paste(getwd(), "escRIP_Brg1_1", sep="/")
> binSize <- NULL
> biomart <- "ENSEMBL_MART_ENSEMBL"
> dataset <- "mmusculus_gene_ensembl"
> host <- "dec2011.archive.ensembl.org"
> goAnno <- "org.Mm.eg.db"
> seekOut <- ripSeek(bamPath="/home/georgem/BAM", cNAME=cNAME, binSize=binSize, strandType=NULL, paired=TRUE, outDir=outDir, genomeBuild="mm9", biomart=biomart, host=host, biomaRt_dataset = dataset, goAnno = goAnno, uniqueHit = TRUE, assignMultihits = TRUE, rerunWithDisambiguatedMultihits = TRUE, multicore=FALSE).

Everything proceeds nicely from chromosome to chromosome, but it halts with an error at:

*IV. Annotate RIP regions via online ensembl database (mmusculus_gene_ensembl):

Error: require(biomaRt) is not TRUE.

This was after a day+ of computing......

Thanks for your help!

**cllorens** · 08-24-2013, 02:58 PM

ok George, give me a couple of days, anyway contact Yue and check if you are using the appropiate reference reléase for calling Biomart. I do not if you used it but it would be a good idea to use the annotation illumina material provided by tophat in its web site.

**cllorens** · 08-27-2013, 04:03 PM

Hi George,

Here you have the protocol I followed.

I am not sure if it will helps you but it was success in running my analysis of two human samples single end control and rip.

just a comment for your problem. Try with the index and annotation material provided by Tophap and use the last mm release (if so you do not need to include the host info in your command.

Hope it helps,

Carlix

A) mapping

for both Control and RIP

./tophat --read-mismatches 2 --read-gap-length 3 --read-edit-dist 3 --GTF /home/your_path/Homo_sapiens/UCSC/hg19/Annotation/Archives/archive-2013-03-06-11-23-03/Genes/genes.gtf --transcriptome-index=/home/your_path/Homo_sapiens/UCSC/hg19/Sequence/ref_transcriptome_to_create /home/your_path/Homo_sapiens/UCSC/hg19/Sequence/Bowtie2Index/genome /home/your_path/your.fastq

B) Then RIP-seq analisis using RIPseeker (http://bioconductor.org/packages/2.1...RIPSeeker.html)

1) calling R (assume you have it installed)

R

2) installing RIPseeker

source("http://bioconductor.org/biocLite.R")
biocLite("RIPSeeker")

3) Then load it into console

library(RIPSeeker)

4) Run Ripseek

bamFiles <- list.files(path="/home/your_bampath", recursive=TRUE, full.names=TRUE)

seekOut.CIPF <- ripSeek(bamPath=bamFiles, cNAME="control_accepted_hits.bam", genomeBuild="hg19", uniqueHit=TRUE, assignMultihits=TRUE, rerunWithDisambiguatedMultihits=TRUE, binSize=NULL, strandType=NULL, paired=FALSE, biomart="ensembl", biomaRt_dataset="hsapiens_gene_ensembl", goAnno="org.Hs.eg.db", exportFormat="gff3", annotateFormat="txt", annotateType="TSS", outDir="/home/your_path/output", padjMethod="BH", logOddCutoff = 0, pvalCutoff=1, pvalAdjCutoff = 1, eFDRCutoff = 1)

5) peak counts

ripRPKM <- computeRPKM(bamFiles[1], dataset="hsapiens_gene_ensembl", paired=FALSE, moreGeneInfo=TRUE, justRPKM=FALSE, idType="ensembl_transcript_id", featureType="exon",txDbName="biomart", biomart="ensembl", by="tx", saveData="/home/yourpath/ripRPKM")

rulebase.results <- rulebaseRIPSeek(bamFiles=bamFiles, cNAME="control_accepted_hits.bam", myMin=1, featureGRanges=ripRPKM$featureGRanges, rpkmCutoff = 0.4, fcCutoff = 3, moreRIPGeneInfo=TRUE, idType="ensembl_transcript_id", biomart="ensembl",dataset="hsapiens_gene_ensembl", saveData="/home/yourpath/rulebase.results")

head(ripRPKM$rpkmDF)

df <- rulebase.results$rpkmDF

df <- df[order(df$foldchange, decreasing=TRUE), ]

write.table(df, file="/home/yourpath/csv", append=FALSE, quote=TRUE, sep = ";", eol = "\n", na = "NA", dec = ".", row.names=TRUE, col.names=TRUE, qmethod = c("escape", "double"), fileEncoding = "")

6) now_look_for_coverage

RipGal <- readGappedAlignments("/data1/home/yourpath/control_accepted_hits.bam")

CtlGal <- readGappedAlignments("/data1/home/yourpath/RIP_accepted_hits.bam")

ripGR <- as(RipGal, "GRanges")

ctlGR <- as(CtlGal, "GRanges")

ripGRList <- split(ripGR, seqnames(ripGR))

ripGRList <- ripGRList[elementLengths(ripGRList) != 0]

ctlGRList <- GRangesList(as.list(split(ctlGR, seqnames(ctlGR))))

ctlGRList <- ctlGRList[lapply(ctlGRList, length) != 0]

binSize <- 1000

#c(bottom, left, top, right)

par(mfrow=c(1, 2), mar=c(2, 2, 2, 0) + 0.1)

tmp <- lapply(ripGRList, plotStrandedCoverage, binSize=binSize, xlab="", ylab="", plotLegend=TRUE, legend.cex=1.5, main="RIP", cex.main=1.5)

tmp <- lapply(ctlGRList, plotStrandedCoverage, binSize=binSize, xlab="", ylab="", plotLegend=TRUE, legend.cex=1.5, main="CTL", cex.main=1.5)

**mgeorge** · 01-21-2014, 09:35 AM

Hi cllorens,
In analyzing the RIPregions data produced by RIPSeeker, I am a little confused. In many cases, the region in experimental sample and the control sample show similar read counts and FPKM, but the logOddScore will be drastically different (e.g.: Region 1: FPK:3435, logOddScore:88.6; CTRL_fpk: 3110, CTRL_logOddScore: -59.6). How can this be?
Thanks! I hope your RIP-seq experiments are going well.

**pavol** · 09-12-2014, 02:53 PM

Hi,
I am also working with RipSeeker v 1.4 and got similar issue:

########### portion of output

**D. Re-run NB HMM with unique hits + disambiguated multihits.
*** All chromosomes have at least one alignment
**********
chr1:
**********
***0. Computing optimal bin size.
**********
chr10:
**********
***0. Computing optimal bin size.
Error in GRangesList(nbhGRList) :
all elements in '...' must be GRanges objects

######### the program terminates after over a day of computation.

I have successfully used this program before with long reads (100nt control and RIP)
Here I decided to use smaller reads (100nt control and 50 bp RIP).
Alignment was done with Tophat with the setting described in most recent RIPSeeker paper and it was exactly same for all data sets.
I have made sure that the alignment was done to reference genome with only well annotated chromosomes.

Any suggestion what may be the problem???

##### the script I used
library(RIPSeeker)

bamFiles # will have 2 total files
iggFiles # will only have igg

outDir <- file.path(wd,"ripseek_Med_out")
multicore <- TRUE
biomart <- "ENSEMBL_MART_ENSEMBL" # use archive to get ensembl 65
biomaRt_dataset <- "mmusculus_gene_ensembl" # mouse dataset id name
host <- "dec2011.archive.ensembl.org" # use ensembl 65 for annotation for mm9
goAnno <- "org.Mm.eg.db" # GO annotation database

ripseek_Med <- ripSeek(bamPath=bamFiles, cNAME=iggFiles, genomeBuild="mm9", uniqueHit=TRUE,
assignMultihits=TRUE,
rerunWithDisambiguatedMultihits=TRUE, biomart=biomart, host=host,
biomaRt_dataset=biomaRt_dataset, goAnno=goAnno, multicore=multicore,
outDir=outDir)

##### end script

**pavol** · 10-02-2014, 06:26 AM

Yet another issue using the same command and settings as seen in my previous post.
Any suggestions?

####
*III. Seek RIP regions with control library:

Error in .normargGenome(value, seqnames(x)) :
when length of supplied 'genome' vector is not 1, then it must equal the number of sequences

**donthu** · 06-06-2016, 09:01 AM

RIPSeeker error

Hello Pavol,

I am also getting same error. Were you able to resolve this? If yes, please share your solution. Thank you for your help.

Error in GRangesList(nbhGRList) :
all elements in '...' must be GRanges objects

Originally posted by pavol View Post

Yet another issue using the same command and settings as seen in my previous post.
Any suggestions?

####
*III. Seek RIP regions with control library:

Error in .normargGenome(value, seqnames(x)) :
when length of supplied 'genome' vector is not 1, then it must equal the number of sequences

**pnat** · 01-27-2017, 01:35 PM

RIP-Seq, RIP-Seeker - annotation

I am working with my first RIP-Seq data (plant genome: Arabidopsis Thaliana).

Can I get help with the following information for annotating the RIP regions identified by RIPSeeker?
genomeBuild= "TAIR10"
biomart=
host=
biomaRt_dataset =
goAnno =

thanks

**sbcn** · 01-31-2017, 10:07 AM

Hi,

I think this should work:

biomart="plants_mart"
host="plants.ensembl.org"
biomaRt_dataset ="athaliana_eg_gene"
goAnno ="org.At.tair.db"

source: https://support.bioconductor.org/p/75424/

Cheers

Topics	Statistics	Last Post
Expanding the Horizons of Cellular Research with the Single Cell Atlas by seqadmin Started by seqadmin, Today, 11:49 AM	0 responses 10 views 0 likes	Last Post by seqadmin Today, 11:49 AM
Genetic Variants and Diabetes Risk in Childhood Cancer Survivors by seqadmin Started by seqadmin, Yesterday, 08:47 AM	0 responses 16 views 0 likes	Last Post by seqadmin Yesterday, 08:47 AM
Cancer Metastasis: A Deep Dive into Cellular Plasticity by seqadmin Started by seqadmin, 04-11-2024, 12:08 PM	0 responses 61 views 0 likes	Last Post by seqadmin 04-11-2024, 12:08 PM
Proteogenomic Profiles Offer New Clues in Prostate Cancer by seqadmin Started by seqadmin, 04-10-2024, 10:19 PM	0 responses 60 views 0 likes	Last Post by seqadmin 04-10-2024, 10:19 PM

Seqanswers Leaderboard Ad

Announcement

problemas with Rip-seq, ripseeker, rownames and R

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News