SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
RAPIDR package for the NIPT Laquais Bioinformatics 9 11-21-2017 12:04 PM
Rapidr invalid class “SummarizedExperiment” paumarc Bioinformatics 3 10-20-2017 11:45 PM
Anyone use Polyphemus R package? mkaus Bioinformatics 0 02-17-2017 12:21 PM
Which R package can do this SDPA_Pet Bioinformatics 14 09-15-2013 08:01 AM
DESeq package(1.5.24) elisadouzi Bioinformatics 1 10-01-2011 02:02 AM

Reply
 
Thread Tools
Old 11-21-2017, 09:33 AM   #1
PandoraMid
Member
 
Location: Italy

Join Date: Nov 2015
Posts: 18
Unhappy Help with RAPIDR package

Hi all! I need your help with an R package named RAPIDR. I'm new to use this package, but I really need it, but I don't know how I can perform my analysis without errors. Now I explain my steps.

1) I create a binned file with the following code:
Code:
> makeBinnedCountsFile(bam.file.list=c(bam1.bam, bam2.bam,bam3.bam,bam4.bam), sampleIDs=c("001", "003", "005", "013"), binned.counts.fname="output.csv", k=20000)
2) I create a reference with the following code:
Code:
> rapidr.dir<-system.file(package="RAPIDR")
> data(outcomes)
> data(gcContent)
> T21.pos<-which(outcomes$Dx=="T21")
> chr.lens<-sapply(gcContent, length)
> chr.names<-names(chr.lens)
> header<-c("SampleID")
> for(i in 1:length(chr.lens)){
+ header<-c(header,rep(chr.names[i], chr.lens[i]))}
> nbins<-sum(chr.lens)
> ncols<-nbins+1
> binned.counts<-matrix(nrow=nrow(outcomes), ncol=ncols)
> for(i in 1:nrow(binned.counts)){
+ binned.counts[i,]<-rpois(ncols, lambda=100)
+ if(i%in%T21.pos){
+ binned.counts[i,139087:141493]<-rpois(chr.lens[21], lambda=115)
+ }}
> binned.counts[,1]<-outcomes$SampleID
> colnames(binned.counts)<-header
> t<-tempfile()
> write.table(binned.counts, file=t, col.names=TRUE, row.names=FALSE, quote=FALSE, sep=",")
> "output.csv"<-t
> message(t)
/tmp/RtmpwrkRt5/file33292c2c486
> gcContent.fname<-paste(rapidr.dir, "/media/martina/VERBATIM HD/Altamedica/RAPIDR_0.1.1/RAPIDR/data/gcContent.rda", sep="")
> head(outcomes)
  SampleID     Dx Gender
1     1000 Normal Female
2     1001 Normal Female
3     1002    T21 Female
4     1003 Normal   Male
5     1004 Normal   Male
6     1005 Normal   Male
> ref.set<-createReferenceSetFromCounts("/media/martina/VERBATIM HD/risultati/output.csv", outcomes, gcCorrect=FALSE, PCA=FALSE, filterBin=FALSE, gcContentFile=gcContent.fname)
But I receive this error:
Code:
Loading binned counts file
Checking every sampleID has an outcome
No outcomes for Sample 001
No outcomes for Sample 003
No outcomes for Sample 005
No outcomes for Sample 013
Error in `[.data.frame`(sampleIDs.with.outcomes, , "Gender") : 
  undefined columns selected

The BAM files is from 2 male and 2 female, without aneuplodies and they come from ION Torrent, but I used the fastq file and re-alignd with bowtie2, and coverted into sorted-bam file with samtools.


Why I receive this error? How can I perform the analysis? Is there anyone who have some experience with this package and can help me?

Thank you

Last edited by PandoraMid; 11-21-2017 at 09:37 AM. Reason: I forgot some essential information
PandoraMid is offline   Reply With Quote
Old 11-22-2017, 12:03 AM   #2
Manonathan
Junior Member
 
Location: india

Join Date: Apr 2015
Posts: 6
Default

Sample Id in outcomes data frame should correspond to the sample file name . The reason for the error may be the change in sample Id .check whether both are having same name. If not, modifythe sample id or rename the sample file and do the process again.
Manonathan is offline   Reply With Quote
Old 11-22-2017, 12:24 AM   #3
PandoraMid
Member
 
Location: Italy

Join Date: Nov 2015
Posts: 18
Default

Hi! Thanks for your reply! I have a doubt: the only ID for sample that I create is in the creation of binned file. Where I can do this check? Is the second part of my code correct? Can I see all 3 aneuploidies with this code?

I'm sorry for my question, but I'm new with this kind of analysis!
PandoraMid is offline   Reply With Quote
Old 11-22-2017, 12:37 AM   #4
Manonathan
Junior Member
 
Location: india

Join Date: Apr 2015
Posts: 6
Default

Hi,
No issue. Me too struggled a lot to understand the workflow.

create a new text file with three columns like that in the outcomes data frame
Use sampleid, Gender and Dx.Use your sample files id, gender and Dx details and save it as "outcome.txt".Use the Female/Male term as it is case sensitive.
convert that into a data frame using
outcome<-read.table("outcome.txt",header=T)

head(outcome)

should be similar to the outcomes dataframe.

use this newly created dataframe in your script instead of outcomes.
Manonathan is offline   Reply With Quote
Old 11-22-2017, 12:54 AM   #5
PandoraMid
Member
 
Location: Italy

Join Date: Nov 2015
Posts: 18
Default

So, if I understood I have to create manually the outcomes, and then use this file with the function
Code:
createReferenceSetFromCounts
with: my binned file, my outcome file (made by output.txt and your instuction), gcCorrect and the others things?

Second question: is it ok if I create my outcomes file with csv extension?

Third question: do you use RAPIDR-Plus too?
PandoraMid is offline   Reply With Quote
Old 11-22-2017, 03:14 AM   #6
PandoraMid
Member
 
Location: Italy

Join Date: Nov 2015
Posts: 18
Default

Hi! I tryed to use your tips but I can't resolve the situation. This is my outcome.txt file:

sampleID Gender Dx
001 Female Normal
003 Female Normal
005 Male Normal
013 Male Normal

Here there is the code that I use

Code:
> makeBinnedCountsFile(bam.file.list=c(bam1, bam2, bam3, bam4), sampleIDs=c("1", "3", "5", "13"), binned.counts.fname="/media/martina/VERBATIM HD/Altamedica/risultati/output_uffa.csv", k=20000)
Binning counts in bam files
doing the binning
Binning done in 42.556
doing the binning
Binning done in 31.375
doing the binning
Binning done in 49.443
doing the binning
Binning done in 35.978
> outcome<-read.table("/media/martina/VERBATIM HD/Altamedica/risultati/outcomes.txt", header=T)
> head(outcome)
  sampleID Gender     Dx
1        1 Female Normal
2        3 Female Normal
3        5   Male Normal
4       13   Male Normal
> ref.set<-createReferenceSetFromCounts(output.binned, outcome)
Loading binned counts file
Checking every sampleID has an outcome
No outcomes for Sample 1
No outcomes for Sample 3
No outcomes for Sample 5
No outcomes for Sample 13
Error in `[.data.frame`(sampleIDs.with.outcomes, , "Gender") : 
  undefined columns selected
I tryed to use "sampleid" "SampleID", "sampleid" but with the same results. Make I some errors?
PandoraMid is offline   Reply With Quote
Old 05-23-2018, 01:11 AM   #7
pooja.solanki2018
Junior Member
 
Location: India

Join Date: May 2018
Posts: 1
Default

I am getting the same error as above. Please anyone can help me regarding this issue

Error in `[.data.frame`(sampleIDs.with.outcomes, , "Gender") :
undefined columns selected

Last edited by pooja.solanki2018; 05-23-2018 at 09:17 PM.
pooja.solanki2018 is offline   Reply With Quote
Reply

Tags
nipt, r package, rapidr

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 05:42 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2018, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO