Hello,
A few of us in my lab want to analyze some HTS data; Illumina GAIIx in this case. Before we start I would love to hear some advices on how to organize yourself.
Our idea is to build an R "data package" while using Subversion (SVN) to version-control our scripts. We want to have a well documented "run-all" script (a vignette file), individual scripts that solve a small part of the analysis, and use the R package framework to document our functions and data tables. The big data files would be excluded from the package. By using an R package to join all the table results (kind of analogous to creating a zip with all the result files) the biologists on the lab will easily load the results on their computers -- yup, we've been teaching them the R basics.
While the project could make it to the public eventually, it will be "lab-eyes-only" for a while. Now, taking this into account I'm puzzled as to which SVN hosting service to use. I think that we are not really making "open source software". The other option would be to use a single account on a server and use SVN "locally" (check this). Sadly using a local server as the SVN repository is complicated for us as the IT people are very restrictive -- they've had bad luck with exterior attacks.
Any tips from your experience are more than welcome. I found this paper to be quite useful.
Thank you and greetings,
Leonardo
PS I'll be asking on the bioc-sig-sequencing (R) mailing list as well.
A few of us in my lab want to analyze some HTS data; Illumina GAIIx in this case. Before we start I would love to hear some advices on how to organize yourself.
Our idea is to build an R "data package" while using Subversion (SVN) to version-control our scripts. We want to have a well documented "run-all" script (a vignette file), individual scripts that solve a small part of the analysis, and use the R package framework to document our functions and data tables. The big data files would be excluded from the package. By using an R package to join all the table results (kind of analogous to creating a zip with all the result files) the biologists on the lab will easily load the results on their computers -- yup, we've been teaching them the R basics.
While the project could make it to the public eventually, it will be "lab-eyes-only" for a while. Now, taking this into account I'm puzzled as to which SVN hosting service to use. I think that we are not really making "open source software". The other option would be to use a single account on a server and use SVN "locally" (check this). Sadly using a local server as the SVN repository is complicated for us as the IT people are very restrictive -- they've had bad luck with exterior attacks.
Any tips from your experience are more than welcome. I found this paper to be quite useful.
Thank you and greetings,
Leonardo
PS I'll be asking on the bioc-sig-sequencing (R) mailing list as well.
Comment