UNC Lineberger is on the forefront of cancer research and is closely involved with the Cancer Genome Atlas (TCGA) project. This cutting-edge, nationwide research project provides an extraordinary opportunity to study the molecular basis of cancer and to implement cutting edge computational tools in the process. Specifically, this position is focused on developing, adapting, and deploying software tools to aid in the analysis of sequence data produced by the Lineberger center. This position is primarily focused on software development to support the SeqWare open source project (http://seqware.sf.net) which is used by the center for sample tracking, performing grid and cloud-based analysis, data basing metadata, and querying unstructured and semi-structured data with Hadoop project tools (such as MapReduce, HBase, etc).
RESPONSIBILITIES
30%: Wrapping software tools as Java objects in a workflow environment.
20%: Developing new software tools to support sequence analysis (includes scripts, Java/C programs, and higher-level analysis tools such as MapReduce/Pig/Hive).
10%: Monitoring production workflow system to ensure sequence data is automatically processed using workflows.
10%: Web application development to support tracking of analysis results.
10%: Data basing of both metadata and analysis results for querying by users .
10%: Mirroring external tools, databases, and software packages for use in our system for sequence analysis.
5%: Workflow and tool optimization to reduce computational time required to produce analysis results.
5%: Working on the upload of resulting analysis and raw data to the project-appropriate repository.
REQUIREMENTS
Education
B.S. degree in computer science or related field.
Required Skills
Proficiency in both written and spoken English with excellent communication skills (presentations, reports, etc).
Excellent programming skills and the desire to learn new technologies, such as Hadoop/HBase/MapReduce, and to participate in large, open source projects.
Proficiency with a scripting language (Perl desired) and Java.
Proficiency with databases (MySQL, Postgres, etc) and the generic SQL language.
A good working knowledge of Linux is required. This includes common command line tools, shell environments such as Bash, and also utilities such as rsync for mirroring datasets.
Excellent interpersonal skills and the ability to work with a diverse group of individuals both locally and remotely.
A dedication to follow software development best practices and methods (testing, code reviews, etc).
Must be detail-oriented and focused.
Desired Skills
Experience with clusters (either traditional grid or Hadoop MapReduce).
Experience with one or more web toolkit environments, such as Spring/Hibernate.
Familiarity with other scripting languages (Python, R, etc) and compiled languages (C, C++, etc).
Experience with software packaging in the Linux environment.
Experience with Linux system administration and networking.
Experience with the Globus Toolkit.
Experience with unit testing and optimization.
A basic background in biology, molecular biology, or biochemistry would be helpful.
Experience
1+ years of experience working as a software developer
Please provide references on request
To apply for this position, please visit our website: http://hr.unc.edu/careers-at-carolin...ions/index.htm. Please reference Position # 0059907 and Department 4226 when applying. EOE.
RESPONSIBILITIES
30%: Wrapping software tools as Java objects in a workflow environment.
20%: Developing new software tools to support sequence analysis (includes scripts, Java/C programs, and higher-level analysis tools such as MapReduce/Pig/Hive).
10%: Monitoring production workflow system to ensure sequence data is automatically processed using workflows.
10%: Web application development to support tracking of analysis results.
10%: Data basing of both metadata and analysis results for querying by users .
10%: Mirroring external tools, databases, and software packages for use in our system for sequence analysis.
5%: Workflow and tool optimization to reduce computational time required to produce analysis results.
5%: Working on the upload of resulting analysis and raw data to the project-appropriate repository.
REQUIREMENTS
Education
B.S. degree in computer science or related field.
Required Skills
Proficiency in both written and spoken English with excellent communication skills (presentations, reports, etc).
Excellent programming skills and the desire to learn new technologies, such as Hadoop/HBase/MapReduce, and to participate in large, open source projects.
Proficiency with a scripting language (Perl desired) and Java.
Proficiency with databases (MySQL, Postgres, etc) and the generic SQL language.
A good working knowledge of Linux is required. This includes common command line tools, shell environments such as Bash, and also utilities such as rsync for mirroring datasets.
Excellent interpersonal skills and the ability to work with a diverse group of individuals both locally and remotely.
A dedication to follow software development best practices and methods (testing, code reviews, etc).
Must be detail-oriented and focused.
Desired Skills
Experience with clusters (either traditional grid or Hadoop MapReduce).
Experience with one or more web toolkit environments, such as Spring/Hibernate.
Familiarity with other scripting languages (Python, R, etc) and compiled languages (C, C++, etc).
Experience with software packaging in the Linux environment.
Experience with Linux system administration and networking.
Experience with the Globus Toolkit.
Experience with unit testing and optimization.
A basic background in biology, molecular biology, or biochemistry would be helpful.
Experience
1+ years of experience working as a software developer
Please provide references on request
To apply for this position, please visit our website: http://hr.unc.edu/careers-at-carolin...ions/index.htm. Please reference Position # 0059907 and Department 4226 when applying. EOE.