Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Data Quality Engineer @ University of Chicago

    We're looking for a problem solver with a background working in data integrity and testing to ensure high quality data and metadata is distributed to the cancer research community. Elevate your career with this opportunity to work with one of the world's largest collections of harmonized cancer genomic data. This role focuses on the Genomic Data Commons, which is at the forefront of both cutting edge research and production systems supporting cancer research. You will join a team of engineers developing innovative technologies who will keep you challenged in our dynamic environment as we work together to pursue discovery through data-driven cancer research.

    You will join the team as the lead engineer for data quality and integrity. You will focus on leading data quality efforts related to data integration, higher level data products, and distribution to the cancer research community. To accomplish this, you will work across multiple teams to build and automate frameworks such as anomaly detection, reporting, and alerting to ensure data quality. You shall gain expertise not only in the data itself, but the systems as well in order to interrogate the data and understand gaps in data quality. Data and metadata quality has a broad scope therefore you are expected work collaboratively across teams to determine priorities and best methods for achieving objectives.

    Key Responsibilities


    Data Quality and Integrity - Drive the design of the data QA infrastructure and execution of testing protocols to validate pipelines, integrated datasets, and data products. Use a combination of exploratory, regression and automated testing to ensure data quality standards. Assess appropriate inclusion/exclusion of data based on defined data dictionary; assist in evaluation of data dictionaries and utilize data specification and code to validate data as it relates to quality.

    Data Quality Improvement - Proactively identify potential data issues and downstream impact. Identify existing data issues and perform research and root cause analyses to determine resolution. Work collaboratively with software engineers and bioinformaticians to achieve and verify resolution. Establish processes and standards to improve data quality assurance and implement efficiencies in data management. Define measurements and metrics to conduct and present routine data reports to the project team and stakeholders.

    Data Management - Participate in data acquisition and integration planning efforts including data modeling, data dictionary definitions, and data harmonization pipeline development. Develop a deep understanding of multiple genomic datasets and the technical data management software and processes of the underlying system. Define data quality and integrity criteria and develop a comprehensive data quality management plan to lead key data QC efforts through team collaboration for all phases of the data management life cycle.

    Technical Writing - Contribute written knowledge and expertise to system documentation, user documentation, scientific manuscripts, reporting, grant proposals and reports, and presentation materials. Stay abreast of broad knowledge of existing and emerging technologies and QC tools in the cancer genomics space.

    Qualifications

    REQUIRED


    Bachelor's degree in Computer Science, Bioinformatics, or relevant engineering or scientific field such as Physics or Genomics required.

    5+ years of experience in progressive technical business analysis role required.

    Experience with Agile methodology required.

    Experience with writing technical specifications required, with a focus on full stack architecture, including REST APIs, SQL and noSQL data solutions and distributed infrastructure required.

    Experience with business analysis and quality assurance professional standards, business processes, workflows, methodologies and leading practices required.

    Experience leading business analysis activities while ensuring the traceability and optimum coverage of business requirements defined required.

    Experience working in a Linux command line environment required.

    PREFERRED

    PhD in an relevant engineering or scientific field highly preferred.

    Experience in Change Management, Release Management, Incident, Problem Management and working on Business Intelligence preferred.

    Experience with HIPAA and/or FISMA security regulations preferred.

    Experience with cancer or human genomics preferred.

    Experience with bioinformatics preferred.

    Experience managing a backlog of requirements in an Agile workflow preferred.

    Experience creating user stories from requirements preferred.

    Experience with JIRA project tracking software preferred.

    About the Genomic Data Commons The Genomic Data Commons (GDC) is a comprehensive computational facility to centralize and harmonize cancer genomic data generated from NCI-funded programs. The GDC is the foundation for a genomic precision medicine platform and will enable the development of a knowledge system for cancer. The GDC will provide an open-source, scalable, modern informatics framework that uses community standards to make raw and processed genomic data broadly accessible. This will enable previously infeasible collaborative efforts between scientists.

    About the Center for Data Intensive Science The Center for Data Intensive Science at the University of Chicago is developing the emerging field of data science with a focus on applications to problems in biology, medicine, and health care. Our vision is a world in which researchers have ready access to the data and tools required to make discoveries that lead to deeper understanding and improved quality of life. We democratize access, speed discovery, create new knowledge and foster innovation through implementation using data at scale. Our scientific data clouds and commons include the Genomic Data Commons, Bionimbus Protected Data Cloud, and Open Science Data Cloud.

    Apply under Requisition#101319 at jobopportunities.uchicago.edu

Latest Articles

Collapse

  • seqadmin
    Investigating the Gut Microbiome Through Diet and Spatial Biology
    by seqadmin




    The human gut contains trillions of microorganisms that impact digestion, immune functions, and overall health1. Despite major breakthroughs, we’re only beginning to understand the full extent of the microbiome’s influence on health and disease. Advances in next-generation sequencing and spatial biology have opened new windows into this complex environment, yet many questions remain. This article highlights two recent studies exploring how diet influences microbial...
    02-24-2025, 06:31 AM
  • seqadmin
    Quality Control Essentials for Next-Generation Sequencing Workflows
    by seqadmin




    Like all molecular biology applications, next-generation sequencing (NGS) workflows require diligent quality control (QC) measures to ensure accurate and reproducible results. Proper QC begins at nucleic acid extraction and continues all the way through to data analysis. This article outlines the key QC steps in an NGS workflow, along with the commonly used tools and techniques.

    Nucleic Acid Quality Control
    Preparing for NGS starts with isolating the...
    02-10-2025, 01:58 PM

ad_right_rmr

Collapse

News

Collapse

Topics Statistics Last Post
Started by seqadmin, 03-03-2025, 01:15 PM
0 responses
46 views
0 likes
Last Post seqadmin  
Started by seqadmin, 02-28-2025, 12:58 PM
0 responses
167 views
0 likes
Last Post seqadmin  
Started by seqadmin, 02-24-2025, 02:48 PM
0 responses
525 views
0 likes
Last Post seqadmin  
Started by seqadmin, 02-21-2025, 02:46 PM
0 responses
256 views
0 likes
Last Post seqadmin  
Working...
X