Publication date: Jan 14, 2020
In the first story of this series, we made the case for applying common data governance (DG) and tools in organizations to help with the quality and privacy of consumer data.
To have the best chances of success, we suggested in a second story that organizations should wage the battle for a single source of reference of their data assets through a governed, global data catalog.
This recent study, developed by UK scientists at Cambridge, is the first to combine non-genetic factors (e. g., clinical data, lifestyle) with known genetic ones in a risk prediction model for breast cancer.
Nongenetic RFs in the model are classified as: At a high level, our data scientists must build a pipeline to find, for each patient, the following data: In the scenario above, our data scientists can rely on existing data assets, but need others.
The DG organization has defined the following policies for such data: The data catalog helps during the governance requirements phase as follows.
In this blog series we have striven to demonstrate that large organizations, which generally have no single source of truth for decision making, must introduce data governance processes to address the challenges of locating trustworthy data and analytic assets to help develop, inform and influence business initiatives -the definition of an organization being data-driven.
This blog has shown that a data catalog is a key technology ingredient in supporting DG, including privacy regulations such as GDPR.
|disease||MESH||coronary artery disease|
- Dr. Eric Topol: How AI will restore ‘humanity of medicine’ this decade
- Predictive and Precision Medicine with Genomic Data.
- AI IN MEDICAL DIAGNOSIS: How US health systems are using AI for diagnostic imaging, clinical decision support, and personalized medicine