A Pan-Canadian Data Governance Framework for Health Synthetic Data
Organization
University of Waterloo
Published
2024
Project Leader(s)
Anindya Sen, Helen Chen, Maura Grossman, Shu-Feng Tsao
Summary
Obtaining access to real-world health data is a significant challenge, mainly due to privacy and security implications. High-quality synthetic data has the potential to expedite research and development of novel technologies. However, synthetic health datasets in Canada are scarce, and no existing synthetic health datasets conform to the Findable, Accessible, Interoperable, and Reusable (FAIR) standards. Moreover, while federated machine learning offers the advantage of protecting patient privacy by not requiring the exchange of source data across nodes, it has yet to be optimized in Canada’s health research environment, and there is limited use of federated learning with synthetic health data. This paper explored the ethical considerations and value proposition of generating and sharing synthetic health data.
Researchers explored a governance framework that could pave the way for a more robust and secure synthetic data ecosystem, enabling the generation of valuable insights that can drive positive health outcomes for Canadians. Further, researchers conducted a scoping review to understand the status of evaluations and governance of health synthetic data following the PRISMA guidelines. The results showed that if synthetic health data are generated via proper methods, the risk of privacy leaks has been low and data quality is comparative to real data. However, the generation of health synthetic data has been generated on a case-by-case basis instead of being scaled up. Furthermore, regulations, ethics, and data sharing of health synthetic data have primarily been inexplicit, although common principles for sharing such data do exist.
Project deliverables are available in the following language(s):
EnglishKey results are available through the papers available at these pages:
- Establishing a FAIR, CARE, and Efficient Synthetic Health Data Sharing Ecosystem for Canada (Full Report)
- Health Synthetic Data to Enable Health Learning System and Innovation (A Scoping Review)
OPC Funded Project
This project received funding support through the Office of the Privacy Commissioner of Canada’s Contributions Program. The opinions expressed in the summary and report(s) are those of the authors and do not necessarily reflect those of the Office of the Privacy Commissioner of Canada. Summaries have been provided by the project authors. Please note that the projects appear in their language of origin.
Contact Information
Anindya Sen
Department of Economics
University of Waterloo
Ontario N2L 3G1
Email: asen@uwaterloo.ca
- Date modified: