HDR UK Gateway
HDR Gateway logo

Bookmarks

Diabetes Core Dataset

Description

The Diabetes Core Dataset is a minimum set of 30 variables considered essential for most diabetes research projects. The dataset is designed to be implementable using routinely collected electronic health record data from primary care, with optional linkage to secondary care datasets such as Hospital Episode Statistics (HES). Variables were selected through expert consensus involving clinicians, researchers, data scientists and patient and public representatives.

Core domains include demographics, diabetes characteristics, clinical measurements, biomarkers, diabetes-related complications, lifestyle factors and medications. The dataset aims to provide a standardised, reproducible framework that balances comprehensiveness with feasibility in NHS data available for research, enabling consistent analyses across studies while remaining adaptable to different data sources. In particular, it provides a structured starting point for the ingestion of diabetes-related data into NHS Secure Data Environments (SDEs), enabling more efficient and standardised cross-site data preparation for research use.

The Diabetes Core Dataset differs from other datasets such as the National Diabetes Audit in that it has been developed specifically for research applications rather than service evaluation. It is designed as a minimum dataset that can be linked to specialist registries, audits, and external data sources to address more specific research questions.

Results/Insights

Components of the Diabetes Core Dataset have been implemented in primary care data within the Kent, Medway and Sussex (KMS) Secure Data Environment (SDE), and in secondary care (Epic) data at the Royal Devon University Healthcare NHS Foundation Trust (RDUH). This has allowed development of standardised SQL scripts to support the initial steps of defining the core dataset. These are openly available via the project Github repository (https://github.com/Exeter-Diabetes/sde-sw-framework), supporting transparency, reproducibility, and reuse across other Secure Data Environments.

Details

License

Creative Commons Attribution–NonCommercial 4.0 International

Last Updated

20/05/2026

Associated Authors

Katherine G Young, Andrew P McGovern, John M Dennis