Get 20M+ Full-Text Papers For Less Than $1.50/day. Start a 14-Day Trial for You or Your Team.

Learn More →

Identifying recurrent breast cancer patients in national health registries using machine learning

Identifying recurrent breast cancer patients in national health registries using machine learning Abstract Background More than 4500 women are diagnosed with breast cancer each year in Denmark, however, despite adequate treatment 10–30% of patients will experience a recurrence. The Danish Breast Cancer Group (DBCG) stores information on breast cancer recurrence but to improve data completeness automated identification of patients with recurrence is needed. Methods We included patient data from the DBCG, the National Pathology Database, and the National Patient Registry for patients with an invasive breast cancer diagnosis after 1999. In total, relevant features of 79,483 patients with a definitive surgery were extracted. A machine learning (ML) model was trained, using a simplistic encoding scheme of features, on a development sample covering 5333 patients with known recurrence and three times as many non-recurrent women. The model was validated in a validation sample consisting of 1006 patients with unknown recurrence status. Results The ML model identified patients with recurrence with AUC-ROC of 0.93 (95% CI: 0.93–0.94) in the development, and an AUC-ROC of 0.86 (95% CI: 0.83–0.88) in the validation sample. Conclusion An off-the-shelf ML model, trained using the simplistic encoding scheme, could identify recurrence patients across multiple national registries. This approach might potentially enable researchers and clinicians to better and faster identify patients with recurrence and reduce manual patient data interpretation. http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png Acta Oncologica Taylor & Francis

Identifying recurrent breast cancer patients in national health registries using machine learning

8 pages

Loading next page...
 
/lp/taylor-francis/identifying-recurrent-breast-cancer-patients-in-national-health-0VHYYeAy8q

References

References for this paper are not available at this time. We will be adding them shortly, thank you for your patience.

Publisher
Taylor & Francis
Copyright
© 2023 Acta Oncologica Foundation
ISSN
1651-226X
eISSN
0284-186X
DOI
10.1080/0284186X.2023.2201687
Publisher site
See Article on Publisher Site

Abstract

Abstract Background More than 4500 women are diagnosed with breast cancer each year in Denmark, however, despite adequate treatment 10–30% of patients will experience a recurrence. The Danish Breast Cancer Group (DBCG) stores information on breast cancer recurrence but to improve data completeness automated identification of patients with recurrence is needed. Methods We included patient data from the DBCG, the National Pathology Database, and the National Patient Registry for patients with an invasive breast cancer diagnosis after 1999. In total, relevant features of 79,483 patients with a definitive surgery were extracted. A machine learning (ML) model was trained, using a simplistic encoding scheme of features, on a development sample covering 5333 patients with known recurrence and three times as many non-recurrent women. The model was validated in a validation sample consisting of 1006 patients with unknown recurrence status. Results The ML model identified patients with recurrence with AUC-ROC of 0.93 (95% CI: 0.93–0.94) in the development, and an AUC-ROC of 0.86 (95% CI: 0.83–0.88) in the validation sample. Conclusion An off-the-shelf ML model, trained using the simplistic encoding scheme, could identify recurrence patients across multiple national registries. This approach might potentially enable researchers and clinicians to better and faster identify patients with recurrence and reduce manual patient data interpretation.

Journal

Acta OncologicaTaylor & Francis

Published: Apr 3, 2023

Keywords: Breast cancer; machine learning; recurrence

References