Treffer: CatBoost Machine Learning Model for Thrombosis Risk Prediction in Critically Ill Cancer Patients: A MIMIC-IV Database Study.
Original Publication: New York, N.Y. : Raven Press, c1995-
J Clin Epidemiol. 2019 Jun;110:12-22. (PMID: 30763612)
J Thromb Haemost. 2007 Mar;5(3):632-4. (PMID: 17319909)
BMC Genomics. 2020 Jan 2;21(1):6. (PMID: 31898477)
Lancet. 1974 Jul 13;2(7872):81-4. (PMID: 4136544)
Haematologica. 2019 Jun;104(6):1277-1287. (PMID: 30606788)
Diagn Progn Res. 2019 Oct 04;3:18. (PMID: 31592444)
JCO Clin Cancer Inform. 2023 Aug;7:e2300060. (PMID: 37616550)
Eur Urol. 2018 Dec;74(6):796-804. (PMID: 30241973)
Lancet Digit Health. 2020 Oct;2(10):e516-e525. (PMID: 32984797)
Intensive Care Med. 1996 Jul;22(7):707-10. (PMID: 8844239)
Nat Biomed Eng. 2021 Jun;5(6):493-497. (PMID: 34131324)
Sci Rep. 2025 Jul 15;15(1):25647. (PMID: 40664778)
Plast Reconstr Surg Glob Open. 2022 Dec 08;10(12):e4683. (PMID: 36518690)
Intensive Care Med. 2024 Feb;50(2):222-233. (PMID: 38170226)
Arch Intern Med. 2010 Aug 9;170(15):1383-9. (PMID: 20696966)
N Engl J Med. 2019 Apr 4;380(14):1347-1358. (PMID: 30943338)
Stat Med. 2009 Nov 10;28(25):3083-107. (PMID: 19757444)
Nat Methods. 2018 Apr;15(4):233-234. (PMID: 30100822)
Sci Data. 2023 Jan 3;10(1):1. (PMID: 36596836)
Br J Surg. 2015 Feb;102(3):148-58. (PMID: 25627261)
Eur Heart J. 2014 Aug 1;35(29):1925-31. (PMID: 24898551)
J Vasc Surg Venous Lymphat Disord. 2024 Sep;12(5):101908. (PMID: 38759751)
J Am Med Inform Assoc. 2018 Oct 1;25(10):1419-1428. (PMID: 29893864)
Semin Thromb Hemost. 2014 Oct;40(7):724-35. (PMID: 25302681)
Int Emerg Nurs. 2025 Dec;83:101678. (PMID: 40967173)
Med Decis Making. 2015 Feb;35(2):162-9. (PMID: 25155798)
Crit Care Med. 1985 Oct;13(10):818-29. (PMID: 3928249)
JCO Clin Cancer Inform. 2024 Jul;8:e2300192. (PMID: 38996199)
Artif Intell Med. 2023 Oct;144:102659. (PMID: 37783541)
Pediatr Blood Cancer. 2008 Dec;51(6):792-7. (PMID: 18798556)
Intensive Care Med. 2017 Sep;43(9):1366-1382. (PMID: 28725926)
Chest. 1992 Jun;101(6):1644-55. (PMID: 1303622)
BMC Med. 2019 Dec 16;17(1):230. (PMID: 31842878)
Contemp Oncol (Pozn). 2018;22(1):31-36. (PMID: 29692661)
Psychosom Med. 2004 May-Jun;66(3):411-21. (PMID: 15184705)
ISRN Hematol. 2011;2011:124610. (PMID: 22084692)
NPJ Digit Med. 2018 May 8;1:18. (PMID: 31304302)
J Vasc Surg Venous Lymphat Disord. 2024 Mar;12(2):101693. (PMID: 37838307)
Weitere Informationen
ObjectiveTo develop and validate a robust machine learning-based prediction model for assessing the risk of thrombotic events in critically ill cancer patients during their ICU stay.MethodsThis retrospective observational study utilized data from 1892 cancer patients in the MIMIC-IV database for model development and internal validation. A stringent data preprocessing pipeline was applied, including multiple imputation for missing data, exclusion of outliers, and the use of the Synthetic Minority Over-sampling Technique (SMOTE) to address class imbalance. Feature importance was evaluated using SHAP, leading to the selection of six key predictors. Nine machine learning models were constructed and compared. Model performance was assessed using the Area Under the Curve (AUC), F1-score, recall, Matthews correlation coefficient (MCC), accuracy, and specificity. The optimal model was selected, calibrated, and interpreted using SHAP. Its clinical utility was further evaluated via calibration curves and decision curve analysis (DCA). Finally, external validation was performed on an independent dataset of 200 patients from our institution.ResultsThe CatBoost model demonstrated superior performance. In internal validation, the calibrated model achieved an AUC of 0.855 (95% CI: 0.797-0.913), with a sensitivity of 0.971 and a specificity of 0.753 at an optimal threshold of 0.245. In external validation, the model maintained strong performance with an AUC of 0.83 (95% CI: 0.742-0.918), sensitivity of 0.968, and specificity of 0.698. SHAP analysis identified "history of thrombosis" as the most influential predictor. Decision curve analysis confirmed the model's clinical utility across a wide risk threshold range (0.25-0.75). The final model was deployed as an online platform to facilitate real-time, individualized risk assessment.ConclusionThe developed CatBoost model exhibits excellent discriminatory power, good calibration, and favorable clinical interpretability for predicting thrombosis risk in critically ill cancer patients. It serves as a promising and reliable clinical decision support tool to guide personalized thromboprophylaxis and improve patient outcomes.
Declaration of Conflicting InterestsThe authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.