实用肝脏病杂志 ›› 2026, Vol. 29 ›› Issue (2): 199-204.doi: 10.3969/j.issn.1672-5069.2026.02.010

• 非酒精性脂肪性肝病 • 上一篇    下一篇

体检人群脂肪肝风险预测模型的构建与验证*

张水珠, 丁梦寒, 周淑萍   

  1. 233000 安徽省蚌埠市 蚌埠医科大学研究生院消化内科(张水珠,丁梦寒);安徽理工大学第一附属医院消化内科(周淑萍)
  • 收稿日期:2025-08-18 出版日期:2026-03-10 发布日期:2026-03-13
  • 通讯作者: 周淑萍,E-mail:hnyyzsp@126.com
  • 作者简介:张水珠,女,26岁,硕士研究生,医师。研究方向:消化系统疾病诊治研究。E-mail:2446441013@qq.com
  • 基金资助:
    *安徽省科技厅临床医学研究转化专项项目(编号:202304295107020036)

Establishment and validation of a risk prediction model for fatty liver disease in health checkup individuals

Zhang Shuizhu, Ding Menghan, Zhou Shuping   

  1. Department of Gastroenterology, Graduate School, Bengbu Medical University, Bengbu 233000, Anhui Province, China
  • Received:2025-08-18 Online:2026-03-10 Published:2026-03-13

摘要: 目的 基于体检中心常规指标构建脂肪肝早期预测模型,实现精准且低成本的筛查方法。方法 2024年2月~2024年5月安徽理工大学第一附属医院体检中心接受体检的人群1212例,使用超声检查诊断脂肪肝,常规临床检测后计算各种指数。采用嵌套交叉验证(10折外层+5折内层)结合随机森林和XGBoost进行特征选择,应用LASSO回归建模。经SHAP值解释变量重要性,应用Bootstrap法(1000次迭代)进行内部验证,并采用随机划分30%数据进行外部验证。结果 在纳入的1212例人群中,发现脂肪肝542例(44.7%);最终模型纳入4个关键变量,即甘油三酯-葡萄糖-BMI指数(TyG-BMI)、体脂率、舒张压和单核细胞与高密度脂蛋白胆固醇比值(MHR);模型效能优异,其嵌套交叉验证AUC为0.874(95%CI:0.855~0.893),最终模型AUC为0.880(95%CI:0.861~0.898),乐观校正后AUC为0.878(95%CI:0.860~0.897),外部验证AUC为0.866 (95%CI:0.830~0.902);校准度及稳定性良好(校准斜率≈1,Hosmer-Lemeshow检验P值=0.433,噪声鲁棒性检验AUC=0.878),SHAP分析显示TyG-BMI贡献度最大。结论 本研究建立的脂肪肝预测模型判别力高、校准度好、易获取,可转化为体检中心脂肪肝“精准-高效-低成本”的筛查工具。

关键词: 脂肪肝, 机器学习预测模型, 甘油三酯-葡萄糖-体质指数, 体检人群

Abstract: Objective The aim of this study was to set up and validate a precise yet low-cost early prediction model for fatty liver disease based on routine indicators available in health-checkup centers. Methods A retrospective cohort of 1212 individuals for physical examination was analyzed, and the fatty liver was diagnosed based on ultrasonography. Various indexes were calculated based on clinical materials. Nested cross-validation (10-fold outer loop for validation and 5-fold inner loop for tuning) was combined random-forest and XGBoost with LASSO Logistic regression was conducted for feature selection. Variable importance was interpreted with SHAP values. Internal validation was used 1000-bootstrap optimism-corrected AUC and external validation was employed by a 30 % random split. Results Of the 1212 individuals, fatty liver was found in 542 cases(44.7%);the final model retained four variables, e.g.,triglyceride-glucose-body mass index (TyG-BMI), body-fat percentage, diastolic blood pressure and monocyte to high-density lipoprotein cholesterol (MHR); AUC of nested-cross-validation was 0.874 (95 % CI: 0.855-0.893), AUC of final-model was 0.880 (95 % CI: 0.861-0.898), AUC of optimism-corrected was 0.878 (95 % CI: 0.860-0.897) and AUC of external was 0.866 (95 % CI: 0.830-0.902); calibration was excellent (slope ≈ 1; Hosmer-Lemeshow P=0.433) and robust under 30 % Gaussian noise (AUC=0.878); SHAP analysis identified TyG-BMI as the dominant contributor. Conclusion The four-variable model demonstrates high discrimination, excellent calibration, easy acquisition and strong generalizability, which might offer health-checkup centers a“precise, efficient and low-cost” screening tool for fatty liver disease.

Key words: Fatty liver, Machine learning prediction mode, Triglyceride-glucose-body mass index, Health check-up population