多种机器学习与反向传播神经网络模型在中医补益类方剂分类中的对比研究*

作者:阮开霞1,李 冰2,冉 宇3,郑丰杰4,姚舜宇4

单位:1.北京中医药大学第二临床医学院东方医院,北京 100078; 2.北京中医药大学中药学院,北京 100029; 3.北京中医药大学生命科学学院,北京 102400; 4.北京中医药大学中医学院,北京 100029

引用:引用:阮开霞,李冰,冉宇,郑丰杰,姚舜宇.多种机器学习与反向传播神经网络模型在中医补益类方剂分类中的对比研究[J].中医药导报,2025,31(7):232-237.

DOI:10.13862/j.cn43-1446/r.2025.07.040

PDF: 下载PDF

摘要:

目的:基于多种机器学习和反向传播神经网络(BPNN)构建中医补益类方剂分类模型,辅助中医补益类方剂自动化分类。方法:结合中医药理论与现代数据科学方法,收集并整理中医补益类方剂数据,选取四气、五味、归经等药性特征进行模型训练和验证。采用支持向量机(SVM)、随机梯度下降(SGD)、K最近邻(KNN)、随机森林(RF)、极限梯度提升(XGBoost)、轻量级梯度提升机(Light GBM)和BPNN算法,构建补气、补血、补阴和补阳4个类别的方剂分类模型,并使用沙普利加性解释(SHAP)对特征重要性进行解释和分析。结果:整理得到174个样本,其中43个补气方剂、45个补血方剂、50个补阴方剂、36个补阳方剂。经训练,各模型训练集的F1分数(F1-Score)分别为SVM 0.551 8、SGD 0.745 6、KNN 1.000 0、RF 0.893 2、XGBoost 1.000 0、Light GBM 0.914 1、BPNN 0.739 9;验证集的F1-Score分别为SVM 0.554 0、SGD 0.846 3、KNN 0.778 3、RF 0.710 5、XGBoost 0.552 1、Light GBM 0.710 1、BPNN 0.770 9;SHAP模型解释器分析的主要特征为肾经、脾经、肺经和甘味。结论:本研究构建了多个具有高预测精度的补益类方剂分类模型,其中BPNN模型在7个模型中整体分类性能最均衡;SHAP模型解释器得出的性味归经符合中医补益类方剂的特征,可辅助补益类方剂分类的解释。

关键词:补益类方剂;方剂分类;机器学习;反向传播神经网络;模型

Abstract:Objective: To construct classification models for traditional Chinese medicine (TCM) tonifying prescriptions based on multiple machine learning and backpropagation neural network (BPNN), assisting the automatic classification of TCM tonifying prescriptions. Methods: By integrating TCM theory with modern data science methods, data on TCM tonifying prescription were collected and organized. Characteristics such as four natures, five flavors, and meridian tropism were selected as features for model training and validation. Support vector machine (SVM), stochastic gradient descent (SGD), K-nearest neighbor (KNN), random forest (RF), extreme gradient boosting (XGBoost), light gradient boosting machine (Light GBM), and BPNN algorithms were used to construct prescription classification models for four categories: Qi-tonifying, blood-tonifying, Yin-tonifying, and Yang-tonifying. SHapley Additive exPlanations (SHAP) were applied to interpret and analyze feature importance. Results: A total of 174 samples were organized, including 43 Qi-tonifying prescriptions, 45 blood-tonifying prescriptions, 50 Yin-tonifying prescriptions, and 36 Yang-tonifying prescriptions. After training, the F1-Scores for the models on the training set were: SVM 0.551 8, SGD 0.745 6, KNN 1.000 0, RF 0.893 2, XGBoost 1.000 0, Light GBM 0.914 1, and BPNN 0.739 9. On the validation set, F1-Scores were: SVM 0.554 0, SGD 0.846 3, KNN 0.778 3, RF 0.710 5, XGBoost 0.552 1, Light GBM 0.710  1, BPNN 0.770 9. Major features identified by the SHAP model interpreter included kidney meridian, spleen meridian, lung meridian, and sweet flavor. Conclusion: This study constructed multiple highly accurate classification models for tonifying prescriptions, with the BPNN model demonstrating the most balanced performance among the seven models. Moreover, the major features identified by the SHAP model interpreter aligned with the characteristics of TCM tonifying prescriptions, aiding in the interpretability of the classification process. 

Key words:tonifying prescriptions; prescription classification; machine learning; backpropagation neural network; model

发布时间:2026-01-06

点击量:53

微信服务号