多种机器学习与反向传播神经网络模型在中医补益类方剂分类中的对比研究*
作者:阮开霞1,李 冰2,冉 宇3,郑丰杰4,姚舜宇4
单位:1.北京中医药大学第二临床医学院东方医院,北京 100078; 2.北京中医药大学中药学院,北京 100029; 3.北京中医药大学生命科学学院,北京 102400; 4.北京中医药大学中医学院,北京 100029
引用:引用:阮开霞,李冰,冉宇,郑丰杰,姚舜宇.多种机器学习与反向传播神经网络模型在中医补益类方剂分类中的对比研究[J].中医药导报,2025,31(7):232-237.
DOI:10.13862/j.cn43-1446/r.2025.07.040
PDF:
下载PDF
摘要:
目的:基于多种机器学习和反向传播神经网络(BPNN)构建中医补益类方剂分类模型,辅助中医补益类方剂自动化分类。方法:结合中医药理论与现代数据科学方法,收集并整理中医补益类方剂数据,选取四气、五味、归经等药性特征进行模型训练和验证。采用支持向量机(SVM)、随机梯度下降(SGD)、K最近邻(KNN)、随机森林(RF)、极限梯度提升(XGBoost)、轻量级梯度提升机(Light GBM)和BPNN算法,构建补气、补血、补阴和补阳4个类别的方剂分类模型,并使用沙普利加性解释(SHAP)对特征重要性进行解释和分析。结果:整理得到174个样本,其中43个补气方剂、45个补血方剂、50个补阴方剂、36个补阳方剂。经训练,各模型训练集的F1分数(F1-Score)分别为SVM 0.551 8、SGD 0.745 6、KNN 1.000 0、RF 0.893 2、XGBoost 1.000 0、Light GBM 0.914 1、BPNN 0.739 9;验证集的F1-Score分别为SVM 0.554 0、SGD 0.846 3、KNN 0.778 3、RF 0.710 5、XGBoost 0.552 1、Light GBM 0.710 1、BPNN 0.770 9;SHAP模型解释器分析的主要特征为肾经、脾经、肺经和甘味。结论:本研究构建了多个具有高预测精度的补益类方剂分类模型,其中BPNN模型在7个模型中整体分类性能最均衡;SHAP模型解释器得出的性味归经符合中医补益类方剂的特征,可辅助补益类方剂分类的解释。
关键词:补益类方剂;方剂分类;机器学习;反向传播神经网络;模型
Abstract:Objective: To construct classification models for
traditional Chinese medicine (TCM) tonifying prescriptions based on multiple
machine learning and backpropagation neural network (BPNN), assisting the
automatic classification of TCM tonifying prescriptions. Methods: By
integrating TCM theory with modern data science methods, data on TCM tonifying
prescription were collected and organized. Characteristics such as four
natures, five flavors, and meridian tropism were selected as features for model
training and validation. Support vector machine (SVM), stochastic gradient
descent (SGD), K-nearest neighbor (KNN), random forest (RF), extreme gradient
boosting (XGBoost), light gradient boosting machine (Light GBM), and BPNN
algorithms were used to construct prescription classification models for four
categories: Qi-tonifying, blood-tonifying, Yin-tonifying, and Yang-tonifying.
SHapley Additive exPlanations (SHAP) were applied to interpret and analyze
feature importance. Results: A total of 174 samples were organized, including
43 Qi-tonifying prescriptions, 45 blood-tonifying prescriptions, 50
Yin-tonifying prescriptions, and 36 Yang-tonifying prescriptions. After
training, the F1-Scores for the models on the training set were: SVM 0.551 8,
SGD 0.745 6, KNN 1.000 0, RF 0.893 2, XGBoost 1.000 0, Light GBM 0.914 1, and
BPNN 0.739 9. On the validation set, F1-Scores were: SVM 0.554 0, SGD 0.846 3,
KNN 0.778 3, RF 0.710 5, XGBoost 0.552 1, Light GBM 0.710 1, BPNN 0.770 9. Major features identified by
the SHAP model interpreter included kidney meridian, spleen meridian, lung
meridian, and sweet flavor. Conclusion: This study constructed multiple highly accurate
classification models for tonifying prescriptions, with the BPNN model
demonstrating the most balanced performance among the seven models. Moreover,
the major features identified by the SHAP model interpreter aligned with the
characteristics of TCM tonifying prescriptions, aiding in the interpretability
of the classification process.
Key words:tonifying prescriptions; prescription classification; machine learning; backpropagation neural network; model
发布时间:2026-01-06
点击量:53