人工智能时代中医药术语机器翻译质量评估研究*——以ChatGPT-4和Google翻译为例

作者:任禹昕1,陈子杰2,林咏臻1,刘 平1

单位:1.北京中医药大学人文学院,北京 102488; 2.北京中医药大学中医学院,北京 102488

引用:

DOI:10.13862/j.cn43-1446/r.2025.12.045

PDF: 下载PDF

摘要:目的:评估大语言模型(如ChatGPT-4)与传统神经机器翻译工具(如Google翻译)在中医药术语翻译中的表现,并探讨人机协同的中医翻译策略。方法:采用半自动机器翻译评价方法,通过综合BLEUTERMETEOR3个自动评估指标和专家人工评分,系统评估ChatGPT-4Google翻译的中医术语翻译质量;通过实验验证提示词工程对中医术语翻译质量的提升作用。结果:ChatGPT-4BLEUTERMETEOR3项自动评估指标均显著优于Google翻译;ChatGPT-4的人工评估结果优于Google翻译,尤其在保留文化内涵和语境适配方面更为突出;提示词测试结果显示,通过优化提示词可以提升ChatGPT-4的翻译质量。结论:大语言模型是更优的赋能中医翻译的机器翻译工具,具有较强的领域鲁棒性、交互性、情境学习能力、指令跟随能力和复杂推理能力,且能够更好地处理中医隐喻性表达和文化负载词;优化提示词可以有效提升大语言模型的中医翻译质量。

关键词:机器翻译;神经机器翻译;大语言模型;中医药术语;翻译质量评估

Abstract:Objectives: To evaluate the performance of large language models (LLMs) (such as ChatGPT-4) and traditional neural machine translation tools (such as Google Translate) in translating traditional Chinese medicine (TCM) terminology, and to explore a human-machine collaborative translation strategy for TCM. Methods: A semi-automatic machine translation evaluation method was adopted. The translation quality of TCM terminology by ChatGPT-4 and Google Translate was systematically assessed through a combination of three automatic evaluation metrics, BLEU, TER, and METEOR, and expert manual scoring. Additionally, experiments were conducted to verify the effect of prompt engineering on improving the translation quality of TCM terminology. Results: ChatGPT-4 significantly outperformed Google Translate in all three automatic evaluation metrics, BLEU, TER, and METEOR. The manual evaluation results also showed that ChatGPT-4 performed better than Google Translate, particularly in preserving cultural connotations and contextual adaptability. The test results of prompt words show that optimizing prompt words can improve the translation quality of ChatGPT-4. Conclusion: LLMs are superior machine translation tools for empowering TCM translation, with strong domain robustness, interactivity, situational learning ability, instruction-following ability, and complex reasoning ability. They can better handle metaphorical expressions and culture-loaded words in TCM. Optimizing prompts words can effectively enhance the TCM translation quality of LLMs.

Key words:machine translation; neural machine translation; large language models; traditional Chinese medicine terminology; translation quality assessment

发布时间:2026-01-09

点击量:48

微信服务号