文章摘要
潘雪峰,陈洁,王超.基于大语言模型多模态融合驱动的图书分类优化方法研究[J].情报工程,2025,11(3):015-026
基于大语言模型多模态融合驱动的图书分类优化方法研究
Research on Book Classification Optimization Method Driven by Multimodal Fusion of Large Language Models
  
DOI:
中文关键词: 单模态;多模态;大语言模型;图书分类
英文关键词: Unimodal; Multimodal; Large Language Model; Book Classification
基金项目:2023 年度教育部人文社会科学研究规划基金“‘人工智能驱动的科学研究’对科研数据基础设施建设的影响机制研究”(23YJA870002)。
作者单位
潘雪峰 辽宁工业大学图书馆 锦州 121000 
陈洁 桂林理工大学图书馆 桂林 541000 
王超 辽宁工业大学图书馆 锦州 121000 
摘要点击次数: 59
全文下载次数: 39
中文摘要:
      [目的/意义]探索基于多模态数据的大语言模型图书分类方法,解决传统图书分类方法存在的瓶颈。[方法/过程]构建单模态数据集和多模态数据集,通过小样本学习策略构建提示语,对比测试GPT-4 和DeepSeek 模型图书分类结果。建立人工编目团队参照系,进行4 次实验,每次实验通过5 组对照量化模型效能。[ 结果/ 结论] 单模态分类中,DeepSeek 平均准确率(80.69%)显著高于GPT-4(64.20%),提示语信息增加使DeepSeek 准确率提升,GPT-4 因信息冗余出现性能波动。多模态场景下,DeepSeek在精确率(94.5%)、召回率(84.0%)及F1 值(88.9%)上全面优于GPT-4,资深编目员与DeepSeek 协同准确率达99.0% 以上。大语言模型通过多模态协同表征与动态知识蒸馏机制,显著优化图书分类效能。
英文摘要:
      [Objective/Significance] To explore a large language model book classification method based on multimodal data and solve the bottleneck of traditional book classification methods. [Methods/Processes] A unimodal dataset and a multimodal dataset were constructed. Prompt language was constructed using a small sample learning strategy, and the book classification results of GPT-4 and DeepSeek models were compared and tested. Establishing a reference frame for the manual cataloging team, conduct 4 experiments, and quantify the effectiveness of the model through 5 control groups in each experiment. [Results/Conclusions]In unimodal classification, the average accuracy of DeepSeek (80.69%) was significantly higher than that of GPT-4 (64.20%).The increase in prompt information improved the accuracy of DeepSeek, while GPT-4 experienced performance fluctuations due to information redundancy. In multimodal scenarios, DeepSeek outperforms GPT-4 in accuracy (94.5%), recall (84.0%), and F1 score (88.9%), with a collaborative accuracy rate of over 99.0% achieved by experienced catalogers and DeepSeek. The large language model significantly optimizes the efficiency of book classification through multimodal collaborative representation and dynamic knowledge distillation mechanism.
查看全文   查看/发表评论  下载PDF阅读器
关闭

分享按钮