杜若鹏,张洁,寇远涛.基于共现词分析的专业科技信息平台用户画像主题标注方法改进[J].数字图书馆论坛,2023,(9):58~63 |
基于共现词分析的专业科技信息平台用户画像主题标注方法改进 |
Improvement of Topic Annotation Method for Professional Science and Technology Information Platform User Profile Basedon Co-Occurrence Word Analysis |
投稿时间:2023-06-16 |
DOI:10.3772/j.issn.1673-2286.2023.09.007 |
中文关键词: 科技信息平台;农业;用户画像;信息推荐;共现词分析;TextRank |
英文关键词: Technology Information Platform; Agriculture; User Profile; Information Recommendation; Co-Occurrence Word Analysis; TextRank |
基金项目:本研究得到科技创新2030——“新一代人工智能重大项目”(编号:2021ZD0113705)、2023年中国农业科学院农业信息研究所公益性
科研院所基本科研业务费专项资金(编号:JBYW-AII-2023-24)资助。 |
作者 | 单位 | 杜若鹏 | 中国农业科学院农业信息研究所/国家新闻出版署农业融合出版知识挖掘与知识服务重点实验室/农业农村部农业大数据重点实验室 | 张洁 | 中国农业科学院农业信息研究所/国家新闻出版署农业融合出版知识挖掘与知识服务重点实验室/农业农村部农业大数据重点实验室 | 寇远涛 | 中国农业科学院农业信息研究所/国家新闻出版署农业融合出版知识挖掘与知识服务重点实验室/农业农村部农业大数据重点实验室 |
|
摘要点击次数: 721 |
全文下载次数: 587 |
中文摘要: |
在海量信息的背景下,用户画像是实现对用户精准推荐服务的有效工具。科技信息用户画像的关键环节是根据用户关注的文献信息进行主题词抽取。文献主题词抽取的质量直接影响用户画像以及基于用户画像的内容推荐的精准度。鉴于目前常用的文献主题词抽取方法存在高维特征表征稀疏、泛化能力差、易用性受限等问题,提出基于文本共现词分析与TextRank算法的主题特征抽取方法。用该方法对农业科技信息平台用户关注和浏览的文献数据进行主题抽取,将获得的核心特征词作为用户画像的标注主题词,并据此构建用户主题推荐表达式进行文献推荐效果验证。结果显示,采用该方法的文献推荐准确率为93.3%,显著优于高频词法(70.4%)、共现词分析法(74.1%)和TextRank算法(77.8%),表明改进的文献主题词抽取方法在农业信息用户画像及信息推荐服务中具有很好的应用前景。 |
英文摘要: |
User profiling is an effective tool for providing accurate recommendation services to users in the context of massive amounts of information. The key process of technology information user profiling is to extract subject words based on the literature information that users have browsed. The quality of topic word extraction in literature directly affects the accuracy of user profiles and information recommendation service. Considering that the commonly used methods for topic word extraction in literature have problems such as sparse high-dimensional feature representation, poor generalization ability, and limited usability, an improved topic feature extraction method based on co-occurrence word analysis and TextRank algorithm is proposed. Firstly, topics are extracted from the literature data that agricultural technology information platform users pay attention to and browse. Then, the obtained feature words are used as annotated topic words for the user profile and based on this, a user theme recommendation expression is constructed to verify the effectiveness of literature recommendation. The results show that the accuracy of literature recommendation under this improved method is 93.3%, which is significantly better than that of using high-frequency vocabulary (70.4%), co-occurrence word analysis (74.1%), and TextRank algorithm (77.8%), indicating that the improved topic word extraction method has potential application prospects in agricultural information user profiles and information recommendation services |
查看全文
查看/发表评论 下载PDF阅读器 |
关闭 |