基于异构数据特征向量的图文检索方法研究

骆有隆; 朱卉钰; 梁松宇; 张腾

文章摘要

骆有隆,朱卉钰,梁松宇,张腾.基于异构数据特征向量的图文检索方法研究[J].情报工程,2021,7(4):027-039

基于异构数据特征向量的图文检索方法研究

Research on Image and Text Retrieval Method Based on Feature Vector of Heterogeneous Data

DOI：10.3772/j.issn.2095-915X.2021.04.003

中文关键词: 跨媒体；图文匹配；检索；融合

英文关键词: Cross-media; image-text matching; retrieval; integration

基金项目:富媒体数字出版内容组织与知识服务重点实验室开放基金“云计算环境下面向富媒体文本和图片数据的实体关系抽取与推理研究”(ZD2020-09/05)，教育部产学合作协同育人项目“智能制造仿真系统的研究与设计”（201901083006）。

作者	单位
骆有隆	1.武汉理工大学管理学院　武汉　430070；2.富媒体数字出版内容组织与知识服务重点实验室　北京　100038；
朱卉钰	1. 武汉理工大学管理学院　武汉　430070；
梁松宇	3. 武汉理工大学计算机学院　武汉　430070
张腾	3. 武汉理工大学计算机学院　武汉　430070

摘要点击次数: 3453

全文下载次数: 3049

中文摘要:

[ 目的/ 意义] 互联网中存在的信息多数是以文本、图像、视频等相结合的形式，即所谓的跨媒体数据。传统的检索方式包括以文检文、以图检图，它们已不能满足如今信息检索的需求，所以跨媒体数据检索应运而生，跨媒体数据检索的研究具有非常重要的研究意义及应用价值。[ 方法/ 过程] 本文利用Doc2vec 模型和VGG16 模型分别处理文本和图片数据得到各自的特征值，根据特征值的相对语义距离对文本和图片数据进行匹配，从而实现自动的给文字配图片或者为图片选择标签。[ 结果/ 结论]本文提出的方法能够通过提取不同形式的数据特征对数据进行融合，有效提升了图片和文本数据自动匹配的命中率，实现了基于异构数据特征的图文检索。

英文摘要:

[Objective/ Significance] Most of the information existing on the Internet is in the form of a combination of text,images, videos, etc., which is the so-called cross-media data. Traditional retrieval methods include document retrieval and image retrieval. They can no longer meet the needs of today's information retrieval. Therefore, cross-media data retrieval has emerged.[Methods/Process] The research on cross-media data retrieval has very important research significance and application value.This paper uses the Doc2vec model and the VGG16 model to process the feature values of text and image data respectively, and uses the vector similarity distance metric to match text and image data. [Results /Conclusions] The experimental results show that the method proposed in this paper can fuse data in different modal forms, effectively improve the hit rate of automatic matching of graphic and text data, and realise a graphical text retrieval method based on feature vectors of heterogeneous data.

查看全文查看/发表评论下载PDF阅读器

关闭