孟月波,张紫琴,刘光辉,徐胜军.融合全局聚合与局部挖掘的建筑图像检索[J].高技术通讯(中文),2024,34(7):692~704 |
融合全局聚合与局部挖掘的建筑图像检索 |
Fusing global aggregation and local mining for architectural image retrieval |
|
DOI:10. 3772 / j. issn. 1002-0470. 2024. 07. 003 |
中文关键词: 建筑图像; 图像检索; 特征聚合; 特征挖掘 |
英文关键词: architectural image, image retrieval, feature aggregation, feature mining |
基金项目: |
作者 | 单位 | 孟月波 | (西安建筑科技大学信息与控制工程学院西安 710055) | 张紫琴 | | 刘光辉 | | 徐胜军 | |
|
摘要点击次数: 399 |
全文下载次数: 254 |
中文摘要: |
针对建筑图像易受到尺度变化和局部遮挡干扰而导致检索准确率低的问题,本文提出了一种融合全局聚合与局部挖掘的建筑图像检索网络。以ResNet50为骨干网络并在其后引入多尺度特征聚合的全局分支和注意力引导特征挖掘的局部分支,再通过正交融合策略高效整合双分支互补特征。其中,多尺度特征聚合模块结合混合空洞卷积和通道注意力对全局不同尺度的目标进行自适应加权聚合,增强网络对建筑多尺度显著特征的提取;注意力引导特征挖掘模块通过信息互补注意力对最显著特征标记擦除,实现对局部区域中潜在的细节信息的挖掘。所提方法在主流建筑数据集ROxf和RPar上的平均精度均值(mAP)指标分别达到了 81.54%(M)、62.43 %(H)和 90.28 %(M)、78.35%(H)。实验结果表明,该方法有效克服了尺度变化和局部遮挡的干扰,显著提升了建筑图像检索的准确率。 |
英文摘要: |
To address the problem of low retrieval accuracy in architectural image retrieval due to scale variations and local occlusions, this paper proposes an architectural image retrieval network that integrates global aggregation and local mining. The method introduces global branch for multi-scale feature aggregation and a local branch for attention-guided feature mining following the ResNet50 backbone network. The network efficiently integrates complementary features from the two branches through an orthogonal fusion module. Specifically, the multi-scale feature aggregation module utilizes mixed dilated convolutions and channel attention to adaptively aggregate globally different-scale targets, enhancing the network’s ability to extract multi-scale salient features from architectural images. The attention-guided feature mining module employs information complementary attention to mark and erase the most salient feature, achieving the mining of potential detail information in local regions. The proposed method achieves mean average precision (mAP) metrics of 81.54%(M)and 62.43%(H)on the ROxf dataset, as well as 90.28%(M)and 78.35%(H)on the RPar dataset, which are two major mainstream architectural datasets. Experimental results indicate that the method effectively overcomes the interference of scale variations and local occlusions, significantly improving the accuracy of architectural image retrieval. |
查看全文
查看/发表评论 下载PDF阅读器 |
关闭 |