| 王成瑞* **,陈宏申***,蔡恒毅***,李天浩***,徐夙龙***,赵晓芳* ****.生成式对话的差异感知对比学习方法[J].高技术通讯(中文),2025,35(11):1163~1173 |
| 生成式对话的差异感知对比学习方法 |
| Difference-aware contrastive learning for dialogue generation |
| |
| DOI:10. 3772 / j. issn. 1002-0470. 2025. 11. 002 |
| 中文关键词: 自然语言处理; 开放域对话系统; 对比学习; 差异感知方法; 预训练模型 |
| 英文关键词: natural language processing, open-domain dialogue system, contrastive learning, difference-aware method, pre-trained model |
| 基金项目: |
| 作者 | 单位 | | 王成瑞* ** | (*中国科学院计算技术研究所北京 100190)
(**中国科学院大学北京 100049)
(***京东集团北京 100176)
(****中科苏州智能计算技术研究院苏州 215028) | | 陈宏申*** | | | 蔡恒毅*** | | | 李天浩*** | | | 徐夙龙*** | | | 赵晓芳* **** | |
|
| 摘要点击次数: 90 |
| 全文下载次数: 65 |
| 中文摘要: |
| 对比学习作为一种有效的微调方法得到了广泛应用,然而,其中的数据增强技术仍面临一些挑战。由于自然语言的离散性,传统数据增强方法可能引起语义的显著变化;同时,模型可能对表面特征过度敏感,而忽略关键的语义差异。为应对这些问题,本文提出了一种差异感知的对比学习方法。该方法通过等价对比增强使模型能够对语义等价的增强数据保持不敏感,同时使用非等价差异判别器来捕获增强样本中的语义变化,进而让模型对潜在的非等价增强数据保持敏感。在2个开放域对话数据集上的实验结果表明,采用本文方法进行微调的模型与之前微调方法的基线模型相比在量化评估和人工测评上性能均取得了显著提升。同时本文进行了消融实验,实验结果验证了本文方法中不同模块的有效性。 |
| 英文摘要: |
| Contrastive learning has been widely used as an effective fine-tuning method. However, data augmentation techniques in this context still face some challenges. Due to the discrete nature of natural language, traditional data augmentation methods can cause significant semantic changes; additionally, models may become overly sensitive to surface features while neglecting critical semantic differences. To mitigate these obstacles, this work proposes a difference-aware contrastive learning method. This method uses equivalent contrast enhancement to allow the model to be insensitive to semantically equivalent augmented data, while employing a non-equivalent difference discriminator to capture semantic changes in the augmented samples, thereby keeping the model sensitive to potential non-equivalent augmented data. This work conducts experiments on two open-domain dialogue datasets, and the results show that models fine-tuned using the proposed method achieve significant improvements in both quantitative evaluations and human assessments compared to baseline models using previous fine-tuning approaches. Additionally, this work conducts ablation studies, which validate the effectiveness of the different modules in the proposed method. |
|
查看全文
查看/发表评论 下载PDF阅读器 |
| 关闭 |
|
|
|