万伊* **,李春国***,杨飞然*,杨军* **.基于倒谱特征数据增强的真实场景合成语音检测[J].高技术通讯(中文),2024,34(10):1013~1023 |
基于倒谱特征数据增强的真实场景合成语音检测 |
Real scene synthetic speech detection based on cepstral feature data augmentation |
|
DOI:10. 3772 / j. issn. 1002-0470. 2024. 10. 001 |
中文关键词: 合成语音检测; 数据增强; 真实场景; 频域掩蔽; 泛化能力 |
英文关键词: synthetic speech detection, data augmentation, real scenes, frequency masking, generalization ability |
基金项目: |
作者 | 单位 | 万伊* ** | (* 中国科学院声学研究所 北京 100190)
(** 中国科学院大学 北京 100049)
(*** 东南大学信息科学与工程学院 南京 210096) | 李春国*** | | 杨飞然* | | 杨军* ** | |
|
摘要点击次数: 299 |
全文下载次数: 293 |
中文摘要: |
现有合成语音检测系统在真实场景下性能损失严重。 本文提出了一种基于频域掩蔽的倒谱特征数据增强方法。 该方法对输入信号的线性滤波器组特征(LFBs)进行频域掩蔽,以引入符合真实场景的语音失真;计算掩蔽特征的线性频率倒谱系数(LFCC),以降低特征维度,提升检测性能。 本文利用轻量级卷积神经网络( LCNN)、深度残差网络(ResNet)和一维卷积 Transformer 模型(OCT)建立了 3 种检测系统,用于验证所提方法的有效性。 真实场景数据集上的实验结果表明,所提方法可使不同合成语音检测系统的等错误率(EER)相较无增强的基线降低 6. 39% ~ 25. 95% 。 将所提方法与基于音频编解码的增强技术相结合时,不同系统的 EER 比基线降低 31. 71% ~ 42. 47% ,进一步提升了系统对真实场景的泛化能力,且性能优于现有数据增强方法。 |
英文摘要: |
The performance of existing synthetic speech detection systems is significantly degraded in real scenarios. This
paper proposes a data augmentation method for cepstral features via frequency masking. First, linear filter banks
(LFBs) of the input signal are masked on frequency channels for realistic speech distortion. Then, the linear fre-
quency cepstral coefficients (LFCC) of the masked features are calculated to reduce the feature dimensionality and
improve the detection performance. Using light convolutional neural network ( LCNN), deep residual network
(ResNet) and one-dimensional convolutional Transformer (OCT), three detection systems are established to verify
the effectiveness of the proposed method. Experiments on the real scene datasets show that the proposed method can
reduce the equal error rate (EER) of different synthetic speech detection systems by 6. 39% - 25. 95% compared
with the baseline without augmentation. The proposed method with the codec-based augmentation can reduce the
EER of different systems by 31. 71% - 42. 47% compared with the baseline, which further improves the generaliza-
tion ability of the systems in real scenarios, and outperforms the existing data augmentation methods. |
查看全文
查看/发表评论 下载PDF阅读器 |
关闭 |
|
|
|