文禹衡,杨长元.数字图书馆隐私合规评估词典构建与应用[J].数字图书馆论坛,2023,(6):60~68 |
数字图书馆隐私合规评估词典构建与应用 |
Construction and Application of Privacy Compliance Evaluation Dictionary of Digital Library |
投稿时间:2023-05-03 |
DOI:10.3772/j.issn.1673-2286.2023.06.008 |
中文关键词: 数字图书馆;词典构建;隐私;合规评估 |
英文关键词: Digital Library; Dictionary Construction; Privacy; Compliance Assessment |
基金项目:本研究得到国家社会科学基金青年项目“数据要素确权的法律供给研究”(编号:21CFX007)资助。 |
作者 | 单位 | 文禹衡 | 湘潭大学知识产权学院;湖南省数据治理与智慧司法研究中心 | 杨长元 | 湘潭大学知识产权学院 |
摘要点击次数: 1046 |
全文下载次数: 941 |
中文摘要: |
按照选取种子词、形成语料库等步骤构建数字图书馆隐私合规评估词典,以数字图书馆的隐私条款为基础构建语料库,采用TF-IDF、TextRank、Synonyms等工具进行分词、词性标注和近义词拓展,通过点互信息和共现分析方法分析内容词与种子词的映射关系,实现合规点识别功能。经测试,该隐私合规评估词典能够提高6.6%的隐私条款分词成词率、降低41.0%的冗余,对合规评估点的识别准确率为90.5%,在较高程度上实现评估点的自动识别,提升数字图书馆隐私合规评估效率。研究有助于完善数字图书馆的通用用户隐私条款规范体系,也为其他领域合规评估的智能化提供参考。 |
英文摘要: |
According to the steps of selecting seed words and forming a corpus, a privacy compliance evaluation dictionary of digital library is constructed. Based on the privacy clauses of digital library, a corpus is constructed. TF-IDF, TextRank, Synonyms, and other tools are used for word segmentation, part-ofspeech tagging, and synonym expansion. The mapping relationship between content words and seed words is analyzed by pointwise mutual information and co-occurrence analysis methods to realize the function of compliance point identification. The test shows that the privacy compliance evaluation dictionary can improve the word formation rate of privacy clause segmentation by 6.6%, reduce the redundancy by 41.0%, and the recognition accuracy of compliance evaluation points is 90.5%, which realizes the automatic recognition of evaluation points to a higher extent and improves the efficiency of digital library privacy compliance evaluation. This study is helpful to promote the general user privacy clause standard system of digital library and provides references for the intelligent evaluation of compliance in other fields. |
查看/发表评论 下载PDF阅读器 |
关闭 |