<listing id="l9bhj"><var id="l9bhj"></var></listing>
<var id="l9bhj"><strike id="l9bhj"></strike></var>
<menuitem id="l9bhj"></menuitem>
<cite id="l9bhj"><strike id="l9bhj"></strike></cite>
<cite id="l9bhj"><strike id="l9bhj"></strike></cite>
<var id="l9bhj"></var><cite id="l9bhj"><video id="l9bhj"></video></cite>
<menuitem id="l9bhj"></menuitem>
<cite id="l9bhj"><strike id="l9bhj"><listing id="l9bhj"></listing></strike></cite><cite id="l9bhj"><span id="l9bhj"><menuitem id="l9bhj"></menuitem></span></cite>
<var id="l9bhj"></var>
<var id="l9bhj"></var>
<var id="l9bhj"></var>
<var id="l9bhj"><strike id="l9bhj"></strike></var>
<ins id="l9bhj"><span id="l9bhj"></span></ins>
Volume 42 Issue 4
Apr.  2020
Turn off MathJax
Article Contents
CAO Wen-bin, WU Zhuo-feng, YANG Tao, FAN You-rong. Entity and attribute extraction of terrorism event based on text corpus[J]. Chinese Journal of Engineering, 2020, 42(4): 500-508. doi: 10.13374/j.issn2095-9389.2019.09.13.003
Citation: CAO Wen-bin, WU Zhuo-feng, YANG Tao, FAN You-rong. Entity and attribute extraction of terrorism event based on text corpus[J]. Chinese Journal of Engineering, 2020, 42(4): 500-508. doi: 10.13374/j.issn2095-9389.2019.09.13.003

Entity and attribute extraction of terrorism event based on text corpus

doi: 10.13374/j.issn2095-9389.2019.09.13.003
More Information
  • Corresponding author: E-mail: 490838330@qq.com
  • Received Date: 2019-09-13
  • Publish Date: 2020-04-01
  • Affected by complex international factors in recent years, terrorism events are increasingly rampant in many countries, thereby posing a great threat to the gloal community. In addition, with the widespread use of emerging technologies in military and commercial fields, terrorist organizations have begun to use emerging technologies to engage in destructive activities. As the Internet and information technology develop, terrorism has been rapidly spreading in cyberspace. Terrorist organizations have created terrorism websites, established multinational networks of terrorist organizations, released recruitment information and even conducted training activities through various mainstream websites with a worldwide reach. Compared with traditional terrorist activities, cyber terrorist activities have a greater degree of destructiveness. Cybercrime and cyber terrorism have become the most serious challenges for societies. Terrorist organizations take advantage of the Internet in rapid dissemination of extremism ideas, and develop a large number of terrorists and supporters around the world, especially in developed Western countries. Terrorist organizations even use the Internet and “dark net” networks to conduct terrorist training, and their activities are concealed. As a result, the "lone wolf" terrorist attacks in various countries have emerged in an endless stream, which is difficult to prevent. This study proposed a method of extracting entities and attributes of terrorist events based on semantic role analysis, and provided technical support for monitoring and predicting cyberspace terrorism activities. Firstly, a naive Bayesian text classification algorithm is used to identify terrorism events on the cleaned text corpus collected from the Anti-Terrorism Information Site of the Northwest University of Political Science and Law. The keyword extraction algorithm TF-IDF is adopted for constructing the terrorism vocabularies from the classified text corpus, combining natural language processing technology. Then, semantic role and syntactic dependency analyses are conducted to mine the attributive post-targeting relationship, the name//place name//organization, and the mediator-like relationship. Finally, regular expressions and constructed lexical terrorism-specific vocabularies are used to extract six entities and attributes (occurrence time, occurrence location, casualties, attack methods, weapon types and terrorist organizations) of terrorism event based on the four types of triad short texts. The F1 values of the six types of entity attribute extraction evaluation results exceeded 80% based on the experimental data of 4221 articles collected. Therefore, the method proposed has practical significance for maintaining social public safety because of the positive effect in monitoring and predicting cyberspace terrorism events.

     

  • loading
  • [1]
    李培峰, 周國棟, 朱巧明. 基于語義的中文事件觸發詞抽取聯合模型. 軟件學報, 2016, 27(2):280

    Li P F, Zhou G D, Zhu Q M. Semantics-based joint model of Chinese event trigger extraction. J Softw, 2016, 27(2): 280
    [2]
    賀瑞芳, 段紹楊. 基于多任務學習的中文事件抽取聯合模型. 軟件學報, 2019, 30(4):1015

    He R F, Duan S Y. Joint Chinese event extraction based multi-task learning. J Softw, 2019, 30(4): 1015
    [3]
    田生偉, 周興發, 禹龍, 等. 基于雙向LSTM的維吾爾語事件因果關系抽取. 電子與信息學報, 2018, 40(1):200 doi: 10.11999/JEIT170402

    Tian S W, Zhou X F, Yu L, et al. Causal relation extraction of Uyghur events based on bidirectional long short-term memory model. J Electron Inf Technol, 2018, 40(1): 200 doi: 10.11999/JEIT170402
    [4]
    章順瑞, 駱陳. 基于語義角色分析的事件抽取技術. 太赫茲科學與電子信息學報, 2017, 15(2):279 doi: 10.11805/TKYDA201702.0279

    Zhang S R, Luo C. Event extraction technology by semantic role analysis. J Terahertz Sci Electron Inf Technol, 2017, 15(2): 279 doi: 10.11805/TKYDA201702.0279
    [5]
    陳簫簫, 劉波. 微博中的開放域事件抽取. 計算機應用與軟件, 2016, 33(8):18 doi: 10.3969/j.issn.1000-386x.2016.08.004

    Chen X X, Liu B. Extracting open domain events in microblogs. Comput Appl Softw, 2016, 33(8): 18 doi: 10.3969/j.issn.1000-386x.2016.08.004
    [6]
    秦兵, 劉安安, 劉挺. 無指導的中文開放式實體關系抽取. 計算機研究與發展, 2015, 52(5):1029 doi: 10.7544/issn1000-1239.2015.20131550

    Qin B, Liu A A, Liu T. Unsupervised Chinese open entity relation extraction. J Comput Res Dev, 2015, 52(5): 1029 doi: 10.7544/issn1000-1239.2015.20131550
    [7]
    侯偉濤, 姬東鴻. 基于Bi-LSTM的醫療事件識別研究. 計算機應用研究, 2018, 35(7):1974 doi: 10.3969/j.issn.1001-3695.2018.07.011

    Hou W T, Ji D H. Research on clinic event recognition based Bi-LSTM. Appl Res Comput, 2018, 35(7): 1974 doi: 10.3969/j.issn.1001-3695.2018.07.011
    [8]
    李衛疆, 李濤, 漆芳. 基于多特征自注意力BLSTM的中文實體關系抽取. 中文信息學報, 2019, 33(10):47 doi: 10.3969/j.issn.1003-0077.2019.10.006

    Li W J, Li T, Xi F. Chinese entity relation extraction based on multi-features self-attention Bi-LSTM. J Chin Inf Process, 2019, 33(10): 47 doi: 10.3969/j.issn.1003-0077.2019.10.006
    [9]
    張俊飛. 基于改進樸素貝葉斯算法實現評教評語情感分析. 現代計算機: 中旬刊, 2018(11):3

    Zhang J F. Sentiment analysis of teaching evaluation based on improved naive Bayes algorithm. Mod Comput, 2018(11): 3
    [10]
    于韜, 王洪巖. 基于TF-IDF算法的文本信息提取. 科技視界, 2018(16):117

    Yu T, Wang H Y. Text information extraction based on TF-IDF algorithm. Sci Technol Vision, 2018(16): 117
    [11]
    吳中勤, 黃萱菁, 吳立德. 基于語義關系三元組的問答式文摘. 計算機工程, 2008, 34(6):194 doi: 10.3969/j.issn.1000-3428.2008.06.070

    Wu Z Q, Huang X J, Wu L D. Question-focused summarization based on semantic relational triple. Comput Eng, 2008, 34(6): 194 doi: 10.3969/j.issn.1000-3428.2008.06.070
    [12]
    蒲文瑩. 面向專用信息獲取的用戶定制主題網絡爬蟲技術探究. 電腦編程技巧與維護, 2019(1):33 doi: 10.3969/j.issn.1006-4052.2019.01.010

    Pu W Y. Research on user-specific theme web crawler technology for private information acquisition. Software dev appl, 2019(1): 33 doi: 10.3969/j.issn.1006-4052.2019.01.010
    [13]
    熊艷秋, 嚴碧波. 基于jsoup爬取圖書網頁信息的網絡爬蟲技術. 電腦與信息技術, 2019, 27(4):61 doi: 10.3969/j.issn.1005-1228.2019.04.018

    Xiong Y Q, Yan B B. Web crawler technology based on jsoup to crawl information of book web pages. Comput Inf Technol, 2019, 27(4): 61 doi: 10.3969/j.issn.1005-1228.2019.04.018
    [14]
    王大偉, 周志瑋, 曹紅根. 基于PCA-SVM算法的酒店評論文本情感分析研究. 現代計算機, 2019(7):13

    Wang D W, Zhou Z W, Cao H G. Research on sentiment analysis of hotel review text based on PCA-SVM algorithm. Mod Comput, 2019(7): 13
    [15]
    湯榮志, 段會川, 孫海濤. SVM訓練數據歸一化研究. 山東師范大學學報: 自然科學版, 2016, 31(4):60

    Tang R Z, Duan H C, Sun H T. Research on normalization of SVM training data. J Shandong Normal University Nat Sci, 2016, 31(4): 60
    [16]
    楊林偉. 突發事件新聞標題的語言學特點——一項語料庫驅動的實證研究. 時代文學(下半月), 2012(6):132

    Yang L W. Linguistic features of emergency news headlines: a corpus-driven empirical study. Shidai Wenxue, 2012(6): 132
    [17]
    熊志斌, 朱劍鋒, 尹成國. 正則表達式在旅游突發事件信息抽取中的應用. 軟件, 2015, 36(11):15 doi: 10.3969/j.issn.1003-6970.2015.11.005

    Xiong Z B, Zhu J F, Yin C G. Application of regular expressions in the extraction of tourism emergency information. Comput Eng Software, 2015, 36(11): 15 doi: 10.3969/j.issn.1003-6970.2015.11.005
    [18]
    鄭治豪, 吳文兵, 陳鑫, 等. 基于社交媒體大數據的交通感知分析系統. 自動化學報, 2018, 44(4):656

    Zheng Z H, Wu W B, Chen X, et al. A traffic sensing and analyzing system using social media data. Acta Automatica Sinica, 2018, 44(4): 656
    [19]
    馮雪. 基于三元組文檔表示的文本分類. 計算機工程與設計, 2019, 40(2):101

    Feng X. Triple-based document representation for text classification. Comput Eng Des, 2019, 40(2): 101
    [20]
    羅永蓮, 趙昌垣. 突發事件新聞標題與正文提取方法. 計算機應用, 2014, 34(10):2865 doi: 10.11772/j.issn.1001-9081.2014.10.2865

    Luo Y L, Zhao C Y. Extracting method of emergency news headline and text from webpages. J Comput Appl, 2014, 34(10): 2865 doi: 10.11772/j.issn.1001-9081.2014.10.2865
    [21]
    劉建偉, 黎海恩, 羅雄麟. 概率圖模型表示理論. 計算機科學, 2014, 41(9):1 doi: 10.11896/j.issn.1002-137X.2014.09.001

    Liu J W, Li H E, Luo X L. Probabilistic graph model representation theory. Comput Sci, 2014, 41(9): 1 doi: 10.11896/j.issn.1002-137X.2014.09.001
    [22]
    屈慶濤, 劉其成, 牟春曉. 基于N-Gram語言模型的并行自適應新聞話題追蹤算法. 山東大學學報: 工學版, 2018, 48(6):37

    Qu Q T, Liu Q C, Mu C X. A parallel adaptive news topic tracking algorithm based on N-Gram language model. J Shandong Univ Eng Sci, 2018, 48(6): 37
    [23]
    尹陳, 吳敏. N-gram模型綜述. 計算機系統應用, 2018, 27(10):33

    Yin C, Wu M. Survey on N-gram model. Comput Syst Appl, 2018, 27(10): 33
    [24]
    石進, 韓進, 趙小柯, 等. 基于語境概念核心詞提取算法研究. 情報學報, 2019, 38(11):1177 doi: 10.3772/j.issn.1000-0135.2019.11.006

    Shi J, Han J, Zhao X K, et al. Research on core word extraction algorithm based on contextual concept. J China Soc Sci Tech Inf, 2019, 38(11): 1177 doi: 10.3772/j.issn.1000-0135.2019.11.006
    [25]
    李曉, 解輝, 李立杰. 基于Word2vec的句子語義相似度計算研究. 計算機科學, 2017, 44(9):256 doi: 10.11896/j.issn.1002-137X.2017.09.048

    Li X, Jie H, Li L J. Research on sentence semantic similarity calculation based on Word2vec. Comput Sci, 2017, 44(9): 256 doi: 10.11896/j.issn.1002-137X.2017.09.048
  • 加載中

Catalog

    通訊作者: 陳斌, bchen63@163.com
    • 1. 

      沈陽化工大學材料科學與工程學院 沈陽 110142

    1. 本站搜索
    2. 百度學術搜索
    3. 萬方數據庫搜索
    4. CNKI搜索

    Figures(3)  / Tables(8)

    Article views (2271) PDF downloads(98) Cited by()
    Proportional views
    Related

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return
    久色视频