Emotional cause mining method based on dependency syntax and generalized causal network

文档序号:1520989 发布日期:2020-02-11 浏览:11次 中文

阅读说明:本技术 一种基于依存句法和泛化因果网络进行情感原因挖掘方法 (Emotional cause mining method based on dependency syntax and generalized causal network ) 是由 孙越恒 谢英杰 于 2018-07-11 设计创作,主要内容包括:本发明属于自然语言处理领域,具体涉及一种基于依存句法和泛化因果网络进行情感原因挖掘方法,主要步骤为利用爬虫框架,爬取新闻数据,输入数据;预处理;提取文本的语义模式;判断;输出语句中因果关系;识别成对的因果关系事件;提取;抽取;构建网络;泛化处理以及评估。本发明充分解读句子中词语之间的隐含意思;对事件进行了泛化处理,提高了事件的匹配度。(The invention belongs to the field of natural language processing, and particularly relates to a method for mining emotional causes based on dependency syntax and generalized causal network, which mainly comprises the steps of crawling news data by using a crawler frame and inputting the data; pre-treating; extracting a semantic mode of the text; judging; outputting a causal relationship in the statement; identifying pairs of causal events; extracting; extracting; constructing a network; generalization processing and evaluation. The invention fully reads the implicit meanings between words in the sentence; events are subjected to generalization processing, and the matching degree of the events is improved.)

1. A method for mining emotional causes based on dependency syntax and a generalized causal network is characterized by comprising the following main steps: firstly, extracting a causal relationship according to a dependency syntax, and then constructing a generalized causal relationship network by using the extracted causal relationship; the method comprises the following specific steps:

1) crawling news data and inputting data by using a crawler frame;

2) preprocessing input data;

3) extracting semantic modes of the text according to the binary relation model and semantic analysis;

4) judging whether the extracted semantic mode belongs to a causal relation semantic mode;

5) outputting cause events in the sentences and causal relationships among the result events;

6) identifying pairs of causal events using causal connectives;

7) extracting more specific causal events from the determined events;

8) extracting causal events into events composed of a series of verb nouns in a union;

9) constructing a causal relationship network by using the cause events and the result events, and establishing connection edges between the cause events and the result events;

10) carrying out generalization treatment on the event;

11) predicting a causal event;

12) and evaluating the result of the causal relationship network prediction by adopting a calculation accuracy P value, a recall rate R value and an F value.

2. The method for mining the emotional causes based on the dependency syntax and the generalized cause and effect network according to claim 1, wherein the crawler framework in the step 1) is selected from any one of Heritrix, jspider, webmagic and webselector.

3. The method for mining the emotional causes based on the dependency syntax and the generalized causal network as claimed in claim 1, wherein the preprocessing step in the step 2) includes sentence segmentation, word segmentation, part of speech tagging or semantic analysis.

4. The method for mining the emotional causes based on the dependency syntax and the generalized causal network as claimed in claim 3, wherein the sentence segmentation process is mainly implemented by using an existing word segmenter.

5. The method for mining the emotional causes based on the dependency syntax and the generalized causal network as claimed in claim 4, wherein the word segmenter mainly comprises a word segmenter, an Ansj segmenter, a stanford segmenter, a Lucene & Nutch segmenter, a stanford segmenter or a Lucene & Nutch segmenter.

6. The method for mining the emotional causes based on the dependency syntax and the generalized causal network as claimed in claim 1, wherein the step 9) is specifically as follows: the edges represent the relationship between events, the arrow tail points to the cause event, and the arrow points to the result event; causality is transitive, namely chain characteristics, and a plurality of matched causality are connected end to form a long chain.

7. The method for mining the emotional causes based on the dependency syntax and the generalized causal network as claimed in claim 1, wherein the step 10) is specifically as follows: events of the same class are classified into one event, and generalized events often represent one class of events, namely abstract events.

8. The method for mining the emotional causes based on the dependency syntax and the generalized causal network as claimed in claim 1, wherein the step 11) is specifically as follows: predicting a cause event causing the occurrence of the event A, finding a node of the event A through a generalized causal relationship network, and finding a cause event related to the node; then a family of causal events that cause the occurrence of the a event is found.

Technical Field

The invention belongs to the field of natural language processing, and particularly relates to an emotion reason mining method based on dependency syntax and a generalized causal network.

Background

With the rapid growth of social networking platforms, more and more people tend to express their emotions on the social network, and emotion cause mining has become a new challenge in natural language processing. The emphasis in emotion analysis in recent years has been mainly on emotion classification, but sometimes we focus more on triggers that trigger emotion generation and transfer. For example, manufacturers want to know the reason why product sales are high and the reason why product sales are low, and governments want to know the reason why social credibility is reduced. This section mainly introduces the current research situation of emotional cause mining.

Sophia M.Y.Lee proposes the concept of emotion reason mining for the first time, and the related research scheme is also driven by linguistic rules and extracts corresponding reasons for emotion expression in news texts. The experimental effect performance of the method is not ideal, and the accuracy rate can only reach 67.47%. On the basis of Sophia work, Chen and Lee propose a rule-based emotion reason mining method, which introduces an OCC model into emotion reason mining and designs a new emotion reason extraction rule.

In addition to rule-based methods, Ghazi uses CRFs for emotion reason mining. However, the emotion reason and the emotion keywords are required to be in the same sentence, so that the application range of the method is greatly limited, and the expansibility needs to be improved. LinGu proposes an emotion reason mining method based on QA, but in a scene with finer granularity, an analysis result may have errors.

According to the method, firstly, the emotional cause relationship in the text is extracted according to the dependency syntax relationship, the cause-effect relationship network is constructed through the extracted relationship, and then the constructed network is subjected to generalization processing, so that better event matching can be achieved. The method has great breakthrough in both experimental effect and expansibility.

Disclosure of Invention

The method mainly comprises the steps of mining triggering events for triggering emotion generation and transfer, discovering the cause-effect relationship of texts and finding out the operation rule of the texts. The method has wide application value in the prediction of events, event clustering and the prediction of stock market, and can also help the government to effectively monitor the public sentiment in social media.

In order to solve the technical problems proposed in the background art, the invention adopts the following technical scheme: a method for mining emotional causes based on dependency syntax and generalized causal network comprises the following main steps: firstly, extracting a causal relationship according to a dependency syntax, and then constructing a generalized causal relationship network by using the extracted causal relationship;

the method comprises the following specific steps:

1) crawling news data and inputting data by using a crawler frame;

2) preprocessing input data;

3) extracting semantic modes of the text according to the binary relation model and semantic analysis;

4) judging whether the extracted semantic mode belongs to a causal relation semantic mode;

5) outputting cause events in the sentences and causal relationships among the result events;

6) identifying pairs of causal events using causal connectives;

7) extracting more specific causal events from the determined events;

8) extracting causal events into events composed of a series of verb nouns in a union;

9) constructing a causal relationship network by using the cause events and the result events, and establishing connection edges between the cause events and the result events;

10) carrying out generalization treatment on the event;

11) predicting a causal event;

12) and evaluating the result of the causal relationship network prediction by adopting a calculation accuracy P value, a recall rate R value and an F value.

The crawler framework in the step 1) is selected from any one of Heritrix, jspider, webmagic and webselector.

The preprocessing step in the step 2) comprises sentence breaking, word segmentation, part of speech tagging or semantic analysis.

The sentence segmentation processing is mainly realized by utilizing the existing word segmentation device.

The word segmentation device mainly comprises a word segmentation device, an Ansj segmentation device, a Stanford segmentation device, a Lucene & Nutch segmentation device, a Stanford segmentation device or a Lucene & Nutch segmentation device.

The step 9) of the invention specifically comprises the following steps: the edges represent the relationship between events, the arrow tail points to the cause event, and the arrow points to the result event; causality is transitive, namely chain characteristics, and a plurality of matched causality are connected end to form a long chain.

The step 10) of the invention specifically comprises the following steps: events of the same class are classified into one event, and generalized events often represent one class of events, namely abstract events.

The step 11) of the invention is specifically as follows: predicting a cause event causing the occurrence of the event A, finding a node of the event A through a generalized causal relationship network, and finding a cause event related to the node; then a family of causal events that cause the occurrence of the a event is found.

Has the advantages that: the existing method for mining the emotional cause and effect mainly adopts a simple scale template method, the syntactic relation among sentences in a sentence is not effectively considered, and the event matching effect is also poor. Compared with the prior art, the method mainly has the following beneficial effects:

firstly, the method adopts the syntactic dependency relationship to carry out semantic analysis on the sentence, thus fully reading the implicit meanings between words in the sentence.

Secondly, the event is generalized, so that the matching degree of the event is improved.

Finally, the method has good expansibility, and can be applied to the field of emotion reason mining and any reason mining.

Drawings

FIG. 1 is a flow diagram of causal extraction based on dependency syntax;

FIG. 2 is a simple semantic pattern extraction analysis example;

FIG. 3 is a flow chart for constructing a generalized causal relationship network.

Detailed Description

The invention is described in further detail below with reference to the figures and specific examples.

10页详细技术资料下载
上一篇:一种医用注射器针头装配设备
下一篇:一种移动终端信息查询方法和计算机设备

网友询问留言

已有0条留言

还没有人留言评论。精彩留言会获得点赞!

精彩留言,会给你点赞!