Alarm scene mining method

文档序号:1963624 发布日期:2021-12-14 浏览:22次 中文

阅读说明:本技术 一种告警场景挖掘方法 (Alarm scene mining method ) 是由 杨康 葛晓波 王鹏 汪洋 于 2021-09-16 设计创作,主要内容包括:本发明涉及一种告警场景挖掘方法,包括如下步骤:获取存储有告警数据的历史记录;对历史记录进行聚类处理及告警模板匹配,得到包含有模板id的告警数据记录;按时间窗口切分告警数据记录,得到若干告警数据记录分段;统计每一个模板id在哪些告警数据记录分段中出现过,形成告警数据记录分段集合,采用哈希表记录模板id及告警数据记录分段集合;计算两两模板id之间的相关性,并构建相关性矩阵。本发明,采用机器学习技术,对海量的告警数据进行告警分析,识别其中包含的告警场景,并记录为告警模板,实现告警分析的自动化、智能化及标准化,能够有效的帮助运维人员进行故障诊断以及问题定位,提高效率和解决问题的能力。(The invention relates to an alarm scene mining method, which comprises the following steps: acquiring a history record in which alarm data is stored; clustering the historical records and matching alarm templates to obtain alarm data records containing template ids; segmenting the alarm data records according to a time window to obtain a plurality of alarm data record segments; counting the alarm data recording segments in which each template id appears to form an alarm data recording segment set, and recording the template ids and the alarm data recording segment set by adopting a hash table; and calculating the correlation between every two template ids and constructing a correlation matrix. According to the invention, the machine learning technology is adopted to perform alarm analysis on massive alarm data, identify alarm scenes contained in the alarm data and record the alarm scenes as the alarm template, so that the automation, the intellectualization and the standardization of the alarm analysis are realized, operation and maintenance personnel can be effectively helped to perform fault diagnosis and problem positioning, and the efficiency and the problem solving capability are improved.)

1. An alarm scene mining method is characterized by comprising the following steps:

acquiring a history record in which alarm data is stored;

clustering the historical records, then performing alarm template matching, and distributing the same template id to alarm data of the same type to obtain alarm data records containing the template id;

setting a time window;

segmenting the alarm data records containing the template id according to a time window to obtain a plurality of alarm data record segments;

processing each alarm data recording segment one by one, counting the alarm data recording segments in which each template id appears, forming an alarm data recording segment set, and recording the template id and the alarm data recording segment set by adopting a hash table;

and calculating the correlation between every two template ids and constructing a correlation matrix.

2. The method for mining an alarm scenario according to claim 1, wherein the correlation between every two template ids is calculated, and a correlation matrix is constructed, and the specific steps are as follows:

forming a plurality of template pairs by pairwise template ids,

obtaining a hash table corresponding to the two template ids,

acquiring the alarm data record segment set from the hash table,

calculating the similarity of the jaccard of the two alarm data record segment sets by the formula

I.e. the jaccard similarity of the two sets of alarm data record segments is equal to the size of the intersection of the two sets divided by the size of the union.

3. The method for mining an alarm scenario according to claim 2, further comprising the steps of: and constructing an acyclic graph based on the correlation matrix.

4. The method for mining the alarm scenario according to claim 3, wherein the specific steps of constructing the acyclic graph based on the correlation matrix are as follows:

each template id is taken as a vertex in the graph,

according to a correlation threshold value configured by a user, regarding a template pair with correlation reaching the threshold value as related, taking vertexes corresponding to two template ids in the template pair, adding an edge between the two vertexes, wherein the weight of the edge is 1;

and processing the acyclic graph based on a community detection algorithm to divide communities.

5. The method for mining an alarm scene according to claim 4, wherein the acyclic graph is processed based on a community detection algorithm to divide communities, and the method comprises the following specific steps:

determining a community detection algorithm to be used;

setting an objective function used for determining modularity in a community detection algorithm, wherein the modularity is also called a Q value and is used for measuring the quality of community division;

carrying out community division on the acyclic graph by adopting a community detection algorithm to enable the Q value to be carried out towards the increasing direction;

and filtering the community division result, removing isolated communities and taking each of the rest communities as an alarm scene.

6. The method for mining an alarm scenario according to claim 5, wherein the determination of the louvain algorithm is the community detection algorithm used.

7. The alert scene mining method of claim 5, wherein in order to increase readability, the template id is replaced with the template contents and stored as the alert scene record.

Technical Field

The invention relates to the technical field of IT operation and maintenance and management (ITOM), in particular to an alarm scene mining method.

Background

The alarm analysis is widely applied and very important in the fields of operation and maintenance and management, and operation and maintenance personnel of enterprises can be assisted to know the safety condition of the server in real time through the alarm analysis so as to avoid the loss which is caused by faults and difficult to estimate. The running conditions of the software and hardware equipment can be known through alarm analysis, and the root cause can be found quickly when a fault occurs, so that the fault is remedied timely, and the high availability of the software and hardware equipment is improved better.

Generally, when a fault repeatedly occurs, several alarms are generated correspondingly, the occurrence of the several alarms has a certain regularity, and the several alarms may be of the same or different types, for example: when the fault a occurs, three alarms may occur, namely corresponding alarms 1-3 are generated, and then: as long as alarms 1-3 are found to occur, it can be presumed that fault a has occurred. By this rule, several alarms that often appear together can be analyzed and integrated into an alarm template (mining alarm template), and several alarms that often appear together included in the alarm template constitute an alarm scenario, each alarm scenario generally corresponding to a fault.

With the development and expansion of server scale, alarm data is increasing day by day, massive alarm data is gradually formed, alarm analysis is carried out based on the massive alarm data, manual processing cannot be relied on, and enterprises need an automatic alarm analysis solution.

The information disclosed in this background section is only for enhancement of understanding of the general background of the invention and should not be taken as an acknowledgement or any form of suggestion that this information forms the prior art already known to a person skilled in the art.

Disclosure of Invention

Aiming at the defects in the prior art, the invention aims to provide an alarm scene mining method, which adopts the machine learning technology to perform alarm analysis on massive alarm data, identifies the alarm scene contained in the alarm scene and records the alarm scene as an alarm template, thereby realizing the automation, the intellectualization and the standardization of the alarm analysis, effectively helping operation and maintenance personnel to perform fault diagnosis and problem location, and improving the efficiency and the capability of solving the problem.

In order to achieve the above purposes, the technical scheme adopted by the invention is as follows:

an alarm scene mining method is characterized by comprising the following steps:

acquiring a history record in which alarm data is stored;

clustering the historical records, then performing alarm template matching, and distributing the same template id to alarm data of the same type to obtain alarm data records containing the template id;

setting a time window;

segmenting the alarm data records containing the template id according to a time window to obtain a plurality of alarm data record segments;

processing each alarm data recording segment one by one, counting the alarm data recording segments in which each template id appears, forming an alarm data recording segment set, and recording the template id and the alarm data recording segment set by adopting a hash table;

and calculating the correlation between every two template ids and constructing a correlation matrix.

On the basis of the technical scheme, the correlation between every two template ids is calculated, and a correlation matrix is constructed, and the specific steps are as follows:

forming a plurality of template pairs by pairwise template ids,

obtaining a hash table corresponding to the two template ids,

acquiring the alarm data record segment set from the hash table,

calculating the similarity of the jaccard of the two alarm data record segment sets by the formula

I.e. the jaccard similarity of the two sets of alarm data record segments is equal to the size of the intersection of the two sets divided by the size of the union.

On the basis of the technical scheme, the method further comprises the following steps: and constructing an acyclic graph based on the correlation matrix.

On the basis of the technical scheme, the specific steps of constructing the acyclic graph based on the correlation matrix are as follows:

each template id is taken as a vertex in the graph,

according to a correlation threshold value configured by a user, regarding a template pair with correlation reaching the threshold value as related, taking vertexes corresponding to two template ids in the template pair, adding an edge between the two vertexes, wherein the weight of the edge is 1;

and processing the acyclic graph based on a community detection algorithm to divide communities.

On the basis of the technical scheme, the method for processing the acyclic graph based on the community detection algorithm and dividing the communities comprises the following specific steps:

determining a community detection algorithm to be used;

setting an objective function used for determining modularity in a community detection algorithm, wherein the modularity is also called a Q value and is used for measuring the quality of community division;

carrying out community division on the acyclic graph by adopting a community detection algorithm to enable the Q value to be carried out towards the increasing direction;

and filtering the community division result, removing isolated communities and taking each of the rest communities as an alarm scene.

On the basis of the technical scheme, the louvain algorithm is determined to be the community detection algorithm.

On the basis of the technical scheme, in order to increase readability, the template id is replaced by the template content and stored as the alarm scene record.

The method for mining the alarm scene has the following beneficial effects:

the machine learning technology is adopted to perform alarm analysis on massive alarm data, identify alarm scenes contained in the alarm data and record the alarm scenes as an alarm template, so that automation, intellectualization and standardization of alarm analysis are realized, operation and maintenance personnel can be effectively helped to perform fault diagnosis and problem positioning, and the efficiency and the problem solving capability are improved.

According to the machine learning technology, a community detection algorithm is selected, the number of the mined scenes is small, the accuracy is high, the correlation among the templates in the scenes is strong, and no intersection exists among the scenes.

Drawings

The invention has the following drawings:

the drawings are included to provide a better understanding of the invention and are not to be construed as unduly limiting the invention. Wherein:

fig. 1 is a flowchart of an embodiment one of an alarm scenario mining method according to the present invention.

Fig. 2 is a flowchart of a second embodiment of the method for mining an alarm scenario according to the present invention.

FIG. 3 constructs an acyclic graph example based on a correlation matrix.

Fig. 4 is a flowchart of a third embodiment of the method for mining an alarm scenario according to the present invention.

Detailed Description

The present invention will be described in further detail with reference to the accompanying drawings. The detailed description, while indicating exemplary embodiments of the invention, is given by way of illustration only, in which various details of embodiments of the invention are included to assist understanding. Accordingly, it will be appreciated by those skilled in the art that various changes and modifications may be made to the embodiments described herein without departing from the scope and spirit of the invention. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.

As shown in fig. 1, the present invention provides an alarm scenario mining method, which comprises the following steps:

acquiring a history record in which alarm data is stored;

for example: the format and contents of the history are shown in table 1 below,

TABLE 1

Alarm id Time of occurrence Content of alarm
1 2021-01-01 00:00 Host db01 unable to ping
2 2021-01-01 00:01 Connection of mysql database on db01 failed
3 2021-01-01 00:03 The CPU utilization rate of the host1 exceeds 80 percent
... ... ...
1001 2021-01-03 08:00 Host db02 unable to ping
1002 2021-01-03 08:00 Connection of mysql database on db02 failed
1003 2021-01-03 08:01 The transaction failure rate reaches 10 percent
1004 2021-01-01 08:02 Failure of transfer service invocation over the internet
... ... ...
2001 2021-01-04 18:00 The transaction failure rate reaches 12 percent
2002 2021-01-04 18:01 Failure of transfer service invocation over the internet
2003 2021-01-04 18:02 The host2 has high memory utilization rate
... ... ...

The history record at least comprises an alarm id, occurrence time and alarm content;

clustering the historical records, then performing alarm template matching processing, and distributing the same template id to alarm data of the same type to obtain alarm data records containing the template id;

the clustering process can adopt the existing arbitrary clustering algorithm, the invention does not relate to the improvement of the clustering algorithm, and the details are not described;

the template id is used for distinguishing different types of alarm data;

the purpose of clustering and alarm template matching is to structure the alarm content and facilitate feature extraction, as one of the alternative embodiments, the clustering can adopt 202010216937.8-disclosed real-time log clustering method based on LCS;

for example: the examples shown in table 1 are clustered and alarm template-matched to obtain the format and content of the alarm data record containing the template id shown in table 2 below,

TABLE 2

Setting a time window;

as an alternative embodiment, the time window defaults to 5 minutes;

segmenting the alarm data records containing the template id according to a time window to obtain a plurality of alarm data record segments; after segmentation, recording the occurrence time of the first record and the occurrence time of the last record in each alarm data segment, wherein the difference value of the occurrence times is less than or equal to the value of a time window;

for example: the example shown in table 2 is divided into time windows, resulting in several alarm data record segments as shown in table 3 below,

TABLE 3

Processing each alarm data recording segment one by one, counting the alarm data recording segments in which each template id appears, forming an alarm data recording segment set, and recording the template id and the alarm data recording segment set by adopting a hash table;

for example: the example statistics shown in table 3 form a set of alarm data record segments, the statistical results are shown in table 4 below,

TABLE 4

Calculating the correlation between every two template ids and constructing a correlation matrix, wherein the specific steps are as follows:

forming a plurality of template pairs by pairwise template ids,

obtaining a hash table corresponding to the two template ids,

acquiring the alarm data record segment set from the hash table,

calculating the similarity of the jaccard of the two alarm data record segment sets by the formula

That is, the jaccard similarity of the two alarm data record segment sets is equal to the size of the intersection of the two sets divided by the size of the union;

for example: the correlation between two template ids is calculated for the example shown in table 4, and a correlation matrix is constructed as shown in table 5 below,

TABLE 5

The higher the similarity (Jaccard coefficient) of the Jaccard is, the higher the probability that the two-by-two template id appears is, that is, either the two-by-two template id appears or neither the two-by-two template id appears.

On the basis of the above technical solution, as shown in fig. 2, the method further comprises the following steps: constructing an acyclic graph based on the correlation matrix;

the specific steps of constructing the acyclic graph based on the correlation matrix are as follows:

each template id is taken as a vertex in the graph,

according to a correlation threshold value configured by a user, regarding a template pair with correlation reaching the threshold value as related, taking vertexes corresponding to two template ids in the template pair, adding an edge between the two vertexes, wherein the weight of the edge is 1;

for example: if the correlation threshold configured by the user is set to be 0.9, constructing an acyclic graph based on the correlation matrix shown in table 5, as shown in fig. 3;

processing a ringless graph based on a community detection algorithm, and dividing communities, wherein the method specifically comprises the following steps:

determining a community detection algorithm to be used; for example, the louvain algorithm is determined to be the community detection algorithm used; the louvain algorithm is a well-known algorithm, and the invention is not detailed;

setting an objective function used for determining modularity in a community detection algorithm, wherein the modularity is also called a Q value and is used for measuring the quality of community division;

carrying out community division on the acyclic graph by adopting a community detection algorithm to enable the Q value to be carried out towards the increasing direction;

for example: constructing a loop-free graph and dividing communities according to the example shown in table 5, wherein the community division result is shown in table 6,

TABLE 6

Community id Template id set
1 1,2
2 3
3 4,5
4 6

As shown in fig. 4, the community division results are filtered, isolated communities are removed, and each of the rest communities is used as an alarm scene;

for example: the example shown in table 6 is filtered to remove isolated communities, and the alarm scenario results are shown in table 7 below,

TABLE 7

Community id Template id set
1 1,2
3 4,5

On the basis of the above technical solution, as shown in fig. 4, in order to increase readability, the template id is replaced with template content and stored as an alarm scene record.

For example: after processing the example shown in table 7, the alarm scenario record is shown in table 8 below,

TABLE 8

Those not described in detail in this specification are within the skill of the art.

The above description is only a preferred embodiment of the present invention, and the scope of the present invention is not limited to the above embodiment, but equivalent modifications or changes made by those skilled in the art according to the present disclosure should be included in the scope of the present invention as set forth in the appended claims.

13页详细技术资料下载
上一篇:一种医用注射器针头装配设备
下一篇:测试调度及日志管理方法、装置、计算机设备及存储介质

网友询问留言

已有0条留言

还没有人留言评论。精彩留言会获得点赞!

精彩留言,会给你点赞!