Log analysis method and device based on syntax tree analysis

文档序号:1087395 发布日期:2020-10-20 浏览:8次 中文

阅读说明:本技术 一种基于语法树解析的日志分析方法及装置 (Log analysis method and device based on syntax tree analysis ) 是由 陈飞 赵莹 王国平 赵川 于 2020-06-10 设计创作,主要内容包括:本发明涉及一种基于语法树解析的日志分析方法及装置,属于数据处理技术领域。本发明使用语法树分析器对用户输入的若干查询语句进行解析并得到若干查询类和/或分析类领域专用语言;对查询类领域专用语言进行并行检索;对各所述查询类领域专用语言并行检索后结果使用所述分析类领域专用语言进行并行分析;对并行分析后的结果进行汇聚产出运营分析仪表盘。本发明实现了通过定义的多个常见操作指令和计算函数结合语法树解析的日志分析思路对日志数据进行统计分析。(The invention relates to a log analysis method and device based on syntax tree parsing, and belongs to the technical field of data processing. The invention uses a syntax tree analyzer to analyze a plurality of query sentences input by a user and obtain a plurality of query and/or analysis field special languages; performing parallel retrieval on the special languages in the query field; performing parallel analysis on the result of parallel retrieval of the query domain specific language by using the analysis domain specific language; and collecting results after parallel analysis to output an operation analysis instrument panel. The invention realizes the statistical analysis of the log data by combining a plurality of defined common operation instructions and calculation functions with the log analysis thought of syntax tree analysis.)

1. A log analysis method based on syntax tree parsing, the method comprising:

analyzing a plurality of query sentences input by a user by using a syntax tree analyzer to obtain a plurality of query class and/or analysis class field specific languages, wherein the query sentences comprise a plurality of built-in instructions and/or function combinations;

performing parallel retrieval on the special languages in the query field;

performing parallel analysis on the result of parallel retrieval of the query domain specific language by using the analysis domain specific language;

and collecting results after parallel analysis to output an operation analysis instrument panel.

2. The syntax tree parsing-based log parsing method as claimed in claim 1, wherein parsing each of the plurality of query statements comprises:

acquiring an instruction and/or a function corresponding to each query statement in the plurality of query statements;

generating a special language for the query field and/or a special language for the analysis field when judging the analysis of the query statement according to the instruction;

parallel retrieval and/or parallel analysis is performed for each of the domain-specific languages.

3. The syntax tree parsing based log parsing method of claim 1, wherein the performing of parallel search for each of the query class domain specific languages comprises:

a search box acquires a plurality of query sentences input by a user;

and performing parallel retrieval on log data distributed in different machines by means of keyword matching, wildcard matching, keyword group matching, field matching, range matching and the like.

4. The syntax tree parsing based log parsing method of claim 1, wherein said method further comprises: acquiring a plurality of query sentences input by a user;

and combining, transmitting and nesting the query statements through a pipe character, wherein the pipe character is one or more than one.

5. The syntax tree parsing-based log parsing method of claim 4, wherein parsing each query statement further comprises:

analyzing and calculating the query statement on the left side of each pipeline symbol to obtain a first analysis result;

inputting the first analysis result as initial data of the query statement on the right side of the pipeline symbol, analyzing the query statement on the right side of the pipeline symbol and obtaining a second analysis result;

and when each retrieval result is analyzed in parallel, temporary results including the first analysis result and the second analysis result are displayed.

6. A log analysis device based on syntax tree parsing is characterized in that: the device comprises:

the grammar tree analysis module is used for analyzing the query statement input by the user to obtain a plurality of special languages in the query and/or analysis fields, and the query statement input by the user comprises a plurality of preset common operation instructions and calculation function combinations;

the parallel retrieval module is used for performing parallel retrieval on the special language of each query field;

the parallel analysis module is used for performing parallel analysis on the result after parallel retrieval of the special language of each query field by using the special language of the analysis field;

the result aggregation module is used for aggregating the results of the parallel analysis;

and the instrument panel generating module is used for carrying out statistical calculation and aggregation on the results of the parallel analysis to generate an operation analysis instrument panel.

7. The apparatus of claim 6, wherein the syntax tree parsing module comprises:

the instruction analysis submodule is used for analyzing the type of the instruction and/or the function corresponding to each field-specific language in the query language;

the instruction conversion sub-module is used for judging whether the domain-specific language is converted into the query or analysis domain-specific language when the domain-specific language is analyzed in parallel according to the type of the instruction and/or the function;

the sentence acquisition module is used for acquiring a plurality of query sentences input by a user in a search box;

and the statement combination module is used for combining, transmitting and nesting the query statements through the pipeline symbols, wherein one or more pipeline symbols are used for combined retrieval and analysis.

8. The apparatus of claim 7, wherein parsing each query statement further comprises:

the first analysis module is used for analyzing and calculating the query statement on the left side of each pipeline symbol to obtain a first analysis result;

the second analysis module is used for inputting the first analysis result as initial data of the query statement on the right side of the pipeline symbol, analyzing the query statement on the right side of the pipeline symbol and obtaining a second analysis result;

and the temporary display module displays temporary results comprising the first analysis result and the second analysis result when the search results are analyzed in parallel.

Technical Field

The invention relates to a log analysis method and device based on syntax tree parsing, and belongs to the technical field of data processing.

Background

In the traditional operation and maintenance field, the log is important data as indexes and application trace, and is beneficial to fault troubleshooting, monitoring, safety audit, compliance, backtracking and the like of enterprises. And the enterprise can also analyze the log data to generate an operation report, and the log has great potential analysis value. With the gradual maturity of the application of enterprises to micro-service architecture and cloud protogenesis, the IT architecture is more and more complex, the log generation speed is accelerated, the data volume is huge, the huge data cannot be analyzed manually, and most of unique contents in the log data cannot be directly identified manually. The problem of analyzing and finding the log content is more and more difficult.

In the operation and maintenance field, logs are processed, stored and analyzed by large data components such as Hadoop, HBase, Hive and the like in a common large data distributed architecture, but users often need to write a large number of complex large data processing programming models to process data and analyze problems. In the analysis process, a user usually can finally confirm the analysis requirement after modifying the analysis target for many times, and the time cost and the learning cost can be greatly increased by analyzing the log by using the traditional big data distributed architecture.

Disclosure of Invention

In order to solve the problems of high complexity and high learning cost of log analysis in the related technology, the invention provides a log analysis method and device based on syntax tree analysis, so as to realize statistical analysis on log data by combining a plurality of defined common operation instructions and calculation functions with a log analysis thought of syntax tree analysis.

The technical scheme of the invention is as follows: in a first aspect, the present invention provides a log analysis method based on syntax tree parsing, where the method includes:

and analyzing a plurality of query sentences input by a user by using a syntax tree analyzer to obtain a plurality of field-specific languages, wherein the query sentences input by the user comprise a plurality of preset common operation instructions and calculation functions.

Parsing the plurality of query sentences to obtain a plurality of types of domain-specific languages;

performing parallel retrieval on the special language of each query field;

performing parallel statistical analysis on the result of parallel retrieval of the query domain specific language by using the analysis domain specific language;

and aggregating the parallel analysis results to generate an operation analysis instrument panel.

In combination with another aspect, when performing parallel search according to the language specific to each of the query fields, the method includes:

a search box acquires a plurality of query sentences input by a user;

and performing parallel retrieval on log data distributed in different machines by means of keyword matching, wildcard matching, keyword group matching, field matching, range matching and the like.

In combination with another aspect, when performing parallel analysis using an analysis domain specific language for each of the search results, the method includes:

obtaining the type of an operation instruction and/or a calculation function corresponding to each field-specific language;

outputting a domain-specific language of an analysis class according to the instruction and/or the function type;

and carrying out parallel statistical analysis on the domain-specific language of each analysis class.

In combination with another aspect, in another possible embodiment of the present invention, the method further includes:

a search box acquires a plurality of query sentences input by a user;

the query statements are combined, passed, and nested by pipe symbols, wherein one or more of the pipe symbols can be used for combined retrieval and analysis.

In combination with another aspect, in another practical implementation of the present invention, the performing, according to the parallel analysis on each search result, further includes:

analyzing and calculating the query statement on the left side of each pipeline symbol to obtain a first analysis result;

and inputting the first analysis result as initial data of a query statement on the right side of the pipe character. Analyzing the query sentence on the right side of the pipeline symbol to obtain a second analysis result;

and when each retrieval result is analyzed in parallel, temporary results including the first analysis result and the second analysis result are displayed.

In a second aspect, the present invention further provides a log analysis apparatus for parsing syntax trees, the apparatus including:

the grammar tree analysis module is used for analyzing the query sentences input by the user to obtain a plurality of field-specific languages, and the query sentences input by the user comprise a plurality of preset common operation instructions and calculation function combinations;

the parallel retrieval module is used for performing parallel retrieval on the special language of each query field;

the parallel analysis module is used for performing parallel analysis on the result of parallel retrieval of the special language of each analysis field by using the special language of the analysis field;

the result aggregation module is used for uniformly aggregating the parallel analysis results;

and the instrument panel generating module is used for performing statistical calculation and aggregation on the parallel analysis results to generate an operation analysis instrument panel.

In the foregoing apparatus, the syntax tree parsing module includes:

the instruction analysis submodule is used for analyzing the type of the instruction and/or the function corresponding to each field-specific language in the query language;

the instruction conversion sub-module is used for judging whether the domain-specific language is converted into the query or analysis domain-specific language when the domain-specific language is analyzed in parallel according to the type of the instruction and/or the function;

the sentence acquisition module is used for acquiring a plurality of query sentences input by a user in a search box;

and the statement combination module is used for combining, transmitting and nesting the query statements through the pipeline symbols, wherein one or more pipeline symbols can be used for combined retrieval and analysis.

When each query statement is analyzed, the method further comprises the following steps:

the first analysis module is used for analyzing and calculating the query statement on the left side of each pipeline symbol to obtain a first analysis result;

and the second analysis module is used for inputting the first analysis result as initial data of the query statement on the right side of the pipeline symbol. Analyzing the query sentence on the right side of the pipeline symbol to obtain a second analysis result;

and the temporary display module displays temporary results comprising the first analysis result and the second analysis result when the search results are analyzed in parallel.

The invention has the beneficial effects that: a plurality of most common instructions and functions are analyzed, defined and abstracted from log data to form a set, and a query language input by a user is analyzed by combining a syntax tree to form a domain-specific language for retrieval and analysis. The user can use the instructions and the functions to flexibly combine, calculate and nest, the user can use the query language to analyze the log without a programming basis, the problem that the user needs to write a complex big data analysis program at present is solved, and the effect of flexibility and easiness in use is realized.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the invention and together with the description, serve to explain the principles of the invention.

FIG. 1 is a flow diagram illustrating a method for log analysis based on syntax tree parsing in accordance with an exemplary embodiment;

FIG. 2 is a flow diagram illustrating query implementation in accordance with an illustrative embodiment;

FIG. 3 is a flowchart illustrating sequential querying in accordance with an illustrative embodiment;

fig. 4 is a block diagram illustrating a dynamic perception-based log data analysis apparatus according to an example embodiment.

Detailed Description

Reference will now be made in detail to the exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, like numbers in different drawings represent the same or similar elements unless otherwise indicated. The embodiments described in the following exemplary embodiments do not represent all embodiments consistent with the present invention. Rather, they are merely examples of apparatus and methods consistent with certain aspects of the invention, as detailed in the appended claims.

The invention relates to a log analysis method and a device based on syntax tree analysis, which are mainly applied to a scene that logs can be flexibly used for value mining and operation analysis, and the basic idea is as follows: the most common instructions and functions are abstracted out by analyzing the logs to form a set, the set comprises dozens, hundreds and more instructions and functions summarized from academic fields such as statistics, machine learning, graphic visualization and the like, a user can flexibly combine, calculate and nest the instructions and the functions, the user can analyze the logs by using query language without programming basis, the problem that the user needs to compile complex big data analysis programs at present is solved, and the flexible and easy-to-use effect is realized.

The present embodiment is applicable to a terminal with a CPU (central processing unit) for performing a log analysis based on syntax tree parsing, and the method may be executed by a device with a CPU, where the device may be implemented by hardware, and may be generally integrated in an equipment terminal, for example, a storage device of a package program, and the method may be executed after completing access to log data of a device to be analyzed, as shown in fig. 1, which is a flowchart of a log analysis method based on syntax tree parsing of the present invention, and the method specifically includes the following steps:

in step 11, a syntax tree analyzer is used for analyzing a plurality of query sentences input by a user to obtain a plurality of domain-specific languages, wherein the query sentences comprise a plurality of preset instructions and/or function combinations;

the execution main body of all the steps of the embodiment of the invention can be a CPU (central processing unit) or an independent log analysis module, which can split a plurality of query statements input by a user into built-in instructions and functions, and retrieve and analyze the log by obtaining a plurality of domain special languages through syntax tree analysis, thereby helping the user to understand the value of the log.

The query sentence inputted by the user may be a plurality of query sentences combined by a pipe character, wherein each query sentence comprises one or more instruction and function combinations, and the pipe character may be one or more. The instructions and functions are supported by the log analysis module in advance, and dozens, hundreds and more of instructions and functions are summarized and abstracted from academic fields such as statistics, machine learning, graphic visualization and the like. And new instructions and function types can be added according to specific log types or new log analysis scenes.

In step 12, parallel search is performed for each of the domain-specific languages;

the obtained language specific to each query field is transmitted to the parallel retrieval module for parallel retrieval, and the parallel retrieval allocates independent threads to each batch of retrieval tasks to operate and obtain retrieval results, so that the retrieval efficiency of the language specific to a plurality of query fields can be improved, and the mode of splitting independent threads is beneficial to reducing performance influence caused by query among each thread.

In step 13, performing parallel statistical analysis on the results of the parallel search of each domain-specific language by using the domain-specific language;

and carrying out statistics, grouping, calculation, series connection, detection and prediction on the obtained parallel retrieval results by using analysis instructions and functions abstracted based on statistics and machine learning methodologies to obtain parallel analysis results, wherein the parallel execution mode of the parallel retrieval results is similar to that of the parallel retrieval results, and independent threads are distributed to operate.

In step 14, the parallel analysis results are aggregated to generate an operation analysis dashboard.

According to the method, the instruction and the function for log analysis are built in the log analysis module, a user can freely combine and analyze the abstracted instruction and function through the pipeline symbol according to the statistical principle and the machine learning methodology, when the user inputs the complex combined query statement to analyze the log, the analysis tree analyzer can analyze the complex combined query statement into the field-specific languages which can be associated one by one, and query and calculation are carried out through parallel retrieval and parallel analysis, so that the workload of the user in analyzing the log is greatly reduced, the working efficiency is improved, and the effects of conveniently, easily, quickly and conveniently organizing the problem solution and flexibly utilizing the solution to solve the problem are realized.

As shown in fig. 2, in the query process of the query statement according to the embodiment of the present invention, one or more query statements are identified and parsed by the syntax tree parsing module, and are split into the query domain specific language and the analysis domain specific language, and the split domain specific language is transmitted to the parallel retrieval module for distributed parallel retrieval, so as to accelerate the retrieval efficiency. And the parallel retrieval module can perform data retrieval on the query domain specific language. In the first aspect, if only the query domain specific language is analyzed and no analysis domain specific language is analyzed, the parallel retrieval results are directly and uniformly converged by the result convergence module, the results are sorted and combined, then the query is finished, and the results are collected and returned to the user; on the other hand, if the analyzed result contains the query domain specific language and the analysis domain specific language, the parallel retrieval result is transmitted to the parallel analysis module, the parallel analysis module performs parallel analysis by using the analysis domain specific language to obtain a parallel analysis result, the parallel analysis result is uniformly converged by the result convergence module, the results are sorted and combined, then the query is finished, and the results are collected and returned to the user.

As shown in fig. 3, which is a flowchart of sequential query according to a specific embodiment of the present invention, when performing parallel search and analysis on each query language according to the domain-specific language, the method further includes a step of performing sequential query on the combined query statement, and this process may include the following steps:

in step 31, a first analysis result is obtained by analyzing and calculating the query statement on the left side of each pipe character;

in step 32, the first analysis result is input as initial data of the query statement to the right of the pipe character. Analyzing the query sentence on the right side of the pipeline symbol to obtain a second analysis result;

in step 33, temporary results including the first analysis result and the second analysis result are displayed when each of the search results is analyzed in parallel.

In the exemplary embodiment of the present invention, the pipe symbol expression is "|", which is used to connect different query statements, the expression and action of which are equivalent to the pipe symbol command in bash, and the result on the left side of "|" can be used in the retrieval statistics combination on the right side of "|". In the process of retrieval, the user can also display the operation result of the query statement in stages according to the pipeline character.

The sequential query method of the invention, through using the pipeline symbol as the sequential query connection command of the two query sentences, makes the whole complex query process more orderly, and can also more conveniently show the running process of the query sentences to the user for checking, so that the user can know the running process of the query sentences, and the log analysis in the process of the user is more facilitated.

Fig. 4 is a schematic structural diagram of a log analysis device based on syntax tree parsing according to an embodiment of the present invention, where the log analysis device is implemented by software and is generally integrated in a terminal. As shown in the figure, based on the above embodiments, a log analysis device based on syntax tree parsing is provided, which mainly includes a syntax parsing module 41, a parallel retrieval module 42, a parallel analysis module 43, a result aggregation module 44, and a dashboard generating module 44.

The grammar parsing module 41 is configured to parse a query statement input by a user to obtain a plurality of query class and/or analysis class domain specific languages, where the query statement includes a combination of a plurality of preset instructions and/or functions;

the parallel retrieval module 42 is used for performing parallel retrieval on the special language of each query field;

the parallel analysis module 43 is configured to perform parallel analysis according to the language specific to each analysis-class domain;

the result aggregation module 44 is configured to aggregate and output the parallel analysis results;

the instrument panel generating module 45 is configured to display the output result and generate an operation analysis instrument panel.

The syntax tree parsing module according to an exemplary embodiment of the present invention includes:

the instruction analysis submodule is used for analyzing the type of the instruction and/or the function corresponding to each field-specific language in the query language;

the instruction conversion sub-module is used for judging whether the domain-specific language is converted into the query or analysis domain-specific language when the domain-specific language is analyzed in parallel according to the type of the instruction and/or the function;

the sentence acquisition module is used for acquiring a plurality of query sentences input by a user in a search box;

and the statement combination module is used for combining, transmitting and nesting the query statements through the pipeline symbols, wherein one or more pipeline symbols can be used for combined retrieval and analysis.

In a feasible implementation scenario of the exemplary embodiment of the present invention, when performing parallel analysis on each query statement according to the domain-specific language, the method further includes:

the first analysis module is used for analyzing and calculating the query statement on the left side of each pipeline symbol to obtain a first analysis result;

and the second analysis module is used for inputting the first analysis result as initial data of the query statement on the right side of the pipeline symbol. Analyzing the query sentence on the right side of the pipeline symbol to obtain a second analysis result;

and the temporary display module displays temporary results comprising the first analysis result and the second analysis result when the search results are analyzed in parallel.

While the present invention has been described in detail with reference to the embodiments shown in the drawings, the present invention is not limited to the embodiments, and various changes can be made without departing from the spirit of the present invention within the knowledge of those skilled in the art.

10页详细技术资料下载
上一篇:一种医用注射器针头装配设备
下一篇:根据公式自动计算观察项的装置及方法

网友询问留言

已有0条留言

还没有人留言评论。精彩留言会获得点赞!

精彩留言,会给你点赞!