Multi-dimensional statistical analysis method for multi-layer grouping

文档序号:361618 发布日期:2021-12-07 浏览:24次 中文

阅读说明:本技术 一种多层分组的多维统计分析方法 (Multi-dimensional statistical analysis method for multi-layer grouping ) 是由 陈波 杜易霖 余智华 于 2021-04-30 设计创作,主要内容包括:本发明公开了一种多层分组的多维统计分析方法,包括下列步骤:S1、根据表格ID参数确定待统计的表格;S2、获取上述表格中用于分组和统计的表格数据;S3、根据转换器参数Transform,对表格数据进行转换,作为步骤S4的输入;S4、构造一个树节点,作为分组和统计的根节点;S5、从根节点出发,对当前节点数据按照指定的分组器参数的字段,对数据进行分组;S6、对每个分组构造一个树节点,作为当前节点的子节点,添加到树中;S7、通过上述节点的数据,得到基于多层分组的树结构表示的表格数据。有益效果:本发明是对表格数据进行多维度统计,实现灵活可扩展的统计分析功能,可用于多种统计分析应用中。(The invention discloses a multi-dimensional statistical analysis method for multilayer grouping, which comprises the following steps: s1, determining a table to be counted according to the table ID parameters; s2, obtaining the table data for grouping and statistics in the table; s3, converting the form data according to the converter parameter Transform as the input of the step S4; s4, constructing a tree node as a root node for grouping and statistics; s5, starting from the root node, grouping the data of the current node according to the field of the designated grouping device parameter; s6, constructing a tree node for each group, taking the tree node as a child node of the current node, and adding the tree node into the tree; and S7, obtaining table data expressed by a tree structure based on multi-layer grouping according to the data of the nodes. Has the advantages that: the invention carries out multi-dimensional statistics on the form data, realizes flexible and extensible statistical analysis function, and can be used in various statistical analysis applications.)

1. A multi-dimensional statistical analysis method for multi-layer packets, comprising the steps of:

s1, determining a table to be counted according to the table ID parameters;

s2, obtaining the table data for grouping and statistics in the table;

s3, converting the form data according to the converter parameter Transform as the input of the step S4;

s4, constructing a tree node as a root node for grouping and statistics;

s5, starting from the root node, grouping the data of the current node according to the field of the designated grouping device parameter;

s6, constructing a tree node for each group, taking the tree node as a child node of the current node, and adding the tree node into the tree;

s7, obtaining table data expressed by a tree structure based on multilayer grouping according to the data of the nodes;

s8, grouping each leaf node, and carrying out statistical calculation on the current grouped data according to the specified statistical function in the parameters of the statistics device;

and S9, outputting the whole tree structure as a result.

2. The multi-dimensional statistical analysis method for multi-layer grouping according to claim 1, wherein said step S2 for obtaining the table data for grouping and statistics in the table comprises the following steps:

s21, constructing a Query statement according to the filter parameter Query, and filtering data from the table;

and S22, obtaining table data for grouping and statistics through the filtered data.

3. The method according to claim 1, wherein the step S7 of obtaining table data based on tree structure representation of multi-layer packet from the data of the nodes comprises the following steps:

s71, grouping the data of each node according to the level of the node and the field of the designated grouping device parameter;

s72, and then the step S6 is proceeded until the grouping is completed, and the table data represented based on the tree structure of the multi-layer grouping is formed.

4. The multi-dimensional statistical analysis method for multi-layer packets according to claim 1, wherein in step S8, for each leaf node packet, when performing statistical calculation on the current packet data according to the statistical function specified in the statistics parameter, the calculation result is stored in the leaf node where the current packet data is located.

5. The multi-dimensional statistical analysis method for multi-layer packets according to claim 1, wherein in step S9, the whole tree structure is outputted as a result, wherein for each node in the output result, the packet data of the node is removed, and only the grouped packet identifier id, the field name key for the next layer packet and the next layer packet result data are reserved;

for the root node, its id is null;

for leaf nodes, its key is null, and the data represents the statistical computation result.

Technical Field

The invention relates to the field of data analysis, in particular to a multi-dimensional statistical analysis method for multi-layer grouping.

Background

In the data science and technology era, mass data needs to be analyzed, and statistical analysis is a simple and effective data analysis mode and can help people to intuitively know the data distribution condition, so that a conclusion is quickly generated or a business decision is supported.

The common statistical analysis tool can only perform parameter design and process flow design respectively according to different types of charts.

Therefore, a flexible and uniform statistical analysis method is needed for data analysis services.

An effective solution to the problems in the related art has not been proposed yet.

Disclosure of Invention

The present invention is directed to a multi-dimensional statistical analysis method for multi-layer grouping to solve the above-mentioned problems in the background art.

In order to achieve the purpose, the invention provides the following technical scheme:

a multi-dimensional statistical analysis method of multi-layer packets, comprising the steps of:

s1, determining a table to be counted according to the table ID parameters;

s2, obtaining the table data for grouping and statistics in the table;

s3, converting the form data according to the converter parameter Transform as the input of the step S4;

s4, constructing a tree node as a root node for grouping and statistics;

s5, starting from the root node, grouping the data of the current node according to the field of the designated grouping device parameter;

s6, constructing a tree node for each group, taking the tree node as a child node of the current node, and adding the tree node into the tree;

s7, obtaining table data expressed by a tree structure based on multilayer grouping according to the data of the nodes;

s8, grouping each leaf node, and carrying out statistical calculation on the current grouped data according to the specified statistical function in the parameters of the statistics device;

and S9, outputting the whole tree structure as a result.

Further, the step S2 of acquiring the table data for grouping and statistics in the above table includes the following steps:

s21, constructing a Query statement according to the filter parameter Query, and filtering data from the table;

and S22, obtaining table data for grouping and statistics through the filtered data.

Further, the step S7 of obtaining table data represented by a tree structure based on multi-layer grouping according to the data of the nodes includes the following steps:

s71, grouping the data of each node according to the level of the node and the field of the designated grouping device parameter;

s72, and then the step S6 is proceeded until the grouping is completed, and the table data represented based on the tree structure of the multi-layer grouping is formed.

Further, in step S8, for each leaf node group, when performing statistical calculation on the current group data according to the statistical function specified in the statistics parameter, the calculation result is stored in the leaf node where the current group data is located.

Further, in the step S9, the whole tree structure is output as a result, where for each node in the output result, the grouped data of the node is removed, and only the grouped group identification id, the field name key for the next-layer group and the next-layer group result data are reserved;

for the root node, its id is null;

for leaf nodes, its key is null, and the data represents the statistical computation result.

Compared with the prior art, the invention has the following beneficial effects: the invention carries out multi-dimensional statistics on the form data, a user can specify the form data, specify the chart type and the chart parameter as input, and output the statistical analysis result. The method carries out statistical analysis on two-dimensional table data, uniformly expresses statistical results by a multi-level tree structure, realizes flexible and extensible statistical analysis functions, and can be used in various statistical analysis applications.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings needed in the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings without creative efforts.

FIG. 1 is one of the flow charts of a multi-dimensional statistical analysis method of multi-layer grouping according to an embodiment of the present invention;

FIG. 2 is a second flowchart of a multi-dimensional statistical analysis method for multi-layer grouping according to an embodiment of the present invention;

FIG. 3 is a table illustrating design parameters for a multi-dimensional statistical analysis method for multi-layer clustering, according to an embodiment of the present invention;

FIG. 4 is a table of the output data structure of step S9 of the multi-dimensional statistical analysis method for multi-layer grouping according to the embodiment of the present invention;

FIG. 5 is a table of an example of a set of tabular data having 3 columns and 5 rows in accordance with an embodiment of a multi-layered, grouped, multi-dimensional statistical analysis method of the present invention.

Detailed Description

Before further description of the present invention, the terms referred to in the invention are briefly described as follows:

tabular data

A structural data representation form, such as common database tables, Excel forms and CSV forms can be processed as table data. A table has a table name, a header defining the data structure of the table, and table row data, which is a collection of data instances conforming to the header definition.

Statistical analysis

And carrying out statistical calculation on a group of data examples and obtaining summary information. Common statistical calculations are: count, sum, maximum, minimum, average, variance, etc.

Filter

And the rule function is used for filtering and screening the table data. The input of the filter is row data and the output is a boolean value.

Converter

A function for converting table data. The input of the converter is a row of data and the output is another row of data.

Grouping device

A function for grouping table data. The input of the grouper is a set of row data and the output is a number of groups, each group being a subset of the input data, there being no intersection between groups and the union of all groups being the same as the input.

Statistics device

A function for performing statistical calculations on a set of line data. The input of the statistic device is a group of row data, and the output is a statistic value.

The invention is further described with reference to the following drawings and detailed description: .

Referring to fig. 1-2, a multi-dimensional statistical analysis method for multi-layer grouping according to an embodiment of the present invention includes the following steps:

step S1, determining a form to be counted according to the form ID parameter;

step S2, obtaining the table data for grouping and statistics in the table;

s21, constructing a Query statement according to the filter parameter Query, and filtering data from the table;

s22, obtaining table data for grouping and statistics

Step S3, converting the form data according to the converter parameter Transform as the input of the next step

Step S4, constructing a tree node as the root node of grouping and statistics

Step S5, from the root node, grouping the data of the current node according to the field of the designated grouping device parameter

Step S6, constructing a tree node for each group, using the tree node as the child node of the current node, and adding the tree node into the tree

Step S7, obtaining table data expressed by tree structure based on multilayer grouping through the data of the nodes;

s71, grouping the data of each node according to the level of the node and the field of the designated grouping device parameter;

s72, proceeding to the sixth step until all the groups are completed, forming table data represented by tree structure based on multi-layer groups

And step S8, grouping each leaf node, performing statistical calculation on the current grouped data according to the statistical function specified in the parameter of the statistical device, and storing the calculation result on the leaf node where the current grouped data is located.

Step S9, the entire tree structure is output as a result.

And removing the grouped data of the nodes in the output result, and only retaining the grouped packet identification id, the field name key for the next layer packet and the next layer packet result data. For the root node, its id is null; for leaf nodes, its key is null, and the data represents the statistical computation result.

The multilayer of the invention is embodied by grouping layer by using a tree structure. The multi-dimension of the invention is embodied in that for leaf nodes, statistical analysis can be performed from multiple dimensions.

For the convenience of understanding the technical solutions of the present invention, the following detailed description will be made on the working principle or the operation mode of the present invention in the practical process.

In practical application, the parameter design in the above steps is as shown in fig. 3, and the complex data structure of the parameter design in fig. 3 is explained as follows:

filter data structure definition:

filter {// JSON object structure

(field:Match)*

}

field: < data field name >

Match:=value|List[Match]|Operator

Operator:=SimpleOper|ComplexOper

SimpleOper:={sop:value}

ComplexOper:={cop:List[Operator]}

sop:="$lt"|"$gt"|"$eq"|"$lte"|"$gte"|"$regex"

cop:="$and"|"$or"|"$not"

value:=string-value|number-value

string-value: < JSON string >

number-value: < JSON number >

The Transform data structure defines:

Transform:={

field: < field to be converted >,

transform < transfer function >,

outputField: < output field >

The following transfer functions are supported:

for type of date

YEAR of YEAR

MONTH of MONTH

DAY DAY

HOUR HOURs

MINUTE score

For character string type

UPPER capitalization

LOWER case of LOWER

Length acquisition Length

CAPITAL single capitalization

For digital type

ROUND rounding up

FLOOR6

CEILING

Absolute value calculation of ABS

NEG inverse value

SQURE squaring value

SQRT square root value

Group data structure definition:

the following statistical functions are supported:

for all types

COUNT SIMPLE COUNT FOR COUNT-BASED RETURN OF PACKETS IN LINES

DIFF uniquely counts the number of different values returned

Type of value

SUM of SUM

MIN minimum

MAX maximum value

AVG average

MEDIAN of MEDIAN

VAR variance

Standard deviation of STD

The Sort data structure defines:

in addition, as shown in fig. 5, in one embodiment, 3 columns and 5 rows of table data as shown in fig. 5 are generated, the example data are grouped according to X1 and X2, and then the number of data pieces in each group is counted, and the statistical result is as follows (expressed by JSON):

in summary, the following steps: the invention carries out multi-dimensional statistics on the form data, a user can specify the form data, specify the chart type and the chart parameter as input, and output the statistical analysis result. The method carries out statistical analysis on two-dimensional table data, uniformly expresses statistical results by a multi-level tree structure, realizes flexible and extensible statistical analysis functions, and can be used in various statistical analysis applications.

Although embodiments of the present invention have been shown and described, it will be appreciated by those skilled in the art that changes, modifications, substitutions and alterations can be made in these embodiments without departing from the principles and spirit of the invention, the scope of which is defined in the appended claims and their equivalents.

12页详细技术资料下载
上一篇:一种医用注射器针头装配设备
下一篇:智能文档处理方法、系统、计算机设备及介质

网友询问留言

已有0条留言

还没有人留言评论。精彩留言会获得点赞!

精彩留言,会给你点赞!