Data detection method and device based on artificial intelligence, server and storage medium

文档序号:682762 发布日期:2021-04-30 浏览:2次 中文

阅读说明:本技术 基于人工智能的数据检测方法、装置、服务器及存储介质 (Data detection method and device based on artificial intelligence, server and storage medium ) 是由 陈桢博 郑立颖 徐亮 于 2020-12-31 设计创作,主要内容包括:本申请涉及人工智能,提供一种基于人工智能的数据检测方法、装置、服务器及存储介质,该方法包括:获取目标检测指标的当前时序数据;根据所述当前时序数据生成当前时序向量,并确定所述目标检测指标的历史时序数据的历史时序向量;通过所述第一Transformer模型,将所述历史时序向量和当前时序向量进行关联,得到关联时序向量;将所述关联时序向量输入至所述第二Transformer模型,得到所述当前时序数据的下一时序的预测数据;获取所述当前时序数据的下一时序的目标时序数据,通过所述预测数据和目标时序数据,对所述目标时序数据进行异常检测。本申请能够提高异常数据的检测准确度。(The application relates to artificial intelligence, and provides a data detection method, a device, a server and a storage medium based on artificial intelligence, wherein the method comprises the following steps: acquiring current time sequence data of a target detection index; generating a current time sequence vector according to the current time sequence data, and determining a historical time sequence vector of historical time sequence data of the target detection index; associating the historical time sequence vector with the current time sequence vector through the first Transformer model to obtain an associated time sequence vector; inputting the correlation time sequence vector to the second Transformer model to obtain predicted data of a next time sequence of the current time sequence data; and acquiring target time sequence data of the next time sequence of the current time sequence data, and performing anomaly detection on the target time sequence data through the prediction data and the target time sequence data. The method and the device can improve the detection accuracy of the abnormal data.)

1. A data detection method based on artificial intelligence is applied to a server, a time series data prediction model is stored in the server, the time series data prediction model comprises a first Transformer model and a second Transformer model which are trained in advance, and the method comprises the following steps:

acquiring current time sequence data of a target detection index;

generating a current time sequence vector according to the current time sequence data, and determining a historical time sequence vector of historical time sequence data of the target detection index;

associating the historical time sequence vector with the current time sequence vector through the first Transformer model to obtain an associated time sequence vector;

inputting the correlation time sequence vector to the second Transformer model to obtain predicted data of a next time sequence of the current time sequence data;

and acquiring target time sequence data of the next time sequence of the current time sequence data, and performing anomaly detection on the target time sequence data through the prediction data and the target time sequence data.

2. The data detection method of claim 1, wherein the historical timing vector corresponds to a time length greater than or equal to a first preset time length, and the current timing vector corresponds to a time length less than or equal to a second preset time length.

3. The data detection method of claim 1, wherein the determining a historical timing vector of historical timing data of the target detection metric comprises:

acquiring historical time sequence data of the target detection index, and determining first time information corresponding to the historical time sequence data;

generating a first time sequence vector according to the historical time sequence data and first time information, and generating a first correction vector according to the historical time sequence data, the first time information and a preset function;

and splicing the first time sequence vector and the first correction vector to obtain a historical time sequence vector.

4. The data detection method according to claim 3, wherein the target detection indexes are k, the first time information includes m time point information of the historical time series data, and k and m are positive integers greater than or equal to 1;

generating a first time sequence vector according to the historical time sequence data and first time information includes:

and arranging the historical time sequence data of the k target detection indexes according to the m time point information to generate a k-m matrix vector, and taking the matrix vector as a first time sequence vector.

5. The data detection method of claim 3, wherein generating a first correction vector based on the historical timing data, the first time information, and a preset function comprises:

acquiring a preset function, and substituting the historical time sequence data and the first time information into the preset function to obtain the correction information of the historical time sequence data;

and generating a first correction vector according to the historical time sequence data and the correction information of the historical time sequence data.

6. The data detection method of any one of claims 1-5, wherein the first Transformer model comprises a Transformer operational layer and a feedforward neural network layer; the associating the historical time sequence vector with the current time sequence vector through the first Transformer model to obtain an associated time sequence vector, including:

calculating the historical time sequence vector and the current time sequence vector through the Transformer calculation layer to obtain associated time sequence data;

and mapping the correlation time sequence data through the feedforward neural network layer to obtain a correlation time sequence vector.

7. The data detection method of any one of claims 1-5, wherein the method further comprises:

acquiring a plurality of sample data, wherein the sample data comprises first time sequence data, second time sequence data and labeled time sequence data, and the acquisition time of the first time sequence data is later than that of the second time sequence data;

generating a first time sequence vector according to the first time sequence data, and generating a second time sequence vector according to the second time sequence data;

associating the first time sequence vector with the second time sequence vector through a first preset Transformer model to obtain a target time sequence vector;

inputting the target time sequence vector to a second preset Transformer model to obtain predicted time sequence data;

updating model parameters of the first preset Transformer model and the second preset Transformer model according to the predicted time sequence data and the marked time sequence data;

and continuing to perform iterative training on the updated first preset Transformer model and the updated second preset Transformer model until the first preset Transformer model and the second preset Transformer model are converged to obtain the time series data prediction model.

8. The data detection device based on artificial intelligence is applied to a server, a time sequence data prediction model is stored in the server, the time sequence data prediction model comprises a first Transformer model and a second Transformer model which are trained in advance, and the data detection device comprises:

the acquisition module is used for acquiring current time sequence data of the target detection index;

the generation module is used for generating a current time sequence vector according to the current time sequence data and determining a historical time sequence vector of historical time sequence data of the target detection index;

the correlation module is used for correlating the historical time sequence vector with the current time sequence vector through the first Transformer model to obtain a correlation time sequence vector;

the prediction module is used for inputting the correlation time sequence vector to the second transform model to obtain prediction data of a next time sequence of the current time sequence data;

the acquisition module is further configured to acquire target timing sequence data of a next timing sequence of the current timing sequence data;

and the detection module is used for carrying out abnormity detection on the target time sequence data through the prediction data and the target time sequence data.

9. A server, characterized in that the server comprises a processor, a memory, and a computer program stored on the memory and executable by the processor, wherein the computer program, when executed by the processor, implements the steps of the artificial intelligence based data detection method according to any one of claims 1 to 7.

10. A computer-readable storage medium, having a computer program stored thereon, wherein the computer program, when being executed by a processor, carries out the steps of the artificial intelligence based data detection method according to any one of claims 1 to 7.

Technical Field

The present application relates to the field of intelligent decision making technology in artificial intelligence, and in particular, to a data detection method, apparatus, server and storage medium based on artificial intelligence.

Background

With the continuous development of computer science and technology, more and more applications in the internet run on servers, and a large amount of running data is generated. The abnormal fluctuation of the running data may reflect the abnormal state of the application or the server hardware, so the server needs to monitor the running data in real time, and then the abnormal state of the application and the server hardware is found in time and an alarm is given. Currently, an anomaly detection model is responsible for monitoring indexes of a plurality of objects such as applications and hardware in a server system, so as to determine whether abnormal states exist in the applications and the hardware. However, the conventional anomaly detection model is obtained by training only using historical time series data as a training sample, and data deviation is likely to occur when the current time series data is predicted, so that the accuracy of anomaly prediction is reduced, and the detection effect of the anomaly data is not good.

Disclosure of Invention

The application mainly aims to provide a data detection method, a data detection device, a server and a storage medium based on artificial intelligence, and aims to improve the detection accuracy of abnormal data.

In a first aspect, the present application provides a data detection method based on artificial intelligence, which is applied to a server, where the server stores a time series data prediction model, and the time series data prediction model includes a first Transformer model and a second Transformer model trained in advance, and the method includes:

acquiring current time sequence data of a target detection index;

generating a current time sequence vector according to the current time sequence data, and determining a historical time sequence vector of historical time sequence data of the target detection index;

associating the historical time sequence vector with the current time sequence vector through the first Transformer model to obtain an associated time sequence vector;

inputting the correlation time sequence vector to the second Transformer model to obtain predicted data of a next time sequence of the current time sequence data;

and acquiring target time sequence data of the next time sequence of the current time sequence data, and performing anomaly detection on the target time sequence data through the prediction data and the target time sequence data.

In a second aspect, the present application further provides an artificial intelligence-based data detection apparatus, which is applied to a server, where a time series data prediction model is stored in the server, the time series data prediction model includes a first Transformer model and a second Transformer model trained in advance, and the data detection apparatus includes:

the acquisition module is used for acquiring current time sequence data of the target detection index;

the generation module is used for generating a current time sequence vector according to the current time sequence data and determining a historical time sequence vector of historical time sequence data of the target detection index;

the correlation module is used for correlating the historical time sequence vector with the current time sequence vector through the first Transformer model to obtain a correlation time sequence vector;

the prediction module is used for inputting the correlation time sequence vector to the second transform model to obtain prediction data of a next time sequence of the current time sequence data;

the acquisition module is further configured to acquire target timing sequence data of a next timing sequence of the current timing sequence data;

and the detection module is used for carrying out abnormity detection on the target time sequence data through the prediction data and the target time sequence data.

In a third aspect, the present application further provides a server comprising a processor, a memory, and a computer program stored on the memory and executable by the processor, wherein the computer program, when executed by the processor, implements the steps of the artificial intelligence based data detection method as described above.

In a fourth aspect, the present application further provides a computer-readable storage medium having a computer program stored thereon, where the computer program, when executed by a processor, implements the steps of the artificial intelligence based data detection method as described above.

The application provides a data detection method, a device, a server and a storage medium based on artificial intelligence, the current time sequence data of a target detection index is obtained, a current time sequence vector is generated according to the current time sequence data, the historical time sequence vector of the historical time sequence data of the target detection index is determined, the historical time sequence vector is associated with the current time sequence vector through a first Transformer model to obtain an associated time sequence vector, then the associated time sequence vector is input into a second Transformer model to obtain the predicted data of the next time sequence of the current time sequence data, the target time sequence data of the next time sequence of the current time sequence data is obtained finally, and the target time sequence data is subjected to abnormity detection through the predicted data and the target time sequence data. The current time sequence vector and the historical time sequence vector are processed through the Transformer model, the prediction data with higher accuracy are output, and the detection accuracy of abnormal data is greatly improved.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.

Fig. 1 is a schematic flowchart illustrating steps of a data detection method based on artificial intelligence according to an embodiment of the present disclosure;

FIG. 2 is a flow diagram illustrating sub-steps of the data detection method of FIG. 1;

fig. 3 is a schematic diagram of a scene for outputting associated time series data according to the embodiment;

FIG. 4 is a schematic block diagram of an artificial intelligence-based data detection apparatus according to an embodiment of the present disclosure;

FIG. 5 is a schematic block diagram of a sub-module of the data detection apparatus of FIG. 4;

fig. 6 is a schematic block diagram of a server according to an embodiment of the present disclosure.

The implementation, functional features and advantages of the objectives of the present application will be further explained with reference to the accompanying drawings.

Detailed Description

The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are some, but not all, embodiments of the present application. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.

The flow diagrams depicted in the figures are merely illustrative and do not necessarily include all of the elements and operations/steps, nor do they necessarily have to be performed in the order depicted. For example, some operations/steps may be decomposed, combined or partially combined, so that the actual execution sequence may be changed according to the actual situation. In addition, although the division of the functional blocks is made in the device diagram, in some cases, it may be divided in blocks different from those in the device diagram.

The embodiment of the application provides a data detection method, a data detection device, a server and a storage medium based on artificial intelligence. The data detection method can be applied to a server, and the server can be a single server or a server cluster consisting of a plurality of servers.

Some embodiments of the present application will be described in detail below with reference to the accompanying drawings. The embodiments described below and the features of the embodiments can be combined with each other without conflict.

Referring to fig. 1, fig. 1 is a schematic flowchart illustrating steps of a data detection method based on artificial intelligence according to an embodiment of the present disclosure.

As shown in fig. 1, the data detection method includes steps S101 to S105.

And S101, acquiring current time sequence data of the target detection index.

When an application program runs on a server, abnormal fluctuations in the server performance indicators may reflect an abnormal state of the application, and thus the abnormality may be detected by time series data of the server performance indicators. The target detection index in the present embodiment may include, for example, at least one of the following indexes: CPU utilization rate, IO utilization rate, memory utilization rate, bandwidth utilization rate, network entry flow, network exit flow, access amount, access time consumption, download amount, newly-increased user number and active users. The time-series data is time-series data, for example, a data sequence recorded in time series by the same index, and the time-series data may be a number of time periods or a number of time points. The current time series data includes time series data of the target detection index within the current time series, and the current time series can be flexibly set by a user, for example, the current time series data is time series data of a plurality of memory usage rates recorded within 1 hour.

It should be noted that the current time series data in the present embodiment may be time series data of a single target detection index, for example, may be time series data of CPU usage, or may be time series data of bandwidth usage; the current time series data in this embodiment may also be time series data of a plurality of target detection indicators, and may include, for example, time series data of a memory usage rate and a network egress traffic.

In one embodiment, current time series data of the target detection index is obtained, and historical time series data of the target detection index is obtained. Similarly, the historical time series data is the time series data of the target detection indexes within the preset time period, and the historical time series data in the embodiment may also be time series data about a single target detection index or a plurality of target detection indexes, and may be set according to actual conditions.

Further, the historical time series data is long time series data, and the current time series data is short time series data. The time length corresponding to the historical time sequence data is greater than or equal to a first preset time length, the time length corresponding to the current time sequence data is less than or equal to a second preset time length, the time length corresponding to the historical time sequence data is greater than the time length corresponding to the current time sequence data, and the first preset time length and the second preset time length can be set according to actual conditions. For example, if the historical time-series data is time-series data of the target detection index recorded in the past 1 week, the time length corresponding to the historical time-series data is 1 week; if the current time sequence data is the time sequence data of the target detection index recorded in the current 1 hour, the time length corresponding to the current time sequence data is 1 hour.

When the time-series data is detected abnormally by the abnormality detection model, the target detection index is often associated with the distribution of the historical data, and the accuracy of abnormality detection may be reduced by using short time-series data for abnormality detection, but a large amount of calculation is required by using long time-series data for abnormality detection. Therefore, in the embodiment, the historical time sequence data of the long time sequence data and the current time sequence data of the short time sequence data are adopted to predict the next time sequence, the accuracy of the abnormality detection is higher, and the historical time sequence data of the long time sequence data can be updated at a lower frequency, so that the calculation amount is saved.

And S102, generating a current time sequence vector according to the current time sequence data, and determining a historical time sequence vector of the historical time sequence data of the target detection index.

The historical data records process information and result information of converting the historical time sequence data of the target detection index into the historical time sequence vector, so that the historical time sequence vector can be reused without being generated in real time, calculation amount is saved, and abnormity detection efficiency is improved.

In an embodiment, the time length corresponding to the historical timing vector is greater than or equal to a first preset time length, and the time length corresponding to the current timing vector is less than or equal to a second preset time length. It should be noted that the time length corresponding to the historical time sequence vector generated by using the historical time sequence data of the long time sequence data is greater than or equal to the first preset time length, and the time length corresponding to the current time sequence vector generated by using the current time sequence data of the short time sequence data is less than or equal to the second preset time length, so that the accuracy of abnormality detection can be improved, and the historical time sequence data of the long time sequence data can be updated at a lower frequency without being changed for a longer period of time, thereby saving the calculation amount.

For example, if the historical time-series data is time-series data of the target detection index recorded in the past 1 week, the time length corresponding to the generated historical time-series vector is 1 week; and if the current time sequence data is the time sequence data of the target detection index recorded in the current 1 hour, the time length corresponding to the generated current time sequence vector is 1 hour.

In one embodiment, as shown in fig. 2, determining a historical timing vector of target detection indicators includes: substeps 1021 to substep S1023.

And a substep S1021, acquiring historical time series data of the target detection index, and determining first time information corresponding to the historical time series data.

The historical time series data may include a plurality of detection data of the target detection index, and the first time information includes a first time series, a first time interval, and/or a first time point of each detection data. The first time sequence is a time length corresponding to the historical time sequence data, and if the first time sequence is 1 week, the historical time sequence data comprises a plurality of detection data of the target detection index within the range of 1 week; the first time interval is an interval range of a time length corresponding to the historical time series data, for example, the first time interval is (14 days, 21 days), that is, the historical time series data includes a plurality of detection data of the target detection index within 14 days to 21 days; the first time interval is an acquisition time interval between every two detection data in the historical time sequence data; the first time point is a time point corresponding to each detection data in the historical time sequence data, and the ratio of the first time sequence to the first time interval is the number of the detection data in the historical time sequence data.

In one embodiment, the target detection indexes are k, the first time information includes m time point information of the historical time sequence data, and k and m are positive integers greater than or equal to 1; each time point information corresponds to one detection data, m time point information indicates that m detection data exist in historical time sequence data, each target detection index corresponds to one time sequence data, and k target detection indexes indicate that k × m detection data exist in the historical time sequence data.

And a substep S1022, generating a first time sequence vector according to the historical time sequence data and the first time information, and generating a first correction vector according to the historical time sequence data, the first time information and a preset function.

The preset function includes a sine function, a cosine function, or a combination function of a sine function and a cosine function, which is not specifically limited in this embodiment. For example, the first timing vector is a k × m matrix vector, and the first correction vector is an e × m matrix vector.

In one embodiment, the target detection indexes are k, the first time information includes m time point information of the historical time sequence data, and k and m are positive integers greater than or equal to 1; generating a first timing vector according to the historical timing data and the first time information, comprising: and arranging the historical time sequence data of the k target detection indexes according to the m time point information to generate a k-m matrix vector, and taking the matrix vector as a first time sequence vector. It should be noted that the historical time series data may include a plurality of detection data of target detection indexes, each piece of time point information corresponds to one piece of detection data, m pieces of time point information indicates that m pieces of detection data exist in the historical time series data, each target detection index corresponds to one piece of time series data, and k pieces of target detection indexes indicate that k × m pieces of detection data exist in the historical time series data. Determining the arrangement sequence of the plurality of detection data in each target detection index according to the m time point information, and arranging the historical time sequence data of k target detection indexes according to the arrangement sequence of the plurality of detection data in each target detection index to obtain a matrix vector of k m detection data, wherein the matrix vector of k m is the first time sequence vector, and k represents the width or the length of the matrix vector. The first time sequence vector of the historical time sequence data can be accurately generated through the historical time sequence data and the first time information.

In one embodiment, generating a first correction vector according to the historical timing data, the first time information and a preset function includes: acquiring a preset function, and substituting the historical time sequence data and the first time information into the preset function to obtain the correction information of the historical time sequence data; and generating a first correction vector according to the historical time sequence data and the correction information of the historical time sequence data. The preset function is a sine function, a cosine function or a combination function of the sine function and the cosine function, wherein the preset function includes a time characteristic parameter, for example, the preset function is a cos (2 pi xt), where a is a preset coefficient, t is the time characteristic parameter, and x is a set unknown parameter, and can be used for representing detection data in historical time sequence data. It should be noted that, the historical time sequence data and the first time information are substituted into the preset function, correction information of the detection data at different time points is calculated, and the correction information of the detection data at different time points is arranged to obtain a first correction vector, and the first correction vector is used for correcting the first time sequence vector, so that the subsequently calculated prediction data is more accurate, and the accuracy of the anomaly detection in the embodiment of the present application is higher.

Illustratively, the number of the correction information calculated by the preset function is e, and the first time information includes m time point information of the historical time sequence data; and arranging the e correction information according to the m time point information to generate an e m matrix vector, and taking the matrix vector as a first correction vector. Wherein e represents the length of the matrix vector, and m represents the width of the matrix vector; or m represents the length of the matrix vector, and e represents the width of the matrix vector, which is not limited in this embodiment.

And a substep S1023 of splicing the first time sequence vector and the first correction vector to obtain a historical time sequence vector.

The splicing method comprises row splicing or column splicing, and can be flexibly applied and transposed according to actual conditions. For example, the first timing vector is a matrix vector of k × m, the first correction vector is a matrix vector of e × m, the matrix vectors of (k + e) × m are obtained by splicing, and the matrix vector of (k + e) × m is used as a history timing vector. Or, for example, the first timing vector is a matrix vector of m × k, the first correction vector is a matrix vector of m × e, a matrix vector of m × k + e is obtained by concatenation, and the matrix vector of m × k + e is used as a history timing vector.

In one embodiment, generating a current timing vector from the current timing data includes: acquiring second time information corresponding to the current time sequence data; generating a second time sequence vector according to the current time sequence data and second time information, and generating a second correction vector according to the current time sequence data, the second time information and a preset function; and splicing the second time sequence vector and the second correction vector to obtain the current time sequence vector. The current time sequence data may include a plurality of detection data of the target detection index, and the second time information includes a second time sequence, a second time interval, and/or a second time point of each detection data. Specifically, reference may be made to the foregoing embodiments for generating the history timing vector, which are not described herein again.

Illustratively, the second timing vector is a k × n matrix vector, the second correction vector is an e × n matrix vector, the (k + e) × n matrix vectors are obtained by splicing, and the (k + e) × n matrix vectors are used as historical timing vectors.

In one embodiment, a historical timing vector of historical timing data of a target detection index is obtained from a memory. It should be noted that, in the embodiment of the present application, the historical time sequence data of the long time sequence data is used to generate the historical time sequence vector, the historical time sequence vector generated by the historical time sequence can be stored in the memory, the historical time sequence vector can be reused within a long period of time (a preset time period), and when the historical time sequence vector is reused, the historical time sequence vector can be directly called through the memory, so that frequent operation is not required, and the calculation amount is saved.

And S103, associating the historical time sequence vector with the current time sequence vector through a first Transformer model to obtain an associated time sequence vector.

The server stores a time sequence data prediction model, and the time sequence data prediction model comprises a first Transformer model and a second Transformer model which are trained in advance.

In one embodiment, the first Transformer model comprises a Transformer computation layer and a feedforward neural network layer FNN; calculating the historical time sequence vector and the current time sequence vector through a Transformer calculation layer to obtain associated time sequence data; and mapping the associated time sequence data through a feedforward neural network layer to obtain an associated time sequence vector. The transform operation layer is used for performing transform operation on a historical time sequence vector and a current time sequence vector, the feedforward neural network layer FNN is used for performing full-link layer mapping on an operation result output by the transform operation layer and outputting associated time sequence data so as to realize association of various target detection indexes in the historical time sequence vector and the current time sequence vector, and the current time sequence data and the historical time sequence data are covered after the association of the historical time sequence vector and the current time sequence data so as to predict a next time sequence more accurately and improve the accuracy of abnormal detection.

Illustratively, the associated timing sequences (x3) of the history timing vector history _ window and the current timing vector current _ window are calculated by the transform operator layer and the feedforward neural network layer FNN. Specifically, as shown in fig. 3, the historical timing vector history _ window and the current timing vector current _ window are input to the first Transformer model, which includes a Transformer computation layer and a feedforward neural network layer FNN. The transform operation layer performs transform operation on the current timing vector current _ window to obtain x1, x1 is transform × S (history _ window), and x S represents that S layer transform operation is performed, and performs transform operation on the current timing vector S (current _ window) to obtain x2, and x2 is transform × S (current _ window); the feedforward neural network layer FNN carries out full-connection layer mapping on x1 and x2 and outputs associated time sequence data x3, x3 being Densea out(attentionout*valueout) Wherein, the FNN of the feedforward neural network layer comprises query vector queryoutKey vector keyoutSum value vector valueout,attentionout=softmax(queryout*keyout),valueout=Densev out(x1),keyout=Densek out(x1),queryout=Denseq out(x2), Dense denotes the fully connected layer, softmax is the activation function.

In one embodiment, the number of the first transform timing models may be n, where n is a positive integer greater than or equal to 2. If the number of the first Transformer time sequence models can be n, the input historical time sequence vector and the input current time sequence vector are sequentially calculated through each first Transformer time sequence model according to the set level until the nth first Transformer time sequence model completes calculation and output, and the associated time sequence data is obtained.

And step S104, inputting the associated time sequence vector into a second Transformer model to obtain predicted data of the next time sequence of the current time sequence data.

The second Transformer model has the same structure as the first Transformer model, has the commonality with the first Transformer model, and also comprises a Transformer operation layer and a feedforward neural network layer FNN, and the difference is that the model parameters are different. And processing the associated time sequence vector through a second Transformer model, and outputting the predicted data of the next time sequence of the current time sequence data. And associating the current time sequence data with the similar time period in the historical time sequence data through the second Transformer model and the associated time sequence vector, so as to obtain the predicted data of the next time sequence of the current time sequence data.

Exemplarily, the associated time sequence data x3 is input into a second transform time sequence model, that is, the associated time sequence data x3 is subjected to transform operation again, full-connection layer mapping is performed through a feedforward neural network layer FNN, and finally output is completed to obtain predicted data of the next time sequence of the current time sequence data

And S105, acquiring target time sequence data of the next time sequence of the current time sequence data, and performing abnormity detection on the target time sequence data through the prediction data and the target time sequence data.

After obtaining predicted data of a next time sequence of the current time sequence data, target time sequence data of the next time sequence of the current time sequence data is obtained, wherein the current time sequence data corresponds to a second time sequence, the second time sequence is a time length corresponding to the current time sequence data, for example, the second time sequence is 1 hour, that is, the current time sequence data includes a plurality of detection data of target detection indexes within 1 hour. The time length corresponding to the target time series data is the same as the time length corresponding to the current time series data, that is, the target time series data corresponds to a second time series, for example, a plurality of detection data including a target detection index within 1 hour. The target time series data is the time series data of the next time series of the current time series data, therefore, the time interval corresponding to the target time series data is obtained by adding the second time series to the second time interval corresponding to the current time series data, for example, when the second time series is 1 hour, and the second time interval corresponding to the current time series data is (12 th, 13 th), the time interval corresponding to the target time series data is (13 th, 14 th).

In one embodiment, the predicted data and the target time sequence data are compared to obtain a relative error between the predicted data and the target time sequence data; and carrying out anomaly detection on the target time sequence data according to the relative error to obtain an anomaly detection result. The abnormal detection result comprises data abnormality and data normality. And when the relative error between the predicted data and the target time sequence data is greater than or equal to a preset error threshold value, determining that the abnormal detection result is data abnormal, and when the relative error between the predicted data and the target time sequence data is less than the preset error threshold value, determining that the abnormal detection result is data normal. According to the method and the device, the target time sequence data of the next time sequence of the current time sequence data is predicted by adopting the associated time sequence vector of the current time sequence data and the historical time sequence data, the predicted data can adapt to the change trend of the target detection index per se along with the time, and the accuracy of abnormal detection is improved.

In one embodiment, when the target time series data is determined to be abnormal data, an alarm may be issued. For example, the server may play an abnormal data alarm sound, or illuminate an abnormal data alarm lamp, or send corresponding alarm information to the relevant responsible personnel, etc. The notification mode of the alarm information includes but is not limited to WeChat, short message, email, enterprise-level communication platform, and the like.

In the data detection method based on artificial intelligence provided in the above embodiment, the current time series data of the target detection index is obtained, the current time series vector is generated according to the current time series data, the historical time series vector of the historical time series data of the target detection index is determined, the historical time series vector is associated with the current time series vector through the first transform model to obtain an associated time series vector, the associated time series vector is input to the second transform model to obtain the predicted data of the next time series of the current time series data, the target time series data of the next time series of the current time series data is finally obtained, and the abnormality detection is performed on the target time series data through the predicted data and the target time series data. The current time sequence vector and the historical time sequence vector are processed through the Transformer model, the prediction data with higher accuracy are output, and the detection accuracy of abnormal data is greatly improved.

In one embodiment, the data detection method further includes: acquiring a plurality of sample data, wherein the sample data comprises first time sequence data, second time sequence data and labeled time sequence data, and the acquisition time of the first time sequence data is later than that of the second time sequence data; generating a first time sequence vector according to the first time sequence data, and generating a second time sequence vector according to the second time sequence data; associating the first time sequence vector with the second time sequence vector through a first preset Transformer model to obtain a target time sequence vector; inputting the target time sequence vector into a second preset Transformer model to obtain predicted time sequence data; updating model parameters of the first preset Transformer model and the second preset Transformer model according to the predicted time sequence data and the marked time sequence data; and continuing to iteratively train the updated first preset Transformer model and the updated second preset Transformer model until the first preset Transformer model and the second preset Transformer model are converged to obtain a time sequence data prediction model. The time sequence data prediction model is obtained through the associated time sequence vector training of the first time sequence data and the second time sequence data, the model training effect can be effectively improved, and the trained time sequence data prediction model can be used for detecting abnormal data.

The acquisition time of the first time sequence data is later than that of the second time sequence data, the first time sequence data is current time sequence data for example, and the second time sequence data is historical time sequence data for example. The current time series data includes time series data of the target detection index within the current time series, for example, the current time series data is time series data of a plurality of memory usage rates recorded within 1 hour. The historical time-series data includes time-series data of the target detection index within the historical time series, for example, the historical time-series data is time-series data of the target detection index recorded within the past 1 week.

In an embodiment, the second time-series data is long-time-series data, and the first time-series data is short-time-series data. The time length corresponding to the second time sequence data is greater than or equal to a first preset time length, the time length corresponding to the first time sequence data is less than or equal to a second preset time length, the time length corresponding to the second time sequence data is greater than the time length corresponding to the first time sequence data, and the first preset time length and the second preset time length can be set according to actual conditions. It should be noted that, in the embodiment, the second time series data of the long time series data and the first time series data of the short time series data are used to predict the next time series, the accuracy of the abnormality detection is higher, and the second time series data of the long time series data can be updated at a lower frequency, so that the calculation amount is saved.

The first time sequence data is converted into a first time sequence vector, and the second time sequence data is converted into a second time sequence vector. The second time sequence vector can be generated by the second time sequence data and stored in the historical data of the memory, and can be directly obtained from the historical data in the subsequent use, so that the second time sequence vector can be reused without repeated generation, thereby saving the calculation amount and improving the model training efficiency.

In an embodiment, the time length corresponding to the second timing vector is greater than or equal to a first preset time length, and the time length corresponding to the first timing vector is less than or equal to a second preset time length. It should be noted that the time length corresponding to the second time sequence vector generated by using the second time sequence data of the long time sequence data is greater than or equal to the first preset time length, and the time length corresponding to the first time sequence vector generated by using the first time sequence data of the short time sequence data is less than or equal to the second preset time length, so that the accuracy of the trained model for assisting in data abnormality detection can be improved, and the second time sequence data of the long time sequence data can be updated at a lower frequency without being changed within a longer period of time, thereby saving the calculation amount.

In an embodiment, generating the second timing vector from the second timing data includes: acquiring second time sequence data of the target detection index, and determining first time information corresponding to the second time sequence data; generating a first time sequence vector according to the second time sequence data and the first time information, and generating a first correction vector according to the second time sequence data, the first time information and a preset function; and splicing the first time sequence vector and the first correction vector to obtain a second time sequence vector. Specifically, reference may be made to the corresponding process in the foregoing embodiment of the data detection method, and details of this embodiment are not repeated herein.

In one embodiment, generating the first timing vector from the first timing data comprises: acquiring second time information corresponding to the first time sequence data; generating a second time sequence vector according to the first time sequence data and the second time information, and generating a second correction vector according to the first time sequence data, the second time information and a preset function; and splicing the second time sequence vector and the second correction vector to obtain a first time sequence vector. The first time sequence data may include a plurality of detection data of the target detection index, and the second time information includes a second time sequence, a second time interval, and/or a second time point of each detection data. Specifically, reference may be made to the embodiment of generating the history timing vector in the data detection method, which is not described herein again.

The first preset Transformer time sequence model can be one or more, and comprises a Transformer operation layer and a feedforward neural network layer FNN; calculating the second time sequence vector and the first time sequence vector through a Transformer calculation layer to obtain associated time sequence data; and mapping the associated time sequence data through a feedforward neural network layer to obtain an associated time sequence vector. The Transformer operation layer is used for performing Transformer operation on the second time sequence vector and the first time sequence vector, the feedforward neural network layer FNN is used for performing full-link layer mapping on an operation result output by the Transformer operation layer and outputting associated time sequence data so as to realize association of each target detection index in the second time sequence vector and the first time sequence vector, and the first time sequence data and the second time sequence data are covered after the first time sequence vector and the first time sequence vector are associated so as to predict a next time sequence more accurately and improve the accuracy of abnormal detection.

The second preset Transformer model and the first preset Transformer model have the same structure and have the commonality, and the second preset Transformer model also comprises a Transformer operation layer and a feedforward neural network layer FNN, wherein the difference is that the model parameters are different. And processing the associated time sequence vector through a second preset Transformer model, and outputting predicted time sequence data of the next time sequence. And associating the first time sequence data with the similar time period in the second time sequence data through the second Transformer model and the associated time sequence vector, so as to obtain predicted time sequence data of the next time sequence of the first time sequence data.

And updating model parameters of the first preset Transformer model and the second preset Transformer model according to the predicted time sequence data and the marked time sequence data, and continuing to carry out iterative training on the updated first preset Transformer model and the updated second preset Transformer model until the first preset Transformer model and the second preset Transformer model are converged to obtain a time sequence data prediction model. The time sequence data prediction model is obtained through the associated time sequence vector training of the first time sequence data and the second time sequence data, the model training effect is better, and the detection precision of abnormal data is higher.

In one embodiment, a first loss function of a first Transformer preset model and a second model parameter of a second preset Transformer model are calculated through predicted time sequence data and labeled time sequence data; adjusting model parameters of a first preset Transformer model based on a first loss function, and adjusting model parameters of a second preset Transformer model based on a second loss function; performing iterative training on the first preset Transformer model and the second preset Transformer model with the adjusted model parameters; and when the trained first preset Transformer model and the trained second preset Transformer model are determined to be in a convergence state, obtaining a time sequence data prediction model.

In one embodiment, a first preset Transformer model and a second preset Transformer model are fused to obtain a target Transformer model; calculating a loss function of the target Transformer model according to the predicted time sequence data and the marked time sequence data, determining model parameters of the target Transformer model based on the loss function, performing iterative training on the target Transformer model according to the model parameters, and determining whether the trained target Transformer model is in a convergence state; and if the trained target Transformer model is determined to be in a convergence state, obtaining a time sequence data prediction model. Wherein the loss function is, for example, MSE (mean-square error) loss function, and the Adam optimization algorithm is used to iteratively train the target Transformer model. It should be noted that the trained abnormal data detection model can predict the next window time series data (the target time series data of the next time series of the first time series data) based on the input second time series data and the first time series data. It is understood that the adaptive gradient algorithm (AdaGrad) and root mean square propagation (RMSProp) may also be used to iteratively train the target transform model, and the present application is not limited specifically.

Referring to fig. 4, fig. 4 is a schematic block diagram of an artificial intelligence-based data detection apparatus according to an embodiment of the present disclosure.

As shown in fig. 4, the artificial intelligence based data detecting apparatus 300 includes: an acquisition module 301, a generation module 302, an association module 303, a prediction module 304, and a detection module 305. The data detection apparatus 300 is applicable to a server, which stores a time series data prediction model comprising a first Transformer model and a second Transformer model trained in advance, wherein:

an obtaining module 301, configured to obtain current time series data of a target detection index;

a generating module 302, configured to generate a current time sequence vector according to the current time sequence data, and determine a historical time sequence vector of historical time sequence data of the target detection indicator;

the association module 303 is configured to associate the historical time sequence vector with the current time sequence vector through a first transform model to obtain an association time sequence vector;

the prediction module 304 is configured to input the associated time sequence vector to a second transform model to obtain prediction data of a next time sequence of the current time sequence data;

the obtaining module 301 is further configured to obtain target time sequence data of a next time sequence of the current time sequence data;

a detecting module 305, configured to perform anomaly detection on the target time series data through the prediction data and the target time series data.

In one embodiment, the time length corresponding to the historical timing vector is greater than or equal to a first preset time length, and the time length corresponding to the current timing vector is less than or equal to a second preset time length.

In one embodiment, as shown in fig. 5, the generating module 302 includes:

an obtaining submodule 3021, configured to obtain historical time series data of the target detection index, and determine first time information corresponding to the historical time series data;

a generating submodule 3022, configured to generate a first time sequence vector according to the historical time sequence data and the first time information, and generate a first correction vector according to the historical time sequence data, the first time information, and a preset function;

and the splicing submodule 3023 is configured to splice the first timing vector and the first correction vector to obtain a historical timing vector.

In one embodiment, the target detection indexes are k, the first time information includes m time point information of historical time series data, and k and m are positive integers greater than or equal to 1; the generation module 302 is further configured to:

and arranging the historical time sequence data of the k target detection indexes according to the m time point information to generate a k-m matrix vector, and taking the matrix vector as a first time sequence vector.

In one embodiment, the generation module 302 is further configured to:

acquiring a preset function, and substituting the historical time sequence data and the first time information into the preset function to obtain the correction information of the historical time sequence data;

and generating a first correction vector according to the historical time sequence data and the correction information of the historical time sequence data.

In one embodiment, the first Transformer model comprises a Transformer operational layer and a feedforward neural network layer; the association module 303 is further configured to:

calculating the historical time sequence vector and the current time sequence vector through the Transformer calculation layer to obtain associated time sequence data;

and mapping the correlation time sequence data through the feedforward neural network layer to obtain a correlation time sequence vector.

In one embodiment, the data detection apparatus 300 is further configured to:

acquiring a plurality of sample data, wherein the sample data comprises first time sequence data, second time sequence data and labeled time sequence data, and the acquisition time of the first time sequence data is later than that of the second time sequence data;

generating a first time sequence vector according to the first time sequence data, and generating a second time sequence vector according to the second time sequence data;

associating the first time sequence vector with the second time sequence vector through a first preset Transformer model to obtain a target time sequence vector;

inputting the target time sequence vector to a second preset Transformer model to obtain predicted time sequence data;

updating model parameters of the first preset Transformer model and the second preset Transformer model according to the predicted time sequence data and the marked time sequence data;

and continuing to perform iterative training on the updated first preset Transformer model and the updated second preset Transformer model until the first preset Transformer model and the second preset Transformer model are converged to obtain the time series data prediction model.

It should be noted that, as will be clear to those skilled in the art, for convenience and brevity of description, the specific working processes of the apparatus and the modules and units described above may refer to the corresponding processes in the foregoing embodiments of the data detection method based on artificial intelligence, and are not described herein again.

The apparatus provided by the above embodiment may be implemented in a form of a computer program, and the computer program may be run on a server as shown in fig. 6.

Referring to fig. 6, fig. 6 is a schematic block diagram of a server according to an embodiment of the present disclosure. The server stores a time sequence data prediction model, wherein the time sequence data prediction model comprises a first Transformer model and a second Transformer model which are trained in advance.

As shown in fig. 6, the server includes a processor, a memory, and a network interface connected by a system bus, wherein the memory may include a nonvolatile storage medium and an internal memory.

The non-volatile storage medium may store an operating system and a computer program. The computer program includes program instructions that, when executed, cause a processor to perform any one of the artificial intelligence based data detection methods.

The processor is used for providing calculation and control capacity and supporting the operation of the whole server.

The internal memory provides an environment for the execution of a computer program on a non-volatile storage medium, which when executed by the processor, causes the processor to perform any one of the artificial intelligence based data detection methods.

The network interface is used for network communication, such as sending assigned tasks and the like. Those skilled in the art will appreciate that the architecture shown in fig. 6 is a block diagram of only a portion of the architecture associated with the subject application, and does not constitute a limitation on the servers to which the subject application applies, as a particular server may include more or less components than those shown, or may combine certain components, or have a different arrangement of components.

It should be understood that the Processor may be a Central Processing Unit (CPU), and the Processor may be other general purpose processors, Digital Signal Processors (DSPs), Application Specific Integrated Circuits (ASICs), Field Programmable Gate Arrays (FPGAs) or other Programmable logic devices, discrete Gate or transistor logic devices, discrete hardware components, etc. Wherein a general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.

Wherein, in one embodiment, the processor is configured to execute a computer program stored in the memory to implement the steps of:

acquiring current time sequence data of a target detection index;

generating a current time sequence vector according to the current time sequence data, and determining a historical time sequence vector of historical time sequence data of the target detection index;

associating the historical time sequence vector with the current time sequence vector through the first Transformer model to obtain an associated time sequence vector;

inputting the correlation time sequence vector to the second Transformer model to obtain predicted data of a next time sequence of the current time sequence data;

and acquiring target time sequence data of the next time sequence of the current time sequence data, and performing anomaly detection on the target time sequence data through the prediction data and the target time sequence data.

In one embodiment, the time length corresponding to the historical timing vector is greater than or equal to a first preset time length, and the time length corresponding to the current timing vector is less than or equal to a second preset time length.

In one embodiment, the processor, when implementing the determining the historical timing vector of the historical timing data of the target detection indicator, is configured to implement:

acquiring historical time sequence data of the target detection index, and determining first time information corresponding to the historical time sequence data;

generating a first time sequence vector according to the historical time sequence data and first time information, and generating a first correction vector according to the historical time sequence data, the first time information and a preset function;

and splicing the first time sequence vector and the first correction vector to obtain a historical time sequence vector.

In one embodiment, the target detection indexes are k, the first time information includes m time point information of historical time series data, and k and m are positive integers greater than or equal to 1;

when the processor generates the first time sequence vector according to the historical time sequence data and the first time information, the processor is configured to:

and arranging the historical time sequence data of the k target detection indexes according to the m time point information to generate a k-m matrix vector, and taking the matrix vector as a first time sequence vector.

In one embodiment, the processor, when implementing the generating of the first correction vector according to the historical timing data, the first time information and the preset function, is configured to implement:

acquiring a preset function, and substituting the historical time sequence data and the first time information into the preset function to obtain the correction information of the historical time sequence data;

and generating a first correction vector according to the historical time sequence data and the correction information of the historical time sequence data.

In one embodiment, the first Transformer model comprises a Transformer operational layer and a feedforward neural network layer; when the processor associates the historical time sequence vector with the current time sequence vector through the first Transformer model to obtain an associated time sequence vector, the processor is configured to:

calculating the historical time sequence vector and the current time sequence vector through the Transformer calculation layer to obtain associated time sequence data;

and mapping the correlation time sequence data through the feedforward neural network layer to obtain a correlation time sequence vector.

In one embodiment, the processor is further configured to implement:

acquiring a plurality of sample data, wherein the sample data comprises first time sequence data, second time sequence data and labeled time sequence data, and the acquisition time of the first time sequence data is later than that of the second time sequence data;

generating a first time sequence vector according to the first time sequence data, and generating a second time sequence vector according to the second time sequence data;

associating the first time sequence vector with the second time sequence vector through a first preset Transformer model to obtain a target time sequence vector;

inputting the target time sequence vector to a second preset Transformer model to obtain predicted time sequence data;

updating model parameters of the first preset Transformer model and the second preset Transformer model according to the predicted time sequence data and the marked time sequence data;

and continuing to perform iterative training on the updated first preset Transformer model and the updated second preset Transformer model until the first preset Transformer model and the second preset Transformer model are converged to obtain the time series data prediction model.

It should be clearly understood by those skilled in the art that, for convenience and brevity of description, the specific working process of the server described above may refer to the corresponding process in the foregoing data detection method embodiment based on artificial intelligence, and is not described herein again.

Embodiments of the present application further provide a computer-readable storage medium, where a computer program is stored on the computer-readable storage medium, where the computer program includes program instructions, and a method implemented when the program instructions are executed may refer to various embodiments of the artificial intelligence based data detection method according to the present application.

The computer-readable storage medium may be an internal storage unit of the server according to the foregoing embodiment, for example, a hard disk or a memory of the server. The computer readable storage medium may also be an external storage device of the server, such as a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), and the like provided on the server.

It is to be understood that the terminology used in the description of the present application herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the application. As used in the specification of the present application and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise.

It should also be understood that the term "and/or" as used in this specification and the appended claims refers to and includes any and all possible combinations of one or more of the associated listed items. It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or system that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or system. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, method, article, or system that comprises the element.

The above-mentioned serial numbers of the embodiments of the present application are merely for description and do not represent the merits of the embodiments. While the invention has been described with reference to specific embodiments, the scope of the invention is not limited thereto, and those skilled in the art can easily conceive various equivalent modifications or substitutions within the technical scope of the invention. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

20页详细技术资料下载
上一篇:一种医用注射器针头装配设备
下一篇:基于状态位的中文地址去重方法、系统及设备

网友询问留言

已有0条留言

还没有人留言评论。精彩留言会获得点赞!

精彩留言,会给你点赞!