Network anomaly detection

文档序号：348435 发布日期：2021-12-03 浏览：10次中文

阅读说明：本技术 网络异常检测 (Network anomaly detection ) 是由詹姆斯·派洛拉斯博吉塔·图克拉尔达特·卡拉帕塔普安德烈亚斯·泰尔济斯克里希纳·萨亚纳于 2019-11-22 设计创作，主要内容包括：一种方法(300)包括从蜂窝网络(100)接收控制消息(128)并从控制消息中提取一个或多个特征(222)。该方法还包括使用预测模型(232),预测控制消息的潜在标签(234),该预测模型(232)被配置为接收从控制消息提取的一个或多个特征作为特征输入。这里,在训练控制消息的集合上训练预测模型(226),其中,每个训练控制消息包括一个或多个对应特征和实际标签(224)。该方法进一步包括确定潜在标签的概率(P)满足置信度阈值(236)。该方法还包括分析控制消息以确定控制消息是否对应于相应的网络性能问题(202)。当控制消息影响网络性能时,该方法包括将网络性能问题传送到网络实体(40)。(A method (300) includes receiving a control message (128) from a cellular network (100) and extracting one or more features (222) from the control message. The method also includes predicting a potential tag (234) of the control message using a predictive model (232), the predictive model (232) configured to receive one or more features extracted from the control message as feature inputs. Here, a predictive model (226) is trained on a set of training control messages, wherein each training control message includes one or more corresponding features and an actual label (224). The method further includes determining that the probability (P) of the potential tag satisfies a confidence threshold (236). The method also includes analyzing the control message to determine whether the control message corresponds to a respective network performance issue (202). When the control message affects network performance, the method includes communicating a network performance issue to a network entity (40).)

1. A method (300), comprising:

receiving a control message (128) at the data processing hardware (124) from the cellular network (100);

extracting, by the data processing hardware (124), one or more features (222) from the control message (128);

predicting, by the data processing hardware (124), a potential label (234) of the control message (128) using a predictive model (232), the predictive model (232) configured to receive the one or more features (222) extracted from the control message (128) as feature inputs, training the predictive model (232) on a set of training control messages (226), each training control message (226) comprising one or more corresponding features (222) and an actual label (224);

determining, by the data processing hardware (124), that the probability (P) of the potential tag (234) satisfies a confidence threshold (236);

analyzing, by the data processing hardware (124), the control message (128) to determine whether the control message (128) corresponds to a respective network performance issue (202) that affects network performance of the cellular network (100); and

transmitting, by the data processing hardware (124), the network performance issue (202) to a network entity (40) responsible for the network performance issue (202) when the control message (128) corresponds to a respective network performance issue (202) affecting network performance.

2. The method (300) of claim 1, wherein predicting the potential tag (234) using the predictive model (232) includes predicting a probability distribution (P) over potential tags (234)_Bdis) The predicted potential tags (234) include a probability distribution (P) over the potential tags (234)_Bdis) Is detected, and one of the potential tags (234) in (a).

3. The method (300) of claim 2, wherein predicting the potential tag (234) comprises selecting a probability distribution (P) over potential tags (234)_Bdis) The highest probability (P) of (a) of (b) of (a) of (b) of (a).

4. The method (300) according to any one of claims 1-3, further including: when the control message (128) fails to correspond to the respective network performance issue (202):

receiving, at the data processing hardware (124), a subsequent control message (128) from the cellular network (100);

extracting, by the data processing hardware (124), one or more corresponding features (222) from the subsequent control message (128);

identifying, by the data processing hardware (124), that at least one of the one or more corresponding features (222) extracted from the subsequent control message (128) matches the one or more features (222) extracted from the control message (128); and

removing, by the data processing hardware (124), the identified at least one of the one or more corresponding features (222) extracted from the subsequent control message (128) from the feature input used as the predictive model (232) prior to using the predictive model (232) to predict the corresponding potential tag (234) for the subsequent control message (128).

5. The method (300) according to any one of claims 1-4, further including: when the control message (128) fails to correspond to the respective network performance issue (202):

identifying, by the data processing hardware (124), the one or more features (222) extracted from the control message (128); and

prior to predicting a corresponding potential tag (234) for a subsequent control message (128) using the predictive model (232):

modifying, by the data processing hardware (124), the set of training control messages (226) by removing each training control message (226) that includes one or more corresponding features (222) that match any of the identified one or more features (222) extracted from the control message (128); and

retraining, by the data processing hardware (124), the predictive model (232) with the modified set (228) of training control messages (226).

6. The method (300) according to any one of claims 1-5, wherein the predictive model (232) includes a multi-class classification model configured to predict one or more types of labels.

7. The method (300) of any of claims 1-6, wherein the actual tag (224) of each training control message (226) comprises a Type Assignment Code (TAC) for a User Equipment (UE) device (102) associated with the training control message (226).

8. The method (300) of any of claims 1-7, wherein the actual tag (224) of each training control message (226) comprises an identifier of a network element of the cellular network (100).

9. The method (300) of any of claims 1-8, wherein the cellular network (100) transmits the control message (128) according to a general packet radio service tunneling protocol (GTP-C).

10. The method (300) of any of claims 1-9, wherein the cellular network (100) communicates the control message (128) according to a Diameter protocol.

11. The method (300) of any of claims 1-10, wherein the control message (128) corresponds to one of a plurality of control messages (128) sent by a user of the cellular network (100) during a single network session.

12. The method (300) of any of claims 1-11, wherein the one or more features (222) extracted from the control message (128) comprise a message type summary vector representing a number of times a message type occurs within a single session of a user of the cellular network (100).

13. The method (300) of any of claims 1-12, wherein the characteristic (222) comprises an amount of data communicated within a time period associated with a single session of a user of the cellular network (100).

14. The method (300) according to any one of claims 1-13, wherein the predictive model (232) includes a deep neural network or a recurrent neural network.

15. The method (300) of any of claims 1-14, wherein analyzing the control message (128) to determine whether the control message (128) corresponds to the respective network performance issue (202) that affects network performance of the cellular network (100) comprises clustering the control message (128) into clusters that share a respective one of the one or more features (222) extracted from the control message (128).

16. A network gateway device (100) comprising:

data processing hardware (124); and

memory hardware (126) in communication with the data processing hardware (124), the memory hardware (126) storing instructions that, when executed on the data processing hardware (124), cause the data processing hardware (124) to perform operations comprising:

receiving a control message (128) from the cellular network (100);

extracting one or more features (222) from the control message (128);

predicting potential tags (234) of the control messages (128) using a predictive model (232), the predictive model (232) configured to receive the one or more features (222) extracted from the control messages (128) as feature inputs, training the predictive model (232) over a set of training control messages (226), each training control message (226) comprising one or more corresponding features (222) and an actual tag (224);

determining that the probability (P) of the potential tag (234) satisfies a confidence threshold (236);

analyzing the control message (128) to determine whether the control message (128) corresponds to a respective network performance issue (202) that affects network performance of the cellular network (100); and

17. The network gateway device (100) of claim 16, wherein predicting the potential label (234) using the predictive model (232) comprises predicting a probability distribution (P) over potential labels (234)_Bdis) The predicted potential tags (234) include a probability distribution (P) over the potential tags (234)_Bdis) Is detected, and one of the potential tags (234) in (a).

18. The network gateway device (100) of claim 17, wherein predicting the potential label (234) comprises selecting a probability distribution (P) over the potential label (234)_Bdis) The highest probability (P) of (a) of (b) of (a) of (b) of (a).

19. The network gateway device (100) of any of claims 16 to 18, wherein the operations further comprise: when the control message (128) fails to correspond to the respective network performance issue (202):

receiving a subsequent control message (128) from the cellular network (100);

extracting one or more corresponding features (222) from the subsequent control message (128);

identifying that at least one of the one or more corresponding features (222) extracted from the subsequent control message (128) matches the one or more features (222) extracted from the control message (128); and

removing the identified at least one of the one or more corresponding features (222) extracted from the subsequent control message (128) from the feature input used as the predictive model (232) prior to using the predictive model (232) to predict the corresponding potential tag (234) for the subsequent control message (128).

20. The network gateway device (100) of any of claims 16 to 19, wherein the operations further comprise: when the control message (128) fails to correspond to the respective network performance issue (202):

identifying the one or more features (222) extracted from the control message (128); and

prior to predicting a corresponding potential tag (234) for a subsequent control message (128) using the predictive model (232):

modifying the set of training control messages (226) by removing each training control message (226) that includes one or more corresponding features (222) that match any of the identified one or more features (222) extracted from the control message (128); and

retraining the predictive model (232) with the modified set (228) of training control messages (226).

21. The network gateway device (100) of any of claims 16 to 20, wherein the prediction model (232) comprises a multi-class classification model configured to predict one or more types of labels.

22. The network gateway apparatus (100) of any of claims 16 to 21, wherein the actual tag (224) of each training control message (226) comprises a Type Assignment Code (TAC) for a User Equipment (UE) device (102) associated with the training control message (226).

23. The network gateway device (100) of any of claims 16 to 22, wherein the actual tag (224) of each training control message (226) comprises an identifier of a network element of the cellular network (100).

24. The network gateway device (100) according to any one of claims 16 to 23, wherein the cellular network (100) transmits the control message (128) according to a general packet radio service tunneling protocol (GTP-C).

25. The network gateway device (100) according to any one of claims 16 to 24, wherein the cellular network (100) communicates the control message (128) according to a Diameter protocol.

26. The network gateway device (100) of any of claims 16 to 25, wherein the control message (128) corresponds to one of a plurality of control messages (128) sent by a user of the cellular network (100) during a single network session.

27. The network gateway device (100) of any of claims 16 to 26, wherein the one or more features (222) extracted from the control message (128) comprise a message type summary vector representing a number of times a message type occurs within a single session of a user of the cellular network (100).

28. The network gateway device (100) of any of claims 16 to 27, wherein the characteristic (222) comprises an amount of data communicated within a time period associated with a single session of a user of the cellular network (100).

29. The network gateway device (100) of any of claims 16 to 28, wherein the predictive model (232) comprises a deep neural network or a recurrent neural network.

30. The network gateway device (100) of any of claims 16 to 29, wherein analyzing the control message (128) to determine whether the control message (128) corresponds to the respective network performance issue (202) that affects network performance of the cellular network (100) comprises clustering the control message (128) into clusters that share a respective one of the one or more features (222) extracted from the control message (128).

Technical Field

The present disclosure relates to network anomaly detection.

Background

Cellular communication networks provide communication content, such as voice, video, packet data, messaging, and broadcast, for subscriber devices, such as mobile devices and data terminals. A cellular communication network may include a plurality of base stations capable of supporting communication for a plurality of subscriber devices across a dispersed geographic area. Typically, when a user equipment, such as a mobile telephone, moves from the vicinity of one base station to another, mobile and fixed components of the cellular network exchange radio measurement and control messages to ensure that the mobile equipment is always ready to receive and transmit data from and to external networks, such as the internet or voice services. Unfortunately, however, cellular communication networks create network performance problems that adversely affect these measurement and control messages. Thus, without an accurate way to detect network performance problems, the cellular network may not be able to ensure that the user equipment is able to receive and transmit data in a reliable manner depending on the network capabilities.

Disclosure of Invention

One aspect of the present disclosure provides a method for detecting network anomalies. The method includes receiving, at data processing hardware, a control message from a cellular network. The method further includes extracting, by the data processing hardware, one or more features from the control message. The method also includes predicting, by the data processing hardware, the potential tag of the control message using a predictive model configured to receive as a feature input one or more features extracted from the control message. Here, the predictive model is trained on a set of training control messages, wherein each training control message includes one or more corresponding features and an actual label. The method further includes determining, by the data processing hardware, that the probability of the potential tag satisfies a confidence threshold. The method also includes analyzing, by the data processing hardware, the control messages to determine whether the control messages correspond to respective network performance issues that affect network performance of the cellular network. When the control messages correspond to respective network performance issues affecting network performance, the method includes communicating, by the data processing hardware, the network performance issues to a network entity responsible for the network performance issues.

Implementations of the invention may include one or more of the following optional features. In some embodiments, predicting the potential tags using the predictive model includes predicting a probability distribution over the potential tags, the predicted potential tags including one of the potential tags in the probability distribution over the potential tags. In these embodiments, predicting the potential tags includes selecting the potential tag associated with the highest probability in the probability distribution over the potential tags. In some examples, the predictive model includes a multi-class classification model configured to predict one or more types of labels. The predictive model may include a deep neural network or a recurrent neural network. The actual label of each training control message includes a type assignment code for a User Equipment (UE) device associated with the training control message or an identifier of a network element of the cellular network. In some configurations, the cellular network communicates the control messages according to a general packet radio service tunneling protocol (GTP-C) or a Diameter protocol. Optionally, the control message corresponds to one of a plurality of control messages sent by a user of the cellular network during a single network session. In some examples, the one or more features extracted from the control message include a message type summary vector that represents a number of times a message type occurs within a single session of a user of the cellular network. In some implementations, the characteristic includes an amount of data communicated within a time period associated with a single session of a user of the cellular network.

In some examples, when the control message fails to correspond to a respective network performance issue, the method includes receiving, at the data processing hardware, a subsequent control message from the cellular network and extracting, by the data processing hardware, one or more corresponding features from the subsequent control message. The method also includes identifying, by the data processing hardware, that at least one of the one or more corresponding features extracted from the subsequent control message matches the one or more features extracted from the control message, removing, by the data processing hardware, the identified at least one of the one or more features extracted from the subsequent control message as a feature input for the predictive model prior to using the predictive model to predict the corresponding potential tag for the subsequent control message.

In some embodiments, when the control message fails to correspond to a respective network performance issue, the method includes identifying, by the data processing hardware, one or more features extracted from the control message. Here, the method further includes modifying, by the data processing hardware, the set of training control messages by removing each training control message that includes one or more corresponding features that match any of the identified one or more features extracted from the control message before using the predictive model to predict the corresponding potential tags for subsequent control messages; and retraining, by the data processing hardware, the predictive model with the modified set of training control messages.

Another aspect of the present disclosure provides a system for detecting network anomalies. The system includes data processing hardware and memory hardware in communication with the data processing hardware. The memory hardware stores instructions that, when executed on the data processing hardware, cause the data processing hardware to perform operations. The operations include receiving a control message from a cellular network. The operations further include predicting potential tags for the control message using a predictive model configured to receive one or more features extracted from the control message as feature inputs. Here, the predictive model is trained on a set of training control messages, wherein each training control message includes one or more corresponding features and an actual label. The operations further include determining that the probability of the potential tag satisfies a confidence threshold. The operations also include analyzing the control message to determine whether the control message corresponds to a respective network performance issue affecting network performance of the cellular network, and communicating the network performance issue to a network entity responsible for the network performance issue when the control message corresponds to the respective network performance issue affecting network performance.

This aspect may include one or more of the following optional features. In some embodiments, predicting the potential tags using the predictive model includes predicting a probability distribution over the potential tags, wherein the predicted potential tags include one of the potential tags in the probability distribution over the potential tags. In these embodiments, predicting the potential tags includes selecting the potential tag associated with the highest probability in the probability distribution over the potential tags. In some examples, the predictive model includes a multi-class classification model configured to predict one or more types of labels. The predictive model may include a deep neural network or a recurrent neural network. The actual label of each training control message includes a type assignment code for a User Equipment (UE) device associated with the training control message or an identifier of a network element of the cellular network. In some configurations, the cellular network communicates the control messages according to a general packet radio service tunneling protocol (GTP-C) or a Diameter protocol. Optionally, the control message corresponds to one of a plurality of control messages sent by a user of the cellular network during a single network session. In some examples, the one or more features extracted from the control message include a message type summary vector that represents a number of times a message type occurs within a single session of a user of the cellular network. In some implementations, the characteristic includes an amount of data communicated within a time period associated with a single session of a user of the cellular network.

In some examples, when the control message fails to correspond to the respective network performance issue, the operations include receiving a subsequent control message from the cellular network and extracting one or more corresponding features from the subsequent control message. Here, the operations further include identifying that at least one of the one or more corresponding features extracted from the subsequent control message matches the one or more features extracted from the control message, removing the identified at least one of the one or more features extracted from the subsequent control message as a feature input to the predictive model prior to using the predictive model to predict the corresponding potential tag for the subsequent control message.

In some embodiments, when the control message fails to correspond to a respective network performance issue, the operation includes identifying one or more features extracted from the control message. Here, the operations further include modifying the set of training control messages by removing each training control message that includes one or more corresponding features that match any of the identified one or more features extracted from the control message before using the predictive model to predict the corresponding potential tags for subsequent control messages; and retraining the predictive model with the modified set of training control messages.

The details of one or more embodiments of the disclosure are set forth in the accompanying drawings and the description below. Other aspects, features, and advantages will be apparent from the description and drawings, and from the claims.

Drawings

Fig. 1 is a schematic diagram of an example communication network.

Fig. 2A-2D are schematic diagrams of an example anomaly detector for the communication network of fig. 1.

FIG. 3 is a flow diagram of an example method for detecting network anomalies.

FIG. 4 is a schematic diagram of an example computing device that may be used to implement the systems and methods described herein.

Like reference symbols in the various drawings indicate like elements.

Detailed Description

Cellular networks may suffer from a range of network problems (e.g., degraded hardware, misconfigurations between network elements, unreliable updates or upgrades to network devices, etc.). Network problems may affect network performance and result in users of the cellular network (i.e., subscribers of the cellular network) having a poor user experience with the cellular network. Poor user experience can lead to user frustration and may even lead to the user switching network operators (i.e., network providers) as a means to address network performance issues.

Network providers (or operators) have an incentive to address these issues because network issues may affect their customer loyalty and may have a detrimental effect on their cellular services. Without addressing network issues, these issues may result in loss of service to the network operator and potentially compromise the reputation and/or brand of the network operator. However, network operators do not typically experience network performance issues on the first hand. In other words, the users of the cellular network are users that are typically affected by network performance issues. This means that network operators may often have to rely on network users to report network problems when they occur. However, there are some problems with user reporting to address network problems. First, network users need not only realize that the problem they are experiencing may be due to their cellular network, but also need to spend their time reporting the problem to the network operator in some way. Clearly, this approach is unlikely to work well for users who are unaware that they are experiencing less than ideal performance. For example, users become accustomed to lower than average network performance or do not realize that network performance should be better. Here, this type of user may never inform the network operator that there is a network performance problem, but simply change the cellular network provider, believing that another provider may result in better performance. In other words, the original cellular provider may never have an opportunity to solve the problem. Furthermore, when a user does report network performance problems to the network operator, the network operator performs an investigation of the reported problems. These surveys can be labor intensive processes that may leave some user problems unsolved due to lack of available resources to survey/resolve all reported problems. In particular, network operators may often have to prioritize labor resources to operate the cellular network rather than investigate reported user issues.

Another approach is for the network operator to monitor the cellular network to detect anomalies that may indicate network performance problems. An anomaly refers to a unique occurrence (or different behavior) during the signaling of the cellular network. Here, the anomaly itself is agnostic as to whether the only occurrence is indicative of an occurrence of adverse behavior (e.g., network performance issues) or of a non-adverse behavior (e.g., not network performance issues). However, by identifying the anomaly, the network operator may analyze the anomaly to determine whether the anomaly corresponds to a network performance issue.

Detecting anomalies within a cellular network traditionally has its drawbacks. For example, depending on cellular usage and traffic, a cellular network may have a large amount of log data (e.g., network logs, inter-process logs, usage statistics, etc.). Screening large amounts of data to identify anomalies can be resource intensive. Thus, when an anomaly affecting network performance is detected, an entity (e.g., a network operator) detecting the anomaly may develop rules to more easily detect the same or similar anomaly in other situations. Thus, this conventional form of anomaly detection generates one or more rules to identify deviations from normal behavior. For example, a rule defines that a particular message type typically occurs at a rate of five times per second. This rule will allow the system to detect this deviation as an anomaly when this particular message type occurs more or less times per second. Unfortunately, a problem with this form of anomaly detection is that the entity must first specify what is considered normal behavior to identify anomalies that have behavior outside of the specified normal. Here, the method is only applicable to known anomalies indicated by known rules. In other words, new anomalies that affect network performance will not be detected until the rules specifically address the new anomalies (or normal behavior that should occur rather than the new anomalies). This approach lacks any ability to predict new anomalies that may cause performance problems. Thus, the predictive anomaly detector can more accurately use anomalies to detect network performance problems.

Fig. 1 illustrates a communication network 100 (also referred to as a cellular network), which may be a Long Term Evolution (LTE) network, a 5G network, and/or a multiple access network supporting multiple access technologies specified by the third generation partnership project (3GPP), such as General Packet Radio Service (GPRS), global system for mobile communications/enhanced data rates for GSM evolution (GSM/EDGE), universal mobile telecommunications system/high speed packet access (UMTS/HSPA), LTE, and LTE advanced network technologies. A cellular network 100 (e.g., an LTE network) enables wireless communication of high-speed data packets between subscriber devices 102, 102a-b, such as mobile phones and data terminals, and a base station 104. Subscriber equipment 102 may be interchangeably referred to as User Equipment (UE) devices and/or mobile equipment 102. For example, LTE is a wireless communication standard based on GSM/EDGE and EIMTS/HSPA network technologies and is configured to increase capacity and speed of telecommunications by using different radio interfaces in addition to core network improvements. Different types of cellular networks 100 may support different frequency bands/frequencies at various bandwidths to allow the UE device 102 to communicate data (e.g., data packets). To illustrate, LTE supports scalable carrier bandwidths from 1.4MHz to 20MHz, and supports Frequency Division Duplex (FDD) and Time Division Duplex (TDD), while 5G supports bandwidths ranging from 5MHz to 100MHz, some of which overlap with LTE.

The UE device 102 may be any telecommunication device capable of sending and/or receiving voice/data over the network 100. The UE device 102 may include, but is not limited to, mobile computing devices such as laptops, tablets, smartphones, and wearable computing devices (e.g., headsets and/or watches). The UE device 102 may also include other computing devices having other form factors, such as computing devices included in desktop computers, smart speakers/displays, vehicles, gaming devices, televisions, or other appliances (e.g., networked home automation devices and home appliances). The UE device 102 subscribes to network services provided by a network operator of the communication network 100. The network operator may also be referred to as a Mobile Network Operator (MNO), a wireless service provider, a wireless operator, a cellular company, or a mobile network operator.

The UE device 102 may communicate with an external network 30, such as a Packet Data Network (PDN), through a communication network 100 (or 5G/3G/2G network). Referring to fig. 1, the communication network 100 is an LTE network that includes a first portion, an evolved universal terrestrial radio access network (E-UTRAN) portion 106, and a second portion, an Evolved Packet Core (EPC) portion 108. The first portion 106 includes an air interface 110 (i.e., evolved universal terrestrial radio access (e-UTRA)), UE device 102 and a plurality of base stations 104 for the LTE upgrade path of 3GPP of the mobile network. The LTE air interface 110 uses Orthogonal Frequency Division Multiple Access (OFDMA) radio access for the downlink and single carrier FDMA (SC-FDMA) for the uplink. Accordingly, the first portion 106 provides a Radio Access Network (RAN) supporting radio communication of data packets and/or other surfaces from the external network 30 to the UE device 102 over the air interface 110 via one or more base stations 104.

Each base station 104 may include an evolved node B (also referred to as an eNodeB or eNB). The eNB104 includes hardware connected to an air interface 110 (e.g., a mobile telephony network) to communicate directly with the UE device 102. For example, the eNB104 may transmit downlink LTE/3G/5G signals (e.g., communications) to the UE device 102 and receive uplink LTE/3G/5G signals from the UE device 102 over the air interface 110. Base station 104 may have an associated coverage area 104_areaWhich corresponds to an area where one or more UE devices 102 communicate with the network 100 through the base station 104. The eNB104 communicates with the EPC 108 using the S1 interface. The S1 interfaces may include a Sl-MME interface for communicating with Mobility Management Entity (MME)112 and a Sl-U interface for interfacing with Serving Gateway (SGW) 116. Thus, the S1 interface is associated with a backhaul link for communicating with the EPC 108.

EPC 108 provides a framework configured to aggregate voice and data over LTE network 100. EPC 108 unifies voice and data over an Internet Protocol (IP) services architecture, and voice is considered only another IP application. EPC 108 includes, but is not limited to, several network elements such as MME112, Serving GPRS Support Node (SGSN)114, SGW116, Policy and Charging Rules Function (PCRF)118, Home Subscriber Server (HSS)120, and packet data node gateway (PGW)122, PGW 122 may also be referred to as network gateway device 122, and when the network corresponds to a 3G network, network gateway device 122 includes a Gateway GPRS Support Node (GGSN) instead of PGW 122. Alternatively, when the network corresponds to a 5G or 5G + network, the network gateway device 122 may include a gateway node having a naming convention defined by the 5G and/or 5G + network. MME112, SGSN 114, SGW116, PCRF 118, HSS120, and PGW 122 may be separate components or at least two of these components may be integrated together. The EPC 108 communicates with the UE device 102 and the external network 30 to route data packets therebetween.

Network 100 includes interfaces that allow UE device 102, base station 104, and various network elements (e.g., MME112, SGSN 114, SGW116, PCRF 118, HSS120, and PGW 122) to cooperate with one another during use of network 100. Information flows along these interfaces throughout the network 100, and in general these interfaces may be divided into a user plane and a control plane. The user plane routes user plane traffic and includes a user plane protocol stack with sublayers between the UE device 102 and the base station 104, such as Packet Data Convergence Protocol (PDCP), Radio Link Control (RLC), and Medium Access Control (MAC). Some of the user plane specific interfaces shown in solid lines between network elements are as follows: an Sl-U interface between the base station 104 and the SGW116 for each bearer user plane tunnel and inter-base station path handover during handover; an S4 interface between the UE device 102 with 2G access or 3G access and the PGW 122 for control and mobility support, and in some cases, user plane tunneling; and an S12 interface (not shown) between the E-UTRAN part 106 (e.g., UE device 102) and the SGW116 for user plane tunneling as an operator configuration option. Other types of communication networks (e.g., 3G, 5G, etc.) may include other user plane interfaces in addition to the user plane interface depicted in fig. 1 for network 100.

The control plane is responsible for controlling and supporting user plane functions using control plane protocols. In particular, the control plane controls E-UTRAN access connections (e.g., E-UTRAN portion 106 attaching and detaching to network 100), controls attributes of established network access connections (e.g., activation of IP addresses), controls routing paths of established network connections (e.g., to support user mobility), and/or controls assignment of network resources based on requirements of network 100 (e.g., by a user of UE device 102). Some interfaces specific to the control plane, shown in dashed lines between network elements, are as follows: an Sl-MME interface between the base station 104 and the MME112, which guarantees delivery of signaling messages; an S3 interface between SGSN 114 and MME112 that enables user/bearer information exchange for inter-3 GPP access network mobility in idle and/or active states; an S5/S8 interface between the SGW116 and PGW 122, wherein the S5 interface is used in non-roaming scenarios to provide relocation based on UE device 102 mobility and to connect to non-collocated gateways of the PDN, while the S8 interface is connected to the Public Land Mobile Network (PLMN); an S10 interface to coordinate handovers between MMEs 112; an SI1 interface between MME112 and SGW116 for communicating signaling messages; an S6a interface between MME112 and HSS120, which is capable of transferring subscription and authentication data related to user access; an S6d interface between the HSS120 and the SGSN 114, which is also capable of transferring subscription and authentication data relating to user access; and an S13 interface (not shown) supporting UE device 102 identity checking. Other types of communication networks (e.g., 3G, 5G, etc.) may include other control plane interfaces in addition to the control plane interfaces depicted in fig. 1 for network 100.

When a particular UE device 102 connects to the network 100, one or more control messages 128 are sent between various network elements (e.g., between the evolved packet core 108 and network elements of the E-UTRAN portion 106). For example, as illustrated in fig. 1, the base station 104 sends a control message 128 to the MME112 indicating that a new UE device 102 is attempting to connect to the network 100. As another example, the SGW116 sends a control message 128 to the MME112, the control message 128 indicating that data from the external network 30 has arrived at a particular UE device 102, and that the UE device 102 needs to be woken up (or paged) to establish a tunnel in order to accept waiting data. The control plane interface may send such control messages 128 using a control plane protocol such as the general packet radio service tunneling control (GTP-C) protocol or the Diameter protocol. The type of protocol used to send the control message 128 may depend on the interface. For example, the S3, S5/S8, and S10 interfaces use the GTP-C protocol, and the S11, S6a, S6d, and S13 interfaces use the Diameter protocol.

The MME112 is a key control node for the LTE network 100. The MME112 manages sessions and states, and authenticates and tracks the UE device 102 across the network 100. For example, the MME112 may perform various functions such as, but not limited to, controlling non-access stratum (NAS) signaling and security, authentication and mobility management of the UE device 102, selection of a gateway for the UE device 102, and bearer management functions. The SGSN 114 may function in some ways similar to the MME 112. For example, the SGSN 114 tracks the location of the UE device 102 and performs security and access control functions. In some examples, the SGSN 114 is responsible for mobility management, logical link management, authentication, charging functions, and/or handling overload conditions (e.g., of the standby mode UE device 102). The SGW116 performs various functions related to IP data transfer for the user equipment 102, such as data routing and forwarding and mobility anchoring. The SGW116 can perform functions such as buffering, routing, and forwarding data packets for the mobile device 102.

PCRF 118 is the node responsible for real-time policy rules and charging in EPC 108. In some examples, PCRF 118 is configured to access a subscriber database (i.e., a UE device user) to make policy decisions. Quality of service management may be controlled through dynamic policy interactions between PCRF 118 and network gateway device 122. Signaling by PCRF 118 may establish or modify attributes of the EPS bearer (i.e., the virtual connection between UE device 102 and PGW 122). In some configurations, such as voice over LTE (VoLTE), PCRF 118 allocates network resources for establishing a call and distributing the requested bandwidth to call bearers having configured attributes.

The HSS120 references a database of all UE devices 102 that includes all UE device user data. In general, HSS120 is responsible for authentication of call and session establishment. In other words, the HSS120 is configured to communicate subscription and authentication data for user access and UE context authentication. The HSS120 interacts with the MME112 to authenticate the UE device 102 and/or UE device user. The MME communicates with the HSS120 on the PLMN using the Diameter protocol (e.g., via the S6a interface).

The PGW 122 (i.e., network gateway device) performs various functions such as, but not limited to, Internet Protocol (IP) address assignment, maintenance of data connectivity for the UE device 102, packet filtering for the UE device 102, service level gating control and rate enforcement, Dynamic Host Configuration Protocol (DHCP) functions for clients and servers, and gateway general packet radio service (GGSN) functions.

In some implementations, the data processing hardware 124 of the network gateway device 122 (e.g., a PGW or GGSN or gateway node with naming conventions as by a 5G and/or 5G + network) receives the control messages 128 associated with the at least one device 102. The data processing hardware 124 may be based on the coverage area 104 of at least one UE device 102 with a base station 104_areaThe interaction of network 100 within receives control message 128.

With further reference to fig. 1, the communication network 100 further comprises an anomaly detector 200. In some examples, the anomaly detector 200 is part of the network gateway device 122 (e.g., a PGW or GGSN or a gateway node with another naming convention defined by a 5G and/or 5G + network). For example, data processing hardware 124 and/or memory hardware 126 of network gateway device 122 host exception detector 200 and perform the functions of exception detector 200. In some embodiments, the anomaly detector 200 is in communication with the E-UTRAN portion 106 and the EPC 108, but resides on the external network 30 (e.g., data processing hardware corresponding to the external network 30). In other words, the external network 30 may be a distributed system (e.g., a cloud environment) having its own data processing hardware or shared data processing hardware (e.g., shared with the network gateway device 122). In other configurations, network elements other than network gateway device 122 implement anomaly detector 200. Additionally or alternatively, the anomaly detector 200 resides on more than one network element of the network 100.

In general, anomaly detector 200 is configured to detect anomalies occurring within network 100 based on one or more control messages 128. Using the detected anomaly, the anomaly detector 200 analyzes whether the anomaly corresponds to a network performance problem 202 that affects the performance of the network 100. In other words, anomaly detector 200 identifies unique events (i.e., anomalies) within network 100 and determines whether the unique events are detrimental to the performance of network 100 (or negatively impact user experience). When the anomaly detector 200 identifies that the detected anomaly affects network performance, the anomaly detector 200 is configured to notify the network entity 40 responsible for the network performance issue 202 or relay the network performance issue 202 to an entity that is aware of or in communication with the responsible entity. For example, the anomaly detector 200 may signal or inform a network operator of network performance issues 202 corresponding to detected anomalies. In some embodiments, anomaly detector 200 transmits one or more control messages 128 to network entity 40 indicating a network anomaly. Here, the network entity 40 may further analyze the one or more control messages 128 to help solve the network problem 202.

Referring to fig. 2A-2D, the anomaly detector 200 generally includes a collector 210, an extractor 220, a predictor 230, and an analyzer 240. The collector 210 is configured to receive at least one control message 128 from the network 100. In some embodiments, the collector 210 includes a data store 212 for collecting the control messages 128 from the network 100 to serve as a central database for recording data corresponding to the control messages 128. Utilizing collector 210, anomaly detector 200 may process control messages 128 in various ways to create training data (e.g., training control messages) that may be used to detect anomalies. For example, the collector 210 groups together the control messages 128 of a single session from the UE device 102 (e.g., within the data store 212). In some examples, a session refers to a period of time from when a user initiates a CreateSessionRequest or CreateSessionRequest message (via the UE device 102) to when the user terminates the session with a deletepsessionresponse or deleteppcontextrequest message. As another example, the collector 210 groups the control messages 128 together to indicate an amount of data 129 that is transmitted (e.g., in the uplink direction, the downlink direction, or both) within a particular time period (e.g., during a session). With these control messages 128 grouped together, the collector 210 forms a representation of the total amount of data 129 for a particular time period.

In other configurations, the collector 210 collects log data as a sequence such that the control messages 128 are concatenated together as a time sequence (e.g., t @)₀～t₃). Here, the string of control messages 128 may be by an entity (e.g., a particular user or UE device 102)) Or by session aggregation of entities. If these sequences become too long, the collector 210 may be configured to decompose the sequences into subsequences of fixed length and associate any identifier of the original sequence with each subsequence. Alternatively, the sequence may have a tag (e.g., a particular entity or UE device 102) that will not be able to be transmitted to one or more subsequences when the collector 210 disassembles the sequence.

Extractor 220 is configured to extract information from one or more control messages 128 and/or log data corresponding to control messages 128. Extractor 220 may extract one or more features 222 and/or one or more tags 224 from one or more control messages 128 (or portions thereof). Each feature 222 and/or label 224 refers to a characteristic derived from the control message 128. In some examples, the label 224 is a characteristic of a network element, UE device 102, user of the UE device, or base station 104, which is typically confused due to 3GPP standardization of the network 100. In other words, while the extractor 220 may generate the actual label 224 directly from the control message 128 (or log data related to the control message 128), when the network 100 is 3GPP compliant, it should not be possible to simply determine the actual label 224 contextually from one or more control messages 128. One such example of a tag 224 is a Type Assignment Code (TAC) that identifies the wireless device (e.g., mobile phone type of UE device 102). Other examples of labels 224 may include, but are not limited to, identifiers corresponding to network elements of network 100 (e.g., MME identifiers, Base Station Identification Codes (BSICs), International Mobile Equipment Identities (IMEIs), E-UTRAN cell identities (ECIs)/E-UTRAN cell global identifiers (ECGIs), etc.).

Feature 222, on the other hand, corresponds to another characteristic derived from control message 128 that is different from the characteristic forming tag 224. Here, unlike the tag 224, the feature 222 of the control message 128 may be discernable even when the network 100 is 3GPP compliant. Some examples of characteristics 222 include a control message type (e.g., expressed as an integer), a cause type of a GTP-C message, an amount of time elapsed between adjacent messages (e.g., when control message 128 is ordered by collector 210), and so forth. In some examples, extractor 220 extracts different features 222 from different control message protocols. For example, the features 222 extracted from GTP-C messages will be different from the features 222 extracted from Diameter messages. In some examples, the features 222 extracted by the extractor 220 are interleaved to create new features 222. The intersection of features 222 refers to a combination of portions of two or more features 222. For example, the extractor 220 intersects the message type features 222 and the cause value features 222 to generate message type-cause value features 222. With the cross-features 222, the extractor 220 may provide additional training data sets, potentially increasing the ability of the anomaly detector 200 to detect anomalies.

Whether the extractor 220 extracts the features 222 and/or the labels 224 may depend on the stage of the anomaly detector 200. In a first phase (e.g., a training phase), the anomaly detector 200 trains to be able to predict network anomalies. To train anomaly detector 200, extractor 220 extracts information from one or more control messages 128 at collector 210. The extracted information forms a training control message 226 that includes one or more features 222 and an actual label 224. By including the actual tags 224 as ground facts (ground true) in the training control messages 226, the anomaly detector 200 learns which features 222 may correspond to which tag 224. In the second stage (e.g., inference), after training anomaly detector 200, extractor 220 no longer provides training control message 226 having both features 222 and labels 224. Instead, extractor 220 extracts one or more features 222 from control message 128 and relies on trained anomaly detector 200 to predict labels 224. In other words, since processing each control message 128 to extract the actual tag 224 therefrom is time sensitive and therefore not feasible in real time, the trained anomaly detector 200 can predict the potential tag 234 using only the features 222 extracted from the control message 128 as feature inputs.

The predictor 230 is configured to use the predictive model 232 to predict a potential tag 234 for the control message 128 that is associated with one or more features 222 extracted from the control message 128 by the extractor 220. Ideally, due to 3GPP standardization, it should not be possible for the predictor 230 to generate a prediction P in which the potential label 234 matches (i.e., correctly predicts) the actual label 224 of a given control message 128. Thus, when the predictor 230 predicts a potential tag 234 from at least one control message 128 (e.g., the features 222 of the control message 128) that matches the actual tag 224, the match indicates a unique correlation (i.e., a detected anomaly) between the control message(s) 128 and the tags 224, 234.

When the predictor 230 generates the correct prediction P, the analyzer 240 analyzes the relevant control messages 128 and/or log data corresponding to the control messages 128. Here, the analyzer 240 analyzes the control message 128 to determine whether the control message 128 corresponds to a network performance issue 202 that affects network performance of the network 100. In other words, the analyzer 240 determines whether the detected anomaly is the only correlation due to adverse behavior, or whether the detected anomaly is the only behavior with little or no impact on network performance or user experience. When the analyzer 240 determines that the detected anomaly of the control message 128 affects network performance, the analyzer 240 marks the adverse behavior as pending remediation. To fix the behavior, the analyzer 240 may communicate the network performance issue 202 to a network entity 40 (e.g., a network operator or UE device provider) responsible for the network performance issue 202.

In some configurations, the analyzer 240 performs clustering. Clustering may be beneficial in situations where the network 100 is experiencing too many anomalies to be investigated. Rather than investigate each and every detected anomaly, the analyzer 240 clusters the detected anomalies into similar groups. By clustering into groups, the analyzer 240 may prioritize larger clusters that may have more adverse effects on the network 100 (e.g., sorting clusters by network effect or likelihood/probability of network effect). Further, when the analyzer 240 relies on manual analysis to determine whether a detected anomaly corresponds to a network performance issue 202, the analyzer 240 may use an auto-encoder to perform dimension reduction. The dimensionality reduction by the auto-encoder is configured to reduce the large dataset (i.e., a large number of anomalies) by correlating redundant features in the large dataset. Here, as a neural network trained from gradient descent, the autoencoder performs dimensionality reduction by attempting to identify new structures or uniqueness in the dataset. In other words, the autoencoder may isolate more unique anomalies of the network 100 that may be more likely to be associated with the network performance issues 202 that should be analyzed. By combining clustering and automatic encoding, large numbers of anomalies can be formed into smaller groups (clusters) and then further reduced to efficiently utilize human and/or computing resources.

The predictor 230 predicts the potential tags 234 using a prediction model 232. In some examples, the predictive model 232 is a neural network (e.g., a Deep Neural Network (DNN), a Recurrent Neural Network (RNN), or a Convolutional Neural Network (CNN)). To generate the prediction P, the prediction model 232 undergoes model training. Here, the training of the predictive model 232 is performed using examples (also referred to as training data or training data sets) corresponding to the control messages 128 and/or their associated log data. In some implementations, the extractor 220 generates the set 228 of training control messages 226 as an example of training a predictive model 232 (e.g., as shown in fig. 2B). In some configurations, each training control message 226 corresponds to a control message 128 processed at the collector 210. Extractor 220 may form each training control message 226 by extracting one or more features 222 from control message 128 along with the actual label 224 for control message 128. In some examples, when more than one control message 128 has the same label 224, the features 222 of these control messages 128 are combined into one example or set 228 of training control messages 226. For example, extractor 220 creates a message type vector digest to account for each type of control message 128 included in the combination. The message type vector digest may include one entry for each possible message type to represent the number of times a particular control message 128 is encountered (e.g., within a single session).

To train the predictive model 232, the predictor 230 divides the set 228 of training control messages 226 into a training set 226_TAnd a verification set 226 v. In some examples, in addition to training set 226_TAnd a validation set 226v, the training control messages 226 are further divided into test sets. The predictive model 232 is in the training set 226_TTraining up while using the validation set 226v to determine when to stop training (e.g., to prevent overfitting). When it is predictedTraining may stop when the performance of the measured model 232 reaches a certain threshold or when the performance of the predictive model 232 on the validation set 226v stops decreasing. In some examples, training set 226_TThe final performance of the prediction model 23 is evaluated. In some embodiments, the predictive models 232 are trained as multi-class classification models. As a multi-class classification model, the prediction model 232 outputs a probability P representing the probability for each class_BProbability distribution P of opinion_Bdis. For example, when the prediction model 232 predicts TACs, each TAC will be a different class, such that the prediction model 232 will output a probability distribution for each TAC class.

In some examples, the process of training and evaluating the predictive model 232 occurs continuously to provide early detection of new network problems 202 that may arise. Once training is complete, the prediction P from the training may be fed back into the predictive model 232. These predictions P may correspond to training set 226_TA validation set 226v, a test set, or any combination thereof. In other words, the predictive model 232 is configured to evaluate its prediction P from training of training data (e.g., the set 228 of training control messages 226). This approach may ensure that the predictive model 232 has completed training and is ready to predict the potential tags 234.

Referring to fig. 2B and 2D, in some examples, the prediction model 232 of the predictor 230 generates a probability P for the prediction P of the potential label 234_B. To evaluate the probability P of a potential tag 234_BThe predictor 230 may apply a confidence threshold 236. The confidence threshold 236 indicates the probability P of the potential tag 234_BCorresponding to the confidence level of the anomaly that the analyzer 240 is required to evaluate for adverse behavior. In other words, the predicted probability P when the potential tag 234 is_BWhen the confidence threshold 236 is met, the predictor 230 passes the control message 128 corresponding to the potential tag 234 to the analyzer 240. For example, when the confidence threshold 236 is 90%, a probability P of a prediction P of a potential tag 234 indicating a TAC that is greater than 90%_BIndicating the confidence prediction P that should be passed to the analyzer 240 for further analysis.

In some configurations, the predictive model 232 outputs/predicts probability distributions over the potential tags 234a-nP_Bdis. In these configurations, the probability distribution P_BdisIncludes a corresponding probability P_B. In some examples, the predictor 230 operates by selecting a probability distribution P over the potential tags 234a-n_BdisHas the highest probability P in_BThe potential tags 234a-n to predict the potential tags 234. In the example shown in FIGS. 2B and 2D, the potential tag 234a has a probability distribution P over the potential tags 234a-n_BdisThe highest probability P of ninety one percent (91%) (N)_BThus, the predictor 230 selects the potential tag 234a and assigns a probability P_B(91%) was compared to a confidence threshold (90%). Thus, in this example, the predictor 230 determines the probability P of the selected potential tag 234a_BThe confidence threshold 236 is satisfied and the corresponding control message 128 is passed to the analyzer 240 to determine whether the control message 128 corresponds to a corresponding network performance issue 202 that affects network performance. In some scenarios, the predictor 230 communicates the probability distribution P to the analyzer 240_BdisHas a corresponding probability P of satisfying the confidence threshold 236_BEach potential tag 234 a-n.

In some configurations, the prediction model 232 is an RNN model that is more suitable (than a DNN model) for sequential data. For the RNN model, extractor 220 generates a sequence of features 222. In other words, extractor 220 may form training control message 226 from sequence control message 128 (or sequence feature 222 from sequence control message 128). With the sequential features 222, each sequence may be a training example, such that the sequential features 222 will be divided into a training data set, a validation data set, and a test data set. The RNN model operates relatively similar to the prediction model 232 previously described, except that sequential data is preferred.

In some examples, the predictive model 232 has difficulty distinguishing between different potential tags 234 that perform similarly. For example, when predicting the TAC, there may be several TACs (e.g., three TACs) that are identically performed. This same behavior results in the prediction model 232 knowing with confidence that the TAC is one of the three TACs, but not predicting exactly which TAC. To overcome this problem, the predictor 230 may use Principal Component Analysis (PCA) to identify groups of tags 234 that are similarly performed (e.g., similar to three TACs). The prediction P of the potential tag 234 may be a vector using PCA, where PCA identifies which groupings of tags 224 are jointly predicted together. For example, the PCA will identify that three TACs should be considered together because the principal component vectors of these three TACs will have strong peaks indicating that they should be grouped (or considered) together.

Referring to fig. 2C and 2D, the anomaly detector 200 may further include a filter 250. The filter 250 is configured to prevent redundant analysis of similarly detected anomalies. In other words, the anomaly detector 200 generates the filter 250 when an anomaly has been detected. The filter 250 may be used for anomalies of adverse behavior or for anomalies of non-adverse behavior (i.e., acceptable behavior). Once the analyzer 240 has determined whether the control message 128 corresponding to the anomaly affects network performance, performing this same analysis on similar control messages 128 or a sequence of similar control messages 128 may defer the anomaly detection resource from detecting a new anomaly or an anomaly that needs to be analyzed. Thus, the filter 250 attempts to prevent duplicate analysis. For example, when the analyzer 240 determines that the control message 128 corresponds to a respective network issue 202 affecting network performance, the respective network issue 202 and/or the control message 128 is reported to the responsible network entity 40. Here, reanalyzing and reporting similar control messages 128 to network entities 40 would be redundant, as the corresponding network problems 202 have been reported and would be resolved by the responsible network entity 40 at the appropriate time. On the other hand, when the analyzer 240 determines that the control message does not affect network performance, the anomalies associated with the control message 128 are harmless and therefore acceptable. Therefore, it would be meaningless to re-analyze subsequent similar control messages 128.

Anomaly detector 200 can generally apply filter 250 in two scenarios: (1) on features 222 extracted from control message 128 before input to predictive model 232; or (2) on a set 228 of training control messages 226 used to train the predictive model 232. In some examples (i.e., the first scenario), the anomaly detector 200 applies the filter 250 after the prediction model 232 has been trained, but before the one or more features 222 extracted from the subsequent control message 128 are input to the trained prediction model 232 for prediction P of the subsequent potential label 234. Here, the anomaly detector 200 identifies that at least one of the one or more corresponding features 222 extracted from the subsequent control message 128 matches one or more features 222 extracted from the previous control message 128 having a predicted potential tag 234 indicative of a network anomaly (i.e., the predicted potential tag 234 satisfies a confidence threshold 236). Thereafter, prior to predicting the corresponding potential tags 234 of the subsequent control messages 128 using the predictive model 232, the anomaly detector 200 applies a filter 250 to remove the identified at least one of the one or more corresponding features 222 extracted from the subsequent control messages 128 from the feature input used as the predictive model 232. Thus, any prediction P output by the predictive model 232 at the predictor 230 for a potential tag 234 will not be based on the features 222 extracted from the previous control messages 128 having the predicted potential tag 234 indicative of a network anomaly, regardless of whether the analyzer 240 determines that the network anomaly is harmless or affecting network performance. For example, fig. 2C illustrates a filter 250 that blocks and/or removes one of the three features 222 in gray, which is communicated to the predictor 230 to predict the potential tag 234 of the subsequent control message 128.

In other examples (i.e., a second scenario), such as in fig. 2D, the anomaly detector 200 retrains the predictive model 232 such that any features 222 extracted from the control messages 128 that were previously identified as having a prediction P of potential tags 234 indicative of a network anomaly are removed from the set 228 of training control messages 226. This approach may also be applicable whether or not the control message 128 corresponds to a network performance issue 202. To retrain the predictive model 232, the anomaly detector 200 first identifies one or more features 222 extracted from the previous control message 128 with potential tags 234 indicating network anomalies. Then, before predicting the corresponding potential tags 234 of the subsequent control messages 128 using the predictive model 232, the anomaly detector 200 modifies the set 228 of training control messages 226 by removing each training control message 226 that includes one or more corresponding features 222 that match any of the identified one or more features 222 extracted from the previous control message 128. Thereafter, the anomaly detector 200 retrains the predictive model 232 on the set 228 of modified training control messages 226. For example, fig. 2D depicts the filter 250 modifying the set 228 of training control messages 226 by removing one of the three training control messages 226 from the retraining set (i.e., the modified set 228) of training control messages 226. Once one or more training control messages 226 have been removed, the filter 250 retrains the predictive model 232 on the set 228 of modified training control messages 226. In other words, if the predictive model 232 is not trained to detect which features 222 indicate an anomaly, the anomaly will subsequently not be detected and therefore ignored.

Additionally or alternatively, when a detected anomaly indicates a corresponding network performance issue 202 and the network performance issue 202 has subsequently been resolved, the anomaly detector 200 may be configured to remove any filters 250 associated with the resolved network performance issue 202. In configurations where the prediction model 232 is an RNN model, the anomaly detector 200 may selectively apply the filter 250. In other words, the filter 250 may remove portions of the sequence of features 222 corresponding to the particular control message(s) 128 of the detected anomaly, without removing the entire sequence as features 222. Advantageously, the filter 250 may remove the portion of the sequence before the sequence is divided into smaller sequences. For example, when the filter 250 identifies when there are too many CreateSessionRequest messages with small time periods, these individual messages can be removed in whole or in part.

Fig. 3 illustrates a flow chart of an exemplary method 300 for detecting network anomalies. At operation 302, the method 300 receives a control message 128 from the cellular network 100. At operation 304, the method 300 extracts one or more features 222 from the control message 128. At operation 306, the method 300 predicts a potential tag 234 for the control message using a predictive model 232, the predictive model 232 configured to receive as a feature input the one or more features 222 extracted from the control message 128. The predictive model 232 is trained on a set of training control messages 226, wherein each training control message 226 includes one or more corresponding features 222 and actual labels 224. At operation 308, the methodMethod 300 determines probability P of potential tag 234_BThe confidence threshold 236 is satisfied. At operation 310, the method 300 analyzes the control message 128 to determine whether the control message 128 corresponds to a respective network performance issue 202 that affects network performance of the cellular network 100. At operation 312, the method 300 communicates the network performance issue 202 to the network entity 40 responsible for the network performance issue 202 when the control message 128 corresponds to a respective network performance issue that affects network performance.

In some examples, the method 300 receives a subsequent control message 128 from the cellular network 100 when the control message 128 fails to correspond to the respective network performance issue 202, and extracts one or more respective features 222 from the subsequent control message 128. In these examples, the method 300 also identifies that at least one of the one or more corresponding features 222 extracted from the subsequent control message 128 matches the one or more features 222 extracted from the control message 128. Here, the method 300 removes the identified at least one of the one or more features 222 extracted from the subsequent control message 128 as a feature input to the predictive model 232 prior to predicting the corresponding potential tag 234 of the subsequent control message using the predictive model 232. In some implementations, the method 300 identifies one or more features 222 extracted from the control message 128 when the control message 128 fails to correspond to the respective network performance issue 202. Here, in addition to identifying the one or more features 222, the method 300 modifies the set 228 of training control messages 226 by removing each training control message 226 that includes one or more of the corresponding features that match any of the identified one or more features 222 extracted from the control message 128, and retrains the predictive model 232 with the modified set 228 of training control messages 226, prior to using the predictive model 232 to predict the corresponding potential tags 234 of subsequent control messages 128.

FIG. 4 is a schematic diagram of an exemplary computing device 400 that may be used to implement the systems (e.g., anomaly detector 200) and methods (e.g., method 300) described in this document. Computing device 400 is intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. The components shown herein, their connections and relationships, and their functions, are meant to be exemplary only, and are not meant to limit embodiments of the inventions described and/or claimed in this document.

Computing device 400 includes a processor 410 (i.e., data processing hardware), memory 420 (i.e., memory hardware), a storage device 430, a high-speed interface/controller 440 connected to memory 420 and high-speed expansion ports 450, and a low-speed interface/controller 460 connected to low-speed bus 470 and storage device 430. Each of the components 410, 420, 430, 440, 450, and 460 are interconnected using various buses, and may be mounted on a common motherboard or in other manners as appropriate. The processor 410 is capable of processing instructions for execution within the computing device 400, including instructions stored in the memory 420 or on the storage device 430, to display graphical information for a Graphical User Interface (GUI) on an external input/output device, such as display 480, which is coupled to high speed interface 440. In other embodiments, multiple processors and/or multiple buses may be used, as appropriate, along with multiple memories and types of memory. Moreover, multiple computing devices 400 may be connected, with each device providing portions of the necessary operations (e.g., as a server bank, a blade server bank, or a multi-processor system).

The memory 420 stores information within the computing device 400 non-temporarily. The memory 420 may be a computer-readable medium, volatile memory unit(s), or non-volatile memory unit(s). Non-transitory memory 420 may be a physical device for temporarily or permanently storing programs (e.g., sequences of instructions) or data (e.g., program state information) for use by computing device 400. Examples of non-volatile memory include, but are not limited to, flash memory and read-only memory (ROM)/programmable read-only memory (PROM)/erasable programmable read-only memory (EPROM)/electronically erasable programmable read-only memory (EEPROM) (e.g., typically used for firmware, such as boot programs). Examples of volatile memory include, but are not limited to, Random Access Memory (RAM), Dynamic Random Access Memory (DRAM), Static Random Access Memory (SRAM), Phase Change Memory (PCM), and magnetic disks or tape.

The storage device 430 can provide mass storage for the computing device 400. In some implementations, the storage device 430 is a computer-readable medium. In various different implementations, the storage device 430 may be a floppy disk device, a hard disk device, an optical disk device, or a tape device, a flash memory or other similar solid state storage device, or an array of devices, including devices in a storage area network or other configurations. In further embodiments, a computer program product is tangibly embodied as an information carrier. The computer program product contains instructions that, when executed, perform one or more methods, such as those described above. The information carrier is a computer-or machine-readable medium, such as the memory 420, the storage device 430, or memory on processor 410.

High speed controller 440 manages bandwidth-intensive operations for computing device 400, while low speed controller 460 manages lower bandwidth-intensive operations. Such allocation of duties is exemplary only. In some implementations, the high-speed controller 440 is coupled to memory 420, display 480 (e.g., through a graphics processor or accelerator), and high-speed expansion ports 450, which may accept various expansion cards (not shown). In some embodiments, low-speed controller 460 is coupled to storage device 430 and low-speed expansion port 490. The low-speed expansion port 490, which may include various communication ports (e.g., USB, bluetooth, ethernet, wireless ethernet), may be coupled, for example, through a network adapter, to one or more input/output devices such as a keyboard, pointing device, scanner, or network device such as a switch or router.

The computing device 400 may be implemented in a number of different forms, as shown in the figure. For example, the computing device 400 may be implemented as a standard server 400a or multiple times in a group of such servers 400a, as a laptop computer 400b, or as part of a rack server system 400 c.

Various implementations of the systems and techniques described here can be realized in digital electronic and/or optical circuits, integrated circuits, specially designed ASICs (application specific integrated circuits), computer hardware, firmware, software, and/or combinations thereof. These various embodiments can include implementations in one or more computer programs that are executable and/or interpretable on a programmable system including at least one programmable processor, which may be special or general purpose, coupled to receive data and instructions from, and to transmit data and instructions to, a storage system, at least one input device, and at least one output device.

These computer programs (also known as programs, software applications or code) include machine instructions for a programmable processor, and can be implemented in a high-level procedural and/or object-oriented programming language, and/or in assembly/machine language. As used herein, the terms "machine-readable medium" and "computer-readable medium" refer to any computer program product, non-transitory computer-readable medium, apparatus and/or device (e.g., magnetic discs, optical disks, memory, Programmable Logic Devices (PLDs), including a machine-readable medium that receives machine instructions as a machine-readable signal) for providing machine instructions and/or data to a programmable processor.

The processes and logic flows described in this specification can be performed by one or more programmable processors executing one or more computer programs to perform functions by operating on input data and generating output. The processes and logic flows can also be performed by, and in particular by, special purpose logic circuitry, e.g., an FPGA (field programmable gate array) or an ASIC (application-specific integrated circuit). Processors suitable for the execution of a computer program include, by way of example, both general and special purpose microprocessors, and any one or more processors of any kind of digital computer. Generally, a processor will receive instructions and data from a read-only memory or a random access memory or both. The essential elements of a computer are a processor for executing instructions and one or more memory devices for storing instructions and data. Generally, a computer will also include, or be operatively coupled to receive data from or transfer data to, or both, one or more mass storage devices for storing data, e.g., magnetic, magneto-optical disks, or optical disks. However, a computer need not have such devices. Computer-readable media suitable for storing computer program instructions and data include all forms of non-volatile memory, media and memory devices, including by way of example semiconductor memory devices, e.g., EPROM, EEPROM, and flash memory devices; magnetic disks, such as internal hard disks or removable disks; magneto-optical disks; and CD ROM and DVD-ROM disks. The processor and the memory can be supplemented by, or incorporated in, special purpose logic circuitry.

To provide for interaction with a user, one or more aspects of the present disclosure can be implemented on a computer having a display device, e.g., a CRT (cathode ray tube), LCD (liquid crystal display) monitor or touch screen, for displaying information to the user and an optional keyboard and a pointing device, e.g., a mouse or a trackball, by which the user can provide input to the computer. Other types of devices can also be used to provide for interaction with a user; for example, feedback provided to the user can be any form of sensory feedback, e.g., visual feedback, auditory feedback, or tactile feedback; and input from the user can be received in any form, including acoustic, speech, or tactile input. In addition, the computer is able to interact with the user by sending and receiving documents to and from the device used by the user; for example, by sending a Web page to a Web browser on the user's client device in response to a request received from the Web browser.

A number of embodiments have been described. Nevertheless, it will be understood that various modifications may be made without departing from the spirit and scope of the disclosure. Accordingly, other implementations are within the scope of the following claims.

26页详细技术资料下载

上一篇：一种医用注射器针头装配设备

下一篇：用于许可认证的方法、节点、系统和计算机可读存储介质

Network anomaly detection

相关技术

网友询问留言