Mobile signaling trajectory denoising method, medium and computing device based on multistage filtering

文档序号:1925769 发布日期:2021-12-03 浏览:25次 中文

阅读说明:本技术 基于多级滤波的移动信令轨迹去噪方法、介质和计算设备 (Mobile signaling trajectory denoising method, medium and computing device based on multistage filtering ) 是由 蒋志鹏 戴帅夫 刘丙双 于 2021-08-18 设计创作,主要内容包括:本发明公开了一种基于多级滤波的移动信令轨迹去噪方法、介质和计算设备,首先提取用户信令轨迹数据并按时间排序,计算每个轨迹点的Geohash网格以及两两轨迹点间的曼哈顿距离,依次以Geohash网格和轨迹点为单位,对移动速度和轨迹夹角进行合理性判定,通过由粗到细的方式过滤轨迹噪声数据。目前基站位置信息不准确导致的连续型轨迹噪声数据,基于聚类或单一滤波的轨迹去噪方法通常无法去除,本发明不仅可以去除移动通信机制所产生的噪声数据,而且能够去除连续型轨迹噪声数据,并且实施方案更加简单有效。(The invention discloses a multistage filtering-based mobile signaling track denoising method, medium and computing equipment. The invention can remove not only the noise data generated by a mobile communication mechanism, but also the continuous track noise data, and the implementation scheme is simpler and more effective.)

1. A mobile signaling track denoising method based on multistage filtering is characterized by comprising the following steps:

the method comprises the following steps: grouping mobile phone signaling data received in real time according to the unique user identification, only storing the signaling data when the user is switched between base stations to form an offline track library, and extracting the track data in a certain period from the offline track library;

step two: grouping the track data according to the unique user identifier, performing in-group sequencing according to the signaling time, sequentially calculating the moving speed between two adjacent track points based on the sequencing track, and only keeping the track points with the moving speed less than the speed threshold;

step three: sequentially encoding longitude and latitude coordinates of the track points into a Geohash character string with a specific digit number, and bringing the Geohash character string into the Geohash group;

step four: when the ith Geohash character string is different from the (i-1) th Geohash character string, and the (i-1) th Geohash character string is also different from the (i-2) th Geohash character string, decoding the three Geohash character strings into longitude and latitude coordinates corresponding to the central points of the respective Geohash characters, sequentially calculating the Manhattan distance between the central point of the i-th Geohash and the central point of the i-1 st Geohash, the Manhattan distance between the central point of the i-1 st Geohash and the central point of the i-2 nd Geohash and the included track angle formed by the central point of the i-th Geohash, the central point of the i-1 st Geohash and the central point of the i-2 th Geohash, and carrying out spatial filtering based on a Geohash grid;

step five: repeating the third step to the fourth step until all track points of the user are traversed, and finally generating a new user track;

step six: traversing the track points of the new user track, calculating the coordinates of the track center points and the Manhattan distance between each track point and the track center point, and encoding the coordinates of the track points with the minimum Manhattan distance into a Geohash character string with a specific digit number, and recording the Geohash character string as a central Geohash grid;

step seven: traversing the trace points of the new user track in the step five, and performing secondary spatial filtering based on the coordinates of the trace points when the ith trace point meets the reasonableness judgment condition;

step eight: and repeating the third step to the seventh step until all user track denoising works are finished.

2. The method for denoising the mobile signaling track based on the multistage filtering according to claim 1, wherein the track data at least comprises a user unique identifier, track point longitude, track point latitude and signaling time, and the track point longitude and the track point latitude are obtained through a base station unique identifier.

3. The method for denoising a mobile signaling trajectory based on multistage filtering according to claim 1, wherein spatial filtering is performed based on a Geohash grid, and specifically comprises:

and when the included angle of the track formed by the ith Geohash central point, the (i-1) th Geohash central point and the (i-2) th Geohash central point is smaller than the included angle threshold, and the Manhattan distance between the ith Geohash central point and the (i-1) th Geohash central point and the Manhattan distance between the (i-1) th Geohash central point and the (i-2) th Geohash central point are both greater than the distance threshold, deleting the ith Geohash character string and all track points contained in the (i-1) th Geohash group.

4. The method according to claim 1, wherein the track center point coordinates include a track center point longitude and a track center point latitude, the track center point longitude is an arithmetic mean of the track point longitudes in the new user track, and the track center point latitude is an arithmetic mean of the track point latitudes in the new user track.

5. The method according to claim 1, wherein the mobile signaling trajectory denoising method based on multistage filtering satisfies a rationality judgment condition, namely one of the following conditions: (1) the Geohash grid where the ith track point is located is a central Geohash grid; (2) the speed of the user moving from the ith-1 track point to the ith track point is less than the speed threshold value; (3) the distance from the ith-1 track point to the track central point is greater than the distance from the ith track point to the track central point; (4) when i is 1, the speed of the user moving from the ith track point to the track center point is less than the speed threshold value.

6. The method for denoising a mobile signaling trajectory based on multistage filtering according to claim 3, wherein performing quadratic spatial filtering based on the trajectory point coordinates specifically comprises:

sequentially calculating the Manhattan distance and the time interval between the ith track point and the (i-1) th track point, and between the (i-1) th track point and the (i-2) th track point, and calculating the track included angle formed by the ith track point, the (i-1) th track point and the (i-2) th track point; and when the track included angle is smaller than the included angle threshold value, the Manhattan distance between the ith track point and the (i-1) th track point, the Manhattan distance between the ith-1 st track point and the (i-2) th track point are both larger than the distance threshold value, and the time interval between the ith track point and the (i-1) th track point and the time interval between the ith-1 st track point and the (i-2) th track point are both larger than the time interval threshold value, deleting the (i-1) th track point.

7. The mobile signaling trajectory denoising method based on multistage filtering as claimed in claim 6, wherein the included angle threshold is 0-45 °, the distance threshold is 0 km-10 km, the speed threshold is 40 km/h-605 km/h, and the time interval threshold is 0 s-60 s.

8. The method for denoising the mobile signaling trajectory based on the multistage filtering as claimed in claim 1, wherein the specific number of bits is 5-7 bits, and the certain period is one day or one month.

9. A computer-readable storage medium, characterized in that the storage medium stores a computer program comprising instructions for carrying out the steps of the method according to any one of claims 1 to 8.

10. A computing device comprising a processor and a memory for storing a computer program executable by the processor, wherein the processor implements the method of any one of claims 1 to 8 when executing the computer program stored in the memory.

Technical Field

The invention relates to the technical field of signaling data analysis, in particular to a mobile signaling track denoising method, medium and computing equipment based on multistage filtering.

Background

The position information is used as an important component of the mobile communication user signaling, and the accuracy of the position information directly determines the output results of various applications. However, due to the reasons of complex environmental terrain, external signal interference, inaccurate base station information and the like, a large amount of noise exists in the position information, so that the problems of serious report missing and false report exist when the track similarity is calculated.

The existing common denoising methods comprise speed filtering, median filtering, Kalman filtering, cluster analysis and the like, and on one hand, the methods have different limitations; on the other hand, these methods are mainly oriented to the positioning error generated by the mobile communication mechanism, such as ping-pong data generated when the user is located at the border of the base station, memory data generated during the high-speed movement of the user, and so on. In practical application, the positioning error caused by inaccurate base station position information also accounts for a large proportion, and the characteristic of the positioning error generated by the mobile communication mechanism is different from that of the positioning error generated by the mobile communication mechanism, so that the existing denoising method is difficult to effectively process.

Specifically, a large number of artificial errors are included in the entry of the base station information table, so that some base stations in the table are far away from the actual position, one or more wrong outlier sub-tracks appear in a user signaling track, base station transfer in the sub-tracks does not occur abnormally under the condition that positioning errors are not considered, and when the time span is large enough, the base station transfer from a normal track to an outlier sub-track does not occur abnormally, so that the noise data cannot be found and removed only by means of a speed threshold or a clustering method.

Therefore, how to remove the continuous track noise is a problem that needs to be solved by those skilled in the art.

Disclosure of Invention

In view of this, the present invention provides a mobile signaling trajectory denoising method based on multistage filtering, which can remove not only noise data generated by a mobile communication mechanism, but also continuous trajectory noise data, and the implementation is simpler and more effective.

In order to achieve the purpose, the invention adopts the following technical scheme:

a mobile signaling track denoising method based on multistage filtering comprises the following steps:

the method comprises the following steps: grouping mobile phone signaling data received in real time according to the unique user identification, only storing the signaling data when the user is switched between base stations to form an offline track library, and extracting the track data in a certain period from the offline track library;

step two: grouping the track data according to the unique user identifier, performing in-group sequencing according to the signaling time, sequentially calculating the moving speed between two adjacent track points based on the sequencing track, and only keeping the track points with the moving speed less than the speed threshold;

step three: sequentially encoding longitude and latitude coordinates of the track points into a Geohash character string with a specific digit number, and bringing the Geohash character string into the Geohash group;

step four: when the ith GeoHash character string is different from the (i-1) th GeoHash character string, and the (i-1) th GeoHash character string is also different from the (i-2) th GeoHash character string, decoding the three GeoHash character strings into longitude and latitude coordinates corresponding to the central points of the respective GeoHash characters, and sequentially calculating the distance between the central point of the ith GeoHash character string and the central point of the (i-1) th GeoHash character string; the Manhattan distance, the i-1 st Geohash central point and the i-2 nd Geohash central point; performing spatial filtering based on a Geohash grid according to the Manhattan distance and a track included angle formed by the ith Geohash central point, the (i-1) th Geohash central point and the (i-2) th Geohash central point;

step five: repeating the third step to the fourth step until all track points of the user are traversed, and finally generating a new user track;

step six: traversing the track points of the new user track, calculating the coordinates of the track center points and the Manhattan distance between each track point and the track center point, and encoding the coordinates of the track points with the minimum Manhattan distance into a Geohash character string with a specific digit number, and recording the Geohash character string as a central Geohash grid;

step seven: traversing the trace points of the new user track in the step five, and performing secondary spatial filtering based on the coordinates of the trace points when the ith trace point meets the reasonableness judgment condition;

step eight: and repeating the third step to the seventh step until all user track denoising works are finished.

Preferably, the track data at least comprises a user unique identifier, track point longitude, track point latitude and signaling time, and the track point longitude and the track point latitude are acquired through a base station unique identifier.

Preferably, the spatial filtering is performed based on a Geohash grid, and specifically includes:

and setting an included angle threshold and a distance threshold, and deleting the ith Geohash character string and all track points contained in the (i-1) th Geohash group when the included angle of the track formed by the ith Geohash central point, the (i-1) th Geohash central point and the (i-2) th Geohash central point is smaller than the included angle threshold, and the Manhattan distance between the ith Geohash central point and the (i-1) th Geohash central point and the Manhattan distance between the (i-1) th Geohash central point and the (i-2) th Geohash central point are both larger than the distance threshold.

Preferably, the track center point coordinates include track center point longitude and track center point latitude, the track center point longitude is the arithmetic mean of the track point longitude in the new user track, and the track center point latitude is the arithmetic mean of the track point latitude in the new user track.

Preferably, the rationality judgment condition is satisfied, that is, one of the following conditions is satisfied: (1) the Geohash grid where the ith track point is located is a central Geohash grid; (2) the speed of the user moving from the ith-1 track point to the ith track point is less than the speed threshold value; (3) the distance from the ith-1 track point to the track central point is greater than the distance from the ith track point to the track central point; (4) when i is 1, the speed of the user moving from the ith track point to the track center point is less than the speed threshold value.

Preferably, performing secondary spatial filtering based on the track point coordinates specifically includes:

sequentially calculating the Manhattan distance and the time interval between the ith track point and the (i-1) th track point, and between the (i-1) th track point and the (i-2) th track point, and calculating the track included angle formed by the ith track point, the (i-1) th track point and the (i-2) th track point; and setting an included angle threshold value, a distance threshold value and a time interval threshold value, and deleting the (i-1) th track point when the (i) th track point, the (i-1) th track point and the (i-2) th track point form a track included angle smaller than the included angle threshold value, the Manhattan distance between the (i) th track point and the (i-1) th track point, the Manhattan distance between the (i-1) th track point and the (i-2) th track point are larger than the distance threshold value, and the time interval between the (i) th track point and the (i-1) th track point and the time interval between the (i-1) th track point and the (i-2) th track point are larger than the time interval threshold value.

Preferably, the threshold value of the included angle is 0-45 degrees, the threshold value of the distance is 0-10 km, the threshold value of the speed is 40-605 km/h, and the threshold value of the time interval is 0-60 s.

Preferably, the specific number of bits is 5-7 bits, and the certain period is one day or one month.

A computer-readable storage medium, storing a computer program comprising instructions for carrying out the steps of the above-described method.

A computing device comprising a processor and a memory for storing a processor executable computer program, the processor implementing the above method when executing the computer program stored by the memory.

According to the technical scheme, compared with the prior art, the invention discloses a mobile signaling track denoising method based on multistage filtering, which adopts multistage filtering from coarse granularity to fine granularity, on one hand, the multi-dimensional thresholds of moving speed, moving distance, track included angle, time interval and the like are combined, and the conventional positioning noise of ping-pong data, memory data and the like can be deleted; on the other hand, by combining the spatial distribution characteristics of continuous track noise, the outlier sub-track is compressed into a Geohash grid in a certain range, and then multi-dimensional threshold filtering is utilized to achieve the purpose of deleting the continuous track noise.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the provided drawings without creative efforts.

Fig. 1 is a flowchart of a mobile signaling trajectory denoising method based on multistage filtering according to the present invention.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

In order to remove positioning errors and continuous track noise in signaling track data, the invention provides a mobile signaling track denoising method based on multistage filtering, which is further described in detail with reference to the embodiments. It should be noted that the specific embodiments described herein are merely illustrative of the invention and should not be considered as limiting.

The following describes, in combination with an example of signaling data in a certain urban area, an implementation process of the mobile signaling trajectory denoising method based on multistage filtering provided by the present invention:

(1) grouping the user signaling data accessed in real time according to the unique user identification, only storing the signaling data when the user is switched with the base station to form an offline track library, extracting track data of one day from the offline track library, and acquiring the longitude and latitude coordinates of the position of the base station through the unique base station identification, wherein the user track data is shown in table 1.

Table 1 user trajectory data example

User unique identification Longitude (G) Latitude Signalling time
user1 u1_lon1 u1_lat1 u1_time1
user2 u2_lon1 u2_lat1 u2_time1
user1 u1_lon2 u1_lat2 u1_time2
user1 u1_lon3 u1_lat3 u1_time3
user2 u2_lon2 u2_lat2 u2_time2
user1 u1_lon4 u1_lat4 u1_time4
user3 u3_lon1 u3_lat1 u3_time1
user1 u1_lon5 u1_lat5 u1_time5

(2) Grouping the trajectory data processed in the step (1) according to the unique user identifier, performing ascending arrangement in the group according to signaling time, sequentially calculating the moving speed between trajectory points two by two based on the grouping and ordering trajectory, only reserving the trajectory points with the moving speed less than the speed threshold value of 200km/h, and after speed filtering, displaying the trajectory data of the user1 in a table 2, wherein only the trajectory denoising process of the user1 is described here, and other user methods are the same and are not repeated.

TABLE 2 user1 example trajectory data

User unique identification Longitude (G) Latitude Signalling time
user1 u1_lon1 u1_lat1 u1_time1
user1 u1_lon2 u1_lat2 u1_time2
user1 u1_lon3 u1_lat3 u1_time3
user1 u1_lon4 u1_lat4 u1_time4
user1 u1_lon5 u1_lat5 u1_time5

(3) And sequentially encoding longitude and latitude coordinates of all track points in the user1 track into 7-bit Geohash character strings, and incorporating the Geohash character strings into the Geohash grouping, wherein the data after Geohash grouping is shown in a table 3.

Table 3 Geohash packet data example

(4) When Geohash1 is not equal to Geohash2 and Geohash2 is not equal to Geohash3, the three Geohash character strings are decoded into longitude and latitude coordinates of the center points, which are recorded as center1, center2 and center3, the track included angles and Manhattan distances between center1 and center2 and between center2 and center3 are sequentially calculated, and when the track included angle is smaller than an included angle threshold value 34 degrees and the two Manhattan distances are larger than 5km, the track points (u1_ lon3, u1_ lat3) and (u1_ lon4, u1_ lat4) are deleted, and finally a new user1 track is generated, as shown in Table 4.

TABLE 4 Geohash filtered example of trace data

User unique identification Longitude (G) Latitude Signalling time Geohash
user1 u1_lon1 u1_lat1 u1_time1 Geohash1
user1 u1_lon2 u1_lat2 u1_time2 Geohash1
user1 u1_lon5 u1_lat5 u1_time5 Geohash3

(5) In the tracing points in the history table 4, the arithmetic mean of the longitude and latitude of the user1 tracing point is used as the coordinates of the central point of the trace, the Manhattan distance between each tracing point and the central point of the trace is further calculated, and the coordinates of the tracing point with the minimum Manhattan distance are encoded into a 6-bit Geohash character string which is used as a central Geohash grid.

(6) In the history table 4, although the Geohash grid where the trace points (u1_ lon1, u1_ lat1) are located is not the central Geohash grid, the speed of the user1 moving from the trace points (u1_ lon1, u1_ lat1) to the trace points (u1_ lon1, u1_ lat1) is less than 200km/h, so that the trace points (u1_ lon1, u1_ lat1) and the trace points (u1_ lon1, u1_ lat1), the trace points (u1_ lon1, u1_ lat1) and the trace points (1 _ lon1, u1_ lat1) calculate the trace distance, the hattan distance, the time interval, and the Mantan distance between the two segments of the Mantan distance and the time interval when the two segments of the Mantan distance between the two segments of the trace points (u1_ lon1, U1_ lat1, the Mantan distance is greater than the time interval (30, the Mantan distance and the time interval is greater than the two segments of the Mantan distance 1).

The embodiments in the present description are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments are referred to each other. The device disclosed by the embodiment corresponds to the method disclosed by the embodiment, so that the description is simple, and the relevant points can be referred to the method part for description.

The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the invention. Thus, the present invention is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

9页详细技术资料下载
上一篇:一种医用注射器针头装配设备
下一篇:一种多张SIM卡流量通用方法及装置

网友询问留言

已有0条留言

还没有人留言评论。精彩留言会获得点赞!

精彩留言,会给你点赞!

技术分类