Equipment positioning method and system in visible light communication

文档序号:1693829 发布日期:2019-12-10 浏览:24次 中文

阅读说明:本技术 一种可见光通信中设备定位方法及系统 (Equipment positioning method and system in visible light communication ) 是由 王昭诚 刘沛玺 赵培尧 于 2019-08-09 设计创作,主要内容包括:本发明实施例提供一种可见光通信中设备定位方法及系统,其中,所提供的方法包括:获取相机在可见光通信中产生的信号图像,对所述信号图像进行预处理,获得特征图像;将所述特征图像输入训练好的神经网络中,获得所述相机的三维位置信息;其中,所述相机的三维位置信息包括但不限于相机位置信息和相机倾角中的一项或多项的组合。本发明实施例提供的方法及系统,采用训练好的神经网络对可见光通信系统中的相机进行定位,与传统的几何方法对比,在相同分辨率下,可以实现更高的定位精度,同时受相机高度的影响较小。(The embodiment of the invention provides a method and a system for positioning equipment in visible light communication, wherein the method comprises the following steps: acquiring a signal image generated by a camera in visible light communication, and preprocessing the signal image to obtain a characteristic image; inputting the characteristic image into a trained neural network to obtain three-dimensional position information of the camera; wherein the three-dimensional position information of the camera includes, but is not limited to, a combination of one or more of camera position information and camera tilt angle. According to the method and the system provided by the embodiment of the invention, the trained neural network is adopted to position the camera in the visible light communication system, compared with the traditional geometric method, under the same resolution, higher positioning precision can be realized, and meanwhile, the influence of the height of the camera is small.)

1. a method for positioning equipment in visible light communication is characterized by comprising the following steps:

Acquiring a signal image generated by a camera in visible light communication, and preprocessing the signal image to obtain a characteristic image;

inputting the characteristic image into a trained neural network to obtain three-dimensional position information of the camera;

Wherein the three-dimensional position information of the camera includes, but is not limited to, a combination of one or more of camera position information and camera tilt angle.

2. the method of claim 1, further comprising:

Acquiring characteristic images generated by a plurality of cameras in visible light communication and position information corresponding to the cameras, and constructing a training sample set;

And training a preset neural network through the training sample set to obtain the trained neural network.

3. The method according to claim 2, wherein the step of obtaining feature images generated by a plurality of cameras in visible light communication and position information corresponding to the cameras to construct a training sample set further comprises:

And acquiring the position information and the power information of the LED, and constructing a training sample set by adopting a camera simulation system.

4. The method according to claim 1, wherein the step of acquiring a signal image generated by a camera in visible light communication, preprocessing the signal image, and obtaining a feature image specifically comprises:

Acquiring a signal image by receiving a signal through a photodiode on a camera light sensing plate;

Separating the ambient light and the LED light in the signal image, labeling the LED light in the signal image, and establishing a mapping rule of the LED light and the labeling;

And rendering the ambient light and the LED light in the signal image to obtain the characteristic image.

5. The method according to claim 2, wherein the step of training a preset neural network through the training sample set to obtain the trained neural network specifically comprises:

Constructing a neural network according to the characteristics of the training sample set, and determining a loss function of the neural network according to the position prediction and the angle prediction inclination of the camera;

and training the neural network through a back propagation algorithm until a loss function of the neural network meets a preset condition.

6. the method according to claim 3, wherein the step of acquiring the position information of the LED and the power information of the LED and constructing the training sample set by using the camera simulation system specifically comprises:

Modeling according to the position information of the LED, the power information of the LED and the parameter information of the camera;

Arranging the cameras in a preset area to form sampling points, and obtaining signal images of the LEDs according to a small hole imaging principle; and recording the signal image and the corresponding sampling point to construct a training sample set.

7. a method for positioning equipment in visible light communication is characterized by comprising the following steps:

Acquiring a signal image generated by a camera in visible light communication, and preprocessing the signal image to obtain a characteristic image;

And inputting the characteristic image into a trained neural network, and obtaining the position of the camera relative to the LED by combining the position information of the LED.

8. a system for locating a device in visible light communication, comprising:

The device comprises a preprocessing module, a characteristic image acquisition module and a processing module, wherein the preprocessing module is used for acquiring a signal image generated by a camera in visible light communication, and preprocessing the signal image to acquire the characteristic image;

The positioning module is used for inputting the characteristic image into a trained neural network to obtain the three-dimensional position information of the camera;

wherein the three-dimensional position information of the camera includes, but is not limited to, a combination of one or more of camera position information and camera tilt angle.

9. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements the steps of the method for positioning a device in visible light communication according to any one of claims 1 to 6 when executing the program.

10. a non-transitory computer-readable storage medium, on which a computer program is stored, wherein the computer program, when being executed by a processor, implements the steps of the method for positioning a device in visible light communication according to any one of claims 1 to 6.

Technical Field

the invention relates to the technical field of computers, in particular to a method and a system for positioning equipment in visible light communication.

Background

With the explosive growth of mobile Communication users, the traditional radio frequency Communication has been gradually unable to meet the Communication requirements, and Visible Light Communication (VLC) has attracted the attention of researchers due to its high speed, low power consumption, high security and strong applicability. In short-distance communication scenes such as indoor communication, vehicle communication, Internet of things and the like, visible light communication has great potential. In these communication scenarios, Location-based services (LBS) are increasingly in demand, such as smart driving, indoor navigation, behavior detection, and the like. Because signals are mainly transmitted through the LOS path in the visible light communication process, and a receiver (a photodiode, a camera and the like) of the visible light communication process can obtain more accurate signal incidence angle information than that of the traditional radio frequency communication, the acquisition of the terminal position in the visible light communication process is a research hotspot.

the traditional visible light communication adopts a positioning method similar to the radio frequency communication: estimating the distance between an LED (light emitting diode) information source and a receiver by utilizing a visible light communication channel model in a free space through Received Signal Strength (RSS), and then obtaining the position of the receiver by a trilateral positioning method; in optical communication, LOS path transmission is mainly used, and the distance or the distance Difference between an LED information source and a receiver is obtained by using Time of Arrival (TOA) or Time Difference of Arrival (TDOA), and then the receiver position is obtained by trilateral positioning. However, the conventional positioning method does not take advantage of the accuracy of the estimation of the angle of arrival of the signal in visible light communication, and a method for determining the position of the receiver by geometric constraint of the angle of arrival of the signal is proposed.

because most of the existing mobile intelligent terminals are equipped with cameras or video cameras (automobiles, mobile phones, notebook computers, tablet computers and the like), and the estimation of the incident angle of the camera on the optical signal is very accurate, the incident angle of the optical signal can be easily calculated by acquiring the imaging coordinate of the LED information source at the known position on the camera light-sensitive plate. An LED source with a known position and the coordinates of an image point of the LED source have a constraint on the inclination angle of the camera, three angle constraints can be obtained through the three LEDs, and the position of the camera can be further solved through a numerical method. However, most of the existing cameras are digital cameras, and it is difficult to accurately acquire the coordinates of the image points of the LEDs, so when the resolution of the camera is low or the camera is too far away from the LED information source, it is difficult to acquire an accurate position or even a position estimate cannot be obtained, and meanwhile, the positioning accuracy is also affected by parameters such as the focal length of the camera, the diopter of the lens, and the like.

Disclosure of Invention

in order to solve the problems in the prior art, embodiments of the present invention provide a method and a system for positioning devices in visible light communication.

In a first aspect, an embodiment of the present invention provides a method for positioning a device in visible light communication, including:

Acquiring a signal image generated by a camera in visible light communication, and preprocessing the signal image to obtain a characteristic image;

Inputting the characteristic image into a trained neural network to obtain three-dimensional position information of the camera;

wherein the three-dimensional position information of the camera includes, but is not limited to, a combination of one or more of camera position information and camera tilt angle.

wherein the method further comprises: acquiring characteristic images generated by a plurality of cameras in visible light communication and position information corresponding to the cameras, and constructing a training sample set; and training a preset neural network through the training sample set to obtain the trained neural network.

The step of obtaining feature images generated by a plurality of cameras in visible light communication and position information corresponding to the cameras and constructing a training sample set further comprises: and acquiring the position information and the power information of the LED, and constructing a training sample set by adopting a camera simulation system.

the method comprises the steps of acquiring a signal image generated by a camera in visible light communication, preprocessing the signal image, and acquiring a characteristic image, and specifically comprises the following steps: acquiring a signal image by receiving a signal through a photodiode on a camera light sensing plate; separating the ambient light and the LED light in the signal image, labeling the LED light in the signal image, and establishing a mapping rule of the LED light and the labeling; and rendering the ambient light and the LED light in the signal image to obtain the characteristic image.

The step of training a preset neural network through the training sample set to obtain the trained neural network specifically includes: constructing a neural network according to the characteristics of the training sample set, and determining a loss function of the neural network according to the position prediction and the angle prediction inclination of the camera; and training the neural network through a back propagation algorithm until the loss function of the neural network meets the preset condition.

the method comprises the following steps of acquiring position information of an LED and power information of the LED, and constructing a training sample set by adopting a camera simulation system, wherein the method specifically comprises the following steps: modeling according to the position information of the LED, the power information of the LED and the parameter information of the camera; arranging the cameras in a preset area to form sampling points, and obtaining signal images of the LEDs according to a small hole imaging principle; and recording the signal image and the corresponding sampling point to construct a training sample set.

wherein the method further comprises: establishing a camera coordinate system with the camera; and obtaining the position information of the LED relative to the camera according to the three-dimensional position information of the camera and the position information of the LED.

In a second aspect, an embodiment of the present invention provides a method for positioning a device in visible light communication, including:

Acquiring a signal image generated by a camera in visible light communication, and preprocessing the signal image to obtain a characteristic image;

And inputting the characteristic image into a trained neural network, and obtaining the position of the camera relative to the LED by combining the position information of the LED.

in a third aspect, an embodiment of the present invention provides a system for positioning a device in visible light communication, including:

The device comprises a preprocessing module, a characteristic image acquisition module and a processing module, wherein the preprocessing module is used for acquiring a signal image generated by a camera in visible light communication, and preprocessing the signal image to acquire the characteristic image;

the positioning module is used for inputting the characteristic image into a trained neural network to obtain the three-dimensional position information of the camera;

Wherein the three-dimensional position information of the camera includes, but is not limited to, a combination of one or more of camera position information and camera tilt angle.

in a fourth aspect, an embodiment of the present invention provides an electronic device, including a memory, a processor, and a computer program stored in the memory and executable on the processor, where the processor executes the program to implement the steps of the method for positioning a device in visible light communication as provided in the first aspect.

In a fifth aspect, an embodiment of the present invention provides a non-transitory computer-readable storage medium, on which a computer program is stored, where the computer program, when executed by a processor, implements the steps of the method for positioning a device in visible light communication as provided in the first aspect.

According to the method and the device for positioning the equipment in the visible light communication, provided by the embodiment of the invention, the trained neural network is adopted to position the camera in the visible light communication system, and compared with the traditional geometric method, the method and the device can realize higher positioning precision under the same resolution, and are less influenced by the height of the camera.

drawings

in order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and those skilled in the art can also obtain other drawings according to the drawings without creative efforts.

Fig. 1 is a flowchart illustrating a method for positioning a device in visible light communication according to an embodiment of the present invention;

Fig. 2 is a schematic view of an application scenario in a method for positioning a device in visible light communication according to an embodiment of the present invention;

Fig. 3 is a flowchart illustrating a device positioning method in visible light communication according to another embodiment of the present invention

Fig. 4 is a schematic diagram of a camera simulation system in a method for positioning a device in visible light communication according to another embodiment of the present invention;

Fig. 5 is a schematic diagram of a feature image generated by a camera simulation system in a device positioning method in visible light communication according to another embodiment of the present invention;

fig. 6 is a schematic structural diagram of a convolutional neural network in a method for positioning a device in visible light communication according to another embodiment of the present invention;

Fig. 7 is a schematic diagram of a depth residual error network structure in a method for positioning a device in visible light communication according to another embodiment of the present invention;

fig. 8 is a schematic diagram illustrating a comparison between training processes of a convolutional neural network and a deep residual error network in a method for positioning a device in visible light communication according to yet another embodiment of the present invention;

Fig. 9 is a schematic diagram illustrating a variation of a positioning error with height in a method for positioning a device in visible light communication according to yet another embodiment of the present invention;

fig. 10 is a schematic diagram illustrating a comparison between a positioning error and a conventional geometric method in a method for positioning a device in visible light communication according to yet another embodiment of the present invention;

Fig. 11 is a schematic structural diagram of a device positioning system in visible light communication according to an embodiment of the present invention;

fig. 12 is a schematic structural diagram of an electronic device according to an embodiment of the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

Referring to fig. 1, fig. 1 is a schematic flowchart of a method for positioning a device in visible light communication according to an embodiment of the present invention, where the method includes:

And S1, acquiring a signal image generated by the camera in visible light communication, and preprocessing the signal image to obtain a characteristic image.

and S2, inputting the characteristic image into the trained neural network to obtain the three-dimensional position information of the camera.

Wherein the three-dimensional position information of the camera includes, but is not limited to, a combination of one or more of camera position information and camera tilt angle.

Specifically, after an image in the camera communication process is obtained, the image needs to be processed into a standard characteristic image by adopting the same offline preprocessing means so as to be input into the neural network. That is, the received signal is judged to be from the background or from the LED by the signal image received by the Photodiode (PD) on the camera light sensing plate. And each part in the signal image is derived from the LED part, corresponding rendering operation is carried out according to the ID of the LED, the part derived from the background is rendered into black, and meanwhile, the resolution of the image is adjusted to be the resolution of the preset neural network corresponding input, so that the characteristic image is constructed.

and inputting the characteristic image into a trained neural network, and performing forward propagation operation once to obtain the optical center position and the angle of the camera.

By the method, the trained neural network is adopted to position the camera in the visible light communication system, and compared with the traditional geometric method, the method can realize higher positioning precision under the same resolution and is less influenced by the height of the camera.

On the basis of the above embodiment, the step of acquiring a signal image generated by the camera in visible light communication, preprocessing the signal image, and obtaining a feature image specifically includes: acquiring a signal image by receiving a signal through a photodiode on a camera light sensing plate; separating the ambient light and the LED light in the signal image, labeling the LED light in the signal image, and establishing a mapping rule of the LED light and the labeling; and rendering the ambient light and the LED light in the signal image to obtain the characteristic image.

Specifically, after an image in the camera communication process is obtained, the image needs to be processed into a standard characteristic image by adopting the same offline preprocessing means so as to be input into the neural network. That is, whether the received signal is from the background is determined by the received signal from the PD on the camera plate, and the received signal corresponds to the ID of the LED.

rendering different colors of corresponding PD regions for different LED images; if the PD does not have any received signal, it is rendered black. And if the image resolution after preprocessing is not equal to the feature image resolution used in the offline training stage, performing processing through up-sampling (when the image resolution is low) or down-sampling (when the image resolution is high), and further obtaining a processed feature image.

On the basis of the embodiment, characteristic images generated by a plurality of cameras in visible light communication and position information corresponding to the cameras are acquired, and a training sample set is constructed; and training a preset neural network through the training sample set to obtain the trained neural network.

The step of obtaining characteristic images generated by a plurality of cameras in visible light communication and position information corresponding to the cameras and constructing a training sample set further comprises: and acquiring the position information and the power information of the LED, and constructing a training sample set by adopting a camera simulation system.

The method comprises the following steps of acquiring position information of an LED and power information of the LED, and constructing a training sample set by adopting a camera simulation system, wherein the method specifically comprises the following steps: modeling according to the position information of the LED, the power information of the LED and the parameter information of the camera; arranging the cameras in a preset area to form sampling points, and obtaining signal images of the LEDs according to a small hole imaging principle; and recording the signal image and the corresponding sampling point to construct a training sample set.

Specifically, the LED sends an optical signal to the camera, the PD array on the camera receives the signal through the pinhole imaging principle, and the signal source of the PD can be judged according to the received signal of the PD. Specifically, when a general camera communication mode is adopted, the PD judges that a signal originates from a certain LED through a communication signal within a period of time; when a camera communication mode of a rolling shutter is adopted, an image signal obtained by the camera is composed of light and dark LED stripe images, and a PD receiving signal source can be judged through a light and dark stripe area where the PD is located.

The advantage of camera communication over conventional image positioning is that ID information of the LED image on the image can be obtained. In order to communicate the difference of the LED images to the neural network, different LED images are rendered with different colors for distinguishing. And establishing a one-to-one mapping rule of the LEDs and the colors, and adopting the same rule for all image signals.

And rendering all the PDs of the camera according to the mapping rule, wherein when the received signals of the PDs are ambient light, the received signals are rendered into black so as to achieve the purpose of removing the background. After the feature image and the data set of the corresponding position are acquired, the feature image is subjected to quality marking according to the information amount contained in the feature image. If there are m images of LED sources in the image, n of which are complete, the image quality is said to be of the order of "m-n". Low quality samples are removed so as not to interfere with neural network training. And (3) randomly dividing the high-quality data set into a training set and a verification set, and inputting the training set and the verification set into a convolutional neural network for training. And converging the neural network by adjusting parameters, and finally taking the network with the minimum error of the verification set as a positioning network.

if there are not enough actual samples and the positions and powers of the LEDs are known, the offline training phase may utilize a camera simulation system to obtain a training set, including the steps of:

and performing spherical modeling on the LED in the free space, wherein the spherical position is the real position of the LED, the spherical radius represents the power of an LED information source, different LEDs have different IDs, the mapping rule is also established, and the different IDs are distinguished corresponding to different colors.

the camera is a rigid body with the degree of freedom of 6 and is described by six parameters including lens optical center coordinates (three parameters), Euler rotation unit vector parameters (two parameters) and rotation angle (one parameter). The distance between the optical center and the plane of the light-sensitive plate is the focal length, and the size of the focal length and the light-sensitive plate determines the size of the visual angle of the camera. The number of PDs in a camera is uniformly distributed in the plane of the plate, and is determined by the plate size and the camera resolution.

Uniform sampling within the target area. For each sampling point, solving the light signal coverage range of the LED according to the linear optics and the small hole imaging principle of the camera, and judging whether the PD on the camera light-sensing plate is covered by the LED signal according to the range. If the PD is in the signal range, rendering a corresponding color according to the ID of the LED; if the signal coverage is not within the signal coverage range, the signal coverage range is regarded as a background and is rendered into black. Finally obtaining a characteristic image which removes background information and distinguishes different LED images by different colors

After the feature image and the data set of the position are acquired, the feature image is subjected to quality labeling according to the information amount contained in the feature image. If there are m images of LED sources in the image, n of which are complete, the image quality is said to be of the order of "m-n". And removing low-level samples, and randomly dividing the high-quality data set into a training set and a verification set.

by the method, the actual data and the camera simulation system are adopted to construct the sample set, the requirement on the number of samples in the sample set for neural network training can be met, and meanwhile, the simulation system is adopted to construct the training samples, so that the types of the sample set can be enriched, and the training of the neural network is more perfect.

On the basis of the above embodiment, the step of training a preset neural network through the training sample set to obtain the trained neural network specifically includes: constructing a neural network according to the characteristics of the training sample set, and determining a loss function of the neural network according to the position prediction and the angle prediction inclination of the camera; and training the neural network through a back propagation algorithm until the loss function of the neural network meets the preset condition.

Specifically, a neural network is constructed, and the network input is a characteristic image, so the input matrix dimension is (m, n,3), wherein m and n are camera resolutions, and 3 is expressed as a color image. The output is the three-dimensional position of the camera, represented by a vector of 1x 3. When the camera angle needs to be predicted, two output representations are used, represented by two 1x3 vectors, one representing the camera position and the other representing the camera tilt angle. In order to extract features in the feature image, a main structure is mainly a convolutional neural network, and an activation function adopts a ReLU class function to perform regression prediction. In order to regress to obtain the angle and position of the camera, a fully connected network is finally needed for regression prediction. The size and initialization type of the convolution kernel of the network can be selected by multiple experiments to best combine.

the training goal of the neural network is to reduce the loss function, which consists of the camera position prediction error (mean square error) and the angle prediction error (mean square error), defined as follows:

Where M is the number of samples, Pi preFor position prediction of the ith feature image, Pi trueFor the true position corresponding to the ith feature image,for angular prediction of the ith feature image,And the real angle vector corresponding to the ith characteristic image. R is a weight coefficient of the angle prediction error, represents the attention degree of the angle error, and the larger the angle prediction is, the more accurate the angle prediction is, and the coarser the position prediction is. Pi2Is the normalization of the angle vector.

And (3) iteratively training the neural network for multiple times by utilizing a back propagation algorithm, gradually reducing network loss, trying by adopting different optimizers, and selecting the optimizers with the fastest convergence rate and smaller oscillation. And finally, saving the model with the minimum loss on the verification set as a final positioning network for online prediction.

on the basis of the above embodiment, the method further includes: establishing a camera coordinate system with the camera; and obtaining the position information of the LED relative to the camera according to the three-dimensional position information of the camera and the position information of the LED.

specifically, after obtaining the angle and the position of the camera, the relationship between a World Coordinate System (WCS) and a Camera Coordinate System (CCS) can be obtained, and the camera Coordinate System is illustrated in fig. 2 and used to describe the camera position and the angle. And (3) obtaining the position of the LED relative to the camera through translation and rotation coordinate transformation according to the position of the LED in a camera coordinate system.

wherein the LEDwIs the coordinates of the LED in a world coordinate system, the LEDcIs the coordinates of the LED in the camera coordinate system, cwfor the coordinates of the camera in the world coordinate system, a rotation matrix, defined by the camera direction vector, is:

Wherein (x)1,x2,x3) The coordinate of the vector in the x-axis direction of the camera in the world coordinate system, and the y-axis and the z-axis are the same.

In summary, in the method provided by the embodiment of the invention, the higher the camera height (i.e. the closer the camera is to the lamp), the smaller the average positioning error. Compared with the traditional geometric method, the neural network method has higher positioning precision under the same resolution and is less influenced by the height of the camera.

Referring to fig. 3, fig. 3 is a schematic flowchart of a method for positioning a device in visible light communication according to another embodiment of the present invention, where the method includes:

S31, acquiring a signal image generated by the camera in visible light communication, preprocessing the signal image and acquiring a characteristic image;

And S32, inputting the characteristic image into the trained neural network, and combining the position information of the LED to obtain the position of the camera relative to the LED.

Specifically, after an image in the camera communication process is obtained, the image needs to be processed into a standard characteristic image by adopting the same offline preprocessing means so as to be input into the neural network. That is, whether the received signal is from the background is determined by the received signal from the PD on the camera plate, and the received signal corresponds to the ID of the LED.

rendering different colors of corresponding PD regions for different LED images; if the PD does not have any received signal, it is rendered black. If the image resolution after preprocessing is not equal to the characteristic image resolution used in the offline training stage, the image is processed by up-sampling (when the image resolution is low) or down-sampling (when the image resolution is high), and then the processed image is input into the neural network.

The position of the LED array in the camera image can be obtained by carrying out forward propagation operation once through the trained positioning neural network under the processed image input line, and then the position of the camera relative to the LED is obtained through the self-position information of the LED, wherein the self-position information of the LED can be obtained through a visible light communication mode.

In training a neural network, a sufficient training sample set needs to be acquired first. The LED sends an optical signal to the camera, the PD array on the camera receives the signal through the pinhole imaging principle, and the signal source of the PD can be judged according to the received signal of the PD.

In order to communicate the difference of the LED images to the neural network, different LED images are rendered with different colors for distinguishing. And establishing a one-to-one mapping rule of the LEDs and the colors, and adopting the same rule for all image signals.

And rendering all the PDs of the camera according to the mapping rule, wherein when the received signals of the PDs are ambient light, the received signals are rendered into black so as to achieve the purpose of removing the background. After the feature image and the data set of the corresponding position are acquired, the feature image is subjected to quality marking according to the information amount contained in the feature image. Low quality samples are removed so as not to interfere with neural network training. And (3) randomly dividing the high-quality data set into a training set and a verification set, and inputting the training set and the verification set into a convolutional neural network for training. And converging the neural network by adjusting parameters, and finally taking the network with the minimum error of the verification set as a positioning network.

If there are not enough actual samples and the relative positions and powers of the LEDs are known, the offline training phase may utilize a camera simulation system to acquire a training set. And performing ball modeling on the LEDs in the free space, wherein the relative positions among the balls are the real relative positions of the LEDs, the radius of the ball represents the power of an LED information source, different LEDs have different IDs, the mapping rules are also established, and the different IDs correspond to different colors for distinguishing.

In the camera coordinate system, the camera is a rigid body with 6 degrees of freedom, the lens optical center can be defined but not limited to the origin, and the direction vectors are x, y and z axes respectively. The light sensitive plate is a plane with a limited size which is below the xoy plane and is parallel to the xoy plane, the distance between the optical center and the plane of the light sensitive plate is a focal length, and the size of the focal length and the size of the light sensitive plate determines the size of the visual angle of the camera. The PD in the camera is uniformly distributed on the plane of the light-sensing plate, and the quantity is determined by the size of the light-sensing plate and the resolution of the camera

Uniform sampling within the target area. For each sampling point, solving the light signal coverage range of the LED according to the linear optics and the small hole imaging principle of the camera, and judging whether the PD on the camera light-sensing plate is covered by the LED signal according to the range. If the PD is in the signal range, rendering a corresponding color according to the ID of the LED; if the signal coverage is not within the signal coverage range, the signal coverage range is regarded as a background and is rendered into black. Finally, a characteristic image which removes background information and distinguishes different LED images in different colors is obtained.

After the feature image and the data set of the LED positions are acquired, the feature image is subjected to quality marking according to the information content of the feature image. If there are m images of LED sources in the image, n of which are complete, the image quality is said to be of the order of "m-n". And removing low-level samples, randomly dividing the high-quality data set into a training set and a verification set, and inputting the training set and the verification set into a convolutional neural network for training. And converging the neural network by adjusting parameters, and finally taking the network with the minimum error of the verification set as a positioning network.

In the process of training the neural network, firstly, the neural network is constructed, the network input is a characteristic image, so the input matrix dimension is (m, n,3), wherein m and n are camera resolution, and 3 is expressed as a color image. The output is the three-dimensional position of the LED source, represented by a vector of 1x3n, and n is the number of LEDs in the LED array. In order to extract features in the feature image, a main structure is mainly a convolutional neural network, and an activation function adopts a ReLU class function to perform regression prediction. In order to regress to obtain the LED positions, a fully connected network is finally required for regression prediction. The size and initialization type of the convolution kernel of the network can be selected by multiple experiments to best combine.

The training goal of the neural network is to reduce the loss function, which consists of the camera position prediction error (mean square error) and the angle prediction error (mean square error), defined as follows:

Where M is the number of samples, Pi preFor position prediction of the ith feature image, Pi trueand the real position of the LED array corresponding to the ith characteristic image.

And (3) iteratively training the neural network for multiple times by utilizing a back propagation algorithm, gradually reducing network loss, trying by adopting different optimizers, and selecting the optimizers with the fastest convergence rate and smaller oscillation. And finally, saving the model with the minimum loss on the verification set as a final positioning network for online prediction.

In another embodiment of the invention, as shown in FIG. 4, the LED is modeled as a light emitting sphere in free space, with a radius related to the LED power and camera photosensing range. According to the linear optics and the camera pinhole imaging principle, the LED light signal coverage range can be calculated, namely, the light center C is used as a tangent line of the sphere L (shown as a straight line CA and a straight line CB), and a cone body enclosed by the tangent line is the light signal coverage range. Plane AB is a plane passing through the center of the sphere parallel to plane A 'B', and it is apparent that when camera C is further away from LED source L, the distance between the planes is negligible. For simplifying the calculation, the bottom surface of the cone of light signals is approximately the intersection surface of the plane A ' B ' and the ball L, i.e. CA ' is the generatrix and CL is the rotation axis. Point P within the cone satisfies:

Wherein the content of the first and second substances,

camera angle is by unit Euler rotation axisAnd rotation angle (the right hand rule direction is the positive direction)To show, the camera optical center is denoted as column vector c in WCSw. The coordinates of the PD on the camera's light-sensing plate in the CCS are known as PDcThen the coordinates of PD in WCS are:

Wherein the content of the first and second substances,

piIs a vectorThe ith element of (1). Will PDwAnd (5) substituting the equation into a cone equation, and judging whether the internal part of the cone exists.

In this embodiment, 3 LEDs are used for positioning, and in order to distinguish three LED images on the image signal, the three LED images are respectively identified by different colors. If the PD satisfies the condition of cone inside, rendering the PD as a corresponding color mark of the LED (as shown by points 1 and 2 in FIG. 3); if not (as shown by point 3 in FIG. 3), then it is marked 0 (black).

after all PDs are rendered, a feature image for localization is obtained, the resolution of which is related to the camera resolution, as shown in fig. 5. The characteristic images are uniformly sampled in a room, and position labels of the characteristic images are recorded to form a training set. And (3) performing quality marking on different characteristic images, and if m images of the LED information sources exist in the images, wherein n images are complete, marking the image quality as an m-n grade. The low quality images at levels "0-0" to "2-2" are removed, leaving feature images at levels above "3-0" that account for approximately two thirds of the total data set to form the final training set.

One tenth of the training set is randomly extracted to be used as a verification set, and other input neural networks are trained. When the image resolution is low (between 28 pixels and 224 pixels), the CNN structure as shown in fig. 6 can be used for training; when the image resolution is high, a depth residual Network (Resnet) as shown in FIG. 7 can be used for training. The network shown in fig. 6 and fig. 7 mainly consists of a wrapper layer (Conv2d), a maximum pooling layer (Maxpooling), a smoothing layer (Flatten) and a fully connected layer (Dense), and the dimensional change of the data features can be seen from the figure. Training a neural network by using an Adam optimizer through a back propagation algorithm with the goal of minimizing loss as a target; and after the training is finished, selecting the network weight with the minimum loss on the verification set as a positioning network. Loss is defined as follows:

wherein P isi prefor position prediction of the ith feature image, Pi trueFor the true position corresponding to the ith feature image,For angular prediction of the ith feature image,And the real angle vector corresponding to the ith characteristic image. R is a weight coefficient of the angle prediction error, which represents the attention degree to the angle error, and in this embodiment, R is 0, that is, the positioning network only considers the position error.

In the on-line positioning stage, the photos obtained in the optical communication of the camera are preprocessed firstly. In common optical communication, the LED sends its ID to the camera receiving end, and the PD in the coverage area of the optical signal renders the ID information into corresponding colors according to the mapping rule in the offline training phase to distinguish different LEDs, thereby forming a characteristic image similar to that in fig. 4. For example, when a camera communicates, the PD on the camera light sensing plate can judge the information source ID through the received signal, and different PDs can be rendered into corresponding colors to represent different LED images according to the information; when the rolling shutter mode is used for communication, the camera scans line by line to form light and dark stripes and transmits information by using the light and dark stripes, in each LED imaging area, the information source ID can be seen from the distribution of the light and dark stripes, and then the area is rendered into corresponding colors.

after the characteristic image is obtained through preprocessing, the image can be input into a trained neural network model, and the network output is the 3D position of the camera.

In yet another embodiment of the present invention, the test area is located in a rectangular solid space of 6m x 6m x 5m, the focal length of the camera is 15mm, the light sensing plate of the camera is a full-frame camera of 36mm x 24mm, and the receiving half angle of the camera is 45 degrees. The three LED coordinates are (1000, 0, 0), (0, 1000, 0) and (0, -1000, 0), respectively, (in millimeters), the radius of the LED sphere model is set to 50mm, and the camera is horizontal up in the data sample. In generating the training set, two training sets are generated at camera resolutions of 28 pixels and 224 pixels. Uniformly sampling in a cuboid, respectively sampling for 30 times in length, width and height, obtaining 2.7 thousands of images in each training set, removing data samples with the quality below 3-0 to obtain a training set of about 1.8 thousands of samples, inputting the training set into a neural network, and then iterating by a back propagation algorithm, wherein the variation curves of Loss of the two networks are shown in figure 8

And obtaining a network with the minimum Loss on the verification set through training, and uniformly sampling 1 ten thousand samples from the positioning area to perform testing, wherein the average positioning error of the testing set with the resolution 28 is 169mm, and the average positioning error of the testing set with the resolution 224 is 28 mm. According to the test result, the influence of the height and the positioning error of the camera is analyzed, the positioning errors of the camera at the same height are averaged to obtain the result in fig. 9, and it can be found that the higher the height of the camera (i.e., the closer the camera is to the lamp), the smaller the average positioning error is. Compared with the traditional geometric method, the neural network method has higher positioning accuracy under the same resolution and is less influenced by the height of the camera, as shown in fig. 10.

Referring to fig. 11, fig. 11 is a schematic structural diagram of a device positioning system in visible light communication according to an embodiment of the present invention, where the system includes: a pre-processing module 1001 and a positioning module 1002.

the preprocessing module 1001 is configured to acquire a signal image generated by the camera in visible light communication, and preprocess the signal image to obtain a feature image.

The positioning module 1002 is configured to input the feature image into a trained neural network, so as to obtain three-dimensional position information of the camera.

Wherein the three-dimensional position information of the camera includes, but is not limited to, a combination of one or more of camera position information and camera tilt angle.

it should be noted that, the preprocessing module 1001 and the positioning module 1002 cooperate to execute the device positioning method in visible light communication in the foregoing embodiment, and specific functions of the system refer to the foregoing embodiment of the device positioning method in visible light communication, which is not described herein again.

Fig. 12 illustrates a schematic structural diagram of an electronic device, and as shown in fig. 12, the server may include: a processor (processor)1110, a communication Interface (Communications Interface)1120, a memory (memory)1130, and a bus 1140, wherein the processor 1110, the communication Interface 1120, and the memory 1130 communicate with each other via the bus 1140. Communication interface 1140 may be used for information transfer between a server and a smart television. Processor 1110 may call logic instructions in memory 1130 to perform the following method: acquiring a signal image generated by a camera in visible light communication, and preprocessing the signal image to obtain a characteristic image; inputting the characteristic image into a trained neural network to obtain three-dimensional position information of the camera; wherein the three-dimensional position information of the camera includes, but is not limited to, a combination of one or more of camera position information and camera tilt angle.

The present embodiments also provide a computer program product comprising a computer program stored on a non-transitory computer-readable storage medium, the computer program comprising program instructions that, when executed by a computer, enable the computer to perform the methods provided by the above-described method embodiments, for example, including: acquiring a signal image generated by a camera in visible light communication, and preprocessing the signal image to obtain a characteristic image; inputting the characteristic image into a trained neural network to obtain three-dimensional position information of the camera; wherein the three-dimensional position information of the camera includes, but is not limited to, a combination of one or more of camera position information and camera tilt angle.

The present embodiments provide a non-transitory computer-readable storage medium storing computer instructions that cause the computer to perform the methods provided by the above method embodiments, for example, including: acquiring a signal image generated by a camera in visible light communication, and preprocessing the signal image to obtain a characteristic image; inputting the characteristic image into a trained neural network to obtain three-dimensional position information of the camera; wherein the three-dimensional position information of the camera includes, but is not limited to, a combination of one or more of camera position information and camera tilt angle.

The above-described embodiments of the apparatus are merely illustrative, and the units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.

Through the above description of the embodiments, those skilled in the art will clearly understand that each embodiment can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware. With this understanding in mind, the above-described technical solutions may be embodied in the form of a software product, which can be stored in a computer-readable storage medium such as ROM/RAM, magnetic disk, optical disk, etc., and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the methods described in the embodiments or some parts of the embodiments.

Finally, it should be noted that: the above examples are only intended to illustrate the technical solution of the present invention, but not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

19页详细技术资料下载
上一篇:一种医用注射器针头装配设备
下一篇:声源定位方法和装置

网友询问留言

已有0条留言

还没有人留言评论。精彩留言会获得点赞!

精彩留言,会给你点赞!