Vehicle positioning system using laser radar

文档序号：1145916 发布日期：2020-09-11 浏览：6次中文

阅读说明：本技术 使用激光雷达的车辆定位系统 (Vehicle positioning system using laser radar ) 是由年兴冯路马腾于 2018-11-09 设计创作，主要内容包括：本申请实施例提供了用于定位车辆的系统和方法。所述系统包括通信接口(202),被配置为接收由所述车辆上配备的传感器捕获的与场景相关的点云。所述系统进一步包括存储器(208),被配置为存储点云和高清地图。所述系统还包括处理器(204)。所述处理器(204)被配置为创建所述点云的第一个三维表示,并创建与所述场景相关的所述高清地图的第二个三维表示。所述处理器(204)被进一步配置为通过比较所述第一个三维表示和所述第二个三维表示来确定所述车辆的位姿信息。所述处理器(204)根据所述位姿信息确定所述车辆的位置。(The embodiment of the application provides a system and a method for positioning a vehicle. The system includes a communication interface (202) configured to receive a point cloud relating to a scene captured by a sensor equipped on the vehicle. The system further includes a memory (208) configured to store the point cloud and the high definition map. The system also includes a processor (204). The processor (204) is configured to create a first three-dimensional representation of the point cloud and create a second three-dimensional representation of the high definition map associated with the scene. The processor (204) is further configured to determine pose information of the vehicle by comparing the first three-dimensional representation and the second three-dimensional representation. The processor (204) determines a position of the vehicle from the pose information.)

1. A system for locating a vehicle, comprising:

a communication interface configured to receive a point cloud relating to a scene captured by a sensor equipped on the vehicle;

a memory configured to store the point cloud and a high definition map; and

a processor configured to:

creating a first three-dimensional representation of the point cloud;

creating a second three-dimensional representation of the high-definition map relating to the scene;

determining pose information for the vehicle by comparing the first three-dimensional representation and the second three-dimensional representation; and

determining a position of the vehicle based on the pose information.

2. The system of claim 1, wherein the sensor is a lidar.

3. The system of claim 1, wherein comparing the first three-dimensional representation and the second three-dimensional representation comprises comparing a voxel of the first three-dimensional representation to a corresponding voxel of the second three-dimensional representation.

4. The system of claim 1, wherein the processor is further configured to:

determining a first voxel value distribution of the first three-dimensional representation;

determining a second voxel value distribution of the second three-dimensional representation; and

comparing the first voxel value distribution to the second voxel value distribution.

5. The system of claim 1, wherein the communication interface is further configured to receive acquired pose information for the vehicle via a positioning sensor equipped on the vehicle, wherein the processor is further configured to use the acquired pose information to refine the pose information.

6. The system of claim 5, wherein to refine the pose information, the processor is configured to apply a filter to the acquired pose information and the determined pose information.

7. The system of claim 6, wherein the filter is an unscented Kalman filter UKF.

8. The system of claim 1, wherein the processor is further configured to create the first three-dimensional representation of the point cloud based on initial pose information, wherein the initial pose information is estimated based on pose information corresponding to previous positions of the vehicle.

9. The system of claim 1, wherein the processor is further configured to:

updating the pose information to increase similarity between the first three-dimensional representation and the second three-dimensional representation.

10. The system of claim 1, wherein the processor is configured to:

creating a first three-dimensional representation of the point cloud at a first resolution and a first three-dimensional representation of the point cloud at a second resolution, wherein the first resolution is lower than the second resolution;

creating a second three-dimensional representation of the high-definition map at the first resolution and a second three-dimensional representation of the high-definition map at the second resolution;

determining the pose information by comparing the first three-dimensional representation at the first resolution and the second three-dimensional representation at the first resolution; and

refining the pose information by comparing the first three-dimensional representation at the second resolution and the second three-dimensional representation at the second resolution.

11. A method for locating a vehicle, comprising:

receiving a point cloud relating to a scene captured by a sensor equipped on the vehicle;

creating, by a processor, a first three-dimensional representation of the point cloud;

creating, by the processor, a second three-dimensional representation of a high-definition map related to the scene;

determining, by the processor, pose information of the vehicle by comparing the first three-dimensional representation and the second three-dimensional representation; and

determining a position of the vehicle based on the pose information.

12. The method of claim 11, wherein comparing the first three-dimensional representation and the second three-dimensional representation comprises comparing a voxel of the first three-dimensional representation with a corresponding voxel of the second three-dimensional representation.

13. The method of claim 1, further comprising:

determining a first voxel value distribution of the first three-dimensional representation;

determining a second voxel value distribution of the second three-dimensional representation; and

comparing the first voxel value distribution to the second voxel value distribution.

14. The method of claim 1, further comprising:

receiving the acquired pose information of the vehicle; and

using the acquired pose information to refine the pose information.

15. The method of claim 14, wherein refining the pose information further comprises applying a kalman filter to the acquired pose information and the determined pose information.

16. The method of claim 11, further comprising creating the first three-dimensional representation of the point cloud based on initial pose information, wherein the initial pose information is estimated based on pose information corresponding to previous positions of the vehicle.

17. The method of claim 11, further comprising:

updating the pose information to increase similarity between the first three-dimensional representation and the second three-dimensional representation.

18. The method of claim 1, further comprising:

determining the pose information by comparing the first three-dimensional representation at the first resolution and the second three-dimensional representation at the first resolution; and

refining the pose information by comparing the first three-dimensional representation at the second resolution and the second three-dimensional representation at the second resolution.

19. A non-transitory computer-readable medium having instructions stored thereon, which, when executed by one or more processors, cause the one or more processors to perform operations comprising:

receiving a point cloud relating to a scene captured by a sensor equipped on a vehicle;

creating a first three-dimensional representation of the point cloud;

creating a second three-dimensional representation of a high-definition map relating to the scene;

determining pose information for the vehicle by comparing the first three-dimensional representation and the second three-dimensional representation; and

determining a position of the vehicle based on the pose information.

20. The computer-readable medium of claim 15, wherein the operations further comprise:

determining a first voxel value distribution of the first three-dimensional representation;

determining a second voxel value distribution of the second three-dimensional representation; and

comparing the first voxel value distribution to the second voxel value distribution.

Technical Field

The present application relates to a positioning system and method using LiDAR (Light Detection and Ranging), and more particularly, to a positioning system and method for estimating a vehicle position using a voxel matching method between LiDAR data and a high definition map.

Background

Autopilot and robotic navigation techniques rely heavily on precise positioning. For example, high-definition vehicle positioning is critical for autonomous driving. Conventional positioning methods, such as using a Global Positioning System (GPS) or an Inertial Measurement Unit (IMU), cannot provide sufficient positioning accuracy required for autonomous driving. For example, the accuracy of a GPS fix is typically on the order of meters. Automatic driving control is often not accurate enough, such as locating a vehicle between lanes, or locating a pedestrian in front of the vehicle, both of which require a positioning error within centimeters. Furthermore, GPS signals are not always available. For example, when the vehicle is in a tunnel or near a high-rise building, the GPS signal is often weak or lost. Furthermore, measurement errors of the IMU typically accumulate over time or distance, resulting in positioning inaccuracies.

High resolution maps may be obtained by aggregating images and information acquired by various sensors, probes, and other devices on the vehicle as the vehicle moves about. For example, a vehicle may be equipped with multiple integrated sensors, such as LiDAR, a GPS receiver, one or more IMU sensors, and one or more cameras, to capture features of the road or surrounding objects on which the vehicle is traveling. A high-definition map may be obtained by aggregating a plurality of frames of point clouds based on respective three-dimensional (3-D) pose information (e.g., position and pose) of a measuring vehicle.

Based on high definition maps, the vehicle can be located by feature matching and fusion with real-time sensor measurements, such as camera images and GPS/IMU signals. However, the positioning result is often affected by, for example, lighting conditions when taking camera images and noise in sensor measurements. Another high-definition positioning method is used for carrying out point-to-point matching on the real-time point cloud captured by the LiDAR and a high-definition map. However, this method is computationally too large in view of the amount of data that needs to be processed and the complexity of the algorithm. Furthermore, it requires an initial position that is sufficiently close to the actual position of the vehicle. Another approach is to segment the point cloud, extract features from the segmentation, and match the features with features extracted from a high definition map. This method is typically not performed in real time and is sensitive to measurement noise.

Embodiments of the present application address the above-mentioned problems with improved systems and methods for locating vehicles.

Disclosure of Invention

Embodiments of the present application provide a system for locating a vehicle. The system includes a communication interface configured to receive a point cloud relating to a scene captured by a sensor equipped on the vehicle. The system further includes a memory configured to store the point cloud and high definition map. The system also includes a processor. The processor is configured to create a first three-dimensional representation of the point cloud and create a second three-dimensional representation of the high definition map relating to the scene. The processor is further configured to determine pose information of the vehicle by comparing the first three-dimensional representation and the second three-dimensional representation. The processor determines a position of the vehicle based on the pose information.

Embodiments of the present application also provide a method for locating a vehicle. The method includes receiving a point cloud associated with a scene captured by a sensor equipped on the vehicle. The method further includes creating, by a processor, a first three-dimensional representation of the point cloud and creating, by the processor, a second three-dimensional representation of a high-definition map related to the scene. The method also includes determining, by the processor, pose information for the vehicle by comparing the first three-dimensional representation and the second three-dimensional representation, and determining a position of the vehicle based on the pose information.

Embodiments of the present application further provide a non-transitory computer-readable medium having instructions stored thereon, which, when executed by one or more processors, cause the one or more processors to perform operations. The operations include receiving a point cloud associated with a scene captured by a sensor equipped on a vehicle. The operations further include creating a first three-dimensional representation of the point cloud and creating a second three-dimensional representation of a high-definition map relating to the scene. The operations further include determining pose information for the vehicle by comparing the first three-dimensional representation and the second three-dimensional representation, and determining a position of the vehicle based on the pose information.

It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the invention, as claimed.

Drawings

FIG. 1 shows a schematic diagram of an exemplary vehicle having sensors according to an embodiment of the present application.

FIG. 2 illustrates a block diagram of an exemplary localization server for localizing a vehicle based on real-time point cloud data according to an embodiment of the present application.

FIG. 3 illustrates an exemplary three-dimensional representation of point cloud data according to an embodiment of the present application.

FIG. 4 sets forth a flow chart illustrating an exemplary method for locating a vehicle according to embodiments of the present application.

FIG. 5 shows a flow diagram of an exemplary voxel matching method according to an embodiment of the application.

Detailed Description

Reference will now be made in detail to exemplary embodiments, examples of which are illustrated in the accompanying drawings. Wherever possible, the same reference numbers will be used throughout the drawings to refer to the same or like parts.

FIG. 1 shows a schematic diagram of an exemplary vehicle 100 having at least two sensors according to an embodiment of the present application. Consistent with some embodiments, vehicle 100 may be a survey vehicle configured to acquire data for constructing high-resolution maps or three-dimensional (3-D) city modeling. It is contemplated that vehicle 100 may be an electric vehicle, a fuel cell vehicle, a hybrid vehicle, or a conventional internal combustion engine vehicle. The vehicle 100 may have a body 110 and at least one wheel 120. The body 110 may be any body type, such as a sports vehicle, a sports car, a sedan, a pick-up truck, a station wagon, a Sport Utility Vehicle (SUV), a minivan, or a retrofit vehicle. In some embodiments, as shown in FIG. 1, the vehicle 100 may include a pair of front wheels and a pair of rear wheels. However, it is contemplated that the vehicle 100 may have fewer wheels or equivalent structures that enable the vehicle 100 to move around. The vehicle 100 may be configured as an All Wheel Drive (AWD), front wheel drive (FWR), or Rear Wheel Drive (RWD). In some embodiments, the vehicle 100 may be configured to be operated by an operator using the vehicle, remotely controlled, and/or automatically operated.

As shown in FIG. 1, the vehicle 100 may be equipped with LiDAR140 mounted to the body 110 by mounting structure 130. The mounting structure 130 may be an electromechanical device mounted or otherwise attached to the body 110 of the vehicle 100. In some embodiments, the mounting structure 130 may use screws, adhesives, or other mounting mechanisms. Vehicle 100 may be additionally equipped with GPS/IMU150 inside or outside body 110 using any suitable mounting mechanism. It is contemplated that the manner in which the LiDAR140 or GPS/IMU150 is equipped on the vehicle 100 may not be limited by the example shown in FIG. 1, and may be modified to achieve the desired sensing performance.

In some embodiments, the LiDAR140 and GPS/IMU150 may be configured to capture data as the vehicle 100 moves along a trajectory. Consistent with the present application, the LiDAR140 may be configured to scan the surrounding environment and acquire a point cloud. LiDAR measures distance to a target by illuminating the target with a pulsed laser and measuring the reflected pulse with a sensor. The laser return time difference and wavelength may then be used to generate a digital 3-D representation of the target. The light used for LiDAR scanning may be ultraviolet, visible, or near infrared. Because a narrow laser beam can map physical features with very high resolution, LiDAR scanners are particularly well suited for high resolution map measurements. In some embodiments, a LiDAR scanner may capture a point cloud. As the vehicle 100 moves along the trajectory, the LiDAR140 may acquire a series of point clouds at multiple points in time (each point cloud is referred to as a frame of point clouds acquired at one point in time).

As shown in FIG. 1, the vehicle 100 may be additionally equipped with a GPS/IMU150, which may include sensors, such as a GPS receiver and one or more IMU sensors, for locating the vehicle 100 in a navigation unit. GPS is a global navigation satellite system that provides geographic positioning and time information to GPS receivers. An IMU is an electronic device that uses various inertial sensors, such as accelerometers and gyroscopes, and sometimes magnetometers, to measure and provide specific forces, angular rates, and sometimes magnetic fields around the vehicle. By combining the GPS receiver and IMU sensor, GPS/IMU150 may provide real-time pose data of vehicle 100 as vehicle 100 travels, including the position and pose (e.g., euler angles) of vehicle 100 at each point in time.

Consistent with the present application, the vehicle 100 may include a local controller 160 within the body 110 of the vehicle 100 that communicates with a map server 170 and a location server 180. In some embodiments, each of the map server 170 and the location server 180 may be a local physical server, a cloud server (as shown in fig. 1), a virtual server, a distributed server, or any other suitable computing device. The map server 170 and the positioning server 180 may be connected via a Network, such as a Wireless Local Area Network (WLAN), a Wide Area Network (WAN), a Network such as radio waves, a cellular Network, a satellite communication Network, and/or a Local or short-range Wireless Network (e.g., bluetooth)^TM) Communicate with the LiDAR140, the GPS/IMU150, and/or other components of the vehicle 100. For example, the map server 170 and the location server 160 may receive sensor measurements from the LiDAR140 and GPS/IMU 150. The map server 170 and the location server 180 may also communicate with each other. For example, the positioning server 180 may receive a high definition map or portions thereof from the map server 170.

Consistent with the present application, the controller 160 and the location server 180 may perform estimation of vehicle 100 pose information based on point clouds captured by the LiDAR140 and pose data captured by the GPS/IMU150, either separately or collectively. In some embodiments, controller 160 or location server 180 may retrieve a high definition map from map server 170 based on initial location information provided by GPS/IMU150 and create a 3-D representation of the high definition map. The controller 160 or the location server 180 may also receive point clouds acquired by the LiDAR140 and create a 3-D representation of the point clouds. A 3-D representation of the point cloud and high definition map may be created by "voxelizing" the data. The controller 160 or the positioning server 180 may perform a voxel matching method on the 3-D representation to optimize pose information of the vehicle 100. The vehicle 100 may be positioned based on the optimized pose information. In some embodiments, the point cloud and high definition map may be voxelized at multiple resolutions, and a voxel matching method may be performed first on the 3-D representation at the lower resolution and the obtained pose information used as an initial guess when performing the method on the 3-D representation at the higher resolution. The disclosed systems and methods provide improved accuracy and reduced computational cost.

For example, fig. 2 shows a block diagram of an exemplary localization server 180 for localizing a vehicle based on real-time point cloud data according to an embodiment of the present application. Consistent with the present application, the positioning server 180 may use various types of data for vehicle pose estimation. As the vehicle 100 moves along the trajectory, various types of data regarding the surrounding scene of the vehicle 100 may be captured by the LiDAR140 and GPS/IMU150 onboard the vehicle 100. The data may include a point cloud 201 captured by the LiDAR140 that consists of multiple point cloud frames at different points in time. The data may also include initial pose data 203 of the vehicle 100 acquired by the GPS/IMU 150. In some embodiments, the point cloud 201 may be calibrated by transforming local LiDAR data from a local coordinate system to a global coordinate system (e.g., longitude/latitude coordinates) based on initial pose data 203 from GPS receivers and IMU sensors.

The data provided to the positioning server 180 for vehicle pose estimation also includes a high definition map 205 provided by the map server 170. High definition map 205 may be constructed by aggregating images and information acquired by various sensors, detectors, and other devices on a measuring vehicle for capturing features of a measuring area. In some embodiments, high-definition map 205 may be a portion of a larger high-definition map, e.g., a partial high-definition map. Map server 170 may also receive pose data 203 from GPS/IMU150 and retrieve high definition map 205 of the scene corresponding to pose data 203.

In some embodiments, as shown in fig. 2, the location server 180 may include a communication interface 202, a processor 204, a memory 206, and a storage 208. In some embodiments, the positioning server 180 may be a stand-alone device with different modules in a single device, such as an Integrated Circuit (IC) chip (implemented as an application-specific integrated circuit (ASIC) or field-programmable gate array (FPGA)) or with dedicated functionality. In some embodiments, one or more components of the location server 180 may be located inside the vehicle 100, or alternatively in a mobile device, a cloud, or another remote location. The components of the location server 180 may be in an integrated device or distributed in different locations but in communication with each other through a network (not shown). For example, the processor 204 may be an onboard processor of the vehicle 100, a processor within a mobile device, or a cloud processor, or any combination thereof.

Communication interface 202 may be through, for example, a communication cable, a Wireless Local Area Network (WLAN), a Wide Area Network (WAN), such as a radio wave network, a cellular network, and/or a local wireless network (e.g., Bluetooth)^TMOr WiFi^TM) To and from components such as the LiDAR140 and the GPS/IMU150, and the map server 170. In some embodiments, communication interface 202 may be an Integrated Services Digital Network (ISDN) card, cable modem, satellite modem, or modem to provide a data communication connection. As another example, communication interface 202 may be a Local Area Network (LAN) adapter to provide a data communication connection to a compatible LAN. The wireless link may also be implemented by the communication interface 202. In such implementations, communication interface 202 may send and receive electrical, electromagnetic or optical signals that carry digital data streams representing various types of information.

Consistent with some embodiments, the communication interface 202 may receive data captured by the LiDAR140 and GPS/IMU150, including point clouds 201 and initial pose data 203. The communication interface 202 may also receive a high definition map 205 from the map server 170. The received data may be provided to memory 206 and/or storage 208 for storage or provided to processor 204 for processing. The communication interface 202 may also receive optimized pose information generated by the processor 204 and provide the pose information to any local components in the vehicle 100 or any remote device over a communication link.

The processor 204 may comprise any suitable type of general or special purpose microprocessor, digital signal processor, or microcontroller. The processor 204 may be configured as a separate processor module dedicated to estimating the pose of the vehicle. Alternatively, the processor 204 may be configured as a common processor module that may also perform other functions unrelated to vehicle pose estimation.

As shown in fig. 2, the processor 204 may include a plurality of modules/units, such as a 3-D representation generation unit 210, a voxel matching unit 212, a localization unit 214, a pose information estimation unit 216, and so on. These modules/units (and any corresponding sub-modules or sub-units) may be hardware units (e.g., portions of an integrated circuit) of the processor 204 designed for use with other components or to execute at least a portion of a program. The program may be stored on a computer-readable medium and, when executed by the processor 204, may perform one or more functions or operations. Although FIG. 2 shows all of the units 210 within one processor 204 and 216, it is contemplated that the units may be distributed among multiple processors that are located near or remote from each other.

The 3-D representation generation unit 210 may be configured to create a 3-D representation of the point cloud 201. In some embodiments, the 3-D representation generation unit 210 may divide the 3-D space of the point cloud 201 into a plurality of voxels. For example, FIG. 3 shows an exemplary 3-D representation 310 of a point cloud 201 according to an embodiment of the present application. As shown in FIG. 3, 3-D representation 310 includes at least two voxels 320. Each voxel is characterized by a set of voxel values, including, for example, a local 3-D surface feature distribution. In some embodiments, the voxel values further include an average intensity value, and a three-dimensional distribution along the x, y, and z directions. For example, the 3-D distribution of intensity values may be a Gaussian/normal distribution.

Thus, voxel 320 may include 3-D spatial information as well as a point cloud intensity distribution. Matching a 3-D representation containing these voxels may improve localization accuracy compared to methods that directly match point clouds.

In some embodiments, 3-D representation unit 210 may also be configured to create a second 3-D representation, i.e., a 3-D representation of at least a portion of high definition map 205. High definition map 205 may contain point cloud data acquired during a survey. Thus, the same method disclosed above in connection with fig. 3 may be used to create a 3-D representation of high definition map 205.

Referring back to fig. 2, the voxel matching unit 212 may be configured to perform voxel matching operations between the 3-D representation of the point cloud 201 and the 3-D representation corresponding to the high definition map 205. In some embodiments, voxel matching may find vehicle 100 optimized pose information corresponding to the smallest difference between the 3-D representation of the point cloud 201 and the 3-D representation corresponding to the high definition map 205. For example, the difference may be computed as an aggregate difference of the differences between corresponding voxels of the two representations. In some embodiments, the difference may also be characterized conversely by the similarity between the voxels of the two representations. The voxel matching unit 212 may solve an optimization problem to maximize this similarity. An iterative method, such as a newton iterative method, may be implemented to solve the optimization problem.

In some embodiments, when solving the optimization problem, the initial pose information T₀May be used as a starting point (also referred to as an initial guess) for the first iteration. E.g. T₀＝(x₀，y₀，z₀，row₀，pitch₀，yaw₀) Wherein x is₀、y₀And z₀Is the three-dimensional coordinate, row, of the vehicle position₀、pitch₀And yaw₀The vehicle pose is represented. May initially be based on initial pose information T₀A 3-D representation is created.

In some embodiments, T may be estimated by pose information estimation unit 216 using pose information of the acquired previous point cloud frame₀. In the optimization process, the search space X, Y, Z, ROLL, PITCH, can be predefined,The pose information T is iteratively refined within YAW. After the optimization cost function meets some predetermined stopping criteria, an optimized pose T may be obtained. For example, the stopping criteria may include a matching maximum Thm and/or a calculated time maximum Tht. If the similarity L exceeds Thm (i.e., the similarity between corresponding voxels of the point cloud data 201 and high definition map 205 is greater than Thm), or the computation time exceeds Tht, the iterative process may stop and the last updated pose information may be used as optimized point cloud pose information T associated with the current point cloud frame.

In some embodiments, after obtaining the point cloud pose information T, the localization unit 214 may then merge the point cloud pose information T obtained by the voxel matching unit 212 with the initial pose data 203 provided by the GPS/IMU 150. For example, an Unscented Kalman Filter (UKF) method may be used to merge the point cloud pose information T and the initial pose data 203 to generate filtered pose information T'. The UKF method uses a deterministic sampling technique called the Unscented Transform (UT) to select the minimum set of sample points around the posterior mean (called sigma points). The sigma points are then propagated through a nonlinear function from which new a posteriori mean and covariance estimates are then formed. The obtained UKF filter can more accurately predict the true mean value and the covariance. Compared with Extended Kalman Filters (EKFs) used for pose estimation by some existing navigation systems, the UKF has the advantage of not calculating jacobian, and is therefore better than the extension of non-linearity and a priori state uncertainty in theory. Consistent with the present application, the positioning unit 214 may use the filtered pose information T' to position the vehicle 100.

In some embodiments, pose information estimation unit 216 may estimate initial pose information T of a next point cloud frame based on filtered pose information T' of a current point cloud frame determined by positioning unit 214₀. In some embodiments, pose information estimation unit 216 may estimate T based on pose changes between point cloud frames₀. For example, the pose information estimation unit 216 may calculate a phase based on the 3-D coordinates of the points in each point cloud frame and their associated attributes (e.g., reflected laser intensity)And changing the pose between adjacent point cloud frames. For another example, the pose information estimation unit 216 may calculate pose changes between the pose data 203 obtained corresponding to the point cloud frames.

Estimated initial pose information T₀May be provided to the voxel matching unit 212 for estimating the optimized pose information T for the next point cloud frame. Because of the initial pose information T₀Is estimated based on the optimized T of the current point cloud frame, so it is accurate enough as an initial guess, thus helping the optimization process to converge quickly.

In some embodiments consistent with the present application, the point cloud 201 and high definition map 205 may be voxelized at different resolutions. For example, the 3-D representation generation unit 210 may generate multiple 3-D representations of the point cloud 201, with voxels 320 of different sizes in the respective 3-D representations. When multi-resolution voxelization is implemented by the 3-D representation generation unit 210, the voxel matching unit 212 is used to match the corresponding point cloud of the operation and the 3-D representation of the high-definition map with the same resolution.

In some embodiments, the 3-D representation generation unit 210 may generate the point cloud 201 and the high definition map 205 at the first resolution R₁The first 3-D representation below. R₁May be a relatively low resolution, i.e. a 3-D representation with a relatively large voxel size. The voxel matching unit 212 and the localization unit 214 may first localize the vehicle 100 based on the 3-D representation at the low resolution. At resolution R₁The pose information obtained next may be a rough estimate of the vehicle position. The 3-D representation generation unit 210 may then be at a higher resolution R than₁Of a second resolution R₂A 3-D representation of the point cloud 201 and high definition map 205 is generated. When using the resolution R₂Resolution R when locating a vehicle in a 3-D representation of₁The predicted pose information under will be used as an initial guess.

The accuracy of the estimated position of the vehicle 100 may also be improved as the estimation process is repeated at higher and higher resolutions. Because the pose information estimated at lower resolutions can provide a sufficiently accurate initial guess, the multi-resolution voxelization method can save a large number of iterations required before the process converges, thereby increasing the computation speed.

Memory 206 and storage 208 may comprise any suitable type of storage device provided to store any type of information that processor 204 may need to process. The memory 206 and storage 208 may be volatile or non-volatile, magnetic, semiconductor-based, tape-based, optical, removable, non-removable, or other types of storage devices or tangible (i.e., non-transitory) computer-readable media, including but not limited to ROM, flash memory, dynamic RAM, and static RAM. The memory 206 and/or storage 208 may be configured to store one or more computer programs that may be executed by the processor 204 to perform the vehicle pose estimation functions disclosed herein. For example, the memory 206 and/or storage 208 may be configured to store programs that may be executed by the processor 204 to control the LiDAR140 and/or GPS/IMU150 to capture various types of data as the vehicle 100 moves along the trajectory, and to process the captured data to estimate pose information for the vehicle 100.

The memory 206 and/or storage 208 may be further configured to store information and data used by the processor 204. For example, the memory 206 and/or storage 208 may be configured to store various types of data captured by the LiDAR140 and GPS/IMU150, as well as estimated pose information. The various types of data may be permanently stored, periodically removed, or ignored immediately after processing each data frame.

FIG. 4 illustrates a flow chart of an exemplary method 400 for locating the vehicle 100 according to an embodiment of the present application. For example, the method 400 may be implemented by a vehicle pose estimation system for the vehicle 100 that includes, among other things, the controller 160, the map server 170, the location server 180, and the LiDAR140 and GPS/IMU 150. However, the method 400 is not limited to this exemplary embodiment.

The method 400 may include steps S402-S416 as described below. It should be understood that some steps may be optional to perform the disclosure provided herein. Further, some steps may be performed simultaneously, or in a different order than shown in fig. 4.

At step S402, a time point t may be reached₁CaptureScene-dependent point cloud frame PC₁(e.g., point cloud 201). For example, point cloud frame PC₁May be captured by the LiDAR 140. In some embodiments, it is also possible to determine the time t₁Capturing initial pose data IP of vehicle 100₁(e.g., initial pose data 203). For example, initial pose data IP₁May be captured by the GPS/IMU 150. Initial pose data IP₁Can be matched with point cloud frame PC₁Corresponding because they are captured at the same point in time. In some embodiments, the captured initial pose data IP₁And point cloud frame PC₁May be sent to the location server 180 and received by the location server 180.

In step S404, the initial pose data IP can be based on₁A high definition map (e.g., high definition map 205) is obtained from map server 170. Initial pose data IP₁Providing corresponding point cloud frame PC₁Such that the map server 170 may select a relevant portion of a previously constructed high definition map, e.g., a portion of a high definition map comprising the scene. In some embodiments, the high-definition map may also be received by the positioning server 180.

In steps S406-S410, the positioning server 180 may optimize the pose information T of the vehicle 100 using a voxel matching method. To optimize pose information T, the positioning sensor 180 may search the predefined spaces X, Y, Z, ROLL, PITCH, YAW to map the point cloud frame PC obtained in step S402 with₁And the point cloud data of the high-definition map obtained in step S404. Consistent with the present application, location server 180 may generate a point cloud frame PC₁And 3-D representation of the corresponding point cloud data of the high-definition map, and determining an optimized estimated pose that reduces the difference between the two 3-D representations.

In some embodiments, location server 180 may perform voxel matching at multiple resolutions, e.g., R₁And R₂Wherein the resolution R₂Higher than resolution R₁. When using a multi-resolution approach, a volume may be performed between a first 3-D representation (e.g., a 3-D representation of a point cloud frame) and a second 3-D representation (e.g., a 3-D representation of a high definition map) at the same resolutionAnd matching elements. In some embodiments, voxel matching may be performed from a lower resolution to a higher resolution. For example, the pose information optimization operation in S408 may be performed at the lowest resolution. Then, in step S410, it is determined whether all resolutions have been processed. If not, the process of method 400 returns to step S406 to select the next lowest resolution for voxel matching (initially, the "next" lowest resolution may be the lowest resolution, e.g., R₁). In step S408, the localization server 180 may perform voxel matching on the 3-D representation of the high definition map and the point cloud frame at the selected resolution.

FIG. 5 shows a flow diagram of an exemplary voxel matching method 500 according to an embodiment of the application. The method 500 may be performed by the location server 180 to perform step S408 of the method 400. However, step S408 is not limited to an embodiment of method 500. The method 500 may include steps S502-S516 as described below. It should be understood that some steps may be optional to perform the disclosure provided herein. Further, some steps may be performed simultaneously, or in a different order than shown in fig. 5.

In step S502, the pose T can be determined₀Set as an initial guess of pose information T. E.g. T₀＝(x₀，y₀，z₀，row₀，pitch₀，yaw₀) Wherein x is₀、y₀And z₀Is the three-dimensional coordinate, row, of the vehicle position₀、pitch₀And yaw₀The vehicle pose is represented. In some embodiments, the obtained previous point cloud frame (e.g., corresponding to time point t) may be used by pose information estimation unit 216₀PC (a)₀) Estimate T from the pose information₀. When using the multi-resolution method, the optimized pose information at the lower resolution can be used as the initial pose T₀To increase the speed of the calculation.

In step S504, a matching threshold Thm and a time threshold Tht may be set. In some embodiments, both the match threshold Thm and the time threshold Tht may be part of the stopping criteria. Thm is the maximum value of the cost function L used to optimize the pose information T. Tht is the maximum value of the calculation time. If the value of the cost function L exceeds Thm, or the calculation time exceeds Tht, the optimization process may be stopped (see step S512).

In step S506, the positioning server 180 may base on the initial pose information T₀Generating a point cloud frame PC at a selected resolution₁And a second 3-D representation of a high definition map. In some embodiments, to generate the 3-D representation, the 3-D representation generation unit 210 may divide the 3-D space of the point cloud data into a plurality of voxels 320 as shown in FIG. 3. Each voxel is characterized by a set of voxel values including, for example, a local three-dimensional surface feature distribution, an average intensity value, and its three-dimensional distribution along the x, y, and z directions.

In step S508, a voxel value distribution of the 3-D representation at the selected resolution may be determined. In some embodiments, the 3-D distribution of voxel values (e.g., intensity values) may be a gaussian/normal distribution, represented using equations (1) - (3):

M＝[p₁-μ...p_n-μ](2)

wherein p is_iIs the 3-D coordinate of point i within its voxel and the intensity value at point i, μ is the mean of the gaussian distribution N (μ, ∑), and ∑ is the variance of the gaussian distribution.

In step S510, the localization sensor 180 may calculate a similarity between the voxel value distributions of the two 3-D representations. For example, the similarity L may be determined using equation (4):

wherein, mu_iAnd ∑_iMean vectors and variance matrices of gaussian distributions of voxels, respectively. Optimizing bits by maximizing such similarityPosture information T.

In some embodiments, an iterative method, such as a newton iterative method, may be implemented to solve the optimization problem. During each iteration, the value of the cost function (e.g., L) and the computation time will be tracked. In step S512, if the value of the cost function L exceeds Thm, or the computation time exceeds Tht (S508: YES), the stop criterion is deemed to be satisfied, and the iterative process is deemed to converge. Therefore, the voxel matching unit 212 stops the optimization and provides the latest pose information T as the optimized pose to the positioning unit 214.

On the other hand, if neither L > Thm nor T > Tht is satisfied (S508: no), the method 500 proceeds to S514 to further update the pose T. In some embodiments, pose T may be refined to further reduce the difference between the point cloud frame and the 3-D representation of the high definition map. With the updated pose T, the method 500 returns to step S506 for another iteration. For example, in steps S506 and S508, the localization server 180 may generate a 3-D representation of the point cloud frame and the high definition map, and determine a voxel value distribution based on the latest updated pose T.

Referring back to fig. 4, in step S410, it is determined whether all resolutions have been processed. If not (S410: NO), the method 400 returns to S406 to select the next lowest resolution for voxel matching. For example, at a processing resolution R₁Then, resolution R₂Becomes the next lowest resolution and may be selected in step S406. Therefore, the resolution R can be compared and matched at step S408₂The first and second 3-D representations of (a) to determine an optimized pose T at the selected resolution.

If all resolutions have been processed (S410: YES), the method 400 proceeds to step S412, where the point cloud frame PC is processed₁And the optimized pose information T and the corresponding initial pose data IP₁Filtering and combining. For example, the positioning server 180 may filter and merge the point cloud pose information T and the initial pose data 203 using the UKF method to obtain merged pose information T'. In step S414, the vehicle 100 may be located in the high definition map based on the merged pose information T'.

In some embodiments, the merged pose information T' may additionally be used to provide estimated pose information for the next cloud frame. For example, if the merged pose information T' obtained in step S412 is associated with the time point T₁Captured point cloud frame PC₁Correspondingly, the positioning server 180 can estimate the next time point t₂Captured point cloud frame PC₂The pose information of (1). Can provide estimated pose information as pose T₀To find the point cloud frame PC₂And (4) corresponding optimized pose information T.

In some embodiments, the pose information estimation units 216 may be respectively based on the time points t₁And t₂Captured point cloud frame PC₁And point cloud frame PC₂And estimating pose information according to pose changes. For example, the pose information estimation unit 216 may calculate pose changes based on the 3-D coordinates of the points in each point cloud frame and their associated attributes (e.g., reflected laser intensity). For example, pose changes may be calculated based on X, Y and the Z coordinates and the reflector strength for each point using a four-dimensional (4-D) Normal Distribution Transform (NDT). 4-D NDT converts a discrete set of 3-D points reconstructed from a single point cloud frame into a piecewise continuous and differentiable probability density defined in 3-D space. The probability density may include a set of easily computable normal distributions. The probability density distribution may be used to represent point cloud pose information for the corresponding point cloud frame. For another example, the pose information estimation unit 216 may calculate the time points t respectively₁And t₂Captured initial pose data IP₁And IP₂Change in pose therebetween.

Another aspect of the present application relates to a non-transitory computer-readable medium having stored thereon instructions that, when executed, cause one or more processors to perform a method as described above. The computer readable medium may be a volatile or non-volatile, magnetic, semiconductor-based, tape-based, optical, removable, non-removable, or other type of computer readable medium or computer readable storage device. For example, a computer-readable medium as in the present application may be a storage device or a memory module having stored thereon computer instructions. In some embodiments, the computer readable medium may be a disk or flash drive having computer instructions stored thereon.

It will be apparent that various modifications and variations can be made in the system and related methods of the present application by those of ordinary skill in the art. Other embodiments will be apparent to those skilled in the art from consideration of the specification and practice of the system and associated method of the present application.

It is intended that the specification and examples be considered as exemplary only, with a true scope being indicated by the following claims and their equivalents.

18页详细技术资料下载

Vehicle positioning system using laser radar

相关技术

网友询问留言