Identification method of plane semantic category and image data processing device

文档序号：39277 发布日期：2021-09-24 浏览：17次中文

阅读说明：本技术 一种平面语义类别的识别方法以及图像数据处理装置 (Identification method of plane semantic category and image data processing device ) 是由马超群陈平方晓鑫于 2020-01-23 设计创作，主要内容包括：本申请提供一种平面语义类别的识别方法以及图像数据处理装置,涉及图像处理技术领域,用以准确的确定平面语义的平面类别,该方法包括：获取待处理图像数据,待处理图像数据包括N个像素点；确定待处理图像数据的语义分割结果,其中,语义分割结果包括N个像素点中至少部分像素点对应的目标平面类别；根据语义分割结果,得到第一稠密语义地图,第一稠密语义地图包括第一三维点云中的至少一个第一三维点对应的至少一个目标平面类别,至少一个第一三维点对应于至少部分像素点中的至少一个像素点；根据第一稠密语义地图进行平面语义类别识别,得到待处理图像数据包括的一个或多个平面的平面语义类别。该方法可以提高平面语义识别的准确度。(The application provides a recognition method of plane semantic categories and an image data processing device, which relate to the technical field of image processing and are used for accurately determining the plane categories of plane semantics, and the method comprises the following steps: acquiring to-be-processed image data, wherein the to-be-processed image data comprises N pixel points; determining a semantic segmentation result of the image data to be processed, wherein the semantic segmentation result comprises a target plane category corresponding to at least part of pixel points in N pixel points; obtaining a first dense semantic map according to the semantic segmentation result, wherein the first dense semantic map comprises at least one target plane category corresponding to at least one first three-dimensional point in the first three-dimensional point cloud, and the at least one first three-dimensional point corresponds to at least one pixel point in at least part of pixel points; and carrying out plane semantic category identification according to the first dense semantic map to obtain plane semantic categories of one or more planes included in the image data to be processed. The method can improve the accuracy of the plane semantic recognition.)

A method for identifying a flat semantic category, comprising:

acquiring to-be-processed image data, wherein the to-be-processed image data comprises N pixel points, and N is a positive integer;

determining a semantic segmentation result of the image data to be processed, wherein the semantic segmentation result comprises a target plane category corresponding to at least part of pixel points in the N pixel points;

obtaining a first dense semantic map according to the semantic segmentation result, wherein the first dense semantic map comprises at least one target plane category corresponding to at least one first three-dimensional point in a first three-dimensional point cloud, and the at least one first three-dimensional point corresponds to at least one pixel point in the at least part of pixel points;

and performing plane semantic category identification according to the first dense semantic map to obtain plane semantic categories of one or more planes included in the image data to be processed.

The method of claim 1, wherein the obtaining a first dense semantic map according to the semantic segmentation result comprises:

obtaining a second dense semantic map according to the semantic segmentation result and the depth image corresponding to the image data to be processed;

using the second dense semantic map as the first dense semantic map, or,

and updating a historical dense semantic map by using one or more second three-dimensional points in a second three-dimensional point cloud in the second dense semantic map to obtain the first dense semantic map.

The method according to claim 1 or 2, wherein after determining the result of semantic segmentation of the image data to be processed, the method further comprises:

and according to the image data to be processed and the depth information included in the depth image corresponding to the image data to be processed, performing optimization operation on the semantic segmentation result, wherein the optimization operation is used for correcting the noise and error part in the semantic segmentation result.

The method according to any one of claims 1-3, wherein the determining a semantic segmentation result of the image data to be processed comprises:

determining one or more plane categories corresponding to any pixel point in at least part of pixel points and the probability of each plane category in the one or more plane categories;

and taking the plane class with the highest probability in the one or more plane classes corresponding to any pixel point as a target plane class corresponding to any pixel point to obtain a semantic segmentation result of the image data to be processed.

The method of claim 4, wherein the determining the probability of each plane class of the one or more plane classes corresponding to any pixel point of the at least some pixel points comprises:

and performing semantic segmentation on the image data to be processed according to a neural network to obtain the probability of each plane category in one or more plane categories corresponding to any pixel point in at least part of pixel points.

The method according to any one of claims 1 to 5, wherein the performing plane semantic category recognition according to the first dense semantic map to obtain plane semantic categories of one or more planes included in the image data to be processed includes:

determining a plane equation for each of the one or more planes from the image data to be processed;

performing the following steps on any plane in the one or more planes to obtain a plane semantic category of the any plane:

determining one or more target plane classes corresponding to the any plane and the confidence of the one or more target plane classes according to the plane equation of the any plane and the first dense semantic map;

and selecting the object plane class with the highest confidence from the one or more object plane classes as the semantic plane class of any one plane.

The method of claim 6, wherein the orientation of the one or more object plane categories corresponding to the any one plane is coincident with the orientation of the any one plane.

The method according to claim 6 or 7, wherein the determining, from the plane equation for the any one plane and the first dense semantic map, one or more object plane classes to which the any one plane corresponds and a confidence of the one or more object plane classes comprises:

determining M first three-dimensional points from the first dense semantic map according to a plane equation of any one plane, wherein the distance between the M first three-dimensional points and the any one plane is smaller than a third threshold value, and M is a positive integer;

determining one or more object plane classes corresponding to the M first three-dimensional points as the one or more object plane classes corresponding to the any plane, wherein the orientation of the one or more object plane classes is consistent with the orientation of the any plane,

and counting the proportion of the number of the three-dimensional points corresponding to each target plane category in the one or more target plane categories in the M first three-dimensional points to obtain the confidence of the one or more target plane categories.

The method of claim 8, further comprising:

and updating the confidence of one or more target plane categories corresponding to any one plane according to at least one of Bayesian theorem or voting mechanism.

The method according to any one of claims 1-9, wherein the obtaining a first dense semantic map according to the semantic segmentation result comprises:

judging whether the current state is a motion state;

and under the condition that the current state is a motion state, obtaining a first dense semantic map according to the semantic segmentation result.

The method according to any one of claims 1 to 10, wherein the image data to be processed is post-aligned image data.

An image data processing apparatus characterized by comprising:

the semantic segmentation module is used for acquiring to-be-processed image data comprising N pixel points and determining a semantic segmentation result of the to-be-processed image data, wherein the semantic segmentation result comprises target plane categories corresponding to at least part of the N pixel points, and N is a positive integer;

the semantic map module is used for obtaining a first dense semantic map according to the semantic segmentation result, wherein the first dense semantic map comprises at least one target plane category corresponding to at least one first three-dimensional point in a first three-dimensional point cloud, and the at least one first three-dimensional point corresponds to at least one pixel point in at least part of pixel points;

and the semantic clustering module is used for carrying out plane semantic category identification according to the first dense semantic map to obtain the plane semantic categories of one or more planes included in the image data to be processed.

The apparatus of claim 12, wherein the semantic map module is specifically configured to:

obtaining a second dense semantic map according to the semantic segmentation result and the depth image corresponding to the image data to be processed;

using the second dense semantic map as the first dense semantic map, or,

the method further includes updating a historical dense semantic map with one or more second three-dimensional points in a second three-dimensional point cloud in the second dense semantic map to obtain the first dense semantic map.

The apparatus according to claim 12 or 13, wherein after determining the semantic segmentation result of the image data to be processed, the semantic segmentation module is further configured to perform an optimization operation on the semantic segmentation result according to the image data to be processed and depth information included in a depth image corresponding to the image data to be processed, where the optimization operation is used to correct noise and error portions in the semantic segmentation result.

The apparatus according to any one of claims 12 to 14, wherein the semantic segmentation module is specifically configured to determine one or more plane classes corresponding to any one of the at least some pixel points and a probability of each of the one or more plane classes;

and the plane class with the highest probability in the one or more plane classes corresponding to any pixel point is used as a target plane class corresponding to any pixel point, so as to obtain a semantic segmentation result of the image data to be processed.

The apparatus according to claim 15, wherein the semantic segmentation module is configured to perform semantic segmentation on the image data to be processed according to a neural network, so as to obtain a probability of each of one or more plane categories corresponding to any one of the at least part of the pixel points.

The apparatus according to any one of claims 12-16, wherein the semantic clustering module is configured to:

determining a plane equation for each of the one or more planes from the image data to be processed;

performing the following steps on any plane in the one or more planes to obtain a plane semantic category of the one or more planes:

and selecting the object plane class with the highest confidence from the one or more object plane classes as the semantic plane class of any one plane.

The apparatus of claim 17, wherein the orientation of the one or more object plane categories corresponding to the any one plane is coincident with the orientation of the any one plane.

The apparatus according to claim 17 or 18, wherein the semantic clustering module is specifically configured to:

The apparatus of claim 19, wherein the semantic clustering module, after counting a ratio of the number of three-dimensional points corresponding to each object plane category in the one or more object plane categories to the M first three-dimensional points to obtain a confidence of the one or more object plane categories, is further configured to update the confidence of the one or more object plane categories corresponding to the any one plane according to at least one of bayesian theorem or voting mechanism.

The apparatus according to any one of claims 12 to 20, wherein the semantic map module is specifically configured to determine whether a current state is a motion state, and obtain a first dense semantic map according to the semantic segmentation result when it is determined that the current state is the motion state.

The apparatus according to any one of claims 12 to 21, wherein the image data to be processed is post-aligned image data.

A computer-readable storage medium having stored thereon instructions which, when executed, implement the method of any one of claims 1 to 11.

The processing device is characterized by comprising a first processor and a second processor, wherein the first processor is used for acquiring image data to be processed, the image data to be processed comprises N pixel points, and N is a positive integer;

the second processor is configured to determine a semantic segmentation result of the image data to be processed, where the semantic segmentation result includes a target plane category corresponding to at least some pixel points in the N pixel points;

the first processor is further configured to obtain a first dense semantic map according to the semantic segmentation result, where the first dense semantic map includes at least one target plane category corresponding to at least one first three-dimensional point in a first three-dimensional point cloud, and the at least one first three-dimensional point corresponds to at least one pixel point in the at least part of pixel points; and performing plane semantic category identification according to the first dense semantic map to obtain plane semantic categories of one or more planes included in the image data to be processed.

The processing device according to claim 24, wherein the second processor is specifically configured to determine one or more plane categories corresponding to any of the at least some of the pixel points and a probability for each of the one or more plane categories;

The processing device according to claim 25, wherein the second processor is specifically configured to perform semantic segmentation on the image data to be processed according to a neural network, so as to obtain a probability of each of one or more plane classes corresponding to any one of the at least part of the pixel points.

A processing device, comprising: one or more processors configured to execute instructions stored in a memory to perform the method of any of claims 1-11.

47页详细技术资料下载

上一篇：一种医用注射器针头装配设备

下一篇：基于机器学习的半导体样本中的缺陷分类

Identification method of plane semantic category and image data processing device

相关技术

网友询问留言