Object grabbing method and device

文档序号：551872 发布日期：2021-05-14 浏览：2次中文

阅读说明：本技术 对象抓取方法及装置 (Object grabbing method and device ) 是由段文杰夏冬青丁有爽邵天兰于 2021-02-05 设计创作，主要内容包括：本发明公开了一种对象抓取方法及装置,方法包括：针对与三维物品区域相对应的点云进行聚类处理,得到多个点云簇；计算各个点云簇所对应的物品对象的位姿关键点以及位姿关键点的深度值,对各个点云簇所对应的物品对象进行分层处理,得到多个从顶层至底层依次排列的点云分层；按照从顶层至底层的顺序,依次将当前点云分层中包含的物品对象作为目标物品对象,当判断出与目标物品对象的位姿关键点相对应的覆盖检测区域内包含与当前点云分层相对应的底层点云分层中的物品对象时,将其确定为压叠物品对象；将各个点云分层中包含的其他物品对象确定为待抓取对象。该方式能够剔除压叠物品对象,防止其他物品飞出并破坏的问题,提升了抓取过程的可靠性。(The invention discloses an object grabbing method and device, wherein the method comprises the following steps: clustering the point clouds corresponding to the three-dimensional article area to obtain a plurality of point cloud clusters; calculating the position and pose key points of the object corresponding to each point cloud cluster and the depth values of the position and pose key points, and carrying out layering processing on the object corresponding to each point cloud cluster to obtain a plurality of point cloud layers which are sequentially arranged from the top layer to the bottom layer; sequentially taking the object objects contained in the current point cloud layering as target object objects according to the sequence from the top layer to the bottom layer, and determining the object objects contained in the bottom layer point cloud layering corresponding to the current point cloud layering as overlapped object objects when the object objects contained in the coverage detection area corresponding to the pose key points of the target object are judged; and determining other object objects contained in each point cloud layering as objects to be grabbed. This mode can reject and press the folding article object, prevents the problem of other article departure and destruction, has promoted the reliability of snatching the process.)

1. An object fetching method, comprising:

clustering the point clouds corresponding to the three-dimensional article areas to obtain a plurality of point cloud clusters corresponding to the article objects;

calculating the pose key points of the article objects corresponding to the point cloud clusters and the depth values of the pose key points, and carrying out layering processing on the article objects corresponding to the point cloud clusters according to the depth values to obtain a plurality of point cloud layers which are sequentially arranged from the top layer to the bottom layer;

sequentially taking the object objects contained in the current point cloud layer as target object objects according to the sequence from the top layer to the bottom layer, and determining the object objects in the bottom point cloud layer corresponding to the current point cloud layer as the overlapped object objects when the object objects in the bottom point cloud layer corresponding to the current point cloud layer are contained in the coverage detection area corresponding to the pose key point of the target object;

and determining the object objects except the overlapped object objects contained in each point cloud layering as the objects to be grabbed.

2. The method of claim 1, wherein the coverage detection zone corresponding to the pose keypoints of the target item object comprises: a vertebral body region corresponding to a pose keypoint of the target item object;

the top of the cone region is determined according to the pose key point of the target object, and the bottom of the cone region is located in a bottom point cloud layer corresponding to the current point cloud layer where the target object is located.

3. The method of claim 2, wherein the vertebral body region comprises: a conical region and/or a frustoconical region;

when the cone region is a cone region, determining the vertex of the cone region according to the pose key point of the target object; when the cone region is a circular truncated cone region, the upper bottom surface of the circular truncated cone region is determined according to the pose key point of the target object.

4. The method according to any one of claims 1 to 3, wherein the number of the point cloud hierarchies arranged in sequence from the top layer to the bottom layer is N, and the point cloud hierarchy positioned at the top layer is the 1 st point cloud hierarchy; when the current point cloud hierarchy where the target object is located is the mth point cloud hierarchy, the bottom point cloud hierarchy corresponding to the current point cloud hierarchy includes: the N-M point cloud layers are positioned at the bottom of the Mth point cloud layer; wherein N and M are both natural numbers, and M is less than or equal to N.

5. The method of any one of claims 1-4, wherein the calculating pose keypoints for the item object corresponding to each point cloud cluster comprises:

acquiring the three-dimensional position coordinates of each data point contained in the point cloud cluster, and determining the position information of the pose key point of the article object corresponding to the point cloud cluster according to the preset operation result corresponding to the three-dimensional position coordinates of each data point;

and analyzing each data point contained in the point cloud cluster by a principal component analysis method, and determining the three-dimensional state information of the position information of the pose key point according to the analysis result.

6. The method according to any one of claims 1 to 5, wherein the layering the item objects corresponding to the point cloud clusters according to the depth values comprises:

sequentially arranging the article objects corresponding to the point cloud clusters according to the depth values, and dividing the sorted point cloud clusters into a plurality of point cloud layers according to a layering threshold;

if the difference between the depth values of the two point cloud clusters is smaller than the layering threshold, the two point cloud clusters belong to the same point cloud layering; and if the difference between the depth values of the two point cloud clusters is not less than the layering threshold, the two point cloud clusters belong to different point cloud layering.

7. The method according to any one of claims 1 to 6, wherein after determining the object objects other than the overlapped object objects contained in each point cloud hierarchy as the objects to be grabbed, the method further comprises:

sequencing the objects to be grabbed according to the depth values of the pose key points of the objects to be grabbed, and determining the grabbing sequence of the objects to be grabbed according to the sequencing result;

the depth value is a coordinate value of the object corresponding to the depth coordinate axis, wherein the depth coordinate axis is set according to the photographing direction of the camera, the gravity direction or the direction of the vertical line of the object bearing surface.

8. The method according to claim 7, wherein the sorting of the objects to be grabbed according to the depth values of the pose key points of the objects to be grabbed, and the determining the grabbing sequence of the objects to be grabbed according to the sorting result comprises:

sequencing the objects to be grabbed according to the distance between the objects to be grabbed and the camera or the object bearing surface, and determining the grabbing sequence of the objects to be grabbed according to the sequencing result;

the closer the distance from the camera to the object to be grabbed, the closer the grabbing sequence of the object to be grabbed is; the farther away from the camera, the later the grabbing sequence of the objects to be grabbed; the closer the object bearing surface is, the more the grabbing sequence of the objects to be grabbed is; the farther away from the article carrying surface, the more forward the gripping sequence of the objects to be gripped.

9. The method according to any one of claims 7 to 8, wherein after determining the grabbing sequence of each object to be grabbed according to the sorting result, the method further comprises:

acquiring a conversion relation between a camera coordinate system and a robot coordinate system;

and converting the pose key points of each object to be grabbed corresponding to the camera coordinate system into a robot coordinate system according to the conversion relation, and outputting the converted pose key points of each object to be grabbed to the robot so as to enable the robot to execute grabbing operation.

10. The method according to any one of claims 1 to 9, wherein the three-dimensional article region comprises a plurality of objects to be grabbed stacked in a preset depth direction; wherein the object to be grabbed comprises: cartons, envelopes, plastic bags, cosmeceuticals, and/or toys.

11. An object grasping apparatus, comprising:

the clustering module is suitable for clustering the point clouds corresponding to the three-dimensional article areas to obtain a plurality of point cloud clusters corresponding to the article objects;

the calculation module is suitable for calculating the position and pose key points of the article objects corresponding to the point cloud clusters and the depth values of the position and pose key points, and performing layering processing on the article objects corresponding to the point cloud clusters according to the depth values to obtain a plurality of point cloud layers which are sequentially arranged from the top layer to the bottom layer;

the determining module is suitable for sequentially taking the object objects contained in the current point cloud layering as target object objects according to the sequence from the top layer to the bottom layer, and determining the object objects in the bottom point cloud layering corresponding to the current point cloud layering as the overlapped object objects when the object objects in the bottom point cloud layering corresponding to the current point cloud layering are contained in the coverage detection area corresponding to the pose key points of the target object;

and the grabbing module is suitable for determining the object objects except the overlapped object objects contained in each point cloud layering as the objects to be grabbed.

12. The apparatus of claim 11, wherein the coverage detection zone corresponding to the pose keypoints of the target item object comprises: a vertebral body region corresponding to a pose keypoint of the target item object;

13. The apparatus of claim 12, wherein the vertebral body region comprises: a conical region and/or a frustoconical region;

14. The device according to any one of claims 11-13, wherein the number of the plurality of point cloud hierarchies arranged in sequence from the top layer to the bottom layer is N, and the point cloud hierarchy at the topmost layer is the 1 st point cloud hierarchy; when the current point cloud hierarchy where the target object is located is the mth point cloud hierarchy, the bottom point cloud hierarchy corresponding to the current point cloud hierarchy includes: the N-M point cloud layers are positioned at the bottom of the Mth point cloud layer; wherein N and M are both natural numbers, and M is less than or equal to N.

15. The apparatus according to any of claims 11-14, wherein the calculation module is specifically adapted to:

16. The apparatus according to any of claims 11-15, wherein the calculation module is specifically adapted to:

17. The apparatus of any of claims 11-16, wherein the grasping module is further adapted to:

18. The apparatus according to claim 17, wherein the grasping module is specifically adapted to:

19. The apparatus of any of claims 17-18, wherein the grasping module is further adapted to:

acquiring a conversion relation between a camera coordinate system and a robot coordinate system;

20. The apparatus according to any one of claims 11-19, wherein the three-dimensional article region comprises a plurality of objects to be grabbed stacked in a preset depth direction; wherein the object to be grabbed comprises: cartons, envelopes, plastic bags, cosmeceuticals, and/or toys.

21. An electronic device, comprising: the system comprises a processor, a memory, a communication interface and a communication bus, wherein the processor, the memory and the communication interface complete mutual communication through the communication bus;

the memory is used for storing at least one executable instruction, and the executable instruction causes the processor to execute the operation corresponding to the object grabbing method according to any one of claims 1-10.

22. A computer storage medium having stored therein at least one executable instruction for causing a processor to perform operations corresponding to the object fetching method of any one of claims 1-10.

Technical Field

The invention relates to the technical field of intelligent program-controlled robots and manipulator control, in particular to an object grabbing method and device.

Background

At present, along with the wide popularization of intelligent program controlled robots such as manipulators, more and more articles can be grabbed and transported by means of the intelligent program controlled robots. For example, commodity circulation packing can snatch through intelligent programming robot to promote by a wide margin and snatch efficiency. In the prior art, in order to realize accurate grabbing, objects to be grabbed contained in an article area need to be identified in advance, so that the intelligent program-controlled robot is controlled to grab the objects to be grabbed.

However, the inventors found in the process of implementing the present invention that, since there may be a case where a plurality of articles contained in an article region are mutually overlapped, once the overlapped article is identified as an object to be grasped and a grasping operation is performed, other articles located on an upper layer thereof may be caused to fly out while grasping the overlapped article, thereby causing a problem that the other articles are damaged during the grasping process. Therefore, the existing grabbing mode cannot accurately identify the overlapping relation between the articles.

Disclosure of Invention

In view of the above, the present invention has been made to provide an object grasping method and apparatus that overcomes or at least partially solves the above problems.

According to an aspect of the present invention, there is provided an object grasping method, including:

clustering the point clouds corresponding to the three-dimensional article areas to obtain a plurality of point cloud clusters corresponding to the article objects;

and determining the object objects except the overlapped object objects contained in each point cloud layering as the objects to be grabbed.

Optionally, the coverage detection area corresponding to the pose key point of the target item object includes: a vertebral body region corresponding to a pose keypoint of the target item object;

Optionally, the vertebral body region comprises: a conical region and/or a frustoconical region;

Optionally, the number of the point cloud hierarchies sequentially arranged from the top layer to the bottom layer is N, and the point cloud hierarchy at the top layer is the 1 st point cloud hierarchy; when the current point cloud hierarchy where the target object is located is the mth point cloud hierarchy, the bottom point cloud hierarchy corresponding to the current point cloud hierarchy includes: the N-M point cloud layers are positioned at the bottom of the Mth point cloud layer; wherein N and M are both natural numbers, and M is less than or equal to N.

Optionally, the calculating the pose key point of the article object corresponding to each point cloud cluster includes:

Optionally, the layering the item object corresponding to each point cloud cluster according to the depth value includes:

Optionally, after determining the object objects, other than the overlapped object, included in each point cloud hierarchy as the objects to be grabbed, the method further includes:

Optionally, the objects to be grabbed are sorted according to the distance between the objects to be grabbed and the camera or the object bearing surface, and the grabbing sequence of the objects to be grabbed is determined according to the sorting result;

Optionally, after determining the grabbing order of each object to be grabbed according to the sorting result, the method further includes:

acquiring a conversion relation between a camera coordinate system and a robot coordinate system;

Optionally, the three-dimensional article area includes a plurality of objects to be grabbed stacked along a preset depth direction; wherein the object to be grabbed comprises: cartons, envelopes, plastic bags, cosmeceuticals, and/or toys.

According to still another aspect of the present invention, there is also provided an object grasping apparatus including:

and the grabbing module is suitable for determining the object objects except the overlapped object objects contained in each point cloud layering as the objects to be grabbed.

Optionally, the coverage detection area corresponding to the pose key point of the target item object includes: a vertebral body region corresponding to a pose keypoint of the target item object;

Optionally, the vertebral body region comprises: a conical region and/or a frustoconical region;

Optionally, the calculation module is specifically adapted to:

Optionally, the grasping module is further adapted to:

Optionally, the grabbing module is specifically adapted to:

Optionally, the grasping module is further adapted to:

acquiring a conversion relation between a camera coordinate system and a robot coordinate system;

According to still another aspect of the present invention, there is provided an electronic apparatus including: the system comprises a processor, a memory, a communication interface and a communication bus, wherein the processor, the memory and the communication interface complete mutual communication through the communication bus;

the memory is used for storing at least one executable instruction, and the executable instruction enables the processor to execute the operation corresponding to the object grabbing method.

According to still another aspect of the present invention, a computer storage medium is provided, where at least one executable instruction is stored in the storage medium, and the executable instruction causes a processor to perform an operation corresponding to the object fetching method.

In the object grabbing method and device provided by the invention, firstly, clustering processing is carried out on point clouds corresponding to a three-dimensional object area to obtain a plurality of point cloud clusters corresponding to object objects; then, calculating pose key points and depth values of the object corresponding to each point cloud cluster, and carrying out layering processing on the object corresponding to each point cloud cluster according to the depth values to obtain a plurality of point cloud layers sequentially arranged from the top layer to the bottom layer; and finally, sequentially taking the object objects contained in the current point cloud layering as target object objects according to the sequence from the top layer to the bottom layer, and determining the object objects in the bottom point cloud layering corresponding to the current point cloud layering as the overlapped object objects when the object objects in the bottom point cloud layering corresponding to the current point cloud layering are judged to be contained in the coverage detection area corresponding to the pose key point of the target object, so that the object objects except the overlapped object objects contained in each point cloud layering are determined as the objects to be grabbed. Therefore, the pressed and overlapped object can be accurately identified by setting the covering detection area, so that the pressed and overlapped object is removed when the object to be grabbed is determined, the problems of flying out and damage of other objects are prevented, and the reliability of the grabbing process is improved.

The foregoing description is only an overview of the technical solutions of the present invention, and the embodiments of the present invention are described below in order to make the technical means of the present invention more clearly understood and to make the above and other objects, features, and advantages of the present invention more clearly understandable.

Drawings

Various other advantages and benefits will become apparent to those of ordinary skill in the art upon reading the following detailed description of the preferred embodiments. The drawings are only for purposes of illustrating the preferred embodiments and are not to be construed as limiting the invention. Also, like reference numerals are used to refer to like parts throughout the drawings. In the drawings:

FIG. 1 is a flow diagram illustrating an object fetching method according to an embodiment of the present invention;

FIG. 2 is a flow diagram illustrating an object fetching method according to another embodiment of the present invention;

fig. 3 is a schematic structural view showing an object grasping apparatus according to still another embodiment of the present invention;

fig. 4 shows a schematic structural diagram of an electronic device according to the present invention.

Detailed Description

Exemplary embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While exemplary embodiments of the present disclosure are shown in the drawings, it should be understood that the present disclosure may be embodied in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the disclosure to those skilled in the art.

Fig. 1 shows a schematic flow diagram of an object capture method according to an embodiment of the present invention, where the object capture method may be specifically controlled by an intelligent program, for example, an intelligent program controlled robot such as a manipulator. As shown in fig. 1, the method includes:

step S110: and clustering the point clouds corresponding to the three-dimensional article areas to obtain a plurality of point cloud clusters corresponding to the article objects.

In addition, the point cloud is a data set of points in a preset coordinate system. The points contain rich information including three-dimensional coordinates X, Y, Z, color, classification values, intensity values, time, etc. The point cloud can atomize the real world, and the real world can be restored through high-precision point cloud data. Therefore, the point cloud information can reflect the three-dimensional characteristics of the three-dimensional object area. In this embodiment, the point cloud corresponding to the three-dimensional object region may be obtained in various ways, for example, a two-dimensional color image corresponding to the three-dimensional object region and a depth image corresponding to the two-dimensional color image may be obtained by a three-dimensional camera, and then point cloud information may be constructed according to the two-dimensional color image and the depth image. For another example, the point cloud may be generated according to elements such as a laser detector, a visible light detector such as an LED, an infrared detector, and a radar detector.

Specifically, clustering processing is performed on point clouds corresponding to the three-dimensional article areas to obtain a plurality of point cloud clusters corresponding to the article objects. Wherein, when the clustering operation is executed, clustering can be performed according to the distance between the points. For example, a distance threshold is set, and a plurality of points with intervals smaller than the distance threshold are aggregated into a point cloud cluster.

Step S120: and calculating the position and pose key points of the object corresponding to each point cloud cluster and the depth values of the position and pose key points, and carrying out layering processing on the object corresponding to each point cloud cluster according to the depth values to obtain a plurality of point cloud layers which are sequentially arranged from the top layer to the bottom layer.

And the point cloud cluster and the object have a corresponding relation. Typically, each point cloud cluster corresponds to an item object. The three-dimensional pose information of the object is used for describing the pose of the object to be grabbed in the three-dimensional world. The pose key points are as follows: the position and orientation point capable of reflecting the three-dimensional position feature of the object may be, for example, a position and orientation point corresponding to the center of gravity of the object. Correspondingly, the position and pose key points of the article object corresponding to each point cloud cluster and the depth values of the position and pose key points are calculated.

The depth value of the pose key point is a coordinate value of the object corresponding to the depth coordinate axis, wherein the direction of the depth coordinate axis can be flexibly set according to the actual service scene, for example, the depth coordinate axis can be set according to the photographing direction of a camera, the gravity direction or the direction of a perpendicular line of the object bearing surface.

In one implementation, the depth coordinate axis corresponds to a photographing direction of the camera, and accordingly, the depth value is used for reflecting the distance between the object and the camera. And carrying out layering processing on the object corresponding to each point cloud cluster according to the depth value to obtain a plurality of point cloud layers which are sequentially arranged from the top layer to the bottom layer. Correspondingly, the point cloud tier at the top layer is closest to the camera (farthest from the article carrying surface), while the point cloud tier at the bottom layer is farthest from the camera (closest to the article carrying surface). Each point cloud tier contains one or more item objects.

Step S130: and sequentially taking the object objects contained in the current point cloud layer as target object objects according to the sequence from the top layer to the bottom layer, and determining the object objects in the bottom point cloud layer corresponding to the current point cloud layer as the overlapped object objects when the object objects in the bottom point cloud layer corresponding to the current point cloud layer are contained in the coverage detection area corresponding to the pose key point of the target object.

Specifically, the point cloud layer located at the top layer is used as a current point cloud layer, the object in the point cloud layer at the top layer is used as a target object, and a coverage detection area corresponding to the pose key point of the target object is set. Wherein, the coverage detection area refers to: and taking the key position and pose points of the target object as reference points, and drawing a geometric area of a preset shape around the reference points. The present invention is not limited to a specific shape covering the detection area, and may be any shape as long as the object to be overlapped adjacent to the target object can be detected.

Correspondingly, when the object in the bottom point cloud layer corresponding to the current point cloud layer is judged to be contained in the coverage detection area corresponding to the pose key point of the target object, the object in the bottom point cloud layer corresponding to the current point cloud layer is determined as the overlapped object. Wherein, the bottom point cloud hierarchy corresponding to the current point cloud hierarchy is: at least one point cloud tier located at the bottom of the current point cloud tier. Wherein, each point cloud layering has the hierarchical relation, and according to the order from the top level to the bottom, the hierarchy of each point cloud layering reduces in proper order, promptly: the top layer has the highest point cloud hierarchical level, and the bottom layer has the lowest point cloud hierarchical level. Thus, the bottom point cloud tier corresponding to the current point cloud tier refers to: the hierarchy is lower than each point cloud hierarchy of the current point cloud hierarchy.

Step S140: and determining the object objects except the overlapped object objects contained in each point cloud layering as the objects to be grabbed.

Since the laminated article object has been detected in the previous step, the laminated article object is: and the article objects except the overlapped article object contained in each point cloud layering are determined as the objects to be grabbed, so that the problem of carrying the article on the upper layer is avoided. Through the mode, the objects of the folded objects can be removed, so that the objects to be grabbed do not contain the objects to be pressed, and the grabbing reliability is improved.

Therefore, the pressed and overlapped object can be accurately identified by setting the covering detection area, so that the pressed and overlapped object is removed when the object to be grabbed is determined, the problems of flying out and damage of other objects are prevented, and the reliability of the grabbing process is improved.

Fig. 2 is a flowchart illustrating an object capture method according to another embodiment of the present invention. As shown in fig. 2, the method includes:

step S200: a point cloud corresponding to the three-dimensional object region is obtained.

Wherein, the three-dimensional article area means: a three-dimensional region in which a plurality of articles are stacked. Since a stacking phenomenon exists among a plurality of articles in this embodiment, the orientation relationship among the articles cannot be accurately described only by the planar image, and thus description is performed by a three-dimensional article region. In addition, the point cloud is a data set of points in a preset coordinate system. In this embodiment, the point cloud corresponding to the three-dimensional object region may be obtained in various ways, and the specific implementation manner of the present invention is not limited. For example, the point cloud may be generated from elements such as laser detectors, visible light detectors such as LEDs, infrared detectors, and radar detectors.

In a specific example, a two-dimensional color image corresponding to a three-dimensional object region and a depth image corresponding to the two-dimensional color image are acquired by a three-dimensional camera, and point cloud information is constructed according to the two-dimensional color image and the depth image, which is specifically realized by the following steps: and acquiring a two-dimensional color image corresponding to the three-dimensional object area and a depth image corresponding to the two-dimensional color image along a preset depth direction, and constructing point cloud information according to the two-dimensional color image and the depth image corresponding to the two-dimensional color image.

The preset depth direction can be flexibly set according to an actual service scene, and specifically, the preset depth direction includes at least one of the following: the camera shoots the direction, the gravity direction and the direction of the vertical line of the object bearing surface.

In one implementation, the preset depth direction is: the depth direction along which the camera takes a picture is also called the camera shooting direction. Specifically, the light generated by the camera of the camera starts from the first direction and extends to the second direction, and the preset depth direction is a direction from the first direction to the second direction. For example, when a camera of the camera takes a picture from top to bottom, the preset depth direction is the direction from top to bottom; when the camera of the camera takes a picture from left to right, the preset depth direction is the direction from left to right. For example, if a camera is used to take a picture, the preset depth direction is: pointed by the camera in the direction of the article area. If the two cameras are used for photographing, the preset depth direction is as follows: the midpoint position of the two cameras points to the direction of the article area. Of course, for scenes with multiple cameras, the preset depth direction may be set according to a direction in which the center positions of the multiple cameras point to the article area, and the present invention is not limited to the specific details.

In another implementation, the preset depth direction is: the direction of the vertical line of the article carrying surface is as follows: perpendicular to the direction of the object carrying surface. In particular, the shooting angle of the camera can be flexibly set, for example, the shooting angle of the camera may be at an angle with the object placement direction, that is: the camera is in a tilted state. Therefore, for the sake of accurate description, the preset depth direction may also be a direction perpendicular to the article carrying surface, and in practical implementation, the preset depth direction may be any direction, for example, the preset depth direction may be a vertical direction, or may be a certain inclined direction, and the present invention does not limit the preset depth direction.

Wherein, article bearing surface means: the plane of the carrier for placing the three-dimensional object. For example, when a three-dimensional object is placed on the ground, the ground is a carrier for placing the three-dimensional object, and correspondingly, the object bearing surface is a plane where the ground is located; for another example, when a three-dimensional object is placed on a tray, a conveyor belt, or a material basket, the tray, the conveyor belt, or the material basket is a carrier for placing the three-dimensional object, and correspondingly, the object bearing surface is a plane on which the tray, the conveyor belt, or the material basket is located. In a specific scenario, the carrier such as the tray, the conveyor belt, or the material basket may be disposed obliquely, for example, for convenience of loading and unloading, a plane on which the conveyor belt is disposed may form a preset angle with a horizontal plane, and correspondingly, a preset depth direction is perpendicular to the plane on which the conveyor belt is disposed, and thus, a preset angle is also formed with a vertical direction.

In addition, the preset depth direction may be a gravity direction. For example, when the object carrying surface is consistent with the horizontal plane, the predetermined depth direction is the gravity direction.

For example, in a specific example, the preset depth direction is a depth direction along which the camera takes a picture, also called a picture taking direction. Specifically, the light generated by the camera of the camera starts from the first direction and extends to the second direction, and the preset depth direction is a direction from the first direction to the second direction. For example, when a camera of the camera takes a picture from top to bottom, the preset depth direction is the direction from top to bottom; when the camera of the camera takes a picture from left to right, the preset depth direction is the direction from left to right. In specific implementation, a two-dimensional color image corresponding to the three-dimensional object area and a depth image corresponding to the two-dimensional color image are obtained through a 3D camera. The two-dimensional color image corresponds to an image of a plane area vertical to a preset depth direction; each pixel point in the depth map corresponding to the two-dimensional color image corresponds to each pixel point in the two-dimensional color image one by one, and the value of each pixel point is the depth value of the pixel point. Wherein the depth value is determined according to the distance of the article from the camera. For example, when the camera is taken from top to bottom, the two-dimensional color map corresponds to a top plan view, while the depth map is used to represent the distance of each object from the camera. Therefore, the azimuth relationship between the objects can be accurately described from a three-dimensional perspective through the two-dimensional color image and the depth image corresponding to the two-dimensional color image. Correspondingly, corresponding point cloud information can be constructed according to the two-dimensional color image and the depth image corresponding to the two-dimensional color image.

Step S210: and clustering the point clouds corresponding to the three-dimensional article areas to obtain a plurality of point cloud clusters corresponding to the article objects.

Specifically, when the clustering operation is performed, clustering is performed according to the distance between the points. For example, a distance threshold is set, and a plurality of points with intervals smaller than the distance threshold are aggregated into a point cloud cluster. The point cloud cluster is: a cluster of point cloud sets consisting of a plurality of close point data. The point cloud cluster and the object have a corresponding relation. Typically, each point cloud cluster corresponds to an item object.

Step S220: and calculating the position and pose key points of the object corresponding to each point cloud cluster and the depth values of the position and pose key points.

And the three-dimensional pose information of the object is used for describing the posture of the object to be grabbed in the three-dimensional world. The pose key points are as follows: the pose points of the three-dimensional position characteristics of the object can be reflected. In the subsequent steps, each point cloud cluster needs to be layered according to the depth value, so that in order to measure the depth of one point cloud cluster, a corresponding pose key point is set for each point cloud cluster, and the depth value of the pose key point is used as the depth value of the point cloud cluster.

Specifically, when calculating the pose key points of the article object corresponding to each point cloud cluster, the method is implemented in the following manner:

firstly, three-dimensional position coordinates of each data point contained in a point cloud cluster are obtained, and position information of a pose key point of an article object corresponding to the point cloud cluster is determined according to a preset operation result corresponding to the three-dimensional position coordinates of each data point. For example, assuming that one point cloud cluster obtained after clustering contains one thousand data points, three-dimensional position coordinates of the one thousand data points are respectively obtained, an average value of the three-dimensional position coordinates of the one thousand data points is calculated, and a data point corresponding to the average value is used as a pose key point of an article object corresponding to the point cloud cluster. Of course, the preset calculation method may be, in addition to the averaging, the center of gravity, the maximum value or the minimum value, and the like, and the present invention is not limited thereto.

Then, each data point included in the point cloud cluster is analyzed and processed by a Principal Component Analysis (PCA), and three-dimensional state information of the position information of the pose key point is determined according to the Analysis result. PCA is a mathematical dimension reduction method, in which a series of variables that may be linearly related are converted into a set of new linearly unrelated variables, also called principal components, by using orthogonal transformation, so that the new variables are used to characterize the data in a smaller dimension. The direction in which the variation is the smallest and the direction in which the variation is the largest among one thousand data points can be found out by principal component analysis. The direction with the minimum variation is taken as the Z-axis direction, the direction with the maximum variation is taken as the X-axis direction, and the Y-axis direction is determined through the right-hand coordinate system, so that the three-dimensional state information of the position information of the pose key point is determined, and the direction characteristics of the pose key point in the three-dimensional space are reflected. Wherein, the Z-axis direction is a direction of a depth coordinate axis, and the direction of the depth coordinate axis is consistent with the above-mentioned preset depth direction.

And finally, calculating the position and pose key points of the object corresponding to each point cloud cluster and the depth values of the position and pose key points. The depth value of the pose key point is a coordinate value of the object corresponding to a depth coordinate axis, wherein the depth coordinate axis is set according to the photographing direction of the camera, the gravity direction or the direction of a vertical line of the object bearing surface. It can be seen that the direction of the depth coordinate axis is consistent with the preset depth direction mentioned above, and there may be three implementation manners, which are not described herein again. Accordingly, the depth value is used to reflect the position of the item object on the depth coordinate axis. In specific implementation, the origin and the direction of the depth coordinate axis can be flexibly set by a person skilled in the art, and the setting mode of the origin of the depth coordinate axis is not limited by the invention. For example, when the depth coordinate axis is set according to a photographing direction of the camera, an origin of the depth coordinate axis may be a position where the camera is located, and a direction of the depth coordinate axis is a direction pointing to an article from the camera, and thus, a depth value of each article corresponds to a distance from the article to the camera. Correspondingly, the depth value of the pose key point can reflect the distance between the object and the camera.

Step S230: and carrying out layering processing on the object corresponding to each point cloud cluster according to the depth value to obtain a plurality of point cloud layers which are sequentially arranged from the top layer to the bottom layer.

Specifically, the article objects corresponding to the point cloud clusters are layered according to the depth values, and a plurality of point cloud layers sequentially arranged from the top layer to the bottom layer are obtained. In specific implementation, the article objects corresponding to the point cloud clusters are orderly arranged according to the depth values, and the ordered point cloud clusters are divided into a plurality of point cloud hierarchies according to the hierarchy threshold; if the difference between the depth values of the two point cloud clusters is smaller than a layering threshold, the two point cloud clusters belong to the same point cloud layering; and if the difference between the depth values of the two point cloud clusters is not less than the layering threshold, the two point cloud clusters belong to different point cloud layering. Therefore, the thickness of each point cloud layer can be controlled by reasonably setting the layering threshold. The hierarchical threshold values corresponding to the point cloud hierarchies may be the same or different. For example, the layering threshold of the top-layer point cloud layering (or the bottom-layer point cloud layering) may be set to be greater than or less than the layering threshold of other layers of point cloud layering, thereby controlling the thickness of the top-layer point cloud layering (or the bottom-layer point cloud layering).

Wherein the concept of the top layer and the bottom layer is determined according to the overlapping relationship between the articles, and the articles of the bottom layer are necessarily positioned below the articles of the top layer under the action of gravity. Therefore, the point cloud tier at the top layer is farthest from the object bearing surface, and the point cloud tier at the bottom layer is closest to the object bearing surface. Or, the point cloud layer at the top layer is closest to the camera, and the point cloud layer at the bottom layer is farthest from the camera. Each point cloud tier contains one or more item objects.

Step S240: and sequentially taking the object objects contained in the current point cloud layer as target object objects according to the sequence from the top layer to the bottom layer, and determining the object objects in the bottom point cloud layer corresponding to the current point cloud layer as the overlapped object objects when the object objects in the bottom point cloud layer corresponding to the current point cloud layer are contained in the coverage detection area corresponding to the pose key point of the target object.

The number of the point cloud layers sequentially arranged from the top layer to the bottom layer is N, the point cloud layer positioned at the topmost layer is a 1 st point cloud layer, and the point cloud layer positioned at the bottommost layer is an Nth point cloud layer; when the current point cloud hierarchy where the target object is located is the mth point cloud hierarchy, the bottom point cloud hierarchy corresponding to the current point cloud hierarchy includes: and the N-M point cloud hierarchies are positioned at the bottom of the Mth point cloud hierarchy. Wherein N and M are both natural numbers, and M is less than or equal to N. In specific implementation, according to the sequence from the layer 1 to the layer N, the object objects contained in the current point cloud layering are sequentially used as target object objects, and whether the coverage detection area corresponding to the pose key point of the target object contains the object in the bottom point cloud layering corresponding to the current point cloud layering is judged.

In specific implementation, assuming that N is 4, firstly taking the 1 st point cloud hierarchy at the top layer as a current point cloud hierarchy, taking an article object L1 in the 1 st point cloud hierarchy as a target article object, and drawing a coverage detection area corresponding to a pose key point of the target article object; then, judging whether the object in the 3 point cloud layers positioned at the bottom of the 1 st point cloud layer is positioned in a coverage detection area corresponding to the pose key point of the target object; then, taking the 2 nd point cloud hierarchy as a current point cloud hierarchy, taking an article object L2 in the 2 nd point cloud hierarchy as a target article object, and drawing a coverage detection area corresponding to a pose key point of the target article object; then, it is determined whether an item object in the 2 point cloud tier located at the bottom of the 2 nd point cloud tier is within the coverage detection zone corresponding to the pose keypoint of the target item object … …, and so on.

Wherein the coverage detection area corresponding to the pose keypoints of the target item object comprises: a vertebral body region corresponding to a pose keypoint of the target object; the top of the cone region is determined according to the pose key point of the target object, and the bottom of the cone region is located in a bottom point cloud layer corresponding to the current point cloud layer where the target object is located. For example, the top of the vertebral body region coincides with a pose keypoint of the target item object. Wherein the vertebral body region comprises: a conical region and/or a frustoconical region. When the cone region is a cone region, the vertex of the cone region is determined according to the pose key point of the target object, for example, the vertex of the cone region coincides with the pose key point of the target object. When the cone region is a circular truncated cone region, the upper bottom surface of the circular truncated cone region is determined according to the pose key point of the target object, for example, the circle center of the upper bottom surface of the circular truncated cone region coincides with the pose key point of the target object.

For example, taking the cone region as a conical region as an example, when the 1 st point cloud hierarchy in the above example is taken as the current point cloud hierarchy, and the item object L1 in the 1 st point cloud hierarchy is taken as the target item object, the coverage detection region corresponding to the pose key point of the target item object L1 is drawn by: and (3) taking the pose key point of the target article object L1 as a conical vertex, drawing a conical region downwards towards the direction of the bottom point cloud hierarchy, and correspondingly, the bottom of the conical region is positioned in any one of the 2 nd to 4 th point cloud hierarchies depending on the height of the conical region and the thickness of each point cloud hierarchy. In addition, the extent of coverage of the cone region depends on the included angle of the cone. Specifically, the solid geometry of the cone is defined as follows: the straight line where the right-angle side of the right-angle triangle is located is used as a rotating shaft, the geometric body enclosed by the curved surfaces formed by rotating the other two sides by 360 degrees is called a cone, the rotating shaft is called the axis of the cone, the curved surface formed by rotating the side perpendicular to the axis is called the bottom surface of the cone, the curved surface formed by rotating the side not perpendicular to the axis is called the side surface of the cone, and the side not perpendicular to the axis is called the generatrix of the cone no matter where the side is rotated. With reference to the above definition, the smaller the angle between the bottom surface of the cone and the side surface of the cone, the wider the coverage of the cone area; the coverage of the cone area is narrower as the angle between the base of the cone and the side of the cone is larger. And when the rotating shaft of the cone is longer, the number of layers of the bottom-layer point cloud layering covered by the cone area is more; the shorter the axis of rotation of the cone, the fewer the number of layers of the underlying point cloud hierarchy that the cone region covers. In practical implementation, the included angle between the bottom surface of the cone and the side surface of the cone can be set as small as possible to cover a wider area, so that the pressing and overlapping judgment is stricter; the length of the axis of rotation of the cone may be set long enough so that the cone region covers each point cloud tier below the current point cloud tier.

Step S250: and determining the object objects except the overlapped object objects contained in each point cloud layering as the objects to be grabbed.

Step S260: and sequencing the objects to be grabbed according to the depth values of the pose key points of the objects to be grabbed, and determining the grabbing sequence of the objects to be grabbed according to the sequencing result.

Sequencing the objects to be grabbed according to the distance between the objects to be grabbed and the camera or the object bearing surface, and determining the grabbing sequence of the objects to be grabbed according to the sequencing result; the closer the distance from the camera to the object to be grabbed, the closer the grabbing sequence of the object to be grabbed is; the farther away from the camera, the later the grabbing sequence of the objects to be grabbed; or the closer to the object bearing surface, the later the grabbing sequence of the grabbed objects is; the farther away from the article carrying surface the graspable objects are, the more forward the grasping order. In general, since the camera takes a picture from top to bottom, the graspable object located close to the camera is located at the top layer, and the graspable object located far from the camera is located at the bottom layer. Therefore, the objects which can be grabbed can be sequentially arranged according to the sequence from the top layer to the bottom layer through the sequencing result, and further the objects can be grabbed sequentially according to the sequence from the top layer to the bottom layer during grabbing. Wherein, the distance between the object that can snatch and the article loading face is: the object can be grasped and spaced from the object carrying surface along the vertical line of the object carrying surface. Namely: the distance between the graspable object and the object carrying surface is a vertical distance between the graspable object and the object carrying surface.

Step S270: and outputting a grabbing instruction to the robot so that the robot can execute grabbing operation according to the grabbing instruction.

Specifically, the above steps are mainly processed according to the information shot by the camera, so as to realize the recognition of the object which can be grabbed and the determination of the grabbing sequence. In general, the camera and the robot belong to different devices, and therefore, a grabbing instruction needs to be further output to the robot so that the robot can perform grabbing operation according to the grabbing instruction.

It is considered that the camera is often not located at the same position as the robot. Therefore, the object to be grabbed can be positioned in a coordinate system conversion mode, and the following modes are specifically adopted: acquiring a conversion relation between a camera coordinate system and a robot coordinate system; and according to the conversion relation, converting the pose key points of each object to be grabbed corresponding to the camera coordinate system into the robot coordinate system, and outputting the converted pose key points of each object to be grabbed to the robot so as to enable the robot to execute grabbing operation. The three-dimensional pose information of the object to be grabbed, which is described in each step, is determined according to a camera coordinate system, and in order to facilitate the robot to realize grabbing operation, the three-dimensional pose information needs to be converted into the robot coordinate system. The conversion process between the camera coordinate system and the robot coordinate system can be determined according to the relative relationship between the position of the camera and the position of the robot.

It can be seen that the three-dimensional object region in the present embodiment includes a plurality of graspable objects stacked along the preset depth direction, where the graspable objects include: a carton, envelope, file pocket, postcard, etc., a plastic pouch (including but not limited to snack packaging, milk tetra pillow packaging, milk plastic packaging, etc.), a cosmeceutical bottle, a cosmeceutical, and/or an irregular toy article, etc. This mode can help the end to be equipped with vacuum chuck's industrial robot, from the article heap of chaotic stack, picks up article one by one for subsequent station processes such as sweep sign indicating number, load.

In addition, various modifications and variations can be made for step S260 by those skilled in the art: when sequencing the objects to be captured, the objects to be captured can be sequenced according to the hierarchy of point cloud layering where the objects to be captured are located, besides the depth values: setting the grabbing sequence of the object corresponding to each point cloud cluster according to the grabbing sequence from the top layer to the bottom layer; when a plurality of article objects are contained in the same point cloud layering, the grabbing sequence of the article objects in the same point cloud layering is set according to the size of each article object. Specifically, the grabbing sequence of the object objects corresponding to the point cloud clusters in each point cloud hierarchy is set according to the grabbing sequence from the top layer to the bottom layer. Correspondingly, the grabbing sequence of the article objects corresponding to the point cloud clusters in the point cloud hierarchy at the top layer is earlier than that of the article objects corresponding to the point cloud clusters in the point cloud hierarchy at the bottom layer, so that the grabbing sequence of the article objects in the point cloud hierarchy at the upper layer is prior to that of the article objects in the point cloud hierarchy at the lower layer. The articles on the upper layer are preferentially grabbed, so that the problem that the articles on the upper layer are carried away due to the fact that the articles on the lower layer are pressed can be prevented. In addition, if a plurality of article objects are contained in the same point cloud hierarchy, the size of each article object is further acquired, and then the grabbing sequence of the article objects in the same point cloud hierarchy is set according to the size of each article object. The size of the object may be expressed in various ways, for example, by the volume, surface area, etc. of the object. In practice, since the probability that a small article is pressed by a large article is considered to be high, an article object having a large size is preferentially grasped, and the article pressed during grasping can be prevented from flying out. In specific implementation, the exposed surface area and/or volume of each article object contained in the same point cloud layering is obtained, and the exposed surface area and/or volume of each article object are compared; and sequencing the object objects according to the comparison result, wherein the exposed surface area and/or the volume of the object objects are in the descending order, and the grabbing order of the object objects is set according to the sequencing result. Namely: the object with large exposed surface area and/or volume is grabbed first, and then the object with small exposed surface area and/or volume is grabbed. Because other small article objects are easy to press by the large article object, the problem that other articles fly out when the pressed article object is firstly gripped can be effectively avoided by preferentially gripping the large article object.

In conclusion, the mode can accurately identify the object to be pressed and overlapped by setting the mode of covering the detection area, so that the object to be pressed and overlapped is eliminated when the object to be grabbed is determined, the problems that other objects fly out and are damaged are prevented, and the reliability of the grabbing process is improved. In addition, in the method, the point cloud layering is arranged, so that the articles can be sequentially grabbed from top to bottom, and when the point cloud layering on the same layer contains a plurality of articles, the articles with larger sizes can be preferentially grabbed according to the sizes of the articles.

Fig. 3 is a schematic structural view showing an object grasping apparatus according to still another embodiment of the present invention, as shown in fig. 3, the apparatus including:

the clustering module 31 is adapted to perform clustering processing on the point clouds corresponding to the three-dimensional article areas to obtain a plurality of point cloud clusters corresponding to the article objects;

the calculation module 32 is adapted to calculate pose key points and depth values of the pose key points of the article objects corresponding to the point cloud clusters, and perform layering processing on the article objects corresponding to the point cloud clusters according to the depth values to obtain a plurality of point cloud layers sequentially arranged from the top layer to the bottom layer;

the determining module 33 is adapted to sequentially take the object objects included in the current point cloud hierarchy as target object objects according to the sequence from the top layer to the bottom layer, and determine the object objects in the bottom point cloud hierarchy corresponding to the current point cloud hierarchy as the overlapped object objects when the object objects in the bottom point cloud hierarchy corresponding to the current point cloud hierarchy are included in the coverage detection area corresponding to the pose key point of the target object;

and the grabbing module 34 is suitable for determining the object objects, except the overlapped object objects, contained in each point cloud layering as the objects to be grabbed.

Optionally, the coverage detection area corresponding to the pose key point of the target item object includes: a vertebral body region corresponding to a pose keypoint of the target item object;

Optionally, the vertebral body region comprises: a conical region and/or a frustoconical region;

Optionally, the calculation module is specifically adapted to:

Optionally, the grasping module is further adapted to:

Optionally, the grabbing module is specifically adapted to:

Optionally, the grasping module is further adapted to:

acquiring a conversion relation between a camera coordinate system and a robot coordinate system;

The specific structure and the working principle of each module may refer to the description of the corresponding step in the method embodiment, and are not described herein again.

The embodiment of the application provides a non-volatile computer storage medium, wherein at least one executable instruction is stored in the computer storage medium, and the computer executable instruction can execute the object capture method in any method embodiment.

Fig. 4 is a schematic structural diagram of an electronic device according to an embodiment of the present invention, and the specific embodiment of the present invention does not limit the specific implementation of the electronic device.

As shown in fig. 4, the electronic device may include: a processor (processor)402, a Communications Interface 404, a memory 406, and a Communications bus 408.

Wherein:

the processor 402, communication interface 404, and memory 406 communicate with each other via a communication bus 408.

A communication interface 404 for communicating with network elements of other devices, such as clients or other servers.

The processor 402 is configured to execute the program 410, and may specifically perform relevant steps in the above embodiments of the domain name resolution method.

In particular, program 410 may include program code comprising computer operating instructions.

The processor 402 may be a central processing unit CPU or an application Specific Integrated circuit asic or one or more Integrated circuits configured to implement embodiments of the present invention. The electronic device comprises one or more processors, which can be the same type of processor, such as one or more CPUs; or may be different types of processors such as one or more CPUs and one or more ASICs.

And a memory 406 for storing a program 410. Memory 406 may comprise high-speed RAM memory, and may also include non-volatile memory (non-volatile memory), such as at least one disk memory.

The program 410 may be specifically configured to cause the processor 402 to perform the operations in the above-described method embodiments.

The algorithms and displays presented herein are not inherently related to any particular computer, virtual machine, or other apparatus. Various general purpose systems may also be used with the teachings herein. The required structure for constructing such a system will be apparent from the description above. Moreover, the present invention is not directed to any particular programming language. It is appreciated that a variety of programming languages may be used to implement the teachings of the present invention as described herein, and any descriptions of specific languages are provided above to disclose the best mode of the invention.

In the description provided herein, numerous specific details are set forth. It is understood, however, that embodiments of the invention may be practiced without these specific details. In some instances, well-known methods, structures and techniques have not been shown in detail in order not to obscure an understanding of this description.

Similarly, it should be appreciated that in the foregoing description of exemplary embodiments of the invention, various features of the invention are sometimes grouped together in a single embodiment, figure, or description thereof for the purpose of streamlining the disclosure and aiding in the understanding of one or more of the various inventive aspects. However, the disclosed method should not be interpreted as reflecting an intention that: that the invention as claimed requires more features than are expressly recited in each claim. Rather, as the following claims reflect, inventive aspects lie in less than all features of a single foregoing disclosed embodiment. Thus, the claims following the detailed description are hereby expressly incorporated into this detailed description, with each claim standing on its own as a separate embodiment of this invention.

Those skilled in the art will appreciate that the modules in the device in an embodiment may be adaptively changed and disposed in one or more devices different from the embodiment. The modules or units or components of the embodiments may be combined into one module or unit or component, and furthermore they may be divided into a plurality of sub-modules or sub-units or sub-components. All of the features disclosed in this specification (including any accompanying claims, abstract and drawings), and all of the processes or elements of any method or apparatus so disclosed, may be combined in any combination, except combinations where at least some of such features and/or processes or elements are mutually exclusive. Each feature disclosed in this specification (including any accompanying claims, abstract and drawings) may be replaced by alternative features serving the same, equivalent or similar purpose, unless expressly stated otherwise.

Furthermore, those skilled in the art will appreciate that while some embodiments described herein include some features included in other embodiments, rather than other features, combinations of features of different embodiments are meant to be within the scope of the invention and form different embodiments. For example, in the following claims, any of the claimed embodiments may be used in any combination.

The various component embodiments of the invention may be implemented in hardware, or in software modules running on one or more processors, or in a combination thereof. Those skilled in the art will appreciate that a microprocessor or Digital Signal Processor (DSP) may be used in practice to implement some or all of the functionality of some or all of the components in an electronic device according to embodiments of the present invention. The present invention may also be embodied as apparatus or device programs (e.g., computer programs and computer program products) for performing a portion or all of the methods described herein. Such programs implementing the present invention may be stored on computer-readable media or may be in the form of one or more signals. Such a signal may be downloaded from an internet website or provided on a carrier signal or in any other form.

It should be noted that the above-mentioned embodiments illustrate rather than limit the invention, and that those skilled in the art will be able to design alternative embodiments without departing from the scope of the appended claims. In the claims, any reference signs placed between parentheses shall not be construed as limiting the claim. The word "comprising" does not exclude the presence of elements or steps not listed in a claim. The word "a" or "an" preceding an element does not exclude the presence of a plurality of such elements. The invention may be implemented by means of hardware comprising several distinct elements, and by means of a suitably programmed computer. In the unit claims enumerating several means, several of these means may be embodied by one and the same item of hardware. The usage of the words first, second and third, etcetera do not indicate any ordering. These words may be interpreted as names.

22页详细技术资料下载

上一篇：一种医用注射器针头装配设备

下一篇：一种无人驾驶车辆隧道内姿态检测方法

Object grabbing method and device

相关技术

网友询问留言