Depth information acquisition method and device, electronic equipment and computer readable medium

文档序号：1954768 发布日期：2021-12-10 浏览：15次中文

阅读说明：本技术 深度信息获取方法、装置、电子设备和计算机可读介质 (Depth information acquisition method and device, electronic equipment and computer readable medium ) 是由徐鑫张亮亮于 2021-09-06 设计创作，主要内容包括：本公开的实施例公开了深度信息获取方法、装置、电子设备和计算机可读介质。该方法的一具体实施方式包括：获取图像帧序列,上述图像帧序列包含图像采集装置连续采集的图像帧,上述图像帧包括用于表征图像帧中物体与上述图像采集装置之间距离的深度信息；对于上述图像帧序列中相邻两张图像帧,确定上述相邻两张图像帧中物体之间的位置关系,得到位置关系序列；基于上述位置关系序列构建三维空间,并通过上述三维空间获取目标物体的目标深度信息。该实施方式提高了目标深度信息的准确性。(The embodiment of the disclosure discloses a depth information acquisition method, a depth information acquisition device, an electronic device and a computer readable medium. One embodiment of the method comprises: acquiring an image frame sequence, wherein the image frame sequence comprises image frames continuously acquired by an image acquisition device, and the image frames comprise depth information used for representing the distance between an object in the image frames and the image acquisition device; determining the position relation between objects in the two adjacent image frames in the image frame sequence to obtain a position relation sequence; and constructing a three-dimensional space based on the position relation sequence, and acquiring target depth information of the target object through the three-dimensional space. This embodiment improves the accuracy of the target depth information.)

1. A depth information acquisition method, comprising:

acquiring an image frame sequence, wherein the image frame sequence comprises image frames continuously acquired by an image acquisition device, and the image frames comprise depth information used for representing the distance between an object in the image frames and the image acquisition device;

determining the position relation between objects in two adjacent image frames in the image frame sequence to obtain a position relation sequence;

and constructing a three-dimensional space based on the position relation sequence, and acquiring target depth information of the target object through the three-dimensional space.

2. The method of claim 1, wherein the determining the position relationship between the objects in the two adjacent image frames comprises:

determining a static object image area and a moving object image area in two adjacent image frames;

and determining the position relation between the objects according to the position change of the moving object image area in the two adjacent image frames.

3. The method of claim 2, wherein an image frame of the sequence of image frames comprises at least one semantic tag describing an object in the image frame; and

the determining of the still object image area and the moving object image area in the two adjacent image frames includes:

for an image frame in two adjacent image frames, at least one super pixel region in the image frame is determined, and a static object image region and a moving object image region in the at least one super pixel region are determined based on the semantic label.

4. The method of claim 2, wherein the determining the position relationship between the objects through the position change of the moving object image area in the two adjacent image frames comprises:

setting at least one corresponding static position mark point in the same static object image area in the two adjacent image frames, and setting at least one corresponding motion position mark point in the same moving object image area;

and determining the position relation among the corresponding moving objects based on the depth information, the at least one static position mark point and the at least one moving position mark point.

5. The method of claim 4, wherein the determining a positional relationship between corresponding moving objects based on the depth information, the at least one stationary position marker point, and at least one moving position marker point comprises:

for the image frames in the two adjacent image frames, correspondingly constructing a plurality of line segments between at least one static position mark point and at least one motion position mark point;

determining the position variation of the moving object according to the length difference of the corresponding line segments in the two adjacent image frames;

determining a position relationship between corresponding moving objects based on the depth information and the position variation, wherein the position relationship comprises any one of the following items: sharing edge, coplane and shielding.

6. The method of claim 1, wherein the image frame comprises location information; and

the constructing of the three-dimensional space based on the position relation sequence comprises:

determining target position information corresponding to the position relation for the position relation in the position relation sequence, and determining a visual angle three-dimensional image according to the position relation and the target position information;

and constructing a three-dimensional space based on the visual angle three-dimensional image set corresponding to the position relation sequence.

7. The method of claim 6, wherein the image frames comprise depth information, wherein the depth information is used to characterize a distance between an object in an image frame and the image acquisition device; and

the determining of the perspective three-dimensional image through the position relationship and the target position information includes:

determining the relative position between the objects according to the depth information and the position relation;

and determining a visual angle three-dimensional image according to the target position information and the relative position, wherein the visual angle three-dimensional image is used for representing an image shot by a visual angle of the image acquisition device at a target position corresponding to the target position information.

8. The method according to claim 6, wherein the constructing a three-dimensional space based on the set of perspective three-dimensional images corresponding to the sequence of position relationships comprises:

fusing the view three-dimensional images in the view three-dimensional image set to obtain an initial three-dimensional space;

and smoothing the initial three-dimensional space to obtain a target three-dimensional space.

9. The method of claim 1, wherein the method further comprises:

and determining the pose information of the object in the three-dimensional space.

10. A depth information acquisition apparatus comprising:

an image frame sequence acquisition unit configured to acquire an image frame sequence comprising image frames continuously acquired by an image acquisition device, the image frames comprising depth information characterizing a distance between an object in the image frame and the image acquisition device;

the position relation acquisition unit is configured to determine the position relation between objects in two adjacent image frames in the image frame sequence to obtain a position relation sequence;

a depth information acquisition unit configured to construct a three-dimensional space based on the sequence of positional relationships and acquire target depth information of a target object through the three-dimensional space.

11. An electronic device, comprising:

one or more processors;

a storage device having one or more programs stored thereon,

an image acquisition device for acquiring an image frame containing depth information and semantic labels,

when executed by the one or more processors, cause the one or more processors to implement the method of any one of claims 1-9.

12. A computer-readable medium, on which a computer program is stored, wherein the program, when executed by a processor, implements the method of any one of claims 1 to 9.

Technical Field

The embodiment of the disclosure relates to the technical field of computers, in particular to a depth information acquisition method, a depth information acquisition device, electronic equipment and a computer readable medium.

Background

With the development of modern industrial technologies, especially the requirements of robots and automatic driving fields, depth estimation in complex scenes becomes one of the key fields concerned by researchers. The existing method usually depends on the characteristics of points, lines, surfaces and the like among objects to carry out depth estimation, but has the following defects:

the existing methods are generally depth information of an object at a certain position or a certain time, which is acquired by an image acquisition device capable of acquiring depth information. Due to various practical interferences, the image acquisition device usually cannot accurately judge the characteristics of points, lines, surfaces and the like of an object at a certain position or a certain moment, so that the accuracy of the acquired depth information is not high;

disclosure of Invention

This summary is provided to introduce a selection of concepts in a simplified form that are further described below in the detailed description. This summary is not intended to identify key features or essential features of the claimed subject matter, nor is it intended to be used to limit the scope of the claimed subject matter.

Some embodiments of the present disclosure propose a depth information acquisition method, apparatus, electronic device, and computer readable medium to solve the technical problems mentioned in the background section above.

In a first aspect, some embodiments of the present disclosure provide a depth information obtaining method, including: acquiring an image frame sequence, wherein the image frame sequence comprises image frames continuously acquired by an image acquisition device, and the image frames comprise depth information used for representing the distance between an object in the image frames and the image acquisition device; determining the position relation between objects in the two adjacent image frames in the image frame sequence to obtain a position relation sequence; and constructing a three-dimensional space based on the position relation sequence, and acquiring target depth information of the target object through the three-dimensional space.

In a second aspect, some embodiments of the present disclosure provide a depth information acquiring apparatus, including: an image frame sequence acquiring unit configured to acquire an image frame sequence including image frames continuously acquired by an image acquisition device, the image frames including depth information characterizing a distance between an object in the image frames and the image acquisition device; the position relation acquisition unit is configured to determine the position relation between objects in two adjacent image frames in the image frame sequence to obtain a position relation sequence; and a depth information acquisition unit configured to construct a three-dimensional space based on the position relationship sequence and acquire target depth information of the target object through the three-dimensional space.

In a third aspect, some embodiments of the present disclosure provide an electronic device, comprising: one or more processors; a storage device having one or more programs stored thereon, which when executed by one or more processors, cause the one or more processors to implement the method described in any of the implementations of the first aspect.

In a fourth aspect, some embodiments of the present disclosure provide a computer readable medium on which a computer program is stored, wherein the program, when executed by a processor, implements the method described in any of the implementations of the first aspect.

The above embodiments of the present disclosure have the following beneficial effects: the depth information obtained by the depth information obtaining method of some embodiments of the present disclosure has improved accuracy. Specifically, the reason why the accuracy of the depth information is not high is that: the existing depth information is usually acquired by an image acquisition device at a certain position or a certain moment, and is easily interfered by various factors. Based on this, the depth information acquiring method of some embodiments of the present disclosure first acquires an image frame sequence obtained by continuously acquiring image frames by an image acquisition device; then, the position relation between the objects in the adjacent image frames in the image frame sequence is determined, and the small change of the position relation between the objects can be obtained. And then, a three-dimensional space is constructed based on the position relation sequence, so that the accuracy of the position relation between the objects in the three-dimensional space is greatly improved. On the basis, the target depth information of the target object is obtained through the three-dimensional space, and the accuracy of the target depth information is improved.

Drawings

The above and other features, advantages and aspects of various embodiments of the present disclosure will become more apparent by referring to the following detailed description when taken in conjunction with the accompanying drawings. Throughout the drawings, the same or similar reference numbers refer to the same or similar elements. It should be understood that the drawings are schematic and that elements and elements are not necessarily drawn to scale.

Fig. 1 is a schematic view of an application scenario of a depth information acquisition method of some embodiments of the present disclosure;

fig. 2 is a flow diagram of some embodiments of a depth information acquisition method according to the present disclosure;

FIG. 3 is a flow diagram of further embodiments of a depth information acquisition method according to the present disclosure;

FIG. 4 is a flow diagram of still further embodiments of depth information acquisition methods according to the present disclosure;

fig. 5 is a schematic structural diagram of some embodiments of a depth information acquisition apparatus according to the present disclosure;

FIG. 6 is a schematic structural diagram of an electronic device suitable for use in implementing some embodiments of the present disclosure.

Detailed Description

Embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While certain embodiments of the present disclosure are shown in the drawings, it is to be understood that the disclosure may be embodied in various forms and should not be construed as limited to the embodiments set forth herein. Rather, these embodiments are provided for a more thorough and complete understanding of the present disclosure. It should be understood that the drawings and embodiments of the disclosure are for illustration purposes only and are not intended to limit the scope of the disclosure.

It should be noted that, for convenience of description, only the portions related to the related invention are shown in the drawings. The embodiments and features of the embodiments in the present disclosure may be combined with each other without conflict.

It should be noted that the terms "first", "second", and the like in the present disclosure are only used for distinguishing different devices, modules or units, and are not used for limiting the order or interdependence relationship of the functions performed by the devices, modules or units.

It is noted that references to "a", "an", and "the" modifications in this disclosure are intended to be illustrative rather than limiting, and that those skilled in the art will recognize that "one or more" may be used unless the context clearly dictates otherwise.

The names of messages or information exchanged between devices in the embodiments of the present disclosure are for illustrative purposes only, and are not intended to limit the scope of the messages or information.

The present disclosure will be described in detail below with reference to the accompanying drawings in conjunction with embodiments.

Fig. 1 is a schematic diagram of an application scenario of a depth information acquisition method according to some embodiments of the present disclosure.

As shown in fig. 1, an electronic device 101 may acquire a sequence of image frames 102. The image frame sequence 102 includes image frames continuously acquired by an image acquisition device, and the image frames in the image frame sequence 102 include depth information used for characterizing a distance between an object in the image frame and the image acquisition device. The electronic device 101 may determine a positional relationship 103 between objects in two adjacent image frames in the image frame sequence 102, and obtain a positional relationship sequence corresponding to the image frame sequence 102. The electronic device 101 may construct a three-dimensional space based on the position relationship sequence, and the three-dimensional space includes accurate information of the position, point, line, plane, and other features of the object. The electronic device 101 may acquire target depth information of the target object based on the three-dimensional space. Therefore, the accuracy of the acquired depth information is improved.

It should be understood that the number of electronic devices 101 in fig. 1 is merely illustrative. There may be any number of electronic devices 101, as desired for implementation.

With continued reference to fig. 2, fig. 2 illustrates a flow 200 of some embodiments of a depth information acquisition method according to the present disclosure. The depth information acquisition method comprises the following steps:

step 201, an image frame sequence is acquired.

In some embodiments, the subject of execution of the depth information acquisition method (e.g., electronic device 101 shown in fig. 1) may acquire the image frame sequence through a wired connection or a wireless connection. It should be noted that the wireless connection means may include, but is not limited to, a 3G/4G/5G connection, a WiFi connection, a bluetooth connection, a WiMAX connection, a Zigbee connection, a uwb (ultra Wide band) connection, and other wireless connection means now known or developed in the future.

The execution subject may acquire a sequence of image frames acquired by the image acquisition device. The image frame sequence can be acquired by rotating the image acquisition device by a plurality of angles in a static state, or can be acquired by rotating the image acquisition device by an angle in a moving process or in a moving process. The image capturing device may include a camera of a device (e.g., a distance sensor) for measuring depth information. The image acquisition device can also acquire point cloud data to obtain an image frame sequence. Correspondingly, the image frame sequence comprises image frames continuously acquired by an image acquisition device, and the image frames comprise depth information for representing the distance between an object in the image frames and the image acquisition device.

Step 202, determining the position relationship between the objects in the two adjacent image frames in the image frame sequence to obtain a position relationship sequence.

In some embodiments, the image frames in the sequence of image frames are acquired continuously by the image acquisition device. Therefore, two adjacent image frames contain the position change between the objects, and the position relation between the objects can be determined according to the position change. Accordingly, the position relation sequence can be obtained through the image frame sequence, and the position relation sequence can represent continuous change of the position relation between the objects in the image frame. Therefore, the position relation of the objects in the image frame can be described from multiple visual angles, and the accurate position relation between the objects in the three-dimensional space can be acquired.

And 203, constructing a three-dimensional space based on the position relation sequence, and acquiring target depth information of the target object through the three-dimensional space.

In some embodiments, the position relationship sequence can represent continuous changes of the position relationship between objects in the image frame, and further, the three-dimensional space constructed based on the position relationship sequence can realize multi-angle correction of the position, point, line, surface and other characteristics of the object in the three-dimensional space, so that the accuracy of the three-dimensional space is improved. Further, upon receiving a request to obtain depth information of a target object, accurate target depth information of the target object may be obtained in a three-dimensional space.

According to the depth information obtained by the depth information obtaining method disclosed by some embodiments of the disclosure, the accuracy of the depth information is improved. Specifically, the reason why the accuracy of the depth information is not high is that: the existing depth information is usually acquired by an image acquisition device at a certain position or a certain moment, and is easily interfered by various factors. Based on this, the depth information acquiring method of some embodiments of the present disclosure first acquires an image frame sequence obtained by continuously acquiring image frames by an image acquisition device; then, the position relation between the objects in the adjacent image frames in the image frame sequence is determined, and the small change of the position relation between the objects can be obtained. And then, a three-dimensional space is constructed based on the position relation sequence, so that the accuracy of the position relation between the objects in the three-dimensional space is greatly improved. On the basis, the target depth information of the target object is obtained through the three-dimensional space, and the accuracy of the target depth information is improved.

With continued reference to fig. 3, fig. 3 illustrates a flow 300 of some embodiments of a depth information acquisition method according to the present disclosure. The depth information acquisition method comprises the following steps:

step 301, a sequence of image frames is acquired.

The content of step 301 is the same as that of step 201, and is not described in detail here.

Step 302, for two adjacent image frames in the image frame sequence, determining a still object image area and a moving object image area in the two adjacent image frames.

The execution subject may compare two adjacent image frames in the image frame sequence to determine a still object image region and a moving object image region in the two adjacent image frames. For example, the execution subject may identify a still object and a moving object in the image frame by image recognition or the like, and further determine a still object image region and a moving object image region.

In some optional implementations of some embodiments, the determining the still object image region and the moving object image region in two adjacent image frames may include: and for the image frame of two adjacent image frames, determining at least one super pixel region in the image frame, and determining a static object image region and a moving object image region in the at least one super pixel region based on the semantic label.

The execution subject may first determine at least one super pixel region in the image frame. The super-pixel region may be a region formed by consecutive pixels having the same or similar pixel values. In practice, there may be a case where a plurality of objects having the same or similar colors are gathered in an image frame. Based on this, the execution subject may further determine a still object image region and a moving object image region in the super pixel region by semantic labeling. The image frames in the image frame sequence comprise at least one semantic tag, and the semantic tag is used for describing objects in the image frames. The semantic label can be obtained by image recognition of the image frame. Therefore, the accuracy of identifying the static object image area and the moving object image area is improved through the semantic labels.

Step 303, determining the position relationship between the objects according to the position changes of the moving object image regions in the two adjacent image frames, so as to obtain a position relationship sequence.

After the still object image region and the moving object image region are determined, the execution subject can compare the position changes of the moving object image regions in two adjacent image frames, and further determine the position relationship between the objects.

In some optional implementations of some embodiments, the determining the position relationship between the objects according to the position changes of the moving object image regions in the two adjacent image frames may include:

firstly, at least one still position mark point is arranged in the same still object image area in the two adjacent image frames, and at least one motion position mark point is arranged in the same motion object image area.

In order to determine the positional relationship between the objects, the execution main body may set at least one still position marker point and at least one moving position marker point at the same position of the same object in two adjacent image frames, respectively. Optionally, the at least one stationary position marker point and the at least one moving position marker point may select a position where a corner, edge, etc. of the object has characteristics that accurately characterize the object.

And secondly, determining the position relation among the corresponding moving objects based on the depth information, the at least one static position mark point and the at least one moving position mark point.

The depth information may characterize a distance between an object in the image frame and the image capturing device. Based on the changes of the static position mark points and the moving position mark points in the two adjacent image frames and the depth information, the position relation between the object and the image acquisition device can be determined, and further the position relation between the objects is determined.

In some optional implementations of some embodiments, the determining a position relationship between corresponding moving objects based on the depth information, the at least one still position marker point, and the at least one moving position marker point may include:

the method comprises the following steps that firstly, a plurality of line segments between at least one static position mark point and at least one moving position mark point are correspondingly constructed for the image frames in the two adjacent image frames.

The execution body may construct a plurality of line segments between the still position marker point and the moving position marker point in each of the adjacent image frames. In order to accurately mark an object by the static position mark point and the moving position mark point, a line segment can be constructed by one static position mark point and a plurality of moving position mark points, a plurality of static position mark points and one moving position mark point, or a plurality of static position mark points and a plurality of moving position mark points.

And secondly, determining the position variation of the moving object according to the length difference of the corresponding line segments in the two adjacent image frames.

The execution subject may compare a length difference between line segments connecting the two identical points in the two adjacent image frames, and determine the amount of change in the position of the moving object from the length difference.

And thirdly, determining the position relation between the corresponding moving objects based on the depth information and the position variation.

The execution subject can determine the position relation such as common edge, coplanarity, occlusion and the like among the corresponding moving objects through the depth information and the position variation. Wherein, the common edge can be that the position variation of the two objects is the same, but the depth information is different; coplanarity can be that the two objects have different variations but the depth information is the same; occlusion may be that both the object variations and the depth information are not the same.

And 304, determining target position information corresponding to the position relation of the position relation in the position relation sequence, and determining the three-dimensional image of the visual angle according to the position relation and the target position information.

The execution subject may determine target location information based on the location relationship. For example, when the position of the image capturing device is position 1 when the previous image frame of the two adjacent image frames is acquired, and the position of the image capturing device is position 2 when the next image frame of the two adjacent image frames is acquired, the target position information determined based on the position relationship may represent the change of the image frames when the image capturing device is from position 1 to position 2, that is, the amount of position change of the image capturing device in the three-dimensional space. Then, the perspective three-dimensional image can be determined through the position relation and the target position information. The visual angle three-dimensional image is an image acquired by the visual angle of the image acquisition device at the target position corresponding to the target position information. Because the target position information is the position variation of the image acquisition device in the three-dimensional space, correspondingly, the three-dimensional image of the visual angle can be an image formed by the pixel variation difference values of corresponding pixels in two adjacent image frames. Therefore, the corresponding relation between the positions of the image and the image acquisition device is established, the accuracy of the relative position between the object and the image acquisition device in the three-dimensional space is improved, and the established three-dimensional space is facilitated to acquire accurate depth information.

As can be seen from the above description, the position relationship is obtained by two adjacent image frames, and the target position information may be regarded as establishing a corresponding relationship between the two adjacent image frames. And then, the visual angle three-dimensional image is determined through the target position information, so that the image frames are corrected and fused through the target position information, and the accuracy of constructing a three-dimensional space is improved.

In some optional implementations of some embodiments, the determining the perspective three-dimensional image according to the position relationship and the target position information may include:

and step one, determining the relative position between the objects according to the depth information and the position relation.

The distance between the image acquisition equipment and the object can be determined through the depth information, and then the relative position between the objects can be determined according to the position relation. The relative position may include a front-back position, a left-right position, and the like.

And secondly, determining a visual angle three-dimensional image according to the target position information and the relative position.

The execution main body can determine the visual angle at the target position corresponding to the target position information, and then determine the position of the object appearing in the visual angle according to the relative position relation, so as to determine the visual angle three-dimensional image. The three-dimensional image of the view angle is used for representing an image acquired by the view angle of the image acquisition device at the target position corresponding to the target position information.

And 305, constructing a three-dimensional space based on the view angle three-dimensional image set corresponding to the position relation sequence, and acquiring target depth information of the target object through the three-dimensional space.

The perspective three-dimensional image is acquired by two adjacent image frames, and the positional relationship sequence includes a plurality of image frames, so that a perspective three-dimensional image set can be obtained. The execution subject can project the perspective three-dimensional image in a three-dimensional space, and gradually construct the three-dimensional space through the same object image in the adjacent perspective three-dimensional images. The three-dimensional space is the three-dimensional space corresponding to the sequence of image frames.

In some optional implementations of some embodiments, the constructing a three-dimensional space based on the set of perspective three-dimensional images corresponding to the above-mentioned sequence of position relationships may include:

and step one, fusing the view three-dimensional images in the view three-dimensional image set to obtain an initial three-dimensional space.

The visual angle three-dimensional image establishes the corresponding relation between the image and the position of the image acquisition device, and is beneficial to improving the accuracy of the relative position between the object and the image acquisition device in the three-dimensional space. Based on the method, the execution main body can sequentially fuse the view angle three-dimensional images in the view angle three-dimensional image set according to the obtained sequence, the position and space of the object can be matched once during each fusion, and the initial three-dimensional space is obtained through gradual fusion. The initial three-dimensional space can meet the position requirement of the object in the images under multiple visual angles, so that the initial three-dimensional space can maximally accord with the actual space.

And step two, smoothing the initial three-dimensional space to obtain a target three-dimensional space.

In practice, the image capturing device may be affected by vibration and the like, and thus, the obtained image frames may have deviations in multiple directions. The execution main body can carry out smoothing processing on the initial three-dimensional space so that the target three-dimensional space can accord with the actual situation, and the accuracy of the depth information obtained based on the three-dimensional space is improved.

With further reference to fig. 4, a flow 400 of further embodiments of a depth information acquisition method is shown. The process 400 of the depth information obtaining method includes the following steps:

step 401, an image frame sequence is acquired.

Step 402, determining the position relation between the objects in the two adjacent image frames in the image frame sequence to obtain a position relation sequence.

And 403, constructing a three-dimensional space based on the position relation sequence, and acquiring target depth information of the target object through the three-dimensional space.

The contents of steps 401 to 403 are the same as those of steps 201 to 203, and are not described in detail here.

And step 404, determining the pose information of the object in the three-dimensional space.

In some embodiments, when the executing body receives a request to acquire the pose of the target object, a corresponding perspective may be set in the three-dimensional space, and pose information of the object at the perspective may be acquired. Because the viewing angles are arranged in the three-dimensional space with great selectivity, the pose information of the object under a plurality of viewing angles can be acquired, and the motion state of the object can be adjusted conveniently through the pose information.

With further reference to fig. 5, as an implementation of the methods shown in the above figures, the present disclosure provides some embodiments of a depth information obtaining apparatus, which correspond to those shown in fig. 2, and which may be applied in various electronic devices in particular.

As shown in fig. 5, the depth information acquiring apparatus 500 of some embodiments includes: an image frame sequence acquisition unit 501, a positional relationship acquisition unit 502, and a depth information acquisition unit 503. The image frame sequence acquiring unit 501 is configured to acquire an image frame sequence, where the image frame sequence includes image frames continuously acquired by an image acquisition device, and the image frames include depth information used for characterizing a distance between an object in the image frame and the image acquisition device; a position relation obtaining unit 502 configured to determine, for two adjacent image frames in the image frame sequence, a position relation between objects in the two adjacent image frames to obtain a position relation sequence; a depth information acquiring unit 503 configured to construct a three-dimensional space based on the positional relationship sequence, and acquire target depth information of the target object through the three-dimensional space.

In an optional implementation manner of some embodiments, the position relation obtaining unit 502 may include: an image area determination subunit (not shown in the figure) and a positional relationship acquisition subunit (not shown in the figure). The image area determining subunit is configured to determine a static object image area and a moving object image area in two adjacent image frames; and the position relation acquisition subunit is configured to determine the position relation between the objects according to the position changes of the moving object image areas in the two adjacent image frames.

In an optional implementation of some embodiments, an image frame in the image frame sequence includes at least one semantic tag, where the semantic tag is used to describe an object in the image frame; and, the image area determination subunit may include: and an image region determining module (not shown in the figure) configured to determine at least one super pixel region in the image frames for the image frames in the two adjacent image frames, and determine a still object image region and a moving object image region in the at least one super pixel region based on the semantic tags.

In an optional implementation manner of some embodiments, the position relationship obtaining subunit may include: a marker setting module (not shown in the figure) and a position relation determination module (not shown in the figure). The mark point setting module is configured to set at least one corresponding static position mark point in the same static object image area in the two adjacent image frames, and set at least one corresponding moving position mark point in the same moving object image area; and the position relation determining module is configured to determine the position relation among the corresponding moving objects based on the depth information, the at least one static position mark point and the at least one moving position mark point.

In an optional implementation manner of some embodiments, the position relation determining module may include: a line segment construction sub-module (not shown), a position variation determination sub-module (not shown), and a position relationship determination sub-module (not shown). The line segment construction submodule is configured to correspondingly construct a plurality of line segments between at least one static position mark point and at least one motion position mark point for the image frames in the two adjacent image frames; the position variation determining submodule is configured to determine the position variation of the moving object according to the length difference of the corresponding line segments in the two adjacent image frames; a position relation determination submodule configured to determine a position relation between corresponding moving objects based on the depth information and the amount of position change, the position relation including any one of: sharing edge, coplane and shielding.

In an optional implementation of some embodiments, the image frame includes location information; and, the depth information acquiring unit 503 may include: a perspective three-dimensional image determination subunit (not shown in the figure) and a three-dimensional space construction subunit (not shown in the figure). The visual angle three-dimensional image determining subunit is configured to determine, for the position relationship in the position relationship sequence, target position information corresponding to the position relationship, and determine a visual angle three-dimensional image according to the position relationship and the target position information; and the three-dimensional space construction subunit is configured to construct a three-dimensional space based on the view angle three-dimensional image set corresponding to the position relation sequence.

In an optional implementation manner of some embodiments, the image frame includes depth information, where the depth information is used to characterize a distance between an object in the image frame and the image acquisition device; and, the above-mentioned viewing angle three-dimensional image determination subunit may include: a relative position determination module (not shown in the figures) and a perspective three-dimensional image determination module (not shown in the figures). Wherein the relative position determining module is configured to determine the relative position between the objects according to the depth information and the position relation; and the visual angle three-dimensional image determining module is configured to determine a visual angle three-dimensional image according to the target position information and the relative position, wherein the visual angle three-dimensional image is used for representing an image shot by an image acquisition device at a visual angle at a target position corresponding to the target position information.

In an optional implementation manner of some embodiments, the three-dimensional space constructing subunit may include: an initial three-dimensional space acquisition module (not shown in the figure) and a three-dimensional space construction module (not shown in the figure). The initial three-dimensional space acquisition module is configured to fuse the view three-dimensional images in the view three-dimensional image set to obtain an initial three-dimensional space; and the three-dimensional space construction module is configured to carry out smoothing treatment on the initial three-dimensional space to obtain a target three-dimensional space.

In an optional implementation manner of some embodiments, the depth information obtaining apparatus 500 may further include: a pose information acquisition unit (not shown in the figure) configured to determine pose information of the object in the above-described three-dimensional space.

It will be understood that the elements described in the apparatus 500 correspond to various steps in the method described with reference to fig. 2. Thus, the operations, features and resulting advantages described above with respect to the method are also applicable to the apparatus 500 and the units included therein, and are not described herein again.

As shown in fig. 6, electronic device 600 may include a processing means (e.g., central processing unit, graphics processor, etc.) 601 that may perform various appropriate actions and processes in accordance with a program stored in a Read Only Memory (ROM)602 or a program loaded from a storage means 608 into a Random Access Memory (RAM) 603. In the RAM603, various programs and data necessary for the operation of the electronic apparatus 600 are also stored. The processing device 601, the ROM 602, and the RAM603 are connected to each other via a bus 604. An input/output (I/O) interface 605 is also connected to bus 604.

Generally, the following devices may be connected to the I/O interface 605: input devices 606 including, for example, a touch screen, touch pad, keyboard, mouse, camera, microphone, accelerometer, gyroscope, etc.; output devices 607 including, for example, a Liquid Crystal Display (LCD), a speaker, a vibrator, and the like; storage 608 including, for example, tape, hard disk, etc.; and a communication device 609. The communication means 609 may allow the electronic device 600 to communicate with other devices wirelessly or by wire to exchange data. While fig. 6 illustrates an electronic device 600 having various means, it is to be understood that not all illustrated means are required to be implemented or provided. More or fewer devices may alternatively be implemented or provided. Each block shown in fig. 6 may represent one device or may represent multiple devices as desired.

In particular, according to some embodiments of the present disclosure, the processes described above with reference to the flow diagrams may be implemented as computer software programs. For example, some embodiments of the present disclosure include a computer program product comprising a computer program embodied on a computer readable medium, the computer program comprising program code for performing the method illustrated in the flow chart. In some such embodiments, the computer program may be downloaded and installed from a network through the communication device 609, or installed from the storage device 608, or installed from the ROM 602. The computer program, when executed by the processing device 601, performs the above-described functions defined in the methods of some embodiments of the present disclosure.

It should be noted that the computer readable medium described above in some embodiments of the present disclosure may be a computer readable signal medium or a computer readable storage medium or any combination of the two. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples of the computer readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In some embodiments of the disclosure, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In some embodiments of the present disclosure, however, a computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: electrical wires, optical cables, RF (radio frequency), etc., or any suitable combination of the foregoing.

In some embodiments, the clients, servers may communicate using any currently known or future developed network Protocol, such as HTTP (HyperText Transfer Protocol), and may interconnect with any form or medium of digital data communication (e.g., a communications network). Examples of communication networks include a local area network ("LAN"), a wide area network ("WAN"), the Internet (e.g., the Internet), and peer-to-peer networks (e.g., ad hoc peer-to-peer networks), as well as any currently known or future developed network.

The computer readable medium may be embodied in the electronic device; or may exist separately without being assembled into the electronic device. The computer readable medium carries one or more programs which, when executed by the electronic device, cause the electronic device to: acquiring an image frame sequence, wherein the image frame sequence comprises image frames continuously acquired by an image acquisition device, and the image frames comprise depth information used for representing the distance between an object in the image frames and the image acquisition device; determining the position relation between objects in the two adjacent image frames in the image frame sequence to obtain a position relation sequence; and constructing a three-dimensional space based on the position relation sequence, and acquiring target depth information of the target object through the three-dimensional space.

Computer program code for carrying out operations for embodiments of the present disclosure may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, Smalltalk, C + +, and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider).

The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

The units described in some embodiments of the present disclosure may be implemented by software, and may also be implemented by hardware. The described units may also be provided in a processor, and may be described as: a processor includes an image frame sequence acquisition unit, a positional relationship acquisition unit, and a depth information acquisition unit. Here, the names of these units do not constitute a limitation on the unit itself in some cases, and for example, the depth information acquisition unit may also be described as "a unit for acquiring depth information of a target object in a three-dimensional space".

The functions described herein above may be performed, at least in part, by one or more hardware logic components. For example, without limitation, exemplary types of hardware logic components that may be used include: field Programmable Gate Arrays (FPGAs), Application Specific Integrated Circuits (ASICs), Application Specific Standard Products (ASSPs), systems on a chip (SOCs), Complex Programmable Logic Devices (CPLDs), and the like.

The foregoing description is only exemplary of the preferred embodiments of the disclosure and is illustrative of the principles of the technology employed. It will be appreciated by those skilled in the art that the scope of the invention in the embodiments of the present disclosure is not limited to the specific combination of the above-mentioned features, but also encompasses other embodiments in which any combination of the above-mentioned features or their equivalents is made without departing from the inventive concept as defined above. For example, the above features and (but not limited to) technical features with similar functions disclosed in the embodiments of the present disclosure are mutually replaced to form the technical solution.

17页详细技术资料下载

Depth information acquisition method and device, electronic equipment and computer readable medium

相关技术

网友询问留言