Three-dimensional tracking using hemispherical or spherical visible depth images

文档序号：24524 发布日期：2021-09-21 浏览：25次中文

阅读说明：本技术 使用半球形或球形可见光深度图像进行三维跟踪 (Three-dimensional tracking using hemispherical or spherical visible depth images ) 是由林袁邓凡何朝文于 2019-09-05 设计创作，主要内容包括：三维跟踪包括获得半球形可见光深度图像,该图像捕捉了用户设备的操作环境。获取半球形可见光深度图像包括,获取半球形可见光图像,以及获取半球形非可见光深度图像。三维跟踪包括生成透视转换的半球形可见光深度图像。生成透视转换的半球形可见光深度图像包括生成透视转换的半球形可见光图像,以及生成透视转换的半球形非可见光深度图像。三维跟踪包括基于所述透视转换的半球形可见光深度图像,生成物体识别和跟踪数据,所述数据代表所述操作环境中的外部物体,以及输出所述物体识别和跟踪数据。(Three-dimensional tracking includes obtaining a hemispherical visible depth image that captures the operating environment of the user device. Acquiring the hemispherical visible light depth image comprises acquiring the hemispherical visible light image and acquiring the hemispherical non-visible light depth image. Three-dimensional tracking includes generating a perspective transformed hemispherical visible depth image. Generating the perspective-converted hemispherical visible light depth image includes generating a perspective-converted hemispherical visible light image and generating a perspective-converted hemispherical non-visible light depth image. Three-dimensional tracking includes generating object recognition and tracking data based on the perspective converted hemispherical visible depth image, the data representing external objects in the operating environment, and outputting the object recognition and tracking data.)

1. A method of three-dimensional tracking, the method comprising:

obtaining a hemispherical visible depth image that captures an operating environment of a user device, wherein the obtaining the hemispherical visible depth image comprises:

obtaining a hemispherical visible light image; and

obtaining a hemispherical invisible light depth image;

generating a perspective-transformed hemispherical visible depth image, wherein generating the perspective-transformed hemispherical visible depth image comprises:

generating a perspective converted hemispherical visible light image; and

generating a perspective converted hemispherical non-visible light depth image;

generating object recognition and tracking data based on the perspective converted hemispherical visible depth image, the data representing external objects in the operating environment; and

outputting the object recognition and tracking data.

2. The method of claim 1, wherein the hemispherical visible light image and the hemispherical non-visible light depth image are synchronized spatiotemporally.

3. The method of claim 1, wherein the obtaining a hemispherical non-visible depth image comprises:

projecting hemispherical non-visible light;

after the hemispherical invisible light is projected, detecting the reflected invisible light; and

three-dimensional depth information is determined from the detected reflected non-visible light and the projected hemispherical non-visible light.

4. The method of claim 3, wherein the projecting hemispherical non-visible light comprises: a hemispherical non-visible static structured light pattern is projected.

5. The method of claim 4, wherein the projecting a hemispherical non-visible static structured light pattern comprises:

emitting infrared light by an infrared light source;

refracting the emitted infrared light to form a hemispherical projection field; and

rectifying infrared light of the hemispherical projected field to form the hemispherical non-visible static structured light pattern.

6. The method of claim 1, wherein the generating object recognition and tracking data comprises:

and performing feature extraction through the hemispherical visible light depth image based on perspective conversion to obtain feature information.

7. The method of claim 6, wherein,

the obtaining of the hemispherical visible depth image comprises: obtaining a sequence of hemispherical visible depth images comprising the hemispherical visible depth images, wherein each hemispherical visible depth image of the sequence of hemispherical visible depth images corresponds to a respective spatiotemporal location in the operating environment; and

the generating of the perspective-transformed hemispherical visible depth image comprises: generating a sequence of perspective converted hemispherical visible light depth images, wherein the sequence of perspective converted hemispherical visible light depth images comprises the perspective converted hemispherical visible light depth images, each perspective converted hemispherical visible light depth image in the sequence of perspective converted hemispherical visible light depth images corresponding to a respective hemispherical visible light depth image in the sequence of hemispherical visible light depth images.

8. The method of claim 7, wherein the obtaining feature information comprises: obtaining feature information corresponding to each perspective-converted hemispherical visible depth image in the perspective-converted sequence of hemispherical visible depth images.

9. The method of claim 8, wherein the generating object recognition and tracking data comprises:

and generating feature matching data based on the feature information.

Obtaining object state information based on the feature matching data; and

and analyzing the object state based on the object state information.

10. A non-transitory computer-readable storage medium comprising executable instructions that, when executed by a processor, perform operations comprising:

obtaining a hemispherical visible depth image that captures an operating environment of a user device, wherein the obtaining the hemispherical visible depth image comprises:

obtaining a hemispherical visible light image; and

obtaining a hemispherical invisible light depth image;

generating a perspective-transformed hemispherical visible depth image, wherein generating the perspective-transformed hemispherical visible depth image comprises:

generating a perspective converted hemispherical visible light image; and

generating a perspective converted hemispherical non-visible light depth image;

generating object recognition and tracking data based on the perspective converted hemispherical visible depth image, the data representing external objects in the operating environment; and

outputting the object recognition and tracking data.

11. The non-transitory computer readable storage medium of claim 10, wherein the hemispherical visible light image and the hemispherical non-visible light depth image are synchronized spatiotemporally.

12. The non-transitory computer readable storage medium of claim 10, wherein the obtaining a hemispherical non-visible depth image comprises:

projecting hemispherical non-visible light;

after the hemispherical invisible light is projected, detecting the reflected invisible light; and

three-dimensional depth information is determined from the detected reflected non-visible light and the projected hemispherical non-visible light.

13. The non-transitory computer readable storage medium of claim 12, wherein the projecting hemispherical non-visible light comprises: a hemispherical non-visible static structured light pattern is projected.

14. The non-transitory computer readable storage medium of claim 13, wherein the projecting a hemispherical non-visible static structured light pattern comprises:

emitting infrared light by an infrared light source;

refracting the emitted infrared light to form a hemispherical projection field; and

rectifying infrared light of the hemispherical projected field to form the hemispherical non-visible static structured light pattern.

15. The non-transitory computer readable storage medium of claim 10, wherein the generating object identification and tracking data comprises:

and performing feature extraction through the hemispherical visible light depth image based on perspective conversion to obtain feature information.

16. The non-transitory computer-readable storage medium of claim 15,

17. The non-transitory computer-readable storage medium of claim 16, wherein the obtaining feature information comprises: obtaining feature information corresponding to each perspective-converted hemispherical visible depth image in the perspective-converted sequence of hemispherical visible depth images.

18. The non-transitory computer readable storage medium of claim 17, wherein the generating object identification and tracking data comprises:

and generating feature matching data based on the feature information.

Obtaining object state information based on the feature matching data; and

and analyzing the object state based on the object state information.

19. An apparatus for depth detection, the apparatus comprising:

a hemispherical non-visible light projector;

a hemispherical non-visible light sensor;

a hemispherical visible light sensor;

a non-transitory computer readable medium; and

a processor configured to execute instructions stored on the non-transitory computer-readable medium to:

obtaining a hemispherical visible depth image that captures an operating environment of a user device, wherein the obtaining the hemispherical visible depth image comprises:

obtaining a hemispherical visible light image; and

obtaining a hemispherical invisible light depth image;

generating a perspective-transformed hemispherical visible depth image, wherein generating the perspective-transformed hemispherical visible depth image comprises:

generating a perspective converted hemispherical visible light image; and

generating a perspective converted hemispherical non-visible light depth image;

generating object recognition and tracking data based on the perspective converted hemispherical visible depth image, the data representing external objects in the operating environment; and

outputting the object recognition and tracking data.

20. The apparatus of claim 19, wherein the processor is configured to execute instructions stored on the non-transitory computer-readable medium to:

obtaining the hemispherical visible depth image by obtaining a sequence of hemispherical visible depth images, the sequence of hemispherical visible depth images comprising the hemispherical visible depth image, wherein each hemispherical visible depth image of the sequence of hemispherical visible depth images corresponds to a respective spatiotemporal location in the operating environment;

generating a perspective-converted hemispherical visible light depth image by generating a perspective-converted hemispherical visible light depth image sequence, wherein the perspective-converted hemispherical visible light depth image sequence includes the perspective-converted hemispherical visible light depth images, each perspective-converted hemispherical visible light depth image in the perspective-converted hemispherical visible light depth image sequence corresponding to a respective hemispherical visible light depth image in the hemispherical visible light depth image sequence;

obtaining feature information by performing feature extraction based on the perspective-converted hemispherical visible light depth image, wherein the obtaining the feature information includes: obtaining feature information corresponding to each perspective-converted hemispherical visible depth image in the perspective-converted sequence of hemispherical visible depth images;

generating feature matching data based on the feature information;

obtaining object state information based on the feature matching data; and

and analyzing the object state based on the object state information.

Technical Field

The present application relates to three-dimensional (3D) modeling and tracking, for example in user devices, using hemispherical or spherical visible depth images.

Background

Cameras may be used to capture images or video, object detection and tracking, face recognition, and the like. Therefore, a method and apparatus for three-dimensional tracking using hemispherical or spherical visible depth images is advantageous.

Disclosure of Invention

The present disclosure provides embodiments for three-dimensional tracking using hemispherical or spherical visible depth images.

The present disclosure provides, in one aspect, a method for three-dimensional tracking using a hemispherical or spherical visible light depth image, the method including: a hemispherical visible depth image is obtained that captures the operating environment of the user device. Obtaining a hemispherical visible depth image comprising: obtaining a hemispherical visible light image; and obtaining a hemispherical non-visible depth image. Three-dimensional tracking using a hemispherical or spherical visible depth image includes generating a perspective transformed hemispherical visible depth image. Generating the perspective-transformed hemispherical visible depth image, comprising: generating a perspective converted hemispherical visible light image; and generating a perspective transformed hemispherical non-visible depth image. Performing three-dimensional tracking using a hemispherical or spherical visible depth image includes generating object recognition and tracking data based on the perspective-transformed hemispherical visible depth image, the data representing external objects in the operating environment; and outputting the object recognition and tracking data.

Another aspect of the present disclosure provides an apparatus for three-dimensional tracking using a hemispherical or spherical visible depth image, the apparatus including: a hemispherical non-visible light projector; a hemispherical non-visible light sensor; a hemispherical visible light sensor; a non-transitory computer readable medium; and a processor configured to execute instructions stored on the non-transitory computer readable medium to obtain a hemispherical visible depth image that captures an operating environment of a user device. Obtaining a hemispherical visible depth image comprising: obtaining a hemispherical visible light image; and obtaining a hemispherical non-visible depth image. Three-dimensional tracking using a hemispherical or spherical visible depth image includes generating a perspective transformed hemispherical visible depth image. Generating the perspective-transformed hemispherical visible depth image, comprising: generating a perspective converted hemispherical visible light image; and generating a perspective transformed hemispherical non-visible depth image. Performing three-dimensional tracking using a hemispherical or spherical visible depth image includes generating object recognition and tracking data based on the perspective-transformed hemispherical visible depth image, the data representing external objects in the operating environment; and outputting the object recognition and tracking data.

Yet another aspect of the disclosure provides a non-transitory computer-readable storage medium comprising executable instructions that, when executed by a processor, perform three-dimensional tracking using a hemispherical or spherical visible light depth image, including obtaining a hemispherical visible light depth image that captures an operating environment of a user device. Obtaining a hemispherical visible depth image comprising: obtaining a hemispherical visible light image; and obtaining a hemispherical non-visible depth image. Three-dimensional tracking using a hemispherical or spherical visible depth image includes generating a perspective transformed hemispherical visible depth image. Generating the perspective-transformed hemispherical visible depth image, comprising: generating a perspective converted hemispherical visible light image; and generating a perspective transformed hemispherical non-visible depth image. Performing three-dimensional tracking using a hemispherical or spherical visible depth image includes generating object recognition and tracking data based on the perspective-transformed hemispherical visible depth image, the data representing external objects in the operating environment; and outputting the object recognition and tracking data.

Drawings

The disclosure may best be understood by reference to the following detailed description when read in conjunction with the accompanying drawings. It is emphasized that, according to common practice, the various features of the drawings are not to scale. On the contrary, the dimensions of the various features are arbitrarily expanded or reduced for clarity.

FIG. 1 shows one example of a user device for digital computing and electronic communications of the present disclosure.

FIG. 2 shows a block diagram of a system for fisheye non-visible light depth detection of the present disclosure.

Fig. 3 shows a schematic diagram of a hemispherical fish-eye invisible light depth detection device in an embodiment of the disclosure.

Fig. 4 shows a schematic diagram of a hemispherical fish-eye invisible light depth detection device in another embodiment of the present disclosure.

Fig. 5 shows a schematic diagram of a hemispherical fisheye non-visible light projection unit in an embodiment of the disclosure.

Fig. 6 shows a schematic diagram of a hemispherical fisheye non-visible light detection unit in an embodiment of the disclosure.

Fig. 7 shows a schematic diagram of a hemispherical fisheye non-visible light floodlight projection unit in an embodiment of the disclosure.

Fig. 8 shows a schematic diagram of a spherical fisheye invisible light depth detection device in an embodiment of the disclosure.

Fig. 9 shows a schematic diagram of a spherical fisheye invisible light depth detection device in another embodiment of the disclosure.

Fig. 10 shows a schematic diagram of a spherical fisheye non-visible light projection unit in an embodiment of the disclosure.

FIG. 11 shows a schematic diagram of a spherical fisheye invisible light detection unit in an embodiment of the disclosure.

FIG. 12 shows a schematic diagram of fisheye non-visible depth detection in an embodiment of the disclosure.

FIG. 13 is a schematic illustration of three-dimensional tracking using hemispherical or spherical visible depth images in an embodiment of the disclosure.

FIG. 14 is a flow chart of an artificial neural network-based three-dimensional tracking using hemispherical or spherical non-visible light depth images in an embodiment of the disclosure.

FIG. 15 is a schematic view of a scene tracked in three dimensions using hemispherical or spherical non-visible depth images in an embodiment of the disclosure.

FIG. 16 is a schematic illustration of a visualization of a scene using hemispherical or spherical non-visible depth images for three-dimensional tracking in an embodiment of the disclosure.

Detailed Description

Light sensors, such as cameras, may be used for a variety of purposes including capturing images or video, object detection and tracking, face recognition, and the like. Wide-angle or ultra-wide-angle lenses, such as fisheye lenses, enable cameras to capture panoramic or hemispherical scenes. The pair of fish-eye glasses cameras oppositely arranged along the optical axis direction enable the photographing apparatus to capture a spherical image.

In some systems, a visible light sensor, such as a camera, is used to determine depth information corresponding to the distance between the camera and various external objects in the scene it captures. For example, some cameras implement stereo vision or binocular depth detection, in which multiple overlapping images captured by multiple, spatially separated cameras are evaluated to determine depth based on differences between content captured by the images. The resource costs, including multiple cameras and computational costs, may be high and the accuracy of binocular depth detection may be limited. The three-dimensional depth detection capability of the camera may be limited due to the respective fields of view.

Spherical or hemispherical non-visible depth detection can improve the accuracy and efficiency of non-hemispherical depth detection and visible depth detection by projecting non-visible light, such as an infrared, spherical, or hemispherical static point cloud pattern, detecting the reflected non-visible light using a spherical or hemispherical non-visible light detector, and determining the three-dimensional depth as a function of the received light corresponding to the projected static point cloud pattern.

A three-dimensional map or model representing the operating environment of a user device may be used, for example, for augmented reality or virtual reality. Using images captured by cameras with limited fields of view, such as right angle or other less than hemispherical fields of view, generating a three-dimensional map or model may be inefficient and inaccurate. For example, using images captured by a camera with a limited field of view, such as a right angle or other less than hemispherical field of view, to generate a three-dimensional map or model may include using multiple image capture units, or positioning the image capture units in a sequence of positions over time, such as manually positioning, to produce multiple images, and merging the multiple images to inefficiently and inaccurately generate the model.

The use of hemispherical or spherical visible depth images for three-dimensional modeling, including fish-eye depth detection, may improve the efficiency, speed, and accuracy of three-dimensional modeling relative to three-dimensional modeling based on limited, e.g., right-angled or other less-than-hemispherical, images. Three-dimensional modeling using hemispherical or spherical visible depth images may use fewer images and may include fewer image stitching operations. Three-dimensional modeling using hemispherical or spherical visible depth images may increase the availability of feature information for each image.

The use of hemispherical or spherical visible depth images for three-dimensional tracking, which may include fish-eye depth detection, may improve the efficiency, speed, and accuracy of three-dimensional tracking relative to three-dimensional tracking based on limited, e.g., square or other less-than-hemispherical, images.

While the disclosure has been described in connection with certain embodiments, it is to be understood that the disclosure is not limited to the disclosed embodiments, but, on the contrary, is intended to cover various modifications and equivalent arrangements included within the scope of the appended claims, which scope is to be accorded the broadest interpretation so as to encompass all such modifications and equivalent arrangements as is permitted under the law.

Fig. 1 shows a schematic diagram of a user device 1000 for digital computing and electronic communication in an embodiment of the disclosure. The user device 1000 for digital computing and electronic communication includes an electronic processing unit 1100, an electronic communication interface unit 1200, a data storage unit 1300, a sensor unit 1400, a user interface unit 1500, a power supply unit 1600, and an internal signal distribution unit 1700. User device 1000 for digital computing and electronic communications may implement one or more aspects or elements of the methods and systems described herein. In some embodiments, user device 1000 for digital computing and electronic communications may include other elements not shown in fig. 1. For example, the user device 1000 for digital computing and electronic communication may include a housing or casing, and the electronic processing unit 1100, the electronic communication interface unit 1200, the data storage unit 1300, the sensor unit 1400, the user interface unit 1500, the power supply unit 1600, the internal signal distribution unit 1700, or a combination thereof may be included within the housing.

Although fig. 1 shows each of the electronic processing unit 1100, the electronic communication interface unit 1200, the data storage unit 1300, the sensor unit 1400, the user interface unit 1500, the power supply unit 1600, and the internal signal distribution unit 1700 as separate units, the user device 1000 for digital computing and electronic communication may include any number of electronic processing units, electronic communication interface units, data storage units, sensor units, user interface units, power supply units, and internal signal distribution units.

The electronic processing unit 1100, or processor, is operable to receive data, process and output data. For example, the electronic processing unit 1100 may receive data from the data storage unit 1300, the sensor unit 1400, the electronic communication interface unit 1200, the user interface unit 1500, or a combination thereof. Receiving data may include receiving computer instructions, such as computer instructions stored in data storage unit 1300 via internal signal distribution unit 1700. Processing data may include processing or executing computer instructions, such as implementing or executing one or more elements or aspects of the techniques disclosed herein. The electronic processing unit may output data to the data storage unit 1300, the sensor unit 1400, the electronic communication interface unit 1200, the user interface unit 1500, or a combination thereof through the internal signal distribution unit 1700. Electronic processing unit 1100 is operable to control one or more operations of user device 1000 for digital computing and electronic communications.

Electronic communication interface unit 1200 may communicate with external devices or systems, such as to receive signals, transmit signals, or both, using a wired or wireless electronic communication protocol, such as a Near Field Communication (NFC) electronic communication protocol, a bluetooth electronic communication protocol, an 802.11 electronic communication protocol, an Infrared (IR) electronic communication protocol, or any other electronic communication protocol.

The data storage unit 1300 may store data, retrieve data, or both. For example, data storage unit 1300 may retrieve computer instructions and other data. The data storage unit 1300 may include a persistent memory, such as a hard disk. The data storage unit 1300 may include volatile memory, such as one or more random access memory units.

Sensor unit 1400 may capture, detect, or determine one or more aspects of the operating environment of user device 1000 for digital computing and electronic communication. For example, the sensor unit 1400 may include one or more cameras, or other visible or invisible light detection and capture units. The sensor unit 1400 may communicate sensor signals, e.g., captured image data, representative of sensed aspects of the operating environment of the user device 1000 for digital computing and electronic communications to the internal signal distribution unit 1700, the power supply unit 1600, the data storage unit 1300, the electronic processing unit 1100, the electronic communication interface unit 1200, the user interface unit 1500, or a combination thereof. In some embodiments, the user device 1000 for digital computing and electronic communication may include a plurality of sensor units, such as a camera, a microphone, an infrared receiver, a global positioning system unit, a gyroscope sensor, an accelerometer, a pressure sensor, a capacitance sensor, a biometric sensor, a magnetometer, a radar unit, a lidar unit, an ultrasound unit, a temperature sensor, or any other sensor capable of capturing, detecting, or determining one or more aspects or conditions of the operating environment of the user device 1000 for digital computing and electronic communication.

The user interface unit 1500 may receive a user input. The user interface unit 1500 may communicate data representing user input to the internal signal distribution unit 1700, the power supply unit 1600, the data storage unit 1300, the electronic processing unit 1100, the sensor unit 1400, the electronic communication interface unit 1200, or a combination thereof. The user interface unit 1500 outputs, presents or displays data to a user of a user device, such as a digital computing and electronic communications 1000, or other forms representative of the functionality described above. For example, the user interface unit 1500 may include a light-based display, a sound-based display, or a combination thereof.

The power supply unit 1600 may supply power to the internal signal distribution unit 1700, the data storage unit 1300, the electronic processing unit 1100, the sensor unit 1400, the electronic communication interface unit 1200, and the user interface unit 1500, for example, through the internal signal distribution unit 1700 or through an internal power supply signal distribution unit (not separately shown). For example, the power supply unit 1600 may be a battery. In some embodiments, the power supply unit 1600 may include an interface to connect with an external power supply.

Internal signal distribution unit 1700 may carry or distribute internal data signals, power signals, or both, such as distributing signals to electronic processing unit 1100, electronic communication interface unit 1200, data storage unit 1300, sensor unit 1400, user interface unit 1500, power supply unit 1600, or a combination thereof.

Other embodiments of the configuration of the user device 1000 for digital computing and electronic communication are also applicable. For example, the user device 1000 for digital computing and electronic communication may omit the electronic communication interface unit 1200.

FIG. 2 shows a block diagram of a system for fisheye non-visible light depth detection of the present disclosure. As shown, the system 2000 for fisheye non-visible depth detection includes a user device 2100, such as the user device 1000 shown in fig. 1 for digital computing and electronic communication. In fig. 2, user device 2100 is shown in electronic communication with an external device 2200, as shown in phantom at 2300. The external device 2200 may be similar to the user device 1000 shown in fig. 1 for digital computing and electronic communication, unless explicitly described herein or from the context. In some embodiments, external device 2200 may be a server or other infrastructure device.

User device 2100 can communicate directly with external device 2200 through a wired or wireless electronic communication medium 2400. The user device 2100 may communicate directly with the external device 2200 via a network 2500, such as the internet, or via a combined network (not separately shown). For example, the user device 2100 may communicate using a first network communication link 2600 over the network 2500, while an external device may communicate using a second network communication link 2610 over the network 2500.

Fig. 3 shows a schematic diagram of a hemispherical fisheye non-visible light depth detection device 3000 in an embodiment of the disclosure. The hemispherical fisheye non-visible depth detection device 3000, or fisheye depth camera, may be similar to a user device, such as the user device 1000 shown in fig. 1 for digital computing and electronic communication, unless explicitly described herein or from the context. The hemispherical fish-eye invisible light depth detection apparatus 3000 may be a fish-eye camera, i.e., a super wide-angle camera, which can capture a panoramic or hemispherical image. Hemispherical fisheye non-visible depth detection device 3000 may be a depth camera that may capture or determine depth information of a scene it captures.

The hemispherical fisheye invisible light depth detection apparatus 3000 includes an apparatus housing 3100, a hemispherical fisheye invisible light projection unit 3200, and a fisheye invisible light detection unit 3300.

The hemispherical fish-eye non-visible light projection unit 3200 may be a fish-eye infrared spot projector. The hemispherical fisheye non-visible light projection unit 3200 may project or emit non-visible light, e.g., infrared light, in a dotted pattern, e.g., a static point cloud pattern, as indicated by a direction line 3210 extending from the surface of the hemispherical fisheye non-visible light projection unit 3200. Although five direction lines 3210 extend from the surface of the hemispherical fish-eye non-visible light projection unit 3200 for simplicity and clarity, the non-visible light static point cloud pattern projected by the hemispherical fish-eye non-visible light projection unit 3200 may have a projection area of 360 degrees in the longitudinal direction and 180 degrees or more in the transverse direction, for example, 183 degrees. An example of a hemispherical fisheye non-visible light projection unit 3200 is shown in fig. 5. In some embodiments, such as panoramic embodiments, the longitudinal projection area may be less than 360 degrees.

The fisheye invisible light detection unit 3300 may be a fisheye infrared camera. The fisheye non-visible light detection unit 3300 may detect or receive non-visible light, such as infrared light indicated by directional lines 3310 that converge on the surface of the fisheye non-visible light detection unit 3300. For example, the fisheye non-visible light detection unit 3300 may receive non-visible light that is emitted by the hemispherical fisheye non-visible light projection unit 3200 in the static point cloud pattern and reflected to the fisheye non-visible light detection unit 3300 by environmental aspects, such as objects in the field of view of the fisheye non-visible light detection unit 3300. Although five direction lines 3210 are shown to converge on the surface of the fisheye non-visible light detection unit 3300 for simplicity and clarity, the fisheye non-visible light detection unit 3300 may have a longitudinal 360 degrees and a transverse 180 degrees or greater field of view, e.g., 183 degrees. An example of a fisheye invisible light detection unit 3300 is shown in fig. 6.

The hemispherical fisheye non-visible light depth detection apparatus 3000 may implement fisheye non-visible light depth detection by emitting non-visible light in a static point cloud pattern using the hemispherical fisheye non-visible light projection unit 3200, and detecting corresponding reflected non-visible light (detected reflected non-visible light) using the fisheye non-visible light detection unit 3300.

For example, fig. 3 shows an external object 3400 in the environment of the hemispherical fisheye non-visible light depth detection apparatus 3000, for example, in the projection field of the hemispherical fisheye non-visible light projection unit 3200 and the field of view of the fisheye non-visible light detection unit 3300. The non-visible light may be emitted by the hemispherical fisheye non-visible light projection unit 3200 to the external object 3400 as indicated by the directional line at 3212. The non-visible light may be reflected by the surface of the external object 3400 towards the fisheye non-visible light detection unit 3300, as indicated by the directional line at 3312, and captured or recorded by the fisheye non-visible light detection unit 3300.

Fig. 4 shows a schematic diagram of a hemispherical fish-eye non-visible light depth detection apparatus 4000 in another embodiment of the present disclosure. The hemispherical fish-eye non-visible light depth detection apparatus 4000 may be a hemispherical fish-eye non-visible light depth detection apparatus 3000 similar to that shown in fig. 3, unless explicitly described herein or from the context.

The hemispherical fisheye non-visible light depth detection apparatus 4000 includes an apparatus housing 4100, a hemispherical fisheye non-visible light projection unit 4200, a hemispherical fisheye non-visible light detection unit 4300, and a hemispherical fisheye non-visible light floodlight projection unit 4400.

The device housing 4100 may be similar to the device housing 3100 illustrated in fig. 3, unless explicitly described herein or from the context. The hemispherical fisheye non-visible light projection unit 4200 may be similar to the hemispherical fisheye non-visible light projection unit 3200 shown in fig. 3, unless explicitly described herein or from the context. The hemispherical fish-eye non-visible light detection unit 4300 may be a fish-eye non-visible light detection unit 3300 similar to that shown in fig. 3, unless explicitly described herein or from the context.

The hemispherical fisheye non-visible light floodlight projection unit 4400, or infrared floodlight, may be a hemispherical fisheye non-visible light projection unit 3200 similar to that shown in fig. 3, unless explicitly described herein or from the context. The hemispherical fish-eye non-visible light floodlighting projection unit 4400 may emit a diffuse, uniform non-visible light field, e.g., infrared light, as shown by the arcs extending from the surface of the hemispherical fish-eye non-visible light floodlighting projection unit 4400. The diffused field of the invisible light emitted by the hemispherical fisheye invisible light flood unit 4400 may invisibly illuminate the environment of the hemispherical fisheye invisible light depth detection device 4000, which may include illuminating external objects near the hemispherical fisheye invisible light depth detection device 4000.

The hemispherical fisheye non-visible light detection unit 4300 may receive non-visible light, which is emitted by the hemispherical fisheye non-visible light floodlight projection unit 4400 and reflected by an external object in the environment of the hemispherical fisheye non-visible light depth detection apparatus 4000, such as a liveliness test section for a face recognition method or a feature extraction section for a simultaneous localization and mapping (SLAM) method. Depth detection based on received reflected non-visible light emitted from the hemispherical fisheye non-visible light floodlight projection device 4400 may be inaccurate, inefficient, or both.

Fig. 5 shows a schematic diagram of the hemispherical fish-eye invisible light projection unit 5000 in the embodiment of the disclosure. A fisheye non-visible light depth detection apparatus, such as the hemispherical fisheye non-visible light depth detection device 3000 shown in fig. 3 or the hemispherical fisheye non-visible light depth detection device 4000 shown in fig. 4, may include a hemispherical fisheye non-visible light projection unit 5000. For example, the hemispherical fish-eye non-visible light projection unit 3200 of the hemispherical fish-eye non-visible light depth detection apparatus 3000 shown in fig. 3 may be used as the hemispherical fish-eye non-visible light projection unit 5000.

The hemispherical fish-eye non-visible light projection unit 5000 includes a housing 5100, a non-visible light source 5200, one or more lenses 5300, and a Diffractive Optical Element (DOE) 5400. The hemispherical fish-eye invisible light projection unit 5000 has an optical axis as indicated by a dotted line at 5500.

The non-visible light source 5200 may be an infrared light source, such as a Vertical Cavity Surface Emitting Laser (VCSEL). The non-visible light generated by the non-visible light source 5200 is refracted by the lens 5300 to form a projection field of 360 degrees in the longitudinal direction and 180 degrees or more in the lateral direction, e.g., 183 degrees. The non-visible light forming the projection field is rectified by the diffractive optical element 5400 to form a static point cloud pattern, as represented by the dashed arc at 5600. An exemplary optical path is represented by a directional line extending from non-visible light source 5200 and through lens 5300 and through and extending from diffractive optical element 5400. In some embodiments, the diffractive optical element 5400 may be omitted and the hemispherical fish-eye non-visible light projection unit 5000 may include a point cloud cover that may form a static point cloud pattern from non-visible light generated by the non-visible light source 5200 and refracted by the lens 5300.

In one embodiment, the non-visible light source 5200 may be an infrared light source and may produce infrared light (photons) having a determined wavelength, such as 940 nanometers. Infrared light having a wavelength of 940 nm may be absorbed by water in the atmosphere, and the use of infrared light having a wavelength of 940 nm may improve the performance and accuracy of fish-eye non-visible depth perception, for example, in outdoor conditions. Other wavelengths, such as 850 nanometers, or another infrared or near-infrared wavelength, such as wavelengths in the range of 0.75 microns to 1.4 microns, may also be used. In this case, a wavelength of 940 nm may represent light that propagates narrowly around 940 nm. Using light of 940 nm wavelength may reduce resource costs and reduce chromatic aberration relative to visible light.

The non-visible light source 5200 produces in-plane non-visible light and the combination of the lens 5300 and the diffractive optical element 5400 maps the light emitted by the non-visible light source 5200 into a spherically distributed static point cloud pattern.

The number and arrangement of the lenses 5300 shown in fig. 5 is for simplicity and clarity of illustration. Other numbers and configurations of lenses may be provided. The optical structure of the lenses 5300, such as the respective shapes, materials, or both of the lenses 5300, can be optimized based on the refractive index of the non-visible light produced by the non-visible light source 5200.

Fig. 6 shows a schematic diagram of the hemispherical fish-eye invisible light detection unit 6000 in the embodiment of the present disclosure. A fisheye non-visible light depth detection device, such as the hemispherical fisheye non-visible light depth detection device 3000 shown in fig. 3 or the hemispherical fisheye non-visible light depth detection device 4000 shown in fig. 4, may include a hemispherical fisheye non-visible light detection unit 6000. For example, the fisheye invisible light detection unit 3300 of the hemispherical fisheye invisible light depth detection apparatus 3000 shown in fig. 3 may be used as the hemispherical fisheye invisible light detection unit 6000.

The hemispherical fish-eye non-visible light detection unit 6000 includes a housing 6100, a non-visible light filter 6200, one or more lenses 6300, and a non-visible light receiver 6400. The hemispherical fish-eye non-visible light detection unit 6000 has an optical axis, as indicated by the dashed line at 6500, and a field of view (not shown) of 360 degrees in the longitudinal direction, 180 degrees in the transverse direction, or more, centered on the optical axis 6500.

The non-visible filter 6200 may receive light, including non-visible light, such as infrared light. For example, the non-visible light filter 6200 may receive infrared light from a static point cloud pattern that is reflected by an adjacent external object (not shown) after being emitted from a non-visible light projection unit, such as the hemispherical fish-eye non-visible light projection unit 5000 shown in fig. 5.

Light received by the non-visible filter 6200 is filtered by the non-visible filter 6200 to exclude visible light and pass non-visible light. The non-visible light passing through the non-visible filter 6200 is focused by the lens 6300 onto the non-visible light receiver 6400. The combination of the non-visible filter 6200 and the lens 6300 maps the hemispherical field of view of the hemispherical fish-eye non-visible light detection unit 6000 onto the plane of the non-visible light receiver 6400. The non-visible light receiver 6400 may be an infrared light receiver.

The number and configuration of the lenses 6300 shown in fig. 6 is for simplicity and clarity of illustration. Other numbers and configurations of lenses may be provided. The optical structure of the lenses 6300, such as the shape, material, or both of the individual lenses 6300, may be optimized according to the refractive index of the non-visible light received by the non-visible light receiver 6400.

Fig. 7 shows a schematic diagram of a hemispherical fisheye non-visible light floodlight projection unit 7000 in an embodiment of the disclosure. A fisheye non-visible light depth detection device, such as hemispherical fisheye non-visible light depth detection device 3000 shown in fig. 3 or hemispherical fisheye non-visible light depth detection device 4000 shown in fig. 4, may include hemispherical fisheye non-visible light flood projection unit 7000. For example, the hemispherical fisheye non-visible light floodlight projection unit 4400 of the hemispherical fisheye non-visible light depth detection apparatus 4000 shown in fig. 4 may serve as the hemispherical fisheye non-visible light floodlight projection unit 7000.

The hemispherical fish-eye non-visible light floodlight projection unit 7000 comprises a housing 7100, a non-visible light source 7200 and one or more lenses 7300. The hemispherical fish-eye non-visible light floodlight projection unit 7000 has an optical axis as indicated by the dashed line at 7400. One exemplary optical path is represented by a directional line extending from non-visible light source 7200 and through and extending from lens 7300.

Fig. 8 shows a schematic diagram of a spherical fisheye non-visible light depth detection device 8000 in an embodiment of the present application. The spherical fisheye non-visible depth detection device 8000, or fisheye depth camera, may be a hemispherical fisheye non-visible depth detection device 3000 similar to that shown in fig. 3, unless explicitly described herein or from the context. The spherical fisheye non-visible depth detection device 8000 may be a dual fisheye camera, which is an omnidirectional camera that may capture panoramic or spherical images. The spherical fisheye non-visible depth detection device 8000 may be a depth camera that may capture or determine depth information for a captured scene.

The spherical fisheye non-visible light depth detection apparatus 8000 includes an apparatus housing 8100, a first hemispherical fisheye non-visible light projection unit 8200, a second hemispherical fisheye non-visible light projection unit 8210, a first hemispherical fisheye non-visible light detection unit 8300, and a second hemispherical fisheye non-visible light detection unit 8310.

In some embodiments, the first hemispherical fisheye non-visible light projection unit 8200 may be a first portion of a spherical fisheye non-visible light projection unit and the second hemispherical fisheye non-visible light projection unit 8210 may be a second portion of the spherical fisheye non-visible light projection unit. Fig. 10 illustrates an example of a spherical fisheye non-visible light projection unit.

In some embodiments, the first hemispherical fisheye non-visible light detection unit 8300 may be a first portion of the spherical fisheye non-visible light detection unit and the second hemispherical fisheye non-visible light detection unit 8310 may be a second portion of the spherical fisheye non-visible light detection unit. Fig. 11 illustrates an example of a spherical fisheye invisible light detection unit.

The first hemispherical fisheye non-visible projection unit 8200 may be a hemispherical fisheye non-visible projection unit 3200 similar to that shown in fig. 3, unless explicitly described herein or from the context. The second hemispherical fish-eye non-visible light projection unit 8210 may be a hemispherical fish-eye non-visible light projection unit 3200 similar to that shown in fig. 3, unless explicitly described herein or from the context.

The projection field of the first hemispherical fish-eye non-visible light projection unit 8200 is represented by a dotted arc at 8400. The projection field of the second hemispherical fisheye non-visible light projection unit 8210 is represented by the dashed arc at 8410. The projection field of the first hemispherical fish-eye non-visible light projection unit 8200 may partially overlap with the projection field of the second hemispherical fish-eye non-visible light projection unit 8210 to form a combined projection field, which is a 360-degree omnidirectional projection field. The first hemispherical fish-eye non-visible light projection unit 8200 and the second hemispherical fish-eye non-visible light projection unit 8210 may collectively project or emit a 360-degree omnidirectional static point cloud pattern.

In some embodiments, a portion of the hemispherical portion of the omnidirectional static point cloud pattern projected by the first hemispherical fish-eye non-visible light projection unit 8200 may overlap with a portion of the hemispherical portion of the omnidirectional static point cloud pattern projected by the second hemispherical fish-eye non-visible light projection unit 8210, as shown at 8500. To avoid ambiguity or conflict between the respective projected static point cloud patterns in the overlapping portions, the hemispherical portion of the omnidirectional static point cloud pattern projected by the first hemispherical fish-eye non-visible light projection unit 8200 may be different from the hemispherical portion of the omnidirectional static point cloud pattern projected by the second hemispherical fish-eye non-visible light projection unit 8210. For example, the hemispherical portion of the omnidirectional static point cloud pattern projected by the first hemispherical fisheye non-visible light projection unit 8200 uses the circular points of the non-visible light, and the hemispherical portion of the omnidirectional static point cloud pattern projected by the second hemispherical fisheye non-visible light projection unit 8210 uses the square points of the non-visible light. In another embodiment, the light projected by each of the hemispherical fisheye non-visible light projection units 8200, 8210 may be time-duplex multiplexed. Other multiplexing techniques may also be used.

The field of view of the first hemispherical fisheye invisible light detecting unit 8300 may partially overlap the field of view of the second hemispherical fisheye invisible light detecting unit 8310 to form a combined field of view, which is a 360 degree omnidirectional field of view. The first hemispherical fisheye invisible light detection unit 8300 and the second hemispherical fisheye invisible light detection unit 8310 may collectively receive or detect reflected light corresponding to a 360-degree omnidirectional static point cloud pattern, such as the 360-degree omnidirectional static point cloud pattern projected by the first hemispherical fisheye invisible light projection unit 8200 and the second hemispherical fisheye invisible light projection unit 8210.

Fig. 9 shows a schematic diagram of a spherical fisheye invisible light depth detection device 9000 in another embodiment of the disclosure. The spherical fisheye non-visible light depth detection device 9000, may be a spherical fisheye non-visible light depth detection device 9000 similar to that shown in fig. 9, unless explicitly described herein or from the context.

The spherical fisheye invisible-light depth detection device 9000 includes a device housing 9100, a first hemispherical fisheye invisible-light projection unit 9200, a second hemispherical fisheye invisible-light projection unit 9210, a first hemispherical fisheye invisible-light detection unit 9300, a second hemispherical fisheye invisible-light detection unit 9310, a first hemispherical fisheye invisible-light floodlight projection unit 9400, and a first hemispherical fisheye invisible-light floodlight projection unit 9410.

Fig. 10 shows a schematic diagram of a spherical fisheye non-visible light projection device 10000 in an embodiment of the disclosure. A spherical or omnidirectional fisheye invisible light depth detection device, such as the spherical fisheye invisible light depth detection device 8000 shown in fig. 8, or the spherical fisheye invisible light depth detection device 9000 shown in fig. 9, may include a spherical fisheye invisible light projection unit 10000. For example, the first hemispherical fish-eye non-visible light projection unit 8200 and the second hemispherical fish-eye non-visible light projection unit 8210 of the spherical fish-eye non-visible light depth detection apparatus 8000 shown in fig. 8 may be used as the spherical fish-eye non-visible light projection unit 10000.

The spherical fisheye invisible light projection unit 10000 includes a housing 10100, an invisible light source 10200, one or more first lenses 10300, a mirror 10400, a first hemispherical portion 10500, and a second hemispherical portion 10600. The invisible light source 10200 and the first lens 10300 are aligned along the first axis 10700.

The first hemispherical portion 10500 includes one or more second lenses 10510 and one first diffractive optical element 10520. The second hemispherical portion 10600 includes one or more third lenses 10610 and a second diffractive optical element 10620. First hemispherical portion 10500 and second hemispherical portion 10600 are oriented along the optical axis, as indicated by the dashed lines at 10800.

The non-visible light projected by the non-visible light source 10200 along the first axis 10700 is directed by the mirror 10400 to the first hemispherical portion 10500 and the second hemispherical portion 10600, respectively, e.g., diverted and reflected. The non-visible light emitted by the non-visible light source 10200 and directed by the mirror 10400 toward the first hemispherical portion 10500 and the second hemispherical portion 10600, respectively, is refracted by the lenses 10510, 10610, respectively, to form a combined projected field of 360 degrees in the longitudinal direction and 360 degrees in the transverse direction. The non-visible light that forms the projection field is rectified by the respective diffractive optical elements 10520, 10620 to form a static point cloud pattern. Respective exemplary light paths are represented by direction lines extending from the non-visible light source 10200, through the lens 10300, directed by the mirror 10400, through the lenses 10510, 10610, through the diffractive optical elements 10520, 10620, and extending from the diffractive optical elements 10520, 10620.

The non-visible light source 10200 produces non-visible light on a plane, and the combination of the lenses 10300, 10510, 10610, mirror 10400, and diffractive optical elements 10520, 10620 maps the light emitted by the non-visible light source 10200 into a spherically distributed static point cloud pattern.

Fig. 11 shows a schematic diagram of a spherical fisheye invisible light detection unit 11000 in an embodiment of the disclosure. A spherical or omnidirectional fisheye non-visible light depth detection device, such as the spherical fisheye non-visible light depth detection device 8000 shown in fig. 8, or the spherical fisheye non-visible light depth detection device 9000 shown in fig. 9, may include a spherical fisheye non-visible light detection unit 11000. For example, the first hemispherical fish-eye invisible light detection unit 8300 and the second hemispherical fish-eye invisible light detection unit 8310 of the spherical fish-eye invisible light depth detection apparatus 8000 shown in fig. 8 may be used as the spherical fish-eye invisible light detection unit 11000.

The spherical fisheye non-visible light detection unit 11000 includes a housing 11100, a first hemispherical portion 11200, a second hemispherical portion 11300, a mirror 11400, one or more first lenses 11500, and a non-visible light receiver 11600. The non-visible light receiver 11600 and the first lens 11500 are arranged along a first axis 11700.

The first hemispherical portion 11200 includes one or more second lenses 11210 and a first non-visible light filter 11220. The second hemispherical portion 11300 includes one or more third lenses 11310 and a second non-visible light filter 11320. The first hemispherical portion 11200 and the second hemispherical portion 11300 are arranged along the optical axis, as indicated by the dashed line at 11800.

The non-visible light filters 11220, 11320 may receive light, including non-visible light, such as infrared light. For example, the non-visible light filters 11220, 11320 may receive infrared light from a static point cloud pattern emitted by a non-visible light projection unit, such as the spherical fisheye non-visible light projection unit 10000 shown in fig. 10, and subsequently reflected by an adjacent external object (not shown).

Light received by the non-visible light filters 11220, 11320 is filtered by the non-visible light filters 11220, 11320 to exclude visible light and pass non-visible light. The non-visible light passing through the non-visible light filters 11220, 11320 is focused on the mirror 11400 by the second and third lenses 11210, 11310, respectively, and is guided to the non-visible light receiver 11600 through the first lens 11500. The combination of the non-visible light pass filters 11220, 11320, the mirror 11400 and the lenses 11210, 11310, 11500 maps the spherical field of view of the spherical fish-eye non-visible light detection unit 11000 onto the plane of the non-visible light receiver 11600.

Fig. 12 shows a schematic diagram of fish-eye invisible light depth detection 12000 in an embodiment of the disclosure. The fisheye non-visible depth detection 12000 may be implemented in a non-visible light based depth detection device, such as a user device, e.g., the hemispherical fisheye non-visible depth detection device 3000 shown in fig. 3, the hemispherical fisheye non-visible depth detection device 4000 shown in fig. 4, the spherical fisheye non-visible depth detection device 8000 shown in fig. 8, or the spherical fisheye non-visible depth detection device 9000 shown in fig. 9.

The fish-eye non-visible depth detection 12000 includes, at 12100, projecting a hemispherical or spherical non-visible static point cloud pattern, at 12200, detecting non-visible light, at 12300, determining three-dimensional depth information, and at 12400, outputting the three-dimensional depth information.

In step 12100, a hemispherical or spherical non-visible light static point cloud pattern is projected, including emitting non-visible light, such as infrared light, from a non-visible light source, such as the non-visible light source 5200 shown in fig. 5 or the non-visible light source 10200 shown in fig. 10. In some embodiments, such as in spherical embodiments, in step 12100, projecting a hemispherical or spherical non-visible static point cloud pattern includes directing the emitted non-visible light to a first hemispherical portion of a non-visible light based depth detection device, such as first hemispherical portion 10500 shown in fig. 10, and a second hemispherical portion of the non-visible light based depth detection device, such as second hemispherical portion 10600 shown in fig. 10, such as by a mirror, such as mirror 10400 shown in fig. 10. In step 12100, a hemispherical or spherical non-visible light static point cloud pattern is projected, including refracting the emitted non-visible light, e.g., by one or more lenses, such as lens 5300 shown in FIG. 5 or lenses 10300, 10510, 10610 shown in FIG. 6, to form a hemispherical or spherical projection field. The hemispherical or spherical non-visible static point cloud pattern is projected 12100, including rectifying or filtering the non-visible light in the hemispherical or spherical projected field, for example, by a diffractive optical element, such as diffractive optical element 5400 shown in fig. 5 or diffractive optical elements 10520, 10620 shown in fig. 6, to form the projected hemispherical or spherical non-visible static point cloud pattern.

The non-visible point of the projected hemispherical or spherical non-visible light static point cloud pattern, or a portion thereof, may be reflected to the non-visible light depth detection device by one or more external objects located in the non-visible light depth detection device environment, or a portion thereof.

In step 12200, detecting the non-visible light includes receiving light, including the reflected non-visible light projected at 12100. In step 12200, detecting the non-visible light includes filtering the received light, e.g., by a non-visible light filter, such as non-visible light filter 6200 shown in fig. 6 or non-visible light filter 11220, 111320 shown in fig. 11, to exclude light other than non-visible light, e.g., visible light, and to pass the non-visible light. In step 12200, detecting the non-visible light includes focusing the received non-visible light onto a plane of a non-visible light detector, such as non-visible light receiver 6400 shown in fig. 6 or non-visible light receiver 11600 shown in fig. 11, using one or more lenses, such as lens 6300 shown in fig. 6 or lenses 11210, 11310, 11500 shown in fig. 11. In some embodiments, such as in a spherical embodiment, the received light may be received and filtered by a first hemispherical portion of a non-visible light based depth detection device, such as first hemispherical portion 11200 shown in FIG. 11, and received and filtered by a second hemispherical portion of a non-visible light based depth detection device, such as second hemispherical portion 11300 shown in FIG. 11, focused by the respective hemispherical portion on a mirror, such as mirror 11400 shown in FIG. 11, and directed by the mirror to a non-visible light receiver.

In step 12300, determining three-dimensional depth information includes determining respective results using one or more mapping functions, where θ represents an angle between a point of the reflected light and an optical axis of the camera in radians, f represents a focal length of the lens, and R represents a radial position of a corresponding detected ray on the sensor. The mapping function may be represented, for example, as an equidistant mapping function R ═ f · θ, a stereoscopic mapping function as R ═ 2f · tan (θ/2), an orthogonal mapping function as R ═ f · sin (θ), an equivalent mapping function as R ═ 2f · sin (θ/2), or any other hemispherical or spherical mapping function.

Although fisheye non-visible depth detection is described herein in the context of structured light based fisheye non-visible depth detection, other fisheye non-visible depth detection techniques may also be used, such as dynamic pattern structured light depth detection and time-of-flight (ToF) depth detection. In some embodiments, the structured or dynamic light pattern may be a point cloud pattern, a gray/color coded light stripe pattern, or the like.

For example, fisheye non-visible time-of-flight depth detection may include: projecting hemispherical non-visible light using a hemispherical fish-eye non-visible light floodlight projecting unit, such as the hemispherical fish-eye non-visible light floodlight projecting unit 4400 shown in fig. 4 or the hemispherical fish-eye non-visible light floodlight projecting unit 7000 shown in fig. 7, or comprising projecting spherical non-visible light using a spherical fish-eye non-visible light floodlight projecting unit; determining a time projection point corresponding to projecting the non-visible light; receiving the reflected invisible light using a hemispherical fish-eye invisible light detection unit, such as the hemispherical fish-eye invisible light detection unit 6000 shown in fig. 6, or a spherical fish-eye invisible light detection unit, such as the spherical fish-eye invisible light detection unit 11000 shown in fig. 11; determining one or more time reception points corresponding to receiving the reflected non-visible light; and determining depth information based on a difference between the time projection point and the time reception point. Spatial information corresponding to detecting or receiving reflected non-visible light may be mapped into an operating environment of the fisheye non-visible light time-of-flight depth detection unit, and a difference between a time projection point and a time reception point corresponding to respective spatial positions may be determined as depth information of the corresponding spatial point.

In step 12400, three-dimensional depth information is output. For example, the three-dimensional depth information may be stored in a data storage unit. In another embodiment, the three-dimensional depth information may be transmitted to other elements of the device.

Fig. 13 is a schematic diagram of three-dimensional tracking 13000 using hemispherical or spherical visible light depth images in an embodiment of the disclosure. The three-dimensional tracking 13000 using the hemispherical or spherical visible light depth image may be implemented by a non-visible light based depth detection device, such as a user device, e.g., the hemispherical fish-eye non-visible light depth detection device 3000 shown in fig. 3, the hemispherical fish-eye non-visible light depth detection device 4000 shown in fig. 4, the spherical fish-eye non-visible light depth detection device 8000 shown in fig. 8, or the spherical fish-eye non-visible light depth detection device 9000 shown in fig. 9.

The three-dimensional tracking 13000 using the hemispherical or spherical visible depth image may include generating a three-dimensional map or model, such as a three-dimensional augmented reality model or a three-dimensional virtual reality model, that represents the operating environment of the user device or a portion thereof. The three-dimensional tracking 13000 using hemispherical or spherical visible depth images includes: step 13100, image acquisition and pre-processing, and step 13200, three-dimensional tracking.

In step 13100, image acquisition and pre-processing includes: step 13110, acquiring an image, and step 13120, performing perspective transformation. In step 13110, acquiring the image may include: a hemispherical or spherical visible light image or image is acquired, which image comprises depth information, such as an RGB-D (red green blue depth) image. For simplicity of description, a hemispherical or spherical visible light image including depth information, such as a combination of a hemispherical or spherical visible light image and a hemispherical or spherical non-visible light depth image, may be referred to herein as a hemispherical or spherical visible light depth (VL-D) image. The hemispherical or spherical visible light image and the hemispherical or spherical non-visible light depth image are synchronized spatiotemporally.

The hemispherical VL-D image may include a hemispherical visible light image and a hemispherical depth image. For example, the user device may include a hemispherical image capture apparatus, similar to hemispherical fish-eye non-visible light depth detection device 3000 shown in fig. 3 or hemispherical fish-eye non-visible light depth detection device 4000 shown in fig. 4, unless explicitly described herein or from the context. The hemispherical image capturing device may include a hemisphere, such as a fish eye, a visible light image capturing unit. The hemispherical visible light image capturing unit is similar to the hemispherical fish-eye non-visible light detecting unit 6000 shown in fig. 6, unless explicitly described herein or from the context. For example, the hemispherical visible light image capturing unit may omit the non-visible light filter 6200 shown in fig. 6, and may be configured in other ways to capture visible light. The hemispherical image capture apparatus may comprise a hemispherical non-visible depth detection unit, apparatus or array, such as hemispherical fisheye non-visible depth detection device 3000 as shown in fig. 3 or hemispherical fisheye non-visible depth detection device 4000 as shown in fig. 4. The hemispherical non-visible depth detection unit, device or array and the hemispherical visible image acquisition unit may be synchronized. The hemispherical visible light image capturing unit may acquire or capture a hemispherical visible light image, and at the same time, the hemispherical non-visible light depth detecting unit may acquire or capture a corresponding hemispherical non-visible light depth image.

The spherical VL-D image may include a spherical visible light image and a spherical depth image. For example, the user device may include a spherical image capture apparatus, similar to the spherical fisheye non-visible light depth detection device 8000 shown in fig. 8 or the spherical fisheye non-visible light depth detection device 9000 shown in fig. 9, unless explicitly described herein or from the context. The spherical image capture device may include a spherical visible light image capture unit. The spherical visible light image capture unit may be similar to the hemispherical fish-eye non-visible light detection unit 11000 shown in fig. 11, unless explicitly described herein or from the context. For example, the spherical visible light image capturing unit may omit the non-visible light filters 11200, 11300 shown in fig. 11, and may be otherwise configured to capture visible light. The spherical image capture device may include a spherical non-visible light depth detection unit, device or array, such as the spherical fisheye non-visible light depth detection device 8000 shown in fig. 8 or the spherical fisheye non-visible light depth detection device 9000 shown in fig. 9. The spherical non-visible depth detection unit, device or array and the spherical visible image capture unit may be synchronized. The spherical visible light image capturing unit may acquire or capture a spherical visible light image, while the spherical non-visible light depth detecting unit may acquire or capture a corresponding spherical non-visible light depth image.

In step 13120, perspective converting may include generating a perspective converted image, such as a perspective converted visible light image, a perspective converted depth image, both, or a combination thereof, i.e., a perspective projection image. For example, a perspective conversion unit of the user device may receive hemispherical or spherical VL-D images from one or more hemispherical or spherical image capture units of the user device, may generate perspective converted images based on the hemispherical or spherical VL-D images, and output the perspective converted images. For example, a perspective-transformed hemispherical VL-D image generated based on a hemispherical VL-D image may resemble a panoramic visible image and a corresponding panoramic non-visible depth image. In step 13120, perspective transformation may include mapping each pixel location in the perspective transformed hemispherical or spherical VL-D image to a corresponding location in the hemispherical or spherical VL-D. In step 13120, perspective conversion may include image processing, such as anti-aliasing of the visible light image, the depth image, or both. The perspective conversion unit may output a perspective-converted hemispherical or spherical VL-D image, which may include a perspective-converted hemispherical or spherical visible light image and a perspective-converted hemispherical or spherical non-visible light depth image.

In step 13120, perspective transformation may include, for example, spherical perspective projection, including projecting lines in space into curves in spherical perspective images in accordance with straight-line spherical perspective projection constraints. For example, a line in space may be projected as an elliptic curve with a semi-major axis in the image plane. In step 13120, perspective transformation may include identifying an ellipse corresponding to a line in space and based on points (u) identified along an elliptic curve from the VL-D image_i,v_i) E.g. five points (i ═ 1.., 5), determine the center (optical center) and semi-major axis of the elliptic curve.

In step 13120, perspective transformation may include point-to-point (u)_i,v_i) A curve fit is performed, for example by a least squares cost function, which includes the identification coefficients (b, c, d, e, f) and can be expressed as follows:

u²+buv+cv²+du+ev+f＝0.

in step 13120, perspective transformation may include determining a center point (x) of the ellipse_c,y_c) And a semi-major axis (a), which may be an optical center of the hemispherical image, which may correspond to a radius. For each point in the perspective-transformed image, e.g. each pixel point (x, y), a corresponding position (u, v) in the VL-D image may be determined, and the value of each point (x, y) in the perspective-transformed image may be determined, e.g. using bilinear interpolation based on the value of the corresponding position (u, v) in the VL-D image.

In step 13120, perspective transformation may include using intrinsic parameters, extrinsic parameters, or both, which may be obtained through a calibration process. The intrinsic parameters may correspond to lens distortion. The extrinsic parameters may correspond to a transformation between the coordinate system of the hemispherical or spherical visible light image and the coordinate system of the hemispherical or spherical non-visible light depth image. In step 13120, perspective converting may include aligning the perspective converted hemispherical or spherical visible light image with a corresponding perspective converted hemispherical or spherical non-visible light depth image.

In step 13200, three-dimensional tracking includes obtaining and tracking objects and object state information (three-dimensional tracking information), which may include three-dimensional shape information, object type information, relative or absolute object position information, kinematic object state information, such as direction, velocity, and acceleration information, or other object state information. In some embodiments, in step 13200, three-dimensional tracking may include generating a three-dimensional model of the operating environment of the user device corresponding to the VL-D images captured in step 13100. For example, a three-dimensional tracking unit of the user device may receive a perspective-transformed hemispherical or spherical VL-D image from a hemispherical or spherical image capture apparatus and may generate a three-dimensional model based on the received perspective-transformed hemispherical or spherical VL-D image, such that simultaneous localization and mapping (SLAM) may be used.

In step 13200, three-dimensional tracking includes, in step 13210, feature extraction; step 13220, feature matching; step 13230, tracking the object state; and step 13240, analyzing the state of the object. Although not separately shown in FIG. 13, in step 13200, three-dimensional tracking may include outputting, such as transmitting or storing, three-dimensional tracking information, perspective transformed hemispherical or spherical VL-D images, or a combination thereof. In some embodiments, the perspective transformation of step 13120 may be omitted, and the implementation of three-dimensional tracking in step 13200 may be based on the hemispherical or spherical VL-D image or sequence of images captured in step 13100.

In step 13210, the feature extraction may include performing feature extraction based on the perspective transformed hemispherical or spherical VL-D image, such as feature extraction based on Scale Invariant Feature Transform (SIFT), feature extraction based on Histogram of Oriented Gradients (HOG), feature extraction based on Speeded Up Robust Features (SURF), Harr feature extraction, feature extraction based on neural networks, and so on. In the perspective converted hemispherical or spherical VL-D images, one or more features may be identified that correspond to portions of the object captured in the respective images, such as corners or edges of the object. For example, one or more features may be identified in a perspective converted hemispherical or spherical visible light image and one or more features may be identified in a corresponding perspective converted hemispherical or spherical non-visible light depth image. In some embodiments, a temporal sequence of VL-D images may be obtained, and in step 13210, feature extraction may include extracting and identifying features from two or more VL-D images in the temporal sequence of VL-D images.

In some embodiments, obtaining the temporal sequence of VL-D images and feature matching in step 13220 may include identifying a correspondence between features identified or extracted from a first VL-D image in the temporal sequence of VL-D images in step 13210 and features identified or extracted from a second VL-D image in the temporal sequence of VL-D images in step 13210, e.g., a subsequent VL-D image, including aligning the first VL-D image with the second VL-D image.

In step 13230, obtaining object state information may include determining three-dimensional shape information, object type information, relative or absolute object position information, kinematic object state information, such as direction, velocity, and acceleration information, or other object state information, which may be based on the features extracted in step 13210 and the features matched in step 13220. The object state information obtained in step 13230 may be output to the image acquisition operation in step 13110, as indicated by the directional line at 13235.

In step 13240, object state analysis may be implemented based on the object state information obtained in step 13230. For example, in step 13240, the object state analysis may include determining whether the tracked object is asleep (e.g., determining a probability that the tracked individual enters a sleep state within a set period of time), and performing or initiating an operation when it is determined that the individual is asleep. For example, when it is determined that the tracked person is sleeping, the object state analysis in step 1324 may include turning off the external device, communicating with the external device, such as pausing a video or music stream, muting non-important messages, increasing a volume setting for important messages, etc.

Fig. 14 is a flow chart of three-dimensional tracking 14000 using hemispherical or spherical non-visible light depth images based on an artificial neural network in an embodiment of the disclosure. The three-dimensional tracking 14000 using a hemispherical or spherical non-visible depth image based on an artificial neural network can be implemented by a non-visible light based depth detection device, such as a user device, for example, a hemispherical fisheye non-visible depth detection device 3000 shown in the figure, a hemispherical fisheye non-visible depth detection device 4000 shown in fig. 4, a spherical fisheye non-visible depth detection device 8000 shown in fig. 8, or a spherical fisheye non-visible depth detection device 9000 shown in fig. 9.

The three-dimensional tracking 14000 using hemispherical or spherical non-visible depth images based on an artificial neural network may include: 14100, receiving an input image; 14200, feature extraction; 14300, feature matching; and 14400, object identification. In one embodiment, the artificial neural network may be a deep learning artificial neural network.

14200, the feature extraction may include an ordered sequence of artificial neural network layers, such as a first convolution and modified linear unit (RELU) layer 14210, a first pooling layer 14220, a second convolution and modified linear unit (RELU) layer 14230, and a second pooling layer 14240. In step 14200, feature extraction may output feature extraction data 14300. Other layers may be used, as indicated by the ellipsis between feature extraction in 14200 and output in 14300.

14400 feature matching may include obtaining flattened feature data 14410 based on the 14300 feature extraction output, and obtaining a fully connected dataset 14420.

14500, object recognition may obtain object recognition information 14510, including one or more object classifications, which may be sorted by probability.

Fig. 15 is a schematic diagram of a scene 15000 three-dimensionally tracked using hemispherical or spherical non-visible depth images, in an embodiment of the disclosure.

The scene 15000 being three-dimensionally tracked using hemispherical or spherical non-visible depth images includes a hemispherical or spherical non-visible depth image capture 15100, such as a user device, e.g., the hemispherical fish-eye non-visible depth detection device 3000 shown in fig. 3, the hemispherical fish-eye non-visible depth detection device 4000 shown in fig. 4, the spherical fish-eye non-visible depth detection device 8000 shown in fig. 8, or the spherical fish-eye non-visible depth detection device 9000 shown in fig. 9. A hemispherical or spherical non-visible light depth image capture device 15100 is arranged in a circle surrounded by eight people 15200, 15210, 15220, 15230, 15240, 15250, 15260, 15270, 15280. The hemispherical or spherical non-visible depth image capturing apparatus 15100 may capture a hemispherical or spherical non-visible depth image, or a sequence of hemispherical or spherical non-visible depth images, which includes the eight persons 15200, 15210, 15220, 15230, 15240, 15250, 15260, 15270, 15280, and perform three-dimensional tracking using the hemispherical or spherical non-visible depth image. Although eight tracked individuals are shown in fig. 15, other numbers of individuals or objects may be tracked.

Fig. 16 is a schematic illustration of a visualization 16000 of a scene for three-dimensional tracking using hemispherical or spherical non-visible light depth images in an embodiment of the disclosure. The visualization 16000 shown in fig. 16 may correspond to a scene, such as the scene 15000 shown in fig. 15.

In a first visualization, at the top, a plurality of individuals surrounding a hemispherical or spherical non-visible depth image capture device may be captured into a hemispherical or spherical non-visible depth image, or a sequence of hemispherical or spherical non-visible depth images, and a perspective correction may be used to generate a visualization, such as a single line visualization 16100 shown at the top, a plurality of lines visualization 16200 shown in the middle, a cylinder visualization 16300 shown at the bottom, or other visualizations, corresponding to a two-dimensional perspective view of each person, of the portion of the captured hemispherical or spherical non-visible depth image or images.

The single line visualization 16100 shows eight individuals 16110, 16120, 16130, 16140, 16150, 16160, 16170, 16180 around a hemispherical or spherical non-visible light depth image capture device, for example in a circular scene, such as the scene shown in fig. 15, and their orientation arranged in a line visualization as shown in fig. 16.

The multi-line visualization 16200 shows eight individuals 16210, 16220, 16230, 16240, 16250, 16260, 16270, 16280 around a hemispherical or spherical non-visible light depth image capture device, for example in a circular scene, such as the scene shown in fig. 15, and their orientation arranged in a multi-line visualization as shown in fig. 16.

The cylinder visualization 16300 shows the three individuals 16310, 16320, 16330 around a hemispherical or spherical non-visible light depth image capture device, for example in a circular scene, such as the scene shown in fig. 15, and oriented in a cylinder visualization as shown in fig. 16.

The hemispherical or spherical non-visible light depth image capturing device can track the depth information of each human face. The individual faces may be scaled and normalized for visualization 16000. For example, the distance between each individual and a hemispherical or spherical non-visible depth image capture device may be different. For example, a first individual may be further from a hemispherical or spherical non-visible light depth image capture device than a second individual. Generating the visualization 16000 can include scaling and normalizing the visualization portion corresponding to each individual so that the distance between each individual and the hemispherical or spherical non-visible light depth image capture device appears equal.

The aspects, features, elements, and embodiments of the methods, programs, or algorithms described in this disclosure may be implemented by a computer program, software, or firmware in a computer readable storage medium for execution by a computer or processor, and may take the form of a computer program product accessible, for example, from a tangible computer-usable or computer readable medium.

As used herein, the term "computer" or "computing device" includes any unit or combination of units capable of performing any method of the present disclosure, or any portion thereof. As used in this disclosure, the terms "user equipment," "mobile device," or "mobile computing device" include, but are not limited to, user equipment, wireless transmit/receive units, mobile stations, fixed or mobile subscriber units, pagers, cellular telephones, Personal Digital Assistants (PDAs), computers, or any other type of user equipment capable of operating in a mobile environment.

As used herein, the term "processor" includes a single processor or multiple processors, such as one or more special purpose processors, one or more digital signal processors, one or more microprocessors, one or more controllers, one or more Application Specific Integrated Circuits (ASICs), one or more Application Specific Standard Products (ASSPs); one or more Field Programmable Gate Array (FPGA) circuits, any other type or combination of Integrated Circuits (ICs), one or more state machines, or any combination thereof.

As used herein, the term "memory" includes any computer-usable or computer-readable medium or device that can, for example, tangibly embody, store, communicate, or transmit any signal or information for use by or in connection with any processor. Examples of a computer-readable storage medium may include one or more read-only memories, one or more random-access memories, one or more registers, one or more cache memories, one or more semiconductor memory devices, one or more magnetic media such as internal hard disks and removable disks, one or more magneto-optical media, one or more optical media such as CD-ROM disks and Digital Versatile Disks (DVDs), or any combination thereof.

As used herein, the term "instructions" may include instructions for performing any of the methods disclosed herein, or any portion thereof, and may be implemented in hardware, software, or any combination thereof. For example, the instructions may be implemented as information stored in a memory, such as a computer program, that is executable by a processor to perform any of the respective methods, algorithms, aspects, or combinations thereof described herein. In some embodiments, the instructions, or portions thereof, may be implemented as a special purpose processor or circuitry that may include specialized hardware for performing any of the methods, algorithms, aspects, or combinations thereof described herein. Portions of the instructions may be distributed across multiple processors on the same machine or on different machines, or across a network, such as a local area network, a wide area network, the internet, or a combination thereof.

As used herein, the terms "example," "embodiment," "implementation," "aspect," "feature," or "element" mean serving as an example, instance, or illustration. Any example, embodiment, implementation, aspect, feature, or element is independent of every other example, embodiment, aspect, feature, or element and may be used in combination with any other example, embodiment, implementation, aspect, feature, or element, unless expressly stated otherwise.

As used herein, the terms "determine" and "identify," or any variation thereof, include selecting, determining, calculating, querying, receiving, determining, establishing, obtaining, or otherwise identifying or determining in any manner using one or more devices shown and described herein.

As used herein, the term "or" means an inclusive "or" rather than an exclusive "or". That is, unless specified otherwise, or clear from context, "X comprises A or B" is intended to mean any of the natural inclusive permutations. That is, if X comprises A; x comprises B; or X includes both A and B, then in any of the above cases "X includes A or B" is satisfied. In addition, the use of "a" or "an" in this application and the appended claims should generally be construed to mean "one or more" unless specified otherwise or clear from context to be directed to a singular form.

Moreover, for simplicity of explanation, while the figures and descriptions herein may include a sequence or series of steps or stages, elements of the methods disclosed herein may occur in various orders or concurrently. Moreover, elements of the methods disclosed herein may occur in conjunction with other elements not expressly set forth or described herein. Moreover, not all elements of a method described herein need be implemented in accordance with a method of the present disclosure. Although the aspects, features, and elements are described herein in particular combinations, the aspects, features, or elements can be used alone, or in various combinations with or without other aspects, features, and elements.

37页详细技术资料下载

上一篇：一种医用注射器针头装配设备

下一篇：扫描过程再生方法

Three-dimensional tracking using hemispherical or spherical visible depth images

相关技术

网友询问留言