System and method for multi-modal depth sensing in an automated surgical robotic vision system

文档序号：690445 发布日期：2021-04-30 浏览：8次中文

阅读说明：本技术 自动手术机器人视觉系统中多模态感测深度的系统和方法 (System and method for multi-modal depth sensing in an automated surgical robotic vision system ) 是由托马斯·J·卡列夫蒂娜·P·陈伊曼纽尔·德马约托尼·陈瓦西里·叶夫根耶维奇·布哈林于 2019-07-19 设计创作，主要内容包括：公开了用于多模式感测对象的表面的三维位置信息的系统和方法。具体地,多个可视化模态均用于收集对象表面的独特位置信息。使用加权因子组合每个计算的位置信息,以计算最终的加权三维位置。在各种实施方式中,可以使用基准标记来记录第一深度,可以使用结构化的光图案来记录第二深度,并且可以使用光场相机来记录第三深度。可以将加权因子应用于所记录的每个深度,并且可以计算最终的加权深度。(Systems and methods for multi-modal sensing of three-dimensional positional information of a surface of an object are disclosed. In particular, multiple visualization modalities are each used to collect unique positional information of the surface of the object. Each calculated position information is combined using a weighting factor to calculate a final weighted three-dimensional position. In various embodiments, a first depth may be recorded using fiducial marks, a second depth may be recorded using a structured light pattern, and a third depth may be recorded using a light field camera. A weighting factor may be applied to each depth recorded and a final weighted depth may be calculated.)

1. A method, comprising:

recording an image comprising an object, a first plurality of marks disposed on the object, a second plurality of marks disposed on the object, and a third plurality of marks disposed on the object;

calculating a first depth using the image and the first plurality of markers;

calculating a second depth using the image and the second plurality of markers;

calculating a third depth using the image and the third plurality of markers;

assigning a first weight to the first depth, a second weight to the second depth, and a third weight to the third depth; and

calculating a weighted average depth based on the first depth, the second depth, the third depth, the first weight, the second weight, and the third weight.

2. The method of claim 1, wherein recording the image is performed using one or more digital cameras.

3. The method of claim 2, wherein the one or more digital cameras comprise a stereoscopic camera system.

4. The method of claim 2 or 3, wherein the one or more digital cameras comprise plenoptic cameras.

5. The method of any of the preceding claims, wherein the first plurality of markings comprises fiducial markings.

6. The method of claim 5, wherein the fiducial mark comprises liquid ink.

7. The method of any one of the preceding claims, wherein recording the image comprises:

overlaying a structured light pattern on the surface of the object from a structured light source;

recording the structured light pattern on the object; and

calculating a geometric reconstruction of the structured light pattern.

8. The method of any one of the preceding claims, wherein recording the image comprises:

overlaying a light pattern on the surface of the object from a light source;

recording a first image of the light pattern using a first camera at a first location;

recording a second image of the light pattern at a second location using a second camera, the second location being a predetermined distance from the first location; and

calculating a disparity value between the first image and the second image.

9. The method of any of the preceding claims, wherein the third plurality of markers comprises a contrast agent applied to a surface of the object.

10. The method of claim 9, wherein the contrast agent is a nebulized liquid dye.

11. The method of any of claims 1-10, wherein the third weight is greater than the first weight and the second weight.

12. The method of any of claims 1-10, wherein the second weight is greater than the first weight and the third weight.

13. The method of any of claims 1-10, wherein the first weight is greater than the second weight and the third weight.

14. The method of any of claims 1-10, wherein the first weight is equal to the second weight and the third weight.

15. A system, comprising:

an imaging device;

a computing node comprising a computer-readable storage medium having program instructions embodied thereon, the program instructions executable by a processor of the computing node to cause the processor to perform a method comprising:

recording an image, the image comprising an object, a first plurality of marks disposed on the object, a second plurality of marks disposed on the object, and a third plurality of marks disposed on the object;

calculating a first depth using the image and the first plurality of markers;

calculating a second depth using the image and the second plurality of markers;

calculating a third depth using the image and the third plurality of markers;

assigning a first weight to the first depth, a second weight to the second depth, and a third weight to the third depth; and

calculating a weighted average depth based on the first depth, the second depth, the third depth, the first weight, the second weight, and the third weight.

16. The system of claim 15, wherein the imaging device comprises one or more digital cameras.

17. The system of claim 16, wherein the one or more digital cameras comprise a stereo camera system.

18. The system of claim 16 or 17, wherein the one or more digital cameras comprise plenoptic cameras.

19. The system of any of claims 15-18, wherein the first plurality of markers comprises fiducial markers.

20. The system of claim 19, wherein the fiducial mark comprises liquid ink.

21. The system of any of claims 15 to 20, further comprising a structured light source configured to project a structured light pattern on the surface of the object.

22. The system of claim 21, wherein recording the image comprises:

overlaying a structured light pattern on the surface of the object from the structured light source;

recording the structured light pattern on the object; and

calculating a geometric reconstruction of the structured light pattern.

23. The system of any of claims 15 to 22, wherein recording the image comprises:

overlaying a light pattern on a surface of the object from a light source;

recording a first image of the light pattern using a first camera at a first location;

recording a second image of the light pattern at a second location using a second camera, the second location being a predetermined distance from the first location; and

calculating a disparity value between the first image and the second image.

24. The system of any of claims 15 to 23, wherein the third plurality of markers comprises a contrast agent applied to a surface of the object.

25. The system of claim 24, wherein the contrast agent is a nebulized liquid dye.

26. The system of any of claims 15 to 25, wherein the third weight is greater than the first weight and the second weight.

27. The system of any of claims 15 to 25, wherein the second weight is greater than the first weight and the third weight.

28. The system of any of claims 15 to 25, wherein the first weight is greater than the second weight and the third weight.

29. The system of any of claims 15 to 25, wherein the first weight is equal to the second weight and the third weight.

30. The system of any one of claims 15 to 29, further comprising an endoscope having a proximal end and a distal end, wherein the imaging device is disposed at the proximal end.

31. A computer program product comprising a computer readable storage medium having program instructions embodied thereon, the program instructions executable by a processor to cause the processor to perform a method comprising:

calculating a first depth using the image and the first plurality of markers;

calculating a second depth using the image and the second plurality of markers;

calculating a third depth using the image and the third plurality of markers;

assigning a first weight to the first depth, a second weight to the second depth, and a third weight to the third depth; and

calculating a weighted average depth based on the first depth, the second depth, the third depth, the first weight, the second weight, and the third weight.

32. The computer program product of claim 31, wherein recording the image is performed using one or more digital cameras.

33. The computer program product of claim 32, wherein the one or more digital cameras comprise a stereoscopic camera system.

34. The computer program product of claim 32 or 33, wherein the one or more digital cameras comprise plenoptic cameras.

35. The computer program product of any of claims 31-34, wherein the first plurality of markers comprises fiducial markers.

36. The computer program product of claim 35, wherein the fiducial mark comprises liquid ink.

37. The computer program product of any of claims 31-36, wherein recording the image comprises:

overlaying a structured light pattern on the surface of the object from the structured light source;

recording the structured light pattern on the object; and

calculating a geometric reconstruction of the structured light pattern.

38. The computer program product of any of claims 31-37, wherein recording the image comprises:

overlaying a light pattern on a surface of the object from a light source;

recording a first image of the light pattern using a first camera at a first location;

recording a second image of the light pattern at a second location using a second camera, the second location being a predetermined distance from the first location; and

calculating a disparity value between the first image and the second image.

39. The computer program product of any of claims 31 to 38, wherein the third plurality of markers comprises a contrast agent applied to a surface of the object.

40. The computer program product of claim 39, wherein the contrast agent is a nebulized liquid dye.

41. The computer program product of any of claims 31-40, wherein the third weight is greater than the first weight and the second weight.

42. The computer program product of any of claims 31-40, wherein the second weight is greater than the first weight and the third weight.

43. The computer program product of any of claims 31-40, wherein the first weight is greater than the second weight and the third weight.

44. The computer program product of any of claims 31-40, wherein the first weight is equal to the second weight and the third weight.

45. A method, comprising:

recording an image, the image comprising an object, a first plurality of marks disposed on the object, and a second plurality of marks disposed on the object;

calculating a first depth using the image and the first plurality of markers;

calculating a second depth using the image and the second plurality of markers;

assigning a first weight to the first depth and a second weight to the second depth; and

calculating a weighted average depth based on the first depth, the second depth, the first weight, and the second weight.

46. A system, comprising:

an imaging device;

recording an image, the image comprising an object, a first plurality of marks disposed on the object, a second plurality of marks disposed on the object;

calculating a first depth using the image and the first plurality of markers;

calculating a second depth using the image and the second plurality of markers;

assigning a first weight to the first depth and a second weight to the second depth; and

calculating a weighted average depth based on the first depth, the second depth, the first weight, and the second weight.

47. A computer program product comprising a computer readable storage medium having program instructions embodied thereon, the program instructions executable by a processor to cause the processor to perform a method comprising:

recording an image, the image comprising an object, a first plurality of marks disposed on the object, a second plurality of marks disposed on the object;

calculating a first depth using the image and the first plurality of markers;

calculating a second depth using the image and the second plurality of markers;

assigning a first weight to the first depth and a second weight to the second depth; and

calculating a weighted average depth based on the first depth, the second depth, the first weight, and the second weight.

48. An integrated surgical device, comprising:

an endoscope having a proximal end and a distal end;

an imaging device optically coupled to a distal end of the endoscope;

calculating a first depth using the image and the first plurality of markers;

calculating a second depth using the image and the second plurality of markers;

calculating a third depth using the image and the third plurality of markers;

assigning a first weight to the first depth, a second weight to the second depth, and a third weight to the third depth; and

calculating a weighted average depth based on the first depth, the second depth, the third depth, the first weight, the second weight, and the third weight.

49. An integrated surgical device, comprising:

an endoscope having a proximal end and a distal end;

an imaging device optically coupled to a distal end of an endoscope;

recording an image, the image comprising an object, a first plurality of marks disposed on the object, a second plurality of marks disposed on the object;

calculating a first depth using the image and the first plurality of markers;

calculating a second depth using the image and the second plurality of markers;

assigning a first weight to the first depth and a second weight to the second depth; and

calculating a weighted average depth based on the first depth, the second depth, the first weight, and the second weight.

Background

Embodiments of the present disclosure generally relate to multi-modal sensing of three-dimensional positional information of a surface of an object.

Disclosure of Invention

According to embodiments of the present disclosure, systems, methods, and computer program products are provided for determining three-dimensional coordinates on an object. In the method, an image is recorded. The image includes an object, a first plurality of markers disposed on the object, a second plurality of markers disposed on the object, and a third plurality of markers disposed on the object. A first depth is calculated using the image and the first plurality of markers. A second depth is calculated using the image and the second plurality of markers. A third depth is calculated using the image and the third plurality of markers. A first weight is assigned to a first depth, a second weight is assigned to a second depth, and a third weight is assigned to a third depth. A weighted average depth is calculated based on the first depth, the second depth, the third depth, the first weight, the second weight, and the third weight.

In various embodiments, a system for determining three-dimensional coordinates on an object is provided. The system includes an imaging device and a computing node including a computer readable storage medium having program instructions embodied thereon. The program instructions are executable by a processor of the computing node to cause the processor to perform a method in which an imaging device records an image. The image includes an object, a first plurality of markers disposed on the object, a second plurality of markers disposed on the object, and a third plurality of markers disposed on the object. A first depth is calculated using the image and the first plurality of markers. A second depth is calculated using the image and the second plurality of markers. A third depth is calculated using the image and the third plurality of markers. A first weight is assigned to a first depth, a second weight is assigned to a second depth, and a third weight is assigned to a third depth. A weighted average depth is calculated based on the first depth, the second depth, the third depth, the first weight, the second weight, and the third weight.

In various embodiments, a computer program product is provided for determining three-dimensional coordinates on an object. The computer program product includes a computer-readable storage medium having program instructions embodied thereon. The program instructions are executable by a processor to cause the processor to perform a method in which an image is recorded. The image includes an object, a first plurality of markers disposed on the object, a second plurality of markers disposed on the object, and a third plurality of markers disposed on the object. A first depth is calculated using the image and the first plurality of markers. A second depth is calculated using the image and the second plurality of markers. A third depth is calculated using the image and the third plurality of markers. A first weight is assigned to a first depth, a second weight is assigned to a second depth, and a third weight is assigned to a third depth. A weighted average depth is calculated based on the first depth, the second depth, the third depth, the first weight, the second weight, and the third weight.

In various embodiments, systems, methods, and computer program products are provided for determining three-dimensional coordinates on an object. In the method, an image is recorded. The image includes an object, a first plurality of markers disposed on the object, and a second plurality of markers disposed on the object. A first depth is calculated using the image and the first plurality of markers. A second depth is calculated using the image and the second plurality of markers. A first weight is assigned to the first depth and a second weight is assigned to the second depth. A weighted average depth is calculated based on the first depth, the second depth, the first weight, and the second weight.

In various embodiments, an integrated surgical device is provided, comprising: an endoscope having a proximal end and a distal end; an imaging device optically coupled to the distal end of the endoscope; and a computing node comprising a computer readable storage medium having program instructions embodied thereon. The program instructions are executable by a processor of a computing node to cause the processor to perform a method in which an image is recorded. The image includes an object, a first plurality of markers disposed on the object, a second plurality of markers disposed on the object, and a third plurality of markers disposed on the object. A first depth is calculated using the image and the first plurality of markers. A second depth is calculated using the image and the second plurality of markers. A third depth is calculated using the image and the third plurality of markers. A first weight is assigned to a first depth, a second weight is assigned to a second depth, and a third weight is assigned to a third depth. A weighted average depth is calculated based on the first depth, the second depth, the third depth, the first weight, the second weight, and the third weight.

Drawings

Fig. 1 illustrates an exemplary image of a surface having fiducial marks, where the image may be used as a baseline image, according to an embodiment of the present disclosure.

FIG. 2 illustrates an exemplary image of a surface having a matrix of structured light markings overlaying a baseline image according to embodiments of the present disclosure.

Fig. 3A illustrates an exemplary image of a simulated biological tissue according to an embodiment of the disclosure.

Fig. 3B illustrates an exemplary image of a depth map of a simulated biological tissue according to an embodiment of the disclosure.

Fig. 4A illustrates an exemplary image of simulated biological tissue with a contrast agent applied to the surface, according to an embodiment of the disclosure.

Fig. 4B illustrates an exemplary image of a depth map of simulated biological tissue with a contrast agent applied to the surface, according to an embodiment of the disclosure.

Fig. 5 illustrates a 3D surface imaging system imaging tissue according to an embodiment of the present disclosure.

Fig. 6 shows a diagram illustrating a 3D surface imaging system according to an embodiment of the present disclosure.

Fig. 7 illustrates an exemplary flow diagram of a method for determining three-dimensional coordinates on an object according to an embodiment of the present disclosure.

FIG. 8 shows a table of analytical sensors and their specifications according to an embodiment of the present disclosure.

Fig. 9A to 9C show graphs of results of sensor biasing according to an embodiment of the present disclosure.

Fig. 10A to 10C show graphs of results of sensor accuracy according to an embodiment of the present disclosure.

Fig. 11 shows a table of lateral noise for various sensors according to embodiments of the present disclosure.

Fig. 12A to 12D show graphs of precision ratios for different materials and lighting conditions (the lower the better) according to an embodiment of the present disclosure.

Fig. 13A shows a graph of accuracy in a multi-sensor setting (where the index represents the distance to the target) and fig. 13B shows a graph of nan ratio (the lower the better) in a multi-sensor setting, according to an embodiment of the present disclosure.

Fig. 14A-14C show graphs of the effect of additional sensors according to embodiments of the present disclosure.

FIG. 15 shows a schematic diagram of an exemplary compute node, according to an embodiment of the present disclosure.

Detailed Description

The ability to accurately discern the three-dimensional positional information (X, Y, Z) of a target object (e.g., biological tissue) is a necessary and critical requirement of an automated surgical robotic system. One approach is to use fiducial markers of known size and shape attached directly to the surface of the object to determine positional information about the surface; however, the spatial resolution of any method using fiducial markers is limited to the number of fiducials applied to the tissue. The fiducial marks must be large enough to be detectable by a computer vision system, but also small enough to maximize the spatial resolution of the surface to which they are attached. Because of these conflicting requirements, there is an upper limit to the spatial resolution provided by fiducial markers, especially in a surgical environment where the automated surgical robotic system may operate in a small, confined space.

Many surgical procedures (e.g., suturing) require highly dexterous and highly precise movement of surgical tools to achieve satisfactory surgical results. In fully automated robotic surgical procedures without active human control, the accuracy of the robotic controlled surgical tool is highly dependent on the spatial resolution of the computer vision system. Since the surgical outcome depends to a large extent on the positional accuracy of the computer vision system guiding the robotic tool, the spatial resolution of the surgical site is even more important in fully automated robotic surgical procedures. The use of fiducial markers alone to guide a fully automated surgical robot does not provide sufficient spatial resolution of the surgical site to ensure satisfactory results.

Accordingly, there is a need for a system and method that can accurately and reliably sense positional information at high resolutions that enable accurate surgical planning and execution, thereby improving the likelihood of enabling robotic-assisted surgery.

Embodiments of the present disclosure generally relate to multi-modal sensing of three-dimensional positional information of an object surface. In particular, the present disclosure describes multiple visualization modalities for collecting unique positional information of the object surface, which are then combined using weighting factors to compute the final three-dimensional position. Although the present disclosure generally focuses on sensing three-dimensional positions relative to an automated surgical robot, the systems, methods, and computer program products are applicable to other fields that employ computer vision techniques to identify three-dimensional positions, such as virtual reality applications or augmented reality applications.

Systems for determining three-dimensional coordinates on a surface of an object (e.g., biological tissue) typically include a first imaging system for establishing a baseline image of the object. A baseline image may be established using, for example, a series of fiducial markers attached to the surface of the object to generate positional information of the surface of the object. For example, fiducial markers may be placed on the surface of the tissue via a spray applicator (e.g., a jet catheter). Generally, fiducial markers are specific markers that can be recognized by a computer vision system to determine specific positional information about the surface to which they are attached. Non-limiting examples of fiducial markers may include symbols (e.g., alphanumeric), patterns (e.g., QR code), liquids (e.g., infrared ink), or physical shapes (2D or 3D). This positional information can be used to map the surface of the object and create a computer simulation of the surface in three dimensions. Fiducial markers may be attached to an object in a specific pattern (e.g., a grid pattern) or in a non-specific pattern (e.g., randomly placed).

In various embodiments, the fiducial marker is applied to the target tissue in a liquid state through a syringe needle. The application of liquid markers to the target tissue has many advantages. First, the tag can be mixed in situ, which improves the stability of the tag. Second, the fluid markers allow for precise control over the location and application of the target tissue. Third, the marker may be applied as any irregular shape. By applying the liquid marker using a syringe, the irrigated surgical field causes an exothermic reaction, thereby solidifying the marker in a circular shape to the target tissue. Circular markers may help track a single point of interest on the target tissue during the surgical procedure.

In various embodiments, a marking tip, such as a syringe needle or felt tip, may be used to dispense the fiducial marks in a linear pattern. By using fiducial markers as continuous lines, the markers can be used to define boundaries on the target tissue. Defining the boundary may help to identify areas of diseased tissue or areas where a surgical procedure should not be performed. In yet another embodiment, the liquid marking may be sprayed onto the target tissue at the time of polymerization to create a speckle pattern. Speckle patterns may be of interest to define large regions of tissue with each other. In one example, background tissue may be blob processed to distinguish it from foreground tissue. Background and foreground information may be used by other components in a robotic or semi-autonomous workflow to plan or control its movements or recommendations.

In other embodiments, the liquid marker may be applied through a predetermined mask to apply any arbitrary and predetermined shape of marker on the target tissue.

To acquire position information of the surface of the object using the fiducial markers, the first imaging system may include one or more cameras (e.g., one, two, three, four, or five). In various implementations, the one or more cameras may include a stereo camera. In various embodiments, the stereo camera may be implemented by two separate cameras. In various embodiments, two separate cameras may be disposed at a predetermined distance from each other. In various embodiments, the stereo camera may be located at the distal-most end of the surgical instrument (e.g., laparoscope, endoscope, etc.). The camera may cross-reference the detected position of each fiducial marker relative to a known reference (e.g., a known size and shape of the fiducial) to determine positional information (e.g., depth) of each fiducial marker. The position information used herein may be generally defined as (X, Y, Z) in a three-dimensional coordinate system.

The one or more cameras may be, for example, infrared cameras that emit infrared radiation and detect reflections of the emitted infrared radiation. In other embodiments, the one or more cameras may be digital cameras known in the art. In other embodiments, one or more cameras may be plenoptic cameras. As described in more detail below, one or more cameras may be connected to the compute node.

In addition to fiducial marker tracking, the present disclosure improves upon single mode approaches that employ only fiducial markers by incorporating other visualization modalities to improve the accuracy of the resulting position information. The second imaging system may be used to generate positional information of the surface of the object (e.g., after recording baseline images using the first imaging system and acquiring positional information for each fiducial marker), either alone or in combination with other imaging systems described herein. A structured pattern projected from a structured light source may change the shape, size, and/or spacing of pattern features when projected on a surface. Given the known pattern stored by the second imaging system, the second imaging system may detect these changes and determine positional information based on the changes to the structured light pattern. For example, the second imaging system may include a structured light source (e.g., a projector) that projects a particular structured pattern of lines (e.g., a matrix of points or a series of stripes) onto the surface of the object. The pattern of lines results in illumination lines that are distorted from other than the source, and these lines can be used for geometric reconstruction of the surface shape, thereby providing positional information about the object surface.

The second imaging system may include one or more cameras (e.g., one, two, three, four, or five) capable of detecting the projected pattern from the structured light source. The one or more cameras may be digital cameras known in the art, and may be the same or different cameras used with the first imaging system. As described in more detail below, one or more cameras may be connected to the compute node. Using images from one or more cameras, the compute node may compute position information (X, Y, Z) for any suitable number of points along the surface of the object, thereby generating a depth map of the surface.

The third imaging system may be used to generate additional position information of the object surface. The third imaging system may include one or more cameras, for example, light field cameras (e.g., plenoptic cameras), and may be the same or different cameras as used for the first and second imaging systems. By having appropriate zoom and focus depth settings, plenoptic cameras can be used to generate accurate position information of the object surface.

One type of light field (e.g., plenoptic) camera that may be used in accordance with the present disclosure uses a microlens array placed in front of an otherwise conventional image sensor to sense intensity, color, and direction information. Multi-camera arrays are another type of light field camera. A "standard plenoptic camera" is a standardized mathematical model used by researchers to compare different types of plenoptic (or light field) cameras. By definition, a "standard plenoptic camera" has a microlens that is one focal length from the image plane of the sensor. Studies have shown that its maximum baseline is limited to the main lens entrance pupil size, which proves to be small compared to the stereoscopic setup. This means that a "standard plenoptic camera" can be used for near distance applications, since it shows a higher depth resolution at very close distances, which can be metric predicted from the parameters of the camera. Other types/locations of plenoptic cameras may be used, such as a focusing plenoptic camera, a coded aperture camera, and/or a stereo camera with plenoptic.

Once the position information is generated using the first, second, and third imaging systems, the combined position may be calculated by calculating a weighted average of the three imaging systems. As shown in equation 1 below, the combined pixel depth may be calculated by a weighted average of the depths generated from each of the three imaging systems.

In equation 1, C_MRepresenting the weight assigned to the first imaging system (e.g. label-based system), C_SLRepresenting the weight assigned to a second imaging system (e.g. a structured light based system), C_PRepresenting the weight, Depth, assigned to a third imaging system (e.g., a structured light based system)_MRepresenting the Depth, of pixels generated from a first imaging system_SLRepresenting the Depth of pixels generated from the second imaging system, and Depth_pRepresenting the depth of the pixel resulting from the third imaging system. In various embodiments, each weight may be a value between zero (0) and one (1), and the sum of all weight values may add up to one (1).

In various embodiments, the weight C assigned to the first imaging system_MMay be equal to the weight C assigned to the second imaging system_SLAnd a weight C assigned to the third imaging system_P. In other embodiments, the weight C assigned to the second imaging system_SLIs greater than that allocated toWeight C of the first imaging system_MAnd/or a weight C assigned to the third imaging system_P. In yet another embodiment, the weight C assigned to the third imaging system_PGreater than the weight C assigned to the first imaging system_MAnd/or a weight C assigned to the second imaging system_SL。

In various embodiments, the weight of each variable in equation 1 may be determined based on one or more factors selected based on the type of imaging system used. For example, if light field imaging is used, the factors may include: (1) a measure of contrast in the image, (2) the number of saturated pixels (which can be used to measure light intensity) and (3) local variations in depth of particular regions of the image. A high weight value may correspond to an image with high contrast, few saturated pixels and small local variations in depth within the scene.

In another example, if structured light imaging is used, the factors may include: (1) the amount of patterns identified and (2) the number of saturated pixels. A high weight value may correspond to an image having most or all of the pattern recognized and few saturated pixels.

In yet another example, if a fiducial marker is used, the factors may include (1) the number of saturated pixels, (2) the ability to identify the shape/size of the fiducial marker and (3) the ability to distinguish the fiducial marker from the surrounding environment. A high weight value may correspond to an image with few saturated pixels, the ability to identify most or all fiducial markers, and the ability to distinguish the fiducial from the surrounding environment.

In various embodiments, any combination of the two imaging modalities described herein may be used to calculate the first depth and the second depth of the surface of the object. In this embodiment, each of the two imaging modalities may have a respective weighting factor that is applied to the depth determined by that particular modality. In various embodiments, the two weighting factors may add up to one. In various embodiments, the pixel depth function is calculated in a similar manner as described above in equation 1, but in contrast, the pixel depths of the two modalities depend only on two weighted depth calculations (rather than three).

In various embodiments, the weight associated with each imaging system may depend on the overall quality of the particular imaging system. For example, one particular imaging system may generally provide more accurate data than another imaging system. In this example, data received from an imaging system with higher accuracy will be given higher weight than data received from an imaging system with lower accuracy. In various embodiments, the accuracy and/or precision of various imaging systems may depend on the distance from the object to be imaged, the material to be imaged, and/or the illumination of the operating environment. In various embodiments, the accuracy and/or precision of various imaging systems may depend on the position of the imaging system in the field of view, e.g., a first imaging system has a higher precision in the center of the field of view and drops rapidly towards the edges, while another imaging system may have a consistent precision throughout the field of view.

A discussion of the performance of various sensors in different situations can be found in Halmetschlager-Funek et al, "An Empirical Evaluation of Ten Depth Cameras," which is incorporated herein by reference in its entirety. Figure 8 shows a table of the analytical sensors in the Halmetschlager-Funek paper. Figures 9 to 14 illustrate various graphs in the Halmetschlager-Funek article relating to the effects of bias, accuracy, lateral noise, material/illumination/distance and the effect of additional sensors. In particular, with respect to bias (as shown in FIGS. 9A-9C), the paper describes that while Kinectv2 provides low bias over the entire range, a significant increase in sensor bias using structured light is observed starting from d >3 m. While all three structured light sensors and two active stereo cameras (ZR300 and D435) provide lower offsets than Kinectv2 for distance D <1m, three sensors (ZR300, Orbbec and Structure IO) provide lower offsets for depth values D < 2.5 m. For all sensors [ full range: d is 0-8m, fig. 9A; amplification: d-0-3 m, fig. 9B ], a quadratic increase in bias was observed. Proximity sensors F200 and SR300[ FIG. 9C ] show slightly higher bias than their distant counterparts, while Ensenso N35 provides low bias over the entire measurement range.

Regarding accuracy (as shown in fig. 10A to 10C), in all the remote sensors [ full range: d is 0-8m, fig. 10A; amplification: d-0-3, m, fig. 10B ] found a quadratic drop in accuracy, but the structured light sensor was different in scale compared to Kinectv 2. Overall, the R200 and ZR300 sensors performed the worst, while the Structure IO and Orbbec sensors performed very similarly. At distances d <2m, all structured light sensors were observed to produce noise measurements less than Kinec-tv 2. Furthermore, D435 is able to collect more accurate results than Kinectv2 at distance D <1 m. The accuracy results for D435 were observed to be more dispersive than for the other sensors. The proximity sensor [ FIG. 10C ] encounters noise levels as high as 0.0007 m. Within the range specified by the manufacturer, an accuracy value of 0.004m or less can be obtained.

As for the lateral noise (fig. 11), the analysis of the lateral noise shows similar results for three distant structured light sensors and distances. For d <3m, the noise level is independent of distance, with three pixels for the structured light sensor and one pixel for Kinectv2 (fig. 11). The two active stereo sensors (D435 and ZR300) provide low lateral noise levels similar to Kinectv 2. For distances less than 2m, R200 achieves lateral noise reduction for two pixels. In the proximity sensor, ensso N35 achieves the highest lateral noise value.

With respect to material/illumination/distance (fig. 12A-12D), a total of 384 data points were collected to determine how the accuracy of the sensor was affected by the reflection and absorption characteristics of the combination of six different materials and four different illumination conditions from 4.2 to 535.75lux (fig. 12A-12D). Tests have shown that Structure IO sensors can best handle various object reflectivities and lighting conditions. Although less accurate than other sensors at distances d >1.5m, it is able to collect information on highly reflective surfaces (e.g., aluminum) and in bright lighting conditions. Although Structure IO sensors give dense depth estimates, the depth values cannot be determined by the Xtion. Under bright lighting conditions, Orbbec may not be able to collect depth information for four of the six surfaces. Kinectv2 may not be able to collect reliable depth data for aluminum at distances d 1m and d 1.5m and under bright lighting conditions. For bright lighting conditions, the accuracy of the F200 and SR300 sensors may be greatly reduced. During the experimental setup, active stereo cameras (ensnso and R200) are expected to handle different illumination conditions better than structured light sensors due to their technical nature. In fig. 12A to 12D, the accuracy of zero indicates that the sensor cannot collect any depth information.

As for the noise caused by the additional sensors (fig. 13A, 13B and 14A to 14C), the results (fig. 13A to 13B) indicate that the long-range structured light sensor can handle the noise caused by one and two additional sensors. An anomaly occurs when the distance to the target is 1.5m and two additional sensors are introduced into the scene. No similar effect was observed with Kinectv 2. The sensor may give stable accuracy results independently of one or two additional sensors. The accuracy of the proximity sensors F200 and SR300 with additional sensors may be low, while the ensso N35 is only slightly affected by the third observation sensor. At this point, we note that the high nan ratio of close range devices can be derived in part from our settings. Half of the scene is beyond the range of the sensor (fig. 14A to 14C). In summary, the first experiment with one sensor provided a baseline for measurements using two and three sensors to view a scene. If only one sensor is added, the first difference may be visible. In particular, if another Realsense device is added to the scene, the nan ratio of the SR300 and F200 sensors may be greatly improved. For further analysis, a corresponding depth image is shown. In fig. 14A to 14C, it is apparent that the depth extraction is seriously affected by the additional sensor. The ensso and Kinectv2 sensors may not be affected by other sensors.

In various implementations, as described above, the depth data received from one or more cameras may be of higher quality (e.g., more reliable) than the depth data from other cameras in the imaging system. In various embodiments, the quality of the depth data may depend on supporting features external to the imaging system. For example, the depth data may be of a higher quality, and thus the depth data may be given a higher weight when a camera (e.g., an infrared camera) can clearly read a predetermined number of fiducial markers on the tissue. In various implementations, if the camera is unable to read the predetermined number of markers, the depth data may be of lower quality, and thus the depth data from the camera may be given lower weight. In a similar example, when the camera can clearly read the structured light pattern from the structured light projector, the depth data produced by the structured light can be of higher quality and therefore given higher weight.

In various embodiments, the weight associated with each imaging system may depend on the confidence of the depth and/or quality of each pixel. In various embodiments, because some imaging systems have one or more "sweet spots" in an image with higher quality image data, and one or more "dead zones" with lower quality image data, each weight associated with the imaging system may be parameterized at the pixel level of the image. In various implementations, one or more (e.g., all) of the weights may be a function of a two-dimensional point (x, y) representing a pixel in the image. In various embodiments, coordinate points may be assigned to pixels in an image in any suitable manner known in the art. For example, the lower left corner of the image may be assigned coordinates (0, 0), and the upper right corner of the image may be assigned the maximum number of pixels on each respective axis (maximum x pixels, maximum y pixels). In one example, one imaging system (e.g., a stereo camera) may have high quality image data in the center of the image and low quality image data in the periphery. In this particular example, pixels at the center of the image may be assigned a higher weight, and the weight may decrease as the pixels move radially away from the center of the image. In various embodiments, the parametric function may be a continuous function. In various embodiments, the parametric function may be a discontinuous function (e.g., a piecewise function). In various embodiments, the parametric function may comprise a linear function. In various embodiments, the parametric function may comprise an exponential function.

In various embodiments, when the imaging system is unable to calculate the depth at a particular pixel, the particular pixel may be assigned a zero weight for that particular imaging system (i.e., that particular imaging system will not be helpful in determining the depth at that particular pixel).

In various embodiments, the imaging system may include stereoscopic depth sensing. In various embodiments, stereoscopic depth sensing may work best when one or more uniquely identifiable features are present in an image (or video frame). In various implementations, stereoscopic depth sensing may be performed using two cameras (e.g., digital cameras). In various embodiments, the cameras may be calibrated to each other. For example, the imaging system may be calibrated based on latency, frame rate, three-dimensional distance between two cameras, various distances from the imaging system, various illumination levels, marker types/shapes/colors, and so forth. Software known in the art may be used to control the two cameras and enable stereoscopic depth sensing. In various embodiments, a first image (or video frame) is captured at a first camera and a second image (or video frame) is captured at a second camera at a predetermined distance from the first camera. In various embodiments, pixel disparity is calculated between a first image (or video frame) and a second image (or video frame). In various embodiments, the depth may be determined from a pixel disparity value. In various embodiments, closer objects have higher pixel disparity values, while other objects have lower pixel disparity values. In various embodiments, three-dimensional coordinates (x, y, z) may be calculated from the determined depth and camera calibration parameters. In various embodiments, stereoscopic depth sensing may be used with fiducial markers to determine depth.

In various embodiments, the imaging system may include active stereo depth sensing. In various embodiments, the projector may project a pattern that is unique on a local scale. In various embodiments, any suitable pattern may be used, and the pattern need not be known a priori by the imaging system. In various embodiments, the pattern may change over time. In various embodiments, active stereoscopic depth sensing with a projector may provide depth information for featureless images in unstructured environments.

In various implementations, a static mask may be projected onto the surface of an object (e.g., tissue) in a scene. For example, a physical pattern (e.g., a screen) may be placed in front of the light source, and a lens may be used to focus the light pattern on the surface.

In various embodiments, a Digital Micromirror Device (DMD) projector may be used to project a pattern on the surface of an object. In this embodiment, light is shone onto a micromirror array (e.g., 1,000,000 mirrors arranged in a rectangle). The mirrors may be controlled to allow or prevent light from entering and illuminating the scene. A lens may be used to focus the light pattern onto the scene. In various implementations, the DMD projector may allow for programmable patterns (e.g., QR codes, letters, circles, squares, etc.). It should be appreciated that a similar effect may be obtained using an optical super-surface instead of a DMD.

In various embodiments, a scanning laser projector may be used to project a pattern on the surface of an object. In this embodiment, one or more laser sources are used to project individual pixels onto a surface. A high definition image may be created by illuminating one pixel at a time at a high frequency. In various embodiments, a laser projector using scanning may not require focusing of the pattern. In various implementations, the scanned laser projector may allow for programmable patterns (e.g., QR codes, letters, circles, squares, etc.).

In various embodiments, custom algorithms may be developed for stereo cameras to detect known programmable patterns and determine depth data from the surface onto which the patterns are projected. In various embodiments, the depth data is calculated by determining a disparity value between a first image (or video frame) from a first camera and a second image (or video frame) from a second camera.

In various embodiments, light of a predetermined wavelength may be projected onto the surface of the object, depending on the material of the surface. Different materials may have different absorption and/or reflection characteristics across successive wavelengths of light. In various embodiments, the wavelengths are selected such that light is reflected off of the outermost surface of the object. In various embodiments, if the wavelength of light penetrating the object surface is selected, the resulting image may have a washed-out appearance, resulting in inaccurate depth data (e.g., lower accuracy, higher spatiotemporal noise).

In various embodiments, the imaging system may include an interferometer. In various implementations, the light source may illuminate the scene with the object, and the sensor may measure a phase difference between the emitted light and the reflected light. In various embodiments, the depth may be calculated directly from the sensor measurements. In various embodiments, the method may have lower computational resource requirements, faster processing, work on featureless scenes, and/or work at various lighting levels.

In various embodiments, the resulting depth map, including the calculated depth at each pixel, may be post-processed. Depth map post-processing refers to the processing of a depth map so that it can be used for a particular application. In various embodiments, depth map post-processing may include an increase in accuracy. In various embodiments, depth map post-processing may be used to speed performance and/or for aesthetic reasons. There are many specialized post-processing techniques suitable for use with the systems and methods of the present disclosure. For example, if the imaging device/sensor is running at a higher resolution than the technology required by the application, sub-sampling of the depth map may reduce the size of the depth map, thereby improving throughput and reducing processing time. In various embodiments, the sub-samples may be biased. For example, the sub-samples may be biased to remove depth pixels that lack a depth value (e.g., cannot be computed and/or have a zero value). In various embodiments, spatial filtering (e.g., smoothing) may be used to reduce noise in a single depth frame, which may include simple spatial averaging and non-linear edge-preserving techniques. In various embodiments, temporal filtering may be performed using data from multiple frames to reduce temporal depth noise. In various embodiments, a simple or time-biased averaging may be employed. In various implementations, for example, holes in the depth map may be filled when pixels show depth values non-uniformly. In various implementations, temporal variations in the signal (e.g., motion in a scene) may cause blurring, and processing to reduce and/or eliminate blurring may be required. In various implementations, some applications may require a depth value to be present at each pixel. For this case, when the accuracy is not highly overridden, the depth map can be extrapolated to each pixel using post-processing techniques. In various embodiments, the extrapolation may be performed using any suitable form of extrapolation (e.g., linear, exponential, logarithmic, etc.).

In various embodiments, the first imaging system, the second imaging system, and the third imaging system use the same one or more cameras (e.g., plenoptic cameras) connected to the compute nodes. The computing node may process the single recorded image to extract the fiducial marks, the structured light pattern, and the light field data as separate components. Each individual component may be used to calculate positional information (e.g., a depth map) of the object surface. A weighting factor may be applied to each calculated position information to calculate a weighted average depth.

In various embodiments, the system may use any combination of the imaging modalities/systems described above to determine positional information about the tissue surface. In various embodiments, the system may determine that the weight value in equation 1 is zero (0). In this case, the system acquires position data using multiple imaging modalities/systems, but determines that at least one of those imaging modalities/systems does not provide reliable position data, and therefore ignores the particular imaging modality/system that does not provide reliable data when applying equation 1.

In some embodiments, a stereo camera may be used as an imaging system, either alone or in combination with any of the above-described imaging systems.

The object from which the location information is obtained may be any suitable biological tissue. For example, the object may be an internal body tissue, such as esophageal tissue, stomach tissue, small/large intestinal tissue and/or muscle tissue. In other embodiments, the subject may be external tissue, such as dermal tissue on the abdomen, back, arms, legs, or any other external body part. Furthermore, the object may be a bone, an internal organ or other internal body structure. The systems and methods of the present disclosure will similarly work with animals in veterinary applications.

In various embodiments, the systems and methods described herein may be used in any suitable application, such as a diagnostic application and/or a surgical application. As an example of a diagnostic application, the systems and methods described herein may be used in colonoscopy to image and size polyps in the gastrointestinal tract. Information such as the size of the polyp can be used by a healthcare professional to determine a treatment plan for the patient (e.g., surgery, chemotherapy, further testing, etc.). In another example, the systems and methods described herein may be used to measure the size of an incision or hole when extracting a portion or the entire internal organ. As an example of a surgical application, the systems and methods described herein may be used in handheld surgical applications, such as handheld laparoscopic surgery, handheld endoscopic surgery, and/or any other suitable surgical application where imaging and depth sensing may be necessary. In various embodiments, the systems and methods described herein may be used to calculate the depth of a surgical field (including tissues, organs, wires, and/or any instruments). In various embodiments, the systems and methods described herein may be capable of measuring in absolute units (e.g., millimeters).

Various embodiments may be adapted for use in a Gastrointestinal (GI) tract, such as an endoscope. In particular, the endoscope may include a nebulized nebulizer, an IR source, a camera system and optics, a robotic arm, and an image processor.

In various embodiments, a contrast agent may be applied to a surface of an object, such as a surface of biological tissue, to provide contrast with the surface for which three-dimensional positional information is to be generated by a computer vision system. When using certain visualization modalities (e.g., light field imaging) where accuracy is directly proportional to contrast and texture, contrast can be provided to the surface with a contrast agent. In various embodiments, where soft tissue is imaged, the color of the surface may be substantially uniform and have little texture. In this case, a contrast agent, such as a nebulized dye that adheres to tissue (e.g., serosa), may be applied to the tissue. The dye may fluoresce and provide artificial contrast to greatly improve the level of accuracy of the light field imaging system.

When contrast is used on the surface of the tissue, a calibration may be obtained prior to applying the contrast agent to determine depth information.

FIG. 1 shows an exemplary image 100 of a surface 102 having fiducial marks 104, where the image may be used as a baseline image. In fig. 1, the reference marks 104 are provided on the surface 102 in the form of liquid marks. The fiducial markers 104 are rendered in a matrix format so that a computer vision system running on the computing nodes can identify the fiducial markers 104 and compute a three-dimensional surface from the images. The computer vision system may include one or more cameras that record images of objects and provide the images to a computing node running computer vision software.

In various embodiments, the computer vision system generates three-dimensional position information (X, Y, Z) for each fiducial marker 104. The computer vision system may further interpolate the position information between the fiducial markers 104 or may extrapolate to produce a three-dimensional model of the surface 102 of the object.

FIG. 2 shows an exemplary image 200 of a surface 202 having a matrix of structured light marks 206 overlaying the baseline image 100 of FIG. 1. The matrix of structured light marks 206 is in the form of a grid of points. Structured light marker 206 is projected from a structured light source (e.g., a laser) onto the surface 202 of the object so that a computer vision system running on the compute nodes can identify structured light marker 206 and compute a three-dimensional surface from the image. The computer vision system may include one or more cameras that record an image of structured light marker 206 projected onto the object and provide the image to a compute node running computer vision software. Computer vision software may analyze structured light marker 206 from images taken at different perspectives and perform geometric reconstruction to generate positional information for surface 202. As shown in FIG. 2, the matrix of structured light marks 206 has more marks projected onto surface 202 than the fiducial marks 104 shown in FIG. 1. Thus, three-dimensional position information will be more accurate using structured light marker 206 because computer vision software can generate a three-dimensional model of surface 202 from it as there are more data points.

Fig. 3A shows an exemplary image of a simulated biological tissue 310, while fig. 3B shows an exemplary image of a depth map 315 of the same simulated biological tissue 310. As shown in fig. 3A, simulated biological tissue 310 (e.g., serosa) is substantially uniform in color, without texture, and without artificial marking. The depth map 315 shown in fig. 3B represents a depth map generated by simulating light field imaging of tissue 310. As shown in fig. 3B, the depth map 315 has little or no depth data in areas of minimal contrast-i.e., areas of tissue 310 that are far from the edge. The depth data is present at the edges due to the contrast between the simulated tissue 310 and the background.

Fig. 4A shows an exemplary image of a simulated biological tissue 410 with a contrast agent applied to the surface, while fig. 4B shows an exemplary image of a depth map 415 of the same simulated biological tissue 410 with a contrast agent. As shown in fig. 4A, a contrast agent (e.g., a nebulized blue dye) is applied to simulated biological tissue 410 (e.g., serosa). The depth map 415 shown in fig. 4B represents a depth map generated by light field imaging of simulated tissue 410 with contrast agent. As shown in fig. 4B, the depth map 415 has more data than the depth map 315 shown in fig. 3B because the contrast agent is applied to the surface of the tissue. Based on the depth map 415, the computer vision system will recognize that the tissue 410 has a curved surface.

Fig. 5 illustrates a 3D surface imaging system 500 for imaging tissue in accordance with an embodiment of the present disclosure. The imaging system 500 includes an endoscope 520 having cameras 521a, 521b that, when used together, produce a stereoscopic image of the tissue 502 (e.g., the stomach). In various embodiments, endoscope 520 may alternatively or additionally include an infrared camera. The tissue 502 has fiducial markers 504 disposed thereon so that a camera (e.g., an infrared camera) can detect the markers 504 against the background of the tissue 502. In various implementations, the imaging system 500 further includes a projector 522. In various implementations, the projector 522 can be configured to project the structured light 506 (e.g., a pattern of dots) onto the tissue 502. In various embodiments, the projector is configured to project infrared light. The imaging system 500 further includes a light field (e.g., plenoptic) camera 524. In various embodiments, the tissue 502 may be sprayed with a contrast fluid as described above to allow the imaging system 500 to determine the depth of the tissue 502.

Fig. 6 shows a diagram illustrating a 3D surface imaging system. The system combines three visualization modalities to improve 3D imaging resolution. The system includes a camera system that can be moved by a robotic arm. For each visualization modality, the camera system captures images of the target tissue through the light guide and optical mechanism in the endoscope. The image is processed by an image processor to determine a virtual structured 3D surface.

In one visualization modality, the camera system includes a light field (e.g., plenoptic) camera for capturing plenoptic images of the target tissue. The image processor determines 3D surface variations and shapes from the plenoptic image using standard techniques.

In a second visualization modality, the system uses an IR (infrared) source/projector for generating an IR dot pattern that is projected onto the target tissue through an optical mechanism and a light guide in the endoscope. The dot pattern may be predefined or random. The camera system includes an IR sensor that captures an image of an IR spot on the target tissue. The image is transmitted to an image processor which detects distortions in the dot pattern projected on the target tissue to determine the 3D surface variations and shape.

In a third visualization modality, the system applies nebulized liquid dye to a selected region of the target tissue using a nebulizer/atomizer in the endoscope to increase the number of fiducials. The atomized dye adheres to the target tissue in a pattern of random dots, wherein the concentration of dots is higher than the concentration of infrared dots. The dye may be made to fluoresce to provide enhanced contrast with the tissue to improve the accuracy of the imaging system.

The image processor determines which visualization modality data is most appropriate in a given situation and merges the data where appropriate to further improve the 3D imaging resolution. A weighting algorithm may be used to combine the data. The system thus accurately and reliably senses depth with high resolution, which is required for accurate robotic surgical planning and execution.

Fig. 7 shows a flow chart 700 of a method for determining three-dimensional coordinates on an object. At 702, the method includes recording an image including an object, a first plurality of markers disposed on the object, a second plurality of markers disposed on the object, and a third plurality of markers disposed on the object. At 704, the method includes calculating a first depth using the image and the first plurality of markers. At 706, the method includes calculating a second depth using the image and a second plurality of markers. At 708, the method includes calculating a third depth using the image and a third plurality of markers. At 710, the method includes assigning a first weight to a first depth, a second weight to a second depth, and a third weight to a third depth. At 712, the method includes calculating a weighted average depth based on the first depth, the second depth, the third depth, the first weight, the second weight, and the third weight.

Referring now to FIG. 15, there is shown a schematic diagram of an exemplary compute node that may be used with the computer vision system described herein. The computing node 10 is only one example of a suitable computing node and is not intended to suggest any limitation as to the scope of use or functionality of the embodiments described herein. In any event, computing node 10 is capable of being implemented and/or performing any of the functions set forth above.

In the computing node 10, there is a computer system/server 12, which is operational with numerous other general purpose or special purpose computing system environments or configurations. Examples of well known computing systems, environments, and/or configurations that may be suitable for use with computer system/server 12 include, but are not limited to, personal computer systems, server computer systems, thin clients, thick clients, hand-held or laptop devices, multiprocessor systems, microprocessor-based systems, set top boxes, programmable consumer electronics, network PCs, minicomputer systems, mainframe computer systems, distributed cloud computing environments that include any of the above systems or devices, and the like.

The computer system/server 12 may be described in the general context of computer system-executable instructions, such as program modules, being executed by a computer system. Generally, program modules may include routines, programs, objects, components, logic, data structures, etc. that perform particular tasks or implement particular abstract data types. Computer system/server 12 may be practiced in distributed cloud computing environments where tasks are performed by remote processing devices that are linked through a communications network. In a distributed cloud computing environment, program modules may be located in both local and remote computer system storage media including memory storage devices.

As shown in FIG. 15, computer system/server 12 in computing node 10 is shown in the form of a general purpose computing device. Components of computer system/server 12 may include, but are not limited to, one or more processors or processing units 16, a system memory 28, and a bus 18 that couples various system components including the system memory 28 to the processors 16.

Bus 18 represents one or more of any of several types of bus structures, including a memory bus or memory controller, a peripheral bus, an accelerated graphics port, and a processor or local bus using any of a variety of bus architectures. By way of example, and not limitation, such architectures include Industry Standard Architecture (ISA) bus, Micro Channel Architecture (MCA) bus, Enhanced ISA (EISA) bus, Video Electronics Standards Association (VESA) local bus, and Peripheral Component Interconnect (PCI) bus.

Computer system/server 12 typically includes a variety of computer system readable media. Such media may be any available media that is accessible by computer system/server 12 and includes both volatile and nonvolatile media, removable and non-removable media.

The system memory 28 may include computer system readable media in the form of volatile memory, such as Random Access Memory (RAM)30 and/or cache memory 32. The computer system/server 12 may also include other removable/non-removable devices, volatile/nonvolatile computer system storage media. By way of example only, storage system 34 may be provided for reading from and writing to non-removable, nonvolatile magnetic media (not shown and commonly referred to as "hard disk drives"). Although not shown, a magnetic disk drive for reading from and writing to a removable, nonvolatile magnetic disk (e.g., a "floppy disk") and an optical disk drive for reading from or writing to a removable, nonvolatile optical disk (e.g., a CD-ROM, DVD-ROM, or other optical media) may be provided. In which case each may be connected to bus 18 by one or more data media interfaces. As will be further depicted and described below, memory 28 may include at least one program product having a set of program modules (e.g., at least one) configured to carry out the functions of embodiments of the disclosure.

By way of example, and not limitation, a program/utility 40 having a set (at least one) of program modules 42 may be stored in memory 28 as well as an operating system, one or more application programs, other program modules, and program data. Each of the operating system, one or more application programs, other program modules, and program data or some combination thereof, may include an implementation of a networked environment. Program modules 42 generally perform the functions and/or methodologies of embodiments described herein.

The computer system/server 12 may also communicate with one or more external devices 14, such as a keyboard, pointing device, display 24, etc.; one or more devices that enable a user to interact with the computer system/server 12; and/or any device (e.g., network card, modem, etc.) that enables computer system/server 12 to communicate with one or more other computing devices. Such communication may occur via an input/output (I/O) interface 22. However, the computer system/server 12 may communicate with one or more networks, such as a Local Area Network (LAN), a general Wide Area Network (WAN) and/or a public network (e.g., the Internet) through a network adapter 20. As shown, network adapter 20 communicates with the other components of computer system/server 12 via bus 18. It should be understood that although not shown, other hardware and/or software components may be used in conjunction with the computer system/server 12. Examples include, but are not limited to: microcode, device drivers, redundant processing units, external disk drive arrays, RAID systems, tape drives, and data archival storage systems, etc.

In other embodiments, the computer system/server may be connected to one or more cameras (e.g., digital cameras, light field cameras) or other imaging/sensing devices (e.g., infrared cameras or sensors).

The present disclosure includes systems, methods, and/or computer program products. The computer program product may include a computer-readable storage medium having computer-readable program instructions thereon for causing a processor to perform aspects of the disclosure.

The computer readable storage medium may be a tangible device that can retain and store the instructions for use by the instruction execution device. The computer readable storage medium may be, for example, but is not limited to, an electronic memory device, a magnetic memory device, an optical memory device, an electromagnetic memory device, a semiconductor memory device, or any suitable combination of the foregoing. A non-exhaustive list of more specific examples of the computer-readable storage medium includes the following: a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), a Static Random Access Memory (SRAM), a portable compact disc read-only memory (CD-ROM), a Digital Versatile Disc (DVD), a memory stick, a floppy disk, a mechanical coding device such as a punch card or a raised pattern in a groove having instructions recorded thereon, and any suitable combination of the foregoing. As used herein, a computer-readable storage medium should not be construed as a transitory signal per se, such as a radio wave or other freely propagating electromagnetic wave, an electromagnetic wave propagating through a waveguide or other transmission medium (e.g., optical pulses traveling through a fiber optic cable), or an electrical signal transmitted through an electrical wire.

The computer-readable program instructions described herein may be downloaded from a computer-readable storage medium to a corresponding computing/processing device, or downloaded to an external computer or external storage device over a network (e.g., the internet, a local area network, a wide area network, and/or a wireless network). The network may include copper transmission cables, optical transmission fibers, wireless transmission, routers, firewalls, switches, gateway computers and/or edge servers. The network adapter card or network interface in each computing/processing device receives computer-readable program instructions from the network and forwards the computer-readable program instructions for storage in a computer-readable storage medium within the respective computing/processing device.

Computer-readable program instructions for carrying out operations of the present disclosure may be assembler instructions, Instruction Set Architecture (ISA) instructions, machine-related instructions, microcode, firmware instructions, state setting data, or source code or object code written in any combination of one or more programming languages, including an object oriented programming language such as Smalltalk, C + + or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The computer-readable program instructions may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer, partly on a remote computer or entirely on the remote computer or server. In the latter scenario, the remote computer may be connected to the user's computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider). In various embodiments, an electronic circuit, including, for example, a programmable logic circuit, a Field Programmable Gate Array (FPGA), or a Programmable Logic Array (PLA), may personalize the electronic circuit by executing computer-readable program instructions with state information of the computer-readable program instructions to perform aspects of the present disclosure.

Aspects of the present disclosure are described herein with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the disclosure. It will be understood that each block of the flowchart illustrations and/or block diagrams, and combinations of blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer-readable program instructions.

These computer-readable program instructions may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus to produce a machine, such that the instructions (which execute via the processor of the computer or other programmable data processing apparatus) create means for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks. These computer-readable program instructions may also be stored in a computer-readable storage medium that can direct a computer, programmable data processing apparatus, and/or other devices to function in a particular manner, such that the computer-readable storage medium having the instructions stored therein comprises an article of manufacture including instructions which implement various aspects of the function/act specified in the flowchart and/or block diagram block or blocks.

The computer readable program instructions may also be loaded onto a computer, other programmable data processing apparatus, or other devices to cause a series of operational steps to be performed on the computer, other programmable apparatus or other devices to produce an implemented computer process such that the instructions which execute on the computer, other programmable apparatus or other devices implement the functions/acts specified in the flowchart and/or block diagram block or blocks.

The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of instructions, which comprises one or more executable instructions for implementing the specified logical function(s). In various alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

The description of various embodiments of the present disclosure has been presented for purposes of illustration but is not intended to be exhaustive or limited to the disclosed embodiments. Many modifications and variations will be apparent to those of ordinary skill in the art without departing from the scope and spirit of the described embodiments. The terms used herein were chosen in order to best explain the principles of the embodiments, the practical application or technical improvements of the technology found in the marketplace, or to enable others of ordinary skill in the art to understand the embodiments disclosed herein.

39页详细技术资料下载

上一篇：一种医用注射器针头装配设备

下一篇：一种显示补偿的方法、装置及终端

System and method for multi-modal depth sensing in an automated surgical robotic vision system

相关技术

网友询问留言