Image orientation based on face orientation detection

文档序号：106556 发布日期：2021-10-15 浏览：31次中文

阅读说明：本技术 基于面部取向检测的图像取向 (Image orientation based on face orientation detection ) 是由埃德温.钟宇.朴拉古拉曼.克里希纳穆尔蒂文卡特.兰加纳尔逊.拉斯奎因哈于 2017-07-12 设计创作，主要内容包括：呈现用于图像处理和利用的方法、系统、计算机可读媒体和设备。在一些实施例中,可以使用移动装置获得含在用户的面部处的图像。可以使用所述移动装置确定所述用户的所述面部在所述图像内的取向。所述用户的所述面部的所述取向可以使用多个阶段确定：(a)旋转阶段,用于控制应用于所述图像的一部分的旋转,以产生经旋转图像部分,以及(b)取向阶段,用于控制应用于对所述经旋转图像部分执行的取向特定特征检测的取向。可以利用所述用户的所述面部的所述所确定的取向作为控制输入,来修改所述移动装置的显示画面旋转。(Methods, systems, computer-readable media, and devices for image processing and utilization are presented. In some embodiments, an image contained at the face of the user may be obtained using the mobile device. An orientation of the face of the user within the image may be determined using the mobile device. The orientation of the face of the user may be determined using a plurality of stages: (a) a rotation stage for controlling a rotation applied to a portion of the image to produce a rotated image portion, and (b) an orientation stage for controlling an orientation applied to orientation-specific feature detection performed on the rotated image portion. A display rotation of the mobile device may be modified using the determined orientation of the face of the user as a control input.)

1. A method for image processing and utilization, comprising:

using a mobile device, obtaining an image containing a face of a user of the mobile device;

determining, using the mobile device, an orientation of the face of the user within the image, wherein the orientation of the face of the user within the image is determined using a plurality of stages, the plurality of stages comprising:

(a) a rotation stage for controlling a rotation applied to a portion of the image to produce a rotated image portion; and

(b) an orientation stage for controlling an orientation applied to orientation-specific feature detection performed on the rotated image portion; and

modifying a display rotation of the mobile device using the determined orientation of the face of the user as a control input.

2. The method of claim 1, wherein the rotated image portion is stored in a rotation buffer.

3. The method of claim 1, wherein the orientation-specific feature detection is performed using a computer vision CV computation unit.

4. The method of claim 1, wherein the orientation of the face of the user within the image is determined based on a plurality of modes, the plurality of modes comprising:

(a) a detection mode for detecting an initial orientation of the face of the user within an initial image; and

(b) a tracking mode to track the orientation of the face of the user within a subsequent image using the detected initial orientation.

5. The method of claim 4, wherein the first and second light sources are selected from the group consisting of,

wherein in the detection mode, the initial orientation of the face of the user within the image is detected by performing feature detection at a first plurality of assumed angles; and is

Wherein in the tracking mode, the orientation of the face of the user within the subsequent image is tracked by performing feature detection at a second plurality of assumed angles, the second plurality being less than the first plurality.

6. The method of claim 4, wherein the first and second light sources are selected from the group consisting of,

wherein in the tracking mode, the orientation of the face of the user within the subsequent image is tracked upon detection of a trigger condition associated with a non-image sensor.

7. The method of claim 6, wherein the non-image sensor comprises an accelerometer.

8. An apparatus for image processing and utilization, comprising:

an image sensor configured to obtain an image containing at least a face of a user of a mobile device;

an object orientation detector configured to determine an orientation of the face of the user within the image, wherein the object orientation detector is configured to determine the orientation of the face of the user within the image using a plurality of stages comprising:

(a) a rotation stage configured to control a rotation applied to a portion of the image to produce a rotated image portion; and

(b) an orientation stage configured to control an orientation applied to orientation-specific feature detection performed on the rotated image portion; and

an object orientation receiver configured to receive and organize the determined orientation of the face of the user as a control input to modify a display rotation of the mobile device.

9. The apparatus of claim 8, further comprising:

a rotation buffer configured to store the rotated image portion.

10. The apparatus of claim 8, further comprising:

a computer vision CV computation unit configured to perform the orientation-specific feature detection.

11. The apparatus of claim 8, wherein the object orientation detector is configured to determine the orientation of the face of the user in a plurality of modes, the plurality of modes comprising:

(a) a detection mode for detecting an initial orientation of the face of the user within an initial image; and

(b) a tracking mode for tracking the orientation of the face of the user within a subsequent image using the detected initial orientation.

12. The apparatus as set forth in claim 11, wherein,

wherein in the detection mode, the object orientation detector is configured to detect the initial orientation of the face of the user within the image by performing feature detection at a first plurality of assumed angles; and is

Wherein in the tracking mode, the object orientation detector is configured to track the orientation of the face of the user within the subsequent image by performing feature detection at a second plurality of assumed angles, the second plurality being less than the first plurality.

13. The apparatus as set forth in claim 11, wherein,

wherein in the tracking mode, the object orientation detector is configured to track the orientation of the face of the user within the subsequent image upon detection of a trigger condition associated with a non-image sensor.

14. The apparatus of claim 13, wherein the non-image sensor comprises an accelerometer.

15. A non-transitory computer-readable medium having instructions embedded thereon for image processing and utilization, which when executed by one or more processing units, cause the one or more processing units to:

using a mobile device, obtaining an image containing a face of a user of the mobile device;

(a) a rotation stage for controlling a rotation applied to a portion of the image to produce a rotated image portion; and

(b) an orientation stage for controlling an orientation applied to orientation-specific feature detection performed on the rotated image portion; and

modifying a display rotation of the mobile device using the determined orientation of the face of the user as a control input.

16. The non-transitory computer-readable medium of claim 15, wherein the rotated image portion is stored in a rotation buffer.

17. The non-transitory computer-readable medium of claim 15, wherein the orientation-specific feature detection is performed using a Computer Vision (CV) computation unit.

18. The non-transitory computer-readable medium of claim 15, wherein the orientation of the face of the user is determined based on a plurality of modes, the plurality of modes comprising:

(a) a detection mode for detecting an initial orientation of the face of the user within an initial image; and

(b) a tracking mode for tracking the orientation of the face of the user within a subsequent image using the detected initial orientation.

19. The non-transitory computer-readable medium of claim 18,

wherein in the detection mode, the initial orientation of the face of the user within the image is detected by performing feature detection at a first plurality of assumed angles; and is

20. The non-transitory computer-readable medium of claim 18,

wherein in the tracking mode, the orientation of the face of the user within the subsequent image is tracked upon detection of a trigger condition associated with a non-image sensor.

21. The non-transitory computer-readable medium of claim 20, wherein the non-image sensor comprises an accelerometer.

22. A system for image processing and utilization, comprising:

means for obtaining, using a mobile device, an image containing a face of a user of the mobile device;

means for determining, using the mobile device, an orientation of the face of the user within the image, wherein the orientation of the face of the user within the image is determined using a plurality of stages, the plurality of stages comprising:

(a) a rotation stage for controlling a rotation applied to a portion of the image to produce a rotated image portion; and

(b) an orientation stage for controlling an orientation applied to orientation-specific feature detection performed on the rotated image portion; and

means for modifying a display rotation of the mobile device using the determined orientation of the face of the user as a control input.

23. The system of claim 22, wherein the rotated image portion is stored in a rotation buffer.

24. The system of claim 22, wherein the orientation-specific feature detection is performed using a computer vision, CV, computation unit.

25. The system of claim 22, wherein the orientation of the face of the user within the image is determined based on a plurality of modes, the plurality of modes comprising:

(a) a detection mode for detecting an initial orientation of the face of the user within an initial image; and

(b) a tracking mode for tracking the orientation of the face of the user within a subsequent image using the detected initial orientation.

26. The system of claim 25, wherein the first and second sensors are configured to sense the temperature of the fluid,

wherein in the detection mode, the initial orientation of the face of the user within the image is detected by performing feature detection at a first plurality of assumed angles; and is

27. The system of claim 25, wherein the first and second sensors are configured to sense the temperature of the fluid,

wherein in the tracking mode, the orientation of the face of the user within the subsequent image is tracked upon detection of a trigger condition associated with a non-image sensor.

28. The system of claim 27, wherein the non-image sensor comprises an accelerometer.

Background

Aspects of the present disclosure relate to detecting and utilizing object orientations associated with images captured by a mobile device. According to embodiments of the present disclosure, the detected orientation may be utilized in different ways. One example is the automatic control by the mobile device of the rotation of the display presented to the user. Assessing proper display rotation can be a challenging task. Current techniques are lacking in many respects in terms of accuracy and computational efficiency. For example, a simple automatic display rotation feature implemented in a mobile device typically utilizes the direction of gravity as measured using an accelerometer to control the rotation of the display presented to the user (e.g., "portrait" versus "landscape" rotation). However, sometimes it is not possible to accurately determine the appropriate display screen rotation by simply using the direction of gravity. For example, if the mobile device is placed flat on a table with the display facing skyward, the gravity vector points towards a direction perpendicular to the plane of the display, and thus cannot provide a useful indication for proper rotation of the display presented to the user. Typically, the last orientation of the display before the device is used to lie flat may not be a suitable rotation for proper viewing by the user. As another example, if a user holds the mobile device in front of him while lying on his side (e.g., in a bed), gravity vector controlled automatic display rotation techniques will typically result in incorrect display rotation from the user's perspective. Here, the display may be improperly automatically rotated with even small movements. This is because a system using only gravity vectors will only sense that the mobile device has moved from an upright position to a prone position and conclude that the display must be rotated in response (e.g., rotated from "portrait" to "landscape"). However, the system will not be able to sense that the user is also moving from the upright to the prone position and therefore rotation of the display should not be required in fact. These and other shortcomings of existing systems highlight the need for improved techniques for image-dependent orientation detection and utilization.

Disclosure of Invention

Certain embodiments are described for improving image processing based display rotation. A method may include, using a mobile device, obtaining an image including a face of a user of the mobile device. The method may further include determining, using a mobile device, that the face of the user is oriented within the image. The orientation of the face of the user within the image may be determined using a plurality of stages, which may include: (a) a rotation stage for controlling a rotation applied to a portion of the image to produce a rotated image portion; and (b) an orientation stage for controlling an orientation applied to orientation-specific feature detection performed on the rotated image portion. The method may further comprise utilizing the determined orientation of the face of the user as a control input to modify a display rotation of the mobile device.

In one embodiment, the rotated image portion is stored in a rotation buffer. In one embodiment, the orientation-specific feature detection is performed using a Computer Vision (CV) computing unit. In one embodiment, the orientation of the face of the user within the image is determined based on a plurality of modes. The plurality of modes may include: (a) a detection mode for detecting an initial orientation of the face of the user within an initial image; and (b) a tracking mode for tracking the orientation of the face of the user within a subsequent image using the detected initial orientation.

According to one embodiment, in the detection mode, the initial orientation of the face of the user within the image is detected by performing feature detection at a first plurality of assumed angles. In the same embodiment, in tracking mode, the orientation of the face of the user within the subsequent image is tracked by performing feature detection at a second plurality of assumed angles, the second plurality being less than the first plurality.

According to a further embodiment, in a tracking mode, the orientation of the face of the user within the subsequent image is tracked upon detection of a trigger condition associated with a non-image sensor. For example, the non-image sensor may include an accelerometer.

Drawings

Aspects of the present disclosure are illustrated as examples. In the drawings, like reference numerals designate like elements. The illustrations are briefly described below:

FIG. 1A depicts a situation in which a mobile device is placed flat on a table and viewed by a user;

FIG. 1B is a different view of the scenario depicted in FIG. 1A, showing the orientation of the mobile device display and the position of the user;

FIG. 2A depicts a subsequent situation in which the mobile device remains in the same position on the table, but the user has moved to view the mobile device from a different user position, i.e., from a different side of the table;

FIG. 2B is a different view of the scenario depicted in FIG. 2A, showing the orientation of the display of the mobile device in response to the changed user position;

FIG. 3A depicts a potentially frustrating situation in which a user, while lying on his side (e.g., in a bed), views an improperly automatically rotated display on a mobile device held in his hand;

FIG. 3B depicts an improved situation in which the user views a correctly auto-rotated display on a mobile device held in his hand while lying on his side (e.g., in a bed);

FIG. 3C depicts a similar situation as FIG. 3B, where the user is lying down on the other side;

fig. 4 is a high-level block diagram showing a system for detecting and utilizing an orientation of one or more objects in an image, in accordance with various embodiments of the present disclosure;

FIG. 5 is a more detailed block diagram showing exemplary components within an object orientation detector (such as the one shown in FIG. 4), in accordance with certain embodiments of the present disclosure;

FIG. 6 illustrates an example of orientation-specific feature detection performed by an accelerated Computer Vision (CV) computation unit (such as the one shown in FIG. 5), in accordance with certain embodiments of the present disclosure;

FIG. 7 is a table showing various hypothetical feature angles implemented with various combinations of (1) image rotation and (2) CV unit input orientations, in accordance with various embodiments;

FIG. 8 shows different states of an orientation detection controller, each state representing a different combination of (1) image rotation and (2) CV unit input orientation, in accordance with various embodiments;

FIG. 9 illustrates a hysteresis function established in a triggering operation of an orientation tracking technique, in accordance with an embodiment of the present disclosure;

FIG. 10 is a flow chart showing illustrative steps in a process for performing image processing and utilization in accordance with at least one embodiment of the present disclosure; and

FIG. 11 illustrates an example computer system 1100 that can be used to implement features of the present disclosure.

Detailed Description

Several illustrative embodiments will now be described with respect to the accompanying drawings, which form a part hereof. While the following describes particular embodiments in which one or more aspects of the present disclosure may be implemented, other embodiments may be used and various modifications may be made without departing from the scope of the present disclosure or the spirit of the appended claims.

Illustrative use case

FIG. 1A depicts a situation where a mobile device is placed flat on a table and viewed by a user. Here, the user is viewing the display of the mobile device. The display presents the outdoor scene with a "landscape" display rotation.

FIG. 1B is a different view of the scenario depicted in FIG. 1A, showing the orientation of the mobile device display and the position of the user. As shown in the illustration, a user is able to view an outdoor scene presented with a "landscape" display rotation.

Fig. 2A depicts a subsequent situation in which the mobile device remains in the same position on the table, but the user has moved to view the mobile device from a different user position, i.e., from a different side of the table. According to embodiments of the present disclosure, the mobile device automatically adjusts the rotation of the display in response to the new user position to present the outdoor scene in a "portrait-like" display rotation. The display rotation not only changes to "portrait," but also matches the user's new position. That is, what the user sees is not an upside down "portrait" view of the outdoor scene, but a right side up "portrait" view. According to an embodiment of the present disclosure, a mobile device captures an image of a user using a front facing camera (i.e., a camera facing the user). The mobile device performs image processing to determine an orientation of the user's face within the captured image. The determined orientation of the user's face is then used to automatically adjust the rotation of the display so that the appropriate display rotation is presented to the user. In this way, the rotation of the display may "follow" the orientation of the user's face. Thus, in response to the user moving to a new user location (i.e., a different side of the table), the display automatically rotates to match the user's new viewing location, as depicted in fig. 2A.

FIG. 2B is a different view of the scenario depicted in FIG. 2A, showing the orientation of the display of the mobile device in response to the changed user position. As shown in the illustration, the display automatically rotates and presents a front-up "portrait" view of the outdoor scene.

Fig. 3A depicts a potentially frustrating situation in which a user views an improperly automatically rotated display on a mobile device held in his hand while lying on his side (e.g., in a bed). Here, the mobile device utilizes conventional techniques for automatically displaying screen rotation, i.e., based on the direction of gravity sensed by the accelerometer. In the scenario depicted in fig. 3A, the automatic display rotation relies on the direction of gravity to cause an incorrect rotation of the display from the user's perspective. In this case, a mobile device using only the gravity vector only senses that the mobile device has moved from an upright position to a prone position, and concludes that the display screen should be rotated (e.g., rotated from "portrait" to "landscape") in response. However, the mobile device cannot sense that the user also moves from the upright to the prone position, and therefore rotation of the display screen should not be actually required. Thus, the user is forced to read the text at an offset angle of 90 degrees.

Fig. 3B depicts an improved situation in which the user views a correctly auto-rotated display on a mobile device held in his hand while lying on his side (e.g., in a bed). According to various embodiments of the present disclosure, a mobile device uses a user image captured from a front facing camera and determines an orientation of the user's face. The mobile device automatically adjusts rotation of the display based on the determined orientation of the user's face within the captured image. This results in, for example, the display presenting the text with the correct rotation for viewing by the user. The user is able to read the text without any significant offset angle.

Fig. 3C depicts a situation similar to fig. 3B, where the user is lying down on the other side. Here again, the rotation of the display may "follow" the orientation of the user's face. Thus, even though the user has now changed his position to lie side-on-side, the text presented on the display is still in the correct rotation for viewing by the user. In yet another scenario (not shown), the user may be supine while holding the mobile device directly above his face, with the device display pointing in a downward direction toward the user's face. Here, if the rotation of the display is automatically controlled using a conventional system based on the direction of gravity, the display may be improperly and unintentionally rotated even by the slightest movement of the user's mobile device held by hand. According to various embodiments of the present disclosure, a detected orientation of the user's face may be used instead of a direction of gravity to determine a proper rotation of the display and thereby avoid such incorrect and unintentional display rotation.

Overall system

Fig. 4 is a high-level block diagram showing a system 400 for detecting and utilizing an orientation of one or more objects in an image, in accordance with various embodiments of the present disclosure. As shown, the system 400 includes an image sensor 402, an object orientation detector 404, and an object orientation receiver 406. According to some embodiments, the system 400 resides on a mobile device, such as a handheld smartphone device. The image sensor 402 captures an image of a scene. In some embodiments, individual sensor elements, e.g., pixels, of image sensor 402 may be aligned in a rectangular grid, but in other embodiments are not aligned in a rectangular grid. The image may be used for an object orientation detector 404. The object orientation detector 404 may have various components for efficiently determining the orientation of one or more objects within an image. An example of such an object is the face of a user of a mobile device. Thus, the object orientation detector 404 determines the rotational orientation (e.g., in degrees) of the user's face within the image. Here, the object orientation detector 404 may be trained to detect objects in different orientation ranges. In some embodiments, the orientation range covers a full 360 degree rotation. In other embodiments, the object is detected only in a narrow range of orientations, for example, plus or minus 30 degrees. The orientation of one or more objects in the image is then provided to the object orientation receiver 406. Object orientation receiver 406 represents a plethora of possible components that may use the determined orientation of one or more objects within an image to control one or more operations of a mobile device.

For example, object orientation receiver 406 may provide automatic display rotation of the mobile device using the orientation of the user's face within the image determined by object orientation detector 404. Here, the rotation of the display may "follow" the orientation of the user's face. For example, if the user tilts his head while viewing the mobile device display, while keeping the mobile device stationary, the display of the mobile device may rotate and follow the tilt of the user's face. Similarly, if the mobile device is placed flat on a desktop (e.g., as shown in fig. 1A and 2A), or if the mobile device is held by a user lying on his side (e.g., as shown in fig. 3B and 3C), the determined orientation of the user's face may be utilized to perform automatic display rotation of the mobile device.

In some embodiments, automatic display rotation may be limited to only a few possible display rotation results, such as 0, 90, 180, and 270 degree rotations. For example, such limited display screen rotation results may be adopted for displays having rectangular shaped displays. In other embodiments, automatic display rotation may result in a higher number of possible display rotation results. For example, different display rotations separated by finer increments, e.g., 2 degrees, are possible. For example, for displays having a circular or other non-rectangular shape, such a wider range of display rotation results may be adopted. Further, display rotation may be used in other ways. For example, the display may provide different information to the user depending on the rotation. As another example, a device such as a display filter, e.g., a polarization filter, may change depending on the display rotation. In one case, the polarization direction of the display filter may be changed to match the polarization direction of the glasses worn by the user.

In various instances, Quick Response (QR) codes within an image may be oriented and utilized to implement a more efficient automatic QR code reader. Current QR code readers must typically be manually turned on to scan the QR code. An alternative is to put the QR code reader in an always-on mode, which unnecessarily consumes computing resources and may quickly consume battery charge of the mobile device. The lack of information about the orientation of the QR code captured in the image exacerbates the computational requirements of the QR reader. In other words, if the orientation of the QR code in the image is known, the QR code reader may specify a "header start point" and may be able to read the QR code more quickly and efficiently. According to various embodiments of the present disclosure, the object orientation detector 404 may determine the orientation of the QR code within the image captured by the image sensor 402. The determined QR code orientation may then be provided to the object orientation receiver 406. In this case, the object orientation receiver 406 may control a QR code reader (not shown) implemented within the mobile device. The object orientation receiver 406 may automatically turn on the QR code reader and give the QR code reader a "header start point" by providing the determined orientation of the QR code. This allows the QR code reader to read the QR code automatically and in a more efficient manner.

Two-stage object orientation determination

Fig. 5 is a more detailed block diagram showing exemplary components within object orientation detector 404 (e.g., the one shown in fig. 4), in accordance with certain embodiments of the present disclosure. As shown, the object orientation detector 404 includes an image buffer 502, a rotation buffer 504, an accelerated Computer Vision (CV) calculation unit 506, and an orientation detection controller 508. Here, a portion of an image captured by an image sensor is provided to and stored in an image buffer 502. The image portion may include the entire image captured by an image sensor (e.g., image sensor 402 of fig. 4) or a portion of such an image. The image portion may then be rotated according to a specified amount of rotation (e.g., degrees) and stored in a rotation buffer 504. The mechanism for performing image rotation is not shown here, but its operation can be understood by those skilled in the art. The amount of image rotation may be specified using control bits provided by the orientation detection controller 508. After the rotated image portion is obtained, it may be provided to an accelerated CV computation unit 506.

In one embodiment, the unrotated image portions may also be provided directly from the image buffer 502 to the accelerated CV computation unit 506. In such embodiments, the rotation buffer 504 may be bypassed during operations that do not require image rotation. In another embodiment, the unrotated image portions may be provided from the rotation buffer 504 (controlled to provide zero rotation) to the accelerated CV computation unit 506. In any case, the rotated and/or non-rotated image portions are provided to the accelerated CV computation unit 506.

According to various embodiments, the accelerated CV computation unit 506 performs orientation-specific feature detection. In other words, as discussed below, the accelerated CV computation unit 506 is capable of detecting target features at different specified orientations, as explained in more detail with respect to fig. 6. The orientation detection controller 508 may provide control bits to indicate the orientation at which the accelerated CV computation unit 506 will perform feature detection. Upon detection of the target feature, the accelerated CV computation unit computes an object detection tag, which is provided to the orientation detection controller 508.

Thus, according to various embodiments of the present disclosure, the orientation detection controller 508 controls the operation of the rotation buffer 504 and the accelerated CV calculation unit 506 in a "multi-stage" method, e.g., a "two-stage" method. In one stage, the rotation of the image portion stored in the rotation buffer 504 is controlled. In another phase, control is input to the orientation of the accelerated CV computation unit 506. In this embodiment, the control bits for both phases are provided simultaneously. Thus, the term "two-phase" broadly refers to the two types of control provided, and is not intended to necessarily imply operation at two different times.

Fig. 6 illustrates an example of orientation-specific feature detection performed by an accelerated Computer Vision (CV) computation unit 506 (such as the one shown in fig. 5), in accordance with certain embodiments of the present disclosure. Here, the accelerated CV computation unit 506 can perform feature detection from any of four possible orientations, such as 0 degrees, 90 degrees, 180 degrees, and 270 degrees. The orientation may be specified as input to an accelerated Computer Vision (CV) calculation unit 506, for example, via control bits provided by an orientation detection controller 508 as previously discussed.

Performing orientation-specific feature detection with components such as the accelerated CV computation unit 506 in this manner has significant advantages. In order to perform feature detection with, for example, 90 degrees of rotation, it is not necessary to first rotate the image portion by 90 degrees before feature detection. Image rotation, such as that provided by image rotation buffer 504, may be computationally intensive and involve a large number of read and write cycles. Alternatively, the unrotated image portions may be fed into the accelerated CV computation unit 506, and the accelerated CV computation unit 506 may run directly on the unrotated image portions to detect the target feature at the 90 degree rotational offset. As shown in fig. 6, the accelerated CV computation unit 506 is able to do so by internally performing efficient coordinated mapping to properly map pixels to different orientations selected from a limited number of available orientations, such as 0 degrees, 90 degrees, 180 degrees, and 270 degrees. The accelerated CV computation unit 506 thus performs orientation-specific feature detection directly on a portion of the image received from the image buffer 502 or the rotation buffer 504.

Fig. 7 is a table showing various hypothetical feature angles implemented with various combinations of (1) image rotation and (2) CV cell input orientations, according to various embodiments. In this example, the orientation detection controller 508 can implement a series of hypothetical feature angles, i.e., from 0 to 360 degrees in 15 degree increments, by controlling (1) the image rotation, and (2) the CV cell input orientation. The possible image rotations shown in this example are: 0 degrees, 15 degrees, 30 degrees, 45 degrees, 60 degrees, and 75 degrees. Possible CV cell input orientations in this example are: 0 degrees, 90 degrees, 180 degrees, and 270 degrees. Here, "CV cell input orientation" refers to the detector orientation, i.e., the orientation at which the detector operates to detect the target feature. The "CV cell input orientation" in this example does not refer to a change in orientation of the image portion provided as input to the CV cell. The combination of image rotation and CV unit input orientation allows the orientation detection controller 508 to cycle through all angles from 0 to 360 degrees in 15 degree increments. At each such angle, feature detection may be performed using the accelerated CV computation unit 506 to see if a target feature is detected. Upon receiving an indication that a target feature is detected, e.g., via the object detection tag illustrated in fig. 5, the orientation detection controller 508 may declare that the present feature angle, e.g., 135 degrees, is the detected orientation of the object in the image.

Fig. 8 shows different states of the orientation detection controller 508, each state representing a different combination of (1) image rotation and (2) CV cell input orientation, in accordance with various embodiments. For example, orientation detection controller 508 may systematically loop through the states depicted in fig. 8 until the orientation of the object within the image is detected. As shown, the process may begin in a state corresponding to an image rotation at 0 degrees. While keeping the image rotation at 0 degrees, the CV cell input orientation may loop through different values, such as 0 degrees, 90 degrees, 180 degrees, and 270 degrees. Subsequently, the process may move to a state corresponding to an image rotation at 15 degrees. While maintaining the image rotation at 15 degrees, the CV cell input orientation may loop through different values, such as 0 degrees, 90 degrees, 180 degrees, and 270 degrees. Similar looping patterns through the available CV unit input orientations may be applied through different image rotations, e.g., 30 degrees, 45 degrees, 60 degrees, and finally 70 degrees, until a target feature is identified, at which point the orientation of the object (e.g., human face) is determined.

It should be noted that the particular order of accessing the various states may be changed. Fig. 8 shows just one example, where the process keeps the image rotation at each angle while cycling through different CV cell input orientations (e.g., 0 degrees, 90 degrees, 180 degrees, and 270 degrees). Alternatively, the process may hold the CV cell input orientation at each value while cycling through different image rotations (e.g., 0, 15, 30, 45, 60, and 75 degrees).

Detection mode versus tracking mode

According to various embodiments, the orientation of the at least one object within the image is determined based on a plurality of modes including (a) a detection mode for detecting an initial orientation of the at least one object within the initial image, and (b) a tracking mode for tracking the orientation of the at least one object within a subsequent image using the detected initial orientation. Here, detection and tracking may involve operations that are related but to some extent different. In particular, as discussed in the section above, orientation detection of objects in an image is typically achieved without any prior knowledge of the possible orientations of the objects. That is, the object orientation may be at an angle as likely as any other angle. Thus, detection of the orientation of the object within the image may be performed by systematically looking at all possible angular (or angular increments) cycles until a target feature (e.g., a human face) is detected.

In contrast, tracking the orientation of an object within an image is typically performed with some knowledge of the likely orientation of the object. In particular, since tracking generally follows detection, "last known orientation" may be used and considered when performing orientation tracking. By way of example only, consider a sequence of images captured by image sensor 402 that includes image 0, image 1, image 2, image 3, and so on. By processing image 0, object orientation detection can be performed. For example, the user's face orientation may be determined by using image 0 and utilizing the operational image buffer 502, the rotation buffer 504, the accelerated CV computation unit 506, and the orientation detection controller 508 in a manner such as discussed in the section above.

Once the user's face orientation in image 0 is detected, the face orientation in image 1 can simply be tracked without performing the full set of operations associated with orientation detection. For example, the tracking technique may utilize prior knowledge of determining the "last known orientation" of the user's face using image 0. Various techniques may be employed for tracking. One exemplary tracking technique involves starting at the "last known orientation" and using it as a seed angle, then performing feature detection at hypothetical angles that progressively extend from the seed angle along positive and negative relative offsets. For example, if the user's face orientation is determined to be at a 90 degree angle based on the orientation detection performed on image 0, tracking may be employed to first attempt detection at a +15 degree offset from 90 degrees (i.e., 105 degrees), followed by an attempt at a-15 degree offset from 90 degrees (i.e., 75 degrees), followed by an attempt at a +30 degree offset from 90 degrees (i.e., 120 degrees), followed by an attempt at a-30 degree offset from 90 degrees (i.e., 60 degrees), and so on until a human face is detected in image 1. The use of such seed angles provides shortcuts to avoid systematically cycling through all possible feature angles, such as associated with full orientation detection. Thus, tracking with existing knowledge of the "last known orientation" may be significantly more efficient when compared to performing orientation detection without any knowledge of the possible orientation of the object within the image.

FIG. 9 illustrates a hysteresis function established in a triggering operation of an orientation tracking technique, according to an embodiment of the present disclosure. In one embodiment, tracking may be triggered by any deviation from the current angle. For example, in a particular image, if the user's face is not detected at the "last known orientation" from the previous image, then the tracking mode may be immediately triggered and an attempt may be made to detect the user's face throughout the hypothetical angle search. However, in certain situations, such as for noise or the like, it may have erroneously occurred that the orientation of the user's face was not immediately detected at the "last known orientation". Thus, a hysteresis function is employed according to some embodiments to prevent a first false positive from causing unnecessary searches through the hypothesis angles.

According to a further embodiment, the non-image sensor may also be used to assist in tracking the orientation of the object in the sequence of images. One example of such a non-image sensor is an accelerometer on a mobile device. The accelerometer readings may indicate angular rotation of the mobile device. The readings from the accelerometer can be used in a variety of ways. One use of accelerometer readings is to provide an additional or alternative source of information for the triggering of the tracking mode of the system. Another use of the accelerometer readings is to provide a seed angle to the tracking mode after it is triggered.

Referring back to fig. 9, the figure illustrates accelerometer readings used to (1) trigger tracking according to a hysteresis function, and (2) provide seed angles for performing tracking. Here, if the accelerometer readings indicate that the mobile device has rotated 30 degrees or more, then the tracking mode is triggered. Upon entering the tracking mode, the reading from the accelerometer (shown as + S or-S) may be used as a seed angle to begin the tracking operation.

Further, according to some embodiments of the present disclosure, other information may be used to assist in tracking the orientation of an object in a sequence of images. For example, statistical data may be collected with respect to each image, object, and/or region proximate to the object to estimate a likely orientation of the object. As previously discussed, non-image sensor readings may also be used in a similar manner. Such alternatively or additionally obtained object orientation estimates may be used to trigger, cause, or otherwise assist in tracking of the orientation of the object within the one or more images. Statistics such as those discussed above may be generated from an image detector, such as object orientation detector 404, a different image detector, or a non-image sensor.

Fig. 10 is a flow diagram showing illustrative steps in a process 1000 for performing image processing and utilization in accordance with at least one embodiment of the present disclosure.

FIG. 11 illustrates an example computer system 1100 that can be used to implement features of the present disclosure. Computer system 1100 is shown including hardware elements that are electrically coupled (or otherwise in communication, as appropriate) via a bus 1102. The hardware elements may include one or more processors 1104, including but not limited to one or more general-purpose processors and/or one or more special-purpose processors (e.g., digital signal processing chips, graphics processing unit 1122, and/or the like); one or more input devices 1108, which may include, but are not limited to, one or more cameras, sensors, mice, keyboards, microphones configured to detect ultrasonic or other sounds, and/or the like; and one or more output devices 1110, which may include, but are not limited to, a display unit, such as a device used in implementations of the invention, a printer, and/or the like. Additional cameras 1120 may be used to detect limbs and gestures of the user. In some implementations, the input device 1108 may include one or more sensors, such as infrared sensors, depth sensors, and/or ultrasonic sensors. The graphics processing unit 1122 may be used to perform the method for erasing and replacing the objects described above in real time.

In some of the present embodiments, various input devices 1108 and output devices 1110 may be embedded in interfaces such as display devices, tables, floors, walls, and window screens. Further, an input device 1108 and an output device 1110 coupled to the processor may form a multi-dimensional tracking system.

The computer system 1100 may further include (and/or be in communication with): one or more non-transitory storage devices 1106, which may include, but are not limited to, local and/or network accessible storage, and/or may include, but are not limited to, disk drives, drive arrays, optical storage, solid-state storage devices such as Random Access Memory (RAM) and/or read-only memory (ROM), which may be programmable, flash-updateable, and/or the like. These storage devices may be configured to implement any suitable data storage, including but not limited to various file systems, database structures, and/or the like.

Computer system 1100 may also include a communication subsystem 1112, which may include, but is not limited to, modems, network cards (wireless or wired), infrared communication devices, wireless communication devices, and/or chipsets (e.g., Bluetooth devices, 802.11 devices, WiFi devices, WiMax devices, cellular communications facilities, etc.), and/or the like. Communication subsystem 1112 may permit data to be exchanged with a network, other computer systems, and/or any other devices described herein. In many implementations, the computer system 1100 will further include a non-transitory working memory 1118, which may include a RAM or ROM device, as described above.

The computer system 1100 may also include software elements shown as currently located within the working memory 1118, including an operating system 1114, device drivers, executable libraries, and/or other code such as one or more application programs 1116, which may include computer programs provided by the various implementations and/or provided by other implementations that may be designed to implement methods and/or configure systems, as described herein. By way of example only, one or more programs described with respect to the methods discussed above might be implemented as code and/or instructions executable by a computer (and/or a processor within a computer); such code and/or instructions may then, in an aspect, be used to configure and/or adapt a general purpose computer (or other device) to perform one or more operations in accordance with the described methods.

The set of such instructions and/or code might be stored on a computer-readable storage medium, such as storage device 1106 described above. In some cases, the storage medium may be incorporated within a computer system, such as computer system 1100. In other implementations, the storage medium may be separate from the computer system (e.g., a removable medium such as a compact disc), and/or provided in an installation package, such that the storage medium may be used to program, configure and/or adapt a general purpose computer having the instructions/code stored thereon. These instructions may be in the form of executable code that may be executed by computer system 1100, and/or may be in the form of source code and/or installable code that is in the form of executable code after being compiled and/or installed on computer system 1100 (e.g., using various commonly available compilers, installation programs, compression/decompression utilities, etc.).

May vary substantially according to particular needs. For example, it may also be possible to use custom hardware, and/or particular elements may be implemented in hardware, software (including portable software, such as applets, etc.), or both. Additionally, connections to other computing devices, such as network input/output devices, may be employed. In some implementations, one or more elements of the computer system 1100 may be omitted or may be implemented separately from the illustrated system. For example, processor 1104 and/or other elements may be implemented separately from input device 1108. In one implementation, the processor may be configured to receive images from one or more cameras implemented separately. In some embodiments, elements other than those illustrated in fig. 4 may also be included in computer system 1100.

Some embodiments may employ a computer system (e.g., computer system 1100) to perform a method in accordance with the present disclosure. For example, some or all of the procedures of the described methods may be performed by the computer system 1100 in response to the processor 1104 executing one or more sequences of one or more instructions (which may be incorporated into the operating system 1114 and/or other code, such as an application 1116) contained in the working memory 1118. Such instructions may be read into the working memory 1118 from another computer-readable medium, such as one or more of the storage devices 1106. For example only, execution of sequences of instructions contained in working memory 1118 may cause processor 1104 to perform one or more procedures of the methods described herein.

As used herein, the terms "machine-readable medium" and "computer-readable medium" refer to any medium that participates in providing data that causes a machine to operation in a specific fashion. In some implementations implemented using computer system 1100, various computer-readable media may be involved in providing instructions/code to processor 1104 for execution and/or may be used to store and/or carry such instructions/code (e.g., as signals). In many implementations, the computer-readable medium may be a physical and/or tangible storage medium. Such a medium may take many forms, including but not limited to, non-volatile media, and transmission media. Non-volatile media includes, for example, optical and/or magnetic disks, such as storage device 1106. Volatile media includes, but is not limited to, dynamic memory such as working memory 1118. Transmission media includes, but is not limited to, coaxial cables, copper wire and fiber optics, including the wires that comprise bus 1102, as well as the various components of communication subsystem 1112 (and/or the media by which communication subsystem 1112 provides communication with other devices). Thus, transmission media can also take the form of waves (including, but not limited to, radio, acoustic and/or light waves, such as those generated during radio wave and infrared data communications).

Common forms of physical and/or tangible computer-readable media include, for example, a floppy disk, a flexible disk, hard disk, magnetic tape, or any other magnetic medium, a CD-ROM, any other optical medium, a RAM, a PROM, an EPROM, a flash EPROM, any other memory chip or cartridge, a carrier wave as described hereinafter, or any other medium from which a computer can read instructions and/or code.

Various forms of computer readable media may be involved in carrying one or more sequences of one or more instructions to processor 1104 for execution. For example only, the instructions may initially be carried on a magnetic and/or optical disk of a remote computer. The remote computer might load the instructions into its dynamic memory and send the instructions as signals over a transmission medium to be received and/or executed by the computer system 1100. These signals, which may be in the form of electromagnetic signals, acoustic signals, optical signals, and/or the like, are all examples of carrier waves upon which instructions may be encoded according to various embodiments of the present invention.

The communication subsystem 1112 (and/or components thereof) will typically receive signals, and the bus 1102 may then carry the signals (and/or data, instructions, etc. carried by the signals) to the working memory 1118 from which the processor 1104 retrieves and executes the instructions. The instructions received by working memory 1118 may optionally be stored on a non-transitory storage device either before or after execution by processor 1104.

It should be understood that the specific order or hierarchy of steps in the processes disclosed is an example of exemplary approaches. Based upon design preferences, it is understood that the specific order or hierarchy of steps in the processes may be rearranged. In addition, certain steps may be combined or omitted. The accompanying method claims present elements of the various steps in a sample order, and are not intended to be limited to the specific order or hierarchy presented.

The previous description is provided to enable any person skilled in the art to practice the various aspects described herein. Various modifications to these aspects will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other aspects. Moreover, nothing disclosed herein is intended to be dedicated to the public.

Although some examples of the methods and systems herein are described in the context of software executed on various machines, the methods and systems may also be implemented in specially configured hardware, such as field-programmable gate arrays (FPGAs) specially adapted for performing the various methods. For example, the examples may be implemented in digital electronic circuitry, or in computer hardware, firmware, software, or in combinations of them. In one example, a device may include one or more processors. The processor includes a computer readable medium, such as Random Access Memory (RAM) coupled to the processor. The processor executes computer-executable program instructions stored in the memory, such as executing one or more computer programs. Such processors may include microprocessors, Digital Signal Processors (DSPs), application-specific integrated circuits (ASICs), Field Programmable Gate Arrays (FPGAs), and state machines. These processors may further include programmable electronic devices such as PLCs, Programmable Interrupt Controllers (PICs), Programmable Logic Devices (PLDs), programmable read-only memories (PROMs), electronically programmable read-only memories (EPROMs or EEPROMs), or other similar devices.

Such a processor may include, or may be in communication with, a medium, such as a computer-readable medium, which may store instructions that, when executed by the processor, may cause the processor to perform steps described herein as being performed or assisted by the processor. Examples of a computer readable medium may include, but are not limited to, an electronic, optical, magnetic, or other storage device capable of providing a processor with computer readable instructions, such as a processor in a network server. Other examples of media include, but are not limited to, a floppy disk, a CD-ROM, a magnetic disk, a memory chip, a ROM, a RAM, an ASIC, a configured processor, all optical media, all magnetic tape or other magnetic media, or any other medium from which a computer processor can read. The processors and processes described may be in, and may be dispersed through, one or more structures. The processor may comprise code for performing one or more of the methods (or portions of methods) described herein.

The foregoing description, for purposes of explanation, used specific nomenclature to provide a thorough understanding of the described embodiments. However, it will be apparent to one skilled in the art that the specific details are not required in order to practice the described embodiments. Thus, the foregoing descriptions of specific embodiments are presented for purposes of illustration and description. It is not intended to be exhaustive or to limit the described embodiments to the precise form disclosed. It will be apparent to those of ordinary skill in the art that many modifications and variations are possible in light of the above teaching.

The foregoing description of some examples has been presented for purposes of illustration and description only and is not intended to be exhaustive or to limit the disclosure to the precise forms disclosed. Many alterations and modifications thereof will become apparent to those skilled in the art without departing from the spirit and scope of the disclosure.

Reference herein to an example or implementation means that a particular feature, structure, operation, or other characteristic described in connection with the example can be included in at least one implementation of the present disclosure. The present disclosure is not limited to the specific examples or embodiments so described. The appearances of the phrases in one example, in an example, in one embodiment, or in an embodiment, or variations thereof in various places in the specification are not necessarily referring to the same example or embodiment. Any particular feature, structure, operation, or other characteristic described in this specification in connection with one example or embodiment may be combined with other features, structures, operations, or other characteristics described in connection with any other example or embodiment.

Use of the word "or" herein is intended to encompass inclusion or and exclusive-or conditions. In other words, a or B or C comprise any or all of the following alternative combinations as appropriate for a particular use: only a, only B, only C, only a and B, only a and C, only B and C; and A, B and C.

27页详细技术资料下载

上一篇：一种医用注射器针头装配设备

下一篇：一种笛子演奏辅助练习装置

Image orientation based on face orientation detection

相关技术

网友询问留言