Improving efficiency of image difference calculation

文档序号：1957993 发布日期：2021-12-10 浏览：20次中文

阅读说明：本技术 改进图像差异计算的效率 (Improving efficiency of image difference calculation ) 是由 M·鲍尔 B·巴里 V·托马二世于 2019-02-26 设计创作，主要内容包括：本文公开了用于识别图像内的特征的方法、系统、装置和制品。一种示例装置包括：水平成本(HCOST)引擎,其用于将宏块的第一像素行应用于第一HCOST单元的输入,该第一HCOST单元包括多个差异计算器；以及差异计算器引擎,其用于将源图像的搜索窗口的对应的像素行应用于第一HCOST单元的多个差异计算器中的对应差异计算器,多个差异计算器中的对应差异计算器用于计算在(a)宏块的第一像素行与(b)搜索窗口的对应的像素行之间的相应的绝对差异和(SAD)值。(Methods, systems, apparatuses, and articles of manufacture for identifying features within an image are disclosed herein. An example apparatus includes: a Horizontal Cost (HCOST) engine to apply a first row of pixels of a macroblock to an input of a first HCOST unit, the first HCOST unit including a plurality of difference calculators; and a difference calculator engine to apply a corresponding line of pixels of a search window of the source image to a corresponding difference calculator of a plurality of difference calculators of the first HCOST unit to calculate a respective Sum of Absolute Difference (SAD) value between (a) the first line of pixels of the macroblock and (b) the corresponding line of pixels of the search window.)

1. An apparatus for improving efficiency of image difference calculations, the apparatus comprising:

a Horizontal Cost (HCOST) engine to apply a first row of pixels of a macroblock to an input of a first HCOST unit, the first HCOST unit including a plurality of difference calculators; and

a difference calculator engine to apply a corresponding line of pixels of a search window of a source image to a corresponding one of the plurality of difference calculators of the first HCOST unit, the corresponding one of the plurality of difference calculators to calculate a respective Sum of Absolute Difference (SAD) value between (a) the first line of pixels of the macroblock and (b) the corresponding line of pixels of the search window.

2. The apparatus of claim 1, further comprising: a difference calculator number determiner for calculating a number of difference calculators based on a number of instances of the macroblock that fit within a width of the search window of the source image.

3. The apparatus of claim 1, wherein the HCOST engine is to cause the first HCOST cell to cascade the respective SAD values to a second HCOST cell without rotating the first row of pixels of the macroblock.

4. The apparatus of claim 3, wherein the HCOST engine is to route cascaded SAD values corresponding to the plurality of difference calculators of the first HCOST unit to inputs of respective difference calculators in the second HCOST unit.

5. The apparatus of claim 4, wherein the difference calculator engine is to constrain the input of the respective difference calculator in the second HCOST unit to evaluate a second row of pixels of the macroblock.

6. The apparatus of claim 1, further comprising: a search area engine to determine whether all of the corresponding pixel rows of the search window have been evaluated.

7. The apparatus of claim 6, further comprising: a sorting engine for comparing the respective SAD values to identify a relatively lowest one of the respective SAD values.

8. The apparatus of claim 7, wherein the relatively lowest one of the respective SAD values indicates a match between the macroblock and an image of the search window.

9. The apparatus of claim 7, wherein the ranking engine is to identify a target location corresponding to the relatively lowest one of the respective SAD values.

10. The apparatus of claim 9, wherein the ranking engine is to identify the target location as a pixel coordinate of the search window.

11. A non-transitory computer-readable medium comprising computer-readable instructions that, when executed, cause at least one processor to:

applying a first row of pixels of a macroblock to an input of a first Horizontal Cost (HCOST) unit, the first HCOST unit comprising a plurality of difference calculators;

applying corresponding pixel rows of a search window of a source image to corresponding ones of the plurality of difference calculators of the first HCOST unit; and

concatenating respective Sum of Absolute Differences (SAD) values of the corresponding difference calculators of the plurality of difference calculators between (a) the first pixel row of the macroblock and (b) the corresponding pixel row of the search window.

12. The computer-readable medium of claim 11, wherein the instructions, when executed, cause the at least one processor to: calculating a number of difference calculators based on a number of instances of the macroblock that fit within a width of the search window of the source image.

13. The computer-readable medium of claim 11, wherein the instructions, when executed, cause the at least one processor to: causing the first HCOST cell to cascade the respective SAD values to a second HCOST cell while bypassing a rotation of the first row of pixels of the macroblock.

14. The computer-readable medium of claim 13, wherein the instructions, when executed, cause the at least one processor to: routing the cascaded SAD values corresponding to the plurality of difference calculators of the first HCOST unit to inputs of respective difference calculators in the second HCOST unit.

15. The computer-readable medium of claim 14, wherein the instructions, when executed, cause the at least one processor to: constraining an input of the respective difference calculator in the second HCOST unit to evaluate a second row of pixels of the macroblock.

16. The computer-readable medium of claim 11, wherein the instructions, when executed, cause the at least one processor to: determining whether all of the corresponding pixel rows of the search window have been evaluated.

17. The computer-readable medium of claim 16, wherein the instructions, when executed, cause the at least one processor to: the respective SAD values are compared to identify a relatively lowest one of the respective SAD values.

18. The computer-readable medium of claim 17, wherein the instructions, when executed, cause the at least one processor to: identifying a match between the macroblock and an image of the search window based on the relatively lowest one of the respective SAD values.

19. The computer-readable medium of claim 17, wherein the instructions, when executed, cause the at least one processor to: a target location corresponding to the relatively lowest one of the respective SAD values is identified.

20. The computer-readable medium of claim 19, wherein the instructions, when executed, cause the at least one processor to: identifying the target location as a pixel coordinate of the search window.

21. A computer-implemented method for improving efficiency of image difference calculations, the method comprising:

applying, by executing instructions with at least one processor, a first row of pixels of a macroblock to an input of a first Horizontal Cost (HCOST) unit, the first HCOST unit comprising a plurality of difference calculators;

applying, by executing instructions with the at least one processor, corresponding pixel rows of a search window of a source image to corresponding ones of the plurality of difference calculators of the first HCOST unit; and

concatenating, by executing instructions with the at least one processor, respective Sum of Absolute Difference (SAD) values for the corresponding difference calculator of the plurality of difference calculators between (a) the first pixel row of the macroblock and (b) the corresponding pixel row of the search window.

22. The method of claim 21, further comprising: calculating a number of difference calculators based on a number of instances of the macroblock that fit within a width of the search window of the source image.

23. The method of claim 21, further comprising: causing the first HCOST cell to cascade the respective SAD values to a second HCOST cell while bypassing a rotation of the first row of pixels of the macroblock.

24. The method of claim 23, further comprising: routing the cascaded SAD values corresponding to the plurality of difference calculators of the first HCOST unit to inputs of respective difference calculators in the second HCOST unit.

25. The method of claim 24, further comprising: constraining an input of the respective difference calculator in the second HCOST unit to evaluate a second row of pixels of the macroblock.

26. The method of claim 21, further comprising: determining whether all of the corresponding pixel rows of the search window have been evaluated.

27. The method of claim 26, further comprising: the respective SAD values are compared to identify a relatively lowest one of the respective SAD values.

28. The method of claim 27, further comprising: identifying a match between the macroblock and an image of the search window based on the relatively lowest one of the respective SAD values.

29. The method of claim 27, further comprising: a target location corresponding to the relatively lowest one of the respective SAD values is identified.

30. The method of claim 29, further comprising: identifying the target location as a pixel coordinate of the search window.

31. An apparatus for improving efficiency of image difference calculations, the apparatus comprising:

a macroblock pixel application unit to apply a first row of pixels of a macroblock to an input of a first Horizontal Cost (HCOST) unit, the first HCOST unit including a plurality of difference calculators; and

a search window application unit to apply a corresponding line of pixels of a search window of a source image to a corresponding one of the plurality of difference calculators of the first HCOST unit, the corresponding one of the plurality of difference calculators to calculate a respective Sum of Absolute Difference (SAD) value between (a) the first line of pixels of the macroblock and (b) the corresponding line of pixels of the search window.

32. The apparatus of claim 31, further comprising: means for determining a number of difference calculators to calculate a number of difference calculators based on a number of instances of the macroblock that fit within a width of the search window of the source image.

33. The apparatus of claim 31, wherein the macroblock pixel application unit is to cause the first HCOST unit to cascade the respective SAD values to a second HCOST unit without rotating the first row of pixels of the macroblock.

34. The apparatus of claim 33, wherein the macroblock pixel application unit is to route cascaded SAD values corresponding to the plurality of difference calculators of the first HCOST unit to inputs of respective difference calculators in the second HCOST unit.

35. The apparatus of claim 34, wherein the search window application unit is to constrain the input of the respective difference calculator in the second HCOST unit to evaluate a second row of pixels of the macroblock.

36. The apparatus of claim 31, further comprising: means for search area evaluation for determining whether all of the corresponding pixel rows of the search window have been evaluated.

37. The apparatus of claim 36, further comprising: a sorting unit for comparing the respective SAD values to identify a relatively lowest one of the respective SAD values.

38. The apparatus of claim 37, wherein the relatively lowest one of the respective SAD values indicates a match between the macroblock and an image of the search window.

39. The apparatus as defined in claim 37, wherein the ordering unit is to identify a target location corresponding to the relatively lowest one of the respective SAD values.

40. The apparatus of claim 39, wherein the ranking unit is to identify the target location as a pixel coordinate of the search window.

Technical Field

The present disclosure relates generally to image searching, and more particularly to methods, systems, apparatus, and articles of manufacture for identifying features within images.

Background

In recent years, vision systems have introduced large amounts of image data into computing resources for performing one or more analysis operations. The large amount of image data includes ever-increasing bandwidth expectations for the most advanced consumer electronics products (e.g., high definition television systems). In some examples, the analysis operation on the image data attempts to identify whether movement is occurring in one or more source images.

Drawings

FIG. 1 is a schematic diagram of an example image analysis system configured to identify features within an image.

FIG. 2 is a schematic diagram of an example image analyzer of the image analysis system of FIG. 1 for identifying features within an image.

Fig. 3A and 3B are example search regions evaluated by the example image analyzer of fig. 1 and 2.

Fig. 3C and 3D are example macro blocks indicating pixels to be searched in the example search areas of fig. 3A and 3B.

FIG. 4 is a schematic diagram of an example first comparison unit of the example image analysis system of FIG. 1.

Fig. 5 is a schematic diagram of an example horizontal cost unit of the example first comparison unit of fig. 4.

FIG. 6 is a schematic diagram of the example search area of FIG. 3A and/or FIG. 3B showing candidate rows.

Fig. 7 is a timing diagram of an example first mode of operation corresponding to the example first comparison unit of fig. 4.

8A-8C are graphical representations of disparity calculations corresponding to an example first mode of operation.

FIG. 9 is an example search area diagram showing a non-overlapping set of candidate rows.

FIG. 10 is a schematic diagram of an example second comparison unit of the example image analysis system of FIG. 1.

Fig. 11 is a schematic diagram of an example horizontal cost cell of the example second comparison unit of fig. 10.

Fig. 12 is a timing diagram of an example second operation mode corresponding to the example second comparison unit of fig. 10.

Fig. 13A-13C are graphical representations of disparity calculations corresponding to an example second mode of operation.

Fig. 14-16 are flow diagrams representing machine readable instructions that may be executed to implement the example image analyzer of fig. 1 and 2.

FIG. 17 is a block diagram of an example processing platform configured to execute the instructions of FIGS. 14-16 to implement the example image analyzer of FIGS. 1 and 2 for identifying features within an image.

The figures are not drawn to scale. Generally, the same reference numbers will be used throughout the drawings and the following written description to refer to the same or like parts.

Detailed Description

Computing and/or otherwise determining the occurrence of motion in an image is computationally intensive. In particular, during a first time instance, the source image is analyzed with the reference image to identify a relative position within the source image in which the reference image is located. However, the source image may be dynamic, e.g., a real-time feed from a high definition (e.g., 1080p, 4k, etc.) source (e.g., camera, memory, database, etc.). Accordingly, the source image is analyzed at a second time instance to identify a second relative position within the source image in which the reference image may be located. The information corresponding to the difference in pixel position of the reference image at the first and second time instances indicates the direction and/or speed of the object motion.

To further illustrate, consider that the source image is a frame scene of a roadway including a vehicle in which the vehicle is moving. During the first instance in time, there is no information indicating whether the vehicle is moving (because there are no previous pixel locations associated with the reference image that can be used for the relative position difference calculation). Additionally, no information indicates the direction in which the vehicle is moving. However, if a reference image is selected and/or otherwise designated as part of the vehicle (e.g., one wheel of the vehicle), then the motion of the vehicle is confirmed at a second time instance when the same reference image (e.g., the wheel) is located at a second (different) relative position of the frame scene. Additionally, orientation change information associated with an object of interest in a source image may be determined based on an analysis of relative pixel value changes in a coordinate system (e.g., a Cartesian coordinate system having +/-x-axis pixel positions, +/-y-axis pixel positions, and/or +/-z-axis pixel positions).

Changes to the source image may occur at an ever increasing rate in view of the updated video standard. For example, a 4k video feed (e.g., a traffic camera surveillance system, or a 4k Ultra High Definition (UHD) for television broadcasts) is 3,840 pixels wide (e.g., x-axis) and 2,160 pixels high (e.g., y-axis). Additionally, the video cameras include corresponding frame rates, e.g., 30 frames per second (fps), 60fps, etc. In the case of a 30fps camera, the corresponding duration of processing the retrieved and/or otherwise acquired images is 30 milliseconds (ms). Thus, after a first instance in time of acquiring an image, the image must be searched and compared to a reference image (e.g., an image of a vehicle wheel) to determine its relative position. In other words, a source image is taken, written to memory, post-processed (e.g., increased/decreased brightness levels), retrieved from memory, and the reference image (or in some cases, two or more reference images) is moved around the source image to compare with pixels/kernels within the image to find the most similar region in the image to the reference image. The degree of similarity is measured by means of absolute differences, but other difference techniques (for a 30fps camera, within 30 ms) can be used. In the case of a relatively lowest disparity value, a confirmation match corresponds to the pixel location in which the lowest disparity value was detected (e.g., the x and y coordinate locations of the pixel grid). As used herein, "match" is based on the relatively lowest disparity value, even if the pixel comparison is not 100% matched.

The apparatus, methods, systems, and articles of manufacture disclosed herein improve computational efficiency when identifying features within one or more images of interest. Examples disclosed herein include the structural arrangement of the difference calculator and the grouping (number) of such difference calculators, which eliminates the specific circuitry associated with the prior art for identifying the matching pixel(s) within the source image of interest. Additionally, as described in further detail below, examples disclosed herein reduce the computational burden required of the hardware to identify such matching pixels.

FIG. 1 is a schematic diagram of an example image analysis system 100 configured to identify features within an image in a manner consistent with the teachings of this patent. In the example shown in fig. 1, the image analysis system 100 includes an example image analyzer 102 communicatively connected to an image (e.g., image from one or more photographs, videos, etc.) source 104, which image source 104 may include example camera(s) 106, example database(s) 108, and so forth. In some examples, video source 104 is communicatively connected to image analyzer 102 via network 110, while in other examples, video source 104 includes direct connection 112. The example image analyzer 102 may include its own one or more processing resources and/or, in some examples, the image analyzer 102 may engage processing services from the processing resources 114. Example processing resources 114 include, but are not limited to, computers, servers, Cloud-based server farms (e.g., Amazon Web Service (AWS), Rackspace Cloud, Google Cloud, Microsoft Azure, etc.), Field Programmable Gate Arrays (FPGAs), and/or Application Specific Integrated Circuits (ASICs).

Based on the video input data, the example image analyzer 102 identifies location information (e.g., relative location information) associated with one or more pixels and/or portions of the image of interest. In some examples, the location information indicates motion within the image of interest, e.g., motion associated with a sub-image (e.g., vehicle, person, etc.) within the image of interest.

Fig. 2 is a schematic diagram of the example image analyzer 102 of fig. 1. In the example shown in FIG. 2, the example image analyzer 102 includes an example hardware evaluator 202. The example hardware evaluator 202 includes an example search area determiner 204, an example difference calculator quantity determiner 206, and an example Horizontal Cost (HCOST) quantity determiner 208. The example image analyzer 102 of fig. 2 also includes an example image retriever 210, an example macroblock engine 212, an example search area engine 214, an example ranking engine 216, an example difference calculator engine 218, an example barrel shift engine 220, an example feedback engine 222, and an example HCOST engine 224. In this example, the HCOST engine 224 implements elements for a macroblock pixel application. The unit for macroblock pixel application may additionally or alternatively be implemented by a macroblock pixel application unit. In this example, the variance calculator engine 218 implements means for a search window application. The means for search window application may additionally or alternatively be implemented by a search window application unit. In this example, the difference calculator number determiner 206 implements means for determining the difference calculator number. The unit for determining the number of difference calculators may additionally or alternatively be implemented by the unit for determining the number of difference calculators. In this example, the search area engine 214 implements means for search area evaluation. The unit for search area evaluation may additionally or alternatively be implemented by a search area evaluation unit. In this example, the ranking engine 216 implements the means for ranking. The means for ordering may additionally or alternatively be implemented by an ordering unit.

Examples disclosed herein consider the example image analyzer 102 of fig. 1 and/or 2 in conjunction with a 9 pixel (x-direction) by 9 pixel (y-direction) search area and a 3 pixel by 3 pixel macroblock in which to search. However, example 9 x 9 search areas and/or example 3 x 3 macroblocks are discussed herein for convenience and not limitation. An example search area of 9 pixels (x-direction) by 9 pixels (y-direction) is referred to herein as a 9 x 9 search area, as shown in fig. 3A. In the example shown in fig. 3A, the 9 x 9 search area 302 includes an x-direction 304 and a y-direction 306. In some examples, the search area is referred to herein as a search window. The particular pixel size of search area 302 is determined based on a selected search span value, where the search span value indicates the number of pixels surrounding the macroblock location to be searched. To illustrate, the example shown in fig. 3A includes a macroblock 308 having a size of 3 pixels (x-direction) by 3 pixels (y-direction). The example macroblock 308 includes a reference pixel 310 (e.g., the top left pixel of the example macroblock 308) that may be moved three (3) pixels in any direction during a search instance within the 9 x 9 search area. As described in further detail below, the example macro block 308 (e.g., the example sub-image mentioned above) includes pixels to be searched in a digital image (e.g., an image of interest) and/or a portion of a digital image (e.g., the example search area 302). For example, the image of interest may be a scene with a vehicle, where the scene may be one of many scenes forming a movie or a real-time image. Continuing with the previous example, macro block 308 may be a portion of a vehicle (e.g., a wheel) used to search for a scene. In the event that a portion of the vehicle (e.g., a wheel) is identified in the scene at a first time as corresponding to a first pixel location (e.g., a lower left region of the image of interest) and the portion of the vehicle is identified in the scene at a second time as corresponding to a second pixel location (e.g., a lower right region of the image of interest), an indication of object movement may be determined (e.g., the vehicle is traveling from left to right in the image/scene of interest).

To illustrate the resulting search area, fig. 3B shows an example search area 302 having a plurality of individual reference macro blocks 308 located therein. When the example reference pixel 310 has an upper left position within the search area 302, the plurality of individual reference macroblocks 308 are positioned in a manner that a complete macroblock (e.g., a 3 × 3 pixel macroblock) can fit within the search. In the first row 312 of the example search area 302, the example macroblock 308 may be positioned at seven (7) different locations, with the example reference pixel 310 having the top-left position of the first row 312 and the full macroblock 308 fitting within the 9 x 9 search area 302. Additionally, seven (7) lines of macroblocks (see line 313) may fit within the example 9 x 9 search area 302, resulting in forty-nine (49) possible macroblock locations in the search area (before searching one or more additional search areas for a match for the reference macroblock) needing to be evaluated. In operation, because the macro block is a reference image (e.g., a wheel of a vehicle) that is searched in the image of interest (example 9 x 9 search area 302), each of the 49 locations of the search area results in a pixel comparison to determine the degree of similarity. The pixel comparison occurs between (a) the macroblock pixels and (b) the image pixels of interest at the relative location of the search area 302. As described in further detail below, each of the example 49 locations will result in a comparative difference value of the values such that difference values having a relatively closer sum equal to zero are considered to be a closer match.

Fig. 3C shows further details of the example macroblock 308. In the example shown in fig. 3C, macroblock 308 includes reference pixel 310 at pixel location (0, 0), and macroblock 308 is 3 pixels wide by 3 pixels long (high). Fig. 3D shows further details of example macroblock 308, where first row 314 is labeled and/or otherwise designated as reference "R0", second row 316 is labeled and/or otherwise designated as reference "R1", and third row 318 is labeled and/or otherwise designated as reference "R2".

In a first mode of operation of the example image analyzer 102, the example hardware evaluator 202 determines whether hardware parameters are known. As used herein, a hardware parameter refers to an example type and/or number of: a Computer Processing Unit (CPU), a Graphics Processing Unit (GPU), a difference calculator (sometimes referred to herein as a compare absolute sum of absolute difference (CSAD) unit), an accumulator, a Horizontal Cost (HCOST) unit, a comparison unit, and the like. In general, the example image analyzer 102 of fig. 1 and/or 2 achieves improved efficiency when identifying features within an image, in part, through specific operations employing specific hardware configurations/arrangements. Examples disclosed herein consider image processing for a search area having a certain number of x-direction pixels and a certain number of y-direction pixels, but examples disclosed herein are not limited thereto.

For example, the improved efficiency achieved by the example first comparison unit 400 when identifying features within an image by the examples disclosed herein is illustrated in the example shown in fig. 4. In some examples, the first comparison unit 400 is hardware located in the example processing resource 114 of fig. 1, while in some examples, the first comparison unit 400 is hardware located in the example image analyzer 102. In some examples, the first comparison unit 400 is an Application Specific Integrated Circuit (ASIC), while in some examples, the first comparison unit 400 is one or more Field Programmable Gate Array (FPGA) resources. In some examples, the image analyzer 102 invokes particular FPGA resources in a manner consistent with input characteristics (e.g., resolution values of the example video source 104, pixel window search area constraints/settings, frame rate expectations of the application, etc.).

In the example shown in fig. 4, the first comparison unit 400 includes a pixel input 402, the pixel input 402 including pixel data from the example image (e.g., video) source 104 of fig. 1 and/or the example image retriever 210 of fig. 2. A respective one of the pixel inputs 402 associated with the reference pixel is provided as an input to a bucket line shifter 410. As discussed in further detail below, the example bucket line shifter 410 arranges the reference pixel inputs in a manner that allows for comparison and/or difference calculations to be performed on the source image during different clock cycle iterations of the image analyzer 102 and/or corresponding processing resources 114. The example first comparison unit 400 of fig. 3 includes a Horizontal Cost (HCOST) unit 404. Continuing with the example of the 3 × 3 reference macroblock 308 and the example 9 × 9 search area 302 identified above (an example for purposes of explanation and not limitation), the example comparison unit 400 shown in fig. 4 includes an example first HCOST unit 404A, an example second HCOST unit 404B, and an example third HCOST unit 404C (generally referred to herein collectively as HCOST units 404). In some examples, the number of HCOST cells is based on the height of the reference macroblock, but other examples may be utilized with varying degrees of optimization and/or efficiency. Each HCOST unit comprises a number of CSAD units 406 (difference calculators) and a corresponding sorting unit 408. To maintain a consistent manner of structural reference to the example shown in fig. 4, the example first HCOST units 404A include respective first CSAD units 406A and first ordering units 408A, the example second HCOST units 404B include respective second CSAD units 406B and second ordering units 408B, and the example third HCOST units 404C include respective third CSAD units 406C and third ordering units 408C.

Fig. 5 shows additional details of the example first HCOST cell 404A corresponding to fig. 4. In the example shown in FIG. 5, the first HCOST units 404A include first 502A (labeled CSAD [0]) through seventh 502G (labeled CSAD [6]) (the CSAD units are generally referred to herein collectively as CSAD units 502 or difference calculator 502). The example first CSAD unit 502A is communicatively coupled to a first adder 504A, which first adder 504A is in turn communicatively coupled to a first accumulator 506A. The example first accumulator 506A includes a first feedback path 508A to the example first adder 504A. Although not shown in the example of fig. 5, all CSAD units of the HCOST unit include similarly constructed adders, accumulators, and feedback paths. Each accumulator output for each CSAD unit 502 (e.g., the example first output 510A of the example first accumulator 506A corresponding to the example first CSAD unit 502A) is communicatively connected to the example sequencing network 408A. The example ordering network 408A includes a minimum Sum of Absolute Difference (SAD) output 512 indicating a SAD value calculated by a respective one of the CSAD units 502. The example ordering network 408A also includes a location output 514 that indicates a pixel location value associated with the minimum CSAD value output 512.

As described above, the example hardware evaluator 202 of FIG. 2 determines whether a hardware parameter is known. The example first comparison unit 400 of fig. 4 and/or the example first HCOST unit 404A of fig. 5 illustrate example hardware parameters analyzed by the example hardware evaluator 202 to identify, in part, capabilities of the example image analysis system 100. In the event that the example hardware evaluator 202 determines that details associated with the hardware are unknown (e.g., the current hardware parameters are not located and/or otherwise depicted in the storage location of the example image analyzer 102), the example hardware evaluator 202 identifies a search span value (e.g., +/-3 pixels in the x and y directions) and a macroblock size (e.g., 3 pixels in the x direction, 3 pixels in the y direction). The example search area determiner 204 calculates the search area value in a manner consistent with example equations 1A and 1B.

SA (width) ═ (2 x search span in direction) + (width of macroblock)

Equation 1A

SA (height) ═ (search span in 2 x y direction) + (height of macroblock)

Equation 1B

In the example shown in equations 1A and 1B, the SA designation is the search area width or height, e.g., the example search area width and height values of fig. 3A. To illustrate in view of an example search span of +/-3 pixels and a macroblock width of 3 pixels, example equation 1 reveals a search region of 81 pixels, which is shown in the example shown in FIG. 3A. In particular, a search area of 81 pixels refers to an equal number of rows and columns, and thus the search area includes nine (9) rows and nine (9) columns.

While the example search area determiner 204 determines that the search area has an equal number of rows and columns, as shown in fig. 3A, there are a limited number of complete macroblocks that can fit within the example search area (e.g., the example search area 302 of fig. 3A). The search area determiner 204 calculates the number of possible macroblock rows within the search area in a manner consistent with example equation 2.

MB line 2 x (search span) +1

Equation 2

In the example shown in equation 2, since the foregoing example includes a search span of 3 pixels, the number of macroblock rows is calculated as a value of seven (7). The number of possible macroblock rows that can fit within an example search area is shown in fig. 3B, where complete macroblocks can be placed within rows 0 (first row 312) through 6 (seventh row 314). However, with the example macroblock 308B (see dashed macroblock) so positioned, the example macroblock 308B will not fit within the example search area 302: reference pixel 310 of example macroblock 308B is placed on eighth row 316. In other words, any macroblock placed in the example eighth row 316 will result in an overlap of missing pixels 330 beyond the evaluation boundary of the example search area 302.

In some examples, the number of CSAD units in a given implementation (e.g., available processing devices) is fixed. Depending on the search area selection, a different number of CSAD units may be employed during the analysis effort. In the event that the search area exceeds the required number of CSAD units, examples disclosed herein enable multiple iterations to fully search the search area of interest. In other examples, where the search area of interest is relatively small in view of the number of CSAD units available, examples disclosed herein enable operation with some CSAD units remaining idle. Determining the number of macroblock rows is important for determining the corresponding number of difference calculators (CSAD units) (e.g., the example difference calculator 502 of fig. 5) used within each HCOST unit. The example difference calculator number determiner 206 determines the number of difference calculators to use based on the number of macroblock rows that may fit within the example search area. Because each difference calculator includes a corresponding accumulator, the corresponding number of accumulators is also known when based on the determined number of macroblock rows that may fit within the example search area. The example HCOST number determiner 208 determines the number of HCOST cells used based on the macroblock height. In the non-limiting example discussed herein, since the macroblock height is 3 pixels, the corresponding number of HCOST cells, shown as HCOST [0] (404A), HCOST [1] (404B), and HCOST [2] (404C) of FIG. 4, is three.

In some examples, the hardware evaluator 202 continues to compare images based on available hardware resources, while in some examples the hardware evaluator 202 allocates the hardware resources as, for example, FPGA circuitry. In some examples, hardware evaluator 202 invokes an FPGA circuit configuration to allocate a particular number of HCOST cells, a particular number of difference calculators, and the like. In some examples, the hardware resource is one or more ASICs having a particular number of HCOST cells and/or difference calculators.

Continuing with the example first mode of operation of the example image analyzer 102 of the example first comparison unit 400 of fig. 4 and the example HCOST unit (e.g., the example first HCOST unit 404A) of fig. 5, the example image retriever 210 retrieves a candidate image of interest to be analyzed. The example macroblock engine 212 retrieves and/or otherwise selects a macroblock of interest, which may include a portion of an image of a reference image or candidate image to be searched. As described above, the candidate image may be a frame scene having an image of a car therein, and the example reference image may be a portion of the car (e.g., wheels of the car). The example macroblock engine 212 marks the row and pixel references of an example macroblock (e.g., the example macroblock 308 of fig. 3C and 3D). The example search area engine 214 selects a search area of interest from the retrieved candidate images and marks its row and pixel references (see, e.g., the example search area 302 of fig. 3A and 3B).

The example search area engine 214 identifies and/or otherwise marks candidate rows of the example search area 302, as shown on the right-hand side of fig. 6. In the example shown in fig. 6, candidate rows C0-C6 represent particular rows of the example search area 302 that may fit within the example reference pixels 310 of the complete macroblock 308. In the example shown in fig. 6, candidate rows C7 and C8 do not contain reference pixels 310, as doing so would only result in a local representation of macroblock 308 within example search area 302 (e.g., resulting in a salient portion of missing pixels 330, as shown in fig. 3B).

Fig. 7 shows an example first mode of operation timing diagram 700 for the example image analyzer 102, and corresponding first comparison unit 400 operation for the example 3-pixel by 3-pixel macroblock example discussed herein. In the example shown in fig. 7, the timing diagram 700 includes a first HCOST cell (HCOST 0)404A, a second HCOST cell (HCOST 1)404B, and a third HCOST cell (HCOST 2) 404C. The example timing diagram 700 also includes rows 708 of clock cycles (clock cycles 0-9) to illustrate operation of the respective HCOST cells when performing a comparison between a candidate row (e.g., C0) and a respective reference row (e.g., R0). For example, during the first clock cycle (clock cycle 0)710, the example first HCOST cell 404A compares the first reference row R0 to the incremented position within the example first candidate row C0. In particular, when considering the 3 × 3 example discussed herein, the first clock cycle invokes a number of seven (7) difference calculators (e.g., CSAD [0]502A through CSAD [6]502G) to (in parallel) calculate difference values for the first reference row R0 at different overlapping locations on the example first candidate row C0.

To illustrate the first mode of operation of the example image analyzer 102 and the example operation of the corresponding difference calculator of the example first HCOST unit 404A, fig. 8A shows a graphical representation of the difference calculation between the first candidate row C0802A and the first reference row R0804A of the reference macroblock 308. In particular, the first difference calculator CSAD [0]502A calculates the pixel value difference between the first candidate row C0802A and the first reference row R0804A for pixels 0, 1 and 2 (shaded) of the first candidate row C0802A during a first clock cycle (clock cycle 0). FIG. 8A also shows the first candidate row C0802B for the difference calculation by the second difference calculator CSAD [1] 502B. The second difference calculator CSAD [1]502B calculates the pixel value difference between pixels 1, 2, and 3 of the first candidate row C0802B (different from pixels 0, 1, and 2 of interest of the CSAD [0]502A) and the same three pixels (R0) of the reference macroblock 308 during the first clock cycle. In other words, the respective difference calculator (difference calculators CSAD [0]502A through CSAD [6]502G) is responsible for the incremental shifts of candidate row C0 from left to right, and the resulting difference value for each incremental shift. For ease of explanation, the first two difference calculators of fig. 8A are explained in this example only for the first clock cycle of the first HCOST unit 404A.

Returning to the example shown in fig. 7, while the first clock cycle 710 described above performed pixel comparisons between (a) the first candidate row C0 and (b) the first row of the reference macroblock R0804A (see dashed circle 720), the second (subsequent) clock cycle continues to perform pixel comparisons of additional candidate rows (e.g., C1) and additional rows of the reference macroblock (e.g., R1) (see dashed circle 722). To illustrate, fig. 8B shows a graphical representation of the difference value calculations between the second candidate line C1802C and the second reference line R1804B of the reference macroblock 308. During this second clock cycle (clock cycle 1), first difference calculator CSAD [0]502A of HCOST [0]404A calculates the pixel value differences between pixels 0, 1, and 2 of second candidate row C1802C and pixels 0, 1, and 2 of second row R1804B of reference macroblock 308. However, because the difference calculator 502A requires a reference macroblock of an alternate row (e.g., R1) (as compared to the first row R0 during the first clock cycle described above), the input to the example first comparison unit 400 must be rotated (rotate) by the example bucket line shifter 410 at the beginning of this second clock cycle (clock cycle 1). Similarly, during this second clock cycle, the example second difference calculator CSAD [1]502B calculates pixel value differences between pixels 1, 2, and 3 (incremental shifts to the right) of the second candidate row C1802D and the same three pixels 0, 1, and 2 of the second row R1804B. As described above, only two of the difference calculators (e.g., CSAD [0]502A and CSAD [1]502B) are shown in FIG. 8B above, but the remaining difference calculators (e.g., CSAD [2] through CSAD [6]502G) continue to calculate the difference values between C1 and R1 as they are incrementally shifted from left to right in a similar manner.

At this time, in the example operation of the first operation mode, it has occurred that two clock cycles (clock cycle 0 and clock cycle 1) process two complete candidate rows (C0 and C1) for two rows (R0 and R1) of the reference macroblock. However, a complete reference macroblock (e.g., example macroblock 308) has not been analyzed within the search area (e.g., search area 302), and a third clock cycle (see dashed circle 724 of fig. 7) is required to complete a complete scan and comparison of candidate rows C0, C1, and C2. Fig. 8C shows a graphical representation of the calculation of the difference between the third candidate row C2802E of the search area 302 and the third reference row R2804C of the reference macroblock 308. During this third clock cycle (clock cycle 2), the first difference calculator CSAD [0]502A calculates the pixel value differences between pixels 0, 1, and 2 of the third candidate row C2802E and pixels 0, 1, and 2 of the third row R2804C of the reference macroblock 308. However, similar to the transition between the first clock cycle (clock cycle 0) and the second clock cycle (clock cycle 1) described above, this instantaneous transition from the second clock cycle (clock cycle 1) to the third clock cycle (clock cycle 2) applies a different row of reference macroblocks. Thus, the inputs to the example first comparison unit 400 must be rotated by the example bucket line shifter 410 at the beginning of this third clock cycle. Similarly, during this third clock cycle, the example second difference calculator CSAD [1]502B calculates the pixel value differences between pixels 1, 2, and 3 (incremental shifts to the right) of the third candidate row C2802F. At this point, in the example operation of the first mode of operation, the complete reference macroblock 308 has been analyzed within the search area for candidate rows equal in number to the number of rows in the macroblock. In the example shown in FIG. 7, the non-overlapping set of candidate rows "S" is the entire search area portion of the example search area 302. In the example shown in fig. 7, the first complete non-overlapping set of candidate rows is shown as "S0" 730.

FIG. 9 is an example search area graph 900 showing an example search area 302 at three separate times in a contiguous layout. In particular, the example search area map 900 includes a first search area portion 902 (which includes non-overlapping sets of candidate rows S0, S3, and S6), a second search area portion 904 (which includes non-overlapping sets of candidate rows S1 and S4), and a third search area portion 906 (which includes non-overlapping sets of candidate rows S2 and S5). An example first complete non-overlapping set of candidate rows S0730 is shown in the first search area portion 902, where S0 has performed a comparison of all candidate rows C0, C1 and C2 with respect to the reference macroblock 308. The example search area engine 214 determines, after each clock cycle, whether the non-overlapping set of candidate rows (S) is complete. If the non-overlapping set of candidate rows (S) is not complete, the example difference calculator engine 218 increments one row for analysis (e.g., from C0 to C1, from R0 to R1, etc.) and applies the new set of pixels as input to the example HCOST module (S) after a barrel shift rotation by the example barrel shift engine 220.

Notably, at the end of each clock cycle, the example feedback engine 222 activates an example feedback path (e.g., feedback path 508A of FIG. 5) for the corresponding difference calculator (e.g., CSAD [0] 502A). In general, each difference calculator (e.g., the example first CSAD unit 502A) calculates a difference value for a row (e.g., R1) of the reference macroblock and retains the value of the previous row (e.g., R0), if any, via the example feedback path 508A. Thus, when all rows of the reference block (e.g., R0, R1, and R2) have been used to perform the comparison on the search region, the example ranking engine 216 applies and/or otherwise provides the accumulated absolute difference Sum (SAD) value for each particular difference calculator to the example ranking network 408A.

The example ranking engine 216 determines a relatively lowest value and a corresponding position. In particular, the example ranking engine 216 considers all of the difference calculator values to find the value with the lowest relative value in view, which indicates the closest match of the reference macroblock image to the candidate source image. Such minimum values and corresponding location information are forwarded to an example Motion Vector (MV) calculator 450, as shown in the example shown in fig. 4. The example MV calculator calculates MVs based on the pixel coordinates of the macroblocks within the search area having the lowest cumulative difference, and compares all the groupings of the non-overlapping sets of rows (S0-S6) to find the absolute minimum difference value and its corresponding location. In other words, the example ranking engine 216 determines which of the 49 possible locations of the macro block 308 within the example search area 302 has the closest match. In view of the example disclosed above in which macro block 308 is a wheel of a vehicle located in a particular portion of search area 302, such a match would identify the location of that wheel. Thus, where the previous position of the wheel is located at a different position within the image of interest (e.g., search area 302), the new position of the wheel identifies the occurrence of motion of the vehicle within the image of interest.

Returning briefly to the example shown in FIG. 7, which indicates a first mode of operation of the example image analyzer 102, the example disclosed above only considers the first three clock cycles (clock cycles 0, 1, and 2) associated with HCOST [0] 404A. At the end of these three clock cycles, a first non-overlapping set of candidate rows (S0, which includes candidate rows C0, C1, and C2) is analyzed by the example compare unit 400 of fig. 4. However, as shown in the example shown in FIG. 7, additional HCOST cells are called in a similar manner during successive clock cycles to evaluate other non-overlapping sets of candidate rows (i.e., S1 through S6). Notably, as described above, each effort to evaluate and identify the relatively lowest SAD value for the respective non-overlapping sets of candidate lines requires additional rotation effort and accumulator feedback (e.g., the example feedback path 508A of fig. 5) effort. In other words, during the first three clock cycles of the example first HCOST cell (HCOST [0]404A), the example bucket-line shifter 410 rotates the pixels that are inputs to the HCOST cell from R0 to R1 (see clock cycle 0 to clock cycle 1), rotates the pixels that are inputs from R1 to R2 (see clock cycle 1 to clock cycle 2), and then must rotate the pixels that are inputs from R2 back to R0 (see clock cycle 2 to clock cycle 3). It is also worth noting with reference to the example first mode of operation that each of the example HCOST cells in the example shown in FIG. 4 includes a corresponding ordering cell (e.g., an example first ordering cell 408A corresponding to the example first HCOST cell 404A, an example second ordering cell 408B corresponding to the example second HCOST cell 404B, and an example third ordering cell 408C corresponding to the example third HCOST cell 404C).

To reduce the number of processing cycles dedicated to the pixel rotation and/or feedback paths, reduce the hardware footprint of the example comparison unit, and reduce dynamic power consumption by reducing the amount of data movement (e.g., removing the barrel shifter task, as described in further detail below), a second mode of operation of the example image analyzer 102 is disclosed below to identify features within an image. Fig. 10 is a schematic diagram illustrating a second comparison unit 1000. In the example shown in fig. 10, the comparison unit 1000 is hardware located in the example processing resource 114 of fig. 1, while in some examples, the second comparison unit 1000 is hardware located in the example image analyzer 102. In some examples, the second comparison unit 1000 is an ASIC, and in some examples, the second comparison unit 1000 is one or more FPGA resources. In some examples, the image analyzer 102 invokes particular FPGA resources in a manner consistent with input characteristics (e.g., resolution values of the example video source 104, pixel window search area constraints/settings, frame rate expectations of the application, etc.).

In the example shown in fig. 10, the second comparison unit 1000 includes a pixel input 1002, the pixel input 1002 including pixel data from the example video source 104 of fig. 1 and/or the example image retriever 210 of fig. 2. The example second comparison unit 1000 of fig. 10 includes an HCOST unit 1004. Continuing with the example of the 3 × 3 reference macroblock 308 and the example 9 × 9 search area 302 identified above (example for purposes of explanation and not limitation), the example second comparison unit 1000 shown in fig. 10 includes an example first HCOST unit 1004A, an example second HCOST unit 1004B, and an example third HCOST unit 1004C (generally collectively referred to herein as HCOST units 1004). Similar to the example first comparison unit 400 of fig. 4, the number of HCOST cells in the example shown in fig. 10 is based on the height of the reference macroblock. Each HCOST unit 1004 includes a number of CSAD units 1006. In particular, the example first HCOST units 1004A include respective first CSAD units 1006A, the example second HCOST units 1004B include respective second CSAD units 1006B, and the example third HCOST units 1004C include respective third CSAD units 1006C.

Fig. 11 shows additional details of an example first HCOST cell 1004A corresponding to fig. 10. In the example shown in FIG. 11, the first HCOST units 1004A include first through seventh CSAD units 1102A (labeled CSAD [0]) 1102G (labeled CSAD [6]) (the CSAD units are generally collectively referred to as CSAD units 1102 or difference calculator 1102). The example first CSAD unit 1102A is communicatively coupled to a first adder 1104A, the first adder 1104A in turn being communicatively coupled to a first accumulator 1106A. Although not shown in the example of FIG. 11, all CSAD units of the HCOST unit include similarly constructed adders and accumulators. The output value of the example first accumulator 1106A is routed to the example first local SAD line 1110A.

Unlike the example first comparison unit 400 of fig. 4, the outputs 1003 of the corresponding HCOST units 1004 of the example shown in fig. 10 are cascaded such that the output of one HCOST unit is provided as an input to a subsequent HCOST unit. Accordingly, values from the respective local SAD line (e.g., the example first local SAD line 1110A of fig. 11) are provided as input to the respective previous SAD line 1108 (e.g., the example first previous SAD line 1108A of fig. 11). At least one benefit of the foregoing example hardware configuration of the cascaded HCOST cells is that no feedback path (e.g., the example feedback path 508A of FIG. 5) is required, and no rotation effort(s) of the barrel shifter is required, thereby saving computation cycles and/or energy when evaluating pixel information. In some examples in which the underlying hardware includes one or more barrel shifters for rotating tasks, examples disclosed herein enable bypassing the rotating task(s) and/or otherwise bypassing hardware associated with such rotating tasks (e.g., bypassing the barrel shifter (s)). In other words, examples disclosed herein bypass rotating the first pixel row of a macroblock when cascading SAD values from one HCOST cell to another HCOST cell. Additionally, unlike the example first comparison unit 400 of fig. 4, the example second comparison unit 1000 of fig. 10 does not include and/or otherwise eliminates the need for a bucket line shifter (e.g., the example bucket line shifter 410 of fig. 4). Thus, combining the example second comparison unit 1000 shown in fig. 10 and the example HCOST unit architecture 1004 of fig. 11 reduces the computational cycle, processing power, and hardware footprint size.

To illustrate the operation of the example second comparison unit 1000 of fig. 10 and the associated example HCOST units 1004 of fig. 10 and 11 (e.g., the example first HCOST unit 1004A, the example second HCOST unit 1004B, and the example third HCOST unit 1004C), an example second operation mode timing diagram 1200 is shown in fig. 12. In the example shown in fig. 12, the timing diagram 1200 includes an example first HCOST cell (HCOST 0)1004A, an example second HCOST cell (HCOST 1)1004B, and an example third HCOST cell (HCOST 2) 1004C. Similar to the example shown in fig. 7, the example second operation mode timing diagram 1200 of fig. 12 includes rows 1208 of clock cycles (clock cycles 0 through 9) to illustrate operation of the corresponding HCOST cells when performing a comparison between the candidate row and the reference row.

Similar to the example timing diagram 700 of fig. 7, in the example shown in fig. 12, three clock cycles are consumed comparing pixels from the first non-overlapping set of candidate rows S01230. In particular, the example variance calculator engine 218 calculates the following: (a) pixel difference values between candidate row C0 and reference row R0 during the first clock cycle (clock cycle 0) (see dashed circle 1220); (b) pixel difference values between candidate row C1 and reference row R1 during a second clock cycle (clock cycle 1) (see dashed circle 1222); and (C) pixel difference values between the candidate row C2 and the reference row R2 during the third clock cycle (clock cycle 2) (see dashed circle 1224). Unlike the example shown in fig. 7, during these three clock cycles, all three HCOST cells participate in comparing the pixels of the first non-overlapping set S01230 of candidate rows. Thus, the SAD value from each participating HCOST cell is cascaded to the next HCOST cell, thereby eliminating any need to rotate pixels with a barrel shifter. In other words, the example difference calculator constrains the input of the respective difference calculator in any particular HCOST cell to evaluate only one pixel row of the macroblock.

It is worth noting that each of the example HCOST units in the example shown in fig. 12 (as well as the associated second comparison unit 1000 of fig. 10 and the associated CSAD architecture of fig. 11) takes only one row as input during a comparison effort in all clock cycles. The macroblock rows of each HCOST cell are constant and reduce data movement, thus reducing power consumption. That is, the example first HCOST cell 1004A processes only the reference row R0, the example second HCOST cell 1004B processes only the reference row R1, and the example third HCOST cell 1004C processes only the reference row R2, thereby improving computational efficiency by avoiding any need for pixel rotation (e.g., via a barrel shifter).

Additionally, due to the cascaded architecture of the example second comparison unit 1000 of fig. 10, which avoids the need to include redundant sort units (e.g., the respective sort units 408A, 408B, and 408C of fig. 4) within each HCOST cell (as is the case in the example shown in fig. 4), all overlapping sets S0 through S6 of candidate rows are ultimately determined by the last HCOST cell (see HCOST [2]1004C in fig. 12). In contrast, the example second comparison unit 1000 of fig. 10 requires only a single ordering unit 1008, because the SAD determination from the respective HCOST unit is cascaded to the subsequent HCOST unit. The example single ordering unit 1008 includes corresponding SAD outputs 1012 and position outputs 1014, which SAD outputs 1012 and position outputs 1014 feed an example MV calculator 1050. Thus, the particular architecture of the example second comparison unit 1000 of fig. 10 and the associated timing diagram 1200 of fig. 12 further facilitate computational resource reduction, power savings, and hardware size reduction.

A further illustration of an example second mode of operation of the example image analyzer 102 and a corresponding difference calculator of the example HCOST unit is shown in fig. 13A. Fig. 13A shows a graphical representation of the difference calculation between the candidate row C01302 and the first reference row R01304 of the reference macroblock 308. In particular, the example first CSAD [0]1002A computes, during a first clock cycle (clock cycle 0), a pixel value difference between the candidate row C01302A and the first reference row R01304A for pixels 0, 1, and 2 (shaded) of the first candidate row C01302A. FIG. 13A also shows the first candidate row C01302B for the difference computation by the second difference calculator CSAD [1] 1002B. The second difference calculator CSAD [1]1002B calculates, during the first clock cycle, differences in pixel values between pixels 1, 2, and 3 of the first candidate row C01302B (different from pixels 0, 1, and 2 of interest of the CSAD [0] 1002A) and the same three pixels (R0) of the reference macroblock 308.

As described above, the respective difference calculator (difference calculators CSAD [0]1002A through CSAD [6]1002G) is responsible for the incremental shifts of candidate row C0 from left to right, and the resulting difference value for each incremental shift. For ease of explanation, the first two difference calculators of fig. 13A are explained in this example only for the first clock cycle of the first HCOST cell 1004A. However, as can be seen in the example shown in fig. 11, additional CSAD units may operate in parallel, each of which focuses on a separate shifted pixel grouping for the candidate row of interest.

Returning to the example shown in fig. 12, while the first clock cycle (clock cycle 0) described above performed a comparison of pixels between (a) the first candidate row C0 and (b) the first row R01304A of the reference macroblock (see dashed circle 1220), the second (subsequent) clock cycle (clock cycle 1) continues to perform pixel comparisons of additional candidate rows (e.g., C1) and additional rows of the reference macroblock (e.g., R1) (see dashed circle 1222). Additionally, during the second cycle, the first row R0 remains compared to the candidate row C1 while a comparison is made between the second row R1 and the candidate row C1 (and so on, as shown in fig. 12). However, unlike the example shown in fig. 7, where the first mode of operation evaluates S0 using the same HCOST cells during all three clock cycles (thereby requiring pixel rotation via the example barrel shifter 410), in the example shown in fig. 12, the additional row of reference macroblocks (e.g., R1) and candidate row C1 employ cascaded HCOST cells (see dashed circle 1222 and associated HCOST 11004B).

In the example shown in fig. 13B, a graphical progression is shown from the analysis of C0 and R0 (see dashed circle 1220) to the analysis of C1 and R1 (see dashed circle 1222). In the example shown in fig. 13B, the first difference calculator of the second HCOST cell 1004B is used to evaluate the pixels (e.g., cells 0, 1, and 2) between the second candidate row C11302C and the second row R11304B of the reference macroblock. Similarly, the second difference calculator of the second HCOST cell 1004B is used to evaluate pixels between the right shifted portion of the second candidate row C11302D (e.g., cells 1, 2, and 3) and the second row R11304B of the reference macroblock.

In the example shown in fig. 13C, the graphical progression is shown from the analysis of C1 and R1 (see dashed circle 1222) to the analysis of C2 and R2 (see dashed circle 1224). In the example shown in fig. 13, the first disparity calculator of the third HCOST cell 1004C is used to evaluate pixels (e.g., cells 0, 1, and 2) between the third candidate row C21302E and the third row R21304C of the reference macroblock. Similarly, the second difference calculator of the third HCOST cell 1004C is used to evaluate pixels between the right shifted portion of the third candidate row C21302F (e.g., cells 1, 2, and 3) and the third row R21304C of the reference macroblock.

Returning briefly to the example shown in fig. 11, after the example search region engine 214 determines that all non-overlapping rows of the search region have been evaluated and the corresponding SAD values calculated, the example ranking engine 216 invokes the ranking unit 1008 of fig. 10 to identify the corresponding overall minimum SAD value and sends that value via the example SAD output 1012. Additionally, the example ordering unit 1008 identifies corresponding pixel location information associated with a minimum SAD value and transmits such location information via the example location output 1014. Accordingly, pixel location information regarding where the reference image is located is determined by examples disclosed herein. Furthermore, the determined position information can be compared with additional position information in a subsequent evaluation of the source image (or video) during a later time. This difference in position information from one time to the next confirms the movement of the object within the source image. Additionally, analysis of the location information from one time to the next allows the direction of such movement to be identified.

Although an example manner of implementing the image analyzer 102 of fig. 1 and 2 is shown in fig. 1-13, one or more of the elements, processes and/or devices shown in fig. 1-13 may be combined, divided, rearranged, omitted, eliminated and/or implemented in any other way. Further, the example hardware evaluator 202, the example search area determiner 204, the example difference calculator number determiner 206, the example HCOST number determiner 208, the example image retriever 210, the example macroblock engine 212, the example search area engine 214, the example ranking engine 216, the example difference calculator engine 218, the example barrel shift engine 220, the example feedback engine 222, the example HCOST engine 224, the example comparison unit(s) 400, the example difference calculators 406, 502, the example adder 504, the example accumulator 506, the example ranking network 408, 1008, the example minimum calculator 450, 1050, and/or more generally, the example image analyzer 102 of FIGS. 1-13 may be implemented by hardware, software, firmware, and/or any combination of hardware, software, and/or firmware. Thus, for example, the example hardware evaluator 202, the example search area determiner 204, the example difference calculator number determiner 206, the example HCOST number determiner 208, the example image retriever 210, the example macroblock engine 212, the example search area engine 214, the example ranking engine 216, the example difference calculator engine 218, the example barrel shift engine 220, the example feedback engine 222, the example HCOST engine 224, the example comparison unit(s) 400, the example difference calculators 406, 502, the example adder 504, the example accumulator 506, the example ranking networks 408, 1008, the example minimum calculators 450, 1050, and/or more generally, any of the example image analyzers 102 of FIGS. 1-13 may be comprised of one or more analog or digital circuits, logic circuits, programmable processors, programmable controllers, Graphics Processing Units (GPUs), Digital Signal Processors (DSPs), digital signal Processors (PLCs), and/or more generally, An Application Specific Integrated Circuit (ASIC), a Programmable Logic Device (PLD), and/or a Field Programmable Logic Device (FPLD). When reading any of the apparatus or system claims of this patent to encompass a purely software and/or firmware implementation, at least one of the example image analyzers 102 of fig. 1-13 is expressly defined herein to include a non-transitory computer-readable storage device or storage disk, the example hardware evaluator 202, the example search area determiner 204, the example difference calculator number determiner 206, the example HCOST number determiner 208, the example image retriever 210, the example macroblock engine 212, the example search area engine 214, the example ranking engine 216, the example difference calculator engine 218, the example barrel shift engine 220, the example feedback engine 222, the example HCOST engine 224, the example comparison unit(s) 400, the example difference calculators 406, 502, the example adder 504, the example accumulator 506, the example ranking networks 408, 1008, the example minimum calculators 450, 1050, and/or more generally, such as memory, Digital Versatile Disks (DVDs), Compact Disks (CDs), blu-ray disks, etc., that include software and/or firmware. Still further, the example image analyzer 102 of fig. 1 and 2 may include one or more elements, processes, and/or devices in addition to, or instead of, those illustrated in fig. 1-13, and/or may include more than one of any or all of the illustrated elements, processes, and devices. As used herein, the phrase "in communication with … …" includes variations thereof that encompass direct communication and/or indirect communication through one or more intermediate components, and does not require direct physical (e.g., wired) communication and/or continuous communication, but instead requires selective communication that additionally includes periodic intervals, predetermined intervals, non-periodic intervals, and/or one-time events.

Fig. 14-16 illustrate flow diagrams representing example hardware logic, machine readable instructions, hardware implemented state machines, and/or any combination thereof for implementing the image analyzer 102 of fig. 1-13. The machine readable instructions may be one or more executable programs or portions of executable programs for execution by a computer processor such as the processor 1712 shown in the example processor platform 1700 discussed below in connection with fig. 17. The program may be embodied in software stored on a non-transitory computer readable storage medium (e.g., a CD-ROM, a floppy disk, a hard drive, a DVD, a blu-ray disk, or a memory associated with the processor 1712), but the entire program and/or parts thereof could alternatively be executed by a device other than the processor 1712 and/or embodied in firmware or dedicated hardware. Further, although the example program is described with reference to the flowcharts shown in FIGS. 14-16, many other methods of implementing the example image analyzer 102 may alternatively be used. For example, the order of execution of the blocks may be changed, and/or some of the blocks described may be changed, eliminated, or combined. Additionally or alternatively, any or all of the blocks may be implemented by one or more hardware circuits (e.g., discrete and/or integrated analog and/or digital circuits, FPGAs, ASICs, comparators, operational amplifiers (op-amps), logic circuitry, etc.) configured to perform corresponding operations without the execution of software or firmware.

The machine-readable instructions described herein may be stored in one or more of a compressed format, an encrypted format, a fragmented format, a packetized format, and the like. The machine-readable instructions described herein may be stored as data (e.g., portions of instructions, code representations of code, etc.) that may be used to create, fabricate, and/or produce machine-executable instructions. For example, the machine-readable instructions may be segmented and stored on one or more storage devices and/or computing devices (e.g., servers). The machine-readable instructions may require one or more of installation, modification, adaptation, updating, combining, supplementing, configuring, decrypting, decompressing, unpacking, distributing, reassigning, etc., in order to be directly readable and/or executable by the computing device and/or other machine. For example, machine-readable instructions may be stored in multiple portions that are separately compressed, encrypted, and stored on separate computing devices, where the portions, when decrypted, decompressed, and combined, form a set of executable instructions that implement programs such as those described herein. In another example, the machine-readable instructions may be stored in a state in which they can be read by a computer, but require the addition of a library (e.g., a Dynamic Link Library (DLL)), a Software Development Kit (SDK), an Application Programming Interface (API), or the like, in order to execute the instructions on a particular computing device or other device. In another example, machine readable instructions (e.g., stored settings, data input, recorded network addresses, etc.) may need to be configured before the machine readable instructions and/or corresponding program can be executed in whole or in part. Accordingly, the disclosed machine readable instructions and/or programs are independent of the particular format or state of the machine readable instructions and/or programs as they are stored or otherwise quiesced or transmitted.

As mentioned above, the example processes of fig. 14-16 may be implemented using executable instructions (e.g., computer and/or machine readable instructions) stored on a non-transitory computer and/or machine readable medium such as a hard disk drive, flash memory, read only memory, compact disk, digital versatile disk, cache, random access memory, and/or any other storage device or storage disk in which information is stored for any duration (e.g., for extended periods of time, permanently, for brief instances, for temporarily buffering, and/or for caching of the information). As used herein, the term "non-transitory computer-readable medium" is expressly defined to include any type of computer-readable storage device and/or storage disk and to exclude propagating signals and to exclude transmission media.

The terms "comprising" and "comprising" (and all forms and tenses thereof) are used herein as open-ended terms. Thus, whenever a claim recitations in any form, "comprise" or "comprise" (e.g., comprise, include, contain, include, have, etc.) as a preface or in any type of claim recitations, it is to be understood that additional elements, items, etc. may be present without departing from the scope of the corresponding claim or recitations. As used herein, the phrase "at least" when used as a transitional term in, for example, the preamble of a claim is open-ended and is in the same way that the terms "comprising" and "including" are open-ended. For example, the term "and/or" when used in the form of, for example, a, B, and/or C refers to any combination or subset of a, B, C, such as: (1) a alone, (2) B alone, (3) C alone, (4) A and B, (5) A and C, (6) B and C, and (7) A and B and C. As used herein in the context of describing structures, components, items, objects, and/or things, the phrase "at least one of a and B" is intended to refer to implementations that include any one of: (1) at least one a, (2) at least one B and (3) at least one a and at least one B. Similarly, as used herein in the context of describing structures, components, items, objects, and/or things, the phrase "at least one of a or B" is intended to refer to implementations that include any one of: (1) at least one a, (2) at least one B and (3) at least one a and at least one B. As used herein in the context of describing implementations or executions of processes, instructions, actions, activities, and/or steps, the phrase "at least one of a and B" is intended to refer to implementations including any of the following: (1) at least one a, (2) at least one B and (3) at least one a and at least one B. Similarly, as used herein in the context of describing the implementation or execution of processes, instructions, actions, activities, and/or steps, the phrase "at least one of a or B" is intended to refer to an implementation that includes any one of the following: (1) at least one a, (2) at least one B and (3) at least one a and at least one B.

The example program 1400 of FIG. 14 includes a block 1402 in which the example hardware evaluator 202 determines whether a hardware parameter is known. For example, where the image analyzer 102 has performed one or more analysis and localization operations on a previous occasion, further efforts to perform underlying hardware detection and/or characterization efforts may be avoided. Information associated with the hardware parameters may be stored in a memory of the example hardware evaluator 202, in a memory of the example processing resource 114, and/or in a memory of the example processor platform 1700 of fig. 17. In the event that the example hardware evaluator 202 determines that the hardware parameters are known (block 1402) (e.g., available in a stored memory), control proceeds to block 1418, as described in further detail below.

In the event that the example hardware evaluator 202 determines that hardware parameters (e.g., including information related to image size) are unknown and/or otherwise unavailable (block 1402), the example hardware evaluator 202 identifies a search span and a macroblock size (block 1404). As described above, in some examples, the size and location of the search area depends on the image size (e.g., if the search is performed in the lower left corner of the image, the search area may be truncated by reaching the border of the image). The example search area determiner 204 calculates a search area based on the identified search span and macroblock size in a manner consistent with example equations 1A and 1B, and in a manner consistent with the image size and location of the reference macroblock within the image (block 1406). Without limitation, one or more additional and/or alternative inputs may be used to direct the search area(s) operation. In some examples, pyramid continuation elimination techniques may be used to more efficiently identify search region(s) operations, where sensor/camera input may be used to more accurately define a particular portion of an image to be searched. Such example pyramid successive elimination techniques may provide greater analysis efficiency than, for example, raster-based analysis efforts (e.g., "raster patterns") in which predefined portions of an image are analyzed regardless of their likelihood of changing pixel activity. Additionally, the example search area determiner 204 calculates the number of possible macroblock rows within the calculated search area in a manner consistent with example equation 2 (block 1408).

To determine the number of difference Calculators (CSADs) to use, the example difference calculator number determiner 206 considers the calculated number of macroblocks that may fit within a row within the example search area (block 1410). As described above, the number of CSADs required depends on the width of the macroblock and the width of the search area. In other words, where possible (e.g., when such hardware resources may accommodate parallelism), some computational efficiency may be achieved by performing the pixel analysis tasks in parallel. Thus, in some examples, the number of difference calculators is equal to the number of macroblock rows that may fit within the example search area. As described above, different numbers and arrangements of difference calculators constitute the HCOST unit. The example HCOST number determiner 208 determines a number of HCOST cells based on the macroblock height (block 1412). In some examples, the hardware evaluator 202 selects a target hardware structure and/or other hardware resources (e.g., number of HCOST cells, particular type of HCOST cell, etc.) (block 1416) and continues to compare images based on such available hardware resources (block 1418). As described above, in some examples, the hardware evaluator 202 allocates required hardware resources based on the analysis identified above (block 1416), e.g., allocates specific FPGA resources to facilitate a specific number and/or type of HCOST units, CSADs, etc.

FIG. 15 shows additional details associated with the comparison activity of initiator block 1418. In the example shown in fig. 15, pixel analysis and comparison are described in connection with a first mode of operation of the example image analyzer 102 employing the example first comparison unit 400 of fig. 4 and the example HCOST unit of fig. 5. As described above, the example first mode of operation follows the example timing diagram 700 of fig. 7. The example image retriever 210 retrieves a candidate image of interest to be analyzed (block 1502), and the example macroblock engine 212 retrieves and/or otherwise selects a macroblock of interest, which may include a portion of an image of a reference image or candidate image to be searched (block 1504). In general, there are always at least two images in which the reference image is searched for in the main image. Within the reference image, one or more macroblocks to be searched can be extracted (e.g., MBs can be extracted via raster mode, a list of image coordinates obtained, etc.). Thus, MBs are extracted from the reference picture and a search area is selected from the candidate pictures where the best match is found between the MB and the reference picture. The example macroblock engine 212 marks the row and pixel references of an example macroblock (e.g., the example macroblock 308 of fig. 3C and 3D) (block 1506), and the example search area engine 214 selects a search area of interest from the retrieved candidate images and marks its row and pixel references (block 1508) (e.g., see the example search area 302 of fig. 3A and 3B). The example search area engine 214 identifies and/or otherwise marks candidate rows of a search area (e.g., the example search area 302) (block 1510), as shown on the right-hand side of fig. 6.

The example shown in fig. 15 is described in connection with one HCOST cell (e.g., the example HCOST cell 404A of fig. 4 and 5), but the example comparison program 1418 of fig. 15 is applicable to any number of HCOST cells given the pixel analysis task. The example difference calculator engine 218 sets the CSAD pointer (P) to manage and/or otherwise control the input loading of one CSAD unit (e.g., the example CSAD [0]502A of FIG. 5) (block 1512). The example difference calculator engine 218 loads the reference pixel to the CSAD unit associated with the pointer (P) (block 1514) and loads the candidate pixels to the same CSAD unit (block 1516). The example difference calculator engine 218 determines whether the CSAD unit associated with the current pointer (P) is the last CSAD unit within the HCOST unit of interest (block 1518). For example, considering first HCOST unit 404A of FIG. 5, there are seven (7) CSAD units operating in parallel, where each CSAD unit performs a comparison of a portion of the candidate row of interest (e.g., the first CSAD unit corresponds to bits 0, 1, and 2 of the candidate row of interest, the second CSAD unit corresponds to right-shifted bits 1, 2, and 3 of the candidate row of interest, etc.). If the example CSAD unit is not the last CSAD unit of the HCOST unit of interest (block 1518), the example barrel shift engine 220 shifts and increments the pointer (P) for the candidate pixel of the reference macroblock (block 1520). Control then returns to block 1512.

In the event that the example difference calculator engine 218 determines that the last CSAD unit has been loaded with input data (pixels to compare) (block 1518), the example difference calculator engine 218 invokes the corresponding loaded CSAD unit (e.g., the seven (7) CSAD units 502A-502G of fig. 5) to perform the comparison that generates the Sum of Absolute Difference (SAD) values (block 1522). The example feedback engine 222 adds a feedback value to the result (block 1524), and the example search area engine 214 determines whether the non-overlapping set of candidate lines is complete (block 1526). As described above, the non-overlapping set of candidate rows indicates that a complete macroblock has been applied to the grouping of candidate rows across a complete row width, as shown in the example search area map 900 of fig. 9. If the non-overlapping set of candidate rows is not complete, the example difference calculator engine 218 increments the row counter (block 1528) to cause the CSAD unit to perform the load and compare the subsequent row of interest (e.g., moving from the comparison of pixels in candidate row C0 to candidate row C1, see dashed circle 722 of FIG. 7). The example barrel shift engine 220 again rotates the reference pixels (block 1530) so that the correct candidate row pixels are compared to the correct reference row pixels. Control then returns to block 1512.

Where the non-overlapping set of candidate lines is complete and/or otherwise evaluated to determine SAD values (block 1526) (e.g., see S0730 of fig. 9, showing candidate lines C0, C1, and C2), the example ranking engine 216 applies and/or otherwise forwards the respective SAD values to a ranking network (block 1532), e.g., the example ranking network 408A of fig. 5. As described above, the example ranking engine 216 invokes the example ranking network 408A to determine a relatively low value (e.g., a relatively minimum SAD value) and a corresponding location of the relatively low value (e.g., a low value pixel coordinate) (block 1534). The example search area engine 214 determines whether there are one or more additional non-overlapping rows to evaluate (block 1536) (see, e.g., the additional non-overlapping sets of candidate rows S1-S6 of fig. 9), and if there are one or more additional non-overlapping rows to evaluate, the example difference calculator engine 218 increments a row counter (block 1538) to cause subsequent rows of the search area of interest to be loaded and compared. The example barrel shift engine 220 rotates the reference pixels (block 1540), and control returns to block 1512. However, where all non-overlapping rows of the search region of interest have been evaluated (e.g., pixels are loaded and compared to the reference block for all possible locations of the reference block within the search region) (block 1536), the example sort engine 216 compares all such groupings of non-overlapping rows to determine an overall minimum and its location (block 1542). In other words, each S value (e.g., S0-S6) represents a local minimum value for a particular grouping of candidate rows that have been compared to the reference macroblock. Thus, only one S value may be an absolute minimum value because the reference macro block corresponds to a portion of the image to be identified for possible movement evidence within a scene (e.g., a frame from a motion picture scene).

While the above-described example comparison 1418 of the image of FIG. 15 corresponds to the example first mode of operation of the example image analyzer 102, the example program 1418 shown in FIG. 16 corresponds to the second mode of operation of the example image analyzer 102. As described above, the example first mode of operation invokes and/or otherwise utilizes the example first comparison unit 400 of fig. 4 and the corresponding CSAD unit of fig. 5, while the example second mode of operation invokes and/or otherwise utilizes the architecture of the example second comparison unit 1000 of fig. 10 and the corresponding CSAD unit of fig. 11. The example second mode of operation corresponds to the example timing diagram 1200 of fig. 12.

In the example shown in fig. 16, the example image retriever 210 retrieves a candidate image of interest to analyze (block 1602), and the example macroblock engine 212 retrieves and/or otherwise selects a macroblock of interest, which may include a portion of an image of a reference image or a candidate image to be searched (block 1604). The example macroblock engine 212 marks the row and pixel references of the example macroblock (e.g., the example macroblock 308 of fig. 3C and 3D) (block 1606), and the example search area engine 214 selects a search area of interest from the retrieved candidate images (e.g., see the example search area 302 of fig. 3A and 3B) and marks its row and pixel references (block 1608). The example search area engine 214 identifies and/or otherwise marks candidate rows of a search area (e.g., the example search area 302) (block 1610), as shown on the right-hand side of fig. 6.

The example shown in fig. 16 is described in connection with all available HCOST cells for a given comparison cell (e.g., the example second comparison cell 1000 of fig. 10). As described above, because all available HCOST cells of a given compare unit participate in the analysis of an example complete non-overlapping set of candidate rows (e.g., S0, S1, etc.), the processing footprint size and processing cycles are reduced because any need for an accumulator feedback loop (e.g., the example feedback path 508) is bypassed. The example HCOST engine 224 selects an HCOST cell (block 1612), e.g., a first available HCOST cell, when performing an initial iteration of the example second comparison unit 1000 of FIG. 10. The example HCOST engine 224 applies the reference row pixels to the selected HCOST cell (block 1614), e.g., the first row R0 of the macroblock. As described above, when all evaluation efforts for the search area of interest provide a single cell row for each HCOST cell from the macroblock, efficiency gains are achieved by partially removing any need for barrel shifter circuitry/hardware. The example HCOST engine 224 determines whether the compare element has additional HCOST elements that have not yet been assigned a corresponding reference row input (block 1616), and increments the reference row pointer if the compare element has additional HCOST elements (block 1618). Control then returns to block 1612.

In the event that the example HCOST engine 224 determines that all available HCOST cells of the comparison cell of interest have been assigned a reference input (block 1616), the example difference calculator engine 218 begins a pixel comparison analysis at the first available row (row 0) (block 1620). The example difference calculator engine 218 sets the CSAD pointer (P) for each available HCOST unit to an initial value of zero to identify the first CSAD unit of HCOST units (block 1622). The example difference calculator engine 218 loads the pixels into the selected CSAD unit as input (block 1624) and determines whether the current CSAD unit is the last unit of the HCOST unit (block 1626). If the current CSAD unit is not the last unit, one or more additional CSAD units need to be configured with input pixels for final comparison with the corresponding reference pixels of the macroblock. The example difference calculator engine 218 increments the pointer (P) to point to the next available CSAD unit (block 1628) and shifts the candidate pixel row to the right (block 1630). Control then returns to block 1624 to load one or more inputs of the additional CSAD units.

If all CSAD units have been provided with candidate row pixel data (block 1626), the example difference calculator engine 218 calculates a difference value for each CSAD unit to generate a local SAD value (block 1632). The example HCOST engine 224 cascades the local SAD value to the next available HCOST cell (block 1634), thereby bypassing any need for an accumulator feedback circuit and associated processing cycles associated therewith. The example search area engine 214 determines whether the non-overlapping set of candidate rows is complete (evaluated) (block 1636), and if the non-overlapping set of candidate rows is not complete, the example difference calculator engine 218 increments the row pointer to focus on the next available candidate row for evaluation (block 1638) (e.g., a comparison between the pixels of the next candidate row and the rows of the reference macroblock). Control then returns to block 1622.

If the example search area engine 214 determines that the non-overlapping set of candidate lines is complete (block 1636) (e.g., S0, which includes candidate lines C0, C1, and C2), the example search area engine 214 saves the corresponding local SAD values (block 1640) (which are sorted at a later time to determine the relatively lowest value). The example search area engine 214 determines whether additional lines of interest exist (block 1642) and, if so, control returns to block 1638. Otherwise, the example sort engine 216 sorts the accumulated/stored S values (e.g., local SAD values associated with S0, S1, S2, etc.) to identify the lowest relative value (block 1644). The example ranking engine 216 then determines a corresponding location (e.g., pixel coordinates on the scene of interest) associated with the lowest relative value (block 1646). Such information may be used from one temporal evaluation of the scene of interest to the next to identify, for example, motion and/or direction of motion of the video stream.

Fig. 17 is a block diagram of an example processor platform 1700 configured to execute the instructions of fig. 14-16 to implement the image analyzer 102 of fig. 1 and 2. Processor platform 1700 may be, for example, a server, a personal computer, a workstation, a self-learning machine (e.g., a neural network), a mobile device (e.g., a cell phone, a smart phone, such as an iPad), a mobile device, a mobile phone, such as a mobile phone, a personal computer, a workstation, a computer, a self-learning machine, a computer, a^TMSuch as a tablet computer), a Personal Digital Assistant (PDA), an internet appliance, a DVD player, a CD player, a digital video recorder, a blu-ray player, a game console, a personal video recorder, a set-top box, or any other type of computing device.

The processor platform 1700 of the illustrated example includes a processor 1712. The processor 1712 of the illustrated example is hardware. For example, the processor 1712 may be implemented by one or more integrated circuits, logic circuits, microprocessors, GPUs, DSPs, or controllers from any desired family or manufacturer. The hardware processor may be a semiconductor-based (e.g., silicon-based) device. In this example, the processor implements the example hardware evaluator 202, the example search area determiner 204, the example difference calculator number determiner 206, the example HCOST number determiner 208, the example image retriever 210, the example macroblock engine 212, the example search area engine 214, the example ranking engine 216, the example difference calculator engine 218, the example barrel shift engine 220, the example feedback engine 222, the example HCOST engine 224, the example comparison unit(s) 400, the example difference calculators 406, 502, the example adder 504, the example accumulator 506, the example ranking network 408, 1008, the example minimum calculator 450, 1050, and/or, more generally, the example image analyzer 102.

The processor 1712 of the illustrated example includes local memory 1713 (e.g., a cache). The processor 1712 of the illustrated example is in communication with a main memory including a volatile memory 1714 and a non-volatile memory 1716 via a bus 1718. The volatile memory 1714 may be implemented by Synchronous Dynamic Random Access Memory (SDRAM), Dynamic Random Access Memory (DRAM),Dynamic random access memoryAnd/or any other type of random access memory device. The non-volatile memory 1716 may be implemented by flash memory and/or any other desired type of memory device. Access to the main memory 1714, 1716 is controlled by a memory controller.

The processor platform 1700 of the illustrated example also includes interface circuitry 1720. The interface circuit 1720 may interface with any type of interface standard (e.g., an ethernet interface, Universal Serial Bus (USB)),An interface, a Near Field Communication (NFC) interface, and/or a PCI express interface).

In the illustrated example, one or more input devices 1722 are connected to the interface circuit 1720. Input device 1722 allows a user to enter data and/or commands into processor 1712. The input device may be implemented by, for example, an audio sensor, a microphone, a camera (still or video), a keyboard, a button, a mouse, a touch screen, a touch pad, a trackball, an equivalent point (isopoint), and/or a voice recognition system.

One or more output devices 1724 are also connected to the interface circuit 1720 of the illustrated example. The output devices 1724 may be implemented, for example, by display devices (e.g., Light Emitting Diodes (LEDs), Organic Light Emitting Diodes (OLEDs), Liquid Crystal Displays (LCDs), cathode ray tube displays (CRTs), in-place switching (IPS) displays, touch screens, etc.), tactile output devices, printers, and/or speakers. Thus, the interface circuit 1720 of the illustrated example typically includes a graphics driver card, a graphics driver chip, and/or a graphics driver processor.

The interface circuit 1720 of the illustrated example also includes a communication device (e.g., a transmitter, receiver, transceiver, modem, residential gateway, wireless access point, and/or network interface) to facilitate exchange of data with external machines (e.g., any kind of computing device) via a network 1726. The communication may be via, for example, an ethernet connection, a Digital Subscriber Line (DSL) connection, a telephone line connection, a coaxial cable system, a satellite system, a line-of-site wireless system, a cellular telephone system, or the like.

Processor platform 1700 of the illustrated example also includes one or more mass storage devices 1728 for storing software and/or data. Examples of such mass storage devices 1728 include floppy disk drives, hard drive disks, compact disk drives, blu-ray disk drives, Redundant Array of Independent Disks (RAID) systems, and Digital Versatile Disk (DVD) drives.

The machine-executable instructions 1732 of fig. 14-16 may be stored in the mass storage device 1728, in the volatile memory 1714, in the non-volatile memory 1716, and/or on a removable non-transitory computer-readable storage medium (e.g., a CD or DVD).

Additional examples of the presently described methods, systems, apparatus, articles of manufacture, and devices disclosed herein include the following non-limiting configurations. Each of the following non-limiting examples can exist independently, or can be combined in any permutation or combination with any one or more of the other examples provided below and/or throughout this disclosure.

Example 1 includes an apparatus for improving efficiency of image difference computation, the apparatus comprising: a Horizontal Cost (HCOST) engine to apply a first row of pixels of a macroblock to an input of a first HCOST unit, the first HCOST unit including a plurality of difference calculators; and a difference calculator engine to apply a corresponding line of pixels of a search window of the source image to a corresponding difference calculator of a plurality of difference calculators of the first HCOST unit to calculate a respective Sum of Absolute Difference (SAD) value between (a) the first line of pixels of the macroblock and (b) the corresponding line of pixels of the search window.

Example 2 includes the apparatus as defined in example 1, further comprising: a difference calculator number determiner for calculating the number of difference calculators based on the number of instances of the macroblock that fit within the width of the search window of the source image.

Example 3 includes the apparatus as defined in example 1, wherein the HCOST engine is to cause the first HCOST cell to cascade the respective SAD values to the second HCOST cell without rotating the first pixel row of the macroblock.

Example 4 includes the apparatus as defined in example 3, wherein the HCOST engine is to route the cascaded SAD values corresponding to the plurality of difference calculators of the first HCOST unit to inputs of respective difference calculators in the second HCOST unit.

Example 5 includes the apparatus as defined in example 4, wherein the difference calculator engine is to constrain inputs of respective difference calculators in the second HCOST cell to evaluate a second row of pixels of the macroblock.

Example 6 includes an apparatus as defined in example 1, further comprising: a search area engine to determine whether all corresponding pixel rows of the search window have been evaluated.

Example 7 includes an apparatus as defined in example 6, further comprising: a ranking engine to compare the respective SAD values to identify a relatively lowest one of the respective SAD values.

Example 8 includes the apparatus as defined in example 7, wherein a relatively lowest one of the respective SAD values indicates a match between the macroblock and an image of the search window.

Example 9 includes the apparatus as defined in example 7, wherein the ranking engine is to identify a target location corresponding to a relatively lowest one of the respective SAD values.

Example 10 includes the apparatus as defined in example 9, wherein the ranking engine is to identify the target location as a pixel coordinate of the search window.

Example 11 includes a non-transitory computer-readable medium comprising computer-readable instructions that, when executed, cause at least one processor to: applying a first row of pixels of a macroblock to an input of a first Horizontal Cost (HCOST) unit, the first HCOST unit comprising a plurality of difference calculators; applying corresponding pixel rows of a search window of a source image to corresponding ones of a plurality of difference calculators of a first HCOST unit; and concatenating respective Sum of Absolute Difference (SAD) values for corresponding ones of a plurality of difference calculators between (a) a first pixel row of the macroblock and (b) a corresponding pixel row of the search window.

Example 12 includes a computer-readable medium as defined in example 11, wherein the instructions, when executed, cause the at least one processor to: the number of difference calculators is calculated based on the number of instances of the macroblock that fit within the width of the search window of the source image.

Example 13 includes a computer-readable medium as defined in example 11, wherein the instructions, when executed, cause the at least one processor to: such that the first HCOST cell cascades the corresponding SAD value to the second HCOST cell while bypassing the rotation of the first pixel row of the macroblock.

Example 14 includes a computer-readable medium as defined in example 13, wherein the instructions, when executed, cause the at least one processor to: the cascaded SAD values corresponding to the plurality of difference calculators of the first HCOST unit are routed to inputs of respective difference calculators in the second HCOST unit.

Example 15 includes a computer-readable medium as defined in example 14, wherein the instructions, when executed, cause the at least one processor to: the input of the corresponding difference calculator in the second HCOST unit is constrained to evaluate a second row of pixels of the macroblock.

Example 16 includes a computer-readable medium as defined in example 11, wherein the instructions, when executed, cause the at least one processor to: it is determined whether all corresponding pixel rows of the search window have been evaluated.

Example 17 includes a computer-readable medium as defined in example 16, wherein the instructions, when executed, cause the at least one processor to: the respective SAD values are compared to identify a relatively lowest one of the respective SAD values.

Example 18 includes a computer-readable medium as defined in example 17, wherein the instructions, when executed, cause the at least one processor to: a match between the macroblock and the image of the search window is identified based on a relatively lowest one of the respective SAD values.

Example 19 includes a computer-readable medium as defined in example 17, wherein the instructions, when executed, cause the at least one processor to: a target location corresponding to a relatively lowest one of the respective SAD values is identified.

Example 20 includes a computer-readable medium as defined in example 19, wherein the instructions, when executed, cause the at least one processor to: the target location is identified as the pixel coordinates of the search window.

Example 21 includes a computer-implemented method for improving efficiency of image difference calculations, the method comprising: applying, by executing instructions with at least one processor, a first row of pixels of a macroblock to an input of a first Horizontal Cost (HCOST) unit, the first HCOST unit comprising a plurality of difference calculators; applying, by executing instructions with at least one processor, a corresponding row of pixels of a search window of a source image to a corresponding difference calculator of a plurality of difference calculators of a first HCOST unit; and concatenating, by executing the instructions with the at least one processor, respective Sum of Absolute Difference (SAD) values of corresponding difference calculators of the plurality of difference calculators between (a) the first pixel row of the macroblock and (b) the corresponding pixel row of the search window.

Example 22 includes the method as defined in example 21, further comprising: the number of difference calculators is calculated based on the number of instances of the macroblock that fit within the width of the search window of the source image.

Example 23 includes the method as defined in example 21, further comprising: such that the first HCOST cell cascades the corresponding SAD value to the second HCOST cell while bypassing the rotation of the first pixel row of the macroblock.

Example 24 includes the method as defined in example 23, further comprising: the cascaded SAD values corresponding to the plurality of difference calculators of the first HCOST unit are routed to inputs of respective difference calculators in the second HCOST unit.

Example 25 includes the method as defined in example 24, further comprising: the input of the corresponding difference calculator in the second HCOST unit is constrained to evaluate a second row of pixels of the macroblock.

Example 26 includes the method as defined in example 21, further comprising: it is determined whether all corresponding pixel rows of the search window have been evaluated.

Example 27 includes the method as defined in example 26, further comprising: the respective SAD values are compared to identify a relatively lowest one of the respective SAD values.

Example 28 includes the method as defined in example 27, further comprising: a match between the macroblock and the image of the search window is identified based on a relatively lowest one of the respective SAD values.

Example 29 includes the method as defined in example 27, further comprising: a target location corresponding to a relatively lowest one of the respective SAD values is identified.

Example 30 includes the method as defined in example 29, further comprising: the target location is identified as the pixel coordinates of the search window.

Example 31 includes an apparatus for improving efficiency of image difference calculations, the apparatus comprising: a macroblock pixel application unit for applying a first pixel row of a macroblock to an input of a first Horizontal Cost (HCOST) unit, the first HCOST unit comprising a plurality of difference calculators; and a search window application unit for applying corresponding pixel rows of a search window of the source image to corresponding ones of the plurality of difference calculators of the first HCOST unit, the corresponding ones of the plurality of difference calculators for calculating respective Sum of Absolute Difference (SAD) values between (a) the first pixel row of the macroblock and (b) the corresponding pixel row of the search window.

Example 32 includes an apparatus as defined in example 31, further comprising: means for determining a number of difference calculators to calculate a number of difference calculators based on a number of instances of macroblocks that fit within a width of a search window of a source image.

Example 33 includes the apparatus as defined in example 31, wherein the macroblock pixel application unit is to cause the first HCOST cell to cascade the corresponding SAD value to the second HCOST cell without rotating the first pixel row of the macroblock.

Example 34 includes the apparatus defined in example 33, wherein the macroblock pixel application unit is to route the cascaded SAD values corresponding to the plurality of difference calculators of the first HCOST unit to inputs of respective difference calculators in the second HCOST unit.

Example 35 includes the apparatus as defined in example 34, wherein the search window application unit is to constrain input of a respective difference calculator in the second HCOST unit to evaluate a second line of pixels of the macroblock.

Example 36 includes an apparatus as defined in example 31, further comprising: a unit for search area evaluation for determining whether all corresponding pixel rows of the search window have been evaluated.

Example 37 includes an apparatus as defined in example 36, further comprising: a sorting unit for comparing the respective SAD values to identify a relatively lowest one of the respective SAD values.

Example 38 includes the apparatus as defined in example 37, wherein a relatively lowest one of the respective SAD values indicates a match between the macroblock and an image of the search window.

Example 39 includes the apparatus as defined in example 37, wherein the ordering unit is to identify a target location corresponding to a relatively lowest one of the respective SAD values.

Example 40 includes the apparatus as defined in example 39, wherein the ranking unit is to identify the target location as a pixel coordinate of the search window.

Example computer readable media include: first instructions that, when executed, cause a machine to at least one of distribute, configure, assemble, install, instantiate, retrieve, decompress, and decrypt second instructions for execution; second instructions that, when executed, cause a machine to: applying a first row of pixels of a macroblock to an input of a first Horizontal Cost (HCOST) unit, the first HCOST unit comprising a plurality of difference calculators; applying corresponding pixel rows of a search window of a source image to corresponding ones of a plurality of difference calculators of a first HCOST unit; and concatenating respective Sum of Absolute Difference (SAD) values for corresponding ones of a plurality of difference calculators between (a) a first pixel row of the macroblock and (b) a corresponding pixel row of the search window.

From the foregoing, it will be appreciated that example methods, apparatus, systems, and articles of manufacture to identify sub-images within an image of interest have been disclosed. For example, examples disclosed herein facilitate object motion detection for images of interest, where the images of interest may be movie frames and/or real-time video feeds of monitored activity (e.g., security cameras). A sub-image (e.g., a wheel of an automobile) of an image of interest (e.g., a scene including an automobile with wheels) is used to search for the image of interest at a first time, and where the sub-image detects, by examples disclosed herein, that a second relative position is located within the image of interest, object motion may be confirmed. Additionally, the disclosed methods, apparatus, systems, and articles of manufacture improve the efficiency of using a computing device by implementing specific hardware configurations, timing diagrams, and/or processes that avoid the need for additional pixel shifting operations typically associated with image analysis. For example, examples disclosed herein avoid and/or otherwise reduce additional barrel shifting circuitry and avoid and/or otherwise reduce additional accumulator feedback circuitry. This unique hardware configuration and/or process enables improved efficiency of the underlying processing resources involved in one or more of the image analysis techniques disclosed herein.

Although certain example methods, apparatus, and articles of manufacture have been disclosed herein, the scope of coverage of this patent is not limited thereto. On the contrary, this patent covers all methods, apparatus, and articles of manufacture fairly falling within the scope of the appended claims either literally or under the doctrine of equivalents.

43页详细技术资料下载

Improving efficiency of image difference calculation

相关技术

网友询问留言