Entropy-inspired directional filtering for image coding

文档序号:144848 发布日期:2021-10-22 浏览:38次 中文

阅读说明:本技术 用于图像代码化的熵启发定向滤波 (Entropy-inspired directional filtering for image coding ) 是由 于尔基·阿拉奎亚拉 罗德·范德韦恩 托马斯·菲施巴赫尔 于 2019-11-25 设计创作,主要内容包括:使用熵启发定向滤波来代码化图像块。在编码期间,针对图像块的至少一些像素基于该至少一些像素中的相应像素的相邻像素确定强度差。针对这些像素中的每一个像素基于强度差来估计角度。然后基于所估计的角度确定图像块的主滤波方向。根据主滤波方向对图像块进行滤波,以移除与图像块相关联的倾斜边缘。然后,经滤波的图像块被编码成编码图像。在解码期间,接收指示针对编码图像块的像素所估计的角度的角度图,并且使用该角度图来确定图像块的主滤波方向。然后,可以根据主滤波方向对图像块进行滤波,然后输出经滤波的图像块以用于显示或存储。(The image blocks are coded using entropy-inspired directional filtering. During encoding, an intensity difference is determined for at least some pixels of an image block based on neighboring pixels of respective ones of the at least some pixels. An angle is estimated based on the intensity difference for each of the pixels. The main filtering direction of the image block is then determined based on the estimated angle. The image block is filtered according to a primary filtering direction to remove a sloping edge associated with the image block. The filtered image blocks are then encoded into an encoded image. During decoding, an angle map indicating angles estimated for pixels of an encoded image block is received and a primary filtering direction for the image block is determined using the angle map. The image block may then be filtered according to the primary filtering direction and then output the filtered image block for display or storage.)

1. A method for encoding an image block, the method comprising:

initializing values at indices of an array of angle candidates to zero, wherein a number of the indices of the array is equal to a number of angle candidates available for encoding the image block, wherein each of the indices corresponds to one of the angle candidates;

for each current pixel of a plurality of pixels of the image block:

determining an intensity difference based on a difference in intensity values of neighboring pixels of the current pixel;

estimating an angle of the current pixel by calculating a function of the intensity difference of the current pixel; and

increasing a value at an index corresponding to the estimated angle within the array of angle candidates;

determining a main filtering direction of the image block based on an angle candidate having a maximum value in the array;

filtering the image block according to the primary filtering direction to remove one or more oblique edges associated with the image block; and

the filtered image blocks are encoded into an encoded image.

2. The method of claim 1, wherein determining an intensity difference based on a difference in intensity values of neighboring pixels of the current pixel comprises:

determining a first delta value representing a difference in intensity values between a left neighboring pixel of the current pixel and a right neighboring pixel of the current pixel;

determining a second delta value representing a difference in intensity values between an upper neighboring pixel of the current pixel and a lower neighboring pixel of the current pixel; and

determining the intensity difference for the current pixel based on the first delta value and the second delta value.

3. The method of claim 1 or 2, wherein increasing the value at the index corresponding to the estimated angle within the array of angle candidates comprises:

calculating a function of the edge intensity of the current pixel along the estimated angle to determine a floating point value for the current pixel; and

increasing a value at an index of the angle candidate based on the floating point value.

4. The method of claim 3, wherein the function calculated for the edge intensity of the current pixel is one of a logarithmic function or an exponential function.

5. The method of claim 1, wherein increasing the value at the index of the angle candidate within the array for a current pixel comprises:

increasing a value at the index of the angle candidate in the array by a first increase value; and

increasing values at neighboring indices of the index of the angle candidate in the array by a second increment value, the second increment value being less than the first increment value.

6. The method of any of claims 1 to 5, wherein filtering the image block according to the primary filtering direction to remove the one or more slanted edges associated with the image block comprises:

determining an average of values at one or more of the indices of the array; and

selecting a filter type based on a ratio of the maximum to the average,

wherein the image block is filtered using a filter of the selected filter type.

7. The method of any of claims 1 to 6, further comprising:

generating an angle map used to indicate a primary filtering direction of the image block to a decoder for decoding the encoded image.

8. An apparatus for encoding an image block, the apparatus comprising:

a memory storing an array of angle candidates having a number of indices based on a number of angle candidates available for encoding the tile; and

a processor configured to:

initializing a value at the index of the array of angle candidates to zero;

for each current pixel of a plurality of pixels of the image block:

determining an intensity difference based on a difference in intensity values of neighboring pixels of the current pixel;

estimating an angle of the current pixel by calculating a function of the intensity difference of the current pixel; and

increasing a value at an index corresponding to the estimated angle within the array of angle candidates;

determining a main filtering direction of the image block based on an angle candidate having a maximum value in the array;

filtering the image block according to the primary filtering direction to remove one or more oblique edges associated with the image block; and

the filtered image blocks are encoded into an encoded image.

9. The apparatus of claim 8, wherein determining an intensity difference based on a difference in intensity values of neighboring pixels to the current pixel comprises:

determining a first delta value representing a difference in intensity values between a left neighboring pixel of the current pixel and a right neighboring pixel of the current pixel;

determining a second delta value representing a difference in intensity values between an upper neighboring pixel of the current pixel and a lower neighboring pixel of the current pixel; and

determining the intensity difference for the current pixel based on the first delta value and the second delta value.

10. The apparatus of claim 8 or 9, wherein increasing the value at the index corresponding to the estimated angle within the array of angle candidates comprises:

calculating a function of the edge intensity of the current pixel along the estimated angle to determine a floating point value for the current pixel; and

increasing a value at an index of the angle candidate based on the floating point value.

11. The apparatus of claim 10, wherein the function calculated for the edge strength of the current pixel is one of a logarithmic function or an exponential function.

12. The apparatus of claim 8 or 9, wherein increasing the value at the index of the angle candidate within the array for a current pixel comprises:

increasing a value at the index of the angle candidate in the array by a first increase value; and

increasing values at neighboring indices of the index of the angle candidate in the array by a second increment value, the second increment value being less than the first increment value.

13. The apparatus of any of claims 8 to 12, wherein filtering the image block according to the primary filtering direction to remove the one or more slanted edges associated with the image block comprises:

determining an average of values at one or more of the indices of the array; and

selecting a filter type based on a ratio of the maximum to the average,

wherein the image block is filtered using a filter of the selected filter type.

14. The apparatus of any of claims 8 to 13, wherein the processor is configured to:

generating an angle map used to indicate a primary filtering direction of the image block to a decoder for decoding the encoded image.

15. A method for encoding an image block, the method comprising:

determining respective intensity differences for at least some pixels of the image block based on neighboring pixels of the respective pixels of the at least some pixels;

estimating a respective angle for each of the at least some pixels based on the respective intensity difference;

determining a main filtering direction of the image block based on the estimated angle;

filtering the image block according to the primary filtering direction to remove one or more oblique edges associated with the image block; and

the filtered image blocks are encoded into an encoded image.

16. The method of claim 15, wherein determining the intensity difference for the current pixel of the at least some pixels based on neighboring pixels of the current pixel comprises:

determining a first delta value representing a difference in intensity between a left neighboring pixel of the current pixel and a right neighboring pixel of the current pixel;

determining a second delta value representing a difference in intensity between an upper neighboring pixel of the current pixel and a lower neighboring pixel of the current pixel; and

determining the intensity difference for the current pixel based on the first delta value and the second delta value.

17. The method of claim 15, wherein the array of angle candidates for encoding the image block has a number of indices based on a number of angle candidates available for encoding the image block, wherein each of the indices corresponds to one of the angle candidates, wherein the method further comprises:

for a current estimated angle of the estimated angles:

determining a current angle candidate corresponding to a currently estimated angle among the number of angle candidates; and

increasing a value at an index of the current angle candidate within the array.

18. The method of claim 17, wherein determining a primary filtering direction for the image block based on the estimated angle comprises:

selecting an angle candidate corresponding to an index of the array having a maximum value among the number of angle candidates as the primary filtering direction.

19. The method of claim 17 or 18, further comprising:

initializing a value at the index of the array of angle candidates to zero prior to encoding the image block; and

resetting the value at the index of the array of angle candidates to zero after encoding the image block.

20. The method of any of claims 17 to 19, wherein filtering the image block according to the primary filtering direction to remove the one or more slanted edges associated with the image block comprises:

determining an average of values at one or more of the indices of the array of angle candidates; and

selecting a filter type based on a ratio of the maximum to the average,

wherein the image block is filtered using a filter of the selected filter type.

21. An apparatus for encoding an image block, comprising:

a memory storing instructions; and

a processor configured to execute the instructions to perform the method of any of claims 15 to 20.

Background

The image content represents a large amount of online content. A web page may include multiple images, and most of the time and resources spent rendering the web page are dedicated to rendering those images for display. The amount of time and resources required to receive and render an image for display depends in part on the manner in which the image is encoded. In this way, by using encoding and decoding techniques to reduce the total data size of an image, the image, and therefore the web page including the image, may be rendered faster.

Disclosure of Invention

Among other things, systems and techniques for image coding using entropy-inspired directional filtering are disclosed.

A method for encoding an image block according to an implementation of the present disclosure includes initializing a value at an index of an angle candidate array to zero. The number of indices of the array is equal to the number of angle candidates available for encoding the image block. Each of the indices corresponds to one of the angle candidates. For each current pixel of a plurality of pixels of an image block: determining an intensity difference based on a difference in intensity values of neighboring pixels of the current block; estimating an angle of the current pixel by calculating a function of the intensity difference of the current pixel; and increasing a value at an index corresponding to the estimated angle within the angle candidate array. The main filtering direction of the image block is determined based on the angle candidate in the array having the largest value. The image block is filtered according to a primary filtering direction to remove one or more sloping edges associated with the image block. The filtered image blocks are encoded into an encoded image.

An apparatus for encoding an image block according to an implementation of the present invention includes a memory and a processor. The memory stores instructions and an angle candidate array having a number of indices based on a number of angle candidates available for encoding an image block. The processor is configured to execute instructions stored in the memory to initialize a value at an index of the angle candidate array to zero. For each current pixel of a plurality of pixels of an image block, the processor further executes instructions to: determining an intensity difference based on a difference in intensity values of neighboring pixels of the current pixel; estimating an angle of the current pixel by calculating a function of the intensity difference of the current pixel; and increasing a value at an index corresponding to the estimated angle within the angle candidate array. The processor further executes instructions to determine a primary filtering direction for the image block based on the angle candidate in the array having the largest value. The processor further executes the instructions to filter the image block according to the primary filtering direction to remove one or more sloping edges associated with the image block. The processor further executes the instructions to encode the filtered image block into an encoded image.

A method for encoding an image block according to an implementation of the present invention comprises determining respective intensity differences of at least some pixels of the image block based on neighboring pixels of respective pixels of the at least some pixels. Estimating a respective angle for each of the at least some pixels based on the respective intensity difference. A main filtering direction of the image block is determined based on the estimated angle. The image block is filtered according to a primary filtering direction to remove one or more sloping edges associated with the image block. The filtered image blocks are encoded into an encoded image.

A method for decoding an encoded image block according to an implementation of the present invention includes receiving an angle map indicating angle candidates for pixels of the encoded image block. The encoded image blocks are decoded to produce decoded image blocks. The value at the index of the angle candidate array is initialized to zero. The number of indices of the array is equal to the number of angle candidates available for filtering the decoded image block. Each index corresponds to an angle candidate. For each current pixel of a plurality of pixels of the decoded image block, increasing a value at an index within an angle candidate array corresponding to the current pixel indicated in the angle map. The main filtering direction of the decoded image block is determined based on the angle candidate in the array having the largest value. The decoded image block is filtered according to a primary filtering direction to remove one or more slanted edges associated with the decoded image block. The filtered image block is output for display or storage.

An apparatus for decoding an encoded image block according to an implementation of the invention includes a memory and a processor. The memory stores instructions and an angle candidate array having a number of indices based on a number of angle candidates available for filtering the encoded image block. The processor is configured to execute instructions stored in the memory to receive an angle map indicating angle candidates for pixels of an encoded image block. The processor further executes the instructions to decode the encoded image block to produce a decoded image block. The processor further executes instructions to initialize a value at the index of the angle candidate array to zero. For each current pixel in the plurality of pixels of the decoded image block, the processor further executes instructions to increase a value at an index within the angle candidate array corresponding to the current pixel as indicated in the angle map. The processor further executes instructions to determine a primary filtering direction for the decoded image block based on the angle candidate in the array having the largest value. The processor further executes the instructions to filter the decoded image block according to the primary filtering direction to remove one or more slanted edges associated with the decoded image block. The processor further executes the instructions to output the filtered image block for display or storage.

A method for decoding an encoded image block according to an implementation of the present invention includes decoding an encoded image block to produce a decoded image block and receiving an angle map indicating angle candidates for pixels of the encoded image block. A main filtering direction of the decoded image block is determined based on the angle map. The decoded image block is filtered according to a primary filtering direction to remove one or more slanted edges associated with the decoded image block. The filtered image block is output for display or storage.

Drawings

The disclosure is best understood from the following detailed description when read with the accompanying drawing figures. It is emphasized that, according to common practice, the various features of the drawings are not to scale. On the contrary, the dimensions of the various features are arbitrarily expanded or reduced for clarity.

FIG. 1 is a block diagram of an example of an image coding system.

Fig. 2 is a block diagram of an example of an internal configuration of a computing device that may be used in an image coding system.

Fig. 3 is a block diagram of an example of an image encoder.

Fig. 4 is a block diagram of an example of an image decoder.

Fig. 5 is an illustration of an example of a portion of an image.

FIG. 6 is a flow diagram of an example of a technique for encoding an image block using entropy-inspired directional filtering.

Fig. 7 is a flow diagram of an example of a technique for decoding an encoded image block using entropy-inspired directional filtering.

Detailed Description

Lossy image coding involves reducing the amount of data within the image to be coded, such as using quantization. In exchange for the reduced bit cost of the resulting encoded image, the image suffers a certain quality loss. The degree of quality loss depends largely on the way the image data is quantized during encoding. In particular, quantization of image data may result in discontinuities or artifacts throughout the image. These artifacts may be present along block boundaries, for example due to differences in the coding of respective neighboring blocks within the image. However, in some cases, artifacts may be present on diagonal lines, such as on non-vertical edges of objects shown within the image itself.

These artifacts may be reduced by applying a filter to the coefficients of the image block. For example, filters may be applied to quantized transform coefficients prior to entropy encoding these coefficients into an encoded image during encoding. In another example, the filter may be applied to the coefficients after entropy decoding, dequantization, and inverse transformation of those coefficients during decoding. The filtering may remove artifacts from the image in order to render the image in its pre-encoded form. However, conventional filtering methods are typically designed to remove artifacts from block boundaries. As such, conventional filtering methods may not be effective in removing certain artifacts, such as artifacts located near the oblique edges of objects within the image.

One solution to filter artifacts at oblique edges in an image uses a directional filter of a given tap size to filter along the edge in question. Based on the tap size used, a number of pixels on each side of the edge are compared and used to calculate a filtered value, which may be, for example, a simple average of the respective pixel values. However, this approach fails to take into account other pixel values within the block, and instead limits its processing to some number of pixels immediately adjacent to the edge. In many cases, this may limit the efficacy of the filtering itself, for example by ignoring other pixel information within the block.

Implementations of the present disclosure address issues such as those that use entropy-inspired directional filtering for image coding. During encoding, an intensity difference is determined for at least some pixels of the image block based on neighboring pixels of respective ones of the at least some pixels. An angle is estimated based on the intensity difference for each of those pixels. For example, an array of angle candidates may be used. For each of those pixels, it may be determined that the intensity difference corresponds to one of the angle candidates, and the value at the index of the angle candidate within the array may be increased.

The main filtering direction of the image block is then determined based on the estimated angle. The image block is filtered according to a primary filtering direction to remove one or more sloping edges associated with the image block. The filtered image block is then encoded into an encoded image. During decoding, an angle map indicating angles estimated for pixels of an encoded image block, e.g., during encoding, may be received and used to determine a primary filtering direction for the image block. The image blocks may then be filtered according to the primary filtering direction and then output for display or storage.

Further details of these techniques are described herein, initially with reference to systems in which techniques for image coding using entropy-inspired directional filtering can be implemented. Fig. 1 is a diagram of an example of an image coding system 100. Image coding system 100 includes a transmitting station 102, a receiving station 104, and a network 106. The image coding system 100 may be used, for example, to encode and decode images.

The transmitting station 102 is a computing device that encodes and transmits images. Alternatively, transmitting station 102 may include two or more distributed devices for encoding and transmitting images. Receiving station 104 is a computing device that receives and decodes encoded images. Alternatively, receiving station 104 may include two or more distributed devices for receiving and decoding encoded images. An example of a computing device for implementing one or both of transmitting station 102 or receiving station 104 is described below with reference to fig. 2.

Network 106 connects transmitting station 102 and receiving station 104 for encoding, transmitting, receiving, and decoding images. The network 106 may be, for example, the internet. Network 106 may also be a Local Area Network (LAN), Wide Area Network (WAN), Virtual Private Network (VPN), cellular telephone network, or other means of transmitting images from transmitting station 102 to receiving station 104.

The implementation of the coding system 100 may differ from that shown and described with reference to fig. 1, and in some implementations the coding system 100 may omit the network 106. In some implementations, the images may be encoded and then stored for later transmission to the receiving station 104 or another device with memory. In some implementations, the receiving station 104 may receive the encoded image (e.g., via the network 106, a computer bus, and/or some communication path) and store the encoded image for later decoding.

In some implementations, the functionality of transmitting station 102 and receiving station 104 may vary based on the particular operations performed. For example, during operation for encoding an image, transmitting station 102 may be a computing device for uploading the image for encoding to a server, and receiving station 104 may be a server that receives the image from transmitting station 102 and encodes the image for later use (e.g., in rendering a web page). In another example, during operation for decoding an encoded image, transmitting station 102 may be a server that decodes the encoded image, and receiving station 104 may be a computing device that receives the decoded image from transmitting station 102 and renders the decoded image (e.g., as part of a web page).

Fig. 2 is a block diagram of an example of an internal configuration of a computing device 200 that may be used in an image encoding and decoding system (e.g., the image coding system 100 shown in fig. 1). Computing device 200 may, for example, implement one or both of transmitting station 102 or receiving station 104. Computing device 200 may be in the form of a computing system including multiple computing devices, or in the form of one computing device, such as a mobile phone, a tablet computer, a laptop computer, a notebook computer, a desktop computer, and so on.

The processor 202 in the computing device 200 may be a conventional central processing unit. Alternatively, processor 202 may be another type of device or devices, now existing or later developed, that is capable of manipulating or processing information. For example, although the disclosed implementations may be practiced with one processor (e.g., processor 202) as shown, speed and efficiency advantages may be realized by using more than one processor.

In one implementation, the memory 204 in the computing device 200 may be a Read Only Memory (ROM) device or a Random Access Memory (RAM) device. However, other suitable types of memory devices may also be used for memory 204. The memory 204 may include code and data 206 that are accessed by the processor 202 using the bus 212. Memory 204 may also include an operating system 208 and application programs 210, the application programs 210 including at least one program that allows processor 202 to perform the techniques described herein. For example, the application programs 210 may include applications 1 through N, which further include image encoding and/or decoding software that performs some or all of the techniques described herein. Computing device 200 may also include secondary storage 214, which may be, for example, a memory card for use with a mobile computing device. For example, the images may be stored in whole or in part in secondary storage 214 and loaded into memory 204 for processing as needed.

Computing device 200 may also include one or more output devices, such as a display 218. In one example, display 218 may be a touch sensitive display that combines the display with a touch sensitive element operable to sense touch inputs. A display 218 may be coupled to the processor 202 via the bus 212. In addition to or in lieu of display 218, other output devices can be provided that allow a user to program or otherwise utilize computing device 200. When the output device is or includes a display, the display may be implemented in various ways, including as a Liquid Crystal Display (LCD), Cathode Ray Tube (CRT) display, or Light Emitting Diode (LED) display, such as an organic LED (oled) display.

Computing device 200 may also include or communicate with an image sensing device 220, such as a camera or another image sensing device now existing or later developed that may sense images, such as images of a user operating computing device 200. The image sensing device 220 may be positioned such that it is pointed at a user operating the computing device 200. For example, the position and optical axis of the image sensing device 220 may be configured such that the field of view includes an area directly adjacent to the display 218 from which the display 218 is visible.

Computing device 200 may also include or be in communication with a sound sensing device 222, such as a microphone or another sound sensing device now existing or later developed that may sense sound in the vicinity of computing device 200. The sound sensing device 222 can be positioned such that it is directed toward a user operating the computing device 200 and can be configured to receive sounds, e.g., speech or other utterances, made by the user while operating the computing device 200.

The implementation of computing device 200 may differ from that shown and described with respect to fig. 2. In some implementations, the operations of processor 202 may be distributed across multiple machines (where each machine may have one or more processors), which may be coupled directly or across a local or other network. In some implementations, memory 204 may be distributed across multiple machines, such as a network-based memory or memory in multiple machines that perform operations for computing device 200. In some implementations, the bus 212 of the computing device 200 may be comprised of multiple buses. In some implementations, secondary storage 214 may be directly coupled to other components of computing device 200 or may be accessible via a network and may include an integrated unit, such as a memory card, or multiple units, such as multiple memory cards.

Fig. 3 is a block diagram of an example of an image encoder 300. Fig. 3 is a block diagram of an example of an image encoder 300. The image encoder 300 may be, for example, an image decoder implemented at a transmitting station of an image coding system (e.g., the transmitting station 102 of the image coding system 100 shown in fig. 1). Image encoder 300 receives an input image 302 and encodes it to produce an encoded image 304, which may be output to a decoder (e.g., implemented by a receiving station, such as receiving station 104 shown in fig. 1) or for storage.

The image encoder 300 includes an inverse filtering stage 306, a transform stage 308, a quantization stage 310, and an entropy coding stage 312. The inverse filtering stage 306 performs filtering on the input image 302 to directly or indirectly reduce artifacts resulting from the transformation and/or quantization performed by the transform stage 308 and/or quantization stage 310, respectively, for example by modifying the pixel values of the input image 302 prior to its transformation and quantization. In particular, the filtering stage 306 may perform filtering that is inverse to filtering performed later at an image decoder (e.g., the image decoder 400 described below with reference to fig. 4), such as to compensate for blurring caused by filtering by the image decoder. Implementations and examples for filtering image blocks during encoding are described below with reference to fig. 6.

The transform stage 308 transforms the filtered blocks of the input image 302 to the frequency domain. For example, transform stage 308 may use a Discrete Cosine Transform (DCT) to transform the filtered blocks of input image 302 from the spatial domain to the frequency domain. Alternatively, the transform stage 308 may use another fourier-related transform or a discrete fourier transform to transform the filtered blocks of the input image 302 from the spatial domain to the frequency domain. As another alternative, the transform stage 308 may use another block-based transform to transform the filtered blocks of the input image 302 from the spatial domain to the frequency domain.

Quantization stage 310 quantizes transform coefficients produced as output by transform stage 308. Quantization stage 310 converts the transform coefficients into discrete magnitudes using a quantization factor, referred to as quantized transform coefficients. For example, the transform coefficients may be divided by a quantization factor and truncated.

The entropy coding stage 312 entropy codes the quantized transform coefficients output from the quantization stage 310 using lossless coding techniques. For example, the lossless coding technique used by entropy coding stage 312 to entropy code the quantized transform coefficients may be or include huffman coding, an arithmetic operation, variable length coding, or another coding technique. The encoded image 304 is generated based on the output of the entropy encoding stage 312. The encoded image 304 may be stored at a server (e.g., in a database or similar data store) for later retrieval and decoding. For example, the encoded image 304 may be an image hosted on a website or an image provided for display on a web page.

The implementation of the image encoder 300 may differ from that shown and described with respect to fig. 3. In some implementations, the image encoder 300 can be used to encode frames of a video sequence, such as by encoding its input video stream into a bitstream. For example, the input image 302 may be an input frame to be encoded, and the encoded image 304 may be an encoded frame, which may be encoded into a bitstream.

In such implementations, the image encoder 300 may include a prediction stage for predicting motion within a frame. For example, the prediction stage may include functionality for generating a prediction residual for a current block of a frame using inter prediction or intra prediction. In the case of intra-prediction, the prediction block may be formed from previously encoded and reconstructed samples in the frame itself. In the case of inter prediction, a prediction block may be formed from samples in one or more previously constructed reference frames. The prediction block may be subtracted from the current block at a prediction stage to generate a prediction residue. The transform stage 306 may then receive and process the prediction residual, e.g., to generate transform coefficients for the current block.

In another example of an implementation in which the image encoder 300 may be used to encode frames of a video sequence, the image encoder 300 may include a reconstruction path to ensure that the image encoder 300 and a decoder (e.g., a decoder implemented at a receiving station of an image coding system, such as the receiving station 104 of the image coding system 100) use the same reference frame to decode a bitstream produced by the image encoder 300 to represent an input video stream.

For example, the reconstruction path may perform functions similar to those occurring during the decoding process (described below with reference to fig. 4), including dequantizing the quantized transform coefficients at a dequantization stage and inverse transforming the dequantized transform coefficients at an inverse transform stage to produce derivative residuals. The prediction block predicted at the prediction stage may then be added to the derivative residual to create a reconstructed block.

In some implementations, the image encoder 300 may omit the filtering stage 306. For example, instead of performing filtering on the input image 302 or image data processed therefrom, the image encoder 300 may estimate an angle for filtering pixels of a block of the input image 302. The estimated angle may be recorded as an angle map of the input image 302. The angle map may be encoded into the encoded image 304, such as into a header or other portion of the encoded image 304. Alternatively, the angle map may be encoded as a separate file from the encoded image 304, but it may be processed during decoding along with the encoded image 304 in order to decode the encoded image 304. In some implementations, the angle map as described above may be generated by the filtering stage 306 or using the output of the filtering stage 306.

Fig. 4 is a block diagram of an example of an image decoder 400. The image decoder 400 may be, for example, an image decoder implemented at a receiving station of an image coding system (e.g., the receiving station 104 of the image coding system 100 shown in fig. 1). The image decoder 400 receives and decodes (e.g., from storage or memory) the encoded image 402 to produce an output image 404, which can be output for display or storage. The output image 404 is perceptually the same or similar to an input image encoded using an encoder (e.g., the input image 302 and the image encoder 300 shown in fig. 3). However, assuming the encoding that produces the encoded image 402 is lossy, the output image 404 may appear substantially identical to the input image, but not necessarily identical to the input image.

The image decoder 400 includes an entropy decoding stage 406, a dequantization stage 408, an inverse transform stage 410, and a filtering stage 412. The entropy decoding stage 406 entropy decodes the encoded image data from the encoded image 402 using lossless coding techniques. For example, the lossless coding technique used by the entropy decoding stage 406 to entropy decode the encoded image data from the encoded image 402 may be or include huffman coding, arithmetic coding, variable length coding, or another coding technique.

The entropy decoding stage 406 entropy decodes the encoded image data to produce quantized transform coefficients. The dequantization stage 408 dequantizes the quantized transform coefficients output from the entropy decoding stage 406, for example, by multiplying the quantized transform coefficients by a quantization factor used to produce the encoded image 402. The inverse transform stage 410 inverse transforms the dequantized transform coefficients, for example, by inverse transforming the dequantized transform coefficients from the frequency domain to the spatial domain.

The filtering stage 412 performs filtering to remove artifacts resulting from the encoding of the encoded image 402. For example, the filtering stage 412 may filter the coefficients for a block of the encoded image 402 output from the inverse transform stage 410 according to a primary filtering direction for that block. Implementations and examples for filtering blocks of an image during decoding are described below with reference to fig. 7.

The implementation of the image decoder 400 may differ from that shown and described with respect to fig. 4. In some implementations, the image decoder 400 may be used to decode encoded frames of a video sequence, such as by decoding the encoded frames into an output video stream from a bitstream. For example, the encoded image 402 may be an encoded frame received from a bitstream, and the output image 404 may be a decoded frame to be output for display or storage, such as within an output video stream.

In such implementations, the image decoder 400 may include a prediction stage for predicting motion within a frame. For example, the prediction stage may include functionality for generating a prediction residual for a current block of the frame based on the output of the inverse transform stage 410 and/or based on the output of the entropy decoding stage using inter-prediction or intra-prediction. For example, using header information decoded from a bitstream and a prediction residual output from an inverse transform, a prediction stage may create the same prediction block as a prediction block created at a prediction stage used to encode a frame.

In the case of intra prediction, the prediction block may be formed from previously decoded and reconstructed samples in the frame itself. In the case of inter prediction, a prediction block may be formed from samples in one or more previously reconstructed reference frames. The prediction block and the prediction residual output from the inverse transform stage 410 may be used to reconstruct the block. The filtering stage 412 may then perform filtering on the reconstructed block. Furthermore, a reconstructed frame generated based on reconstruction of its blocks may be stored as a reference frame for use in reconstructing a frame to be decoded later.

Fig. 5 is an illustration of an example of a portion of an image 500. As shown, the image 500 includes four 64 x 64 blocks 510 in two rows and two columns in a matrix or cartesian plane. In some implementations, a 64 × 64 block may be the largest coding unit, with N being 64. Each 64 x 64 block may include four 32x32 blocks 520. Each 32x32 block may include four 16 x 16 blocks 530. Each 16 x 16 block may include four 8 x 8 blocks 540. Each of the 8 x 8 blocks 540 may include four 4 x 4 blocks 550. Each 4 x 4 block 550 may include 16 pixels, which 16 pixels may be represented in four rows and four columns in each respective block in a cartesian plane or matrix.

The pixels may include information representing the image captured in image 500, such as brightness information, color information, and location information. In some implementations, a block (e.g., a 16 × 16 block of pixels as shown) may include a luma block 560, which may include luma pixels 562; and two chroma blocks 570, 580, such as a U or Cb chroma block 570 and a V or Cr chroma block 580. The chrominance blocks 570, 580 may include chrominance pixels 590. For example, luma block 560 may include 16 × 16 luma pixels 562 and each chroma block 570, 580 may include 8 × 8 chroma pixels 590, as shown. Although one arrangement of blocks is shown, any arrangement may be used. Although fig. 7 shows an N × N block, in some implementations, an N × M block may be used, where N and M are different numbers. For example, 32 × 64 blocks, 64 × 32 blocks, 16 × 32 blocks, 32 × 16 blocks, or any other size block may be used. In some implementations, an N × 2N block, a 2N × N block, or a combination thereof may be used.

In some implementations, coding the image 500 may include ordered block-level coding. The ordered block-level coding may include coding the blocks of the image 500 in an order, such as a raster scan order, where the blocks may be identified and processed starting with a block in the upper left corner of the image 500 or a portion of the image 500 and continuing along left-to-right lines and from top-to-bottom lines, identifying each block in turn for processing. For example, the 64 x 64 block in the top row and left column of the image 500 may be a coded first block, and the 64 x 64 block immediately to the right of the first block may be a coded second block. The second row from the top may be a coded second row such that the 64 x 64 block in the left column of the second row may be coded after the 64 x 64 block in the rightmost column of the first row.

In some implementations, coding the blocks of image 500 may include using quadtree coding, which may include coding smaller block units within a block in raster scan order. For example, the 64 × 64 block shown in the lower left corner of the portion of image 500 may be coded using quadtree coding, where the upper left 32 × 32 block may be coded, then the upper right 32 × 32 block may be coded, then the lower left 32 × 32 block may be coded, and then the lower right 32 × 32 block may be coded. Each 32x32 block may be coded using quadtree coding, where the top left 16 x 16 block may be coded, then the top right 16 x 16 block may be coded, then the bottom left 16 x 16 block may be coded, and then the bottom right 16 x 16 block may be coded.

Each 16 x 16 block may be coded using quadtree coding, where the top left 8 x 8 block may be coded, then the top right 8 x 8 block may be coded, then the bottom left 8 x 8 block may be coded, and then the bottom right 8 x 8 block may be coded. Each 8 x 8 block may be coded using quadtree coding, where the top left 4 x 4 block may be coded, then the top right 4 x 4 block may be coded, then the bottom left 4 x 4 block may be coded, and then the bottom right 4 x 4 block may be coded. In some implementations, for a 16 x 16 block, the 8 x 8 block may be omitted and the 16 x 16 block may be coded using quadtree coding, where the top left 4 x 4 block may be coded and then the other 4 x 4 blocks in the 16 x 16 block may be coded in raster scan order.

In some implementations, coding the image 500 may include encoding information included in an original version of the image by, for example, omitting some information from the original version of the image (e.g., an input image, such as the input image 302 shown in fig. 3) from a corresponding encoded image. For example, the coding may include reducing spectral redundancy, reducing spatial redundancy, or a combination thereof.

Reducing spectral redundancy may include using a color model based on a luminance component (Y) and two chrominance components (U and V or Cb and Cr), which may be referred to as a YUV or YCbCr color model or color space. Using the YUV color model may include using a relatively large amount of information to represent a luminance component of a portion of image 500 and using a relatively small amount of information to represent each corresponding chrominance component of the portion of image 500. For example, a portion of the image 500 may be represented by a high-resolution luma component, which may include a 16 × 16 block of pixels, and two lower-resolution chroma components, each representing the portion of the image as an 8 × 8 block of pixels. A pixel may indicate a value, e.g., a value in the range from 0 to 255, and may store or transmit the value using, e.g., eight bits. Although the present disclosure is described with reference to a YUV color model, another color model may be used.

Reducing spatial redundancy may include transforming the block to the frequency domain using, for example, a discrete cosine transform. For example, a unit of an encoder, such as transform stage 306 shown in fig. 3, may perform a discrete cosine transform using spatial frequency-based transform coefficient values.

Although described herein with reference to a matrix or cartesian representation of image 500 for clarity, image 500 may be stored, transmitted, processed, or any combination thereof in any data structure such that pixel values may be efficiently represented for image 500. For example, the image 500 may be stored, transmitted, processed, or any combination thereof, in a two-dimensional data structure such as the matrix shown, or in a one-dimensional data structure such as a vector array.

Further, although described herein as illustrating a chroma subsampled image where U and V have half the resolution of Y, image 500 may have a different configuration for its color channels. For example, still referring to the YUV color space, full resolution may be used for all color channels of the image 500. In another example, a color space other than the YUV color space may be used to represent the resolution of the color channels of image 500. Implementations of the present disclosure describe filtering that may be used on images of varying color channel resolutions and/or varying color spaces.

The blocks shown and described with respect to image 500 are square or rectangular. However, the objects displayed within the image 500 may not be square or rectangular. For example, a generally circular object may be included within or otherwise intersect multiple blocks of the image 500. Furthermore, such non-square or non-rectangular objects will have slanted edges that do not align with the boundaries of the blocks of the image 500. Encoding blocks that include a sloped edge may result in block artifacts that occur along the sloped edge. A filtering tool that uses the pixel information of the block to determine the main filtering direction of the block may be used to reduce artifacts along the sloping edges.

Image coding techniques using entropy-inspired directional filtering are now described with reference to fig. 6-7. Fig. 6 is a flow diagram of an example of a technique 600 for encoding an image block using entropy-inspired directional filtering. Fig. 7 is a flow diagram of an example of a technique 700 for decoding an encoded image block using entropy-inspired directional filtering.

One or more of techniques 600 or 700 may be implemented, for example, as software programs executable by a computing device, such as transmitting station 102 or receiving station 104. For example, a software program may include machine-readable instructions that may be stored in a memory (such as memory 204 or secondary storage 214) and which, when executed by a processor (such as processor 202), may cause a computing device to perform one or more of techniques 600 or 700. One or more of techniques 600 or techniques 700 may be implemented using dedicated hardware or firmware. As described above, some computing devices may have multiple memories or processors, and operations described in one or more of techniques 600 or 700 may be distributed using multiple processors, memories, or both.

For simplicity of explanation, the techniques 600 and 700 are each depicted and described as a series of steps or operations. However, steps or operations in accordance with the present disclosure may occur in various orders and/or concurrently. In addition, other steps or operations not presented and described herein may be used. Moreover, not all illustrated steps or operations may be required to implement a technique in accordance with the disclosed subject matter.

Referring initially to fig. 6, a flow diagram of an example of a technique 600 for encoding an image block using entropy-inspired directional filtering is shown. An image block is a block of size M × N, where M and N may be the same number or different numbers. For example, the image blocks may have a size between 2x2 and 32x 32. At 602, an angle candidate array is initialized. The array of angle candidates comprises a number of indices equal to the number of angle candidates available for encoding the image block. The angle candidates represent edge angles of varying degrees. Each of the indices of the array of angle candidates corresponds to one of the angle candidates. For example, in the case of considering 64 angle candidates, the array includes 64 indexes. Each of the 64 angle candidates will correspond to a different edge angle that can be imagined to be represented within the image block. The value at each index of the array is initially set to zero.

At 604, the intensity difference for pixel I is determined. Pixel I is the current pixel of the image block. For example, pixel I may be the first pixel of an image block or the last pixel of an image block, according to a scanning order or other processing order. The intensity difference is the difference between the intensity value of pixel I and the intensity value of one or more pixels selected from a group of pixels located in the image block that do not include pixel I. For example, the intensity difference of the pixel I may be determined based on the difference of the intensity values of the neighboring pixels of the pixel I, e.g., a first delta value may be determined to represent the difference of the intensity values between the left neighboring pixel of the pixel I and the right neighboring pixel of the pixel I. The second delta value may be determined to represent a difference in intensity values between an upper adjacent pixel of pixel I and a lower adjacent pixel of pixel I, the first delta value representing a change in pixel intensity along the X-axis for pixel I. The second delta value represents the change in pixel intensity along the Y-axis for pixel I.

The intensity difference for pixel I may be determined based on the first delta value, the second delta value, or both. For example, when determining the intensity difference of the pixel I based on the first increment value, the intensity difference of the pixel I may be equal to the difference in intensity values between the left and right neighboring pixels of the pixel I. In another example, when the intensity difference of the pixel I is determined based on the second increment value, the intensity difference of the pixel I may be equal to a difference in intensity values between upper and lower neighboring pixels of the pixel I. In yet another example, where both the first delta value and the second delta value are used to determine the intensity difference for pixel I, the first delta value and the second delta value may be combined. For example, the intensity value of pixel I may be determined as the average of the first delta value and the second delta value. The average may be weighted or unweighted.

At 606, the angle of pixel I is estimated. The angle represents the angle of the edge of the object where the pixel I is located. The angle is estimated by calculating a function of the intensity difference of the pixel I. The calculated function may be an arctangent 2 (square) function or another suitable function. The estimated angle may correspond to an angle candidate in an angle candidate array.

At 608, the value at the index within the angle candidate array corresponding to the estimated angle is incremented. In the event that the estimated angle corresponds to an angle candidate in an array of angle candidates, an index associated with the angle candidate in the array is identified. The value at that index is then incremented. The value at the index may be increased by a floating point value calculated as the output of a function applied to the edge intensity of the pixel I along the estimated angle. The function may be a logarithmic function. For example, the logarithmic function may be represented as log (x +1), where x represents the estimated angle for pixel I and f represents the floating point value. Alternatively, the function may be an exponential function. For example, the exponential function may be represented as f 1-exp (-x), where x represents the angle estimated for the pixel I and f represents the floating point value. In some implementations, the value of the index may be increased by one or incremented according to a fixed ratio.

At 610, it is determined whether pixel I is the last pixel in the image block. For example, a scan order or other processing tool may be used to check whether another pixel of the image block is still to be processed. At 612, the value of I is increased by 1 in response to determining that pixel I is not the last pixel in the image block. The technique 600 then returns to 604 to repeat the operations at 604, 606, and 608 for the new pixel I.

At 614, in response to determining that pixel I is not the last pixel in the image block, a primary filtering direction for the image block is determined based on the values of the array. The main filtering direction is determined based on the angle candidate in the array having the largest value. Thus, determining the main filtering direction comprises identifying the index of the array having the maximum value after completing the addition of the pixels of the image block, and selecting the angle candidate associated with the identified index as the main filtering direction.

At 616, the image block is filtered according to the primary filtering direction. Filtering the image block according to the main filtering direction comprises using a filter of the filter type to change at least one pixel value within the image block along the main filtering direction. The filter type is one of a directional filter, a semi-directional filter, or a circular filter. The filter type may be selected according to the ratio of the main filtering direction to the average filtering direction of the image block. For example, an average of values at one or more indices of the array may be determined. Then, a filter type for filtering along the main filtering direction may be selected based on the ratio of the maximum value to the average value. The average filtering direction indicates an angle corresponding to an angle candidate having a value equal to the average of the values. In the case where no angle candidate has a value equal to the average of the values, the average filtering direction indicates an angle corresponding to an angle candidate having a value closest to the average of the values.

The selection of the filter type is based on a measure of the confidence in the main filtering direction determined for the image block. I.e. the closer the main filtering direction and the average filtering direction are, the greater the likelihood that the image block comprises edges at a number of different angles, and thus the lower the confidence in the main filtering direction. Conversely, the further the primary and average filtering directions are, the greater the likelihood that the primary filtering direction represents an angle of a primary (or in some cases, unique) edge of the image block, and thus the higher the confidence in the primary filtering direction. A threshold and/or threshold range may be established for indicating which filter type to use based on the relation between the main filtering direction and the average filtering direction. For example, a first filter type may be used when the relationship between the main filtering direction and the average filtering direction indicates a high confidence, a second filter type may be used when the relationship between the main filtering direction and the average filtering direction indicates a medium confidence, and a third filter type may be used when the relationship between the main filtering direction and the average filtering direction indicates a low confidence. Thus, filtering the image block may include using a filter of the selected filter type. At 618, the filtered image block is encoded into an encoded image.

In some implementations, for each pixel I, the value at more than one index of the array of angle candidates may be increased, e.g., a gaussian distribution of the angle candidates may be determined based on the estimated angle of pixel I, and the index of the array associated with the angle candidate corresponding to the estimated angle of pixel I may be increased by a first increased value (e.g., 50). Each adjacent index of the index that increments the first increment value may be incremented by a second increment value (e.g., 25) that is less than the first increment value. In some such implementations, other adjacent indices (e.g., each of the indices that increase the second incremental value) may be increased by smaller and smaller incremental values according to a gaussian distribution.

In some implementations, the technique 600 includes generating an angle map. For example, the angle map may be an array, matrix, table, or other variable or object in which locations correspond to pixels of the image block. The estimated angle for each pixel of the image block may be recorded at a corresponding position of the angle map. The angle map may be encoded into the encoded image, for example, into a header or other portion of the encoded image. Alternatively, the angle map may be encoded to a file separate from the encoded image, but it may be processed during decoding along with the encoded image in order to decode the encoded image.

In some implementations, the angle candidate array may not be used. For example, a different structure may be used that may be updated based on the angles estimated for individual pixels or groups of pixels of an image block. For example, the different structure may be a matrix, a table, or another variable or object. In another example, the data structure may not be used to monitor the estimated angle. For example, heuristics for orientation computation that are not motivated by the entropy of the image block may consider various directivities of the image block.

In some implementations, the angle candidate array may be reset after the image block is encoded into the encoded image. For example, after encoding an image block, the value at the index of the angle candidate array may be reset to zero.

Referring next to fig. 7, a flow diagram of an example of a technique 700 for decoding an encoded image block using entropy-inspired directional filtering is shown. At 702, an angle map is received. The angle map may be an array, matrix, table, or other variable or object in which locations correspond to pixels of the image block. The estimated angle for each pixel of the image block may be recorded at a corresponding position of the angle map. The angle map may be encoded into the encoded image, for example, into a header or other portion of the encoded image. Alternatively, the angle map may be encoded into a file separate from the encoded image. The angle map indicates angles estimated for pixels of the encoded image.

At 704, an image block is decoded from an encoded image. Decoding an image block from an encoded image may include entropy decoding syntax elements from the encoded image to produce quantized transform coefficients, dequantizing the quantized transform coefficients to produce transform coefficients, and inverse transforming the transform coefficients to produce pixel values for the image block.

At 706, an angle candidate array is initialized. The array of angle candidates comprises a number of indices equal to the number of angle candidates available for encoding the image block. The angle candidates represent edge angles of varying degrees. Each of the indices of the array of angle candidates corresponds to one of the angle candidates. For example, in the case of considering 64 angle candidates, the array includes 64 indexes. Each of the 64 angle candidates would correspond to a different edge angle that can be imagined to be represented within the image block. The value at each index of the array is initially set to zero.

At 708, the index of the array corresponding to the estimated angle of pixel I of the image block is increased. Increasing the index of the array includes determining an estimated angle of pixel I based on the angle map. For example, a location within the angle map corresponding to pixel I may be identified. The angle indicated at this position may be identified as the estimated angle of pixel I. Then, the value at the index corresponding to the angle candidate associated with the estimated angle may be increased.

At 710, it is determined whether pixel I is the last pixel in the image block. For example, a scan order or other processing tool may be used to check whether another pixel of the image block is still to be processed. At 712, the value of I is increased by 1 in response to determining that pixel I is not the last pixel in the image block. The technique 700 then returns to 708 to repeat the operation at 708 for the new pixel I.

At 714, in response to determining that pixel I is not the last pixel in the image block, a primary filtering direction for the image block is determined based on the values of the array. The main filtering direction is determined based on the angle candidate in the array having the largest value. Thus, determining the main filtering direction comprises identifying the index of the array having the maximum value after completing the addition of the pixels of the image block, and selecting the angle candidate associated with the identified index as the main filtering direction.

At 716, the image block is filtered according to a primary filtering direction. Filtering the image block according to the main filtering direction comprises using a filter of the filter type to change at least one pixel value within the image block along the main filtering direction. The filter type is one of a directional filter, a semi-directional filter, or a circular filter. The filter type may be selected according to the ratio of the main filtering direction to the average filtering direction of the image block. For example, an average of values at one or more indices of the array may be determined. Then, a filter type for filtering along the main filtering direction may be selected based on the ratio of the maximum value to the average value. Thus, filtering the image block may include using a filter of the selected filter type. At 718, the filtered image block is output for display or storage.

In some implementations, for each pixel I, the value at more than one index of the array of angle candidates may be increased, e.g., a gaussian distribution of the angle candidates may be determined based on the estimated angle of pixel I, and the index of the array associated with the angle candidate corresponding to the estimated angle of pixel I may be increased by a first increased value (e.g., 50). Each adjacent index to the index that increments the first increment value may be incremented by a second increment value (e.g., 25) that is less than the first increment value. In some such implementations, other adjacent indices (e.g., each of the indices that increase the second incremental value) may be increased by smaller and smaller incremental values according to a gaussian distribution.

In some implementations, the angle candidate array may not be used. For example, a different structure may be used that may be updated based on the angles estimated for individual pixels or groups of pixels of an image block. For example, the different structure may be a matrix, a table, or another variable or object. In another example, the data structure may not be used to monitor the estimated angle. For example, heuristics for orientation computation that are not motivated by the entropy of the image block may consider various directivities of the image block. In yet another example, the angle candidate array may simply be omitted. For example, the angle map may be used to identify the primary filtering direction of the image block without processing or other validation using an array or other structure.

In some implementations, the angle candidate array may be reset after the image block is encoded into the encoded image. For example, after encoding an image block, the value at the index of the angle candidate array may be reset to zero.

The above-described aspects of encoding and decoding illustrate some examples of encoding and decoding techniques and hardware components configured to perform all or a portion of those examples of encoding and/or decoding techniques. It is to be understood, however, that encoding and decoding, as those terms are used in the claims, may mean encoding, decoding, transforming, or another process or change of data.

The word "example" is used herein to mean serving as an example, instance, or illustration. Any aspect or design described herein as an "example" is not necessarily to be construed as preferred or advantageous over other aspects or designs. Rather, use of the word "example" is intended to present concepts in a concrete fashion. As used in this application, the term "or" is intended to mean an inclusive "or" rather than an exclusive "or". That is, unless the context clearly dictates otherwise or is otherwise clear, the expression "X comprises a or B" is intended to mean any of its natural inclusive permutations. That is, if X comprises A; x comprises B; or X includes A and B, then "X includes A or B" is satisfied under any of the foregoing circumstances. In addition, the articles "a" and "an" as used in this application and the appended claims should generally be construed to mean "one or more" unless specified otherwise or the context clearly dictates otherwise. Furthermore, use of the term "implementation" or the term "one implementation" in this disclosure is not intended to require the same implementation unless so described.

All or a portion of the implementations of the disclosure may take the form of a computer program product accessible from, for example, a computer-usable or computer-readable medium. A computer-usable or computer-readable medium may be, for example, any apparatus that can tangibly contain, store, communicate, or transport the program for use by or in connection with any processor. The medium may be, for example, an electronic, magnetic, optical, electromagnetic, or semiconductor device. Other suitable media are also available.

The foregoing implementations, examples, and aspects have been described to facilitate an understanding of the present disclosure, and are not limiting of the present disclosure. On the contrary, the disclosure is intended to cover various modifications and equivalent arrangements included within the scope of the appended claims, which scope is to be accorded the broadest interpretation allowed by law so as to encompass all such modifications and equivalent arrangements.

23页详细技术资料下载
上一篇:一种医用注射器针头装配设备
下一篇:视频处理中对基于模型的整形的约束

网友询问留言

已有0条留言

还没有人留言评论。精彩留言会获得点赞!

精彩留言,会给你点赞!

技术分类