Apparatus and method for image coding under selective loop filtering
阅读说明:本技术 选择性环路滤波下的图像编码的装置及方法 (Apparatus and method for image coding under selective loop filtering ) 是由 赵志杰 约翰尼斯·索尔斯 麦蒂尔斯·韦恩 于 2018-03-02 设计创作,主要内容包括:本发明公开了一种用于选择性环路滤波下的图像编码的装置、方法和计算机程序。所公开的内容:不使用对横跨不连续面边界的像素点进行操作的环路滤波器。这样做的优点是避免错误的环路滤波器操作产生伪影。所公开的概念还提供:推迟环路滤波器操作,直到横跨面边界的所有像素点已知。然后,环路滤波器可以根据3D布局使用正确的像素点。这个操作可以在编码块级或更高级上执行。这样做的优点是能够正确对面边界进行环路滤波,例如,进行去块滤波或自适应环路滤波,不需要为了填充而对额外的像素点进行编码。(The invention discloses an apparatus, a method and a computer program for image coding under selective loop filtering. The disclosure is as follows: a loop filter that operates on pixels that span the boundary of the discontinuity is not used. This has the advantage that false loop filter operations are avoided to produce artefacts. The disclosed concept also provides: the loop filter operation is deferred until all pixel points crossing the face boundary are known. The loop filter can then use the correct pixel points according to the 3D layout. This operation may be performed at the coding block level or higher. This has the advantage that the face boundary can be loop filtered correctly, e.g. deblocking or adaptive loop filtered, without the need to encode additional pixels for padding.)
1. An image decoding apparatus (100, 200), characterized in that the image decoding apparatus is configured to:
reconstructing pixel values of encoded blocks in an image of a two-dimensional (2D) representation, the 2D representation being a spherical video derived in a projection format, the 2D representation comprising a set of 2D surfaces interconnected by a plurality of boundaries, one or more of the plurality of boundaries being discontinuous in a corresponding three-dimensional (3D) representation, the 3D representation being the spherical video derived in the projection format; the 2-D surfaces include a first 2-D surface and a second 2-D surface adjoining at one of the discontinuous boundaries, the first 2-D surface including the coded block, the coded block adjoining the second 2-D surface, the 2-D surfaces further including a third 2-D surface adjoining the first 2-D surface at one of the plurality of boundaries, the boundaries being continuous in the corresponding 3-D representation;
and performing loop filtering on the reconstructed pixel value of the coding block according to the pixel value of a filtering reference pixel point set, wherein the filtering reference pixel point set comprises one or more pixel points which are part of the third 2D surface.
2. The image decoding apparatus (100, 200) of claim 1, wherein said image decoding apparatus is further configured to: if one or more pixel values of the pixel in the filter reference pixel set are not suitable for the loop filtering, deferring performing the loop filtering on one or more of the reconstructed pixel values of the coding block until the one or more pixel values of the pixel in the filter reference pixel set are suitable for the loop filtering.
3. The image decoding device (100, 200) according to claim 1 or 2, wherein the image decoding device is further configured to perform loop filtering on one or more pixels located at an outer image boundary of the 2D representation.
4. The image decoding device (100, 200) according to any of claims 1 to 3, further configured to maintain discontinuity boundary pixel information indicating which pixels of the reconstructed pixel values are located at the one or more discontinuity boundaries.
5. The image decoding apparatus (100, 200) of claim 4, wherein the image decoding apparatus is further configured to maintain reconstruction state information, the reconstruction state information indicating whether a neighboring pixel is reconstructed, the neighboring pixel being used for loop filtering a pixel indicated by the discontinuity boundary pixel information.
6. The image coding device (100, 200) according to claim 5, wherein the loop filtering comprises one or more different loop filtering operations, the image coding device (100, 200) further configured to: and respectively maintaining at least one of the discontinuous boundary pixel point information and the reconstruction state information for each of the plurality of different loop filtering operations.
7. The image coding device (100, 200) according to claim 5, wherein the loop filtering comprises one or more different loop filtering operations, the image coding device (100, 200) further configured to: and jointly maintaining at least one of the discontinuous boundary pixel point information and the reconstruction state information for each of the plurality of different loop filtering operations.
8. The image coding device (100, 200) of any of claims 1 to 7, wherein the loop filtering comprises at least one of an in-loop bilateral filtering operation, a deblocking filtering operation, a sample adaptive offset filtering operation, or an adaptive loop filtering operation.
9. The image decoding apparatus (100, 200) of any of claims 1 to 8, wherein one or more parameters used for loop filtering the reconstructed pixel point values of the encoded block are different from corresponding parameters used for loop filtering reconstructed pixel values of one or more other blocks in the image.
10. The image decoding device (100, 200) according to any of claims 1 to 9, wherein the projection format comprises a cubic format, an icosahedron format, an equidistant columnar format, or a variant thereof.
11. The image decoding device (100) according to any of claims 1 to 10, wherein the image decoding device comprises an image encoding device (100).
12. The image decoding device (200) according to any of claims 1 to 10, wherein said image decoding device comprises an image decoding device (200).
13. A method (700) for encoding an image, the method comprising:
reconstructing (702, 703) pixel values of a coding block in an image of a two-dimensional (2D) representation, the 2D representation being a spherical video obtained in a projection format, the 2D representation comprising a set of 2D surfaces interconnected by boundaries, one or more of the boundaries being discontinuous in a corresponding three-dimensional 3D representation, the 3D representation being the spherical video obtained in the projection format, the 2D surfaces comprising a first 2D surface and a second 2D surface adjoining at one of the discontinuous boundaries, the first 2D surface comprising the coding block, the coding block adjoining the second 2D surface, the 2D surfaces further comprising a third 2D surface adjoining the first 2D surface at one boundary, the boundaries being continuous in the corresponding 3D representation;
performing (704, 707) loop filtering on the reconstructed pixel values of the coding blocks according to pixel values of a set of filter reference pixels, wherein the set of filter reference pixels comprises one or more pixels that are part of the third 2D surface.
14. Computer program comprising a program code for performing the method according to claim 13 when the computer program is executed on a computing device.
15. A layout for cube projection of a two-dimensional (2D) representation of a spherical video, characterized in that the 2D representation comprises a set of 2D cube faces interconnected by boundaries, one or more of the boundaries being discontinuous in a corresponding three-dimensional (3D) representation of the spherical video, only one boundary per 2D cube face or the opposite boundary of the 2D cube face in 3D representation being aligned with a boundary on which loop filtering is to be performed by an image coding apparatus according to any of claims 1 to 12.
Technical Field
The present invention relates to the field of image coding. In particular, the invention relates to improving image encoding and decoding by selective loop filtering.
Background
360 degree video or spherical video is a way to experience immersive video using a head-mounted display (HMD) or like device. This technology can provide a "immersive experience for the user by capturing a panoramic view of the world. 360 degree video is typically recorded using a dedicated rig (rig) containing multiple cameras or a dedicated Virtual Reality (VR) camera containing multiple built-in lenses. The resulting segments are then stitched to form a single video. This process can be done by the camera itself or with video editing software. The video editing software is able to analyze common video content (visuals) to synchronize and concatenate different video segments (camera feeds) to represent a spherical panorama around the camera rig. Basically, a camera or camera system will map a 360 degree scene onto a spherical surface.
Then, the stitched image (i.e., the image on the surface of the sphere) is mapped (or unwrapped) from the spherical representation to a two-dimensional (2D) rectangular representation according to a projection (e.g., an equidistant cylindrical projection); the image is then encoded using standard Video Coding such as h.264/Advanced Video Coding (AVC), High Efficiency Video Coding (HEVC)/h.265, and so on. On the viewing side, after decoding, the video is mapped onto a virtual sphere, where the viewer is located at the center of the virtual sphere. A viewer can navigate within the virtual sphere to view 360 degrees of the world view as needed to obtain an immersive experience.
As mentioned above, for encoding of 360 degree video, the content needs to be projected to a 2D representation. In addition to equidistant columnar projections, it is also possible to include projections onto faces of cubes, octahedra, etc. This introduces discontinuities at frame boundaries (borders) and the like, and in some cases, at boundaries of faces (e.g., faces of a cube or the like). Therefore, the smoothness of the content across these boundaries is often not maintained during encoding. When the encoded (decoded) video is presented in a view, artifacts often occur at the re-connected boundary seams. These artifacts can interfere with the viewing experience.
Loop filters (e.g., deblocking filters) are commonly used in video coding to compensate for artifacts caused by hybrid video coding frameworks. However, in 360 degree video content, there are several problems that affect the loop filter performance. For example, in 360 degree video content, not only block boundary coding can cause artifacts, but also unconnected face boundaries can cause artifacts. Furthermore, proper use of the loop filter requires that the boundaries at which the loop filter is used be continuous in a three-dimensional (3D) sense. However, this is not necessarily the case for 360 degree content. Instead, a loop filter may be used across continuous and discontinuous boundaries of 360 degrees of video content. The use of a loop filter across the discontinuity boundary may create artifacts.
The frames or faces to be encoded can be expanded by filling pixels at the boundaries. However, this unnecessarily increases the frame size, resulting in an increase in pixels to be encoded, and pixels need to be encoded twice.
Disclosure of Invention
This summary is provided to introduce a selection of concepts in a simplified form that are further described below in the detailed description. This summary is not intended to identify key features or essential features of the claimed subject matter, nor is it intended to be used to limit the scope of the claimed subject matter.
It is an object of the invention to provide improved image encoding and decoding under selective loop filtering. The above and other objects are achieved by the features of the independent claims. Other implementations are apparent from the dependent claims, the description and the drawings.
According to a first aspect, an image decoding apparatus is provided. The image coding device is to: pixel values of coding blocks in an image of a two-dimensional (2D) representation are reconstructed, the 2D representation being pixel values of coding blocks in an image of spherical video in projection format. The 2D representation includes a set of 2D surfaces interconnected by a plurality of boundaries. One or more of the plurality of boundaries are discontinuous in a corresponding three-dimensional (3D) representation of the spherical video in the projection format. The 2-D surfaces include a first 2-D surface and a second 2-D surface that are contiguous at one of the discontinuity boundaries. The first 2D surface includes the coding block, the coding block abutting the second 2D surface. The 2D surfaces further comprise a third 2D surface adjoining the first 2D surface at a boundary that is continuous in the corresponding 3D representation of the spherical video in the projection format. The image coding device is also to: and performing loop filtering on the reconstructed pixel value of the coding block according to the pixel value of a filtering reference pixel point set, wherein the filtering reference pixel point set comprises one or more pixel points which are part of the third 2D surface. This avoids artifacts.
In yet another implementation form of the first aspect, the image coding device is further configured to: if one or more pixel values of the pixel in the filter reference pixel set are not suitable for the loop filtering, deferring performing the loop filtering on one or more of the reconstructed pixel values of the coding block until the one or more pixel values of the pixel in the filter reference pixel set are suitable for the loop filtering. This avoids artifacts.
In another implementation form of the first aspect, the image coding device is further configured to perform loop filtering on one or more pixel points located at an outer image boundary of the 2D representation.
In yet another implementation form of the first aspect, the image decoding apparatus is further configured to maintain discontinuous boundary pixel point information, where the discontinuous boundary pixel point information is used to indicate which pixels points of the reconstructed pixels are located at the one or more discontinuous boundaries.
In yet another implementation form of the first aspect, the image decoding apparatus is further configured to maintain reconstruction state information indicating whether an adjacent pixel for performing loop filtering on the pixel indicated by the discontinuous boundary pixel information has been reconstructed.
In yet another implementation form of the first aspect, the loop filtering comprises one or more different loop filtering operations, the image coding device further to: and respectively maintaining at least one of the discontinuous boundary pixel point information and the reconstruction state information for each of the plurality of different loop filtering operations.
In yet another implementation form of the first aspect, the loop filtering comprises one or more different loop filtering operations, the image coding device further to: and jointly maintaining at least one of the discontinuous boundary pixel point information and the reconstruction state information for each of the plurality of different loop filtering operations.
In yet another implementation form of the first aspect, the loop filtering includes at least one of an in-loop bilateral filtering operation, a deblocking filtering operation, a sample adaptive offset filtering operation, or an adaptive loop filtering operation.
In yet another implementation form of the first aspect, one or more parameters used for loop filtering the reconstructed pixel value of the encoded block are different from corresponding parameters used for loop filtering reconstructed pixel values of one or more other blocks in the image.
In yet another implementation form of the first aspect, the projection format includes a cube format, an icosahedron format, an equidistant columnar format, or a variant thereof.
In yet another implementation form of the first aspect, the image coding device comprises an image encoding device.
In yet another implementation form of the first aspect, the image coding device comprises an image decoding device.
In a second aspect, an image encoding method is provided. The method comprises the following steps: pixel values of encoded blocks in a two-dimensional (2D) representation of an image in which spherical video is obtained in a projection format are reconstructed (e.g., by an image decoding device). The 2D representation includes a set of 2D surfaces interconnected by a plurality of boundaries. One or more of the plurality of boundaries are discontinuous in a corresponding three-dimensional (3D) representation of the spherical video in the projection format. The 2-D surfaces include a first 2-D surface and a second 2-D surface that are contiguous at one of the discontinuity boundaries. The first 2D surface includes the coding block, the coding block abutting the second 2D surface. The 2D surfaces further comprise a third 2D surface adjoining the first 2D surface at a boundary that is continuous in the corresponding 3D representation of the spherical video in the projection format. The method further comprises the following steps: performing (e.g., by the image decoding device) loop filtering on the reconstructed pixel values of the coding block according to pixel values of a set of filter reference pixels, wherein the set of filter reference pixels includes one or more pixels that are part of the third 2D surface.
In yet another implementation form of the second aspect, the method further comprises: if one or more pixel values of the pixel in the filtered reference pixel set are not suitable for the in-loop filtering, the image decoding apparatus defers performing the in-loop filtering on one or more reconstructed pixel values of the encoded block until the one or more pixel values of the pixel in the filtered reference pixel set are suitable for the in-loop filtering.
In yet another implementation form of the second aspect, the method further comprises: the image coding device performs loop filtering on one or more pixel points located at an outer image boundary of the 2D representation.
In yet another implementation form of the second aspect, the method further comprises: the image decoding device maintains discontinuity boundary pixel information indicating which pixels are located at the one or more discontinuity boundaries.
In yet another implementation form of the second aspect, the method further comprises: the image decoding device is also used for maintaining and indicating whether reconstruction state information of an adjacent pixel point used for performing loop filtering on the pixel point indicated by the discontinuous boundary pixel point information is already reconstructed.
In a further implementation form of the second aspect, the loop filtering comprises one or more different loop filtering operations, the loop filtering comprising one or more different loop filtering operations, the method further comprising: the image decoding apparatus maintains at least one of the discontinuous boundary pixel point information and the reconstruction state information for each of the plurality of different loop filtering operations, respectively.
In a further implementation form of the second aspect, the loop filtering comprises one or more different loop filtering operations, the method further comprising: the image decoding apparatus maintains at least one of the discontinuous boundary pixel point information and the reconstruction state information in association with each of the plurality of different loop filtering operations.
In yet another implementation form of the second aspect, the loop filtering includes at least one of an in-loop bilateral filtering operation, a deblocking filtering operation, a sample adaptive offset filtering operation, or an adaptive loop filtering operation.
In yet another implementation form of the second aspect, one or more parameters used for loop filtering the reconstructed pixel value of the encoded block are different from corresponding parameters used for loop filtering reconstructed pixel values of one or more other blocks in the image.
In yet another implementation form of the second aspect, the projection format includes a cube format, an icosahedron format, an equidistant columnar format, or a variant thereof.
In yet another implementation form of the second aspect, the image coding device comprises an image encoding device.
In yet another implementation form of the second aspect, the image coding device comprises an image decoding device.
In a third aspect, a computer program is provided. The computer program comprises program code for performing the method according to the second aspect when the computer program is executed on a computing device.
According to a fourth aspect, a layout of a cube projection for a two-dimensional (2D) spherical video representation is provided. The 2D representation includes a set of 2D cube faces interconnected by boundaries. One or more of the boundaries are discontinuous in a corresponding three-dimensional (3D) representation of the spherical video. Only one boundary per 2D cube face or the opposite boundary of one boundary of the 2D cube face in the 3D representation is aligned with the boundary on which the image coding device according to the first aspect is to perform loop filtering.
Many of the attendant advantages will become more readily appreciated as the same becomes better understood by reference to the following detailed description considered in connection with the accompanying drawings.
Drawings
Exemplary embodiments will be described in more detail below with reference to the attached drawings, in which:
FIG. 1 is a block diagram of an exemplary embodiment of a video encoding device;
FIG. 2 is a block diagram of one exemplary embodiment of a video decoding apparatus;
fig. 3A is another block diagram of another exemplary embodiment of a video encoding device;
fig. 3B is another block diagram of another exemplary embodiment of a video decoding apparatus;
FIG. 4A is a schematic diagram of one example of a cube projection format;
FIG. 4B is another schematic diagram of one example of a cube projection format;
FIG. 5 is a schematic diagram of an example of a 2D representation of spherical video from a cube projection format;
FIG. 6A is a schematic diagram of another example of a 2D representation of spherical video from a cube projection format;
FIG. 6B is a schematic diagram of yet another example of a 2D representation of spherical video from a cube projection format;
FIG. 7 is a flow diagram of an exemplary method of image coding and decoding under selective loop filtering;
fig. 8 shows examples of respective boundary directions of a sample adaptive offset filter in image coding under selective loop filtering.
In the following, the same reference signs refer to the same features or at least functionally equivalent features.
Detailed Description
Reference is now made to the accompanying drawings, which form a part hereof, and in which is shown by way of illustration specific aspects in which the invention may be practiced. It is to be understood that other aspects may be utilized and structural or logical changes may be made without departing from the scope of the present invention. The following detailed description is, therefore, not to be taken in a limiting sense, and the scope of the present invention is defined by the appended claims.
For example, it is to be understood that the disclosure relating to describing a method is equally applicable to the corresponding apparatus or system for performing the method, and vice versa. For example, if a specific method step is described, the corresponding apparatus may comprise means for performing the described method step, even if such means are not explicitly described or illustrated in the figures. On the other hand, for example, if a specific apparatus is described on the basis of functional units, the corresponding method may comprise steps for performing the described functions, even if such steps are not explicitly described or illustrated in the figures. Furthermore, it is to be understood that features of the various exemplary aspects described herein may be combined with each other, unless specifically noted otherwise.
Video coding refers to the digital compression or decompression of a sequence of images that make up a video or video sequence. In the field of video coding, the terms "picture/image" and "frame" may be used as synonyms. Each image is typically partitioned into a set of non-overlapping blocks. Encoding/coding of video is typically performed at the block level, e.g. at the block level, inter prediction or intra prediction is used to generate a prediction block, the prediction block is subtracted from the current block (currently processed block/block to be processed) resulting in a residual block, the residual block is further transformed and quantized to reduce the amount of data to be transmitted (compression), and at the decoder side, the encoded/compressed block is inverse processed to reconstruct the current block (video block) for representation.
The disclosed concept provides for: the loop filter that operates on pixels that span the discontinuity boundary is not used or disabled. This has the advantage of avoiding artifacts resulting from loop filter operations across discontinuity boundaries. The disclosed concept also provides: the loop filter operation is deferred until all pixel points that cross the face boundary are reconstructed. The loop filter can then use the correct pixel points according to the 3D permutation. This operation may be performed at the coding block level or higher. This has the advantage that the face boundary can be loop filtered correctly, e.g. deblocking or adaptive loop filtered, without the need to encode additional pixels for padding.
As described above, a scene is captured in all directions of a single view point (view point) to acquire a 360-degree video sequence. This can be achieved by using a plurality of cameras with different viewing directions arranged close to each other. The captured content is then stitched together and can be considered as a sphere around the airplane stand with textures representing a 360 degree scene. However, current display and video codecs require a flat (2D) rectangular image of a scene. Therefore, the sphere must be converted to some 2D format. There are several ways this can be done, including but not limited to:
(a) the equirectangular columnar format means that a sphere is projected to a rectangle in a manner similar to creating a rectangular world map from the globe of the earth, or the like. Here, the distortion depends on the position.
(b) The cube format refers to mapping a sphere to each face of the cube. Each facet looks like a normal 2D image without any visible geometric distortion. However, there is geometric distortion at the boundaries of the two faces.
(c) The icosahedron format refers to the mapping of a spherical surface to each face of an icosahedron. Geometric distortion exists at the boundaries of two faces, but is less severe than the cube format because of the smaller angles between adjacent faces.
In discussing the disclosed concepts, the cube format is used as an example. Fig. 4A and 4B illustrate an example 400 of a cube projection format. Fig. 5, 6A and 6B show examples 500, 600 and 650 of 2D representations of spherical video from the
When 360 degrees of video content is mapped to a 2D representation, discontinuities are introduced in the video content that do not exist on a sphere. FIG. 5 shows an example of a non-compact cube layout or 2D representation that preserves the connectivity of the faces of the cube as much as possible. In other words, the cube is unfolded so that all faces are still connected in 2D, just as in 3D. However, there are two unused areas in this format: one region in the upper right corner (enclosed by 3, 3 ', 1' and 2 ') and another region in the lower left corner (enclosed by 10, 10', 11, 12). Therefore, a compact cube format may be more suitable, examples of which are shown in fig. 6A and 6B.
Fig. 6A and 6B illustrate different boundary types in the compact cube format and the way these formats are processed during loop filtering. The continuous boundaries (dashed
In the example of fig. 6A and 6B, at least some of the face boundaries (dotted
Accordingly, in the example of fig. 6A,
The use of a loop filter across the discontinuity boundary may create artifacts. However, for optimal results of the encoder, all face boundaries should be loop filtered.
The disclosed concepts may have at least some of the following feature sets:
and if the loop filter operates the pixel points crossing the discontinuous surface boundary, the coding block does not use the loop filter. The loop filter may be delayed until all such connected neighboring blocks of the coding block are available or reconstructed. These neighboring blocks are needed for proper use of a particular loop filter.
Since all pixels are available after the reconstruction of the decoded frame is completed before loop filtering, all boundary pixels can be loop filtered, for example, by performing deblocking filtering or adaptive loop filtering.
Operations according to the disclosed concept may be performed in units of encoded blocks, with the option of releasing the final processed block from memory if not needed for other purposes.
The disclosed concepts, when applied to the Joint Exploration Model (JEM) for HEVC/reference software, may affect pixels at boundaries where deblocking filters (DBFs) are used, where the boundaries include Coding Tree Block (CTB) boundaries, Coding Block (CB) boundaries, Prediction Block (PB) boundaries, and Transform Block (TB) boundaries co-located with the face boundaries. In addition, the disclosed concepts may affect performing sample-adaptive offset (SAO) operations on boundary pixels. Moreover, when an adaptive loop-filter (ALF) or bilateral filter is used across a face boundary, the disclosed concepts may affect the operation of the ALF or bilateral filter.
Exemplary embodiments of the
Fig. 1 shows an
The input 102 is for receiving an image block 101 of an image, e.g. a still image or an image of a sequence of images forming a video or a video sequence. The image block may also be referred to as a current image block or an image block to be encoded, and the image may also be referred to as a current image or an image to be encoded.
The residual calculation unit 104 is configured to calculate a residual block 105 (the prediction block 165 is described in detail below) from the image block 101 and the prediction block 165 by, for example: the pixel values of the prediction block 165 are subtracted from the pixel values of the image block 101 pixel by pixel (pixel by pixel) to obtain a residual block 105 in the pixel domain.
The transform unit 106 is configured to perform a Discrete Cosine Transform (DCT) or a Discrete Sine Transform (DST) on the residual block 105 to obtain transform coefficients 107 in a transform domain. The transform coefficients 107, which may also be referred to as transform residual coefficients, represent the residual block 105 in the transform domain.
The quantization unit 108 is configured to quantize the transform coefficient 107 by performing scalar quantization, vector quantization, or the like to obtain a quantized transform coefficient 109. The quantized coefficients 109 may also be referred to as quantized residual coefficients 109.
The
The
The reconstruction unit 114 is configured to combine the
A buffer unit 116 (or simply "buffer" 116) (e.g., column buffer 116) is used to buffer or store reconstructed blocks for intra estimation and/or intra prediction, etc.
Loop filter unit 120 (or simply "loop filter" 120) is used to filter reconstructed block 115 using a bilateral filter, a deblocking filter, a sample-adaptive offset (SAO) filter, an adaptive loop filter, or other filters, to obtain a filter block 121. The filtering block 121 may also be referred to as a filtered reconstruction block 121. Various loop filters are described in more detail in the Joint Video Exploration Team (JVET) standards-related document. For example, JFET-G1001 discloses an algorithmic description of Joint exploratory test Model 7 (JEM 7).
The
The
In other words, since frames may be processed as independent processing blocks or coding blocks in some parts of the encoding and/or decoding flow, visible artifacts such as discontinuities may be introduced in the reconstructed frames. Due to these artifacts, etc., the boundaries of the coding blocks can be seen in the reconstructed frame. Loop filtering may be used to mitigate or remove these artifacts from the reconstructed frame as well as other artifacts created in the encoding/decoding process. For example, a deblocking filter may be used to use adaptive smoothing across the boundaries of processing blocks (e.g., prediction blocks and transform blocks). SAO filtering may be performed after deblocking filtering, and the like. SAO may be used in a so-called boundary shift mode for filtering locally oriented structures in reconstructed frames, and also in a so-called band shift mode for modifying pixel values according to pixel intensities.
The decoded
The inter estimation unit 142, also referred to as inter estimation unit 142, is configured to receive the image block 101 (a current image block of a current image) and one or more previous reconstructed blocks (e.g., reconstructed blocks of one or more other/different previously decoded images 231) for inter estimation (or "inter image estimation"). For example, the video sequence may comprise a current picture and a previous decoded picture 231, or in other words, the current picture and the previous decoded picture 231 may be part of or form the sequence of pictures forming the video sequence.
For example, the
The
The intra estimation unit 152 is configured to receive the image block 101 (current image block) and one or more previous reconstructed blocks (e.g., reconstructed neighboring blocks) of the same image for intra estimation. For example, the
The
The
The mode selection unit 160 may be used to perform inter estimation/prediction and intra estimation/prediction, or control inter estimation/prediction and intra estimation/prediction, and select a reference block and/or prediction mode (intra or inter prediction mode) for use as the prediction block 165 to calculate the above-described residual block 105 and reconstruct the above-described reconstructed block 115.
The mode selection unit 160 may be used to select the prediction mode so as to provide a minimum residual (minimum residual means better compression) or a minimum signaling overhead or both. The mode selection unit 160 may be configured to determine a prediction mode according to Rate Distortion Optimization (RDO).
Entropy encoding unit 170 is configured to use an entropy encoding algorithm on quantized residual coefficients 109, inter-prediction parameters 143, intra-prediction parameters 153, and/or loop filter parameters, either alone or in combination (or not involved), to obtain encoded image data 171. For example, the output terminal 172 may output the encoded image data 171 in the form of an encoded code stream 171.
For example,
The
A "pixel point" is a small segment of an image having a pixel value associated with the image. The pixel value is a measure of the intensity and/or color of the pixel point. The image may be represented as a 2D array of pixel points. In the art, a pixel is sometimes referred to as a pixel, but sometimes the term "pixel" refers to a small segment of an image rendering device such as a Liquid Crystal Display (LCD). For example, the pixel values may be intensity values of a single color, such as blue, green, or red, or may be multi-dimensional intensities, including tuples of intensity values, such as blue, green, and red intensity values.
The
To divide the image into blocks, the
Fig. 2 shows an exemplary video decoder 200 for receiving encoded image data (codestream) 171 encoded by
Decoder 200 includes an input 202, an
Thus, fig. 1 and 2 show examples of the image decoding apparatus. The image decoding apparatus may be an image encoding apparatus (e.g., the
The
The 2D representation includes a set of 2D surfaces interconnected by a plurality of boundaries. One or more of the boundaries are discontinuous in the corresponding 3D representation of the spherical video in projection format. Furthermore, one or more boundaries are continuous in the corresponding 3D representation of the spherical video in projection format. Continuous and discontinuous boundaries are described above in connection with fig. 6A and 6B.
The 2-D surfaces include a first 2-D surface and a second 2-D surface that are contiguous at one of the discontinuity boundaries. The first 2D surface includes the coding block, the coding block abutting the second 2D surface. The 2-D surfaces further include a third 2-D surface contiguous with the first 2-D surface at a continuous boundary.
The
In one embodiment, pixel points (e.g., filter reference pixel points) may be obtained from connected or adjacent surfaces on the sphere of the current coding block, for example, as follows:
option 1: directly copying pixel points from the third 2D surface, and if the pixel points are available or reconstructed, executing loop filtering by using the pixel points; or
Option 2: and projecting pixel points of the third 2D surface (if available or reconstructed) onto the 3D spherical surface by using the geometric information, mapping the projection pixel points in the 3D spherical surface to a projection format according to the geometric information and the selected interpolation filter, and then using the mapping pixel points in the projection format as filter reference pixel points for executing loop filtering.
The
The
The
Maintaining the reconstruction state information may be accomplished in one of two ways, among others:
1. real-time: the face arrangement information is used together with the processing order of the blocks (i.e. using slice/block coding). It can then be deduced whether the connected neighboring blocks have already been decoded and are available; or
2. The identity of the block at the face boundary is stored. This can be done at the coding block level. After block reconstruction, the flag may be set to true (true). Loop filtering may be performed when the flags of all connected neighboring blocks are set to true.
If the loop filtering includes a plurality of different loop filtering operations, the
In one embodiment, the one or more parameters used for in-loop filtering the reconstructed pixel value of the encoded block are different from the corresponding parameters used for in-loop filtering reconstructed pixel values of one or more other blocks in the image. For example, boundary strength (bs) parameter derivation may be modified. For example, bs may be set to 2 for blocks at face boundaries, assuming that the deblocking required at face boundaries is as strong as the deblocking of intra blocks. This also ensures that chroma deblocking can be applied.
Fig. 3A shows another example of the
Fig. 3B shows another example of the image decoding apparatus 200 in fig. 2. The image decoding apparatus 200 may include a processor 280, a memory 285, and/or an input/output interface 290. A processor 2180 may be used to perform the functions of one or more of
Fig. 7 shows a flow diagram of an
The
At
If the current coding block does not include a discontinuity boundary, the method proceeds to
If the current coding block includes a discontinuity boundary, the method proceeds to
In other words, when using the existing loop filter, the pixels in the above-mentioned plane boundary pixel list skip the operation of the loop filter. The loop filter operation is performed after the reconstruction of the associated connected neighboring blocks is completed. The related block is a block including pixels in the face boundary pixel list. Once the unconnected pixels in the 2D representation but connected in the 3D representation are reconstructed and available, such loop filter operations can access and modify these pixels. If there are pixels in the above list of face boundary pixels that have already been used for loop filtering and are no longer needed for other purposes, then these pixels may be released from memory,
In other words, the method comprises: reconstructing spherical video pixel values of encoded blocks in a 2D represented image obtained in projection format. The 2D representation includes a set of 2D surfaces interconnected by a plurality of boundaries. One or more of the boundaries are discontinuous in the corresponding 3D representation of the spherical video in projection format. The 2-D surfaces include a first 2-D surface and a second 2-D surface that are contiguous at one of the discontinuity boundaries. The first 2D surface includes the coding block, the coding block abutting the second 2D surface. The 2D surfaces further comprise a third 2D surface adjoining the first 2D surface at a boundary that is continuous in the corresponding 3D representation of the spherical video in the projection format. The method further comprises the following steps: performing loop filtering on the reconstructed pixel value of the coding block according to the pixel value of the filtering reference pixel point set, where the filtering reference pixel point set includes one or more pixel points that are part of the third 2D plane, that is,
The
The
Diagram 800 in fig. 8 shows an example of the respective boundary or edge directions of a sample adaptive offset filter in image coding under selective loop filtering. This example assumes the use of SAO according to HEVC. Here, only the boundary offset pattern needs to be modified. The band offset pattern is independent of neighboring pixels.
For the boundary shift mode,
(a) below the upper boundary of the coding block. At this time, in three cases of b, c and d, p is obtained from geometrically adjacent surfaces0。
(b) To the right of the left boundary of the block. At this time, in both a and cIn the case of obtaining p from geometrically adjacent surfaces0(ii) a In case of d, p is taken from geometrically adjacent surfaces1。
(c) To the left of the right boundary of the block. At this time, in case d, p is obtained from the geometrically neighboring planes0(ii) a In both cases a and c, p is taken from geometrically adjacent surfaces1。
(d) Above the lower boundary of the code block. At this time, in three cases of b, c and d, p is obtained from geometrically adjacent surfaces1。
If p iscAt the corners of one face, diagonal neighbors are not available. This corresponds to pcIn the case of the corners of the frame for the original SAO filter. The rest of the SAO filter does not need to be modified.
Fig. 6B shows a
In order for the filtering operation to be correctly performed, only one of each facet boundary x or x' paired therewith (fig. 6B) must be aligned with the boundary processed by the DBF. The deblocking in JEM is applied to the upper or left boundary of a given block. In the disclosed concept, only the upper and left boundaries of the cube face boundary are considered. The right and lower boundaries are automatically filtered as the opposite boundary is processed. To this end, the DBF is modified so that the upper and left borders of the image are filtered. This is necessary because 360 degree video is rotationally symmetric. When moving out of the border of the image, another part of the image will be reached instead of moving out of the scene.
In
An image decoding apparatus and corresponding methods are described herein in connection with various embodiments. However, other variations to the disclosed embodiments can be understood and effected by those skilled in the art in practicing the claimed invention, from a study of the drawings, the disclosure, and the appended claims. In the claims, the word "comprising" does not exclude other elements or steps, and the terms "a" or "an" do not exclude a plurality.
Embodiments of the invention include or are a computer program comprising program code. The program code is for performing any of the methods described herein when executed on a computer.
Embodiments of the invention include or are a computer readable medium containing program code. The program code, when executed by a processor, causes a computer system to perform any of the methods described herein.
Those skilled in the art will appreciate that the "blocks" ("elements") in the various figures represent or describe the functionality of embodiments of the present invention (rather than individual "elements" in hardware or software), and thus, equally describe the functionality or features of apparatus embodiments as well as method embodiments (element equivalent steps).
As described above, the apparatus for image coding and decoding may be implemented in hardware such as the video coding apparatus or the video decoding apparatus described above, or implemented as a method. The method may be implemented as a computer program. The computer program is then executed in a computing device.
A video decoding device, a video encoding device, or any other corresponding image encoding device is used to perform one of the methods described above. The apparatus includes any necessary hardware components. These hardware components may include at least one processor, at least one memory, at least one network connection, a bus, and the like. For example, instead of dedicated hardware components, memory or processors may be shared with other components, or accessed from a cloud service, centralized computing unit, or other resource that may be used via a network connection.
The inventive methods may be implemented in hardware or software or any combination thereof, as desired for certain implementations of the inventive methods.
Such implementations may be implemented using a digital storage medium, such as a floppy, CD, DVD, or blu-ray, ROM, PROM, EPROM, EEPROM, or flash memory having electronically readable control signals stored thereon which cooperate (or are capable of cooperating) with a programmable computer system such that at least one embodiment of the inventive method is performed.
Accordingly, another embodiment of the present invention is or includes a computer program product including program code. The program code is stored on a computer readable carrier, the program code being operative for performing at least one inventive method when the computer program product runs on a computer.
In other words, an embodiment of the inventive method is therefore or comprises a computer program comprising program code. The program code is for performing at least one inventive method when the computer program runs on a computer, processor or the like.
Accordingly, another embodiment of the present invention is or includes a machine-readable digital storage medium including a computer program stored thereon. The computer program is operative to perform at least one inventive method when the computer program product is run on a computer, a processor or the like.
Thus, yet another embodiment of the invention is or includes a data stream or signal sequence representing a computer program. The computer program is operative to perform at least one inventive method when the computer program product is run on a computer, a processor or the like.
Accordingly, yet another embodiment of the invention is or includes a computer, processor, or any other programmable logic device for performing at least one of the inventive methods.
Thus, yet another embodiment of the invention is or includes a computer, processor, or any other programmable logic device. They store a computer program operative to perform at least one inventive method when the computer program product is run on a computer, a processor, or any other Programmable logic device, such as a Field Programmable Gate Array (FPGA) or an Application Specific Integrated Circuit (ASIC).
While the foregoing has been particularly shown and described with reference to particular embodiments thereof, it will be understood by those skilled in the art that various other changes in form and details may be made therein without departing from the spirit and scope thereof. It is therefore to be understood that various changes may be made in adapting to different embodiments without departing from the broader concepts disclosed herein and understood from the claims that follow.
- 上一篇:一种医用注射器针头装配设备
- 下一篇:图像处理装置和方法、成像元件和成像装置