Image encoding apparatus and method
阅读说明:本技术 图像编码设备和方法 (Image encoding apparatus and method ) 是由 中神央二 于 2015-06-05 设计创作,主要内容包括:本公开涉及一种图像编码设备和方法,该图像编码设备包括:帧内预测单元,被配置为仅使用属于与当前编码树单元行的上部邻近的上编码树单元行的上编码树单元中的、在解码顺序上位于与当前编码树单元的右上侧邻近的右上编码树单元之上的所述上编码树单元作为参考范围,来执行屏幕内运动预测;以及编码单元,被配置为对从所述屏幕内运动预测的结果获得的运动信息进行编码以产生位流。当前编码树单元行和所述上编码树单元行属于不同的片。(The present disclosure relates to an image encoding apparatus and method, the image encoding apparatus including: an intra prediction unit configured to perform intra-screen motion prediction using, as a reference range, only an upper coding tree unit, of upper coding tree units belonging to an upper coding tree unit row adjacent to an upper portion of a current coding tree unit row, which is located above an upper right coding tree unit adjacent to an upper right side of the current coding tree unit in decoding order; and an encoding unit configured to encode motion information obtained from a result of the intra-screen motion prediction to generate a bitstream. The current coding tree unit row and said upper coding tree unit row belong to different slices.)
1. An image encoding device comprising:
an intra prediction unit configured to perform intra-screen motion prediction using, as a reference range, only an upper coding tree unit, of upper coding tree units belonging to an upper coding tree unit row adjacent to an upper portion of a current coding tree unit row, which is located above an upper right coding tree unit adjacent to an upper right side of the current coding tree unit in decoding order; and
an encoding unit configured to encode motion information obtained from a result of the intra-screen motion prediction to generate a bitstream.
2. The image encoding apparatus of claim 1, wherein the current coding tree unit row and the up-coding tree unit row belong to different slices.
3. An image encoding method implemented by an image encoding apparatus,
the image encoding method includes:
performing intra-screen motion prediction using, as a reference range, only an upper coding tree unit, of upper coding tree units belonging to an upper coding tree unit row adjacent to an upper portion of the current coding tree unit row, which is located above an upper right coding tree unit adjacent to an upper right side of the current coding tree unit in decoding order; and is
Encoding motion information obtained from a result of the intra-screen motion prediction to generate a bitstream.
Technical Field
The present disclosure relates to an image encoding apparatus and method and an image decoding apparatus and method. More particularly, the present disclosure relates to an image encoding apparatus and method and an image decoding apparatus and method, which can improve encoding efficiency of intra block copy (IntraBC).
Background
In recent years, apparatuses for compressing images by an encoding method for compressing image information by orthogonal transform (e.g., discrete cosine transform) and motion compensation have been popularized by implementing an encoding method for compressing image information by using redundancy unique to image information, for processing image information as digital information and thus achieving efficient information transmission and accumulation. The encoding method may be Moving Picture Experts Group (MPEG), h.264, MPEG-4Part 10 (advanced video coding, hereinafter also referred to as AVC), or the like.
Currently, in order to achieve coding efficiency higher than that of h.264/AVC, a coding method called High Efficiency Video Coding (HEVC) has been developed as a standard of JCTVC (joint collaborative group-video coding), which is a joint standard organization of ITU-T and ISO/IEC.
Also, in HEVC, range extension (HEVC range extension) is considered in order to support high-end formats, such as images with color difference signal formats (e.g., 4:2:2 and 4:4:4) and contours for screen content (e.g., refer to non-patent document 1).
Meanwhile, intra block copy (IntraBC) is a coding tool that performs motion compensation within a picture. IntraBC is a tool that helps to improve the efficiency of encoding artificial images, such as computer pictures or CG images.
IntraBC, however, is not used as a technique for HEVC range extension described above, and is considered as a standardized technique for Screen Content Coding (SCC) extension.
Reference list
Non-patent document
Non-patent document 1: david Flynn, Joel Sole, and Teruhiko Suzuki, "High Efficiency Video Coding (HEVC), Range Extension text specification: Draft 4", JCTVC-N1005_ v1,2013.8.8
Disclosure of Invention
Problems to be solved by the invention
Here, low latency is important for SCC applications. Therefore, in HEVC schemes, slice partitioning needs to be used.
However, in the case of picture slicing, the improvement of the coding efficiency of IntraBC becomes significantly small. This is because IntraBC cannot refer to any data outside the current slice.
The present disclosure was authored in view of those circumstances and aims to improve the coding efficiency of IntraBC.
Solution to the problem
According to an aspect of the present disclosure, an image encoding apparatus includes: a setting unit that sets control information for controlling intra-picture motion prediction across slices; an encoding unit that encodes an image to generate a bit stream according to the control information set by the setting unit; and a transmission unit that transmits the control information set by the setting unit and the bit stream generated by the encoding unit.
The setting unit may set, as the control information, a reference permission flag indicating a reference permission for a decoding result of a current slice and slices preceding the current slice in a raster scan order.
The setting unit may set a reference permission flag within a Picture Parameter Set (PPS).
The setting unit may set a parameter indicating the number of previous slices having a referable decoding result as the control information.
The setting unit may set an on/off flag indicating whether or not intra-picture motion prediction of a cross-picture can be performed as the control information.
The setting unit may set an on/off flag within a Sequence Parameter Set (SPS) or a Video Parameter Set (VPS).
When the Wavefront Parallel Processing (WPP) is on, the setting unit may limit the range of reference and set the reference permission flag.
The setting unit may set the reference permission flag when the tile division is not "on".
The setting unit may set the reference permission flag to 'off' when a temporal motion constraint tile set SEI (MCTS-SEI) is 'on'.
According to an aspect of the present disclosure, an image encoding method is implemented by an image encoding apparatus and includes: setting control information for controlling cross-slice intra-picture motion prediction; encoding the image to generate a bit stream according to the set control information; and transmits the set control information together with the generated bit stream.
In another aspect of the present disclosure, an image decoding apparatus includes: a receiving unit that receives a bit stream generated by encoding an image; an extraction unit that extracts control information designed to control intra-picture motion prediction across slices from the bitstream received by the reception unit; and a decoding unit decoding the bit stream received by the receiving unit using the control information extracted by the extracting unit to generate an image.
The extraction unit may extract, as the control information, a reference permission flag indicating a reference to a decoding result for a current slice and slices preceding the current slice in a raster scan order.
The extraction unit may extract the reference permission flag from a Picture Parameter Set (PPS).
The extraction unit may extract a parameter indicating the number of previous slices having a referable decoding result as the control information.
The extraction unit may extract an on/off flag indicating whether cross-slice intra-picture motion prediction can be performed, as the control information.
The extraction unit may extract an on/off flag from a Sequence Parameter Set (SPS) and a Video Parameter Set (VPS).
When the Wavefront Parallel Processing (WPP) is on, the extracting unit may restrict the range of reference and extract the reference permission flag.
The extraction unit may extract the reference permission flag when the tile division is "on".
The extraction unit may extract the reference permission flag set to 'off' when a tile set SEI (MCTS-SEI) of a temporal motion constraint is 'on'.
According to another aspect of the present disclosure, an image decoding method is implemented by an image decoding apparatus and includes: receiving a bit stream generated by encoding an image; extracting control information from the received bitstream, the control information designed to control intra-picture motion prediction across slices; and decodes the received bit stream using the extracted control information to generate an image.
In one aspect of the present disclosure, control information for controlling intra-picture motion prediction across slices is set, and an image is encoded to generate a bitstream according to the set control information. Then, the set control information and the generated bit stream are transmitted.
In another aspect of the present disclosure, a bitstream generated by encoding an image is received, and control information for controlling intra-picture motion prediction across slices is extracted from the received bitstream. Then, using the extracted control information, the received bit stream is decoded and an image is generated.
It should be noted that the above-described image encoding device and image decoding device may be independent image processing devices, or may be internal blocks forming the image encoding device and image decoding device.
Effects of the invention
According to one aspect of the present disclosure, an image may be encoded. In particular, the coding efficiency of IntraBC can be improved.
According to another aspect of the disclosure, an image may be decoded. In particular, the coding efficiency of IntraBC can be improved.
It should be noted that the effects of the present technology are not limited to the effects described herein, and may include any effects described in the present disclosure.
Drawings
Fig. 1 is a diagram for explaining an example structure of an encoding unit.
Fig. 2 is a table showing example syntax of SPS and PPS.
FIG. 3 is a table illustrating an example of semantics in accordance with the present technology.
Fig. 4 is a table showing an example syntax of the VPS.
Fig. 5 is a diagram showing an example in which a picture is divided into 4 slices according to the present technology.
Fig. 6 is a diagram showing a case where intraBC _ ref _ prev _ slice _ num is 1.
Fig. 7 is a diagram for explaining a combination of the present technology and WPP.
Fig. 8 is a diagram for explaining a combination of the present technology and WPP.
Fig. 9 is a diagram for explaining a combination of the present technology and tile division.
Fig. 10 is a table for explaining advantageous effects.
Fig. 11 is a diagram for explaining the combination of the present technology and MCTS-SEI.
Figure 12 is a table showing an example of NOTEs added to the semantics of an MCTS-SEI message.
Fig. 13 is a block diagram showing an example configuration of a first embodiment of an encoding apparatus to which the present technology is applied.
Fig. 14 is a block diagram showing an example configuration of the encoding unit shown in fig. 13.
Fig. 15 is a flowchart for explaining the flow generation process.
Fig. 16 is a flowchart for explaining the parameter set setting process.
Fig. 17 is a flowchart for explaining the encoding process shown in fig. 15.
Fig. 18 is a flowchart for explaining the encoding process shown in fig. 15.
Fig. 19 is a flowchart for explaining the intra prediction process shown in fig. 17.
Fig. 20 is a block diagram showing an example configuration of the first embodiment of the decoding apparatus to which the present disclosure is applied.
Fig. 21 is a block diagram showing an example configuration of the decoding unit shown in fig. 20.
Fig. 22 is a flowchart for explaining an image generation process performed by the decoding apparatus shown in fig. 20.
Fig. 23 is a flowchart for explaining the parameter set extraction process.
Fig. 24 is a flowchart for explaining the decoding process in detail.
Fig. 25 is a flowchart for explaining the slice decoding process in detail.
Fig. 26 is a block diagram showing an example configuration of hardware of a computer.
Fig. 27 is a diagram illustrating an example of a multi-view image encoding method.
Fig. 28 is a diagram showing an example configuration of a multi-viewpoint image encoding apparatus to which the present disclosure is applied.
Fig. 29 is a diagram showing an example configuration of a multi-viewpoint image decoding apparatus to which the present disclosure is applied.
Fig. 30 is a diagram illustrating an example of a layered image encoding method.
Fig. 31 is a diagram for explaining an example of spatial scalable coding.
Fig. 32 is a diagram for explaining an example of time-scalable coding.
Fig. 33 is a diagram for explaining an example of snr scalable coding.
Fig. 34 is a diagram showing an example configuration of a layered image encoding apparatus to which the present disclosure is applied.
Fig. 35 is a diagram showing an example configuration of a layered image decoding apparatus to which the present disclosure is applied.
Fig. 36 is a diagram schematically showing an example configuration of a television apparatus to which the present disclosure is applied.
Fig. 37 is a diagram schematically showing an example configuration of a portable telephone device to which the present disclosure is applied.
Fig. 38 is a diagram schematically showing an example configuration of a recording/reproducing apparatus to which the present disclosure is applied.
Fig. 39 is a diagram schematically showing an example configuration of an imaging apparatus to which the present disclosure is applied.
Fig. 40 is a block diagram showing an example of use of scalable coding.
Fig. 41 is a block diagram showing another use example of scalable coding.
Fig. 42 is a block diagram illustrating yet another use case of scalable coding.
Fig. 43 schematically shows an example configuration of a video machine to which the present disclosure is applied.
Fig. 44 schematically shows an example configuration of a video processor to which the present disclosure is applied.
Fig. 45 schematically shows another example configuration of a video processor to which the present disclosure is applied.
Detailed Description
The following is a description of a mode for carrying out the disclosure (hereinafter referred to as an embodiment). It should be noted that the explanation is made in the following order.
0. Overview
1. First embodiment (encoding device and decoding device)
2. Second embodiment (computer)
3. Third embodiment (Multi-view image encoding apparatus and Multi-view image decoding apparatus)
4. Fourth embodiment (layered image encoding apparatus and layered image decoding apparatus)
5. Fifth embodiment (television device)
6. Sixth embodiment (Portable telephone apparatus)
7. Seventh embodiment (recording/reproducing apparatus)
8. Eighth embodiment (image forming apparatus)
9. Example application of scalable coding
10. Other examples of embodiments
<0, overview >
(encoding method)
The present techniques are described below in connection with an example case where High Efficiency Video Coding (HEVC) is applied in image encoding and decoding.
< description of coding Unit >
Fig. 1 is a diagram for explaining a Coding Unit (CU) that is a coding unit in HEVC.
In HEVC, images with a large number of image frames, e.g., Ultra High Definition (UHD) images of 4000 × 2000 pixels, are to be processed, and therefore, sizing the coding unit to 16 × 16 pixels is not an optimal way. In view of this, a CU is defined as a coding unit in HEVC.
A CU plays a similar role as a macroblock in AVC. Specifically, a CU is divided into PUs or into TUs.
It should be noted that the size of a CU is equal to a square represented by pixels that are a power of 2 as the sequence changes. Specifically, a CU is set as a CU of the maximum size by 2-division of the LCU an appropriate number of times in the horizontal direction and the vertical direction so that the CU is not smaller than the minimum coding unit (SCU) of the CU of the minimum size. That is, when an LCU is layered, the size of a CU is the size of the layered level until an SCU is obtained such that the size of a higher level is 1/4 having the size of a CU of one level lower than the higher level.
For example, in fig. 1, the size of the LCU is 128, and the size of the SCU is 8. Thus, the hierarchical Depth (Depth) of the LCU is 0 to 4, and the number of hierarchical Depth levels is 5. That is, the division number corresponding to the CU is one of 0 to 4.
It should be noted that information specifying the size of the LCU and SCU is contained within the SPS. Also, the division number corresponding to a CU is denoted by "split _ flag", indicating whether or not the CU is further divided at each hierarchical level. CU is specifically described in
The size of a TU may be specified with a "split _ transform _ flag", similar to the "split _ flag" for a CU. The maximum number of partitions of a TU in inter prediction and intra prediction is designated as "max _ transform _ hierarchy _ depth _ inter" and "max _ transform _ hierarchy _ depth _ intra" in SPS, respectively.
Further, in the present specification, a Coding Tree Unit (CTU) is a unit including a Coding Tree Block (CTB) of an LCU and parameters for processing based on the LCU (hierarchy). Further, the CUs constituting the CTU are units including a Coding Block (CB) and parameters for processing based on the CU (hierarchy).
(mode selection)
Meanwhile, in order to achieve higher coding efficiency by AVC and HEVC coding methods, it is critical to select a suitable prediction mode.
The method implemented in the reference software H.264/MPEG-4AVC, known as the Joint Model (JM) (available in http:// iphome. hhi. de/suehring/tml/index. htm), can be used as an example of such a selection method.
In JM, two mode determination methods described below can be selected: high complexity mode and low complexity mode. By either of these two methods, a cost function value is calculated for each prediction Mode, and the prediction Mode that minimizes the cost function value is selected as the best Mode with blocks or macroblocks.
The cost function in the high complexity mode is represented in the following expression (1).
Cost (mode ∈ Ω) ═ D + λ r. (1)
Here, Ω denotes a full set of candidate modes for encoding a block or macroblock, and D denotes a differential energy between a decoded image and an input image when encoding in the current prediction mode. λ represents the lagrangian uncertainty multiplier provided as a function of the quantization parameter. R denotes a total bit rate including an orthogonal transform coefficient in case of encoding in the current mode.
That is, in order to perform encoding in the high complexity mode, it is necessary to perform provisional encoding processing in all candidate modes to calculate the above parameters D and R, and therefore, a large amount of calculation is required.
The cost function in the low complexity mode is expressed in the following expression (2).
Cost (pattern ∈ Ω) ═ D + QP2Quant (QP) × headerbit. (2)
Here, D is different from D in the high complexity mode, and represents the differential energy between the prediction image and the input image. QP2Quant (QP) represents a function of a quantization parameter QP, and header bit represents a bit rate associated with information that does not include orthogonal transform coefficients and belongs to a slice header (e.g., a motion vector and a mode).
That is, in the low complexity mode, the prediction processing needs to be performed for each candidate mode, but the image does not need to be decoded. Therefore, the encoding process does not need to be performed. Therefore, the amount of calculation is smaller than that in the high complexity mode.
(IntraBC)
Intra block copy (IntraBC) is a coding tool for performing motion compensation within a picture. IntraBC refers to a tool that helps to improve the efficiency of encoding artificial images (e.g., computer pictures or CG images).
IntraBC, however, is not used as a technique for HEVC range extension described above, and is considered as a standardized technique for Screen Content Coding (SCC) extension.
In the case of IntraBC, only vector values are transmitted. Therefore, when a picture is divided into slices, the relationship between the current block and other slices is unclear. On the other hand, in the case of temporal prediction, the relationship between the current block and the reference frame becomes clear by combining the reference list and the index and the vector value.
A limitation is imposed in that the vector values become values based on the data in the current slice. In this way, reference to any piece other than the current piece is unambiguously prohibited.
For the above reasons, the effect of IntraBC becomes smaller, and in the case where a picture is divided into slices to achieve low-delay transmission, the coding efficiency becomes worse.
In view of the above, the present technology proposes to transmit intra _ BC _ ref _ prev _ slice _ flag, which is a reference grant flag indicating the decoding result of intra BC reference previous slice. When the value of intra _ BC _ ref _ prev _ slice _ flag is 0 (default value), IntraBC may refer to only the current slice. When the value of intra _ BC _ ref _ prev _ slice _ flag is 1, IntraBC may refer not only to the current slice but also to a block in the previous slice. Note that this flag indicates the relationship between slices, and therefore, is appropriately set in the Picture Parameter Set (PPS).
Also, in
For example, when intraBC _ ref _ prev _ slice _ num is 5 and the current slice number is 10, pictures having slice numbers 5 to 10 may be referred to. For example, if the current slice number is less than 5, the images of
Further, in
(example syntax of SPS and PPS)
Fig. 2 is a table showing example syntax of SPS and PPS. In the example of fig. 2, the syntax of the PPS is shown below the syntax of the SPS.
In the SPS, intra _ block _ copy _ enabled _ flag, which is a flag indicating that IntraBC is to be executed, is written, and below the intra _ block _ copy _ enabled _ flag, SPS _ cross _ IntraBC _ enabled _ flag is added as the above-described open/close flag as
In the PPS, intra _ BC _ ref _ prev _ slice _ flag, which is the above-mentioned reference permission flag of the present technology, is added. Intra _ BC _ ref _ prev _ slice _ flag is parsed only when SPS _ cross _ intraBC _ enable _ flag, which is an on/off flag added to SPS, is true. Further, below intra _ BC _ ref _ prev _ slice _ flag, intra _ BC _ ref _ prev _ slice _ num, which is the above-mentioned parameter of
It should be noted that information (e.g., the above-described flags and parameters) for controlling IntraBC (intra picture motion prediction across slices) is hereinafter collectively referred to as IntraBC control information. Also, an intra prediction mode using IntraBC control information will be hereinafter referred to as IntraBC mode.
(example of semantics)
FIG. 3 is a table illustrating an example of semantics in accordance with the present technology. In the example of fig. 3, sps _ cross _ intraBC _ enable _ flag, intraBC _ ref _ prev _ slice _ flag, and intraBC _ ref _ prev _ slice _ num are defined as follows.
Sps _ cross _ intraBC _ enable _ flag equal to 1 means that intraBC _ rev _ prev _ slice _ flag may have a value equal to 1 in CVS;
an intraBC _ ref _ prev _ slice _ flag equal to 1 indicates that a prediction unit whose coding mode in the current picture is intraBC (predModeIntraBc equals 1) can refer to previously decoded slice data that precedes the current picture in decoding order in the current picture. intraBC _ ref _ prev _ slice _ flag equal to 0 indicates that the prediction unit whose coding mode is intraBC does not refer to previously coded slice data. In the absence, the value of intraBC _ ref _ prev _ slice _ flag is inferred to be 0;
intraBC _ ref _ prev _ slice _ num represents one or more slices that the coding mode is referred to by the prediction unit of intraBC in the current slice. The collection of fragments is obtained as follows.
Assume C is the order of slices in the current picture (e.g., 0 for the first slice). And calculates a as follows.
A=(C-intraBC_ref_prev_slice_num)<0 0:(C-intraBC_ref_prev_slice_num)
Then, the Xth fragment (where X is in the range of A to C) is a target fragment represented by the syntax.
(example syntax of VPS)
Fig. 4 is a table showing an example syntax of the VPS. In the VPS shown in FIG. 4, the SPS _ cross _ intraBC _ enable _ flag in the SPS shown in FIG. 2 is written as VPS _ cross _ intraBC _ enable _ flag.
(detailed description)
Fig. 5 is a diagram showing an example of a picture divided into 4 slices (
In the case where reference to a different slice is prohibited, the range that can be referred to from the current CTU in
In the case of the present technology, on the other hand, different slices (slice #0 and slice #1) to be decoded are included in a referenceable range, and thus, for example, as shown in fig. 5, a block within
Fig. 6 is a diagram showing a case where intraBC _ ref _ prev _ slice _ num is 1 in the example shown in fig. 5.
Since intraBC _ ref _ prev _ slice _ num is 1,
(combination with WPP)
Fig. 7 and 8 are diagrams for explaining a combination of the present technology and Wavefront Parallel Processing (WPP).
WPP is processing executed when entry _ coding _ sync _ enabled _ flag in PPS is 1. There are two methods of performing WPP. The first method is a multi-slice encoding method, one slice as one CTU column. The second is an encoding method using entry _ point _ offset, and one slice is one picture. Since the present technology described above can be applied in the case of the second method, the first method is described below.
When the WPP function is turned on, one slice is a CTU column. Therefore, if reference to a different slice is prohibited, only the CTUs adjacent on the left are the range that can be referred to from the current CTU, and only the CTUs adjacent on the left can be referred to.
According to the present technology, on the other hand, when the WPP function is turned on, the reference range is not limited to the
That is, when the leftmost CTU in
Also, when the second CTU from the left in
In this way, the present techniques and WPP may be combined.
(in combination with tiling and dicing)
Fig. 9 is a diagram for explaining a combination of the present technology and tile division.
Tile division is processing executed when tiles _ enabled _ frag in PPS is 1. Tiles are specified in HEVC as a tool for parallel processing. A tile is a unit of segmentation of a picture. In the image compression information, the row size and column size of each tile are specified based on the LCU in SPS or PPS.
The LCUs contained within each tile are processed in raster scan order, and the tiles contained within each picture are processed in raster scan order. The tile may also include a plurality of tiles, and there may be a tile boundary within a tile.
In the case where a picture is vertically divided into two parts or is sliced into
According to the present technology, on the other hand, intra _ BC _ ref _ prev _ slice _ flag as a reference permission flag is set to 1 so that a decoded different slice can be referred to. Thus, when the tile splitting function is turned on, reference is allowed to tile #0,
(advantageous effects)
Fig. 10 is a table for explaining advantageous effects.
In the case where reference to different slices is prohibited, independent decoding may be performed between slices. With the present technique, on the other hand, IntraBC cannot be performed unless the specified slice is fully decoded. Therefore, independent decoding cannot be performed between slices.
In the case where reference to a different slice is prohibited, IntraBC cannot refer to a previous slice, and therefore, coding efficiency becomes worse. According to the present technique, IntraBC, on the other hand, can refer to the previous slice, thus improving coding efficiency.
(in combination with MCTS-SEI)
Fig. 11 is a diagram of a combination of tile set SEI (MCTS-SEI) for explaining the present technique and temporal motion constraints.
MCTS-SEI is SEI within the draft for SHVC (JCTVC-Q1008_ V2). By using MCTS-SEI, only data within a specified tile can be extracted from the bitstream, so that the specified tile can be decoded independently. It should be noted that without this SEI, it is not possible to decode only some tiles within a picture independently.
In the example shown in fig. 11, the picture is divided into 10 × 6 tiles. Tiles mcts _ id [0] within the area represented by the bold frame are part of a picture, but only these tiles may be extracted and decoded (such decoding is hereinafter referred to as independent decoding).
Similarly, tiles _ id [1] within the box drawn using the dashed line may also be decoded independently. The MCTS-SEI may specify tile sets within complex regions, e.g., MCTS _ id [0] and MCTS _ id [1] as shown in fig. 11.
Therefore, of the slices within the tile set specified by MCTS-SEI, intraBC _ ref _ prev _ slice _ flag needs to be set to 0.
This is because reference to tiles/slices other than the current tile/slice is prohibited.
Figure 12 is a table showing an example of NOTEs added to the semantics of an MCTS-SEI message.
To combine the present technique with MCTS-SEI, NOTEs shown in figure 12 are added to the semantics of the MCTS-SEI message according to JCTVC-Q1008_ V2.
NOTE-when intraBC ref prev slice flag is equal to 1, the intra block copy process may require decoding dependencies between tiles. The encoder is facilitated to set intraBC ref prev slice flag equal to 0 in the tiles, which is selected by the temporally motion constrained tile set.
Next, an example application of the present technology described above to a specific device is described.
< first embodiment >
(example configuration of embodiment of encoding apparatus)
Fig. 13 is a block diagram showing an example configuration of an embodiment of an encoding apparatus to which the present disclosure is applied.
The encoding apparatus 10 shown in fig. 13 includes a setting unit 11, an
Specifically, the setting unit 11 of the encoding device 10 sets VPS, SPS, PPS, VUI, SEI, and the like. Specifically, the setting unit 11 sets IntraBC control information in the SPS and PPS. The setting unit 11 supplies the
The frame-based image is input into the
The transmission unit 13 transmits the encoded stream supplied from the
(example configuration of coding Unit)
Fig. 14 is a block diagram showing an example configuration of the
The
The a/
The
The
The
The
The
The
The
The
It should be noted that the encoding information subjected to lossless encoding may be header information (e.g., slice header) regarding the quantization value subjected to lossless encoding.
The
The quantized value output from the
The inverse
The
A deblocking (deblocking)
The adaptive offset
Specifically, the adaptive offset
The adaptive offset
For example, the
Specifically, for each LCU, the
The
It should be noted that although in this example, the adaptive loop filtering process is performed for each LCU, the processing unit in the adaptive loop filtering process is not limited to LCUs. However, the processing can be performed efficiently, in which the adaptive offset
The
IntraBC control information in SPS and PPS is supplied from setting section 11 to intra prediction section 46. Using the peripheral image read from the
Also, the intra prediction unit 46 calculates cost function values (described in detail later) of all candidate intra prediction modes from the image read from the
The intra prediction unit 46 supplies the prediction image generated in the optimal intra prediction mode and the corresponding cost function value to the prediction
The motion prediction/
In this regard, the motion prediction/
The prediction
Based on the encoded data stored in the
(description of processing to be performed by the encoding apparatus)
Fig. 15 is a flowchart for explaining a stream generation process to be performed by the encoding device 10 shown in fig. 13.
In step S11 of fig. 15, the setting unit 11 of the encoding apparatus 10 sets parameter sets such as a VPS and an SPS. The setting unit 11 supplies the set parameter set to the
In step S12, the
In step S13, the accumulation buffer 37 (fig. 14) of the
In step S14, the transmission unit 13 transmits the encoded stream supplied from the setting unit 11 to the
Now, the parameter set setting process of step S11 in fig. 15 is described in detail with reference to the flowchart of fig. 16. In the example shown in fig. 16, IntraBC control information is set in the SPS and PPS.
In step S31, the setting unit 11 shown in fig. 13 sets sps _ cross _ intraBC _ enable _ flag. In step S32, the setting unit 11 determines whether sps _ cross _ intraBC _ enable _ flag is 1. If it is determined in step S32 that sps _ cross _ intraBC _ enable _ flag is 1, the process moves to step S33.
In step S33, the setting unit 11 sets intraBC _ ref _ prev _ slice _ flag. In step S34, the setting unit 11 determines whether intraBC _ ref _ prev _ slice _ flag is 1.
If it is determined in step S34 that intraBC _ ref _ prev _ slice _ flag is 1, the process moves to step S35. In step S35, setting section 11 sets intraBC _ ref _ prev _ slice _ num.
If it is determined in step S32 that sps _ cross _ intraBC _ enable _ flag is 0, steps S33 to S35 are skipped and the parameter set setting process ends. Then, the process returns to step S11 in fig. 15.
If it is determined in step S34 that intraBC _ ref _ prev _ slice _ flag is 0, step S35 is skipped and the parameter set setting process ends. Then, the process returns to step S11 in fig. 15.
Next, fig. 17 and 18 are flowcharts for explaining in detail the encoding processing of step S12 in fig. 15. The frame-based image is input from the setting unit 11 into the a/
In step S61 of fig. 17, the a/D converter 31 (fig. 14) of the
In step S62, the
In step S63, the intra prediction unit 46 performs the intra prediction process in all the candidate intra prediction modes for each LCU. This intra prediction process is described in detail later with reference to fig. 19. That is, the intra prediction unit 46 calculates the cost function values of all candidate intra prediction modes (including the IntraBC prediction mode) from the image read from the
Meanwhile, in step S64, the motion prediction/
In step S65, the prediction
In step S65, the prediction
Then, in step S66, the motion prediction/
On the other hand, if it is determined in step S65 that the optimal prediction mode is not the optimal inter prediction mode or that the optimal prediction mode is the optimal intra prediction mode, the prediction
In step S69, the
In step S70, the
In step S71, the
In step S72, the
In step 73, the inverse
In step S74, the
In step S75, the
In step S76, the adaptive offset
In step S77, the
In step S78, the
In step S79, the
In step S80, the
In step S81, the
In step S82, the
Now, the intra prediction process at step S63 of fig. 17 is described in detail with reference to the flowchart in fig. 19. IntraBC control information (e.g., sps _ cross _ IntraBC _ enable _ flag, intra _ BC _ ref _ prev _ slice _ flag, and IntraBC _ ref _ prev _ slice _ num) is provided from the setting unit 11 to the intra prediction unit 46.
In step S91, the intra prediction unit 46 divides the picture into a plurality of slices. In step S92, the intra prediction unit 46 performs intra prediction in a prediction mode other than the IntraBC mode to calculate the cost function value.
In step S93, the intra prediction unit 46 determines whether sps _ cross _ intraBC _ enable _ flag is 1. If it is determined in step S93 that sps _ cross _ intraBC _ enable _ flag is 1, the process moves to step S94.
In step S94, the intra prediction unit 46 searches for a motion vector of IntraBC. In step S95, the intra prediction unit 46 determines whether the search within the search range is completed. If it is determined in step S95 that the search within the search range is not completed, the process moves to step S96.
In step S96, the intra prediction unit 46 changes the search point. In step S97, the intra prediction unit 46 determines whether the search point changed from the previous search point in step S96 is located within the current slice.
If it is determined in step S97 that the search point is not located within a slice, the process moves to step S98. In step S98, the intra prediction unit 46 determines whether intra _ BC _ ref _ prev _ slice _ flag is 1. If it is determined in step S98 that intra _ BC _ ref _ prev _ slice _ flag is 1, the process moves to step S99.
In step S99, the intra prediction unit 46 determines whether the position of the search point is within the range specified by intraBC _ ref _ prev _ slice _ num.
If it is determined in step S99 that the position of the search point is not within the range specified by intraBC _ ref _ prev _ slice _ num, the process returns to step S96, and the procedures are repeated. If it is determined in step S98 that intra _ BC _ ref _ prev _ slice _ flag is not 1, the process returns to step S96, and then the procedures are repeated.
If it is determined in step S99 that the position of the search point is within the range specified by intraBC _ ref _ prev _ slice _ num, the process moves to step S100. If it is determined in step S97 that the search point is located within the slice, the process also moves to step S100.
In step S100, the intra prediction unit 46 calculates a cost function value in the IntraBC mode. The IntraBC vector corresponding to the minimum cost in the IntraBC mode is stored in a memory (not shown). In step S101, the intra prediction unit 46 determines whether the cost function value calculated in step S100 is less than the minimum cost.
If it is determined in step S101 that the cost function value is smaller than the minimum cost, the process moves to step S102. In step S102, the IntraBC vector in memory is updated along with the minimum cost, and the process returns to step S96. Then, the procedure is repeated.
If it is determined in step S101 that the cost function value is not less than the minimum cost, the procedure returns to step S96, and then the routine is repeated.
If it is determined in step S95 whether the search within the search range is completed, the process moves to step S103. If it is determined in step S93 that sps _ cross _ intraBC _ enable _ flag is not 1, the process also moves to step S103.
In step S103, the intra prediction unit 46 determines the optimum intra prediction mode from the cost function value, and ends the intra prediction processing.
(example configuration of embodiment of decoding apparatus)
Fig. 20 is a block diagram showing an example configuration of an embodiment of a decoding apparatus to which the present disclosure is applied. The decoding apparatus decodes the encoded stream transmitted from the encoding apparatus 10 shown in fig. 13.
The
The receiving
The
The
(example configuration of decoding Unit)
Fig. 21 is a block diagram showing an example configuration of the
The
The
The
The
The
The
Specifically, the
The inverse
The
The
The adaptive offset
Using the filter coefficients supplied from the
The
The D/a
The frame memory 141 stores the image supplied from the
Using the peripheral image read from the frame memory 141 through the
The
When the intra prediction mode information is supplied from the
(description of procedure to be performed by decoding apparatus)
Fig. 22 is a flowchart for explaining an image generation process performed by the
In step S111 in fig. 22, the receiving
In step S112, the
In step S113, the
In step S114, using the parameter sets supplied from the
Now, the decoding process at step S114 in fig. 22 is described in detail with reference to the flowchart in fig. 23.
In step S121, the
In step S122, the
In step S123, the
In step S124, the
In step S125, the
On the other hand, if it is determined in step S123 that sps _ cross _ intraBC _ enable _ flag is not 1, or if it is determined in step S124 that intra _ BC _ ref _ prev _ slice _ flag is not 1, the process moves to step S126.
In step S126, the
It should be noted that the slice decoding processing in steps S125 and S126 is described later with reference to fig. 25.
Now, with reference to the flowchart in fig. 24, another example of the decoding process at step S114 in fig. 22 is described.
In step S141, the
In step S142, the
In step S143, the
In step S144, the
In step S145, the
If it is determined in step S145 that the slice is a slice having a dependency relationship, the process moves to step S146. In step S146, the
On the other hand, if it is determined in step S145 that the slice is not a slice having a dependency relationship, the process moves to step S147. In step S147, the
If it is determined in step S143 that sps _ cross _ intraBC _ enable _ flag is not 1, or if it is determined in step S144 that intra _ BC _ ref _ prev _ slice _ flag is not 1, the process moves to step S148.
In step S148, the
In the above manner, the slices are processed in parallel or sequentially in the
It should be noted that the slice decoding processing in steps S146 to S148 is described later with reference to fig. 25.
Now, the slice decoding process is described with reference to a flowchart in fig. 25. This process is performed in parallel or sequentially for slices by the
In step S161 of fig. 25, the accumulation buffer 131 (fig. 21) of the
In step S162, the
The
The
In step S163, the
In step S164, the inverse
In step S165, the
In step S166, the
If it is determined in step S165 that the inter prediction mode information is not provided, or if the intra prediction mode information is provided to the
In step S167, the
In step S168, using the peripheral image read from the frame memory 141 through the
If it is determined in step S167 that the mode is the IntraBC mode, the process moves to step S169. In step S169, the
In step S171, the
In step S172, the
In step S173, the adaptive offset
In step S174, the
In step S175, the frame memory 141 stores the image supplied from the
In step S176, the
In step S177, the D/a
Through the method, the coding efficiency of IntraBC can be improved.
In the above example, HEVC compliant methods are used as the encoding method. However, the present technology is not limited to the above method, and some other encoding/decoding method may be used.
It should be noted that the present disclosure can be applied to an image encoding apparatus and an image decoding apparatus, which are used, for example, when image information (bit stream) compressed by orthogonal transformation (e.g., discrete cosine transform and motion compensation) is received through a network medium (e.g., satellite broadcasting, cable television, the internet, or a portable telephone) as in HEVC. The present disclosure can also be applied to an image encoding apparatus and an image decoding apparatus that are used when compressed image information is processed on a storage medium (e.g., an optical or magnetic disk or a flash memory).
< second embodiment >
(description of computer to which the present disclosure applies)
The above-described series of processes may be executed by hardware, and may also be executed by software. When the series of processes is executed by software, a program constituting the software is installed into a computer. Here, the computer may be a computer contained in dedicated hardware, or may be a general-purpose personal computer that can perform various functions, in which various programs are installed.
Fig. 26 is a block diagram showing an example configuration of hardware of a computer that executes the above-described series of processes according to a program.
In the computer, a Central Processing Unit (CPU)201, a Read Only Memory (ROM)202, and a Random Access Memory (RAM)203 are connected to each other by a
The input/
The
In the computer having the above-described configuration, for example, the
For example, a program executed by a computer, (CPU 201) may be recorded on the
In the computer, when the
It should be noted that the program executed by the computer may be a program for executing the processing in time series according to the order described in the present specification, or may be a program for executing the processing in parallel or executing the processing as needed (for example, when there is a call).
<3, third embodiment >
(applied to Multi-view image encoding and Multi-view image decoding)
The above series of processes can be applied to multi-view image encoding and multi-view image decoding. Fig. 27 shows an example of the multi-view image encoding method.
As shown in fig. 27, the multi-viewpoint image includes images of a plurality of viewpoints. The view of the multi-view image includes a base view encoded/decoded using only an image of its own view without using images of other views and a non-base view encoded/decoded using images of other views. The non-base views may use image processing of the base views, or may use image processing of other non-base views.
In the case of encoding/decoding the multi-view image shown in fig. 27, images of respective views are encoded/decoded, and the method according to the above-described first embodiment may be applied to the encoding/decoding of the respective views. In this way, the coding efficiency of IntraBC can be improved. Thus, coding efficiency is increased.
Further, parameters used in the method according to the above-described first embodiment may be shared when encoding/decoding the respective views. More specifically, VPS, SPS, PPS, etc., which are encoding information, may be shared when encoding/decoding the corresponding views. Necessary information other than those parameter sets may of course be shared when encoding/decoding the respective views.
In this way, transmission of redundant information can be prevented, and the amount of information to be transmitted (bit rate) can be reduced (or reduction in encoding efficiency can be prevented).
(Multi-viewpoint image encoding apparatus)
Fig. 28 is a diagram showing a multi-view image encoding apparatus that performs the above-described multi-view image encoding. As shown in fig. 28, the multi-view
The
The encoding apparatus 10 (fig. 13) can function as the
(Multi-view image decoding apparatus)
Fig. 29 is a diagram showing a multi-view image decoding apparatus that performs the multi-view image decoding described above. As shown in fig. 29, the multi-view image decoding apparatus 610 includes a demultiplexer 611, a decoding unit 612, and a decoding unit 613.
The demultiplexer 611 demultiplexes a multi-view image encoding stream formed by multiplexing a base view image encoding stream and a non-base view image encoding stream, and extracts the base view image encoding stream and the non-base view image encoding stream. The decoding unit 612 decodes the base view image coded stream extracted by the demultiplexer 611, and obtains a base view image. The decoding unit 613 decodes the non-base view image coded stream extracted by the demultiplexer 611, and obtains a non-base view image.
The decoding apparatus 110 (fig. 20) can function as the decoding unit 612 and the decoding unit 613 of the multi-view image decoding apparatus 610. That is, the IntraBC coding efficiency can be improved. Also, the decoding unit 612 and the decoding unit 613 may perform decoding (or share flags and parameters) using the same flags and parameters (e.g., syntax elements related to processing between pictures) between two decoding units. Therefore, a reduction in coding efficiency can be prevented.
<4 > and fourth embodiment
(application to layered image coding and layered image decoding)
The above-described series of processes can be applied to layered image encoding and layered image decoding (scalable encoding and scalable decoding). Fig. 30 shows an example of a layered image encoding method.
Hierarchical image coding (scalable coding) is performed to divide an image into layers (layering), and the layers are coded layer by layer in such a manner that predetermined parameters have a scalable function. The layered image decoding (scalable decoding) is decoding corresponding to layered image encoding.
As shown in fig. 30, in layering images, a predetermined parameter having an extensible function is used as a reference, and a single image is divided into a plurality of images (a plurality of layers). That is, the layered image (layered image) includes images of layers having values of predetermined parameters different from each other. The layers of the layered image include a base layer encoded/decoded using only images of its own layer without using images of other layers and a non-base layer (also referred to as an enhancement layer) encoded/decoded using images of other layers. The non-base layer may be processed using the image of the base layer or may be processed using the image of other non-base layers.
In general, a non-base layer is formed with data of a difference image between its own image and an image of another layer in order to reduce redundancy. For example, in the case where an image is divided into two layers, i.e., a base layer and an enhancement layer (also referred to as an enhancement layer), an image of lower quality than that of the original image is obtained when only data of the base layer is used, and the original image (or a high-quality image) is obtained when data of the base layer and data of a non-base layer are combined.
Since images are layered in this manner, images having various qualities can be easily obtained according to circumstances. For a terminal having a low processing capability, for example, a portable telephone, only image compression information on the base layer is transmitted, so that a moving image having a low spatial and temporal resolution or poor image quality is reproduced. For a terminal having high processing capability (e.g., a television set or a personal computer), only image compression information on the base layer and the enhancement layer is transmitted, so that a moving image having high spatial and temporal resolution or high image quality can be reproduced. In this way, image compression information according to the capability of the terminal or the network can be transmitted from the server without any transcoding process.
In the case of encoding/decoding the example of the layered image shown in fig. 30, images of respective layers are encoded/decoded, and the method according to the above-described first embodiment may be applied to encoding/decoding the respective layers. In this way, the coding efficiency of IntraBC can be improved. Thus, coding efficiency is increased.
Further, the flags and parameters used in the method according to the first embodiment described above may be shared when encoding/decoding the respective layers. More specifically, VPS, SPS, PPS, etc., which are encoding information, may be shared when encoding/decoding the corresponding layer. Necessary information other than those parameter sets can of course be shared when encoding/decoding the respective layers.
In this way, transmission of redundant information can be prevented, and the amount of information to be transmitted (bit rate) can be reduced (or reduction in encoding efficiency can be prevented).
(Expandable parameters)
In such layered image encoding and layered image decoding (scalable encoding and scalable decoding), parameters having a scalable function are used as needed. For example, the spatial resolution shown in fig. 31 can be used as such a parameter (spatial scalability). With this spatial scalability, the image resolution varies between layers. Specifically, in this case, each picture is divided into two layers, i.e., a base layer having a spatial resolution lower than that of the original image and an enhancement layer that can achieve the original spatial resolution when combined with the base layer, as shown in fig. 31. This number of layers is of course only an example, and each picture may be layered into any suitable number of layers.
Alternatively, for example, the parameter having such scalability may be a time resolution (time scalability) as shown in fig. 32. In the case of this temporal scalability, the frame rate varies between layers. That is, in this case, each picture is divided into two layers, i.e., a base layer having a frame rate lower than that of the original moving image and an enhancement layer that can realize the original frame rate when combined with the base layer, as shown in fig. 32. This number of layers is of course only an example and each picture may be layered into any suitable number of layers.
Further, a parameter with such scalability may be, for example, a signal-to-noise ratio (SNR) (SNR scalability). With this SNR scalability, the SN ratio varies between layers. Specifically, in this case, each picture is divided into two layers, i.e., a base layer having a lower SNR than that of the original image and an enhancement layer that can achieve the original SNR when combined with the base layer, as shown in fig. 33. This number of layers is of course only an example and each picture may be divided into any suitable number of layers.
Some other parameters than the above parameters may of course be used as parameters with scalability. For example, the bit depth may be used as a parameter with scalability (bit depth scalability). With this bit depth scalability, the bit depth varies between layers. In this case, for example, the base layer is formed with an 8-bit image, and the enhancement layer is added to the base layer to obtain a 10-bit image.
Alternatively, the chroma format may be used as a parameter having scalability (chroma scalability). In the case of this chroma scalability, the chroma format changes between layers. In this case, for example, the base layer is formed with a component image having a 4:2:0 format, and an enhancement layer is added to the base layer to obtain a component image having a 4:2:2 format.
(layered image encoding apparatus)
Fig. 34 is a diagram showing a layered image encoding apparatus that performs the above-described layered image encoding. As shown in fig. 34, the layered
The
The encoding apparatus 10 (fig. 13) can function as the
(layered image decoding apparatus)
Fig. 35 is a diagram showing a layered image decoding apparatus that performs the above-described layered image decoding. As shown in fig. 35, the layered
The
The decoding apparatus 110 (fig. 20) can function as the
< fifth embodiment >
(example configuration of television apparatus)
Fig. 36 schematically shows an example configuration of a television apparatus to which the present disclosure is applied. The
The
The
The
The video
The
The audio
The
The
The
It should be noted that, in the
In the television apparatus designed as above, the
< sixth embodiment >
(example configuration of Portable telephone device)
Fig. 37 schematically shows an example configuration of a portable telephone device to which the present disclosure is applied. The
Also, an
The
In the audio communication mode, an audio signal generated at the
In the case of performing mail transmission in the data communication mode, the
Note that the
In the case of transmitting image data in the data communication mode, the image data generated on the camera unit 926 is supplied to the
The multiplexing/
In the portable telephone device designed as above, the
< seventh embodiment >
(example configuration of recording/reproducing apparatus)
Fig. 38 schematically shows an example configuration of a recording/reproducing apparatus to which the present disclosure is applied. For example, the recording/reproducing
The recording/reproducing
The
The
The
The
The
The
The
The
The
The
In the recording/reproducing apparatus designed as above, the
< eighth embodiment >
(example configuration of image Forming apparatus)
Fig. 39 schematically shows an example configuration of an imaging apparatus to which the present disclosure is applied. The
The
The
The camera
The image
The OSD unit 969 generates a menu screen or display data, e.g., icons, formed with symbols, characters, or numbers, and outputs such data to the image
The
The recording medium driven by the media drive 968 may be a readable/rewritable removable medium such as a magnetic disk, a magneto-optical disk, an optical disk, or a semiconductor memory. Further, the recording medium may be any type of removable medium, and may be a tape device, a magnetic disk, or a memory card. The recording medium may of course be a contactless Integrated Circuit (IC) card or the like.
Alternatively, the media drive 968 and the recording medium may be integrated, and may be formed with a fixed storage medium, for example, an internal hard disk drive or a Solid State Drive (SSD).
The
In the imaging apparatus designed as above, the image
< example application of scalable coding >
(first System)
Next, a specific use example of scalable encoded data subjected to scalable encoding (layered encoding) is described. As shown in the example of fig. 40, scalable coding is used to select data to be transmitted.
In the data transmission system 1000 shown in fig. 40, a distribution server 1002 reads scalable encoded data stored in a scalable encoded data storage unit 1001 and distributes the scalable encoded data to a terminal, for example, a personal computer 1004, an Audio Visual (AV) device 1005, a tablet device 1006, or a portable telephone device 1007 through a network 1003.
In so doing, the distribution server 1002 selects and transmits encoded data of appropriate quality according to the capability of the terminal device, communication environment, and the like. If the distribution server 1002 transmits unnecessarily high quality data, the terminal apparatus does not have to obtain a high quality image, and such high quality data transmission may cause a delay or overflow. Also, such high quality data may unnecessarily occupy a communication band or unnecessarily increase a load on the terminal apparatus. On the other hand, if the distribution server 1002 transmits unnecessarily low quality data, the terminal apparatus may not be able to obtain a sufficiently high quality image. Therefore, the distribution server 1002 reads the scalable encoded data stored in the scalable encoded data storage unit 1001, and takes the read scalable encoded data as encoded data having a quality suitable for the capability of the terminal device and the communication environment.
For example, the scalable encoded data storage unit 1001 stores scalable encoded data (BL + EL)1011 that is scalable encoded. Scalable encoded data (BL + EL)1011 is encoded data containing a base layer and an enhancement layer, and at the time of decoding, can provide a picture of the base layer and a picture of the enhancement layer.
The distribution server 1002 selects an appropriate layer according to the capability of a terminal device for transmitting data, a communication environment, and the like, and reads data of the layer. For example, for the personal computer 1004 and the flat panel device 1006 having high processing capability, the distribution server 1002 reads high-quality scalable encoded data (BL + EL)1011 from the scalable encoded data storage unit 1001, and transmits the scalable encoded data (BL + EL)1011 as it is. On the other hand, with the AV device 1005 and the portable telephone device 1007 having low processing capabilities, for example, the distribution server 1002 extracts data of the base layer from the scalable encoded data (BL + EL)1011 and transmits the extracted data as scalable encoded data (BL)1012, which is data having the same content as the scalable encoded data (BL + EL)1011 but having a quality worse than the scalable encoded data (BL + EL) 1011.
When scalable encoded data is used in this way, the amount of data can be easily adjusted. Therefore, delay and overflow can be prevented, and the load on the terminal device or the communication medium can be prevented from being unnecessarily increased. Also, the scalable encoded data (BL + EL)1011 has redundancy reduced between layers, and therefore, the data amount can be made smaller than the case where the encoded layer of each layer is handled as separate data. Therefore, the storage area of the scalable encoded data storage unit 1001 can be used more efficiently.
It should be noted that various devices can be used as terminal devices, for example, from the personal computer 1004 to the portable telephone device 1007, and therefore, hardware performance varies between terminal devices. As the application to be executed by the terminal apparatus changes, the software performance also changes. Further, the network 1003 serving as a communication medium may be a wired or wireless communication network (e.g., the internet or a Local Area Network (LAN)) or any communication network system including a wired network and a wireless network. Data transmission capabilities vary between communication networks. The data transmission capabilities may further vary with respect to other communications, etc.
In view of this, the distribution server 1002 can communicate with the terminal apparatus as the data transmission destination before starting data transmission, and obtain information related to the capability of the terminal apparatus (for example, the hardware performance of the terminal apparatus and the performance of an application (software) to be executed by the terminal apparatus) and information related to the communication environment, for example, the bandwidth that the network 1003 can use. Further, based on the information obtained herein, the distribution server 1002 may select an appropriate layer.
It should be noted that the layer extraction may be performed within the terminal device. For example, the personal computer 1004 may decode the transmitted scalable encoded data (BL + EL)1011 and display an image of the base layer and an image of the enhancement layer. Also, the personal computer 1004 may extract the base layer scalable encoded data (BL)1012 from the transmitted scalable encoded data (BL + EL)1011, transmit the scalable encoded data (BL)1012 to another device, or decode the scalable encoded data (BL)1012 to display an image of the base layer.
The number of scalable encoded data storage sections 1001, distribution server 1002, network 1003, and terminal apparatuses may be determined as necessary. Also, in the above example, the distribution server 1002 transmits data to the terminal apparatus. However, the use example is not limited thereto. The data transmission system 1000 may be any suitable system that selects an appropriate layer according to the capability of a terminal device, communication environment, and the like when transmitting encoded data subjected to scalable coding to the terminal device.
(second System)
As shown in the example of fig. 41, scalable coding is also used for transmission over communication media.
In the
The
In accordance with a user instruction or the like, the
Also, in accordance with a user instruction or the like, the
As described above, for example, scalable encoded data may be transmitted through different communication media for each layer. Therefore, the load can be dispersed, and delay and overflow can be prevented.
Also, the communication medium for transmission may be selected for each layer, as appropriate. For example, scalable encoded data (BL)1121 of a base layer having a large data amount may be transmitted through a communication medium having a wide bandwidth, and scalable encoded data (EL)1122 of an enhancement layer having a small data amount may be transmitted through a communication medium having a narrow bandwidth. Also, the communication medium used to transmit the enhancement layer scalable encoded data (EL)1122 can be switched between the
When the control is performed in this manner, an increase in load due to data transmission can be further reduced.
The number of layers may of course be determined as appropriate, and the number of communication mediums used for transmission may also be determined as appropriate. The number of
(third System)
As shown in the example of fig. 42, scalable coding is also used to encode data by storing it.
In the imaging system 1200 shown in fig. 42, the imaging apparatus 1201 performs scalable encoding on image data obtained by photographing an object 1211 and transmits the image data as scalable encoded data (BL + EL)1221 to the scalable encoded data storage device 1202.
The scalable-encoded-data storage means 1202 stores scalable encoded data (BL + EL)1221 supplied from the imaging device 1201 in this case at an appropriate quality level. For example, at normal time, the scalable-coded-data storage means 1202 extracts data of the base layer from the scalable-coded data (BL + EL)1221 and stores the extracted data as scalable-coded data (BL)1222 of the base layer having low quality and a small data amount. On the other hand, at the observation time, for example, the scalable-encoded-data storage apparatus 1202 stores high-quality scalable encoded data (BL + EL)1221 having a large data amount as it is.
In this way, the scalable encoded data storage 1202 can store high quality images only when necessary. Therefore, it is possible to prevent an increase in the amount of data while preventing a decrease in the value of each image due to degradation in image quality. Therefore, the availability of the storage area can be improved.
The imaging device 1201 is, for example, a monitoring camera. In the case where no monitoring object (e.g., intruder) is captured within the image (at normal times), the content of the image is very likely to be unimportant. Therefore, it is prioritized to reduce the data amount and store the image data at a low quality level (scalable encoded data). On the other hand, in the case where the monitoring object is captured within the image as the object 1211 (at the observation time), the content of the image is highly likely to be important. Therefore, priority is given to image quality, and image data (scalable encoded data) is stored at a high quality level.
It should be noted that, for example, the scalable-coded-data storage means 1202 may determine whether the current time is the normal time or whether the current time is the observation time by analyzing the image. Alternatively, the imaging apparatus 1201 may make the determination and transmit the determination result to the scalable encoded data storage device 1202.
It should be noted that any suitable criterion may be used as the criterion for determining whether the current time is the normal time or whether the current time is the observation time, and it may be determined to use the content of the image as the determination criterion as appropriate. Conditions other than the content of the image may of course be used as the determination criterion. For example, the normal time and the observation time may be switched according to the volume or waveform of the recorded voice, may be switched at predetermined time intervals, or may be switched according to an external instruction (e.g., a user instruction).
Further, in the above example, the state is switched between two states of the normal time and the observation time. However, the number of states may be determined as appropriate. For example, the state may be switched between three or more states, for example, a normal time, a low-level observation time, an observation time, and a high-level observation time. However, the upper number of switchable states depends on the number of layers in the scalable encoded data.
Also, the imaging apparatus 1201 can determine the number of layers in scalable coding according to the situation. For example, at normal times, the imaging apparatus 1201 may generate scalable encoded data (BL)1222 of a base layer having low quality and a small data amount, and supply the scalable encoded data (BL)1222 to the scalable encoded data storage 1202. Further, on the other hand, at the observation time, for example, the imaging apparatus 1201 can generate scalable encoded data (BL + EL)1221 of the base layer having high quality and a large data amount and supply the scalable encoded data (BL + EL)1221 to the scalable encoded data storage means 1202.
In the above example, the monitoring camera is described. However, the imaging apparatus 1201 may be used as appropriate, and is not limited to the monitoring camera.
< other examples of embodiment >
Although the examples of the apparatus, the system, and the like to which the present disclosure is applied are described above, the present disclosure is not limited thereto, and may be implemented as any structure mounted on the above-described apparatus or as an apparatus within a system, for example, a processor serving as a system Large Scale Integration (LSI) or the like, a module using a processor or the like, a unit using a module or the like, and an apparatus (or a structure in an apparatus) having other functions incorporated in the unit.
(example configuration of video machine)
Now, with reference to fig. 43, an example case where the present disclosure is the above-described device is described. Fig. 43 schematically shows an example configuration of a video machine to which the present disclosure is applied.
In recent years, electronic devices have become multifunctional. In the development and manufacturing process of electronic devices, not only a structure within such electronic devices is sold or provided or a structure having one function is manufactured, but at present, in many cases, one device having various functions is manufactured by combining structures having related functions.
The
As shown in fig. 43,
A module is formed by integrating functions of elements related to each other, and the module is used as an element having an integrated function. Although not limited to a specific physical structure thereof, a module may be formed by placing electronic circuit elements (e.g., a processor having a corresponding function, a resistor, and a capacitor) on a wiring board or the like, and the module may be integrated thereon. Alternatively, a new module may be formed by combining one module with another module, a processor, or the like.
In the case of the example shown in fig. 43, the
A processor is formed by integrating a structure having a predetermined function into a semiconductor chip of a system on chip (SoC), and for example, some processors are called a system Large Scale Integration (LSI). The structure having a predetermined function may be a logic circuit (hardware structure), may be a structure including a CPU, a ROM, and a RAM and a program (software structure) executed by these elements, or may be a structure formed by combining these two structures. For example, the processor may include a logic circuit, a CPU, a ROM, and a RAM, one function may be realized by a logic circuit (hardware configuration), and the other function may be realized by a program (software configuration) executed by the CPU.
An
The
The
The
It should be noted that
The
The
The front-
The
The
For example, the
The
The
The
The structure described above as a module may be embodied as a processor, and the structure described above as a processor may be embodied as a module.
In the
(example configuration of video processor)
Fig. 44 schematically shows an example configuration of a video processor 1332 (fig. 43) to which the present disclosure is applied.
In the case of the example shown in fig. 44, the
As shown in fig. 44, the
For example, the video input processing unit 1401 acquires a video signal input from the connection device 1321 (fig. 43), and converts the video signal into digital image data. The first image enlargement/reduction unit 1402 performs format conversion, image enlargement/reduction processing, and the like on the image data. The second image enlargement/reduction unit 1403 performs image enlargement/reduction processing on the image data according to the format at the output destination by the video output processing unit 1404, or performs format conversion, image enlargement/reduction processing, and the like as with the first image enlargement/reduction unit 1402. The video output processing unit 1404 performs format conversion, conversion into an analog signal, and the like on the image data, and outputs the result as a reproduced video signal to the connection device 1321 (fig. 43), for example.
The frame memory 1405 is an image data memory shared among the video input processing unit 1401, the first image enlargement/reduction unit 1402, the second image enlargement/reduction unit 1403, the video output processing unit 1404, and the encoding/decoding engine 1407. The frame memory 1405 is designed as a semiconductor memory, for example, a DRAM.
The memory control unit 1406 receives the synchronization signal from the encoding/decoding engine 1407 and controls write and read accesses to the frame memory 1405 in accordance with an access plan to the frame memory 1405 written in the access management table 1406A. The access management table 1406A is updated by the memory control unit 1406 according to processing performed by the encoding/decoding engine 1407, the first image enlarging/reducing unit 1402, the second image enlarging/reducing unit 1403, and the like.
The encoding/decoding engine 1407 performs image data encoding processing and processing of decoding a video stream that is data generated by encoding image data. For example, the encoding/decoding engine 1407 encodes the image data read from the frame memory 1405 and sequentially writes the encoded image data as a video stream into the video ES buffer 1408A. Also, for example, the encoding/decoding engine 1407 sequentially reads and decodes the video stream from the video ES buffer 1408B, and sequentially writes the decoded video stream as image data into the frame memory 1405. At the time of encoding and decoding, the encoding/decoding engine 1407 uses the frame memory 1405 as a work area. For example, at the time of starting processing of a macroblock, the encoding/decoding engine 1407 also outputs a synchronization signal to the memory control unit 1406.
The video ES buffer 1408A buffers the video stream generated by the encoding/decoding engine 1407 and provides the video stream to a Multiplexer (MUX) 1412. The video ES buffer 1408B buffers a video stream supplied from a Demultiplexer (DMUX)1413 and supplies the video stream to the encoding/decoding engine 1407.
The audio ES buffer 1409A buffers the audio stream generated by the audio encoder 1410 and supplies the audio stream to a Multiplexer (MUX) 1412. The audio ES buffer 1409B buffers an audio stream supplied from a Demultiplexer (DMUX)1413 and supplies the audio stream to an audio decoder 1411.
For example, the audio encoder 1410 performs digital conversion on an audio signal or the like input from the connection device 1321 (fig. 43), and encodes the audio signal by a predetermined method, for example, an MPEG audio method or an AudioCode digital 3(AC 3). The audio encoder 1410 sequentially writes an audio stream, which is data generated by encoding an audio signal, into the audio ES buffer 1409A. The audio decoder 1411 decodes the audio stream supplied from the audio ES buffer 1409B, performs conversion into an analog signal, for example, and supplies the result as a reproduced audio signal to the connection device 1321 (fig. 43) or the like.
A Multiplexer (MUX)1412 multiplexes the video stream and the audio stream. Any method may be used in this multiplexing (or any format may be used for the bit stream generated by the multiplexing). In this multiplexing, a Multiplexer (MUX)1412 may also add predetermined header information and the like to the bit stream. That is, the Multiplexer (MUX)1412 may convert the stream format by performing multiplexing. For example, a Multiplexer (MUX)1412 multiplexes a video stream and an audio stream to convert the format into a transport stream as a bit stream having a format for transmission. Also, a Multiplexer (MUX)1412 multiplexes the video stream and the audio stream to perform conversion into data of a file format (file data) for recording, for example.
The Demultiplexer (DMUX)1413 demultiplexes a bitstream generated by multiplexing a video stream and an audio stream by a method compatible with multiplexing performed by a Multiplexer (MUX) 1412. Specifically, a Demultiplexer (DMUX)1413 extracts a video stream and an audio stream from (or separates) the bit stream read from the stream buffer 1414. That is, the Demultiplexer (DMUX)1413 can convert the stream format by performing demultiplexing (inverse conversion of the conversion performed by the Multiplexer (MUX) 1412). For example, the Demultiplexer (DMUX)1413 acquires a transport stream supplied from, for example, the
The stream buffer 1414 buffers the bit stream. For example, the stream buffer 1414 buffers a transport stream supplied from a Multiplexer (MUX)1412 and supplies the transport stream to, for example, a
Also, for example, the stream buffer 1414 buffers file data supplied from a Multiplexer (MUX)1412, and supplies the file data to a connection device 1321 (fig. 43) or the like at a predetermined time or in response to a request from the outside or the like to record the file data into any type of recording medium.
Further, the stream buffer 1414 buffers a transport stream obtained through, for example, the
Also, for example, the stream buffer 1414 buffers file data read from any type of recording medium in the connection device 1321 (fig. 43), and supplies the file data to a Demultiplexer (DMUX)1413 at a predetermined time or in response to a request from the outside or the like.
Next, an example operation of the
Meanwhile, an audio signal input into the
The video stream in the video ES buffer 1408A and the audio stream in the audio ES buffer 1409A are read into a Multiplexer (MUX)1412, and then multiplexed and converted into a transport stream or file data or the like. The transport stream generated by the Multiplexer (MUX)1412 is buffered by the stream buffer 1414 and then output to an external network through the
Meanwhile, for example, a transport stream input into the
The audio stream is supplied to the audio decoder 1411 through the audio ES buffer 1409B and then decoded to reproduce an audio signal. Meanwhile, the video stream is written into the video ES buffer 1408B, then sequentially read and decoded by the encoding/decoding engine 1407, and written into the frame memory 1405. The decoded image data is subjected to enlargement/reduction processing by the second image enlargement/reduction unit 1403, and written in the frame memory 1405. Then, the decoded image data is read into the video output processing unit 1404, subjected to format conversion to a predetermined format (for example, 4:2:2Y/Cb/Cr format), and further converted into an analog signal, so that a video signal is reproduced and output.
In the case where the present disclosure is applied to the
It should be noted that, in the encoding/decoding engine 1407, the present disclosure (or the functions of the image encoding apparatus and the image decoding apparatus according to one of the above-described embodiments) may be embodied by hardware (e.g., logic circuits), may be embodied by software (e.g., an embedded program), or may be embodied by hardware and software.
(Another example configuration of video processor)
Fig. 45 schematically illustrates another example configuration of a video processor 1332 (fig. 43) to which the present disclosure is applied. In the case of the example shown in fig. 45, the
More specifically, as shown in fig. 45, the
The
As shown in fig. 45, for example, the
Under the control of the
Under the control of the
Under the control of the
The
The
In the example shown in fig. 45, the
The MPEG-2
MPEG-DASH1551 is a functional block for transmitting and receiving image data by MPEG dynamic adaptive streaming over HTTP (MPEG-DASH). MPEG-DASH is a technique for transmitting a video stream using a hypertext transfer protocol (HTTP), and one of them is characterized in that an appropriate encoded data block is selected and transmitted for each segment from among predetermined encoded data blocks having different resolutions from each other. MPEG-DASH1551 generates a stream conforming to the standard, and performs control and the like on the stream transmission. With respect to encoding/decoding image data, MPEG-DASH1551 uses the above-described MPEG-2
The
A multiplexer/demultiplexer (MUX DMUX)1518 multiplexes or demultiplexes various data related to the image, for example, a bitstream of encoded data, image data, and a video signal. In this multiplexing/demultiplexing, any method may be used. For example, in multiplexing, a multiplexer/demultiplexer (MUX DMUX)1518 not only integrates data blocks into one block but also adds predetermined header information and the like to the data. At the time of demultiplexing, a multiplexer/demultiplexer (MUX DMUX)1518 may not only divide one data packet into several blocks but also add predetermined header information and the like to each block of divided data. That is, the multiplexer/demultiplexer (MUX DMUX)1518 may convert the data format by performing multiplexing/demultiplexing. For example, the multiplexer/demultiplexer (MUX DMUX)1518 may convert the bit stream into a transport stream as a bit stream having a format for transmission, or convert into data (file data) having a file format for recording by multiplexing the bit stream. By demultiplexing, it is of course also possible to reverse the conversion.
For example, the
Next, an example operation of the
Further, file data of encoded data generated by encoding image data and read from a recording medium (not shown) by the connection device 1321 (fig. 43) or the like is supplied to a multiplexer/demultiplexer (MUX DMUX)1518 through the
It should be noted that the exchange of image data and other data is performed between respective processing units in the
In the case where the present disclosure is applied to the
It should be noted that, in the
Although two example configurations of
(example application of the device)
The
It should be noted that even one element within the
That is, as with
It should be noted that in this specification, an example is described in which various information blocks (for example, VPS and SPS) are multiplexed with encoded data and transmitted from an encoding side to a decoding side. However, the method of transmitting information is not limited to the above example. For example, the information block may be transmitted or recorded as separate data associated with the encoded data, without being multiplexed with the encoded data. Herein, the term "associated" means allowing an image (which may be a part of an image such as a slice or a block) contained within the bitstream to be linked to information corresponding to the image at the time of decoding. That is, information may be transmitted through a transmission path different from that of encoded data. Alternatively, the information may be recorded in a recording medium other than the recording medium used to encode the data (or in a different area located within the same recording medium). Also, the information and the encoded data may be associated with each other in any unit, for example, in units of some frames, one frame, or a part of a frame.
Further, in the present specification, a system denotes an assembly of elements (a device, a module (a part), and the like), and it is not necessary that all the elements are provided in the same housing. In view of this, devices accommodated in different housings and connected to each other through a network form a system, and one device having modules accommodated in one housing is also a system.
The advantageous effects described in this specification are merely examples, and the advantageous effects of the present technology are not limited thereto, and other effects may be included.
It should be noted that the embodiments of the present disclosure are not limited to the above-described embodiments, and various modifications may be made thereto without departing from the scope of the present disclosure.
For example. The present disclosure may also be applied in encoding devices and decoding devices that implement encoding methods other than HEVC and may perform transform hopping.
The present disclosure may also be applied to an encoding apparatus and a decoding apparatus for receiving an encoded stream through a network medium (e.g., satellite broadcasting, cable television, internet, or portable phone), or for processing an encoded stream within a storage medium (e.g., an optical or magnetic disk or a flash memory).
Further, the present disclosure may be implemented in a cloud computing configuration in which one function is shared among devices through a network, and processing is performed by the devices cooperating with each other.
Further, the respective steps described with reference to the flowcharts described above may be performed by one apparatus or may be shared among apparatuses.
In the case where more than one process is included in one step, the processes included in the step may be performed by one device or may be shared among devices.
Although the preferred embodiments of the present disclosure are described above with reference to the drawings, the present disclosure is not limited to those examples. It is apparent that those skilled in the art can make various changes or modifications within the technical spirit claimed herein, and it is understood that those changes or modifications are within the technical scope of the present disclosure.
It should be noted that the present technology can also be implemented in the structure described below.
(1) An image encoding device comprising:
a setting unit configured to set control information for controlling intra-picture motion prediction across slices;
an encoding unit configured to encode an image to generate a bit stream according to the control information set by the setting unit; and
a transmission unit configured to transmit the control information set by the setting unit and the bit stream generated by the encoding unit.
(2) The image encoding apparatus according to (1), wherein the setting unit sets, as the control information, a reference permission flag indicating a decoding result that allows reference to a current slice and a slice preceding the current slice in a raster scan order.
(3) The image encoding apparatus according to (2), wherein the setting unit sets a reference permission flag within a Picture Parameter Set (PPS).
(4) The image encoding apparatus according to any one of (1) to (3), wherein the setting unit sets a parameter representing the number of previous slices having a referable decoding result as the control information.
(5) The image encoding apparatus according to any one of (1) to (4), wherein the setting unit sets an on/off flag as the control information, the on/off flag indicating whether cross-slice intra motion prediction can be performed.
(6) The image encoding apparatus according to (5), wherein the setting unit sets an on/off flag within one of a Sequence Parameter Set (SPS) and a Video Parameter Set (VPS).
(7) The image encoding apparatus according to any one of (2) to (6), wherein the setting unit limits a range of reference and sets the reference permission flag when Wavefront Parallel Processing (WPP) is "on".
(8) The image encoding apparatus according to any one of (2) to (6), wherein the setting unit sets the reference permission flag when tile division is "on".
(9) The image encoding apparatus according to any one of (2) to (6), wherein the setting unit sets the reference permission flag to "off" when a temporal motion constrained tile set SEI (MCTS-SEI) is "on".
(10) An image encoding method implemented by an image encoding apparatus,
the image encoding method includes:
setting control information for controlling cross-slice intra-picture motion prediction;
encoding the image to generate a bit stream according to the set control information; and is
The set control information and the generated bit stream are transmitted.
(11) An image decoding apparatus comprising:
a receiving unit configured to receive a bit stream generated by encoding an image;
an extraction unit configured to extract control information from the bitstream received by the reception unit, the control information being designed to control intra-picture motion prediction across slices; and
a decoding unit configured to decode the bitstream received by the receiving unit using the control information extracted by the extracting unit to generate an image.
(12) The image decoding apparatus according to (11), wherein the extraction unit extracts, as the control information, a reference permission flag indicating a decoding result that allows reference to a current slice and a slice preceding the current slice in raster scan order.
(13) The image decoding apparatus according to (12), wherein the extraction unit extracts the reference permission flag from a Picture Parameter Set (PPS).
(14) The image decoding apparatus according to any one of (11) to (13), wherein the extraction unit extracts, as the control information, a parameter indicating the number of previous slices having a referable decoding result.
(15) The image decoding apparatus according to any one of (11) to (14), wherein the extraction unit extracts, as the control information, an on/off flag indicating whether cross-slice intra-picture motion prediction can be performed.
(16) The image decoding apparatus of (15), wherein the extraction unit extracts an on/off flag from one of a Sequence Parameter Set (SPS) and a Video Parameter Set (VPS).
(17) The image decoding apparatus according to any one of (12) to (16), wherein the extraction unit limits a range of reference and extracts the reference permission flag when Wavefront Parallel Processing (WPP) is "on".
(18) The image decoding apparatus according to any one of (12) to (16), wherein the extraction unit extracts the reference permission flag when tile division is "on".
(19) The image decoding apparatus according to any one of (12) to (16), wherein the extraction unit extracts the reference permission flag set to "off" when a temporal motion constrained tile set SEI (MCTS-SEI) is "on".
(20) An image decoding method implemented by an image decoding apparatus,
the image decoding method includes:
receiving a bit stream generated by encoding an image;
extracting control information from the received bitstream, the control information designed to control intra-picture motion prediction across slices; and is
Using the extracted control information, the received bit stream is decoded to generate an image.
Description of the symbols
10: encoding apparatus
11: setting unit
12: coding unit
13: transmission unit
46: intra prediction unit
110: decoding apparatus
111: receiving unit
112: extraction unit
113: decoding unit
143: and an intra prediction unit.
- 上一篇:一种医用注射器针头装配设备
- 下一篇:图像解码设备和方法