Residual transform and inverse transform in video coding systems and methods

文档序号:1643431 发布日期:2019-12-20 浏览:8次 中文

阅读说明:本技术 视频编码系统和方法中的残差变换和逆向变换 (Residual transform and inverse transform in video coding systems and methods ) 是由 丁文鹏 吴刚 蔡家扬 于 2017-02-23 设计创作,主要内容包括:确定未编码视频帧的最大编码块大小和最大变换块大小的变换块处理程序。所述未编码视频帧被划分成多个编码块,所述多个编码块包括第一编码块,所述第一编码块被划分成至少一个预测块和多个变换变换块。所述变换块的大小至少部分地取决于所述编码块和相应的预测块的大小。然后所述变换块被编码,由此生成编码位流的视频数据载荷。生成所述编码位流的帧头,所述帧头包含最大编码块大小标志和最大变换块大小标志。(A transform block handler is determined for a maximum encoding block size and a maximum transform block size for an uncoded video frame. The unencoded video frame is divided into a plurality of encoding blocks including a first encoding block divided into at least one prediction block and a plurality of transform blocks. The size of the transform block depends at least in part on the size of the coding block and the corresponding prediction block. The transform block is then encoded, thereby generating a video data payload of the encoded bitstream. Generating a frame header for the encoded bit stream, the frame header including a maximum coding block size flag and a maximum transform block size flag.)

1. A video encoder device implemented method for encoding an unencoded video frame to generate an encoded bitstream representative of the unencoded video frame, the encoded bitstream including at least a frame header and a video data payload, the video encoder device implemented method comprising:

determining a maximum encoding block size of the unencoded video frame, the maximum encoding block size being defined by a maximum horizontal encoding block size and a maximum vertical encoding block size;

determining a maximum transform block size of the unencoded video frame, the maximum transform block size defined by a maximum horizontal prediction block size and a maximum vertical prediction block size;

encoding the unencoded video frame, thereby generating the video data payload of the encoded bitstream;

generating the frame header of the encoded bit stream, the frame header including a maximum coding block size flag and a maximum transform block size flag; and

wherein the maximum coding block size flag is set to 0 unless the maximum horizontal coding block size and the maximum vertical coding block size are both equal to 64 pixels, and the maximum transform block size flag is set to 0 unless the maximum horizontal prediction block size and the maximum vertical prediction block size are both greater than 16 pixels.

2. The video encoder apparatus implemented method of claim 1, further comprising: prior to encoding the uncoded frequency frame,

dividing the unencoded video frame into a plurality of encoding blocks, the plurality of encoding blocks including a first encoding block having a horizontal encoding block size less than or equal to the maximum horizontal encoding block size and having a vertical encoding block size less than or equal to the maximum vertical encoding block size;

dividing the first encoded block into at least one prediction block, each of the at least one prediction block having a horizontal prediction block size and a vertical prediction block size;

dividing the first encoded block into a plurality of transform blocks including a first transform block having a horizontal transform block size less than or equal to the maximum horizontal prediction block size and having a vertical transform block size less than or equal to the maximum vertical prediction block size; and

wherein the horizontal transform block size and the vertical transform block size depend at least in part on the horizontal coding block size, the vertical coding block size, the horizontal prediction block size, and the vertical prediction block size.

3. The video encoder device-implemented method of claim 2, each of the plurality of transform blocks comprising a set of transform coefficients, the video encoder device-implemented method further comprising: for each of the plurality of transform blocks, setting a respective transform block mode flag in a transform block header, wherein the respective transform block mode flag is assigned a first flag value if the set of transform coefficients includes at least one transform coefficient having a non-zero value, and a second flag value otherwise.

4. The video encoder device implemented method of claim 3, wherein the respective transform block mode flag for each transform block in the plurality of transform blocks is listed in the transform block header in raster scan order.

5. The video encoder apparatus implemented method of claim 2, further comprising determining, during encoding of the unencoded video frame, that the horizontal transform block size and the vertical transform block size are both equal to 4 pixels, and therefore:

obtaining a first set of transform coefficients from the first transform block via a first transform;

obtaining a second set of transform coefficients from the first set of transform coefficients by shifting each transform coefficient in the first set of transform coefficients to the right by 5 bits; and

a third set of transform coefficients is obtained from the second set of transform coefficients via a second transform.

6. The video encoder apparatus implemented method of claim 2, further comprising determining during encoding of the unencoded video frame that the horizontal transform block size and the vertical transform block size are both equal to 8 pixels, and thus:

obtaining a first set of transform coefficients from the first transform block via a first transform;

obtaining a second set of transform coefficients from the first set of transform coefficients by shifting each transform coefficient in the first set of transform coefficients to the right by 2 bits;

obtaining a third set of transform coefficients from the second set of transform coefficients via a second transform; and

obtaining a fourth set of transform coefficients from the third set of transform coefficients by shifting each transform coefficient in the third set of transform coefficients to the right by 2 bits.

7. The method implemented by a video encoder device of claim 6, wherein said first transform and said second transform are represented by the equation y-T _8x 8x, T _8x8 being represented by:

8. the video encoder apparatus implemented method of claim 2, further comprising determining during encoding of the unencoded video frame that the horizontal transform block size and the vertical transform block size are both equal to 16 pixels, and thus:

obtaining a first set of transform coefficients from the first transform block via a first transform;

obtaining a second set of transform coefficients from the first set of transform coefficients by shifting each transform coefficient in the first set of transform coefficients to the right by 2 bits;

obtaining a third set of transform coefficients from the set of transform coefficients via a second transform; and

obtaining a fourth set of transform coefficients from the third set of transform coefficients by shifting each transform coefficient in the third set of transform coefficients to the right by 2 bits.

9. The method of claim 8, wherein the first transform and the second transform are represented by the equation y-T16 x16 x, T16 x16 being a matrix with coefficients t0..

Wherein t0... t15 is defined as:

t0 {26 26 26 26 26 26 26 26 26 26 26 26 26 26 26 26} t1= {37 35 32 28 23 17 11 4 -4 -11 -17 -23 -28 -32 -35 -37} t2 {36 31 20 7 -7 -20 -31 -36 -36 -31 -20 -7 7 20 31 36} t3 {35 23 4 -17 -32 -37 -28 -11 11 28 37 32 17 -4 -23 -35} t4 {34 14 -14 -34 -14 14 34 34 14 -14 -34 -34 -14 14 34} t5 {32 4 -28 -35 -11 23 37 14 -17 -37 -23 11 35 28 -4 -32} t6 {31 -7 -36 -20 20 36 7 -31 -31 7 36 20 -20 -36 -7 31} t7 {28 -17 -35 4 37 11 -32 -23 23 32 -11 -37 -4 35 17 -28} t8 {26 -26 -26 26 26 -26 -26 26 26 -26 -26 26 26 -26 -26 26} t9 {23 -32 -11 37 -4 -35 17 28 -28 -17 35 4 -37 11 32 -23} t10 {20 -36 7 31 -31 -7 36 -20 -20 36 -7 -31 7 -36 20} t11 {17 -37 23 11 -35 28 4 -32 32 -4 -28 35 -11 -23 37 -17} t12 {14 -34 34 -14 -14 34 -34 14 14 -34 34 -14 -14 34 -34 14} t13 {11 -28 37 -32 17 4 -23 35 -35 23 -4 -17 32 -37 28 -11} t14 {7 -20 31 -36 36 -31 20 -7 -7 20 -31 36 -36 31 -20 7} t15 {4 -11 17 -23 28 -32 35 -37 37 -35 32 -28 23 -17 11 -4}

10. the method of claim 2, the first transform block comprising a set of transform coefficients, each transform coefficient associated with a luma characteristic of a pixel of the unencoded video frame, wherein when the horizontal coding block size, the vertical coding block size, the horizontal prediction block size, and the vertical prediction block size are each equal to 8 pixels, the method further comprises setting the horizontal transform block size and the vertical transform block size each equal to 8 pixels.

11. The method of claim 2, the first transform block comprising a set of transform coefficients, each transform coefficient relating to a luma characteristic of a pixel of the unencoded video frame, wherein when the horizontal coding block size and the vertical coding block size are both equal to 8 pixels and the horizontal prediction block size and the vertical prediction block size are not both equal to 8 pixels, the method further comprises setting the horizontal transform block size and the vertical transform block size to be both equal to 4 pixels.

12. The method of claim 2, the first transform block comprising a set of transform coefficients, each transform coefficient relating to a luma characteristic of a pixel of the unencoded video frame, wherein if the horizontal coding block size, the vertical coding block size, the horizontal prediction block size, and the vertical prediction block size all equal 16 pixels, the method further comprises setting the horizontal transform block size and the vertical transform block size each equal 16 pixels.

13. The method of claim 2, the first transform block comprising a set of transform coefficients, each transform coefficient relating to a luma characteristic of a pixel of the unencoded video frame, wherein when the horizontal coding block size and the vertical coding block size are both equal to 16 pixels and the horizontal prediction block size and the vertical prediction block size are not both equal to 16 pixels, the method further comprises setting the horizontal transform block size and the vertical transform block size to each 4 pixels.

14. The method of claim 2, the first transform block comprising a set of transform coefficients, each transform coefficient associated with a luma characteristic of a pixel of the unencoded video frame, wherein when the horizontal coding block size and the vertical coding block size are each greater than 31 pixels, the method further comprises setting the horizontal transform block size and the vertical transform block size to each equal to 16 pixels.

15. The method of claim 2, the first transform block comprising a set of transform coefficients, each transform coefficient associated with a chroma characteristic of a pixel of the unencoded video frame, wherein when the horizontal coding block size and the vertical coding block size are both equal to 8 pixels, the method further comprises setting the horizontal transform block size and the vertical transform block size are both equal to 4 pixels.

16. The video encoder implemented method of claim 2, the first transform block comprising a set of transform coefficients, each transform coefficient associated with a chroma characteristic of a pixel of the unencoded video frame, wherein when the horizontal coding block size, the vertical coding block size, the horizontal prediction block size, and the vertical prediction block size are each equal to 16 pixels, the video encoder device implemented method further comprises setting the horizontal transform block size and the vertical transform block size each equal to 8 pixels.

17. The method of claim 2, the first transform block comprising a set of transform coefficients, each transform coefficient relating to a chroma characteristic of a pixel of the unencoded video frame, wherein when the horizontal coding block size and the vertical coding block size are both equal to 16 pixels and the horizontal prediction block size and the vertical prediction block size are not both equal to 16 pixels, the method further comprises setting the horizontal transform block size and the vertical transform block size to both be equal to 4 pixels.

18. The method of claim 2, the first transform block comprising a set of transform coefficients, each transform coefficient associated with a chroma characteristic of a pixel of the unencoded video frame, wherein when the horizontal coding block size and the vertical coding block size are each greater than 31 pixels, the method further comprises setting the horizontal transform block size and the vertical transform block size to each equal to 8 pixels.

Technical Field

The present disclosure relates to encoding and decoding of video signals, and more particularly, to codebook-based encoding and decoding of adaptive filters for impairment compensation.

Background

The advent of digital multimedia such as digital images, voice/audio, graphics, and video has greatly improved various applications and opened entirely new applications because it is relatively easy to reliably store, communicate, transmit, and search for and access content. In general, digital multimedia has many applications, including a wide range of applications for entertainment, information, medicine, security, etc., and benefits society in various ways. Multimedia captured by sensors such as cameras and microphones is typically analog and made digital by a digitization process in the form of Pulse Code Modulation (PCM). However, after digitization, the amount of data produced can be very large, as the analog representation required for the speaker and/or TV display must be recreated. Efficient transfer, storage, or transmission of large amounts of digital multimedia content therefore requires compression from a raw PCM form into a compressed representation. Accordingly, many techniques for compressing multimedia have been invented. Video compression techniques have become very complex for many years, and they can often achieve high compression factors between 10 and 100 while maintaining a high degree of psycho-visual quality generally similar to uncompressed digital video.

Although great advances have been made in Video compression technology and science to date (representing a number of standards body driven Video coding standards such as MPEG-1, MPEG-2, h.263, MPEG-4 part 2, MPEG-4AVC/h.264, MPEG-4SVC and MVC, and industry driven proprietary standards such as Windows Media Video, RealVideo, On2VP, etc.), consumers are increasingly desirous of higher quality, higher definition and now 3D (stereo) Video (accessible anywhere and anytime) must be delivered by various means such as DVD/BD through over the air, cable/satellite, cable and mobile networks to a range of client devices such as PC/laptop computers, TVs, set top boxes, game consoles, portable Media players/devices, smart phones and wearable computing devices, and thus a higher level of video compression is more desirable. This is evidenced in standards driven by the standards body, recently by ISO MPEG efforts beginning at high efficiency video coding (HVEC), which anticipates new technical contributions and techniques that will be derived in conjunction with the ITU-T standards committee's exploratory work on h.265 video compression for many years.

All of the above standards employ a general inter-frame predictive coding framework that involves reducing temporal redundancy by compensating for motion between video frames. The basic concept is to remove temporal dependencies between neighboring pictures by using a block matching method. At the beginning of the encoding process, each frame of an uncoded video sequence is grouped into one of three types: i-type frames, P-type frames, and B-type frames. The type I frame is intra-coded. That is, pictures are encoded using only information from the frame itself, and inter-frame motion compensation techniques are not used (although intra-frame motion compensation techniques may be applied).

The other two types of frames, P-type and B-type, are encoded using inter-frame motion compensation techniques. The difference between P and B pictures is the temporal direction of the reference picture used for motion compensation. P-type pictures utilize information from previous pictures (in display order), while B-type pictures can utilize information from previous and future pictures (in display order).

For P-type and B-type frames, each frame is divided into blocks of pixels represented by coefficients of the luminance and chrominance components of each pixel, and one or more motion vectors are obtained for each block (2 motion vectors can be encoded for each block because B-type pictures can utilize information from future and past encoded frames). The Motion Vector (MV) represents the spatial displacement from the position of the current block to the position of a similar block (referred to as reference block and reference frame, respectively) in another previously encoded frame (which may be a past frame or a future frame in display order). The difference, if any, between the reference block and the current block is determined, and a residual (also referred to as "residual signal") is obtained. Thus, for each block of an inter-coded frame, only the residual and motion vectors need to be encoded, rather than the entire content of the block. By removing this temporal redundancy between frames of a video sequence, the video sequence can be compressed.

To further compress the video data, the coefficients of the residual signal are often transformed from the spatial domain to the frequency domain (e.g., using a discrete cosine transform ("DCT") or a discrete sine transform ("DST")) after applying inter-frame or intra-frame prediction techniques. For naturally occurring images, such as the types of images that typically make up a video sequence that is perceptible to humans, low frequency energy is always stronger than high frequency energy. Thus, the residual signal gets a better energy compression in the frequency domain than in the spatial domain. After the forward transform, the coefficients and motion vectors may be quantized and entropy encoded before packetization or otherwise processed, e.g., for transmission over a network such as the internet.

On the decoder side, inverse quantization and inverse transformation are applied to recover the spatial residual signal. These are typical transform/quantization processes in many video compression standards. A backward prediction process may then be performed to generate a recreated version of the original unencoded video sequence.

In past standards, the blocks used in coding were typically 16 × 16 pixels (called macroblocks in many video coding standards). However, as these standards have evolved, frame sizes have grown larger and larger, and many devices have achieved higher display capabilities than "high definition" (or "HD") frame sizes, e.g., 2048 x 1530 pixels. Therefore, larger blocks may be needed to efficiently encode motion vectors of these frame sizes (e.g., 64 × 64 pixels). Therefore, it is also desirable to increase the size of the residual signal block transformed from the spatial domain to the frequency domain.

Drawings

Fig. 1 illustrates an exemplary video encoding/decoding system according to one embodiment.

FIG. 2 illustrates several components of an exemplary encoding device, according to one embodiment.

Fig. 3 illustrates several components of an exemplary decoding apparatus according to one embodiment.

Fig. 4 illustrates a block diagram of an exemplary video encoder in accordance with at least one embodiment.

Fig. 5 illustrates a block diagram of an exemplary video decoder in accordance with at least one embodiment.

FIG. 6 illustrates a transform block processing routine in accordance with at least one embodiment.

FIG. 7 illustrates a transform block size selection subroutine in accordance with at least one embodiment.

FIG. 8 illustrates a forward integer transform subroutine in accordance with at least one embodiment.

FIG. 9 illustrates a quadratic transformation subroutine in accordance with at least one embodiment.

FIG. 10 illustrates a transform block recovery routine in accordance with at least one embodiment.

FIG. 11 illustrates an inverse integer transform subroutine in accordance with at least one embodiment.

Fig. 12 illustrates a schematic diagram of an exemplary recursive encoding block partitioning scheme in accordance with at least one embodiment.

FIG. 13 illustrates an exemplary encoded block indexing routine in accordance with at least one embodiment.

Fig. 14 illustrates an exemplary encoding block partitioning subroutine in accordance with at least one embodiment.

Fig. 15 a-15 c illustrate schematic diagrams of applications of the exemplary recursive coding block partitioning scheme illustrated in fig. 11, in accordance with at least one embodiment.

FIG. 16 illustrates an alternative transform block processing routine in accordance with at least one embodiment.

FIG. 17 illustrates an alternative forward integer transform subroutine in accordance with at least one embodiment.

Detailed Description

The following detailed description is presented primarily in terms of procedures and symbolic representations of operations on conventional computer components, including a processor, memory storage devices for the processor, connected display devices, and input devices. Further, the processes and operations may utilize conventional computer components in a heterogeneous distributed computing environment, including remote file servers, computer servers, and memory storage devices. Each of these conventional distributed computing components is accessible by the processor via a communications network.

The phrases "in one embodiment," "in at least one embodiment," "in various embodiments," "in some embodiments," and the like may be used repeatedly herein. Such phrases are not necessarily referring to the same embodiment. The terms "comprising," "having," and "including" are synonymous, unless the context dictates otherwise. The various embodiments are described in the context of the typical "hybrid" video coding method generally described above, as it uses inter/intra picture prediction and transform coding.

Reference will now be made in detail to a description of the embodiments illustrated in the accompanying drawings. Although embodiments have been described in connection with the drawings and the associated descriptions, it will be appreciated by those of ordinary skill in the art that a replacement and/or equivalent implementation may be substituted for the specific embodiments shown and described without departing from the scope of the present disclosure, including all alternatives, modifications, and equivalents, whether explicitly illustrated and/or described. In various alternative embodiments, additional devices or combinations of the devices shown may be added or combined without limiting the scope of the embodiments disclosed herein.

Exemplary video encoding/decoding System

Fig. 1 illustrates an exemplary encoding/decoding system 100 in accordance with at least one embodiment. The encoding device 200 (shown in fig. 2 and described below) and the decoding device 300 (shown in fig. 3 and described below) are in data communication with the network 104. The decoding device 200 may be in data communication with the unencoded video source 108 via a direct data connection, such as a storage area network ("SAN"), a high-speed serial bus, and/or via other suitable communication techniques or via the network 104 (as shown by dashed lines in fig. 1). Similarly, the encoding device 300 may be in data communication with the optional encoding video source 112 via a direct data connection (e.g., a storage area network ("SAN")), a high-speed serial bus, and/or via other suitable communication techniques or via the network 104 (as shown by the dashed lines in fig. 1). In some implementations, the encoding device 200, the decoding device 300, the encoded video source 112, and/or the unencoded video source 108 may include one or more replicated and/or distributed physical or logical devices. In many embodiments, there may be more encoding devices 200, decoding devices 300, unencoded video sources 108, and/or encoded video sources 112 than shown.

In various embodiments, the encoding device 200 may be a networked computing device capable of accepting requests, e.g., from the decoding device 300, generally over the network 104 and providing responses accordingly. In various embodiments, the decoding device 300 may be a networked computing device with a form factor, such as a mobile handset, watch, head-up display, or other wearable computing device, a dedicated media player, a computing tablet, a motor vehicle audio host, an audio-video on demand (AVOD) system, a dedicated media console, a gaming device, a "set-top box," a digital video recorder, a television, or a general purpose computer. In various implementations, the network 104 may include the internet, one or more local area networks ("LANs"), one or more wide area networks ("WANs"), a cellular data network, and/or other data networks. At various points, the network 104 may be a wired and/or wireless network.

Exemplary encoding apparatus

Referring to fig. 2, several components of an exemplary encoding apparatus 200 are shown. In some embodiments, the encoding device may include more components than those shown in fig. 2. However, not all of these generally conventional components are necessarily shown in order to disclose an illustrative embodiment. As shown in fig. 2, the example encoding apparatus 200 includes a network interface 204 for connecting to a network (e.g., the network 104). The example encoding device 200 also includes a processing unit 208, a memory 212, an optional user input 214 (e.g., an alphanumeric keyboard, a keypad, a mouse or other pointing device, a touch screen and/or microphone), and an optional display 216, all interconnected with the network interface 204 by a bus 220. The memory 212 typically includes RAM, ROM, and permanent mass storage such as disk drives, flash memory, and the like.

The memory 212 of the example encoding apparatus 200 stores an operating system 224 and program code for a number of software services, such as a software-implemented inter-frame video encoder 400 (described below with reference to fig. 4) having instructions for performing a transform block processing routine 600 (described below with reference to fig. 6). The memory 212 may also store video data files (not shown) that may represent unencoded copies of audio/video media pieces (e.g., movies and/or television episodes). These and other software components may be loaded into the memory 212 of the encoding device 200 using a drive mechanism (not shown) associated with a non-transitory computer-readable medium 232, such as a floppy disk, magnetic tape, DVD/CD-ROM drive, USB drive, memory card, or the like.

In operation, the operating system 224 manages the hardware and other software resources of the encoding device 200 and provides common services for software applications such as the software-implemented interframe video encoder 400. For hardware functions such as network communications via the network interface 204, receiving data via the input 214, outputting data via the optional display 216, and allocating memory 212 for various software applications such as the software-implemented inter-frame video encoder 400, the operating system 224 acts as an intermediary between software and hardware executing on the encoding device.

In some implementations, the encoding device 200 can also include a dedicated unencoded video interface 236 for communicating with an unencoded video source 108, such as a high-speed serial bus. In some implementations, the encoding device 200 may communicate with the unencoded video source 108 via the network interface 204. In other implementations, the unencoded video source 108 may reside in the memory 212 or the computer-readable medium 232.

While the example encoding device 200 has been described generally in accordance with a conventional general purpose computing device, the encoding device 200 may be any of a number of devices capable of executing instructions (e.g., the example software-implemented video encoder 400, the transform block processing routine 600) for encoding video in accordance with various embodiments, such as a video recording device, a video co-processor and/or accelerator, a personal computer, a gaming device, a set-top box, a hand-held or wearable computing device, a smart phone, or any other suitable device.

As an example, the encoding apparatus 200 may further operate in an on-demand media service (not shown). In at least one exemplary embodiment, the on-demand media service may further be an encoding device 200 operating in an online on-demand media store that provides digital copies of media works, such as video content, to users on a per-work and/or per-subscription basis. The on-demand media service may obtain a digital copy of such media work from the unencoded video source 108.

Exemplary decoding device

Referring to fig. 3, several components of an exemplary decoding apparatus 300 are shown. In some embodiments, the decoding apparatus may include more components than those shown in fig. 3. However, not all of these generally conventional components are necessarily shown in order to disclose an illustrative embodiment. As shown in fig. 3, the exemplary decoding apparatus 300 includes a network interface 304 for connecting to a network (e.g., network 104). The exemplary decoding device 300 also includes a processing unit 308, a memory 312, an optional user input 314 (e.g., an alphanumeric keyboard, a keypad, a mouse or other pointing device, a touch screen and/or microphone), an optional display 316, and an optional speaker 318, all interconnected with the network interface 304 by a bus 320. Memory 312 typically includes RAM, ROM, and permanent mass storage such as disk drives, flash memory, and the like.

The memory 312 of the exemplary decoding apparatus 300 may store an operating system 324 and program code for a plurality of software services, such as a software-implemented video decoder 500 (described below with reference to fig. 5) having instructions for performing a transform block recovery routine 1000 (described below with reference to fig. 10). The memory 312 may also store video data files (not shown) that may represent encoded copies of audio/video media works (e.g., movies and/or television episodes). These and other software components may be loaded into the memory 312 of the decoding apparatus 300 using a drive mechanism (not shown) associated with a non-transitory computer-readable medium 332, such as a floppy disk, magnetic tape, DVD/CD-ROM drive, memory card, or the like.

In operation, the operating system 324 manages the hardware and other software resources of the decoding apparatus 300 and provides common services for software applications such as the software-implemented video decoder 500. For hardware functions such as network communications via network interface 304, receiving data via input 314, outputting data via optional display 316 and/or optional speaker 318, and allocating memory 312, operating system 324 acts as an intermediary between software and hardware executing on the encoding device.

In some embodiments, the decoding device 300 may also include an optional encoded video interface 336, for example, for communicating with an encoded video source 116, such as a high speed serial bus. In some implementations, the decoding device 300 can communicate with an encoded video source (e.g., the encoded video source 116) via the network interface 304. In other implementations, the encoded video source 116 may reside in the memory 312 or the computer readable medium 332.

Although the exemplary decoding device 300 has been described generally in accordance with a conventional general purpose computing device, the decoding device 300 may be any of a number of devices capable of executing instructions (e.g., the exemplary software-implemented video decoder 500, the transform block recovery routine 1000) for decoding video in accordance with various embodiments, such as a video recording device, a video co-processor and/or accelerator, a personal computer, a gaming device, a set-top box, a hand-held or wearable computing device, a smart phone, or any other suitable device.

As an example, the decoding apparatus 300 may operate in conjunction with an on-demand media service. In at least one exemplary embodiment, the on-demand media service may provide digital copies of media works, such as video content, to the user-operated decoding device 300 on a per-work and/or per-subscription basis. The decoding device may obtain a digital copy of such a media work from the unencoded video source 108 via the network 104 via, for example, the decoding device 200.

Software implemented interframe video encoder

Fig. 4 illustrates an overall functional block diagram of a software-implemented inter-frame video encoder 400 (hereinafter "encoder 400") employing residual transform techniques in accordance with at least one embodiment. One or more uncoded video frames (vidfrms) of the video sequence in display order may be provided to the sequencer 404.

Sequencer 404 may assign a predictive coded picture type (e.g., I, P or B) to each unencoded video frame and reorder the frame sequence or groups of frames from the frame sequence into a coding order for motion prediction purposes (e.g., an I-type frame followed by a P-type frame, a P-type frame followed by a B-type frame). The ordered uncoded video frames (seqfrms) are then input to block indexer 408 in coding order.

For each ordered uncoded video frame (seqfrms), block indexer 408 may determine a largest coded block ("LCB") size (e.g., 64 x 64 pixels) for the current frame and partition the uncoded frame into an array of coded blocks (blks). The size of each encoded block within a given frame may vary, for example, from 4 x 4 pixels to the LCB size of the current frame.

Each encoded block may then be input to the differentiator 412 one at a time and may be differentiated from a corresponding prediction signal block (pred) generated from a previously encoded block. To generate the prediction block (pred), the coded block (cblk) is also provided to the motion estimator 416. After the difference operation at differentiator 412, a transformer 420 (discussed below) may forward transform the resulting residual block (res) into a frequency domain representation, resulting in a transform coefficient block (tcof). The block of transform coefficients (tcof) may then be sent to a quantizer 424, resulting in a block of quantized coefficients (qcf), which may then be sent to an entropy encoder 428 and a local decoding loop 430.

At the beginning of the local decoding loop 430, the inverse quantizer 432 may dequantize the transform coefficient block (tcof ') and pass it to the inverse transformer 436 to generate a dequantized residual block (res'). At adder 440, the predicted block (pred) from motion compensated predictor 442 may be added to the de-quantized residual block (res') to generate a locally decoded block (res). The local decode block (rec) may then be sent to a frame assembler and deblocking filter processor 444, which reduces blocking artifacts and assembles a recovered frame (recd) that may be used as a reference frame for the motion estimator 416 and motion compensated predictor 442.

The entropy encoder 428 encodes the quantized transform coefficients (qcf), differential motion vectors (dmv), and other data to generate an encoded video bitstream 448. For each frame of the unencoded video sequence, the encoded video bitstream 448 may include encoded picture data (e.g., encoded quantized transform coefficients (qcf) and differential motion vectors (dmv)) and encoded frame headers (e.g., syntax information such as LCB size of the current frame).

Forward integer conversion program

Referring to the function of transformer 420, the transformer receives a block of residual values for the luma and chroma values of each encoded block and divides the block of residual values into one or more luma and chroma transform blocks.

In at least one embodiment, the coding block is divided into transform blocks sized according to a current coding block size and a size of a prediction block used for motion estimation of the coding block. For example, the transform block size may be assigned according to the combination shown in table 1 below. The transformer 420 may also set a maximum transform block size flag in a picture header of the current frame.

TABLE 1

After the encoding block is divided into transform blocks, the residual values in the transform blocks are converted from the spatial domain to the frequency domain, e.g., via a forward DCT transform operation. In at least one embodiment, to improve coding efficiency, an integer equivalence of residual values of a transform block is obtained and a forward integer DCT transform operation may be performed. To further improve coding efficiency, it may be advantageous to use a Single Instruction Multiple Data (SIMD) instruction architecture in the video coding process. However, the most common implementations of SIMD instruction architectures require a bit width of 16 bits. Thus, in at least one embodiment, a shift operation may be performed on residual values after some forward transform operations (and on the decoder side, transform coefficients after some inverse transform operations) to ensure that residual values and transform coefficients may be represented by 16-bit integers.

In at least one embodiment, for a 4 x 4 transform block, transformer 420 may perform a forward integer DCT transform operation according to the following equation:

whereinIs the input residual value vector of the current transform block,is the output vector of the transform operation, and T4×4Is a 4 x 4 forward integer transform matrix given by:

in at least one embodiment, in the case of an 8 × 8 transform block, transformer 420 may perform a forward integer DCT transform operation according to the following equation:

whereinIs the input residual vector of the current transform block,is the output vector of the transform operation, and T8×8Is an 8x8 forward integer transform matrix given by:

after the 8 × 8 forward integer DCT transform operation, to guarantee a 16-bit operation, the transformer 420 may shift the values of the transform coefficients by 2 bits to the right.

In at least one embodiment, in the case of a 16 × 16 transform block, transformer 420 may perform a forward integer DCT transform operation according to the following equation:

whereinIs the input residual vector of the current transform block,is the output vector of the transform operation, and T16×16Is a 16x16 forward integer transform matrix given by:

wherein t is0、t1、t2...t14、t15As defined in table 2 below.

After the 16 × 16 forward integer DCT transform operation, to guarantee a 16-bit operation, the transformer 420 may shift the value of the transform coefficient by 2 bits to the right.

According to the number of transform blocks per coding block, coding efficiency can be further improved by performing an additional transform operation on the DC coefficient of each transform block. The DC coefficients are collected into a DC integer transform block and transformed again, for example, according to one of the forward integer DCT transform operations described above. This process is called quadratic transformation.

Software implemented interframe decoder

Fig. 5 illustrates a general functional block diagram of a corresponding software-implemented inter-frame video decoder 500 (hereinafter "decoder 500") inverse residual transform technique according to at least one embodiment and suitable for use in a decoding apparatus, such as decoding apparatus 300. The decoder 500 may operate similarly to the local decoding loop 455 at the encoder 400.

In particular, the encoded video bitstream 504 to be decoded may be provided to an entropy decoder 508, and the entropy decoder 508 may decode blocks of quantization coefficients (qcf), differential motion vectors (dmv), accompanying message packets (msg data), and other data. The block of quantized coefficients (qcf) may then be inverse quantized by an inverse quantizer 512, resulting in dequantized coefficients (tcof'). The dequantized coefficients (tcof ') may then be inverse transformed from the frequency domain by inverse transformer 516 (described below), resulting in a decoded residual block (res'). The adder 520 may add the motion compensated prediction blocks (pred) obtained by using the corresponding motion vectors (mv). The resulting decoded video (dv) may be deblock filtered in a frame assembler and deblock filtering processor 524. The block (recd) at the output of the frame assembler and deblocking filter processor 528 forms a reconstructed frame of the video sequence that can be output from the decoder 500 and also used as a reference frame for a motion compensated predictor 530 for decoding a subsequently encoded block.

Inverse integer transform procedure

Referring to the function of the inverse transformer 516, the inverse transformer obtains a dequantized 16-bit integer transform coefficient block from the inverse quantizer 512. The inverse transformer 516 performs an inverse integer DCT transform operation on the transform coefficients obtained from the inverse quantizer 512 to reverse the forward integer DCT transform operation performed by the transformer 420 as described above and restore residual values.

If the transform coefficients of the current coding block have been sub-transformed, the inverse transformer performs an inverse sub-transform procedure, as described below. After the DC transform coefficients are inverse transformed and inserted back into their corresponding transform blocks, the inverse transformer performs an inverse integer DCT transform operation.

For example, in at least one embodiment, inverse transformer 516 may perform an inverse integer DCT transform operation according to the following equation for a block of 16-bit integer transform coefficients corresponding to a 4 x 4 transform block:

whereinIs a vector of quantized transform coefficients that is,is a vector of restored residual values, anIs a 4 x 4 inverse integer transform matrix given by:

after the 4 × 4 inverse integer DCT transform operation, to guarantee a 16-bit operation, the inverse transformer may shift the value of the resulting residual value by 5 bits to the right.

In at least one embodiment, inverse transformer 516 may perform an inverse integer DCT transform operation according to the following equation for a block of 16-bit integer transform coefficients corresponding to an 8x8 transform block:

whereinIs to quantizeThe vector of transform coefficients is transformed by a vector of transform coefficients,is a vector of restored residual values, anIs an 8x8 inverse integer transform matrix, such as the 8x8 forward integer transform matrix T described above8x8The inverse of (c).

After the 8 × 8 inverse integer DCT transform operation, to guarantee a 16-bit operation, the inverse transformer may shift the resulting residual value by 7 bits to the right.

In at least one embodiment, inverse transformer 516 may perform an inverse integer DCT transform operation according to the following equation for a block of 16-bit integer transform coefficients corresponding to a 16x16 transform block:whereinIs a vector of quantized transform coefficients that is,is a vector of restored residual values, anIs a 16x16 inverse integer transform matrix, such as the 16x16 forward integer transform matrix T described above16x16The inverse of (c).

After the 16 × 16 inverse integer DCT transform operation, to guarantee 16-bit operation, the inverse transformer may shift the value of the resulting residual value by 7 bits to the right.

Transform block processing routine

Fig. 6 illustrates a transform block processing routine 600 suitable for use in at least one embodiment (e.g., encoder 400). One of ordinary skill in the art will recognize that not all events in the encoding process are shown in fig. 6. Rather, for the sake of clarity, only those steps reasonably relevant to describing the illustrated embodiments are shown.

At execution block 604, the transform block processing routine 600 obtains an encoded block of integer residual values for the current frame being encoded. The transform block processing routine 600 then provides the size of the current encoding block and the size of the corresponding prediction block used in motion estimation to a transform block size selection subroutine 700 (described below with reference to fig. 7), which transform block size selection subroutine 700 returns the appropriate chroma and luma transform block sizes for the current encoding block size and current combination of prediction block sizes.

The transform block processing routine 600 then separates the current encoded block into one or more 16-bit integer residual value transform blocks at execution block 608 according to the chroma and luma transform block sizes returned by the transform block size selection subroutine 700 above.

At start loop block 612, each transform block of the current coding block is processed in turn.

At decision block 616, if each residual value of the current transform block has a zero value, then at execution block 620, the transform block processing routine 600 sets a corresponding transform block mode flag in the transform block header of the current transform block.

Otherwise, at decision block 616, if one or more of the residual values of the current transform block have a non-zero value, the transform block processing routine 600 invokes a forward integer transform subroutine 800 (described below with reference to FIG. 8), which the forward integer transform subroutine 800 returns the corresponding block of 16-bit integer transform coefficients.

At end loop block 624, the transform block processing routine 600 iterates back to the start loop block 612 to process the next transform block (if any) of the currently encoded block.

At decision block 628, if the transform block of the current encoding block may be quadratic transformed, e.g., there are 16 or 64 transform blocks in the current encoding block, the transform block processing routine 600 may invoke a quadratic transform subroutine 900 (described below with reference to FIG. 9) that performs additional transform operations on the DC integer transform coefficients of the transform block of the current encoding block, the quadratic transform subroutine 900 returning the corresponding quadratic transform block of 16-bit integer transform coefficients.

After the quadratic transform subroutine 900 returns a quadratic transform block of 16-bit integer transform coefficients, or referring again to decision block 628, if the current coding block is not suitable for quadratic transform, the transform block processing routine 600 ends the current coding block at termination block 699.

Transform block size selection subroutine

FIG. 7 illustrates a transform block size selection subroutine 700 suitable for use in at least one embodiment (e.g., the transform block processing routine 600).

At execution block 704, the transform block size determination subroutine 700 obtains the coding block size and the prediction block size for the motion estimation process for the current coding block.

At decision block 712, if the encoded block size of the current encoded block is 8x8 pixels, the transform block size determination subroutine 700 proceeds to decision block 716.

At decision block 716, if the prediction block size of the current coding block is 8x8 pixels, then at execution block 720, the transform block size determination subroutine 700 sets the luma transform block size of the current coding block to 8x8 luma transform coefficients, and at execution block 724, the transform block size determination subroutine sets the chroma transform block size of the current coding block to 4 x 4 chroma transform coefficients. The transform block size determination subroutine then returns the luma transform block size and chroma transform block size of the current encoded block at return block 799.

Referring again to decision block 716, if the prediction block size of the current encoding block is not 8x8 pixels, then at execution block 728, transform block size determination subroutine 700 sets the luma transform block size of the current encoding block to 4 x 4 luma transform coefficients. The transform block size determination subroutine 700 then proceeds to execution block 724. As described above, at execution block 724, the transform block size determination subroutine 700 sets the chroma transform block size of the current encoded block to 4 x 4 chroma transform coefficients. The transform block size determination subroutine then returns the luma transform block size and chroma transform block size of the current encoded block at return block 799.

Referring again to decision block 712, if the coding block size of the current coding block is not 8x8 pixels, the transform block size determination subroutine 700 proceeds to decision block 736.

At decision block 736, if the coding block size of the current coding block is 16x16 pixels, the transform block size determination subroutine 700 proceeds to decision block 740.

At decision block 740, if the prediction block size of the current encoding block is 16 × 16 pixels, then at execution block 744, the transform block size determination subroutine 700 sets the luma transform block size of the current encoding block to 16 × 16 luma transform coefficients, and then at execution block 748, the transform block size determination subroutine 700 sets the chroma transform block size of the current encoding block to 8 × 8 chroma transform coefficients. The transform block size determination subroutine 700 then returns the luma transform block size and chroma transform block size of the current encoded block at return block 799.

Referring again to decision block 740, if the prediction block size of the current encoding block is not 16 × 16 pixels, the transform block size determination subroutine 700 proceeds to perform block 728. As described above, at execution block 728, the transform block size determination subroutine 700 sets the luma transform block size of the current coding block to 4 x 4 luma transform coefficients. Transform block size determination subroutine 700 then proceeds to execution block 724. As described above, at execution block 724, the transform block size determination subroutine 700 sets the chroma transform block size of the current encoded block to 4 x 4 chroma transform coefficients. The transform block size determination subroutine then returns the luma transform block size and chroma transform block size of the current encoded block at return block 799.

Referring again to decision block 736, if the coding block size of the current coding block is not 16x16 pixels, the transform block size determination subroutine 700 proceeds to execution block 744. As described above, at execution block 744, the transform block size determination subroutine 700 sets the luma transform block size of the current coding block to 16 × 16 luma transform coefficients, and then at execution block 748, the transform block size determination subroutine sets the chroma transform size of the current coding block to 8 × 8 chroma transform coefficients. The transform block size determination subroutine then returns the luma transform block size and chroma transform block size of the current encoded block at return block 799.

Forward integer transform subroutine

FIG. 8 illustrates a forward integer transform subroutine 800 suitable for use in at least one embodiment (e.g., the transform block processing routine 600 or the quadratic transform subroutine 900 described below with reference to FIG. 9).

At execution block 804, the forward integer transform subroutine obtains a transform block, for example, from the transform block processing routine 600.

At decision block 808, if the current transform block is a 4 x 4 integer transform coefficient block, then at execution block 812, the forward integer transform subroutine 800 executes a 4 x 4 forward transform, such as the 4 x 4 forward integer transform operation described above. The forward integer transform subroutine 800 then returns the transform coefficients obtained via the 4 x 4 integer transform at return block 899.

Referring again to decision block 808, if the current transform block is not a 4 x 4 integer transform coefficient block, e.g., an 8x8, 16x16, 32 x 32, or 64 x 64 integer transform coefficient block, then forward integer transform subroutine 800 proceeds to decision block 816.

At decision block 816, if the current transform block is an 8x8 integer transform coefficient block, at execution block 820, the forward integer transform subroutine 800 executes an 8x8 forward transform, such as the 8x8 forward integer transform operation described above. At execution block 824, forward integer transform subroutine 800 operates on the transform coefficients resulting via the 8 × 8 integer transform at execution block 820, shifting the transform coefficients to the right 2 times to ensure that the transform coefficients can be represented by no more than 16 bits. The forward integer transform subroutine 800 returns the displaced transform coefficients at return block 899.

Referring again to decision block 816, if the current transform block is not an 8x8 integer transform coefficient block (e.g., if it is a 16x16, 32 x 32, or 64 x 64 integer transform coefficient block), the forward integer transform subroutine 800 proceeds to decision block 826.

At decision block 826, if the current transform block is a 16x16 integer transform coefficient block, at execution block 828, the forward integer transform subroutine 800 performs a 16x16 forward transform, such as the 16x16 forward integer transform operation described above. The forward integer transform subroutine 800 then proceeds to execution block 824. As described above, at execution block 824, the forward integer transform subroutine 800 operates on the transform coefficients resulting via the 8 × 8 integer transform at execution block 820, shifting the transform coefficients to the right 2 times to ensure that the transform coefficients can be represented by no more than 16 bits. The forward integer transform subroutine 800 then returns the displaced transform coefficients at return block 899.

Referring again to decision block 826, if the current transform block is larger than a 16x16 integer transform coefficient block, e.g., a 32 x 32 or 64 x 64 integer transform coefficient block, then at execution block 832, the forward integer transform subroutine 800 executes a large transform procedure. The forward integer transform subroutine 800 returns the result of the large integer transform procedure at return block 899.

Quadratic conversion subroutine

FIG. 9 illustrates a quadratic transform subroutine 900 suitable for use in at least one embodiment (e.g., the transform block processing routine 600).

At execution block 904, the quadratic transform subroutine 900 obtains a transform block of intermediate integer transform coefficients for the current coding block.

At execution block 908, the quadratic transform subroutine 900 extracts an intermediate DC coefficient from each block of intermediate integer transform coefficients.

At execution block 912, the quadratic transform subroutine 900 generates a transform block of intermediate DC coefficients.

The quadratic transform subroutine 900 then passes the intermediate DC coefficients to the forward transform subroutine 800, and the forward transform subroutine 800 returns a block of 16-bit integer transform coefficients (now quadratic transformed).

The quadratic transform subroutine 900 returns the transformed block of quadratic transforms at return block 999.

Transform block recovery routine

Fig. 10 illustrates a transform block recovery routine 1000 suitable for use in at least one embodiment (e.g., decoder 500). Those of ordinary skill in the art will recognize that not all events in the decoding process are shown in fig. 10. Rather, for clarity, only those steps reasonably relevant to describing the transform block recovery routine 1000 are shown.

At execution block 1004, the transform block recovery routine 1000 obtains a dequantized transform coefficient block, e.g., from the inverse quantizer 512.

At execution block 1005, the transform block recovery routine 1000 determines the size of the currently encoded block.

At execution block 1006, the transform block recovery routine 1000 determines the size of a prediction block used for motion prediction of the current coding block.

At execution block 1007, the transform block recovery routine 1000 looks up the prediction block size for the corresponding combination of the current coding block size and the size of the prediction block used for motion prediction for the current coding block.

Then at execution block 1008, the transform block recovery routine 1000 assembles the dequantized transform coefficients into one or more transform blocks of 16-bit integer transform coefficients according to the transform block size obtained at execution block 1007 above.

At decision block 1028, if the transform block of the currently encoded block is not twice transformed, the transform block recovery routine 1000 proceeds to a start loop block 1032, described below. If the transform blocks of the current encoded block have been sub-transformed (e.g., if they include sub-transform blocks of 16-bit integer DC transform coefficients), the transform block recovery routine 1000 invokes an inverse integer transform subroutine 1100 (described below with reference to fig. 11), which inverse integer transform subroutine 1100 performs an initial inverse transform operation on the sub-transform blocks of 16-bit integer transform coefficients of the transform block of the current encoded block and returns a corresponding block of intermediate 16-bit integer DC transform coefficients.

At execution block 1030, the transform block recovery routine 1000 inserts the appropriate 16-bit integer DC transform coefficient into the corresponding 16-bit integer transform coefficient block and proceeds to a start loop block 1032, described below.

Beginning at start loop block 1032, the transform block recovery routine 1000 processes each transform block of 16-bit integer transform coefficients in turn.

At decision block 1036, if the transform block mode flag for the corresponding transform block is set in the transform block header, at end loop block 1040, the transform block recovery routine 1000 iterates back to the start loop block 1032 to process the next block (if any) of 16-bit integer transform coefficients for the currently encoded block.

If at decision block 1036 the transform block mode flag for the corresponding transform block is not set in the transform block header, then the transform block recovery routine 1000 calls the inverse transform subroutine 1100 (described below with reference to FIG. 11), and the inverse transform subroutine 1100 returns the block of recovered residual values.

At end loop block 1040, the transform block recovery routine 1000 iterates back to the start loop block 1032 to process the next transform block (if any) of the currently encoded block.

The transform block recovery routine 1000 ends at terminator block 1099.

Inverse integer transform subroutine

FIG. 11 illustrates an inverse integer transform subroutine 1100 suitable for use in at least one embodiment (e.g., the transform block recovery routine 1000).

At execution block 1104, the inverse integer transform subroutine 1100 obtains a transform block, for example, from the transform block recovery routine 1000.

At decision block 1108, if the transform block is a 4 x 4 transform block, at execution block 1110, the inverse integer transform subroutine 1100 performs a 4 x 4 inverse integer transform, such as the 4 x 4 inverse integer transform described above. At execution block 1112, the inverse integer transform subroutine 1100 shifts the resulting integer transform coefficients by 5 bits to the right. The inverse integer transform subroutine 1100 returns the shifted integer transform coefficients at return block 1199.

Referring again to decision block 1108, if the transform block is not a 4 x 4 transform block, then inverse integer transform subroutine 1100 proceeds to decision block 1116.

At decision block 1116, if the transform block is an 8 × 8 transform block, then at execution block 1118, the inverse integer transform subroutine 1100 performs an 8 × 8 inverse integer transform, such as the 8 × 8 inverse integer transform described above. At execution block 1120, the inverse integer transform subroutine 1100 shifts the resulting integer transform coefficients by 7 bits to the right. The inverse integer transform subroutine 1100 returns the shifted integer transform coefficients at return block 1199.

Referring again to decision block 1116, if the transform block is not an 8x8 transform block, the inverse integer transform subroutine 1100 proceeds to decision block 1126.

At decision block 1126, if the transform block is a 16 × 16 transform block, at execution block 1127, the inverse integer transform subroutine 1100 performs a 16 × 16 inverse integer transform, such as the 16 × 16 inverse integer transform described above. At execution block 1128, the inverse integer transform subroutine 1100 shifts the resulting integer transform coefficients by 7 bits to the right. The inverse integer transform subroutine 1100 returns the shifted integer transform coefficients at return block 1199.

Referring again to decision block 1126, if the transform block is larger than a 16x16 transform block, such as a 32 x 32 or 64 x 64 transform block, then at execution block 1132, the inverse integer transform subroutine 1100 executes a large inverse transform procedure. At return block 1199, the inverse integer transform subroutine 1100 returns the result of the large integer transform procedure.

Recursive coding block partitioning scheme

Fig. 11 illustrates an exemplary recursive encoding block partitioning scheme 1100 that may be implemented by the encoder 400, in accordance with various embodiments. At block indexer 408, after the frame is divided into LCB-sized pixel regions (hereinafter referred to as coded block candidates ("CBCs")), each LCB-sized coded block candidate ("LCBC") may be divided into smaller CBCs according to recursive coded block partitioning scheme 1100. This process may continue recursively until block indexer 408 determines whether (1) the current CBC is suitable for encoding (e.g., because the current CBC contains only pixels of a single value) or (2) the current CBC is an MCB-sized CBC ("MCBC"), whichever occurs first. Chunk indexer 408 may then index the current CBC into encoded chunks suitable for encoding.

The square CBC 1102 (e.g., LCBC) may be divided along one or both of the vertical and horizontal lateral axes 1104, 1106. The division along the vertical horizontal axis 1104 vertically divides the square CBC 1102 into a first rectangular coding block structure 1108, as shown by rectangular (1:2) CBCs 1110 and 1112. The division along the horizontal axis 1106 horizontally divides the square CBC 1102 into a second rectangular code structure 1114, as shown by the combination of rectangular (2:1) CBCs 1116 and 1118. The division along the horizontal and vertical horizontal axes 1104, 1106 divides the square CV 1102 into a four square coded block structure 1120, as shown by the square CBCs 1122, 1124, 1126, and 1128 taken together.

The rectangle (1:2) CBC (e.g., CBC 1112) of the first rectangle coding block structure 1108 may be divided along a horizontal axis 1130 into a first two square coding block structure 1132, as shown by the square CBCs 1134 and 1136 taken together.

The rectangular (2:1) CBC (e.g., CBC 1118) of the second rectangular coding structure 1114 may be divided into a second two square coding block structure 1138, as shown by the combination of square CBCs 1140 and 1142.

The square CBC of the four square coding block structure 1120, the first two square coding block structure 1132 or the second two square coding block structure 1138 may be divided along one or both of the vertical and horizontal lateral axes of the coding blocks in the same manner as the CBC 1102.

For example, a 64 × 64 bit LCBC sized coded block may be divided into 232 × 64 bit coded blocks, 264 × 32 bit coded blocks, or 432 × 32 bit coded blocks.

In the coded bitstream, a 2-bit coded block partition flag may be used to indicate whether the current coded block is further partitioned:

coded block partition flag value Partition type
00 The current coding block is not divided
01 The current coding block is divided horizontally
10 The current coding block is divided vertically
11 When the coding block is divided horizontally and vertically

Code block indexing routine

FIG. 13 illustrates an exemplary encoded block indexing routine 1300 that may be performed by block indexer 408 in accordance with various embodiments.

The encode block indexing routine 1300 may obtain frames of a video sequence at execution block 1302.

The encode block indexing routine 1300 may divide the frame into LCBCs at execution block 1304.

At start loop block 1306, the encode block indexing routine 1300 may process each LCBC in turn, e.g., starting from the LCBC in the upper left corner of the frame and proceeding from left to right, top to bottom.

At subroutine block 1400, the encode block indexing routine 1300 calls the encode block partition subroutine 1400 (described below with reference to fig. 14).

At end loop block 1308, the code block indexing routine 1300 loops back to the start loop block 1306 to process the next LCBC (if any) for the frame.

The encoded block indexing routine 1300 ends at return block 1399.

Coded block partitioning subroutine

FIG. 14 illustrates an exemplary encoded block partitioning subroutine 1400 that may be performed, for example, by block indexer 408, in accordance with various embodiments.

Subroutine 1400 obtains the CBC at execution block 1402. The encoded block candidates may be provided from the routine 1400 or recursively as described below.

At decision block 1404, if the obtained CBC is an MCBC, encoding block partitioning subroutine 1400 may proceed to execute block 1406; otherwise the encode block partition subroutine 1400 may proceed to execution block 1408.

The encoded block partitioning subroutine 1400 may index the obtained CBCs into encoded blocks at execution block 1406. The coded block partition subroutine 1400 then terminates at return block 1498.

The encode block partitioning subroutine 1400 may test the encoding suitability of the current CBC at execution block 1408. For example, coding block partitioning subroutine 1400 may analyze the pixel values of the current CBC and determine whether the current CBC contains only pixels of a single value or whether the current CBC matches a predefined pattern.

At decision block 1410, if the current CBC is suitable for encoding, the encode block partitioning subroutine 1400 may proceed to execution block 1406; otherwise the encoding block partitioning subroutine 1400 may proceed to decision block 1412.

At decision block 1412, if the current CBC is a square CBC, the encode block partition subroutine 1400 may proceed to execute block 1414; otherwise the encoding block partitioning subroutine 1400 may proceed to execution block 1416.

The encoded block partitioning subroutine 1400 may select the encoded block partitioning structure for the current square CBC at execution block 1414. For example, the coding block partitioning subroutine 1400 may be selected among a first rectangular coding block structure 1108, a second rectangular coding structure 1114, or a four square coding block structure 1120 of the recursive coding block partitioning scheme 1100 (described above with reference to fig. 11).

The encoding block partitioning subroutine 1400 may partition the current CBC into 2 or 4 sub-CBCs according to the recursive encoding block partitioning scheme 1100 at execution block 1416.

At start loop block 1418, the encode block partition subroutine 1400 may process each sub-CBC resulting from executing the partition procedure of block 1416 in turn.

At subroutine block 1400, the encoding block scheduler subroutine 1400 may call itself to process the current sub-CBC in the manner presently described.

At end loop block 1420, the encode block partition subroutine 1400 loops back to the start loop block 1418 to process the next sub-CBC (if any) of the current CBC.

The encode block partition subroutine 1400 may then terminate at return block 1499.

Coding block tree partitioning procedure

Fig. 15 a-15 c illustrate an exemplary coding block tree partitioning procedure 1500 that applies a coding block partitioning scheme 1100 to a "root" LCBC 1502. Fig. 15a shows the individual sub-encoding blocks 1504 to 1554 created by the encoding block tree partitioning program 1500; fig. 15b shows a coding block tree partitioning procedure as a tree data structure, which shows parent/child relationships between the respective coding blocks 1502 to 1554; figure 15c shows various "leaf node" sub-encoding blocks of figure 15b located at respective positions within the configuration of the root encoding block 1502 (as indicated by the dashed lines).

Assuming that the 64 x 64LCBC 1502 is not suitable for encoding, it may be divided into a first rectangular coding block structure 1108, a second rectangular coding structure 1114, or a four square coding block structure 1120 of a recursive coding block division scheme 1100 (described above with reference to fig. 11). For this example, assume that the 64 × 64LCBC 1502 is divided into 232 × 64-bit sub-coded block candidates, namely 32 × 64CBC 1504 and 32 × 64CBC 1506. Each of these sub-CBCs may then be processed in turn.

Assuming that the first sub-32 × 64CBC 1504 of the 64 × 64LCBC 1502 is not suitable for encoding, it may be divided into 2 sub-32 encoded block candidates, namely 32 × 32CBC 1508 and 32 × 32CBC 1510. Each of these sub-CBCs may then be processed in turn.

Assuming that the first sub-32 × 32CBC 1508 of the 32 × 64LCBC 1504 is not suitable for encoding, it may be partitioned into 2 sub-16 × 32 encoded block candidates, 16 × 32CBC 1512 and 16 × 32CBC 1514. Each of these sub-CBCs may then be processed in turn.

The encoder 400 may determine that the first sub-16 × 32CBC 1512 of the 32 × 32CBC 1508 is suitable for encoding; the encoder 400 may index the 16x 32CBC 1512 into an encoded block 1513 and return to the parent 32 x 32CBC 1508 to process its next child, if any.

The encoder 400 may determine that the second sub-16 × 32CBC 1514 of the 32 × 32CBC 1508 is not suitable for encoding; it can be divided into 216 × 16 CBCs, namely 16 × 16CBC 1516 and 16 × 16CBC 1518. Each of these sub-CBCs may then be processed in turn.

Assuming that the first sub-16 × 16CBC 1516 of the 16 × 32CBC 1514 is not suitable for encoding, it can be divided into 28 × 16 CBCs, i.e., 8 × 16CBC 1520 and 8 × 16CBC 1522. Each of these sub-CBCs may then be processed in turn.

The encoder 400 may determine that the first sub-8 × 16CBC 1520 of the 16 × 16CBC 1516 is suitable for encoding; the encoder 400 may index the 8 × 16CBC 1520 into an encoded block 1521 and return to the parent 16 × 16CBC 1516 to process its next child (if any).

The encoder 400 may determine that the second sub-8 × 16CBC 1522 of the 16 × 16CBC 1516 is suitable for encoding; the encoder 400 can index the 8 × 16CBC 1522 into an encoded block 1523 and return to the parent 16 × 16CBC 1516 to process its next child (if any).

All the children of the 16 × 16CBC 1516 have now been processed, resulting in an index for the 8 × 16 coded blocks 1521 and 1523. Thus, the encoder 400 may return to the parent 16 × 32CBC 1514 to process its next child (if any).

Assuming that the second sub-16 × 16CBC 1518 of the 16 × 32CBC 1514 is not suitable for encoding, it may be partitioned into 28 × 16 encoded block candidates, i.e., 8 × 16CBC 1524 and 8 × 16CBC 1526. Each of these sub-CBCs may then be processed in turn.

Assuming that the first sub-8 × 16CBC 1524 of the 16 × 16CBC 1518 is not suitable for encoding, it may be divided into 28 × 8 encoded block candidates, i.e., 8 × 8CBC 1528 and 8 × 8CBC 1530. Each of these sub-CBCs may then be processed in turn.

The encoder 400 may determine that a first sub-8 × 8CBC 1528 of the 8 × 16CBC 1524 is suitable for encoding; the encoder 400 can index the 8 × 8CBC 1528 into an encoded block 1529 and then return to the parent 8 × 16CBC 1524 to process its next child (if any).

The encoder 400 may determine that the second child 8 × 8CBC 1530 of the 8 × 16CBC 1524 is suitable for encoding, so the encoder 400 may index the 8 × 8CBC 1530 into an encoded block 1531 and then return to the parent 8 × 16CBC 1524 to process its next child (if any).

All the children of the 8x 16CBC 1524 have now been processed, resulting in an index for the 8x8 coding blocks 1529 and 1531. Thus, the encoder 400 returns to the parent 16 × 16CBC 1518 to process its next child (if any).

The encoder 400 may determine that a second sub-8 × 16CBC 1526 of the 16 × 16CBC 1518 is suitable for encoding; the encoder 400 can index the 8 × 16CBC 1526 into an encoded block 1527 and then return to the parent 16 × 16CBC 1518 to process its next child (if any).

All the children of the 16x16 CBC 1518 have now been processed, resulting in an index for the 8x8 coding blocks 1529, 1531, and the 8x 16 coding block 1527. The encoder 400 may return to the parent 16x 32CBC 1514 to process its next child, if any.

All the children of the 16 × 32CBC 1514 have now been processed, resulting in indexes for the 8 × 8 coding blocks 1529, 1531, 8 × 16 coding blocks 1521, 1523, 1527. The encoder 400 may return to the parent 32 x 32CBC 1508 to process its next child, if any.

All the children of the 32 x 32CBC 1508 have now been processed, resulting in an index for the 8x8 coding blocks 1529, 1531, 8x 16 coding blocks 1521, 1523, 1527, 16x 32 coding blocks 1513. The encoder 400 may return the parent 32 x 64CBC 1504 to process its next child, if any.

The encoder 400 may determine that the second sub-32 × 32CBC 1510 of the 32 × 64CBC 1504 is suitable for encoding; the encoder 400 can index the 32 x 32CBC 1510 into an encoded block 1511 and then return to the parent 32 x 64CBC 1504 to process its next child, if any.

All the children of the 32 x 64CBC 1504 have now been processed, resulting in the indexing of the 8x8 coding blocks 1529, 1531, 8x 16 coding blocks 1521, 1523, 1527, 16x 32 coding blocks 1513, and 32 x 32 coding blocks 1511. The encoder 400 may return the parent, root 64 x 64LCBC 1502 to process its next child (if any).

Assuming that the second sub-32 × 64CBC 1506 of the 64 × 64LCBC 1502 is not suitable for encoding, it may be partitioned into 232 × 32 encoded block candidates, namely 32 × 32CBC 1532 and 32 × 32CBC 1534. Each of these sub-CBCs may then be processed in turn.

Assuming that the first sub-32 × 32CBC 1532 of the 32 × 64LCBC 1506 is not suitable for encoding, it may be partitioned into 232 × 16 encoded block candidates, namely 32 × 16CBC 1536 and 32 × 16CBC 1538. Each of these sub-CBCs may then be processed in turn.

The encoder 400 may determine that the first sub-32 × 16CBC 1536 of the 32 × 32CBC 1532 is suitable for encoding; the encoder 400 may index the 32 x16 CBC 1536 into an encoded block 1537 and then return to the parent 32 x 32CBC 1532 to process its next child (if any).

The encoder 400 may determine that a second sub 32 × 16CBC 1538 of the 32 × 32CBC 1532 is suitable for encoding; the encoder 400 may index the 32 x16 CBC 1538 into an encoded block 1539 and then return to the parent 32 x 32CBC 1532 to process its next child (if any).

All the children of the 32 x 32CBC 1532 have now been processed, resulting in an index for the 32 x16 coding blocks 1537, 1539. The encoder 400 may return to the parent 32 x 64CBC 1506 to process its next child, if any.

Assuming that the second sub 32 × 32CBC 1534 of the 32 × 64CBC 1506 is not suitable for encoding, it may be partitioned into 416 × 16 encoded block candidates, namely 16 × 16CBC 1540, 16 × 16CBC 1542, 16 × 16CBC 1544 and 16 × 16CBC 1546. Each of these sub-CBCs may then be processed in turn.

The encoder 400 may determine that the first sub-16 × 16CBC 1540 of the 32 × 32CBC 1534 is suitable for encoding; the encoder 400 may index the 16 × 16CBC 1540 into an encoded block 1541 and then return to the parent 32 × 32CBC 1534 to process its next child (if any).

The encoder 400 may determine that the second sub-16 × 16CBC 1542 of the 32 × 32CBC 1534 is suitable for encoding; the encoder 400 may index the 16 × 16CBC 1542 into an encoded block 1543 and then return to the parent 32 × 32CBC 1534 to process its next child (if any).

Assuming that the third sub-16 × 16CBC 1544 of the 32 × 32CB is not suitable for coding, it may be divided into 4 coded block candidates, i.e., 8 × 8CBC 1548, 8 × 8CBC 1550, 8 × 8CBC 1552 and 8 × 8CBC 1554. Each of these sub-CBCs may then be processed in turn.

The encoder 400 may determine that the first sub-8 × 8CBC 1548 of the 16 × 16CBC 1544 is suitable for encoding; the encoder 400 may index the 8 × 8CBC 1548 into an encoded block 1549 and then return to the parent 16 × 16CBC 1544 to process its next child (if any).

The encoder 400 may determine that the second sub-8 × 8CBC 1550 of the 16 × 16CBC 1544 is suitable for encoding; the encoder 400 may index the 8 × 8CBC 1550 into an encoded block 1551 and then return to the parent 16 × 16CBC 1544 to process its next child (if any).

The encoder 400 may determine that the third sub-8 × 8CBC 1552 of the 16 × 16CBC 1544 is suitable for encoding; the encoder 400 may index the 8 × 8CBC 1552 into an encoded block 1553 and then return to the parent 16 × 16CBC 1544 to process its next child (if any).

The encoder 400 may determine that the fourth sub-8 × 8CBC 1554 of the 16 × 16CBC 1544 is suitable for encoding; the encoder 400 may index the 8 × 8CBC 1554 into an encoded block 1555 and then return to the parent 16 × 16CBC 1544 to process its next child (if any).

All the children of the 16 × 16CBC 1544 have now been processed resulting in 8 × 8 coded blocks 1549, 1551, 1553 and 1555. The encoder 400 may return to the parent 32 x 32CBC 1534 to process its next child, if any.

The encoder 400 may determine that the fourth sub-16 × 16CBC1546 of the 32 × 32CBC 1534 is suitable for encoding; the encoder 400 may index the 16 × 16CBC1546 into an encoded block 1547 and then return to the parent 32 × 32CBC 1534 to process its next child (if any).

All the children of the 32 x 32CBC 1534 have now been processed to get an index for the 16x16 coded blocks 1541, 1543, 1547 and the 8x8 coded blocks 1549, 1551, 1553, 1555. The encoder 400 may return to the parent 32 x 64LCBC 1506 to process its next child, if any.

Now all the children of the 32 × 64CBC 1506 have been processed, resulting in 32 × 16 encoded blocks 1537, 1539; the 16 × 16 coding blocks 1541, 1543, 1547 and the 8 × 8 coding blocks 1549, 1551, 1553 and 1555. The encoder 400 may return the parent, root 64 x 64LCBC 1502 to process its next child (if any).

Now all the children of the 64 × 64LCBC 1502 have been processed, resulting in 8 × 8 coded blocks 1529, 1531, 1549, 1551, 1553, 1555; 8 × 16 coding blocks 1521, 1523, 1527; 16 × 32 coding blocks 1513, 32 × 32 coding blocks 1511; 32 × 16 encoded blocks 1537, 1539; and indexing of the 16 × 16 coded blocks 1541, 1543, 1547. The encoder 400 may proceed to the next LCBC (if any) of the frame.

Alternative forward integer transform procedure for rectangular coded blocks

Referring to the function of transformer 420, the transformer receives a block of residual values for the luma and chroma values of each encoded block and divides the block of residual values into one or more luma and chroma transform blocks.

In at least one implementation, the transform block size is equal to the prediction block size, where the prediction block size is equal to the coding block size.

After the predictors of the coding block are selected, a prediction block is obtained, and the predictors in the prediction block can be converted from the spatial domain to the frequency domain, e.g., via a forward transform operation. In at least one embodiment, in order to improve coding efficiency, an integer equivalence of residual values of a transform block is obtained and a forward integer transform operation may be performed.

For a rectangular prediction block, transformer 420 may perform 2 one-dimensional transform sequences similar to the transforms described above. However, unlike square coded blocks, the same transform matrix may not be suitable for both transform operations. For example, for a 16x16 block of predicted values, in execution block 828 of the forward integer transform subroutine 800 (described above with reference to fig. 8), the transformer 420 may: (1) for example using T16x16Performing a 16-point one-dimensional transform on the 16 × 16 block of predicted values to obtain a 16 × 16 block of intermediate transform coefficients; (2) transposing the 16x16 block of intermediate transform coefficients to obtain a transposed 16x16 block of intermediate transform coefficients; and (3) use of T, for example16x16The same 16-point one-dimensional transform is performed on the 16x16 block of transposed intermediate transform coefficients to obtain a 16x16 block of transform coefficients.

In the case of a rectangular coding block (e.g., a 16 × 8 coding block), transformer 420 may: (1) using T16x16Performing a 16-point one-dimensional transform on the 16x 8 block of predicted values to obtain a 16x 8 block of intermediate transform coefficients; (2) transposing a 16x 8 block of intermediate transform coefficients to obtain an 8x 16 block of transposed intermediate transform coefficients; and (3) use of T, for example8x8An 8-point one-dimensional transform is performed on the 8x 16 block of transposed intermediate transform coefficients to obtain an 8x 16 block of transform coefficients.

Similarly, for an 8 × 16 coding block, transformer 420 may: (1) for example using T8x8Performing an 8-point one-dimensional transform on the 8 × 16 block of predicted values to obtain an 8 × 16 block of intermediate transform coefficients; (2) transposing an 8x 16 block of intermediate transform coefficients to obtain a transposed 16x 8 block of intermediate transform coefficients; and (2) use of T, for example16x16Performing a 16-point one-dimensional transform on a 16x 8 block of transposed intermediate transform coefficientsTo obtain a 16x 8 block of transform coefficients.

The transform size S may be transmitted in the picture header using the flag M according to the following formula:

S=2M

thus for a 16-point transform, S ═ 16, the marker M has a value of 4.

Alternative transform block processing routines

Fig. 16 illustrates an exemplary transform block processing routine 1600 suitable for use with the alternative forward integer transform procedure for the rectangular coded block described above.

At execution block 1602, the transform block processing routine 1600 may obtain a block of predicted values, for example, from the output of the differentiator 412.

As described above, at execution block 1604, the transform block processing routine 1600 may normalize the prediction values to 16-bit integers.

At subroutine block 1700A, the transform block processing routine 1600 may call the forward integer transform subroutine 1700 (described below with reference to fig. 17) to perform a first of 2 forward integer transform operations on a block of predicted values. Subroutine block 1700A may return a block of intermediate coefficients.

The transform block processing routine 1600 may transpose the block of intermediate coefficients at execution block 1606.

At subroutine block 1700B, the transform block processing routine 1600 may again call the forward integer transform subroutine 1700 to perform the second of the 2 forward integer transform operations on the block of intermediate coefficients. The subroutine block 1700B may return a block of transform coefficients.

At decision block 1608, if a shift operation (such as the shift operation described above) is necessary, the transform block processing routine 1600 may proceed to execute block 1608; otherwise the transform block processing routine 1600 may proceed to terminator block 1699.

The transform block processing routine 1600 may perform any necessary shifting operations on the block of transform coefficients at execution block 1608.

The transform block processing routine 1600 may return a block of transform coefficients at termination block 1699.

Alternative forward integer transform subroutine

FIG. 17 illustrates an exemplary forward integer transform subroutine 1700 suitable for use in the transform block processing routine 1600 described above.

The forward integer transform subroutine 1700 may obtain a block of 16-bit integer coefficients ("coefficient block") at execution block 1702. According to the current embodiment, the coefficient block may have a size: 64 × 64, 64 × 32, 32 × 64, 32 × 32, 32 × 16, 16 × 32, 16 × 16, 16 × 8, 8 × 16, 8 × 8, 8 × 4, 4 × 8, or 4 × 4.

At decision block 1704, if the number of rows of the coefficient block is greater than 16 (e.g., the number of rows is 64 or 32), the forward integer transform subroutine 1700 may proceed to decision block 1706; otherwise (e.g., the number of rows is 16, 8, or 4), forward integer transform subroutine 1700 may proceed to decision block 1708.

At decision block 1706, if the number of rows of the coefficient block is greater than 32 (e.g., the number of rows is 64), the forward integer transform subroutine 1700 may proceed to execution block 1710; otherwise (e.g., the number of rows is 32), the forward integer transform subroutine 1700 may proceed to execute block 1712.

The forward integer transform subroutine 1700 may perform a 64-bit forward integer transform operation on the coefficient block at execution block 1710. The forward integer transform subroutine 1700 may then end by returning the resulting transformed coefficient block at termination block 1795.

The forward integer transform subroutine 1700 may perform a 32-bit forward integer transform operation on the coefficient block at execution block 1712. The forward integer transform subroutine 1700 may then end by returning the resulting transformed coefficient block at termination block 1796.

Returning to decision block 1708, if the number of lines of the coefficient block is less than 8 (e.g., the number of lines is 4), then forward integer transform subroutine 1700 may proceed to execution block 1714; otherwise (e.g., the number of rows is 8 or 16), the forward integer transform subroutine 1700 may proceed to a decision block 1716.

The forward integer transform subroutine 1700 may perform a 4-bit forward integer transform operation on the coefficient block at execution block 1714. The forward integer transform subroutine 1700 may then end by returning the resulting transformed coefficient block at termination block 1797.

At decision block 1716, if the number of rows of the coefficient block is greater than 8 (e.g., the number of rows is 16), the forward integer transform subroutine 1700 may proceed to execution block 1718; otherwise (e.g., the number of rows is 8), forward integer transform subroutine 1700 may proceed to execution block 1720.

The forward integer transform subroutine 1700 may perform a 16-bit forward integer transform operation on the coefficient block at execution block 1718. The forward integer transform subroutine 1700 may then end by returning the resulting transformed coefficient block at termination block 1798.

The forward integer transform subroutine 1700 may perform an 8-bit forward integer transform operation on the coefficient block at execution block 1720. The forward integer transform subroutine 1700 may then end by returning the resulting transformed coefficient block at termination block 1799.

Although specific embodiments have been illustrated and described herein, it will be appreciated by those of ordinary skill in the art that a variety of alternate and/or equivalent implementations may be substituted for the specific embodiments shown and described without departing from the scope of the present disclosure. This application is intended to cover any adaptations or variations of the embodiments discussed herein.

43页详细技术资料下载
上一篇:一种医用注射器针头装配设备
下一篇:用于显示器流压缩的子流多路复用

网友询问留言

已有0条留言

还没有人留言评论。精彩留言会获得点赞!

精彩留言,会给你点赞!

技术分类