Method and apparatus for interaction between decoder-side intra mode derivation and adaptive intra prediction modes

文档序号：1943014 发布日期：2021-12-07 浏览：16次中文

阅读说明：本技术 用于解码器侧帧内模式导出和自适应帧内预测模式之间的交互的方法和装置 (Method and apparatus for interaction between decoder-side intra mode derivation and adaptive intra prediction modes ) 是由赵亮赵欣刘杉于 2021-02-03 设计创作，主要内容包括：一种用于执行视频序列的图像的当前块的帧内预测的方法,包括：确定第一标志是否指示对应于当前块的帧内预测模式是定向模式；以及基于确定第一标志指示对应于当前块的帧内预测模式是定向模式,在允许帧内预测模式(AIPM)列表中确定该帧内预测模式的索引,并使用在AIPM列表中与所确定的索引对应的的帧内预测模式来执行当前块的帧内预测。(A method for performing intra prediction of a current block of an image of a video sequence, comprising: determining whether the first flag indicates that an intra prediction mode corresponding to the current block is a directional mode; and based on determining that the first flag indicates that the intra prediction mode corresponding to the current block is the directional mode, determining an index of the intra prediction mode in an Allowed Intra Prediction Mode (AIPM) list, and performing intra prediction of the current block using the intra prediction mode corresponding to the determined index in the AIPM list.)

1. A method for performing intra prediction for a current block of an image of a video sequence, the method being performed by at least one processor and comprising:

determining whether a first flag indicates that an intra-prediction mode corresponding to the current block is a directional mode; and

based on determining that the first flag indicates that an intra-prediction mode corresponding to the current block is the directional mode:

determining an index of the intra prediction mode in an allowed intra prediction mode AIPM list; and

performing intra-prediction of the current block using an intra-prediction mode corresponding to the determined index in the AIPM list.

2. The method of claim 1, wherein the method further comprises:

determining whether a second flag indicates that the intra-prediction mode is a decoder-side intra-mode derived (DIMD) mode based on a determination that the first flag indicates that the intra-prediction mode corresponding to the current block is the directional mode;

based on determining that the second flag does not indicate that the intra-prediction mode is the DIMD mode:

determining an index of the intra prediction mode in the AIPM list; and

performing intra-prediction of the current block using an intra-prediction mode corresponding to the determined index in the AIPM list; and

based on determining that the second flag indicates that the intra-prediction mode is the DIMD mode:

performing DIMD to determine the intra-prediction mode; and

performing intra prediction of the current block using the determined intra prediction mode.

3. The method of claim 1, wherein the method further comprises: the intra prediction mode determined by performing decoder-side intra mode derivation, DIMD, is first inserted into a first level of the AIPM list.

4. The method of claim 2, wherein the method further comprises: performing intra prediction of a chroma component of the current block using an intra prediction mode determined by performing the DIMD.

5. The method of claim 1, wherein the method further comprises: based on determining that the first flag does not indicate that an intra-prediction mode corresponding to the current block is the directional mode:

determining the intra-prediction mode to be one of a plurality of non-directional modes; and

performing intra prediction of the current block using the determined intra prediction mode.

6. The method of claim 1, wherein the method further comprises: performing intra-prediction of a luma component of the current block using at least one intra-prediction mode in the AIPM list.

7. The method of claim 1, wherein the AIPM list includes only directional patterns, an

The number of directional patterns included in the AIPM list is equal to the sum of powers of 2 or multiples of a power of 2.

8. A device for performing intra prediction of a current block of an image of a video sequence, the device comprising:

at least one memory configured to store computer program code; and

at least one processor configured to access the at least one memory and operate in accordance with the computer program code, the computer program code comprising:

determining whether a first flag indicates that an intra-prediction mode corresponding to the current block is a directional mode;

determining, by the at least one processor, an index of an intra-prediction mode corresponding to the current block in an allowed intra-prediction mode (AIPM) list based on determining that the first flag indicates that the intra-prediction mode is the directional mode; and

first execution code configured to cause the at least one processor to, based on determining that the first flag indicates that an intra-prediction mode corresponding to the current block is the directional mode, perform intra-prediction for the current block using an intra-prediction mode in the AIPM list that corresponds to the determined index.

9. The apparatus of claim 8, wherein the apparatus further comprises:

third determining code configured to cause the at least one processor to determine whether a second flag indicates that the intra-prediction mode is decoder-side intra-mode derived, DIMD, mode based on determining that the first flag indicates that the intra-prediction mode corresponding to the current block is the directional mode,

wherein the second determining code is further configured to cause the at least one processor to determine an index of the intra-prediction mode in the AIPM list based on determining that the second flag does not indicate that the intra-prediction mode is the DIMD mode, and

the first executing code is further configured to cause the at least one processor, based on determining that the second flag does not indicate that the intra-prediction mode is the DIMD mode, to perform intra-prediction of the current block using an intra-prediction mode in the AIPM list that corresponds to the determined index; and

second execution code configured to cause the at least one processor, based on determining that the second flag indicates that the intra-prediction mode is the DIMD mode, to execute DIMD to determine the intra-prediction mode,

wherein the first executing code is further configured to cause the at least one processor to perform intra-prediction of the current block using the determined intra-prediction mode based on determining that the second flag indicates that the intra-prediction mode is the DIMD mode.

10. The apparatus according to claim 8, wherein the apparatus further comprises insertion code configured to cause the at least one processor to first insert an intra-prediction mode determined by performing decoder-side intra-mode derivation, DIMD, into a first level of the AIPM list.

11. The apparatus of claim 9, wherein the apparatus further comprises third execution code configured to cause the at least one processor to perform intra prediction of a chroma component of the current block using an intra prediction mode determined by performing the DIMD.

12. The apparatus of claim 8, wherein the apparatus further comprises fourth determining code configured to cause the at least one processor to determine that the intra-prediction mode is one of a plurality of non-directional modes based on determining that the first flag does not indicate that the intra-prediction mode corresponding to the current block is the directional mode,

wherein the first executing code is further configured to cause the at least one processor to perform intra-prediction of the current block using the determined intra-prediction mode based on determining that the first flag does not indicate that the intra-prediction mode corresponding to the current block is the directional mode.

13. The apparatus of claim 8, wherein the apparatus further comprises third executing code configured to cause the at least one processor to perform intra prediction of a luma component of the current block using at least one intra prediction mode in the AIPM list.

14. The apparatus of claim 8, wherein the AIPM list includes only directional patterns, an

The number of directional patterns included in the AIPM list is equal to the sum of powers of 2 or multiples of a power of 2.

15. A non-transitory computer-readable storage medium storing instructions that cause at least one processor to:

determining whether the first flag indicates that an intra prediction mode corresponding to the current block is a directional mode; and

based on determining that the first flag indicates that an intra-prediction mode corresponding to the current block is the directional mode:

determining an index of the intra prediction mode in an allowed intra prediction mode AIPM list; and

performing intra-prediction of the current block using an intra-prediction mode corresponding to the determined index in the AIPM list.

16. The non-transitory computer-readable storage medium of claim 15, wherein the instructions further cause the at least one processor to:

based on determining that the second flag does not indicate that the intra-prediction mode is the DIMD mode:

determining an index of the intra prediction mode in the AIPM list; and

performing intra-prediction of the current block using an intra-prediction mode corresponding to the determined index in the AIPM list; and

based on determining that the second flag indicates that the intra-prediction mode is the DIMD mode:

performing DIMD to determine the intra-prediction mode; and

performing intra prediction of the current block using the determined intra prediction mode.

17. The non-transitory computer-readable storage medium of claim 15, wherein the instructions further cause the at least one processor to first insert an intra-prediction mode determined by performing decoder-side intra-mode derivation (DIMD) into a first level of the AIPM list.

18. The non-transitory computer-readable storage medium of claim 16, wherein the instructions further cause the at least one processor to perform intra prediction of chroma components of the current block using an intra prediction mode determined by performing the DIMD.

19. The non-transitory computer-readable storage medium of claim 15, wherein the instructions further cause the at least one processor to, based on determining that the first flag does not indicate that the intra-prediction mode corresponding to the current block is the directional mode:

determining the intra-prediction mode to be one of a plurality of non-directional modes; and

performing intra prediction of the current block using the determined intra prediction mode.

20. The non-transitory computer-readable storage medium of claim 15, wherein the instructions further cause the at least one processor to perform intra prediction of a luma component of the current block using at least one intra prediction mode in the AIPM list.

Technical Field

Methods and apparatus consistent with embodiments relate to video coding, and more particularly, to methods and apparatus for interaction between decoder-side intra mode derivation and adaptive intra prediction modes.

Background

The video coding format VP9 supports 8 directional modes corresponding to angles from 45 degrees to 207 degrees. To exploit the greater variety of spatial redundancies in directional texture, in the video coding format aomedia video 1(AV1), the directional intra mode is extended to angle sets with finer span. The original 8 angles are slightly changed and become nominal angles, these 8 nominal angles being named V _ PRED, H _ PRED, D45_ PRED, D135_ PRED, D113_ PRED, D157_ PRED, D203_ PRED and D67_ PRED, as shown in fig. 1. For each nominal angle, it has 7 finer angles, so AV1 has 56 direction angles in total. The predicted angle is represented by the sum of the nominal frame interior angle and an angle delta, wherein the angle delta is-3 to 3 times of the step length of 3 degrees. To implement the directional prediction mode in AV1 in a general way, all 56 directional intra prediction modes in AV1 are implemented with a unified directional predictor that projects each pixel to a reference sub-pixel location and interpolates the reference pixels through a two-tap bilinear filter.

Disclosure of Invention

According to an embodiment, a method for performing intra prediction of a current block of a picture of a video sequence, performed by at least one processor and comprising: determining whether the first flag indicates that an intra prediction mode corresponding to the current block is a directional mode; and based on determining that the first flag indicates that the intra prediction mode corresponding to the current block is a directional mode, determining an index of the intra prediction mode in an Allowed Intra Prediction Mode (AIPM) list, and performing intra prediction of the current block using the intra prediction mode corresponding to the determined index in the AIPM list.

According to an embodiment, an apparatus for performing intra prediction of a current block of a picture of a video sequence comprises: at least one memory configured to store computer program code; and at least one processor configured to access the at least one memory and operate in accordance with the computer program code. The computer program code includes: determining, by the at least one processor, whether a first flag indicates that an intra-prediction mode corresponding to the current block is a directional mode; determining, by the at least one processor, an index of an intra-prediction mode in an allowed intra-prediction mode (AIPM) list based on determining that the first flag indicates that the intra-prediction mode corresponding to the current block is a directional mode; and first executing code configured to cause the at least one processor to perform intra-prediction of the current block using the intra-prediction mode corresponding to the determined index in the AIPM list based on determining that the first flag indicates that the intra-prediction mode corresponding to the current block is the directional mode.

According to an embodiment, a non-transitory computer-readable storage medium stores instructions that cause at least one processor to: determining whether the first flag indicates that an intra prediction mode corresponding to the current block is a directional mode; and determining an index of an intra prediction mode in an intra prediction mode Allowed (AIPM) list based on the determination that the first flag indicates that the intra prediction mode corresponding to the current block is the directional mode, and performing intra prediction of the current block using the intra prediction mode corresponding to the determined index in the AIPM list.

Drawings

Fig. 1 is a graph of 8 nominal angles in AV 1.

Fig. 2 is a simplified block diagram of a communication system according to an embodiment.

Fig. 3 is a diagram of placement of a video encoder and a video decoder in a streaming environment, according to an embodiment.

Fig. 4 is a functional block diagram of a video decoder according to an embodiment.

Fig. 5 is a functional block diagram of a video encoder according to an embodiment.

Fig. 6A is a diagram of the upper, left, and upper left positions of the PAETH mode.

Fig. 6B is a diagram of a recursive intra filtering mode.

Fig. 6C is a diagram showing selection of a template from a reconstruction region having T rows of pixels.

Fig. 6D is a diagram showing prediction fusion obtained by weighted averaging of two Histogram of Gradient (HoG) modes and a planar mode.

FIG. 7 is a flowchart illustrating a method for performing intra prediction of a current block of an image of a video sequence, according to an embodiment.

Fig. 8 is a simplified block diagram of an apparatus for performing intra prediction of a current block of an image of a video sequence, according to an embodiment.

FIG. 9 is a diagram of a computer system suitable for implementing embodiments.

Detailed Description

Fig. 2 is a simplified block diagram of a communication system (200) according to an embodiment. The communication system (200) may include at least two terminals (210) and (220) interconnected by a network (250). For unidirectional data transmission, a first terminal (210) may encode video data at a local location for transmission over a network (250) to another terminal (220). The second terminal (220) may receive encoded video data of another terminal from the network (250), decode the encoded data, and display according to the restored video data. Unidirectional data transmission may be more common in applications such as media services.

Fig. 2 shows a second pair of terminals (230, 240), the second pair of terminals (230, 240) being arranged to support bi-directional transmission of encoded video, which may occur, for example, during a video conference. For bi-directional data transmission, each terminal (230, 240) may encode the captured video data at a local location for transmission over a network (250) to another terminal. Each terminal (230, 240) may also receive encoded video data transmitted by another terminal, may decode the encoded data, and may display the video data according to the recovery on a local display device.

In fig. 2, the terminals (210 and 240) may be illustrated as a server, a personal computer, and a smart phone, but the principles of the embodiments are not limited thereto. Embodiments are applicable to laptop computers, tablet computers, media players, and/or dedicated video conferencing equipment. The network (250) represents any number of networks that transport encoded video data between terminals (210-240), including, for example, wired and/or wireless communication networks. The communication network (250) may exchange data in circuit-switched and/or packet-switched channels. Representative networks include telecommunications networks, local area networks, wide area networks, and/or the internet. For purposes of this discussion, the architecture and topology of the network (250) may be immaterial to the operation of the embodiments, unless explained below.

Fig. 3 shows a diagram of the placement of a video encoder and a video decoder in a streaming environment, according to an embodiment. The disclosed subject matter is equally applicable to other video-enabled applications including, for example, video conferencing, digital TV, storing compressed video on digital media including CDs, DVDs, memory sticks, and the like.

The streaming system may include an acquisition subsystem (313) that may include a video source (301), such as a digital camera, that creates a stream of uncompressed video samples (302), for example. A sample stream (302) depicted as a thick line to emphasize high data volume may be processed by an encoder (303) coupled to a camera (301) compared to an encoded video bitstream. The encoder (303) may comprise hardware, software, or a combination of hardware and software to implement or embody aspects of the disclosed subject matter as described in more detail below. The encoded video bitstream (304), depicted as a thin line to emphasize lower data amounts as compared to the sample stream, may be stored on the streaming server (305) for future use. One or more streaming clients (306, 308) may access the streaming server (305) to retrieve a copy (307, 309) of the encoded video bitstream (304). The client (306) may include a video decoder (310), the video decoder (310) decoding an incoming copy of the encoded video bitstream (307) and generating an output stream of video samples (311) that may be presented on a display (312) or another presentation device (not depicted). In some streaming systems, the video bit stream (304, 307, 309) may be encoded according to some video encoding/compression standard. Examples of such standards include ITU-T recommendation H.265. The Video Coding standard under development is informally referred to as next generation Video Coding (VVC). The disclosed subject matter may be used in the context of VVCs.

Fig. 4 is a functional block diagram of a video decoder (310) according to an embodiment.

The receiver (410) may receive one or more encoded video sequences to be decoded by the decoder (310); in the same or another embodiment, the encoded video sequences are received one at a time, wherein the decoding of each encoded video sequence is independent of the decoding of other encoded video sequences. The encoded video sequence may be received from a channel (412), which may be a hardware/software link to a storage device that stores encoded video data. The receivers (410) may receive encoded video data, as well as other data, such as encoded audio data and/or auxiliary data streams, which may be forwarded to their respective usage entities (not depicted). The receiver (410) may separate the encoded video sequence from other data. To prevent network jitter, a buffer memory (415) may be coupled between the receiver (410) and an entropy decoder/parser (420) (hereinafter "parser"). When the receiver (410) receives data from a store/forward device with sufficient bandwidth and controllability or from an isochronous network, the buffer (415) may not be needed or may be made smaller. In an effort to use over a traffic packet network such as the internet, a buffer (415) may be required, which may be relatively large and may advantageously be of an adaptive size.

The video decoder (310) may include a parser (420) to reconstruct symbols (421) from the entropy encoded video sequence. The categories of these symbols include information for managing the operation of the decoder (310), as well as potential information to control a rendering device (312) (e.g., a display) that is not an integral part of the decoder, but may be coupled to the decoder, as shown in fig. 4. The control Information for the rendering device may be in the form of parameter set fragments (not depicted) of Supplemental Enhancement Information (SEI messages) or Video Usability Information (VUI). The parser (420) may parse/entropy decode the received encoded video sequence. The encoding of the encoded video sequence may be performed in accordance with video coding techniques or standards and may follow various principles known to those skilled in the art, including variable length coding, Huffman coding, arithmetic coding with or without context sensitivity, and so forth. A parser (420) may extract a subgroup parameter set for at least one of the subgroups of pixels in the video decoder from the encoded video sequence based on at least one parameter corresponding to the group. A subgroup may include a Group of Pictures (GOP), a picture, a tile, a slice, a macroblock, a Coding Unit (CU), a block, a Transform Unit (TU), a Prediction Unit (PU), and so on. The entropy decoder/parser may also extract information from the encoded video sequence, such as transform coefficients, Quantizer Parameter (QP) values, motion vectors, and so on.

The parser (420) may perform entropy decoding/parsing operations on the video sequence received from the buffer (415), thereby creating a symbol (421). A parser (420) may receive the encoded data and selectively decode particular symbols (421). In addition, the parser (420) may determine whether to provide the specific symbol (421) to the motion compensation prediction unit (453), the scaler/inverse transform unit (451), the intra prediction unit (452), or the loop filter unit (454).

The reconstruction of the symbol (421) may involve a number of different units depending on the type of the encoded video image or a portion of the encoded video image (e.g., inter and intra images, inter and intra blocks), among other factors. Which units are involved and the way in which they are involved can be controlled by subgroup control information parsed from the coded video sequence by a parser (420). For clarity, such a subgroup control information flow between parser (420) and a plurality of units below is not depicted.

In addition to the functional blocks already mentioned, the decoder (310) may be conceptually subdivided into several functional units as described below. In a practical implementation operating under business constraints, many of these units interact closely with each other and may be at least partially integrated with each other. However, for the purposes of describing the disclosed subject matter, a conceptual subdivision into the following functional units is appropriate.

The first unit is a sealer/inverse transform unit (451). The sealer/inverse transform unit (451) receives the quantized transform coefficients as symbols (421) from the parser (420) along with control information including which transform scheme to use, block size, quantization factor, quantization scaling matrix, etc. The sealer/inverse transform unit (451) may output a block comprising sample values, which may be input into the aggregator (455).

In some cases, the output samples of sealer/inverse transform unit (451) may belong to an intra-coded block; namely: predictive information from previously reconstructed pictures is not used, but blocks of predictive information from previously reconstructed portions of the current picture may be used. Such predictive information may be provided by intra picture prediction unit (452). In some cases, the intra image prediction unit (452) generates a block of the same size and shape as the block being reconstructed using surrounding reconstructed information extracted from the current (partially reconstructed) image (456). In some cases, the aggregator (455) adds, on a per-sample basis, the prediction information generated by the intra prediction unit (452) to the output sample information provided by the scaler/inverse transform unit (451).

In other cases, the output samples of sealer/inverse transform unit (451) may belong to inter-coded and potential motion compensated blocks. In this case, motion compensated prediction unit (453) may access reference picture store (457) to extract samples for prediction. After motion compensation of the extracted samples according to the symbols (421) belonging to the block, these samples may be added to the output of the scaler/inverse transform unit (in this case referred to as residual samples or residual signals) by an aggregator (455), thereby generating output sample information. The fetching of prediction samples by the motion compensation unit from addresses in the reference picture memory may be controlled by a motion vector and the motion vector is used by the motion compensation unit in the form of a symbol (421), which symbol (421) may have e.g. X, Y and a reference picture component. Motion compensation may also include interpolation of sample values fetched from a reference picture memory, motion vector prediction mechanisms, etc., when using sub-sample exact motion vectors.

The output samples of the aggregator (455) may be subjected to various loop filtering techniques in a loop filter unit (454). The video compression techniques may include in-loop filter techniques that are controlled by parameters included in the encoded video bitstream and available to the loop filter unit (454) as symbols (421) from the parser (420), however, the video compression techniques may also be responsive to meta-information obtained during decoding of a previous (in decoding order) portion of the encoded image or encoded video sequence, as well as to sample values previously reconstructed and loop filtered.

The output of the loop filter unit (454) may be a sample stream that may be output to a rendering device (312) and stored in a reference picture store (456) for subsequent inter picture prediction.

Once fully reconstructed, some of the coded pictures may be used as reference pictures for future prediction. Once the encoded picture is fully reconstructed and the encoded picture is identified as a reference picture (by, for example, the parser (420)), the current reference picture (456) may become part of the reference picture buffer (457) and new current picture memory may be reallocated before reconstruction of a subsequent encoded picture begins.

The video decoder (310) may perform decoding operations according to a predetermined video compression technique, such as that recorded in the ITU-T recommendation h.265 standard. The encoded video sequence may conform to the syntax specified by the video compression technique or standard used in the sense that the encoded video sequence conforms to the syntax of the video compression technique or standard (as specified in the video compression technique document or standard, in particular as specified in a configuration file of the video compression technique document or standard). For compliance, the complexity of the encoded video sequence may also be required to be within a range defined by the level of the video compression technique or standard. In some cases, the hierarchy limits the maximum image size, the maximum frame rate, the maximum reconstruction sampling rate (measured in units of, e.g., mega samples per second), the maximum reference image size, and so forth. In some cases, the limits set by the hierarchy may be further defined by a Hypothetical Reference Decoder (HRD) specification and HRD buffer-managed metadata signaled in the encoded video sequence.

In an embodiment, the receiver (410) may receive additional (redundant) data along with the reception of the encoded video. The additional data may be included as part of the encoded video sequence. The additional data may be used by the video decoder (310) to properly decode the data and/or more accurately reconstruct the original video data. The additional data may be in the form of, for example, a temporal, spatial, or signal-to-noise ratio (SNR) enhancement layer, a redundant slice, a redundant picture, a forward error correction code, and so forth.

Fig. 5 is a functional block diagram of a video encoder (303) according to an embodiment.

The encoder (303) may receive video samples from a video source (301) (not part of the encoder) that may capture video images to be encoded by the encoder (303).

The video source (301) may provide a source video sequence in the form of a stream of digital video samples to be encoded by the encoder (303), which may have any suitable bit depth (e.g., 8-bit, 10-bit, 12-bit … …), any color space (e.g., bt.601y CrCB, RGB … …), and any suitable sampling structure (e.g., Y CrCB 4: 2: 0, Y CrCB 4: 4: 4). In the media service system, the video source (301) may be a storage device that stores previously prepared video. In a video conferencing system, the video source (301) may be a camera that captures local image information as a video sequence. Video data may be provided as a plurality of individual images that are given motion when viewed in sequence. The image itself may be constructed as an array of spatial pixels, where each pixel may comprise one or more samples, depending on the sampling structure, color space, etc. used. The relationship between the pixel and the sample can be easily understood by those skilled in the art. The following text focuses on describing the samples.

According to an embodiment, the encoder (303) may encode and compress images of a source video sequence into an encoded video sequence (543) in real time or under any other temporal constraints required by the application. It is a function of the controller (550) to perform the appropriate encoding speed. The controller controls other functional units as described below and is functionally coupled to these units. For clarity, the coupling is not depicted in the figures. The parameters set by the controller may include rate control related parameters (picture skip, quantizer, lambda value … … for rate distortion optimization techniques), picture size, group of picture (GOP) layout, maximum motion vector search range, etc. Those skilled in the art will readily recognize other functions of the controller (550) as these relate to the video encoder (303) being optimized for a certain system design.

Some video encoders operate in an "encoding loop" as readily recognized by those skilled in the art. As a brief description, the encoding loop may comprise an encoder (530) (hereinafter referred to as the "source encoder" responsible for creating symbols based on the input image to be encoded and the reference image) and a (local) decoder (533) embedded in the encoder (303), the decoder (533) reconstructing the symbols to create sample data in a similar way as the (remote) decoder can also create sample data (since any compression between the symbols and the encoded video bitstream is lossless in the video compression techniques considered by the disclosed subject matter). The reconstructed sample stream is input to a reference image memory (534). Since the decoding of the symbol stream produces bit accurate results independent of the decoder location (local or remote), the content in the reference picture buffer also corresponds bit accurately between the local encoder and the remote encoder. In other words, the reference picture samples that the prediction portion of the encoder "sees" are exactly the same as the sample values that the decoder would "see" when using prediction during decoding. Those skilled in the art are aware of this reference picture synchronization philosophy (and the drift that occurs if synchronization cannot be maintained, e.g., due to channel errors).

The operation of the "local" decoder (533) may be the same as the operation of the "remote" decoder (310) that has been described in detail above in connection with fig. 4. However, referring briefly to fig. 4 additionally, when symbols are available and the entropy encoder (545) and parser (420) can losslessly encode/decode the symbols into an encoded video sequence, the entropy decoding portion of the decoder (310), including the channel (412), receiver (410), buffer (415), and parser (420), may not be fully implemented in the local decoder (533).

At this point it is observed that any decoder technique other than the parsing/entropy decoding present in the decoder must also be present in the corresponding encoder in substantially the same functional form. The description of the encoder techniques may be simplified because the encoder techniques are reciprocal to the fully described decoder techniques. A more detailed description is needed only in certain areas and is provided below.

As part of its operations, the source encoder (530) may perform motion compensated predictive coding, in which reference is made to one or more previously coded frames from the video sequence that are designated as "reference frames," which predictively codes the input frame. In this way, the encoding engine (532) encodes the difference between a block of pixels of an input frame and a block of pixels of a reference frame, which may be selected as a prediction reference for the input frame.

The local video decoder (533) may decode encoded video data for a frame that may be designated as a reference frame based on the symbols created by the source encoder (530). The operation of the encoding engine (532) may advantageously be a lossy process. When the encoded video data can be decoded at a video decoder (not shown in fig. 4), the reconstructed video sequence may typically be a copy of the source video sequence, but with some errors. The local video decoder (533) replicates the decoding process that may be performed on the reference frames by the video decoder, and may cause the reconstructed reference frames to be stored in the reference picture cache (534). In this way, the encoder (303) can locally store a copy of the reconstructed reference frame that has common content (no transmission errors) with the reconstructed reference frame that will be obtained by the far-end video decoder.

The predictor (535) may perform a prediction search against the coding engine (532). That is, for a new frame to be encoded, the predictor (535) may search the reference picture memory (534) for sample data (as candidate reference pixel blocks) or some metadata, such as reference picture motion vectors, block shapes, etc., that may be referenced as appropriate predictions for the new image. The predictor (535) may operate on a block-by-block basis of samples to find a suitable prediction reference. In some cases, the input image may have prediction references derived from multiple reference images stored in a reference image memory (534), as determined by search results obtained by a predictor (535).

The controller (550) may manage encoding operations of the video encoder (530), including, for example, setting parameters and subgroup parameters for encoding video data.

The outputs of all of the above functional units may be entropy encoded in an entropy encoder (545). The entropy encoder losslessly compresses the symbols generated by the various functional units according to techniques known to those skilled in the art, such as huffman coding, variable length coding, arithmetic coding, etc., to transform the symbols into an encoded video sequence.

The transmitter (540) may buffer the encoded video sequence created by the entropy encoder (545) in preparation for transmission over a communication channel (560), which may be a hardware/software link to a storage device that may store encoded video data. The transmitter (540) may combine the encoded video data from the video encoder (530) with other data to be transmitted, such as encoded audio data and/or an auxiliary data stream (sources not shown).

The controller (550) may manage the operation of the encoder (303). During encoding, the controller (550) may assign each encoded picture a certain encoded picture type, but this may affect the encoding techniques that may be applied to the respective picture. For example, an image may be generally assigned to any of the following frame types:

intra pictures (I pictures), which may be pictures that can be encoded and decoded without using any other frame in the sequence as a prediction source. Some video codecs tolerate different types of intra pictures, including, for example, Independent Decoder Refresh ("IDR") pictures. Those skilled in the art are aware of variations of I-pictures and their corresponding applications and features.

A predictive picture (P picture), which may be a picture that can be encoded and decoded using intra prediction or inter prediction that predicts sample values of each block using at most one motion vector and a reference index.

A bi-directional predictive picture (B picture), which may be a picture that can be encoded and decoded using intra prediction or inter prediction that predicts sample values of each block using at most two motion vectors and a reference index. Similarly, multiple predictive pictures may use more than two reference pictures and associated metadata for reconstructing a single block.

A source image may typically be spatially subdivided into blocks of samples (e.g., blocks of 4 × 4, 8 × 8, 4 × 8, or 16 × 16 samples) and encoded block-by-block. These blocks may be predictively encoded with reference to other (encoded) blocks determined by the encoding assignments applied to the respective pictures of the blocks. For example, a block of an I picture may be non-predictively encoded, or the block may be predictively encoded (spatial prediction or intra prediction) with reference to an encoded block of the same picture. The blocks of pixels of the P picture can be non-predictively coded by spatial prediction or by temporal prediction with reference to a previously coded reference picture. A block of a B picture may be non-predictively coded by spatial prediction or by temporal prediction with reference to one or two previously coded reference pictures.

The Video encoder (303) may perform encoding operations according to a predetermined Video encoding technique or standard, such as ITU-T recommendation h.265, or next generation Video Coding (VVC) h.266. In operation, the video encoder (303) may perform various compression operations, including predictive encoding operations that exploit temporal and spatial redundancies in the input video sequence. Thus, the encoded video data may conform to syntax specified by the video coding technique or standard used.

In an embodiment, the transmitter (540) may transmit the additional data while transmitting the encoded video. The video encoder (530) may include such data as part of an encoded video sequence. The additional data may include temporal, spatial and/or SNR Enhancement layers, redundant images and slices, among other forms of redundant data, Supplemental Enhancement Information (SEI) messages or Video Usability Information (VUI) parameter set segments, etc.

In AV1, there are 5 non-directionally SMOOTH intra prediction modes, DC, path, SMOOTH _ V, and SMOOTH _ H, respectively. For DC prediction, the average of the left neighboring sample and the upper neighboring sample is used as the prediction value of the block to be predicted. For the PAETH predictor, the upper reference sample, the left reference sample, and the upper left reference sample are first extracted, and then the closest (upper + left-upper left) value is set as the predictor of the pixel to be predicted. FIG. 6A shows the position of the top, left and top-left samples of a pixel in the current block. For the SMOOTH mode, SMOOTH _ V mode, and SMOOTH _ H mode, a block is predicted using quadratic interpolation along the vertical or horizontal direction or an average of both directions.

To obtain the attenuated spatial correlation by reference on the edges, the filter intra mode is designed for the luminance block. 5 filter intra modes are defined for AV1, each represented by a set of eight 7-tap filters that reflect the correlation between pixels in a 4 x 2 tile and the 7 neighbors adjacent to the tile. In other words, the weighting factor of the 7-tap filter depends on the position. Take an 8 × 8 block as an example; the block is divided into 8 4 x 2 patches as shown in fig. 6B. These patches are represented in fig. 6B by B0, B1, B2, B3, B4, B5, B6, and B7. For each tile, its 7 neighbors are represented by R0-R6, which are used to predict the pixels in the current tile. For tile B0, all neighbors have been reconstructed. But for other tiles, not all neighbors are reconstructed, and the predicted values of the neighbors are then used as references. For example, not all neighbors of tile B7 are reconstructed, so the prediction samples of the neighbors (i.e., B5 and B6) are used instead.

From Luma, CfL is a pure Chroma intra predictor that models Chroma pixels as a linear function of uniformly reconstructed Luma pixels. CfL predicts the following:

CfL(α)＝α×L_AC+DC (1)，

wherein L is_ACRepresents the AC contribution of the luminance component, α represents a parameter of the linear model, and DC represents the DC contribution of the chrominance component. Specifically, the reconstructed luma pixels are downsampled to chroma resolution and then the average is subtracted to form the AC contribution. In order to approximate the chroma AC component from the AC contribution, instead of requiring the decoder to calculate scaling parameters as in some approaches, AV1 CfL determines a parameter α based on the original chroma pixels and signals α in the bitstream. This reduces decoder complexity and produces a more accurate prediction. For the DC contribution of the chrominance component, which is calculated using a DC mode, the DC mode is sufficient for most chrominance content and has a well-established fast implementation.

Proposals have been made to improve intra mode coding of the next generation video coding (VVC) standard. For example, two sets of intra prediction modes may be defined for each block, named as an allowed intra prediction mode set (AIPM, also named adaptive intra prediction mode) and a prohibited intra prediction mode set (DIPM). The AIPM is defined as a mode set whose modes can be used for intra prediction of the current block, and the DIPM is defined as a mode set whose modes cannot be signaled or used for intra prediction of the current block. For each block, the modes in the two mode sets are derived from the intra prediction modes of the neighboring blocks. The neighbor mode is included in the AIPM set but not included in the DIPM set. The number of modes included in the AIPM set and the DIPM set is predefined and fixed for all blocks. When the size of the AIPM set is S and the number of intra-prediction modes derived from neighboring modes is less than S, the AIPM set is populated using a default mode.

When AIPM is applied to AV1, all nominal angles are always included in AIPM regardless of the block size of the current block and the prediction modes of the neighboring blocks.

In the decoder-side intra mode derivation (DIMD) process, the intra prediction mode is derived based on previously encoded/decoded pixels, and is done in the same way at both the encoder side and the decoder side. Thus, in the DIMD process, signaling of the intra prediction mode index is avoided. This process defines a new coding mode called DIMD. A flag is signaled in the bitstream to indicate whether the DIMD mode is selected. Decoder side INTRA MODE derivation can also be referred to as deriving INTRA MODE, which has been implemented with the flag CONFIG _ detailed _ INTRA _ MODE under proposal.

Two main steps are employed in the DIMD process, as described in detail below.

To implicitly derive the Intra Prediction Mode (IPM) of a DIMD block, texture gradient analysis is performed at both the encoder side and the decoder side. The process starts with an empty HoG having 65 entries corresponding to the number of angular patterns. The amplitudes of these entries are determined during the texture gradient analysis.

In the first step, the DIMD selects templates of T ═ 3 columns and 3 rows from the left side and the top side of the current block, respectively, as shown in part (a) of fig. 6C. This region will be used as a gradient-based IPM derived reference.

In a second step, horizontal and vertical Sobel filters are applied to all 3 × 3 window positions centered on the pixel of the line in the template, as shown in part (b) of fig. 6C. At each window position, the sobel filter calculates the pure horizontal and vertical intensities as G _ hor and G _ ver, respectively.

The texture angle of the window is then calculated as:

angle＝arctan(G_hor/G_hor) (2)，

the angle may be converted into one of 65 angles IPM. Once the IPM index of the current window is derived as idx, the magnitude of its entry in HoG [ idx ] is updated by adding:

ampl＝|G_hor|+|G_hor| (3)。

part (C) of fig. 6C shows an example of the HoG calculated after the above-described operation is applied to all the pixel positions in the template.

This procedure is not required if only a single IPM is used, which corresponds to the highest peak of the HoG.

Otherwise, if more than one IPM is derived from the DIMD process, the predictive fusion process may be used.

The prediction fusion is calculated by using a weighted average of a plurality of predictors. Fig. 6D shows an example of a fusion algorithm. As can be seen, the two IPMs corresponding to the three highest peaks of the HoG are detected as M1 and M2. The third IPM is fixed to the planar mode. After pixel prediction is applied and Pred1, Pred2, and Pred3 are obtained by these three IPMs, their fusion is calculated by a weighted average of the above three predictors. In one example, the planar mode is fixed at 21/64 (1/3) in weight. The remaining weight of 43/64 (-2/3) is then shared between the two HoG IPMs, the weight being proportional to the amplitude of the HoG bar.

In detail, the first weight ω 1, the second weight ω 2, and the third weight ω 3 may be expressed as follows:

and

thus, a predictor block can be represented as follows:

DIMD derives one or more angle IPMs using neighboring samples of the current block and assigns shorter codewords to these derived IPMs. The AIPM derives the selected IPM list using IPMs of the neighboring mode and assigns shorter codewords to IPMs of the neighboring mode. Both methods use neighbor information to optimize signaling of IPM of the current block. However, there is no solution as to how to combine these two methods together.

Embodiments of methods and apparatus for interaction between decoder-side intra mode derivation and adaptive intra prediction modes are described herein.

In this detailed description, if one mode is not a smooth mode or generates prediction samples according to a given prediction direction, this one mode is referred to as an angular mode or a directional mode. DIMD is a general term, and if one process uses neighboring reconstructed samples to derive intra prediction modes, this one process is called DIMD.

In an embodiment, for each block, there are two sets of intra prediction modes, referred to as the AIPM set and the DIPM set, respectively. All non-directional modes are always included in the AIPM set regardless of the block size of the current block and the prediction modes of the neighboring blocks.

In an embodiment, all non-directional smooth intra-prediction modes in AV1 are always inserted first in the AIPM set, regardless of the intra-prediction modes of neighboring blocks.

In an embodiment, the DC mode, the path mode, the SMOOTH _ V mode, and the SMOOTH _ H mode are always included in the AIPM set first, regardless of the intra prediction modes of the neighboring blocks.

In an embodiment, the patterns included in the AIPM set may be divided into K levels, K being a positive integer, e.g., 2 or 3 or 4. For the first level, the number of modes is equal to the number of non-directional modes. For other levels, the number of patterns is equal to a power of 2, e.g. 2^LAnd L is a positive integer greater than 1. For example, the number of patterns in an AIPM set may be S, and the AIPM set may have 3 levels. S is equal to K +2^L+2^MWherein a mode in which the index is less than K in the AIPM set is referred to as a first level mode, and a mode in which the index is greater than or equal to K but less than K +2 in the AIPM set^LIs referred to as a second level mode, and so on. In an embodiment of the present invention,all non-directional IPMs are placed in the first level of the AIPM set.

In an embodiment, only the directed IPM is included in the AIPM list, and the number of modes in the AIPM is set equal to the sum of powers of 2 or multiples of powers of 2.

In an embodiment, for signaling of intra prediction mode, a flag is signaled to indicate whether the current block is directional mode. If so, a second flag is signaled to indicate the index of the current mode in the AIPM list. Otherwise, a second flag is signaled to indicate which non-directional mode the current mode is.

In an embodiment, for signaling of intra prediction mode, a flag is signaled to indicate whether the current block is directional mode. If the current block is directional mode, a second flag is signaled to indicate whether the current mode is DIMD mode. If the current mode is not DIMD mode, a third flag is signaled to indicate the index of the current mode in the AIPM list. Otherwise, if the current mode is the DIMD mode, the third flag is avoided and the IPM of the current block is derived from the decoder side. Otherwise, if the current mode is not a directional IPM, a second flag is signaled to indicate which non-directional mode the current mode is.

In an embodiment, the IPM derived from the DIMD process is always inserted in the AIPM list. In an embodiment, the IPM derived from the DIMD process is always inserted first in the AIPM list and placed in the first level of the AIPM list.

In an embodiment, the AIPM scheme is applied only to the luminance component, and the DIMD scheme is applied only to the chrominance component.

FIG. 7 is a flow diagram illustrating a method (700) for performing intra prediction of a current block of an image of a video sequence, according to an embodiment. In some implementations, one or more of the processing blocks of fig. 7 may be performed by the decoder (310). In some implementations, one or more of the processing blocks of fig. 7 may be performed by another device or group of devices (e.g., encoder (303)) separate from or including decoder (310).

Referring to fig. 7, in a first block (710), a method (700) includes: it is determined whether the first flag indicates that the intra prediction mode corresponding to the current block is a directional mode.

Based on determining that the first flag indicates that the intra-prediction mode corresponding to the current block is directional mode (710-yes), in a second block (720), the method (700) comprises: it is determined whether the second flag indicates that the intra prediction mode is a decoder-side intra mode derivation (DIMD) mode.

Based on determining that the second flag does not indicate that the intra-prediction mode corresponding to the current block is DIMD mode (720-NO), in a third block (730), the method (700) comprises: -determining an index of an intra prediction mode in an Allowed Intra Prediction Mode (AIPM) list, and in a fourth block (740), the method (700) comprises: performing intra prediction of the current block using an intra prediction mode corresponding to the determined index in the AIPM list.

Based on determining that the second flag indicates that the intra-prediction mode corresponding to the current block is DIMD mode (720-YES), in a fifth block (750), the method (700) comprises: performing DIMD to determine an intra-prediction mode, and continuing in a fourth block (740), wherein the method (700) comprises: performing intra prediction of the current block using the determined intra prediction mode.

Based on determining that the first flag does not indicate that the intra-prediction mode corresponding to the current block is directional mode (710-no), in a sixth block (760), the method (700) comprises: determining the intra-prediction mode to be one of a plurality of non-directional modes and proceeding to a fourth block (740), wherein the method (700) comprises: performing intra prediction of the current block using the determined intra prediction mode.

The method (700) may further comprise: an intra prediction mode determined by performing the DIMD is first inserted into a first level of the AIPM list.

The method (700) may further comprise: intra prediction of the luma component of the current block is performed using at least one intra prediction mode in the AIPM list.

The method (700) may further comprise: the intra prediction of the chrominance component of the current block is performed using an intra prediction mode determined by performing the DIMD.

The AIPM list may include only directed patterns, and the number of directed patterns included in the AIPM list may be equal to a power of 2 or a sum of multiples of a power of 2.

Although fig. 7 shows exemplary blocks of the method (700), in some implementations, the method (700) may include additional blocks, fewer blocks, different blocks, or blocks arranged differently than those shown in fig. 7. Additionally or alternatively, two or more blocks of the method (700) may be performed in parallel.

Fig. 8 is a simplified block diagram of an apparatus (800) for performing intra prediction of a current block of an image of a video sequence according to an embodiment.

Referring to fig. 8, the apparatus (800) includes a first determination code (805), a second determination code (810), a third determination code (815), a first execution code (820), a second execution code (825), and a fourth determination code (830).

The first determining code (805) is configured to cause the at least one processor to determine whether a first flag indicates that an intra-prediction mode corresponding to the current block is a directional mode.

The third determining code (815) is configured to cause the at least one processor to determine whether the second flag indicates that the intra-prediction mode corresponding to the current block is a decoder-side intra mode derived (DIMD) mode based on a determination that the first flag indicates that the intra-prediction mode corresponding to the current block is a directional mode.

The second determining code (810) is configured to cause the at least one processor to determine an index of an intra-prediction mode in an allowed intra-prediction mode (AIPM) list based on determining that the second flag does not indicate that the intra-prediction mode corresponding to the current block is a DIMD mode.

The first executing code (820) is configured to cause the at least one processor to perform intra-prediction of the current block using an intra-prediction mode in the AIPM list corresponding to the determined index based on determining that the second flag does not indicate that the intra-prediction mode corresponding to the current block is DIMD mode.

The second executing code (825) is configured to cause the at least one processor to execute the DIMD to determine the intra-prediction mode based on determining that the second flag indicates that the intra-prediction mode corresponding to the current block is DIMD mode.

The first executing code (820) is further configured to cause the at least one processor to perform intra-prediction of the current block using the determined intra-prediction mode based on determining that the second flag indicates that the intra-prediction mode corresponding to the current block is a DIMD mode.

The fourth determining code (830) is configured to cause the at least one processor to determine the intra-prediction mode to be one of a plurality of non-directional modes based on determining that the first flag does not indicate that the intra-prediction mode corresponding to the current block is a directional mode.

The first executing code (820) is further configured to cause the at least one processor to perform intra-prediction of the current block using the determined intra-prediction mode based on determining that the first flag does not indicate that the intra-prediction mode corresponding to the current block is a directional mode.

The apparatus (800) may further include inserting code configured to cause the at least one processor to first insert the intra-prediction mode determined by performing the DIMD into the first level of the AIPM list.

The apparatus (800) may further include third executable code configured to cause the at least one processor to perform intra prediction of the luma component of the current block using at least one intra prediction mode in the AIPM list.

The third executing code may be further configured to cause the at least one processor to perform intra prediction of the chroma component of the current block using an intra prediction mode determined by performing the DIMD.

The AIPM list may include only directed patterns, and the number of directed patterns included in the AIPM list may be equal to a power of 2 or a sum of multiples of a power of 2.

FIG. 9 is a diagram of a computer system (900) suitable for implementing embodiments.

The computer software may be encoded using any suitable machine code or computer language that may be subject to assembly, compilation, linking, or similar mechanism to create code that includes instructions that may be executed directly by one or more computer Central Processing Units (CPUs), Graphics Processing Units (GPUs), etc., or by interpretation, microcode execution, etc.

The instructions may be executed on various types of computers or components thereof, including, for example, personal computers, tablet computers, servers, smart phones, gaming devices, internet of things devices, and so forth.

The components of computer system (900) shown in FIG. 9 are exemplary in nature and are not intended to suggest any limitation as to the scope of use or functionality of the computer software implementing the embodiments. Neither should the configuration of components be interpreted as having any dependency or requirement relating to any one or combination of components illustrated in an embodiment of the computer system (900).

The computer system (900) may include some human interface input devices. Such human interface input devices may be responsive to input by one or more human users through, for example: tactile input (e.g., keystrokes, strokes, data glove movements), audio input (e.g., speech, clapping hands), visual input (e.g., gestures), olfactory input (not depicted). The human interface device may also be used to capture certain media that are not necessarily directly related to human conscious input, such as audio (e.g., voice, music, ambient sounds), images (e.g., scanned images, captured images from still image cameras), video (e.g., two-dimensional video, three-dimensional video including stereoscopic video).

The input human interface device may include one or more of the following (only one shown in each): keyboard (901), mouse (902), touch pad (903), touch screen (910), data glove, joystick (905), microphone (906), scanner (907), camera (908).

The computer system (900) may also include certain human interface output devices. Such human interface output devices may stimulate the senses of one or more human users, for example, through tactile outputs, sounds, light, and smells/tastes. Such human interface output devices may include tactile output devices (e.g., tactile feedback for a touch screen (910), data gloves or joysticks (905), but may also be tactile feedback devices that do not act as input devices), audio output devices (e.g., speakers (909), headphones (not depicted)), visual output devices (e.g., screens (910) including Cathode Ray Tube (CRT) screens, Liquid Crystal Display (LCD) screens, plasma screens, organic light-emitting diode (OLED) screens, each screen with or without touch screen input functionality, each screen with or without tactile feedback functionality-some of which can output two-dimensional visual output or output in excess of three-dimensional output through devices such as stereoscopic image output, virtual reality glasses (not depicted), holographic displays and smoke boxes (not depicted), and printers (not depicted) The graphics adapter (950) generates an image and outputs the image to the touch screen (910).

The computer system (900) may also include human-accessible storage devices and their associated media, e.g., optical media including a CD/DVD ROM/RW (920) with a CD/DVD or the like medium (921), a finger drive (922), a removable hard or solid state drive (923), conventional magnetic media (not depicted) such as magnetic tape and floppy disk, dedicated ROM/ASIC/PLD based devices (not depicted) such as a security dongle, and the like.

Those skilled in the art will also appreciate that the term "computer-readable medium" used in connection with the presently disclosed subject matter does not encompass transmission media, carrier waves, or other transitory signals.

The computer system (900) may also include an interface to one or more communication networks (955). The network (955) may be, for example, a wireless network, a wired network, an optical network. The network (955) may further be a local network, a wide area network, a metropolitan area network, a vehicle and industrial network, a real time network, a delay tolerant network, and the like. Examples of networks (955) include local area networks such as ethernet, wireless LANs, cellular networks including global system for mobile communications (GSM), third generation (3G), fourth generation (4G), fifth generation (5G), Long Term Evolution (LTE), etc., television wired or wireless wide area digital networks including wired television, satellite television, and terrestrial broadcast television, automotive and industrial televisions including CANBus, and so forth. Certain networks (955) typically require external network interface adapters (e.g., Universal Serial Bus (USB) ports of the computer system (900)) that connect to certain universal data ports or peripheral buses (949); other network interfaces are typically integrated into the kernel of the computer system (900) by connecting to a system bus, as described below (e.g., to an ethernet interface in a PC computer system or to a cellular network interface (954) in a smartphone computer system). The computer system (900) may communicate with other entities using any of these networks (955). Such communications may be received only one way (e.g., broadcast television), transmitted only one way (e.g., CANbus connected to certain CANbus devices), or bi-directional, e.g., connected to other computer systems using a local or wide area network digital network. As described above, certain protocols and protocol stacks may be used on each of those networks (955) and network interfaces (954).

The human interface device, human-accessible storage device, and network interface (954) described above may be attached to the kernel (940) of the computer system (900).

The core (940) may include one or more Central Processing Units (CPUs) (941), Graphics Processing Units (GPUs) (942), special purpose programmable processing units in the form of Field Programmable Gate Arrays (FPGAs) (943), hardware accelerators (194) for certain tasks, and the like. These devices, as well as Read Only Memory (ROM) (945), Random Access Memory (RAM) (946), internal mass storage (947), such as internal non-user accessible hard drives, Solid State Drives (SSDs), etc., may be connected through a system bus (948). In some computer systems, the system bus (948) may be accessed in the form of one or more physical plugs to enable expansion by additional CPUs, GPUs, and the like. The peripheral devices may be connected directly to the system bus (948) of the core or connected to the system bus (948) of the core through a peripheral bus (949). The architecture of the peripheral bus includes Peripheral Component Interconnect (PCI), USB, etc.

The CPU (941), GPU (942), FPGA (943), and accelerator (Accl.) (944) may execute certain instructions, which may be combined to make up the computer code described above. The computer code may be stored in ROM (945) or RAM (946). Transitional data may also be stored in RAM (946), while persistent data may be stored in internal mass storage (947), for example. Fast storage and retrieval to any storage device may be made by using a cache, which may be closely associated with: one or more CPUs (941), GPUs (942), mass storage (947), ROMs (945), RAMs (946), and the like.

The computer-readable medium may have thereon computer code for performing various computer-implemented operations. The media and computer code may be those specially designed and constructed for the purposes of the embodiments, or they may be of the kind well known and available to those having skill in the computer software arts.

By way of non-limiting example, a computer system having an architecture (900), and in particular a core (940), may provide functionality as a result of one or more processors (including CPUs, GPUs, FPGAs, accelerators, etc.) executing software embodied in one or more tangible computer-readable media. Such computer-readable media may be media associated with user-accessible mass storage as described above, as well as some non-transitory memory of the kernel (940), such as kernel internal mass storage (947) or ROM (945). Software implementing embodiments may be stored in such devices and executed by the core (940). The computer readable medium may include one or more memory devices or chips, according to particular needs. The software may cause the core (940), and in particular the processors therein (including CPUs, GPUs, FPGAs, etc.), to perform certain processes or certain portions of certain processes described herein, including defining data structures stored in RAM (946) and modifying such data structures according to processes defined by the software. Additionally or alternatively, the functionality may be provided by a computer system as a result of logic that is hardwired or otherwise embodied in circuitry (e.g., accelerator (944)) that may operate in place of or in conjunction with software to perform certain processes or certain portions of certain processes described herein. Where appropriate, reference to portions of software may include logic and vice versa. Where appropriate, reference to portions of a computer-readable medium may include circuitry (e.g., an Integrated Circuit (IC)) that stores software for execution, circuitry embodying logic for execution, or both. Embodiments include any suitable combination of hardware and software.

While this disclosure has described several embodiments, there are alterations, permutations, and various substitute equivalents, which fall within the scope of this disclosure. It will thus be appreciated that those skilled in the art will be able to devise numerous systems and methods which, although not explicitly shown or described herein, embody the principles of the disclosure and are thus within the spirit and scope of the disclosure.

30页详细技术资料下载

上一篇：一种医用注射器针头装配设备

下一篇：利用细化的帧间预测的统一计算方法

Method and apparatus for interaction between decoder-side intra mode derivation and adaptive intra prediction modes

相关技术

网友询问留言