Signal processing device and image display apparatus including the same

文档序号：1342053 发布日期：2020-07-17 浏览：6次中文

阅读说明：本技术 信号处理装置和包括该信号处理装置的图像显示设备 (Signal processing device and image display apparatus including the same ) 是由金起出李准一金钟乾全善荷朴钟河李东润于 2020-01-08 设计创作，主要内容包括：信号处理装置和包括该信号处理装置的图像显示设备。公开了一种信号处理装置和包括该信号处理装置的图像显示设备。该信号处理装置和包括该信号处理装置的图像显示设备包括：转换器,其被配置为对输入立体声音频信号的频率进行转换；主分量分析器,其被配置为基于来自所述转换器的信号来执行主分量分析；特征提取器,其被配置为基于来自所述主分量分析器的信号来提取主分量信号的特征；包络调整器,其被配置为基于以深度神经网络模型为基础执行的预测来执行包络调整；以及逆转换器,其被配置为对来自所述包络调整器的信号进行逆转换,以输出多通道的上混音频信号。因此,当将下混立体声音频信号上混为多通道音频信号时,能够改善空间失真。(A signal processing device and an image display apparatus including the same. A signal processing device and an image display apparatus including the same are disclosed. The signal processing device and the image display apparatus including the same include: a converter configured to convert a frequency of an input stereo audio signal; a principal component analyzer configured to perform principal component analysis based on the signal from the converter; a feature extractor configured to extract features of a principal component signal based on the signal from the principal component analyzer; an envelope adjuster configured to perform envelope adjustment based on a prediction performed based on a deep neural network model; and an inverse transformer configured to inversely transform the signal from the envelope adjuster to output a multi-channel up-mixed audio signal. Therefore, when a downmix stereo audio signal is upmixed into a multi-channel audio signal, spatial distortion can be improved.)

1. A signal processing apparatus, comprising:

a converter configured to convert a frequency of an input stereo audio signal;

a principal component analyzer configured to perform principal component analysis based on the signal from the converter;

a feature extractor configured to extract features of a principal component signal based on the signal from the principal component analyzer;

an envelope adjuster configured to perform envelope adjustment based on a prediction performed based on a deep neural network model; and

an inverse transformer configured to inversely transform the signal from the envelope adjuster to output a multi-channel up-mixed audio signal.

2. The signal processing apparatus of claim 1, further comprising a filter bank configured to filter the frequency converted stereo audio signal from the converter by a plurality of band pass filters.

3. The signal processing apparatus of claim 1, further comprising a model learning predictor configured to perform prediction based on the deep neural network model based on features from the feature extractor.

4. The signal processing apparatus according to claim 1, further comprising a masking unit configured to perform masking on a prediction result from a model learning predictor.

5. The signal processing apparatus according to claim 4, wherein the masking unit performs masking by using time-frequency components to perform channel separation based on a prediction result from the model learning predictor when each of multiple channels is independent of time and frequency.

6. The signal processing apparatus according to claim 1, wherein the envelope adjuster separates channels by correcting an envelope of the signal in a frequency band according to a weight function for the frequency band based on a prediction result from a model learning predictor.

7. The signal processing apparatus of claim 1, wherein the envelope adjuster adjusts a size of each frequency band to follow an envelope in a target channel.

8. The signal processing apparatus according to claim 1, wherein the principal component analyzer separates a principal component signal and a sub-component signal of the input stereo audio signal.

9. The signal processing apparatus according to claim 1, wherein the primary component analyzer performs at least one of a correlation operation between channels of the primary component signals of the input stereo audio signal, a panning gain operation of the primary component signals, and a power operation of the primary component signals.

10. The signal processing apparatus according to claim 9, wherein the feature extractor extracts a panning gain of the main component signal and a power of the main component signal of the input stereo audio signal.

11. The signal processing apparatus of claim 1, the signal processing apparatus further comprising:

a second converter configured to convert frequencies of a downmix stereo audio signal or a multi-channel audio signal received from a database;

a second principal component analyzer configured to perform principal component analysis based on the signal from the second converter; and

a second feature extractor configured to extract features of the principal component signals based on signals from the second principal component analyzer,

wherein learning is performed based on the deep neural network model based on the features extracted by the second feature extractor.

12. The signal processing apparatus according to claim 11, further comprising a model learning predictor configured to perform learning based on the deep neural network model based on the features extracted by the second feature extractor.

13. An image display apparatus comprising the signal processing device according to any one of claims 1 to 12.

Technical Field

The present disclosure relates to a signal processing device and an image display apparatus including the same, and more particularly, to a signal processing device capable of improving spatial distortion when upmixing (upmixing) a downmix stereo audio signal into a multi-channel audio signal and an image display apparatus including the same.

Background

The signal processing apparatus is a device that can perform image signal processing or voice signal processing.

Recently, audio codecs such as MPEG-H3D audio of ATSC 3.0, Dolby AC4, Dolby ATMOS, DTS Virtual: X, etc. have been standardized, and corresponding rendering techniques have been widely used in audio devices such as mobile devices, home theaters, and image display apparatuses.

That is, the paradigm is changing from a traditional multi-channel audio codec to an immersive audio codec.

In addition, the audio playback apparatus extends two channels made up of a pair of left and right speakers to 5.1, 7.1 channels, and so on, in order to reproduce an effective spatial impression, thereby forming a sound field on a two-dimensional plane.

In addition, recently, for realistic audio for ultra high definition such as UHDTV, it has been further deepened to be able to render multi-channels of a three-dimensional space such as 5.1.2 channels or 22.2 channels.

However, due to problems such as high cost of content production, transmission devices for transmitting content to consumers, limitations of wired and wireless environments, and price competitiveness of audio playback devices, low-quality stereo sources or multi-channels are downmixed and moved to consumers.

In order to effectively play such a downmix two-channel stereo source in a multi-channel audio playing device, a multi-channel upmix method is required.

The method of re-separating the signals in which a plurality of channels or sound sources are combined into individual channels or sound sources is called blind upmix, blind source separation method.

Blind upmix, source separation methods may include Independent Component Analysis (ICA) methods that perform analysis based on the assumption that audio sources have independent characteristics, Principal Component Analysis (PCA) methods that perform analysis using principal components and ambient component signals, and unsupervised learning-based non-Negative Matrix Factorization (NMF) methods.

Further, according to the principal component analysis method (PCA), since signals separated into principal components and ambient components are different from the original multi-channel, it is impossible to match the principal components and the ambient components to the original multi-channel signals.

For example, in a multi-channel playback device, if the main components are distributed in the front channel and the ambient components are uniformly distributed in the entire channel, or if they are rendered in the rear upstream channel differently from the actual intention of the content creator, distorted spatial sound characteristics occur such that the audio objects are only in the front.

Furthermore, since the ICA, NMF, etc. based methods also decompose the signal based on independent and basic components, it is difficult to match to multiple channels such as actual front, center, woofer, rear, and upstream.

Disclosure of Invention

An object of the present disclosure is to provide a signal processing device capable of improving spatial distortion when upmixing a downmix stereo audio signal into a multi-channel audio signal and an image display apparatus having the same.

Another object of the present disclosure is to provide a signal processing apparatus capable of synthesizing a multichannel signal using a principal component analysis method and a deep neural network model, and an image display device having the same.

In order to achieve the above object, a signal processing device and an image display apparatus including the same according to an embodiment of the present disclosure include: a converter configured to convert a frequency of an input stereo audio signal; a principal component analyzer configured to perform principal component analysis based on the signal from the converter; a feature extractor configured to extract features of the principal component signals based on the signals from the principal component analyzer; an envelope adjuster configured to perform envelope adjustment based on a prediction performed based on a deep neural network model; and an inverse transformer configured to inversely transform the signal from the envelope adjuster to output a multi-channel up-mixed audio signal.

The signal processing apparatus and the image display device including the same according to the embodiments of the present disclosure further include a model learning predictor configured to perform prediction based on the deep neural network model based on the features from the feature extractor.

The signal processing apparatus and the image display device including the same according to the embodiments of the present disclosure further include a masking unit configured to perform masking on a prediction result from the model learning predictor.

An envelope adjuster adjusts the size of each frequency band to follow the envelope in the target channel.

The main component analyzer separates a main component signal and a sub component signal of an input stereo audio signal.

The principal component analyzer performs at least one of a correlation operation between channels of a principal component signal of the input stereo audio signal, a panning gain operation of the principal component signal, and a power operation of the principal component signal.

The feature extractor extracts panning gain of a principal component signal of an input stereo audio signal and power of the principal component signal.

The signal processing device and the image display apparatus including the same according to the embodiment of the present disclosure further include: a second converter configured to convert frequencies of a downmix stereo audio signal or a multi-channel audio signal received from a database; a second principal component analyzer configured to perform principal component analysis based on the signal from the second converter; and a second feature extractor configured to extract features of the principal component signals based on the signals from the second principal component analyzer; wherein the learning is performed based on the deep neural network model based on the features extracted by the second feature extractor.

The signal processing apparatus and the image display device including the same according to the embodiments of the present disclosure further include a model learning predictor configured to perform learning based on the deep neural network model based on the features extracted by the second feature extractor.

Drawings

Embodiments will be described in detail with reference to the following drawings, wherein like reference numerals represent like elements, and wherein:

fig. 1 is a diagram illustrating an image display system according to an embodiment of the present disclosure;

fig. 2 is an example illustrating an internal block diagram of the image display apparatus of fig. 1;

fig. 3 is a diagram of an internal block diagram of the signal processing apparatus shown in fig. 2;

fig. 4A is a diagram illustrating a control method of the remote controller of fig. 2;

FIG. 4B is an internal block diagram of the remote control of FIG. 2;

FIG. 5 is an internal block diagram of the display of FIG. 2;

fig. 6A and 6B are diagrams referred to in describing the organic light emitting diode panel of fig. 5;

fig. 7 is an example of an internal block diagram of a signal processing apparatus according to an embodiment of the present disclosure;

fig. 8 to 9B are diagrams referred to in describing the signal processing apparatus shown in fig. 7;

fig. 10 is an example of an internal block diagram of a signal processing apparatus according to an embodiment of the present disclosure;

fig. 11 is another example of an internal block diagram of a signal processing apparatus according to an embodiment of the present disclosure;

fig. 12 is a diagram referred to in describing fig. 10 to 11;

fig. 13 is a flowchart illustrating an operation of a signal processing apparatus according to an embodiment of the present disclosure; and

fig. 14 is a flowchart illustrating an operation of a signal processing apparatus according to an embodiment of the present disclosure.

Detailed Description

Hereinafter, the present disclosure will be described in detail with reference to the accompanying drawings.

The suffixes "module" and "unit" with respect to the constituent elements used in the following description are given only in consideration of the ease of preparing the specification, and are not given or used in different meanings. Thus, the suffixes "module" and "unit" may be used interchangeably.

Fig. 1 is a diagram illustrating an image display system according to an embodiment of the present disclosure.

Referring to this drawing, the image display system 10 according to an embodiment of the present disclosure may include an image display apparatus 100 including a display 180, a set-top box 300, and a server 600.

The image display apparatus 100 according to the embodiment of the present disclosure may receive an image from the set-top box 300 or the server 600.

For example, the image display apparatus 100 may receive an image signal from the set-top box 300 through the HDMI terminal.

For another example, the image display apparatus 100 may receive an image signal from the server 600 through a network terminal.

Further, the image display apparatus 100 may calculate an original quality of an original image signal received through the external set-top box 300 or the network, set an image quality of the image signal according to the calculated original quality, and perform an image quality process on the image signal according to the set image quality.

In addition, the image display apparatus 100 may calculate the resolution and noise level of the received image signal using a Deep Neural Network (DNN). Accordingly, the original quality calculation can be accurately performed on the received image signal.

Further, the image display apparatus 100 may update the parameters of the DNN from the server 600, and calculate the resolution and noise level of the received image signal based on the updated parameters. Therefore, the original quality of the image signal can be accurately calculated based on the learning.

For example, the display 180 may be any of a liquid crystal display panel (L CD panel), an organic light emitting diode panel (O L ED panel), an inorganic light emitting diode panel (L ED panel).

In the present disclosure, an example in which the display 180 includes an organic light emitting diode panel (O L ED panel) is mainly described.

Further, the O L ED panel exhibits a faster response speed than L ED and is excellent in color reproduction.

Therefore, if the display 180 includes an O L ED panel, it is preferable that the signal processing device 170 (see FIG. 2) of the image display apparatus 100 performs image quality processing on the O L ED panel.

Further, the image display apparatus 100 in fig. 1 may be a TV, a monitor, a tablet PC, a mobile terminal, a display for a vehicle, and the like.

In addition, the image display apparatus 100 may upmix the input audio signals of the stereo channels into audio signals of multiple channels using a deep neural network.

To this end, the image display apparatus 100 according to the embodiment of the present disclosure includes: a converter 1010 for frequency converting an input stereo audio signal; a principal component analyzer 1030, the principal component analyzer 1030 performing principal component analysis based on the signal from the converter 1010; a feature extractor 1040, the feature extractor 1040 for extracting features of the principal component signal based on the signal from the principal component analyzer 1030; an envelope adjuster 1060, the envelope adjuster 1060 for performing envelope adjustment based on prediction performed based on a deep neural network model; and an inverse converter 1070, the inverse converter 1070 for inverse-converting the signal from the envelope adjuster 1060 to output a multi-channel up-mixed audio signal. Therefore, when a downmix stereo audio signal is upmixed into a multi-channel audio signal, spatial distortion can be improved. In particular, a multi-channel signal can be easily synthesized using a principal component analysis method and a deep neural network model.

Fig. 2 is an example illustrating an internal block diagram of the image display apparatus of fig. 1.

Referring to fig. 2, the image display apparatus 100 according to the embodiment includes a broadcast receiving unit 105, a storage unit 140, a user input interface 150, a sensor unit (not shown), a signal processing device 170, a display 180, an audio output unit 185, and a brightness sensor 197.

The broadcast receiving unit 105 may include a tuner unit 110, a demodulator 120, a network interface 135, and an external device interface 130.

Further, unlike the drawing, the broadcast receiving unit 105 may include only the tuner unit 110, the demodulator 120, and the external device interface 130. That is, the network interface 135 may not be included.

The tuner unit 110 selects a Radio Frequency (RF) broadcast signal corresponding to a channel selected by a user or all pre-stored channels among RF broadcast signals received through an antenna (not shown). In addition, the selected RF broadcast signal is converted into an intermediate frequency signal, a baseband image, or an audio signal.

For example, IF the selected RF broadcast signal is a digital broadcast signal, the RF broadcast signal is converted into a digital IF signal (DIF). If the selected RF broadcast signal is an analog broadcast signal, the RF broadcast signal is converted into an analog baseband image or audio signal (CVBS/SIF). That is, the tuner unit 110 may process a digital broadcast signal or an analog broadcast signal. The analog baseband image or audio signal (CVBS/SIF) output from the tuner unit 110 may be directly input to the signal processing device 170.

In addition, the tuner unit 110 may include a plurality of tuners for receiving broadcast signals of a plurality of channels. Alternatively, a single tuner that simultaneously receives broadcast signals of a plurality of channels may also be used.

The demodulator 120 receives the converted digital IF signal DIF from the tuner unit 110 and performs a demodulation operation.

The demodulator 120 may perform demodulation and channel decoding and then output a stream signal TS. At this time, the stream signal may be a multiplexed signal of an image signal, an audio signal, or a data signal.

The stream signal output from the demodulator 120 may be input to the signal processing device 170. The signal processing device 170 performs separation, image/audio signal processing, and the like, and then outputs an image to the display 180 and an audio to the audio output unit 185.

The external device interface 130 may transmit or receive data with a connected external device (not shown), for example, the set-top box 50. To this end, the external device interface 130 may include an a/V input and output unit (not shown).

The external device interface 130 may be connected to external devices such as a Digital Versatile Disc (DVD), a blu-ray, a game device, a camera, a camcorder, a computer (notebook), a set-top box, and may perform input/output operations with the external devices, by wire/wirelessly.

The a/V input and output unit may receive an image signal and an audio signal from an external device. Further, the wireless communication unit (not shown) may perform short-range wireless communication with other electronic devices.

The external device interface 130 may exchange data with the neighboring mobile terminal 600 through a wireless communication unit (not shown). In particular, in the mirror mode, the external device interface 130 may receive device information, executed application information, an application image, and the like from the mobile terminal 600.

The network interface 135 provides an interface for connecting the image display apparatus 100 to a wired/wireless network including the internet. For example, the network interface 135 may receive content or data provided by the internet, a content provider, or a network operator via a network.

Further, the network interface 135 may include a wireless communication unit (not shown).

The storage unit 140 may store a program for each signal processing and control in the signal processing device 170, and may store an image, audio, or data signal after the signal processing.

In addition, the storage unit 140 may be used to temporarily store image, audio, or data signals input to the external device interface 130. In addition, the storage unit 140 may store information on a specific broadcast channel through a channel storage function such as channel mapping.

Although fig. 2 illustrates that the storage unit is provided separately from the signal processing apparatus 170, the scope of the present disclosure is not limited thereto. The storage unit 140 may be included in the signal processing device 170.

The user input interface 150 transmits a signal input by a user to the signal processing device 170 or transmits a signal from the signal processing device 170 to the user.

For example, it may perform transmission/reception of a user input signal such as power on/off, channel selection, screen setting, etc. with the remote controller 200, may transmit a user input signal input from a local key (not shown) such as a power key, a channel key, a volume key, a setting key, etc. to the signal processing device 170, may transmit a user input signal input from a sensor unit (not shown) sensing a user gesture to the signal processing device 170, or may transmit a signal from the signal processing device 170 to the sensor unit (not shown).

The signal processing device 170 may separate an input stream through the tuner unit 110, the demodulator 120, the network interface 135, or the external device interface 130, or process the separated signal to generate and output a signal for image or audio output.

For example, the signal processing device 170 may receive a broadcast signal or an HDMI signal received by the broadcast receiving unit 105, and perform signal processing based on the received broadcast signal or HDMI signal, thereby outputting a processed image signal.

The image signal processed by the signal processing device 170 is input to the display 180, and may be displayed as an image corresponding to the image signal. In addition, the image signal processed by the signal processing apparatus 170 may be input to an external output device through the external device interface 130.

The audio signal processed by the signal processing apparatus 170 may be output to the audio output unit 185 as an audio signal. In addition, the audio signal processed by the signal processing apparatus 170 may be input to an external output device through the external device interface 130.

Although not shown in fig. 2, the signal processing device 170 may include a demultiplexer (demultiplexer), an image processing unit, and the like. That is, the signal processing apparatus 170 can perform various signal processes, and for this reason, the signal processing apparatus 170 may be implemented in the form of a System On Chip (SOC). Subsequently, this will be described with reference to fig. 3.

In addition, the signal processing device 170 may control the overall operation of the image display apparatus 100. For example, the signal processing apparatus 170 may control the tuner unit 110 to control tuning of an RF broadcast corresponding to a channel selected by a user or a previously stored channel.

In addition, the signal processing device 170 may control the image display apparatus 100 according to a user command or an internal program input through the user input interface 150.

In addition, the signal processing device 170 may control the display 180 to display an image. At this time, the image displayed on the display unit 180 may be a still image or a moving image, and may be a 2D image or a 3D image.

In addition, the signal processing device 170 may display a specific object in an image displayed on the display 180. For example, the object may be at least one of a connected web page picture (newspaper, magazine, etc.), an Electronic Program Guide (EPG), various menus, widgets, icons, still images, moving images, and text.

Further, the signal processing device 170 may recognize the position of the user based on an image captured by a capturing unit (not shown). For example, the distance (z-axis coordinate) between the user and the image display apparatus 100 may be determined. Additionally, x-axis coordinates and y-axis coordinates corresponding to the user location in the display 180 may be determined.

The display 180 generates a driving signal by converting the image signal, the data signal, the OSD signal and the control signal processed by the signal processing device 70, the image signal, the data signal and the control signal received from the external device interface 130, and the like.

Further, the display 180 may be configured as a touch screen and also serve as an input device in addition to an output device.

The audio output unit 185 receives the signal processed by the signal processing device 170 and outputs the signal as audio.

A photographing unit (not shown) photographs a user. The photographing unit (not shown) may be implemented by a single camera, but the present disclosure is not limited thereto and may be implemented by a plurality of cameras. Image information photographed by a photographing unit (not shown) may be input to the signal processing device 170.

The signal processing device 170 may sense each of an image photographed by a photographing unit (not shown), a signal detected by a sensor unit (not shown), or a combination thereof to sense a posture of the user.

The power supply 190 supplies corresponding power to the image display apparatus 100. In particular, the power supply 190 may supply power to the signal processing apparatus 170, which may be implemented in the form of an SOC, the display 180 for displaying an image, and the audio output unit 185 for outputting audio.

Specifically, the power supply 190 may include a converter for converting AC power into DC power and a DC/DC converter for converting the level of the DC power.

The brightness sensor 197 may sense ambient brightness of the display 180.

Remote control 200 transmits user inputs to user input interface 150. For this, the remote controller 200 may use Bluetooth, Radio Frequency (RF) communication, Infrared (IR) communication, Ultra Wideband (UWB), ZigBee, etc. In addition, the remote controller 200 may receive an image, audio, or data signal output from the user input interface 150 and display it on the remote controller 200 or output it as audio.

Further, the image display apparatus 100 may be a fixed or mobile digital broadcasting receiver capable of receiving digital broadcasting.

Further, the block diagram of the image display apparatus 100 shown in fig. 2 is a block diagram of an embodiment of the present disclosure. Each component in the block diagram may be integrated, added, or omitted according to the specification of the image display apparatus 100 actually implemented. That is, two or more components may be integrated into a single component, or a single component may be divided into two or more components, if desired. The functions performed in each block are described for the purpose of illustrating the embodiments of the present disclosure, and the specific operations or apparatuses thereof do not limit the scope of the present disclosure.

Fig. 3 is a diagram of an internal block diagram of the signal processing apparatus shown in fig. 2.

Referring to this drawing, the signal processing apparatus 170 according to an embodiment of the present disclosure may include a demultiplexer 310, an image processing unit 320, a processor 330, and an audio processing unit 370. In addition, it may further include a data processing unit (not shown).

The demultiplexer 310 may separate input streams. For example, when MPEG-2TS is input, it can be separated into image, audio, and data signals, respectively. Here, the stream signal input to the demultiplexer 310 may be a stream signal output from the tuner unit 110, the demodulator 120, or the external device interface 130.

The image processing unit 320 may perform signal processing on the input image. For example, the image processing unit 320 may perform image processing on the image signal separated by the demultiplexer 310.

To this end, the image processing unit 320 may include an image decoder 325, a scaler 335, an image quality processing unit 635, an image encoder (not shown), an OSD processing unit 340, a frame rate converter 350, a formatter 360, and the like.

The image decoder 325 may decode the separated image signal, and the scaler 335 performs scaling so that the resolution of the decoded image signal may be output from the display 180.

The image decoder 325 may include decoders of various standards. For example, a 3D image decoder for MPEG-2, an h.264 decoder for color images and depth images, and a decoder for multi-view images may be provided.

The scaler 335 may scale the input image signal decoded by the image decoder 325 or the like.

For example, if the size or resolution of the input image signal is small, the scaler 335 may amplify the input image signal, and if the size or resolution of the input image signal is large, the scaler 335 may reduce the input image signal.

The image quality processing unit 635 may perform image quality processing on the input image signal decoded by the image decoder 325 or the like.

For example, the image quality processing unit 625 may perform noise reduction processing on an input image signal, expand the resolution of the gray scale of the input image signal, perform image resolution enhancement, perform signal processing based on a High Dynamic Range (HDR), change the frame rate, perform image quality processing suitable for the characteristics of a panel (particularly, an O L ED panel, etc.).

The OSD processing unit 340 may generate an OSD signal according to a user input or itself. For example, based on a user input signal, the OSD processing unit 340 may generate a signal for displaying various information such as graphics or text on the screen of the display 180. The generated OSD signal may include various data such as a user interface screen, various menu screens, widgets, and icons of the image display apparatus 100. In addition, the generated OSD signal may include a 2D object or a 3D object.

In addition, the OSD processing unit 340 may generate a pointer (pointer) that can be displayed on the display based on a pointing signal (pointing signal) input from the remote controller 200. In particular, this pointer may be generated by a pointing signal processing means, and the OSD processing unit 340 may include this pointing signal processing means (not shown). Obviously, a pointing signal processing device (not shown) may be provided separately from the OSD processing unit 340.

The Frame Rate Converter (FRC)350 may convert a frame rate of an input image. Furthermore, the frame rate converter 350 may also directly output the frame rate without any additional frame rate conversion.

Further, the formatter 360 may change the format of the input image signal into a format suitable for displaying the image signal on the display, and output the image signal in the changed format.

In particular, the formatter 360 may change the format of the image signal to correspond to the display panel.

The processor 330 may control the overall operation of the image display apparatus 100 or the signal processing device 170.

For example, the processor 330 may control the tuner unit 110 to control tuning of an RF broadcast corresponding to a channel selected by a user or a previously stored channel.

In addition, the processor 330 may control the image display apparatus 100 according to a user command input through the user input interface 150 or an internal program.

In addition, the processor 330 may transmit data to the network interface unit 135 or the external device interface 130.

In addition, the processor 330 may control the demultiplexer 310, the image processing unit 320, and the like in the signal processing device 170.

Further, the audio processing unit 370 in the signal processing apparatus 170 may perform audio processing on the separated audio signal. To this end, the audio processing unit 370 may include various decoders.

In addition, the audio processing unit 370 in the signal processing apparatus 170 may process bass, treble, volume control, and the like.

A data processor (not shown) in the signal processing device 170 may perform data processing on the separated data signals. For example, when the separated data signal is an encoded data signal, it can be decoded. The encoded data signal may be electronic program guide information including broadcast information such as a start time and an end time of a broadcast program broadcast on each channel.

The block diagram of the signal processing apparatus 170 shown in fig. 3 is a block diagram of an embodiment of the present disclosure. Each component in the block diagram may be integrated, added, or omitted according to the specification of the signal processing apparatus 170 actually implemented.

In particular, the frame rate converter 350 and the formatter 360 may be provided separately from the image processing unit 320.

Fig. 4A is a diagram illustrating a control method of the remote controller of fig. 2.

As shown in (a) of fig. 4A, it is exemplified that a pointer 205 corresponding to the remote controller 200 is displayed on the display 180.

The user can move or rotate the remote controller 200 up and down, left and right ((b) of fig. 4A), and back and forth ((c) of fig. 4A). The pointer 205 displayed on the display 180 of the image display apparatus corresponds to the motion of the remote controller 200. Such a remote controller 200 may be referred to as a spatial remote controller or a 3D pointing device because the pointer 205 moves and is displayed according to the movement in the 3D space, as shown in the drawing.

Fig. 4A (b) illustrates that when the user moves the remote controller 200 to the left, the pointer 205 displayed on the display 180 of the image display apparatus is correspondingly moved to the left.

Information about the motion of the remote controller 200 detected by the sensor of the remote controller 200 is transmitted to the image display apparatus. The image display apparatus may calculate the coordinates of the pointer 205 from the information on the motion of the remote controller 200. The image display device may display the pointer 205 in correspondence with the calculated coordinates.

Fig. 4A (c) illustrates the following case: the user moves the remote controller 200 away from the display 180 while pressing a specific button of the remote controller 200. Accordingly, a selection region within the display 180 corresponding to the pointer 205 may be enlarged such that it may be displayed enlarged. On the other hand, if the user moves the remote controller 200 close to the display 180, the selection area corresponding to the pointer 205 within the display 180 may be reduced so that it may be displayed reduced. Further, the selection area may be reduced when the remote controller 200 is moved away from the display 180, and enlarged when the remote controller 200 approaches the display 180.

Further, when a specific button of the remote controller 200 is pressed, recognition of vertical and lateral movement can be excluded. That is, when the remote controller 200 moves away from or close to the display 180, the up movement, the down movement, the left movement, and the right movement are not recognized, and only the forward movement and the backward movement are recognized. In a state where a specific button of the remote controller 200 is not pressed, only the pointer 205 moves according to the up movement, the down movement, the left movement, and the right movement of the remote controller 200.

Further, the moving speed or moving direction of the pointer 205 may correspond to the moving speed or moving direction of the remote controller 200.

Fig. 4B is an internal block diagram of the remote controller of fig. 2.

Referring to this drawing, the remote controller 200 includes a wireless communication unit 425, a user input unit 430, a sensor unit 440, an output unit 450, a power supply 460, a storage unit 470, and a controller 480.

The wireless communication unit 425 transmits/receives signals to/from any one of the image display apparatuses according to the embodiments of the present invention described above. Among the image display apparatuses according to the embodiments of the present invention, one image display apparatus 100 will be described as an example.

In the current embodiment, the remote controller 200 may include an RF module 421, and the RF module 421 is used for transmission and reception of signals with the image display apparatus 100 according to an RF communication standard. In addition, the remote controller 200 may include an IR module 423 for transmitting and receiving signals with the image display apparatus 100 according to an IR communication standard.

In the current embodiment, the remote controller 200 transmits a signal containing information on the motion of the remote controller 200 to the image display apparatus 100 through the RF module 421.

In addition, the remote controller 200 may receive a signal transmitted by the image display apparatus 100 through the RF module 421. In addition, the remote controller 200 may transmit commands related to power on/off, channel change, volume control, etc., to the image display apparatus 100 through the IR module 423, if necessary.

The user input unit 435 may be implemented by a keyboard, buttons, a touch pad, a touch screen, and the like. The user may operate the user input unit 435 to input a command related to the image display apparatus 100 to the remote controller 200. If the user input unit 435 includes a hard key button, the user may input a command related to the image display apparatus 100 to the remote controller 200 through a pressing operation of the hard key button. When the user input unit 435 includes a touch screen, the user may touch a soft key of the touch screen to input a command related to the image display apparatus 100 to the remote controller 200. In addition, the user input unit 435 may include various types of input devices that the user can operate, such as a scroll key, a jog key, and the like, and the present disclosure does not limit the scope of the present disclosure.

The sensor unit 440 may include a gyro sensor 441 or an acceleration sensor 443. The gyro sensor 441 can sense information related to the motion of the remote controller 200.

For example, the gyro sensor 441 can sense information related to the operation of the remote controller 200 based on the x-axis, the y-axis, and the z-axis. The acceleration sensor 443 can sense information related to the moving speed of the remote controller 200. Further, a distance measuring sensor may also be provided, and thus, the distance from the display 180 can be sensed.

The output unit 450 may output an image or audio signal corresponding to the operation of the user input unit 435 or a signal transmitted from the image display apparatus 100. Through the output unit 450, the user can recognize whether to operate the user input unit 435 or to control the image display apparatus 100.

For example, the output unit 450 may include L ED module 451, the L ED module 451 being turned on when the user input unit 430 is operated or transmission/reception of a signal with the image display apparatus 100 through the wireless communication unit 425, a vibration module 453, the vibration module 453 being for generating vibration, an audio output module 455, the audio output module 455 being for outputting audio, or a display module 457, the display module 457 being for outputting an image.

The power supply 460 supplies power to the remote controller 200. When the remote controller 200 is not moved for a certain time, the power supply 460 may stop supplying power to reduce power waste. The power supply 460 may restart power supply when a specific key provided in the remote controller 200 is operated.

The storage unit 470 may store various types of programs, application data, and the like required to control or operate the remote controller 200. If the remote controller 200 performs wireless transmission and reception of signals with the image display apparatus 100 through the RF module 421, the remote controller 200 and the image display apparatus 100 may transmit and receive signals through a specific frequency band. The controller 480 of the remote controller 200 may store information on a frequency band or the like for wireless transmission and reception of a signal with the image display apparatus 100 paired with the remote controller 200 in the storage unit 470, and may refer to the stored information.

The controller 480 controls various transactions related to the control of the remote controller 200. The controller 480 may transmit a signal corresponding to a specific key operation of the user input unit 430 or a signal corresponding to a motion of the remote controller 200 sensed by the sensor unit 440 to the image display apparatus 100 through the wireless communication unit 425.

The user input interface 150 of the image display apparatus 100 includes: a wireless communication unit 151, the wireless communication unit 151 being capable of wireless transmission and reception of signals with the remote controller 200; and a coordinate value calculator 415, the coordinate value calculator 415 may calculate a coordinate value of the pointer corresponding to an operation of the remote controller 200.

The user input interface 150 can perform wireless transmission and reception of signals with the remote controller 200 through the RF module 412. In addition, the user input interface 150 may receive a signal transmitted from the remote controller 200 through the IR module 413 according to an IR communication standard.

The coordinate value calculator 415 may correct hand trembling or an error corresponding to an operation of the remote controller 200 received through the wireless communication unit 151 and calculate coordinate values (x, y) of the pointer 205 to be displayed on the display 180.

The transmission signal of the remote controller 200 input to the image display apparatus 100 through the user input interface 150 is transmitted to the signal processing device 170 of the image display apparatus 100. The signal processing device 170 may determine information on the operation and key operation of the remote controller 200 according to a signal transmitted from the remote controller 200 and, correspondingly, control the image display apparatus 100.

For another example, the remote controller 200 may calculate a pointer coordinate value corresponding to an operation and output it to the user input interface 150 of the image display apparatus 100. In this case, the user input interface 150 of the image display apparatus 100 may transmit information on the received pointer coordinate values to the signal processing device 170 without a separate hand trembling or erroneous correction process.

For another example, unlike the drawing, the coordinate value calculator 415 may be provided in the signal processing apparatus 170, not in the user input interface 150.

Fig. 5 is an internal block diagram of the display of fig. 2.

Referring to fig. 5, the organic light emitting diode panel-based display 180 may include an organic light emitting diode panel 210, a first interface 230, a second interface 231, a timing controller 232, a gate driver 234, a data driver 236, a memory 240, a processor 270, a power supply 290, a current detector 510, and the like.

The display 180 receives the image signal Vd, the first DC power V1, and the second DC power V2, and may display a specific image based on the image signal Vd.

Further, the first interface 230 in the display 180 may receive the image signal Vd and the first DC power V1 from the signal processing apparatus 170.

Here, the first DC power V1 may be used for the operation of the power supply 290 and the timing controller 232 in the display 180.

Next, the second interface 231 may receive the second DC power V2 from the external power supply 190. In addition, the second DC power V2 may be input to the data driver 236 in the display 180.

The timing controller 232 may output the data driving signal Sda and the gate driving signal Sga based on the image signal Vd.

For example, when the first interface 230 converts the input image signal Vd and outputs the converted image signal va1, the timing controller 232 may output the data driving signal Sda and the gate driving signal Sga based on the converted image signal va 1.

The timing controller 232 may receive a control signal, a vertical synchronization signal Vsync, and the like, in addition to the image signal Vd from the signal processing device 170.

In addition to the image signal Vd, the timing controller 232 generates a gate driving signal Sga for the operation of the gate driver 234 and a data driving signal Sda for the operation of the data driver 236 based on a control signal, a vertical synchronization signal Vsync, and the like.

At this time, when the panel 210 includes RGBW sub-pixels, the data driving signal Sda may be a data driving signal for driving the RGBW sub-pixels.

In addition, the timing controller 232 may also output a control signal Cs to the gate driver 234.

The gate driver 234 and the data driver 236 supply scan signals and image signals to the organic light emitting diode panel 210 through the gate lines G L and the data lines D L according to the gate driving signal Sga and the data driving signal Sda from the timing controller 232, respectively, and thus, the organic light emitting diode panel 210 displays a specific image.

In order to display an image, a plurality of gate lines G L and data lines D L may be disposed in a matrix form in each pixel corresponding to the organic light emitting layer.

In addition, the data driver 236 may output a data signal to the organic light emitting diode panel 210 based on the second DC power V2 from the second interface 231.

The power supply 290 may supply various powers to the gate driver 234, the data driver 236, the timing controller 232, and the like.

The current detector 510 may detect a current flowing in the sub-pixel of the organic light emitting diode panel 210. The detected current may be input to processor 270, or the like, for an accumulated current calculation.

The processor 270 may perform various types of control over the display 180. For example, the processor 270 may control the gate driver 234, the data driver 236, the timing controller 232, and the like.

In addition, the processor 270 may receive information of the current flowing in the sub-pixels of the organic light emitting diode panel 210 from the current detector 510.

In addition, the processor 270 may calculate an accumulated current of each sub-pixel of the organic light emitting diode panel 210 based on information of currents flowing through the sub-pixels of the organic light emitting diode panel 210. The calculated accumulated current may be stored in the memory 240.

Further, with respect to aging, the processor 270 may determine whether the accumulated current of the respective sub-pixels of the organic light emitting diode panel 210 is equal to or greater than an allowable value.

For example, if the accumulated current of each sub-pixel of the O L ED panel 210 is equal to or higher than 300000A, the processor 270 may determine that the corresponding sub-pixel is an aged sub-pixel.

Further, if the accumulated current of each subpixel of the O L ED panel 210 is close to the allowable value, the processor 270 may determine that the corresponding subpixel is the one that is expected to age.

Further, based on the current detected by current detector 510, processor 270 may determine that the subpixel having the largest accumulated current is the expected aging subpixel.

Fig. 6A and 6B are diagrams referred to in describing the organic light emitting diode panel of fig. 5.

First, fig. 6A is a diagram illustrating a pixel in the organic light emitting diode panel 210.

Referring to this figure, the organic light emitting diode panel 210 may include a plurality of Scan lines Scan₁To Scan_nAnd a plurality of data lines R1, G1, B1, W1 to R crossing the scan lines_m、G_m、B_m、W_m。

In addition, pixels (sub-pixels) are defined in the crossing regions of the scan lines and the data lines in the organic light emitting diode panel 210. In the figure, pixels including subpixels SR1, SG1, SB1, and SW1 of RGBW are shown.

Fig. 6B illustrates a circuit of any one sub-pixel in the pixel of the organic light emitting diode panel of fig. 6A.

Referring to the drawing, the organic light emitting sub-pixel circuit (CRTm) may include a scan switching element SW1, a storage capacitor Cst, a driving switching element SW2, and an organic light emitting layer (O L ED) as an active type.

With the scan line connected to the gate terminal, the scan switching element SW1 is turned on according to the input scan signal Vdscan. When the scan switching element SW1 is turned on, the input data signal Vdata is transmitted to the gate terminal of the driving switching element SW2 or one end of the storage capacitor Cst.

The storage capacitor Cst is formed between the gate terminal and the source terminal of the driving switching element SW2, and stores a certain difference between a level of a data signal transmitted to one terminal of the storage capacitor Cst and a level of DC power (VDD) transmitted to the other terminal of the storage capacitor Cst.

For example, when the data signal has different levels according to a Pinnate Amplitude Modulation (PAM) method, the power level stored in the storage capacitor Cst varies according to a level difference of the data signal Vdata.

For another example, when the data signals have different pulse widths according to a pulse width modulation (PAM) method, the power level stored in the storage capacitor Cst varies according to a pulse width difference of the data signal Vdata.

When the driving switching element SW2 is turned on, a driving current (IO L ED) proportional to the stored power level flows in the organic light emitting layer (O L ED). accordingly, the organic light emitting layer O L ED performs a light emitting operation.

The organic light emitting layer O L ED may include a light emitting layer (EM L) of RGBW corresponding to the sub-pixel, and may include at least one of a hole injection layer (HI L), a hole transport layer (HT L), an electron transport layer (ET L), and an electron injection layer (EI L).

In addition, all the sub-pixels emit white light in the organic light emitting layer O L ED, however, in the case of green, red, and blue sub-pixels, separate color filters are provided for the sub-pixels to realize colors.

Further, in the figure, it is exemplified that p-type MOSFETs are used for the scanning switching element SW1 and the driving switching element SW2, and n-type MOSFETs or other switching elements such as JFETs, IGBTs, SIC and the like are also available.

Further, the pixel is a hold-type element which continuously emits light in the organic light emitting layer (O L ED) during a unit display period, specifically during a unit frame, after the scan signal is applied.

Fig. 7 is an example of an internal block diagram of a signal processing apparatus according to an embodiment of the present disclosure, and fig. 8 to 9B are diagrams referred to in describing the signal processing apparatus shown in fig. 7.

First, referring to fig. 7, the image display system 10 according to an embodiment of the present disclosure may include an image display apparatus 100, a server 600, and a set-top box 300.

The server 600 may include: a learning DB 640 configured to receive the training images and store the received training images; a quality calculator 670, the quality calculator 670 configured to calculate an image source quality using training images acquired from the learning DB 640 and a Deep Neural Network (DNN); and a parameter updating unit 675 configured to update the parameter of the DNN based on the learning DB 640 and the quality calculator 670.

The parameter updating unit 675 may transmit the updated parameters to the quality calculator 632 of the image display apparatus 100.

The set-top box 300 may receive an input signal from an image provider and transmit an image signal to the HDMI terminal of the image display apparatus 100.

The image display 100 may include: an image receiving unit 105, the image receiving unit 105 being configured to receive an image signal via an external set-top box 300 or a network; and a signal processing device 170, the signal processing device 170 being configured to perform signal processing on the image signal received by the image receiving unit 105; and a display 180, the display 180 being configured to display the image processed by the signal processing apparatus 170.

Further, the image display apparatus 100 may apply optimal tuning to the quality of the input image.

In addition, the image display apparatus 100 may analyze the input image in real time to determine a raw resolution, a noise level, a compression level, and an enhancement level of the input image.

Further, the image display apparatus 100 can change the image quality setting based on the calculated image information data without causing a sense of discomfort or a sense of distortion.

Further, the signal processing device 170 may include: a quality calculator 632, the quality calculator 632 configured to calculate an original quality of an image signal received via the external set-top box 300 or a network; an image quality setting unit 634, the image quality setting unit 634 being configured to set a quality of the image signal; and an image quality processing unit 635, the image quality processing unit 635 configured to perform image quality processing on the image signal according to the set quality.

If the original quality of the input image signal changes at a first point in time, the image quality setting unit 634 sequentially changes the image quality setting from the first setting to the second setting, and the image quality processing unit 635 may perform image quality processing according to the sequential change of the first setting to the second setting. Therefore, it is possible to reduce flicker when the image quality changes due to a change in the original quality of the input image signal. In particular, when the original quality of the image signal changes, the quality may change smoothly rather than aggressively.

Further, if the original quality of the received image signal is modified at a first time point of reproducing the image, the image quality setting unit 634 may sequentially change the image quality setting from the first setting to the second setting. Accordingly, when the original quality of the received image signal is changed, the image quality setting can be changed in real time. In particular, when the original quality of an image signal changes, the image quality may change smoothly rather than aggressively.

Further, if the original quality of the received image signal changes at a first point in time due to a channel change or an input change while the image signal is received from the set-top box 300, the image quality setting unit 634 sequentially changes the image quality from the first setting to the second setting. Accordingly, it is possible to reduce flicker when the image quality is changed due to a change in the original quality of the received image signal. In particular, when the original quality of an image signal changes, the image quality may change smoothly rather than aggressively.

The quality calculator 632 may divide the input image into a UHD (3840 × 2160 or higher), HD (1280 × 720) or SD (720 × 480 or higher) image.

The quality calculator 632 may calculate a probability for each resolution with respect to the input image, select a resolution with the highest probability as a final resolution and exclude a resolution with too low probability.

In addition to resolution, the quality calculator 632 may also anticipate noise levels and compression levels.

Further, when calculating the compression level, the quality calculator 632 may determine the compression level based on training data obtained by reducing the compression bit rate with reference to the original state.

For example, for FHD, the quality calculator 632 may evaluate the current digital TV broadcast standard to 1.0 and perform calculations so that when data is lost due to too much compression, the value may be reduced to 0.0.

In addition, the quality calculator 632 may calculate the noise level by measuring a flicker level in the input image.

For example, the quality calculator 632 may calculate the noise level in the input image as one of four levels, a high level, a medium level, a low level, and a noise-free level.

In addition, the quality calculator 632 may calculate the resolution and noise level of the received image signal using DNN. Therefore, the input image can be accurately analyzed.

Further, the quality calculator 632 may update parameters of the DNN from the server 600, and calculate the resolution and noise level of the received image signal based on the updated parameters. Therefore, the original quality of the image signal can be accurately calculated based on the learning.

Further, the quality calculator 632 may extract the first region and the second region from the image signal, and calculate the original resolution of the image signal based on the first region and calculate the noise level of the image signal based on the second region. Therefore, the original quality of the image signal can be accurately calculated based on the extraction of the region suitable for the quality calculation.

Further, the quality calculator 632 may extract a region where the edge component in the image signal is the largest as the first region, and extract a region where the edge component in the image signal is the smallest as the second region. Therefore, the original quality of the image signal can be accurately calculated based on the extraction of the region suitable for the quality calculation.

Further, the image quality processing unit 635 may increase the noise reduction processing strength of the image signal as the calculated noise level increases. Accordingly, image quality processing suitable for the noise level of the received image signal can be performed.

Further, the quality calculator 632 may calculate the original resolution, the noise level, and the compression level of the received image signal, and calculate the compression level based on training data obtained by reducing the compression bit rate.

Further, the image quality processing unit 635 may decrease the enhancement intensity of the image signal as the calculated noise level increases. Therefore, the compression level can be accurately calculated.

Further, the image quality processing unit 635 may increase the enhancement intensity of the image signal as the calculated original resolution increases. Accordingly, image quality processing suitable for the original resolution of the received image signal can be performed.

Further, the image quality processing unit 635 may increase the blurring processing strength of the image signal as the calculated noise level increases. Accordingly, image quality processing suitable for the compression level of the received image signal can be performed.

Further, the image quality processing unit 635 may reduce filtering for filtering the image signal as the original resolution of the image signal increases. Accordingly, image quality processing suitable for the original resolution of the received image signal can be performed.

Further, the image quality processing unit 635 may reduce the image signal according to an original resolution of the image signal, perform image quality processing on the reduced image signal, amplify the image quality-processed image signal, and output the amplified image signal. Accordingly, image quality processing suitable for the original resolution of the received image signal can be performed.

Fig. 8 is an example of an internal block diagram of the signal processing apparatus 170 in fig. 7.

Further, the signal processing device 170 in fig. 8 may correspond to the signal processing device 170 in fig. 2.

First, referring to fig. 8, the signal processing apparatus 170 according to an embodiment of the present disclosure may include an image analyzer 610 and an image processing unit 635.

The image analyzer 610 may include a quality calculator 632 and an image quality setting unit 634 shown in fig. 7.

The image analyzer 610 may analyze an input image signal and output information related to the analyzed input image signal.

In addition, the image analyzer 610 may distinguish an object region and a background region of the first input image signal. Alternatively, the image analyzer 610 may calculate the probability or percentage of the object region and the background region of the first input image signal.

The input image signal may be an input image signal from the image receiving unit 105 or an image decoded by the image decoder 325 in fig. 3.

In particular, the image analyzer 610 may analyze an input image signal using artificial intelligence and output information about the analyzed input image signal.

Specifically, the image analyzer 610 may output the resolution, the gray level, the noise level, and the pattern of the input image signal, and output information (especially, image setting information) on the analyzed input image signal to the image quality processing unit 635.

The image quality processing unit 635 may include an HDR processing unit 705, a first reducing unit 710, an enhancing unit 750, and a second reducing unit 790.

The HDR processing unit 705 may receive an image signal and perform High Dynamic Range (HDR) processing on the input image signal.

For example, the HDR processing unit 705 may convert a Standard Dynamic Range (SDR) image signal into an HDR image signal.

For another example, the HDR processing unit 705 may receive an image signal and perform gray scale processing on an input image signal for HDR.

Further, if the input image signal is an SDR image signal, the HDR processing unit 705 may bypass the gradation conversion, and if the input image signal is an HDR image signal, the HDR processing unit 705 performs the gradation conversion. Therefore, high gray scale representation of the input image can be improved.

Further, the HDR processing unit 705 may convert the gray levels according to a first gray level conversion mode in which the low gray levels are to be emphasized and the high gray levels are to be saturated, and a second gray level conversion mode in which the low gray levels and the high gray levels are slightly uniformly converted.

Specifically, if the first gray scale conversion mode is implemented, the HDR processing unit 705 may convert the gray scale based on data corresponding to the first gray scale conversion mode in the lookup table.

More specifically, if the first gray-scale conversion mode is implemented, the HDR processing unit 705 may convert the gray-scale based on the equation of the input data and the first gray-scale conversion mode determined by the equation in the lookup table. Here, the input data may include video data and metadata.

Further, if the second gray-scale conversion mode is implemented, the HDR processing unit 705 may convert the gray-scale based on data corresponding to the second gray-scale conversion mode in the lookup table.

More specifically, if the second gray-scale conversion mode is implemented, the HDR processing unit 705 may convert the gray-scale based on the equation of the input data and the second gray-scale conversion mode determined by the equation in the lookup table. Here, the input data may include video data and metadata.

Further, the HDR processing unit 705 may select the first gray scale conversion mode or the second gray scale conversion mode according to the third gray scale conversion mode or the fourth gray scale conversion mode in the high gray scale amplification unit 851 in the second reducing unit 790.

For example, if the third gray level conversion mode is implemented, the high gray level amplification unit 851 of the second reducing unit 790 may convert gray levels based on data corresponding to the third gray level conversion mode in the lookup table.

Specifically, if the third gray scale conversion mode is implemented, the high gray scale amplification unit 851 of the second reduction unit 790 may perform conversion of gray scales based on the equation of the input data and data corresponding to the third gray scale conversion mode determined by the equation in the lookup table. Here, the input data may include video data and metadata.

Further, if the fourth gray level conversion mode is implemented, the high gray level amplification unit 851 of the second reducing unit 790 may convert the gray levels based on the data corresponding to the fourth gray level conversion mode in the lookup table.

Specifically, if the fourth gray scale conversion mode is implemented, the high gray scale amplification unit 851 of the second reduction unit 790 may perform conversion of gray scales based on the equation of the input data and data corresponding to the fourth gray scale conversion mode determined by the equation in the lookup table. Here, the input data may include video data and metadata.

For example, if the fourth gray scale conversion mode is implemented in the high gray scale amplification unit 851 of the second reduction unit 790, the HDR processing unit 705 may implement the second gray scale conversion mode.

As another example, if the third gray scale conversion mode is implemented in the high gray scale amplification unit 851 of the second reduction unit 790, the HDR processing unit 705 may implement the first gray scale conversion mode.

Alternatively, the high gray scale amplification unit 851 of the second reduction unit 790 may change the gray scale conversion mode according to the gray scale conversion mode of the HDR processing unit 705.

For example, if the second gray scale conversion mode is implemented in the HDR processing unit 705, the high gray scale amplification unit 851 in the second reduction unit 790 may perform a fourth gray scale conversion mode.

For another example, if the first gray scale conversion mode is implemented in the HDR processing unit 705, the high gray scale amplification unit 851 of the second reduction unit 790 may implement the third gray scale conversion mode.

Further, the HDR processing unit 705 according to the embodiment of the present disclosure may implement a gray level conversion mode such that a low gray level and a high gray level are uniformly converted.

Further, according to the second gray-scale conversion mode in the HDR processing unit 705, the second reducing unit 790 may implement the fourth gray-scale conversion mode and thereby amplify the upper limit of the gray-scale level of the received input signal. Therefore, high gray scale representation of the input image can be improved.

Next, the first reduction unit 710 may perform noise reduction on the input image signal or the image signal processed by the HDR processing unit 705.

Specifically, the first reduction unit 710 may perform a plurality of stages of noise reduction processing and a first stage of gray scale extension processing on the input image signal or the HDR image from the HDR processing unit 705.

To this end, the first reducing unit 710 may include a plurality of noise reducing parts 715 and 720 for reducing noise in a plurality of stages and a first gray level expanding part 725 for expanding gray levels.

Next, the enhancing unit 750 may perform a plurality of stages of image resolution enhancing processes on the image from the first reducing unit 710.

In addition, the enhancing unit 750 may perform object three-dimensional effect enhancing processing. In addition, the enhancing unit 750 may perform color or contrast enhancing processing.

To this end, the enhancing unit 750 may include: a plurality of resolution enhancement units 735, 738, 742 for enhancing the resolution of the image in a plurality of stages; and an object three-dimensional effect enhancing unit 745, the object three-dimensional effect enhancing unit 745 for enhancing the three-dimensional effect of the object; and a color contrast enhancing unit 749, the color contrast enhancing unit 749 for enhancing color or contrast.

Next, the second reducing unit 790 may perform a second stage of gray level extension processing based on the noise-reduced image signal received from the first reducing unit 710.

Further, the second reducing unit 790 may amplify an upper limit of a gray level of the input signal and expand a resolution of a high gray level of the input signal. Therefore, high gray scale representation of the input image can be improved.

For example, the gray scale extension may be uniformly performed for the entire gray scale range of the input signal. Accordingly, gray scale expansion is uniformly performed over the entire area of the input image, thereby improving high gray scale representation.

Further, the second reducing unit 790 may perform gray scale amplification and expansion based on the signal received from the first gray scale expanding section 725. Therefore, high gray scale representation of the input image can be improved.

Also, if the input image signal is an SDR image signal, the second reducing unit 790 may change the degree of amplification based on the user input signal. Thus, high gray scale rendering can be improved in response to user settings.

Further, if the input image signal is an HDR image signal, the second reducing unit 790 may perform amplification according to the set value. Therefore, high gray scale representation of the input image can be improved.

Further, if the input image signal is an HDR image signal, the second reducing unit 790 may change the degree of amplification based on the user input signal. Thus, high gray level rendering can be improved according to user settings.

Further, in the case of expanding the gray level based on the user input signal, the second reducing unit 790 may change the degree of expansion of the gray level. Thus, high gray level rendering can be improved according to user settings.

Further, the second reducing unit 790 may enlarge the upper limit of the gray level according to the gray level conversion mode in the HDR processing unit 705. Therefore, high gray scale representation of the input image can be improved.

The signal processing device 170 includes: an HDR processing unit 705, the HDR processing unit 705 configured to receive an image signal and adjust luminance of an input image signal; and a reduction unit 790 configured to amplify the luminance of the image signal received from the HDR processing unit 705 and increase the grayscale resolution of the image signal, thereby generating an enhanced image signal. The enhanced image signal provides increased brightness and increased grayscale resolution of the image signal while maintaining a high dynamic range of the displayed HDR image.

Further, the luminance range of the image signal is adjusted by the control signal received by the signal processing device 170.

Further, the signal processing apparatus 170 further includes an image analyzer configured to determine whether the input image signal is an HDR signal or an SDR signal, and generate a control signal to be supplied to the HDR processor 705. The luminance range of the input image signal is adjusted by the control signal only when the input image signal is the HDR signal.

Further, a control signal related to the signal processing is received from the controller of the image display apparatus, and the control signal corresponds to the setting of the image display apparatus.

Further, the resolution of the gray scale is increased based on the amplification of the adjusted luminance of the image signal.

Furthermore, the resolution of the grey scale is increased based on the control signal received by the signal processing means 170.

Further, the reducing unit 790 may include: a high gray level amplification unit 851, the high gray level amplification unit 851 being configured to amplify an upper limit of a gray level of an input signal; and de-contouring units 842 and 844, the de-contouring units 842 and 844 being configured to expand the resolution of the gray scale amplified by the high gray scale amplification unit 851.

The second reducing unit 790 may include a second gray level expanding part 729 for second gray level expansion.

Further, as shown in fig. 8, the image quality processing unit 635 in the signal processing apparatus 170 according to the present disclosure is characterized by performing four stages of reduction processing and four stages of image enhancement processing.

Here, the four-stage reduction processing may include two-stage noise reduction processing and two-stage gray-level extension processing.

Here, the two stages of noise reduction processing may be performed by the first and second noise reducers 715 and 720 in the first reduction unit 710, and the two stages of gray level extension processing may be performed by the first and second gray level extenders 725 and 729 in the first and second reduction units 710 and 790.

Further, these four stages of image enhancement processing may include three stages of image resolution enhancement (bit resolution enhancement) and object three-dimensional effect enhancement.

Here, the three stages of image enhancement processing may be performed by the first, second, and third resolution enhancement units 735, 738, and 742, and the object three-dimensional effect enhancement may be performed by the object three-dimensional enhancement unit 745.

Further, the signal processing device 170 of the present disclosure may apply the same algorithm or a similar algorithm to the image quality processing a plurality of times, thereby enabling to gradually enhance the image quality.

To this end, the image quality processing unit 635 of the signal processing apparatus 170 of the present disclosure may perform image quality processing by applying the same algorithm or a similar algorithm two or more times.

Further, the same algorithm or similar algorithms implemented by the image quality processing unit 635 have different implementation objectives at each stage. In addition, since the image quality processing is performed gradually in a plurality of stages, there is an effect of causing less artifacts to appear in the image, resulting in a more natural and more vivid image processing result.

Further, the same algorithm or similar algorithms are alternately applied a plurality of times with different image quality algorithms, thereby bringing about a stronger effect than a simple continuous process.

Further, the signal processing apparatus 170 of the present disclosure may perform noise reduction processing in a plurality of stages. The noise reduction processing at each stage may include temporal processing and spatial processing.

Further, in order to calculate the original quality of the image signal, the present disclosure uses an up-to-date technique such as Artificial Intelligence (AI). For this, a Deep Neural Network (DNN) may be used.

The quality calculator 632 may use DNN to calculate the resolution and noise level of the input image signal.

The quality calculator 632 may obtain the original resolution and training images for each compression rate and train the network in order to improve the accuracy of the calculation.

Various images that are commonly seen in general broadcast programs are provided as images for training, and thus any input environment can be covered.

Further, in order to reduce the detection time or cost, the quality calculator 632 may perform learning using a convolutional neural network having a small number of layers, Mobile-Net, or the like.

For example, the quality calculator 632 may analyze only regions of the entire image (e.g., 224 × 224, 128 × 128, 64 × 64, etc.).

Further, the quality calculator 632 may select a detection region suitable for the detection purpose.

For example, the quality calculator 632 may select a first region having the largest number of edge components when detecting the original resolution, and select a second region having the smallest number of edge components when detecting noise.

In particular, the quality calculator 632 may apply an algorithm of selecting a detection region in a short time in order to increase the processing speed.

For example, the quality calculator 632 may perform preprocessing such as Fast Fourier Transform (FFT) on the detection region.

Fig. 9A is a diagram showing a calculation performed based on a Convolutional Neural Network (CNN).

Referring to this figure, a convolutional neural network is used for a specific region 1015 in the acquired image 1010.

A convolutional network and a deconvolution network can be implemented as convolutional neural networks.

The convolution and pooling (posing) is performed repeatedly according to a convolutional neural network.

Further, according to the CNN scheme shown in fig. 9A, information on the region 1015 may be used to determine the type of pixels in the region 1015.

Fig. 9B is a diagram showing a calculation performed based on Mobile-Net.

According to the scheme shown in the figure, a quality calculation is performed.

Further, as the original quality is changed, the signal processing apparatus 170 of the present disclosure may apply an image quality setting corresponding to the changed quality in real time.

In particular, while reproducing an image, the signal processing apparatus 170 may perform the control application on the change of the image quality setting without any condition such as a channel change or an input change when the image quality setting is changed.

In this context, "real-time" refers to the use of temporal processing techniques including Infrared Imaging (IIR) and step-and-step movement.

Further, in order to upmix the downmix stereo audio signals and convert them into multi-channel audio signals, an Independent Component Analysis (ICA) method, a Principal Component Analysis (PCA) method in which analysis is performed using principal component and ambient component signals, a non-Negative Matrix Factorization (NMF) method based on unsupervised learning may be used.

Further, according to the principal component analysis method (PCA) method, since signals separated into principal components and ambient components are different from the original multichannel signal, it is impossible to match the principal components and the ambient components to the original multichannel signal.

For example, in a multi-channel playback apparatus, if the main components are in the front channel and the ambient components are evenly distributed in the entire channel, or if they are rendered in the rear upstream channel differently from the actual intention of the content creator, distorted spatial sound characteristics occur such that the audio objects are only in the front.

Accordingly, the present disclosure proposes an upmix method capable of synthesizing a multi-channel signal using a principal component analysis method (PCA) and a Deep Neural Network (DNN) model. In particular, when upmixing a downmix stereo audio signal into a multi-channel audio signal, a method of improving spatial distortion is proposed. This will be described below with reference to fig. 10.

Fig. 10 is an example of an internal block diagram of a signal processing apparatus according to an embodiment of the present disclosure.

Referring to this figure, in order to upmix a stereo audio signal into a multi-channel audio signal, the signal processing apparatus 170 may include a converter 1010, a filter bank 1020, a principal component analyzer 1030, a feature extractor 1040, a masking unit 1055, an envelope adjuster 1060, and an inverse converter 1070.

The converter 1010 may convert a frequency of an input stereo audio signal. For example, the converter 1010 may perform a short-time fourier transform (STFT) on the input stereo audio signal.

Next, the filter bank 1020 may filter the frequency-converted stereo audio signal by using a plurality of band pass filters.

For example, the filter bank 1020 may include a filter bank such as a threshold band based on auditory characteristics, an octave band, and an Equivalent Rectangular Bandwidth (ERB) of gamma tones, and performs corresponding filtering.

In addition, filter bank 1020 may perform a Quadrature Mirror Filter (QMF) transform.

Furthermore, the two-channel signal of the stereo audio signal is analyzed in time and frequency band by the filter bank 1020 and separated into a main component signal transmitting main information such as speech and audio objects and an ambient component signal presenting reverberation and spatial impression.

Therefore, when the analysis is performed by a Deep Neural Network (DNN), parameters required for the upmixing can be simplified and computational complexity can be reduced.

In addition, the sound source separation method based on Principal Component Analysis (PCA) can be expressed as formula 1.

[ formula 1]

x₁[i，k]＝s₁[i，k]+n₁[i，k]，

x₂[i，k]＝s₂[i，k]+n₂[i，k]，

s₂＝as₁

Here, s1[ i, k ], s2[ i, k ] may respectively represent the index i in the time domain and the main component signal of the frequency band k in the frequency domain, n1[ i, k ], n2[ i, k ] may respectively represent the index i in the time domain and the ambient component signal of the frequency band k in the frequency domain, and a may represent the panning gain.

The main component signal may represent a component having a high correlation between two channels of the stereo signal and having only an amplitude difference, and the ambient component signal may represent a component having a low correlation between two channels such as sound reflected by various paths or reverberated sound.

In the principal component analysis method, since specific sources such as direct sounds, voices, and musical instruments are separated into principal components, intelligibility can be efficiently improved by panning for the front channel.

In addition, a principal component analysis approach may be used to maximize the spatial impression, since the background sound is separated into environmental components and rendered consistently across the entire channel.

However, when the principal component is translated to a portion or one-side channel where the principal component does not exist, performance may be degraded because correlation is small.

In addition, the principal component signals of all channels (such as the front channel, the center channel, the woofer channel, the rear channel, and the upstream channel) are mixed in the estimated principal component signals, and the environmental components of all the original channels are also mixed in the environmental components.

Therefore, a Principal Component Analysis (PCA) method may have difficulty in accurately rendering components of corresponding original channels to each speaker of a multi-channel playback device. In addition, in the case of a stereo playback apparatus, since incorrect multi-channel upmixing has a limitation in improving a sound field by virtualization adapted to each channel, spatial distortion may occur differently from the intention of a content creator.

Recently, with the development of neural network research technology that can improve performance without reducing local minima, relative to many hierarchical models, it is being extended to various fields such as classification, recognition, detection, retrieval, and the like of speech audio signals other than speech recognition.

Accordingly, in the present disclosure, Artificial Intelligence (AI) image quality processing using a Deep Neural Network (DNN) model, Artificial Intelligence (AI) sound quality processing, and the like are performed. In particular, the present disclosure suggests an upmix scheme using a Deep Neural Network (DNN) model.

For this, it is necessary to learn a downmix audio signal or a multi-channel audio signal using a Deep Neural Network (DNN) model. This will be described with reference to fig. 11.

Further, the principal component analyzer 1030 may perform principal component analysis based on the signal from the converter 1010.

In particular, the principal component analyzer 1030 may perform principal component analysis based on signals from the filter bank 1020.

The feature extractor 1040 may extract features of the principal component signals based on the signals from the principal component analyzer 1030.

The model learning predictor 1050 may perform deep neural network model-based prediction based on the features from the feature extractor 1040.

The masking unit 1055 may perform masking on the prediction result from the model learning predictor 1050.

When each of the plurality of channels is independent of time and frequency, the masking unit 1055 may perform masking by using the time-frequency component to perform channel separation based on the prediction result from the model learning predictor 1050.

The envelope adjuster 1060 may perform envelope adjustment based on prediction performed based on a deep neural network model.

The envelope adjuster 1060 may separate channels by correcting the envelope of the signal in the frequency band according to a weight function for the frequency band based on a prediction result from the model learning predictor 1050.

The envelope adjuster 1060 may adjust the size of each frequency band to follow the envelope in the target channel.

The inverse transformer 1070 may inversely transform the signal from the envelope adjuster 1060 to output a multi-channel upmixed audio signal.

Fig. 11 is another example of an internal block diagram of a signal processing apparatus according to an embodiment of the present disclosure.

Referring to the figure, in order to upmix the stereo audio signal into the multi-channel audio signal, the signal processing apparatus 170 may further include a second converter 1015, a second filter bank 1025, a second principal component analyzer 1035, and a second feature extractor 1045.

Furthermore, the database 1005 may contain a downmix stereo audio signal and a multi-channel audio signal.

At this time, the database 1005 may be provided in the server 600 or the like.

Furthermore, the second converter 1015 may convert the frequency of the downmix stereo audio signal or the multi-channel audio signal received from the database 1005.

For example, the second converter 1015 may perform a short-time fourier transform (STFT) on the input downmix stereo audio signal or the multi-channel audio signal.

Next, the second filter bank 1025 may filter the frequency-converted stereo audio signal by using a plurality of band pass filters.

For example, the second filter bank 1025 may include a filter bank such as a threshold band based on auditory characteristics, octave band, Equivalent Rectangular Bandwidth (ERB) of gamma tones, and may perform corresponding filtering.

In addition, the second filter bank 1025 may perform Quadrature Mirror Filter (QMF) conversion.

On the other hand, the two-channel signal or the multi-channel audio signal of the stereo audio signal is analyzed in time and frequency band by the second filter bank 1025 and may be separated into a main component signal transmitting main information such as speech and audio objects and an ambient component signal presenting reverberation and spatial impression.

The second principal component analyzer 1035 may separate the principal component signal and the ambient component signal with respect to a two-channel signal or a multi-channel audio signal of the stereo audio signal, and perform a correlation operation between channels of the principal component signal, a panning gain operation of the principal component signal, a power operation of the principal component signal, and the like.

Further, the second principal component analyzer 1035 may perform correlation operations between channels of the ambient component signal, translation gain operations of the ambient component signal, power operations of the ambient component signal, and the like.

The second feature extractor 1045 may extract features such as a panning gain of the main component signal, a power of the main component signal, a panning gain of the ambient component signal, a power of the ambient component signal, and the like.

Further, the features extracted by the second feature extractor 1045, etc. may be input to the model learning predictor 1050.

Further, the model learning predictor 1050 may perform learning based on the deep neural network model based on the features extracted by the second feature extractor 1045.

In particular, the model learning predictor 1050 may perform learning based on a deep neural network model based on features such as the panning gain of the principal component signal and the power of the principal component signal extracted by the second feature extractor 1045.

Further, the model learning predictor 1050 may be provided in the signal processing apparatus 170, but may also be provided in the server 600.

Further, the signal processing device 170 shown in fig. 10 may convert a stereo audio signal into a multi-channel audio signal.

To this end, as described above, the signal processing apparatus 170 illustrated in fig. 10 may include a converter 1010, a filter bank 1020, a principal component analyzer 1030, a feature extractor 1040, a masking unit 1055, an envelope adjuster 1060, and an inverse converter 1070.

The converter 1010 may frequency-convert the input downmix stereo audio signal.

Next, the filter bank 1020 may filter the frequency-converted stereo audio signal by using a plurality of band pass filters.

Next, the main component analyzer 1030 separates the main component signal and the ambient component signal with respect to the two-channel signal of the stereo audio signal, and performs a correlation operation between channels of the main component signal, a panning gain operation of the main component signal, a power operation of the main component signal, and the like.

Further, the main component analyzer 1030 may perform a correlation operation between channels of the ambient component signal, a panning gain operation of the ambient component signal, a power operation of the ambient component signal, and the like.

Further, the feature extractor 1040 may extract features such as a panning gain of the main component signal, a power of the main component signal, a panning gain of the ambient component signal, a power of the ambient component signal, and the like.

Further, the features extracted by the feature extractor 1040 and the like may be input to the model learning predictor 1050.

Further, the model learning predictor 1050 may perform prediction based on the deep neural network model based on the features extracted by the feature extractor 1040.

In particular, the model learning predictor 1050 may perform prediction based on a deep neural network model based on features such as the translational gain of the principal component signal extracted by the feature extractor 1040 and the power of the principal component signal.

Further, the model learning predictor 1050 may be provided in the signal processing apparatus 170, but may also be provided in the server 600.

Further, if the downmix input signal is ideally separated by principal component analysis, it can be expressed as equation 2, equation 3.

[ formula 2]

s_d[i，k]＝s_f[i，k]+s_c[i，k]+s_w[i，k]+s_r[i，k]+s_h[i，k]

[ formula 3]

n_d[i，k]＝n_f[i，k]+n_r[i，k]+n_h[i，k]

Here, Sd, Sf, Sc, Sw, Sr, Sh denote the principal components of the downmix channel and the principal components of the front, center, woofer, rear, and upstream channels of the original multichannel, respectively, and nd, nf, nr, nh denote the environmental components of the downmix channel and the environmental components of the front, rear, and upstream channels of the original multichannel, respectively.

In the main components of equation 2, the main components of the front channel, the center channel, the woofer channel, the rear channel, and the upstream channel are mixed. In particular, in the center and woofer signals, the principal components are dominant because the correlation is high by the downmix method.

In equations 2 and 3, the multichannel signal is assumed to be a layout of 5.1.2 channels, but multichannel signals of other layouts may be similarly presented.

The masking unit 1055 may perform masking on the result predicted by the model learning predictor 1050.

When each channel in the multi-channel signal is statistically independent of time and frequency, the masking unit 1055 may perform masking using the time-frequency components to perform per-channel separation, as shown in equation 4.

[ formula 4]

Here, Msx and Mnx represent mask functions of the principal component and the context separation of the upmix tunnel, respectively.

The masking function is determined from the frequency bands of any of the original multi-channels that are matched for the principal and ambient components of the input signal.

In the matching method, the inclusion relationship can be easily known by the band power of the main component and the environmental component of the multi-channel predicted from the model of the Deep Neural Network (DNN).

Thus, the mask function may mean a function that separates the principal component Sd and the ambient component nd of the input downmix into the principal component Sx and the ambient component nx of the upmix channel.

The separated principal and ambient components of the target channel are mixed as shown in equation 5 to obtain the final upmixed signal.

[ formula 5]

When a rectangular window binary mask is used as described above, it has good performance in objective evaluation, but has a disadvantage of severe distortion in subjective sound quality.

Thus, a gaussian filter may be used to improve auditory unnatural distortion.

However, since the actual multi-channel signals do not have mutual traitory properties, they cannot be fully coped with in a simple mask form.

As a result, the envelope adjuster 1060 can adjust the envelope of the signal output through the mask.

That is, the envelope adjuster 1060 may separate channels by using a weighting function as shown in equation 6. Therefore, a more natural output can be obtained.

[ formula 6]

Here, Wsx and Wnx represent a principal component weight function and an environment component weight function of the target channel, respectively.

The weight function represents a weight function that minimizes an error between a principal component of the signal relative to the target upmix channel and the downmix signal and the ambient component signal.

Thus, the weight function for a frequency band can be considered to correct the envelope of the signal in the frequency band.

That is, the envelope adjuster 1060 may adjust the envelope by calculating weights of the envelope of the principal component and the environmental component of the multiple channels.

As shown in equation 7, the final upmix signal to which the weight function is applied is generated.

[ formula 7]

To adjust the envelope, upmix parameters may be predicted from a model of the deep neural network DNN in the model learning predictor 1050.

Therefore, according to the Deep Neural Network (DNN) learning performance in the model learning predictor 1050, the multi-channel upmix performance may degrade.

In addition, the envelope adjuster 1060 may adjust the size of each frequency band to follow the envelope of the target channel. In this case, since only the size information of the principal component and the environmental component band needs to be estimated as the feature vector, there is an advantage in terms of implementation.

In addition, since a weight function optimized according to a Deep Neural Network (DNN) model can be applied to the principal component and the environmental component estimated by the conventional principal component analysis method, there is an advantage that real-time tuning can be performed.

Further, the inverse transformer 1070 may inversely transform the signal output from the envelope adjuster 1060 to output the multi-channel up-mixed audio signal.

Therefore, when a downmix stereo audio signal is upmixed into a multi-channel audio signal, spatial distortion can be improved.

Fig. 12 is a diagram referred to in describing fig. 10 to 11.

Referring to the drawing, a stereo input signal (left signal, right signal) may be separated into a principal component signal (main) and an ambient component signal (ambient) by a principal component analyzer 1030.

Further, the multichannel signal may be upmixed via the envelope adjuster 1060 or the like.

Furthermore, the front channel, the center channel and the woofer channel of the multi-channel audio signal are uncorrelated with each other with respect to the frequency domain. In addition, since the left and right channels have high correlation with the same signal, the left and right channels can be decomposed into principal components.

Further, since the correlation between the left channel and the right channel is low, the rear channel and the upstream channel can be decomposed into environmental components.

Further, the weight function predicted in the deep neural network model is applied to the principal component and the environmental component decomposed by the envelope adjuster 1060 to generate an upmix channel.

Fig. 13 is a flowchart illustrating an operation of a signal processing apparatus according to an embodiment of the present disclosure.

Referring to this figure, the signal processing apparatus 170 may receive an input stereo signal (S1405). In particular, a downmix stereo signal can be received.

Next, the converter 1030 in the signal processing apparatus 170 may convert the frequency of the input stereo audio signal (S1410).

As described above, the input stereo audio signal may be converted by using a Short Time Fourier Transform (STFT).

Next, the filter bank 1020 may filter the frequency-converted stereo audio signal by using a plurality of band pass filters (S1415).

Next, the principal component analyzer 1030 may perform principal component analysis based on the signal from the converter 1010 (S1420).

In particular, the principal component analyzer 1030 may perform principal component analysis based on signals from the filter bank 1020.

Next, the feature extractor 1040 may extract features of the principal component signals based on the signals from the principal component analyzer 1030 (S1425).

Next, the model learning predictor 1050 may perform prediction based on the deep neural network model based on the features from the feature extractor 1040 (S1430).

Next, the masking unit 1055 may perform masking on the prediction result from the model learning predictor 1050 (S1435).

Further, when each of the plurality of channels is independent of time and frequency, the masking unit 1055 performs masking with a time-frequency component based on a prediction result from the model learning predictor 1050 to perform channel separation.

Next, the envelope adjuster 1060 may perform envelope adjustment based on the prediction performed based on the deep neural network model (1440).

Further, based on the prediction result from the model learning predictor 1050, the envelope adjuster 1060 can separate channels by correcting the envelope of the signal in the frequency band according to the weight function for the frequency band.

Further, the envelope adjuster 1060 may adjust the size of each frequency band to follow the envelope in the target channel.

Next, the inverse transformer 1070 may inversely transform the signal from the envelope adjuster 1060 to output a multi-channel upmixed audio signal.

Therefore, when a downmix stereo audio signal is upmixed into a multi-channel audio signal, spatial distortion can be improved. In particular, a multi-channel signal can be easily synthesized using a principal component analysis method and a deep neural network model.

Fig. 14 is a flowchart illustrating an operation of a signal processing apparatus according to an embodiment of the present disclosure.

Referring to this figure, the signal processing device 170 may receive a downmix stereo audio signal or a multi-channel audio signal received from a database 1005.

Next, the second converter 1015 in the signal processing apparatus 170 may convert the frequency of the downmix stereo audio signal or the multi-channel audio signal received from the input database 1005 (S1310). As described above, the input stereo audio signal may be converted by using a Short Time Fourier Transform (STFT).

Next, the second filter bank 1025 may filter the frequency-converted stereo audio signal by using a plurality of band pass filters (S1315).

Next, the second principal component analyzer 1035 may perform principal component analysis based on the signal from the second converter 1015 (S1320).

In particular, the second principal component analyzer 1035 may perform principal component analysis based on signals from the second filter bank 1025.

Next, the second feature extractor 1045 may extract features of the principal component signals based on the signal from the second principal component analyzer 1035 (S1325).

Next, the model learning predictor 1050 may perform learning based on the deep neural network model based on the features from the second feature extractor 1045 (S1330).

Therefore, learning is performed based on the deep neural network model, so that accurate prediction can be performed in the multi-channel prediction of fig. 13.

Further, the upmix channel generation of the audio signal by the signal processing apparatus 170 may be applied to any playback device capable of playing back audio content, such as a portable playback device, a home theater, a sound bar, a car audio, and the like, in addition to an image display device such as a TV, a mobile terminal, a vehicle display device, and the like.

In particular, multi-channel playback devices such as home theaters, car audio, etc. may generate multi-channel audio signals that may be output to each speaker.

In addition, even in portable playback devices used as headphones and headsets, the immersive audio environment can be reproduced by linking the 3D multichannel signal with externalization techniques.

In addition, even a two-channel speaker playing device in the form of a TV and a sound bar can be combined with multi-channel virtualization technology to reproduce further enhanced three-dimensional audio.

As is apparent from the above description, a signal processing device and an image display apparatus including the same according to an embodiment of the present disclosure include: a converter configured to convert a frequency of an input stereo audio signal; a principal component analyzer configured to perform principal component analysis based on the signal from the converter; a feature extractor configured to extract features of the principal component signals based on the signals from the principal component analyzer; an envelope adjuster configured to perform envelope adjustment based on a prediction performed based on a deep neural network model; and an inverse transformer configured to inversely transform the signal from the envelope adjuster to output a multi-channel up-mixed audio signal. Therefore, when a downmix stereo audio signal is upmixed into a multi-channel audio signal, spatial distortion can be improved. In particular, a multi-channel signal can be easily synthesized using a principal component analysis method and a deep neural network model.

The signal processing device and the image display apparatus including the same according to the embodiments of the present disclosure further include a filter bank configured to filter the frequency-converted stereo audio signal from the converter through a plurality of band pass filters. Therefore, when a downmix stereo audio signal is upmixed into a multi-channel audio signal, spatial distortion can be improved.

The signal processing device and the image display apparatus including the same according to the embodiments of the present disclosure further include a masking unit configured to perform masking with respect to a prediction result from the model learning predictor. Therefore, when a downmix stereo audio signal is upmixed into a multi-channel audio signal, spatial distortion can be improved.

When each of the multiple channels is independent of time and frequency, a masking unit performs masking by using time-frequency components based on a prediction result from the model learning predictor to perform channel separation. Therefore, when a downmix stereo audio signal is upmixed into a multi-channel audio signal, spatial distortion can be improved.

The envelope adjuster separates channels by correcting the envelope of the signal in the frequency band according to a weight function for the frequency band based on a prediction result from the model learning predictor. Therefore, when a downmix stereo audio signal is upmixed into a multi-channel audio signal, spatial distortion can be improved.

An envelope adjuster adjusts the size of each frequency band to follow the envelope in the target channel. Therefore, when a downmix stereo audio signal is upmixed into a multi-channel audio signal, spatial distortion can be improved.

The main component analyzer separates a main component signal and a sub component signal of an input stereo audio signal. Therefore, when a downmix stereo audio signal is upmixed into a multi-channel audio signal, spatial distortion can be improved.

The main component analyzer performs at least one of a correlation operation between channels of main component signals of the input stereo audio signal, a panning gain operation of the main component signals, and a power operation of the main component signals. Therefore, when a downmix stereo audio signal is upmixed into a multi-channel audio signal, spatial distortion can be improved.

The feature extractor extracts panning gain of a principal component signal of an input stereo audio signal and power of the principal component signal. Therefore, when a downmix stereo audio signal is upmixed into a multi-channel audio signal, spatial distortion can be improved.

The signal processing apparatus and the image display device including the same according to the embodiments of the present disclosure further include a model learning predictor configured to perform learning based on the deep neural network model based on the features extracted by the second feature extractor. Thus, learning may be performed based on the deep neural network model.

Although the preferred embodiments of the present disclosure have been disclosed for illustrative purposes, those skilled in the art will appreciate that various modifications, additions and substitutions are possible, as disclosed in the accompanying claims. Accordingly, such modifications, additions and substitutions should also be construed as falling within the scope of the present disclosure.

Cross Reference to Related Applications

This application claims the benefit of priority from korean patent application No.10-2019-0002219, filed by 8.1.2019 at the korean intellectual property office, the disclosure of which is incorporated herein by reference.

46页详细技术资料下载

上一篇：一种医用注射器针头装配设备

下一篇：视频的处理方法和装置

Signal processing device and image display apparatus including the same

相关技术

网友询问留言