Content classification method, device and system

文档序号：1921700 发布日期：2021-12-03 浏览：8次中文

阅读说明：本技术 内容分类方法、装置及系统 (Content classification method, device and system ) 是由车元于 2020-05-29 设计创作，主要内容包括：本申请提供了一种内容分类方法、装置及系统。该系统中,服务器可以利用算法模型预测目标内容所属的目标分类；向终端发送目标内容以及至少一个样本表情特征,其中各个样本表情特征与目标分类关联,各个样本表情特征属于用户阅读目标内容时应当/不应当存在的表情特征。接着,终端可以获取用户阅读终端显示的目标内容时存在的至少一个表情特征；将至少一个表情特征与至少一个样本表情特征进行匹配以得到第一匹配结果,并将第一匹配结果发送至服务器；该第一匹配结果指示了目标内容属于目标分类的可能性。服务器可根据第一匹配结果确定目标内容是否属于目标分类。根据本申请实施例的技术方案,可以更为高效的确定内容是否属于算法模型预测的分类。(The application provides a content classification method, device and system. In the system, a server can predict the target classification to which the target content belongs by using an algorithm model; and sending the target content and at least one sample expression feature to a terminal, wherein each sample expression feature is associated with the target classification, and each sample expression feature belongs to an expression feature which should/should not exist when the user reads the target content. Then, the terminal can acquire at least one expression feature existing when the user reads the target content displayed by the terminal; matching the at least one expression feature with the at least one sample expression feature to obtain a first matching result, and sending the first matching result to a server; the first match result indicates a likelihood that the target content belongs to the target category. The server may determine whether the target content belongs to the target category according to the first matching result. According to the technical scheme of the embodiment of the application, whether the content belongs to the classification predicted by the algorithm model can be determined more efficiently.)

1. A content classification system is characterized by comprising a terminal and a server; wherein the content of the first and second substances,

the server is used for predicting the target classification to which the target content belongs by utilizing a pre-trained algorithm model;

the server is further used for sending the target content and at least one sample expression feature to the terminal; the at least one sample expression feature is associated with the target classification, and any sample expression feature in the at least one sample expression feature belongs to an expression feature which should exist or should not exist when a user reads the target content;

the terminal is used for displaying the target content and acquiring at least one expression characteristic existing when a user reads the target content displayed by the terminal; matching the at least one expression feature with the at least one sample expression feature to obtain a first matching result, and sending the first matching result to the server; wherein the first match result indicates a likelihood that target content belongs to the target classification;

the server is further configured to determine whether the target content belongs to the target category according to the first matching result.

2. The system of claim 1,

the at least one sample expression feature comprises: at least one positive-going expressive feature and at least one negative-going sample expressive feature; the positive expression features are expression features which should exist when the user reads the target content, and the negative sample expression features are expression features which should not exist when the user reads the target content;

the terminal is specifically used for matching the at least one expression feature with the at least one positive sample expression feature to obtain a second matching result; matching the at least one expression feature with the at least one negative sample expression feature to obtain a third matching result; and determining a first matching result according to the second matching result and the third matching result.

3. The system of claim 1,

the terminal is specifically used for collecting at least one face image when a user reads the target content displayed by the terminal; and analyzing and processing the at least one facial image by utilizing at least one pre-trained expression feature recognition model to obtain at least one expression feature existing when the user reads the target content displayed by the terminal.

4. The system according to any one of claims 1 to 3,

the server is further used for obtaining at least one sample expression feature associated with the target classification and associating the target content with the at least one sample expression feature;

the server is also used for receiving a content request from the terminal; wherein, the content request is used for requesting a server to send the target content to the terminal;

the server is further configured to determine, according to the content request, the target content to be sent and the at least one sample expression feature associated with the target content.

5. A content classification method is applied to a terminal, and the method comprises the following steps:

receiving target content and at least one sample expression feature from a server; the at least one sample expression feature is associated with a target classification, the target classification is a classification to which the target content predicted by a pre-trained algorithm model belongs, and any sample expression feature in the at least one sample expression feature belongs to an expression feature which should exist or should not exist when a user reads the target content;

displaying the target content, and acquiring at least one expression characteristic existing when a user reads the target content displayed by the terminal;

matching the at least one expression feature with the at least one sample expression feature to obtain a first matching result; wherein the first match result indicates a likelihood that the target content belongs to the target classification;

and sending the first matching result to the server.

6. The method of claim 5,

the matching the at least one expression feature with the at least one sample expression feature to obtain a first matching result includes:

matching the at least one expression feature with the at least one positive sample expression feature to obtain a second matching result; matching the at least one expression feature with the at least one negative sample expression feature to obtain a third matching result;

and determining a first matching result according to the second matching result and the third matching result.

7. The method according to claim 5 or 6,

the obtaining of at least one expression feature existing when the user reads the target content displayed by the terminal includes:

collecting at least one face image when a user reads the target content displayed by the terminal;

and analyzing and processing the at least one facial image by utilizing at least one pre-trained expression feature recognition model to obtain at least one expression feature existing when the user reads the target content displayed by the terminal.

8. A content classification method applied to a server, the method comprising:

predicting the target classification to which the target content belongs by using a pre-trained algorithm model;

sending the target content and at least one sample expression feature to a terminal; the at least one sample expression feature is associated with the target classification, and any sample expression feature in the at least one sample expression feature belongs to an expression feature which should exist or should not exist when a user reads the target content;

receiving a first matching result from the terminal; the first matching result is obtained by matching at least one expression feature existing when a user reads the target content displayed by the terminal with the at least one sample expression feature by the terminal, and the first matching result indicates the possibility that the target content belongs to a target classification;

and determining whether the target content belongs to the target classification according to the first matching result.

9. The method of claim 8,

before the sending of the target content and the at least one sample expression feature to the terminal, the method further includes:

obtaining at least one sample expression feature associated with the target classification, and associating the target content with the at least one sample expression feature;

receiving a content request from the terminal; wherein, the content request is used for requesting a server to send the target content to the terminal;

and determining the target content to be sent and the at least one sample expression feature associated with the target content according to the content request.

10. A communication device arranged to implement the method of any of claims 5 to 7.

11. A communication device, characterized by being adapted to implement the method of claim 8 or 9.

12. A terminal, characterized in that it comprises a processor for executing computer instructions stored in a memory, such that the terminal implements the method of any of claims 5 to 7.

13. A server, comprising a processor configured to execute computer instructions stored in a memory, such that the server implements the method of claim 8 or 9.

14. A computer readable storage medium storing computer instructions which, when executed by a processor of a terminal, cause the terminal to carry out the method of any one of claims 5 to 7; alternatively, the computer instructions, when executed by a processor of a server, cause the server to implement the method of claim 8 or 9.

Technical Field

The embodiment of the application relates to the field of computers, in particular to a content classification method, device and system.

Background

With the development of computer application technology, there is a need to classify content in many business scenarios. For example, in order to manage a large amount of content, the content needs to be classified; as another example, in order to accurately recommend content of interest to a user, it is necessary to classify content and recommend content belonging to a specific classification to the user.

At present, semantic analysis is mainly performed on the content itself or related information through a pre-trained algorithm model, and the classification to which the content belongs is predicted and output. Depending on the network structure of the algorithmic model and the complexity of the content, the classification to which the content predicted by the algorithmic model belongs may not be accurate. In order to correctly classify the content, it is usually necessary to add a manual review link to determine whether the content belongs to the classification predicted by the algorithm model after the classification to which the content belongs is predicted by the algorithm model.

Disclosure of Invention

The embodiment of the application provides a content classification method, device and system, which can determine whether content belongs to classification predicted by an algorithm model more efficiently.

In a first aspect, a content classification system is provided, which includes a terminal and a server.

In the system, a server can predict the target classification to which the target content belongs by using an algorithm model; and sending the target content and at least one sample expression feature to a terminal, wherein each sample expression feature is associated with the target classification, and each sample expression feature belongs to an expression feature which should exist or should not exist when the user reads the target content. Then, the terminal can display the target content and acquire at least one expression characteristic existing when the user reads the target content displayed by the terminal; and matching the at least one expression feature with the at least one sample expression feature to obtain a first matching result, and sending the first matching result to the server. Thereafter, since the obtained first matching result may reflect the possibility that the target content belongs to the target classification, the server may determine whether the target content belongs to the target classification according to the first matching result. In this way, it can be more efficiently determined whether the content belongs to the class predicted by the algorithm model.

In addition, a large amount of contents generally need to be classified in an actual business scene, and the accuracy of the classification to which the contents are predicted by the algorithm model is relatively high. Under the condition that whether the content belongs to the classification predicted by the algorithm model or not can be determined more efficiently, only the content which cannot be correctly predicted by the algorithm model can be subjected to manual reinspection subsequently, so that a large amount of manual reinspection operations are avoided, and the method is favorable for more efficiently realizing accurate classification of a large amount of content.

In one possible embodiment, the at least one sample expression feature comprises: at least one positive-going expressive feature and at least one negative-going sample expressive feature; the positive expression features are expression features which should exist when the user reads the target content, and the negative sample expression features are expression features which should not exist when the user reads the target content. Correspondingly, the terminal can match at least one expression feature with at least one positive sample expression feature to obtain a second matching result; matching the at least one expression feature with the at least one negative sample expression feature to obtain a third matching result; and determining a first matching result according to the second matching result and the third matching result.

In a possible implementation manner, the terminal can collect at least one face image when a user reads target content displayed by the terminal; and analyzing and processing at least one facial image by utilizing at least one pre-trained expression feature recognition model to obtain at least one expression feature existing when the user reads the target content displayed by the terminal.

In one possible implementation, the server may further obtain at least one sample expressive feature associated with the target category and associate the target content with the at least one sample expressive feature. When a content request which comes from a terminal and is used for requesting a server to send target content to the terminal is received, the server can determine the target content to be sent and at least one sample expression characteristic associated with the target content according to the content request. In this way, by associating the target content with the corresponding sample expression features, when the target content needs to be sent to the terminal, the server can quickly inquire each sample expression feature which needs to be sent to the terminal together with the target content.

In a second aspect, a content classification method is provided, and the beneficial effects can be obtained by referring to the description in the first aspect. The method may be performed by a terminal, and may include: the terminal receives target content and at least one sample expression characteristic from the server; the at least one sample expression feature is associated with a target classification, the target classification is a classification to which target content predicted by a pre-trained algorithm model belongs, and any sample expression feature in the at least one sample expression feature belongs to an expression feature which should exist or should not exist when a user reads the target content. And then, the terminal displays the target content, at least one expression feature existing when the user reads the target content displayed by the terminal is obtained, and the at least one expression feature is matched with the at least one sample expression feature to obtain a first matching result. Finally, the terminal may send the first matching result to the server, so that the server determines whether the target content belongs to the target classification predicted by the algorithm model according to the first matching result.

In one possible embodiment, the at least one sample expression feature comprises: at least one positive-going expressive feature and at least one negative-going sample expressive feature; the positive expression features are expression features which should exist when the user reads the target content, and the negative sample expression features are expression features which should not exist when the user reads the target content. Correspondingly, the terminal can match at least one expression feature with at least one positive sample expression feature to obtain a second matching result; matching the at least one expression feature with the at least one negative sample expression feature to obtain a third matching result; and then determining a first matching result according to the second matching result and the third matching result.

In one possible implementation, the communication device may collect at least one face image when the user reads the target content displayed by the terminal; and analyzing and processing at least one facial image by utilizing at least one pre-trained expression feature recognition model to obtain at least one expression feature existing when the user reads the target content displayed by the terminal.

In a third aspect, a content classification method is provided, and the beneficial effects can be obtained by referring to the description in the first aspect. The method is executed by a server, and the method can comprise the following steps: and the server predicts the target classification to which the target content belongs by using a pre-trained algorithm model. Then sending the target content and at least one sample expression feature to the terminal; the at least one sample expression feature is associated with the target classification, and any sample expression feature in the at least one sample expression feature belongs to an expression feature which should exist or should not exist when the user reads the target content. Then, receiving a first matching result from the terminal; the first matching result is obtained by matching at least one expression characteristic existing when the user reads the target content displayed by the terminal with at least one sample expression characteristic by the terminal. Accordingly, since the first matching result can indicate the possibility that the target content belongs to the target category, the server can determine whether the target content belongs to the target category according to the first matching result.

In a possible implementation manner, the server may further obtain at least one sample expression feature associated with the target classification, and associate the target content with the at least one sample expression feature; when receiving a content request from the terminal for requesting the server to transmit the target content to the terminal; the server may determine, according to the content request, target content to be sent and at least one sample expression feature associated with the target content.

In a fourth aspect, there is provided a communication device comprising means for performing the steps of the second aspect above.

In a fifth aspect, there is provided a communication device comprising means for performing the steps of the above third aspect.

In a sixth aspect, a terminal is provided, which comprises a processor connected to a memory for executing a program of computer instructions stored in the memory, so that the terminal implements the method provided in the second aspect. The memory may be located within the terminal or external to the terminal.

In a seventh aspect, a server is provided, which comprises a processor connected to a memory and configured to execute a computer instruction program stored in the memory, so that the server implements the method provided in the third aspect.

In an eighth aspect, there is provided a computer readable storage medium storing computer instructions that, when executed by a processor of a terminal, cause the terminal to implement the method provided in the first aspect; alternatively, the computer instructions, when executed by a processor of a server, cause the server to implement the method provided in the third aspect.

In a ninth aspect, a computer program product is provided, the computer program product comprising computer program code which, when run on a computer, causes the computer to perform the method provided by the second or third aspect.

In the above aspects, the processor may be implemented by hardware or may be implemented by software. When the processor is implemented in hardware, the processor may be a logic circuit, an integrated circuit, or the like. When the processor is implemented in software, the processor may be a general-purpose processor, implemented by reading software code stored in a memory; the memory may be integrated into the processor or may be located outside the processor.

In the above aspects, the terminal/server may include one or more processors, and the memory may include one or more memories. The memory may be integral with the processor or provided separately from the processor. In a specific implementation process, the memory and the processor may be integrated on the same chip, or may be respectively disposed on different chips.

In the above aspects, the information transmitting or receiving process may be a process in which the processor transmits and receives information. For example, the process of sending the target content and the at least one sample expressive feature may be outputting the target content and the at least one sample expressive feature from the processor; the process of receiving the first matching result may receive the first matching result for the processor. In particular, information output by the processor may be provided to the transmitter/interface circuitry, and information received by the processor may be provided from the receiver/interface circuitry. The transmitter and receiver may be collectively referred to as a transceiver, among others.

Drawings

The drawings that accompany the detailed description can be briefly described as follows.

Fig. 1 is a schematic structural diagram of an exemplary mobile phone in an embodiment of the present application.

Fig. 2 is a schematic structural diagram of an exemplary software system of a mobile phone in an embodiment of the present application.

Fig. 3 is a system framework diagram of the technical solution provided in the embodiment of the present application.

Fig. 4 is a schematic diagram of an interaction process between a server and a terminal in an embodiment of the present application.

Fig. 5A is one of schematic diagrams of a graphical user interface displayed by an exemplary terminal in an embodiment of the present application.

Fig. 5B is a second schematic diagram of a graphical user interface displayed by an exemplary terminal in an embodiment of the present application.

Fig. 5C is a third schematic diagram of a graphical user interface displayed by an exemplary terminal in an embodiment of the present application.

Fig. 5D is a fourth schematic diagram of a graphical user interface displayed by an exemplary terminal in an embodiment of the present application.

Fig. 6 is a schematic structural diagram of a terminal provided in an embodiment of the present application.

Fig. 7 is a schematic structural diagram of a server provided in the embodiment of the present application.

Detailed Description

The technical solutions provided in the embodiments of the present application will be described below with reference to the accompanying drawings.

In the examples of this application, "/" indicates an OR meaning, for example, A/B indicates A or B, unless otherwise specified. "and/or" is only one kind of association relationship describing the associated object, and means that there may be three kinds of relationships; for example, a and/or B, may represent: a exists alone, A and B exist simultaneously, and B exists alone. In addition, several means one or more, and a plurality means two or more.

In the embodiments of the present application, the terms "first" and "second" are used for descriptive purposes only and are not to be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defined as "first" or "second" may explicitly or implicitly include one or more of that feature. The terms "comprising," "including," "having," and variations thereof mean "including, but not limited to," unless expressly specified otherwise.

In this embodiment of the application, the terminal may be an electronic device having a display screen and a camera, such as a mobile phone, a tablet computer, a notebook computer, an ultra-mobile personal computer (UMPC), a Personal Digital Assistant (PDA), a wearable device, and a virtual reality device, which is not limited in this respect.

Taking the terminal as a mobile phone for example, as shown in fig. 1, the mobile phone 100 may include a processor 110, an external memory interface 120, and an internal memory 121. A Universal Serial Bus (USB) interface 130, a charging management module 140, a power management module 141, a battery 142, an antenna 1, an antenna 2, a mobile communication module 150, a wireless communication module 160, an audio module 170, a speaker 170A, a receiver 170B, a microphone 170C, an earphone interface 170D, a sensor module 180, a button 190, a motor 191, an indicator 192, a camera 193, a display 194, a Subscriber Identity Module (SIM) card interface 195, and the like may also be included. The sensor module 180 may include a pressure sensor 180A, a gyroscope sensor 180B, an air pressure sensor 180C, a magnetic sensor 180D, an acceleration sensor 180E, a distance sensor 180F, a proximity light sensor 180G, a fingerprint sensor 180H, a temperature sensor 180J, a touch sensor 180K, an ambient light sensor 180L, a bone conduction sensor 180M, and the like.

The processor 110 may include one or more processing units, such as including an Application Processor (AP), a modem, a Graphics Processing Unit (GPU), an Image Signal Processor (ISP), a controller, a video codec, a Digital Signal Processor (DSP), a baseband processor, and/or a neural-Network Processor (NPU). The different processing units may be independent devices or may be integrated into one or more devices.

A memory may also be provided in processor 110 for storing instructions and data. In some embodiments, the memory in the processor 110 is a cache memory. The memory may hold instructions or data that have just been used or recycled by the processor 110. If the processor 110 needs to reuse the instruction or data, the processor 110 may call directly from the memory. Avoiding repeated accesses to the data, reducing the latency of the processor 110, and improving the efficiency of the system.

The wireless communication function of the mobile phone 100 can be realized by the cooperation of the antenna 1, the antenna 2, the mobile communication module 150, the wireless communication module 160, a modem, and a baseband processor.

The antenna 1 and the antenna 2 are each used for transmitting and receiving electromagnetic wave signals. Antenna 1 and antenna 2 may each cover a single or multiple communication bands, and may also multiplex different antennas to improve antenna utilization. In some embodiments, antenna 1 may be multiplexed as a diversity antenna for a wireless local area network.

The mobile communication module 150 is used to support solutions of wireless communication technologies such as 2G, 3G, 4G, and 5G applied to the mobile phone 100. The mobile communication module 150 may include a filter, a switch, a power amplifier, and a Low Noise Amplifier (LNA) function module. The mobile communication module 150 may receive electromagnetic waves through the antenna 1, filter and amplify the received electromagnetic waves, and transmit the processed signals to the modem for demodulation. The mobile communication module 150 may also amplify the signal modulated by the modem, and convert the amplified signal into electromagnetic wave through the antenna 1 to radiate the electromagnetic wave. In some embodiments, some of the functional modules of the mobile communication module 150 may be disposed in the processor 110. In some embodiments, some functional modules of the mobile communication module 150 may be integrated with some functional modules of the processor 110 in the same device.

The modem may include a modulator and a demodulator. The modulator is used for modulating a low-frequency baseband signal to be transmitted into a medium-high frequency signal. The demodulator is used for demodulating the received electromagnetic wave signal into a low-frequency baseband signal. The demodulator then passes the demodulated low frequency baseband signal to a baseband processor for processing. The low frequency baseband signal is processed by the baseband processor and then sent to the application processor. The application processor outputs sound signals through audio devices (including but not limited to speaker 170A and receiver 170B) or displays images or video through display screen 194. In some embodiments, the modem may be a stand-alone device. In some embodiments, the modem may be provided in the same device as the mobile communication module 150 or other components, independent of the processor 110.

The wireless communication module 160 is configured to support solutions of wireless communication technologies such as a Wireless Local Area Network (WLAN), a Bluetooth (BT), a Global Navigation Satellite System (GNSS), a Frequency Modulation (FM), a Near Field Communication (NFC), and an Infrared (IR) technology, which are applied to the mobile phone 100. The wireless communication module 160 may be one or more devices integrating at least one communication processing module. The wireless communication module 160 may receive electromagnetic waves through the antenna 2, perform frequency modulation and filtering processing on the received electromagnetic wave signals, and transmit the processed signals to the processor 110. The wireless communication module 160 can also receive a signal to be transmitted from the processor 110, perform frequency modulation and amplification processing on the signal, and convert the signal into electromagnetic waves through the antenna 2 to radiate the electromagnetic waves.

The antenna 1 of the handset 100 is coupled to the mobile communication module 150 and the antenna 2 is coupled to the wireless communication module 160 so that the handset 100 can communicate with other devices through wireless communication technology. It is understood that the wireless communication technologies may include global system for mobile communications (GSM), General Packet Radio Service (GPRS), Code Division Multiple Access (CDMA), Wideband Code Division Multiple Access (WCDMA), time-division code division multiple access (TD-SCDMA), Long Term Evolution (LTE), BT, GNSS, WLAN, NFC, FM, and IR technologies, etc. GNSS includes, but is not limited to, Global Positioning System (GPS), global navigation satellite system (GLONASS), beidou satellite navigation system (BDS), quasi-zenith satellite system (QZSS), and satellite-based augmentation system (SBAS).

The handset 100 cooperates with the GPU, the display screen 194, and the application processor to implement the display function.

The GPU is an image processing microprocessor that may be coupled to a display screen 194 and an application processor. The GPU is used to perform mathematical and geometric calculations for graphics rendering. The processor 110 may include one or more GPUs that execute program instructions to generate or alter display information.

The display screen 194 is used to display images and video. The display screen 194 includes a display panel. The display panel may adopt a Liquid Crystal Display (LCD), an organic light-emitting diode (OLED), an active-matrix organic light-emitting diode (AMOLED), a flexible light-emitting diode (FLED), a Mini-led, a Micro-led, or a quantum dot light-emitting diode (QLED). In some embodiments, the handset 100 may include one or more display screens 194.

The mobile phone 100 can realize the functions of collecting, processing and displaying images by the cooperation of the ISP, the camera 193, the video codec, the GPU, the display screen 194, the application processor and other components.

The cameras 193 (front or rear camera; or one camera may be both front and rear camera) are used to capture images or video. For example, when an image or video is captured by the camera 193, light is transmitted through the lens of the camera to the photosensitive element of the camera, the optical signal is converted into an electrical signal on the photosensitive element, and the electrical signal is transmitted to the ISP, which can process the electrical signal to obtain an image visible to the human eye. The photosensitive elements of camera 193 may include Charge Coupled Devices (CCDs) or complementary metal-oxide-semiconductor (CMOS) phototransistors. The light sensing element converts the optical signal into an electrical signal, and then transmits the electrical signal to the ISP to be converted into a digital image signal. The ISP can output the digital image signal to the DSP for processing. In some embodiments, the handset 100 may include one or more cameras 193.

The ISP is used to process the data fed back by the camera 193. For example, for processing the electrical signal from the camera 193 to obtain an image visible to the human eye, or for processing the electrical signal from the camera 193 to obtain a digital image signal and passing the digital image signal to the DSP. The ISP can also carry out algorithm optimization on the noise, brightness and skin color of the image. The ISP can also optimize parameters such as exposure, color temperature and the like of a shooting scene. In some embodiments, the ISP may be provided integrally in camera 193.

The DSP is used to convert the digital image signal from the ISP into a standard RGB or YUV format image signal. In some embodiments, the DSP may also be used to process other forms of digital signals; for example, when the mobile phone 100 selects a frequency point, the DSP may perform fourier transform on the frequency point energy.

Video codecs are used to compress or decompress digital video. The mobile phone 100 may support one or more video codecs, so that the mobile phone 100 can play or record videos in various encoding formats, such as Moving Picture Experts Group (MPEG) 1, MPEG2, MPEG3, and MPEG 4.

The NPU is a neural-network (NN) computing processor, processes input information rapidly by referring to a biological neural network structure, and can also learn by self continuously. The NPU may be used to support applications such as smart recognition of the cell phone 100, such as supporting image recognition, face recognition, voice recognition, and text semantic analysis.

The controller may be used as a neural center and a command center of the mobile phone 100, and is configured to generate an operation control signal according to the instruction operation code and the timing signal, so as to complete control of instruction acquisition and instruction execution.

The external memory interface 120 may be used to connect an external memory card, such as a Micro SD card, to extend the storage capability of the mobile phone 100. The external memory card communicates with the processor 110 through the external memory interface 120 to implement a data storage function. For example, files such as music, video, etc. are saved in an external memory card.

The internal memory 121 may be used to store computer-executable program code, which includes instructions. The internal memory 121 may include a program storage area and a data storage area. The storage program area may store an operating system and application programs corresponding to respective functions (such as a sound playing function and an image playing function) of the mobile phone 100. The storage data area may store data (such as audio data) created during use of the handset 100. Further, the internal memory 121 may include a high-speed random access memory and a nonvolatile memory, such as a magnetic disk memory, a flash memory, and a universal flash memory (UFS). The processor 110 implements various functions and data processing procedures of the mobile phone 100 by executing instructions stored in the internal memory 121 and/or executing instructions stored in a memory provided in the processor.

The handset 100 may cooperate through the audio module 170, the speaker 170A, the receiver 170B, the microphone 170C, the headset interface 170D, and the application processor to implement audio functions. Such as recording or playing music, etc.

The audio module 170 is used to convert digital audio signals from the application processor into analog audio signals and also to convert analog audio signals from the microphone into digital audio signals. The audio module 170 may also be used to encode and decode audio signals. In some embodiments, the audio module 170 may be disposed in the processor 110, or a part of the functional modules of the audio module 170 may be disposed in the processor 110.

Speaker 170A may also be referred to as a "horn" for converting audio signals from audio module 170 into sound signals. The handset 100 may enable playing music or hands-free calling through the speaker 170A.

The receiver 170B may also be referred to as a "receiver" for converting an audio signal from the audio module 170 into a sound signal. The user can receive a call or voice message by placing the receiver 170B close to the ear.

Microphone 170C may also be referred to as a "microphone" or "microphone" for converting sound signals into electrical signals. When a user makes a call or sends voice information through the mobile phone 100, the user can make a sound by placing the microphone 170C close to the mouth of the user, and the microphone 170 can receive a corresponding sound signal and convert the sound signal into an electrical signal. In some embodiments, one or more microphones 170C may be disposed in the handset 100 to facilitate noise reduction and identification of the source of the sound signal while the sound signal is being collected.

The headphone interface 170D is used to connect a wired headphone. The headset interface 170D may be the USB interface 130, may be an open mobile electronic device platform (OMTP) standard interface of 3.5mm, and may also be a CTIA (cellular telecommunications industry association) standard interface.

The pressure sensor 180A is used for sensing a pressure signal and converting the pressure signal into an electrical signal. Pressure sensor 180A may be of a wide variety, for example, pressure sensor 180A may be a resistive pressure sensor, an inductive pressure sensor, or a capacitive pressure sensor. Wherein the capacitive pressure sensor may be a sensor including at least two parallel plates having a conductive material, wherein when a pressure is applied to the pressure sensor 180A, the capacitance between the parallel plates changes, and the processor 110 may determine the strength of the pressure based on the change in capacitance. In some embodiments, the pressure sensor 180A may be disposed on the display screen 194; when a touch operation is applied to the display screen 194, the processor 110 may detect the touch intensity of the touch operation according to the pressure sensor 180A. The processor 110 may also calculate the position of the touch from the detection signal of the pressure sensor 180A. In some embodiments, touch operations that act on the same touch position but have different touch intensities may correspond to different operation instructions; for example, when a touch operation with the touch intensity smaller than the preset pressure threshold is applied to the icon corresponding to the short message application, the processor executes an operation instruction corresponding to the short message viewing. And when the touch operation with the touch operation intensity larger than or equal to the preset pressure threshold value acts on the icon corresponding to the short message application, executing an operation instruction corresponding to the newly-built short message.

The gyro sensor 180B may be used to determine the motion attitude of the cellular phone 100. In some embodiments, the angular velocity of the handpiece 100 about three axes (i.e., the x, y, and z axes) may be determined by the gyro sensor 180B. The gyro sensor 180B may be used for photographing anti-shake. For example, the gyro sensor 180B may be configured to detect an angle at which the mobile phone 100 shakes, and calculate a distance that needs to be compensated for by the lens of the camera 193 according to the angle, so that the lens counteracts the shake of the mobile phone 100 through a reverse movement, thereby implementing anti-shake during shooting. In some embodiments, the gyro sensor 180B may also be used to support the handset to implement its navigation function, and to support the user to play a somatosensory game through the handset 100.

The air pressure sensor 180C is used to measure air pressure. In some embodiments, the processor 110 may calculate an altitude based on the barometric pressure measured by the barometric pressure sensor 180C to support the handset 100 for assisted positioning and navigation functions.

The magnetic sensor 180D includes a hall sensor. The cellular phone 100 can detect the open/close state of the holster attached to the cellular phone 100 by the magnetic sensor 180D. In some embodiments, when the type of the cellular phone 100 is a flip phone, the cellular phone 100 may detect the open/close state of the flip thereof according to the magnetic sensor 180D. Accordingly, the mobile phone 100 can automatically unlock or lock the display screen 194 according to the detected opening/closing state of the holster or the detected opening/closing state of the flip.

The acceleration sensor 180E can detect the acceleration of the cellular phone 100 in various directions. And may also be used to support the step-counting function of the handset 100 and the horizontal and vertical screen switching of the graphical user interface on the display screen 194.

The distance sensor 180F is used to measure a distance. The mobile phone 100 can measure the distance between the target object and the mobile phone 100 by transmitting and receiving infrared light or infrared laser light. In some embodiments, the mobile phone 100 may measure the distance between the subject and the camera 193 using the distance sensor 180F to achieve fast focusing.

The proximity light sensor 180G includes, but is not limited to, a Light Emitting Diode (LED) and a light detector. The light emitting diode may be an infrared light emitting diode. The light detector may be a photodiode. The cellular phone 100 emits infrared light to the outside through the light emitting diode. The cellular phone 100 may detect infrared light reflected by a target object through a photodiode. When the photodiode detects infrared light satisfying a certain condition, it can be determined that a target object exists near the cellular phone 100. The mobile phone 100 can detect whether the mobile phone is close to the ear of a person when the user holds the mobile phone 100 for a call by using the proximity light sensor 180G, so that the display screen is automatically turned off after the mobile phone is close to the ear of the person to achieve the purpose of saving power. The proximity light sensor 180G may also be used to support the handset 100 in its holster and pocket modes.

The ambient light sensor 180L is used to sense the ambient light level. The processor 110 may adaptively adjust the brightness of the display screen 194 according to the ambient light level sensed by the ambient light sensor 180L. The ambient light sensor 180L may also be used to support automatic white balance adjustment when the handset 100 takes pictures or video via the camera 193. The ambient light sensor 180L may also cooperate with the proximity light sensor 180G to support the mobile phone 100 to detect whether the mobile phone 100 is located in a pocket, thereby avoiding touching the display screen by mistake.

The fingerprint sensor 180H is used to capture a fingerprint of a user's finger. So that the mobile phone 100 can realize fingerprint unlocking, access to an application lock, fingerprint photographing and fingerprint incoming call answering according to the collected fingerprint.

The temperature sensor 180J is used to detect temperature. In some embodiments, the handset 100 implements a temperature processing strategy using the ambient temperature detected by the temperature sensor 180J. For example, when the temperature reported by the temperature sensor 180J exceeds a threshold, the mobile phone 100 performs a reduction in performance of the processor located near the temperature sensor 180J, so as to reduce power consumption and implement thermal protection. In some embodiments, when the temperature reported by the temperature sensor 180J is lower than another threshold, the mobile phone 100 heats the battery 142, so as to avoid abnormal shutdown of the mobile phone 100 due to low temperature. In some embodiments, when the temperature reported by the temperature sensor 180J is lower than another threshold, the mobile phone 100 boosts the output voltage of the battery 142, so as to avoid abnormal shutdown caused by low temperature.

The touch sensor 180K may also be referred to as a "touch device". The touch sensor 180K may be disposed on the display screen 194, and the touch sensor 180K and the display screen 194 form a touch screen, which is also called a "touch screen". The touch sensor 180K is used to detect a touch operation applied to itself or a nearby area. The touch sensor 180K may pass the detected touch operation to the application processor so that the application processor determines the type of touch event corresponding to the touch operation. In some embodiments, cell phone 100 may provide visual output related to touch operations through display screen 194. In some embodiments, the touch sensor 180K may be disposed on the surface of the mobile phone 100, independent of the display 194.

The bone conduction sensor 180M may acquire a vibration signal. In some embodiments, the bone conduction sensor 180M may acquire a vibration signal of the human vocal part vibrating the bone mass. The bone conduction sensor 180M may also be in contact with a specific part of the human body to collect a pulse signal and a blood pressure signal of the human body. In some embodiments, the bone conduction sensor 180M may be disposed in a headset, forming a bone conduction headset. The audio module 170 may analyze a voice signal based on the vibration signal of the bone mass vibrated by the sound part acquired by the bone conduction sensor 180M, so as to implement a voice function. In some embodiments, the application processor may analyze the heart rate information based on the blood pressure signal acquired by the bone conduction sensor 180M, implementing a heart rate detection function.

Keys 190 include, but are not limited to, a power-on key and a volume key. The keys 190 may be mechanical keys or touch keys. The user may generate input signals/instructions related to user settings and function control of the handset 100 by activating the keys 190.

The motor 191 may be used for incoming call vibration cues, as well as for touch vibration feedback. Specifically, the user may perform a touch operation on respective icons (for example, an icon corresponding to a camera, an icon corresponding to a calendar, and an icon corresponding to information) corresponding to different application programs, so as to correspond to different vibration feedback effects; the user acts on the touch operation of different types of application programs (such as instant messaging application programs, audio application programs and video application programs) and can correspond to different vibration feedback effects; different application scenarios (such as receiving notification information of an application program and a game) can also correspond to different vibration feedback effects. It will be appreciated that touch vibration feedback may be set by the user in conjunction with his actual business needs.

The indicator 192 may be an indicator light for indicating the charging status of the mobile phone 100, and may also be used for indicating whether there is a missed call, whether there is information or a notification that is not viewed by the mobile phone 100.

The display screen 194 is used to display a graphical user interface for each application at the application layer. It is understood that the handset 100 may include one or more display screens 194. Alternatively, the mobile phone 100 may only include one display screen 194 but the display screen can be divided into a plurality of display areas under the control of the user; for example, the cell phone 100 may include only one foldable flexible display, but the display may be folded under the control of the user and divided into two displays (i.e., into two display areas) along respective fold lines. The multiple display screens 194 of the same mobile phone 100 may display different graphical user interfaces independently, or may display partial areas of the same graphical user interface separately, and cooperate with each other to complete displaying a complete graphical user interface.

The SIM card interface 195 is used for connecting a SIM card, so that the mobile phone 100 can perform information interaction with a wireless network or a corresponding device through the SIM card, thereby implementing functions such as communication and data communication. The SIM card can be inserted into the SIM card interface 195 or pulled out of the SIM card interface 195, so that the SIM card is in contact with and separated from the mobile phone 100; alternatively, the SIM card may be an embedded SIM card that cannot be separated from the SIM card. It is understood that the handset 100 may include one or more SIM card interfaces, and each SIM card interface 195 may be connected to a different SIM card; alternatively, one SIM card interface 195 of the handset 100 may connect multiple SIM cards at the same time.

It should be noted that the structure of the mobile phone 100 exemplarily described in the embodiment of the present application does not constitute a limitation to the specific structure of the mobile phone or other terminals. In fact, for a mobile phone or other terminal, more or less components than the mobile phone 100 shown in fig. 1 may be included, some components in the mobile phone 100 shown in fig. 1 may be combined, some components in the mobile phone 100 shown in fig. 1 may be further separated, and various components in the mobile phone 100 shown in fig. 1 may have other connection relationships.

The software system of the handset 100 may employ a layered architecture, an event-driven architecture, a micro-core architecture, or a cloud architecture. In the embodiment of the present application, taking an Android (Android) system that is a layered architecture adopted as an example for a software system of the mobile phone 100, a structure of the software system of the mobile phone 100 is exemplarily described.

Fig. 2 is a schematic structural diagram of a software system of the mobile phone 100. As shown in fig. 2, the Android system may be divided into four layers, which are an application layer, an application framework layer, a system library, an Android runtime (Android runtime) and an kernel layer from top to bottom, each layer has a clear role and division of labor, and the layers communicate with each other through a software interface.

The application layer includes a series of applications deployed on the handset 100. By way of example, the application layer may include, but is not limited to, applications such as a Launcher (Launcher), a browser, a calendar, a camera, a photo, a call, and a text message.

The application framework layer may provide an Application Programming Interface (API) and a programming framework for each application in the application layer. The application framework layer may include some predefined functional modules/services. By way of example, the application framework layer may include, but is not limited to, a Window manager (Window manager), an Activity manager (Activity manager), a Package manager (Package manager), a Resource manager (Resource manager), and a Power manager (Power manager).

The activity manager is used for managing the life cycle of each application program and realizing the navigation backspacing function of each application program. In particular, the Activity manager may be responsible for the creation of an Activity (Activity) process and the maintenance of the entire lifecycle of the created Activity process.

The window manager is used for managing window programs. It will be appreciated that the graphical user interfaces of the various applications at the application layer are typically comprised of one or more Activities, which in turn are comprised of one or more views View, and that the window manager may be used to add views included in the graphical user interface to be displayed to the display screen 194 or to remove views from the graphical user interface displayed on the display screen 194. In some embodiments, the window manager may also obtain the size of the display 194, determine whether there is a status bar in the graphical user interface displayed by the display 194, and enable locking the display 194 and intercepting the graphical user interface displayed by the display 194.

The packet manager may manage the data packets corresponding to the respective applications, for example, for performing decompression, verification, installation, and upgrade processes on the respective data packets. More specifically, the package manager may maintain at least the respective icon and the respective name of the data package for each application.

The resource manager may provide access to various non-code resources, such as native strings, graphics, and layout files, for various applications at the application layer.

The power supply manager is a core service for Android system power supply management and is mainly used for executing calculation tasks related to power supply management in the Android system. And controlling hardware equipment such as a display screen, a starting or stopping distance sensor, a proximity optical sensor and the like to be turned on or off by a bottom layer system of the Android system in a downward decision mode. Providing corresponding operation interfaces upwards, so that each application program of the application program layer can call the corresponding operation interfaces, and specific service purposes are realized; such as continuously maintaining the display 194 of the handset 100 in the illuminated state while the handset 100 is playing audio through the application "music," or illuminating the display 194 of the handset 100 when a notification is received by each application.

The system libraries and android runtimes, kernel layer, etc. located below the application framework layer may be referred to as an underlying system including an underlying display system for providing display services, which may include, but is not limited to, a surface manager (surface manager) located in the system libraries and display drivers located in the kernel layer.

It can be appreciated that android runtime is responsible for the scheduling and management of the android system, including the core library and virtual machines. Computer programs of the application layer and the application framework layer run in a virtual machine. More specifically, the virtual machine may execute java files of the application layer and the application framework layer as binary files; the virtual machine can also be used for realizing the functions of object life cycle management, stack management, thread management, safety management, garbage collection and the like.

It will be appreciated that the system library may also include a plurality of functional modules other than a surface manager. For example, it may also include a condition monitoring service, Media Libraries, three-dimensional graphics engines (e.g., OpenGL for Embedded Systems), and two-dimensional graphics engines.

The surface manager may provide a fusion of two-dimensional graphics and three-dimensional graphics for various applications.

The state monitoring service can receive data reported by each driver in the kernel layer.

The media library may support playback and capture of images/audio/video in a variety of commonly used formats.

The three-dimensional graphic engine is used for realizing drawing, rendering and synthesis of three-dimensional images.

The two-dimensional graphic engine is used for realizing drawing and rendering of two-dimensional images.

The kernel layer is a layer between hardware and software, and the kernel layer comprises a plurality of drivers of the hardware. Illustratively, the kernel layer may include a display driver, a camera driver, an audio driver, and a touch driver; each driver can collect the information collected by the corresponding hardware and report the corresponding monitoring data to the state monitoring service or other functional modules in the system library.

Some applications deployed on the terminals, such as a browser/search engine, may have a server that can recommend content belonging to a category of interest to the user or a category selected by the user to the corresponding terminal. The terminal may present the content from the server to the user through a browser/search engine. If the content is not classified correctly, the content presented to the user by the terminal may not be the content in which the user is interested, which not only affects the user experience, but also may cause the content with good quality to be discarded due to the reduced click rate. Generally, in order to correctly classify contents, a manual review link needs to be added to determine whether the contents belong to the classification predicted by the algorithm model after the classification to which the contents belong is predicted by the pre-trained algorithm model, so that the degree of manual intervention is too high, and the efficiency is low.

In order to more efficiently determine whether the content belongs to the classification predicted by the algorithm model, at least one content classification method, device and system is provided in the embodiments of the present application. At least one expression feature associated with each of the plurality of classifications can be predetermined; wherein the at least one expressive feature respectively associated with each classification may include: one or more expressive features that should be present when the user reads content belonging to the category, and/or one or more expressive features that should not be present when the user reads content belonging to the category. For target content predicted to belong to a certain target classification in a plurality of classifications by a pre-trained algorithm model, at least one expression feature existing when a user reads the target content through a terminal can be obtained, then the obtained at least one expression feature is matched with at least one expression feature associated with the target classification, and the obtained matching result can reflect the possibility that the target content belongs to the target classification. Correspondingly, whether the target content belongs to the target classification predicted by the algorithm model can be determined more efficiently according to the matching result.

In addition, a large amount of contents generally need to be classified in an actual business scene, and the accuracy of the classification to which the contents are predicted by the algorithm model is relatively high. Under the condition that whether the content belongs to the classification predicted by the algorithm model or not can be determined more efficiently, only the content which cannot be correctly predicted by the algorithm model can be subjected to manual reinspection subsequently, so that a large batch of manual reinspection operations are avoided, and the content can be classified more efficiently.

The content classification system provided in the embodiment of the present application is described below with reference to fig. 3. As shown in fig. 3, the content classification system may include at least a server 20 and one or more terminals 10. Wherein a corresponding application, such as a browser, may be deployed in the terminal 10. The server 20 of the browser is connected to a database 30. Wherein, the database 30 may be deployed in the server 20, or may be deployed in other computing devices; the database 30 may also be replaced by other forms of data management means, such as by a corresponding file management system.

At least one expressive feature recognition model may also be deployed in the terminal 10.

The expressive feature recognition model may be deployed in the NPU or other type of AI chip of the terminal 10.

The at least one expressive feature recognition model may include, but is not limited to, any one or more of the following expressive feature recognition models: the facial expression recognition system comprises an expression feature recognition model for recognizing expression features existing in eyebrows of a user, an expression feature recognition model for recognizing expression features existing in eyes of the user, an expression feature recognition model for recognizing expression features existing in cheeks of the user, an expression feature recognition model for recognizing expression features existing in noses of the user and an expression feature recognition model for recognizing expression features existing in lips of the user. It is understood that the same expression feature recognition model may also be trained to recognize expression features existing in a plurality of parts of the user, for example, one expression feature recognition model may be trained to recognize expression features existing in the eyebrows and eyes of the user.

In order to facilitate the calling and management of the expression feature recognition model, different expression feature recognition models can be identified by using different model codes. Illustratively, referring to table 1 below, the model code M001 identifies an expression feature recognition model for recognizing expression features existing in the eyebrows of the user; the expressive features in which the user's eyebrows exist may include, but are not limited to, raised eyebrows, pressed eyebrows, raised intereyebrow wrinkles. The model code M002 identifies an expression feature recognition model used for recognizing expression features existing in the eyes of the user; the expressive features in which the user's eyes are present may include, but are not limited to, eye openness (increased eye white area) and eye focus (squinting). The model coding M003 identified expression feature recognition model is used for recognizing the expression features existing in the lips of the user; the expressive features in which the user's lips are present may include, but are not limited to, the corners of the lips being pulled back and up, the lips being closed, the corners of the lips being pulled down, and the lips being open (teeth exposed). The model code M004 identifies an expression feature recognition model used for recognizing expression features existing in the nose of the user; expressive features where the user's nose is present may include, but are not limited to, a crumpled nose and an enlarged nostril. The model coding M005 identified expression feature recognition model is used for recognizing expression features existing on the cheek of the user; the expressive features in which the user's cheek is present may include, but are not limited to, cheek lift and cheek drop.

TABLE 1

Model coding	Expressive features
		M001	Raised eyebrows, pressed eyebrows and raised eyebrow wrinkles
M002	Eyes wide open (white area increase), eyes focused (eyes squinting)
		M003	The lip is pulled backwards and upwards, the lip is closed tightly, the lip is pulled downwards, the lip is opened (teeth are exposed)
M004	The nose is wrinkled and the nostrils are enlarged
		M005	Cheek rising and cheek falling
…	…

It should be noted that the model codes and the expression features shown in table 1 are only used to assist in describing the technical solutions provided in the embodiments of the present application. In practical applications, more or fewer expressive feature recognition models can be deployed in the terminal 10, more or fewer expressive features can be recognized, and the model codes of the expressive feature recognition models can be replaced with corresponding real values.

The terminal 10 may receive configuration information from the server 20 or other communication device, the configuration information including a model code and configuration parameters, and the terminal 10 may update the expressive feature recognition model that has been deployed in the terminal 10 and identified by the model code according to the configuration parameters.

The database 30 is used for storing at least one expression feature associated with each of the plurality of classifications. Wherein the categories may include, but are not limited to, one or more of video, fashion, military, science and technology, puzzlement, history, etc., such as one or more of entertainment, cate, cartoon, automobile, sports, finance, disaster, medical relationship, strange affairs, vehicle model beauty, etc.

For any classification in a plurality of classifications, at least one expression feature which should exist and/or at least one expression feature which should not exist when a user reads the content belonging to the classification can be obtained through manual labeling and/or machine learning, and the association relationship between the classification and each expression feature is maintained in a database.

For example, when a large number of users read contents belonging to the "make a smile" category, the ratio of the number of users having a "raise of eyebrow" phenomenon at the eyebrow part to the total number of users reading contents belonging to the "make a smile" category is extremely large, for example, the ratio is greater than the set value of 90%. The "raised eyebrows" may be determined as an expression feature that should exist when the user reads the content belonging to the "fun" classification, and the association relationship between the "fun" classification and the "raised eyebrows" may be stored in the database 30.

For example, when a large number of users read contents belonging to the "make a smile" category, the ratio of the number of users who have a "eyebrow pressing" phenomenon at the eyebrow part to the total number of users who read contents belonging to the "make a smile" category is very small, for example, the ratio is less than the set value of 10%. "eyebrow press down" may be determined as an expressive feature that should not exist when the user reads content belonging to the "fun" category. The database 30 may store therein an association between the "fun" classification and the "eyebrow pressing down".

For any of several classifications, the expression features that should exist when the user reads the content belonging to that classification are also referred to as positive expression features associated with that classification, and the expression features that should not exist when the user reads the content belonging to that classification are also referred to as negative expression features associated with that classification. Correspondingly, a forward expression feature set may be formed by at least one expression feature that should exist when the user reads the content belonging to the category, and the database 30 stores an association relationship between the category and the forward expression feature set; and/or, a negative expression feature set is formed by at least one expression feature which should not exist when the user reads the content belonging to the classification, and the database 30 stores the association relationship between the classification and the negative expression feature set.

For example, when the user reads the content belonging to the "political" classification, the expressive features "eye focus", "eyebrow pressing", "glabellar wrinkle highlighting", and "tight lips" that should exist may constitute the positive expressive feature set M1, the expressive features "eyes open", "teeth exposed", and "cheeks raised" that should not exist may constitute the negative expressive feature set N1, and the database 30 stores the association relationship between the "political" classification and the positive expressive feature set M1 and the negative expressive feature set N1.

For example, when the user reads the content belonging to the "disaster" classification, the expression features "eyebrow pressing down", "glabellar wrinkle highlighting", "lip closing", and "lip pulling back" that should exist may constitute the positive expression feature set M2, the expression features "eyebrow raising", "tooth exposing", "lip corner pulling back, and raising" that should not exist may constitute the negative expression feature set N2, and the database 30 stores the association relationship between the "disaster" classification and the positive expression feature set M2 and the negative expression feature set N2.

For example, when the user reads the content belonging to the "doctor-patient relationship" classification, the expression features "eyebrow pressing down", "glabellar wrinkle protruding", "lips closed", "nose wrinkle", and "nostril opening" that should exist may constitute the positive expression feature set M3, the expression features "eyebrow raising", "teeth exposed", "lip angle pulling back, and raising" that should not exist may constitute the negative expression feature set N3, and the association relationship between the "doctor-patient relationship" classification and the positive expression feature set M3 and the negative expression feature set N3 is stored in the database 30.

For example, when the user reads the content belonging to the "strange and peculiar" category, the expression features "eyebrow raising", "eyes open", and "cheek falling" that should exist may constitute the positive expression feature set M4, the expression features "eyebrow pressing" and "mouth corner pulling" that should not exist may constitute the negative expression feature set N4, and the database 30 stores the association relationship between the "strange and peculiar" category and the positive expression feature set M4 and the negative expression feature set N4.

For example, when the user reads the content belonging to the category "car model beauty", the expression features "eyes open wide", "eyebrows raise", "nostrils open" and "cheek lift" that should exist may constitute the positive expression feature set M5, the expression features "eyebrows press down" and "mouth corner pull down" that should not exist may constitute the negative expression feature set N5, and the database 30 stores the association relationship between the category "strange things" and the positive expression feature set M5 and the negative expression feature set N5.

For example, when the user reads the content belonging to the "entertainment" category, the expression features "eyes open wide", "eyebrows raise", "teeth are exposed", "lip corners pull back and raise", and "cheek raise" that should exist may constitute the positive expression feature set M6, the expression features "eyebrows press down" and "mouth corners pull down" that should not exist may constitute the negative expression feature set N6, and the database 30 stores the association relationship between the "entertainment" category and the positive expression feature set M6 and the negative expression feature set N6.

For example, when the user reads content belonging to the "food" category, the expression features "eyes open wide", "nostrils open wide", and "lip angle pulled back and up" that should exist may constitute the positive expression feature set M7, the expression features "eyebrow pressing down" and "glabellar wrinkle highlighting" that should not exist may constitute the negative expression feature set N7, and the database 30 stores the association relationship between the "food" category and the positive expression feature set M7 and the negative expression feature set N7.

For example, when the user reads the content belonging to the "fun" classification, the expression features "eyes open wide", "eyebrows raise up", and "cheeks up" that should exist may constitute the positive expression feature set M8, the expression features "eyebrows press down", and "lips corner pull down" that should not exist may constitute the negative expression feature set N8, and the database 30 stores the association relationship between the "fun" classification and the positive expression feature set M8 and the negative expression feature set N8.

It should be noted that the association relationship between the classification described in the foregoing example and the positive expression feature set and the negative expression feature set is only used to assist in describing the technical solution provided in the embodiment of the present application, and the database 30 may also store the positive expression feature set and the negative expression feature set associated with each of the other classifications.

In order to facilitate management and other processing of the expression features, different feature codes can be used for representing different expression features. To facilitate management and other processing of the classes, different classes can be characterized using different class codes.

Correspondingly, when the classification is stored in the database 30 together with the positive expression feature set and the negative expression feature set, the classification may be replaced with the corresponding classification code, and the expression feature may be replaced with the corresponding feature code. And indicating the association relationship between the classification and the positive expression feature set and the negative expression feature set through the association relationship between the classification code and the positive feature code set and the negative feature code set.

The following describes an exemplary process for implementing the determination of whether the target content belongs to the target classification predicted by the algorithm model in cooperation between the server 20 and the terminal 10 in conjunction with the content classification system shown in fig. 4. As shown in fig. 4, the process may specifically include the following steps.

In step 401, the server 20 obtains the target content.

The targeted content is distributed by the user to the server 20 through a corresponding terminal or input device, or from another business system.

The type of target content may include, but is not limited to, text, an atlas, video, a combination thereof, or the like.

It is to be appreciated that the targeted content can be carried in a corresponding graphical user interface. The terminal may obtain the graphical user interface according to a Uniform Resource Locator (URL) address corresponding to the graphical user interface, and display the graphical user interface to present the target content to the user.

Server 20 predicts, using a pre-trained algorithmic model, a target classification to which the target content belongs, step 402.

In some embodiments, different from step 402, if the target content is from another business system, the server 20 may also receive the description information of the target content from the business system, and then perform semantic analysis on the description information using a corresponding algorithm model, and predict and output the target classification to which the target content belongs.

Illustratively, the title is "store clothes are expensive? The small editors teach you: the target content of how to sell the clothes is substantial comes from other business systems, from which the server 20 may receive the following description information of the target content:

the server 20 may perform semantic analysis on the description information by using a corresponding algorithm model, and predict and output a classification of the object to which the object content belongs as "fun".

It should be noted that, the algorithm model for performing semantic analysis on the target content or the description information of the target content may include, but is not limited to, an algorithm model based on a network structure such as BERT, Transformer, or long-short term memory network LSTM, which is not limited in this embodiment of the application.

In step 403, the server 20 queries the database 30 according to the target classification to obtain a positive expression feature set and a negative expression feature set associated with the target classification, and associates the target content with the positive expression feature set and the negative expression feature set.

In some embodiments, rather than step 403, the server 20 may query the database 30 according to the classification code of the target classification to obtain the positive set of signatures and the negative set of signatures associated with the classification code.

After the access and the processing of the target content are completed through the steps, the target content can be recommended to the terminal by the server under the request of the terminal.

Specifically, the terminal 10 may transmit a content recommendation request to the server 20 in step 404.

In one possible embodiment, the server 20 may be informed of the category of interest to the user from a user representation or other means, and the content recommendation request is for requesting the server 20 to recommend content belonging to the category of interest to the user to the terminal 10.

Illustratively, as shown in fig. 5A, the graphical user interface displayed by the terminal 10 is a desktop of the terminal 10, and the desktop may include respective icons of a plurality of applications deployed on the terminal 10 and may further include respective names of some of the applications. Wherein "icon 5" is an icon of the application "browser"; when the user touches the area of the "icon 5" on the desktop, the terminal 10 may sense a touch operation of the user on the "icon 5" through the touch sensor, respond to the touch operation to start the "browser", and trigger the browser to send a content recommendation request to the server 20.

Illustratively, as shown in fig. 5B, the graphical user interface displayed by the terminal 10 is a graphical user interface of an application "browser", and when a user performs a refresh operation on the graphical user interface, such as touching a finger on a display screen of the terminal 10 and sliding the finger in a specific direction on the display screen, the terminal 10 may sense the refresh operation on the graphical user interface by the user through the touch sensor, and respond to the refresh operation to send a content recommendation request to the server 20.

In one possible embodiment, the content recommendation request is for requesting the server 20 to recommend the user-selected target category of content to the terminal 10. Illustratively, as shown in fig. 5B, the graphical user interface displayed by the terminal 10 is a graphical user interface of an application "browser", and when a user touches a word of a target category in the graphical user interface, such as a word of a "funny" category, the terminal 10 may sense a touch operation of the user on the "funny" word through a touch sensor, and respond to the touch operation to send a content recommendation request to the server 20, wherein the content recommendation request is used for requesting the server 20 to recommend one or more pieces of content belonging to the "funny" category to the terminal 10.

Accordingly, if the server 20 knows that the user holding the terminal 10 is interested in the target category to which the target content belongs, or the content recommendation request is for requesting the server 20 to recommend the content belonging to the target category to the terminal 10, the server 20 may perform step 405 of transmitting the summary information of the target content to the terminal 10.

The summary information may include a title of the target content, and may further include one or more of information such as a source of the target content, a reading amount/playing amount/comment amount, and an image (image) included in the target content.

It should be noted that the server 20 may send a plurality of pieces of summary information to the terminal 10, and the content corresponding to any two pieces of summary information may belong to the same or different target categories. The server 20 may transmit a list including summary information of each of the plurality of contents to the terminal 10. Summary information may also be referred to as list page content or card content.

In step 406, the terminal 10 displays summary information of the target content.

Illustratively, the server 20 learns that the user is interested in "video" categories, "hour-of-affairs" categories, "military" categories, "scientific" categories, and "fun" categories; the pre-trained algorithm model predicts or the server 20 determines by other means: title "heavy pound micro video: what is "the weather-free" content belongs to the "video" category, entitled "during two meetings, XXX is together with people" belongs to the "political" category, entitled "is shopping for clothes in the market? The small editors teach you: what is more practical about buying clothes belongs to the category of "fun" entitled "middle parties talking about XX of the aforementioned defense agreements: the contents of the country's fortunes and nations' fortunes belong to the category of "military", the title of "Hua is a new order! "content belongs to the" science and technology "category, the aforementioned contents may be sent to the terminal 10 by the server 20, and the terminal 10 may display a graphical user interface as shown in fig. 5B, so as to display summary information of the aforementioned contents.

Illustratively, the terminal 10 requests the server 20 to recommend thereto the content of the target category "laugh" selected by the user, predicted by a pre-trained algorithm model or determined by the server 20 by other means: content entitled "feeling the world is full of maliciousness" content entitled "determining you are cooking but not doing experiments", content entitled "is it expensive to buy clothes in a mall? The small editors teach you: what is more beneficial about how to buy clothes, what is entitled as meaning of the figure, and what is entitled as "challenging the longest hotel name" all belong to the category of "making up", and then summary information of the aforementioned contents may be transmitted to the terminal 10 by the server 20, and the terminal 10 may display a graphical user interface as shown in fig. 5C.

In step 407, the terminal 10 transmits a content request to the server 20.

Illustratively, when the user performs a "shopping mall is clothes expensive? The small editors teach you: how to buy the clothes can benefit the touch operation of the area where the word is located, the terminal 10 may sense the touch operation through the touch sensor, and transmit a content request to the transmission server 20 in response to the touch operation.

In step 408, the terminal 10 receives the target content from the server 20, and receives the positive expression feature set and the negative expression feature set associated with the target content.

Illustratively, the server 20 may send a title "is shopping for clothes in a mall? The small editors teach you: the method includes the steps of obtaining target content of how to buy clothes, and associating a positive expression feature set M8 with the target content, namely { eyes are open, eyebrows are raised, cheeks are raised }, and associating a negative expression feature set N8 with { eyebrows are pressed down and lips are pulled down }.

In some embodiments, unlike step 408, the terminal 10 may receive from the server 20 a set of positive and negative signatures associated with a classification code of a target classification.

In step 409, the terminal 10 displays the target content.

Illustratively, the terminal 10 may display a graphical user interface as shown in fig. 5D, through which the user is presented with a title "is shopping for clothes in a shopping mall? The small editors teach you: how to buy the clothes is beneficial "for the user to read the target content.

In step 410, the terminal 10 acquires a face image of the user when reading the target content.

In some embodiments, after the terminal 10 displays the target content, such as under the support of the event manager and/or the window manager, and the graphical user interface shown in fig. 5D is loaded on the display screen of the terminal 10, a front camera (or referred to as an image capture device) of the terminal may be started, and the facial image of the user is captured through the camera. It is understood that the terminal 10 may continuously or periodically collect a plurality of facial images of the user while reading the target content.

In step 411, the terminal 10 analyzes and processes the face image respectively by using at least one pre-deployed expression feature recognition model, and obtains and outputs at least one expression feature existing when the user reads the target content.

In some embodiments, for each expression feature recognition model deployed in the terminal 10, the terminal 10 may record in advance an association between a model code of the expression feature recognition model and an expression feature that the expression feature recognition model can recognize. The terminal can selectively start the expression feature recognition model to process the facial image according to each expression feature in the positive expression feature set and the negative expression feature recognition associated with the target content.

More specifically, for each of the expressive feature recognition models deployed in the terminal 10, if there are one or more expressive features that can be recognized by the expressive feature recognition model in the positive-direction expressive feature set and the negative-direction expressive feature set associated with the target content, the terminal 10 may start the expressive feature recognition model and provide a facial image to the expressive feature recognition model. On the contrary, if there is no expression feature that can be identified by the expression feature identification model in the positive expression feature set and the negative expression feature set associated with the target content, the terminal 10 does not need to start the expression feature identification model. Therefore, the method is beneficial to saving the computing resources of the terminal 10 and reducing the load of the terminal 10.

In some embodiments, different from step 411, the facial image is analyzed by the expression feature recognition model, and a feature code of the corresponding expression feature may be output.

Step 412, the terminal 10 matches at least one expression feature existing when the user reads the target content with the forward expression feature set associated with the target content to obtain a second matching result; and matching at least one expression feature existing when the user reads the target content with the negative expression feature set associated with the target content to obtain a third matching result, and obtaining the first matching result according to the second matching result and the third matching result.

In a possible implementation manner, the initial score of the forward matching score is 0, for each expressive feature existing when the user reads the target content, whether the expressive feature same as the expressive feature exists in the forward expressive feature set is detected, if yes, the forward matching score is added by 1, and the finally obtained forward matching score is the second matching result. And the initial score of the negative matching score is 0, for each expression feature existing when the user reads the target content, detecting whether the expression feature identical to the expression feature exists in the negative expression feature set, if so, subtracting 1 from the negative matching score, and finally obtaining the negative matching score which is the third matching result. The sum of the positive matching score and the negative matching score is the first matching result.

Illustratively, the terminal 10 analyzes and processes the face images when the user reads the target content belonging to the "fun" classification, respectively, using the expression feature recognition models with model codes M001, M002, and M003. The expression feature recognition model with the model code of M001 recognizes that the expression features exist in the user and comprise ' eyebrow pressing ', the expression feature recognition model with the model code of M002 recognizes that the expression features exist in the user and comprise ' eyes are opened, and the expression feature recognition model with the model code of M003 recognizes that the expression features exist in the user and comprise ' lip angle pulling '. The expression features of the user, namely 'eyebrow pressing down', 'eyes opening wide' and 'lip angle pulling down', are combined with the positive expression features associated with the target content to obtain a positive matching score of 1, the negative matching score obtained by matching the negative expression feature set associated with the target content to obtain-2, and the sum of the positive matching score and the negative matching score is-1, namely, the first matching result is-1.

It can be understood that the larger the value of the first matching result is, the higher the possibility that the target content belongs to the target classification is; conversely, the smaller the value of the first matching result is, the smaller the probability that the target content belongs to the target classification is.

In some embodiments, different from step 412, the terminal 10 matches the feature codes corresponding to at least one expression feature existing when the user reads the target content with the positive feature code set and the negative feature code set, respectively, and finally obtains a first matching result.

In step 413, the terminal 10 transmits the first matching result corresponding to the target content to the server 20.

Server 20 determines whether the target content belongs to the target category according to the first matching result, step 414.

It is understood that the target content may be presented to different users through a plurality of terminals, and the server 20 may receive the first matching result corresponding to the target content from the plurality of different terminals, respectively.

In some embodiments, for a plurality of first matching results corresponding to the target content, if the number of the first matching results smaller than a preset threshold (e.g., 0) reaches a first set value, or the proportion of the first matching results smaller than the preset threshold (e.g., 0) reaches a second set value, it may be determined that the target content does not belong to the target category. Conversely, the server 20 may determine the target category as the category to which the target content belongs, indicating that the algorithm model correctly predicts the category to which the target content belongs.

In some embodiments, after the server 20 determines that the target content does not belong to the target category predicted by the algorithm model, the server 20 may temporarily stop recommending the target content to other terminals. And then correspondingly labeling the target content so that the user can review the labeled target content and re-determine the classification of the target content.

In some embodiments, in the process of manually reviewing the labeled content, not only the category to which the target content belongs may be re-determined, but also it may be determined whether the target content belongs to a content that is not allowed to be recommended to the terminal. If the classification to which the target content belongs is determined again, the server 20 may recommend the target content to a corresponding terminal according to a corresponding recommendation rule; if the target content is determined to belong to a content that is not allowed to be recommended to the terminal, the server 20 may discard the target content or other business processes and no longer recommend the target content to the terminal.

Illustratively, the title "middle party talks XX signs the defense agreement: the target content is a target content that is a fortunes or a fortunes of the country, the target content is predicted to belong to the "military" classification by the algorithm model, but after the server 20 receives the first matching results corresponding to the target content from the plurality of terminals 10, the server 20 recognizes that the target content does not belong to the "military" classification predicted by the algorithm model, after the server 20 labels the target content, the target content may be found to belong to the "current administration" classification in the manual review process, and the staff may trigger the server 20 to determine the "current administration" classification as the classification to which the target content belongs. Thereafter, if there is a terminal requesting recommendation of a content belonging to the "hour" category or the server 20 knows that the user holding a specific terminal is interested in the "hour" category, the summary information of the target content may be re-recommended to the terminal.

Illustratively, as shown in fig. 5D, is it more expensive for the title "shopping mall? The small editors teach you: however, after the server 20 receives the first matching results corresponding to the target content from the plurality of terminals 10, the server 20 recognizes that the target content does not belong to the "fun" classification predicted by the algorithm model, and after the server 20 marks the target content, the server 20 may find that the target content is actually an advertisement soft text and belongs to a content which is not allowed to be recommended to the terminal in the manual review process, and the staff may trigger the server 20 to discard the target content or perform other business processes, so that the target content is not recommended to the terminal any more.

In some embodiments, unlike the embodiment shown in fig. 4, the terminal need not deploy an expressive feature recognition model. The terminal 10 may transmit the face image to the server 20 after acquiring the face image when the user reads the target content. The server 20 analyzes and processes the face image from the terminal by using the corresponding expression feature recognition model, and obtains at least one expression feature existing when the user reads the target content. Then, the server 20 may match at least one expression feature with the positive expression feature set and the negative expression feature associated with the target content to obtain a first matching result corresponding to the target content, and finally determine whether the target content belongs to the target classification predicted by the algorithm model according to the first matching result.

In some embodiments, unlike the embodiment shown in fig. 4, the terminal 10 may transmit at least one expressive feature to the server 20 after acquiring at least one expressive feature existing when the user reads the target content. The server 20 may match at least one expression feature with the positive expression feature set and the negative expression feature associated with the target content to obtain a first matching result corresponding to the target content, and finally determine whether the target content belongs to the target classification predicted by the algorithm model according to the first matching result.

Based on the same concept as the foregoing method embodiment, the embodiment of the present application further provides a terminal 600. The terminal 600 may perform the operations performed by the terminal in the method embodiment shown in fig. 4. Among other things, the terminal 600 can include a processor 601, a memory 602, and a transceiver 603. The memory 602 has stored therein computer instructions that are executable by the processor 601. When executed by the processor 601, the terminal 600 may perform the operations performed by the terminal in the method embodiment shown in fig. 4 described above. In particular, the processor 601 may perform data processing operations and the transceiver 603 may perform data transmission and/or reception operations.

Based on the same concept as the foregoing method embodiment, the embodiment of the present application further provides a server 700. The terminal 700 may perform the operations performed by the server in the method embodiment shown in fig. 4. The server 700 may include, among other things, a processor 701, a memory 702, and a transceiver 703. The memory 702 has stored therein computer instructions that are executable by the processor 701. When executed by processor 701, server 700 may perform the operations described above as being performed by the server in the method embodiment illustrated in fig. 4. Specifically, the processor 701 may perform data processing operations, and the transceiver 703 may perform data transmission and/or reception operations.

It is understood that the processor in the embodiments of the present application may be a Central Processing Unit (CPU), other general purpose processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other programmable logic device, a transistor logic device, a hardware component, or any combination thereof. The general purpose processor may be a microprocessor, but may be any conventional processor.

The method steps in the embodiments of the present application may be implemented by hardware, or may be implemented by software instructions executed by a processor. The software instructions may consist of corresponding software modules that may be stored in Random Access Memory (RAM), flash memory, read-only memory (ROM), programmable read-only memory (PROM), Erasable Programmable ROM (EPROM), Electrically Erasable Programmable ROM (EEPROM), registers, a hard disk, a removable hard disk, a CD-ROM, or any other form of storage medium known in the art. An exemplary storage medium is coupled to the processor such the processor can read information from, and write information to, the storage medium. Of course, the storage medium may also be integral to the processor. The processor and the storage medium may reside in an ASIC.

In the above embodiments, the implementation may be wholly or partially realized by software, hardware, firmware, or any combination thereof. When implemented in software, may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. When loaded and executed on a computer, cause the processes or functions described in accordance with the embodiments of the application to occur, in whole or in part. The computer may be a general purpose computer, a special purpose computer, a network of computers, or other programmable device. The computer instructions may be stored in or transmitted over a computer-readable storage medium. The computer instructions may be transmitted from one website site, computer, server, or data center to another website site, computer, server, or data center via wired (e.g., coaxial cable, fiber optic, Digital Subscriber Line (DSL)) or wireless (e.g., infrared, wireless, microwave, etc.). The computer-readable storage medium can be any available medium that can be accessed by a computer or a data storage device, such as a server, a data center, etc., that incorporates one or more of the available media. The usable medium may be a magnetic medium (e.g., floppy disk, hard disk, magnetic tape), an optical medium (e.g., DVD), or a semiconductor medium (e.g., Solid State Disk (SSD)), among others.

Those of ordinary skill in the art will appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the embodiments of the present application.

It should be understood that, in various embodiments of the present application, the sequence numbers of the above-mentioned processes do not imply an order of execution, and the order of execution of the processes should be determined by their functions and inherent logic, and should not limit the implementation processes of the embodiments of the present application.

It will be appreciated that the above-described apparatus embodiments are illustrative, and that the division of the modules/units, for example, is merely one logical division, and that in actual implementation there may be additional divisions, for example, where multiple units or components may be combined or integrated into another system, or where some features may be omitted, or not implemented. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.

The above embodiments are only specific examples of the present application, but the scope of the embodiments of the present application is not limited thereto, and any person skilled in the art can easily conceive of changes or substitutions within the technical scope of the embodiments of the present application, and all the changes or substitutions should be covered by the scope of the embodiments of the present application

Finally, it should be noted that the above embodiments are only used for illustrating the technical solutions of the present application, and do not limit the same; although the present application has been described in detail with reference to the foregoing embodiments, it will be understood by those skilled in the art that various changes in form and details may be made therein without departing from the spirit and scope of the invention as defined by the appended claims.

31页详细技术资料下载

Content classification method, device and system

相关技术

网友询问留言