Barrage information processing method and device, electronic equipment and storage medium

文档序号:1524379 发布日期:2020-02-11 浏览:17次 中文

阅读说明:本技术 弹幕信息处理方法、装置、电子设备及存储介质 (Barrage information processing method and device, electronic equipment and storage medium ) 是由 肖志婕 刘平 涂强 况鹰 于 2019-08-12 设计创作,主要内容包括:本公开实施例提供了一种弹幕信息处理方法、装置、电子设备及存储介质,涉及人工智能技术领域,并涉及机器学习技术,该方法包括:获取图像数据中的待处理图像;对所述待处理图像进行场景识别以确定所述待处理图像所属的场景标签,所述场景标签用于表示图像所属的场景类别;将与所述场景标签关联的可选弹幕信息作为候选弹幕信息,并显示所述候选弹幕信息。本公开实施例根据识别到的场景标签引导用户提高发送弹幕的行为,提高了弹幕信息的使用率;能够针对性触发合适的弹幕信息,避免了使用平台通用的弹幕信息的步骤,提高了发送弹幕信息的准确性和及时性。(The embodiment of the disclosure provides a bullet screen information processing method, a bullet screen information processing device, electronic equipment and a storage medium, which relate to the technical field of artificial intelligence and machine learning technology, wherein the method comprises the following steps: acquiring an image to be processed in image data; performing scene recognition on the image to be processed to determine a scene label to which the image to be processed belongs, wherein the scene label is used for representing a scene category to which the image belongs; and taking the selectable bullet screen information associated with the scene label as candidate bullet screen information, and displaying the candidate bullet screen information. According to the method and the device, the user is guided to improve the behavior of sending the bullet screen according to the identified scene label, and the utilization rate of bullet screen information is improved; the method and the device can trigger the appropriate bullet screen information in a targeted manner, avoid the step of using the bullet screen information universal for the platform, and improve the accuracy and timeliness of sending the bullet screen information.)

1. A bullet screen information processing method is characterized by comprising the following steps:

acquiring an image to be processed in image data;

performing scene recognition on the image to be processed to determine a scene label to which the image to be processed belongs, wherein the scene label is used for representing a scene category to which the image belongs;

and taking the selectable bullet screen information associated with the scene label as candidate bullet screen information, and displaying the candidate bullet screen information.

2. The bullet screen information processing method according to claim 1, further comprising:

acquiring an image material and a scene picture category of the image material;

and training a deep learning model according to the image materials and the scene picture categories to obtain an identification model for identifying the scene to which the image to be processed belongs.

3. The bullet screen information processing method according to claim 2, wherein the acquiring of the image material comprises:

determining a characteristic region required by the image material, and taking an image containing the characteristic region as the image material, wherein the characteristic region contains characteristics used for representing the scene category to which the image material belongs.

4. The bullet screen information processing method according to claim 3, wherein taking the image containing the characteristic region as the image material comprises:

intercepting the characteristic region and classifying the characteristic region;

and performing an expansion operation on the characteristic region of each category to obtain new image materials under each category, wherein the expansion operation comprises at least one of contrast conversion and noise information increase.

5. The bullet screen information processing method according to claim 2, wherein training a deep learning model according to the image material and the scene picture category to obtain a recognition model for recognizing a scene to which the image to be processed belongs comprises:

adjusting the size of the image material;

randomly ordering the adjusted image materials, and performing format conversion on the randomly ordered image materials to obtain a file with a preset format;

and inputting the files in the preset format and the scene picture categories into the deep learning model for training to obtain the recognition model.

6. The bullet screen information processing method according to claim 1, wherein performing scene recognition on the image to be processed to determine the scene tag to which the image to be processed belongs comprises:

if the staying time of the address of the image to be processed in the preset queue exceeds a time threshold, discarding the image to be processed;

and performing multi-frame association identification on the residual images in the image data to determine the scene labels to which the residual images belong.

7. The bullet screen information processing method according to claim 6, wherein the performing multi-frame association recognition on the remaining images in the image data comprises:

routing the remaining images belonging to the same image data to the same process using consistent hash routing;

and storing the last identification result aiming at the image data in the same process, and carrying out the multi-frame association identification on the rest images by combining the last identification result.

8. The bullet screen information processing method according to claim 1, wherein displaying the candidate bullet screen information comprises:

and if the situation that the currently played image in the image data is the same as the target scene picture represented by the scene label is detected, displaying the candidate barrage information.

9. The bullet screen information processing method according to claim 1, wherein displaying the candidate bullet screen information comprises:

determining a target moment corresponding to a target scene picture represented by the scene label;

and if the image data is detected to be played to the target moment, displaying the candidate barrage information.

10. The bullet screen information processing method according to claim 9, wherein determining the target time corresponding to the target scene picture represented by the scene tag comprises:

and acquiring timestamp information of the target scene picture represented by the scene tag, and determining the target time according to the timestamp information, wherein the timestamp information is used for representing the display time of the target scene picture.

11. The bullet screen information processing method according to claim 1, wherein taking selectable bullet screen information associated with the scene tag as candidate bullet screen information, and displaying the candidate bullet screen information comprises:

and generating a plurality of selectable bullet screen information according to the scene label, and determining the candidate bullet screen information from the plurality of selectable bullet screen information for displaying.

12. The bullet screen information processing method according to claim 11, wherein determining the candidate bullet screen information for presentation from the plurality of selectable bullet screen information comprises:

and responding to the operation of a user on a bullet screen control, and determining the candidate bullet screen information related to the scene label from the plurality of selectable bullet screen information for displaying.

13. A bullet screen information processing device, characterized by comprising:

the image acquisition module is used for acquiring an image to be processed in the image data;

the scene identification module is used for carrying out scene identification on the image to be processed so as to determine a scene label to which the image to be processed belongs, wherein the scene label is used for representing a scene category to which the image belongs;

and the bullet screen information determining module is used for taking the selectable bullet screen information associated with the scene label as candidate bullet screen information and displaying the candidate bullet screen information.

14. An electronic device, comprising:

a processor; and

a memory for storing executable instructions of the processor;

wherein the processor is configured to execute the bullet screen information processing method of any one of claims 1-12 via execution of the executable instructions.

15. A computer-readable storage medium on which a computer program is stored, the computer program, when being executed by a processor, implementing the bullet screen information processing method according to any one of claims 1 to 12.

Technical Field

The present disclosure relates to the field of artificial intelligence technologies, and in particular, to a bullet screen information processing method, a bullet screen information processing apparatus, an electronic device, and a computer-readable storage medium.

Background

With the development of artificial intelligence technology, the barrage interaction atmosphere in live game is an important parameter index for cultivating high activity of live users.

Disclosure of Invention

The embodiment of the disclosure provides a bullet screen information processing method, a bullet screen information processing device, an electronic device and a computer readable storage medium, so that appropriate bullet screen information can be triggered at least to a certain extent, the step of using general bullet screen information is avoided, the accuracy and timeliness of sending bullet screen information are improved, and the pertinence of bullet screen information is also improved.

Additional features and advantages of the disclosure will be set forth in the detailed description which follows, or in part will be obvious from the description, or may be learned by practice of the disclosure.

According to an aspect of the embodiments of the present disclosure, there is provided a bullet screen information processing method, including: acquiring an image to be processed in image data; performing scene recognition on the image to be processed to determine a scene label to which the image to be processed belongs, wherein the scene label is used for representing a scene category to which the image belongs; and taking the selectable bullet screen information associated with the scene label as candidate bullet screen information, and displaying the candidate bullet screen information.

According to an aspect of an embodiment of the present disclosure, there is provided a bullet screen information processing apparatus including: the image acquisition module is used for acquiring an image to be processed in the image data; the scene identification module is used for carrying out scene identification on the image to be processed so as to determine a scene label to which the image to be processed belongs, wherein the scene label is used for representing a scene category to which the image belongs; and the bullet screen information determining module is used for taking the selectable bullet screen information associated with the scene label as candidate bullet screen information and displaying the candidate bullet screen information.

In some embodiments of the present disclosure, based on the foregoing solution, the apparatus further includes: the material acquisition module is used for acquiring image materials and scene picture categories of the image materials; and the model training module is used for training a deep learning model according to the image material and the scene picture category to obtain an identification model for identifying the scene to which the image to be processed belongs.

In some embodiments of the present disclosure, based on the foregoing solution, the material obtaining module includes: and the material acquisition control unit is used for determining a characteristic region required by the image material and taking an image containing the characteristic region as the image material, wherein the characteristic region contains a characteristic used for representing the scene category of the image material.

In some embodiments of the present disclosure, based on the foregoing, the material acquisition control unit includes: the region classification unit is used for intercepting the characteristic region and classifying the characteristic region; and the image expansion unit is used for performing expansion operation on the characteristic region of each category to obtain the new image material under each category, and the expansion operation comprises at least one of contrast conversion and noise information increase.

In some embodiments of the present disclosure, based on the foregoing scheme, the model training unit includes: the size adjusting unit is used for adjusting the size of the image material; the format conversion unit is used for randomly sequencing the adjusted image materials and carrying out format conversion on the randomly sequenced image materials to obtain a file with a preset format; and the training control unit is used for inputting the files in the preset format and the scene picture categories into the deep learning model for training so as to obtain the recognition model.

In some embodiments of the present disclosure, based on the foregoing solution, the scene recognition module includes: the image screening unit is used for discarding the image to be processed if the staying time of the address of the image to be processed in a preset queue exceeds a time threshold; and the association identification unit is used for carrying out multi-frame association identification on the residual images in the image data so as to determine the scene labels to which the residual images belong.

In some embodiments of the present disclosure, based on the foregoing scheme, the association identifying unit is configured to: routing the remaining images belonging to the same image data to the same process using consistent hash routing; and storing the last identification result aiming at the image data in the same process, and carrying out the multi-frame association identification on the rest images by combining the last identification result.

In some embodiments of the present disclosure, based on the foregoing solution, the bullet screen information determining module includes: and the first triggering module is used for displaying the candidate barrage information if the situation that the currently played image in the image data is the same as the target scene picture represented by the scene label is detected.

In some embodiments of the present disclosure, based on the foregoing solution, the bullet screen information determining module includes: the time determining module is used for determining target time corresponding to a target scene picture represented by the scene label; and the second triggering module is used for displaying the candidate barrage information if the image data is detected to be played to the target moment.

In some embodiments of the present disclosure, based on the foregoing, the time determination module is configured to: and acquiring timestamp information of the target scene picture represented by the scene tag, and determining the target time according to the timestamp information, wherein the timestamp information is used for representing the display time of the target scene picture.

In some embodiments of the present disclosure, based on the foregoing solution, the bullet screen information determining module includes: and the selectable bullet screen generating module is used for generating a plurality of selectable bullet screen information according to the scene label and determining the candidate bullet screen information from the plurality of selectable bullet screen information for displaying.

In some embodiments of the present disclosure, based on the foregoing, the selectable barrage generating module is configured to: and responding to the operation of a user on a bullet screen control, and determining the candidate bullet screen information related to the scene label from the plurality of selectable bullet screen information for displaying.

According to an aspect of an embodiment of the present disclosure, there is provided an electronic device including: a processor; and a memory for storing executable instructions of the processor; wherein the processor is configured to execute any one of the bullet screen information processing methods via execution of the executable instructions.

According to an aspect of the embodiments of the present disclosure, there is provided a computer-readable storage medium having a computer program stored thereon, the computer program, when executed by a processor, implementing the bullet screen information processing method of any one of the above.

In the technical solutions provided by some embodiments of the present disclosure, an image to be processed is acquired from image data to identify a scene tag of the image to be processed, so as to determine a scene category to which the image to be processed belongs; and further, triggering candidate barrage information conforming to the scene label according to the scene label. According to the technical scheme, on one hand, the to-be-processed image acquired from the image data is subjected to scene recognition to recognize the scene label corresponding to the to-be-processed image, so that a user can be guided to improve the behavior of sending the bullet screen according to the determined scene label, and the utilization rate of bullet screen information is improved. On the other hand, according to the scene label, the candidate barrage information aiming at the scene type represented by the scene label is automatically triggered, and the appropriate barrage information can be triggered in a targeted manner, so that the step of using the general barrage information of the platform is avoided, the accuracy and timeliness of sending the barrage information are improved, and the diversity and pertinence of the barrage information are also increased.

It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the present disclosure and together with the description, serve to explain the principles of the disclosure. It is to be understood that the drawings in the following description are merely exemplary of the disclosure, and that other drawings may be derived from those drawings by one of ordinary skill in the art without the exercise of inventive faculty. In the drawings:

fig. 1 is a schematic diagram showing a system architecture for implementing a bullet screen information processing method in the related art;

FIG. 2 shows a schematic diagram of an exemplary system architecture to which aspects of embodiments of the present disclosure may be applied;

fig. 3 schematically shows a flowchart of a bullet screen information processing method in an embodiment of the present disclosure;

FIG. 4 schematically shows a flow diagram of model training according to one embodiment of the present disclosure;

FIG. 5 schematically shows a flow diagram for obtaining image material according to one embodiment of the present disclosure;

FIG. 6 schematically illustrates an interface diagram for determining a target scene screen according to one embodiment of the present disclosure;

FIG. 7 schematically illustrates an interface diagram for configuring candidate barrage information according to one embodiment of the present disclosure;

fig. 8 schematically shows an interface diagram of a bullet screen information processing method according to an embodiment of the present disclosure;

FIG. 9 schematically shows a schematic diagram of an interaction process according to an embodiment of the present disclosure;

FIG. 10 schematically illustrates an interface diagram of candidate barrage information displayed by a client according to one embodiment of the present disclosure;

fig. 11 schematically shows a block diagram of a bullet screen information processing apparatus in an embodiment of the present disclosure;

FIG. 12 illustrates a schematic structural diagram of a computer system suitable for use in implementing the electronic device of an embodiment of the present disclosure.

Detailed Description

Example embodiments will now be described more fully with reference to the accompanying drawings. Example embodiments may, however, be embodied in many different forms and should not be construed as limited to the examples set forth herein; rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the concept of example embodiments to those skilled in the art.

Furthermore, the described features, structures, or characteristics may be combined in any suitable manner in one or more embodiments. In the following description, numerous specific details are provided to give a thorough understanding of embodiments of the disclosure. One skilled in the relevant art will recognize, however, that the subject matter of the present disclosure can be practiced without one or more of the specific details, or with other methods, components, devices, steps, and so forth. In other instances, well-known methods, devices, implementations, or operations have not been shown or described in detail to avoid obscuring aspects of the disclosure.

The block diagrams shown in the figures are functional entities only and do not necessarily correspond to physically separate entities. I.e. these functional entities may be implemented in the form of software, or in one or more hardware modules or integrated circuits, or in different networks and/or processor means and/or microcontroller means.

The flow charts shown in the drawings are merely illustrative and do not necessarily include all of the contents and operations/steps, nor do they necessarily have to be performed in the order described. For example, some operations/steps may be decomposed, and some operations/steps may be combined or partially combined, so that the actual execution sequence may be changed according to the actual situation.

Fig. 1 schematically shows candidate barrage information in a live broadcast process corresponding to a barrage information processing method provided by the inventor. As shown in fig. 1, taking a live video of a game type anchor as an example, candidate barrage information provided during anchor winning is substantially the same, specifically, a quick barrage pattern during anchor winning as shown in fig. 1 a. The candidate barrage information provided when the anchor fails is also substantially the same, specifically, as shown in fig. 1, which is a quick barrage pattern when the anchor sacrifices to fail. The display screen of the candidate barrage information shown in fig. 1 is mainly used for selection and transmission of barrage information commonly used by each platform for a user, and has certain limitations and poor user experience possibly caused by the fact that the display screen cannot be accurately targeted at a certain scene picture or a certain moment.

In view of the problems in the foregoing solutions, the present disclosure provides a bullet screen information processing method, and the bullet screen information processing method in the present disclosure may be used in any processing scene capable of sending bullet screen information, such as a live video, other types of videos played or played in a video website, or bullet screen processing scenes in an image stream, and so on. The barrage information refers to comments or comments made on the image data and the like input by a user through a barrage button or other controls when the user watches the video or the image data such as live broadcast. The bullet screen information can be displayed at any position in the image data presented by the client in a text form, a voice form or other forms (such as expression forms), and the bullet screen information can be displayed according to the trigger time of the user. In the embodiment of the present disclosure, the bullet screen information may be, for example, bullet screen information in live broadcasting, and specifically, "this wave is stable", or the like.

Fig. 2 shows a schematic diagram of an exemplary system architecture to which the technical solutions of the embodiments of the present disclosure can be applied.

As shown in fig. 2, the system architecture 200 may include a first end 201, a network 202, and a second end 203. The first end 201 may be a client, and the client may specifically be a terminal device with a display screen, such as a portable computer, a desktop computer, a smart phone, and a smart television, and is configured to display video content by installing an application program or logging in a website and display some barrage information on a display screen of the video content. The number of clients may be one or more, and multiple clients may perform the same function. Network 202 serves as a medium for providing a communication link between first end 201 and second end 203. The network 202 may include various connection types, such as wired communication links, wireless communication links, etc., and in the disclosed embodiment, the network 202 between the first end 201 and the second end 203 may be a wired communication link, such as may be provided by a serial connection line, or a wireless communication link, such as may be provided by a wireless network. The second end 202 may be a client or a server as long as it has a processor disposed thereon and is capable of performing processing operations. When the second end is a client, the second end may be the same as or different from the client of the first end. When the second end is a server, the server may be a local server or a remote server, or may be another product capable of providing a storage function or a processing function, such as a cloud server, or may be a server cluster formed by a plurality of servers, and the like, and the embodiment of the present disclosure is not particularly limited herein.

It should be understood that the number of first ends, networks and second ends in fig. 2 is merely illustrative. There may be any number of first ends, networks, and second ends, as desired for the implementation.

In one embodiment of the present disclosure, a plurality of images to be processed in image data are processed to determine a corresponding target scene picture and a corresponding scene tag; and configuring the bullet screen information according to the specific content of the target scene picture to obtain more accurate candidate bullet screen information aiming at the target scene picture and display the candidate bullet screen information.

It should be noted that the bullet screen information processing method provided in the embodiment of the present disclosure may be completely executed by the second end 203 (server), may also be completely executed by the first end 201 (client), and may also be partially executed by the first end and partially executed by the second end, where an execution subject of the bullet screen information processing method is not particularly limited. Accordingly, the bullet screen information processing device may be disposed in the second end 203 or in the first end 201.

Fig. 3 schematically shows a flowchart of a bullet screen information processing method according to an embodiment of the present disclosure, which is described as an example in the embodiment of the present disclosure in which the bullet screen information processing method is executed by a server. Referring to fig. 3, the bullet screen information processing method at least includes steps S310 to S330, where:

in step S310, an image to be processed in the image data is acquired;

in step S320, performing scene recognition on the image to be processed to determine a scene tag to which the image to be processed belongs, where the scene tag is used to represent a scene category to which the image belongs;

in step S330, the selectable barrage information associated with the scene tag is used as candidate barrage information, and the candidate barrage information is displayed.

According to the technical scheme of the embodiment of the disclosure, on one hand, the image to be processed acquired from the image data is identified to identify the scene label corresponding to the image to be processed, so that a user can be guided to improve the behavior of sending the bullet screen according to the determined scene label, and the utilization rate of bullet screen information is improved. On the other hand, according to the scene label, the candidate barrage information aiming at the scene label is automatically triggered, and the appropriate barrage information can be triggered in a targeted manner, so that the step of using the barrage information universal for the platform is avoided, the accuracy and timeliness of sending the barrage information are improved, and the diversity and pertinence of the barrage information are also increased.

Next, a bullet screen information processing method in the embodiment of the present disclosure will be described in detail with reference to the drawings.

In step S310, an image to be processed in the image data is acquired.

In an embodiment of the present disclosure, the image data may include, for example, a video stream or a picture stream, and the image data is taken as the video stream in the embodiment of the present disclosure as an example for explanation. The video stream is typically a video segment, which may be a segment of any length, such as 1 second or several seconds, and the like, and in general, the video stream may include a plurality of frames (such as 24 frames per second, and the like) of images to be processed. In the embodiment of the present disclosure, the image data may be game video, sports video, movie video, mini video, variety video, mv (music video) video, or the like, according to the content; according to the type, the image data may be an internet video stream being played in a video website, a local video stream, or a live video stream being performed in a live platform, and the description will be given by taking the image data as the live video stream being performed.

In one embodiment of the present disclosure, the video stream may be transmitted to facilitate playing and displaying the video stream on the operator interface. When the image data is a live video stream in the live platform, the live broadcast may be a game live broadcast, a movie live broadcast, a life live broadcast, a comprehensive live broadcast, and the like.

The method includes acquiring an image to be processed in image data, specifically, intercepting the image to be processed from the image data, or acquiring the image to be processed in preset image data. In the embodiment of the present disclosure, an example of capturing an image to be processed from image data is described. The corresponding image may be cut out from the image data every second as the image to be processed. Specifically, the image per second may be automatically intercepted according to a predetermined instruction, for example, the predetermined instruction may be that a time change in the image data is detected, but of course, other instructions may also be used. Through step S310, a plurality of images to be processed corresponding to the image data can be obtained.

In step S320, performing scene recognition on the image to be processed to determine a scene tag to which the image to be processed belongs, where the scene tag is used to represent a scene category to which the image belongs.

In the embodiment of the disclosure, scene recognition may be performed on an image to be processed to obtain a scene corresponding to the image to be processed and a corresponding scene tag. Scene recognition refers to the process of determining to which scene the image to be processed belongs. The scene label is used to indicate the category of the scene in which the image is located, i.e. to which particular scene the image belongs. The scene label may specifically be an identifier corresponding to each type of scene picture, the scene label may be represented by a text identifier or other types of identifiers, and the scene labels corresponding to different types of scenes are different. For example, tag a corresponds to scene category 1, tag B corresponds to scene category 2, and so on.

In the embodiment of the present disclosure, when the scene label describes the scene category in which the image is located, the scene of the scene label may be a scene of the target scene picture. The target scene picture can be understood as a highlight picture, and the target scene picture includes a plurality of different categories, and the highlight picture can be specifically measured by the highlight degree of the scene picture. The wonderness degree is a relative concept and can be used for reflecting the attention stimulation degree of the frame images or videos for the user, the guiding strength for the user, the information richness of the video content or the representativeness of the video content and the like. The degree of wonderness may be a degree of a state parameter of the virtual object, a degree of an action performed by the virtual object, or a degree of win or loss in the game, etc., including, but not limited to, a degree of a speed parameter, a degree of a position parameter, a degree of a skill parameter, a virtual object failed after winning, etc. For example, when the virtual object is a racing car, the target scene picture may be one or more of a speed parameter reaching a preset speed, a position parameter reaching a preset position, and a height parameter satisfying a preset height. For another example, when the virtual object is a virtual character, the target scene screen may be used for the virtual character to use skill or weapon, win the virtual character, fail the virtual character, and so on. Of course, the category of the scene label in describing the general scene in which the image is located is also within the scope of the present disclosure. A normal scene picture refers to a scene that does not need to be measured by the degree of highlights of the scene picture.

When the scene type of the target scene picture to which the image to be processed belongs and the corresponding scene tag are identified, the identification may be performed according to an identification model, or may be performed in other suitable manners. In order to ensure the accuracy of the recognition result, in this embodiment, before the recognition is performed on the recognition model, the recognition model may be trained first, so as to obtain a trained recognition model with better performance.

Fig. 4 schematically shows a flow chart of training a model, and referring to fig. 4, mainly includes the following steps:

in step S410, image material and a scene type of the image material are acquired.

In the embodiment of the present disclosure, the image material refers to images used for training the recognition model, the images may be historical images that have appeared in the game application, and the number of the image material may be plural. The image material can be an image related to the image to be recognized, thereby ensuring the speed and accuracy of the training process. For example, to identify the target scene screen 1 in the game application 1, the image material may be a plurality of images corresponding to the game application 1. The scene classification of the image material is used to specifically indicate which type of scene the scene of the image material belongs to, and the classification may be multiple, for example, the scene classification of the image material 1 is a "winning" scene in the target scene, and the scene classification of the image material 2 is a "failing" scene in the target scene.

Fig. 5 schematically shows a flow chart of acquiring image material, and referring to fig. 5, the flow chart mainly includes the following steps S510 and S520, and step S510 and S520 are specific implementations of step S410, where:

in step S510, a feature region required for the image material is determined.

In the embodiment of the present disclosure, the feature region refers to a region that includes features for characterizing a scene category to which the image material belongs, and it can be determined whether the image material belongs to the target scene picture and which scene category the image material belongs to through these feature regions. The feature areas may include, for example, but are not limited to, areas where skills are located, areas where avatar avatars are located, and so forth. The image may or may not have a characteristic region, and the image is not particularly limited herein. The characteristic region may be determined manually according to a previously identified event belonging to the target scene picture, where the event may include, for example, an event process (e.g., killing) or skill occurring in the game, and may further include a virtual object (a virtual character or hero) and an interactive object (an attack object) related to the event.

For example, referring to fig. 6, region 1, region 2, and region 3 in the display may each be used to determine the characteristic regions required for the image material. After determining the feature region representing the target scene picture, the determined feature region may be labeled. The labeling mode can be a box or other forms, and the position coordinates of each characteristic region can be labeled. It should be noted that, for all images, the feature regions may be manually labeled, so as to facilitate the screening of the images.

In step S520, an image including the feature region is used as the image material.

In the embodiment of the disclosure, for the images without the characteristic region, because the images without the characteristic region do not have the referential property, the images without the characteristic region can be deleted, so that the interference of the irrelevant images on the acquired material can be reduced, the influence of the images on the whole training process is further avoided, and the accuracy is improved.

For an image having a characteristic region, an image including the characteristic region may be used as an image material. With continuing reference to that shown in fig. 5, the main process of taking an image containing a feature area as an image material includes step S521 and step S522, in which:

in step S521, the feature region is cut out and classified.

In the embodiment of the present disclosure, when the feature region is cut, the feature region may be cut at a rate of one sheet per second using ffmpeg (fast Forward mpeg). ffmpeg is a set of open-source computer programs that can be used to record, convert digital audio, video, and convert them into streams, which provides a complete solution to recording, converting, and streaming audio and video. For efficiency, the images containing the feature areas in the plurality of videos can be processed in batch by an automated tool. In classifying, the classification may be by virtual objects, skills, interactive objects, and the like. Meanwhile, configuration files corresponding to each type can be defined so as to process target scene pictures of different characteristic areas.

In step S522, an expansion operation is performed on the feature region of each category to obtain new image material under the category, wherein the expansion operation includes at least one of contrast transformation and noise information increase.

In the embodiment of the present disclosure, after classifying the image including the feature region, the expansion operation may be performed on the classified images of each category. The expansion operation may be to perform only the contrast conversion, may be to perform only the noise information addition, or may be to perform both the contrast conversion and the noise information addition at the same time. The embodiment of the present disclosure takes an example in which the expansion operation includes contrast conversion and adding noise information. In particular, the feature region may be contrast transformed to achieve image enhancement. The contrast transformation is an image processing method for improving the image quality by changing the contrast of image elements by changing the brightness values of the image elements, and particularly can stretch and expand the over-concentrated image element distribution area (brightness value distribution range) in an image, expand the contrast of the image contrast and enhance the layering of the image expression. For example, the image may be contrast transformed using python imaggauge. The classified characteristic region of each category can be expanded by increasing noise information, so that new image materials under each category are obtained, and the robustness can be improved by increasing noise. Further, new image materials under all categories obtained by the expansion operation can be used as required image materials (i.e., all images containing feature regions after at least one of contrast conversion and noise information addition are used as new image materials), so that the training process of the recognition model can be completed according to the new image materials under each category.

Next, in step S420, a deep learning model is trained according to the image material and the scene picture category to obtain an identification model for identifying a scene to which the image to be processed belongs.

In the embodiment of the present disclosure, the deep learning model may be any suitable model for identifying the target scene picture, such as a neural network model and the like. The identification model may be used to identify a scene to which the image to be processed belongs, e.g. whether the image to be processed 1 belongs to scene 1 or scene 2, etc. To improve the training speed, the framework used in the embodiment of the present disclosure is a capacity (Convolutional structure for Fast Feature Embedding) framework. The caffe is an open source software framework, a set of basic programming framework or a template framework is provided inside the framework, the basic programming framework or the template framework is used for realizing algorithms such as deep convolutional neural network and deep learning under a GPU parallel framework, various structures of the convolutional neural network can be defined according to the framework, codes of the convolutional neural network can be added under the framework, a new algorithm is designed, and the framework can only use the convolutional network, namely all the frameworks are carried out on a model based on the convolutional neural network.

Under the context of a cafe framework, the deep learning model used is a lightweight model, which may include, but is not limited to, MobileNet, SqueezeNet, ShuffleNet, and the like. The neural network carries out deep compression on the model, and the number of parameters is obviously reduced by 50 times compared with that of AlexNet model; if the neural network compression deep compression technology is added, the compression ratio can reach 461 times. And the space of the network can be reduced and the calculation amount is reduced by adopting the lightweight model.

In the process of training a deep learning model according to image materials and scene picture categories to obtain a trained recognition model, the method mainly comprises the following steps: step one, adjusting the size of the image material. Specifically, the width and height of the image material may be reset to 256. And step two, randomly sequencing the adjusted image materials, and performing format conversion on the randomly sequenced image materials to obtain a file with a preset format. The file with the preset format refers to an lmdb (Lightning-Mapped Database) file in a coffee frame, all images are converted into a consistent data format, reading is facilitated, reading efficiency of the lmdb file is higher, simultaneous reading of different programs is supported, and utilization rate of disk IO (input output) is facilitated. Specifically, all images can be converted into an lmdb file by convcert _ imageset in the caffe framework. In the process of converting the file format, the pixel materials can be randomly sequenced, and the image materials after random sequencing are subjected to format conversion according to disorder, namely, the sequence of the images is disordered and is not converted according to the original sequence of the images, but another image which can be immediately processed is processed after the current image is processed, instead of waiting for the next image adjacent to the current image. The waiting time of the processor caused by the fact that the next image does not arrive after the previous image is processed can be avoided, the processing efficiency can be improved, and misoperation can be avoided. And step three, inputting the files in the preset format and the scene picture categories into the deep learning model to obtain the trained recognition model. Specifically, after format conversion is performed to obtain a file in a preset format, namely, an lmdb file, the file in the preset format and the category of the scene picture corresponding to the image material can be input into the deep learning model to realize a training process of the deep learning model, so that a trained recognition model is obtained.

The specific process of training the recognition model may include: acquiring characteristic regions of a plurality of artificially labeled images, and determining image materials and scene picture categories corresponding to the image materials according to the plurality of artificially labeled images; and training the deep learning model by taking the image material and the scene picture category as the input of the deep learning model so as to continuously adjust the weight of the deep learning model until the result that a certain image obtained by the deep learning model belongs to the scene picture category is consistent with the manually set scene picture category.

In the training process, the GPU machine and the deep learning chip P40 may be used to perform training in a manner of fine tuning the convolution output layer parameters of the deep learning model to adapt to the number of classes to be identified. The number of classes to be identified herein may include the total number of classes of all target scene pictures corresponding to the event or skill, the virtual object related to the event, and the interactive object, so that the target scene picture belonging to which type can be identified after any image is input into the model. Of course, it is within the scope of the embodiments of the present disclosure to use other deep learning chips such as P100.

In the embodiment of the present disclosure, after the trained recognition model is obtained, the to-be-processed image in the image data acquired in step S310 may be input into the recognition model, and the scene to which each to-be-processed image belongs may be analyzed to determine which scene label in the target scene picture the to-be-processed image belongs to or represents which target scene picture. The target scene pictures in a game are preset, and the target scene pictures can include a plurality of different types, such as virtual character weapon use, virtual character skill use, virtual character jump, virtual character win, virtual character failure, and the like.

The recognition model can be a SqueezeNet model, and the model structure can comprise a convolutional layer-Fire module-convolutional layer-softmax. The Fire module is composed of two layers, a compressed convolutional layer squeeze and an extended convolutional layer expand. The compressed convolution layer part is a convolution layer with convolution kernel 1 × 1, and the subsequent expanded convolution layer part is formed by two convolution layers of 1 × 1 and 3 × 3. After passing through the compression convolutional layer and the expansion convolutional layer, the obtained feature map may be subjected to a stitching operation through concat. Based on the method, firstly, a characteristic region in the image to be processed can be intercepted, then, a first layer of convolution is carried out on the characteristic region in the image to be processed through a SqueezeNet model, then, the image to be processed after the first layer of convolution is subjected to characteristic extraction sequentially through a compression convolution layer and an expansion convolution layer, and then, subsequent convolution and normalization processes are carried out to determine a scene label of the image to be processed. Based on this, the probability that the input image to be processed belongs to a certain scene label, that is, which type of target scene picture the image to be processed belongs to, can be determined through the trained recognition model. For example, when the probability that the image to be processed 1 belongs to the target scene picture 1 is greater than a certain threshold, for example, 0.8, the image to be processed 1 may be considered to correspond to the target scene picture 1. Further, a scene tag, such as tag a, may be added to the target scene screen. The scene tag is used to indicate a scene type to which a target scene corresponding to each to-be-processed image belongs, and the target scene may include, but is not limited to, a scene that is a monoscidal scene, a drift scene, a battle and death scene, or a victory scene, for example.

In the embodiment of the disclosure, the SqueezeNet model performs depth compression on the model, and the parameter number is obviously reduced by 50 times compared with that of the AlexNet model; if the neural network compression deep compression technology is added, the compression ratio can reach 461 times. By adopting the lightweight model, the space of the network can be reduced, the calculated amount is reduced, and the target scene picture and the scene label corresponding to the image to be processed can be identified more accurately and more quickly.

It should be noted that, in the process of performing recognition through the trained recognition model, since the image processing operation is a time-consuming operation, if the image recognition is performed directly, the time consumption is long, the system throughput is low, and the recognition fails, resulting in failure of the whole chain. Therefore, scene recognition can be carried out on the image to be processed through the preset queue. The pre-set queue refers to a rockmq, which is a message middleware of a distributed, queue model. After the RocktMQ is introduced, decoupling can be applied, and the failure of the whole process caused by the failure of calling an interface is avoided; the image can be processed asynchronously, the bottleneck of synchronous blocking of the system is reduced, the delay is reduced, and the throughput is increased; and the peak clipping of the flow can be realized, and the condition that the application system is hung up due to overlarge flow is prevented.

The image recognition process through the preset queue specifically comprises the following steps: if the staying time of the address of the image to be processed in the preset queue exceeds a time threshold, discarding the image to be processed; performing multi-frame association identification on the residual images in the image data to determine the scene labels to which the residual images belong. That is, the service daemon of the daemon started at the time of system boot may consume the message in the preset queue, and after the message in the preset queue is taken out, the daemon is sent to the service SVR dedicated to image processing to perform image processing. In the image processing process, the service daemon will also perform the operation of limiting the flow, and at the same time, will determine the retention time of the address of the image to be processed in the preset queue, that is, the time of the URL (Uniform Resource Locator) remaining in the preset queue. The URL is the address of a standard resource on the internet, and each file on the internet has a unique URL that contains information indicating the location of the file and how the browser should handle it. If the address of a certain image to be processed is in the preset queue for too long time, for example, the time exceeds the time threshold value of 5 seconds, it may be considered that the image to be processed is delayed, or the image to be processed may have already been displayed, so that the image to be processed may be directly discarded without performing the recognition processing, so as to ensure the recognition accuracy and reduce the calculation amount.

The remaining image refers to an image to be processed remaining in the entire image data after discarding the image to be processed. The method can perform multi-frame association identification on the rest images in the image data, and the process of multi-frame association identification comprises the following steps: routing the remaining images belonging to one image data to the same process using consistent hash routing; and storing the last identification result aiming at the image data in the same process, and carrying out the multi-frame association identification on the rest images by combining the last identification result. In the embodiment of the disclosure, multiple frame identifications can be associated in a consistent and orderly manner, rather than processing one image of one frame in isolation, so as to maintain the continuity of the video identification. In order to meet multi-frame correlation identification, consistent hash routing can be used, images to be processed belonging to the same image data (such as the same live video stream) are consistently routed to the same process of the same service SVR, and the last identification result of the image data is stored in the same process, so that correlation identification is performed. Specifically, in order to achieve consistent hash routing, consistent hash of L5 may be used, and a pid (Packet Identifier) of a live broadcast program of image data is used as a primary key for consistent hash, so as to ensure that the same image data is routed to the same service SVR, thereby implementing multi-frame association identification. The last recognition result refers to the last recognized target scene picture for each category, and the last recognition result can be used for confirming and checking the recognition results of the remaining images. For example, when executing killing skills, a threshold may be set, and the killing operation may be considered valid only when the number of the virtual objects killing the interactive objects is greater than the threshold. By comparing the last recognition result of the same image data with the recognition results of the rest images, whether the recognition result is effective or not can be accurately determined, and the recognition accuracy is improved.

In the embodiment of the disclosure, a trained noise model can be loaded by using an opensource vision library (opensource vision library) based recognition engine, and the recognition environment is deployed on a V4-8-100CPU machine for image recognition. Since the image processing time is about 50-100ms, the images to be processed obtained every second can be basically processed in real time. For example, referring to the illustration in fig. 6, with the trained recognition model, it can be determined that 12, 6/18/2019: 00, the acquired image to be processed belongs to a target scene picture of 'two continuous shooting' and the scene label of the image to be processed is determined to be 'two continuous shooting'.

Next, in step S330, the selectable barrage information associated with the scene tag is used as candidate barrage information, and the candidate barrage information is displayed.

In the embodiment of the disclosure, after the target scene picture corresponding to the image to be processed is identified, a corresponding scene tag may be added to the target scene picture, and then, selectable barrage information corresponding to the target scene picture represented by the scene tag may be generated according to the scene tag, and candidate barrage information is determined from the selectable barrage information and displayed. The optional bullet screen information generated here may be a target scene picture of a category or a bullet screen set corresponding to a scene label. Specifically, the user can select a plurality of quick bullet screen labels and bullet screen guide words, the scene labels are different, and the corresponding quick bullet screen labels and the bullet screen guide words can be the same or different. Specifically, the corresponding barrage guide words and the shortcut barrage tags may be configured according to the specific content of the target scene picture. Referring to fig. 7, for the target scene picture being "2-killer", the barrage guide may be "anchor 2-killer 6? "quick bullet screen label may be" this wave is steady "," 6666 ", etc. When the shortcut bullet screen information is configured, the number of the shortcut bullet screen information may be limited or not, and the number is not particularly limited. The candidate barrage information finally displayed on the operation interface of the client can be part or all of the generated selectable barrage information, and can be specifically triggered and selected by the user.

Further, whether the operation of the user on the bullet screen control is received or not can be detected, and the bullet screen control can be at least one of an area where a plurality of bullet screen quick bullet screen labels are located and an area where bullet screen guide words are located. The operation of the user on the bullet screen control can be a click operation, a double-click operation or a pressing operation, and the like. When the operation of the user on the bullet screen control is detected, in response to the operation of the user on the bullet screen control, part or all of the configured selectable bullet screen information is selected from all the selectable bullet screen information to be used as candidate bullet screen information of the scene label for displaying.

Displaying the candidate barrage information may include the following two ways: triggering and determining according to a target scene picture corresponding to a scene label in a first mode; and the second mode is to trigger and determine the target time of the target scene picture corresponding to the scene label.

First, the first embodiment will be described. Because the image data is generally played at the client, and the image data has a delay of more than 1 second, that is, the same scene may appear in the background first, and is displayed at the client with a delay of more than 1 second. Based on this, in the process of playing the image data, the currently played image is captured to identify whether the currently played image is the target scene picture represented by the scene label. That is, it may be determined whether the currently played image of the image data is the same as the target scene picture, or whether the image data is played to the determined target scene picture, so as to trigger the bullet screen information according to the determination result. Specifically, the following two situations are included: in case one, if the image data is played to a target scene picture represented by a scene tag, displaying the candidate barrage information associated with the target scene picture represented by the scene tag. That is, if the currently played image in the image data is the same as the target scene picture represented by the determined scene tag, it indicates that the target scene picture is played to be identified, and at this time, the candidate barrage information corresponding to the scene tag may be triggered.

When the candidate barrage information is triggered, a plurality of selectable barrage information of the target scene picture, such as "this is stable", "6666", and the like, may be acquired first. Further, whether the operation of the user on the bullet screen control is received or not can be detected. When the operation of the user on the bullet screen control is detected, in response to the operation of the user on the bullet screen control, part or all of the configured selectable bullet screen information is selected from all the selectable bullet screen information to be used as candidate bullet screen information of the scene label for displaying. For example, if a click operation on a certain selectable bullet screen information, for example, an area where the shortcut bullet screen label 1 is located, is detected, the shortcut bullet screen label 1 may be triggered to serve as candidate bullet screen information of a target scene picture conforming to the scene label, and the candidate bullet screen information may be displayed at any position of the operation interface.

And in case two, if the currently played image of the image data is not played to the target scene picture, the candidate barrage information associated with the scene label is not triggered. That is, if the currently played image of the image data is different from the target scene picture, it may be considered that the currently played image does not belong to the target scene picture represented by the scene tag, and thus the candidate bullet screen information conforming to the scene tag is not triggered, but general bullet screen information or the like may be triggered.

In the embodiment of the disclosure, on one hand, appropriate bullet screen information can be triggered for target scene pictures represented by different scene labels, so that the step of using the bullet screen information universal for the platform is avoided, the diversity and pertinence of the bullet screen information are increased, and the user experience is further improved. On the other hand, when the currently played image in the image data is consistent with the target scene picture, the appropriate bullet screen information can be triggered in time according to the target scene picture, so that the accuracy and timeliness of sending the bullet screen information are improved.

Next, a second embodiment will be described. Since timestamp information and other identifiers indicating the sequence (e.g., the frame number of the image) may be set in each video stream, the candidate barrage information may be displayed according to the frame number of the image or the playing time, etc. In the embodiment of the present disclosure, the candidate bullet screen information is triggered by the playing time of the image data as an example. Specifically, after the scene tag of the image to be processed is identified in step S320, the target time corresponding to the target scene picture represented by the scene tag may be determined, and the target time may be issued, so as to automatically trigger the barrage information for the target scene picture represented by the scene tag according to the determination result of whether the image data is played to the target time. Through the mode, the suitable bullet screen information can be triggered aiming at the currently played image, the step of using the bullet screen information universal to the platform is avoided, the accuracy and timeliness of sending the bullet screen information are improved, the diversity and pertinence of the bullet screen information are also increased, and the user experience is further improved.

Specifically, similar to the triggering manner according to the target scene picture, in the process of playing the image data, the to-be-processed image in the image data may be identified to determine which scene label in the target scene picture the to-be-processed image is. And after the scene label is determined, further acquiring the target time of the target scene picture corresponding to the image to be processed. The target time may be specifically determined according to timestamp information of a target scene picture in the image data, where the timestamp information is used to indicate a time of displaying the target scene picture, and may be specifically a character sequence used to uniquely identify a time of a certain time. For example, the time stamp of image data playing is 2019, 6, 18, 12: 00. as another example, if it is determined that 12 is 6 months and 18 days 2019: if the image to be processed at 00 belongs to a target scene picture of 'two continuous shooting broken', determining that the target time is 2019, 6 months, 18 days and 12 days: 00.

in the process of playing the image data, after the current playing time of the image data is determined, the current playing time can be compared with the determined target time to determine whether the image data is played to the target time or whether the current playing time is overlapped with the timestamp information of the target time. Further, the candidate barrage information may be triggered according to the determination result, which specifically includes the following two situations: and in case one, if the image data is played to the target moment, triggering the candidate barrage information associated with the target scene picture represented by the scene label. That is, if the currently played timestamp information in the image data is the same as the target time, it indicates that the image data is played to the identified target time, and at this time, the candidate bullet-screen information of the target scene picture corresponding to the scene tag may be triggered.

When the candidate barrage information is triggered, a plurality of pieces of selectable barrage information of the target time, such as "this wave is stable", "6666", and the like, may be acquired first. Further, when the operation of the user on the bullet screen control is detected, the selectable bullet screen information corresponding to the bullet screen control is triggered to serve as candidate bullet screen information conforming to the standard scene picture, and the bullet screen information is displayed.

And in case that the image data is not played to the target time, namely the current playing time in the image data does not reach the target time or the current playing time exceeds the target time, the candidate bullet screen information related to the target scene picture represented by the scene label is not triggered. If the current playing time of the image data is different from the target time, the played scene picture can be considered not to belong to the target scene picture, so that the candidate barrage information conforming to the scene label is not triggered, but the general barrage information and the like can be triggered at the moment.

In the embodiment of the present disclosure, the determined target time may be received, and whether to trigger candidate barrage information of the target scene picture conforming to the scene tag representation may be determined according to whether the image data is played to the target time. On the one hand, when the current playing time is consistent with the target time, the appropriate bullet screen information can be triggered aiming at the target scene picture, the step of using the general bullet screen information of the platform is avoided, the diversity and pertinence of the bullet screen information are also increased, and the user experience is further improved. On the other hand, the bullet screen information can be triggered in time when the image data is played to the target moment, and the accuracy and timeliness of sending the bullet screen information are improved.

It should be noted that, for a live video stream, only according to an identified target scene picture, when an image currently played by image data is the target scene picture, candidate bullet screen information conforming to the target scene picture may be triggered. The target scene picture of the image to be processed can be identified, the target time is determined when the target scene picture is identified, and the candidate barrage information which accords with the target scene picture is triggered when the current playing time of the image data is consistent with the target time. The target time is used for triggering, so that the efficiency and the accuracy can be improved. In addition, the candidate barrage information can be triggered only by whether the current playing time meets the target time.

For the website video stream, when the image currently played by the image data is the target scene picture, the candidate bullet screen information conforming to the target scene picture can be triggered. The candidate barrage information conforming to the target scene picture may also be triggered when the current playing time of the image data is consistent with the target time, without identifying and detecting the target scene picture, which is not particularly limited herein.

Fig. 8 schematically shows an interface schematic diagram of the whole bullet screen information processing method, and referring to fig. 8, the time is identified first, a corresponding bullet screen guide word and a bullet screen label are further popped up, and finally a quick bullet screen is displayed. Fig. 9 schematically shows a schematic diagram of the whole interaction process, and referring to the schematic diagram shown in fig. 9, the client identifies the time of playing the image data, where the time can be understood as a certain target scene picture (i.e. the scene type of the highlight time), for example, N highlight times, such as time 1, time 2, time 3, and up to time N. The server determines a bullet screen guide language and a quick bullet screen label related to each moment; for example, 1 corresponds bullet screen guide word 1 and swift bullet screen label 1 constantly, 2 corresponds bullet screen guide word 2 and swift bullet screen label 2 constantly, 3 corresponds bullet screen guide word 3 and swift bullet screen label 3 constantly, and N corresponds bullet screen guide word 3 and swift bullet screen label 3 constantly. The user triggers bullet screen information composed of the shortcut bullet screen label and the bullet screen guide language at the client, so that the bullet screen information is displayed on the operation interface of the client, and the operation interface for finally displaying the bullet screen information on the client can be as shown in fig. 10. Referring to fig. 10, for different target scene pictures, the corresponding shortcut bullet screen label and bullet screen guide words may be different to better conform to the current scene picture, so that the bullet screen information is more accurate.

According to the technical scheme, the scene label of the target scene picture of the image to be processed in the live stream is determined by utilizing an image recognition technology, and the candidate barrage information conforming to the scene label is triggered when the image currently played in the live stream is consistent with the target scene picture. The behavior of the user for sending the bullet screen can be guided to be improved, and the bullet screen sending cost is reduced. And the interactive atmosphere of the live broadcast room is improved, and the activity and the user viscosity of the live broadcast room are positively influenced. In addition, in the embodiment of the disclosure, interface development is not particularly provided for games, the capability of the whole platform is universal, and the method can be continuously expanded and applied to various game live scenes, so that convenience, universality and application range are improved.

The following describes embodiments of the apparatus of the present disclosure, which may be used to execute the bullet screen information processing method in the above embodiments of the present disclosure. For details that are not disclosed in the embodiments of the apparatus of the present disclosure, please refer to the embodiments of the bullet screen information processing method described above in the present disclosure.

Fig. 11 schematically shows a block diagram of a bullet screen information processing apparatus according to one embodiment of the present disclosure.

Referring to fig. 11, a bullet screen information processing apparatus 1100 according to one embodiment of the present disclosure includes: the system comprises an image acquisition module 1101, a scene recognition module 1102 and a bullet screen information determination module 1103. The image acquiring module 1101 is configured to acquire an image to be processed in the image data; a scene recognition module 1102, configured to perform scene recognition on the image to be processed to determine a scene tag to which the image to be processed belongs, where the scene tag is used to represent a scene category to which the image belongs; a bullet screen information determining module 1103, configured to use the selectable bullet screen information associated with the scene tag as candidate bullet screen information, and display the candidate bullet screen information.

In some embodiments of the present disclosure, the apparatus further comprises: the material acquisition module is used for acquiring image materials and scene picture categories of the image materials; and the model training module is used for training a deep learning model according to the image material and the scene picture category to obtain an identification model for identifying the scene to which the image to be processed belongs.

In some embodiments of the present disclosure, the material obtaining module includes: and the material acquisition control unit is used for determining a characteristic region required by the image material and taking an image containing the characteristic region as the image material, wherein the characteristic region contains a characteristic used for representing the scene category of the image material.

In some embodiments of the present disclosure, the material acquisition control unit includes: the region classification unit is used for intercepting the characteristic region and classifying the characteristic region; and the image expansion unit is used for performing expansion operation on the characteristic region of each category to obtain the new image material under each category, and the expansion operation comprises at least one of contrast conversion and noise information increase.

In some embodiments of the present disclosure, the model training unit comprises: the size adjusting unit is used for adjusting the size of the image material; the format conversion unit is used for randomly sequencing the adjusted image materials and carrying out format conversion on the randomly sequenced image materials to obtain a file with a preset format; and the training control unit is used for inputting the files in the preset format and the scene picture categories into the deep learning model for training so as to obtain the trained recognition model.

In some embodiments of the present disclosure, the scene recognition module comprises: the image screening unit is used for discarding the image to be processed if the staying time of the address of the image to be processed in a preset queue exceeds a time threshold; and the association identification unit is used for carrying out multi-frame association identification on the residual images in the image data so as to determine the scene labels to which the residual images belong.

In some embodiments of the present disclosure, the association identification unit is configured to: routing the remaining images belonging to the same image data to the same process using consistent hash routing; and storing the last identification result aiming at the image data in the same process, and carrying out the multi-frame association identification on the rest images by combining the last identification result.

In some embodiments of the present disclosure, the barrage information determining module includes: and the first triggering module is used for displaying the candidate barrage information if the situation that the currently played image in the image data is the same as the target scene picture represented by the scene label is detected.

In some embodiments of the present disclosure, the barrage information determining module includes: the time determining module is used for determining target time corresponding to a target scene picture represented by the scene label; and the second triggering module is used for displaying the candidate barrage information if the image data is detected to be played to the target moment.

In some embodiments of the present disclosure, the time of day determination module is configured to: and acquiring timestamp information of the target scene picture represented by the scene tag, and determining the target time according to the timestamp information, wherein the timestamp information is used for representing the display time of the target scene picture.

In some embodiments of the present disclosure, the barrage information determining module includes: and the selectable bullet screen generating module is used for generating a plurality of selectable bullet screen information according to the scene label and determining the candidate bullet screen information from the plurality of selectable bullet screen information for displaying.

In some embodiments of the present disclosure, the selectable barrage generation module is configured to: and responding to the operation of a user on a bullet screen control, and determining the candidate bullet screen information related to the scene label from the plurality of selectable bullet screen information for displaying.

FIG. 12 illustrates a schematic structural diagram of a computer system suitable for use in implementing the electronic device of an embodiment of the present disclosure.

It should be noted that the computer system 1200 of the electronic device shown in fig. 12 is only an example, and should not bring any limitation to the functions and the scope of use of the embodiments of the present disclosure.

As shown in fig. 12, the computer system 1200 includes a Central Processing Unit (CPU)1201 that can perform various appropriate actions and processes in accordance with a program stored in a Read-Only Memory (ROM) 1202 or a program loaded from a storage section 1208 into a Random Access Memory (RAM) 1203. In the RAM 1203, various programs and data necessary for system operation are also stored. The CPU1201, ROM 1202, and RAM 1203 are connected to each other by a bus 1204. An Input/Output (I/O) interface 1205 is also connected to bus 1204.

The following components are connected to the I/O interface 1205: an input section 1206 including a keyboard, a mouse, and the like; an output section 1207 including a Display device such as a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), and a speaker; a storage section 1208 including a hard disk and the like; and a communication section 1209 including a network interface card such as a LAN (Local area network) card, a modem, or the like. The communication section 1209 performs communication processing via a network such as the internet. A driver 1210 is also connected to the I/O interface 1205 as needed. A removable medium 1211, such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like, is mounted on the drive 1210 as necessary, so that a computer program read out therefrom is mounted into the storage section 1208 as necessary.

In particular, the processes described below with reference to the flowcharts may be implemented as computer software programs, according to embodiments of the present disclosure. For example, embodiments of the present disclosure include a computer program product comprising a computer program embodied on a computer readable medium, the computer program comprising program code for performing the method illustrated in the flow chart. In such an embodiment, the computer program may be downloaded and installed from a network through the communication section 1209, and/or installed from the removable medium 1211. The computer program executes various functions defined in the system of the present application when executed by a Central Processing Unit (CPU) 1201. In some embodiments, computer system 1100 may also include an AI (Artificial Intelligence) processor for processing computing operations related to machine learning.

Artificial Intelligence (AI) is a theory, method, technique and application system that uses a digital computer or a machine controlled by a digital computer to simulate, extend and expand human Intelligence, perceive the environment, acquire knowledge and use the knowledge to obtain the best results. In other words, artificial intelligence is a comprehensive technique of computer science that attempts to understand the essence of intelligence and produce a new intelligent machine that can react in a manner similar to human intelligence. Artificial intelligence is the research of the design principle and the realization method of various intelligent machines, so that the machines have the functions of perception, reasoning and decision making.

The artificial intelligence technology is a comprehensive subject and relates to the field of extensive technology, namely the technology of a hardware level and the technology of a software level. The artificial intelligence infrastructure generally includes technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, operation/interaction systems, mechatronics, and the like. The artificial intelligence software technology mainly comprises a computer vision technology, a voice processing technology, a natural language processing technology, machine learning/deep learning and the like.

Machine Learning (ML) is a multi-domain cross discipline, and relates to a plurality of disciplines such as probability theory, statistics, approximation theory, convex analysis, algorithm complexity theory and the like. The special research on how a computer simulates or realizes the learning behavior of human beings so as to acquire new knowledge or skills and reorganize the existing knowledge structure to continuously improve the performance of the computer. Machine learning is the core of artificial intelligence, is the fundamental approach for computers to have intelligence, and is applied to all fields of artificial intelligence. Machine learning and deep learning generally include techniques such as artificial neural networks, belief networks, reinforcement learning, transfer learning, inductive learning, and formal education learning.

In the early traditional machine learning era, people need to carefully design how to extract useful features from data, design an objective function aiming at a specific task, and then build a machine learning system by using some universal optimization algorithms. After the rise of deep learning, people largely do not rely on well-designed features, but let neural networks learn useful features automatically. In the embodiment of the disclosure, the features can be automatically extracted according to the neural network, and the scene label of the image to be processed is identified, so that the bullet screen information associated with the scene label is automatically triggered to be displayed.

It should be noted that the computer readable medium shown in the embodiments of the present disclosure may be a computer readable signal medium or a computer readable storage medium or any combination of the two. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples of the computer readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a Read-Only Memory (ROM), an Erasable Programmable Read-Only Memory (EPROM), a flash Memory, an optical fiber, a portable Compact Disc Read-Only Memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the present disclosure, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In contrast, in the present disclosure, a computer-readable signal medium may include a propagated data signal with computer-readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: wireless, wired, etc., or any suitable combination of the foregoing.

The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams or flowchart illustration, and combinations of blocks in the block diagrams or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

The units described in the embodiments of the present disclosure may be implemented by software, or may be implemented by hardware, and the described units may also be disposed in a processor. Wherein the names of the elements do not in some way constitute a limitation on the elements themselves.

As another aspect, the present application also provides a computer-readable medium, which may be contained in the electronic device described in the above embodiments; or may exist separately without being assembled into the electronic device. The computer readable medium carries one or more programs which, when executed by an electronic device, cause the electronic device to implement the method described in the above embodiments.

It should be noted that although in the above detailed description several modules or units of the device for action execution are mentioned, such a division is not mandatory. Indeed, the features and functionality of two or more modules or units described above may be embodied in one module or unit, according to embodiments of the present disclosure. Conversely, the features and functions of one module or unit described above may be further divided into embodiments by a plurality of modules or units.

Through the above description of the embodiments, those skilled in the art will readily understand that the exemplary embodiments described herein may be implemented by software, or by software in combination with necessary hardware. Therefore, the technical solution according to the embodiments of the present disclosure may be embodied in the form of a software product, which may be stored in a non-volatile storage medium (which may be a CD-ROM, a usb disk, a removable hard disk, etc.) or on a network, and includes several instructions to enable a computing device (which may be a personal computer, a server, a touch terminal, or a network device, etc.) to execute the method according to the embodiments of the present disclosure.

Other embodiments of the disclosure will be apparent to those skilled in the art from consideration of the specification and practice of the disclosure disclosed herein. This application is intended to cover any variations, uses, or adaptations of the disclosure following, in general, the principles of the disclosure and including such departures from the present disclosure as come within known or customary practice within the art to which the disclosure pertains.

It will be understood that the present disclosure is not limited to the precise arrangements described above and shown in the drawings and that various modifications and changes may be made without departing from the scope thereof. The scope of the present disclosure is limited only by the appended claims.

25页详细技术资料下载
上一篇:一种医用注射器针头装配设备
下一篇:一种视频播放方法、视频播放器及计算机存储介质

网友询问留言

已有0条留言

还没有人留言评论。精彩留言会获得点赞!

精彩留言,会给你点赞!

技术分类