Video image identification method, device, server, terminal and storage medium

文档序号：1465588 发布日期：2020-02-21 浏览：33次中文

阅读说明：本技术 视频图像的识别方法、装置、服务器、终端和存储介质 (Video image identification method, device, server, terminal and storage medium ) 是由刘亦明张昕宇杨慈航禹慧军于 2019-11-05 设计创作，主要内容包括：本公开提供视频图像的识别方法及装置、服务器、终端和存储介质。视频图像的识别方法包括：接收获取所述视频图像的指令信息；根据所述指令信息从视频文件中提取相应的视频图像；识别所述视频图像中的信息,并上传所述信息；及接收并显示检索到的所述信息的匹配信息。本公开的视频图像的识别方法能够根据识别出视频中出现的人和物,进行图像识别,搜索相似结果,方便用户获取视频中的商品及人物相关信息。(The disclosure provides a video image recognition method and apparatus, a server, a terminal and a storage medium. The video image identification method comprises the following steps: receiving instruction information for acquiring the video image; extracting a corresponding video image from a video file according to the instruction information; identifying information in the video image and uploading the information; and receiving and displaying the retrieved matching information of the information. The video image identification method can identify the images and search similar results according to people and objects appearing in the video, and is convenient for users to obtain the related information of commodities and people in the video.)

1. A method of identifying video images, comprising:

receiving instruction information for acquiring the video image;

extracting a corresponding video image from a video file according to the instruction information;

identifying information in the video image and uploading the information; and

and receiving and displaying the retrieved matching information of the information.

2. The identification method according to claim 1, wherein the step of receiving instruction information for acquiring the video image comprises:

receiving frame extraction instruction information; and

and recording a timestamp sent by the frame extraction instruction information.

3. The method according to claim 2, wherein the step of extracting the corresponding video image from the video file according to the instruction information comprises:

and extracting the video image corresponding to the time stamp from the video file according to the time stamp.

4. The recognition method according to claim 3, wherein the video image includes a main body portion and a background portion;

the step of identifying information in the video image and uploading the information comprises:

identifying the body part in the video image and uploading the body part; or

And identifying the label information of the main body part in the video image, and uploading the label information.

5. The method according to claim 1, wherein the step of receiving instruction information for acquiring the video image further comprises:

receiving screenshot instruction information; and

and recording a time stamp and a screenshot coordinate sent by the screenshot instruction information.

6. The method according to claim 5, wherein the step of extracting the corresponding video image from the video file according to the instruction information further comprises:

and extracting a video image corresponding to the timestamp from the video file according to the timestamp, and acquiring a part of screenshot in the video image according to the screenshot coordinate.

7. The recognition method of claim 6, wherein the partial screenshot comprises a body portion and a background portion;

the step of identifying information in the video image and uploading the information further comprises:

identifying the body portion of the partial screenshot in the video image and uploading the body portion; or

And identifying label information of the main body part of the partial screenshot in the video image, and uploading the label information.

8. The identification method according to any one of claims 1-7, characterized in that the method further comprises:

receiving other information except the matching information in the global matching information of the video file; and

displaying the other information after the matching information.

9. A method of identifying video images, comprising:

receiving a frame of video image in a video file; or

Receiving a main portion of the video image; or

Receiving signature information for the body portion of the video image;

retrieving matching information for the video image or the body part or the signature information; and

and sending the matching information.

10. The identification method according to claim 9, characterized in that the method further comprises:

receiving all video images of the video file;

identifying all information in all video images;

retrieving global matching information of all the information; wherein the global matching information comprises the matching information; and

and sending other information except the matching information in the global matching information.

11. An apparatus for identifying video images, comprising:

the receiving module is used for receiving instruction information for acquiring the video image;

the extraction module is used for extracting corresponding video images from the video files according to the instruction information;

the identification module is used for identifying information in the video image;

the channel module is used for uploading the information and receiving the retrieved matching information of the information; and

and the display module is used for displaying the matching information.

12. A server, comprising:

the device comprises a receiving module, a judging module and a processing module, wherein the receiving module is used for receiving a frame of video image in a video file or a main part of the video image or signature information of the main part of the video image;

a retrieval module for retrieving matching information of the video image or the main body part or the signature information; and

and the sending module is used for sending the matching information.

13. A terminal, comprising:

at least one memory and at least one processor;

wherein the at least one memory is configured to store program code and the at least one processor is configured to invoke the program code stored in the at least one memory to perform the method of any of claims 1 to 10.

14. A storage medium for storing program code for performing the method of any one of claims 1 to 10.

Technical Field

The present disclosure relates to the field of computer technologies, and in particular, to a method and an apparatus for identifying a video image, a server, a terminal, and a storage medium.

Background

In the current image recognition technology, after a picture is taken, the picture is transmitted to a server, then the server identifies and searches for an article or a person in the picture, and then a search result is returned to a sending end. There is currently a lack of implementations that can search based on video.

Disclosure of Invention

In order to solve the existing problems, the present disclosure provides a video image recognition method and apparatus, a server, a terminal, and a storage medium.

The present disclosure adopts the following technical solutions.

In some embodiments, the present disclosure provides a method of identifying a video image, comprising:

receiving instruction information for acquiring the video image;

extracting a corresponding video image from a video file according to the instruction information;

identifying information in the video image and uploading the information; and

and receiving and displaying the retrieved matching information of the information.

In some embodiments, the present disclosure provides a method of identifying a video image, comprising:

receiving a frame of video image in a video file; or

Receiving a main portion of the video image; or

Receiving signature information for the body portion of the video image;

retrieving matching information for the video image or the body part or the signature information; and

and sending the matching information.

In some embodiments, the present disclosure provides an apparatus for identifying a video image, comprising:

the receiving module is used for receiving instruction information for acquiring the video image;

the extraction module is used for extracting corresponding video images from the video files according to the instruction information;

the identification module is used for identifying information in the video image;

the channel module is used for uploading the information and receiving the retrieved matching information of the information; and

and the display module is used for displaying the matching information.

In some embodiments, the present disclosure provides a server comprising:

a retrieval module for retrieving matching information of the video image or the main body part or the signature information; and

and the sending module is used for sending the matching information.

In some embodiments, the present disclosure provides a terminal comprising: at least one memory and at least one processor;

the memory is used for storing program codes, and the processor is used for calling the program codes stored in the memory to execute the method.

In some embodiments, the present disclosure provides a storage medium for storing program code for performing the above-described method.

The video image identification method can perform image identification in real time according to the current video content, the identified content comprises information displayed in the video image, such as people or articles.

Drawings

The above and other features, advantages and aspects of various embodiments of the present disclosure will become more apparent by referring to the following detailed description when taken in conjunction with the accompanying drawings. Throughout the drawings, the same or similar reference numbers refer to the same or similar elements. It should be understood that the drawings are schematic and that elements and elements are not necessarily drawn to scale.

Fig. 1 is a flowchart of a video image recognition method according to an embodiment of the present disclosure.

Fig. 2 is a schematic diagram of an identification page of a video image according to an embodiment of the disclosure.

Fig. 3 is a flowchart of a video image recognition method according to another embodiment of the present disclosure.

Fig. 4 is a schematic structural diagram of a video image recognition apparatus according to an embodiment of the present disclosure.

Fig. 5 is a schematic structural diagram of a server for identification according to an embodiment of the present disclosure.

Fig. 6 is a schematic structural diagram of an electronic device according to an embodiment of the present disclosure.

Detailed Description

Embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While certain embodiments of the present disclosure are shown in the drawings, it is to be understood that the present disclosure may be embodied in various forms and should not be construed as limited to the embodiments set forth herein, but rather are provided for a more thorough and complete understanding of the present disclosure. It should be understood that the drawings and embodiments of the disclosure are for illustration purposes only and are not intended to limit the scope of the disclosure.

It should be understood that various steps recited in method embodiments of the present disclosure may be performed in parallel and/or in parallel. Moreover, method embodiments may include additional steps and/or omit performing the illustrated steps. The scope of the present disclosure is not limited in this respect.

The term "include" and variations thereof as used herein are open-ended, i.e., "including but not limited to". The term "based on" is "based, at least in part, on". The term "one embodiment" means "at least one embodiment"; the term "another embodiment" means "at least one additional embodiment"; the term "some embodiments" means "at least some embodiments". Relevant definitions for other terms will be given in the following description.

It should be noted that the terms "first", "second", and the like in the present disclosure are only used for distinguishing different devices, modules or units, and are not used for limiting the order or interdependence relationship of the functions performed by the devices, modules or units.

It is noted that references to "a" or "an" in this disclosure are intended to be illustrative rather than limiting, and that those skilled in the art will recognize that reference to "one or more" unless the context clearly dictates otherwise.

The names of messages or information exchanged between devices in the embodiments of the present disclosure are for illustrative purposes only, and are not intended to limit the scope of the messages or information.

The embodiments of the present application will be described in detail below with reference to the accompanying drawings.

As shown in fig. 1, fig. 1 is a flowchart of a video image recognition method according to an embodiment of the present disclosure, which includes the following steps.

And S100, receiving instruction information for acquiring the video image.

The instruction information can be sent out in two forms, namely frame extraction instruction information or screenshot instruction information. Specifically, the embodiment of the present disclosure may include a starting path, which may be, for example, a frame selection button or a figure recognition button of a program on the running terminal. For example, when a video is played to a certain frame, a user finds an interest point, and accordingly, the user may send instruction information that a user needs to capture a picture or capture the video, specifically, further click a frame selection button or click a picture recognition button. The frame selection button is different from the image recognition button in that the frame extraction instruction information can be sent by clicking the image recognition button, and then a complete image, namely all information in one frame, can be obtained. It can be understood that if the complete information is obtained, the amount of information to be transmitted is relatively large, in other words, the bandwidth to be occupied is relatively large. Therefore, preferably, the user can also intercept only a part of one frame of image according to personal interest, namely, select screenshot instruction information, and select a part of information in one frame of image for sending, so as to reduce the data volume needing to be transmitted. In the above embodiment, in order to locate the selected video image frame, the instruction is acquired and the corresponding timestamp is acquired. Namely, receiving frame extraction instruction information, and recording a time stamp sent by the frame extraction instruction information. And if the screenshot is only partially captured, recording screenshot coordinates besides the timestamp. Namely, receiving screenshot instruction information, and recording a timestamp and screenshot coordinates sent by the screenshot instruction information.

And S200, extracting a corresponding video image from the video file according to the instruction information.

Specifically, one time stamp corresponds to only one video picture frame, and therefore, accurate positioning can be performed according to the time stamp. From the location of the time stamp, the video image that the user specified to be intercepted can be determined. If the command is a frame extraction command, extracting an original video image frame; if the screenshot instruction information is the screenshot instruction information, only the image part in the selected range can be selected according to the screenshot coordinate, and the rest part is discarded, so that the redundancy is eliminated. In particular, in the embodiment of the present disclosure, the screenshot instruction information may be, for example, screenshot coordinates/display parameters of the selected frame, which may include position information and size information of the selected frame, specifically including an X coordinate of the first pixel in the upper left corner of the selected frame on the video image frame, a Y coordinate of the first pixel in the upper left corner of the selected frame on the video image frame, a length and a width of the selected frame, and the like. The selected frame can be extracted according to the position information and the size determined by the four display parameters on the video image frame. The selection frame can be represented by a display starting coordinate, a frame length and a frame width, and a limited area of the selection frame is set; in addition to the regular shape selection frame, the selection frame may be an irregular shape, and for example, f (x) that changes with respect to the change of the abscissa x is acquired after the initial position and the stroking path of the selection frame are determined.

And S300, identifying information in the video image and uploading the information.

In particular, embodiments of the present disclosure may divide a video image into a background portion and a body portion. Wherein the body portion may include a person or an article, and the background portion may be a portion without an identifying meaning other than the body portion. Then, the main body part can be automatically identified and selected according to the characteristics of the main body part, such as the characteristics of commodities or people, the rest parts are marked as backgrounds, and only the main body part is sent to a server for identification, so that the data volume is reduced. Furthermore, the embodiment of the present disclosure may also identify the identified main body portion, that is, only extract the tag information for uploading. If a plurality of main parts are identified, the main parts are respectively identified and respectively endowed with label information, and then the plurality of label information are sent to an identification end such as a server. In the partial screenshot of the video image, the content of the partial screenshot can also be divided into a main part and a background part, and other operations are the same as above and are not described herein again.

S400, receiving and displaying the retrieved matching information of the information.

In the disclosed embodiment, it is possible to identify both an article and a person in one frame of video image. In this case, the search results related to the person and the item may be displayed separately or simultaneously. Specifically, the separate display may be, for example, a plurality of objects, which may be people or objects, are respectively framed in one video image frame; if one of the selection boxes is selected, the pop-up page can be a related search result of the content in the related selection box, and similarly, if another selection box is selected, the content of the pop-up page is correspondingly replaced by a search result in the other selection box. In another embodiment of the present disclosure, the person and the commodity/commodities can be displayed simultaneously, and the display mode may be, for example, a column display mode, for example, two columns (one commodity and one person) or multiple columns are sequentially divided below corresponding to the horizontal position relationship of each in the video image frame. If too much content is identified in a frame that exceeds the width of the frame, it can be selectively presented in the form of a scroll bar. In this embodiment, the corresponding search result may also be converted by clicking on an icon in the bar. Taking fig. 2 as an example, fig. 2 is a schematic diagram of an identification page according to an embodiment of the present disclosure. Wherein the identification page can be called out by, for example, receiving instruction information. The disclosed embodiment takes the example of three results A, B, C identified in a frame. In this embodiment, a may be a person, and B and C may be commodities. The identification page can use the original video image frame as the background, and simultaneously select the identified person/object, as shown by the different dotted line boxes in the figure. After the instruction information is sent, a product page may be floated on the original video image frame, and the goods in the page may be, for example, a product library built in the application program, an external hyperlink, or a connection applet entry, which is not limited in this disclosure. The product page/results library may include, for example, a single bar and multiple columns. The bar may display different dotted frames in the video, that is, the bar may respectively correspond to the goods/people in sequence, and it is understood that in other embodiments of the present disclosure, other corresponding forms may also be adopted, which are not limited herein. Accordingly, the columns may correspond to the bars, respectively, and each column may display vertically the other videos a associated with person a, and the respective identical and similar items B, C of the items B and C, respectively. The bar may be a fixed bar or a rolling bar, and when the number of recognition targets is not large, the commodity illustrations A, B, C are arranged horizontally; when the number of the identified targets is large, selective display can be carried out in a transverse scrolling mode. Likewise, the vertical merchandise/video may also be arranged in a vertical scroll bar. And the order of arrangement, the character videos can be sorted by click rate for example; the commodity classes may be arranged according to the similarity with the target commodity, for example. Of course, in the disclosed embodiments, other arrangements of logic may be employed, as long as reasonable results are obtained.

In the embodiment of the present disclosure, the step of retrieving the matching information may be performed in the server. As shown in fig. 3, fig. 3 is a flowchart of a video image recognition method according to another embodiment of the present disclosure, which includes receiving a frame of video image in a video file; or receiving a body portion of the video image; or receiving signature information of the body portion of the video image; retrieving matching information for the video image or the body part or the signature information; and sending the matching information. In addition, in addition to the above-mentioned displaying only the matching result of the current page identification information, the embodiment of the present disclosure may further include receiving other information in the global matching information of the video file except for the matching information, and displaying the other information after the matching information. Specifically, when the video file is cached by the server, a main part, such as a commodity or a person, included in each frame can be analyzed in advance, and at this time, only the timestamp information needs to be uploaded to return products or other information included in the corresponding video image frame, such as other videos of the same kind of commodities or related persons found according to the characteristics. The returned results may be the identified region coordinates and the identified content therein. If the server does not have the relevant cache of the video file, the server can perform online identification in real time, perform characteristic comparison with commodities in a product library in the server, and then return the same or similar product results; if the related content is identified as a person, face information can be extracted for example to perform video feature searching for other videos containing images or names of the person. In addition, if the client, such as a program application installed on a mobile phone, is provided with a commodity library, the commodity library can be directly called according to the identification result; further, an external link may be set in the own product library, and a jump may be made to, for example, an applet, another app, or a web page through the external link. If the server stores the video file, the matching information of all the information in the video file identified in advance may be referred to as global matching information. After the matching information of the current frame is returned, in order to provide more choices, the rest results in the global matching information can be displayed after the matching information of the current frame, for example, after the icon c in the bar of fig. 2, d, e and the like in the global matching information are continuously provided, so that the search results are enriched.

As shown in fig. 4, an identification apparatus 10 for video images is further provided in the embodiments of the present disclosure, and includes a receiving module 11, an extracting module 13, an identifying module 15, a channel module 17, and a display module 19. The receiving module 11 may be configured to receive instruction information for acquiring the video image. The extracting module 13 may be configured to extract a corresponding video image from the video file according to the instruction information. The identification module 15 may be used to identify information in the video image. Channel module 17 may be used to upload the information and accept matching information for the retrieved information. The display module 19 may be used to display the matching information.

The embodiment of the present disclosure also provides a server, as shown in fig. 5, the server 30 shown in fig. 5 may include a receiving module 33, a retrieving module 35, and a sending module 37. The receiving module 33 may be configured to receive a video image of a frame in a video file or a main portion of the video image or signature information of the main portion of the video image. The retrieval module 35 may be used to retrieve matching information for the video image or the body part or the signature information. The sending module 37 may be configured to send the matching information. It should be noted that the server 30 may pre-cache product/character information of each frame of the video, and after obtaining the identification matching instruction, correspondingly call the result corresponding to the timestamp to return, so that the user can quickly obtain the result, and the use feeling is improved; on the other hand, if the server 30 does not have a pre-cache, the feature may also be identified online for comparison with an information base (including videos and commodities), which is not limited in the embodiment of the present disclosure.

For the embodiments of the apparatus, since they correspond substantially to the method embodiments, reference may be made to the partial description of the method embodiments for relevant points. The above-described apparatus embodiments are merely illustrative, wherein the modules described as separate modules may or may not be separate. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.

The multi-window parallel method and apparatus of the present disclosure are described above based on the embodiments and application examples. In addition, the present disclosure also provides a terminal and a storage medium, which are described below.

Referring now to fig. 6, a schematic diagram of an electronic device (e.g., a terminal device or server) 800 suitable for use in implementing embodiments of the present disclosure is shown. The terminal device in the embodiments of the present disclosure may include, but is not limited to, a mobile terminal such as a mobile phone, a notebook computer, a digital broadcast receiver, a PDA (personal digital assistant), a PAD (tablet computer), a PMP (portable multimedia player), a vehicle terminal (e.g., a car navigation terminal), and the like, and a stationary terminal such as a digital TV, a desktop computer, and the like. The electronic device shown in fig. 6 is only an example, and should not bring any limitation to the functions and the scope of use of the embodiments of the present disclosure.

As shown in fig. 6, the electronic device 800 may include a processing means (e.g., a central processing unit, a graphics processor, etc.) 801 that may perform various appropriate actions and processes in accordance with a program stored in a Read Only Memory (ROM)802 or a program loaded from a storage means 808 into a Random Access Memory (RAM) 803. In the RAM803, various programs and data necessary for the operation of the electronic apparatus 800 are also stored. The processing apparatus 801, the ROM 802, and the RAM803 are connected to each other by a bus 804. An input/output (I/O) interface 805 is also connected to bus 804.

Generally, the following devices may be connected to the I/O interface 805: input devices 806 including, for example, a touch screen, touch pad, keyboard, mouse, camera, microphone, accelerometer, gyroscope, etc.; output devices 807 including, for example, a Liquid Crystal Display (LCD), speakers, vibrators, and the like; storage 808 including, for example, magnetic tape, hard disk, etc.; and a communication device 809. The communication means 809 may allow the electronic device 800 to communicate wirelessly or by wire with other devices to exchange data. While fig. 6 illustrates an electronic device 800 having various means, it is to be understood that not all illustrated means are required to be implemented or provided. More or fewer devices may alternatively be implemented or provided.

In particular, according to an embodiment of the present disclosure, the processes described above with reference to the flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program embodied on a computer readable medium, the computer program comprising program code for performing the method illustrated in the flow chart. In such an embodiment, the computer program may be downloaded and installed from a network through the communication means 809, or installed from the storage means 808, or installed from the ROM 802. The computer program, when executed by the processing apparatus 801, performs the above-described functions defined in the methods of the embodiments of the present disclosure.

It should be noted that the computer readable medium in the present disclosure can be a computer readable signal medium or a computer readable storage medium or any combination of the two. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples of the computer readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the present disclosure, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In contrast, in the present disclosure, a computer readable signal medium may comprise a propagated data signal with computer readable program code embodied therein, either in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: electrical wires, optical cables, RF (radio frequency), etc., or any suitable combination of the foregoing.

In some embodiments, the clients, servers may communicate using any currently known or future developed network protocol, such as HTTP (HyperText transfer protocol), and may be interconnected with any form or medium of digital data communication (e.g., a communications network). Examples of communication networks include a local area network ("LAN"), a wide area network ("WAN"), the Internet (e.g., the Internet), and peer-to-peer networks (e.g., ad hoc peer-to-peer networks), as well as any currently known or future developed network.

The computer readable medium may be embodied in the electronic device; or may exist separately without being assembled into the electronic device.

The computer readable medium carries one or more programs which, when executed by the electronic device, cause the electronic device to perform the methods of the present disclosure as described above.

Computer program code for carrying out operations for aspects of the present disclosure may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, Smalltalk, C + +, and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider).

The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

The units described in the embodiments of the present disclosure may be implemented by software or hardware. Where the name of an element does not in some cases constitute a limitation on the element itself.

The functions described herein above may be performed, at least in part, by one or more hardware logic components. For example, without limitation, exemplary types of hardware logic components that may be used include: field Programmable Gate Arrays (FPGAs), Application Specific Integrated Circuits (ASICs), Application Specific Standard Products (ASSPs), systems on a chip (SOCs), Complex Programmable Logic Devices (CPLDs), and the like.

In the context of this disclosure, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. A machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.

According to one or more embodiments of the present disclosure, there is provided a video image recognition method including:

receiving instruction information for acquiring the video image;

extracting a corresponding video image from a video file according to the instruction information;

identifying information in the video image and uploading the information; and

and receiving and displaying the retrieved matching information of the information.

According to one or more embodiments of the present disclosure, there is provided an identification method, wherein the step of receiving instruction information for acquiring the video image includes:

receiving frame extraction instruction information; and

and recording a timestamp sent by the frame extraction instruction information.

According to one or more embodiments of the present disclosure, there is provided an identification method, wherein the step of extracting a corresponding video image from a video file according to the instruction information includes:

and extracting the video image corresponding to the time stamp from the video file according to the time stamp.

According to one or more embodiments of the present disclosure, there is provided an identification method characterized in that a video image includes a main body portion and a background portion;

the step of identifying information in the video image and uploading the information comprises:

identifying the body part in the video image and uploading the body part; or

And identifying the label information of the main body part in the video image, and uploading the label information.

receiving screenshot instruction information; and

and recording a time stamp and a screenshot coordinate sent by the screenshot instruction information.

According to one or more embodiments of the present disclosure, there is provided an identification method, wherein the partial screenshot includes a main body part and a background part;

the step of identifying information in the video image and uploading the information further comprises:

identifying the body portion of the partial screenshot in the video image and uploading the body portion; or

And identifying label information of the main body part of the partial screenshot in the video image, and uploading the label information.

According to one or more embodiments of the present disclosure, there is provided an identification method, characterized in that the method further includes:

receiving other information except the matching information in the global matching information of the video file; and

displaying the other information after the matching information.

According to one or more embodiments of the present disclosure, there is provided a video image recognition method including:

receiving a frame of video image in a video file; or

Receiving a main portion of the video image; or

Receiving signature information for the body portion of the video image;

retrieving matching information for the video image or the body part or the signature information; and

and sending the matching information.

According to one or more embodiments of the present disclosure, there is provided an identification method, characterized in that the method further includes:

receiving all video images of the video file;

identifying all information in all video images;

retrieving global matching information of all the information; wherein the global matching information comprises the matching information; and

and sending other information except the matching information in the global matching information.

According to one or more embodiments of the present disclosure, there is provided an apparatus for recognizing a video image, including:

the receiving module is used for receiving instruction information for acquiring the video image;

the extraction module is used for extracting corresponding video images from the video files according to the instruction information;

the identification module is used for identifying information in the video image;

the channel module is used for uploading the information and receiving the retrieved matching information of the information; and

and the display module is used for displaying the matching information.

According to one or more embodiments of the present disclosure, there is provided a server including:

a retrieval module for retrieving matching information of the video image or the main body part or the signature information; and

and the sending module is used for sending the matching information.

According to one or more embodiments of the present disclosure, there is provided a terminal including: at least one memory and at least one processor;

wherein the at least one memory is configured to store program code, and the at least one processor is configured to call the program code stored in the at least one memory to perform the method of any one of the above.

According to one or more embodiments of the present disclosure, there is provided a storage medium for storing program code for performing the above-described method.

The foregoing description is only exemplary of the preferred embodiments of the disclosure and is illustrative of the principles of the technology employed. It will be appreciated by those skilled in the art that the scope of the disclosure herein is not limited to the particular combination of features described above, but also encompasses other embodiments in which any combination of the features described above or their equivalents does not depart from the spirit of the disclosure. For example, the above features and (but not limited to) the features disclosed in this disclosure having similar functions are replaced with each other to form the technical solution.

Further, while operations are depicted in a particular order, this should not be understood as requiring that such operations be performed in the particular order shown or in sequential order. Under certain circumstances, multitasking and parallel processing may be advantageous. Likewise, while several specific implementation details are included in the above discussion, these should not be construed as limitations on the scope of the disclosure. Certain features that are described in the context of separate embodiments can also be implemented in combination in a single embodiment. Conversely, various features that are described in the context of a single embodiment can also be implemented in multiple embodiments separately or in any suitable subcombination.

Although the subject matter has been described in language specific to structural features and/or methodological acts, it is to be understood that the subject matter defined in the appended claims is not necessarily limited to the specific features or acts described above. Rather, the specific features and acts described above are disclosed as example forms of implementing the claims.

15页详细技术资料下载

上一篇：一种医用注射器针头装配设备

下一篇：基于大数据的视频课程推荐方法及相关装置

Video image identification method, device, server, terminal and storage medium

相关技术

网友询问留言