Entry data expansion method and device based on face recognition

文档序号:1556711 发布日期:2020-01-21 浏览:2次 中文

阅读说明:本技术 基于人脸识别的词条数据扩充方法及装置 (Entry data expansion method and device based on face recognition ) 是由 王晨龙 于 2019-09-16 设计创作,主要内容包括:本发明公开一种基于人脸识别的词条数据扩充方法及装置,涉及数据处理技术领域,能够有效解决词条数据匹配错乱的问题。该方法包括:基于内部数据库的第一词条数据,从外部网站中爬取与词条数据相关的第二词条数据,第一词条数据和第二词条数据均包括人脸图片及字段;识别第一词条数据和第二词条数据中的人脸图片,若识别结果匹配则将第二词条数据中的字段补录和/或更新到第一词条数据中。该装置应用有上述方案所提的方法。(The invention discloses a method and a device for expanding entry data based on face recognition, relates to the technical field of data processing, and can effectively solve the problem of disorder matching of the entry data. The method comprises the following steps: based on first entry data of an internal database, crawling second entry data related to the entry data from an external website, wherein the first entry data and the second entry data both comprise a face picture and a field; and recognizing the face pictures in the first entry data and the second entry data, and if the recognition results are matched, adding and/or updating the fields in the second entry data into the first entry data. The device is applied with the method provided by the scheme.)

1. A method for expanding vocabulary entry data based on face recognition is characterized by comprising the following steps:

based on first entry data of an internal database, crawling second entry data related to the entry data from an external website, wherein the first entry data and the second entry data at least comprise face pictures and fields, and the fields comprise Chinese and English names, professions, sexes, birthdays, regions, representatives and related news information;

extracting at least one face picture from each related second entry data respectively;

comparing the face picture extracted from each related second entry data with the face picture extracted from the first entry data respectively to identify the face similarity;

when the face similarity recognition result is the same person, the fields in the related second entry data are added and/or updated into the first entry data;

and when the face similarity identification result is that the face similarity identification result cannot be judged, continuously judging whether the face similarity identification result can be associated with the same person or not through any one or more fields of birthday, region and representation, and if the face similarity identification result can be associated with the same person, adding and/or updating the fields in the related second vocabulary entry data into the first vocabulary entry data.

2. The method of claim 1, wherein the internal database is a star database comprising the first entry data in a one-to-one correspondence with a plurality of stars.

3. The method of claim 2, wherein the method of crawling external websites for second term data related to the term data based on the first term data of the internal database comprises:

crawling second entry data of the same star from an external website based on the first entry data of any star in the internal database;

and filtering and screening the plurality of pieces of second entry data obtained by crawling by comparing the professional fields, and finally reserving the related second entry data.

4. The method according to claim 1, further comprising, when the face similarity recognition result is not inconclusive:

if a plurality of face pictures are extracted from the related second vocabulary entry data, another face picture is called again and compared with the face picture extracted from the first vocabulary entry data of the star to identify the face similarity;

and when all the face similarity recognition results in the related second entry data cannot be judged, continuously judging whether the second entry data can be associated with the same person through any one or more fields of birthday, region and representation.

5. A kind of vocabulary entry data expansion device based on face identification, characterized by that, including:

the data crawling unit is used for crawling second entry data related to the entry data from an external website based on first entry data of an internal database, and the first entry data and the second entry data both comprise a face picture and a field;

and the identification matching unit is used for identifying the face pictures in the first entry data and the second entry data, and if the identification results are matched, the fields in the second entry data are additionally recorded and/or updated into the first entry data.

6. The apparatus of claim 5, wherein the data crawling unit comprises:

the data crawler module is used for crawling second vocabulary entry data of the same star from an external website based on the first vocabulary entry data of any star in the internal database;

and the data cleaning module is used for filtering and screening the plurality of pieces of second entry data obtained by crawling by comparing the professional fields and finally retaining the related second entry data.

7. The apparatus of claim 6, wherein the identifying matching unit comprises:

the image extraction module is used for respectively extracting at least one face image from each piece of related second entry data;

the face recognition module is used for comparing the face picture extracted from each related second entry data with the face picture extracted from the first entry data of the star to recognize the face similarity;

the judgment output module is used for supplementing and/or updating the fields in the related second entry data into the first entry data when the face similarity identification result is the same person; or when the face similarity identification result is that the face similarity identification result cannot be judged, whether the face similarity identification result can be associated with the same person or not is continuously judged through any one or more fields of birthdays, regions and representatives, and if the face similarity identification result can be associated with the same person, the fields in the related second vocabulary entry data are additionally recorded and/or updated into the first vocabulary entry data.

8. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the method according to any one of the claims 1 to 4.

Technical Field

The invention relates to the technical field of data processing, in particular to a method and a device for expanding entry data based on face recognition.

Background

In recent years, "content is king" becomes an absolute high-frequency word in the industry, the accuracy and integrity of the content of encyclopedia entry data of the star play a very important role in important services such as video searching, recommendation and the like, the establishment and operation of a star picture library need to be completed by a large amount of manpower, and although a crawler capture technology is gradually applied in the industry to perfect and update the entry data of the star, the problem of disorder of matching of the entry data of the double-name star is easily caused by a scheme of only depending on text matching.

Disclosure of Invention

The invention aims to provide a method and a device for expanding entry data based on face recognition, which can effectively solve the problem of disordered matching of the entry data.

In order to achieve the above object, an aspect of the present invention provides a method for expanding entry data based on face recognition, including:

based on first entry data of an internal database, crawling second entry data related to the entry data from an external website, wherein the first entry data and the second entry data at least comprise a face picture and a field;

and recognizing the face pictures in the first entry data and the second entry data, and if the recognition results are matched, adding and/or updating the fields in the second entry data into the first entry data.

Illustratively, the fields include Chinese and English names, professions, genders, birthdays, regions, representatives, and related news information.

Illustratively, the internal database is a star database, and includes the first entry data corresponding to a plurality of stars in a one-to-one correspondence.

Preferably, the method for crawling the second term data related to the term data from the external website based on the first term data of the internal database comprises the following steps:

crawling second entry data of the same star from an external website based on the first entry data of any star in the internal database;

and filtering and screening the plurality of pieces of second entry data obtained by crawling by comparing the professional fields, and finally reserving the related second entry data.

Preferably, the method for recognizing the face pictures in the first entry data and the second entry data, and if the recognition results are matched, adding and/or updating the fields in the second entry data into the first entry data includes:

extracting at least one face picture from each related second entry data respectively;

comparing the face picture extracted from each related second entry data with the face picture extracted from the first entry data of the star respectively to identify the face similarity;

when the face similarity recognition result is the same person, the fields in the related second entry data are added and/or updated into the first entry data;

and when the face similarity identification result is that the face similarity identification result cannot be judged, continuously judging whether the face similarity identification result can be associated with the same person through any one or more fields of birthday, region and representation, and if the face similarity identification result can be associated with the same person, adding and/or updating the fields in the related second vocabulary entry data into the first vocabulary entry data.

Preferably, when the face similarity recognition result is not inconclusive, the method further includes:

if a plurality of face pictures are extracted from the related second vocabulary entry data, another face picture is called again and compared with the face picture extracted from the first vocabulary entry data of the star to identify the face similarity;

and when all the face similarity recognition results in the related second entry data cannot be judged, continuously judging whether the second entry data can be associated with the same person through any one or more fields of birthday, region and representation.

Compared with the prior art, the entry data expansion method based on the face recognition has the following beneficial effects:

according to the entry data expansion method based on the face recognition, second entry data related to first entry data in an internal database are automatically crawled from an external website regularly, then whether the first entry data and the second entry data are related entry data of the same person or not is judged by recognizing face pictures in the first entry data and the second entry data, and when the judgment result is yes, fields in the related second entry data are additionally recorded and/or updated into the first entry data, so that automatic updating and improvement of the first entry data in the internal database are achieved.

Therefore, the method organically combines the face recognition technology and the data crawler technology and applies the face recognition technology and the data crawler technology to the entry data expansion of the internal database, and can effectively ensure the matching accuracy of crawler data and the timeliness of the entry data expansion of the internal database.

Another aspect of the present invention provides a vocabulary entry data expansion apparatus based on face recognition, which is applied to the vocabulary entry data expansion method based on face recognition mentioned in the above technical solution, and the apparatus includes:

the data crawling unit is used for crawling second entry data related to the entry data from an external website based on first entry data of an internal database, wherein the first entry data and the second entry data at least comprise a face picture and a field;

and the identification matching unit is used for identifying the face pictures in the first entry data and the second entry data, and if the identification results are matched, the fields in the second entry data are additionally recorded and/or updated into the first entry data.

Preferably, the data crawling unit includes:

the data crawler module is used for crawling second vocabulary entry data of the same star from an external website based on the first vocabulary entry data of any star in the internal database;

and the data cleaning module is used for filtering and screening the plurality of pieces of second entry data obtained by crawling by comparing the professional fields and finally retaining the related second entry data.

Preferably, the identification matching unit includes:

the image extraction module is used for respectively extracting at least one face image from each piece of related second entry data;

the face recognition module is used for comparing the face picture extracted from each related second entry data with the face picture extracted from the first entry data of the star to recognize the face similarity;

the judgment output module is used for supplementing and/or updating the fields in the related second entry data into the first entry data when the face similarity identification result is the same person; or when the face similarity identification result is that the face similarity identification result cannot be judged, whether the face similarity identification result can be associated with the same person or not is continuously judged through any one or more fields of birthdays, regions and representatives, and if the face similarity identification result can be associated with the same person, the fields in the related second vocabulary entry data are additionally recorded and/or updated into the first vocabulary entry data.

Compared with the prior art, the beneficial effects of the entry data expansion device based on face recognition provided by the invention are the same as those of the entry data expansion method based on face recognition provided by the technical scheme, and are not repeated herein.

A third aspect of the present invention provides a computer-readable storage medium, on which a computer program is stored, where the computer program, when executed by a processor, performs the steps of the above-mentioned vocabulary entry data expansion method based on face recognition.

Compared with the prior art, the beneficial effects of the computer-readable storage medium provided by the invention are the same as the beneficial effects of the entry data expansion method based on face recognition provided by the technical scheme, and the detailed description is omitted here.

Drawings

The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this specification, illustrate embodiments of the invention and together with the description serve to explain the invention and not to limit the invention. In the drawings:

fig. 1 is a flowchart illustrating a method for extending entry data based on face recognition according to an embodiment of the present invention.

Detailed Description

In order to make the aforementioned objects, features and advantages of the present invention comprehensible, embodiments accompanied with figures are described in detail below. It is to be understood that the described embodiments are merely exemplary of the invention, and not restrictive of the full scope of the invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

9页详细技术资料下载
上一篇:一种医用注射器针头装配设备
下一篇:为文章匹配对象的方法、系统、设备及存储介质

网友询问留言

已有0条留言

还没有人留言评论。精彩留言会获得点赞!

精彩留言,会给你点赞!