Foreign patent translation requirement identification method and system

文档序号:1170297 发布日期:2020-09-18 浏览:23次 中文

阅读说明:本技术 一种涉外专利翻译需求识别方法及系统 (Foreign patent translation requirement identification method and system ) 是由 倪海斌 施建建 徐可欣 于 2020-06-11 设计创作,主要内容包括:本发明提供一种涉外专利翻译需求识别方法及系统,涉及涉外专利翻译技术领域。该涉外专利翻译需求识别方法,包括以下步骤:S1、创建搜索关键词,基于各大网站及语言搜索相关涉外专利,获取涉外专利样本;S2、基于人工智能算法,对涉外专利样本进行迭代训练;S3、判断分析搜索的涉外专利真实性,筛选无关内容、重复内容及覆盖残缺内容。本发明,通过获取涉外专利样本,然后再利用人工智能算法,对涉外专利样本进行迭代训练,使得后期根据关键词获取的涉外专利相关性大大提高,从而能够大大减少后续人工筛选的操作,工作量大大减少,翻译过程比较简单,大大提高了翻译之后的专利文件准确度。(The invention provides a method and a system for recognizing the translation requirement of foreign-involved patents, and relates to the technical field of foreign-involved patent translation. The foreign-involved patent translation requirement identification method comprises the following steps: s1, creating search keywords, searching relevant foreign-involved patents based on each large website and language, and obtaining foreign-involved patent samples; s2, carrying out iterative training on the foreign-involved patent samples based on an artificial intelligence algorithm; and S3, judging and analyzing the authenticity of the searched foreign patents, and screening irrelevant contents, repeated contents and covering incomplete contents. According to the method, the foreign-involved patent samples are obtained, then the artificial intelligence algorithm is utilized to carry out iterative training on the foreign-involved patent samples, so that the correlation of the foreign-involved patents obtained according to the keywords in the later period is greatly improved, the subsequent manual screening operation can be greatly reduced, the workload is greatly reduced, the translation process is simpler, and the accuracy of translated patent documents is greatly improved.)

1. A foreign patent translation requirement identification method is characterized by comprising the following steps: the method comprises the following steps:

s1, creating search keywords, searching relevant foreign-involved patents based on each large website and language, and obtaining foreign-involved patent samples;

s2, carrying out iterative training on the foreign-involved patent samples based on an artificial intelligence algorithm;

s3, judging and analyzing the authenticity of the searched foreign patents, and screening irrelevant contents, repeated contents and covering incomplete contents;

s4, downloading foreign patent documents, extracting character information in the foreign patent documents, and adjusting formats and messy codes;

s5, setting classification key words, classifying and storing all foreign patents;

s6, selecting language type, translating the foreign patents and exporting translation results.

2. The method for recognizing the translation requirement of the foreign-involved patent according to claim 1, wherein: the specific content in the step 1 is as follows:

1) extracting title keywords according to the foreign patent information acquired as required, and checking whether all the keywords are repeated;

2) searching the foreign-involved patents on each large website according to the extracted keywords, and then obtaining a foreign-involved patent sample from all the searched foreign-involved patents according to each keyword, wherein the foreign-involved patent sample does not obtain the foreign-involved patents with repeated keywords.

3. The method for recognizing the translation requirement of the foreign-involved patent according to claim 1, wherein: the specific contents in the step 2 are as follows:

1) establishing a foreign-involved patent training model and a database system, and importing all the collected foreign-involved patents into the database system;

2) establishing a training model and a database system data transmission channel, and synchronizing all data in the database system into a foreign-involved patent training model;

3) and leading the obtained foreign patent samples into a foreign patent training model for training, and simultaneously leading in keywords until the training model can accurately correspond the keywords to the foreign patents in the foreign patent samples one by one.

4. The method for recognizing the translation requirement of the foreign-involved patent according to claim 1, wherein: the specific content in the step 3 is as follows:

1) analyzing the authenticity of the foreign-involved patent contents under each keyword, extracting partial contents of the foreign-involved patents, analyzing whether the extracted contents have certain relevance with the corresponding keywords, setting the relevance as P, judging no relevance if P is less than or equal to 0.1, and deleting all information of the foreign-involved patents;

2) if P is more than 0.1, judging the relation, retaining all information of the patent, detecting the patent, deleting the irrelevant content in the patent document, if the patent appears repeatedly, selecting to cover the patent or delete the patent, if the original patent appears incomplete, selecting to cover the patent.

5. The method for recognizing the translation requirement of the foreign-involved patent according to claim 1, wherein: the specific contents in the step 4 are as follows:

1) downloading foreign patents which cannot extract characters in the document, then extracting the characters in the downloaded document by using a character extractor, and directly extracting or copying the contents of the foreign patents for the foreign patents which can directly extract the characters in the document;

2) and (4) putting the characters extracted from each foreign-involved patent into a Word or WPS document, adjusting the format of the content in the document, and meanwhile, repairing the messy code characters.

6. The method for recognizing the translation requirement of the foreign-involved patent according to claim 1, wherein: the specific contents in the step 5 are as follows:

1) dividing all foreign patents according to keywords, and setting a plurality of independent keywords;

2) marking all the keywords, classifying all the foreign-related patents into corresponding classifications, and simultaneously storing all the foreign-related patents.

7. The method for recognizing the translation requirement of the foreign-involved patent according to claim 1, wherein: the specific contents in the step 6 are as follows:

1) determining the language type of the foreign-involved patent, selecting translation software, importing the foreign-involved patent file, and generating a translation file by one key;

2) and checking the translated file, eliminating wrong characters, words, punctuations and the like, modifying on line, and finally exporting the translated file.

8. A foreign patent translation requirement recognition system is characterized in that: the system comprises a data acquisition unit, a data analysis unit, a data downloading unit, a data classification unit, a central processing unit, a data storage unit, a data translation unit, a language selection unit, a result derivation unit and a model training unit, wherein the data acquisition unit is connected with the data analysis unit, the data analysis unit is connected with the data downloading unit, the language selection unit is connected with the data translation unit, and the data downloading unit, the data classification unit, the data storage unit, the data translation unit, the result derivation unit and the model training unit are all connected with the central processing unit.

9. The system of claim 8, wherein the system comprises: the data acquisition unit is used for gathering relevant foreign-involved patents on each big website, the data analysis unit is used for analyzing the authenticity of the searched foreign-involved patents, the data download unit is used for downloading required foreign-involved patent files, and the data storage unit is used for storing all downloaded foreign-involved patent files.

10. The system of claim 8, wherein the system comprises: the data translation unit is used for translating foreign patents in different languages into Chinese or other character types, the language selection unit is used for selecting and switching different languages, the result derivation unit is used for deriving the translated result, and the model training unit is used for carrying out iterative training on the foreign patent samples based on an artificial intelligence algorithm.

Technical Field

The invention relates to the technical field of foreign-involved patent translation, in particular to a method and a system for recognizing the translation requirement of a foreign-involved patent.

Background

The Patent Cooperation Treaty (PCT) is a special treaty, managed by the world intellectual property organization. The member countries are Paris convention member countries, and 151 are available at present. According to the specification of the PCT, a patent application filed in any one of the PCT member countries can be considered to be filed in the designated other member countries at the same time. The method realizes the application in one country and is effective in multiple countries. The approval process of the PCT application is divided into an international phase and a national phase. The international phase performs acceptance, publication, retrieval and initial review, and the national phase is examined and authorized by a specific national bureau. The time for a PCT application to enter a particular national stage is within 30 months from the date of application. Thus, when the applicant wishes to obtain protection in multiple countries (typically more than 5) with one inventive creation, it is convenient to use the PCT approach.

At present, because of the increasingly strong protection consciousness of patents in the world, more and more individuals or companies apply patents to their products and technologies to obtain legal protection, along with the increase of the number of patents, the system of the patents is more and more perfect, and the patents involved in foreign languages are more and more valuable for the reference of the people.

Disclosure of Invention

Technical problem to be solved

Aiming at the defects in the prior art, the invention provides a method and a system for recognizing the translation requirements of foreign patents, which solve the defects and shortcomings in the prior art.

(II) technical scheme

In order to achieve the purpose, the invention is realized by the following technical scheme: a foreign patent translation requirement identification method comprises the following steps:

s1, creating search keywords, searching relevant foreign-involved patents based on each large website and language, and obtaining foreign-involved patent samples;

s2, carrying out iterative training on the foreign-involved patent samples based on an artificial intelligence algorithm;

s3, judging and analyzing the authenticity of the searched foreign patents, and screening irrelevant contents, repeated contents and covering incomplete contents;

s4, downloading foreign patent documents, extracting character information in the foreign patent documents, and adjusting formats and messy codes;

s5, setting classification key words, classifying and storing all foreign patents;

s6, selecting language type, translating the foreign patents and exporting translation results.

Preferably, the specific content in step 1 is as follows:

1) extracting title keywords according to the foreign patent information acquired as required, and checking whether all the keywords are repeated;

2) searching the foreign-involved patents on each large website according to the extracted keywords, and then obtaining a foreign-involved patent sample from all the searched foreign-involved patents according to each keyword, wherein the foreign-involved patent sample does not obtain the foreign-involved patents with repeated keywords.

Preferably, the specific content in step 2 is as follows:

1) establishing a foreign-involved patent training model and a database system, and importing all the collected foreign-involved patents into the database system;

2) establishing a training model and a database system data transmission channel, and synchronizing all data in the database system into a foreign-involved patent training model;

3) and leading the obtained foreign patent samples into a foreign patent training model for training, and simultaneously leading in keywords until the training model can accurately correspond the keywords to the foreign patents in the foreign patent samples one by one.

Preferably, the specific content in step 3 is as follows:

1) analyzing the authenticity of the foreign-involved patent contents under each keyword, extracting partial contents of the foreign-involved patents, analyzing whether the extracted contents have certain relevance with the corresponding keywords, setting the relevance as P, judging no relevance if P is less than or equal to 0.1, and deleting all information of the foreign-involved patents;

2) if P is more than 0.1, judging the relation, retaining all information of the patent, detecting the patent, deleting the irrelevant content in the patent document, if the patent appears repeatedly, selecting to cover the patent or delete the patent, if the original patent appears incomplete, selecting to cover the patent.

Preferably, the specific content in step 4 is as follows:

1) downloading foreign patents which cannot extract characters in the document, then extracting the characters in the downloaded document by using a character extractor, and directly extracting or copying the contents of the foreign patents for the foreign patents which can directly extract the characters in the document;

2) and (4) putting the characters extracted from each foreign-involved patent into a Word or WPS document, adjusting the format of the content in the document, and meanwhile, repairing the messy code characters.

Preferably, the specific content in step 5 is as follows:

1) dividing all foreign patents according to keywords, and setting a plurality of independent keywords;

2) marking all the keywords, classifying all the foreign-related patents into corresponding classifications, and simultaneously storing all the foreign-related patents.

Preferably, the specific content in step 6 is as follows:

1) determining the language type of the foreign-involved patent, selecting translation software, importing the foreign-involved patent file, and generating a translation file by one key;

2) and checking the translated file, eliminating wrong characters, words, punctuations and the like, modifying on line, and finally exporting the translated file.

The system comprises a data acquisition unit, a data analysis unit, a data downloading unit, a data classification unit, a central processing unit, a data storage unit, a data translation unit, a language selection unit, a result derivation unit and a model training unit, wherein the data acquisition unit is connected with the data analysis unit, the data analysis unit is connected with the data downloading unit, the language selection unit is connected with the data translation unit, and the data downloading unit, the data classification unit, the data storage unit, the data translation unit and the result derivation unit are all connected with the central processing unit.

Preferably, the data acquisition unit is used for acquiring relevant foreign-involved patents on each large network station, the data analysis unit is used for analyzing the authenticity of the searched foreign-involved patents, the data download unit is used for downloading required foreign-involved patent files, and the data storage unit is used for storing all downloaded foreign-involved patent files.

Preferably, the data translation unit is used for translating the foreign-involved patents in different languages into Chinese or other character types, the language selection unit is used for selecting and switching different languages, the result derivation unit is used for deriving the translated result, and the model training unit is used for carrying out iterative training on the foreign-involved patent samples based on an artificial intelligence algorithm.

(III) advantageous effects

The invention provides a method and a system for recognizing the translation requirement of a foreign patent. The method has the following beneficial effects:

1. according to the method, the foreign-involved patent samples are obtained, then the artificial intelligence algorithm is utilized, the iterative training and other processes are carried out on the foreign-involved patent samples, so that the correlation of the foreign-involved patents obtained according to the keywords in the later period is greatly improved, the subsequent manual screening operation can be greatly reduced, the workload is greatly reduced, the translation process is simpler, and the accuracy of the translated patent files is greatly improved.

2. According to the invention, all the patent documents are ordered and disorderly by marking all the keywords, classifying all the foreign-involved patents into corresponding classifications and storing all the foreign-involved patents, so that a powerful guarantee is provided for the subsequent translation work.

Drawings

FIG. 1 is a schematic flow diagram of the present invention;

FIG. 2 is a block diagram of the system of the present invention.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

8页详细技术资料下载
上一篇:一种医用注射器针头装配设备
下一篇:基于降维桶模型的文本翻译方法及装置

网友询问留言

已有0条留言

还没有人留言评论。精彩留言会获得点赞!

精彩留言,会给你点赞!