Knowledge graph-based unstructured power grid data processing method and device

文档序号:1952865 发布日期:2021-12-10 浏览:15次 中文

阅读说明:本技术 一种基于知识图谱的非结构化电网数据处理方法及装置 (Knowledge graph-based unstructured power grid data processing method and device ) 是由 李保平 于 2021-09-14 设计创作,主要内容包括:本发明公开了一种基于知识图谱的非结构化电网数据处理方法及装置,涉及知识图谱领域,针对现有的非结构化电网数据的处理方法处理效果差,存在查询效果慢的问题,现提出如下方案,其包括以下步骤:S1、先将非结构化数据进行导出,并进行初步的数据检索,S2、将检索出的数据进行清洗,S3、将清洗后的数据进行进一步的分析转换,使非结构化数据转换成可以识别的结构化数据,S4、将初始数据库中数据导出,并进行整合,S5、将整合后的数据与转换的结构化数据进行相似度对比,S6、对转换对比后的结构化数据中不完整的数据、错误的数据以及重复的数据进行处理删除。本发明结构简单,使用方便,提高了非结构化数据的处理效果,提高查询效率。(The invention discloses a method and a device for processing unstructured power grid data based on a knowledge graph, relates to the field of knowledge graphs, and aims to solve the problems that the existing method for processing unstructured power grid data is poor in processing effect and slow in query effect, the following scheme is provided, and the method comprises the following steps: s1, firstly exporting unstructured data, carrying out preliminary data retrieval, S2, cleaning the retrieved data, S3, carrying out further analysis and conversion on the cleaned data to convert the unstructured data into recognizable structured data, S4, exporting the data in an initial database, integrating the data, S5, carrying out similarity comparison on the integrated data and the converted structured data, and S6, processing and deleting incomplete data, wrong data and repeated data in the converted and compared structured data. The invention has simple structure and convenient use, improves the processing effect of unstructured data and improves the query efficiency.)

1. An unstructured grid data processing method based on knowledge graph is characterized by comprising the following steps:

s1, exporting unstructured data and performing preliminary data retrieval;

s2, cleaning the retrieved data;

s3, further analyzing and converting the cleaned data to convert the unstructured data into recognizable structured data;

s4, exporting and integrating the data in the initial database;

s5, carrying out similarity comparison on the integrated data and the converted structured data;

s6, processing and deleting incomplete data, wrong data and repeated data in the converted and compared structured data;

and S7, classifying the processed data, uploading the data to a finished product database, and backing up and storing the data through a cloud storage platform.

2. The method as claimed in claim 1, wherein the unstructured data processing system based on knowledge graph comprises all forms of office documents, texts, pictures, XML, HTML, various types of reports, images, audio and video information.

3. The method as claimed in claim 1, wherein the data classification in S7 includes regulation data, contract data, and safety data, and the classified data are unified.

4. The method as claimed in claim 1, wherein the steps of S3 and S4 are parallel and can be performed synchronously.

5. A knowledge-graph-based unstructured grid data processing device, which is applied to the knowledge-graph-based unstructured grid data processing method of any one of claims 1 to 4, and comprises the following steps:

the unstructured database is used for storing original unstructured data and exporting the unstructured data through an adapter, and the unstructured database is connected with the data retrieval module;

the data retrieval module is used for retrieving and collecting unstructured data in the unstructured database and is connected with the data cleaning module;

the data cleaning module is used for cleaning and sorting the data retrieved by the data retrieval module and is connected with the data analysis engine;

the data analysis engine is used for analyzing the data cleaned and sorted by the data cleaning module and is connected with the data converter;

the data converter is used for converting non-numerical data in the unstructured data into numerical data and converting the unstructured data into structured data and is connected with the similarity calculation module;

the initial database is used for storing the structural data in the existing data and is connected with the data integration module;

the data integration module is used for integrating data in the initial database and is connected with the similarity calculation module;

and the similarity calculation module is used for carrying out similarity comparison and similarity calculation on the converted data and the integrated structured data in the initial database and carrying out data extraction, and is connected with the data classification module and the data processing module.

6. The apparatus of claim 5, further comprising:

the data classification module is used for classifying data and is connected with a finished product database.

7. The device for processing unstructured grid data based on a knowledge graph according to claim 6, wherein the finished product database is used for storing classified data, and is connected with a cloud storage platform;

and the cloud storage platform is used for backing up the data files in the finished product database.

8. The device of claim 5, wherein the data retrieval module comprises a translator, an optimizer, and an executor;

the translator is used for translating the query request and generating a query expression;

the optimizer is used for optimizing the query expression to obtain an optimized query plan;

and the actuator is used for selecting the optimal query plan to execute so as to obtain a query result.

9. The method and the device for processing unstructured grid data based on knowledge graph according to claim 5, wherein the data classification module comprises classification and integration of regulation data, contract data and safety data in grid data.

10. The method and device for unstructured grid data processing based on a knowledge graph according to claim 5, wherein the data processing module is used for processing and deleting incomplete data, wrong data and repeated data in the structured data after similarity calculation.

Technical Field

The invention relates to the field of knowledge graphs, in particular to a method and a device for processing unstructured power grid data based on a knowledge graph.

Background

The unstructured data are data which are irregular or incomplete in data structure, have no predefined data model, are inconvenient to express by a database two-dimensional logic table, are widely stored in a database of a computer, the quantity of the unstructured data is far greater than that of the structured data, the unstructured data are very diverse in format and diverse in standard, the unstructured information is technically harder to standardize and understand than the structured information, the unstructured data are very difficult to extract and retrieve, and the unstructured data need to be processed.

Disclosure of Invention

The invention provides a knowledge graph-based unstructured power grid data processing method and device, and solves the problems that an existing unstructured power grid data processing method is poor in processing effect and slow in query effect.

In order to achieve the purpose, the invention adopts the following technical scheme:

an unstructured grid data processing method based on knowledge graph comprises the following steps:

s1, exporting unstructured data and performing preliminary data retrieval;

s2, cleaning the retrieved data;

s3, further analyzing and converting the cleaned data to convert the unstructured data into recognizable structured data;

s4, exporting and integrating the data in the initial database;

s5, carrying out similarity comparison on the integrated data and the converted structured data;

s6, processing and deleting incomplete data, wrong data and repeated data in the converted and compared structured data;

and S7, classifying the processed data, uploading the data to a finished product database, and backing up and storing the data through a cloud storage platform.

Preferably, the unstructured database includes office documents, texts, pictures, XML, HTML, various reports, images, audio/video information, and the like in all formats.

Preferably, the data classification in S7 includes regulation data, contract data, and safety data, and the classified data is unified.

Preferably, S3 and S4 are parallel steps and can be performed synchronously.

An unstructured grid data processing apparatus based on knowledge-graph, comprising the following:

the unstructured database is used for storing original unstructured data and exporting the unstructured data through an adapter, and the unstructured database is connected with the data retrieval module;

the data retrieval module is used for retrieving and collecting unstructured data in the unstructured database and is connected with the data cleaning module;

the data cleaning module is used for cleaning and sorting the data retrieved by the data retrieval module and is connected with the data analysis engine;

the data analysis engine is used for analyzing the data cleaned and sorted by the data cleaning module and is connected with the data converter;

the data converter is used for converting non-numerical data in the unstructured data into numerical data and converting the unstructured data into structured data and is connected with the similarity calculation module;

the initial database is used for storing the structural data in the existing data and is connected with the data integration module;

the data integration module is used for integrating data in the initial database and is connected with the similarity calculation module;

and the similarity calculation module is used for carrying out similarity comparison and similarity calculation on the converted data and the integrated structured data in the initial database and carrying out data extraction, and is connected with the data classification module and the data processing module.

Preferably, the apparatus for processing unstructured grid data based on knowledge graph further comprises:

the data classification module is used for classifying data and is connected with a finished product database;

the finished product database is used for storing the classified data and is connected with the cloud storage platform;

and the cloud storage platform is used for backing up the data files in the finished product database.

Preferably, the data retrieval module comprises a translator, an optimizer and an executor;

the translator is used for translating the query request and generating a query expression;

the optimizer is used for optimizing the query expression to obtain an optimized query plan;

and the actuator is used for selecting the optimal query plan to execute so as to obtain a query result.

Preferably, the data classification module classifies and integrates data such as regulation data, contract data and safety data in the power grid data.

Preferably, the data processing module is configured to process and delete incomplete data, erroneous data, and repeated data in the structured data after the similarity calculation is performed.

The invention has the beneficial effects that:

the unstructured data of the power grid are cleaned, the unstructured data are removed, the accuracy of data retrieval is improved, data which best meet requirements are extracted through similarity comparison, search errors are reduced, the processing efficiency of the unstructured data is improved, the processed unstructured data are classified, the power grid data are sorted, and subsequent data are conveniently extracted and used.

In conclusion, the method is simple in structure and convenient to use, improves the processing effect of the unstructured data, improves the query efficiency, and solves the problems that the existing processing method of the unstructured power grid data is poor in processing effect and slow in query effect.

Drawings

FIG. 1 is a flow chart of a method for creating a knowledge graph based on a recurrent neural network according to the present invention.

Fig. 2 is a structural diagram of a knowledge graph constructing apparatus based on a recurrent neural network according to the present invention.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments.

Example 1

Referring to fig. 1, an unstructured grid data processing method based on knowledge graph includes the following steps:

s1, exporting unstructured data and performing preliminary data retrieval;

s2, cleaning the retrieved data;

s3, further analyzing and converting the cleaned data to convert the unstructured data into recognizable structured data;

s4, exporting and integrating the data in the initial database;

s5, carrying out similarity comparison on the integrated data and the converted structured data;

s6, processing and deleting incomplete data, wrong data and repeated data in the converted and compared structured data;

and S7, classifying the processed data, uploading the data to a finished product database, and backing up and storing the data through a cloud storage platform.

The unstructured database comprises office documents, texts, pictures, XML, HTML, various reports, images, audio/video information and the like in all formats.

The data classification in S7 includes regulation data, contract data, and security data, and integrates the classified data uniformly.

The steps S3 and S4 are parallel and can be performed synchronously.

Example 2

Referring to fig. 2, an unstructured grid data processing apparatus based on knowledge graph includes the following:

the unstructured database is used for storing original unstructured data and exporting the unstructured data through an adapter, and the unstructured database is connected with the data retrieval module;

the data retrieval module is used for retrieving and collecting unstructured data in the unstructured database and is connected with the data cleaning module; the data retrieval module comprises a translator, an optimizer and an executor;

the translator is used for translating the query request and generating a query expression;

the optimizer is used for optimizing the query expression to obtain an optimized query plan;

the executor is used for selecting the optimal query plan to execute to obtain a query result;

the data cleaning module is used for cleaning and sorting the data retrieved by the data retrieval module and is connected with the data analysis engine;

the data analysis engine is used for analyzing the data cleaned and sorted by the data cleaning module and is connected with the data converter;

the data converter is used for converting non-numerical data in the unstructured data into numerical data and converting the unstructured data into structured data and is connected with the similarity calculation module;

the initial database is used for storing the structural data in the existing data and is connected with the data integration module;

the data integration module is used for integrating data in the initial database and is connected with the similarity calculation module;

the similarity calculation module is used for carrying out similarity comparison and similarity calculation on the converted data and the integrated structured data in the initial database and carrying out data extraction, the similarity calculation module is connected with the data classification module and the data processing module, the data classification module is used for classifying the data, and the data classification module is connected with the finished product database; the finished product database is used for storing the classified data and is connected with the cloud storage platform; and the cloud storage platform is used for backing up the data files in the finished product database.

The data classification module is used for classifying and integrating regulation data, contract data, safety data and other data in the power grid data.

And the data processing module is used for processing and deleting incomplete data, wrong data and repeated data in the structured data subjected to similarity calculation.

In the specific using process, firstly, the unstructured network data in the unstructured database is exported through the adapter, then, the translator translates the query request and generates a query expression, then, the optimizer optimizes the query expression to obtain an optimized query plan, then, the executor is used for selecting the optimal query plan to execute to obtain a query result, thereby, the unstructured data in the unstructured database is retrieved and collected, then, the data retrieved by the data retrieval module is cleaned and sorted through the data cleaning module, the cleaned and sorted data is analyzed through the data analysis engine, then, the data converter is used for converting the unstructured data in the unstructured data into numerical data, the unstructured data is converted into structured data, and the structured data in the initial database is exported and integrated, and then, the similarity calculation module is used for carrying out similarity comparison and similarity calculation on the converted data and the integrated structured data in the initial database, extracting the data, then, the data processing module is used for processing and deleting incomplete data, wrong data and repeated data in the structured data subjected to the similarity calculation, meanwhile, the extracted data is used for classifying the data through the data classification module, after the classification is finished, the classified data is stored in the finished product database, and data files in the finished product database are backed up through the cloud storage platform.

The above description is only for the preferred embodiment of the present invention, but the scope of the present invention is not limited thereto, and any person skilled in the art should be considered to be within the technical scope of the present invention, and the technical solutions and the inventive concepts thereof according to the present invention should be equivalent or changed within the scope of the present invention.

8页详细技术资料下载
上一篇:一种医用注射器针头装配设备
下一篇:一种基于图数据库的知识管理方法和系统

网友询问留言

已有0条留言

还没有人留言评论。精彩留言会获得点赞!

精彩留言,会给你点赞!