Data-driven-based power distribution network reactive power optimization method and device

文档序号：1924705 发布日期：2021-12-03 浏览：21次中文

阅读说明：本技术 一种基于数据驱动的配电网无功优化方法及装置 (Data-driven-based power distribution network reactive power optimization method and device ) 是由曹华珍高崇陈沛东何璇张俊潇王天霖黄烨罗强程苒黄曜林凌雪于 2021-09-07 设计创作，主要内容包括：本发明公开了一种基于数据驱动的配电网无功优化方法及装置,该方法包括：根据将历史数据输入预设的下层神经网络进行训练,获取下层训练模型以及不可测关键节点电压数据；根据获取的目标区域电网的可测节点、不可测关键节点电压数据以及预设的离散无功调节设备的投切状态输入预设的上层神经网络进行训练,获取上层训练模型以及离散无功设备的投切动作指令；将下层训练模型与上层训练模型部署至实际运行的配电网,获取整体运行模型；根据运行网损以及离散无功设备的投切动作指令构建目标函数,采用A2C算法求解目标函数,获取配电网无功优化结果。本发明通过获取整体运行模型,结合网损以及投切动作指令,优化了电压分布以及降低了网损。(The invention discloses a data-driven power distribution network reactive power optimization method and a data-driven power distribution network reactive power optimization device, wherein the method comprises the following steps: inputting historical data into a preset lower-layer neural network for training, and acquiring a lower-layer training model and voltage data of an unmeasured key node; inputting a preset upper-layer neural network for training according to the acquired voltage data of the measurable nodes and the undetectable key nodes of the power grid of the target area and the switching state of the preset discrete reactive power regulation equipment, and acquiring an upper-layer training model and switching action instructions of the discrete reactive power regulation equipment; deploying the lower training model and the upper training model to a power distribution network which actually runs to obtain an overall running model; and constructing an objective function according to the operation network loss and the switching action instruction of the discrete reactive equipment, and solving the objective function by adopting an A2C algorithm to obtain a reactive power optimization result of the power distribution network. According to the invention, by acquiring the integral operation model and combining the network loss and the switching action instruction, the voltage distribution is optimized and the network loss is reduced.)

1. A reactive power optimization method for a power distribution network based on data driving is characterized by comprising the following steps:

inputting a preset upper-layer neural network for training according to the acquired measurable nodes of the power grid in the target area, the acquired voltage data of the non-measurable key nodes and the preset switching state of the discrete reactive power regulation equipment, acquiring an upper-layer training model, and acquiring switching action instructions of the discrete reactive power regulation equipment according to the upper-layer training model;

deploying the lower layer training model and the upper layer training model to an actually-operated power distribution network to obtain an overall operation model;

and constructing an objective function according to the network loss of the integral operation model and the switching action instruction of the discrete reactive power equipment, and solving the objective function by adopting an A2C algorithm to obtain a reactive power optimization result of the power distribution network.

2. The reactive power optimization method for the power distribution network based on data driving according to claim 1, wherein historical data of power injection of measurable nodes of the power distribution network are input into a preset lower-layer neural network for training, and a lower-layer training model is obtained, and the method comprises the following steps:

historical data of injection power of measurable nodes of the power distribution network comprise measurable node loads;

and training the preset lower-layer neural network according to the measurable node load to obtain the lower-layer training model.

3. The data-driven-based power distribution network reactive power optimization method according to claim 2, wherein the obtaining of the switching action instruction of the discrete reactive power equipment according to the upper training model comprises:

and converting the objective function in the upper training model into a Markov process, optimizing the Markov process by adopting a strategy gradient method, and acquiring a switching action instruction of the discrete reactive power equipment.

4. The data-driven-based reactive power optimization method for the power distribution network according to claim 3, wherein the network loss of the overall operation model is obtained by determining an objective function of the optimal network loss according to the minimum square difference value between the voltage of each node and a preset target voltage value and calculating according to the objective function of the optimal network loss.

5. The method for optimizing the reactive power of the power distribution network based on the data driving according to claim 4, wherein the constructing an objective function according to the network loss of the overall operation model and the switching action instructions of the discrete reactive power devices, and solving the objective function by using an A2C algorithm to obtain the result of optimizing the reactive power of the power distribution network comprises:

the switching action instruction of the discrete reactive equipment comprises the action cost of the discrete reactive equipment;

and constructing the objective function according to the network loss of the integral operation model and the action cost of the discrete reactive power equipment, introducing constraint conditions constructed according to node voltage, reactive power of the nodes and preset control variables, solving the objective function by adopting an A2C algorithm, and obtaining a reactive power optimization result of the power distribution network.

6. A distribution network reactive power optimization device based on data driving is characterized by comprising:

the lower-layer training module is used for inputting historical data of injection power of measurable nodes of the power distribution network into a preset lower-layer neural network for training, acquiring a lower-layer training model and acquiring voltage data of the non-measurable key nodes according to the lower-layer training model;

the upper-layer training module is used for inputting a preset upper-layer neural network for training according to the acquired measurable nodes of the target area power grid, the acquired voltage data of the undetectable key nodes and the switching state of the preset discrete reactive power regulation equipment, acquiring an upper-layer training model and acquiring switching action instructions of the discrete reactive power equipment according to the upper-layer training model;

the operation module is used for deploying the lower layer training model and the upper layer training model to an actually-operated power distribution network to obtain an integral operation model;

and the optimization module is used for constructing an objective function according to the network loss of the integral operation model and the switching action instruction of the discrete reactive power equipment, solving the objective function by adopting an A2C algorithm and obtaining a reactive power optimization result of the power distribution network.

7. The reactive power optimization device for the power distribution network based on data driving of claim 6, wherein the lower training module is further configured to:

historical data of injection power of measurable nodes of the power distribution network comprise measurable node loads;

and training the preset lower-layer neural network according to the measurable node load to obtain the lower-layer training model.

8. The reactive power optimization device for the power distribution network based on data driving of claim 7, wherein the upper training module is further configured to:

9. The data-driven-based power distribution network reactive power optimization device of claim 8, wherein the optimization module is further configured to:

and the network loss of the integral operation model is obtained by determining an optimal network loss objective function according to the minimum square difference value of each node voltage and a preset target voltage value and calculating according to the optimal network loss objective function.

10. The data-driven-based power distribution network reactive power optimization device of claim 9, wherein the optimization module is further configured to:

the switching action instruction of the discrete reactive equipment comprises the action cost of the discrete reactive equipment;

Technical Field

The invention relates to the technical field of power distribution network optimization, in particular to a data-driven power distribution network reactive power optimization method and device.

Background

With the development of new energy and distributed power generation technologies, a large number of Distributed Generation (DG) are connected to a power grid, so that the permeability of the power distribution network is improved, and the uncertainty is enhanced. According to statistics, only 51% of DGs in a 10kV power distribution network are accessed to a dispatching master station, wherein the access rate of three or more remote stations is less than 10%, and a medium-voltage power distribution network has the characteristics of high DG permeability and low perceptibility. The traditional centralized reactive voltage control is mostly based on a group intelligent optimization algorithm, the operation is complex, accurate and complete load prediction data and target network topology are needed, and for a power distribution network in a low-perception environment, the load prediction data and the complete network topology are difficult to obtain, which brings great challenges to the centralized reactive voltage control of the power distribution network.

Disclosure of Invention

The invention aims to provide a data-driven power distribution network reactive power optimization method and device, and aims to solve the problem that limited measurable node voltage data in the prior art cannot completely reflect the voltage fluctuation of a target area power grid.

In order to achieve the purpose, the invention provides a data-driven power distribution network reactive power optimization method, which comprises the following steps:

deploying the lower layer training model and the upper layer training model to an actually-operated power distribution network to obtain an overall operation model;

Preferably, the step of inputting historical data of injection power of measurable nodes of the power distribution network into a preset lower-layer neural network for training to obtain a lower-layer training model includes:

historical data of injection power of measurable nodes of the power distribution network comprise measurable node loads;

and training the preset lower-layer neural network according to the measurable node load to obtain the lower-layer training model.

Preferably, the obtaining of the switching action instruction of the discrete reactive power device according to the upper training model includes:

Preferably, the network loss of the overall operation model is obtained by determining an objective function of the optimal network loss according to the minimum squared difference value between the voltage of each node and a preset target voltage value, and then calculating according to the objective function of the optimal network loss.

Preferably, the constructing an objective function according to the network loss of the integral operation model and the switching action instruction of the discrete reactive power equipment, and solving the objective function by using an A2C algorithm to obtain a reactive power optimization result of the power distribution network includes:

the switching action instruction of the discrete reactive equipment comprises the action cost of the discrete reactive equipment;

The invention provides a data-driven power distribution network reactive power optimization device, which comprises:

the operation module is used for deploying the lower layer training model and the upper layer training model to an actually-operated power distribution network to obtain an integral operation model;

Preferably, the lower layer training module is further configured to:

historical data of injection power of measurable nodes of the power distribution network comprise measurable node loads;

and training the preset lower-layer neural network according to the measurable node load to obtain the lower-layer training model.

Preferably, the upper training module is further configured to:

Preferably, the optimization module is further configured to:

the switching action instruction of the discrete reactive equipment comprises the action cost of the discrete reactive equipment;

Compared with the prior art, the invention has the beneficial effects that:

inputting historical data of injection power of measurable nodes of the power distribution network into a preset lower-layer neural network for training, acquiring a lower-layer training model, acquiring voltage data of the non-measurable key nodes according to the lower-layer training model, inputting the acquired measurable nodes of the power grid in the target area, the voltage data of the non-measurable key nodes and the switching state of the preset discrete reactive power regulation equipment into a preset upper-layer neural network for training to acquire an upper-layer training model, and acquiring switching action instructions of discrete reactive equipment according to the upper training model, deploying the lower training model and the upper training model to a power distribution network which actually runs to acquire an integral running model, and constructing an objective function according to the network loss of the integral operation model and the switching action instruction of the discrete reactive power equipment, and solving the objective function by adopting an A2C algorithm to obtain a reactive power optimization result of the power distribution network. According to the method, the voltage distribution is optimized while the model is simplified, the active network loss is reduced, and aiming at the problem that limited measurable node voltage data cannot completely reflect the voltage fluctuation of a target area power grid, the voltage fitting of the non-measurable key node is carried out through the lower-layer training model, so that the rationality and the accuracy of the optimized model are improved.

Drawings

In order to more clearly illustrate the technical solution of the present invention, the drawings needed to be used in the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art that other drawings can be obtained according to the drawings without creative efforts.

Fig. 1 is a schematic flow chart of a reactive power optimization method for a power distribution network based on data driving according to an embodiment of the present invention;

fig. 2 is a schematic flow chart of a reactive power optimization method for a power distribution network based on data driving according to another embodiment of the present invention;

FIG. 3 is a schematic flow chart of a default upper-level neural network according to another embodiment of the present invention;

fig. 4 is a schematic structural diagram of a data-driven-based reactive power optimization device for a power distribution network according to an embodiment of the present invention.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

It should be understood that the step numbers used herein are for convenience of description only and are not intended as limitations on the order in which the steps are performed.

It is to be understood that the terminology used in the description of the invention herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. As used in the specification of the present invention and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise.

The terms "comprises" and "comprising" indicate the presence of the described features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.

The term "and/or" refers to and includes any and all possible combinations of one or more of the associated listed items.

Referring to fig. 1, an embodiment of the present invention provides a method for optimizing reactive power of a power distribution network based on data driving, including the following steps:

s101: inputting historical data of the injection power of the measurable nodes of the power distribution network into a preset lower-layer neural network for training, obtaining a lower-layer training model, and obtaining voltage data of the non-measurable key nodes according to the lower-layer training model.

Referring to fig. 2, in detail, the whole distribution network control process packet is divided into a lower training stage, an upper training stage and a real-time control stage, the three stages complete the whole optimization process through information transmission and interaction with the distribution network, in the lower training stage, the Agent1 trains historical data through a deep neural network, the injection power of measurable nodes is used as input data, wherein, the historical data of the injection power of the measurable nodes of the power distribution network comprises measurable node loads, the measurable node loads train a preset lower-layer neural network, a lower-layer training model is obtained, the voltage of the key nodes which cannot be measured is taken as output data, the voltage of the key nodes which cannot be measured in the power distribution network is trained and is transmitted to an upper-layer training model, the measurable nodes represent nodes with four remote accesses, the non-measurable nodes represent nodes which are not accessed to the master scheduling station, and the non-measurable key nodes represent nodes which are not accessed to the master scheduling station but have the highest expression on the operation condition of the power distribution network.

The preset lower-layer neural network comprises three layers of convolution structures and two layers of full-connection structures, each convolution structure comprises a convolution layer and a batch regularization layer, a Relu function is arranged behind each batch regularization layer and each full-connection layer and serves as an activation function, input data are measurable node loads accessed to a scheduling master station for an input layer of the network, a target area power grid network frame is provided with N +1 nodes and s lines, wherein m nodes are provided with distributed power supplies, the number of input neurons of the neural network is 2N, DG access nodes serve as pv nodes, and common load nodes serve as pq nodes, namely:

X＝[P_pv.1,...P_pv.m...P_pv.1P_pv.N-m,Q_pv.1,...,Q_pv.m,Q_pq.1,...,Q_pq,N-m]；

in the formula (I), the compound is shown in the specification,representing the active load with the DG,indicating an active load that does not contain a DG,representing a reactive load with a DG,representing a reactive load without DG.

The output layer is the voltage data of the unmeasured key node, and the data comprises the following steps:

Y＝[V₁...,V_k,...V_n]；

where n represents the number of unmeasured key nodes, V_k(k 1.., n) represents the voltage of the kth unmeasured critical node.

S102: inputting the acquired voltage data of the measurable nodes, the non-measurable key nodes and the preset switching state of the discrete reactive power regulation equipment of the target area into a preset upper-layer neural network for training, acquiring an upper-layer training model, and acquiring switching action instructions of the discrete reactive power regulation equipment according to the upper-layer training model.

Referring to fig. 2, specifically, the objective function in the upper training model is converted into a markov process, and the markov process is optimized by using a policy gradient method to obtain a switching action instruction of the discrete reactive power device. In the upper training stage, operation data of voltage of measurable nodes of a target area power grid and operation data of voltage of undetectable key nodes obtained in the lower training stage are extracted through a dispatching center, the switching state of discrete regulating equipment is combined to serve as joint state input, switching action instructions of discrete reactive equipment serve as action output, a double-network model of an upper deep neural network is fitted through Agent 2, and network training is completed in the interaction process of a main evaluation network and a main action network.

Referring to fig. 3, the preset upper neural network selects a dual network model including Actor-critic, and fits the dual network model including the strategy function respectivelyAnd operator network and criticic network of state cost function, theta₁And theta₂U in FIG. 3 is a network parameter_iThe method comprises the steps of taking an original input voltage matrix as a receptive field, extracting the characteristics of input characteristic quantity by taking the regional characteristic graph as a unit, and performing T_iAnd C_TiFor discrete device state input, s_iIs input as a whole state.

That is, the A2C model mainly includes three parts, namely, a convolutional network layer for extracting the input voltage, and a CN (main evaluation network) for fitting the state-to-action AN (main action network) and fitting the state cost function. Model (model)Inputting a state s of a Markov decision process for reactive power optimization of the power distribution network, wherein the state s comprises a target node voltage matrix, a discrete device switching state T (adopting one hot coding) and discrete device switching times C_T(one hot encoding is used).

The strategy gradient method (PG) is used to optimize the target AN, and the commonly used strategy gradient method has the optimization targets:

s and a are the state and action of the Markov decision process, respectively, E_tRepresenting the expectation of sampling through a constant interaction process, A_tThe action merit function is expressed as follows:

in the formulaA state cost function, θ, representing the state s₂Representing parameters of CN, q (s, a) representing a cost function of action a in state s, r (s, a) taking an immediate reward of action a in state s, s' representing a subsequent state of taking action a in state s, μ representing a discount coefficient.

The role of CN is to fit a cost state function, whose signature can be obtained from bellman's equation as follows:

if the dispatching center for action decision is used as a decision main body and the actual power system is used as an environment, the reactive power optimization model of the power distribution network can be converted into a typical multi-step decision problem, and the Markov property in the decision process can be ensured by adopting appropriate state space coding. Combining with a continuous reactive power optimization problem, variables of a decision process < S, R, A, P and gamma > are defined as follows, a state space S is input state information and mainly comprises non-measurable key nodes, measurable node voltage information and switching state information of discrete action equipment, and a state S of an ith decision stage in a switching action instruction of the discrete reactive equipment is defined as follows:

in the formula (I), the compound is shown in the specification,representing a target node voltage matrix in the ith decision stage, wherein the dimension is n multiplied by k, wherein n is m + p represents the number m of nodes of the four-remote access scheduling center and the number of key nodes p, k represents the measurement times in the decision period, and T represents the measurement times in the decision period_iRepresenting the switching gear of the discrete action equipment in the ith decision stage, C_TiAnd representing the switching times of the discrete equipment.

S103: and deploying the lower training model and the upper training model to an actually-operated power distribution network to obtain an overall operation model.

S104: and constructing an objective function according to the network loss of the integral operation model and the switching action instruction of the discrete reactive power equipment, and solving the objective function by adopting an A2C algorithm to obtain a reactive power optimization result of the power distribution network.

Specifically, A2C is called as an advantageous action review algorithm (advance Actor critical), in real-time operation, a trained upper and lower double-layer model is deployed to an actually-operated power distribution network, in a real-time operation stage, an optimization problem is converted into a markov process, network loss and discrete equipment action cost are taken as optimization targets, a discrete equipment action instruction is taken as a control variable to perform real-time solution, and an optimization strategy is obtained (the process interacts with the power distribution network in real time and is completed in Agent 2).

And determining an optimal network loss objective function according to the minimum square difference value of each node voltage and a preset target voltage value, and acquiring the network loss of the whole operation model according to the optimal network loss objective function. The switching action instruction of the discrete reactive power equipment comprises the action cost of the discrete reactive power equipment, an objective function is constructed according to the network loss of the integral operation model and the action cost of the discrete reactive power equipment, constraint conditions constructed according to node voltage, reactive power of nodes and preset control variables are introduced, the objective function is solved by adopting an A2C algorithm, the reactive power optimization result of the power distribution network is obtained, the power factors of adjacent nodes are ignored, and the network loss is expressed as follows:

in the formula, P_LossRepresents the loss of the network, V_i,V_jRepresenting the voltage value, G, of any two points_ijRepresenting nodes, N representing the number of command cycles in a day, assuming a preset target voltage value of V_setIndicating the voltage of each node and a predetermined target voltage value V_setThe minimum square difference value of the sum is used as a target function, so that the whole network observation and the key node voltage can be converged to V consistently_setTherefore, the relative optimal network loss of the whole network is achieved, and the target function is simplified into another expression form through a least square method, and the method comprises the following steps:

selecting the average value of all measurable and key node voltages as a target voltage value V_setThe following are:

the voltage of each node tends to be consistent, so that the loss of the reactive power network is reduced, the reactive power requirement of the target area power network is reduced, and the power is prevented from being transmitted backwards. After considering the system operation economy and the adjustment cost of discrete equipment, the dynamic reactive power optimization objective function of the power distribution network can divide the network loss and the action loss, and because the set optimization function comprises the optimization of voltage, the objective function does not need to be set for the dynamic reactive power optimization objective function, and after considering the network loss and the equipment switching cost, the objective function is as follows:

wherein N is the number of instruction cycles in a day, P_lossiOptimizing the variables for the system loss in the ith instruction cycle, c_jFor the cost of the action of the jth device,is a function of 0-1, the j device in the i instruction cycle is 1 when it is active, otherwise it is 0, m is the number of discrete devices, λ_cIs the action cost coefficient, the constraint condition is defined as follows:

U^min≤U≤U^max；

Q^min≤Q≤Q^max；

T^min≤T≤T^max；

g_i(X,T)＝0i＝1,2...N；

in the formula, U^max,U^minRepresenting the node voltage, the maximum value of the node voltage and the minimum value of the node voltage, Q^max,Q^minRepresenting the reactive power, the maximum value of the reactive power and the minimum value of the reactive power, T^max,T^minRepresenting the input state of the discrete device, the maximum value of the input state of the discrete device and the minimum value of the input state of the discrete device, g_i(X, T) represents a control variable,representing a constraint on the number of actions, includingThe step capacitor and the OLTC add the constraint condition of the node voltage into an objective function, and change the optimization objective into the following steps:

in the formula, gamma₁And gamma₂And the sigma constraint condition is a penalty coefficient and represents a judgment function, and the value is 0 when the sigma constraint condition is satisfied and is 1 when the sigma constraint condition is not satisfied. In the distribution network with low sensing degree, the formula g_iN is difficult to accurately establish, and a theoretical nonlinear power flow relationship is expressed, and some nodes are measured in real time, and the formula U is^min≤U≤U^maxOnly for the measured nodes and the key nodes, in fact, for the described measurable node voltage and the key node voltage, the voltage distribution has strong representativeness, and under the condition of strong dispersion, the node voltage has neighborhood representativeness, so that the situation that when the target node voltage is in an allowable range, all the node voltages of the whole network are in the allowable range can be approximately considered, and the A2C algorithm is adopted to solve the target function to obtain the reactive power optimization result of the power distribution network.

According to the method, an integral model is constructed according to the training results of the upper layer and the lower layer, the model is deployed to the power distribution network which actually runs to obtain an integral running model, an objective function is constructed according to the network loss of the integral running model and the switching action instructions of the discrete reactive power equipment, the objective function is solved by adopting an A2C algorithm, and the reactive power optimization result of the power distribution network is obtained. According to the method, the voltage distribution is optimized while the model is simplified, the active network loss is reduced, and aiming at the problem that limited measurable node voltage data cannot completely reflect the voltage fluctuation of a target area power grid, the voltage fitting of the non-measurable key node is carried out through the lower-layer training model, so that the rationality and the accuracy of the optimized model are improved.

Referring to fig. 4, another embodiment of the present invention provides a data-driven reactive power optimization device for a power distribution network, including:

and the lower-layer training module 11 is used for inputting historical data of the injection power of the measurable nodes of the power distribution network into a preset lower-layer neural network for training, acquiring a lower-layer training model, and acquiring voltage data of the non-measurable key nodes according to the lower-layer training model.

And the upper training module 12 is configured to input a preset upper neural network for training according to the acquired measurable nodes of the target area power grid, the acquired voltage data of the undetectable key nodes, and the preset switching state of the discrete reactive power regulation device, acquire an upper training model, and acquire a switching action instruction of the discrete reactive power regulation device according to the upper training model.

And the operation module 13 is configured to deploy the lower training model and the upper training model to an actually-operated power distribution network to obtain an overall operation model.

And the optimization module 14 is configured to construct an objective function according to the network loss of the overall operation model and the switching action instruction of the discrete reactive power equipment, and solve the objective function by using an A2C algorithm to obtain a reactive power optimization result of the power distribution network.

For specific limitations of the data-driven-based reactive power optimization device for the power distribution network, reference may be made to the above limitations of the data-driven-based reactive power optimization method for the power distribution network, and details are not repeated here. All or part of each module in the data-driven-based power distribution network reactive power optimization device can be realized by software, hardware and a combination thereof. The modules can be embedded in a hardware form or independent from a processor in the computer device, and can also be stored in a memory in the computer device in a software form, so that the processor can call and execute operations corresponding to the modules.

While the foregoing is directed to the preferred embodiment of the present invention, it will be understood by those skilled in the art that various changes and modifications may be made without departing from the spirit and scope of the invention.

13页详细技术资料下载

上一篇：一种医用注射器针头装配设备

下一篇：虑及负荷波动与预测偏差获取电力系统频率波动的方法

Data-driven-based power distribution network reactive power optimization method and device

相关技术

网友询问留言