Model predictive control device, model predictive control program, model predictive control system, and model predictive control method

文档序号：214431 发布日期：2021-11-05 浏览：28次中文

阅读说明：本技术 模型预测控制装置、模型预测控制程序、模型预测控制系统及模型预测控制方法 (Model predictive control device, model predictive control program, model predictive control system, and model predictive control method ) 是由濑川秀一摄津敦外山正胜小中裕喜于 2019-03-29 设计创作，主要内容包括：操作路径生成部(210)基于从状态传感器(101)输出的计测状态量,生成针对致动器(111)的操作量时间序列。预测模型部(220)将所述计测状态量和所述操作量时间序列作为输入来运算预测模型,由此,生成状态量预测时间序列。神经网络部(230)将从环境传感器(102)输出的计测环境量和所述状态量预测时间序列作为输入来运算神经网络,由此,校正所述状态量预测时间序列。状态量评价部(240)生成针对所述校正后的状态量时间序列的评价结果。操作路径生成部在所述评价结果满足适当基准的情况下,将所述操作量时间序列的前头的操作量向所述致动器输出。(An operation path generation unit (210) generates a time series of operation amounts for the actuator (111) on the basis of the measured state amount output from the state sensor (101). A prediction model unit (220) generates a state quantity prediction time series by calculating a prediction model using the measured state quantity and the operation quantity time series as inputs. A neural network unit (230) corrects the state quantity prediction time series by calculating a neural network using the measured environment quantity output from the environment sensor (102) and the state quantity prediction time series as inputs. A state quantity evaluation unit (240) generates an evaluation result for the corrected state quantity time series. The operation path generation unit outputs the operation amount at the head of the operation amount time series to the actuator when the evaluation result satisfies an appropriate criterion.)

1. A model predictive control apparatus, wherein,

the model prediction control device includes:

an operation amount time-series generation unit that generates an operation amount time series for an actuator so as to change a state of a control target, based on a measured state amount output from a state sensor that measures the state of the control target;

a prediction model unit that generates a state quantity prediction time series, which is a predicted state quantity time series of the control target, by calculating a prediction model using the measured state quantity and the operation quantity time series as inputs;

a neural network unit that calculates a neural network by using, as inputs, a measured environment quantity output from an environment sensor that measures an operating environment of the control target and the state quantity prediction time series, and corrects the state quantity prediction time series;

a state quantity evaluation unit that generates an evaluation result for the corrected state quantity time series by calculating an evaluation function with the corrected state quantity prediction time series as an input; and

and an operation amount determination unit that outputs an operation amount at the head of the time series of operation amounts to the actuator when the evaluation result satisfies an appropriate criterion.

2. The model predictive control device of claim 1, wherein,

the model prediction control device includes:

a model calculation unit that generates a state quantity learning time series, which is a state quantity time series for learning, by calculating the prediction model using, as inputs, a past state quantity, which is a measured state quantity output from the state sensor, and an operation quantity past time series, which is a time series of operation quantities input to the actuator; and

a weight parameter learning unit that performs machine learning of a weight parameter for the neural network using the state quantity learning time series, a past environment quantity that is a measured environment quantity output from the environment sensor, and a state quantity past time series that is a time series of the measured state quantity output from the state sensor,

the neural network unit calculates a neural network in which a weight parameter obtained by the machine learning is set.

3. The model predictive control device of claim 1 or 2, wherein,

the control object is a vehicle that is,

the model predictive control apparatus is used for automatic driving control of the vehicle.

4. A model predictive control program, wherein,

the model predictive control program is for causing a computer to execute:

an operation amount time-series generation process of generating an operation amount time series for an actuator so as to change a state of a control target, based on a measured state amount output from a state sensor that measures the state of the control target;

a prediction model process of generating a state quantity prediction time series, which is a predicted state quantity time series of the control target, by calculating a prediction model using the measured state quantity and the operation quantity time series as inputs;

a neural network process for correcting the state quantity prediction time series by calculating a neural network using, as inputs, a measurement environment quantity output from an environment sensor that measures an operating environment of the control target and the state quantity prediction time series;

a state quantity evaluation process of calculating an evaluation function with the corrected state quantity prediction time series as an input, thereby generating an evaluation result for the corrected state quantity time series; and

and an operation amount determination process of outputting an operation amount at the head of the operation amount time series to the actuator when the evaluation result satisfies an appropriate criterion.

5. A model predictive control system, wherein,

the model prediction control system includes:

a state sensor for measuring the state of the control object;

an environment sensor that measures an operating environment of the control target;

an actuator for changing a state of the control object;

an operation amount time-series generation unit that generates an operation amount time series for the actuator based on the measured state amount output from the state sensor;

a neural network unit that corrects the state quantity prediction time series by calculating a neural network using the measured environment quantity output from the environment sensor and the state quantity prediction time series as inputs;

6. The model predictive control system of claim 5, wherein,

the model prediction control system includes:

the neural network unit calculates a neural network in which a weight parameter obtained by the machine learning is set.

7. The model predictive control system of claim 5 or 6, wherein,

the control object is a vehicle that is,

the model predictive control system is used for automatic driving control of the vehicle.

8. A model predictive control method, wherein,

the state sensor measures the state of the control object,

an environment sensor measures an operation environment of the control object,

an operation amount time-series generation unit generates an operation amount time series for an actuator for changing a state of the control target based on the measured state amount output from the state sensor,

a prediction model unit generates a state quantity prediction time series which is a predicted state quantity time series of the control target by calculating a prediction model using the measured state quantity and the operation quantity time series as inputs,

the neural network unit corrects the state quantity prediction time series by calculating a neural network using the measured environment quantity output from the environment sensor and the state quantity prediction time series as inputs,

the state quantity evaluation unit generates an evaluation result for the corrected state quantity time series by calculating an evaluation function with the corrected state quantity prediction time series as an input,

the operation amount determination unit outputs the operation amount at the head of the time series of operation amounts to the actuator when the evaluation result satisfies an appropriate criterion.

9. A model predictive control device that supplies an operation amount to an actuator for changing a state of a control target,

the model prediction control device includes:

a neural network unit that calculates a model parameter set in a prediction model for predicting a change in the state of the control target by calculating a neural network using, as inputs, a measured state quantity output from a state sensor that measures the state of the control target and a measured environment quantity output from an environment sensor that measures an operating environment of the control target;

an evaluation formula generation unit that generates an evaluation formula in a quadratic programming method as a formula for evaluating a time series of operation amounts for the actuator, based on a prediction model in which the calculated model parameters are set; and

a solver unit that calculates an operation amount to be supplied to the actuator by solving the evaluation formula in a quadratic programming method.

10. The model predictive control device of claim 9, wherein,

the control object is a vehicle that is,

the model predictive control apparatus is used for automatic driving control of the vehicle.

11. A model predictive control program for providing an operation amount to an actuator for changing a state of a controlled object,

the model predictive control program causes a computer to execute:

a neural network process that calculates a model parameter set in a prediction model for predicting a change in the state of the control target by calculating a neural network using, as inputs, a measured state quantity output from a state sensor that measures the state of the control target and a measured environment quantity output from an environment sensor that measures an operating environment of the control target;

an evaluation formula generation process of generating an evaluation formula in a quadratic programming method as a formula for evaluating a time series of operation amounts for the actuator, based on a prediction model in which the calculated model parameters are set; and

and a solver process of calculating an operation amount to be supplied to the actuator by solving the evaluation formula in a quadratic programming method.

12. A model predictive control system, wherein,

the model prediction control system includes:

a state sensor for measuring the state of the control object;

an environment sensor that measures an operating environment of the control target;

an actuator for changing a state of the control object;

a solver unit that calculates an operation amount to be supplied to the actuator by solving the evaluation formula in a quadratic programming method.

13. The model predictive control system of claim 12, wherein,

the control object is a vehicle that is,

the model predictive control system is used for automatic driving control of the vehicle.

14. A model predictive control method for providing an operation amount to an actuator for changing a state of a control target,

the state sensor measures the state of the control object,

an environment sensor measures an operation environment of the control object,

an evaluation formula generation unit generates an evaluation formula in a quadratic programming method as a formula for evaluating a time series of operation amounts for the actuator, based on a prediction model in which the calculated model parameters are set,

the solver unit calculates an operation amount to be supplied to the actuator by solving the evaluation formula in a quadratic programming method.

Technical Field

The invention relates to model predictive control.

Background

Model predictive control is known in which a control target is controlled using a predictive model.

For example, the model predictive control can be used for automatic driving control of the vehicle.

Patent document 1 discloses a model predictive control system that automatically changes a model in accordance with an external environment.

In this system, a model corresponding to weather at the time of prediction is selected from models prepared for different weathers, the selected model is corrected based on the outside air temperature, and model prediction control is performed using the corrected model.

Documents of the prior art

Patent document

Patent document 1: japanese patent laid-open publication No. 2000-99107

Disclosure of Invention

Problems to be solved by the invention

The system disclosed in patent document 1 cannot cope with an external environment other than the assumed environment.

For example, even if a clear weather model, a cloudy weather model, a rainy weather model, and a snowy weather model are prepared, it is not possible to select an appropriate model for special weather such as typhoon. Even if a model suitable for the weather at the time of prediction can be selected, if the outside air temperature at the time of prediction is a temperature outside the assumed range, the model cannot be appropriately corrected.

As a result, the accuracy of the model predictive control may be degraded.

The purpose of the present invention is to maintain the accuracy of model predictive control even in an unexpected environment.

Means for solving the problems

The model prediction control device of the present invention includes: an operation amount time-series generation unit that generates an operation amount time series for an actuator so as to change a state of a control target, based on a measured state amount output from a state sensor that measures the state of the control target; a prediction model unit that generates a state quantity prediction time series, which is a predicted state quantity time series of the control target, by calculating a prediction model using the measured state quantity and the operation quantity time series as inputs; a neural network unit that calculates a neural network by using, as inputs, a measured environment quantity output from an environment sensor that measures an operating environment of the control target and the state quantity prediction time series, and corrects the state quantity prediction time series; a state quantity evaluation unit that generates an evaluation result for the corrected state quantity time series by calculating an evaluation function with the corrected state quantity prediction time series as an input; and an operation amount determination unit that outputs an operation amount at the head of the operation amount time series to the actuator when the evaluation result satisfies an appropriate criterion.

ADVANTAGEOUS EFFECTS OF INVENTION

According to the present invention, the state quantity prediction time series is corrected by calculating the neural network using the state quantity prediction time series obtained by the prediction model and the measured environment quantity output from the environment sensor as inputs. Therefore, the state quantity prediction time series can be corrected even in an environment other than the assumed environment. Therefore, the accuracy of the model predictive control can be maintained even in an unexpected environment.

Drawings

Fig. 1 is a configuration diagram of a model predictive control system 100 according to embodiment 1.

Fig. 2 is a configuration diagram of a model predictive control device 200 according to embodiment 1.

Fig. 3 is an explanatory diagram of model prediction control in embodiment 1.

Fig. 4 is an explanatory diagram of model prediction control in embodiment 1.

Fig. 5 is a flowchart of a model prediction control method in embodiment 1.

Fig. 6 is a diagram showing the neural network 231 in embodiment 1.

Fig. 7 is a block diagram of a model predictive control system 190 that does not use the neural network 231.

Fig. 8 is a block diagram of a model predictive control system 190 used for automatic driving control of a vehicle.

Fig. 9 is a diagram illustrating automatic driving control of the vehicle by the model predictive control system 190.

Fig. 10 is an explanatory diagram of automatic driving control of the vehicle.

Fig. 11 is a configuration diagram of the model predictive control system 100 according to embodiment 2.

Fig. 12 is a configuration diagram of a model predictive control device 200 according to embodiment 2.

Fig. 13 is a configuration diagram of the history unit 280 in embodiment 2.

Fig. 14 is a schematic diagram of the learning method in embodiment 2.

Fig. 15 is a flowchart of the learning method in embodiment 2.

Fig. 16 is a configuration diagram of a model predictive control system 300 according to embodiment 3.

Fig. 17 is a configuration diagram of a model predictive control device 400 according to embodiment 3.

Fig. 18 is a flowchart of a model prediction control method according to embodiment 3.

Fig. 19 is a diagram showing the neural network 411 in embodiment 3.

Fig. 20 is a hardware configuration diagram of the model predictive control device 200 according to the embodiment.

Fig. 21 is a hardware configuration diagram of the model predictive control device 400 according to the embodiment.

Detailed Description

In the embodiments and the drawings, the same elements or corresponding elements are denoted by the same reference numerals. The description of the elements denoted by the same reference numerals as those of the already described elements is appropriately omitted or simplified. The arrows in the figure primarily indicate the flow of data or processing.

Embodiment 1.

A model predictive control system 100 using a neural network is explained based on fig. 1 to 10.

The model predictive control system 100 is a system for controlling a control target by Model Predictive Control (MPC). Model predictive control is described later.

For example, the model predictive control system 100 can be used to implement automated driving of a vehicle.

Description of the structure of Tuliuzhang

The configuration of the model predictive control system 100 will be described with reference to fig. 1.

The model predictive control system 100 includes a state sensor group, an environment sensor group, an actuator group, and a model predictive control device 200.

The state sensor group is 1 or more state sensors 101.

The state sensor 101 is a sensor for measuring the state of the control target.

For example, the control object is a vehicle, and the state sensor 101 is a speed sensor or a position sensor. The speed sensor measures the speed of the vehicle. The position sensor measures a position of the vehicle.

The environmental sensor group is 1 or more environmental sensors 102.

The environment sensor 102 is a sensor for measuring an operating environment of a control target.

For example, the control object is a vehicle, and the environment sensor 102 is a vehicle weight sensor or a posture sensor. The vehicle weight sensor measures the weight of the vehicle (including the weight of passengers and cargo). The attitude sensor measures the attitude (inclination) of the vehicle. The posture of the vehicle corresponds to the inclination of the road surface.

The actuator group is 1 or more actuators 111.

The actuator 111 changes the state of the control target.

For example, the control target is a vehicle, and the actuator 111 is a steering wheel, a motor, or a brake.

The model predictive control apparatus 200 is an apparatus for controlling a control target by Model Predictive Control (MPC). Model predictive control is described later.

For example, the model predictive control device 200 performs automatic driving control for the vehicle.

The model predictive control device 200 is characterized by being provided with a neural network unit 230.

The structure of the model prediction control device 200 will be described based on fig. 2.

The model predictive control apparatus 200 is a computer including hardware such as a processor 201, a memory 202, an auxiliary storage device 203, an input/output interface 204, and a communication device 205. These pieces of hardware are connected to each other via signal lines.

The processor 201 is an IC that performs arithmetic processing, and controls other hardware. The processor 201 is, for example, a CPU, DSP, or GPU.

IC is an abbreviation for Integrated Circuit.

The CPU is an abbreviation for Central Processing Unit (CPU).

The DSP is a short for Digital Signal Processor.

The GPU is an abbreviation of Graphics Processing Unit.

The memory 202 is a volatile storage device. The memory 202 is also referred to as a main storage device or main memory. For example, the memory 202 is a RAM. The data stored in the memory 202 is stored in the auxiliary storage device 203 as needed.

RAM is short for Random Access Memory (Random Access Memory).

The auxiliary storage device 203 is a nonvolatile storage device. The secondary storage device 203 is, for example, a ROM, HDD, or flash memory. Data stored in the secondary storage device 203 is loaded into the memory 202 as needed.

ROM is an abbreviation for Read Only Memory (ROM).

The HDD is an abbreviation for Hard Disk Drive.

The input/output interface 204 is a port to which an input device and an output device are connected. For example, a status sensor group, an environmental sensor group, and an actuator group are connected to the input/output interface 204.

USB is a short for Universal Serial Bus (Universal Serial Bus).

The communication device 205 is a receiver and a transmitter. The communication device 205 is, for example, a communication chip or NIC.

NIC is short for Network Interface Card.

The model predictive control device 200 includes elements such as an operation path generation unit 210, a prediction model unit 220, a neural network unit 230, and a state quantity evaluation unit 240. These elements are implemented in software.

The operation route generation unit 210 includes an operation amount time series generation unit 211 and an operation amount determination unit 212.

The auxiliary storage device 203 stores a model prediction control program for causing a computer to function as the operation route generation unit 210, the prediction model unit 220, the neural network unit 230, and the state quantity evaluation unit 240. The model predictive control program is loaded into the memory 202 and executed by the processor 201.

The OS is also stored in the auxiliary storage device 203. At least a portion of the OS is loaded into memory 202 for execution by processor 201.

The processor 201 executes the model predictive control program while executing the OS.

OS is an abbreviation for Operating System.

The input/output data of the model predictive control program is stored in the storage unit 290.

The memory 202 functions as the storage unit 290. However, a storage device such as the auxiliary storage device 203, a register in the processor 201, and a cache memory in the processor 201 may function as the storage unit 290 instead of the memory 202 or together with the memory 202.

The model predictive control apparatus 200 may include a plurality of processors instead of the processor 201. The plurality of processors share the role of the processor 201.

The model predictive control program may be recorded (stored) in a non-volatile recording medium such as an optical disc or a flash memory in a computer-readable manner.

Model Predictive Control (MPC) will be described based on fig. 3 and 4. Model predictive control is prior art.

First, the model prediction control will be described with reference to fig. 3.

The model predictive control is one of control methods for calculating an optimal control input using a predictive estimation of a control target.

In the model predictive control, a predictive model and an optimizer are used. The predictive model is a model for simulating a control object. The optimizer evaluates the behavior of the prediction model and calculates the optimal control input.

The set of the operation path generation unit 210 and the state quantity evaluation unit 240 corresponds to an optimizer.

Next, the model prediction control will be described with reference to fig. 4. The operation amount u corresponds to the control input u (t) of fig. 3.

In the model predictive control, a time series xi of the predicted state quantity is generated based on a time series ui of the candidates of the manipulated variable, and the quality of the predicted state quantity is determined by an evaluation function. This process is repeated until a highly evaluated predicted state quantity is obtained. Then, the operation amount u1 corresponding to the predicted state amount with the high evaluation is output.

Description of the actions of Tuzhang

The operation of the model predictive control system 100 corresponds to a model predictive control method. The steps of the model predictive control method by the model predictive control device 200 correspond to the steps of the model predictive control program.

The model prediction control method is explained based on fig. 5.

For ease of explanation, the description will be given with 1 state sensor group 101, 1 environmental sensor group 102, and 1 actuator group 111 as the state sensor group.

The state sensor 101 regularly measures the state of the control target and outputs a measured state quantity. The measured state quantity is a state quantity obtained by measuring the state of the control target. The state quantity represents the state of the control object.

The environment sensor 102 periodically measures the operating environment of the control target and outputs a measured environment amount. The measured environment amount is an environment amount obtained by measuring an operating environment of the control target. The environment amount represents an operation environment of the control object.

Step S110 to step S160 are repeatedly executed.

In step S110, the operation amount time series generator 211 receives the measured state amount output from the state sensor 101.

The operation amount time-series generator 211 generates an operation amount time series based on the received measured state amount.

Then, the operation amount time series generator 211 outputs the measured state amount and the operation amount time series.

The operation amount time series is a time series ui (see fig. 4) of a plurality of operation amounts arranged in time series and corresponds to a candidate of the operation amount in the conventional model predictive control.

The method of generating the time series of the operation amount is the same as the method of generating the time series ui of the operation amount candidates in the conventional model predictive control.

In step S120, the prediction model unit 220 receives the measured state quantity and the operation quantity time series output from the operation quantity time series generation unit 211.

The prediction model unit 220 calculates a prediction model using the measured state quantity and the operation quantity time series as inputs. Thereby, a state quantity prediction time series is generated.

Then, the prediction model unit 220 outputs a state quantity prediction time series.

The state quantity prediction time series is a state quantity time series predicted by a prediction model.

The state quantity time series is a plurality of state quantities arranged in time series, and corresponds to a time series xi of predicted state quantities in the conventional model prediction control (see fig. 4).

The method of generating the state quantity prediction time series is the same as the method of generating the time series xi of the predicted state quantity in the conventional model prediction control.

In step S130, the neural network unit 230 receives the measured environment quantity output from the environment sensor 102 and the state quantity prediction time series output from the prediction model unit 220.

The neural network unit 230 calculates the neural network 231 using the measurement environment quantity and the state quantity prediction time series as inputs. Thereby correcting the state quantity prediction time series.

Then, the neural network unit 230 outputs the corrected state quantity prediction time series.

The neural network 231 is described later.

In step S140, the state quantity evaluation unit 240 receives the corrected state quantity prediction time series output from the neural network unit 230.

The neural network unit 230 calculates an evaluation function with the corrected state quantity prediction time series as an input. Thereby generating a state quantity evaluation result.

Then, the state quantity evaluation unit 240 outputs the state quantity evaluation result.

The state quantity evaluation result is an evaluation result of the corrected state quantity prediction time series, and corresponds to an evaluation result of the time series xi of the predicted state quantity in the conventional model prediction control (see fig. 4).

The method of generating the state quantity evaluation result is the same as the method of generating the evaluation result for the time series xi of the predicted state quantity in the conventional model prediction control.

In step S150, the operation amount determination unit 212 receives the state amount evaluation result output from the state amount evaluation unit 240.

Then, the operation amount determination unit 212 determines whether or not the state amount evaluation result satisfies an appropriate criterion. The appropriate reference is a predetermined reference. The determination method is the same as that in the conventional model prediction control.

In the case where the state quantity evaluation result satisfies the appropriate criterion, the operation quantity time-series generated in step S110 is the optimum operation quantity time-series, that is, the optimum solution.

In the case where the operation amount time-series generated in step S110 is the optimal solution, the process proceeds to step S160.

In a case where the operation amount time-series generated in step S110 is not the optimal solution, the process proceeds to step S110. Then, another operation amount time series is generated in step S110.

In step S160, the operation amount determination unit 212 outputs the top operation amount of the time series (optimal solution) of operation amounts generated in step S110 to the actuator 111. The operation amount of the head is referred to as "1 st operation amount".

The actuator 111 receives the 1 st operation amount output from the operation amount determination unit 212. Then, the actuator 111 operates according to the received 1 st operation amount. As a result, the state of the control target changes.

The neural network 231 is explained based on fig. 6.

The neural network 231 is a neural network for the model predictive control system 100.

The structure of the neural network is explained.

The neural network has an input layer, a hidden layer, and an output layer.

Each layer has more than 1 node. The circles represent nodes.

The nodes between the layers are connected by edges. The dashed lines indicate edges.

A weight is set for each edge.

The value of the node of the next layer is determined based on the value of the node of the previous layer and the weight set at the edge.

In the neural network 231, the state quantity prediction time series (x1, …, xk) and the measurement environment quantity (y0) become inputs to the input layer. Then, the corrected state quantity prediction time series (x '1, …, x' k) becomes an output from the output layer.

Effects of embodiment 1

Problems of the model predictive control device 191 not using the neural network 231 will be described based on fig. 7 to 10.

Fig. 7 shows the structure of the model predictive control system 190 without using the neural network 231.

The model predictive control system 190 does not have an environmental sensor group.

The model predictive control device 191 does not have a function corresponding to the neural network unit 230.

Therefore, the model predictive control device 191 cannot correct the state quantity prediction time series based on the measurement environment quantity.

However, the status sensor group and the actuator group are exposed to the external environment. Therefore, the state quantity measured by the state sensor group and the state quantity changed by the actuator group do not necessarily coincide with the state quantity prediction time series.

Fig. 8 shows a configuration of a model predictive control system 190 used for automatic driving control of a vehicle.

The model predictive control system 190 includes state sensors such as a vehicle speed sensor and a position sensor. The model predictive control system 190 includes actuators such as a steering wheel, a motor, and a brake.

The model predictive control device 191 determines a steering amount, a motor output, and a brake output based on the speed of the vehicle and the position of the vehicle.

When the model predictive control system 190 is widely used, it is conceivable that the model predictive control system 190 is a system that outputs an operation amount based on a state amount.

Fig. 9 shows a case of automatic driving control of the vehicle based on the model predictive control system 190.

Model predictive control device 191 for controlling state quantity x_i(vehicle speed, vehicle position) varies and the operation amount u is output_i. Thereby, the travel path of the vehicle is controlled.

The automatic driving control of the vehicle will be described based on fig. 10.

In the vehicle, gravity based on the vehicle weight, stress from the road surface, propulsive force of the propeller, and the like are generated.

Acceleration amount Δ of vehicle_vCan be represented by formula (1).

"M" represents the vehicle weight. "θ" represents the inclination of the vehicle. "F" represents the operation amount of the propeller. "g" represents the acceleration of gravity.

“X_gain"denotes a gain correction amount. "X_sens"indicates a measurement state quantity. "X_ofs"denotes an offset correction amount.

[ numerical formula 1]

θ＝Θ_gainθ_sens+Θ_ofs

M＝M_gainM_sens+M_ofs

However, it is necessary to perform correction in consideration of other errors after correction of each state sensor is performed. In addition, when the non-linear characteristic exists in the measurement state quantity, it is necessary to separately consider.

Further, the gain correction amount X_gainAnd offset correction amount X_ofsDepending on the action environment.

Therefore, if the operation environment is not considered, the accuracy of the automatic driving control for the vehicle may deteriorate.

On the other hand, the model predictive control apparatus 200 according to embodiment 1 realizes control in consideration of the operation environment by using the neural network 231. As a result, various controls can be performed with high accuracy.

For example, it is possible to realize automatic driving control with high accuracy without performing accurate calibration of a state sensor for a vehicle.

Embodiment 2.

The method of learning the weight parameters of the neural network 231 is different from that of embodiment 1 mainly based on fig. 11 to 15.

Description of the structure of Tuliuzhang

The configuration of the model predictive control system 100 will be described with reference to fig. 11.

The configuration of the model predictive control system 100 is the same as that in embodiment 1 except for the configuration of the model predictive control device 200 (see fig. 1).

The configuration of the model prediction control device 200 will be described with reference to fig. 12.

The model prediction control device 200 further includes a learning unit 250. The learning unit 250 includes a model calculation unit 251 and a weight parameter learning unit 252. The learning section 250 is implemented by software.

The model prediction control program also causes the computer to function as the learning unit 250.

The model predictive control device 200 further includes a history unit 280. The history unit 280 is implemented by a storage device such as the memory 202.

The structure of the history unit 280 will be described with reference to fig. 13.

The history unit 280 stores data such as a state quantity history 281, an environment quantity history 282, an operation quantity history 283, and a state quantity learning history 284.

The state quantity history 281 is a history of measured state quantities, that is, a set of past measured state quantities. The past measurement state quantity is referred to as a "past state quantity". The time series of past state quantities is referred to as "state quantity past time series".

The environment amount history 282 is a history of measurement environment amounts, i.e., a set of past measurement environment amounts. The past measurement environment amount is referred to as a "past environment amount".

The operation amount history 283 is a set of past operation amounts that are histories of operation amounts. The past operation amount is referred to as a "past operation amount". The time series of the past operation amount is referred to as "operation amount past time series".

The state quantity learning history 284 is a history of the state quantity learning time series, that is, a set of past state quantity learning time series.

The state quantity learning time series is a state quantity learning time series generated to learn the weight parameters used in the neural network 231.

Description of the actions of Tuzhang

An outline of the learning method of the learning unit 250 will be described with reference to fig. 14.

"prediction" refers to a process of generating a state quantity learning time series.

The state quantity learning time series corresponds to the state quantity prediction time series. That is, the state quantity learning time series is generated by calculating the same prediction model as that used to generate the state quantity prediction time series.

The past time series of the operation amount and the past state amount are used in the "prediction".

The operation amount elapsed time series is a time series of the past operation amount.

As the operation amount u0 in the past time series of operation amounts, the operation amount u0 at the 1 st time (t equal to 1) was used.

As the operation amount u1 in the past time series of operation amounts, the operation amount u0 at the 2 nd time (t 2) is used.

As the operation amount u0 in the past time series of operation amounts, the operation amount u0 at the 3 rd time (t being 3) is used.

As the past state quantity, the state quantity x0 at the 1 st time (t ═ 1) is used.

"learning" refers to a process of learning weight parameters used in the neural network 231.

In "learning", a state quantity learning time series and a state quantity elapsed time series are used.

As the state quantity x1 in the past time series of state quantities, the state quantity x0 at the 2 nd time (t equal to 2) is used.

As the state quantity x2 in the past time series of state quantities, the state quantity x0 at the 3 rd time (t being 3) is used.

A learning method of the learning unit 250 will be described with reference to fig. 15.

The learning method is repeatedly implemented. For example, the learning method is implemented periodically or each time the operation amount is output to the actuator 111.

In the learning method, the history unit 280 operates as follows.

Each time the measurement state quantity is output from the state sensor 101, the history unit 280 stores the output measurement state quantity.

Each time the measurement environment amount is output from the environment sensor 102, the history unit 280 stores the output measurement environment amount.

The history unit 280 stores the output operation amount each time the operation amount determination unit 212 outputs the operation amount to the actuator 111.

In step S210, the model arithmetic unit 251 acquires the past state quantity and the past time series of operation quantities from the history unit 280.

Then, the model calculation unit 251 calculates a prediction model using the past state quantity and the past operation quantity time series as inputs. The prediction model calculated by the model calculation unit 251 is the same as the prediction model calculated by the prediction model unit 220.

Thereby, a state quantity time series corresponding to the state quantity prediction time series is generated. The generated state quantity time series is referred to as "state quantity learning time series".

The model calculation unit 251 stores the state quantity learning time series in the history unit 280.

In step S220, the weight parameter learning unit 252 acquires the past environment amount, the state amount past time series, and the state amount learning time series from the history unit 280.

Then, the weight parameter learning unit 252 performs machine learning of the weight parameters for the neural network 231 using the state quantity learning time series, the past environment quantity, and the state quantity past time series.

Specifically, the weight parameter learning unit 252 performs the weight parameter of the neural network 231 such that the state quantity learning time series after correction obtained by executing the neural network 231 with the state quantity learning time series and the past environment quantity as inputs matches the state quantity past time series.

In step S230, the weight parameter learning unit 252 evaluates the weight parameters (learning results) obtained in the machine learning.

Evaluation of the learning result is performed as follows.

In step S210, the model arithmetic unit 251 generates a plurality of state amount learning time series in the learning target period using a plurality of past state amounts in the learning target period and a plurality of past operation amount time series in the learning target period.

In step S220, the weight parameter learning unit 252 performs machine learning of the weight parameters for the neural network 231 using the plurality of state quantity learning time series of the first period, the plurality of past environment quantities of the first period, and the plurality of state quantity past time series of the first period. The first period is a part of the learning object period. For example, the first period is the first half of the learning object period.

In step S230, the weight parameter learning unit 252 temporarily sets the weight parameters obtained in the machine learning in the neural network 231. Next, the weight parameter learning unit 252 calculates the neural network 231 using the plurality of state quantity learning time series in the second period and the plurality of past environment quantities in the second period as inputs. Thereby, a plurality of state quantity correction time series of the second period are obtained. The second period is a part of the learning object period. For example, the second period is the latter half of the learning object period. The state quantity correction time series is a state quantity learning time series after correction. Then, the weight parameter learning unit 252 evaluates the learning result based on the error amount between the plurality of state quantity correction time series in the second period and the past time series of the plurality of state quantities in the second period. Evaluation of the learning result is performed using a normal index in deep learning.

If an evaluation result such as an appropriate learning result is obtained, the process proceeds to step S240.

If an evaluation result is obtained that does not yield an appropriate learning result, the weight parameters obtained in step S220 are discarded, and the learning method ends. In this case, the weight parameters of the neural network 231 are not updated.

In step S240, the weight parameter learning unit 252 sets the weight parameter obtained in step S220 in the neural network 231. Thereby, the weight parameters of the neural network 231 are updated.

After step S240, the neural network unit 230 corrects the state quantity prediction time series by calculating the updated neural network 231.

Effects of mode for carrying out mode 2

The weight parameters of the neural network 231 can be learned. Therefore, the accuracy of the correction based on the neural network 231 improves. As a result, the accuracy of the model predictive control is improved.

Embodiment 3.

A model predictive control system 300 that calculates an operation amount using a quadratic programming method will be described with reference to fig. 16 to 19.

The model predictive control system 300 is a system for controlling a control target by Model Predictive Control (MPC). The model predictive control is as described in embodiment 1.

For example, the model predictive control system 300 can be used to realize automatic driving of a vehicle.

Description of the structure of Tuliuzhang

The configuration of the model predictive control system 300 will be described with reference to fig. 16.

The model predictive control system 300 includes a state sensor group, an environment sensor group, an actuator group, and a model predictive control device 400.

The state sensor group is 1 or more state sensors 301.

The state sensor 301 is a sensor for measuring the state of the control target.

For example, the control object is a vehicle, and the state sensor 301 is a speed sensor or a position sensor. The speed sensor measures the speed of the vehicle. The position sensor measures a position of the vehicle.

The environmental sensor group is 1 or more environmental sensors 302.

The environment sensor 302 is a sensor for measuring an operating environment of a control target.

For example, the control object is a vehicle, and the environment sensor 302 is a vehicle weight sensor or a posture sensor. The vehicle weight sensor measures the weight of the vehicle (including the weight of passengers and cargo). The attitude sensor measures the attitude (inclination) of the vehicle. The posture of the vehicle corresponds to the inclination of the road surface.

The actuator group is 1 or more actuators 311.

The actuator 311 changes the state of the control target.

For example, the control target is a vehicle, and the actuator 311 is a steering wheel, a motor, or a brake.

The model predictive control apparatus 400 is an apparatus for controlling a control target by Model Predictive Control (MPC).

For example, the model predictive control device 400 performs automatic driving control for the vehicle.

The model predictive control device 400 is characterized by being provided with a neural network unit 410.

The configuration of the model prediction control device 400 will be described with reference to fig. 17.

The model predictive control apparatus 400 is a computer including hardware such as a processor 401, a memory 402, an auxiliary storage device 403, an input/output interface 404, and a communication device 405. These pieces of hardware are connected to each other via signal lines.

The processor 401 is an IC that performs arithmetic processing, and controls other hardware. For example, the processor 401 is a CPU, DSP, or GPU.

The memory 402 is a volatile storage device. The memory 402 is also referred to as a main storage device or main memory. For example, the memory 402 is a RAM. The data stored in the memory 402 is stored in the auxiliary storage device 403 as needed.

The secondary storage device 403 is a nonvolatile storage device. The secondary storage device 403 is, for example, a ROM, HDD, or flash memory. Data stored in the secondary storage device 403 is loaded into the memory 402 as needed.

The input/output interface 404 is a port for connecting an input device and an output device. For example, a status sensor group, an environmental sensor group, and an actuator group are connected to the input/output interface 404.

The communication device 405 is a receiver and a transmitter. The communication device 405 is, for example, a communication chip or NIC.

The model prediction control device 400 includes elements such as a neural network unit 410, an evaluation formula generation unit 420, and a solver unit 430. These elements are implemented in software.

The auxiliary storage device 403 stores a model prediction control program for causing a computer to function as the neural network unit 410, the evaluation formula generation unit 420, and the solver unit 430. The model predictive control program is loaded into the memory 402 and executed by the processor 401.

The secondary storage device 403 also stores an OS. At least a portion of the OS is loaded into memory 402 for execution by processor 401.

The processor 401 executes the model predictive control program while executing the OS.

The input/output data of the model predictive control program is stored in the storage unit 490.

The memory 402 functions as a storage unit 490. However, a storage device such as the auxiliary storage device 403, a register in the processor 401, or a cache memory in the processor 401 may function as the storage unit 490 instead of the memory 402 or together with the memory 402.

The model predictive control apparatus 400 may include a plurality of processors instead of the processor 401. The plurality of processors share the role of the processor 401.

The model predictive control program may be recorded (stored) in a non-volatile recording medium such as an optical disc or a flash memory in a computer-readable manner.

Description of the actions of Tuzhang

The model prediction control method will be described with reference to fig. 18.

For ease of explanation, the description will be given with 1 state sensor group 101, 1 environmental sensor group 102, and 1 actuator group 111 as the state sensor group.

The state sensor 301 periodically measures the state of the control target and outputs a measured state quantity. The measured state quantity is a state quantity obtained by measuring the state of the control target. The state quantity represents the state of the control object.

The environment sensor 302 periodically measures the operating environment of the control target and outputs a measured environment amount. The measured environment amount is an environment amount obtained by measuring an operating environment of the control target. The environment amount represents an operation environment of the control object.

The steps S310 to S330 are repeatedly performed.

In step S310, the neural network unit 410 receives the measured state quantity output from the state sensor 301.

The neural network unit 410 receives the measured environment quantity output from the environment sensor 302.

The neural network unit 410 calculates the neural network 411 using the measurement state quantity and the measurement environment quantity as inputs. Thus, model parameters set in a prediction model for predicting a change in the state of the controlled object are calculated.

Then, the neural network unit 410 outputs the calculated model parameters.

The prediction model can be represented by equation (2).

x_k+1＝Ax_k+Bu_k…(2)

“x_n"is the nth state quantity of the control object.

“u_n"is the nth operation amount for the actuator 311.

"A" is a matrix that is one of the model parameters.

"B" is a vector as one of the model parameters.

The neural network 411 is explained based on fig. 19.

The neural network 411 is a neural network for the model predictive control system 300.

The structure of the neural network is as described in embodiment 1.

In the neural network 411, the measurement state quantity x0 and the measurement environment quantity y0 are inputs to the input layer. The model parameters (a, B) are output from the output layer.

(A₀₀，…，A_ij，…，A_nn) Forming a matrix a.

(B₀，…，B_i，…，B_n) Constituting a vector B.

Returning to fig. 18, the description is continued from step S320.

In step S320, the evaluation formula generation unit 420 generates an evaluation formula in the quadratic programming method based on the prediction model in which the calculated model parameters are set. The generated evaluation expression is an expression for evaluating the time series of the operation amount for the actuator 311.

Then, the evaluation formula generation unit 420 outputs an evaluation formula in the quadratic programming method.

The evaluation formula in the quadratic programming method will be explained.

The evaluation function for the prediction model can be represented by equation (3).

“E₁"is an evaluation value obtained by an evaluation function.

“x_Tk"is a target value of the state quantity.

“x_k"is a state quantity calculated by calculating a prediction model in which a matrix a and a vector B are set.

[ numerical formula 2]

Evaluation value E of optimized evaluation function₁The problem (2) is equivalent to the evaluation value E of the optimum evaluation formula₂. The evaluation formula can be represented by formula (4).

(u₁，…，u_n) Is an operation amount time series.

"Q" is a matrix.

"R" is a vector.

[ numerical formula 3]

The evaluation formula generation unit 420 calculates a matrix Q of an evaluation formula and a vector R of the evaluation formula based on a prediction model in which the matrix a and the vector B are set.

Then, the evaluation expression generation unit 420 sets the matrix Q and the vector R in the evaluation expression. The evaluation formula in which the matrix Q and the vector R are set is an evaluation formula in the quadratic programming method.

In step S330, the solver unit 430 calculates the operation amount to be supplied to the actuator 311 by solving the evaluation formula in the quadratic programming method.

Specifically, the solver unit 430 solves the evaluation formula in the quadratic programming method by executing an optimization solver (quadratic programming solver).

Then, the solver unit 430 supplies the calculated operation amount to the actuator 311.

Effects of mode for carrying out embodiment 3

The model predictive control system 300 that calculates the operation amount using the quadratic programming method can also provide the same effects as those of embodiment 1. That is, the accuracy of the model predictive control can be maintained even in an environment other than the assumed environment.

Supplement to the embodiments

The hardware configuration of the model prediction control device 200 will be described with reference to fig. 20.

The model predictive control device 200 includes a processing circuit 209.

The processing circuit 209 is hardware that realizes the operation path generation unit 210, the prediction model unit 220, the neural network unit 230, the state quantity evaluation unit 240, and the learning unit 250.

The processing circuit 209 may be dedicated hardware or may be the processor 201 that executes a program stored in the memory 202.

In case the processing circuit 209 is dedicated hardware, the processing circuit 209 is for example a single circuit, a complex circuit, a programmed processor, a parallel programmed processor, an ASIC, an FPGA or a combination thereof.

The ASIC is an abbreviation for Application Specific Integrated Circuit (ASIC).

FPGA is the abbreviation of Field Programmable Gate Array (FPGA).

The model predictive control apparatus 200 may include a plurality of processing circuits instead of the processing circuit 209. The plurality of processing circuits share the role of the processing circuit 209.

In the model predictive control apparatus 200, a part of the functions may be realized by dedicated hardware, and the remaining functions may be realized by software or firmware.

As such, the processing circuit 209 can be implemented in hardware, software, firmware, or a combination thereof.

The hardware configuration of the model predictive control apparatus 400 will be described with reference to fig. 21.

The model predictive control device 400 includes a processing circuit 409.

The processing circuit 409 is hardware that realizes the neural network unit 410, the evaluation formula generation unit 420, and the solver unit 430.

The processing circuit 409 may be dedicated hardware or may be a processor 401 that executes a program stored in the memory 402.

Where the processing circuitry 409 is dedicated hardware, the processing circuitry 409 may be, for example, a single circuit, a complex circuit, a programmed processor, a parallel programmed processor, an ASIC, an FPGA, or a combination thereof.

The model predictive control apparatus 400 may include a plurality of processing circuits instead of the processing circuit 409. The plurality of processing circuits share the role of the processing circuit 409.

In the model predictive control apparatus 400, a part of the functions may be realized by dedicated hardware, and the remaining functions may be realized by software or firmware.

As such, the processing circuit 409 can be implemented in hardware, software, firmware, or a combination thereof.

The embodiments are illustrative of preferred embodiments and are not intended to limit the technical scope of the present invention. Embodiments may be implemented in part or in combination with other implementations. The steps described with reference to the flowcharts and the like may be changed as appropriate.

The model prediction control device (200, 400) may be configured by a plurality of devices. For example, the server device installed in the cloud may include the learning unit 250, and the processing of the learning method may be executed in the cloud.

The "section" as an element of the model prediction control device (200, 400) may be rewritten as "processing" or "step".

Description of the reference symbols

100 model predictive control system, 101 state sensor, 102 environment sensor, 111 actuator, 190 model predictive control system, 191 model predictive control device, 200 model predictive control device, 201 processor, 202 memory, 203 auxiliary storage device, 204 input/output interface, 209 processing circuit, 210 operation path generation portion, 211 operation amount time series generation portion, 212 operation amount determination portion, 220 predictive model portion, 230 neural network portion, 231 neural network, 240 state amount evaluation portion, 250 learning portion, 251 model operation portion, 252 weight parameter learning portion, 280 history portion, 281 state amount history, 282 environment amount history, operation amount history, 284 state amount learning history, 290 storage portion, 300 model predictive control system, 301 state sensor, 302 environment sensor, 311 actuator, 400 model predictive control device, 401 processor, 402 memory, 403 auxiliary storage device, 404 input/output interface, 409 processing circuit, 410 neural network unit, 411 neural network, 420 evaluation formula generation unit, 430 solver unit, 490 storage unit.

37页详细技术资料下载

上一篇：一种医用注射器针头装配设备

下一篇：控制装置及控制程序

Model predictive control device, model predictive control program, model predictive control system, and model predictive control method

相关技术

网友询问留言