Large-scale commercial vehicle anti-collision decision method considering road adhesion conditions

文档序号：1930590 发布日期：2021-12-07 浏览：13次中文

阅读说明：本技术 一种考虑路面附着条件的大型营运车辆防碰撞决策方法 (Large-scale commercial vehicle anti-collision decision method considering road adhesion conditions ) 是由李旭胡玮明胡悦胡锦超徐启敏于 2021-10-21 设计创作，主要内容包括：本发明公开了一种考虑路面附着条件的大型营运车辆防碰撞决策方法。首先,建立三自由度的营运车辆运动模型。其次,建立基于交互多模型的路面附着条件估计模型,对路面附着系数进行准确识别。最后,将防碰撞决策问题描述为马尔科夫决策过程,建立基于强化学习的防碰撞驾驶决策模型,得到准确、可靠、自适应路面条件的防碰撞决策策略。本发明提出的方法,综合考虑路面附着条件、前向和后向障碍物对车辆碰撞的影响,为驾驶员提供节气门开度、方向盘转角控制量等精确量化的防碰撞策略,克服了现有的大型营运车辆防碰撞驾驶策略缺乏准确性和路面条件适应性的不足。(The invention discloses an anti-collision decision method for a large-scale commercial vehicle, which considers road adhesion conditions. Firstly, a three-degree-of-freedom commercial vehicle motion model is established. Secondly, a road adhesion condition estimation model based on the interactive multi-model is established, and the road adhesion coefficient is accurately identified. And finally, describing the anti-collision decision problem as a Markov decision process, and establishing an anti-collision driving decision model based on reinforcement learning to obtain an accurate, reliable and self-adaptive anti-collision decision strategy for road conditions. The method provided by the invention comprehensively considers the influence of the road adhesion condition and the forward and backward obstacles on the vehicle collision, provides the driver with the anti-collision strategies of accurate quantification such as the throttle opening degree, the steering wheel angle control quantity and the like, and overcomes the defects of the existing anti-collision driving strategies of large commercial vehicles, such as lack of accuracy and road condition adaptability.)

1. The utility model provides a large-scale operation vehicle anticollision decision-making method of considering road surface adhesion condition which characterized in that: the method comprises the following steps:

the method comprises the following steps: establishing a dynamic model of vehicle motion

A three-degree-of-freedom model is adopted, namely longitudinal, lateral and transverse rotation is considered, and vehicle dynamics modeling is carried out; the point O is the center of mass of the vehicle, the left and right wheels of the front axle are combined into a point C, and the left and right wheels of the rear axle are combined into a point D; the dynamic model of the vehicle is described as:

in which the superscript ". cndot.Denotes v_xDifferential of (a), omega_s,v_x,v_y,a_x,a_yRespectively representing yaw rate, longitudinal rate, lateral rate, longitudinal acceleration and lateral acceleration of the pilot vehicle, M, delta, I_zRespectively representing the mass of the pilot vehicle, the steering angle of the front wheel and the moment of inertia around the vertical axis of the coordinate system of the vehicle body_f,l_rRespectively representing the distances of the centre of mass of the vehicle to the front and rear axles, F_xf,F_xr,F_yf,F_yrRespectively representing the longitudinal force and the lateral force applied to the front wheel and the rear wheel;

wherein the lateral force of the tire is expressed as:

F_yf＝C_αf·α_f F_yr＝C_αr·α_r (2)

in the formula, C_αf,C_αrRespectively representing the cornering stiffness, alpha, of the front and rear tyres_f,α_rRespectively representing the slip angles of the front and rear tires, and alpha_f＝δ-(v_y+l_fr_s)/v_x,α_r＝(l_rr_s-v_y)/v_x；

The longitudinal force of the tire is expressed as:

in the formula, F_xf,F_xrRespectively representing the longitudinal forces acting on the front and rear tyres, C_xf,C_xrRespectively representing the longitudinal stiffness of the front and rear tires, mu being the road adhesion coefficient, F_zf,F_zrRespectively, the vertical loads of the front and rear tires, s_xf,s_xrRespectively represents the longitudinal slip ratio of the front and rear tires, and is obtained by the following formulas (4) and (5):

in the formula, R_tyreIs the radius of the tire, omega_f,ω_rRespectively representing the angular velocities of rotation of the front and rear wheels, obtainable by calculation from the linear velocity measured by the wheel speed sensor, v_xf,v_xrRespectively, the speeds in the tire direction on the front and rear wheel shafts, and v_xr＝v_x，v_xf＝v_xcosδ+(v_y+l_fω_s)sinδ；

Step two: road adhesion coefficient estimation model based on interactive multiple models is established

And (3) carrying out recursive estimation on the road adhesion coefficient, the yaw velocity, the transverse velocity and the longitudinal velocity of the vehicle by adopting a UKF algorithm, specifically:

with the vehicle and tire models described in formula (1), formula (2), and formula (3), 10 different UKF filter models were established for 10 cases in which the road surface adhesion coefficients were 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, and 1.0, respectively; therefore, there should be 10 filter state equations established; the 10 models have the same form and are only different in the value of the road surface adhesion coefficient;

firstly, for the motion process of the vehicle, a system state vector X is taken_l＝[v_x v_y ω_s]^TWherein, the superscript T of the matrix represents the transposition of the matrix, and T is the discrete period; establishing a system state equation according to a dynamic model described by the formula (1):

X_l＝f_l(X_l,U_l,W_l,γ_l) (6)

in the formula, the subscript l denotes the ith model, f (-) is a 3-dimensional vector function, W_lZero mean system Gaussian white noise, gamma_lInputting corresponding zero mean value white Gaussian noise, U, for the outside of the system_lIs a system external input vector and U_l＝[δ F_{l_xf} F_{l_xr}]^TWhere δ is the front wheel steering angle, and δ ═ ε_s/ρ_s，ε_sIs the steering wheel angle and CAN be obtained through a vehicle body CAN bus, rho_sFor the gear ratio of the steering system, F_{l_xf}And F_{l_xr}Respectively representing the longitudinal force of a front tire and a rear tire in the first model, and being determined by a brush tire model; gamma ray_lZero mean Gaussian white noise vector corresponding to external input vector of representing systemWherein, ω is_δRepresents zero mean gaussian white noise corresponding to the external input delta of the system,andis represented by F_{l_xf}And F_{l_xr}Corresponding zero isGaussian white noise values, which are implicit in the system external inputs of the state equation;

secondly, selecting an inertial measurement unit as a measurement sensor of the vehicle motion, and taking the longitudinal forward speed and the yaw rate of the vehicle as the system observation vectors, the observation equation of the system can be expressed as:

Z(t)＝h(X(t),V(t)) (7)

where h is an observation equation, t represents time, and a system observation vector Z ═ v_{x_m} ω_{z_m}]^TWherein v is_{x_m},ω_{z_m}Respectively representing the longitudinal forward speed and the yaw rate of the vehicle, which can be measured by an inertial measurement unit;

discretizing the equations (7) and (8), wherein the discretized system state equation and observation equation are respectively as follows:

where k is the discretization time, the system process noise W_l＝[w₁ w₂ w₃]^TWherein w is₁,w₂,w₃Respectively represent 3 system Gaussian white noise components, W_l(k-1) corresponding Gaussian white noise covariance matrixWherein the content of the first and second substances,respectively represent white gaussian noise w₁,w₂,w₃A corresponding variance; u shape_l(k-1) a system external input vector representing the ith model at time k-1; v_lObserve the noise for the system, and V_l＝[v₁ v₂]^TWherein v is₁,v₂Respectively representing two systematic white Gaussian noise components, V_l(k) Corresponding measured Gaussian white noise covariance matrixWherein the content of the first and second substances,respectively representing Gaussian white noise v₁,v₂The corresponding variance is determined according to the position, speed and the statistical characteristics of the yaw rate measurement noise of the sensor; noise input outside the systemWherein the content of the first and second substances,respectively represent delta, F_xf,F_xrCorresponding zero mean Gaussian white noise components, which are implicit in the system state function f_lThe three system external inputs of (1); the system state function is:

wherein the content of the first and second substances,

and finally, according to a system state equation and an observation equation described by the formula (8), establishing a filtering recursion process based on an interactive multi-model by using an interactive multi-model filtering theory, and performing parameter estimation by using time updating and measurement updating:

(1) interactive estimation computation

The transition probability among the 10 UKF filtering models is p_jlWhen the subscripts j and l (j is 1,2, …,10, l is 1,2, …,10) indicate the probability of transition from the state j to the state l, the model probability ρ of the l-th model is predicted_l(k, k-1) and the prediction mixture probability ρ_jl(k-1) are respectively:

then the input of the ith filter at time k after the cross estimation is:

(2) model conditional filtering

For the state equation and the observation equation described by the equation (6) and the equation (7), applying the kalman filtering theory, performing the UKF filtering recursion on each filter, wherein the filtering process of the l model is as follows:

1) initializing input variables and calculating parameters

In the formula, P₀For the initial error variance matrix, in the present invention, a variable with a superscript symbol ^ represents the filtered estimate of the variable, e.g.To represent an initial value X of an input variable₀The filtered estimate of (a);

2) state estimation

In the formula, xi_i(k-1) is a Sigma point,is the ith column, x, of the square root of the weighted covariance matrix_dimIs the dimension of the state vector;

wherein λ is a distance parameter, and λ ═ x_dim(α²-1), alpha is a first scale factor,weight coefficients representing the mean and variance, respectively;

3) time updating

ξ_i(k,k-1)＝f_l[ξ_i(k,k-1)],i＝0,1,...,2x_dim (17)

In the formula (I), the compound is shown in the specification,for optimal estimation of the time k-1, P_l(k, k-1) is a one-step prediction error variance matrix at time k;

4) observation update

χ_i(k,k-1)＝h_l[ξ_i(k,k-1)] (20)

In the formula, x_i(k, k-1) represents the value of the observation equation after the transformation of the Sigma point set,a one-step predictive observation representing a time k recurred from time k-1,to predict covariance, P_XZCovariance of the state value and the measured value;

5) filter update

In the formula, K_l(k) In order to filter the gain matrix of the filter,as state quantity estimation value, P_l(k) An estimation error variance matrix is obtained;

(3) model probability update

After each model completes the update of the previous step, the maximum likelihood function Lambda is utilized_l(k) Calculating new model probabilities:

according to Bayes' theorem, model probability rho of the ith model at the time k_l(k) Comprises the following steps:

(4) calculating road surface adhesion coefficient

After calculating the correct posterior probability of each model, firstly, probability weighting and summing are carried out on the state estimation of all filters, the weighting coefficient is the correct posterior probability, and the final state estimation is obtained as follows:namely the longitudinal speed, the transverse speed and the yaw rate of the vehicle after the filtering deduction; secondly, the road adhesion coefficient μ at the current time can be obtained by probability weighting the adhesion coefficients set by the models:

in the formula, mu_lThe road adhesion coefficient of the first model, where l 1,2, 10,μ₁＝0.1,μ₂＝0.2,...,μ₁₀＝1.0；

step three: establishing an anti-collision driving decision model based on reinforcement learning

Establishing an anti-collision driving decision model by adopting an SARSA algorithm, and researching self-adaptive anti-collision driving strategies under different road surface conditions; the method specifically comprises the following 4 sub-steps:

substep 1: establishing a state space

The running safety of the large commercial vehicle is related to the motion state of the vehicle, and also related to the relative motion state of front and rear obstacles; therefore, by using the information of the motion state of the commercial vehicle, the information of the relative motion state, and the road surface adhesion coefficient and the yaw rate of the vehicle, which are output in the first step, which are measured by the sensors, a state space is established:

S_t＝(v_sx,v_sy,v_sf,v_sr,a_sx,a_sy,d_sf,d_sr,ω_s,θ_str,δ_br,δ_thr,μ) (30)

in the formula, v_sf,v_srRespectively representing the relative speed of the large commercial vehicle, a front vehicle and a rear vehicle, and the unit is meter per second; a is_sx,a_syRespectively representing the transverse acceleration and the longitudinal acceleration of the large commercial vehicle, wherein the unit is meter per square second; d_sf,d_srRespectively represents the relative distance between the vehicle and the front vehicle and the rear vehicle, and the unit is meter; omega_sThe unit of the yaw velocity of the large commercial vehicle is radian per second; theta_strFor the steering wheel angle of large commercial vehicles in degrees, delta_br,δ_thrRespectively representing the opening of a brake pedal and the opening of a throttle valve of a large commercial vehicle, wherein the unit is percentage;

substep 2: establishing a behavioral space

Considering both the transverse motion and the longitudinal motion of the vehicle, taking the steering wheel angle and the accelerating/braking normalization quantity as control quantities, and defining a driving strategy output by a decision model, namely a behavior space:

A_t＝[θ_{str_out},δ_{br_out},δ_{thr_out}] (31)

in the formula, A_tFor action decision at time t, θ_{str_out}Represents the normalized steering wheel angle control quantity in the range of [ -1,1]，δ_{br_out},δ_{thr_out}Respectively representing the normalized control quantity of the brake pedal and the normalized control quantity of the opening degree of the throttle valve, wherein the ranges are all [0, 1%]；

Substep 3: establishing a reward function

To implement a behavior space A_tQuantitative evaluation of quality, namely concretizing and digitizing the evaluation in a mode of establishing a reward function; if execution behavior space A_tThen, the running state of the large-scale commercial vehicle can be safer, the return value is positive reward, otherwise, the return value is negative reward, and the anti-collision driving decision model can make a certain judgment on the last executed error action;

when an anti-collision driving strategy is established, the occurrence of vehicle collision and rollover needs to be considered at the same time, and a reward function is designed as follows:

R_t＝r₁+r₂+r₃ (32)

in the formula, R_tFor a reward function at time t, r₁For a safety distance reward function, r₂As a comfort reward function, r₃Is a penalty function;

vehicle safe distance reward function r considering road surface adhesion coefficient₁：

In the formula, ω₁,ω₂A weight coefficient of a function is awarded for the safe distance;

designing a comfort reward function r₂＝-|a_sy(t+1)-a_sy(t)|；

Finally, in order to judge the error action of the vehicle, a penalty function r is designed₃：

In the formula, S_penFor penalty, in the present invention, take S_pen-500, indicating that when the vehicle crashes or rolls over, the decision model will get a penalty of-500;

substep 4: establishing a behavior selection mechanism

Considering real-time interaction with an actual traffic environment in the process of driving decision learning, the method adopts a Pursuit function to establish an anti-collision decision behavior updating mechanism;

wherein at time t +1, action decision A is selected_t＝argmaxQ(S_t,A_t) Has a probability ofProbability of selecting other behavioral space is pi_t+1(A_t+1)；

Substep 5: training anti-collision driving decision model based on SARSA

1) Initializing a Q value matrix and a behavior selection matrix;

2) acquiring the self motion state of the commercial vehicle and the relative motion state of the commercial vehicle and other traffic participants by using a vehicle-mounted sensor, acquiring the road adhesion coefficient by using the step one, and establishing an initial state S by using a formula (30)₀；

3) Using Q value experience, in state space S_tSelecting a driving decision strategy A according to the behavior selection strategy in the corresponding behavior space set_t；

4) Executing decision strategy A in the running process of commercial vehicle_tObserving the reward function R_tAnd a new state space S'_tAnd selecting the decision strategy of the next moment, namely the new behavior space A'_t；

5) Updating the Q value, wherein the updating method comprises the following steps:

Q_t(S_t,A_t)←Q_t(S_t,A_t)+ψ_s[R_t+θ_sQ_t(S′_t,A′_t)-Q_t(S_t,A_t)] (36)

in the formula, #_sDenotes a learning rate, theta_sRepresents a discount factor;

6) new state space S'_tGiving a state space S_tNew behavior space A'_tImpartation of A_t；

7) Repeating the step 3), the step 4), the step 5) and the step 6) until the training process is finished;

substep 6: outputting a driving strategy using an anti-collision driving decision model

And inputting all parameters in the state space into the trained anti-collision driving decision model, and outputting reasonable steering wheel rotation angle, brake pedal and throttle opening control quantity in real time to provide accurate, quantitative and reliable anti-collision driving suggestions for a driver.

Technical Field

The invention relates to a vehicle anti-collision driving strategy, in particular to a large-scale commercial vehicle anti-collision decision method considering road adhesion conditions, and belongs to the technical field of automobile safety.

Background

The safety condition of the commercial vehicle, which is a main undertaker of road transportation, directly influences the safety of road transportation. Different from small passenger vehicles, most of the vehicles for operation, transport and transportation are large and medium-sized vehicles, and the vehicle has the characteristics of large total mass, narrow wheel track and the like, and has the advantages of high vehicle operation intensity, long operation time and complex operation environment. Once a traffic accident occurs in the transportation process, serious consequences such as group death and group injury are easily caused, and adverse effects such as property loss, environmental pollution, ecological damage and the like are caused.

Relevant researches show that the collision accidents are the main accident forms of commercial vehicles, and are the main culprits of group death group injury accidents. Before a collision accident occurs to a commercial vehicle, if an anti-collision driving suggestion can be timely and accurately provided for a driver, the accident of group death and group injury caused by collision can be effectively reduced or even avoided, and the safety level of road transportation is greatly improved. Therefore, the accurate and reliable anti-collision driving decision strategy is researched, and the anti-collision driving decision strategy plays an important role in guaranteeing the in-transit operation safety of the operating vehicle.

Road adhesion coefficient is an important parameter influencing the accuracy and reliability of anti-collision decision, and the anti-collision method of the small passenger vehicle considering the change of road adhesion conditions has been researched at present, but for large commercial vehicles, the method is particularly important for preventing the vehicle from turning over besides ensuring the vehicle to be far away from collision accidents. The specific reasons are that: compared with a passenger vehicle, the large-scale commercial vehicle has the characteristics of high centroid position, high load capacity and the like, so that the braking distance is long, the side-tipping stability is poor, particularly, in the collision avoidance process of a semitrailer train for transporting goods and a semitrailer tank car for transporting dangerous goods, if the operations of emergency braking, emergency lane change and the like are adopted, the goods on the trailer or liquid in the tank can shake to further increase the instability of the vehicle, and the vehicle is extremely easy to destabilize to cause side-tipping.

In addition, under different road conditions such as wet and slippery, ice and snow, dry and the like, the braking distance, the safety distance and the braking time of the vehicle are different greatly, even the difference reaches hundreds of meters or about 10 seconds. Meanwhile, on the road surface with lower adhesion coefficient, the large commercial vehicle has poorer operation stability, and the occurrence frequency of sideslip and rollover accidents caused by instability is higher. Therefore, the anti-collision driving strategy for passenger vehicles is difficult to be applied to large commercial vehicles.

In the anti-collision driving decision research for large-scale commercial vehicles, the current research only relates to collision danger identification and anti-collision driving decision under the condition of dry road surfaces, and is difficult to be applied to other road conditions, so that the existing anti-collision decision method still has certain defects in the aspects of accuracy and reliability. Generally, an anti-collision decision method considering the operation characteristics of a large commercial vehicle is not available at present, and particularly, an anti-collision decision method of a large commercial vehicle, which is accurate, reliable and adaptive to different road conditions, is not available.

Disclosure of Invention

The purpose of the invention is as follows: the invention discloses an anti-collision decision method for a large commercial vehicle, which considers road adhesion conditions and aims to solve the problems that the anti-collision decision method for the large commercial vehicle is lack of accuracy and road condition adaptability. The method can provide accurate and quantized driving suggestions such as the opening degree of a throttle valve, the opening degree of a brake pedal, the steering wheel angle control quantity and the like for a driver, can adapt to different road adhesion conditions, and improves the accuracy and the adaptability of the anti-collision decision method for large commercial vehicles.

The technical scheme is as follows: the invention provides an anti-collision driving decision method considering road adhesion conditions for large-scale operation vehicles such as semi-trailer trains and semi-trailer tankers. Firstly, a three-degree-of-freedom commercial vehicle motion model is established. Secondly, a road adhesion condition estimation model based on the interactive multi-model is established, and the road adhesion coefficient is accurately identified. And finally, describing the anti-collision decision problem as a Markov decision process, and establishing an anti-collision driving decision model based on reinforcement learning to obtain an accurate, reliable and self-adaptive anti-collision decision strategy for road conditions. The method comprises the following steps:

the method comprises the following steps: establishing a dynamic model of vehicle motion

In the process of operating a vehicle and outputting an anti-collision strategy, parameters such as a road adhesion coefficient, a vehicle running speed and a yaw rate need to be accurately acquired. In order to meet the measurement requirements of complete information and high precision, a dynamic model capable of accurately describing the motion characteristics of the commercial vehicle needs to be established. For the field of application of the invention, the following reasonable assumptions are made for a four-wheeled vehicle with front-wheel steering:

(1) ignoring pitch, roll and bounce up and down motions of the vehicle;

(2) assuming that the two tires of the front axle of the vehicle have the same steering angle, slip angle, longitudinal force and lateral force, and similarly, assuming that the two tires of the rear axle of the vehicle have the same steering angle, slip angle, longitudinal force and lateral force;

(3) neglecting the effect of the vehicle suspension on the tire axle; it is assumed that the direction of the front wheels of the vehicle coincides with the current speed direction of the vehicle.

The vehicle is dynamically modeled according to the above requirements and assumptions. Because the dynamic model of the operating vehicle is complex, the related partial parameters are difficult to obtain and need to be properly simplified. Considering that a single-degree-of-freedom and two-degree-of-freedom dynamic model is too simple, influences of factors such as non-linear characteristics of tires on vehicle motion are ignored, and vehicle motion characteristics in the running process of a commercial vehicle cannot be accurately described. Therefore, under the condition of considering both the model precision and the parameter complexity, the invention adopts a three-degree-of-freedom model, namely, longitudinal, lateral and yaw rotation are considered, and vehicle dynamics modeling is carried out.

And the point O is the center of mass of the vehicle, the left and right wheels of the front axle are combined into a point and positioned at the point C, and the left and right wheels of the rear axle are combined into a point and positioned at the point D. The dynamic model of the vehicle can be described as:

in which the superscript ". cndot.Denotes v_xDifferential of (a), omega_s，v_x，v_y，a_x，a_yRespectively representing yaw rate, longitudinal rate, lateral rate, longitudinal acceleration and lateral acceleration of the pilot vehicle, M, delta, I_zRespectively representing the mass of the pilot vehicle, the steering angle of the front wheel and the moment of inertia around the vertical axis of the coordinate system of the vehicle body_f，l_rRespectively representing the distances of the centre of mass of the vehicle to the front and rear axles, F_xf，F_xr，F_yf，F_yrRespectively showing the longitudinal force and the lateral force applied to the front wheel and the rear wheel.

Wherein the lateral force of the tire can be expressed as:

F_yf＝C_αf·α_f F_yr＝C_αr·α_r (2)

in the formula, C_αf，C_αrRespectively representing the cornering stiffness, alpha, of the front and rear tyres_f，Δ_rRespectively representing the slip angles of the front and rear tires, and alpha_f＝δ-(v_y+l_fr_s)/v_x，α_r＝(l_rr_s-v_y)/v_x。

To calculate the tire longitudinal force in equation (1), a tire model may be used for the determination. The common tire model comprises an empirical model, a theoretical model and an adaptive model, in order to ensure the accuracy and the real-time performance of the measurement of the vehicle motion parameters, the invention adopts a brush tire model, and the longitudinal force of the tire can be expressed as follows:

in the formula, F_xf，F_xrRespectively representing the longitudinal forces acting on the front and rear tyres, C_xf，C_xrRespectively representing the longitudinal stiffness of the front and rear tires, mu being the road adhesion coefficient, F_zf，F_zrRespectively, the vertical loads of the front and rear tires, s_xf，s_xrThe respective longitudinal slip ratios of the front and rear tires are obtained by equations (4) and (5):

in the formula, R_tyreIs the radius of the tire, omega_f，ω_rRespectively representing the angular velocities of rotation of the front and rear wheels, obtainable by calculation from the linear velocity measured by the wheel speed sensor, v_xf，v_xrRespectively, the speeds in the tire direction on the front and rear wheel shafts, and v_xr＝v_x，v_xf＝v_xcosδ+(v_y+l_fω_s)sinδ。

Step two: road adhesion coefficient estimation model based on interactive multiple models is established

In order to calculate the road adhesion coefficient in the running process of the commercial vehicle, a filtering recursive estimation method can be adopted, and the accurate estimation of the road adhesion coefficient is realized by using less system observation measurement. And (3) adopting a nonlinear Kalman filter to process the nonlinear system state equation described in the step one.

In a conventional nonlinear filter, the particle filter has a high computational complexity, and if the number of particles is reduced, the estimation accuracy will be reduced. The extended Kalman filtering algorithm introduces a linearization error, and the filtering effect is easily reduced for a system with a complex model. Considering that an Unscented Kalman Filter (UKF) has the same order as that of extended Kalman filtering in terms of computational complexity, but the parameter estimation precision is higher than that of the extended Kalman filtering, the invention adopts the UKF algorithm to carry out recursive estimation on the road surface attachment coefficient, the yaw angular velocity, the transverse velocity and the longitudinal velocity of the vehicle.

With the vehicle and tire models described in expression (1), expression (2), and expression (3), 10 different UKF filter models were established for 10 cases in which the road surface adhesion coefficients were 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, and 1.0, respectively. Therefore, there should be 10 filter state equations established. The 10 models have the same form, and the difference is only in the value of the road surface adhesion coefficient.

Firstly, for the motion process of the vehicle, a system state vector X is taken_l＝[v_x v_y ω_s]^TIn the present invention, the corner mark is provided on the matrix^TRepresenting a transposition of the matrix, T being the period of the dispersion. Establishing a system state equation according to a dynamic model described by the formula (1):

X_l＝f_l(X_l，U_l，W_l，γ_l) (6)

in the formula, the subscript l denotes the ith model, f (-) is a 3-dimensional vector function, W_lZero mean system Gaussian white noise, gamma_lInputting corresponding zero mean value white Gaussian noise, U, for the outside of the system_lIs a system external input vector and U_l＝[δ F_{l_xf}F_{l_xr}]^TWhere δ is the front wheel steering angle, and δ ═ ε_s/ρ_s，ε_sIs the steering wheel angle and CAN be obtained through a vehicle body CAN bus, rho_sFor the gear ratio of the steering system, F_{l_xf}And F_{l_xr}Respectively representing the longitudinal force of a front tire and a rear tire in the first model, and being determined by a brush tire model; gamma ray_lZero mean Gaussian white noise vector corresponding to external input vector of representing systemWherein, ω is_δZero mean height corresponding to the representation system external input deltaThe white noise of the white noise is generated,andis represented by F_{l_xf}And F_{l_xr}And corresponding zero-mean white Gaussian noise, which is implicit in the system external input of the state equation.

Secondly, selecting an inertial measurement unit as a measurement sensor of the vehicle motion, and taking the longitudinal forward speed and the yaw rate of the vehicle as the system observation vectors, the observation equation of the system can be expressed as:

z(t)＝h(X(t)，V(t)) (7)

where h is an observation equation, t represents time, and a system observation vector z ═ v_{x_m} ω_{z_m}]^TWherein v is_{x_m}，ω_{z_m}Measurements representing the longitudinal forward speed and yaw rate of the vehicle, respectively, may be obtained by inertial measurement unit measurements.

In the actual filtering recursion process, a discretized filtering model is needed. For this purpose, discretization processing is performed on the equations (7) and (8), and the discretized system state equation and observation equation are respectively as follows:

where k is the discretization time, the system process noise W_l＝[w₁ w₂ w₃]^TWherein w is₁，w₂，w₃Respectively represent 3 system Gaussian white noise components, W_l(k-1) corresponding Gaussian white noise covariance matrixWherein the content of the first and second substances,respectively represent white gaussian noise w₁，w₂，w₃The corresponding variance. U shape_l(k-1) a system external input vector representing the ith model at time k-1; v_lObserve the noise for the system, and V_l＝[v₁ v₂]^TWherein v is₁，v₂Respectively representing two systematic white Gaussian noise components, V_l(k) Corresponding measured Gaussian white noise covariance matrixWherein the content of the first and second substances,respectively representing Gaussian white noise v₁，v₂The corresponding variance may be determined based on the statistical properties of the sensor's position, velocity, and yaw-rate measurement noise. Noise input outside the systemWherein the content of the first and second substances,respectively represent delta, F_xf，F_xrCorresponding zero mean Gaussian white noise components, which are implicit in the system state function f_lAmong the three system-external inputs. The system state function is:

wherein the content of the first and second substances,

(1) interactive estimation computation

The transition probability among the 10 UKF filtering models is p_jlIf the indices j, l (j 1, 2.,. 10, l 1, 2.,. 10) denote the probability of a transition from state j to state l, the model probability ρ of the l-th model is predicted_l(k, k-1) and the prediction mixture probability ρ_j|l(k-1) are respectively:

then the input of the ith filter at time k after the cross estimation is:

(2) model conditional filtering

1) initializing input variables and calculating parameters

In the formula, P₀For the initial error variance matrix, in the present invention, the variable with superscript symbol Λ represents the filtered estimate of the variable, e.g.To represent an initial value X of an input variable₀The filtered estimate of (2).

2) State estimation

In the formula, xi_i(k-1) is a Sigma point,is the ith column, x, of the square root of the weighted covariance matrix_dim

Is the dimension of the state vector.

Wherein λ is a distance parameter, and λ ═ x_dim(α²-1), alpha is a first scale factor,the weight coefficients of the mean and variance, respectively.

3) Time updating

ξ_i(k，k-1)＝f_l[ξ_i(k，k-1)]，i＝0，1，...，2x_dim (17)

In the formula (I), the compound is shown in the specification,for optimal estimation of the time k-1, P_lAnd (k, k-1) is a one-step prediction error variance matrix at the k time.

4) Observation update

χ_i(k，k-1)＝h_l[ξ_i(k，k-1)] (20)

In the formula, x_i(k, k-1) represents the value of the observation equation after the transformation of the Sigma point set,a one-step predictive observation representing a time k recurred from time k-1,to predict covariance, P_XZIs the covariance of the state values and the measured values.

5) Filter update

In the formula, K_l(k) In order to filter the gain matrix of the filter,as state quantity estimation value, P_l(k) To estimate an error variance matrix.

(3) Model probability update

After each model completes the update of the previous step, the maximum likelihood function Lambda is utilized_l(k) Calculating new model probabilities:

according to Bayes' theorem, model probability rho of the ith model at the time k_l(k) Comprises the following steps:

(4) calculating road surface adhesion coefficient

After calculating the correct posterior probability of each model, firstly, probability weighting and summing are carried out on the state estimation of all filters, the weighting coefficient is the correct posterior probability, and the final state estimation is obtained as follows:i.e., filtered extrapolated vehicle longitudinal speed, lateral speed, and yaw-rate. Secondly, the road adhesion coefficient μ at the current time can be obtained by probability weighting the adhesion coefficients set by the models:

in the formula, mu_lThe road surface adhesion coefficient of the first model, where l is 1,2₁＝0.1，μ₂＝0.2，...，μ₁₀＝1.0。

Step three: establishing an anti-collision driving decision model based on reinforcement learning

Aiming at the problem that the anti-collision driving decision method for large-scale commercial vehicles lacks accuracy and road condition adaptability, the invention considers the influence of road adhesion conditions on driving decisions and establishes an accurate, reliable and self-adaptive anti-collision driving decision model. The reinforcement learning is realized in a trial and error mode, the purpose of obtaining the maximum reward is achieved, the driving strategy is guided to be generated through interaction with the environment, and the strong decision making capability is achieved. Therefore, the anti-collision driving decision model is established by adopting a reinforcement learning algorithm.

Common reinforcement learning algorithms include both offline and online learning modes. The off-line learning method can obtain the optimal behavior only after the learning algorithm is converged, and cannot meet the requirements of real-time interaction and learning strategies of commercial vehicles in the actual traffic environment. The online learning method does not need to establish an environment model, iteration of a value function is updated synchronously with the running state of a commercial vehicle in a traffic environment, the SARSA algorithm is based on Q value iteration, an optimal strategy and a behavior function value can be guaranteed to converge by adopting a greedy strategy, and the optimal anti-collision strategies under different road surface attachment conditions can be better output. Therefore, the invention adopts SARSA algorithm to establish an anti-collision driving decision model and research the self-adaptive anti-collision driving strategy under different road surface conditions. The method specifically comprises the following 4 sub-steps:

substep 1: establishing a state space

The running safety of a large commercial vehicle is related to the motion state of the vehicle and the relative motion state of front and rear obstacles. Therefore, by using the information of the motion state of the commercial vehicle, the information of the relative motion state, and the road surface adhesion coefficient and the yaw rate of the vehicle, which are output in the first step, which are measured by the sensors, a state space is established:

S_t＝(v_sx，v_sy，v_sf，v_sr，a_sx，a_sy，d_sf，d_sr，ω_s，θ_str，δ_br，δ_thr，μ) (30)

in the formula, v_sf，v_srRespectively representing the relative speed of the large commercial vehicle, a front vehicle and a rear vehicle, and the unit is meter per second; a is_sx，a_svRespectively representing the transverse acceleration and the longitudinal acceleration of the large commercial vehicle, wherein the unit is meter per square second; d_sf，d_srRespectively represents the relative distance between the vehicle and the front vehicle and the rear vehicle, and the unit is meter; omega_sThe unit of the yaw velocity of the large commercial vehicle is radian per second; theta_strFor the steering wheel angle of large commercial vehicles in degrees, delta_br，δ_thrRespectively represents the opening degree of a brake pedal and the opening degree of a throttle valve of a large commercial vehicle, and the unit is percentage.

Substep 2: establishing a behavioral space

In order to establish a more accurate and reliable anti-collision driving strategy, the invention gives consideration to the transverse motion and the longitudinal motion of a vehicle, takes the steering wheel angle and the accelerating/braking normalization quantity as control quantities, and defines the driving strategy output by a decision model, namely a behavior space:

A_t＝[θ_{str_out}，δ_{br_out}，δ_{thr_out}] (31)

in the formula, A_tFor action decision at time t, θ_{str_out}Represents the normalized steering wheel angle control quantity in the range of [ -1,1]，δ_{br_out}，δ_{thr_out}Respectively representing the normalized control quantity of the brake pedal and the normalized control quantity of the opening degree of the throttle valve, wherein the ranges are all [0, 1%]。

Substep 3: establishing a reward function

To implement a behavior space A_tThe quantitative evaluation of the advantages and the disadvantages is realized and digitalized by establishing a reward function. If execution behavior space A_tThen the running state of the large-scale commercial vehicle can be safer, the return value is positive reward, otherwise, the return value is negative reward, and the anti-collision driving decision model has a certain effect on the last executed error actionAnd (6) judging.

Different from passenger vehicles, large-scale commercial vehicles have the characteristics of higher mass center position, larger load capacity and the like, and are easy to rollover in the processes of emergency braking, urgent steering and lane changing. Therefore, when an anti-collision driving strategy is established, the occurrence of vehicle collision and rollover needs to be considered at the same time, and the reward function is designed as follows:

R_t＝r₁+r₂+r₃ (32)

in the formula, R_tFor a reward function at time t, r₁For a safety distance reward function, r₂As a comfort reward function, r₃Is a penalty function.

First, in order to prevent a collision of the vehicle, the commercial vehicle should maintain a certain safety gap with both the front vehicle and the rear vehicle. Meanwhile, considering that the braking distance of the vehicle on the road surface with low adhesion coefficient is longer, the vehicle safety distance reward function r considering the road adhesion coefficient is designed₁：

In the formula, ω₁，ω₂The weighting factor of the function is awarded for the safe distance.

Secondly, in order to ensure the driving comfort of the vehicle, the excessive impact degree should be avoided as much as possible, and a comfort rewarding function r is designed₂＝-|a_sy(t+1)-a_sy(t)|。

Finally, in order to judge the error action of the vehicle, a penalty function r is designed₃：

In the formula, S_penFor penalty, in the present invention, take S_penThe decision model will get a penalty of-500 when the vehicle crashes or rolls over.

Substep 4: establishing a behavior selection mechanism

Considering real-time interaction with the actual traffic environment in the driving decision learning process, the method adopts the Pursuit function to establish an anti-collision decision behavior updating mechanism.

Wherein at time t +1, action decision A is selected_t＝argmaxQ(S_t，A_t) Has a probability ofProbability of selecting other behavioral space is pi_t+1(A_t+1)。

Substep 5: training anti-collision driving decision model based on SARSA

1) Initializing a Q value matrix and a behavior selection matrix;

3) Using Q value experience, in state space S_tSelecting a driving decision strategy A according to the behavior selection strategy in the corresponding behavior space set_t；

5) Updating the Q value, wherein the updating method comprises the following steps:

in the formula, #_sWhich is indicative of the rate of learning,representing a discount factor.

6) New state space S'_tGiving a state space S_tNew behavior space A'_tImpartation of A_t；

7) Repeating the steps 3), 4), 5) and 6) until the training process is finished.

Substep 6: outputting a driving strategy using an anti-collision driving decision model

The parameters in the state space are input into the trained anti-collision driving decision model, reasonable steering wheel turning angle, brake pedal and throttle opening control quantity can be output in real time, accurate, quantitative and reliable anti-collision driving suggestions are provided for a driver, and therefore the anti-collision driving decision of the large-scale commercial vehicle with accurate, reliable and self-adaptive road adhesion conditions is achieved.

Has the advantages that: compared with a general vehicle anti-collision decision strategy, the method provided by the invention has the characteristics of more accuracy, reliability and self-adaption, and is specifically embodied as follows:

(1) the method comprehensively considers the influence of forward and backward obstacles on vehicle collision, accurately quantifies safe driving strategies such as driving speed, steering wheel steering and the like in a numerical form, and realizes accurate and reliable collision-proof driving decision of large commercial vehicles;

(2) the method provided by the invention considers the influence of unstable rollover of a large-scale commercial vehicle on driving safety, so that the output driving decision strategy can not only prevent the occurrence of collision accidents, but also avoid the rollover accidents of the vehicle in the collision avoidance process, and further improve the accuracy and reliability of anti-collision driving decision;

(3) the method provided by the invention can adapt to different road conditions, the output driving strategy can be adaptively adjusted according to the change of the road adhesion conditions, and the defects of the existing anti-collision driving strategy of a large-scale commercial vehicle that the accuracy is poor and the road condition adaptability is poor are overcome.

Drawings

FIG. 1 is a schematic diagram of a technical route of the present invention;

FIG. 2 is a schematic representation of a vehicle dynamics model of the present invention.

Detailed Description

The technical scheme of the invention is further explained by combining the attached drawings.

In order to establish an accurate, reliable and self-adaptive anti-collision driving strategy for road adhesion conditions, the invention provides an anti-collision driving decision method considering the road adhesion conditions for large-scale operation vehicles such as semi-trailer trains and semi-trailer tankers. Firstly, a three-degree-of-freedom commercial vehicle motion model is established. Secondly, a road adhesion condition estimation model based on the interactive multi-model is established, and the road adhesion coefficient is accurately identified. And finally, describing the anti-collision decision problem as a Markov decision process, and establishing an anti-collision driving decision model based on reinforcement learning to obtain an accurate, reliable and self-adaptive anti-collision decision strategy for road conditions. The technical route of the invention is shown in figure 1, and the specific steps are as follows:

the method comprises the following steps: establishing a dynamic model of vehicle motion

(1) ignoring pitch, roll and bounce up and down motions of the vehicle;

(3) neglecting the effect of the vehicle suspension on the tire axle; it is assumed that the direction of the front wheels of the vehicle coincides with the current speed direction of the vehicle.

The vehicle is dynamically modeled according to the above requirements and assumptions. Because the dynamic model of the operating vehicle is complex, the related partial parameters are difficult to obtain and need to be properly simplified. Considering that a single-degree-of-freedom and two-degree-of-freedom dynamic model is too simple, influences of factors such as non-linear characteristics of tires on vehicle motion are ignored, and vehicle motion characteristics in the running process of a commercial vehicle cannot be accurately described. Therefore, under the condition of considering both the model precision and the parameter complexity, the method adopts the three-degree-of-freedom model to carry out vehicle dynamics modeling.

Fig. 2 defines a three-degree-of-freedom dynamic model of the vehicle, i.e. considering longitudinal, lateral and yaw rotations. And the point O is the center of mass of the vehicle, the left and right wheels of the front axle are combined into a point and positioned at the point C, and the left and right wheels of the rear axle are combined into a point and positioned at the point D. According to fig. 2, the dynamic model of the vehicle can be described as:

in which the superscript ". cndot.Denotes v_xDifferential of (a), omega_s，v_x，v_y，a_x，a_yRespectively representing the yaw rate and the longitudinal rate of the pilot vehicleDegree, lateral velocity, longitudinal and lateral acceleration, M, delta, I_zRespectively representing the mass of the pilot vehicle, the steering angle of the front wheel and the moment of inertia around the vertical axis of the coordinate system of the vehicle body_f，l_rRespectively representing the distances of the centre of mass of the vehicle to the front and rear axles, F_xf，F_xr，F_yf，F_yrRespectively showing the longitudinal force and the lateral force applied to the front wheel and the rear wheel.

Wherein the lateral force of the tire can be expressed as:

F_yf＝C_αf·α_f F_yr＝C_αr·α_r (2)

in the formula, C_αf，C_αrRespectively representing the cornering stiffness, alpha, of the front and rear tyres_f，α_rRespectively representing the slip angles of the front and rear tires, and alpha_f＝δ-(v_y+l_fr_s)/v_x，α_r＝(l_rr_s-v_y)/v_x。

Step two: road adhesion coefficient estimation model based on interactive multiple models is established

X_l＝f_l(X_l，U_l，W_l，γ_l) (6)

in the formula, the subscript l denotes the ith model, f (-) is a 3-dimensional vector function, W_lZero mean system Gaussian white noise, gamma_lInputting corresponding zero mean value white Gaussian noise, U, for the outside of the system_lIs a system external input vector and U_l＝[δ F_{l_xf}F_{l_xr}]^TWhere δ is the front wheel steering angle, and δ ═ ε_s/ρ_s，ε_sIs the steering wheel angle and CAN be obtained through a vehicle body CAN bus, rho_sFor the gear ratio of the steering system, F_{l_xf}And F_{l_xr}Respectively representing the longitudinal force of a front tire and a rear tire in the first model, and being determined by a brush tire model; gamma ray_lZero mean Gaussian white noise vector corresponding to external input vector of representing systemWherein, ω is_δRepresents zero mean gaussian white noise corresponding to the external input delta of the system,andis represented by F_{l_xf}And F_{l_xr}Corresponding zero-mean Gaussian white noises, which are hidden in the system external input of the state equation;

Z(t)＝h(X(t)，V(t)) (7)

wherein the content of the first and second substances,

(1) interactive estimation computation

then the input of the ith filter at time k after the cross estimation is:

(2) model conditional filtering

1) initializing input variables and calculating parameters

2) State estimation

In the formula (I), the compound is shown in the specification,ξ_i(k-1) is a Sigma point,is the ith column, x, of the square root of the weighted covariance matrix_dim

Is the dimension of the state vector.

Wherein λ is a distance parameter, and λ ═ x_dim(α²-1), alpha is a first scale factor,the weight coefficients of the mean and variance, respectively.

3) Time updating

ξ_i(k，k-1)＝f_l[ξ_i(k，k-1)]，i＝0，1，...，2x_dim (17)

In the formula (I), the compound is shown in the specification,for optimal estimation of the time k-1, P_lAnd (k, k-1) is a one-step prediction error variance matrix at the k time.

4) Observation update

χ_i(k，k-1)＝h_l[ξ_i(k，k-1)] (20)

In the formula, x_i(k, k-1) represents the value of the observation equation after the transformation of the Sigma point set,a one-step predictive observation representing a time k recurred from time k-1,to predict covariance, P_xzIs the covariance of the state values and the measured values.

5) Filter update

In the formula, K_l(k) In order to filter the gain matrix of the filter,as state quantity estimation value, P_l(k) To estimate an error variance matrix.

(3) Model probability update

After each model completes the update of the previous step, the maximum likelihood function Lambda is utilized_l(k) Calculating new model probabilities:

according to Bayes' theorem, model probability rho of the ith model at the time k_l(k) Comprises the following steps:

(4) calculating road surface adhesion coefficient

After calculating the correct posterior probability of each model, firstly, probability weighting and summing are carried out on the state estimation of all filters, the weighting coefficient is the correct posterior probability, and the final state estimation is obtained as follows:i.e., filtered extrapolated vehicle longitudinal speed, lateral speed, and yaw-rate. Secondly, the road adhesion coefficient μ at the current time can be obtained by probability weighting the adhesion coefficients set by the models:

in the formula, mu_lThe road surface adhesion coefficient of the first model, where l is 1,2₁＝0.1，μ₂＝0.2，...，μ₁₀＝1.0。

Step three: establishing an anti-collision driving decision model based on reinforcement learning

substep 1: establishing a state space

S_t＝(v_sx，v_sy，v_sf，v_sr，a_sx，a_sy，d_sf，d_sr，ω_s，θ_str，δ_br，δ_thr，μ) (30)

in the formula, v_sf，v_srRespectively representing the relative speed of the large commercial vehicle, a front vehicle and a rear vehicle, and the unit is meter per second; a is_sx，a_syRespectively representing the transverse acceleration and the longitudinal acceleration of the large commercial vehicle, wherein the unit is meter per square second; d_sf，d_srRespectively represents the relative distance between the vehicle and the front vehicle and the rear vehicle, and the unit is meter; omega_sThe unit of the yaw velocity of the large commercial vehicle is radian per second; theta_strFor the steering wheel angle of large commercial vehicles in degrees, delta_br，δ_thrRespectively represents the opening degree of a brake pedal and the opening degree of a throttle valve of a large commercial vehicle, and the unit is percentage.

Substep 2: establishing a behavioral space

A_t＝[θ_{str_out}，δ_{br_out}，δ_{thr_out}](31)

Substep 3: establishing a reward function

To implement a behavior space A_tThe quantitative evaluation of the advantages and the disadvantages is realized and digitalized by establishing a reward function. If execution behavior space A_tAnd then, the running state of the large-scale commercial vehicle can be safer, the return value is positive reward, otherwise, the return value is negative reward, and the anti-collision driving decision model can judge the error action executed last time.

R_t＝r₁+r₂+r₃ (32)

in the formula, R_tFor a reward function at time t, r₁For a safety distance reward function, r₂As a comfort reward function, r₃Is a penalty function.

First, to prevent vehicle accidentsIn case of collision, the commercial vehicle should maintain a certain safety clearance with both the front vehicle and the rear vehicle. Meanwhile, considering that the braking distance of the vehicle on the road surface with low adhesion coefficient is longer, the vehicle safety distance reward function r considering the road adhesion coefficient is designed₁：

In the formula, ω₁，ω₂The weighting factor of the function is awarded for the safe distance.

Finally, in order to judge the error action of the vehicle, a penalty function r is designed₃：

In the formula, S_penFor penalty, in the present invention, take S_penThe decision model will get a penalty of-500 when the vehicle crashes or rolls over.

Substep 4: establishing a behavior selection mechanism

Wherein at time t +1, action decision A is selected_t＝argmaxQ(S_t，A_t) Has a probability ofProbability of selecting other behavioral spacesIs pi_t+1(A_t+1)。