Method and system for identifying driving risk of driver

文档序号：191563 发布日期：2021-11-02 浏览：3次中文

阅读说明：本技术 一种驾驶员行车风险鉴定方法及系统 (Method and system for identifying driving risk of driver ) 是由王旭马菲廖小棱张伟于迪常玉涛陈西广于 2021-08-11 设计创作，主要内容包括：本发明属于驾驶行为数据处理领域,提供了一种驾驶员行车风险鉴定方法及系统。其中,该方法包括获取驾驶数据,从驾驶数据中提取驾驶行为特征指标；基于驾驶行为特征指标及行为分类模型,识别出驾驶行为,且当驾驶行为属于激进型时,输出风险预警信息；其中,驾驶行为包括激进、一般与平静三种驾驶风格。(The invention belongs to the field of driving behavior data processing, and provides a method and a system for identifying driving risk of a driver. The method comprises the steps of obtaining driving data, and extracting driving behavior characteristic indexes from the driving data; identifying the driving behavior based on the driving behavior characteristic indexes and the behavior classification model, and outputting risk early warning information when the driving behavior belongs to an aggressive type; the driving behaviors comprise three driving styles of aggressiveness, generality and calmness.)

1. A method for identifying driving risk of a driver is characterized by comprising the following steps:

acquiring driving data, and extracting driving behavior characteristic indexes from the driving data;

identifying the driving behavior based on the driving behavior characteristic indexes and the behavior classification model, and outputting risk early warning information when the driving behavior belongs to an aggressive type; the driving behaviors comprise three driving styles of aggressiveness, generality and calmness.

2. The driver driving risk assessment method according to claim 1, wherein the process of extracting driving behavior feature indicators from the driving data comprises:

extracting driving travel events of different drivers from the driving data set, calculating driving characteristic parameters and establishing a driving style index system;

the driving comprehensive evaluation score obtained by principal component analysis is used as an input variable to realize the classification of the driving style;

and obtaining the characteristic index most closely related to the driving behavior by comparing the optimal characteristic screening results of at least two set screening methods.

3. The method for identifying driving risk of driver as claimed in claim 2, wherein the set screening method comprises support vector machine-recursive feature elimination algorithm and random forest-recursive feature elimination algorithm.

4. The method as claimed in claim 3, wherein the best feature screening results of the support vector machine-recursive feature elimination algorithm include average longitudinal acceleration, average vertical acceleration, standard deviation of velocity, minimum longitudinal acceleration, minimum vertical acceleration and maximum velocity.

5. The method for identifying the driving risk of the driver as claimed in claim 4, wherein the optimal feature screening result of the random forest-recursive feature elimination algorithm comprises maximum speed, minimum longitudinal acceleration, minimum vertical acceleration, speed standard deviation, distance and average speed.

6. The method for identifying driving risk of a driver according to claim 5, wherein the characteristic index most closely related to the driving behavior is any one of or any combination of a minimum longitudinal acceleration, a minimum vertical acceleration and a maximum speed.

7. The method for identifying driving risk of driver according to claim 1, wherein the behavior classification model is a K-means + + clustering model.

8. A driver driving risk assessment system, comprising:

the behavior characteristic index extraction module is used for acquiring driving data and extracting driving behavior characteristic indexes from the driving data;

the driving behavior recognition early warning module is used for recognizing the driving behavior based on the driving behavior characteristic index and the behavior classification model and outputting risk early warning information when the driving behavior is aggressive; the driving behaviors comprise three driving styles of aggressiveness, generality and calmness.

9. A computer-readable storage medium, on which a computer program is stored, which program, when being executed by a processor, carries out the steps of the method for driver driving risk assessment according to any one of claims 1-7.

10. A computer arrangement comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the steps in the driver driving risk assessment method according to any one of claims 1-7 when executing the program.

Technical Field

The invention belongs to the field of driving behavior data processing, and particularly relates to a method and a system for identifying driving risk of a driver.

Background

The statements in this section merely provide background information related to the present disclosure and may not necessarily constitute prior art.

The problem of road traffic safety is a hot spot of concern in the global traffic field in recent years. Statistically, a large number of road traffic accident cause analyses indicate that over 80% of accidents are associated with driver behavior, wherein there is a strong correlation between driving style and accident rate, and the National Highway Traffic Safety Administration (NHTSA) in the united states finds that aggressive driving behavior accounts for about two thirds of all fatal traffic accidents. The higher the degree of excitement of the driving style of the driver, the more easily the bad driving such as rapid speed change, frequent lane change and overspeed driving appear in the driving process. These adverse driving behaviors frequently cause occurrence of malignant traffic accidents, and researchers are encouraged to pay attention to the influence of driving style on traffic safety.

The driving style refers to a relatively stable behavior characteristic shown by a driver operating a vehicle, and is an individual and differential tendency behavior. At present, many scholars at home and abroad extract characteristic parameters representing the driving style on the basis of analyzing the influence factors of the driving style and classify the driving style. The first research on driving style was in the form of questionnaires to design Driving Behavior Questionnaires (DBQ) and multidimensional driving style scales (MDSI) from cultural, gender, territorial, and other perspectives. Although the questionnaire survey method is simple and feasible, the accuracy and reliability of survey results are difficult to guarantee due to the influence of the subjective emotion of a driver.

The development of the internet of vehicles and big data technology has prompted many scholars to gradually build a driving style classification system by using objective parameters in natural driving experimental data, for example, Bellem and the like uses the average value, standard deviation or extreme value of natural driving experimental data such as acceleration, speed or pedal position and the like as characteristic parameters, then uses Principal Component Analysis (PCA) algorithm to perform dimensionality reduction processing on the characteristics, and uses K-means to cluster and divide driving style. In general, current research on identifying driving style using natural driving experimental data is mainly divided into two categories: 1) and directly identifying the driving style of the driver from the driving characteristic parameters by using an unsupervised learning algorithm. For example, a Bayesian multivariate linear model combined with a sequence segmentation algorithm is utilized by Bender and the like, and driving behaviors are deduced through natural driving data; koh et al directly gradient classify driving styles using the Gaussian Mixture Model (GMM). However, such identification requires a large amount of processing and analysis on the sample data to obtain reliable classification results. 2) Firstly, a clustering algorithm is used for marking class labels on the style samples of the drivers, and then a driving style recognition model is established and the recognition precision of the driving style recognition model is optimized.

The K-means method is a simple and easily-understood clustering algorithm and is often used for dividing driving style samples by researchers, but researches show that the K-means method has certain limitation, and the clustering effect is influenced by the K value of the clustering number and the random selection of the initial clustering center in the algorithm. The K-means + + algorithm can optimize the K-means clustering center selection problem and ensure the selection of the K value. Therefore, K-means and K-means + + are simultaneously selected for driving style sample division in the research, and the driving style sample with the better clustering effect is selected as the input of the next recognition model. In addition, the inventor finds that the existing driving style evaluation indexes are various, and the accuracy of the driving style identification result can be reduced while the data acquisition and processing difficulty and the communication bandwidth requirement of the identification system are increased by selecting too many indexes, so that the driver cannot be timely given correct early warning prompt, and the reliability of the identification system is reduced. Meanwhile, the requirement of excessive data indexes can also threaten the privacy of the system user. However, the existing driving style classification and identification research is less concerned about how to select a reasonable number and type of index sets to accurately reflect the driving style and perform dangerous driving behavior early warning.

Disclosure of Invention

In order to solve the technical problems in the background art, the invention provides a method and a system for identifying driving risks of a driver.

In order to achieve the purpose, the invention adopts the following technical scheme:

the invention provides a driving risk identification method for a driver.

A driver driving risk assessment method, comprising:

acquiring driving data, and extracting driving behavior characteristic indexes from the driving data;

Further, the process of extracting the driving behavior feature index from the driving data includes:

extracting driving travel events of different drivers from the driving data set, calculating driving characteristic parameters and establishing a driving style index system;

the driving comprehensive evaluation score obtained by principal component analysis is used as an input variable to realize the classification of the driving style;

and obtaining the characteristic index most closely related to the driving behavior by comparing the optimal characteristic screening results of at least two set screening methods.

Further, the set screening method comprises a support vector machine-recursive feature elimination algorithm and a random forest-recursive feature elimination algorithm.

Further, the optimal feature screening result of the support vector machine-recursive feature elimination algorithm comprises average longitudinal acceleration, vertical acceleration average, speed standard deviation, minimum longitudinal acceleration, minimum vertical acceleration and maximum speed.

Further, the optimal feature screening result of the random forest-recursive feature elimination algorithm comprises maximum speed, minimum longitudinal acceleration, minimum vertical acceleration, speed standard deviation, distance and average speed.

Further, the characteristic index most closely related to the driving behavior is any one of or any combination of a minimum longitudinal acceleration, a minimum vertical acceleration, and a maximum speed.

Further, the behavior classification model is a K-means + + clustering model.

A second aspect of the invention provides a driver driving risk assessment system.

A driver driving risk assessment system, comprising:

the behavior characteristic index extraction module is used for acquiring driving data and extracting driving behavior characteristic indexes from the driving data;

A third aspect of the invention provides a computer-readable storage medium.

A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the steps of the method for driver driving risk assessment as described above.

A fourth aspect of the invention provides a computer apparatus.

A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the steps in the method for driver driving risk assessment as described above when executing the program.

Compared with the prior art, the invention has the beneficial effects that:

the driving behavior is identified by extracting the driving behavior characteristic index from the driving data based on the driving behavior characteristic index and the behavior classification model, and when the driving behavior is aggressive, risk early warning information is output; the driving behaviors comprise three driving styles of aggressive driving, general driving and calm driving, and the accuracy of driving style identification is improved.

Advantages of additional aspects of the invention will be set forth in part in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, are included to provide a further understanding of the invention, and are incorporated in and constitute a part of this specification, illustrate exemplary embodiments of the invention and together with the description serve to explain the invention and not to limit the invention.

FIG. 1 is a flow chart of a method for identifying driving risk of a driver according to an embodiment of the present invention;

FIG. 2 is a principal component contribution ratio graph of an embodiment of the present invention;

FIG. 3 is a graphical representation of feature counts and cross-validation correct classification scores for an embodiment of the present invention;

FIG. 4 is a maximum velocity profile of an embodiment of the present invention;

FIG. 5 is an average velocity profile of an embodiment of the present invention;

FIG. 6 is a diagram of a neural network model according to an embodiment of the present invention;

FIG. 7 is a graph of test sample test results for an embodiment of the present invention.

Detailed Description

The invention is further described with reference to the following figures and examples.

It is to be understood that the following detailed description is exemplary and is intended to provide further explanation of the invention as claimed. Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs.

It is noted that the terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of exemplary embodiments according to the invention. As used herein, the singular forms "a", "an" and "the" are intended to include the plural forms as well, and it should be understood that when the terms "comprises" and/or "comprising" are used in this specification, they specify the presence of stated features, steps, operations, devices, components, and/or combinations thereof, unless the context clearly indicates otherwise.

Example one

As shown in fig. 1, the method for identifying driving risk of a driver in this embodiment specifically includes the following steps:

step S101: and the behavior characteristic index extraction module is used for acquiring the driving data and extracting the driving behavior characteristic index from the driving data.

In a specific implementation, the process of extracting the driving behavior feature index from the driving data includes:

extracting driving travel events of different drivers from the driving data set, calculating driving characteristic parameters and establishing a driving style index system;

the driving comprehensive evaluation score obtained by principal component analysis is used as an input variable to realize the classification of the driving style;

and obtaining the characteristic index most closely related to the driving behavior by comparing the optimal characteristic screening results of at least two set screening methods.

The driving schedule event here refers to the schedules of different drivers (i.e., how many miles each driver travels, speed during traveling, acceleration/deceleration, etc.).

For example, the set screening method comprises a support vector machine-recursive feature elimination algorithm and a random forest-recursive feature elimination algorithm. The optimal feature screening result of the support vector machine-recursive feature elimination algorithm comprises average longitudinal acceleration, vertical acceleration average, speed standard deviation, minimum longitudinal acceleration, minimum vertical acceleration and maximum speed. The optimal feature screening result of the random forest-recursive feature elimination algorithm comprises maximum speed, minimum longitudinal acceleration, minimum vertical acceleration, speed standard deviation, distance and average speed.

The characteristic index most closely related to the driving behavior is any one of or any combination of the minimum longitudinal acceleration, the minimum vertical acceleration, and the maximum speed.

As a preferred approach, the characteristic index most closely related to the driving behavior selects the maximum speed.

In the embodiment, based on a Basic Safety Message subdata set in a Safety Pilot Model Deployment plan (SPMD) data set, driving travel events of different drivers are extracted from natural driving data, driving characteristic parameters are calculated, a driving style index system is established, driving comprehensive evaluation scores obtained by principal component analysis are used as input variables, driving style Classification is realized by using K-means and K-means + +, and finally, by comparing optimal feature screening results of Support Vector machine-recursive feature elimination algorithms (Support Vector Classification-recursive feature elimination, SVC-RFE) and Random-recursive feature elimination algorithms (Random form-recursive feature elimination, RF-RFE), the feature indexes most closely related to driving behaviors are obtained, and the effectiveness of the driving style indexes is verified by using a neural network driving identification Model, and explore the differences between the selected characteristic indexes for drivers of different styles.

The present example selects the primary data set, basic safety information (BSM), in the american safety testing model deployment plan (SPMD) to observe and study driver micro-driving behavior. SPMD (https:// www.its.dot.gov/data /) conducted multi-mode traffic tests on nearly 3000 networked vehicles equipped with vehicle-to-vehicle communication devices (V2V) in Annelberg, Mich, collected driving state data of each vehicle comprehensively, and is one of the largest field collection items of vehicle networked vehicle data at present.

The BSMs data set contains mainly data on the vehicle motion state (i.e. speed, acceleration and yaw rate) and position, in particular, data for 4 months in 2013 in the BSMs master file "BsmP 1" are used in this study. Since the BsmP1 data is a set of high-resolution microscopic traffic data measured at a frequency of 10Hz, although its individual time point observations contain information about speed, acceleration and yaw rate, it lacks background information about the entire driving event for describing the driving behavior of the vehicle and for studying the driving style. Therefore, in this study, it is first ensured that the data format is suitable for driving style clustering. Meanwhile, in order to analyze the driving style of the driver more intuitively and reliably, MATLAB software is applied in the research, data of each vehicle are divided according to continuous strokes and vehicle IDs, different strokes of the same driver are combined, time point data of speed, acceleration and yaw rate are subjected to statistical processing, driving style quantitative indexes such as an average value, a standard deviation, a maximum value, a minimum value, a stroke distance and the like are generated, and then a driving style quantitative data set of 242 drivers is obtained. Further, considering that the driver is more sensitive to a rapid change in acceleration than the acceleration itself, the present study also introduces an acceleration jerk, that is, an acceleration change rate (jerk), as a driving style evaluation index. The indexes selected in this example are specifically shown in table 1.

TABLE 1 quantized index set of driving styles

Principal Component Analysis (PCA) is a method commonly used for index dimension reduction in statistics, and in consideration of the correlation among the 18 driving style indexes and the workload required by subsequent driving style clustering, the embodiment adopts a Principal Component Analysis method to realize dimension reduction processing of a driving style index set. The main idea is to map m-dimensional features onto p-dimensions (p < m) by orthogonal transformation, which are mutually independent principal components containing the original m-dimensional information. For the driving style evaluation index selected in this embodiment, a 242 × 18 dimensional data set is imported into Python, the raw data is first normalized, the correlation coefficient matrix, the eigenvalue, and the eigenvector are calculated, and then the information contribution rate and the cumulative contribution rate of each principal component are obtained according to the following formulas.

Calculating a characteristic value lambda_i(j ═ 1,2, …, m) information contribution rate and cumulative contribution rate.

Wherein, b_jAs a principal component y_iThe information contribution rate of (1).

Wherein alpha is_PAs a principal component y₁，y₂，…，y_PThe cumulative contribution rate of. When alpha is_PWhen the value is close to 1, the first p index variables y are selected₁，y₂，…，y_PAs p principal components, the original m index variables are replaced, so that p principal components can be comprehensively analyzed.

Fig. 2 shows the cumulative contribution ratio of 18 principal components, the abscissa shows 18 principal component variables, the ordinate of the histogram shows the information contribution ratio of each principal component, the larger the value thereof, the more data information is included, and the ordinate of the line graph shows the cumulative contribution ratio of each principal component. As shown in fig. 2 and table 2, the cumulative contribution rate of the first 6 principal components reaches 85%, and the principal component scores of 242 drivers can be calculated by using the cumulative contribution rate to represent the original 18 evaluation indexes.

TABLE 2 information contribution ratio and cumulative contribution ratio of each principal component

Calculating the comprehensive score.

In the formula, Z is a principal component score of each driving sample, and evaluation can be performed based on the Z value.

TABLE 3 driver principal component score

Table 3 shows the scores of the first 6 principal components of 242 drivers, which are used as the input of the subsequent K-means and K-means + + clustering models. In addition, in order to evaluate the importance of each index in the first six principal components selected, the present embodiment continues to calculate the factor load amount of each index, also referred to as principal component Y_jAnd an index Q_iCorrelation system ofNumber, the magnitude of the absolute value of which reflects the index Q_iWith a principal component Y_jThe degree of closeness of the relationship. As can be seen from tables 2 and 4, the first principal component Y₁The maximum information content is 24.2%, and Q₁₀Minimum lateral acceleration, Q₈The two indexes of the average lateral acceleration have the closest relationship, the absolute values of the correlation coefficients are 0.710 and 0.705 respectively, the first principal component reflects the lateral acceleration information of the driver to the maximum extent, and the second principal component Y₂Contains 21.9% of information, and Q₄Standard deviation of speed, Q₉The maximum transverse acceleration is the most closely related, the absolute values of the correlation coefficients are 0.676 and 0.646 respectively, the second principal component comprehensively reflects the motion state information of the driver, and the third principal component Y₃Contains a maximum information amount of 15% and Q₁₆Average longitudinal impact, Q₁₇The vertical shock average value is most closely related, and the absolute values of the correlation coefficients are 0.841 and 0.832 respectively, so that the third main component can be regarded as a representative of acceleration change. In general, although each principal component includes the index information, the specific emphasis is placed on a principal component that reflects one or more index information

Calculating factor load

In the formula, var (Q)_i) Is an index Q_iVariance of c_ijFor the coefficients of the respective principal components,

TABLE 4 factor Loading of the indices

Step S102: the driving behavior recognition early warning module is used for recognizing the driving behavior based on the driving behavior characteristic index and the behavior classification model and outputting risk early warning information when the driving behavior is aggressive; the driving behaviors comprise three driving styles of aggressiveness, generality and calmness.

Specifically, the driving behavior characteristic indexes of the driver are used as input variables of a K-means and K-means + + clustering algorithm, the driver is clustered into three driving styles of aggressive driving, ordinary driving and calm driving according to the relation between samples, the quality of the two clustering methods is compared by using a contour coefficient method, and K-means + + with a better clustering effect is selected as a final driving behavior classification result.

Preferably, the behavior classification model is a K-means + + clustering model.

The K-means algorithm is an unsupervised dynamic clustering algorithm based on division, Euclidean distance is used as a sample similarity measurement criterion, the smaller the distance is, the higher the sample similarity is, so that objects with higher similarity are divided into clusters of the same cluster, and the objects in different clusters have smaller similarity. The K-means algorithm firstly randomly selects K samples from a sample data set as a clustering center, calculates the distance between each sample and the clustering center, divides the distance into clusters to which the clustering centers with the highest similarity to the samples belong, iteratively updates the positions of the K clustering centers, and converges when the sum of squared errors is minimum.

The specific clustering process of the K-means algorithm is as follows:

step 1: determining the clustering number k, and randomly selecting k samples from the sample data set as an initial clustering center C_iWherein i ═ 1,2,3, k.

Step 2: extracting the remaining samples X in the dataset_iRespectively calculating k cluster centers C_iDistance D (X) of_i，C_i) Dividing the cluster into clusters corresponding to the cluster centers with the minimum distance;

wherein p is the dimension of the sample data; x_ijAnd C_ijIs X_iAnd C_iThe j-th dimension of (1).

Step 3: recalculating the cluster center for each cluster;

step 4: repeating Step2 and Step3 until the sum of squared errors in the cluster SSE reaches the minimum, the cluster center does not change any more, and the algorithm converges;

step 5: and outputting the K-means algorithm clustering result.

The algorithm is prone to converge on a locally optimal solution. Therefore, correlation studies [28] propose a K-means + + algorithm to improve the selection of initial cluster centers. Randomly selecting a certain sample in the sample data set as a first clustering center, calculating the distance between each sample and the current known clustering center, selecting the sample point with the farthest distance as a new clustering center with higher possibility, repeating the steps until all K initial clustering centers are determined, and applying the K initial clustering centers to perform clustering operation.

The clustering process of the K-means + + algorithm is as follows:

step 1: randomly selecting a sample from the sample data set as a first cluster center C₁；

Step 2: for each point X in the data set_iCalculating its distance D (x) from the known cluster center with a probability of being selected as the new cluster centerSelecting a new clustering center according to a wheel disc method;

step 3: repeating Step2 until all k initial clustering centers are determined;

step 4: extracting the remaining samples X in the dataset_jRespectively calculating the distances to k cluster centers and dividing the distances intoTo the cluster corresponding to the cluster center with the minimum distance;

step 5: recalculating the cluster center for each cluster;

step 6: repeating Step2 and Step3 until the sum of squares of errors in the cluster reaches the minimum, the clustering center does not change any more, and the algorithm converges;

step 7: and outputting the K-means algorithm clustering result.

In order to objectively evaluate the clustering effect of the two methods, in the embodiment, a contour Coefficient method (Silhouette Coefficient) is selected for comparison, and for unsupervised learning of the clustered driving style sample, the contour Coefficient method evaluates the concentration degree of the clustering result from two angles of a (i) cohesion and b (i) separation. The value of the contour coefficient is between-1, 1]The closer to 1, the better the cohesion and separation degree, and the good clustering effect. For the ith driving sample, its Silhouette value S_iThe calculation is as follows:

wherein, a (i) is the dissimilarity degree in the cluster and represents the average value of the dissimilarity degree from the driving sample i to other samples in the same cluster; and b (i) is the dissimilarity between clusters, and represents the minimum value of the average dissimilarity degree from the driving sample i to other clusters.

The driving styles are classified into 3 types, i.e., a calm type, a normal type, and an aggressive type, according to the general classification of the driving styles. The clustering evaluation results of K-means and K-means + + are as follows:

TABLE 5 evaluation of the clustering effects of K-means and K-means +

Comparing the clustering results of the K-means and the K-means + +, finding that the value of the contour coefficient of the K-means + + is greater than that of the K-means under the condition that the iteration times of the K-means and the K-means are the same, which indicates that the clustering effect of the K-means + + is better, so that the embodiment selects the clustering result of the K-means + + as the input of a Recursive Feature Elimination algorithm (Recursive Feature Elimination) in the next step, and completes the marking work of the driving style sample.

The method has the advantages that a simple and clear driving style evaluation system is built, characteristic indexes capable of representing the driving style to the maximum extent are found, and the method is particularly important for driving behavior research or upgrading optimization of a future driving assistance system. The previous research does not give a unified statement on how to select the driving index and how many indexes to select. Too much selection of an index may more fully reflect driving behavior, but this may reduce the accuracy of classification while increasing workload. Meanwhile, through analyzing the correlation coefficients between the first six main components and the indexes, even if a large number of indexes are selected for dimension reduction processing, the finally determined main components are emphasized to reflect each index, and the emphasis may weaken some important factors to influence the subsequent driving behavior analysis. Therefore, in the embodiment, a Support Vector machine (SVC) and a Random Forest (Random Forest) are selected as bottom-layer iterative models, an SVC-RFE and RF-RFE model is constructed to perform driving style index screening, and feature parameters capable of most representing the driving style are screened.

Recursive Feature Elimination (RFE) is a backward search Feature screening method with good performance, and the first method selected in this embodiment is a Support Vector machine-Recursive Feature Elimination algorithm SVC-RFE (Support Vector Classification-Recursive Feature Elimination, SVC-RFE), in which SVC is a binary Classification model whose basic model is a linear classifier defined on a Feature space with the largest interval, a Classification hyperplane is found in an N-dimensional sample space, and training samples in the space are classified. The second method is a Random Forest-recursive feature elimination (RF-RFE) algorithm, in which a Random Forest is a classifier that uses a plurality of tree training samples, and can randomly select decision tree node division features, and when the training samples have high feature dimensions, the model can still be efficiently trained, which is one of boosting algorithms. SVC-RFE and RF-RFE can carry out index importance ranking through SVC and RF, and further use RFE to screen important indexes.

In the embodiment, the number of important indexes is determined by using triple-fold cross validation, and the optimal feature number and the cross validation correct classification score of two integrated algorithms are shown in fig. 3. When n is 6, the classification accuracy of the two methods is more than 85%; further calculating the importance of the first 6 characteristics, and finding that the screening results of the two methods comprise the maximum speed, the standard deviation of the speed, the minimum vertical acceleration and the minimum longitudinal acceleration, and the coincidence rate is 66.7%. In contrast, as shown in Table 6, in SVC-RFE, the highest ranking is the mean longitudinal acceleration, with a score of 4.979, in RF-RFE, the highest ranking is the maximum velocity, with a score of 0.0867, and in connection with FIG. 2, the ordinate is the cross-validation correct classification score for both methods, with the RF-RFE curve above, its correct classification score being higher. The embodiment selects the RF-RFE with the higher rank correct score as the final result.

TABLE 6 SVC-RFE and RF-RFE top 6 feature and importance scores

The embodiment selects the maximum speed with the highest RF-RFE ranking and the average speed with the lowest RF-RFE ranking, and further verifies the difference of drivers with different driving styles in the aspect of selecting indexes. In fig. 4, the abscissa indicates the driving style of the driver, and the ordinate indicates the maximum speed value of the driver, and as can be seen from fig. 4, the average maximum speed level of 58 aggressive drivers can reach 29.40m · s^-1The speed difference with a quiet driver or a normal driver can reach 10 m.s^-1(ii) a FIG. 5 shows the driving style of the driver on the abscissa and the average speed of the driver on the ordinate, and it can be seen from FIG. 5 that the average speed of 58 aggressive drivers is about 16.82 m.s^-1The interval is smaller than that of a driver with a quiet and general driving style. Combining the observations of table 6 and fig. 3, the present embodiment uses the maximum speed as an input variable for the next neural network driving style recognition model. Meanwhile, as can be seen from a comparison of tables 2 and 4 obtained by principal component analysis with fig. 3 and 6, the six principal component principal components input as the driving sample division are weighted and addedSpeed and other indexes, while the maximum speed is ignored, which may have a certain influence on the driving sample division result of the unsupervised learning algorithm.

The embodiment selects to use the neural network to build a driving style recognition model to verify the rationality of the selected index. The neural network recognition has the characteristics of strong objectivity, strong capability of processing big data and big samples and self-learning, and is widely used for driving style recognition. 170 samples of the 242 driving samples are randomly selected for neural network training, and the remaining 72 samples are used for verifying the recognition accuracy of the model. The maximum speed of 242 samples is used as the input of the neural network, the classification result Y of 242 samples (the quiet type is matrix [1, 0, 0], the general type is [0, 1, 0], the aggressive type is [0, 0, 1]) is used as the output of the neural network model, the number of hidden layers is set to 10, and the training function is the transcg function, as shown in fig. 5. The weight of the model is W, the offset is b, the hidden layer transfer function is a sigmoid function, and the output layer transfer function is a softmax function. Fig. 6 shows the classification results of 72 test samples, which shows 3 driving style samples, 25 quiet samples, 30 general samples, and 7 aggressive samples. As shown in fig. 7, the test result shows that the model recognition accuracy can reach 86.1% by using only the variable of the maximum speed as the input of the driving style recognition model. Therefore, the present embodiment suggests prioritizing the maximum speed index of the driver in the driving style identifying work.

Example two

The embodiment provides a driver driving risk identification system, which specifically includes:

the behavior characteristic index extraction module is used for acquiring driving data and extracting driving behavior characteristic indexes from the driving data;

It should be noted that, each module in the driving risk assessment system for a driver in the embodiment corresponds to each step in the driving risk assessment method in the first embodiment one to one, and the specific implementation process is the same, which is not described herein again.

EXAMPLE III

The present embodiment provides a computer-readable storage medium, on which a computer program is stored, which when executed by a processor implements the steps in the driver driving risk assessment method as described above.

Example four

The present embodiment provides a computer device, which includes a memory, a processor and a computer program stored in the memory and executable on the processor, wherein the processor executes the program to implement the steps of the driving risk assessment method for a driver as described above.

As will be appreciated by one skilled in the art, embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of a hardware embodiment, a software embodiment, or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, optical storage, and the like) having computer-usable program code embodied therein.

The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by a computer program, which can be stored in a computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. The storage medium may be a magnetic disk, an optical disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), or the like.

The above description is only a preferred embodiment of the present invention and is not intended to limit the present invention, and various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

17页详细技术资料下载

上一篇：一种医用注射器针头装配设备

下一篇：基于服务机器人云平台的图像处理方法及系统

Method and system for identifying driving risk of driver

相关技术

网友询问留言