Sports score prediction method based on improved fuzzy C-means clustering method

文档序号:35631 发布日期:2021-09-24 浏览:10次 中文

阅读说明:本技术 一种基于改进的模糊c均值聚类方法的体育成绩预测方法 (Sports score prediction method based on improved fuzzy C-means clustering method ) 是由 田磊 周近 蓝晓萍 张思维 于 2021-06-02 设计创作,主要内容包括:一种基于改进的模糊C均值聚类方法的体育成绩预测方法。步骤1,获取训练样本:采集学生的各项项目的成绩和人工评定的成绩,并根据相关标准确定各项项目的得分,其中测试项目包含:50米跑、坐位体前屈、立定跳远、引体向上(男)、仰卧起坐(女)、1000米跑(男)、800米跑(女);步骤2,映射编码处理:将采集到的各项项目成绩进行映射编码,而后将各项成绩拼接组成特征向量;步骤3,模型离线训练:利用训练样本组成的特征向量对改进的模糊C均值聚类方法进行训练,确定各分数段的聚类中心;步骤4,模型在线打分:利用训练好的模型在线对学生成绩进行预测,并将分数实时输出。本发明可以准确的预测学生的体育成绩,具有良好的实际应用价值。(A sports result prediction method based on an improved fuzzy C-means clustering method. Step 1, obtaining a training sample: collecting the achievements of all items of students and the achievements of manual evaluation, and determining the scores of all items according to relevant standards, wherein the test items comprise: running at 50 m, bending forward, standing long jump, pulling up (male), sit-up (female), running at 1000 m (male), running at 800 m (female); step 2, mapping coding treatment: mapping and coding the acquired scores of each item, and splicing the scores to form a feature vector; step 3, off-line training of the model: training the improved fuzzy C-means clustering method by using a feature vector formed by training samples, and determining the clustering center of each fractional segment; step 4, online scoring of the model: and (4) predicting the student score on line by using the trained model, and outputting the score in real time. The invention can accurately predict the sports scores of students and has good practical application value.)

1. A sports result prediction method based on an improved fuzzy C-means clustering method comprises the following specific steps:

step 1, obtaining a training sample: collecting the scores of each item of the student and the scores of the manual evaluation, and determining the scores of each item according to the relevant standard;

step 2, mapping coding treatment: mapping and coding the acquired scores of each item, and splicing the scores to form a feature vector;

the criterion for the encoding process of the achievements in step 2 can be expressed as follows:

wherein x represents the achievement to be coded, xcIndicating the encoded score, xminAnd xmaxRespectively corresponding minimum and maximum values of the same item of all sample data;

step 3, off-line training of the model: training the improved fuzzy C-means clustering method by using a feature vector formed by training samples, and determining the clustering center of each fractional segment;

the specific steps of training the improved fuzzy C-means clustering method in the step 3 are as follows:

step 3.1, defining a cost function J, and setting the sample set X as { X ═ X1,x2,x3,...,xNIn which x1={u11,u12,...,u1nThe number of fuzzy groups c is set to 10, J is used to cluster the center v of each groupi(i ═ 1, 2.., c), which is defined as each set of data and cluster center viOf centre distanceThe square sum, the specific expression is as follows:

wherein U is { U ═ U { (R) }ijDenotes a membership matrix, V ═ ViRepresents a clustering center matrix, m is a fuzzy coefficient, and the selection of the method is 2, dijRepresents a sample xiAnd cluster center cjThe present patent redefines this as an improved euclidean distance between them, the expression is as follows:

in the formula uikRepresents uiThe k characteristic quantity of (v)ikDenotes viThe k characteristic quantity, skThe standard deviation of the k-th component is indicated.

Step 3.2, setting a threshold epsilon and a membership matrix U of iteration termination;

step 3.3, constructing a Lagrange function, and updating each clustering center, wherein the updating criterion is as follows:

and 3.4, updating the membership matrix, wherein the updating criterion is as follows:

step 3.5, judging whether the error between adjacent membership degree matrixes meets an iteration termination threshold epsilon, if so, stopping iteration, otherwise, skipping to the step 3.4 to continue iteration;

step 4, online scoring of the model: and (4) predicting the student score on line by using the trained model, and outputting the score in real time.

2. The method for predicting the sports result based on the improved fuzzy C-means clustering method as claimed in claim 1, wherein: the specific steps of predicting the student scores online in the step 4 are as follows:

step 4.1, determining the score sections corresponding to each cluster center, wherein the score sections corresponding to 10 cluster centers are respectively 0-10, 11-20, 21-30, 31-40, 41-50, 51-60, 61-70, 71-80, 81-90 and 91-100, and the basic value b of each cluster centerjJ is 1,2, c is 0, 10, 20, 30, 40, 50, 60, 70, 80 and 90, respectively;

step 4.2, calculate the maximum improved Euclidean distance of each cluster centerj=1,2,...,c;

Step 4.3, determining the category of the online input data and the corresponding improved Euclidean distance dj

And 4.4, determining the final prediction score s, wherein the calculation formula is as follows:

Technical Field

The invention relates to the field of student sports result prediction, in particular to a sports result prediction method based on an improved fuzzy C-means clustering method.

Background

At present, the economy of China is in a high-speed development stage, college students pay more attention to physical life and are led to rapid decline of physical fitness by slight physical exercise, and a sports grade course is a main way for improving the physical fitness of the college students. By predicting the sports scores of students in colleges and universities, the method can assist the sports management department to set up reasonable sports score courses; the teaching aid can also help sports teachers to improve teaching modes, formulate a scientific and reasonable education training mechanism and improve the physical quality of students in colleges and universities.

For the sport performance prediction problem, a lot of researchers have conducted a lot of research, and there are two main categories at present: the method mainly comprises a linear sport result prediction method and a nonlinear sport result prediction method, wherein the linear modeling method mainly comprises multiple linear regression, but can only describe the linear change characteristics of the sport results, so that the sport result prediction error is larger; the nonlinear modeling method mainly comprises an artificial neural network, and although the artificial neural network has strong nonlinear modeling capability, the artificial neural network is easy to generate an overfitting sports result prediction result, so that the sports result prediction result is not credible.

The domestic patent related to the achievement prediction is 'a student achievement prediction system and a student achievement prediction method based on deep learning' (202010961528.0), the patent comprises a data management module and a model operation module, the data management module comprises a user information module and an achievement information module, the user information module is used for realizing the functions of user registration, user login and user information modification, and a deep learning algorithm is used for processing complex nonlinear data, so that the accuracy of the prediction method is improved, but the generalization of the deep learning model in the patent is possibly insufficient. The national invention patent "hybrid method of assessing and predicting athletic performance" (201880087060.8), the system described therein includes a receiver that collects uncertainty data regarding one or more aspects of athletic performance; a deterministic model of sports performance; a mixing processor that creates a conditional probability model from the elements; and a display for displaying the evaluated or predicted results, but the sports result prediction of the patent does not take into consideration a large number of factors, so that the prediction accuracy is not guaranteed.

Disclosure of Invention

In order to solve the problems, the invention provides a sports result prediction method based on an improved fuzzy C-means clustering method on the basis of a fuzzy C-means clustering algorithm. A mapping coding processing criterion is provided for highlighting the test results of each item as much as possible; meanwhile, the Euclidean distance in the fuzzy clustering is redefined, so that the distance between the sample point and the clustering center can be quantized more accurately and effectively; and finally, a new calculation formula for performance prediction is provided through research on different clustering centers and improved Euclidean distances. To achieve the purpose, the invention provides a sports result prediction method based on an improved fuzzy C-means clustering method, which comprises the following specific steps:

step 1, obtaining a training sample: collecting the scores of each item of the student and the scores of the manual evaluation, and determining the scores of each item according to the relevant standard;

step 2, mapping coding treatment: mapping and coding the acquired scores of each item, and splicing the scores to form a feature vector;

the criterion for the encoding process of the achievements in step 2 can be expressed as follows:

wherein x represents the achievement to be coded, xcIndicating the encoded score, xminAnd xmaxRespectively corresponding minimum and maximum values of the same item of all sample data;

step 3, off-line training of the model: training the improved fuzzy C-means clustering method by using a feature vector formed by training samples, and determining the clustering center of each fractional segment;

the specific steps of training the improved fuzzy C-means clustering method in the step 3 are as follows:

step 3.1, defining a cost function J, and setting the sample set X as { X ═ X1,x2,x3,...,xNIn which x1={u11,u12,...,u1nThe number of fuzzy groups c is set to 10, J is used to cluster the center v of each groupi(i ═ 1, 2.., c), which is defined as each set of data and cluster center viThe square sum of the center distances, the specific expression is as follows:

wherein U is { U ═ U { (R) }ijDenotes a membership matrix, V ═ ViRepresents a clustering center matrix, m is a fuzzy coefficient, and the selection of the method is 2, dijRepresents a sample xiAnd cluster center cjThe present patent redefines this as an improved euclidean distance between them, the expression is as follows:

in the formula uikRepresents uiThe k characteristic quantity of (v)ikDenotes viThe k characteristic quantity, skThe standard deviation of the k-th component is indicated.

Step 3.2, setting a threshold epsilon and a membership matrix U of iteration termination;

step 3.3, constructing a Lagrange function, and updating each clustering center, wherein the updating criterion is as follows:

and 3.4, updating the membership matrix, wherein the updating criterion is as follows:

step 3.5, judging whether the error between adjacent membership degree matrixes meets an iteration termination threshold epsilon, if so, stopping iteration, otherwise, skipping to the step 3.4 to continue iteration;

step 4, online scoring of the model: and (4) predicting the student score on line by using the trained model, and outputting the score in real time.

As a further improvement of the invention, the specific steps of predicting the student achievement online in the step 4 are as follows:

step 4.1, determining the score sections corresponding to each cluster center, wherein the score sections corresponding to 10 cluster centers are respectively 0-10, 11-20, 21-30, 31-40, 41-50, 51-60, 61-70, 71-80, 81-90 and 91-100, and the basic value b of each cluster centerjJ is 1,2, c is 0, 10, 20, 30, 40, 50, 60, 70, 80 and 90, respectively;

step 4.2, calculate the maximum improved Euclidean distance of each cluster center

Step 4.3, determining the category of the online input data and the corresponding improved Euclidean distance dj

And 4.4, determining the final prediction score s, wherein the calculation formula is as follows:

the invention relates to a sports result prediction method based on an improved fuzzy C-means clustering method, which has the beneficial effects that: the invention has the technical effects that:

1. the invention provides a mapping coding processing criterion, which can better highlight the test results of each item;

2. the invention redefines the Euclidean distance in fuzzy clustering, provides an improved fuzzy C-means clustering algorithm, and can more accurately and effectively determine the clustering centers corresponding to different fraction segments;

3. according to the invention, through research on different clustering centers and improved Euclidean distances, a new calculation formula for performance prediction is provided, and accurate prediction of sports performance is realized.

Drawings

FIG. 1 is a flow chart of the present invention;

FIG. 2 is a flow chart of the improved fuzzy C-means clustering algorithm according to the present invention.

Detailed Description

The invention is described in further detail below with reference to the following detailed description and accompanying drawings:

the invention provides a sports score prediction method based on an improved fuzzy C-means clustering method, and aims to realize accurate and effective prediction of the sports scores of students. FIG. 1 is a flow chart of the present invention, and the steps of the present invention will be described in detail in conjunction with the flow chart.

Step 1, obtaining a training sample: collecting the achievements of all items of students and the achievements of manual evaluation, and determining the scores of all items according to the national student physical health standard, wherein the test items comprise: running at 50 m, bending forward, standing long jump, pulling up (male), sit-up (female), running at 1000 m (male), running at 800 m (female);

step 2, mapping coding treatment: mapping and coding the acquired scores of each item, and splicing the scores to form a feature vector;

the criterion for the encoding process of the achievements in step 2 can be expressed as follows:

wherein x represents the achievement to be coded, xcIndicating the encoded score, xminAnd xmaxThe minimum value and the maximum value corresponding to the same item of all sample data are respectively.

Step 3, off-line training of the model: training the improved fuzzy C-means clustering method by using a feature vector formed by training samples, and determining the clustering center of each fractional segment;

the specific steps of training the improved fuzzy C-means clustering method in the step 3 are as follows:

step 3.1, defining a cost function J, and setting the sample set X as { X ═ X1,x2,x3,...,xNIn which x1={u11,u12,...,u1nThe number of fuzzy groups c is set to 10, J is used to cluster the center v of each groupi(i ═ 1, 2.., c), which is defined as each set of data and cluster center viThe square sum of the center distances, the specific expression is as follows:

wherein U is { U ═ U { (R) }ijDenotes a membership matrix, V ═ ViRepresents a clustering center matrix, m is a fuzzy coefficient, and the selection of the method is 2, dijRepresents a sample xiAnd cluster center cjThe present patent redefines this as an improved euclidean distance between them, the expression is as follows:

in the formula uikRepresents uiThe k characteristic quantity of (v)ikDenotes viThe k characteristic quantity, skThe standard deviation of the k-th component is indicated.

Step 3.2, setting a threshold epsilon and a membership matrix U of iteration termination;

step 3.3, constructing a Lagrange function, and updating each clustering center, wherein the updating criterion is as follows:

and 3.4, updating the membership matrix, wherein the updating criterion is as follows:

and 3.5, judging whether the error between the adjacent membership degree matrixes meets an iteration termination threshold epsilon, if so, stopping iteration, otherwise, skipping to the step 3.4 to continue iteration.

Step 4, online scoring of the model: and (4) predicting the student score on line by using the trained model, and outputting the score in real time.

The specific steps of predicting the student scores online in the step 4 are as follows:

step 4.1, determining the score sections corresponding to each cluster center, wherein the score sections corresponding to 10 cluster centers are respectively 0-10, 11-20, 21-30, 31-40, 41-50, 51-60, 61-70, 71-80, 81-90 and 91-100, and the basic value b of each cluster centerj(j 1, 2.., c) is 0, 10, 20, 30, 40, 50, 60, 70, 80, and 90, respectively;

step 4.2, calculate the maximum improved Euclidean distance of each cluster center

Step 4.3, determining the category of the online input data and the corresponding improved Euclidean distance dj

And 4.4, determining the final prediction score s, wherein the calculation formula is as follows:

FIG. 2 is a flow chart of the improved fuzzy C-means clustering algorithm according to the present invention. From this flowchart it is clear that the improved fuzzy C-means algorithm mainly comprises: firstly, defining a cost function J; initializing a membership matrix U, an iteration termination threshold value and a category total number c; then constructing a Lagrange function, and updating the clustering center V; simultaneously updating the cost function J and the membership degree matrix U; then judging whether the current state meets an iteration termination condition, if so, terminating, otherwise, continuing the iteration; finally U, V is output and the corresponding cluster center is determined.

The above description is only a preferred embodiment of the present invention, and is not intended to limit the present invention in any way, but any modifications or equivalent variations made according to the technical spirit of the present invention are within the scope of the present invention as claimed.

10页详细技术资料下载
上一篇:一种医用注射器针头装配设备
下一篇:一种产流模式自适应的流域水文预报方法及系统

网友询问留言

已有0条留言

还没有人留言评论。精彩留言会获得点赞!

精彩留言,会给你点赞!