Test question difficulty prediction method and system based on deep semantic representation

文档序号:1922134 发布日期:2021-12-03 浏览:26次 中文

阅读说明:本技术 一种基于深度语义表征的试题难度预测方法及系统 (Test question difficulty prediction method and system based on deep semantic representation ) 是由 周东岱 顾恒年 董晓晓 钟绍春 段智议 于 2021-09-06 设计创作,主要内容包括:本发明公开了一种基于深度语义表征的试题难度预测方法及系统。该方法包括:基于预训练语言模型对多类型试题进行文本表征;对试题文本表征进行特征提取和融合;基于多层感知机对融合后的特征进行分类,确定多类型试题的所属知识点集合;计算知识点集合中各知识点的拓扑距离;基于深度注意力网络模型,根据融合后的特征确定多类型试题的认知层级;基于试题文本表征、知识点拓扑距离以及认知层级预测多类型试题的难度。本发明在确定试题认知层级基础上,结合试题上下文特征、知识点拓扑结构特征,研究基于混合神经网络模型和认知层级的试题难度自动评估框架,从而解决试题文本认知层级标注困难和语料不足、试题难度评估标准单一缺乏认知指导的问题。(The invention discloses a test question difficulty prediction method and system based on deep semantic representation. The method comprises the following steps: performing text representation on the multi-type test questions based on a pre-training language model; carrying out feature extraction and fusion on the test question text representation; classifying the fused features based on a multilayer perceptron, and determining a knowledge point set to which the multi-type test questions belong; calculating the topological distance of each knowledge point in the knowledge point set; determining the cognitive level of the multi-type test question according to the fused features based on the deep attention network model; and predicting the difficulty of the multi-type test questions based on the test question text representation, the topological distance of the knowledge points and the cognitive hierarchy. According to the method, on the basis of determining the test question cognitive level, the test question context characteristics and the knowledge point topological structure characteristics are combined, and a test question difficulty automatic evaluation framework based on a hybrid neural network model and the cognitive level is researched, so that the problems that the test question text cognitive level is difficult to label and insufficient in linguistic data, and the test question difficulty evaluation standard is single and lacks cognitive guidance are solved.)

1. A test question difficulty prediction method based on deep semantic representation is characterized by comprising the following steps:

performing text representation on the multi-type test questions based on a pre-training language model; the multi-type test questions comprise three question types, namely blank filling questions, selection questions and short answer questions; the three question types comprise four types of structure texts which are a question stem text, an answer text, an option text and a parsing text;

carrying out feature extraction and fusion on the test question text representation;

classifying the fused features based on a multilayer perceptron, and determining a knowledge point set to which the multi-type test questions belong;

calculating the topological distance of each knowledge point in the knowledge point set;

determining the cognitive level of the multi-type test questions according to the fused features based on a deep attention network model;

predicting the difficulty of the multi-type test questions based on the test question text representation, the topological distance of the knowledge points and the cognitive level.

2. The test question difficulty prediction method based on the depth semantic representation according to claim 1, wherein the feature extraction and fusion of the test question text representation specifically comprises:

performing feature extraction on the question stem text representation and the analysis text representation by adopting a bidirectional long-short memory network model;

performing feature extraction on the answer text representation and the option text representation by adopting convolution through a network model;

and performing feature fusion by using a feature fusion model.

3. The test question difficulty prediction method based on the depth semantic representation according to claim 2, wherein the feature fusion is performed by using a feature fusion model, and specifically comprises:

for the blank filling questions, splicing the extracted text features of the blank filling question stem and the text features of the blank filling question answer, and inputting the spliced text features into a layer of BilSTM and an attention mechanism layer for fusion;

for the choice question, inputting the text characteristics of each choice into an attention mechanism layer, splicing the text characteristics with the question stem of the choice question, and inputting the text characteristics into a layer of BilSTM and the attention mechanism layer for fusion;

for the simple answer questions, splicing the text characteristics of the short answer question stems and the text characteristics of the simple answer questions, and inputting the spliced text characteristics into a layer of BilSTM and an attention mechanism layer for fusion; splicing the text characteristics of the analysis of the short-answer questions and the text characteristics of the answers of the short-answer questions, and inputting the characteristics into a layer of BilSTM and an attention mechanism layer for fusion; and inputting the fused features into the full-connection layer for final fusion.

4. The method for predicting the difficulty of test questions based on deep semantic representation according to claim 1, wherein the predicting the difficulty of the multiple types of test questions based on the text representation of the test questions, the topological distance of the knowledge points and the cognitive level specifically comprises:

in the training stage, the text representation of the test question, the topological distance of the knowledge point and the cognitive level are used as the input of a linear regression model, and the score of the sample test question is obtained from the answer record and is used as a label of the difficulty of the test question;

in the test stage, the score of the current test question is predicted by inputting the text representation of the test question, the topological distance of the knowledge point and the cognitive level, and the difficulty of the test question is determined.

5. The test question difficulty prediction method based on the depth semantic representation is characterized in that the calculation formula of the topological distance of the knowledge points is as follows:

wherein d isqTopological distance of knowledge points, k, representing test question qi,kjThe i and j knowledge points in the knowledge point set K to which the test question q belongs are shown, and K is (K)1,k2...kNAnd N represents the number of knowledge points.

6. A test question difficulty prediction system based on deep semantic representation is characterized by comprising:

the text representation module is used for performing text representation on the multi-type test questions based on the pre-training language model; the multi-type test questions comprise three question types, namely blank filling questions, selection questions and short answer questions; the three question types comprise four types of structure texts which are a question stem text, an answer text, an option text and a parsing text;

the characteristic extraction and fusion module is used for extracting and fusing the characteristics of the test question text representation;

the knowledge point set determining module is used for classifying the fused features based on the multilayer perceptron and determining the knowledge point sets of the multi-type test questions;

the knowledge point distance calculation module is used for calculating the topological distance of each knowledge point in the knowledge point set;

the cognitive level determining module is used for determining the cognitive levels of the multi-type test questions according to the fused features based on the deep attention network model;

and the difficulty prediction module is used for predicting the difficulty of the multi-type test questions based on the test question text representation, the knowledge point topological distance and the cognitive level.

7. The system for predicting the difficulty of test questions based on the deep semantic representation according to claim 6, wherein the feature extraction and fusion module specifically comprises:

the first feature extraction unit is used for extracting features of the question stem text characterization and the analysis text characterization by adopting a two-way long-short memory network model;

the second feature extraction unit is used for extracting features of the answer text characterization and the option text characterization by adopting convolution through a network model;

and the characteristic fusion unit is used for carrying out characteristic fusion by adopting the characteristic fusion model.

8. The depth semantic representation-based test question difficulty prediction system according to claim 6, wherein the calculation formula of the topological distance of the knowledge points is as follows:

wherein d isqTopological distance of knowledge points, k, representing test question qi,kjThe i and j knowledge points in the knowledge point set K to which the test question q belongs are shown, and K is { K ═ K1,k2...kNAnd N represents the number of knowledge points.

Technical Field

The invention relates to the technical field of test question characterization, in particular to a test question difficulty prediction method and system based on deep semantic characterization.

Background

In traditional education, the attribute labels of test questions are usually marked manually by experts, and the problems of time and labor waste, difficulty in ensuring scientificity and consistency and the like exist. For the problem, researchers manually screen classification features and build a model by using a machine learning technology to label each attribute of the test question. However, such research still fails to fully utilize the rich semantic information existing in the test question text and the complex context relationship existing between text modules with different question types, and the precision of attribute labeling needs to be improved. In addition, the important attribute of the cognitive target of the test question is not fully paid attention in the existing research, the cognitive target plays an important role in evaluating the thinking of the learner, and the test question difficulty evaluation under the cognitive target is lack of research.

Disclosure of Invention

The invention aims to provide a test question difficulty prediction method and system based on deep semantic representation, which are used for solving the problems of difficulty in labeling cognitive levels of test question texts, insufficient corpus and lack of cognitive guidance due to single test question difficulty evaluation standard.

In order to achieve the purpose, the invention provides the following scheme:

a test question difficulty prediction method based on deep semantic representation comprises the following steps:

performing text representation on the multi-type test questions based on a pre-training language model; the multi-type test questions comprise three question types, namely blank filling questions, selection questions and short answer questions; the three question types comprise four types of structure texts which are a question stem text, an answer text, an option text and a parsing text;

carrying out feature extraction and fusion on the test question text representation;

classifying the fused features based on a multilayer perceptron, and determining a knowledge point set to which the multi-type test questions belong;

calculating the topological distance of each knowledge point in the knowledge point set;

determining the cognitive level of the multi-type test questions according to the fused features based on a deep attention network model;

predicting the difficulty of the multi-type test questions based on the test question text representation, the topological distance of the knowledge points and the cognitive level.

Further, the performing feature extraction and fusion on the test question text representation specifically includes:

performing feature extraction on the question stem text representation and the analysis text representation by adopting a bidirectional long-short memory network model;

performing feature extraction on the answer text representation and the option text representation by adopting convolution through a network model;

and performing feature fusion by using a feature fusion model.

Further, the feature fusion by using the feature fusion model specifically includes:

for the blank filling questions, splicing the extracted text features of the blank filling question stem and the text features of the blank filling question answer, and inputting the spliced text features into a layer of BilSTM and an attention mechanism layer for fusion;

for the choice question, inputting the text characteristics of each choice into an attention mechanism layer, splicing the text characteristics with the question stem of the choice question, and inputting the text characteristics into a layer of BilSTM and the attention mechanism layer for fusion;

for the simple answer questions, splicing the text characteristics of the short answer question stems and the text characteristics of the simple answer questions, and inputting the spliced text characteristics into a layer of BilSTM and an attention mechanism layer for fusion; splicing the text characteristics of the analysis of the short-answer questions and the text characteristics of the answers of the short-answer questions, and inputting the characteristics into a layer of BilSTM and an attention mechanism layer for fusion; and inputting the fused features into the full-connection layer for final fusion.

Further, the predicting the difficulty of the multiple types of test questions based on the test question text representation, the knowledge point topological distance and the cognitive level specifically includes:

in the training stage, the text representation of the test question, the topological distance of the knowledge point and the cognitive level are used as the input of a linear regression model, and the score of the sample test question is obtained from the answer record and is used as a label of the difficulty of the test question;

in the test stage, the score of the current test question is predicted by inputting the text representation of the test question, the topological distance of the knowledge point and the cognitive level, and the difficulty of the test question is determined.

Further, the calculation formula of the topological distance of the knowledge point is as follows:

wherein d isqTopological distance of knowledge points, k, representing test question qi,kjThe i and j knowledge points in the knowledge point set K to which the test question q belongs are shown, and K is { K ═ K1,k2…kNAnd N represents the number of knowledge points.

The invention also provides a test question difficulty prediction system based on the deep semantic representation, which comprises the following steps:

the text representation module is used for performing text representation on the multi-type test questions based on the pre-training language model; the multi-type test questions comprise three question types, namely blank filling questions, selection questions and short answer questions; the three question types comprise four types of structure texts which are a question stem text, an answer text, an option text and a parsing text;

the characteristic extraction and fusion module is used for extracting and fusing the characteristics of the test question text representation;

the knowledge point set determining module is used for classifying the fused features based on the multilayer perceptron and determining the knowledge point sets of the multi-type test questions;

the knowledge point distance calculation module is used for calculating the topological distance of each knowledge point in the knowledge point set;

the cognitive level determining module is used for determining the cognitive levels of the multi-type test questions according to the fused features based on the deep attention network model;

and the difficulty prediction module is used for predicting the difficulty of the multi-type test questions based on the test question text representation, the knowledge point topological distance and the cognitive level.

Further, the feature extraction and fusion module specifically includes:

the first feature extraction unit is used for extracting features of the question stem text characterization and the analysis text characterization by adopting a two-way long-short memory network model;

the second feature extraction unit is used for extracting features of the answer text characterization and the option text characterization by adopting convolution through a network model;

and the characteristic fusion unit is used for carrying out characteristic fusion by adopting the characteristic fusion model.

Further, the calculation formula of the topological distance of the knowledge point is as follows:

wherein d isqTopological distance of knowledge points, k, representing test question qi,kjThe i and j knowledge points in the knowledge point set K to which the test question q belongs are shown, and K is { K ═ K1,k2…kNAnd N represents the number of knowledge points.

According to the specific embodiment provided by the invention, the invention discloses the following technical effects:

the method classifies and summarizes the test question types and the test question language characteristics, comprehensively utilizes the text characteristics of the test question context and is based on the automatic extraction model of the test question text cognitive hierarchy of the deep attention network; on the basis of determining the test question cognitive level, the test question context characteristics and the knowledge point topological structure characteristics are combined, and a test question difficulty automatic evaluation framework based on a hybrid neural network model and a cognitive level is researched, so that the problems that the test question text cognitive level is difficult to label and insufficient in linguistic data, and the test question difficulty evaluation standard is single and lacks cognitive guidance are solved.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings needed to be used in the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings without inventive exercise.

FIG. 1 is a flowchart of a test question difficulty prediction method based on deep semantic representation according to an embodiment of the present invention;

FIG. 2 is a schematic diagram of an embedded representation of test question text according to an embodiment of the present invention;

FIG. 3 is a diagram illustrating context coding of test questions based on a pre-trained language model according to an embodiment of the present invention;

FIG. 4 is a schematic diagram of feature extraction of test question texts according to an embodiment of the present invention;

FIG. 5 is a schematic diagram of multi-type test question text feature fusion according to an embodiment of the present invention;

FIG. 6 is a schematic diagram of the automatic extraction of the cognitive hierarchy of the test question text according to the embodiment of the present invention;

FIG. 7 is a schematic diagram of test question difficulty prediction based on cognitive hierarchy according to an embodiment of the present invention;

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

The invention aims to provide a test question difficulty prediction method and system based on deep semantic representation, which are used for solving the problems of difficulty in labeling cognitive levels of test question texts, insufficient corpus and lack of cognitive guidance due to single test question difficulty evaluation standard.

In order to make the aforementioned objects, features and advantages of the present invention comprehensible, embodiments accompanied with figures are described in further detail below.

As shown in fig. 1, the test question difficulty prediction method based on deep semantic representation provided by the present invention includes the following steps:

step 101: performing text representation on the multi-type test questions based on a pre-training language model; the multi-type test questions comprise three question types, namely blank filling questions, selection questions and short answer questions; the three question types comprise four types of structure texts including a question stem text, an answer text, a choice text and an analysis text.

Step 102: and carrying out feature extraction and fusion on the test question text representation.

Step 103: classifying the fused features based on a multilayer perceptron, and determining the affiliated knowledge point set of the multi-type test questions.

Step 104: and calculating the topological distance of each knowledge point in the knowledge point set.

Step 105: and determining the cognitive level of the multi-type test questions according to the fused features based on the deep attention network model.

Step 106: predicting the difficulty of the multi-type test questions based on the test question text representation, the topological distance of the knowledge points and the cognitive level.

As an alternative embodiment, step 102: carrying out feature extraction and fusion on the test question text representation, and specifically comprising the following steps:

step 1021: and performing feature extraction on the question stem text representation and the analysis text representation by adopting a bidirectional long-short memory network model.

Step 1022: and performing feature extraction on the answer text representation and the option text representation by using convolution through a network model.

Step 1023: and performing feature fusion by using a feature fusion model.

As an alternative embodiment, step 1023: the method for carrying out feature fusion by adopting the feature fusion model specifically comprises the following steps:

and for the blank filling questions, splicing the extracted text features of the blank filling question stem and the text features of the blank filling question answer, and inputting the spliced text features into a layer of BilSTM and an attention mechanism layer for fusion.

For the choice question, the text features of each choice are input into an attention mechanism layer and then spliced with the text features of the choice question stem, and the text features are input into a layer of BilSTM and the attention mechanism layer to be fused.

For the simple answer questions, splicing the text characteristics of the short answer question stems and the text characteristics of the simple answer questions, and inputting the spliced text characteristics into a layer of BilSTM and an attention mechanism layer for fusion; splicing the text characteristics of the analysis of the short-answer questions and the text characteristics of the answers of the short-answer questions, and inputting the characteristics into a layer of BilSTM and an attention mechanism layer for fusion; and inputting the fused features into the full-connection layer for final fusion.

As an alternative embodiment, step 106: predicting the difficulty of the multi-type test questions based on the test question text representation, the knowledge point topological distance and the cognitive level, which specifically comprises the following steps:

in the training stage, the text representation of the test question, the topological distance of the knowledge point and the cognitive level are used as the input of a linear regression model, and the score of the sample test question is obtained from the answer record and is used as a label of the difficulty of the test question;

in the test stage, the score of the current test question is predicted by inputting the text representation of the test question, the topological distance of the knowledge point and the cognitive level, and the difficulty of the test question is determined.

As an alternative embodiment, the calculation formula of the topological distance of the knowledge point in step 104 is as follows:

wherein d isqKnowledge point for representing test question qTopological distance, ki,kjThe i and j knowledge points in the knowledge point set K to which the test question q belongs are shown, and K is { K ═ K1,k2…kNAnd N represents the number of knowledge points.

The above method will be described in detail below:

1. multi-type test question context feature extraction and fusion based on hybrid neural network

The method is intended to determine three general and universal test question types (blank filling questions, selection questions and brief answer questions) as objects on the basis of summarizing the test question types and analyzing the test question structures, and extract and fuse context characteristics by adopting a targeted mixed neural network model according to different types of test question structures after vector representation is carried out on texts by utilizing a pre-training language model.

(1) Corpus preparation

The invention takes teaching resources of subjects of various sections of middle and primary schools, such as teaching designs, test paper, multimedia materials and the like, analyzes various subject types, subject structures and subject language characteristics (sub-language characteristics) of the subjects, combs out the common and universal test question types of the subjects as objects constructed by a project universal test question knowledge point extraction model, combines knowledge point sources such as section catalogues, test question analysis and course outline of the test questions, takes Bingzhou Chinese tree library (PCTB) marking specifications as the basis, constructs a corpus suitable for downstream test question attribute marking, and arranges and normalizes undetermined test questions and isolated test questions.

(2) Test question text representation and feature extraction

The invention respectively performs text representation on the blank filling questions, the selection questions and the short answer questions, and designs different feature extraction methods according to different structures of different question types, thereby realizing the full representation and feature extraction of the test question text. Firstly, formalized definitions are given for three types of topics and affiliated knowledge points:

definition 1: filling in the blank

Definition of TFQ=[Tstem;Tanswer]Filling a blank question FQ (fill in the blanks question) text content, TstemRepresenting the text of the stem, TanswerIndicating a blank answer text (default study entry blank type).

Definition 2: choosing questions

Definition of TFQ=[Tstem;Topt]Selecting a text content of a question CQ (ChoiceQuestion), TstemRepresenting the text of the stem, ToptRepresents all option texts, and Topt={t1,t2…toIndicates the number of options and the correct option ti∈Topt(default study singles choice question type).

Definition 3: simple answer

Definition of TPQ=[Tstem;Tanswer;Tanalysis]Is a text content, T, of filling blank question PQ (practical question)stemRepresenting the text of the stem, TanswerRepresenting answer text, TanalysisRepresents the resolution of the answer as an interpretation and supplement to the answer text. The answer text of the short answer is mostly long text, and the answer analysis is the analysis of the answer text, so that the content and the details are richer relative to the answer.

The invention only considers the answer analysis of the short-cut answer, mainly the corresponding relation between the short-cut answer and the answer is more obvious and concrete; the analysis of the brief answer is corresponding to the brief answer of both the science and the literal science, and the answer text has multiple sections, lines and points. The answers of the blank filling questions and the selection questions are mostly short texts, and the analysis cannot be closely related to the short texts, so the method is not considered for the moment.

Definition 4: knowledge points

For specific subject and specific segment (such as mathematics in junior middle school) and according to the teaching outline, all knowledge points are defined as K ═ K1,k2…kMM denotes the number of knowledge points, KQShowing the whole of a test question Q investigation,

by formalized definition, the invention integrates four structural texts of three types of topics,comprising Tstem、Tanalysis、TanswerAnd Topt. In order to understand semantic information, the invention firstly adopts a pre-training language model BERT (bidirectional Encoder reproduction from transformations) to respectively carry out vectorization Representation on four structural texts, and can fully describe the character level, word level and sentence level characteristic information of test question texts. By TstemFor example, BERT will Tstem={w1,w2...wnEach word w inn(Token) is passed through the embedding layer to convert each Token into a vector representation. As shown in FIG. 1, the embedding layer includes three embedded representations, token embedding, segment embedding, and position embedding, respectively, by which the entered text is collectively represented. Unlike other representation methods such as Word2Vec, BERT also devised both Segment embedding and Position. The example of the simulated subject stem "artifact uniformity" is shown in FIG. 2.

The role of Token embedding is to convert a word into a fixed-dimension vector representation, where there are two special tokens, the [ CLS ] at the beginning of a sentence]And [ SEP ] located at the end of the sentence]For representing the input whole sentence and the divided sentence pair, respectively. Segment embedding (sentence fragment embedding) is used for distinguishing two sentence vectors in a sentence pair, namely an SEP (seg partitioning character) with the front part being a sentence 1 and the rear part being a sentence 2, all tokens in the sentence 1 are marked as 0, the sentence 2 is 1, and the like. The role of Position embedding is to record the Position attribute of each token in the sentence. BERT represents each token as a vector of 768 dimensions, e.g., processing n input tokens in the b-th batch can be converted to a tensor of (b, n,768), thus embedding the representation Estem={E1,E2...EnIn which En=Etoken(wn)+Esegment(wn)+Eposition(wn)。

Then, the embedded expression E of the question stem textstemInputting the pre-trained bidirectional Transformer in the BERT model, extracting information implied in text sentence patterns such as lexical method, syntax and the like, and obtaining the word vector representation X of the input textstem={x1x2...xnIn which xn=Transformerencoder(En) As shown in fig. 3.

Through the same BERT text embedding process, the invention can obtain the word vector representation of each structural text, namely Xstem、Xanalysis、XanswerAnd Xopt. Then, the invention adopts two different feature extraction methods by analyzing the language characteristics of the four structured texts. For question stem text XstemAnd parsing text XanalysisThe two structural texts pay more attention to the overall understanding of the texts and have stronger sequence semantics, so that the invention adopts a bidirectional long and short Memory network (Bi-directional Long short-Term Memory) to extract sequence characteristic information, and the method is shown in figure 4. With the question stem text XstemFor example, the post-hidden state in the input BilSTM network structure can be expressed as:

ht=f(Wxxt+Whht-1+b)

wherein xt∈XstemRepresenting the t-th word vector input, Wx、WhB is the offset for the current word input and the weight matrix connecting the last word hidden layer. Specifically, compared with rnn (current Neural network), LSTM designs a memory cell structure, and controls information in the memory cell c through a three-gate structure (input gate, forgetting gate, and output gate)tStorage, update, and forget in (1):

it=σ(Wxixt+Whiht-1+Wcict-1+bi)

ct=(1-it)⊙ct-1+it⊙tanh(Wxcxt+Whcht-1+bc)

ot=σ(Wxoxt+Whoht-1+Wcoct+bo)

ht=ot⊙tanh(ct)

wherein sigmaBeing a sigmoid function, an h is a Hadamard product. Normal LSTM processes t-th word from left to right to get forward hidden stateOn the basis, the BilSTM adds a backward hidden state from right to leftThe text word vector X of the question stem is finally obtained by splicing the forward direction and the backward directionstemIs characterized byBy the same method, the invention can also obtain the analytic text XanalysisIs characterized by the sequence of (A) Fanalysis

For XanswerAnd XoptThe two structural texts generally exist in the form of short texts, aggregate comprehensive features of multiple test questions and have strong local semantic properties, so that the invention adopts a Convolutional Neural network textcnn (text conditional Neural networks) for processing texts to extract local semantic features in an emphasis manner, and the method is shown in figure 3. With answer text XanswerFor example, convolution kernels with different sizes are set to respectively extract word segment features (n-gram features), feature maps calculated by convolution are subjected to maximum pooling (Max boosting) to retain maximum feature values, and then a vector is spliced to be used as the representation of the text. Specifically, mixing XanswerInputting the word vector into a TextCNN, adopting a one-dimensional convolution kernel, wherein the width of the one-dimensional convolution kernel is consistent with the dimension d of the input word vector, only the height of the one-dimensional convolution kernel is different, and the one-dimensional convolution kernel is provided with p types of convolution kernels with different heights, namely h1,h2......hpTaking the example of a convolution kernel with a height h, the convolution kernel can be represented as a matrix Wconv_h∈Rh×dUsing the convolution kernel on the word vector xiSliding up to perform convolution operation, when the sliding window is located between the i-th to i + h-1-th words, the output of the convolution kernel can be expressed as:

where f is the activation function of the convolution kernel and b is the bias. The convolution kernel matrixes with the height h are subjected to i-h +1 times of convolution operation, n-h +1 values are output, a word fragment set (feature _ map) with the length of n-h +1 is obtained after splicing, and if the convolution kernels with each height have k, each convolution kernel generates k vectors with the length of n-h + 1:

conv=[conv0:h-1,conv1:h…convn-h:n-1]

then, for each feature _ map output by the convolution kernel, performing a maximum pooling operation using a pooling kernel of length n-h + 1:

since each height of convolution kernel has k convolution kernels and p convolution kernels in total, output values are spliced through pooling operation to obtain a vector with the length of k multiplied by p

WhereinIs to use the jth convolution kernel of the ith height to check EiThe values output after convolution and maximum pooling,i.e. answer text XanswerIs characterized by (A) and is denoted by Fanswer. By the same method, the invention can also obtain the option text XoptIs characterized by the expression FoptWherein each option text xopt,i∈XoptIs characterized by the expression Fopt,i∈Fopt

(3) Multi-question multi-feature fusion

On the basis of extracting the characteristics of the four-structure text, the invention designs a characteristic fusion network structure which is suitable for the text characteristics and question making characteristics of different question types, and the structure is shown in an attached figure 5.

1) Filling in the blank

In the respective pair of question stem texts T for filling in the blankstemRepresenting and extracting the characteristics to obtain FstemFor the space-filling answer text TanswerF obtained after representing and extracting featuresanswerIs formed by splicing and input a layer of BilSTM output fusion FFQ

FFQ=BiLSTM(concat(Fstemn,Fanswer))

However, the blank filling questions and the choice questions have differences on the whole text, namely, the answer text of the blank filling questions is more closely combined with the question stem text, because the answers are generated by hollowing out the question stem, and therefore, syntactic semantic relations also exist between the blank filling answers and the question stem. After feature fusionNFRepresenting the feature vector length. The invention inputs the semantic information into an Attention mechanism layer (Attention layer) to process the whole text semantic information, and the Attention probability distribution ai

Wherein j is equal to [1, N ]F],For randomly initializing the vector, gradually updating in the training process, and then representing the feature of the ith word vector feature by FFQ,iAccount forSpecific gravity value ofComprises the following steps:

w, U is a weight matrix, b is a bias value, ReLU is an activation function, after the probability distribution value of each word vector is obtained, all the word vector feature representations are summed up and averaged to obtain the overall text T for filling the blankFQFeature vector of

2) Choosing questions

In order to fully utilize the weight matching among the options and facilitate the network to learn more questions and link among the options, the invention inputs the feature representation of each option into an Attention layer, and calculates the percentage (Attention probability distribution) a of the matching score of the feature representation of each option and the feature representations of all the options in the totali

Wherein j is ∈ [1, O ]]And O represents the number of options,for random initialization of the vector, gradually updated during the training process, the feature representation F of the ith optionoptI accounts for FoptSpecific gravity value ofComprises the following steps:

w, U is weight matrix, b is bias value, ReLU is activation function, probability distribution value of each option is obtained, all option feature representations are summed and averaged to obtain option text ToptFeature vector of

Then, the method is combined in a splicing mode, and a layer of BilSTM output fusion characteristics F is inputCQ

Finally, a layer of attention mechanism is adopted to obtain the final characteristics

3) Simple answer

The answering process of the simple answer question embodies a thinking logic process, namely T is appliedstemMiddle condition to TanswerThe process of the middle step, meanwhile, the answer analysis and the answer also have an overcharge of the seal supplement, namely TanalysisFor TanswerEach step in (a) gives a specific parsed content. Such a Tstem→Tanswer→TanalysisThe dependency of semantic level will also reflect the dependency between text features, therefore, the invention respectively adopts the Attention to characterize and fuse the two dependency processes.

For Tstem→TanswerFirstly, the characteristics of two parts are spliced and input into a layer of BilSTM, and the number of nodes of the hidden layerObtaining the fusion characteristics Fsa

Fsa=BiLSTM(concat(Fs tem,Fanswer))

Then, a layer of attention mechanism weight distribution is adopted to obtain the feature vector

Wherein

For Tanswer→TanalysisThe characteristics of the two parts are spliced and input into a layer of BilSTM, and the number of nodes of the hidden layer isObtaining the fusion characteristics Faa

Faa=BiLSTM(concat(Fanswer,Fanafysis))

Then, a layer of attention mechanism weight distribution is adopted to obtain the feature vector

Wherein

Finally, two-part feature-dependent overall features are formed through a full connection layer FC:

semi-supervised data computation regularization, multi-class output

The method comprises the following steps of designing a feature fusion model according to the characteristics of different question types, and respectively obtaining the overall feature representation of test question texts of a blank filling question, a selection question and a short answer question:and

2. test question knowledge point extraction

To be provided withFor example, the data is input into a multi-layer perceptron MLP (Multi layer perceptron) for classification, and the layer is composed of two fully connected layers F1、F2Of composition (I) wherein F1Using ReLU as activation function, F2The number of the nodes is consistent with the total knowledge point number M, and the pseudo NM knowledge point labels K is set as { K ═ K1,k2…kM}, MLP will fuse featuresConversion into vectors of length equal to MFinally, using Softmax function as a classifier to convert F into2The probability that the choice question belongs to each knowledge point is obtained by normalization of the output:

3. test difficulty assessment under cognitive objective

The test question difficulty assessment based on the cognitive hierarchy mainly solves two problems: the method comprises the steps of firstly solving the problems of difficulty in labeling and insufficient corpus of test question text cognitive levels, secondly realizing automatic extraction of cognitive verbs of the test question text and determination of cognitive levels, and thirdly realizing a difficulty automatic evaluation model integrating test question features such as the test question cognitive levels.

(1) Test question text cognition level automatic extraction model

The cognitive hierarchy of the test questions is hidden in the test question text and belongs to a Deep characteristic, on the basis of the preparation work, the invention adopts a Deep attention Network structure (Deep attention Network), designs a Network block capable of deeply excavating the hidden characteristic of the test question text, and combines a residual error Network to realize the characteristic enhancement of a high-level Network, which is shown in the attached figure 6.

1) Test question text representation

The test question text is first embedded and represented, here similarly as before, by BERT to obtain the word vector sequence XTAnd obtaining fusion characteristics using the method

Then, a structure (Multi-head attention) in a Multi-head attention mechanism is adopted to characterize the textLinear projections of multiple passes through multiple head h are mapped into n sets (queues), K (keys) and V (values) matrices:

wherein

And finally, outputting a hidden variable Y by the deep attention network block:

Y=M·W

wherein M ═ concat (M)1,…,Mh)。

3) Deep attention network incorporating residual network

The deep attention network is composed of a plurality of network blocks, as shown in fig. 6. To simplify model training and achieve higher accuracy, one residual connection block is used after each network block to stabilize network feature propagation:

Y=X+Block(X)

and finally, outputting the probabilities of the test question T corresponding to six cognitive levels as a table through Softmax layer mapping, and taking the cognitive level with the highest probability as the cognitive level of the test question text:

however, often a test question may include multiple cognitive levels, such as memory, understanding, and synthesis. Therefore, the invention needs an index f for comprehensively measuring the cognitive level of the test questioncognition:

Cognitive hierarchy Memory Understanding of Application of Analysis of Synthesis of Evaluation of
Probability value pi p1 p2 p3 p4 p5 p6
Weight value ai 1 2 3 4 5 6

(2) Test question difficulty assessment technology based on cognitive hierarchy

On the basis of automatic extraction of test question cognitive levels, the invention designs a test question difficulty evaluation technology based on cognitive levels, which is shown in figure 7. The overall process includes training and testing phases. In the training stage, the test question text representation, the test question knowledge point topological distance representation and the cognitive level are used as model input, and the score of the test question is obtained from the answer record and is used as a label of the test question difficulty; in the test stage, the scoring rate of the test questions, namely the difficulty of the test questions, is predicted by inputting the three types of test question characteristics. Formalization is defined as follows:

definition 5: fraction of the power

Defining a test question set Q as a test question QiFraction of e.g. Q is rQCan be expressed as:

wherein wiShow the question qiWith a score of qiScore () represents the score of a record.

Definition 6: topological distance of test question knowledge points

From Def 4K ═ K1,k2…kMIs the set of all knowledge points, M represents the number of knowledge points, KqShowing the entirety of a test question q-exam,defining a knowledge point relation undirected graph G ═ K, E, wherein K represents a knowledge point set, and E represents an incidence relation edge between knowledge points; two knowledge points ki,kjThe shortest topological distance in G is DFS (k)i,kj) And then the topological distance of the knowledge points of the test question q is as follows:

firstly, the invention makes each part of text character vector X of test question qstem、Xanal ysis、XanswerAnd XoptAnd splicing into a global word vector to represent the test question text information X:

Xq=concat(Xstem,Xanalysis,Xanswer,Xopt)

then continuously sending the data into a BilSTM layer, a CNN layer and a full connection layer FC with the number of nodes being d to obtain the data with the dimension ofIs characterized by the expression Fq(the specific data flow is similar to that in the above and is not described again):

Fq=FC(CNN(LSTM(X)))

on the other hand, the cognitive level characteristics f of the test questioncognitionTopological distance d from knowledge pointqAdding and fusing to obtain new characteristic value, and adding to FqOn each element of (a):

fq=add(fcognition,dq)

Fq=Fq+fq

finally, a linear regression model is adopted, and the output is the difficulty d (F) of the prediction test questionq):

d(Fq)=WT·Fq+b

Wherein WTB is a weight matrix and b is a bias vector.

The invention also provides a test question difficulty prediction system based on the deep semantic representation, which comprises the following steps:

the text representation module is used for performing text representation on the multi-type test questions based on the pre-training language model; the multi-type test questions comprise three question types, namely blank filling questions, selection questions and short answer questions; the three question types comprise four types of structure texts which are a question stem text, an answer text, an option text and a parsing text;

the characteristic extraction and fusion module is used for extracting and fusing the characteristics of the test question text representation;

the knowledge point set determining module is used for classifying the fused features based on the multilayer perceptron and determining the knowledge point sets of the multi-type test questions;

the knowledge point distance calculation module is used for calculating the topological distance of each knowledge point in the knowledge point set;

the cognitive level determining module is used for determining the cognitive levels of the multi-type test questions according to the fused features based on the deep attention network model;

and the difficulty prediction module is used for predicting the difficulty of the multi-type test questions based on the test question text representation, the knowledge point topological distance and the cognitive level.

Wherein, the feature extraction and fusion module specifically comprises:

the first feature extraction unit is used for extracting features of the question stem text characterization and the analysis text characterization by adopting a two-way long-short memory network model;

the second feature extraction unit is used for extracting features of the answer text characterization and the option text characterization by adopting convolution through a network model;

and the characteristic fusion unit is used for carrying out characteristic fusion by adopting the characteristic fusion model.

The embodiments in the present description are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments are referred to each other. For the system disclosed by the embodiment, the description is relatively simple because the system corresponds to the method disclosed by the embodiment, and the relevant points can be referred to the method part for description.

The principles and embodiments of the present invention have been described herein using specific examples, which are provided only to help understand the method and the core concept of the present invention; meanwhile, for a person skilled in the art, according to the idea of the present invention, the specific embodiments and the application range may be changed. In view of the above, the present disclosure should not be construed as limiting the invention.

18页详细技术资料下载
上一篇:一种医用注射器针头装配设备
下一篇:一种资源配置精准方法、装置、服务器及存储介质

网友询问留言

已有0条留言

还没有人留言评论。精彩留言会获得点赞!

精彩留言,会给你点赞!