Method and system for automatically solving mathematical problem

文档序号:1490757 发布日期:2020-02-04 浏览:6次 中文

阅读说明:本技术 一种自动求解数学题的方法和系统 (Method and system for automatically solving mathematical problem ) 是由 谢晓华 罗文杰 赖剑煌 于 2019-09-04 设计创作,主要内容包括:本发明公开了一种自动求解数学题的方法和系统,方法包括:获取待求解的数学题信息,并转化成对应的图像矩阵信号;对图像矩阵信号进行数学题的定位、字符分割与识别,得到字符序列;对识别出的字符序列进行空间结构分析和语义分析,从而实现数学计算题自动求解。本发明识别字符过程中使用了定位、字符切割和字符细化方法,在字符识别上运用了字符识别的分类器。进行空间结构分析和语义分析时,生成一颗解析树,最后对解析树进行语法制导翻译从而实现计算题的自动求解。本发明避免了只能通过键盘输入数学计算题的弊端,只要定义好数学题的类型和符号,就可以自动求解该类型的题目,因此可处理更多类型的题目,有更好的实用性以及更高的准确性。(The invention discloses a method and a system for automatically solving a mathematical problem, wherein the method comprises the following steps: acquiring mathematical problem information to be solved, and converting the mathematical problem information into corresponding image matrix signals; positioning, character segmentation and recognition of a mathematical problem are carried out on the image matrix signal to obtain a character sequence; and carrying out spatial structure analysis and semantic analysis on the recognized character sequence so as to realize automatic solving of the mathematical calculation problem. The invention uses positioning, character cutting and character thinning method in the character recognizing process, and uses the character recognizing classifier in character recognition. And generating a parse tree during spatial structure analysis and semantic analysis, and finally performing grammar guidance translation on the parse tree so as to realize automatic solving of the calculation problem. The invention avoids the disadvantage that only mathematical calculation questions can be input through a keyboard, and the questions of the type can be automatically solved as long as the types and symbols of the mathematical questions are defined, so that more types of questions can be processed, and the method has better practicability and higher accuracy.)

1. A method for automatically solving a mathematical problem, comprising the steps of:

(1) acquiring mathematical calculation problem information to be solved, and converting the mathematical calculation problem information into corresponding image matrix signals;

(2) positioning, character segmentation and recognition of a mathematical calculation problem are carried out on the image matrix signal to obtain a character sequence;

(3) combining the category, size and space coordinate information of the character sequence, carrying out structural analysis on the mathematical computation questions, analyzing the mathematical computation questions by using a method based on a syntactic rule, continuously combining the character sequence, and finally obtaining the representation of the tree structure of the mathematical computation questions, namely an analysis tree;

(4) defining attributes and a calculation method of the nodes of the analytical tree according to mathematical semantic rules of the symbols; traversing the analysis tree in a subsequent order, transmitting the attribute values of the nodes of the analysis tree from bottom to top, and finally obtaining the attribute value of the root node; and (4) taking the attribute value of the root node as an answer of the mathematical calculation problem to finish the automatic solving of the mathematical calculation problem.

2. The method for automatically solving the mathematical problems as claimed in claim 1, wherein in the step (1), the information of the mathematical calculation problems to be solved is in any form of: the system comprises an image of a mathematical calculation question, a handwritten text track of the mathematical calculation question and a scanning piece of a document of the mathematical calculation question.

3. The method for automatically solving the mathematical problem according to claim 1, wherein in the step (1), the image matrix signal is pre-processed, and the pre-processing method comprises: thresholding is carried out firstly, then Gaussian smoothing is carried out, and then the image is denoised by using the morphological method of the image to obtain the preprocessed image.

4. The method for automatically solving the mathematical problem according to claim 1, wherein in the step (2), the mathematical calculation problem is located on the image matrix signal by: different text regions of the image are divided through connected body analysis or a projection method, the text regions are classified by extracting features through a deep learning technology, the classified categories comprise a text region and a formula region, and the formula region is a position region of a mathematical calculation question.

5. The method for automatically solving the mathematical problems as claimed in claim 1, wherein in the step (2), the character segmentation and recognition are performed on the positioned mathematical calculation problems by: constructing a mathematical symbol library, collecting a certain number of handwritten samples of each symbol as a data set according to the definition of the mathematical symbol library, and training by using a deep learning target detection model to obtain a classifier for identifying the mathematical symbols; identifying the symbols in the image matrix signal to obtain an identified character sequence { c1,c2,c2,...,cn}; each element of the sequence contains the character's category, size, spatial coordinate information.

6. The method for automatically solving mathematical problems as claimed in claim 1, wherein in the step (3), the mathematical problems are structurally analyzed by:

(3-1) designing an automaton to identify elements of the mathematical formula according to the category, the size and the space coordinate information of the characters to form a minimum element sequence { t ] of the mathematical formula1,t2,...,tnH, each element t of the sequenceiThe mathematical sign is at least composed of one mathematical symbol, and the elements form the minimum unit of a mathematical calculation question;

(3-2) defining a syntax rule of the mathematical computation problem, and identifying the structure of the mathematical computation problem through syntax analysis; in the process of analyzing the structure of the calculation problem, mathematical elements are continuously merged into subtrees according to the syntactic rules until a parse tree T of the semantics of the mathematical calculation problem is finally obtained.

7. The method of automatically solving mathematical problems of claim 6 wherein in step (3-2) a parser is constructed to identify the structure of the mathematical computational problem, the parser receiving the sequence of elements as input and using operator precedence or recursive descent to parse the structure of the mathematical computational problem.

8. The method for automatically solving the mathematical problems as claimed in claim 1, wherein in the step (4), the semantic analysis is performed according to the parse tree to realize the automatic solving of the mathematical computation problems, and the method comprises the following steps:

(4-1) writing attributes of tree nodes and corresponding calculation methods for the analytic tree T according to the mathematical meaning of the mathematical symbol;

(4-2) performing bottom-to-top syntactic analysis by semantic analysis, and realizing automatic solution of the mathematical computation formula by adopting a mode of traversing a parse tree in subsequent order, wherein the method specifically comprises the following steps: traversing attribute values of the analytical tree in a subsequent order, calling a self calculation method by each node according to the type of the node, and then returning a calculation result to a father node; finally, analyzing the root node of the tree to obtain the answer of the mathematical calculation question;

and (4-3) obtaining LaTeX format output and calculation steps of the mathematical calculation problem and a solving result, and returning the results to a user, thereby realizing the function of automatically solving the mathematical calculation problem.

9. The method for automatically solving a mathematical problem according to claim 8, wherein in the step (4-1), the attributes of the tree node include at least a state attribute and a value attribute, the state attribute indicates whether the node has been calculated, whether the node is a constant or a variable; the value attribute is a result calculated according to mathematical semantics.

10. A system for automatically solving a mathematical problem, comprising:

the information acquisition module is used for acquiring mathematical calculation problem information to be solved and converting the mathematical calculation problem information into a corresponding image matrix signal;

the character extraction module is used for positioning, segmenting and identifying the mathematical calculation questions of the image matrix signals to obtain a character sequence;

the structure analysis module is used for carrying out structure analysis on the mathematical computation questions by combining the category, size and space coordinate information of the character sequence, and the method is to analyze the mathematical computation questions by using a method based on a syntactic rule, constantly merge the character sequence and finally obtain the representation of the tree structure of the mathematical computation questions, namely an analysis tree;

the semantic analysis module is used for defining the attribute and the calculation method of the analysis tree node according to the mathematical semantic rule of the symbol; traversing the analysis tree in a subsequent order, transmitting the attribute values of the nodes of the analysis tree from bottom to top, and finally obtaining the attribute value of the root node; and (4) taking the attribute value of the root node as an answer of the mathematical calculation problem to finish the automatic solving of the mathematical calculation problem.

Technical Field

The invention relates to the field of artificial intelligence research, in particular to a method and a system for automatically solving a mathematical problem.

Background

The problem of mathematical calculation makes students have headache. In order to guide students to do mathematical calculations, teachers and parents spend a lot of time reviewing mathematical knowledge and repeating trivial calculations. If a program can automatically solve the mathematical calculation questions, students can independently complete the mathematical calculation questions and compare the reference answers.

To meet this need, researchers have proposed building a system that can identify and automatically solve mathematical problems. In the system, a user inputs a mathematical calculation question in a camera shooting or handwriting mode, the system outputs a mathematical calculation question recognition result, and a solving step and an answer are given. The system has a plurality of use scenes, such as digitizing paper mathematical literature and permanently storing the paper mathematical literature on an electronic storage device. The system can also recognize the mathematical formula on the book, and the audio frequency of the mathematical formula is changed for people with visual impairment to learn mathematical knowledge. In the education field, the system can help students to give ideas and calculation processes for solving the mathematical calculation questions, and can also help teachers to automatically change the mathematical homework questions.

Although current optical character recognition technology is well developed, document recognition software on the market can only process one-dimensional text. The mathematical calculation formula has complex two-dimensional spatial relationship and nested relationship. This makes it very difficult to identify the mathematical formula. Most of the work of predecessors is focused on positioning, cutting, identifying and structural analysis of mathematical formulas, and few study people deeply on semantic analysis of mathematical formulas. Briefly, it is proposed to recognize a single character using statistical and structural features, then decompose the recognized expression into a series of sub-expressions for structural analysis, and finally output the recognized mathematical formula in the LaTeX format. The method of using the Chensong to classify the characters roughly and then finely to match the templates and finally combining the positions among the characters and partial recognition results is used for recognizing the mathematical symbols. In the structural analysis, a box merging method is adopted to merge a complete expression from bottom to top, namely the identified expression. The objects processed by the above two articles are machine-printed fonts and do not relate to the recognition of handwritten mathematical formulas. Furthermore, they only stay in the structural analysis phase and do not perform semantic analysis and automatic evaluation on mathematical formulas.

There are many software that can realize the function of searching for the question automatically. The method comprises the steps of preprocessing and segmenting pictures uploaded by users through software such as ape search questions and Baidu search questions, identifying characters through a deep learning technology, converting questions into characters through the steps, searching and sorting the similarity of the questions in a question bank accumulated for a long time, and finally returning answers to the questions and analyzing the questions. The techniques of image processing, character recognition, question bank matching and the like are used in the ape search questions and the Baidu search questions, but semantic analysis and automatic solving are not carried out on mathematical formulas, but solving is carried out through the question bank input in front, so that new problem programs of the same type cannot work well. And an APP in the mobile phone application store is called 'love homework', and the function of the APP can be used for photographing and correcting oral problems. The APP can accurately identify a plurality of oral calculation questions in the picture, and also carries out structural analysis and semantic analysis on the mathematical calculation formula and provides a correct correction result. However, only arithmetic calculation questions, fractional expressions and the like of primary schools can be processed, and variables and mathematical function symbols cannot be recognized.

In order to solve the defects of the technology and the method, the method and the system which can automatically solve the mathematical problem have great research significance and practical value.

Disclosure of Invention

The invention aims to overcome the defects of the conventional shooting question searching method and provide a method and a system for automatically solving a mathematical question based on an artificial intelligence technology, which have strong practicability, do not cause the problem not to conform to an answer and have multiple applicable question types.

The purpose of the invention is realized by the following technical scheme: a method for automatically solving a mathematical problem, comprising the steps of:

(1) acquiring mathematical calculation problem information to be solved, and converting the mathematical calculation problem information into corresponding image matrix signals;

(2) positioning, character segmentation and recognition of a mathematical calculation problem are carried out on the image matrix signal to obtain a character sequence;

(3) combining the category, size and space coordinate information of the character sequence, carrying out structural analysis on the mathematical computation questions, analyzing the mathematical computation questions by using a method based on a syntactic rule, continuously combining the character sequence, and finally obtaining the representation of the tree structure of the mathematical computation questions, namely an analysis tree;

(4) defining attributes and a calculation method of the nodes of the analytical tree according to mathematical semantic rules of the symbols; traversing the analysis tree in a subsequent order, transmitting the attribute values of the nodes of the analysis tree from bottom to top, and finally obtaining the attribute value of the root node; and (4) taking the attribute value of the root node as an answer of the mathematical calculation problem to finish the automatic solving of the mathematical calculation problem.

As a preferred embodiment of the present invention, in the step (1), the mathematical computation problem information to be solved is in any one of the following forms: the system comprises an image of a mathematical calculation question, a handwritten text track of the mathematical calculation question and a scanning piece of a document of the mathematical calculation question. The mathematical calculation questions include elementary mathematics, advanced mathematics, linear algebra, probability theory, etc.

As a preferred embodiment of the present invention, in the step (1), the image matrix signal I is pre-processed, and the pre-processing method is as follows: thresholding is firstly carried out, then Gaussian smoothing is carried out, and then the image is denoised by using the morphological method of the image to obtain a preprocessed image I1. The influence of irrelevant factors is avoided.

As a preferred embodiment of the present invention, in the step (2), the problem of mathematical computation is located on the image matrix signal by: different text regions of the image are divided through connected body analysis or a projection method, the text regions are classified by extracting features through a deep learning technology, the classified categories comprise a text region and a formula region, and the formula region is a position region of a mathematical calculation question.

As a preferable mode of the present invention, in the step (2),character segmentation and recognition are carried out on the positioned mathematical calculation questions, and the method comprises the following steps: constructing a mathematical symbol library, collecting a certain number of handwritten samples of each symbol as a data set according to the definition of the mathematical symbol library, and training by using a deep learning target detection model to obtain a classifier for identifying the mathematical symbols; identifying the symbols in the image matrix signal to obtain an identified character sequence { c1,c2,c2,...,cn}; each element of the sequence contains the character's category, size, spatial coordinate information.

As a preferred embodiment of the present invention, in the step (3), the structure analysis is performed on the mathematical computation questions by a method comprising:

(3-1) designing an automaton to identify elements of the mathematical formula according to the category, the size and the space coordinate information of the characters to form a minimum element sequence { t ] of the mathematical formula1,t2,...,tnH, each element t of the sequenceiThe mathematical sign is at least composed of one mathematical symbol, and the elements form the minimum unit of a mathematical calculation question;

(3-2) defining a syntax rule of the mathematical computation problem, and identifying the structure of the mathematical computation problem through syntax analysis; in the process of analyzing the structure of the calculation problem, mathematical elements are continuously merged into subtrees according to the syntactic rules until a parse tree T of the semantics of the mathematical calculation problem is finally obtained.

Further, in step (3-2), a parser is constructed to identify the structure of the mathematical computation problem, the parser receiving the sequence of elements as input and analyzing the structure of the mathematical computation problem using operator-first or recursive descent.

As a preferred embodiment of the present invention, in step (4), semantic analysis is performed according to the parse tree to realize automatic solution of the mathematical computation problem, and the method is as follows:

(4-1) writing attributes of tree nodes and corresponding calculation methods for the analytic tree T according to the mathematical meaning of the mathematical symbol;

(4-2) performing bottom-to-top syntactic analysis by semantic analysis, and realizing automatic solution of the mathematical computation formula by adopting a mode of traversing a parse tree in subsequent order, wherein the method specifically comprises the following steps: traversing attribute values of the analytical tree in a subsequent order, calling a self calculation method by each node according to the type of the node, and then returning a calculation result to a father node; finally, analyzing the root node of the tree to obtain the answer of the mathematical calculation question;

and (4-3) obtaining LaTeX format output and calculation steps of the mathematical calculation problem and a solving result, and returning the results to a user, thereby realizing the function of automatically solving the mathematical calculation problem.

Further, in the step (4-1), the attributes of the tree node at least include a state attribute and a value attribute, the state attribute indicates whether the node has been calculated, whether the node is a constant or a variable, and the like; the value attribute is a result calculated according to mathematical semantics.

A system for automatically solving a mathematical problem, comprising:

the information acquisition module is used for acquiring mathematical calculation problem information to be solved and converting the mathematical calculation problem information into a corresponding image matrix signal;

the character extraction module is used for positioning, segmenting and identifying the mathematical calculation questions of the image matrix signals to obtain a character sequence;

the structure analysis module is used for carrying out structure analysis on the mathematical computation questions by combining the category, size and space coordinate information of the character sequence, and the method is to analyze the mathematical computation questions by using a method based on a syntactic rule, constantly merge the character sequence and finally obtain the representation of the tree structure of the mathematical computation questions, namely an analysis tree;

the semantic analysis module is used for defining the attribute and the calculation method of the analysis tree node according to the mathematical semantic rule of the symbol; traversing the analysis tree in a subsequent order, transmitting the attribute values of the nodes of the analysis tree from bottom to top, and finally obtaining the attribute value of the root node; and (4) taking the attribute value of the root node as an answer of the mathematical calculation problem to finish the automatic solving of the mathematical calculation problem.

Compared with the prior art, the invention has the following advantages and beneficial effects:

1) the invention can input the mathematical calculation questions through the pictures shot by the camera, the handwriting equipment or the scanner, avoids the defect that the mathematical calculation questions can only be input through a keyboard, forms a whole set of complete algorithm for identifying the handwritten mathematical calculation questions and automatically solving the problems, and has better practicability.

2) The invention realizes the automatic solving of the mathematical calculation problem by identifying the structure and the semantics of the mathematical calculation problem, and can automatically solve the problem of the type as long as the type and the symbol of the mathematical calculation problem are defined. Therefore, the method can theoretically process various problems, such as various mathematical problems including arithmetic operation, fractional operation, definite integral, equation solution, derivative calculation and limitation calculation.

3) The method for automatically solving the mathematical calculation problems by identifying the structure and the semantics of the mathematical calculation problems is essentially different from the current method for identifying the text and searching the problem base. The method for searching the question bank is easy to output answers which are not in accordance with the questions when the questions which do not exist in the question bank exist, but the method of the invention returns correct question analysis under the condition of well-defined mathematical question types and symbols.

Drawings

FIG. 1 is a flow chart of a method of the present invention;

FIG. 2 is an example of a mathematical symbol library in an embodiment of the present invention;

FIG. 3 is a mathematical computation question parsing tree constructed according to an embodiment of the present invention;

FIG. 4 is a flow chart of automatically solving a mathematical computational problem according to an embodiment of the present invention;

FIG. 5 is a diagram illustrating an automatic solution problem according to the present invention.

Detailed Description

The present invention will be described in further detail with reference to examples and drawings, but the present invention is not limited thereto.

14页详细技术资料下载
上一篇:一种医用注射器针头装配设备
下一篇:一种基于yolov3和CNN的盘头标识识别方法

网友询问留言

已有0条留言

还没有人留言评论。精彩留言会获得点赞!

精彩留言,会给你点赞!