Entropy maximization card-smearing identification method based on pixel probability distribution statistics

文档序号:1832024 发布日期:2021-11-12 浏览:10次 中文

阅读说明:本技术 一种基于像素概率分布统计的熵最大化涂卡识别方法 (Entropy maximization card-smearing identification method based on pixel probability distribution statistics ) 是由 田博帆 于 2021-07-20 设计创作,主要内容包括:本发明涉及一种基于像素概率分布统计的熵最大化涂卡识别方法,具体包括以下步骤:S1答题卡对齐:将已涂答题卡与空白答题卡对齐;S2试卷模板相减:将读入的空白答题卡和已涂答题卡的数字化矩阵图像保存,再对空白答题卡和已涂答题卡的数字化矩阵做减法操作,得到像素差值集合矩阵;S3涂卡答案区定位;S4裁剪获取答案区:根据步骤S3获得每个小题所对应的填涂答案区坐标,利用填涂答案区坐标对每个小题分别裁剪获取空白答题卡答案裁剪区域和已涂答题卡答案裁剪区域;S5区域像素灰度统计;S6计算空白答题卡答案裁剪区域和已涂答题卡答案裁剪区域的图像的最大化熵;S7识别结果判断获得涂卡结果。(The invention relates to an entropy maximization card-smearing identification method based on pixel probability distribution statistics, which specifically comprises the following steps: s1 answer sheet alignment: aligning the coated answer sheet with the blank answer sheet; s2 subtraction of test paper templates: storing the read-in digital matrix images of the blank answer sheet and the coated answer sheet, and then carrying out subtraction operation on the digital matrices of the blank answer sheet and the coated answer sheet to obtain a pixel difference value set matrix; s3 positioning the answer area of card coating; s4 cutting to obtain a solution area: according to the step S3, acquiring the coordinates of the filled-in answer area corresponding to each small question, and respectively cutting each small question by using the coordinates of the filled-in answer area to acquire a blank answer sheet answer cutting area and a coated answer sheet answer cutting area; s5 area pixel gray statistics; s6, calculating the maximum entropy of the image of the blank answer sheet answer cutting area and the image of the painted answer sheet answer cutting area; s7 judges the result to obtain the card-coating result.)

1. An entropy maximization card-smearing identification method based on pixel probability distribution statistics is characterized by specifically comprising the following steps of:

s1 answer sheet alignment: aligning the coated answer sheet with the blank answer sheet;

s2 subtraction of test paper templates: storing the read-in digital matrix images of the blank answer sheet and the coated answer sheet, and respectively recording the digital matrix images as follows: i is0And I1(ii) a Then, subtraction operation is carried out on the digitized matrixes of the blank answer sheet and the coated answer sheet to obtain a pixel difference value set matrix Id

S3 card-coating answer area positioning: firstly, identifying the question number of each question to obtain the coordinate of the question number character; then, carrying out sequencing analysis on the sequence coordinates of the question numbers by combining with the sequence of the question numbers to obtain the horizontal and vertical distribution of each question and record the coordinate position of the character corresponding to each question number, carrying out statistics after processing and analysis to obtain the coordinates and the total number of answers of the full-filling answer area corresponding to each question and numbering the total number of the answers;

s4 cutting to obtain a solution area: according to the step S3, acquiring the coordinates of the filled-in answer area corresponding to each small question, and respectively cutting each small question by using the coordinates of the filled-in answer area to acquire a blank answer sheet answer cutting area and a coated answer sheet answer cutting area;

and S5 area pixel gray statistics: the pixel difference matrix I obtained in step S2dCounting the distribution D (X) of different pixel gray values, and combining with the pixel difference matrix IdObtaining a probability density function F of the filled and approximately black pixelsX(x);

S6 image maximization entropy calculation: probability density function F according to step S5X(x) Sum pixel difference matrix IdCalculating the maximum entropy of the images of the blank answer sheet answer cutting area and the painted answer sheet answer cutting area;

and S7 judging the recognition result: and (4) according to the steps S1-S6, calculating the image maximum entropy of the blank answer sheet answer cutting area and the image maximum entropy of the coated answer sheet answer cutting area, calculating the proportion P of the filled and coated approximate black pixels, and judging to obtain a card coating result.

2. An entropy-maximized card-filling identification method based on pixel probability distribution statistics as claimed in claim 1, wherein the step S1 of answer sheet alignment specifically comprises the following steps:

s11: selecting two answer sheets as a blank answer sheet and a coated answer sheet respectively, wherein the blank answer sheet and the coated answer sheet have background colors, and the background color of the blank answer sheet is not pure white;

s12: randomly cutting an image area with a fixed size from the upper left corner points (0,0) of a blank answer sheet and a coated answer sheet to respectively obtain a blank answer sheet area image and a coated answer sheet area image, and calculating the coordinate offset and the scaling s of the blank answer sheet area image and the coated answer sheet area image by an affine transformation algorithm; randomly cutting an image area with a fixed size from the upper left corner points (0,0) of a blank answer sheet and a coated answer sheet to respectively obtain a blank answer sheet area image and a coated answer sheet area image, and calculating the coordinate offset and the scaling s of the blank answer sheet area image and the coated answer sheet area image by an affine transformation algorithm; the radiation transformation algorithm is to perform s-time scaling by introducing a homogeneous transformation matrix with translation characteristics, act on an image to be corrected and transform the image to a corrected image matrix to obtain a corrected image, and the specific transformation is as shown in formula 1-1:

the transform matrix of the affine transform has 6 degrees of freedom, which is expressed as: (scos (θ), -ssin (θ), tx,ssin(θ),scos(θ),ty) (ii) a Wherein s is a scaling factor, i.e. a scaling ratio; theta is a defined angle of affine iteration; t is the degree of translational freedom, txIs a translational degree of freedom in the horizontal direction, tyA translational degree of freedom in the vertical direction;

s13: and applying the coordinate offset and the scaling s of the blank answer sheet area image to the coated answer sheet image to enable the blank answer sheet and the coated answer sheet to be approximately aligned.

3. An entropy-maximized card-smearing identification method based on pixel probability distribution statistics as claimed in claim 2, wherein the step S3 of positioning the card-smearing answer area comprises the specific steps of:

s31: identifying and detecting the question numbers in the blank answer sheet by adopting a Yolo v3 target identification network to obtain the specific coordinates of each question number character, performing up-down left-right sequencing analysis on the character coordinates of the question numbers by combining the sequence of the question numbers to obtain the horizontal and vertical distribution situation of each small question, and recording the coordinate positions of the characters corresponding to all the question numbers;

s32: adopting a target detection algorithm and a horizontal projection algorithm to further process and analyze the image characters of the blank answer sheet for identifying and positioning the coordinate position of the answer;

s33: and counting the total number of answers in the full-filling answer area corresponding to each question according to the recorded coordinates of the question number characters, and numbering the total number of the answers in sequence.

4. An entropy-maximized card-smearing identification method based on pixel probability distribution statistics as claimed in claim 3, wherein the target detection algorithm in the step S32 adopts a fast rcnn algorithm or an SSD algorithm.

5. An entropy-maximized card-smearing identification method based on pixel probability distribution statistics as claimed in claim 4, wherein the step S5 comprises the following steps:

s51: the pixel difference matrix I obtained in step S2dCounting the distribution D (X) of the gray values of different pixels; wherein X is a continuous random variable of the gray value of the pixel;

s52: setting the range R of the self-defined pixel gray value region according to the distribution statistics of the pixel gray valueiIt is written as: ri=[v1,v2](v1<v2;v1∈[0,255),v2∈(0,255]);

Wherein v is1And v2Are all pixel gray value sizes;

s53: then according to the pixel difference matrix IdAnd pixel gray value distribution D (X) to obtain probability density function F of filled and approximately black pixelX(x) It is written as:

6. an entropy-maximized card-smearing identification method based on pixel probability distribution statistics as described in claim 4, wherein the calculation formula in the step S6 is shown as 1-2;

wherein x isiAs a pixel difference matrix IdN is the number of pixel values not greater than zero.

7. An entropy-maximized card-smearing identification method based on pixel probability distribution statistics as claimed in claim 6, wherein the calculation formula of the proportion size in the step S7 is shown as 1-3:

wherein IdIs a pixel difference matrix.

8. An entropy-maximized card-smearing identification method based on pixel probability distribution statistics as claimed in claim 7, wherein the method for judging to obtain the card-smearing result in the step S7 is specifically as follows: combining the normative requirements of the coated answer sheet, the self-defined maximum value meeting the proportion and the set empirical value of the maximum entropy are respectively as follows: 70% and 70; if this condition is not met, it indicates that the area is not filled, otherwise it is filled.

Technical Field

The invention relates to the technical field of computer application, in particular to an entropy maximization card-smearing identification method based on pixel probability distribution statistics.

Background

With the continuous iterative update of the technology, many new technical methods are created in the field of automatic review, for example, some automatic identification methods can overcome the defects of the traditional methods, bring more convenience and optimal user experience, and thus the method is widely applied.

In the aspect of answer sheet coating identification, the traditional coating identification method has more limitations, and a specific scanning card reader is usually required. Because the traditional card-coating identification method is based on the carbon content in the card-coating pigment as the reference standard of the filling identification, the requirements on the identification scene and the filling specification are harsh, and the used card-coating paper is also required to be higher. The traditional card-coating identification method is not only limited by the thickness of paper, the type of card-coating pencil and the like, but also needs a fixed standard design form of answer sheet. It is these necessary conditions that make it inconvenient to use universally, and also put higher demands on the filling specifications of the examinees.

In the prior art, the identification of answer sheets is based on image data obtained by a scanning system with a standard format, and for a mobile terminal of a smart phone, the mobile terminal has an influence of objective factors on the photographing of answer sheets, so that images obtained in different environments of different mobile phones are different from each other, and serious interference is brought to correct identification of the contents of the answer sheets.

Disclosure of Invention

The invention aims to solve the technical problem of providing an entropy maximization card-coating identification method based on pixel probability distribution statistics, which is not limited by the card-coating form of an answer sheet and the paper style of the answer sheet, realizes the purpose of answer sheet card-coating identification, is mainly used for solving the defects brought by the traditional method, is free from more condition constraints, and has more reasonable and humanized identification transformation.

In order to solve the technical problems, the invention adopts the technical scheme that: the entropy maximization card-smearing identification method based on pixel probability distribution statistics specifically comprises the following steps:

s1 answer sheet alignment: aligning the coated answer sheet with the blank answer sheet;

s2 subtraction of test paper templates: storing the read-in digital matrix images of the blank answer sheet and the coated answer sheet, and respectively recording the digital matrix images as follows: i is0And I1(ii) a Then, subtraction operation is carried out on the digitized matrixes of the blank answer sheet and the coated answer sheet to obtain a pixel difference value set matrix IdIt is written as: i isd=I1-I0

S3 card-coating answer area positioning: firstly, identifying the question number of each question to obtain the coordinate of the question number character; then, carrying out sequencing analysis on the sequence coordinates of the question numbers by combining with the sequence of the question numbers to obtain the horizontal and vertical distribution of each question and record the coordinate position of the character corresponding to each question number, carrying out statistics after processing and analysis to obtain the coordinates and the total number of answers of the full-filling answer area corresponding to each question and numbering the total number of the answers;

s4 cutting to obtain a solution area: according to the step S3, acquiring the coordinates of the filled-in answer area corresponding to each small question, and respectively cutting each small question by using the coordinates of the filled-in answer area to acquire a blank answer sheet answer cutting area and a coated answer sheet answer cutting area;

and S5 area pixel gray statistics: the pixel difference matrix I obtained in step S2dCounting the distribution D (X) of different pixel gray values, and combining with the pixel difference matrix IdObtaining a probability density function F of the filled and approximately black pixelsX(x);

S6 image maximization entropy calculation: probability density function F according to step S5X(x) Sum pixel difference matrix IdCalculating the maximum entropy of the images of the blank answer sheet answer cutting area and the painted answer sheet answer cutting area;

and S7 judging the recognition result: and (4) according to the steps S1-S6, calculating the image maximum entropy of the blank answer sheet answer cutting area and the image maximum entropy of the coated answer sheet answer cutting area, calculating the proportion P of the filled and coated approximate black pixels, and judging to obtain a card coating result.

By adopting the technical scheme, firstly, an image positioning technology is adopted to obtain answer areas corresponding to a blank answer sheet and a coated answer sheet for cutting; then, carrying out probability distribution statistics on pixel gray values of the cut answer region image and calculating corresponding maximum entropy; and finally, judging the answer result according to the relation of the answer case area to realize the card coating identification of the answer sheet, wherein the identification method can be not limited by the card coating form of the answer sheet and the paper style of the answer sheet, so that the aim of the card coating identification of the answer sheet is fulfilled, the defects brought by the traditional method are solved, more condition constraints are eliminated, and more reasonable and humanized identification transformation is realized.

As a preferred technical solution of the present invention, the step S1 of aligning the answer sheets specifically includes the following steps:

s11: selecting two answer sheets as a blank answer sheet and a coated answer sheet respectively, wherein the blank answer sheet and the coated answer sheet have background colors, and the background color of the blank answer sheet is not pure white;

s12: randomly cutting an image area with a fixed size from the upper left corner points (0,0) of a blank answer sheet and a coated answer sheet to respectively obtain a blank answer sheet area image and a coated answer sheet area image, and calculating the coordinate offset and the scaling s of the blank answer sheet area image and the coated answer sheet area image by an affine transformation algorithm; the radiation transformation algorithm is characterized in that a homogeneous transformation matrix with translation characteristics is introduced for s-time scaling, and acts on an image to be corrected to transform the image to a corrected image matrix to obtain a corrected image, wherein the specific transformation is shown as a formula 1-1;

the transform matrix of the affine transform has 6 degrees of freedom, which is expressed as: (scos (θ), -ssin (θ), tx,ssin(θ),scos(θ),ty) (ii) a Wherein s is a scaling factor, i.e. a scaling ratio; theta is a defined angle of affine iteration; t is the degree of translational freedom, txIs a translational degree of freedom in the horizontal direction, tyIs a translational degree of freedom in the vertical direction.

S13: and applying the coordinate offset and the scaling s of the blank answer sheet area image to the coated answer sheet image to enable the blank answer sheet and the coated answer sheet to be approximately aligned.

As a preferred technical solution of the present invention, the step S3 of positioning the card-applying response area specifically comprises the steps of:

s31: adopting a Yolo v3 target recognition network to recognize and detect the question numbers in the blank answer sheet, dividing an input image into grids of 13x13, 26x26 and 52x52, wherein each grid is responsible for predicting the question number characters of the center in the grid; each grid predicts 3 bounding boxes, and each bounding box predicts contents including: the specific coordinates, confidence degrees and the category probabilities of the characters; finally, obtaining the specific coordinates of each question mark character, then carrying out up-down, left-right sequencing analysis on the character coordinates of the question marks by combining the sequence of the question marks to obtain the horizontal and vertical distribution situation of each question, and recording the coordinate positions of the characters corresponding to all the question marks;

s32: adopting a target detection algorithm and a horizontal projection algorithm to further process and analyze the image characters of the blank answer sheet for identifying and positioning the coordinate position of the answer; the horizontal projection algorithm aims to supplement the missed answers of target recognition, if the target recognition loses the character B, the horizontal projection is sequentially sequenced and analyzed to confirm that the lost answers are the character B according to the principle that the projection intervals of all the characters in the horizontal direction are equal;

s33: counting the total number of answers of the full-filling answer area corresponding to each small question according to the recorded coordinates of the question number characters, and numbering the total number of the answers in sequence; wherein the total number of answers is numbered in sequence as: 1, 2, 3, 4.; the corresponding results are: a, B, C, D.

As a preferred embodiment of the present invention, the target detection algorithm in step S32 employs a fast rcnn algorithm or an SSD algorithm.

As a preferred technical solution of the present invention, the step S5 specifically includes:

s51: the pixel difference matrix I obtained in step S2dCounting the division of gray values of different pixelsCloth, record as: d (X); wherein X is a continuous random variable of the gray value of the pixel;

s52: setting the range R of the self-defined pixel gray value region according to the distribution statistics of the pixel gray valueiIt is written as: ri=[v1,v2](v1<v2;v1∈[0,255),v2∈(0,255]);

Wherein v is1And v2Are all pixel gray value sizes;

s53: then according to the pixel difference matrix IdAnd pixel gray value distribution D (X) to obtain probability density function F of filled and approximately black pixelX(x) It is written as:

as a preferred embodiment of the present invention, the calculation formula in step S6 is shown as 1-2;

wherein x isiAs a pixel difference matrix IdN is the number of pixel values not greater than zero.

As a preferred embodiment of the present invention, the calculation formula of the proportion size in step S7 is shown in 1-3:

wherein IdIs a pixel difference matrix.

As a preferred technical solution of the present invention, the method for judging in step S7 to obtain the result of card coating specifically comprises: combining the normative requirements of the coated answer sheet, the self-defined maximum value meeting the proportion and the set empirical value of the maximum entropy are respectively as follows: 70% and 70; if this condition is not met, it indicates that the area is not filled, otherwise it is filled.

Compared with the prior art, the invention has the beneficial effects that: firstly, acquiring a blank answer sheet and an answer area corresponding to a coated answer sheet by adopting an image positioning technology to cut; then, carrying out probability distribution statistics on pixel gray values of the cut answer region image and calculating corresponding maximum entropy; and finally, judging the answer result according to the relation of the answer case area to realize the card coating identification of the answer sheet, wherein the identification method can be not limited by the card coating form of the answer sheet and the paper style of the answer sheet, so that the aim of the card coating identification of the answer sheet is fulfilled, the defects brought by the traditional method are solved, more condition constraints are eliminated, and more reasonable and humanized identification transformation is realized.

Drawings

The technical scheme of the invention is further described by combining the accompanying drawings as follows:

FIG. 1 is a flow chart of an entropy maximization card-smearing identification method based on pixel probability distribution statistics of the present invention;

FIG. 2 is a diagram of the effect of recognition by the entropy maximization card-smearing recognition method based on pixel probability distribution statistics.

Detailed Description

For the purpose of enhancing the understanding of the present invention, the present invention will be described in further detail with reference to the accompanying drawings and examples, which are provided for the purpose of illustration only and are not intended to limit the scope of the present invention.

Example (b): as shown in fig. 1, the entropy maximization card-smearing identification method based on pixel probability distribution statistics specifically includes the following steps:

s1 answer sheet alignment: aligning the coated answer sheet with the blank answer sheet;

the step S1 of aligning answer sheets specifically includes the following steps:

s11: selecting two answer sheets as a blank answer sheet and a coated answer sheet respectively, wherein the blank answer sheet and the coated answer sheet have background colors, and the background color of the blank answer sheet is not pure white;

s12: randomly cutting an image area with a fixed size from the upper left corner points (0,0) of a blank answer sheet and a coated answer sheet to respectively obtain a blank answer sheet area image and a coated answer sheet area image, and calculating the coordinate offset and the scaling s of the blank answer sheet area image and the coated answer sheet area image by an affine transformation algorithm; the radiation transformation algorithm is to perform s-time scaling by introducing a homogeneous transformation matrix with translation characteristics, act on an image to be corrected and transform the image to a corrected image matrix to obtain a corrected image, and the specific transformation is as shown in formula 1-1:

the transform matrix of the affine transform has 6 degrees of freedom, which is expressed as: (scos (θ), -ssin (θ), tx,ssin(θ),scos(θ),ty) (ii) a Wherein s is a scaling factor, i.e. a scaling ratio; theta is a defined angle of affine iteration; t is the degree of translational freedom, txIs a translational degree of freedom in the horizontal direction, tyA translational degree of freedom in the vertical direction;

s13: applying the coordinate offset and the scaling s of the blank answer sheet area image to the coated answer sheet image to enable the blank answer sheet and the coated answer sheet to be approximately aligned;

s2 subtraction of test paper templates: storing the read-in digital matrix images of the blank answer sheet and the coated answer sheet, and respectively recording the digital matrix images as follows: i is0And I1(ii) a Then, subtraction operation is carried out on the digitized matrixes of the blank answer sheet and the coated answer sheet to obtain a pixel difference value set matrix IdIt is written as: i isd=I1-I0

S3 card-coating answer area positioning: firstly, identifying the question number of each question to obtain the coordinate of the question number character; then, carrying out sequencing analysis on the sequence coordinates of the question numbers by combining with the sequence of the question numbers to obtain the horizontal and vertical distribution of each question and record the coordinate position of the character corresponding to each question number, carrying out statistics after processing and analysis to obtain the coordinates and the total number of answers of the full-filling answer area corresponding to each question and numbering the total number of the answers;

the step S3 of positioning the card-coating answer area comprises the following specific steps:

s31: adopting a Yolo v3 target recognition network to recognize and detect the question numbers in the blank answer sheet, dividing an input image into grids of 13x13, 26x26 and 52x52, wherein each grid is responsible for predicting the question number characters of the center in the grid; each grid predicts 3 bounding boxes, and each bounding box predicts contents including: the specific coordinates, confidence degrees and the category probabilities of the characters; finally, obtaining the specific coordinates of each question mark character, then carrying out up-down, left-right sequencing analysis on the character coordinates of the question marks by combining the sequence of the question marks to obtain the horizontal and vertical distribution situation of each question, and recording the coordinate positions of the characters corresponding to all the question marks;

s32: adopting a target detection algorithm and a horizontal projection algorithm to further process and analyze the image characters of the blank answer sheet for identifying and positioning the coordinate position of the answer; the horizontal projection algorithm aims to supplement the missed answers of target recognition, if the target recognition loses the character B, the horizontal projection is sequentially sequenced and analyzed to confirm that the lost answers are the character B according to the principle that the projection intervals of all the characters in the horizontal direction are equal; wherein the target detection algorithm adopts a fast rcnn algorithm or an SSD algorithm;

s33: counting the total number of answers of the full-filling answer area corresponding to each small question according to the recorded coordinates of the question number characters, and numbering the total number of the answers in sequence; wherein the total number of answers is numbered in sequence as: 1, 2, 3, 4.; the corresponding results are: a, B, C, D,;

s4 cutting to obtain a solution area: according to the step S3, acquiring the coordinates of the filled-in answer area corresponding to each small question, and respectively cutting each small question by using the coordinates of the filled-in answer area to acquire a blank answer sheet answer cutting area and a coated answer sheet answer cutting area;

and S5 area pixel gray statistics: the pixel difference matrix I obtained in step S2dCounting the distribution D (X) of different pixel gray values, and combining with the pixel difference matrix IdObtaining a probability density function F of the filled and approximately black pixelsX(x);

The specific steps of step S5 are:

s51: the pixel difference matrix I obtained in step S2dAnd counting the distribution of the gray values of different pixels, and recording as: d (X); wherein X is a continuous random variable of the gray value of the pixel;

s52: setting the range R of the self-defined pixel gray value region according to the distribution statistics of the pixel gray valueiIt is written as: ri=[v1,v2](v1<v2;v1∈[0,255),v2∈(0,255]);

Wherein v is1And v2Are all pixel gray value sizes;

s53: according to a pixel difference matrix IdAnd pixel gray value distribution D (X) to obtain probability density function F of filled and approximately black pixelX(x) It is written as:

s6 image maximization entropy calculation: probability density function F according to step S5X(x) Sum pixel difference matrix IdCalculating the maximum entropy of the images of the blank answer sheet answer cutting area and the painted answer sheet answer cutting area;

the calculation formula in the step S6 is shown as 1-2;

wherein x isiAs a pixel difference matrix IdN is the number of pixel values not greater than zero;

and S7 judging the recognition result: according to the steps S1-S6, the calculation of the image maximum entropy of the blank answer sheet answer clipping area and the painted answer sheet answer clipping area is completed, the proportion size P of the filled and approximately black pixels is calculated, and the set empirical values of the custom maximum proportion and the maximum entropy are respectively as follows by combining the normative requirement of the painted answer sheet: 70% and 70; if this condition is not met, it means that the area is not filled, otherwise it is filled, and a card-filling result is obtained, as shown in fig. 2.

The calculation formula of the proportion size in step S7 is shown in 1-3:

wherein IdIs a pixel difference matrix.

It is obvious to those skilled in the art that the present invention is not limited to the above embodiments, and it is within the scope of the present invention to adopt various insubstantial modifications of the method concept and technical scheme of the present invention, or to directly apply the concept and technical scheme of the present invention to other occasions without modification.

10页详细技术资料下载
上一篇:一种医用注射器针头装配设备
下一篇:指甲关键点检测方法、装置、电子设备及存储介质

网友询问留言

已有0条留言

还没有人留言评论。精彩留言会获得点赞!

精彩留言,会给你点赞!