Image-text separation device, image-text separation method, and computer-readable recording medium
阅读说明:本技术 图文分离装置、图文分离方法及计算机可读取记录介质 (Image-text separation device, image-text separation method, and computer-readable recording medium ) 是由 雷凯 于 2019-03-18 设计创作,主要内容包括:本发明涉及一种图文分离装置、图文分离方法及计算机可读取记录介质,用于将半色调图像作为待分离图像进行图文分离,本发明的图文分离装置包括:分块部,将待分离图像划分成多个图像块;灰度复杂度解析部,基于均值差分算法依次对各个图像块进行解析得到各个图像块的灰度复杂度;第一判定部,将大于复杂度阈值的灰度复杂度对应的图像块判定为构成图片区域的图片块,并将不大于复杂度阈值的灰度复杂度对应的图像块判定为构成文字区域的文字块;整体轮廓提取部,获得待分离图像的图像轮廓;以及第二判定部,依次判断各个图像轮廓经过的图像块中是否含有图片块,一旦图像轮廓经过的图像块中含有图片块,就将该图像轮廓经过的所有图像块判定为图片块。(The invention relates to a picture-text separation device, a picture-text separation method and a computer readable recording medium, which are used for separating picture and text by taking a halftone image as an image to be separated, and the picture-text separation device comprises: a blocking part for dividing the image to be separated into a plurality of image blocks; a gray complexity analyzing part which analyzes each image block in sequence based on a mean difference algorithm to obtain the gray complexity of each image block; a first determination unit that determines an image block corresponding to a gradation complexity greater than a complexity threshold as a picture block constituting a picture area, and determines an image block corresponding to a gradation complexity not greater than the complexity threshold as a character block constituting a character area; an overall contour extraction unit for obtaining an image contour of an image to be separated; and a second determination unit which sequentially determines whether or not the image blocks through which the respective image profiles pass include a picture block, and determines all the image blocks through which the image profiles pass as picture blocks when the image blocks through which the image profiles pass include a picture block.)
1. An image-text separation device for performing image-text separation on a halftone image containing picture content and text content as an image to be separated so as to obtain a picture area corresponding to the picture content and a text area corresponding to the text content in the halftone image, the device comprising:
the partitioning part is used for partitioning the image to be separated into a plurality of image blocks;
a gray complexity analyzing part which analyzes each image block in sequence to obtain the gray complexity of each image block;
a first determination unit configured to determine, based on a grayscale complexity of the image block and a predetermined complexity threshold, the image block corresponding to the grayscale complexity that is greater than the complexity threshold as a picture block constituting the picture region, and determine the image block corresponding to the grayscale complexity that is not greater than the complexity threshold as a character block constituting the character region;
the overall contour extraction part is used for carrying out overall contour extraction on the image to be separated to obtain the image contour of the image to be separated; and
and a second determination unit configured to sequentially determine whether or not the image block through which the image contour passes includes the image block, and determine, as the image block, all the image blocks through which the image contour passes, once the image block through which the image contour passes includes the image block.
2. The separation apparatus of claim 1, further comprising:
a preprocessing section for preprocessing the halftone image and taking the preprocessed halftone image as the image to be separated,
wherein the preprocessing is mean filtering processing.
3. The separation apparatus of claim 1, further comprising:
an output unit and a control unit, wherein,
wherein, when the second determination unit completes the determination operation, the control unit controls the output unit to output, as a result of the separation, area information of the picture area formed by the plurality of picture blocks and area information of the character area formed by the plurality of character blocks.
4. The separation device of claim 1, wherein:
wherein the whole contour extraction section includes a whole binarization unit and a whole contour identification unit,
the integral binarization unit carries out binarization processing on the image to be separated to obtain a binarized image,
and the overall contour identification unit carries out overall contour identification on the binary image to obtain the image contour of the image to be separated.
5. The separation device of claim 1, wherein:
wherein the gray complexity analyzing section has an image block binarization unit, an image block contour extraction unit, an image block mean value filtering unit, and a gray complexity calculating unit,
the image block binarization unit sequentially performs binarization processing on each image block to obtain a plurality of binarized image blocks,
the image block outline extraction unit sequentially extracts the image block outlines of the binaryzation image blocks to obtain a plurality of binaryzation image block outlines,
the image block mean filtering unit sequentially performs mean filtering processing on each binarized image block to obtain a plurality of processed image blocks,
the image block outline extraction unit sequentially extracts the image block outlines of the processed image blocks to obtain a plurality of processed image block outlines,
and the gray complexity calculating unit calculates to obtain the gray complexity of each image block according to the gray value of the pixel point on the binarization image block outline of each image block and the gray value of the pixel point on the processed image block outline.
6. Teletext separation arrangement according to claim 5, characterized in that:
the gray complexity calculating unit calculates the gray complexity of each image block based on a mean difference algorithm, wherein the mean difference algorithm is as follows: and sequentially calculating the absolute value of the difference between the gray value of each pixel point on the binarized image block outline of the image block and the gray value of the pixel point on the processed image block outline, and then calculating to obtain the sum of the absolute values of the differences as the gray complexity of the image block.
7. An image-text separation method is used for performing image-text separation on a halftone image containing picture contents and text contents as an image to be separated so as to obtain a picture area corresponding to the picture contents and a text area corresponding to the text contents in the halftone image, and is characterized by comprising the following steps of:
a blocking step, namely dividing the image to be separated into a plurality of image blocks;
a gray complexity analyzing step, namely analyzing each image block in sequence to obtain the gray complexity of each image block;
a first determination step of determining, based on a grayscale complexity of the image block and a predetermined complexity threshold, the image block corresponding to the grayscale complexity that is greater than the complexity threshold as a picture block constituting the picture region, and determining the image block corresponding to the grayscale complexity that is not greater than the complexity threshold as a character block constituting the character region;
an overall contour extraction step, namely performing overall contour extraction on the image to be separated to obtain an image contour of the image to be separated; and
and a second determining step of sequentially determining whether the image blocks through which the image profiles pass contain the image blocks, and determining all the image blocks through which the image profiles pass as the image blocks once the image blocks through which the image profiles pass contain the image blocks.
8. A computer-readable recording medium for recording a computer program, wherein the computer program is for causing a text-text separation apparatus that subjects a halftone image including picture content and text content as an image to be separated to text-text separation to obtain a picture region corresponding to the picture content and a text region corresponding to the text content in the halftone image to perform the steps of:
a blocking step, namely dividing the image to be separated into a plurality of image blocks;
a gray complexity analyzing step, namely analyzing each image block in sequence to obtain the gray complexity of each image block;
a first determination step of determining, based on a grayscale complexity of the image block and a predetermined complexity threshold, the image block corresponding to the grayscale complexity that is greater than the complexity threshold as a picture block constituting the picture region, and determining the image block corresponding to the grayscale complexity that is not greater than the complexity threshold as a character block constituting the character region;
an overall contour extraction step, namely performing overall contour extraction on the image to be separated to obtain an image contour of the image to be separated; and
and a second determining step of sequentially determining whether the image blocks through which the image profiles pass contain the image blocks, and determining all the image blocks through which the image profiles pass as the image blocks once the image blocks through which the image profiles pass contain the image blocks.
Technical Field
The invention belongs to the technical field of image-text separation, and particularly relates to an image-text separation device, an image-text separation method and a computer readable recording medium.
Background
Halftone images are used to simulate the color and shade variation of continuous tone images by changing the size or density of multiple pixels. When performing image-text separation on a halftone image, different halftone generation algorithms may generate different pixel distribution patterns because the distribution of pixel points of the halftone image is related to the halftone generation algorithm, and therefore, the halftone image cannot be image-text separated only by extracting the color and shape features of the pixels of the halftone image.
Generally, the separation of the image and text of the halftone image is performed based on cross algorithm, which specifically includes: dividing the halftone image into a plurality of image blocks, then carrying out binarization on each image block, then comparing the gray change quantity of each image block in the horizontal and vertical directions, and judging the image block with large gray change quantity as an image block, and using the image block with small gray change quantity as a character block. However, when the halftone image contains the character-shaped pattern, the character-shaped pattern is also low in grayscale variation and is easily mistakenly determined as a character region, which causes inaccurate image-text separation, thereby affecting subsequent image processing and analysis processes.
Disclosure of Invention
The present invention has been made to solve the above-described problems, and an object of the present invention is to provide an image-text separation device, an image-text separation method, and a computer-readable recording medium that can perform image-text separation on a relatively complex halftone image subjected to image-text mixed layout.
In order to achieve the purpose, the invention adopts the following technical scheme:
the invention provides a picture-text separation device, which is used for separating picture and text of a halftone image containing picture content and text content as an image to be separated so as to obtain a picture area corresponding to the picture content and a text area corresponding to the text content in the halftone image, and is characterized by comprising the following steps: a blocking part for dividing the image to be separated into a plurality of image blocks; a gray complexity analyzing part which analyzes each image block in sequence based on a mean difference algorithm to obtain the gray complexity of each image block; a first determination unit that determines, based on the gradation complexity of the image block and a predetermined complexity threshold, an image block corresponding to a gradation complexity greater than the complexity threshold as a picture block constituting a picture region, and determines an image block corresponding to a gradation complexity not greater than the complexity threshold as a character block constituting a character region; the overall contour extraction part is used for extracting the overall contour of the image to be separated to obtain the image contour of the image to be separated; and a second determination unit which sequentially determines whether or not the image blocks through which the respective image profiles pass include a picture block, and determines all the image blocks through which the image profiles pass as picture blocks when the image blocks through which the image profiles pass include a picture block.
The invention also provides an image-text separation method, which is used for performing image-text separation on a halftone image containing image contents and text contents as an image to be separated so as to obtain an image area corresponding to the image contents and a text area corresponding to the text contents in the halftone image, and is characterized by comprising the following steps of: a blocking step, namely dividing an image to be separated into a plurality of image blocks; a gray level complexity analyzing step, namely analyzing each image block in sequence based on a mean difference algorithm to obtain the gray level complexity of each image block; a first determination step of determining, based on the grayscale complexity of the image block and a predetermined complexity threshold, an image block corresponding to the grayscale complexity greater than the complexity threshold as a picture block constituting a picture region, and determining an image block corresponding to the grayscale complexity not greater than the complexity threshold as a character block constituting a character region; an overall contour extraction step, namely performing overall contour extraction on the image to be separated to obtain the image contour of the image to be separated; and a second determination step of sequentially determining whether the image blocks through which the image profiles pass contain image blocks, and determining all the image blocks through which the image profiles pass as image blocks once the image blocks through which the image profiles pass contain image blocks.
The invention also provides a computer-readable recording medium for recording a computer program, which is characterized in that the computer program is used for separating the image and text of a halftone image containing picture content and text content as an image to be separated, so as to obtain a picture area corresponding to the picture content and a text area corresponding to the text content in the halftone image. The image-text separation device executes the following steps: a blocking step, namely dividing an image to be separated into a plurality of image blocks; a gray level complexity analyzing step, namely analyzing each image block in sequence based on a mean difference algorithm to obtain the gray level complexity of each image block; a first determination step of determining, based on the grayscale complexity of the image block and a predetermined complexity threshold, an image block corresponding to the grayscale complexity greater than the complexity threshold as a picture block constituting a picture region, and determining an image block corresponding to the grayscale complexity not greater than the complexity threshold as a character block constituting a character region; an overall contour extraction step, namely performing overall contour extraction on the image to be separated to obtain the image contour of the image to be separated; and a second determination step of sequentially determining whether the image blocks through which the image profiles pass contain image blocks, and determining all the image blocks through which the image profiles pass as image blocks once the image blocks through which the image profiles pass contain image blocks.
Action and Effect of the invention
According to the image-text separation device, the image-text separation method, and the computer-readable recording medium of the present invention, since the blocking section divides the halftone image into a plurality of image blocks, the grayscale complexity analysis section sequentially analyzes each image block based on the mean difference algorithm to obtain the grayscale complexity of each image block, the first determination section determines an image block corresponding to a grayscale complexity greater than the complexity threshold as an image block, and determines an image block corresponding to a grayscale complexity not greater than the complexity threshold as a text block, thereby preliminarily determining the attribute of each image block. Further, the whole contour extraction unit extracts the image contour of the obtained halftone image, the second determination unit sequentially determines whether or not image blocks through which the respective image contours pass include a picture block, and determines all image blocks through which the image contours pass as picture blocks when the image blocks through which the image contours pass include a picture block, so that the first determination unit erroneously determines an image block (for example, an image block constituting a character-shaped figure) as a character block, and thus the determination result can be more accurate, and the subsequent image processing and analysis processes can be smoothly performed.
Drawings
Fig. 1 is a block diagram showing the structure of an image-text separating apparatus according to an embodiment of the present invention;
FIG. 2 is an exemplary diagram of a halftone image in an embodiment of the invention;
FIG. 3 is an exemplary diagram of all image blocks constituting an image to be separated after binarization processing in the embodiment of the present invention;
FIG. 4 is a diagram illustrating an example of a determination result after a halftone image is determined by a first determination section according to an embodiment of the present invention;
FIG. 5 is a diagram illustrating an example of a determination result of a halftone image determined by the second determination unit according to an embodiment of the present invention; and
fig. 6 is a flowchart of the teletext separation operation of the teletext separator arrangement in an embodiment of the invention.
Detailed Description
In order to make the technical means, the creation features, the achievement objects and the effects of the invention easy to understand, the text separation device of the invention is specifically described below with reference to the embodiments and the accompanying drawings.
As a first aspect, the present invention provides a text-text separation apparatus for performing text-text separation on a halftone image including picture content and text content as an image to be separated to obtain a picture region corresponding to the picture content and a text region corresponding to the text content in the halftone image, the apparatus comprising: a blocking part for dividing the image to be separated into a plurality of image blocks; a gray complexity analyzing part for sequentially analyzing each image block to obtain the gray complexity of each image block; a first determination unit that determines, based on the gradation complexity of the image block and a predetermined complexity threshold, an image block corresponding to a gradation complexity greater than the complexity threshold as a picture block constituting a picture region, and determines an image block corresponding to a gradation complexity not greater than the complexity threshold as a character block constituting a character region; the overall contour extraction part is used for extracting the overall contour of the image to be separated to obtain the image contour of the image to be separated; and a second determination unit which sequentially determines whether or not the image blocks through which the respective image profiles pass include a picture block, and determines all the image blocks through which the image profiles pass as picture blocks when the image blocks through which the image profiles pass include a picture block.
In addition, the separation device for teletext according to the first embodiment may further have the following features: further comprising: and the preprocessing part is used for preprocessing the halftone image and taking the preprocessed halftone image as an image to be separated, wherein the preprocessing is mean filtering processing.
In addition, the separation device for teletext according to the first embodiment may further have the following features: and an output unit and a control unit, wherein the control unit controls the output unit to output, as a result of the separation, region information of a picture region formed by the plurality of picture blocks and region information of a character region formed by the plurality of character blocks.
In addition, the separation device for teletext according to the first embodiment may further have the following features: the overall contour extraction unit comprises an overall binarization unit and an overall contour identification unit, the overall binarization unit conducts binarization processing on the image to be separated to obtain a binarized image, and the overall contour identification unit conducts overall contour identification on the binarized image to obtain the image contour of the image to be separated.
In addition, the separation device for teletext according to the first embodiment may further have the following features: wherein the gray complexity analyzing section has an image block binarizing unit, an image block contour extracting unit, an image block mean filtering unit, and a gray complexity calculating unit, the image block binarization unit sequentially performs binarization processing on each image block to obtain a plurality of binarized image blocks, the image block outline extraction unit sequentially extracts the image block outlines of the binarization image blocks to obtain a plurality of binarization image block outlines, the image block mean filtering unit sequentially performs mean filtering processing on each binarized image block to obtain a plurality of processed image blocks, the image block contour extraction unit sequentially extracts image block contours of the processed image blocks to obtain a plurality of processed image block contours, and the gray complexity calculation unit calculates gray levels of the image blocks according to gray values of pixel points on the binarized image block contours of the image blocks and gray values of pixel points on the processed image block contours.
In addition, the separation device for teletext according to the first embodiment may further have the following features: the gray complexity calculating unit calculates the gray complexity of each image block based on a mean difference algorithm, wherein the mean difference algorithm is as follows: and sequentially calculating the absolute value of the difference between the gray value of each pixel point on the binary image block outline of the image block and the gray value of the pixel point on the processed image block outline, and then calculating to obtain the sum of the absolute values of the differences to serve as the gray complexity of the image block.
As a second embodiment, the present invention further provides an image-text separation method for performing image-text separation on a halftone image containing picture content and text content as an image to be separated, so as to obtain a picture region corresponding to the picture content and a text region corresponding to the text content in the halftone image, the method comprising: a blocking step, namely dividing an image to be separated into a plurality of image blocks; a gray complexity analyzing step, namely analyzing each image block in sequence to obtain the gray complexity of each image block; a first determination step of determining, based on the grayscale complexity of the image block and a predetermined complexity threshold, an image block corresponding to the grayscale complexity greater than the complexity threshold as a picture block constituting a picture region, and determining an image block corresponding to the grayscale complexity not greater than the complexity threshold as a character block constituting a character region; an overall contour extraction step, namely performing overall contour extraction on the image to be separated to obtain the image contour of the image to be separated; and a second determination step of sequentially determining whether the image blocks through which the image profiles pass contain image blocks, and determining all the image blocks through which the image profiles pass as image blocks once the image blocks through which the image profiles pass contain image blocks.
As a third aspect, the present invention provides a computer-readable recording medium for recording a computer program, wherein the computer program is configured to perform text-text separation on a halftone image including picture content and text content as an image to be separated, thereby obtaining a picture area corresponding to the picture content and a text area corresponding to the text content in the halftone image. The image-text separation device executes the following steps: a blocking step, namely dividing an image to be separated into a plurality of image blocks; a gray complexity analyzing step, namely analyzing each image block in sequence to obtain the gray complexity of each image block; a first determination step of determining, based on the grayscale complexity of the image block and a predetermined complexity threshold, an image block corresponding to the grayscale complexity greater than the complexity threshold as a picture block constituting a picture region, and determining an image block corresponding to the grayscale complexity not greater than the complexity threshold as a character block constituting a character region; an overall contour extraction step, namely performing overall contour extraction on the image to be separated to obtain the image contour of the image to be separated; and a second determination step of sequentially determining whether the image blocks through which the image profiles pass contain image blocks, and determining all the image blocks through which the image profiles pass as image blocks once the image blocks through which the image profiles pass contain image blocks.
< example >
Fig. 1 is a block diagram of a separation apparatus for separating text and graphics in an embodiment of the present invention.
As shown in fig. 1, the
FIG. 2 is an exemplary diagram of a halftone image in an embodiment of the invention.
As shown in fig. 2, the halftone image is generated from a continuous tone image by a halftone generation algorithm, and includes a text region and a picture region. In fig. 2, the left part is a text area and the right part is a picture area. In the text area, the text content is superposed on the picture background; the picture area has a background and a pattern, and the pattern contains a figure in the shape of a character.
The preprocessing
The
The gray
The image
The image block
The image block
The gray level
Wherein, the mean difference algorithm is as follows: and sequentially calculating the absolute value of the difference between the gray value of each pixel point on the binary image block outline of the image block and the gray value of the pixel point on the processed image block outline, and then calculating to obtain the sum of the absolute values of the differences to serve as the gray complexity of the image block.
Fig. 3 is an exemplary diagram of all image blocks constituting an image to be separated after binarization processing in the embodiment of the present invention.
In fig. 3, (a) part represents each binarized image block, (b) part represents a plurality of binarized image blocks in a character area, and (c) part represents a plurality of binarized image blocks in a character-shaped figure. As can be seen from fig. 3, the gray scale complexity of the binarized image block in the text area is obviously inconsistent with the gray scale complexity of the binarized image block in the background; meanwhile, the gray scale complexity of the binary image block in the character-shaped graph is closer to that of the binary image block in the character area.
The
The picture block is an image block having an attribute of a picture (corresponding to a picture area), and the character block is an image block having an attribute of a character (corresponding to a character area). The
Fig. 4 is an exemplary diagram of a determination result after the halftone image is determined by the first determination section in the embodiment of the present invention. In fig. 4, (a) is a schematic view of the entire result of the halftone image determined by the
As shown in fig. 4, the image block determined as the text block by the
The overall
The
The overall
The
Fig. 5 is an exemplary diagram of a determination result of the halftone image determined by the second determination unit according to the embodiment of the present invention. In fig. 5, (a) is a schematic view of the entire result of the halftone image determined by the second determination unit in the present embodiment, (b) is a partial enlarged view of a character region of the halftone image, and (c) is a partial enlarged view of a figure of a character shape of the halftone image. In fig. 5, the white line is the image contour extracted by the entire
As can be seen from fig. 4(c), the image block determined as the image block by the
As can be seen from fig. 5(c), the
The
The
The
The operation of the
The image-
Fig. 6 is a flowchart of the teletext separation operation of the teletext separator arrangement in an embodiment of the invention.
As shown in fig. 6, in the present embodiment, the flow of the teletext separation operation of
in step S1, the
In step S2, the blocking
In step S3, the image
In step S4, the image block
In step S5, the image block mean
In step S6, the image block
In step S7, the gray
In step S8, the
In step S9, the
In step S10, the entire
In step S11, the
In step S12, the
- 上一篇:一种医用注射器针头装配设备
- 下一篇:一种信号灯的前景区域识别方法及相关装置