Text processing method, device and equipment

文档序号：1889921 发布日期：2021-11-26 浏览：9次中文

阅读说明：本技术 一种文本处理方法、装置及设备 (Text processing method, device and equipment ) 是由康伟于 2021-09-02 设计创作，主要内容包括：本申请实施例公开了一种文本处理方法,在获取到待处理文本后,对待处理文本中所包括的多个字符进行遍历,确定每个字符对应的字符类型,并根据字符类型确定字符所占用的网格数量。在确定出待处理文本中各字符所占用的网格数量后,根据显示区域在横向上所对应的网格数量以及各字符所占用的网格数量确定各字符的显示位置。进一步地,在确定出各字符在显示区域的显示位置后,对待处理文本进行显示,以使得待处理文本在显示区域中相邻两行上下两个字符对齐。可见,通过本申请实施例提供的方案可以对文本进行网格化显示,提升用户浏览效果。(The embodiment of the application discloses a text processing method, which is characterized in that after a text to be processed is obtained, a plurality of characters included in the text to be processed are traversed, a character type corresponding to each character is determined, and the number of grids occupied by the characters is determined according to the character type. And after the grid number occupied by each character in the text to be processed is determined, determining the display position of each character according to the grid number corresponding to the display area in the transverse direction and the grid number occupied by each character. Further, after the display position of each character in the display area is determined, the text to be processed is displayed, so that two adjacent lines of the text to be processed in the display area are aligned with each other. Therefore, the text can be displayed in a gridding mode through the scheme provided by the embodiment of the application, and the browsing effect of a user is improved.)

1. A method of text processing, the method comprising:

acquiring a text to be processed, wherein the text to be processed comprises a plurality of characters;

traversing the text to be processed, determining a character type corresponding to each character in the characters, and determining the number of grids occupied by the characters according to the character type corresponding to the characters, wherein the character type comprises at least one of Chinese characters and English characters, the number of the grids occupied by the characters is the number of the grids occupied by the characters in the transverse direction of a display area, the grids are sub-areas obtained by dividing the display area, the Chinese characters occupy a first preset number of grids, and the number of the grids occupied by the English characters is related to the width of the characters;

and determining the display position corresponding to each character according to the grid number occupied by each character and the grid number corresponding to the display area in the transverse direction.

2. The method of claim 1, further comprising:

and displaying the text to be processed according to the display position corresponding to each character.

3. The method according to claim 1 or 2, wherein the determining the number of grids occupied by the character according to the character type corresponding to the character comprises:

responding to the character type corresponding to the character as the English character, and acquiring the width of the character;

and determining the number of grids occupied by the character according to the width of the character and the width of one grid, wherein the width of each grid in a plurality of grids divided by the display area is the same.

4. The method of claim 3, further comprising:

and responding to the fact that the number of grids occupied by a first character in the characters is larger than the number of grids corresponding to the display area in the transverse direction, carrying out word segmentation on the first character to obtain a second character and a third character, wherein the number of grids occupied by the second character is equal to the number of grids corresponding to the display area in the transverse direction, the third character is the rest characters except the second character in the first character, and the first character is a continuous text character except the Chinese character and a symbol in the characters.

5. The method of claim 4, further comprising:

and performing word segmentation processing on the third character in response to the fact that the number of grids occupied by the third character is larger than the number of grids corresponding to the display area in the transverse direction until the number of grids occupied by the remaining characters is smaller than or equal to the number of grids corresponding to the display area in the transverse direction.

6. The method according to any one of claims 1 to 5, wherein determining the number of grids occupied by the character according to the character type corresponding to the character comprises:

and determining that the characters occupy a second preset number of grids in response to the character type corresponding to the characters being a symbol.

7. The method of claim 6, further comprising:

determining that the character and the next character jointly occupy the second preset number of grids in response to that the character type of the next character adjacent to the character is a symbol and the character and the next character can be merged.

8. The method of claim 6, further comprising:

and responding to the character type of the character as a symbol, and adjusting the display position of the character according to the display rule of the character, wherein the display rule is used for indicating the condition required to be met when the character is displayed.

9. The method according to claim 8, wherein the adjusting the display position of the character according to the display rule of the character in response to the character type of the character being a symbol comprises:

when the display position of the character is a head of a line and the display rule is that the character cannot be located at the head of the line, searching a first target character forwards, wherein the first target character is a character which can be located at the head of the line, is not located at the head of the line yet, and a previous character adjacent to the first target character can be located at the tail of the line;

and adjusting the display position of the character according to the relative positions of the first target character and the character.

10. The method according to claim 8, wherein the adjusting the display position of the character according to the display rule of the character in response to the character type of the character being a symbol comprises:

when the display position of the character is the line tail and the display rule is that the character cannot be located at the line tail, a second target character is searched forward, wherein the second target character can be located at the line tail and the next character adjacent to the second target character can be located at the line head;

and determining the display position of the character according to the relative position of the second target character and the character.

11. The method according to any one of claims 1-10, further comprising:

acquiring characters of which the character types are the Chinese characters in the text to be processed;

and acquiring pinyin information corresponding to the characters, and labeling the characters according to the pinyin information.

12. A text processing apparatus, characterized in that the apparatus comprises:

the device comprises an acquisition unit, a processing unit and a processing unit, wherein the acquisition unit is used for acquiring a text to be processed, and the text to be processed comprises a plurality of characters;

the first determining unit is used for traversing the text to be processed, determining a character type corresponding to each character in the characters, and determining the number of grids occupied by the characters according to the character type corresponding to the characters, wherein the character type comprises at least one of Chinese characters and English characters, the number of the grids occupied by the characters is the number of the grids occupied by the characters in the transverse direction of the display area, the grids are sub-areas obtained by dividing the display area, the Chinese characters occupy a first preset number of grids, and the number of the grids occupied by the English characters is related to the width of the characters;

and the second determining unit is used for determining the display position corresponding to each character according to the grid number occupied by each character and the grid number corresponding to the display area in the transverse direction.

13. An electronic device, the device comprising: a processor and a memory;

the memory for storing instructions or computer programs;

the processor to execute the instructions or computer program in the memory to cause the electronic device to perform the method of any of claims 1-11.

14. A computer-readable storage medium comprising instructions which, when executed on a computer, cause the computer to perform the method of any of claims 1-11 above.

Technical Field

The present application relates to computer processing technologies, and in particular, to a text processing method, apparatus, and device.

Background

At present, functions related to text display in a terminal only support a default typesetting rule, and contents displayed according to the default typesetting rule may be disordered, so that the display effect is poor, and the use experience of a user is influenced.

Disclosure of Invention

In view of this, embodiments of the present application provide a text processing method, apparatus, and device to realize displaying text content in a grid manner, so as to perform regular display on a text, and improve user experience.

In order to achieve the above purpose, the technical solutions provided in the embodiments of the present application are as follows:

in a first aspect of an embodiment of the present application, a text processing method is provided, where the method includes:

acquiring a text to be processed, wherein the text to be processed comprises a plurality of characters;

In a second aspect of embodiments of the present application, there is provided a text processing apparatus, including:

In a third aspect of embodiments of the present application, there is provided an electronic device, including: a processor and a memory;

the memory for storing instructions or computer programs;

the processor is configured to execute the instructions or the computer program in the memory, so as to enable the electronic device to execute the text processing method according to the first aspect.

In a fourth aspect of embodiments of the present application, there is provided a computer-readable storage medium including instructions that, when executed on a computer, cause the computer to perform the text processing method described in the first aspect above.

Therefore, the embodiment of the application has the following beneficial effects:

according to the technical scheme provided by the embodiment of the application, the total number of grids corresponding to the display area in the transverse direction and the number of grids occupied by characters of different character types are predefined. The grid is a sub-area obtained by dividing the display area. After the text to be processed is obtained, traversing a plurality of characters included in the text to be processed, determining a character type corresponding to each character, and determining the number of grids occupied by the characters according to the character types. And after the number of grids occupied by each character in the text to be processed is determined, determining the display position of each character according to the total number of grids corresponding to the display area in the transverse direction and the number of grids occupied by each character. Further, after the display position of each character in the display area is determined, the text to be processed is displayed, so that two adjacent lines of the text to be processed in the display area are aligned with each other. Therefore, the text can be displayed in a gridding mode through the scheme provided by the embodiment of the application, and the browsing effect of a user is improved.

Drawings

Fig. 1 is a schematic view of a text display provided in an embodiment of the present application;

fig. 2 is a flowchart of a text processing method according to an embodiment of the present application;

FIG. 3 is a schematic diagram of another text display provided by an embodiment of the present application;

FIG. 4a is a schematic diagram of a segmentation process flow according to an embodiment of the present application;

fig. 4b is a schematic diagram of a word segmentation optimization processing flow provided in the embodiment of the present application;

FIG. 5 is a block diagram of a text processing apparatus according to an embodiment of the present disclosure;

fig. 6 is a block diagram of an electronic device according to an embodiment of the present application.

Detailed Description

In order to make the aforementioned objects, features and advantages of the present application more comprehensible, embodiments accompanying the drawings are described in detail below. It is to be understood that the specific embodiments described herein are merely illustrative of and not restrictive on the broad application. It should be noted that, for the convenience of description, only a part related to the present application is shown in the drawings, and not all structures are shown.

When the traditional terminal equipment carries out character typesetting display, only the default typesetting rule is supported. However, when non-Chinese characters such as numbers and English characters appear in the text, the displayed text is disordered due to the default typesetting rule, the reading habit of the user is not met, and the use experience of the user is influenced.

Based on this, the text processing method provided in the embodiment of the present application may divide the display area in advance, and determine the total number of sub-areas corresponding to the display area in the lateral direction, that is, the total data of the grids corresponding to the display area in the lateral direction. Meanwhile, the number of grids occupied by different types of characters can be predefined, when the text to be processed needs to be displayed, the type of the characters in the text to be processed is firstly identified, and the number of grids occupied by the characters is determined according to the character type corresponding to the characters. And then determining the display position of each character in the display area according to the number of grids occupied by each character and the total number of grids corresponding to the display area in the transverse direction, so that the characters in the text to be processed are displayed.

For ease of understanding, referring to the schematic diagram of the application scenario shown in fig. 1, the display area is divided in the scenario such that the display area is divided into 6 sub-areas in the horizontal direction, i.e. one row of the display area includes 6 grids, and the size of each grid is the same. The explanation is given by taking the example that a single Chinese character occupies one grid, an English word can occupy a plurality of grids, and a punctuation mark occupies one grid. The number of the grids occupied by one English word is determined according to the width of the English word and the width of one grid. The text to be processed is 'the house double collapse of grandpa's milk. "because a single chinese character occupies a grid, then the first 6 chinese characters" i grandpa's milk "occupy the first line, occupy 3 grids through calculating house, and 3 chinese characters" two fall "occupy the second line again, and remaining 2 chinese characters and 1 sentence number occupy the first 3 grids of 3 lines. The final display effect of the text to be processed is as shown in fig. 1. As can be seen from fig. 1, the upper and lower characters of two adjacent lines are displayed in an aligned manner, even if the text includes english and punctuation marks, the text can be aligned with the upper and lower characters because the text occupies an integer number of grids, for example, "houses" occupies 3 grids, and the punctuation marks occupy 1 grid, so that the text content can be regularly displayed, the browsing habit of the user is met, and the user experience is improved.

In order to facilitate understanding of the technical solutions provided by the embodiments of the present application, the following description is made with reference to the accompanying drawings.

Referring to fig. 2, which is a flowchart of a text display method provided in an embodiment of the present application, as shown in fig. 2, the method may include:

s201: the method comprises the steps of obtaining a text to be processed, wherein the text to be processed comprises a plurality of characters.

In this embodiment, when browsing a text through a terminal device, a user may select a part of the text, and determine the selected part of the text as a text to be processed based on a selection operation of the user. The text to be processed may include a plurality of characters, and the plurality of characters may include characters of the same character type, for example, the plurality of characters are both chinese characters, or may include characters of different character types, for example, the plurality of characters include both chinese characters and english characters. The character types may include chinese characters, english characters (english words), numeric characters, punctuation characters, special characters (%, #, &), etc., among others.

S202: and traversing the text to be processed, determining the character type corresponding to each character, and determining the number of grids occupied by the character according to the character type corresponding to the character. .

In the process of obtaining the text to be processed, word segmentation processing is carried out on the text to be processed, so that a plurality of characters included in the text to be processed are obtained, a character type corresponding to each character is determined in a traversal mode, and the number of grids occupied by the character is determined according to the character type corresponding to the character. Wherein the number of grids occupied by a single character is the number of grids occupied by the character in the lateral direction of the display area. The grid is a sub-area obtained by dividing the display area, the display area can be divided into a plurality of grids, and the grids are the same in size and width.

In this embodiment, the number of grids occupied by characters of different character types may be predefined, and after a character type corresponding to a certain character in the text to be processed is determined, the number of grids occupied by the character may be determined according to the predefined definition. And after the number of grids occupied by the character is determined, sequentially placing the character into a word segmentation list according to a traversal sequence.

The number of grids occupied by the characters of different character types can be predefined as that one Chinese character occupies a first preset number of grids, a punctuation character occupies a second preset number of grids, and a special symbol occupies a third preset number of grids. The first preset number of grids, the second preset number of grids, and the third preset number of grids may be set according to practical applications, and this embodiment is not limited herein. For example, it is predefined that a single chinese character occupies one grid, a single punctuation mark occupies one grid, two punctuation marks that can be merged occupy one grid, and a special mark occupies one grid.

The number of grids occupied by the English characters is determined according to the width of the characters and the width of a single grid. When the number of grids occupied by the characters is non-integer, rounding up can be performed. For example, if the width of an english word is 10 pixels and the width of a single lattice is 3 pixels, the number of lattices occupied by the english word is 3.33, and the number of lattices occupied by the english word is determined to be 4.

In some application scenarios, the number of grids occupied by a certain character (first character) may be greater than the total number of grids corresponding to one line (i.e., the number of grids corresponding to the display area in the horizontal direction). The number of grids occupied by the second character is equal to the number of grids corresponding to the display area in the transverse direction, and the third character is the rest of the characters except the second character. And continuously calculating the number of grids occupied by the third character, and if the number of the grids occupied by the third character is less than or equal to the total number of the grids corresponding to the display area in the transverse direction, putting the third character into the word segmentation list. And if the number of the grids occupied by the third character is larger than the number of the grids corresponding to the display area in the transverse direction, continuing to perform word segmentation on the third character until the number of the grids occupied by the remaining characters is smaller than or equal to the number of the grids corresponding to the display area in the transverse direction. Wherein the first character is a continuous text character, such as a continuous english character, of the plurality of characters except for a chinese character and a symbol.

For example, a row of the display area corresponds to 6 grids, each grid has a width of 3 pixels, and an english word has a width of 40 pixels, so that the english word needs to be divided for 2 times, and a divided character 1 corresponds to 18 pixels and occupies a row 1; the divided character 2 corresponds to 18 pixels, occupying line 2; the remaining characters correspond to 2 pixels, occupying the first grid of row 3.

In some scenes, when the character type of the traversed current character is a symbol, whether the character type of the next character adjacent to the current character is a symbol can be judged, if so, whether the next character can be merged with the current character is further judged, if so, the two characters are put into a word segmentation list together, and the two characters are determined to occupy a second preset number of grids together. Meanwhile, setting parameters corresponding to the two characters as double symbols; if not, the current character is put into a word segmentation list to determine that the current character occupies a second preset number of grids, and meanwhile, the parameter corresponding to the current character is set to be a single symbol. For example, the text to be processed is: he said that: "the house of my grandpa double collapses. ", then the colon and the first quotation mark in the double quotation mark can be merged, and the period and the second quotation mark in the double quotation mark can be merged.

S203: and determining the display position corresponding to each character according to the grid number occupied by each character and the grid number corresponding to the display area in the transverse direction.

In this embodiment, the display positions of the characters in the display area are determined after the number of grids occupied by each character and the number of grids corresponding to one line of the display area are determined, so as to display the text to be processed according to the display position corresponding to each character, thereby aligning the upper and lower characters of two adjacent lines of the text to be processed in the display area.

In some application scenarios, when the text to be processed includes punctuation marks or special symbols, since the characters cannot be placed at the head and/or tail of a line, the display position of the characters needs to be adjusted, so that the adjusted display position conforms to the display rule. That is, the display position of the character is adjusted according to the display rule corresponding to the character. Wherein the display rule is used for indicating the condition that the character needs to satisfy when being placed. For example, the display rules may include displayable anywhere, not displayable at the head of the line, not displayable at the tail of the line, not displayable at the head and tail of the line. Specifically, firstly, a display position of a character is obtained, and if the display position is a head of a line and a display rule corresponding to the character cannot be located at the head of the line, a first target character is searched in a word segmentation list; and adjusting the display position of the character according to the first target character and the distance between the first target character and the character. The first target character is a character which can be located at the head of a line, is not located at the head of the line yet, and a previous character adjacent to the first target character can be located at the tail of the line. After the first target character is determined, because the relative position of the first target character and the character is fixed, when the first target character is positioned at the head of a line, the display position of the character can be adjusted according to the relative position, so that the display position of the character is no longer positioned at the head of the line.

It can be understood that, since the first target character to be searched is to be located at the head of a new line, if the display position of the first target character itself is the head of the line, the line feed is performed as if the line is moved next line as a whole, and the position of the character is not changed, the first target character needs to be a character which is not located at the head of the line yet. In addition, it is also necessary to ensure that the previous character adjacent to the first target character can be located at the tail of the line, otherwise, the problem that the character which cannot be located at the tail of the line is caused.

In another example, if the display position corresponding to the character is the tail of the line and the display rule corresponding to the character is that the character cannot be located at the tail of the line, searching a second target character in the word segmentation list forwards; and adjusting the display position of the character according to the relative position of the second target character and the character. The second target character is a character which can be located at the tail of the line and is adjacent to the second target character and can be located at the head of the line. That is, after the second target character is determined, since the relative position between the second target character and the character is fixed, when the second target character is located at the tail of the line, the display position of the character can be adjusted according to the relative position, so that the display position of the character is no longer located at the tail of the line.

In some application scenarios, in order to improve the reading experience of the user, pinyin can be labeled to the Chinese characters in the text to be processed. As shown in fig. 3, pinyin is displayed on chinese characters in the text to be processed. Specifically, Chinese characters included in a text to be processed are obtained from a word segmentation list, and pinyin information corresponding to each Chinese character is obtained; and marking the middle character according to the pinyin signal. The pinyin information comprises pinyin corresponding to Chinese characters and tone information. For example, after the chinese characters are obtained, pinyin information corresponding to each chinese character may be searched for by the pinyin repository, and the chinese character is labeled by using the pinyin information. In some scenarios, when the tone in the obtained pinyin information is already marked on a certain letter in the pinyin, the pinyin information can be directly used to mark the Chinese character symbol. When the tones in the obtained pinyin information are not labeled with letters and include specific tone information, the letters labeled with the tones can be determined according to the pinyin rule. For example, if the obtained pinyin information is "hai 3", that is, the pinyin is "hai" and the tone is 3, the letter labeled with the tone is determined to be a according to the pinyin rule, and the labeling result is "h { hacek over (a) } i". Wherein the pinyin rule is used to indicate which letters are marked with tones. For example, the pinyin rule is: 1) whenever aoe appears arbitrary, a label must be placed on aoe, where a > o > e; 2) if aoe is not present, ui is linked together, or iu is linked together, then mark at the end; 3) if aoe is not present, and if iu \ ui are not linked together, it is only possible to mark i u u, where i > u: a > o > e > ui > iu > i > u.

According to the embodiment, the number of grids corresponding to one line of the display area and the number of grids occupied by the characters of different character types are predefined. After the text to be processed is obtained, traversing a plurality of characters included in the text to be processed, determining a character type corresponding to each character, and determining the number of grids occupied by the character according to the character type corresponding to the character. After the grid number occupied by each character in the text to be processed is determined, the display position of each character in the display area is determined according to the grid number corresponding to one line of the display area and the grid number occupied by each character, so that the text to be processed is displayed according to the determined display position, and the upper and lower two characters of two adjacent lines in the text to be processed are aligned. Therefore, the text can be displayed in a gridding mode through the scheme provided by the embodiment of the application, and the browsing effect of a user is improved.

For the convenience of understanding the embodiment of the present application, refer to the processing frame diagrams shown in fig. 4a and 4b, which are described by taking an example in which one line of the display area includes 6 grids, a single chinese character occupies one grid, a single punctuation mark, and two punctuation marks that can be combined each occupy one grid.

Referring to fig. 4a, first, a text content is obtained, and the text content is traversed to determine a character type of each character in the text content. When the character is a Chinese character, 1 grid is determined to be occupied, and the character is put into a word segmentation list. When the characters are English characters, determining the number of occupied grids, if the number of occupied grids is larger than 6, performing word segmentation on the English characters, intercepting texts corresponding to the 6 grids, putting the texts into a word segmentation list, and when the number of occupied grids of the residual texts is calculated until the number of occupied grids of the residual texts is smaller than or equal to 6, putting the residual texts into the word segmentation list. When the character is a punctuation mark, determining whether the next character is the punctuation mark, if so, judging whether the next character is combined with the current character, if so, determining that the combined punctuation mark occupies 1 grid, putting the combined character into a word segmentation list, and simultaneously setting punctuation mark parameters corresponding to the character as double symbols. If not, determining that the current character occupies 1 grid, putting the current character into a word segmentation list, setting punctuation parameters corresponding to the character as single symbols,

it should be noted that, when performing word segmentation operation on the text content, the placement position types of each character can also be determined, and the placement position types include that the placement position can be placed at any position, can not be placed at the head of a line, can not be placed at the tail of a line, and can not be placed at the head and tail of a line. For example, for chinese characters or english characters, it can be placed anywhere, while for some punctuation it cannot be placed at the beginning or end of a line, etc.

After the word segmentation operation of fig. 4a, the placement position of each character is preliminarily determined, but since some punctuations cannot be located at the beginning and end of a line, optimization processing needs to be performed on the word segmentation list. As shown in fig. 4b, each character in the word segmentation list is traversed, and a column index corresponding to the currently traversed character is obtained. And judging whether the column index of the current character is 0, if so, indicating that the character is positioned at the head of the line. Further, obtaining a placing position type corresponding to the character, determining whether the character can be located at the head of the line according to the placing position type, and if the character can be located at the head of the line, continuing to traverse the next character; if the character can not be located at the head of the line, the character which can be located at the head of the line and can not be located at the head of the line is searched forward to carry out line feed, the traversal position is reset, and the next character is traversed continuously.

If the column index is not 0, judging whether the column index is 5, if so, indicating that the character is positioned at the tail of the line, acquiring the placement position type of the character, determining whether the character can be positioned at the tail of the line according to the placement position type, and if so, continuing to traverse the next character. If the character can not be located at the tail of the line, the character located at the tail of the line and the character located at the head of the line can be searched forward, line feed is carried out, the traversal position is reset, and the next character is continuously traversed.

After traversing is completed, a Chinese character list can be obtained from the word segmentation list, the pinyin of each Chinese character is obtained, and the pinyin is assigned to the corresponding Chinese character. And after the pinyin annotation is finished, performing gridding display on the characters in the word list.

Based on the above method embodiments, the present application provides a text processing apparatus and a processing device, which will be described below with reference to the accompanying drawings.

Referring to fig. 5, which is a structural diagram of a text processing apparatus according to an embodiment of the present disclosure, as shown in fig. 5, the apparatus 500 may include: an acquisition unit 501, a first determination unit 502, and a second determination unit 503.

An obtaining unit 501, configured to obtain a text to be processed, where the text to be processed includes multiple characters;

a first determining unit 502, configured to traverse the text to be processed, determine a character type corresponding to each character in the multiple characters, and determine, according to the character type corresponding to the character, a grid number occupied by the character, where the character type includes at least one of a chinese character and an english character, the grid number occupied by the character is a grid number occupied by the character in a lateral direction of a display area, the grid is a sub-area obtained by dividing the display area, the chinese character occupies a first preset number of grids, and the grid number occupied by the english character is related to a width of the character;

a second determining unit 503, configured to determine a display position corresponding to each character according to the number of grids occupied by each character and the number of grids corresponding to the display area in the lateral direction.

In a specific implementation manner, the apparatus further includes: a display unit (not shown in the figure);

and the display unit is used for displaying the text to be processed according to the display position corresponding to each character.

In a specific implementation manner, the first determining unit is specifically configured to obtain a width of the character in response to that a character type corresponding to the character is the english character; and determining the number of grids occupied by the character according to the width of the character and the width of one grid, wherein the width of each grid in a plurality of grids divided by the display area is the same.

In a specific implementation manner, the apparatus further includes: a word segmentation unit (not shown in the figure);

the word segmentation unit is used for responding to that the number of grids occupied by a first character in the characters is larger than the number of grids corresponding to the display area in the transverse direction, performing word segmentation on the first character to obtain a second character and a third character, wherein the number of grids occupied by the second character is equal to the number of grids corresponding to the display area in the transverse direction, the third character is the rest of characters except the second character in the first character, and the first character is a continuous text character except the Chinese character and the symbol in the characters.

In a specific implementation manner, the word segmentation unit is further configured to perform word segmentation on the third character in response to that the number of grids occupied by the third character is greater than the number of grids corresponding to the display area in the horizontal direction, until the number of grids occupied by the remaining characters is less than or equal to the total number of grids corresponding to the display area in the horizontal direction.

In a specific implementation manner, the first determining unit is specifically configured to determine that the character occupies a second preset number of grids in response to that the character type corresponding to the character is a symbol.

In a specific implementation manner, the first determining unit is specifically configured to determine that the character and the next character jointly occupy a second preset number of grids, in response to that the character type of the next character adjacent to the character is a symbol and the character and the next character can be merged.

In a specific implementation manner, the apparatus further includes: an adjusting unit (not shown in the figure);

the adjusting unit is used for responding to the character type of the character as a symbol, and adjusting the display position of the character according to the display rule of the character, wherein the display rule is used for indicating the condition required to be met when the character is displayed.

In a specific implementation manner, the adjusting unit is specifically configured to search forward a first target character when the display position of the character is a head of a line and the display rule is that the first target character cannot be located at the head of the line, where the first target character is a character that can be located at the head of the line, is not located at the head of the line yet, and a previous character adjacent to the first target character can be located at the tail of the line; and adjusting the display position of the character according to the relative positions of the first target character and the character.

In a specific implementation manner, the adjusting unit is specifically configured to search forward a second target character when the display position of the character is a tail of a line and the display rule is that the second target character cannot be located at the tail of the line, where the second target character is a next character that can be located at the tail of the line and is adjacent to the second target character and can be located at the head of the line; and determining the display position of the character according to the relative position of the second target character and the character.

In a specific implementation manner, the apparatus further includes: a labeling unit (not shown in the figure);

the acquiring unit is further configured to acquire a character of which the character type is the chinese character in the text to be processed and pinyin information corresponding to the character;

and the marking unit is used for marking the characters according to the pinyin information.

It should be noted that, for implementation of each unit in this embodiment, reference may be made to relevant description in the method embodiment shown in fig. 2, and this embodiment is not described herein again.

Referring now to FIG. 6, shown is a schematic diagram of an electronic device 600 suitable for use in implementing embodiments of the present application. The terminal device in the embodiment of the present application may include, but is not limited to, a mobile terminal such as a mobile phone, a notebook computer, a Digital broadcast receiver, a PDA (Personal Digital Assistant), a PAD (Portable android device), a PMP (Portable multimedia Player), a car terminal (e.g., car navigation terminal), and the like, and a fixed terminal such as a Digital TV (television), a desktop computer, and the like. The electronic device shown in fig. 6 is only an example, and should not bring any limitation to the functions and the scope of use of the embodiments of the present application.

As shown in fig. 6, electronic device 600 may include a processing means (e.g., central processing unit, graphics processor, etc.) 601 that may perform various appropriate actions and processes in accordance with a program stored in a Read Only Memory (ROM)602 or a program loaded from a storage means 608 into a Random Access Memory (RAM) 603. In the RAM603, various programs and data necessary for the operation of the electronic apparatus 600 are also stored. The processing device 601, the ROM602, and the RAM603 are connected to each other via a bus 604. An input/output (I/O) interface 605 is also connected to bus 604.

Generally, the following devices may be connected to the I/O interface 605: input devices 606 including, for example, a touch screen, touch pad, keyboard, mouse, camera, microphone, accelerometer, gyroscope, etc.; output devices 607 including, for example, a Liquid Crystal Display (LCD), a speaker, a vibrator, and the like; storage 608 including, for example, tape, hard disk, etc.; and a communication device 609. The communication means 609 may allow the electronic device 600 to communicate with other devices wirelessly or by wire to exchange data. While fig. 6 illustrates an electronic device 600 having various means, it is to be understood that not all illustrated means are required to be implemented or provided. More or fewer devices may alternatively be implemented or provided.

In particular, according to embodiments of the application, the processes described above with reference to the flow diagrams may be implemented as computer software programs. For example, embodiments of the present application include a computer program product comprising a computer program carried on a non-transitory computer readable medium, the computer program containing program code for performing the method illustrated by the flow chart. In such an embodiment, the computer program may be downloaded and installed from a network via the communication means 609, or may be installed from the storage means 608, or may be installed from the ROM 602. The computer program, when executed by the processing device 601, performs the above-described functions defined in the methods of the embodiments of the present application.

The electronic device provided by the embodiment of the present application and the text processing method provided by the embodiment of the present application belong to the same inventive concept, and technical details that are not described in detail in the embodiment of the present application can be referred to the embodiment of the present application, and the embodiment of the present application have the same beneficial effects.

The embodiment of the present application provides a computer readable medium, on which a computer program is stored, wherein the program, when executed by a processor, implements the text processing method according to any of the above embodiments.

It should be noted that the computer readable medium mentioned above in the present application may be a computer readable signal medium or a computer readable storage medium or any combination of the two. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples of the computer readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the present application, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In this application, however, a computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: electrical wires, optical cables, RF (radio frequency), etc., or any suitable combination of the foregoing.

In some embodiments, the clients, servers may communicate using any currently known or future developed network Protocol, such as HTTP (HyperText Transfer Protocol), and may interconnect with any form or medium of digital data communication (e.g., a communications network). Examples of communication networks include a local area network ("LAN"), a wide area network ("WAN"), the Internet (e.g., the Internet), and peer-to-peer networks (e.g., ad hoc peer-to-peer networks), as well as any currently known or future developed network.

The computer readable medium may be embodied in the electronic device; or may exist separately without being assembled into the electronic device.

The computer readable medium carries one or more programs which, when executed by the electronic device, cause the electronic device to perform the text processing method.

Computer program code for carrying out operations for aspects of the present application may be written in any combination of one or more programming languages, including but not limited to an object oriented programming language such as Java, Smalltalk, C + +, and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider).

The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present application. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

The units described in the embodiments of the present application may be implemented by software or hardware. Where the name of a unit/module does not in some cases constitute a limitation on the unit itself, for example, a voice data collection module may also be described as a "data collection module".

The functions described herein above may be performed, at least in part, by one or more hardware logic components. For example, without limitation, exemplary types of hardware logic components that may be used include: field Programmable Gate Arrays (FPGAs), Application Specific Integrated Circuits (ASICs), Application Specific Standard Products (ASSPs), systems on a chip (SOCs), Complex Programmable Logic Devices (CPLDs), and the like.

In the context of this application, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. A machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.

It should be noted that, in the present specification, the embodiments are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments may be referred to each other. For the system or the device disclosed by the embodiment, the description is simple because the system or the device corresponds to the method disclosed by the embodiment, and the relevant points can be referred to the method part for description.

It should be understood that in the present application, "at least one" means one or more, "a plurality" means two or more. "and/or" for describing an association relationship of associated objects, indicating that there may be three relationships, e.g., "a and/or B" may indicate: only A, only B and both A and B are present, wherein A and B may be singular or plural. The character "/" generally indicates that the former and latter associated objects are in an "or" relationship. "at least one of the following" or similar expressions refer to any combination of these items, including any combination of single item(s) or plural items. For example, at least one (one) of a, b, or c, may represent: a, b, c, "a and b", "a and c", "b and c", or "a and b and c", wherein a, b, c may be single or plural.

It is further noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.

The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present application. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the application. Thus, the present application is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

18页详细技术资料下载

上一篇：一种医用注射器针头装配设备

下一篇：文本处理方法、装置与存储介质

Text processing method, device and equipment

相关技术

网友询问留言