Method for converting Chinese pinyin into Braille ASCII codes

文档序号:361614 发布日期:2021-12-07 浏览:4次 中文

阅读说明:本技术 一种汉字拼音到盲文ascii码的转换方法 (Method for converting Chinese pinyin into Braille ASCII codes ) 是由 王丹英 杨文珍 于 2020-06-07 设计创作,主要内容包括:本发明公开了一种汉字拼音到盲文ASCII码的转换方法。汉字拼音转换为盲文ASCII码是汉语盲文计算机翻译系统的核心技术之一。由于每个汉字拼音都有韵母,本发明还发现了所有韵母的首字符均不同于声母的字符这一客观事实,创新提出一种标志位汉字拼音切分算法,得到每个汉字的声母、韵母、声调。进而,本发明提出一种三元素汉字拼音匹配盲文ASCII码算法,得到每个汉字的盲文ASCII码。本发明不仅高效实现了汉字拼音到盲文ASCII码的转换,而且为解决汉语盲文的标调问题提供了重要技术,也为国家通用盲文的数字化奠定基础。(The invention discloses a method for converting pinyin of Chinese characters into Braille ASCII codes. The conversion of Chinese pinyin to Braille ASCII codes is one of the core technologies of Chinese Braille computer translation systems. Because each Chinese character pinyin has a vowel, the invention also discovers the objective fact that the first characters of all vowels are different from the characters of the initial consonants, innovatively provides a flag bit Chinese character pinyin segmentation algorithm, and obtains the initial consonant, the vowel and the tone of each Chinese character. Furthermore, the invention provides a three-element Chinese character pinyin matching Braille ASCII code algorithm to obtain Braille ASCII codes of each Chinese character. The invention not only realizes the conversion from Chinese pinyin to Braille ASCII code with high efficiency, but also provides an important technology for solving the problem of the tone marking of Chinese Braille and lays a foundation for the digitalization of national universal Braille.)

1. A method for converting Chinese pinyin into Braille ASCII codes is characterized by comprising the following steps: the method comprises a flag bit Chinese character pinyin segmentation algorithm and a three-element Chinese character pinyin matching Braille ASCII code algorithm; for each Chinese character pinyin character string which may contain an initial consonant, a final sound and a tone, the flag bit Chinese character pinyin segmentation algorithm establishes a final sound first character list and an initial sound character list on the objective fact that the first characters of all the final sounds are different from the characters of the initial consonants, searches for a flag bit Pos from the aspect of the final sound, and segments the initial consonant, the final sound and the tone of the pinyin character string; and then, respectively searching the dictionaries corresponding to the Braille ASCII codes of the initial consonants, the final consonants and the tones, respectively obtaining the Braille ASCII codes of the initial consonants, the Braille ASCII codes of the final consonants and the Braille ASCII codes of the tones by the algorithm of matching the three-element Chinese character pinyin with the Braille ASCII codes, and sequentially combining the Braille ASCII codes of the Chinese characters.

2. The sign-digit pinyin segmentation algorithm for Chinese characters as claimed in claim 1, wherein:

1) reading a pinyin character string of a Chinese character;

2) if the pinyin character string is empty, prompting the current character to return that the pinyin is empty in a message window;

3) if the pinyin character string is not empty, judging whether the current character belongs to a character in a vowel first character list from a first character of the pinyin character string, if so, taking the current character position as a flag Pos for splitting the initial consonant and the vowel, and if not, selecting a next character in the pinyin character string to continue judging until the first character of the vowel in the pinyin character string is found;

4) if Pos is equal to 0, the initial consonant of the pinyin character string is null, the vowels are the first character to the second last character of the pinyin character string, and the tone is the last character of the pinyin character string;

5) if Pos is not equal to 0, the initial consonant of the pinyin character string is from the first character to the flag position Pos, the final consonant is from the flag position Pos to the second last character, and the tone is the last character.

3. The method of claim 1, wherein the method further comprises the steps of:

1) extracting a vowel initial character table from 24 vowels a, o, e, i, u, ai, ei, uei (ui), ao, ou, iou (iu), ie, ue, er, an, en, in, uen (un), un, ang, eng, ing and ong of pinyin of the Chinese character;

first character of vowel a o e i u ü

2) Extracting 23 initial consonants b, p, m, f, d, t, n, l, g, k, h, j, q, x, zh, ch, sh, r, z, c, s, y and w of Chinese character pinyin to obtain an initial consonant character table;

initial character b p m f d t n l g k h j q x r z c s y w

3) Comparing the initial character table of the vowels with the initial character table, finding that the initial characters of all the vowels are different from the characters of the initial consonants, traversing all the characters in the pinyin character string by the flag-bit Chinese character pinyin segmentation algorithm, and finding the initial characters of the vowels as the flag bits Pos for segmenting the initial consonants and the vowels.

4. The three-element pinyin matching braille ASCII code algorithm of claim 1 wherein:

1) if the initial consonant is null, outputting null Braille ASCII codes;

2) if the initial consonant is not empty, searching a dictionary corresponding to the Braille ASCII code of the initial consonant, and outputting the Braille ASCII code of the initial consonant;

3) if the vowel is empty, outputting empty Braille ASCII codes;

4) if the vowel is not empty, searching a corresponding dictionary of the braille ASCII code of the vowel, and outputting the braille ASCII code of the vowel;

5) searching a dictionary corresponding to the Braille ASCII code of the tone, and outputting the Braille ASCII code of the tone;

6) and combining the Braille ASCII codes of one Chinese character according to the front and back sequence of the initial Braille ASCII codes, the final Braille ASCII codes and the tone Braille ASCII codes.

Technical Field

The invention relates to a Braille ASCII code conversion method, in particular to a method for converting Chinese pinyin into Braille ASCII codes.

Background

The braille is also called braille, is a special character which is specially designed for the blind and is sensed by the touch sense, and is a character symbol which is read and written by the blind. At present, one international braille is formed by arranging and combining 6 points of three lines and two lines according to a certain rule, and 64 expression forms are called as a square.

The Chinese braille is the character which is presented based on Chinese characters and pinyin. A Braille of a Chinese character consists of a consonant square, a vowel square and a tone square, and can also be a silent mother square or a silent tone square. For a long time, the braille in Chinese continental has the existing braille and the double-spelling braille, and the existing braille is the main. The regulation of the existing braille 'tone marking when necessary' causes that the tone marking rate of the existing braille is extremely low, the tone marking rule is random and complicated, a large amount of subjective judgments are often mixed, when multi-tone characters are met, the blind person needs to guess tone, and the touch reading efficiency is greatly reduced. In order to solve the inherent defect of the existing braille, in recent years, China is popularizing the national universal braille. The Braille touch screen adopts a full-scale tone strategy, better eliminates ambiguity caused by unclear tone, and is convenient for blind people to more accurately touch and read Braille.

With the development of computer technology, the digitization of braille is inevitable. The Braille ASCII code is a subset of international standard information interchange code ASCII code, and 64 Braille characters in 32-95 are respectively in one-to-one correspondence with 64 Braille characters. The braille ASCII code has become the standard code of braille computer equipment and is widely used for software and hardware systems of digital braille.

At present in the information era, no matter the current braille, the double-spelling braille and the national universal braille, the problem of the digitalization of the braille of the Chinese characters must be solved, and a computer translation system of the Chinese braille is established. The conversion of Chinese pinyin into Braille ASCII codes is one of the core technologies of Chinese Braille computer translation systems. The key to the computer algorithm for realizing the Braille ASCII code conversion of the pinyin of Chinese characters lies in processing the mapping relation between the pinyin and the Braille ASCII codes.

In 2010, the patent "automatic translation and conversion method from chinese to braille" (CN1591414B) indicates that combined word blocks can be converted into braille according to the spelling and tone rules of braille, and does not disclose a specific method for converting braille by pinyin. In 2011, the document "design of a braille conversion system for chinese characters" (Yang Chao, etc.) further indicates that the corresponding braille phonetic codes can be searched one by pinyin to form braille texts, but a specific conversion algorithm is not described. In 2016, the document, "design implementation of vision-impaired chinese conversion software SunBraille" (luqian, etc.) further mentions that pinyin strings can be segmented through a split (",") function to obtain tones, 1, 2, 3, 4, and 5 respectively represent yin-yang-qi, upward sound, silence, and then a corresponding conversion dictionary from pinyin to braille ASCII codes is designed to convert the pinyin to braille ASCII codes. In 2017, a patent "an automatic efficient translation and conversion method from Chinese to Braille" (CN201710550659.8) further mentions that phoneme recognition and segmentation extraction are performed on Chinese pinyin strings, syllables, initials and finals are recognized integrally, and 6-bit symbolic Braille codes corresponding to each phoneme are obtained by utilizing a self-established phoneme Braille comparison table.

The document 'design of a Chinese character braille conversion system' and the document 'design realization of SunBraille Chinese conversion software' do not divide initial consonants and final consonants, adopt a one-by-one pinyin searching method, need to traverse a huge Chinese character pinyin library and a conversion dictionary from pinyin to braille ASCII codes, and have very large calculation amount and low conversion efficiency. The patent 'an automatic high-efficient translation conversion method from Chinese to braille' adopts the maximum matching algorithm of forward direction, carry on the recognition and segmentation extraction of the phoneme to the Chinese character pinyin string, match the phonetic alphabet in the pinyin string with the phoneme in the phoneme braille reference table one by one with the step length from long to short, obtain the result whether to match, the initial step length is to fetch the letter total number of the pinyin string; if the matching is successful under the maximum step length, the matching is terminated, if not, the step length is modified, the step length of the two letters of the longest initial consonant is used for pre-matching, and after the matching is successful, the remaining letters are directly subjected to final matching and are divided into whole syllables, initials and final consonants. Although the forward maximum matching algorithm has higher calculation efficiency than the pinyin one-by-one searching method, the initial consonants, the vowels and the tones are difficult to be simultaneously distinguished, the inherent defects of the existing braille are not overcome, and the digitalization of the universal braille of the country is not facilitated.

Disclosure of Invention

In order to overcome the defects of the prior art, the invention provides a method for converting pinyin of Chinese characters into Braille ASCII codes.

The technical scheme adopted by the invention is disclosed.

A method for converting Chinese pinyin to Braille ASCII codes comprises a zone bit Chinese pinyin segmentation algorithm and a three-element Chinese pinyin matching Braille ASCII code algorithm; for each Chinese character pinyin character string which may contain an initial consonant, a final sound and a tone, the flag bit Chinese character pinyin segmentation algorithm establishes a final sound first character list and an initial sound character list on the objective fact that the first characters of all the final sounds are different from the characters of the initial consonants, searches for a flag bit Pos from the aspect of the final sound, and segments the initial consonant, the final sound and the tone of the pinyin character string; and then, respectively searching dictionaries corresponding to Braille ASCII codes of initial consonants, final consonants and tones, respectively obtaining the Braille ASCII codes of the initial consonants, the Braille ASCII codes of the final consonants or the Braille ASCII codes of the tones by the algorithm of matching the three-element Chinese character pinyin with the Braille ASCII codes, and sequentially combining the Braille ASCII codes of the Chinese characters.

The sign bit Chinese character pinyin segmentation algorithm comprises the detailed steps of 1) reading a pinyin character string of a Chinese character; 2) if the pinyin character string is empty, prompting the current character to return that the pinyin is empty in a message window; 3) if the pinyin character string is not empty, judging whether the current character belongs to a character in a vowel first character list from a first character of the pinyin character string, if so, taking the current character position as a flag Pos for splitting the initial consonant and the vowel, and if not, selecting a next character in the pinyin character string to continue judging until the first character of the vowel in the pinyin character string is found; 4) if Pos is equal to 0, the initial consonant of the pinyin character string is null, the vowels are the first character to the second last character of the pinyin character string, and the tone is the last character of the pinyin character string; 5) if Pos is not equal to 0, the initial consonant of the pinyin character string is from the first character to the flag position Pos, the final consonant is from the flag position Pos to the second last character, and the tone is the last character.

The detailed steps of establishing the vowel initial character table and the initial consonant character table comprise 1) extracting the vowel initial character table from 24 vowels a, o, e, i, u, ai, ei, uei (ui), ao, ou, iou (iu), ie, uu, er, an, en, in, uen (un), un, ang, eng, ing and ong of pinyin of Chinese characters, and the vowel initial character table is shown in table 1; 2) extracting initial consonant character tables from 23 initial consonants b, p, m, f, d, t, n, l, g, k, h, j, q, x, zh, ch, sh, r, z, c, s, y and w of Chinese character pinyin, and obtaining the initial consonant character tables shown in table 2; 3) comparing the initial character table of the vowels with the initial character table, finding that the initial characters of all the vowels are different from the characters of the initial consonants, traversing all the characters in the pinyin character string by the flag-bit Chinese character pinyin segmentation algorithm, and finding the initial characters of the vowels as the flag bits Pos for segmenting the initial consonants and the vowels.

Table 1 vowel first character table.

First character of vowel a o e i u ü

Table 2 alphabet.

Initial character b p m f d t n l g k h j q x r z c s y w

The three-element Chinese character pinyin matching Braille ASCII code algorithm comprises the detailed steps of 1) outputting empty Braille ASCII codes if initial consonants are empty; 2) if the initial consonant is not empty, searching a dictionary corresponding to the Braille ASCII code of the initial consonant, and outputting the Braille ASCII code of the initial consonant in a table 3; 3) if the vowel is empty, outputting empty Braille ASCII codes; 4) if the vowel is not empty, searching a dictionary corresponding to the braille ASCII code of the vowel, and outputting the braille ASCII code of the vowel in a table 4; 5) looking up a dictionary corresponding to the Braille ASCII code of the tone, and outputting the Braille ASCII code of the tone in a table 5; 6) and combining the Braille ASCII codes of one Chinese character according to the front and back sequence of the initial Braille ASCII codes, the final Braille ASCII codes and the tone Braille ASCII codes.

Table 3 braille ASCII code correspondence dictionary.

Table 4 braille ASCII code corresponding dictionary.

Table 5 tone braille ASCII code corresponds to the dictionary.

Compared with the prior art, the invention has the beneficial effects.

(1) Some prior art divides the pinyin character string of a chinese character into pinyin and tone, and some prior art divides the pinyin character string of a chinese character into whole syllables, initials and finals. The results obtained by the techniques do not accord with the rule that the blind characters of a Chinese character consist of a consonant party, a vowel party and a tone party. The invention divides the phonetic character string of a Chinese character into initial consonant, vowel and tone, and completely accords with the rule that the Braille of a Chinese character consists of an initial square, a vowel square and a tone square.

(2) Compared with the prior art of one-by-one pinyin searching method, forward maximum matching algorithm and the like, the sign bit pinyin segmentation algorithm provided by the invention is based on the objective fact that the inventor finds that the first characters of all vowels are different from the characters of the initial consonants, a vowel first character list and an initial consonant character list are creatively established, as long as 6 characters of the vowel first character list are traversed, the pinyin character string of one Chinese character can be segmented into the initial consonants, the vowels and the tones, the calculation amount is small, and the conversion efficiency from the pinyin to the Braille ASCII codes is improved.

(3) On the basis of initial consonant, final sound and tone results obtained by a zone bit Chinese character pinyin segmentation algorithm, the invention provides a three-element Chinese character pinyin matching Braille ASCII code algorithm, which can conveniently combine Braille ASCII codes of Chinese characters, effectively solves the problem of Chinese Braille tone, and meets the digital requirement of national universal Braille.

Drawings

FIG. 1 is a flow chart of the sign-digit pinyin segmentation algorithm for Chinese characters.

FIG. 2 is a flow chart of the algorithm of three-element Chinese pinyin matching braille ASCII codes.

Detailed Description

The invention is further described below with reference to the accompanying drawings.

The pinyin character string of a Chinese character may consist of initials, finals and tones. The Chinese characters have 23 initial consonants, 24 vowels and 5 tones. Table 1 is a final initial character table, table 2 is an initial character table, table 3 is an initial braille ASCII code corresponding dictionary, table 4 is a final braille ASCII code corresponding dictionary, and table 5 is an intonation braille ASCII code corresponding dictionary.

As shown in FIG. 1 and FIG. 2, the invention provides a method for converting Chinese character pinyin to Braille ASCII codes, which comprises a zone bit Chinese character pinyin segmentation algorithm and a three-element Chinese character pinyin matching Braille ASCII code algorithm; for each Chinese character pinyin character string which may contain initials, finals and tones, the sign bit Chinese character pinyin segmentation algorithm provided by the invention establishes a final initial character list and an initial character list on the objective fact that the first characters of all finals are different from the characters of the initials, searches for sign bits Pos from the aspect of the finals, and segments the initials, the finals and the tones of the pinyin character string; and then, respectively searching the dictionaries corresponding to the Braille ASCII codes of the initial consonants, the final consonants and the tones, respectively obtaining the Braille ASCII codes of the initial consonants, the Braille ASCII codes of the final consonants and the Braille ASCII codes of the tones by the algorithm of matching the three-element Chinese character pinyin with the Braille ASCII codes, and then combining the Braille ASCII codes of the Chinese characters in sequence.

As shown in fig. 1, the flag bit pinyin segmentation algorithm includes the detailed steps of 1) reading a pinyin character string of a chinese character; 2) if the pinyin character string is empty, prompting the current character to return that the pinyin is empty in a message window; 3) if the pinyin character string is not empty, judging whether the current character belongs to a character in a vowel first character list from a first character of the pinyin character string, if so, taking the current character position as a flag Pos for splitting the initial consonant and the vowel, and if not, selecting a next character in the pinyin character string to continue judging until the first character of the vowel in the pinyin character string is found; 4) if Pos is equal to 0, the initial consonant of the pinyin character string is null, the vowels are the first character to the second last character of the pinyin character string, and the tone is the last character of the pinyin character string; 5) if Pos is not equal to 0, the initial consonant of the pinyin character string is from the first character to the flag position Pos, the final consonant is from the flag position Pos to the second last character, and the tone is the last character.

The detailed steps of establishing the vowel initial character table and the initial consonant character table comprise 1) extracting the vowel initial character table from 24 vowels a, o, e, i, u, ai, ei, uei (ui), ao, ou, iou (iu), ie, uu, er, an, en, in, uen (un), un, ang, eng, ing and ong of pinyin of Chinese characters, and the vowel initial character table is shown in table 1; 2) extracting initial consonant character tables from 23 initial consonants b, p, m, f, d, t, n, l, g, k, h, j, q, x, zh, ch, sh, r, z, c, s, y and w of Chinese character pinyin, and obtaining the initial consonant character tables shown in table 2; 3) comparing the initial character table of the vowels with the initial character table, finding that the initial characters of all the vowels are different from the characters of the initial consonants, traversing all the characters in the pinyin character string by the flag-bit Chinese character pinyin segmentation algorithm, and finding the initial characters of the vowels as the flag bits Pos for segmenting the initial consonants and the vowels.

As shown in fig. 2, the three-element pinyin matching braille ASCII code algorithm includes the detailed steps of 1) outputting empty braille ASCII codes if the initial consonant is empty; 2) if the initial consonant is not empty, searching a dictionary corresponding to the Braille ASCII code of the initial consonant, and outputting the Braille ASCII code of the initial consonant in a table 3; 3) if the vowel is empty, outputting empty Braille ASCII codes; 4) if the vowel is not empty, searching a dictionary corresponding to the braille ASCII code of the vowel, and outputting the braille ASCII code of the vowel in a table 4; 4) looking up a dictionary corresponding to the Braille ASCII code of the tone, and outputting the Braille ASCII code of the tone in a table 5; 5) and combining the Braille ASCII codes of one Chinese character according to the front and back sequence of the initial Braille ASCII codes, the final Braille ASCII codes and the tone Braille ASCII codes.

Detailed description of the invention

As shown in table 6, the "hang" character in hang state is taken as an example, the pinyin character string in hang state is "hang 2", and the length of the character string is 5. The sign bit Chinese character pinyin segmentation algorithm firstly takes a first character 'h' in a character string, traverses a vowel first character table and cannot find the character 'h'; then, taking a second-bit character 'a', traversing the vowel first character table to find the character 'a', wherein the flag bit Pos = 2; thus, it can be concluded that the first character "h" is the initial, the second to fourth characters "ang" are the final, and the fifth is the last character "2" is the tone, which represents the positive tone. The algorithm for matching the pinyin with the Braille ASCII codes of the three-element Chinese characters finds that the Braille ASCII code of the initial consonant of "H" is H, then the Braille ASCII code of the final vowel of "ang" is 8, then the Braille ASCII code of the tone of "2" is 1, and then the Braille ASCII codes corresponding to Hangzhou are combined in sequence and are H81.

TABLE 6 Braille ASCII code of Hangzhou

10页详细技术资料下载
上一篇:一种医用注射器针头装配设备
下一篇:语义角色标注方法、装置、电子设备和计算机可读介质

网友询问留言

已有0条留言

还没有人留言评论。精彩留言会获得点赞!

精彩留言,会给你点赞!