Plug-in type voice recognition method, intelligent electronic scale and transaction platform

文档序号：88054 发布日期：2021-10-08 浏览：36次中文

阅读说明：本技术 一种插件式语音识别方法、智能电子秤及交易平台 (Plug-in type voice recognition method, intelligent electronic scale and transaction platform ) 是由郑吉贵熊会超郑吉富于 2021-07-07 设计创作，主要内容包括：本发明公开了一种插件式语音识别方法、智能电子秤及交易平台,方法包括：S1,语音监听,设定起始标记,当监听到起始标记时,获取其后的关键语音；S2,语音编码,将关键语音转换成文本数据；S3,关键词识别,根据文本数据,设定孤立词词长,对文本数据进行关键词识别,当未能识别关键词时,返回S1；S4,匹配检索,将关键词与数据库中的数据进行匹配,当无匹配项时,返回S1,匹配成功时,输出匹配项,并返回S1；智能电子秤包括：语音识别系统与交易系统,语音识别系统包括：语音监听模块、语音编码模块、关键词识别模块、匹配检索模块及交易系统,交易系统包括：展示装置、称重装置、结算装置；交易平台包括：智能电子秤和服务器。(The invention discloses a plug-in voice recognition method, an intelligent electronic scale and a transaction platform, wherein the method comprises the following steps: s1, monitoring voice, setting a start mark, and acquiring subsequent key voice when the start mark is monitored; s2, voice coding, converting the key voice into text data; s3, identifying keywords, setting the word length of isolated words according to the text data, identifying the keywords of the text data, and returning to S1 when the keywords cannot be identified; s4, matching and searching, matching the keywords with the data in the database, returning to S1 when no matching item exists, outputting the matching item when the matching is successful, and returning to S1; the intelligent electronic scale includes: speech recognition system and transaction system, the speech recognition system includes: pronunciation monitoring module, speech coding module, keyword recognition module, match retrieval module and transaction system, transaction system includes: the device comprises a display device, a weighing device and a settlement device; the trading platform comprises: intelligent electronic scale and server.)

1. A plug-in type voice recognition method is characterized by comprising the following steps:

s1, monitoring voice, setting a start mark, and acquiring subsequent key voice when the start mark is monitored;

s2, voice coding, converting the key voice into text data;

s3, identifying keywords, setting the word length of isolated words according to the text data, identifying the keywords of the text data, and returning to S1 when the keywords cannot be identified;

and S4, matching and searching, matching the keywords with the data in the database, returning to S1 when no matching item exists, outputting the matching item when the matching is successful, and returning to S1.

2. A plug-in speech recognition method according to claim 1, wherein when the keyword cannot be recognized and/or the recognized keyword has no matching item, after returning to S1, the method monitors the keyword speech directly, and resumes monitoring the start flag after the nth return, where N is a natural number.

3. The plug-in speech recognition method according to claim 1 or 2, wherein in S1, if the key speech is not acquired within a preset time, the monitoring of the start flag is returned.

4. The plug-in speech recognition method according to claim 1 or 2, wherein in S1, a separation flag is set for the key speech for distinguishing between consecutive key speech.

5. The plug-in speech recognition method according to claim 1, wherein the key speech obtained in S1 is subjected to speech noise reduction.

6. A plug-in speech recognition method according to claim 1, wherein the interface is arranged to interface with an external system and to output the matching term to the external system via a return function.

7. The plug-in voice recognition method according to claim 1, wherein the key voice includes a dish name and a dish classification, when the dish classification is obtained first and a matching item of the matched dish classification is obtained, the step returns to S1, the subclass of the dish classification is monitored, and voice recognition is performed until the dish name is monitored and recognized.

8. An intelligent electronic scale according to the method of any one of claims 1 to 7, comprising: a speech recognition system and a transaction system, wherein the speech recognition system comprises: the system comprises a voice monitoring module, a voice coding module, a keyword recognition module, a matching retrieval module and a transaction system which are sequentially connected, wherein the voice monitoring module is also respectively connected with the keyword recognition module and the matching retrieval module; the transaction system includes: the transaction system obtains the matching items, the content of the matching items is displayed through the display device, the weighing device is activated, and the settlement device obtains payment information and feeds back payment vouchers.

9. The intelligent electronic scale according to claim 8, further comprising an interactive dish selection system for selecting and outputting dishes and/or dish categories to the transaction system, wherein the voice recognition system is used for obtaining the dish and/or dish categories, and the transaction system obtains and displays the dish categories and then obtains specific dishes through the voice recognition system or the interactive dish selection system.

10. The transaction platform of intelligent electronic scale according to claim 8, comprising intelligent electronic scale and server, wherein the intelligent electronic scale uploads access data to the server, the access data comprises settlement data and weighing data, the settlement data is used for reflecting sales volume of dishes and/or dish categories, and the access data is used for reflecting heat of the dishes and/or dish categories.

Technical Field

The invention relates to the technical field of voice recognition, in particular to a plug-in type voice recognition method, an intelligent electronic scale and a transaction platform.

Background

Traditional electronic scale, intelligent electronic scale system only manual input, manual function of looking for, when article class/vegetable are many, the seller need on the system screen, just can find corresponding article class/vegetable to turn over several pages, and turn over the search of pages for a long time, for the operator in farmer's market originally brings inconvenience with busy work, and when the hand is wet and slippery or when staiing, more inconvenient touch and click.

Disclosure of Invention

In order to solve the defects of the prior art and realize the purposes of plug-in speech recognition and dish searching, the invention adopts the following technical scheme:

a plug-in type voice recognition method comprises the following steps:

s1, monitoring voice, setting a start mark, and acquiring subsequent key voice when the start mark is monitored, so as to avoid energy consumption caused by real-time monitoring of the key voice or real-time recognition of keywords, and avoid false recognition caused by influence of voice signals of next stall or client in farmer market;

s2, voice coding, which converts the key voice into text data and encrypts the data through coding;

s3, identifying keywords, setting isolated word length according to the text data, identifying the keywords on the text data, thereby also playing the effect of filtering irrelevant long sentences or phrases (longer than the threshold value of the keywords), and returning to S1 when the keywords cannot be identified;

Further, when the keywords cannot be recognized and/or the recognized keywords have no matching items, after the keyword is returned to S1, the key voice is directly monitored, and monitoring of the initial mark is recovered until the keyword is returned for the preset Nth time, wherein N is a natural number, because the farmer market environment is relatively noisy, the probability that the keyword cannot be recognized is high by inputting the voice for one time or two times, and by omitting monitoring of the initial mark, the merchant does not need to shout the initial mark again, and only the key voice is shout directly, so that the convenience of re-recognition is improved; meanwhile, the merchant does not need to wait for the recognition system to feed back whether the matching item is recognized or not (whether the required dishes automatically jump out of the waiting screen or not), and can directly and continuously input more than two times of key voices.

Further, in S1, if the key voice is not obtained within the preset time, monitoring of the start mark is returned, so that after the merchant or the buyer yells the start mark, when the dish/dish classification is no longer input due to other matters, the system performs null operation, and subsequent voice monitoring cannot be performed; meanwhile, long voice exceeding preset time can be filtered, and the noise reduction and voice recognition efficiency is improved.

Further, in S1, a separation flag is set for the key speech to distinguish between consecutive key speech, and on one hand, the key speech before and after the separation flag is input as a group of key speech, such as: more than 2 dishes/dish classifications can be sequentially input to serve as a group, and subsequent operations such as voice noise reduction, voice coding, keyword recognition and the like are performed in parallel, so that the recognition efficiency of key voice is improved; on the other hand, subsequent operations are sequentially performed by separating the key voices before and after the mark, so that the accuracy of recognizing the continuous key voices is improved.

Further, the key voice obtained in S1 is subjected to voice noise reduction, and since the farm product market environment is noisy, noise reduction processing needs to be performed on the received voice signal.

Further, the interface is set to be in butt joint with an external system, and the matching items are output to the external system through a return function.

Further, the key voice includes the name of the dish and the classification of the dish, when the classification of the dish is obtained first and the matching item of the classification of the dish is matched, the subclass of the classification of the dish is displayed, the step returns to step S1, the subclass of the classification of the dish is monitored, voice recognition is performed, and the subclass is displayed and the voice signal of the subclass is monitored in a circulating mode until the name of the dish is monitored and recognized.

A plug-in voice recognition intelligent electronic scale comprises: a speech recognition system and a transaction system, the speech recognition system comprising: the system comprises a voice monitoring module, a voice coding module, a keyword recognition module, a matching retrieval module and a transaction system which are sequentially connected, wherein the voice monitoring module is also respectively connected with the keyword recognition module and the matching retrieval module; the transaction system includes: the transaction system of the method does not need to click the dish and operate, does not need to click to determine, the payment system of the client is connected with an electronic scale system, and directly defaults to complete the transaction after two-dimensional code payment.

Furthermore, intelligent electronic scale still include interactive dish and select the system for select dish and/or dish classification, and output to transaction system, the speech recognition system for acquire dish and/or dish classification, transaction system acquires and show after the dish classification, selects the system through speech recognition system or interactive dish, acquires specific dish, and the noisy noise in the farmer's market may be intermittent type nature, and customer's shopping custom, the operating habits of merchant also are different, and some customers may want to pass through the dish classification earlier through the electronic screen, see what has been had dishes, and some merchants may be more used to contact manual selection, and when any link appeared interfering in the period, merchant or customer all optionally selected the dish through manual selection or speech recognition.

A plug-in type voice recognition transaction platform comprises an intelligent electronic scale and a server, wherein the intelligent electronic scale uploads access data to the server, the access data comprise settlement data and weighing data, the settlement data are used for reflecting sales of dishes and/or dish types, and the access data are used for reflecting heat of the dishes and/or dish types.

The invention has the advantages and beneficial effects that:

by setting the starting mark of voice recognition, the invention avoids the energy consumption caused by real-time monitoring of key voice or real-time recognition of key words, and simultaneously avoids the false recognition caused by the influence of voice signals of next-door booths or customers in the farmer market; the method comprises the following steps of voice coding, namely, converting key voice into text data and simultaneously playing an encryption effect on the data through coding; and (3) keyword recognition, namely setting the word length of the isolated word according to the text data, and performing keyword recognition on the text data, so that the effect of filtering irrelevant long sentences or phrases (longer than a keyword threshold value) is also achieved.

Drawings

FIG. 1 is a flow chart of the method of the present invention.

Detailed Description

The following detailed description of embodiments of the invention refers to the accompanying drawings. It should be understood that the detailed description and specific examples, while indicating the present invention, are given by way of illustration and explanation only, not limitation.

In a traditional intelligent electronic scale system, dishes are usually manually input or page-turned to select according to the requirements of a client, the selected dishes are clicked, weighing is carried out, the price is output, and the client pays according to the weighed price.

The intelligent electronic scale system provides an optional parallel scheme for the two steps of manually inputting or turning pages and clicking the selected dishes through the plug-in intelligent voice assistant, and sellers only need to select the dishes through voice according to the requirements of customers and then weigh the dishes, so that the customers pay.

As shown in fig. 1, the speech recognition method of the intelligent electronic scale based on the plug-in speech recognition system includes the following steps:

step one, voice monitoring is carried out, and dish voice signals input by a seller are obtained. In order to save energy consumption and avoid the false recognition of a system caused by voice signals of next-door booths or customers in farmer markets, a voice identifier mode is adopted, a voice identifier is added before a voice search keyword, and a voice monitoring module recognizes and performs subsequent steps on the subsequent keyword when monitoring the voice identifier, for example: and (4) carrying out Xiaonong and potato, wherein the Xiaonong is a voice identifier, and the potato is a key voice.

And step two, voice noise reduction, wherein noise reduction processing needs to be carried out on the received voice signals due to the fact that the farmer market environment is noisy.

And step three, voice coding, namely converting the audio data into text data, and meanwhile, encrypting the data through coding.

And step four, identifying keywords, namely identifying the keywords in the text data by using a self-Hidden Markov Model (HMM) as an acoustic model, and meanwhile, adopting isolated word identification because the dish names are all short words, thereby having the effect of filtering irrelevant long sentences or phrases. And when the keyword is not identified, returning to the step one, and continuing to perform voice monitoring.

On the other hand, during voice monitoring, key voices are acquired in units of time, for example: and the time of 1-2 seconds later is used for inputting the key voice of dishes by the Xiaonong, and other long sentences are sequentially filtered, so that the efficiency of noise reduction and voice recognition is improved.

When the first recognition is not finished, the voice monitoring module is returned to continue monitoring, the voice identifier is not required to be recognized firstly, and then the keyword recognition is carried out, so that the first recognition cannot be carried out, the second input is given, the convenience is improved, the merchant does not need to shout the voice identifier (Xiaonong) again, and the potato can be shout directly. In addition, under the circumstances such as the environment is particularly noisy, usually input pronunciation once, the unable discernment probability is great, to this, the merchant need not to wait that the identification system feedback has discerned pronunciation (whether jump out the required dishes through waiting in the screen automatically), direct continuous input more than twice keyword can, for example: after the voice identifier is monitored by the voice recognition module, the preset time (1-2 seconds) is set for recognizing the keyword, so that the potatoes can be recognized twice separately. For example: the keyword input time is set to 1 second, and the merchant shouts: the method comprises the steps that small farmers, soil knives and potatoes take 1 second for the first soil knife, at the moment, the latter potato is used as a keyword for the second input, and the potatoes which are continuously input at the moment are recognized because the soil knives which are input for the first time cannot be recognized.

However, the threshold set for the keyword recognition time has a problem that the dish names are different in length and the habits of speakers are different, which may result in overtime, and thus the keywords input for the first time and the second time are not complete (for example, the first time is recognized as "earth-soil", and the second time is recognized as "bean"), and thus both the recognition times are successful. At this time, a separation flag is set between the keywords, that is, the speech signal is not monitored yet 0.5 seconds after the speech signal is ended, then the monitoring is ended, and the subsequent speech signal is used as the keyword for the second recognition, for example: the small farm, the soil knife and the potato (0.5 second) can achieve the effects of voice segmentation and continuous recognition. After two times of continuous recognition, the voice identifier monitoring state is entered again.

And step five, matching and searching, namely, matching and searching the keywords and the data in the preset library, returning to the step one when no matching item exists, outputting the matching item when the matching is successful, and returning to the step one to enter a voice identifier monitoring state.

And step six, setting an interface, returning a function, giving a specific word/classification, and butting with transaction systems of different intelligent electronic scales.

And seventhly, realizing transaction, namely displaying the corresponding dishes on the screen of the electronic scale according to the key words, directly weighing the corresponding dishes, displaying the price, and printing the receipt after the customer pays through the two-dimensional code. In the existing system, in order to be accurate, dishes need to be selected, after the customer pays, the dishes need to be checked and determined again, the transaction is completed and the transaction record is uploaded.

The key words can also be the classification of dishes, all dishes of the type are displayed on the screen of the electronic scale through the dish classification, and the merchant completes the transaction through voice input again or manual click. The noise in the farm trade market may be intermittent, the shopping habits of customers and the operation habits of merchants are different, some customers may want to see which dishes are available through the dish categories first through the electronic screen, and some businesses may be more accustomed to contact type manual ordering, for example: the customers require to see all dishes of the dishes to be learned firstly, the dishes are just very noisy at the moment, the merchants can display the dishes through manual ordering, when the customers see that the dishes are ready to be made, the environment is not noisy any more, and the merchants or the customers input voice or manually order to select the dishes.

Step eight, storing transaction records, wherein the electronic scale is provided with a storage unit for temporarily storing transaction data, uploading the transaction data in real time or at regular time, and uploading the transaction data to a server of a farmer market only when the transaction data is completed, so that the transaction data is more accurate, the intelligent transaction is accurate, and the data reading and writing times are reduced; the data that the trade was not realized in weighing also can upload, combine the data that the trade was accomplished, constitute the access data for show dish heat, can show this trade company's heat dish at trade company's stand top electronic screen, or the transaction volume rank, also can show dish, vegetables, trade volume, the heat rank and the stall number of trade market at the electronic screen of whole agricultural trade market, be used for customer selection, location, practice thrift customer purchase time, raise the efficiency, supervise the merchant reasonable pricing, promote fair transaction.

The above examples are only intended to illustrate the technical solution of the present invention, but not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some or all of the technical features may be equivalently replaced; and the modifications or the substitutions do not make the essence of the corresponding technical solutions depart from the scope of the technical solutions of the embodiments of the present invention.

7页详细技术资料下载

Plug-in type voice recognition method, intelligent electronic scale and transaction platform

相关技术

网友询问留言