Electronic medical record generation method, device, equipment and medium based on artificial intelligence

文档序号：1906541 发布日期：2021-11-30 浏览：22次中文

阅读说明：本技术 基于人工智能的电子病历生成方法、装置、设备及介质 (Electronic medical record generation method, device, equipment and medium based on artificial intelligence ) 是由孙耀辉于 2021-08-30 设计创作，主要内容包括：本发明涉及人工智能技术领域,本发明公开了一种基于人工智能的电子病历生成方法、装置、设备及介质,所述方法包括：通过获取病历生成请求中的问诊对话语音、患者信息和医生标识；通过语音角色分割和语音识别,得到对话文本；进行关键症状识别,获得关注文本；通过主诉特征提取,并根据提取的主诉特征识别出主诉结果,以及根据患者信息,进行病史识别及核验,得到现病史结果和既往史结果；通过病历模板生成模型对主诉结果、现病史结果和既往史结果进行病历生成,得到电子病历。因此,本发明实现了快速地、准确地基于医生定制化的病历模板自动生成患者的电子病历,提高了医生看诊效率。本发明适用于人工智能领域,可进一步推动智慧医疗的建设。(The invention relates to the technical field of artificial intelligence, and discloses an electronic medical record generation method, device, equipment and medium based on artificial intelligence, wherein the method comprises the following steps: acquiring inquiry dialogue voice, patient information and doctor identification in a medical record generation request; obtaining a dialogue text through voice role segmentation and voice recognition; performing key symptom identification to obtain a concerned text; extracting the main complaint characteristics, identifying the main complaint result according to the extracted main complaint characteristics, and identifying and verifying the medical history according to the information of the patient to obtain the current medical history result and the past history result; and (4) performing medical record generation on the main complaint result, the current medical history result and the past history result through a medical record template generation model to obtain the electronic medical record. Therefore, the invention realizes the automatic generation of the electronic medical record of the patient based on the medical record template customized by the doctor quickly and accurately, and improves the doctor seeing efficiency. The invention is suitable for the field of artificial intelligence and can further promote the construction of intelligent medical treatment.)

1. An electronic medical record generation method based on artificial intelligence is characterized by comprising the following steps:

receiving a medical record generation request, and acquiring inquiry dialogue voice, patient information and doctor identification in the medical record generation request;

carrying out voice role segmentation and voice recognition on the inquiry dialogue voice to obtain a dialogue text;

performing key symptom identification on the dialogue text to obtain an attention text corresponding to the inquiry dialogue voice;

extracting the main complaint characteristics of the concerned text, identifying a main complaint result according to the extracted main complaint characteristics, and identifying and verifying the medical history of the concerned text according to the patient information to obtain a current medical history result and a past history result;

and acquiring a medical record template generation model corresponding to the doctor identification, and performing medical record generation on the chief complaint result, the current medical history result and the past medical history result through the acquired medical record template generation model to obtain an electronic medical record corresponding to the medical record generation request.

2. The method for generating an electronic medical record based on artificial intelligence as claimed in claim 1, wherein after obtaining the electronic medical record, the method comprises:

receiving a confirmation instruction from the doctor identification; the confirmation instruction is generated after a doctor corresponding to the doctor identification checks or modifies the displayed electronic medical record;

and updating and adding a signature to the electronic medical record according to the confirmation instruction to generate the confirmed electronic medical record.

3. The method for generating an electronic medical record based on artificial intelligence as claimed in claim 1, wherein said performing voice character segmentation and voice recognition on said inquiry dialogue speech to obtain dialogue text comprises:

carrying out segmentation processing on the inquiry dialogue voice to obtain a plurality of voice fragments;

acquiring an audio sample corresponding to the doctor identification, comparing each voice fragment with the audio sample through a role recognition model to obtain the similarity between the audio sample and each voice fragment, marking the voice fragment corresponding to the similarity which is greater than or equal to a preset similarity threshold as a doctor role, and marking the rest voice fragments as patient roles;

extracting frequency domain characteristics of each voice segment through a voice recognition model, and performing character prediction according to the extracted frequency domain characteristics to obtain paragraph texts corresponding to each voice segment;

correspondingly marking the roles of the paragraph texts according to the voice segments marked as the doctor roles and the voice segments marked as the patient roles;

and performing time sequence splicing on the paragraph texts marked by all the roles to obtain the dialog text.

4. The method of generating an electronic medical record based on artificial intelligence as claimed in claim 3, wherein before the extracting the frequency domain features of each of the speech segments by the speech recognition model, it comprises:

acquiring a voice sample set; the set of speech samples comprises a plurality of speech samples;

inputting the voice sample into an initial recognition model containing initial parameters;

carrying out audio enhancement processing on the voice sample through the initial recognition model to obtain an audio clip to be processed;

performing teacher acoustic feature extraction on the audio clip to be processed through a teacher network to obtain a first feature vector, and performing student acoustic feature extraction on the audio clip to be processed through a student network to obtain a second feature vector; wherein the initial recognition model comprises the teacher network and the student network; the student network is obtained after distillation learning is carried out on the teacher network;

aligning and comparing the first feature vector, the second feature vector and a dynamic queue in the teacher network to obtain a loss value;

and when the loss value does not reach a preset convergence condition, iteratively updating the initial parameters of the initial recognition model until the loss value reaches the convergence condition, and recording the initial recognition model after convergence as a trained voice recognition model.

5. The method for generating an electronic medical record based on artificial intelligence as claimed in claim 1, wherein said performing key symptom recognition on said dialogue text to obtain a focus text corresponding to said inquiry dialogue speech comprises:

carrying out symptom identification on the conversation text, and identifying a plurality of symptom keywords in the conversation text;

and performing context semantic analysis and time dimension analysis on each symptom keyword to determine the concerned text in the conversation text.

6. The method as claimed in claim 1, wherein said identifying and verifying the medical history of the text of interest according to the patient information to obtain the current medical history result and the past medical history result comprises:

carrying out medical history distinguishing on the concerned text to obtain a current medical history result and an initial past history result;

and performing medical history check on the initial past history result according to the historical clinic information in the patient information to obtain the past history result.

7. The method as claimed in claim 1, wherein the step of generating the medical record from the acquired medical record template generation model to the chief complaint result, the current medical history result and the past medical history result to obtain the electronic medical record corresponding to the medical record generation request comprises:

performing template factor feature extraction on the chief complaint result through the acquired medical record template generation model, and generating a medical record template corresponding to the chief complaint result according to the extracted template factor feature;

and filling the current medical history result and the past medical history result into the medical history template corresponding to the main complaint result to obtain the electronic medical record.

8. An electronic medical record generation device based on artificial intelligence, comprising:

the receiving module is used for receiving a medical record generation request and acquiring inquiry dialogue voice, patient information and doctor identification in the medical record generation request;

the first recognition module is used for carrying out voice role segmentation and voice recognition on the inquiry dialogue voice to obtain a dialogue text;

the second identification module is used for identifying key symptoms of the dialogue text to obtain a concerned text corresponding to the inquiry dialogue voice;

the extraction module is used for extracting the main complaint characteristics of the concerned text, identifying a main complaint result according to the extracted main complaint characteristics, and identifying and verifying the medical history of the concerned text according to the patient information to obtain a current medical history result and a past history result;

and the generation module is used for acquiring a medical record template generation model corresponding to the doctor identifier, and performing medical record generation on the chief complaint result, the current medical history result and the past medical history result through the acquired medical record template generation model to obtain an electronic medical record corresponding to the medical record generation request.

9. A computer device comprising a memory, a processor, and a computer program stored in the memory and executable on the processor, wherein the processor implements the method for generating an electronic medical record based on artificial intelligence according to any one of claims 1 to 7 when executing the computer program.

10. A computer-readable storage medium, in which a computer program is stored, which, when being executed by a processor, implements the method for generating an electronic medical record based on artificial intelligence according to any one of claims 1 to 7.

Technical Field

The invention relates to the technical field of model construction of artificial intelligence, in particular to an electronic medical record generation method, device, equipment and medium based on artificial intelligence.

Background

The medical record is the record of the medical activities such as examination, diagnosis and treatment of a patient by a doctor. The medical record is the summary of clinical practice and the legal basis for exploring disease laws and dealing with medical disputes. The medical records have important functions on medical treatment, prevention, teaching, scientific research, hospital management and the like.

However, most of the medical records of the patients are paper medical records, the medical records are written by doctors manually, the quality of the medical records depends on the experience of the doctors, the problems of non-uniform words, non-standard words and the like exist in most of the medical record writing, and meanwhile, the working efficiency and the working quality of the doctors are lower.

Disclosure of Invention

The invention provides an electronic medical record generation method and device based on artificial intelligence, computer equipment and a storage medium, which can realize the automatic generation of an electronic medical record suitable for a patient quickly and accurately based on a medical record template customized by a doctor, reduce the workload of manual input of the doctor, improve the diagnosis efficiency of the doctor and improve the accuracy and timeliness of the medical record.

An electronic medical record generation method based on artificial intelligence comprises the following steps:

receiving a medical record generation request, and acquiring inquiry dialogue voice, patient information and doctor identification in the medical record generation request;

carrying out voice role segmentation and voice recognition on the inquiry dialogue voice to obtain a dialogue text;

performing key symptom identification on the dialogue text to obtain an attention text corresponding to the inquiry dialogue voice;

An electronic medical record generation device based on artificial intelligence, comprising:

the first recognition module is used for carrying out voice role segmentation and voice recognition on the inquiry dialogue voice to obtain a dialogue text;

the second identification module is used for identifying key symptoms of the dialogue text to obtain a concerned text corresponding to the inquiry dialogue voice;

A computer device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, the processor implementing the steps of the artificial intelligence based electronic medical record generation method when executing the computer program.

A computer-readable storage medium, which stores a computer program that, when executed by a processor, implements the steps of the artificial intelligence based electronic medical record generation method described above.

The invention provides an electronic medical record generation method, a device, computer equipment and a storage medium based on artificial intelligence, wherein the method comprises the steps of receiving a medical record generation request, and acquiring inquiry dialogue voice, patient information and doctor identification in the medical record generation request; carrying out voice role segmentation and voice recognition on the inquiry dialogue voice to obtain a dialogue text; performing key symptom identification on the dialogue text to obtain an attention text corresponding to the inquiry dialogue voice; extracting the main complaint characteristics of the concerned text, identifying a main complaint result according to the extracted main complaint characteristics, and identifying and verifying the medical history of the concerned text according to the patient information to obtain a current medical history result and a past history result; the method comprises the steps of obtaining a medical record template generating model corresponding to a doctor identification, carrying out medical record generation on a chief complaint result, a current medical history result and a past medical history result through the obtained medical record template generating model, obtaining an electronic medical record, automatically identifying a concerned text by utilizing voice role segmentation and voice identification and key symptom identification, automatically identifying a chief complaint result corresponding to the concerned text by extracting chief complaint characteristics, and automatically generating the electronic medical record through the corresponding medical record template generating model, so that the electronic medical record which is suitable for a patient is automatically generated on the basis of the medical record template customized by the doctor quickly and accurately, the workload of manual input of the doctor is reduced, the doctor seeing and diagnosing efficiency is improved, the accuracy and timeliness of the medical record are improved, and the experience satisfaction degree of the patient is improved.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the description of the embodiments of the present invention will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art that other drawings can be obtained according to these drawings without inventive labor.

FIG. 1 is a schematic diagram of an application environment of an electronic medical record generation method based on artificial intelligence according to an embodiment of the present invention;

FIG. 2 is a flow chart of a method for generating an electronic medical record based on artificial intelligence according to an embodiment of the invention;

FIG. 3 is a flowchart illustrating the step S20 of the method for generating an electronic medical record based on artificial intelligence according to an embodiment of the present invention;

FIG. 4 is a flowchart illustrating the step S30 of the method for generating an electronic medical record based on artificial intelligence according to an embodiment of the present invention;

FIG. 5 is a flowchart illustrating the step S50 of the method for generating an electronic medical record based on artificial intelligence according to an embodiment of the present invention;

FIG. 6 is a flowchart of step S50 of the method for generating an electronic medical record based on artificial intelligence according to another embodiment of the present invention;

FIG. 7 is a schematic block diagram of an apparatus for generating an electronic medical record based on artificial intelligence according to an embodiment of the present invention;

FIG. 8 is a schematic diagram of a computer device in an embodiment of the invention.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

The electronic medical record generation method based on artificial intelligence provided by the invention can be applied to the application environment shown in fig. 1, wherein a client (computer equipment or terminal) communicates with a server through a network. The client (computer device or terminal) includes, but is not limited to, various personal computers, notebook computers, smart phones, tablet computers, and portable wearable devices. The server may be an independent server, or may be a cloud server that provides basic cloud computing services such as a cloud service, a cloud database, cloud computing, a cloud function, cloud storage, a Network service, cloud communication, a middleware service, a domain name service, a security service, a Content Delivery Network (CDN), a big data and artificial intelligence platform, and the like.

In an embodiment, as shown in fig. 2, an electronic medical record generation method based on artificial intelligence is provided, which mainly includes the following steps S10-S50:

and S10, receiving a medical record generation request, and acquiring the inquiry dialogue voice, the patient information and the doctor identification in the medical record generation request.

Understandably, in the inquiry process of a patient, triggering a recording button in application software of a terminal by a doctor to input a diagnosis window corresponding to the patient to the terminal, starting to collect voice input for entering the dialogue between the doctor and the patient, after the dialogue is finished, triggering stopping in the application software by the doctor to finish the voice input for collection, so as to record the collected voice as the inquiry dialogue voice corresponding to the patient, and automatically triggering a medical record generation request according to the inquiry dialogue voice, the patient information and the doctor identification, wherein the medical record generation request comprises the inquiry dialogue voice, the patient information and the doctor identification, the doctor identification is a unique identification which is authenticated and endowed by the doctor in the application software, the patient information is basic information related to the patient in a patient database, for example: the information comprises name, gender, age, historical clinic information and the like, the patient information comprises historical clinic information, and the historical clinic information is related to all historical clinics for creating a patient identification start record corresponding to the patient.

Further, the application software also comprises a pause button, a continue button and a delete button, wherein the pause button is used for triggering the terminal to pause voice collection, the continue button is used for triggering the terminal to continue voice collection, and the delete button is used for triggering the terminal to finish the voice collection and delete the currently recorded voice.

And S20, performing voice character segmentation and voice recognition on the inquiry dialogue voice to obtain a dialogue text.

Understandably, the inquiry dialogue voice is processed in a segmentation way to obtain a plurality of voice segments, the voice role segmentation process is a process of matching each voice segment based on the audio sample corresponding to the doctor identification so as to distinguish the roles corresponding to each voice segment, wherein the role comprises a doctor and a patient, the voice recognition process is a process of converting voice of each voice segment into text, the Speech Recognition may be performed by using an Automatic Speech Recognition technology (ASR, which is a technology for converting human Speech into text), or may be performed by using a Speech Recognition model of TinyBert based on distillation learning, and according to the role of each recognized Speech segment, and marking the text corresponding to each voice segment output by voice recognition so as to obtain the dialog text.

In an embodiment, as shown in fig. 3, in step S20, the performing voice character segmentation and voice recognition on the inquiry dialogue speech to obtain a dialogue text includes:

s201, carrying out segmentation processing on the inquiry dialogue voice to obtain a plurality of voice segments.

Understandably, the segmentation processing is a process of detecting segmentation points in the inquiry dialogue voice by using a BIC algorithm, filtering voices among the segmentation points by using a VAD (voice Activity detection) method to obtain a plurality of voice segments, the VAD method is to perform VAD detection on the voices among each two segmentation points, and if a voice end point is detected by the VAD, the processing is not performed; if VAD detects no voice end point, the voice between the two segmentation points is removed, a plurality of voice segments with human voice can be segmented through the segmentation processing, the interval soundless part is removed, and only useful voice segments are reserved.

S202, obtaining an audio sample corresponding to the doctor identification, comparing each voice fragment with the audio sample through a role recognition model to obtain the similarity between the audio sample and each voice fragment, marking the voice fragment corresponding to the similarity which is greater than or equal to a preset similarity threshold as a doctor role, and marking the rest voice fragments as patient roles.

Understandably, the audio sample is an audio file which is collected historically and corresponds to each doctor identifier one by one, the audio sample can be an audio file which is collected by sound with fixed content sent by a doctor corresponding to each doctor identifier, the role recognition model is a model which is trained and used for recognizing whether an input audio fragment is similar to the input audio sample, the similarity between the two is calculated, whether the input audio fragment is a doctor sounding or a patient sounding is judged according to the similarity, the voiceprint characteristics of the voice fragment and the audio sample are respectively extracted through the role recognition model, the extracted voiceprint characteristics of the voice fragment and the extracted voiceprint characteristics of the audio sample are compared to obtain the similarity between the audio sample and the voice fragment, so that the similarity between the audio sample and each voice fragment can be compared, the voice segment corresponding to the similarity greater than or equal to the preset similarity threshold is marked as a doctor role, the voice segment corresponding to the similarity greater than or equal to the preset similarity threshold is shown as a segment of a sound made by a doctor, the voice segment corresponding to the similarity less than the preset similarity threshold can be marked as a patient role, the voice segment corresponding to the similarity less than the preset similarity threshold is shown as a segment of a sound made by a patient, and the remaining voice segments which are not marked can be marked as patient roles.

The voiceprint feature is a feature related to a sound wave frequency spectrum emitted by a person, and the preset similarity threshold is a preset threshold meeting a similarity requirement, such as: 92%, 95%, etc.

S203, extracting frequency domain characteristics of each voice segment through the voice recognition model, and performing character prediction according to the extracted frequency domain characteristics to obtain paragraph texts corresponding to each voice segment.

Understandably, the Speech Recognition model is a trained model for extracting frequency domain features of an input Speech segment and performing character prediction on the extracted frequency domain features to obtain each character in the Speech segment, the Speech Recognition model may be a model for extracting frequency domain features and character prediction by using a distillation learning method, training a teacher network and a student network, and recognizing a text from the input Speech segment through the trained student network, and the Speech Recognition model may also be a model for text conversion trained by using an Automatic Speech Recognition technology (ASR, a technology for converting human Speech into text).

Wherein, the frequency domain features are signal features observed according to frequency, that is, features related to frequency domain feature parameters, such as frequency domain parameters including barycentric frequency, average frequency, root mean square frequency, frequency standard deviation, and the like, the text prediction process is a process of performing masking prediction encoding processing and fine word decoding on the extracted frequency domain features, and performing student network prediction based on TinyBert of distillation learning on all the masking sequences after the fine word decoding, so as to predict text content, thereby obtaining paragraph texts corresponding to the speech segments, the masking prediction encoding is also referred to as mpc masked Predictive encoding, so as to perform Predictive encoding on a model based on machine learning fransformer, that is, 15% of labels of each masking sequence are randomly masked, a masking frame is selected, and 80% of frames in the selected masking frame are represented by a zero vector, 10% of the masked frames are represented by random information of other frames, and the rest 10% of the masked frames are subjected to an unchanged encoding process, the Student network based on the distillation learning TinyBert is obtained by distillation learning from a Teacher network based on the Bert in the speech recognition model, the distillation learning method is to transfer and learn parameters of a corresponding layer, and a simple model (Student network) is trained by adopting the output of a pre-trained complex model (Teacher model) as a supervision signal, for example: the student network adopts a mode of distilling at intervals of N layers, that is, the process of identifying the distilled layer is performed at intervals of preset N layers, for example, the teacher network has 12 layers, if the student network is set to 4 layers, a transform loss is calculated at intervals of 3 layers, and the mapping function g (m) is 3 × m, where m is the number of layers related to coding in the student network, and the specific correspondence is as follows: the 1 st layer transform of the student network corresponds to the 3 rd layer of the teacher network, the 2 nd layer of the student network corresponds to the 6 th layer of the teacher network, the 3 rd layer of the student network corresponds to the 9 th layer of the teacher network, and the 4 th layer of the student network corresponds to the 12 th layer of the teacher network.

In an embodiment, before the step S203, that is, before the performing frequency domain feature extraction on each of the speech segments through the speech recognition model, the method includes:

acquiring a voice sample set; the set of speech samples comprises a plurality of speech samples.

Understandably, the voice sample set is a set of all the voice samples, the voice samples are audio files collected historically, the voice samples can be audio files with preset duration, and one section of audio file can be segmented according to the preset duration to obtain the voice samples.

The speech samples are input to an initial recognition model containing initial parameters.

Understandably, the initial recognition model includes the initial parameters, the initial parameters are parameters of each level in the initial recognition model, the initial recognition model includes a teacher network and a student network, and the initial parameters include a teacher parameter corresponding to the teacher network and a student parameter corresponding to the student network.

And carrying out audio enhancement processing on the voice sample through the initial recognition model to obtain an audio clip to be processed.

Understandably, the audio enhancement processing procedure is: firstly, pre-emphasizing the signal-to-noise ratio of a high-frequency part in the voice sample, wherein most energy of voice is concentrated in a low-frequency part due to the fact that the power spectrum of a voice signal is reduced along with the increase of frequency, so that the signal-to-noise ratio of the high-frequency part is very low, and the signal-to-noise ratio of the high-frequency part is improved through a first-order or second-order high-pass filter; secondly, framing and windowing the voice sample after pre-emphasizing the signal-to-noise ratio of the high-frequency part, namely, taking a preset time length (for example, 10ms, 15ms, 20ms and the like) as a frame, in order to ensure that the smooth transition between the frames keeps continuity, the time length of partial overlap (for example, 1ms and 2ms) exists between the frames, preferably, the time length of the partial overlap is less than one third of the preset time length, and the windowing mode is to perform windowing extraction operation on the framed signal through a window function; thirdly, performing Fourier transform and amplitude square operation on the extracted frame signal; and finally, filtering the signal after the square of the amplitude by using a filter, and obtaining a characteristic vector through logarithmic power conversion, so that frame signals after audio enhancement processing are spliced to obtain the audio segment to be processed, wherein the audio segment to be processed is a segment formed by characteristic vectors related to frequency domain characteristics.

Performing teacher acoustic feature extraction on the audio clip to be processed through a teacher network to obtain a first feature vector, and performing student acoustic feature extraction on the audio clip to be processed through a student network to obtain a second feature vector; wherein the initial recognition model comprises the teacher network and the student network; the student network is obtained after distilling learning is carried out on the teacher network.

Understandably, the teacher network is a neural network model trained in advance, the teacher network is configured to extract the acoustic features of the inputted audio segment to be processed, output a first feature vector according to the extracted acoustic features of the teacher, and identify the output first feature vector to obtain a model of text content, the student network is obtained after performing distillation learning on the teacher network, the student network can extract the acoustic features of students in the inputted audio segment to be processed in a distillation learning manner, output a second feature vector according to the extracted acoustic features of the students, and identify the output second feature vector to obtain a model of text content, preferably, the teacher network is a model constructed based on Bert, the student network is a model constructed based on TinyBert, and the process of extracting the acoustic features of the teacher is to perform Bert model compiling on the inputted audio segment to be processed And code and feature normalization, wherein the student acoustic feature extraction process is a process of coding in a compression mode and feature normalization after learning a teacher network by using a distillation learning method.

The Teacher acoustic feature is a feature related to acoustic frequency, namely a feature mapped into text content by sequence coding on a learning frequency domain, the Student acoustic feature is a feature of a mapping relation learned into the Teacher acoustic feature by applying a distillation learning method, the distillation learning method is a parameter of a transfer learning corresponding layer, and a simple model (Student model) is trained by adopting the output of a pre-trained complex model (Teacher network) as a supervision signal.

And aligning and comparing the first characteristic vector, the second characteristic vector and the dynamic queue in the teacher network to obtain a loss value.

Understandably, by applying a Moco training method, utilizing a dynamic queue (queue) to update the feature vectors corresponding to the negative samples, so that the training of large samples can be considered, the consistency between the negative samples can be kept, and the process of frequency domain feature extraction is close to the correct samples and far away from the training method of the negative samples (namely incorrect samples) through the dynamic queue, the initial dynamic queue is the feature vectors corresponding to all the collected negative samples, namely the feature vectors different from the input voice samples, the alignment comparison processing refers to adding the first feature vector into the dynamic queue in order to relieve the problem that the alignment cannot be achieved so that the correct feature vectors cannot be found for alignment and the dead cycle is caused, the updated dynamic queue comprises the feature vector corresponding to a consistent sample and the feature vectors corresponding to a plurality of negative samples, performing inner product processing on the first feature vector and the feature vectors in each dynamic queue, and simultaneously performing inner product processing on the second feature vector and the feature vectors in each dynamic queue, so as to determine a loss value by using a cross entropy formula, wherein the alignment comparison can align correct acoustic features (including teacher acoustic features and student acoustic features) and contrast (keep away) the correct acoustic features with other irrelevant features, the student network can also migrate and predict a layer under a distillation learning method while outputting the loss value, can perform masking prediction coding on the second feature vector and text prediction on text content corresponding to the second feature vector, and can recognize text content corresponding to input voice segments only through the student network in the subsequent application of a voice recognition model, the depth of the voice recognition model in operation is greatly reduced, and the voice recognition efficiency is improved.

And when the loss value does not reach a preset convergence condition, iteratively updating the initial parameters of the initial recognition model until the loss value reaches the convergence condition, and recording the initial recognition model after convergence as a trained voice recognition model.

Understandably, the convergence condition may be a condition that the loss value is small and does not decrease again after 10000 times of calculation, that is, when the loss value is small and does not decrease again after 10000 times of calculation, the training is stopped, and the initial recognition model after convergence is recorded as a trained speech recognition model; the convergence condition may also be a condition that the loss value is smaller than a set convergence threshold value, that is, when the loss value is smaller than the set convergence threshold value, the training is stopped, and the initial recognition model after the convergence is recorded as a trained speech recognition model, so that when the loss value does not reach the pre-training convergence condition, the initial parameters of the initial recognition model are continuously adjusted, wherein the teacher parameters are frozen, the student parameters are adjusted, the learning network can be continuously drawn to an accurate result, and the accuracy of speech recognition is increased. Therefore, the accuracy of voice recognition can be improved, the efficiency of recognizing the text by the voice is improved, the capacity of a voice recognition model is optimized, and a dynamic queue does not need to be continuously increased to serve as a negative sample for voice recognition.

The invention realizes the purpose of acquiring the voice sample set comprising a plurality of voice samples; inputting the voice sample into an initial recognition model containing initial parameters; carrying out audio enhancement processing on the voice sample through the initial recognition model to obtain an audio clip to be processed; performing teacher acoustic feature extraction on the audio clip to be processed through a teacher network to obtain a first feature vector, and performing student acoustic feature extraction on the audio clip to be processed through a student network to obtain a second feature vector; wherein the initial recognition model comprises the teacher network and the student network; the student network is obtained after distillation learning is carried out on the teacher network; aligning and comparing the first feature vector, the second feature vector and a dynamic queue in the teacher network to obtain a loss value; when the loss value does not reach the preset convergence condition, iteratively updating the initial parameters of the initial recognition model until the loss value reaches the convergence condition, recording the initial recognition model after convergence as a trained voice recognition model, thus realizing the purpose of automatically enhancing useful audio information without marking a large number of voice samples and saving labor cost by audio enhancement processing, extracting acoustic characteristics of teachers through a teacher network, extracting acoustic characteristics of students through a student network obtained by distillation learning in the teacher network, carrying out alignment comparison processing in combination with a dynamic queue, and obtaining the voice recognition model through iterative training, and training by using a distillation learning method and model training of a self-supervised teacher network and a student network to obtain the voice recognition model, the manual marking time and workload are reduced, and the efficiency of voice recognition is accelerated through a student network, so that the efficiency of voice recognition is improved.

S204, correspondingly marking the characters of the paragraph texts according to the voice segments marked as the doctor characters and the voice segments marked as the patient characters.

Understandably, the paragraph text corresponding to the voice segment marked as the doctor role is marked as the doctor role, and the paragraph text corresponding to the voice segment marked as the patient role is marked as the patient role.

And S205, performing time sequence splicing on the paragraph texts marked by all the roles to obtain the dialog text.

Understandably, the time-sequence splicing is to splice the paragraph texts after each character mark is performed according to the sequence of a time axis, so as to obtain the dialog text.

The invention realizes that a plurality of voice fragments are obtained by carrying out segmentation processing on the inquiry dialogue voice; acquiring an audio sample corresponding to the doctor identification, comparing each voice fragment with the audio sample through a role recognition model to obtain the similarity between the audio sample and each voice fragment, marking the voice fragment corresponding to the similarity which is greater than or equal to a preset similarity threshold as a doctor role, and marking the rest voice fragments as patient roles; extracting frequency domain characteristics of each voice segment through a voice recognition model, and performing character prediction according to the extracted frequency domain characteristics to obtain paragraph texts corresponding to each voice segment; correspondingly marking the roles of the paragraph texts according to the voice segments marked as the doctor roles and the voice segments marked as the patient roles; the method comprises the steps of carrying out time sequence splicing on paragraph texts after all role marks to obtain the dialogue texts, so that the inquiry dialogue speech is automatically segmented to be divided into a plurality of speech segments in the dialogue, comparing the similarity of an audio sample and each speech segment through a role recognition model, distinguishing the speech segment for marking a doctor role and the audio segment for marking a patient role, carrying out frequency domain feature extraction and character prediction on each speech segment by using the speech recognition model, predicting the paragraph texts corresponding to each speech segment, carrying out corresponding role marks on each paragraph text, and splicing according to the time sequence to obtain the dialogue texts, so that the accuracy of outputting the dialogue texts is improved, the role marks are added, a data base is provided for subsequent attention text output, and the quality of the attention text output is improved.

And S30, performing key symptom identification on the dialogue text to obtain a focus text corresponding to the inquiry dialogue voice.

Understandably, the key symptom is recognized to recognize words or words related to symptoms in the dialog text, and whether the words or words are positive semantics or negative semantics is judged according to context semantics of the recognized words or words, so that the recognized words or words with positive semantics are determined as concerned words or words, time dimension analysis is performed on the dialog text, information of time dimension is extracted, a recognition process of concerned time text is obtained, and finally the concerned words or words and the concerned time text are recorded as the concerned text.

In an embodiment, as shown in fig. 4, in step S30, the performing key symptom recognition on the dialog text to obtain a text of interest corresponding to the inquiry dialog speech includes:

s301, carrying out symptom identification on the dialog text, and identifying a plurality of symptom keywords in the dialog text.

Understandably, the symptom recognition is to perform word vector conversion on the dialog text to obtain a vector text corresponding to the dialog text, extract a feature vector related to symptoms in the vector text, perform prediction according to the extracted feature vector related to the symptoms to obtain probability distribution of each symptom keyword, and recognize the symptom keyword related to the dialog text from the probability distribution.

S302, performing context semantic analysis and time dimension analysis on each symptom keyword to determine the concerned text in the conversation text.

Understandably, the context semantic analysis is an analysis process for identifying a positive word or a negative word from the context of the symptom keyword related to the recognized dialog text, so as to determine whether the positive semantic meaning or the negative semantic meaning is an analysis process, that is, whether a word eye with the positive semantic meaning or a word eye with the negative semantic meaning is identified in the context of each symptom keyword, the positive semantic meaning or the negative semantic meaning is determined according to the occurrence of time sequence, the paragraph text related to the context semantic analysis is determined as the attention text, the time dimension analysis is that whether a term with a time dimension exists in the context of the symptom keyword related to the recognized dialog text, and the paragraph text related to the term with the time dimension is recorded as the attention text.

The method and the device realize the recognition of a plurality of symptom keywords in the dialog text by carrying out symptom recognition on the dialog text; and performing context semantic analysis and time dimension analysis on each symptom keyword to determine the attention text in the conversation text, so that the symptom identification of the conversation text is automatically performed in a targeted manner, the symptom keywords of the patient are identified, the context semantic analysis and the time dimension of the time are used for analyzing, the text content needing attention is extracted from the conversation text, a doctor does not need to summarize according to the amount of information heard by the doctor, the text needing attention in the inquiry conversation voice can be accurately and quickly automatically identified, and the accuracy and timeliness of electronic medical record output are improved.

S40, extracting the chief complaint features of the concerned text, identifying the chief complaint results according to the extracted chief complaint features, and identifying and verifying the medical history of the concerned text according to the patient information to obtain the current medical history results and the past history results.

Understandably, the process of extracting the chief complaint features is to perform classified dimension reduction processing on the content in the concerned text and extract features of the symptom types after dimension reduction, the chief complaint features are category features of words or phrases of similar symptoms, the chief complaint results represent a set of similar symptom types in the concerned text, the process of performing medical history recognition and verification on the concerned text is to perform time sequence recognition on the dialogue text corresponding to unit content (words or phrases or a sentence) in the concerned text, distinguish whether each unit content in the concerned text belongs to the current or history, thereby obtaining a current medical history result and an initial past history result, and then perform medical history verification on the initial past history result according to historical visit information in the patient information to obtain the past history result.

In an embodiment, the step S40, namely, the performing medical history identification and verification on the text of interest according to the patient information to obtain a present medical history result and a past medical history result includes:

and distinguishing the medical history of the concerned text to obtain a present medical history result and an initial past history result.

Understandably, the medical history distinguishing is to identify the time sequence in the dialog text corresponding to the unit content (word or sentence) in the attention text, and distinguish whether each unit content in the attention text belongs to the current or history, so as to obtain the current medical history result and the initial past history result, wherein the current medical history result is a set of paragraph texts corresponding to the context related to the word or word in the current state, and the initial past history result is a set of paragraph texts corresponding to the context related to the word or word in the history state.

And performing medical history check on the initial past history result according to the historical clinic information in the patient information to obtain the past history result.

Conceivably, the history diagnosis information is related information of all history diagnoses which is created and recorded by a patient identifier corresponding to the patient, the process of the medical history verification is to perform word vector conversion on the history diagnosis information and the initial past history result, compare whether the converted word vectors are matched or similar, determine whether the initial past history result is correct, add the matched or similar medical history into the past history result, add the medical history which is not in the initial past history result into the past history result, and if the mismatching or dissimilar condition occurs, which indicates that the patient has a false report, add the mismatching or dissimilar content into the past history result, thereby avoiding the situation that the patient has the false report on the medical history.

According to the method, the medical history distinguishing is carried out on the concerned text, so that a current medical history result and an initial past history result are obtained; according to history information of seeing a doctor among the patient information, it is right initial anamnesis result carries out medical history check, obtains anamnesis result, so, can distinguish automatically that the anamnesis result appears and initial anamnesis result, carries out medical history check to initial anamnesis result automatically, finally outputs anamnesis result to ensure that the content of patient anamnesis and anamnesis is correct, improved the pluralism and the exactness of electronic medical record.

And S50, acquiring a medical record template generation model corresponding to the doctor identifier, and performing medical record generation on the chief complaint result, the current medical history result and the past medical history result through the acquired medical record template generation model to obtain an electronic medical record corresponding to the medical record generation request.

Understandably, each doctor identification corresponds to a medical record template generation model, the medical record template generation model is a model of a medical record template specially customized for a doctor after a historical medical record scheme is provided for the doctor, the deep machine learning is carried out, learning contents include but are not limited to professional terms, medication suggestions, dosage suggestions, preference features and the like for disease description, the medical record generation process is to carry out template factor feature extraction on the chief complaint result, a medical record template corresponding to the chief complaint result is generated according to the extracted template factor features, then the current medical history result and the final previous medical history result are automatically filled to the corresponding position in the medical record template corresponding to the chief complaint result, so that the medical record is obtained, the medical record is determined as the electronic medical record, and the electronic medical record is displayed in a window in application software of a terminal, the electronic medical record embodies the information of the treatment information of the patient in combination with the medical history of the patient and accords with the information of the electronic mode of a doctor customized medical record template, and the treatment record of the patient can be conveniently managed through the electronic medical record which can be stored in a structured mode.

The invention realizes that the inquiry dialogue voice, the patient information and the doctor identification in the medical record generation request are obtained by receiving the medical record generation request; carrying out voice role segmentation and voice recognition on the inquiry dialogue voice to obtain a dialogue text; performing key symptom identification on the dialogue text to obtain an attention text corresponding to the inquiry dialogue voice; extracting the main complaint characteristics of the concerned text, identifying a main complaint result according to the extracted main complaint characteristics, and identifying and verifying the medical history of the concerned text according to the patient information to obtain a current medical history result and a past history result; the method comprises the steps of obtaining a medical record template generating model corresponding to a doctor identification, carrying out medical record generation on a chief complaint result, a current medical history result and a past medical history result through the obtained medical record template generating model, obtaining an electronic medical record, automatically identifying a concerned text by utilizing voice role segmentation and voice identification and key symptom identification, automatically identifying a chief complaint result corresponding to the concerned text by extracting chief complaint characteristics, and automatically generating the electronic medical record through the corresponding medical record template generating model, so that the electronic medical record which is suitable for a patient is automatically generated on the basis of the medical record template customized by the doctor quickly and accurately, the workload of manual input of the doctor is reduced, the doctor seeing and diagnosing efficiency is improved, the accuracy and timeliness of the medical record are improved, and the experience satisfaction degree of the patient is improved.

In an embodiment, as shown in fig. 5, in step S50, that is, performing medical record generation on the chief complaint result, the current medical history result, and the past medical history result through the acquired medical record template generation model to obtain an electronic medical record corresponding to the medical record generation request includes:

s501, extracting template factor characteristics of the chief complaint result through the acquired medical record template generation model, and generating a medical record template corresponding to the chief complaint result according to the extracted template factor characteristics.

Understandably, the template factor features are implicit features customized by each doctor according to various combinations in the chief complaints, the medical record templates are templates obtained by learning according to various combination results related to the chief complaint results, the medical record templates comprise guide directions, auxiliary test items, medication schemes and the like,

s502, filling the current medical history result and the past medical history result into the medical history template corresponding to the chief complaint result to obtain the electronic medical record.

Understandably, the current medical history result is filled into the position, corresponding to the current medical history result, in the medical history template, and the past medical history result is filled into the position, corresponding to the past medical history result, in the medical history template, so that the electronic medical history is obtained.

Therefore, the invention realizes that the medical record template matched with the patient complaint result is identified by automatically identifying the template factor characteristics in the patient complaint result, and the current medical history result and the past medical history result are automatically filled into the identified medical record template, and the electronic medical record is automatically generated, so that the patient complaint time of the doctor is greatly reduced, the patient complaint efficiency is improved, and the patient complaint experience satisfaction is improved.

In an embodiment, as shown in fig. 6, after the step S50, that is, after the obtaining the electronic medical record, the method includes:

s60, receiving a confirmation instruction from the doctor identifier; and the confirmation instruction is generated after the doctor corresponding to the doctor identifier checks or modifies the displayed electronic medical record.

Understandably, after the doctor corresponding to the doctor identification finishes watching the electronic medical record, and checks the electronic medical record without errors or inputs a corresponding modification meaning, after the doctor confirms the electronic medical record without errors, a 'confirmation/printing' button in application program software is triggered, so that the confirmation instruction is triggered and generated, and the confirmation instruction comprises the checked or modified electronic medical record.

And S70, updating and adding a signature to the electronic medical record according to the confirmation instruction to generate the confirmed electronic medical record.

Understandably, the confirmation instruction further includes a doctor authentication result, the checked or modified electronic medical record is updated to a new electronic medical record, then a signature corresponding to the doctor identification is obtained according to the doctor authentication result, the signature is synthesized into the updated electronic medical record by using an image synthesis technology, the signature adding process is completed, and finally the electronic medical record after signature adding is confirmed to be the confirmed electronic medical record.

The doctor authentication result represents a result of non-inductive authentication performed by a doctor corresponding to the doctor identifier, and the non-inductive authentication is an authentication method for identifying whether a currently acquired fingerprint, audio or image is the doctor corresponding to the doctor identifier through technologies such as fingerprint authentication, voiceprint authentication or image face authentication.

In an embodiment, the step S70, namely, the updating and signature adding the electronic medical record to generate the confirmed electronic medical record, includes:

according to the doctor authentication result passing the authentication, signature stamps corresponding to the doctor identification are obtained from a cloud signature stamp database, and the cloud signature stamp database stores signature stamps of all doctors and manages all signature stamps.

And updating the checked or modified electronic medical record into a new electronic medical record.

And recognizing the position corresponding to the doctor name corresponding to the doctor identifier in the updated electronic medical record by using a text recognition technology.

Understandably, the text recognition technology is used for performing text recognition on the updated text in the electronic medical record, recognizing a text position corresponding to the doctor name corresponding to the doctor identification, and locating the position.

And applying an image synthesis technology to print a signature after the acquired signature seal is at the position to obtain the confirmed electronic medical record.

Understandably, the Image synthesis technology is an operation technology executed by using an Image synthesis (Image blending) code in OpenCV language codes, the signature is a process of performing overlay processing by using the electronic medical record as a background and the signature seal as a surface layer through the Image synthesis technology.

According to the invention, the signature seal corresponding to the doctor identification is obtained from a cloud signature seal database according to the doctor authentication result passing the authentication; updating the checked or modified electronic medical record into a new electronic medical record; identifying a position corresponding to a doctor name corresponding to the doctor identifier in the updated electronic medical record by using a text identification technology; by applying the image synthesis technology, the acquired signature seal is printed after the position to obtain the confirmed electronic medical record, so that the signature seal can be accurately printed after the signature position, the time and workload of applying the signature seal by a doctor, validating the seal and the like are saved, the seal management for the signature seal is realized, the doctor seeing a doctor is reduced, and the doctor seeing a doctor efficiency is improved.

It should be understood that, the sequence numbers of the steps in the foregoing embodiments do not imply an execution sequence, and the execution sequence of each process should be determined by its function and inherent logic, and should not constitute any limitation to the implementation process of the embodiments of the present invention.

In an embodiment, an electronic medical record generation device based on artificial intelligence is provided, and the electronic medical record generation device based on artificial intelligence corresponds to the electronic medical record generation method based on artificial intelligence in the above embodiment one to one. As shown in fig. 7, the electronic medical record generation device based on artificial intelligence includes a receiving module 11, a first identification module 12, a second identification module 13, an extraction module 14 and a generation module 15. The functional modules are explained in detail as follows:

the receiving module 11 is configured to receive a medical record generation request, and acquire an inquiry dialogue voice, patient information, and a doctor identifier in the medical record generation request;

the first recognition module 12 is configured to perform voice role segmentation and voice recognition on the inquiry dialogue speech to obtain a dialogue text;

the second recognition module 13 is configured to perform key symptom recognition on the dialog text to obtain a focus text corresponding to the inquiry dialog voice;

an extraction module 14, configured to perform chief complaint feature extraction on the text of interest, identify a chief complaint result according to the extracted chief complaint feature, and perform medical history identification and verification on the text of interest according to the patient information to obtain a current medical history result and a past history result;

and the generating module 15 is configured to obtain a medical record template generating model corresponding to the doctor identifier, and perform medical record generation on the chief complaint result, the current medical history result, and the past medical history result through the obtained medical record template generating model to obtain an electronic medical record corresponding to the medical record generating request.

For specific limitations of the electronic medical record generation apparatus based on artificial intelligence, reference may be made to the above limitations of the electronic medical record generation method based on artificial intelligence, which are not described herein again. All or part of the modules in the electronic medical record generation device based on artificial intelligence can be realized by software, hardware and a combination thereof. The modules can be embedded in a hardware form or independent from a processor in the computer device, and can also be stored in a memory in the computer device in a software form, so that the processor can call and execute operations corresponding to the modules.

In one embodiment, a computer device is provided, which may be a client or a server, and its internal structure diagram may be as shown in fig. 8. The computer device includes a processor, a memory, a network interface, and a database connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device comprises a readable storage medium and an internal memory. The readable storage medium stores an operating system, a computer program, and a database. The internal memory provides an environment for the operation of an operating system and computer programs in the readable storage medium. The network interface of the computer device is used for communicating with an external terminal through a network connection. The computer program is executed by a processor to implement an artificial intelligence based electronic medical record generation method.

In one embodiment, a computer device is provided, which includes a memory, a processor, and a computer program stored on the memory and executable on the processor, and when the processor executes the computer program, the artificial intelligence based electronic medical record generation method in the above embodiments is implemented.

In one embodiment, a computer-readable storage medium is provided, on which a computer program is stored, and the computer program, when executed by a processor, implements the artificial intelligence based electronic medical record generation method in the above embodiments.

It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by hardware instructions of a computer program, which can be stored in a non-volatile computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. Any reference to memory, storage, databases, or other media used in embodiments provided herein may include non-volatile and/or volatile memory. Non-volatile memory can include read-only memory (ROM), Programmable ROM (PROM), Electrically Programmable ROM (EPROM), Electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), Dynamic RAM (DRAM), Synchronous DRAM (SDRAM), Double Data Rate SDRAM (DDRSDRAM), Enhanced SDRAM (ESDRAM), Synchronous Link DRAM (SLDRAM), Rambus Direct RAM (RDRAM), direct bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM).

It will be apparent to those skilled in the art that, for convenience and brevity of description, only the above-mentioned division of the functional units and modules is illustrated, and in practical applications, the above-mentioned function distribution may be performed by different functional units and modules according to needs, that is, the internal structure of the apparatus is divided into different functional units or modules to perform all or part of the above-mentioned functions.

The above-mentioned embodiments are only used for illustrating the technical solutions of the present invention, and not for limiting the same; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; such modifications and substitutions do not substantially depart from the spirit and scope of the embodiments of the present invention, and are intended to be included within the scope of the present invention.

19页详细技术资料下载

上一篇：一种医用注射器针头装配设备

下一篇：一种语音关键词的识别方法及系统

Electronic medical record generation method, device, equipment and medium based on artificial intelligence

相关技术

网友询问留言