ICD (interface control document) coding conversion method, device, computing equipment and storage medium

文档序号:1905158 发布日期:2021-11-30 浏览:5次 中文

阅读说明:本技术 Icd编码转化方法、装置、计算设备和存储介质 (ICD (interface control document) coding conversion method, device, computing equipment and storage medium ) 是由 徐伟风 李易平 于 2021-08-19 设计创作,主要内容包括:本发明公开了一种ICD编码转化方法、装置、计算设备和存储介质,包括:提取并预处理待转化的ICD二元组{名称类,编码类};针对每个预处理后的ICD二元组进行精转化,包括:对ICD二元组进行相对于标准ICD二元组{标准名称类,标准ICD编码类}的关联匹配,筛选得到关联的ICD二元组作为精转化结果;针对每个未关联的ICD二元组进行粗转化,包括:计算名称类与每个标准名称类的相似度得分、编码类与每个标准ICD编码类的距离得分,综合相似度得分和距离得分筛选确定得分最高的一组{标准名称类,标准ICD编码类}作为ICD二元组的粗转化结果。提升编码效率和编码准确率,且降低编码员工作量。(The invention discloses an ICD coding conversion method, a device, a computing device and a storage medium, comprising the following steps: extracting and preprocessing an ICD (interface control document) binary group { name class and coding class } to be converted; performing fine transformation on each preprocessed ICD binary group, including: performing association matching on the ICD binary group relative to a standard ICD binary group { a standard name class and a standard ICD encoding class }, and screening to obtain an associated ICD binary group as a fine transformation result; performing a coarse transformation for each unassociated ICD tuple, comprising: and calculating the similarity score between the name class and each standard name class and the distance score between the coding class and each standard ICD coding class, and screening and determining a group of { standard name class and standard ICD coding class } with the highest score as a rough conversion result of the ICD binary group by integrating the similarity score and the distance score. The coding efficiency and the coding accuracy are improved, and the workload of a coder is reduced.)

1. An ICD coding transformation method is characterized by comprising the following steps:

extracting and preprocessing an ICD (interface control document) binary group { name class and coding class } to be converted;

performing fine transformation on each preprocessed ICD binary group, including: performing association matching on the ICD binary group relative to a standard ICD binary group { a standard name class and a standard ICD encoding class }, and screening to obtain an associated ICD binary group as a fine transformation result;

performing a coarse transformation for each unassociated ICD tuple, comprising: and calculating the similarity score between the name class and each standard name class and the distance score between the coding class and each standard ICD coding class, and screening and determining a group of { standard name class and standard ICD coding class } with the highest score as a rough conversion result of the ICD binary group by integrating the similarity score and the distance score.

2. The ICD coding conversion method of claim 1, wherein the pre-processing of the ICD doublet to be converted comprises: deleting blank space and special characters, unifying upper and lower case letters and unifying synonyms.

3. The ICD coding conversion method according to claim 1, wherein in the fine conversion process, the standard ICD doublet corresponding to the standard name class same as the name class is searched based on the name class of the ICD doublet to be converted, and if the standard ICD doublet is found, the correlation matching is performed, that is, the standard coding class in the standard ICD doublet is used as the new coding class of the ICD doublet to be converted, and the associated ICD doublet { name class, new coding class } is extracted as the fine conversion result.

4. The ICD coding conversion method according to claim 1, wherein in the coarse conversion process, a cosine similarity between the name class and each standard name class is calculated as a similarity score.

5. The ICD coding conversion method according to claim 1 or 4, wherein calculating a similarity score between the name class and each standard name class comprises:

respectively segmenting words of the name class and each standard name class, respectively combining word segmentation results of the name class and each standard name class to determine a combined word segmentation result, coding according to the occurrence condition of the word segmentation result of the name class and the word segmentation result of the standard name class in the combined word segmentation result, and confirming word segmentation vectors of the name class and the word segmentation vectors of the standard name class;

and calculating the similarity score of the name class and each standard name class according to the word segmentation vector of the name class and the word segmentation vector of the standard name class.

6. The ICD coding conversion method of claim 1, wherein in the coarse conversion process, a Jaro-Winkle distance between the coding class and each of the standard ICD coding classes is calculated as a distance score.

7. The ICD coding conversion method according to claim 1, wherein in the rough conversion process, the coarse conversion result of the ICD binary group is determined as a set of { standard name class, standard ICD coding class } with the highest score by combining the similarity score and the distance score screening, and includes:

when the similarity score between the name class and each standard name class is 1, expanding the similarity score according to the set similarity weight;

when the distance score between the coding class and each standard ICD coding class is larger than a set distance threshold, expanding the distance score according to the set distance weight;

and weighting and summing the expanded similarity score and the expanded distance score to obtain a comprehensive score, and selecting a group of { standard name class and standard ICD coding class } with the highest comprehensive score as a rough conversion result of the ICD binary group.

8. An ICD coding conversion device, comprising:

the acquisition and preprocessing module is used for extracting and preprocessing the ICD dyads { name class and coding class } to be converted;

the fine transformation module is used for performing fine transformation on each preprocessed ICD binary group, and comprises: performing association matching on the ICD binary group relative to a standard ICD binary group { a standard name class and a standard ICD encoding class }, and screening to obtain an associated ICD binary group as a fine transformation result;

a rough translation module, configured to perform rough translation on each unassociated ICD tuple, including: and calculating the similarity score between the name class and each standard name class and the distance score between the coding class and each standard ICD coding class, and screening and determining a group of { standard name class and standard ICD coding class } with the highest score as a rough conversion result of the ICD binary group by integrating the similarity score and the distance score.

9. A computing device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, wherein the processor implements the steps of the ICD coding conversion method according to any one of claims 1 to 7 when executing the computer program.

10. A computer storage medium having a computer program stored thereon, wherein the computer program is configured to perform the steps of the ICD coding conversion method according to any one of claims 1-7 when executed.

Technical Field

The invention belongs to the field of international disease classification coding, and particularly relates to an ICD coding conversion method, an ICD coding conversion device, computing equipment and a storage medium.

Background

International Classification of Diseases (ICD) codes, which allow different countries or regions to be expanded according to the actual situation of the country or region, are a unified coding method used by medical institutions such as hospitals. In the last 20 years, nearly 20 coding versions are expanded on the basis of ICD-10 and ICD-9-CM-3 issued by WHO in various regions of China, the same medical institution has different coding versions in different periods, and even the same medical institution can use different versions in different systems (such as electronic medical records and medical record first pages) at the same time. In addition, many hospitals can also automatically expand in-hospital ICD codes, which increases the difficulty of refined statistical analysis and also makes the application of big data and machine learning based on diagnostic codes or surgical codes difficult.

With the improvement of medical insurance payment modes, particularly the steady push of a charging system of Diagnostic Related Groups (DRGs), Related departments also strongly promote the standardization of various codes, and the ICD-10 medical insurance edition 1.0 and the ICD-9-CM-3 medical insurance edition 1.0 are respectively issued for disease and operation codes in 2019, so that the conversion of other different editions is facilitated, the data consistency is ensured, and a foundation is laid for the implementation of DRGs and the fine management of hospitals.

The traditional ICD coding conversion method usually adopts a mode of strip-by-strip translation, namely, a coder translates codes to be converted into standard new codes strip-by-strip, and the method ensures the accuracy but has low efficiency.

Disclosure of Invention

In view of the foregoing, an object of the present invention is to provide an ICD coding conversion method, an ICD coding conversion apparatus, an ICD computing apparatus and an ICD storage medium, which can improve coding efficiency and coding accuracy and reduce workload of a coder.

In a first aspect, embodiments provide a method for ICD coding transformation, including the following steps:

extracting and preprocessing an ICD (interface control document) binary group { name class and coding class } to be converted;

performing fine transformation on each preprocessed ICD binary group, including: performing association matching on the ICD binary group relative to a standard ICD binary group { a standard name class and a standard ICD encoding class }, and screening to obtain an associated ICD binary group as a fine transformation result;

performing a coarse transformation for each unassociated ICD tuple, comprising: and calculating the similarity score between the name class and each standard name class and the distance score between the coding class and each standard ICD coding class, and screening and determining a group of { standard name class and standard ICD coding class } with the highest score as a rough conversion result of the ICD binary group by integrating the similarity score and the distance score.

In one embodiment, the pre-processing of ICD tuples to be converted includes: deleting blank space and special characters, unifying upper and lower case letters and unifying synonyms.

In one embodiment, in the fine conversion process, a standard ICD binary group corresponding to a standard name class which is the same as the name class is searched based on the name class of the ICD binary group to be converted, and if the standard ICD binary group is searched, correlation matching is performed, that is, the standard encoding class in the standard ICD binary group is used as a new encoding class of the ICD binary group to be converted, and associated ICD binary group { name class, new encoding class } is extracted as a fine conversion result.

In one embodiment, during the coarse translation, the cosine similarity of the name class to each standard name class is calculated as the similarity score.

In one embodiment, calculating a similarity score for the name class and each standard name class includes:

respectively segmenting words of the name class and each standard name class, respectively combining word segmentation results of the name class and each standard name class to determine a combined word segmentation result, coding according to the occurrence condition of the word segmentation result of the name class and the word segmentation result of the standard name class in the combined word segmentation result, and confirming word segmentation vectors of the name class and the word segmentation vectors of the standard name class;

and calculating the similarity score of the name class and each standard name class according to the word segmentation vector of the name class and the word segmentation vector of the standard name class.

In one embodiment, the distance score is calculated as the Jaro-Winkle distance of the encoding class from each standard ICD encoding class during the course of the coarse translation.

In one embodiment, in the rough transformation process, the comprehensive similarity score and distance score screening determines a set { standard name class, standard ICD encoding class } with the highest score as the rough transformation result of the ICD binary, including:

when the similarity score between the name class and each standard name class is 1, expanding the similarity score according to the set similarity weight;

when the distance score between the coding class and each standard ICD coding class is larger than a set distance threshold, expanding the distance score according to the set distance weight;

and weighting and summing the expanded similarity score and the expanded distance score to obtain a comprehensive score, and selecting a group of { standard name class and standard ICD coding class } with the highest comprehensive score as a rough conversion result of the ICD binary group.

In a second aspect, embodiments provide an ICD coding conversion apparatus, including:

the acquisition and preprocessing module is used for extracting and preprocessing the ICD dyads { name class and coding class } to be converted;

the fine transformation module is used for performing fine transformation on each preprocessed ICD binary group, and comprises: performing association matching on the ICD binary group relative to a standard ICD binary group { a standard name class and a standard ICD encoding class }, and screening to obtain an associated ICD binary group as a fine transformation result;

a rough translation module, configured to perform rough translation on each unassociated ICD tuple, including: and calculating the similarity score between the name class and each standard name class and the distance score between the coding class and each standard ICD coding class, and screening and determining a group of { standard name class and standard ICD coding class } with the highest score as a rough conversion result of the ICD binary group by integrating the similarity score and the distance score.

In a third aspect, embodiments provide a computing device, including a memory, a processor, and a computer program stored in the memory and executable on the processor, where the processor implements the steps of the ICD coding conversion method according to the first aspect when executing the computer program.

In a fourth aspect, the embodiments provide a computer storage medium, on which a computer program is stored, and the computer program, when being processed and executed, implements the steps of the ICD coding conversion method according to the first aspect.

The technical scheme provided by the embodiment has the beneficial effects that at least:

according to the standard ICD binary group, the ICD binary group to be converted is subjected to fine conversion and coarse conversion processes, so that the ICD codes are quickly converted, the coding efficiency is greatly improved on the basis of ensuring the accuracy, and meanwhile, the workload of a coder is greatly reduced.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to these drawings without creative efforts.

FIG. 1 and FIG. 2 are flow charts of ICD coding conversion methods provided by an embodiment;

FIG. 3 is a schematic structural diagram of an ICD encoding and converting device according to an embodiment.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention will be further described in detail with reference to the accompanying drawings and examples. It should be understood that the detailed description and specific examples, while indicating the scope of the invention, are intended for purposes of illustration only and are not intended to limit the scope of the invention.

By analyzing different coding versions, all versions are found to be expanded on the basis of ICD codes issued by WHO, wherein diagnosis codes are actually expanded on the basis of ICD-10 sub-orders (former four-bit codes), operation codes are expanded on the basis of ICD-9-CM-3 sub-orders (former four-bit codes), and even if self-maintenance codes of hospitals approximately follow the principle, the embodiment provides an ICD code conversion method and device based on the principle.

Fig. 1 and fig. 2 are flow charts of ICD coding transformation methods provided by an embodiment. As shown in fig. 1 and 2, the embodiment provides an ICD coding transformation method, which includes the following steps:

step 1, extracting and preprocessing the ICD binary group to be converted.

The medical record library comprises all diagnosis names to be converted and corresponding diagnosis codes, operation names and corresponding operation codes, and ICD dyads { name class, code class } are formed after the diagnosis names to be converted and the corresponding diagnosis codes, the operation names and the corresponding operation codes are extracted and extracted information is preprocessed. The diagnosis name and the operation name are collectively called a name class, and the diagnosis code and the operation code are collectively called a code class. For diagnosis, the corresponding ICD tuple is { diagnosis name, diagnosis code }, and for surgery, the corresponding ICD tuple is { surgery name, surgery code }.

The preprocessing of the extracted information comprises: deleting blank space and special characters, unifying upper and lower case letters and unifying synonyms. When the letters are in the same case, all the letters can be converted into upper case or lower case. When synonyms are unified, for example, "a", "I", etc. are unified as the arabic numeral "1".

The ICD dyads to be converted after the pretreatment are arranged into a form of table 1 for subsequent conversion by taking the ICD dyads as { diagnosis names and diagnosis codes } as an example.

TABLE 1

And 2, carrying out fine transformation on the ICD binary group.

In an embodiment, each preprocessed ICD binary group is subjected to fine transformation, and a specific process includes: and performing association matching on the ICD duplet relative to the standard ICD duplet { standard name class, standard ICD coding class }, and screening to obtain an associated ICD duplet as a fine transformation result.

Before the fine transformation process, the standard ICD binary group needs to be preprocessed, namely, blank spaces and special characters are deleted from the standard ICD binary group, capital and lower letters are unified, synonym processing is unified, so that the standard ICD binary group is aligned with the ICD binary group to be transformed, and the fine transformation and the coarse transformation processes are facilitated.

In the fine conversion process, the name class of the ICD doublet to be converted is taken as a reference, a standard ICD doublet corresponding to a standard name class with the same name class is searched, if the standard ICD doublet is searched, correlation matching is carried out, namely the standard encoding class in the standard ICD doublet is taken as a new encoding class of the ICD doublet to be converted, and the associated ICD doublet { the name class and the new encoding class } is extracted as a fine conversion result.

Table 2 shows the result of the fine transformation, and as shown in table 2, the number 1 and the number M can be subjected to the fine transformation process, that is, the diagnostic name to be transformed is associated with the standard diagnostic name, so as to determine the associated ICD binary, that is, the final code of the diagnostic code 1 is the diagnostic code 1 ', and the final code of the diagnostic code M is the diagnostic code M'. The ICD doublet with number 2 is not associated with the standard ICD doublet and needs to be transformed by the rough transformation process of step 3.

TABLE 2

And 3, carrying out coarse transformation on the ICD binary group.

In an embodiment, each unassociated ICD tuple is roughly transformed, and the specific process includes: and calculating the similarity score between the name class and each standard name class and the distance score between the coding class and each standard ICD coding class, and screening and determining a group of { standard name class and standard ICD coding class } with the highest score as a rough conversion result of the ICD binary group by integrating the similarity score and the distance score.

In an embodiment, the cosine similarity between the calculated name class and each standard name class is used as a similarity score, the Jaro-Winkle distance between the calculated coding class and each standard ICD coding class is used as a distance score, then the cosine similarity score and the Jaro-Winkle distance score are combined, and a group { standard name class, standard ICD coding class } with the highest score is screened as a rough conversion result of the ICD binary group.

In the embodiment, before calculating the similarity, word segmentation vectorization needs to be performed on the name class and the standardized name class, and the similarity is calculated according to the word segmentation vectors of the name class and the word segmentation vectors of the standard name class.

With name class A and ith standard name class BiFor example, the word segmentation vectorization process of the name class and the standardized name class is explained. Firstly, a word segmentation tool such as an open source tool jieba is adopted to pair a name class A and an ith standard name class BiPerforming respective word segmentation to obtain a word segmentation result A ═ a1,a2,…,an},Bi={bi1,bi2,…,bimAnd combining the two word segmentation results to obtain a combined word segmentation result Ci={ci1,ci2,…,cikWhere k is m + n. According to C, without considering the relevance and the sequence among the participlesiIf the element in (A) appears in A, the element is coded into 1 if the element appears in A, and the element is coded into 0 if the element does not appear in A, so that the participle vector of the name class A is obtained, and similarly, according to CiWhether or not the element in (A) is in BiIf the standard name class B exists, the code is 1, if the standard name class B does not exist, the code is 0, and the standard name class B is obtainediThe segmentation vector of (2). Then, the word segmentation vector according to the name class A and the standard name class BiCalculating the name class A and each standard name class BiThe similarity score of (2).

Because the coding class and the standard coding class are directly expressed in a digital form, the coding is not required to be carried out, and the Jaro-Winkle distance between the coding class and the standard coding class is directly calculated, and the Jaro-Winkle distance effectively emphasizes the importance of the same prefix.

Coding class E and jth standard coding class FjFor example, where the code class E belongs to the same ICD binary group as the name class a, the Jaro-Winkle distance between them is calculated as:

Yscore(E,Fj)=simjaro(E,Fj)+lp(1-simjaro(E,Fj))

wherein, simjaro(E,Fj) The calculation method of (2) is as follows:

where l represents a string of code class E and a standard code class FjThe maximum number of the prefixes equal to the character string in (1) is not more than 4, the prefixes are constant as a scaling factor, p represents the contribution of the common prefixes to the similarity, the larger p represents the larger weight of the common prefixes, and the maximum number of the prefixes is not more than 0.25, in the embodiment, p is 0.1, | s is adoptedEI and sFjCharacter string for I representing encoding class E and standard encoding class FjM represents the number of matching characters in two character strings, t represents half of the number of transpositions which is the number of matching characters in two character strings, for example, the same characters of the character strings "bcade" and "abed" are 4 abde, and the order is different, so t is 2.

In an embodiment, the combined similarity score and distance score filtering determines a set of { standard name class, standard ICD encoding class } with the highest score as a rough transformation result of the ICD doublet, including:

similarity score X between the name class and each standard name classscoreWhen 1, the similarity score is expanded according to the set similarity weight alpha, namely Xscore=α*XscoreThe value range of alpha is larger than 2, preferably, the value of alpha is 2-20, and further preferably, the value of alpha is 10.

When the distance score Y between the coding class and each standard ICD coding classscoreIs greater than a set distance threshold epsilon,broadening the distance score, i.e. Y, according to a set distance weight βscore=β*YscoreThe value of epsilon can be 0.95, the value range of beta is more than 2, preferably, the value of alpha is 2-20, and further preferably, the value of alpha is 10.

Weighting and summing the expanded similarity score and the expanded distance score to obtain a composite score, i.e. a composite score Fscore=δXscore+γ*YscoreWherein, δ and γ are weighting weights, δ can be 1, γ is 0.8, and then a group of { standard name class B with the highest comprehensive score is selectediStandard ICD coding class FjAnd the result is used as the rough conversion result of the ICD doublet.

In the embodiment, the fine conversion result and the coarse conversion result are used as the ICD coding conversion result together, so that the standard coding conversion of the name class to be standardized in the case database is realized. The conversion process combines two modes of accurate matching conversion and fuzzy matching conversion, different strategies are adopted for conversion of the ICD codes to be converted, the conversion efficiency is improved, and in the fuzzy matching conversion, the ICD codes are converted by combining the similarity score of the name class and the distance score of the code class, so that the conversion accuracy is improved.

FIG. 3 is a schematic structural diagram of an ICD encoding and converting device according to an embodiment. As shown in fig. 3, an embodiment provides an ICD coding conversion apparatus 300, including:

an obtaining and preprocessing module 310, configured to extract and preprocess ICD tuples { name class, encoding class } to be converted;

a fine transformation module 320, configured to perform fine transformation on each preprocessed ICD binary group, including: performing association matching on the ICD binary group relative to a standard ICD binary group { a standard name class and a standard ICD encoding class }, and screening to obtain an associated ICD binary group as a fine transformation result;

a rough translation module 330, configured to perform rough translation on each unassociated ICD tuple, including: and calculating the similarity score between the name class and each standard name class and the distance score between the coding class and each standard ICD coding class, and screening and determining a group of { standard name class and standard ICD coding class } with the highest score as a rough conversion result of the ICD binary group by integrating the similarity score and the distance score.

It should be noted that, when performing ICD coding conversion, the ICD coding conversion apparatus provided in the above embodiment should be exemplified by the division of the above functional modules, and the above function allocation may be completed by different functional modules according to needs, that is, the internal structure of the terminal or the server is divided into different functional modules, so as to complete all or part of the above described functions. In addition, the ICD code conversion device provided in the above embodiment and the ICD code conversion method embodiment belong to the same concept, and specific implementation processes thereof are described in the ICD code conversion method embodiment for details, which are not described herein again.

Embodiments also provide a computing device, including a memory, a processor, and a computer program stored in the memory and executable on the processor, where the processor implements the ICD code conversion method described above when executing the computer program, and the method includes the following steps:

step 1, extracting and preprocessing an ICD binary group to be converted;

step 2, carrying out fine transformation on the ICD binary group;

and 3, carrying out coarse transformation on the ICD binary group.

In practical applications, the memory may be a volatile memory at the near end, such as RAM, a non-volatile memory, such as ROM, FLASH, a floppy disk, a mechanical hard disk, etc., or a remote storage cloud. The processor may be a Central Processing Unit (CPU), a microprocessor unit (MPU), a Digital Signal Processor (DSP), or a Field Programmable Gate Array (FPGA), i.e., the ICD code conversion step may be implemented by these processors.

The embodiment also provides a computer storage medium, on which a computer program is stored, and the computer program is processed and executed to implement the ICD coding conversion method, including the following steps:

step 1, extracting and preprocessing an ICD binary group to be converted;

step 2, carrying out fine transformation on the ICD binary group;

and 3, carrying out coarse transformation on the ICD binary group.

In embodiments, the computer-readable storage medium may be a ROM, a Random Access Memory (RAM), a CD-ROM, a magnetic tape, a floppy disk, an optical data storage device, and the like.

The above-mentioned embodiments are intended to illustrate the technical solutions and advantages of the present invention, and it should be understood that the above-mentioned embodiments are only the most preferred embodiments of the present invention, and are not intended to limit the present invention, and any modifications, additions, equivalents, etc. made within the scope of the principles of the present invention should be included in the scope of the present invention.

12页详细技术资料下载
上一篇:一种医用注射器针头装配设备
下一篇:一种用于CNC的圆度补偿文件生成方法及系统

网友询问留言

已有0条留言

还没有人留言评论。精彩留言会获得点赞!

精彩留言,会给你点赞!