Data processing method and device, electronic equipment and storage medium

文档序号：1922139 发布日期：2021-12-03 浏览：13次中文

阅读说明：本技术 数据处理方法、装置、电子设备及存储介质 (Data processing method and device, electronic equipment and storage medium ) 是由谭传奇陈漠沙仇伟黄非于 2020-05-29 设计创作，主要内容包括：本公开实施例公开了一种数据处理方法、装置、电子设备及存储介质,所述方法包括：获取待处理对象；从包括标准对象的对象集中检索得到与所述待处理对象相匹配的候选对象；利用识别模型从所述候选对象确定所述待处理对象对应的目标对象。该技术方案通过先检索再利用识别模型对候选对象重排序的方法,提高了针对待处理对象识别标准对象的准确率。(The embodiment of the disclosure discloses a data processing method, a data processing device, an electronic device and a storage medium, wherein the method comprises the following steps: acquiring an object to be processed; retrieving candidate objects matched with the object to be processed from an object set comprising standard objects; and determining a target object corresponding to the object to be processed from the candidate object by using a recognition model. According to the technical scheme, the accuracy of identifying the standard object aiming at the object to be processed is improved by the method of firstly retrieving and then reordering the candidate object by using the identification model.)

1. A data processing method, comprising:

acquiring an object to be processed;

retrieving candidate objects matched with the object to be processed from an object set comprising standard objects;

and determining a target object corresponding to the object to be processed from the candidate object by using a recognition model.

2. The method of claim 1, wherein retrieving candidate objects matching the object to be processed from a set of objects comprising standard objects comprises:

acquiring a first object set; the first object set comprises a first standard object;

and retrieving the first object set by using the object to be processed to obtain a first candidate object matched with the object to be processed.

3. The method of claim 1 or 2, wherein retrieving candidate objects matching the object to be processed from a set of objects comprising standard objects comprises:

acquiring a second object set; the second object set comprises an original object and a second standard object corresponding to the original object;

and retrieving the second object set by using the object to be processed, obtaining an original object matched with the object to be processed, and determining a second standard object corresponding to the original object as a second candidate object.

4. The method according to claim 1 or 2, wherein determining a target object corresponding to the object to be processed from the candidate objects by using a recognition model comprises:

processing the object to be processed and the current candidate object by using a feature representation model in the identification model to obtain the correlation features of the object to be processed and the current candidate object;

and determining the target object by using the correlation characteristics.

5. The method of claim 4, wherein determining the target object using the correlation features comprises:

performing dimension reduction processing on the correlation characteristics by using a multilayer perceptron in the recognition model;

processing the correlation characteristics after dimensionality reduction by utilizing a normalization model in the identification model to obtain the correlation between the candidate object and the object to be processed;

and determining the target object according to the correlation.

6. The method of claim 2, wherein retrieving the first set of objects using the object to be processed, obtaining a first candidate object matching the object to be processed, comprises:

calculating the similarity between the object to be processed and the first standard object in the first object set;

and determining the first candidate object according to the similarity.

7. The method of claim 3, wherein retrieving the second set of objects using the object to be processed, obtaining an original object matching the object to be processed, comprises:

calculating the similarity between the object to be processed and the original object;

and determining the original object matched with the object to be processed according to the similarity.

8. A data processing apparatus, comprising:

an acquisition module configured to acquire an object to be processed;

the retrieval module is configured to retrieve candidate objects matched with the object to be processed from an object set comprising standard objects;

a determination module configured to determine a target object corresponding to the object to be processed from the candidate object by using a recognition model.

9. An electronic device, comprising a memory and a processor; wherein the content of the first and second substances,

the memory is to store one or more computer instructions, wherein the one or more computer instructions are to be executed by the processor to implement the method of any one of claims 1-7.

10. A computer readable storage medium having computer instructions stored thereon, wherein the computer instructions, when executed by a processor, implement the method of any of claims 1-7.

Technical Field

The present disclosure relates to the field of computer technologies, and in particular, to a data processing method and apparatus, an electronic device, and a storage medium.

Background

In the big data age, data standardization is an indispensable task for each field. The problem to be solved by data normalization is to find corresponding standard representations for various different representations of the same object. Taking clinical terms in the medical field as an example, hundreds of different writing methods are often used for the same diagnosis, operation, medicine, examination, assay, symptom, etc., and if the clinical terms are not standardized, it is difficult to perform subsequent statistical analysis on the relevant information of the patient's medical history, etc. In the process of standardizing clinical terms, the clinical names in the case documents are generally matched with the standardized clinical terms in the standard knowledge base through semantic similarity, but because the expression modes of the same term by different doctors in the case documents are too diverse, a single matching model is difficult to obtain a good effect. Therefore, how to implement a data standardization process with better effect is one of the technical problems to be solved by those skilled in the relevant field.

Disclosure of Invention

The embodiment of the disclosure provides a data processing method and device, electronic equipment and a computer-readable storage medium.

In a first aspect, an embodiment of the present disclosure provides a data processing method, where the method includes:

acquiring an object to be processed;

retrieving candidate objects matched with the object to be processed from an object set comprising standard objects;

and determining a target object corresponding to the object to be processed from the candidate object by using a recognition model.

Further, retrieving candidate objects matching the object to be processed from an object set including standard objects, including:

acquiring a first object set; the first object set comprises a first standard object;

and retrieving the first object set by using the object to be processed to obtain a first candidate object matched with the object to be processed.

Further, retrieving candidate objects matching the object to be processed from an object set including standard objects, including:

acquiring a second object set; the second object set comprises an original object and a second standard object corresponding to the original object;

Further, determining a target object corresponding to the object to be processed from the candidate object by using a recognition model, including:

and determining the target object by using the correlation characteristics.

Further, determining the target object using the correlation features includes:

performing dimension reduction processing on the correlation characteristics by using a multilayer perceptron in the recognition model;

and determining the target object according to the correlation.

Further, retrieving the first object set by using the object to be processed to obtain a first candidate object matching the object to be processed, including:

calculating the similarity between the object to be processed and the first standard object in the first object set;

and determining the first candidate object according to the similarity.

Further, retrieving the second object set by using the object to be processed to obtain an original object matched with the object to be processed, including:

calculating the similarity between the object to be processed and the original object;

and determining the original object matched with the object to be processed according to the similarity.

In a second aspect, an embodiment of the present invention provides a data processing apparatus, including:

an acquisition module configured to acquire an object to be processed;

the retrieval module is configured to retrieve candidate objects matched with the object to be processed from an object set comprising standard objects;

a determination module configured to determine a target object corresponding to the object to be processed from the candidate object by using a recognition model.

The functions can be realized by hardware, and the functions can also be realized by executing corresponding software by hardware. The hardware or software includes one or more modules corresponding to the above-described functions.

In one possible design, the apparatus includes a memory configured to store one or more computer instructions that enable the apparatus to perform the corresponding method, and a processor configured to execute the computer instructions stored in the memory. The apparatus may also include a communication interface for the apparatus to communicate with other devices or a communication network.

In a third aspect, an embodiment of the present disclosure provides an electronic device, including a memory and a processor; wherein the memory is configured to store one or more computer instructions, wherein the one or more computer instructions are executed by the processor to implement the method of any of the above aspects.

In a fourth aspect, the disclosed embodiments provide a computer-readable storage medium for storing computer instructions for use by any of the above-mentioned apparatuses, including computer instructions for performing the method according to any of the above-mentioned aspects.

The technical scheme provided by the embodiment of the disclosure can have the following beneficial effects:

the method includes the steps that firstly, a plurality of candidate objects which are matched with an object to be processed (can also be understood as high in correlation) are obtained through retrieval from an object set, for example, the similarity between the object to be processed and a standard object can be calculated, and a plurality of standard objects with high similarity are determined as the candidate objects; and then, identifying a target object corresponding to the object to be processed from the plurality of candidate objects by using the identification model. According to the method for reordering the candidate objects by using the recognition model after retrieval, the accuracy of recognizing the standard object for the object to be processed is improved.

It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure.

Drawings

Other features, objects, and advantages of the present disclosure will become more apparent from the following detailed description of non-limiting embodiments when taken in conjunction with the accompanying drawings. In the drawings:

FIG. 1 shows a flow diagram of a data processing method according to an embodiment of the present disclosure;

FIG. 2 illustrates a schematic diagram of an application in a clinical terminology normalization scenario, according to an embodiment of the present disclosure;

FIG. 3 illustrates a structural schematic of a recognition model according to an embodiment of the present disclosure;

fig. 4 is a schematic structural diagram of an electronic device suitable for implementing a data processing method according to an embodiment of the present disclosure.

Detailed Description

Hereinafter, exemplary embodiments of the present disclosure will be described in detail with reference to the accompanying drawings so that those skilled in the art can easily implement them. Also, for the sake of clarity, parts not relevant to the description of the exemplary embodiments are omitted in the drawings.

In the present disclosure, it is to be understood that terms such as "including" or "having," etc., are intended to indicate the presence of the disclosed features, numbers, steps, behaviors, components, parts, or combinations thereof, and are not intended to preclude the possibility that one or more other features, numbers, steps, behaviors, components, parts, or combinations thereof may be present or added.

It should be further noted that the embodiments and features of the embodiments in the present disclosure may be combined with each other without conflict. The present disclosure will be described in detail below with reference to the accompanying drawings in conjunction with embodiments.

The details of the embodiments of the present disclosure are described in detail below with reference to specific embodiments.

Fig. 1 shows a flow diagram of a data processing method according to an embodiment of the present disclosure. As shown in fig. 1, the data processing method includes the steps of:

in step S101, an object to be processed is acquired;

in step S102, candidate objects matching the object to be processed are retrieved from an object set including a standard object;

in step S103, a target object corresponding to the object to be processed is determined from the candidate object by using a recognition model.

In this embodiment, the object to be processed may be an object to be standardized, and may be, for example, a term name in the related art, a text including a word, a sentence, or the like, a table, an image, or the like. The preset standard may be a standard formulated in the related art, and the standard object may be a standardized object well known in the related art. Taking clinical terms in the medical field as an example, the object to be processed may be a clinical name filled by a doctor in a clinical medical record document, and the standard object may be a standardized name of the clinical name in a well-recognized clinical knowledge base, such as an operation name in ICD9-2017 council & clinical edition.

In general, the related art may formulate a plurality of standard objects in the industry to form a standard object set. For the object to be processed, a target object corresponding to the object to be processed may be obtained by performing a correlation matching between the object to be processed and a set of established standard objects, and the matching may be performed by performing a correlation matching between a representation feature of the object to be processed (for example, for an object of a term name class, the object may be a word-level feature) and representation features of a plurality of standard objects, for example, determining a matched target object by calculating similarities between the object to be processed and the plurality of standard objects. However, when the representation feature of the object to be processed is greatly different from the representation feature of the corresponding target object, the target object corresponding to the object to be processed cannot be accurately identified by the above method. Therefore, in the data processing method provided by the embodiment of the present disclosure, a plurality of candidate objects matched with an object to be processed are first determined from an object set including a plurality of standard objects by using a retrieval means, where the plurality of candidate objects may be a preset number of standard objects in the object set that have a higher correlation with the object to be processed; and then, determining a target object corresponding to the object to be processed from the plurality of candidate objects by using the recognition model, for example, determining the target object which is most matched with the object to be processed from the plurality of candidate objects.

The recognition model may be obtained by pre-training, and the training data may include the sample object and a standard object corresponding to the sample object, and the standard object may be included in the object set. The standard object corresponding to the sample object in the training data can be obtained through manual marking. And training to obtain a recognition model capable of recognizing the target object corresponding to the object to be processed from a plurality of candidate objects by using the labeling relation between the sample object and the standard object corresponding to the sample object.

The method includes the steps that firstly, a plurality of candidate objects which are matched with an object to be processed (can also be understood as high in correlation) are obtained through retrieval from an object set, for example, the similarity between the object to be processed and a standard object can be calculated, and a plurality of standard objects with high similarity are determined as the candidate objects; and then, identifying a target object corresponding to the object to be processed from the plurality of candidate objects by using the identification model. According to the method for reordering the candidate objects by using the recognition model after retrieval, the accuracy of recognizing the standard object for the object to be processed is improved.

In an optional implementation manner of this embodiment, in step S102, the step of retrieving, from an object set including a standard object, a candidate object matching the object to be processed further includes the following steps:

acquiring a first object set; the first object set comprises a plurality of first standard objects;

and retrieving the first object set by using the object to be processed to obtain a first candidate object matched with the object to be processed.

In this alternative implementation manner, the first object set may be a standard object set formulated in the related field, and a first standard object having a higher correlation with the object to be processed is retrieved from the first standard object set by the retrieval means and is determined as the first candidate object. In some embodiments, the correlation between the object to be processed and the first standard object in the first standard object set may be determined by calculating a similarity between the two. In some embodiments, the first standard objects in the first standard object set may be ranked according to similarity, and a plurality of first standard objects ranked at the top may be determined as first candidate objects; in other embodiments, a plurality of first standard objects having a similarity higher than a preset value may be determined as the first candidate objects.

acquiring a second object set; the second object set comprises an original object and a second standard object corresponding to the original object;

In this alternative implementation, the original object may be a normalized object, and the second standard object may be an object normalized to the original object. The original objects in the second object set and the second standard objects corresponding to the original objects may be sample objects and standard objects corresponding to the sample objects in part or all of training data used for training the recognition model. It is understood that the original objects in the second object set and the second standard objects corresponding to the original objects may not be the sample objects and the standard objects corresponding to the sample objects in the training data. The original object and the second standard object corresponding to the original object may be historical annotation data collected in various ways, and the historical annotation data marks a standardized correspondence between the original object and the second standard object according to a standard formulated in a related field.

After the second object set is obtained, the original object matched with the object to be processed can be determined by retrieving the second object set, and then the second standard object corresponding to the matched original object is determined as a second candidate object. As can be seen from the above description, the original object matched with the object to be processed may be one or more original objects with higher correlation with the object to be processed, and the correlation between the original object and the object to be processed may be determined by the similarity. In this embodiment, since the object to be processed and the original object are both objects that are not standardized, they may be closer in representation, and the second standard object corresponding to the original object is a known standardized object, so that the object to be processed and the original object are matched, and then the second standard object corresponding to the matched original object is used as the second candidate object of the object to be processed, which has higher accuracy than that of matching the second standard object in the second object set by directly using the object to be processed.

In some embodiments, the first set of objects may include some or all of the standardized objects specified in the related art; the second standard object in the second set of objects may be a standardized object in the first set of objects corresponding to the original object. Taking clinical terminology in the medical field as an example, the first set of objects may include some or all of the standard surgical names specified in ICD9-2017 synergetics clinical edition.

It is noted that, in some embodiments, candidate objects of the object to be processed may be obtained by combining the first object set and the second object set. That is, the object to be processed and the first standard object in the first object set may be used for matching to obtain the first candidate object, the object to be processed and the original object in the second object set may also be used for matching, and then the matched original object is used to obtain the second candidate object, and both the first candidate object and the second candidate object may be used as candidate objects of the object to be processed and input to the recognition model for reordering. By the method, the candidate object with wide coverage rate can be obtained, the problem that the candidate object cannot be obtained by singly searching the first object set under the condition that the difference between the representation characteristics of the object to be processed and the standard object is large is solved, and the problem that the candidate object cannot be obtained due to incomplete data in the second object set is solved.

In an optional implementation manner of this embodiment, in step S103, the step of determining, by using a recognition model, a target object corresponding to the object to be processed from the candidate object further includes the following steps:

and determining the target object by using the correlation characteristics.

In this optional implementation manner, the recognition model may include a feature representation model, and is configured to obtain correlation features corresponding to the object to be processed and the candidate object after the object to be processed and the candidate object are processed. The correlation feature may be used to characterize the correlation between the object to be processed and the candidate object. The correlation between the object to be processed and the candidate object can be determined by using the correlation characteristic. For example, the candidate object with the highest correlation may be determined as the target object corresponding to the object to be processed.

In some embodiments, for an object of a term name class, a text class object, and the like, a language model may be used to extract semantic correlation features between the object to be processed and the candidate object, and then the correlation between the object to be processed and the candidate object may be determined through the semantic correlation features. The language Model may be a Model known as BERT, ESIM (Enhanced Sequential reference Model), BiMPM (binary Multi-reactive Matching), MwAN (Multi Attention networks), or the like.

In an optional implementation manner of this embodiment, the step of determining the target object by using the correlation feature further includes the following steps:

performing dimension reduction processing on the correlation characteristics by using a multilayer perceptron in the recognition model;

and determining the target object according to the correlation.

In this optional implementation, the correlation feature usually output by the feature representation model may be a vector feature with several hundred dimensions, and in order to normalize the correlation feature by using the normalization model, the multi-layer perceptron may be used to reduce the dimension of the correlation feature, for example, the correlation feature may be mapped to a 2-dimensional vector feature, and then the correlation feature is input to the normalization model, such as a softmax function, to perform normalization, so as to obtain a correlation score between the object to be processed and the candidate object, where a higher score indicates that the higher correlation between the object to be processed and the candidate object is, the more likely the candidate object is a target object corresponding to the object to be processed. For example, the candidate object with the highest relevance score may be determined as the target object of the object to be processed.

In an optional implementation manner of this embodiment, the step of retrieving, by using the object to be processed, the first object set to obtain a first candidate object matching the object to be processed further includes the following steps:

calculating the similarity between the object to be processed and the first standard object in the first object set;

and determining the first candidate object according to the similarity.

In this optional implementation manner, in the retrieval process, it may be determined whether the object to be processed matches the first standard object by calculating the similarity. After the similarity between the object to be processed and each first standard object is obtained through calculation, a preset number of first standard objects with the highest similarity may be determined as first candidate objects, or the first standard objects with the similarity higher than a preset value may be determined as first candidate objects.

Taking the object of the term name class as an example, the similarity between the object to be processed and each standard object in the standard object set can be calculated by using the TF-IDF feature, for example, the similarity calculation formula can be expressed as follows:

wherein q is an object to be processed, d is a current standard object, v (q) is a TF-IDF feature vector of the object to be processed q, and v (d) is a TF-IDF feature vector of the standard object d.

Of course, it is understood that the above similarity calculation formula is only an example, and may be calculated in other ways.

In addition, when the object to be processed is an object of another type, such as a text, an image, a table, etc., the existing correlation calculation method of the correlation type may be used to determine the correlation between the object to be processed and the standard object, which is not limited herein.

In an optional implementation manner of this embodiment, the step of retrieving, by using the object to be processed, the second object set to obtain an original object matching the object to be processed further includes the following steps:

and calculating the similarity between the object to be processed and the original object.

And determining the original object matched with the object to be processed according to the similarity.

In this alternative implementation, whether the object to be processed matches the original object may be determined by calculating the similarity. After the similarity between the object to be processed and each original object is obtained through calculation, the second standard objects corresponding to a preset number of original objects with the highest similarity can be determined as the first candidate objects, or the second standard objects corresponding to the original objects with the similarity higher than the preset value can be determined as the first candidate objects.

For the similarity calculation method between the object to be processed and the original object, reference may be made to the similarity calculation method between the object to be processed and the first standard object in the foregoing embodiment, and details are not repeated here.

Fig. 2 shows a schematic diagram of an application in a clinical terminology normalization scenario according to an embodiment of the present disclosure. As shown in fig. 2, all standard words (corresponding to a first standard object in the embodiment of the present disclosure) formulated by the relevant department for clinical medicine and coded IDs assigned to the standard words may be included in the code file, and the label file may include the collected preoperative original words (corresponding to an original object in the embodiment of the present disclosure) and the standard words (corresponding to a second standard object in the embodiment of the present disclosure) corresponding to the preoperative original words. The retrieval index can be established in advance for the encoded file and the annotation file. After receiving an operation original word (namely, an operation name filled by a doctor in a case document) input by a user from a client, respectively retrieving a first candidate standard word and a second candidate standard word (corresponding to candidate objects in the embodiment of the present disclosure) from a coding file and a labeling file; and aiming at each of the first candidate standard word and the second candidate standard word, inputting a word sequence S formed by the candidate standard word and the operation original word into the recognition model for scoring, and outputting a target standard word corresponding to the operation original word to be normalized based on a score given by the recognition model, namely determining a candidate answer with the last score as the target standard word. The standardization of the operation primitive words obtained by the embodiment of the disclosure can be provided for hospitals, and can also be used as the primitive words standardization of health files or physical examination files, and provide standardized guide lists and the like for patients, guide the standardization of the disease state description of the patients, and the like.

The following describes the generation process of candidate answers and the scoring process of candidate answers in detail by taking the Lucene search tool and the Transformer framework as examples.

Lucene is a set of tools for full-text retrieval and search, the default sorting mode of Lucene is based on TF-IDF and a vector space model, and a target result similar to a retrieved phrase, namely an operation original word on characters can be conveniently and quickly found. The retrieval process is as follows:

1) giving an operation original word q and a word d to be searched in the index (a standard word in a coding file or an operation original word in a labeling file); calculating to obtain TF-IDF characteristics aiming at the original operation word q and the word d to be searched, and respectively representing the TF-IDF characteristics as v (q) and v (d);

2) calculating the similarity score of q and d by a vector space model:

3) and obtaining a plurality of candidate answers from the coding file and the labeling file respectively according to the similarity scores.

After the candidate standard words are obtained through Lucene retrieval, the candidate standard words are scored based on a recognition model of a Transformer frame.

Fig. 3 illustrates a schematic structural diagram of a recognition model according to an embodiment of the present disclosure. As shown in fig. 3, the transform framework includes a BERT encoder and a decoder composed of a multi-layered perceptron and a Softmax function. In the scoring process, according to the specification of a BERT model, the operation original words to be processed and the current candidate answers are segmented according to characters and are arranged into "[ CLS ]]Surgical original word (SEP)]Standard word (SEP)]"into the BERT encoder. I.e. given surgical primitive wordAnd candidate answersSplicing them according to the BERT specificationAre spliced into a sequenceInputting the sequence S into a BERT encoder, and taking an output vector V at CL as a vector feature representation of S, where V is BERT (S); inputting V into the multilayer perceptron, and converting V into a 2-dimensional vector P by the multilayer perceptron, wherein P is W^TV; p is normalized by the Softmax operation and the 1 st dimension is taken as representing the probability Prob between 0 and 1. In the training process of the recognition model, the optimization training of the model parameters may be performed by using a minimum cross entropy loss function (based on Prob).

As shown in fig. 3, the preoperative word is "vertebroplasty", the standard word is "percutaneous vertebroplasty", so that "[ CLS ] percutaneous vertebroplasty [ SEP ] is input to the BERT encoder, and then the result of the BERT encoder, i.e. the vector feature representation at the position of" [ CLS ] "is input to the multi-layer perceptron, so as to obtain a 2-dimensional vector. The 2-dimensional vector is normalized to a fraction between 0 and 1 after passing through the Softmax function.

The following are embodiments of the disclosed apparatus that may be used to perform embodiments of the disclosed methods.

According to the data processing apparatus of an embodiment of the present disclosure, the apparatus may be implemented as part or all of an electronic device by software, hardware, or a combination of both. The data processing apparatus includes:

an acquisition module configured to acquire an object to be processed;

the retrieval module is configured to retrieve candidate objects matched with the object to be processed from an object set comprising standard objects;

a determination module configured to determine a target object corresponding to the object to be processed from the candidate object by using a recognition model.

In an optional implementation manner of this embodiment, the retrieving module includes:

a first acquisition submodule configured to acquire a first set of objects; the first object set comprises a first standard object;

and the first retrieval sub-module is configured to retrieve the first object set by using the object to be processed to obtain a first candidate object matched with the object to be processed.

In an optional implementation manner of this embodiment, the retrieving module includes:

a second acquisition submodule configured to acquire a second set of objects; the second object set comprises an original object and a second standard object corresponding to the original object;

and the second retrieval sub-module is configured to retrieve the second object set by using the object to be processed, obtain an original object matched with the object to be processed, and determine a second standard object corresponding to the original object as a second candidate object.

In an optional implementation manner of this embodiment, the determining module includes:

the recognition submodule is configured to process the object to be processed and the current candidate object by using a feature representation model in the recognition model, and obtain correlation features of the object to be processed and the current candidate object;

a first determination submodule configured to determine the target object using the correlation feature.

In an optional implementation manner of this embodiment, the first determining sub-module includes:

the dimension reduction sub-module is configured to perform dimension reduction processing on the correlation characteristics by using a multilayer perceptron in the recognition model;

the processing submodule is configured to process the correlation characteristics after dimensionality reduction by using a normalization model in the identification model to obtain the correlation between the candidate object and the object to be processed;

a second determination submodule configured to determine the target object according to the correlation.

In an optional implementation manner of this embodiment, the first retrieving sub-module includes:

a first calculation submodule configured to calculate a similarity between the object to be processed and the first standard object in the first object set;

a third determination submodule configured to determine the first candidate object according to the similarity.

In an optional implementation manner of this embodiment, the second retrieving sub-module includes:

a second calculation submodule configured to calculate a similarity between the object to be processed and the original object;

a fourth determining submodule configured to determine the original object matched with the object to be processed according to the similarity.

The data processing apparatus in this embodiment corresponds to the data processing method, and specific details may refer to the description of the data processing method, which is not described herein again.

Fig. 4 is a schematic structural diagram of an electronic device suitable for implementing a data processing method according to an embodiment of the present disclosure.

As shown in fig. 4, electronic device 400 includes a processing unit 401, which may be implemented as a CPU, GPU, FPGA, NPU, or other processing unit. The processing unit 401 may execute various processes in the embodiment of any one of the above-described methods of the present disclosure according to a program stored in a Read Only Memory (ROM)402 or a program loaded from a storage section 408 into a Random Access Memory (RAM) 403. In the RAM403, various programs and data necessary for the operation of the electronic apparatus 400 are also stored. The processing unit 401, the ROM402, and the RAM403 are connected to each other via a bus 404. An input/output (I/O) interface 405 is also connected to bus 404.

The following components are connected to the I/O interface 405: an input section 406 including a keyboard, a mouse, and the like; an output section 407 including a display device such as a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), and the like, and a speaker; a storage section 408 including a hard disk and the like; and a communication section 409 including a network interface card such as a LAN card, a modem, or the like. The communication section 409 performs communication processing via a network such as the internet. A driver 410 is also connected to the I/O interface 405 as needed. A removable medium 411 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is mounted on the drive 410 as necessary, so that a computer program read out therefrom is mounted into the storage section 408 as necessary.

In particular, according to embodiments of the present disclosure, any of the methods described above with reference to embodiments of the present disclosure may be implemented as a computer software program. For example, embodiments of the present disclosure include a computer program product comprising a computer program tangibly embodied on a medium readable thereby, the computer program comprising program code for performing any of the methods of the embodiments of the present disclosure. In such an embodiment, the computer program may be downloaded and installed from a network through the communication section 409, and/or installed from the removable medium 411.

The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowcharts or block diagrams may represent a module, a program segment, or a portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

The units or modules described in the embodiments of the present disclosure may be implemented by software or hardware. The units or modules described may also be provided in a processor, and the names of the units or modules do not in some cases constitute a limitation of the units or modules themselves.

As another aspect, the present disclosure also provides a computer-readable storage medium, which may be the computer-readable storage medium included in the apparatus in the above-described embodiment; or it may be a separate computer readable storage medium not incorporated into the device. The computer readable storage medium stores one or more programs for use by one or more processors in performing the methods described in the present disclosure.

The foregoing description is only exemplary of the preferred embodiments of the disclosure and is illustrative of the principles of the technology employed. It will be appreciated by those skilled in the art that the scope of the invention in the present disclosure is not limited to the specific combination of the above-mentioned features, but also encompasses other embodiments in which any combination of the above-mentioned features or their equivalents is possible without departing from the inventive concept. For example, the above features and (but not limited to) the features disclosed in this disclosure having similar functions are replaced with each other to form the technical solution.

14页详细技术资料下载

Data processing method and device, electronic equipment and storage medium

相关技术

网友询问留言