Tissue sample classification method, device, equipment and storage medium

文档序号：1923591 发布日期：2021-12-03 浏览：12次中文

阅读说明：本技术 组织样本的分类方法、装置、设备和存储介质 (Tissue sample classification method, device, equipment and storage medium ) 是由蔡德叶虎马兆轩肖凯文韩骁于 2021-09-01 设计创作，主要内容包括：本申请公开了一种组织样本的分类方法、装置、设备和存储介质,属于计算机技术领域。所述方法包括：获取目标组织样本的图像数据；基于所述目标组织样本的图像数据和可疑阳性细胞检测模型,确定多个可疑阳性细胞的特征向量和每个可疑阳性细胞的特征向量对应的分数值；在所述多个可疑阳性细胞的特征向量中,获取满足预设的分数值条件的多个参考特征向量；基于所述多个参考特征向量和样本分类模型,确定所述目标组织样本的目标样本类型。采用本申请,提供了一种可以通过计算机设备自动进行分类处理的样本分类方法,为医生提供了一种确定样本分类的参考依据,提高了确定目标样本类型的准确性。(The application discloses a method, a device, equipment and a storage medium for classifying tissue samples, and belongs to the technical field of computers. The method comprises the following steps: acquiring image data of a target tissue sample; determining feature vectors of a plurality of suspicious positive cells and score values corresponding to the feature vectors of each suspicious positive cell based on the image data of the target tissue sample and a suspicious positive cell detection model; obtaining a plurality of reference feature vectors meeting a preset score value condition from the feature vectors of the plurality of suspicious positive cells; determining a target sample type for the target tissue sample based on the plurality of reference feature vectors and a sample classification model. By the adoption of the sample classification method, the sample classification method capable of automatically performing classification processing through computer equipment is provided, a reference basis for determining sample classification is provided for doctors, and accuracy of determining the type of the target sample is improved.)

1. A method of classifying a tissue sample, the method comprising:

acquiring image data of a target tissue sample;

determining a plurality of feature vectors of suspicious positive cells and a score value corresponding to the feature vector of each suspicious positive cell based on the image data of the target tissue sample and a suspicious positive cell detection model, wherein the score value is used for indicating the classification confidence of the classification result of the feature vector of the suspicious positive cell corresponding to the score value;

obtaining a plurality of reference feature vectors meeting a preset score value condition from the feature vectors of the plurality of suspicious positive cells;

determining a target sample type for the target tissue sample based on the plurality of reference feature vectors and a sample classification model.

2. The method according to claim 1, wherein the obtaining a plurality of reference feature vectors satisfying a preset score condition among the feature vectors of the plurality of suspected positive cells comprises:

arranging the feature vectors of the plurality of suspicious positive cells according to the sequence of the corresponding fraction values from large to small, and determining the feature vectors with the first preset number as the plurality of reference feature vectors; alternatively, the first and second electrodes may be,

and acquiring the feature vectors of which the corresponding score values are greater than a preset score threshold value from the feature vectors of the suspicious positive cells, and determining the feature vectors as the plurality of reference feature vectors.

3. The method of claim 1, wherein determining the target sample type for the target tissue sample based on the plurality of reference feature vectors and a sample classification model comprises:

determining a plurality of reference feature vector sets based on the plurality of reference feature vectors;

for each reference feature vector set, inputting each reference feature vector in the reference feature vector set into the sample classification model to obtain a probability value of each sample type corresponding to the reference feature vector set;

for each sample type, calculating an average value of probability values of the sample types corresponding to the multiple reference feature vector sets to obtain an average probability value corresponding to each sample type;

and determining the sample type corresponding to the maximum average probability value as the target sample type of the target tissue sample.

4. The method of claim 3, wherein determining a plurality of sets of reference feature vectors based on the plurality of reference feature vectors comprises:

and performing Monte Carlo Monte-Carlo sampling for multiple times in the multiple reference feature vectors to obtain multiple reference feature vector sets, wherein each reference feature vector set comprises a second preset number of reference feature vectors.

5. The method of claim 3, further comprising:

and determining the uncertainty of the target sample type based on the probability value of each sample type corresponding to each reference feature vector set and the average probability value corresponding to each sample type.

6. The method of claim 5, wherein the determining the uncertainty of the target sample type based on the probability value of the target sample type corresponding to each reference feature vector set and the average probability value corresponding to the target sample type comprises:

respectively calculating relative entropies between the probability values of the multiple sample types corresponding to each reference feature vector set and the average probability values corresponding to the multiple sample types to obtain the relative entropy corresponding to each reference feature vector set;

and determining the average value of the relative entropies corresponding to all the reference feature vector sets as the uncertainty of the target sample type.

7. The method of claim 1, wherein determining the target sample type for the target tissue sample based on the plurality of reference feature vectors and a sample classification model comprises:

inputting the plurality of reference feature vectors into the sample classification model to obtain a probability value of each sample type;

and determining the sample type corresponding to the maximum probability value as the target sample type of the target tissue sample.

8. The method of claim 1, further comprising:

acquiring image data of a training tissue sample and a sample type of the training tissue sample;

determining probability sequence data as reference output data based on the sample types of the training tissue samples, wherein the probability sequence data is sequence data composed of probability values of a plurality of sample types arranged in a preset order, in the probability sequence data, the probability value of the sample type of the training tissue sample is 1, and the probability values of other sample types except the sample type of the training tissue sample are 0;

determining a feature vector of each suspicious positive cell in a plurality of suspicious positive cells corresponding to the training tissue sample and a score value corresponding to the feature vector of each suspicious positive cell based on the image data of the training tissue sample and the suspicious positive cell detection model;

arranging the feature vectors of the plurality of suspicious positive cells according to the sequence of the corresponding fraction values from large to small, obtaining a first preset number of feature vectors, and determining the feature vectors as a plurality of sample feature vectors;

performing Monte-Carlo sampling for multiple times in a plurality of sample feature vectors to obtain a sample feature vector set, wherein the sample feature vector set comprises a second preset number of sample feature vectors;

inputting each sample feature vector in the sample feature vector set into a sample classification model to be trained to obtain actual output data;

and training the sample classification model to be trained based on the actual output data and the reference output data to obtain the trained sample classification model.

9. A device for classifying a tissue sample, the device comprising:

a first acquisition module for acquiring image data of a target tissue sample;

a first determination module, configured to determine, based on the image data of the target tissue sample and a suspected positive cell detection model, feature vectors of a plurality of suspected positive cells and a score value corresponding to the feature vector of each suspected positive cell, where the score value is used to indicate a classification confidence of a classification result of the feature vector of the suspected positive cell corresponding to the score value;

a second obtaining module, configured to obtain, from the feature vectors of the suspicious positive cells, a plurality of reference feature vectors that satisfy a preset score value condition;

a second determination module to determine a target sample type for the target tissue sample based on the plurality of reference feature vectors and a sample classification model.

10. A computer device comprising a processor and a memory, the memory having stored therein at least one instruction that is loaded and executed by the processor to perform operations performed by the method of classifying a tissue sample according to any one of claims 1 to 8.

Technical Field

The present application relates to the field of computer technologies, and in particular, to a method, an apparatus, a device, and a storage medium for classifying a tissue sample.

Background

Currently, people can effectively prevent cancers and tumor diseases through early screening and timely treatment, for example, cervical cancer is one of the common malignant tumors of women, and the early screening and the timely treatment can effectively prevent the cervical cancer. Exfoliative cytology, as a well-established screening tool, plays an important role in early screening, and can be used for examining a tissue sample on an exfoliative cytology slide of a patient, so as to obtain which sample type, negative or positive, the tissue sample of the patient belongs to which sample type or stage in the positive.

The current cast-off cytology examination is usually a smear of cast-off cytology slides, and then the doctor observes the morphology of each cell on the slide under a microscope, etc., to determine to which type of sample the tissue sample belongs.

However, there are various unstable possibilities during the working process of the doctor, which may affect the accuracy of the type of the finally obtained sample, and reduce the accuracy of the classification result.

Disclosure of Invention

The embodiment of the application provides a method for classifying tissue samples, which can solve the problem that the accuracy of sample types obtained in the prior art is relatively low.

In a first aspect, a method for classifying a tissue sample is provided, the method comprising:

acquiring image data of a target tissue sample;

obtaining a plurality of reference feature vectors meeting a preset score value condition from the feature vectors of the plurality of suspicious positive cells;

determining a target sample type for the target tissue sample based on the plurality of reference feature vectors and a sample classification model.

In a possible implementation manner, the obtaining, among the feature vectors of the suspected positive cells, a plurality of reference feature vectors that satisfy a preset score value condition includes:

arranging the feature vectors of the plurality of suspicious positive cells in a descending order of the corresponding score values, and determining a first preset number of feature vectors as the plurality of reference feature vectors; alternatively, the first and second electrodes may be,

In one possible implementation, the determining a target sample type of the target tissue sample based on the plurality of reference feature vectors and a sample classification model includes:

determining a plurality of reference feature vector sets based on the plurality of reference feature vectors;

and determining the sample type corresponding to the maximum average probability value as the target sample type of the target tissue sample.

In one possible implementation, the determining a plurality of reference feature vector sets based on the plurality of reference feature vectors includes:

and performing Monte-Carlo (Monte Carlo) sampling for multiple times in the plurality of reference feature vectors to obtain a plurality of reference feature vector sets, wherein each reference feature vector set comprises a second preset number of reference feature vectors.

In one possible implementation, the method further includes:

In one possible implementation manner, the determining an uncertainty of the target sample type based on the probability value of the target sample type corresponding to each reference feature vector set and the average probability value corresponding to the target sample type includes: