Biomarker and detection kit for predicting postoperative late-stage recurrence risk of liver cancer patient

文档序号:128536 发布日期:2021-10-22 浏览:34次 中文

阅读说明:本技术 一种用于预测肝癌患者术后晚期复发风险的生物标志物及检测试剂盒 (Biomarker and detection kit for predicting postoperative late-stage recurrence risk of liver cancer patient ) 是由 李臻 张玉元 李鑫 吕培杰 詹鹏超 吴阳 葛鹏磊 王彩鸿 谢滢滢 谢炳灿 叶书文 于 2021-08-27 设计创作,主要内容包括:本发明属于医药生物技术领域,具体公开了一种用于预测肝癌患者术后晚期复发风险的生物标志物,所述标志物为ANGPT4基因、FAM78B基因、COLEC12基因、TRABD2A基因、AMFR基因和LMTK3基因的组合;所述试剂盒含有用于检测上述生物标志物表达量的试剂。通过检测HCC患者肝癌组织中ANGPT4基因、FAM78B基因、COLEC12基因、TRABD2A基因、AMFR基因和LMTK3基因的表达水平,能够实现肝癌患者术后晚期复发风险的预测。(The invention belongs to the technical field of medical biology, and particularly discloses a biomarker for predicting postoperative late stage recurrence risk of a liver cancer patient, wherein the marker is a combination of an ANGPT4 gene, a FAM78B gene, a COLEC12 gene, a TRABD2A gene, an AMFR gene and an LMTK3 gene; the kit contains a reagent for detecting the expression level of the biomarker. By detecting the expression levels of the ANGPT4 gene, the FAM78B gene, the COLEC12 gene, the TRABD2A gene, the AMFR gene and the LMTK3 gene in the liver cancer tissue of the HCC patient, the prediction of the postoperative late-stage recurrence risk of the liver cancer patient can be realized.)

1. A biomarker for predicting postoperative late stage recurrence risk of a liver cancer patient, wherein the biomarker is a combination of ANGPT4 gene, FAM78B gene, COLEC12 gene, TRABD2A gene, AMFR gene and LMTK3 gene.

2. The use of a reagent for detecting the expression level of the biomarker according to claim 1 in the preparation of a product for predicting the risk of late postoperative recurrence of a liver cancer patient.

3. The use of claim 2, wherein the product is used for detecting the expression level of the biomarker in a sample by RT-PCR, real-time quantitative PCR, in situ hybridization, Northern blotting, microarray or high throughput sequencing platform.

4. The use according to claim 3, wherein the reagent is a specific primer for amplifying the biomarker or a probe that hybridizes to the biomarker.

5. The use according to claim 4, characterized in that the nucleotide sequence of the primers specific for amplifying the ANGPT4 gene is shown in SEQ ID No.1 and SEQ ID No. 2; the nucleotide sequence of the specific primer for amplifying the FAM78B gene is shown as SEQ ID NO.3 and SEQ ID NO. 4; the nucleotide sequence of the specific primer for amplifying the COLEC12 gene is shown as SEQ ID NO.5 and SEQ ID NO. 6; the nucleotide sequence of the specific primer for amplifying the TRABD2A gene is shown as SEQ ID NO.7 and SEQ ID NO. 8; the nucleotide sequence of the specific primer for amplifying the AMFR gene is shown as SEQ ID NO.9 and SEQ ID NO. 10; the nucleotide sequence of the specific primer for amplifying the LMTK3 gene is shown as SEQ ID NO.11 and SEQ ID NO. 12.

6. The use of claim 3, wherein the sample is a tissue, cell or body fluid; the product is a chip, a preparation or a kit.

7. The use of any one of claims 2-6, wherein the formula for calculating the risk prediction value of the product for predicting the risk of late postoperative recurrence of the liver cancer patient is shown as formula I:

risk prediction 0.356+0.06 × ANGPT4+0.110 × FAM78B +0.046 × COLEC12+0.063 × TRABD2A +0.049 × AMFR +0.004 × LMTK3 formula I

In the formula I, ANGPT4 represents the expression level of ANGPT4 gene in a test sample, FAM78B represents the expression level of FAM78B gene in the test sample; COLEC12 shows the expression level of COLEC12 gene in the test sample; TRABD2A shows the expression level of TRABD2A gene in the test sample; AMFR represents the expression level of AMFR gene in the test sample; LMTK3 shows the expression level of LMTK3 gene in a test sample.

8. A kit for predicting the risk of late-stage postoperative recurrence of a liver cancer patient, comprising reagents for detecting the expression level of the biomarker of claim 1.

9. The kit of claim 8, wherein the reagent is a specific primer that amplifies the biomarker or a probe that hybridizes to the biomarker.

10. The kit of claim 8, wherein the formula for calculating the risk prediction value of predicting the risk of late postoperative recurrence of a liver cancer patient is shown in formula I:

risk prediction 0.356+0.06 × ANGPT4+0.110 × FAM78B +0.046 × COLEC12+0.063 × TRABD2A +0.049 × AMFR +0.004 × LMTK3 formula I

In the formula I, ANGPT4 represents the expression level of ANGPT4 gene in a test sample, FAM78B represents the expression level of FAM78B gene in the test sample; COLEC12 shows the expression level of COLEC12 gene in the test sample; TRABD2A shows the expression level of TRABD2A gene in the test sample; AMFR represents the expression level of AMFR gene in the test sample; LMTK3 shows the expression level of LMTK3 gene in a test sample.

Technical Field

The invention belongs to the technical field of medical biology, and particularly relates to a biomarker and a detection kit for predicting postoperative late-stage recurrence risk of a liver cancer patient.

Background

Hepatocellular carcinoma (HCC) is a highly malignant tumor with a poor prognosis. Currently, hepatectomy and interventional therapy have become the mainstay of treatment for stage I-III HCC patients, but high recurrence rates remain a major obstacle to improving long-term survival, with nearly 70% of patients relapsing within 5 years of surgery. HCC recurrence is usually divided into Early Recurrence (ER) and Late Recurrence (LR) relative to the time from surgery to initial recurrence, usually with 2 years as a cutoff, HCC patients are counted from surgical resection and recurrence after 2 years is scored as Late recurrence. In clinical practice, many patients who survive 2 years after radical hepatectomy without tumor recurrence are not monitored regularly and may lose the opportunity for treatment when symptoms appear. Therefore, there is a need to predict patients susceptible to advanced recurrence of HCC and further provide an optimized strategy for recurrence monitoring and treatment.

Initial relapse patterns and degrees have been reported to vary in patients with early and late stage recurrence of HCC. Advanced HCC recurrence, which is often considered a neogenetic tumor, may be related to the evolution of underlying chronic liver disease, with different biological behavior compared to HCC early recurrence. Studies have shown that the relationship between cirrhosis and the advanced recurrence of HCC has a clinical "surface effect" because chronic hepatitis inflammation and fibrosis accelerate the development of HCC by creating a carcinogenic microenvironment in the liver, known as the "field effect". Therefore, in view of the critical role of genomic alterations in tumorigenesis and progression, it is necessary to decipher the genomic landscape of HCC late relapsing patients. In addition, we hope to translate these knowledge into new biomarkers and drug action targets that further impact disease monitoring and tumor treatment decisions, ultimately improving the clinical outcome of HCC patients.

It is common for clinicians to select a rational treatment strategy based on tumor-node-metastasis (TNM) staging. However, HCC patients with the same TNM stage tend to have different prognosis and therefore require more individualized treatment strategies. Until recently published studies on the late-stage recurrence of HCC remained rare and mostly limited to clinical features. For example, a study at the oriental hepatobiliary surgery hospital investigates the risk factors for late recurrence after hepatectomy of Hepatitis B Virus (HBV) related HCC. In this study, Wang et al found that the recurrence rate peaked 1-2 and 4-5 years post-operatively (23% and 35% per year, respectively), and concluded that men, cirrhosis and preoperative high HBV-DNA load were associated with advanced recurrence of HCC. However, most previous studies have not elaborated on how to accurately predict advanced HCC recurrence and rational intervention in patients at high risk for advanced HCC recurrence. Therefore, there is a need to develop a product that can be used to predict the risk of advanced recurrence in HCC patients.

Disclosure of Invention

Aiming at the problems and the defects in the prior art, the invention aims to provide a biomarker and a detection kit for predicting the postoperative late-stage recurrence risk of a liver cancer patient.

In order to realize the purpose of the invention, the technical scheme adopted by the invention is as follows:

the invention provides a biomarker for predicting the risk of postoperative late stage recurrence of a liver cancer patient, and the biomarker is a combination of ANGPT4 gene, FAM78B gene, COLEC12 gene, TRABD2A gene, AMFR gene and LMTK3 gene. The ANGPT4 gene, FAM78B gene, COLEC12 gene, TRABD2A gene, AMFR gene and LMTK3 gene are up-regulated in liver cancer tissues of patients with HCC late recurrence.

In a second aspect, the invention provides an application of a reagent for detecting the expression level of the biomarker in the first aspect in preparing a product for predicting the risk of late-stage postoperative recurrence of a liver cancer patient.

According to the application, preferably, the product is obtained by detecting the expression level of the biomarker in a sample through RT-PCR, real-time quantitative PCR, in-situ hybridization, Northern blotting, a chip or a high-throughput sequencing platform.

According to the above-mentioned use, preferably, the reagent is a specific primer for amplifying the biomarker or a probe that hybridizes to the biomarker.

According to the above-mentioned application, preferably, the reagent for detecting the expression level of the biomarker in the sample by RT-PCR or real-time quantitative PCR comprises a specific primer for amplifying the biomarker.

According to the above-mentioned application, preferably, the reagent for detecting the expression level of the biomarker in the sample by in situ hybridization comprises a probe that hybridizes to a nucleotide sequence of the biomarker.

According to the above application, preferably, the reagent for detecting the expression level of the biomarker in the sample by Northern blotting comprises a probe that hybridizes to a nucleotide sequence of the biomarker.

According to the above-mentioned application, preferably, the reagent for detecting the expression level of the biomarker in the sample by the chip comprises a probe that hybridizes to a nucleotide sequence of the biomarker.

According to the above application, preferably, the nucleotide sequence of the specific primer for amplifying the ANGPT4 gene is shown as SEQ ID NO.1 and SEQ ID NO. 2; the nucleotide sequence of the specific primer for amplifying the FAM78B gene is shown as SEQ ID NO.3 and SEQ ID NO. 4; the nucleotide sequence of the specific primer for amplifying the COLEC12 gene is shown as SEQ ID NO.5 and SEQ ID NO. 6; the nucleotide sequence of the specific primer for amplifying the TRABD2A gene is shown as SEQ ID NO.7 and SEQ ID NO. 8; the nucleotide sequence of the specific primer for amplifying the AMFR gene is shown as SEQ ID NO.9 and SEQ ID NO. 10; the nucleotide sequence of the specific primer for amplifying the LMTK3 gene is shown as SEQ ID NO.11 and SEQ ID NO. 12.

According to the above-mentioned application, preferably, the sample includes, but is not limited to, tissue, cell, body fluid (blood, lymph). More preferably, the sample is tissue, blood.

According to the above-mentioned use, preferably, the product is a chip, a preparation or a kit.

According to the application, preferably, the calculation formula of the risk prediction value for predicting the postoperative late-stage recurrence risk of the liver cancer patient is shown as formula I:

risk prediction 0.356+0.06 × ANGPT4+0.110 × FAM78B +0.046 × COLEC12+0.063 × TRABD2A +0.049 × AMFR +0.004 × LMTK3 formula I

In the formula I, ANGPT4 represents the expression level of ANGPT4 gene in a test sample, FAM78B represents the expression level of FAM78B gene in the test sample; COLEC12 shows the expression level of COLEC12 gene in the test sample; TRABD2A shows the expression level of TRABD2A gene in the test sample; AMFR represents the expression level of AMFR gene in the test sample; LMTK3 shows the expression level of LMTK3 gene in a test sample. More preferably, in the formula I, the ANGPT4 represents that qRT-PCR is adopted to detect the expression level of ANGPT4 gene expression in the liver cancer tissue of the HCC patient, and the FAM78B represents that qRT-PCR is adopted to detect the expression level of FAM78B gene expression in the liver cancer tissue of the HCC patient; COLEC12 shows that the expression level of COLEC12 gene expression in liver cancer tissues of HCC patients is detected by qRT-PCR; TRABD2A shows that the expression level of TRABD2A gene expression in liver cancer tissues of HCC patients is detected by qRT-PCR; AMFR represents the expression quantity of AMFR gene in liver cancer tissue of HCC patients detected by qRT-PCR; LMTK3 represents that the expression level of LMTK3 gene in liver cancer tissue of HCC patient is detected by qRT-PCR.

In a third aspect, the present invention provides a kit for predicting risk of late stage postoperative recurrence of a liver cancer patient, the kit comprising a reagent for detecting an expression level of the biomarker according to the first aspect.

According to the kit, preferably, the reagent is a reagent for detecting the expression level of the biomarker in a sample by RT-PCR, real-time quantitative PCR, in situ hybridization, Northern blotting, a chip or a high-throughput sequencing platform.

According to the above kit, preferably, the reagent for detecting the expression level of the biomarker in the sample by RT-PCR or real-time quantitative PCR comprises a specific primer for amplifying the biomarker.

According to the above kit, preferably, the reagent for detecting the expression level of the biomarker in the sample by in situ hybridization comprises a probe that hybridizes to a nucleotide sequence of the biomarker.

According to the above kit, preferably, the reagent for detecting the expression level of the biomarker in the sample by Northern blotting comprises a probe that hybridizes to a nucleotide sequence of the biomarker.

According to the above kit, preferably, the reagent for detecting the expression level of the biomarker in the sample by the chip comprises a probe that hybridizes to a nucleotide sequence of the biomarker.

According to the kit, preferably, the nucleotide sequence of the specific primer for amplifying the ANGPT4 gene is shown as SEQ ID NO.1 and SEQ ID NO. 2; the nucleotide sequence of the specific primer for amplifying the FAM78B gene is shown as SEQ ID NO.3 and SEQ ID NO. 4; the nucleotide sequence of the specific primer for amplifying the COLEC12 gene is shown as SEQ ID NO.5 and SEQ ID NO. 6; the nucleotide sequence of the specific primer for amplifying the TRABD2A gene is shown as SEQ ID NO.7 and SEQ ID NO. 8; the nucleotide sequence of the specific primer for amplifying the AMFR gene is shown as SEQ ID NO.9 and SEQ ID NO. 10; the nucleotide sequence of the specific primer for amplifying the LMTK3 gene is shown as SEQ ID NO.11 and SEQ ID NO. 12.

According to the above-mentioned kit, preferably, the sample includes, but is not limited to, tissue, cell, body fluid (blood, lymph). More preferably, the sample is tissue, blood.

Preferably, the product is a chip, a preparation or a kit according to the above.

According to the kit, preferably, the calculation formula of the risk prediction value for predicting the postoperative late-stage recurrence risk of the liver cancer patient is shown as formula I:

risk prediction 0.356+0.06 × ANGPT4+0.110 × FAM78B +0.046 × COLEC12+0.063 × TRABD2A +0.049 × AMFR +0.004 × LMTK3 formula I

Wherein, the ANGPT4 represents the expression level of ANGPT4 gene in the detection sample, and the FAM78B represents the expression level of FAM78B gene in the detection sample; COLEC12 shows the expression level of COLEC12 gene in the test sample; TRABD2A shows the expression level of TRABD2A gene in the test sample; AMFR represents the expression level of AMFR gene in the test sample; LMTK3 shows the expression level of LMTK3 gene in a test sample. More preferably, in the formula I, the ANGPT4 represents that qRT-PCR is adopted to detect the expression level of ANGPT4 gene expression in the liver cancer tissue of the HCC patient, and the FAM78B represents that qRT-PCR is adopted to detect the expression level of FAM78B gene expression in the liver cancer tissue of the HCC patient; COLEC12 shows that the expression level of COLEC12 gene expression in liver cancer tissues of HCC patients is detected by qRT-PCR; TRABD2A shows that the expression level of TRABD2A gene expression in liver cancer tissues of HCC patients is detected by qRT-PCR; AMFR represents the expression quantity of AMFR gene in liver cancer tissue of HCC patients detected by qRT-PCR; LMTK3 represents that the expression level of LMTK3 gene in liver cancer tissue of HCC patient is detected by qRT-PCR.

In the present invention, the term "primer" refers to a nucleic acid sequence having a free 3' hydroxyl group which is capable of binding complementarily to a template and enabling reverse transcriptase or DNA polymerase to initiate template replication. Primers are nucleotides having a sequence complementary to a nucleic acid sequence of a specific gene. In the present invention, unless otherwise indicated, the term "probe" generally refers to a polynucleotide probe that is capable of binding to another polynucleotide (often referred to as a "target polynucleotide") by complementary base pairing. Depending on the stringency of the hybridization conditions, a probe can bind to a target polynucleotide that lacks complete sequence complementarity to the probe. In the present invention, the term "post-operation" refers to a liver cancer patient after undergoing a hepatoma resection operation.

Compared with the prior art, the invention has the following positive beneficial effects:

(1) according to the invention, by researching transcriptome data of liver cancer tissue genes of HCC late-stage recurrence patients and HCC non-recurrence patients, the situation that the ANGPT4 gene, FAM78B gene, COLEC12 gene, TRABD2A gene, AMFR gene and LMTK3 gene have differential expression in liver cancer tissues of HCC late-stage recurrence patients and liver cancer tissues of HCC non-recurrence patients is found for the first time, and the difference has statistical significance, therefore, the combination of the ANGPT4 gene, FAM78B gene, COLEC12 gene, TRABD2A gene, AMFR gene and LMTK3 gene can be used as a biomarker for predicting postoperative late-stage recurrence risk of the liver cancer patients, and the prediction of postoperative late-stage recurrence risk of the liver cancer patients can be realized by detecting the expression levels of the ANGPT4 gene, FAM78B gene, COLEC12 gene, TRABD2A gene, AMFR gene and LMTK3 gene.

(2) According to the screened biomarkers capable of predicting the postoperative late-stage recurrence risk of the liver cancer patient, a risk prediction value model capable of predicting the postoperative late-stage recurrence risk of the liver cancer patient is constructed, and single-factor and multi-factor Cox regression analysis shows that the risk prediction value model is an independent risk factor for predicting the advanced-stage recurrence of the HCC patient; moreover, the risk prediction value model for predicting the postoperative late recurrence risk of the liver cancer patient, which is constructed by the invention, is used for distinguishing HCC late recurrence patients from HCC non-recurrence patients, and the AUC of the ROC curve reaches 0.94 (verification group), so that the accuracy is high; therefore, the risk prediction value model constructed by the invention can be used for predicting the risk of postoperative late-stage recurrence of the liver cancer patient, so that a clinician can formulate an individualized postoperative review or follow-up scheme according to the predicted probability of postoperative recurrence of the HCC patient, if the probability of postoperative late-stage recurrence of the HCC patient is predicted to be higher, the interval of each review or follow-up of the patient is shortened, and the treatment scheme is adjusted according to the change of the disease condition; if the probability of the later period recurrence of the HCC patient after the operation is predicted to be lower, the reexamination or follow-up interval can be properly prolonged so as to reduce the economic and psychological burden of the patient.

Drawings

FIG. 1 is the heat map and volcano map of differential expression genes screened in LR group and NR group in TCGA-LIHC sample and GSE76427 sample, and the enrichment analysis of the screened differential expression genes; wherein A is a heat map of differentially expressed genes in the TCGA-LIHC sample; b is volcano diagram of differential expression gene in TCGA-LIHC sample; c is a heatmap of differentially expressed genes in GSE76427 samples; d is a volcano map of the differentially expressed genes in the GSE76427 sample; e is an enrichment analysis result graph of the differential expression genes obtained by screening;

FIG. 2 is a Kaplan-Meier survival analysis result chart of HCC late stage recurrence high-risk group and HCC late stage recurrence low-risk group in TCGA-LIHC sample and GSE76427 sample; wherein Low represents the group with Low risk of HCC late recurrence; high indicates the group with High risk of HCC late recurrence;

FIG. 3 is a diagram showing the results of Lasso logistic regression analysis, wherein A is a diagram showing the results of screening the number of the optimal genes of the LASSO model; b is a Ten-time cross validation result graph selected by the tuning parameters of the LASSO model;

FIG. 4 is a graph showing statistics of time to relapse, risk score and expression level of 6 genes in each patient in TCGA-LIHC samples and GSE76427 samples; wherein A is TCGA-LIHC sample; b is GSE76427 sample;

FIG. 5 is a Kaplan-Meier survival analysis chart of risk score and relapse-free survival of HCC late-stage recurrence high-risk group and HCC late-stage recurrence low-risk group in TCGA-LIHC sample and GSE76427 sample; wherein Low represents the group with Low risk of HCC late recurrence; high indicates the group with High risk of HCC late recurrence; a is TCGA-LIHC sample; b is GSE76427 sample;

FIG. 6 is a ROC plot of TCGA-LIHC samples, GSE76427 samples with risk scoring model diagnosis to differentiate patients with advanced recurrence of HCC from patients with no recurrence of HCC; wherein A is TCGA-LIHC sample; b is GSE76427 sample;

FIG. 7 is a graph showing the expression levels of 6 different genes in a group with high risk of HCC late stage recurrence and a group with low risk of HCC late stage recurrence in 103 clinical specimens; wherein Low represents the group with Low risk of HCC late recurrence; high indicates the group with High risk of HCC late recurrence;

FIG. 8 is a graph showing the statistics of the time to relapse, risk score and expression level of 6 genes in each patient in 103 clinical specimens;

FIG. 9 is a Kaplan-Meier survival analysis graph, ROC graph and recurrence rate statistics of risk score and recurrence-free survival for HCC late-recurrence high-risk and HCC late-recurrence low-risk groups in 103 clinical samples; wherein, A is a Kaplan-Meier survival analysis chart, B is an ROC curve chart, and C is a recurrence rate statistical result chart; wherein Low represents the group with Low risk of HCC late recurrence; high indicates the group with High risk of HCC recurrence at advanced stage.

Detailed Description

The following detailed description is exemplary and is intended to provide further explanation of the invention as claimed. Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs.

It is noted that the terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of exemplary embodiments according to the invention. As used herein, the singular forms "a", "an" and "the" are intended to include the plural forms as well, and it should be understood that when the terms "comprises" and/or "comprising" are used in this specification, they specify the presence of the stated features, steps, operations, elements, and/or combinations thereof, unless the context clearly indicates otherwise.

The experimental methods in the following examples, which do not indicate specific conditions, all employ conventional techniques in the art, or follow the conditions suggested by the manufacturers; the reagents or instruments used are not indicated by the manufacturer, and are all conventional products commercially available.

In order to make the technical solutions of the present invention more clearly understood by those skilled in the art, the technical solutions of the present invention will be described in detail below with reference to specific embodiments.

Example 1: screening of genes significantly related to HCC late recurrence and construction of risk scoring model for predicting risk of HCC late recurrence

1. Selection, data acquisition and arrangement of experimental samples

According to inclusion criteria: (1) primary hepatocellular carcinoma; (2) AJCC I-III stage; (3) with recurrence data, from TCGAThe database (TCGA,https://portal.gdc.cancer.gov/) And a GEO database (GEO,https:// www.ncbi.nlm.nih.gov/geo/GEO ID: GSE 76427).

Through screening, 73 patients in the TCGA database meet the standard, 26 patients with advanced relapse (marked as an LR group) in the 73 patients (marked as TCGA-LIHC samples) and 47 patients without relapse (marked as an NR group) in the 73 patients, transcriptome data and clinical data of the 73 patients are obtained from the TCGA database, and the RNA-seq data of the 73 samples are subjected to FPKM normalization processing and converted into log2 FPKM; statistics of clinical data for patients with no recurrence and patients with advanced recurrence in 73 samples are shown in table 1. The GEO database was standardized for 21 patients (designated GSE76427 samples), 9 patients with advanced relapse (LR) and 12 patients with No Relapse (NR), and the chip spectra and clinical data of 21 patient samples were obtained from the GEO database and normalized for the chip raw data of 21 samples using the lumi R package.

TABLE 1 clinical data information for non-relapsing patients and patients with advanced relapse in TCGA-LIHC samples

2. Screening and validation of genes significantly associated with HCC late-stage recurrence

In order to search for genes which are remarkably related to the HCC late-stage recurrence in the HCC, the transcriptome data of the TCGA-LIHC sample is analyzed by adopting a DESeq2 packet, and genes which are differentially expressed in an LR group and an NR group are screened; and (3) analyzing the chip spectrum data of the GSE76427 sample by using a limma R packet, and screening genes which are differentially expressed in an LR group and an NR group. The screening condition of the differential expression gene is that the corrected P value is less than 0.05.

The heatmap and volcano plots of the differentially expressed genes in the LR group and the NR group in the DEseq2 panel screening TCGA-LIHC samples are shown in A, B in fig. 1, respectively. Through screening, 1890 differentially expressed genes are screened out from the TCGA-LIHC sample, wherein 1236 up-regulated genes and 654 down-regulated genes are selected from the LR group. The heatmap and volcano map of differentially expressed genes in the LR group and NR group in limma R-package screening GSE76427 samples are shown as C, D in fig. 1, respectively. Through screening, 1070 differential expression genes are screened out in the GSE76427 sample, wherein 506 up-regulated genes and 564 down-regulated genes are selected in the LR group.

And (3) taking the intersection of the up-regulated genes and the down-regulated genes screened by the TCGA-LIHC sample and the GSE76427 sample to determine 14 overlapped up-regulated genes and 16 overlapped down-regulated genes. The biological functions of the 30 overlapping genes were further explored by functional enrichment analysis of these genes using KEGG. The KEGG analysis results are shown as E in fig. 1. KEGG analysis suggests that 30 overlapping genes are primarily involved in cancer-related and metabolism-related pathways, including the Ras signaling pathway, the MAPK signaling pathway, the alpha-linolenic acid pathway, and vascular smooth muscle contraction.

In a TCGA-LIHC sample, 30 screened overlapping genes are further screened by adopting single-factor COX regression analysis and Kaplan-Meier analysis, and 6 overlapping genes which are obviously related to HCC late-stage relapse are screened out, wherein the 6 overlapping genes are respectively an ANGPT4 gene, an AMFR gene, a COLEC12 gene, an FAM78B gene, an LMTK3 gene and a TRABD2A gene. According to the expression quantity of 6 overlapped genes of each patient in a TCGA-LIHC sample, determining the optimal cutoff values of an LR group and an NR group of diagnosis differentiation of an ANGPT4 gene, an AMFR gene, a COLEC12 gene, a FAM78B gene, an LMTK3 gene and a TRABD2A gene by using a surfmer package respectively; dividing patients in the TCGA-LIHC sample into an HCC late stage recurrence high-risk group and an HCC late stage recurrence low-risk group according to the optimal cutoff value of each gene, and analyzing the relationship between the 6 overlapping gene expression levels and the recurrence-free survival (RFS) of the patients in the HCC late stage recurrence high-risk group and the HCC late stage recurrence low-risk group by adopting a Kaplan-Meier method and a Log-rank test, wherein the results are shown as A-F in figure 2. As can be seen from A to F in FIG. 2, the recurrence-free survival time of patients with HCC advanced-stage recurrence high-risk group is significantly different from that of patients with HCC advanced-stage recurrence low-risk group, and the recurrence-free survival time of patients with HCC advanced-stage recurrence high-risk group is shorter and the prognosis is poor; the significant correlation between the expression values of the ANGPT4 gene, the AMFR gene, the COLEC12 gene, the FAM78B gene, the LMTK3 gene and the TRABD2A gene and the late recurrence of the liver cancer is shown.

In GSE76427 samples, according to the expression quantity of 6 overlapped genes in each sample, respectively adopting a surfminer package to determine the optimal cutoff values of an ANGPT4 gene, an AMFR gene, a COLEC12 gene, a FAM78B gene, an LMTK3 gene and a TRABD2A gene for diagnosing and distinguishing an LR group and an NR group; dividing patients in a GSE76427 sample into an HCC advanced relapse high-risk group and an HCC advanced relapse low-risk group according to the optimal cutoff value of each gene, and analyzing the relationship between the expression quantity of 6 overlapping genes and the relapse-free survival (RFS) of the patients in the HCC advanced relapse high-risk group and the HCC advanced relapse low-risk group by adopting a Kaplan-Meier method and a Log-rank test, wherein the results are shown as G-L in figure 2. As can be seen from G-L in FIG. 2, the recurrence-free survival time of patients with HCC advanced-stage recurrence high-risk group is significantly different from that of patients with HCC advanced-stage recurrence low-risk group, and the recurrence-free survival time of patients with HCC advanced-stage recurrence high-risk group is shorter and the prognosis is poor; the expression values of 6 genes are obviously related to the late-stage recurrence of the liver cancer.

3. Constructing a risk scoring model for predicting the late-stage recurrence risk of liver cancer

In a TCGA-LIHC sample, according to 6 screened genes (ANGPT4 gene, AMFR gene, COLEC12 gene, FAM78B gene, LMTK3 gene and TRABD2A gene) which are obviously related to HCC late recurrence, LASSO (minimum absolute shrinkage and selection operator) logistic regression analysis is adopted, and a glmnet software package is utilized to establish a risk prediction model of HCC late recurrence. With ten-fold cross validation, the optimal lambda is generated when the partial likelihood deviation reaches a minimum (λ ═ 0.012). And selecting a non-zero coefficient gene to establish a prediction model based on the optimal lambda. The risk score for each patient was calculated using the LASSO model weighting coefficients as follows:

wherein n is a critical base factor, coef (i) is the LASSO coefficient of gene i, and exp (i) is the expression level of gene i.

The results of the LASSO logistic regression analysis are shown in FIG. 3 at A, B. Through LASSO logistic regression analysis, a risk scoring model of the postoperative late recurrence of HCC patients is obtained: risk score 0.356+0.06 × ANGPT4+0.110 × FAM78B +0.046 × COLEC12+0.063 × TRABD2A +0.049 × AMFR +0.004 × LMTK3 formula I.

In the formula I, ANGPT4 represents the expression level of ANGPT4 gene in a test sample, FAM78B represents the expression level of FAM78B gene in the test sample; COLEC12 shows the expression level of COLEC12 gene in the test sample; TRABD2A shows the expression level of TRABD2A gene in the test sample; AMFR represents the expression level of AMFR gene in the test sample; LMTK3 shows the expression level of LMTK3 gene in a test sample.

Example 2: risk scoring model prediction capability assessment and verification for predicting liver cancer late recurrence risk

The relation between the risk score prediction model for predicting the risk of liver cancer late recurrence and the prognosis of HCC late recurrence of the invention is evaluated and verified by taking the TCGA-LIHC sample included in example 1 as a training set and the GSE76427 sample as a verification set.

1. Risk scoring model prediction capability assessment for predicting liver cancer late recurrence risk

In the training set, the optimal cutoff value for diagnosing and distinguishing the LR group and the NR group by the risk score model is determined by using the survivor package, the risk score of each sample in the training set is calculated according to the risk score model constructed in example 1 (73 patients in the training set are ranked according to the risk score, the result is shown as a in fig. 4, the recurrence time, the risk score and the expression quantity of 6 genes of the 73 patients in the training set are respectively shown in the a from top to bottom), then the samples in the training set are divided into an HCC late recurrence high-risk group and an HCC late recurrence low-risk group according to the optimal cutoff value and the risk score of each sample, and then the relationship between the risk of the HCC late recurrence high-risk group and the HCC late recurrence low-risk group and the recurrence-free survival (RFS) is analyzed by using the Kaplan-Meier method and the HCC Log-rank test, and the result is shown as a in fig. 5. As can be seen from a in fig. 5, the recurrence-free survival time of patients with HCC in the high-risk group with advanced HCC recurrence is significantly different from that of patients with HCC in the low-risk group with advanced HCC recurrence (P ═ 0.003), and the recurrence-free survival time of patients with HCC in the high-risk group with advanced HCC recurrence is shorter and the prognosis is poor.

Further to evaluate the effectiveness of the risk scoring model constructed in example 1 of the present invention, ROC curves for diagnosing and differentiating HCC patients with advanced HCC recurrence and HCC patients with no HCC recurrence were prepared according to the risk score of each sample in the training set, and the results are shown in a in fig. 6. As can be seen from a in fig. 6, the AUC of the ROC curve for diagnosing patients with advanced HCC recurrence and patients with HCC non-recurrence, which is constructed by the risk scoring model of the present invention, is 0.73, the sensitivity is 76.9%, and the specificity is 61.6%, which indicates that the risk scoring model for predicting risk of advanced liver cancer recurrence, which is constructed by the present invention, has high accuracy.

2. The invention relates to the verification of the prediction ability of a risk scoring model for predicting the late-stage recurrence risk of liver cancer

In the validation set, the survivor package is used to determine the optimal cutoff value for diagnosing and differentiating the LR group and the NR group in the risk score model, and the risk score of each sample in the validation set is calculated according to the risk score model constructed in example 1 (21 patients in the validation set are ranked according to the risk score, the result is shown as B in fig. 4, the recurrence time, the risk score and the expression amount of 6 genes of 21 patients in the validation set are respectively shown in the B from top to bottom), then the samples in the training set are divided into an HCC late recurrence high-risk group and an HCC late recurrence low-risk group according to the optimal cutoff value and the risk score of each sample, and then the relationship between the risk score of the HCC late recurrence high-risk group and the HCC late recurrence low-risk group and the recurrence-free survival (RFS) is analyzed by using the Kaplan-Meier method and the Log-rank test, and the result is shown as B in fig. 5. As can be seen from fig. 5B, the recurrence-free survival time of patients with HCC in the high-risk group with advanced HCC recurrence is significantly different from that of patients with HCC in the low-risk group with advanced HCC recurrence (P ═ 0.003), and the recurrence-free survival time of patients with HCC in the high-risk group with advanced HCC recurrence is shorter and the prognosis is poor.

Further to verify the effectiveness of the risk scoring model constructed in example 1 of the present invention, ROC curves for diagnosing patients with advanced HCC recurrence and patients with HCC non-recurrence were prepared according to the risk score of each sample in the validation set, and the results are shown in B in fig. 6. As can be seen from fig. 6B, the AUC of the ROC curve for diagnosing patients with advanced HCC recurrence and patients with HCC non-recurrence, which is constructed by the risk scoring model of the present invention, is 0.94, the sensitivity is 88.9%, and the specificity is 100%, which indicates that the risk scoring model for predicting risk of advanced liver cancer recurrence, which is constructed by the present invention, has high accuracy.

3. And (3) possibility evaluation of the risk scoring model as an RFS independent prognosis prediction index of the liver cancer late recurrence patient:

(1) possibility of evaluating risk scoring model as RFS (radiofrequency local regression) independent prognosis prediction index of liver cancer late recurrence patient in training set sample

In the training set, R language is used to respectively perform single-factor Cox regression analysis and multi-factor Cox regression analysis on clinical indexes possibly related to the late-stage liver cancer recurrence, and the prognosis prediction value of the clinical indexes on RFS of patients with late-stage liver cancer recurrence is evaluated. The results are shown in tables 2 and 3.

TABLE 2 Single-factor Cox regression analysis results of RFS in patients with advanced liver cancer recurrence in training set samples

As can be seen from Table 2, the risk score and creatinine were statistically significant in the single-factor Cox regression analysis, which is a risk factor for late recurrence of liver cancer, while age, gender, staging, grade and prothrombin time were not statistically significant.

TABLE 3 Multi-factor Cox regression analysis results of RFS of patients with advanced liver cancer recurrence in training set samples

As can be seen from Table 3, the risk score further derived from multifactorial Cox regression is an independent risk factor for late stage recurrence of liver cancer.

(2) Possibility of verifying risk scoring model as RFS (radiofrequency regression) independent prognosis prediction index of liver cancer late recurrence patient in verification set sample

In the verification set, R language is used to respectively perform single-factor Cox regression analysis and multi-factor Cox regression analysis on clinical indexes possibly related to the late-stage liver cancer recurrence, and the prognosis prediction value of the clinical indexes on RFS of patients with late-stage liver cancer recurrence is evaluated. The results are shown in tables 4 and 5.

Table 4 results of single-factor Cox regression analysis of RFS of patients with advanced liver cancer recurrence in validation set samples

As can be seen from Table 4, the risk score and the TNM stage are risk factors for late stage recurrence of liver cancer.

TABLE 5 validation set of multifactor Cox regression analysis results of RFS of patients with advanced liver cancer recurrence in samples

As can be seen from Table 5, the risk score and the TNM stage are independent risk factors for late stage recurrence of liver cancer.

Example 3: clinical samples prove the prediction capability of the risk scoring model for predicting the late-stage recurrence risk of liver cancer

The expression levels of ANGPT4 gene, AMFR gene, COLEC12 gene, FAM78B gene, LMTK3 gene and TRABD2A gene in the liver cancer tissue samples of 103 HCC patients collected clinically are detected by adopting qRT-PCR, and the prediction capability of the risk scoring model for predicting the late recurrence risk of liver cancer is verified and verified according to the detection results.

1. Experimental sample

Collecting 103 patients with stage AJCC I-III HCC from the first subsidiary Hospital of Zhengzhou university, and the inclusion criteria for 103 patients were: (1) primary hepatocellular carcinoma; (2) surgical resection; (3) there were recurrent data. Of the 103 patients, 42 patients with advanced relapse (referred to as LR group) and 61 patients with no relapse (referred to as NR group). Clinical data information of 103 samples is shown in table 6.

Statistics of clinical data information of table 6103 clinical samples

2. qRT-PCR was used to detect the expression levels of ANGPT4 gene, AMFR gene, COLEC12 gene, FAM78B gene, LMTK3 gene and TRABD2A gene in 103 clinical HCC samples

Total RNA was isolated from liver cancer tissue using RNAiso Plus reagent (Takara, chinese, da) according to the manufacturer's instructions. RNA quality was assessed using NanoDrop One C (Waltham, MA, USA) and RNA integrity was assessed using agarose gel electrophoresis. Mu.g of total RNA was reverse transcribed into complementary DNA (cDNA) using the mRNA reverse transcription kit (TaKaRa BIO, Japan). SYBR Assay I Low ROX (Eurogentec, USA) andthe GreenPCRMaster Mix (Yeason Shanghai, China) performed qRT-PCR reaction on the cDNAs of all samples, and the expression levels of the ANGPT4 gene, AMFR gene, COLEC12 gene, FAM78B gene, LMTK3 gene and TRABD2A gene in the samples were detected. The expression value of the target gene is normalized to GAPDH, and then log2 is transformed for subsequent analysis. The primer sequences for 6 genes and GAPDH are shown in Table 7.

TABLE 7 qRT-PCR amplification primers

The qRT-PCR detection results of 103 clinical samples were determined by using a surfminer package to determine the optimal cutoff values of the ANGPT4 gene, AMFR gene, COLEC12 gene, FAM78B gene, LMTK3 gene and TRABD2A gene diagnosis differentiation LR group and NR group, and the patients in 103 clinical samples were classified into HCC late recurrence high-risk group and HCC late recurrence low-risk group according to the optimal cutoff value of each gene, and the difference in gene expression amounts of the HCC late recurrence high-risk group and HCC late recurrence low-risk group is shown in fig. 7. As can be seen from FIG. 7, the expression levels of the 6 genes in the group with high risk of HCC late stage recurrence are all significantly higher than those in the group with low risk of HCC late stage recurrence, and the differences have statistical significance.

3. 103 clinical samples prove that the risk scoring model prediction capability for predicting the late recurrence risk of liver cancer

In 103 clinical samples, according to the expression levels of 6 genes (ANGPT4 gene, AMFR gene, COLEC12 gene, FAM78B gene, LMTK3 gene and TRABD2A gene) detected by qRT-PCR, the optimal cut-off value for diagnosing and distinguishing LR group and NR group is determined by using a surfminer package, the risk score of each sample in 103 clinical samples is calculated according to the risk score model (103 clinical patients are sorted according to the risk score, the results are shown in FIG. 8, the risk score, the recurrence time and the expression level of 6 genes of 103 clinical patients are respectively shown in FIG. 8 from top to bottom), then the samples in the training set are divided into a high recurrence risk group and a low recurrence risk group of late recurrence according to the optimal cut-off value and the risk score of each sample, and the sample in the training set is then analyzed by using Kaplan-late Meier method and Log-rank to check the risk of the recurrence-free risk of HCC (RFS) of patients in the high recurrence risk group and low recurrence risk of HCC, the result is shown as a in fig. 9. As can be seen from A in FIG. 9, the recurrence-free survival time of patients with high risk of HCC late stage recurrence is significantly different from that of patients with low risk of HCC late stage recurrence (P <0.001), and the recurrence-free survival time of patients with high risk of HCC late stage recurrence is shorter and the prognosis is poor.

Further to verify the effectiveness of the risk scoring model constructed in example 1 of the present invention, ROC curves were generated for diagnosing patients with advanced HCC recurrence and patients with HCC non-recurrence according to the risk score of each of 103 clinical samples, and the results are shown in B of fig. 9. As shown in FIG. 9B, the risk score model diagnosis constructed in the present invention has a ROC curve that distinguishes patients with advanced HCC recurrence from patients with HCC non-recurrence, with an AUC of 0.851, a sensitivity of 81% and a specificity of 82%. Therefore, the risk scoring model for predicting the late-stage recurrence risk of the liver cancer, which is constructed by the invention, has higher accuracy.

In addition, the number of patients actually having advanced relapse in patients with HCC advanced relapse high-risk group and HCC advanced relapse low-risk group was counted, and the advanced relapse rates of the HCC advanced relapse high-risk group and the HCC advanced relapse low-risk group were calculated according to the statistical results, with the results shown in C in fig. 9. As can be seen from C in fig. 9, the recurrence rate of actual late stage recurrence in the HCC advanced recurrence high-risk group is 69%, the recurrence rate of actual late stage recurrence in the HCC advanced recurrence low-risk group is 13%, the recurrence rate of actual late stage recurrence in the HCC advanced recurrence high-risk group is significantly higher than that in the HCC advanced recurrence low-risk group, and the difference is statistically significant (p < 0.001).

R language is used for respectively carrying out single-factor Cox regression analysis and multi-factor Cox regression analysis on clinical indexes which are possibly related to the late-stage liver cancer recurrence in 103 clinical samples, the prognosis prediction value of the clinical indexes on RFS of patients with the late-stage liver cancer recurrence is evaluated, and the results of the single-factor Cox regression analysis and the multi-factor Cox regression analysis are respectively shown in tables 8 and 9.

Table 8103 clinical samples of single factor Cox regression analysis results of RFS of patients with advanced liver cancer recurrence

Table 9103 clinical samples of the results of the multifactor Cox regression analysis of RFS in patients with advanced liver cancer recurrence

As can be seen from tables 8 and 9, the risk score, male, cirrhosis and microvascular invasion are independent risk factors for late stage recurrence in liver cancer patients, consistent with the expected results of our trial. Therefore, the risk scoring model can be used as an independent prognosis prediction index of the RFS of the liver cancer patient.

In summary, the experimental results of 103 clinical samples are basically consistent with the training set and the verification set, which indicates that the risk scoring model for predicting the HCC late stage recurrence risk constructed by the invention can be used as an independent prognosis prediction index of the postoperative RFS of the liver cancer patient, and the accuracy of the prediction results is high.

Finally, it should be noted that: the above embodiments are only used to illustrate the technical solution of the present invention, and not to limit the same; while the invention has been described in detail and with reference to the foregoing embodiments, it will be understood by those skilled in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some or all of the technical features may be equivalently replaced; and the modifications or the substitutions do not make the essence of the corresponding technical solutions depart from the scope of the technical solutions of the embodiments of the present invention.

Sequence listing

<110> first subsidiary Hospital of Zhengzhou university

<120> biomarker and detection kit for predicting postoperative late stage recurrence risk of liver cancer patient

<160> 12

<170> SIPOSequenceListing 1.0

<210> 1

<211> 24

<212> DNA

<213> Artificial Sequence (Artificial Sequence)

<400> 1

tccttaaaga cacctaagcc agtg 24

<210> 2

<211> 24

<212> DNA

<213> Artificial Sequence (Artificial Sequence)

<400> 2

ggtcctctgg aaatttacgc ttcc 24

<210> 3

<211> 22

<212> DNA

<213> Artificial Sequence (Artificial Sequence)

<400> 3

cctgcactgc tctagctact tc 22

<210> 4

<211> 24

<212> DNA

<213> Artificial Sequence (Artificial Sequence)

<400> 4

gatcccaatt tcaactgtga gatc 24

<210> 5

<211> 20

<212> DNA

<213> Artificial Sequence (Artificial Sequence)

<400> 5

gaagctagta gactccaagc 20

<210> 6

<211> 20

<212> DNA

<213> Artificial Sequence (Artificial Sequence)

<400> 6

ctctcctttc tgtcccttgt 20

<210> 7

<211> 22

<212> DNA

<213> Artificial Sequence (Artificial Sequence)

<400> 7

ttctttggca caatccatgt cc 22

<210> 8

<211> 20

<212> DNA

<213> Artificial Sequence (Artificial Sequence)

<400> 8

gggagcacat cttggaggtt 20

<210> 9

<211> 23

<212> DNA

<213> Artificial Sequence (Artificial Sequence)

<400> 9

gaccaggaag agggagaaac ttc 23

<210> 10

<211> 18

<212> DNA

<213> Artificial Sequence (Artificial Sequence)

<400> 10

cctccaggcg aggactga 18

<210> 11

<211> 20

<212> DNA

<213> Artificial Sequence (Artificial Sequence)

<400> 11

aatgtctgcg taaccgcacg 20

<210> 12

<211> 18

<212> DNA

<213> Artificial Sequence (Artificial Sequence)

<400> 12

ggcgaatcca tcggggtg 18

23页详细技术资料下载
上一篇:一种医用注射器针头装配设备
下一篇:小分子标志物相关的产品在诊断疾病中的用途

网友询问留言

已有0条留言

还没有人留言评论。精彩留言会获得点赞!

精彩留言,会给你点赞!