Commodity encoding method, electronic device and computer-readable storage medium

文档序号:379712 发布日期:2021-12-10 浏览:18次 中文

阅读说明:本技术 商品的编码方法、电子设备及计算机可读存储介质 (Commodity encoding method, electronic device and computer-readable storage medium ) 是由 何浩 刘佳伟 黄小明 于 2021-09-03 设计创作,主要内容包括:本申请涉及海关商品编码技术领域,公开了商品的编码方法、电子设备及计算机可读存储介质。该方法包括:获取待归类商品的数据信息;从数据信息中获取至少一个关键信息;为每个关键信息确定多个参考商品编码;从多个参考商品编码中确定待归类商品的目标商品编码。通过上述方式,能够实现对待归类商品自动推荐海关商品编码,减少用户操作,提升工作效率。(The application relates to the technical field of customs commodity coding, and discloses a commodity coding method, electronic equipment and a computer readable storage medium. The method comprises the following steps: acquiring data information of commodities to be classified; acquiring at least one piece of key information from the data information; determining a plurality of reference commodity codes for each key information; and determining a target commodity code of the commodity to be classified from the plurality of reference commodity codes. By the aid of the method, customs commodity codes can be automatically recommended to commodities to be classified, user operation is reduced, and working efficiency is improved.)

1. A method for encoding a customs good, the method comprising:

acquiring data information of commodities to be classified;

acquiring at least one piece of key information from the data information;

determining a plurality of reference commodity codes for each key information;

and determining a target commodity code of the commodity to be classified from a plurality of reference commodity codes.

2. The method of claim 1,

the determining a plurality of reference commodity codes for each key information comprises:

comparing each key information with each key information of all customs commodity codes to obtain a comparison result;

and determining a plurality of reference commodity codes according to the comparison result.

3. The method of claim 2,

the determining a plurality of the reference commodity codes according to the comparison result comprises:

acquiring the comparison result exceeding a preset value;

and determining the customs commodity code corresponding to the comparison result exceeding a preset value as the reference commodity code.

4. The method of claim 1,

the step of determining the target commodity code of the commodity to be classified from a plurality of reference commodity codes comprises the following steps:

calculating the information entropy corresponding to each reference commodity code;

calculating information gain corresponding to each reference commodity code based on the information entropy;

and determining the reference commodity code with the largest information gain as the target commodity code of the commodity to be classified.

5. The method of claim 4,

the calculating of the information entropy corresponding to each reference commodity code comprises the following steps:

the following formula is used for calculation:

wherein D represents the number of codes of all reference commodities, pkRepresenting the proportion of the kth reference commodity code in the D;

the calculating the information gain corresponding to each reference commodity code based on the information entropy comprises:

the following formula is used for calculation:

wherein Gain represents the information Gain, a represents a reference commodity code, DvThe expression that the v-th node contains all the values of D with the characteristic alpha as avThe total number of samples of (a) is,representing the weight of the v-th node.

6. The method of claim 1,

constructing a customs commodity code recommendation model;

inputting a training sample into the customs commodity code recommendation model to obtain a first prediction code output by the customs commodity code recommendation model;

determining a classification error rate using the first predictive coding and the true codes in the training samples;

and correcting the customs commodity code recommendation model by using the classification error rate.

7. The method of claim 6,

the customhouse goods code recommendation model comprises a plurality of branch nodes;

the modifying the customs goods code recommendation model by using the classification error rate comprises:

calculating a first classification error rate of each branch node according to the classification error rate of each branch node and the weight of each branch node;

calculating a second classification error rate of each branch node after being pruned;

if the second classification error rate is greater than the first classification error rate, the branch node is retained.

8. The method of claim 7,

the method further comprises the following steps:

acquiring key information of the real code;

inputting the key information into the customs commodity code recommendation model to obtain a second prediction code output by the customs commodity code recommendation model;

determining a classification error rate using the second predictive coding and the true coding;

and correcting the customs commodity code recommendation model by using the classification error rate.

9. An electronic device, comprising a processor and a memory coupled to the processor;

wherein the memory is adapted to store a computer program and the processor is adapted to execute the computer program to implement the method according to any of claims 1-8.

10. A computer-readable storage medium, characterized in that the computer-readable storage medium is used to store a computer program which, when being executed by a processor, is used to carry out the method according to any one of claims 1-8.

Technical Field

The present application relates to the field of customs product coding technologies, and in particular, to a product coding method, an electronic device, and a computer-readable storage medium.

Background

The classification of customs goods is an important basis for applying taxation, supervision, smuggling and statistics work by customs, and is an important guarantee for accurate and unified law enforcement by customs. The commodity classification means the activity of determining commodity codes (hereinafter referred to as ' commodity codes, HS codes ') of imported and exported goods according to the requirements of administrative adjudication about commodity classification and commodity classification determination issued by the customs administration, the ' import and export tax rules ' (abbreviated as ' tax rules ') of the people's republic of China, the ' import and export tax rules commodity and item comments ', ' import and export tax rules and local subdirectory comments of the people's republic of China, and the customs general administration under the commodity classification catalogue system of the ' Commodity name and Code coordination System convention '.

The commodity classification has important significance in import and export services, and commodity codes (HS codes) determine the tariff rate of goods, trade control requirements, export tax return rate, whether tax deduction and exemption can be enjoyed, and the like. Whether the commodity classification is accurate or not reflects the compliance level of the import and export management of enterprises.

Meanwhile, the classification of the commodities has considerable technical difficulty, and factors influencing the correct classification of the commodities are many, so that even small differences of specifications and models of the commodities can cause different classification results. The wrong goods code also has a great influence on the operation cost of the enterprise, such as increasing tax payment cost, causing unnecessary limitation on import and export goods, influencing customs clearance efficiency and the like.

Disclosure of Invention

The technical problem mainly solved by the application is to provide a commodity coding method, an electronic device and a computer readable storage medium, which can realize automatic customs commodity coding recommendation of commodities to be classified, reduce user operation and improve working efficiency.

In order to solve the above problem, a technical solution adopted by the present application is to provide a method for encoding a commodity, including: acquiring data information of commodities to be classified; acquiring at least one piece of key information from the data information; determining a plurality of reference commodity codes for each key information; and determining a target commodity code of the commodity to be classified from the plurality of reference commodity codes.

Wherein determining a plurality of reference commodity codes for each key information comprises: comparing each key information with each key information of all customs commodity codes to obtain a comparison result; and determining a plurality of reference commodity codes according to the comparison result.

Determining a plurality of reference commodity codes according to the comparison result, wherein the step of determining the plurality of reference commodity codes comprises the step of obtaining the comparison result exceeding a preset value; and determining the customs commodity code corresponding to the comparison result exceeding the preset value as the reference commodity code.

Wherein, the step of determining the target commodity code of the commodity to be classified from the plurality of reference commodity codes comprises the following steps: calculating the information entropy corresponding to each reference commodity code; calculating information gain corresponding to each reference commodity code based on the information entropy; and determining the reference commodity code with the largest information gain as the target commodity code of the commodity to be classified.

Wherein, the following formula is adopted for calculation:wherein D represents the number of codes of all reference commodities, pkRepresenting the proportion of the kth reference commodity code in the D; calculating information gain corresponding to each reference commodity code based on the information entropy, wherein the information gain comprises the following steps: the following formula is used for calculation:wherein Gain represents information Gain, a represents reference commodity code, DvThe expression that the v-th node contains all the values of D which are alpha on the characteristic alphavThe total number of samples of (a) is,representing the weight of the v-th node.

Wherein, the method also comprises: constructing a customs commodity code recommendation model; inputting the training samples into a customs commodity code recommendation model to obtain a first prediction code output by the customs commodity code recommendation model; determining a classification error rate using the first predictive coding and the true codes in the training samples; and correcting the customs commodity code recommendation model by using the classification error rate.

The customhouse commodity code recommendation model comprises a plurality of branch nodes; the method for correcting the customs commodity code recommendation model by utilizing the classification error rate comprises the following steps: calculating a first classification error rate of each branch node according to the classification error rate of each branch node and the weight of each branch node; calculating a second classification error rate of each branch node after pruning; if the second classification error rate is greater than the first classification error rate, the branch node is retained.

Wherein, the method also comprises: acquiring key information of real codes; inputting the key information into a customs commodity code recommendation model to obtain a second prediction code output by the customs commodity code recommendation model; determining a classification error rate using the second predictive coding and the true coding; and correcting the customs commodity code recommendation model by using the classification error rate.

In order to solve the above problem, another technical solution adopted by the present application is to provide an electronic device, which includes a processor and a memory coupled to the processor; wherein the memory is used for storing computer programs, and the processor is used for executing the computer programs so as to realize the method provided by the technical scheme.

In order to solve the above problem, another technical solution adopted by the present application is to provide a computer-readable storage medium for storing a computer program, which when executed by a processor is used for implementing the method provided by the above technical solution.

The beneficial effect of this application is: different from the situation of the prior art, the commodity coding method is characterized in that data information of commodities to be classified is acquired; acquiring at least one piece of key information from the data information; determining a plurality of reference commodity codes for each key information; the method for determining the target commodity code of the commodity to be classified from the plurality of reference commodity codes can realize automatic recommendation of the customs commodity code for the commodity to be classified, reduce user operation and improve working efficiency.

Drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts. Wherein:

FIG. 1 is a schematic flowchart of an embodiment of a method for encoding a commodity according to the present application;

FIG. 2 is a schematic flow chart diagram illustrating a method for encoding a product according to another embodiment of the present disclosure;

FIG. 3 is a schematic flow chart diagram illustrating a method for encoding a product according to another embodiment of the present disclosure;

FIG. 4 is a schematic flow chart diagram illustrating a method for encoding a product according to another embodiment of the present disclosure;

FIG. 5 is a flowchart illustrating a method for encoding a product according to another embodiment of the present disclosure;

FIG. 6 is a schematic flow chart diagram illustrating one embodiment of step 54 provided herein;

FIG. 7 is a flowchart illustrating a method for encoding an article of manufacture according to another embodiment of the present disclosure;

FIG. 8 is a schematic structural diagram of an embodiment of an electronic device provided in the present application;

FIG. 9 is a schematic structural diagram of an embodiment of a computer-readable storage medium provided in the present application.

Detailed Description

The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application. It is to be understood that the specific embodiments described herein are merely illustrative of the application and are not limiting of the application. It should be further noted that, for the convenience of description, only some of the structures related to the present application are shown in the drawings, not all of the structures. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.

Reference herein to "an embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment can be included in at least one embodiment of the application. The appearances of the phrase in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. It is explicitly and implicitly understood by one skilled in the art that the embodiments described herein can be combined with other embodiments.

The classification of customs goods is an important basis for applying taxation, supervision, smuggling and statistics work by customs, and is an important guarantee for accurate and unified law enforcement by customs. The commodity classification means the activity of determining commodity codes (hereinafter referred to as ' commodity codes, HS codes ') of imported and exported goods according to the requirements of administrative adjudication about commodity classification and commodity classification determination issued by the customs administration, the ' import and export tax rules ' (abbreviated as ' tax rules ') of the people's republic of China, the ' import and export tax rules commodity and item comments ', ' import and export tax rules and local subdirectory comments of the people's republic of China, and the customs general administration under the commodity classification catalogue system of the ' Commodity name and Code coordination System convention '.

The commodity classification has important significance in import and export services, and commodity codes (HS codes) determine the tariff rate of goods, trade control requirements, export tax return rate, whether tax deduction and exemption can be enjoyed, and the like. Whether the commodity classification is accurate or not reflects the compliance level of the import and export management of enterprises.

Meanwhile, the classification of the commodities has considerable technical difficulty, and factors influencing the correct classification of the commodities are many, so that even small differences of specifications and models of the commodities can cause different classification results. The wrong goods code also has a great influence on the operation cost of the enterprise, such as increasing tax payment cost, causing unnecessary limitation on import and export goods, influencing customs clearance efficiency and the like.

In order to solve the above problems, the present application proposes the following technical solutions:

referring to fig. 1, fig. 1 is a schematic flowchart of an embodiment of a method for encoding a commodity according to the present application. The method comprises the following steps:

step 11: and acquiring data information of the commodities to be classified.

The data information of the to-be-classified commodities includes relevant information of the to-be-classified commodities, and for example, taking a liquid crystal display as an example, the data information may be: "liquid crystal display, 42 inch, with backlight, without touch function, video interface, decoder board, high frequency head board, filled liquid crystal, cut, 1.27 kg/sheet".

Step 12: at least one key information is obtained from the data information.

Obtaining at least one key information from the data information, as in the example of the above-described liquid crystal display, may use liquid crystal, no touch function, video interface, high-frequency head board, 42 inches as the key information.

Step 13: a plurality of reference commodity codes are determined for each of the key information.

At this time, a plurality of corresponding reference product codes can be determined according to each piece of key information.

Step 14: and determining a target commodity code of the commodity to be classified from the plurality of reference commodity codes.

And then determining a target commodity code of the commodity to be classified from the plurality of reference commodity codes.

In some embodiments, the acquired data information of the to-be-classified commodities can be input into the trained customs commodity code recommendation model, so that the customs commodity code recommendation model outputs the target commodity code of the to-be-classified commodities.

In the customs goods code recommendation model, a plurality of reference goods codes are determined for each piece of key information, and then the target goods codes of the goods to be classified are determined according to the reference goods codes.

In the embodiment, data information of the commodities to be classified is acquired; acquiring at least one piece of key information from the data information; determining a plurality of reference commodity codes for each key information; the method for determining the target commodity code of the commodity to be classified from the plurality of reference commodity codes can realize automatic recommendation of the customs commodity code for the commodity to be classified, reduce user operation and improve working efficiency.

Referring to fig. 2, fig. 2 is a schematic flowchart of another embodiment of a method for encoding a commodity according to the present application. The method comprises the following steps:

step 21: and acquiring data information of the commodities to be classified.

Step 22: at least one key information is obtained from the data information.

Steps 21 to 22 are the same as or similar to the technical solutions of the above embodiments, and are not described herein.

Step 23: and comparing each key information with each key information of all customs commodity codes to obtain a comparison result.

In some embodiments, a weight may be set for each customs product code, and when the index 23 is executed, a comparison result between each piece of key information and each piece of key information of all customs product codes may be calculated according to the corresponding weight.

Step 24: and determining a plurality of reference commodity codes according to the comparison result.

In some embodiments, referring to fig. 3, step 24 may be a flow as follows:

step 241: and obtaining a comparison result exceeding a preset value.

Step 242: and determining the customs commodity code corresponding to the comparison result exceeding the preset value as the reference commodity code.

Step 25: and determining a target commodity code of the commodity to be classified from the plurality of reference commodity codes.

In some embodiments, referring to fig. 4, step 25 may be the following flow:

step 251: and calculating the information entropy corresponding to each reference commodity code.

The following formula is used for calculation:

wherein D represents the number of codes of all reference commodities, pkAnd D represents the proportion of the kth reference commodity code in D.

Step 252: and calculating the information gain corresponding to each reference commodity code based on the information entropy.

And calculating the information gain corresponding to each reference commodity code by adopting the following formula:

wherein Gain represents information Gain, a represents key information, and DvThe expression that the v-th node contains all the values of D with the characteristic alpha as avThe total number of samples of (a) is,representing the weight of the v-th node.

Step 253: and determining the reference commodity code with the largest information gain as the target commodity code of the commodity to be classified.

In the embodiment, by the mode, the customs commodity code can be automatically recommended to the commodity to be classified, user operation is reduced, and working efficiency is improved.

Referring to fig. 5, fig. 5 is a schematic flowchart of another embodiment of a method for encoding a commodity according to the present application.

Step 51: and constructing a customs commodity code recommendation model.

Step 52: and inputting the training samples into a customs commodity code recommendation model to obtain a first prediction code output by the customs commodity code recommendation model.

Step 53: the classification error rate is determined using the first predictive coding and the true codes in the training samples.

Step 54: and correcting the customs commodity code recommendation model by using the classification error rate.

The description is made in conjunction with the following table 1:

the key to decision tree learning is how to select the optimal partition attribute, the so-called optimalThe attribute of division is an attribute that makes divided samples belong to the same class as much as possible, i.e., the "purity" is the highest, for binary classification. How to measure the purity of features (features) is then used, at which point "information entropy" is used. First, see the definition of information entropy: if the ratio of the kth sample in the current sample set D is Pk(K-1, 2, 3. | K |), K being the total number of classes (for binary classification, K-2). The information entropy of the sample set is:

the smaller the value of Ent (D), the higher the purity of D. Looking again at a conceptual information gain (informativogain), assume that the discrete attribute α has V possible values (a)1,a2,...,anIf the feature α is used to divide the data set D, V branch nodes are generated, where the V-th node includes all the values of the feature α in the data set D, which are αvTotal number of samples (D)v. Therefore, the information entropy can be calculated according to the formula of the information entropy, and the weight is given to the branch nodes by considering that the number of samples contained in different branch nodes is differentThat is, the influence of the branch node having a larger number of samples is larger, and therefore, the "information gain" obtained by dividing the sample set D by the feature α can be calculated:

in general, the greater the information gain, the greater the "purity improvement" obtained by using the feature α to partition the data set. The information gain can be used for selecting the attribute of the decision tree partition, which is to select the attribute with the largest information gain, and the ID3 algorithm is to use the information gain to partition the attribute.

The data set is 20 samples with binary class, the number of valid samples is 17, and the proportion of positive examples (samples with class 1) is:the proportion of counterexamples (samples of class 0) is:the information entropy of the data set D can be calculated according to the formula of the information entropy as follows:

wherein D1 corresponds to the key information A of Table 1, D2 corresponds to the key information B of Table 1, and D3 corresponds to the key information C of Table 1.

By analogy, information gain calculation is carried out on each key information,

therefore, the commodity code 9013803090 has the maximum gain, and the most optimal value of the commodity code is recommended.

In other embodiments, the customs goods code recommendation model comprises a plurality of branch nodes; referring to fig. 6, step 54 may be the following process:

step 541: and calculating the first classification error rate of each branch node according to the classification error rate of each branch node and the weight of each branch node.

Step 542: and calculating the second classification error rate of each branch node after pruning.

Step 543: if the second classification error rate is greater than the first classification error rate, the branch node is retained.

By the method, the branch nodes are pruned, so that the calculation of a customs commodity code recommendation model is reduced, the operational capability is improved, and the efficiency of recommending customs commodity codes to the commodities to be classified is improved.

In some embodiments, classification of customs goods is performed by natural language recognition, intelligent extraction and intelligent word segmentation after entering the system, positive feedback and negative feedback are provided for the system through selection of customs personnel, machine learning training is completed, a supply chain industry model is constructed, and data are classified into different categories.

In the recommendation system, each time the customs officer selects, the operation of recalculating the weight of each piece of key information corresponding to each piece of currently operated data is carried out, the same piece of key information is repeatedly selected to be the same commodity code, the weight of the same is increased on the basis of the weight base number, and the weighting of the key fields of the system is carried out. And recommending N pieces of data with the highest matching degree in the system, and performing selection operation by the customs officers to gradually form a data model of the supply chain industry.

In the selection process, a decision tree algorithm is added, and after the decision tree grows sufficiently, redundant branches are trimmed. Calculating the expected classification error rate when the node is not pruned according to the classification error rate of each branch and the weight of each branch; for each non-leaf node, calculating the classification error rate of the node after pruning, and if the classification error rate after pruning is increased, abandoning pruning; otherwise, the node is forced to be a leaf node and the category is marked. After a series of pruned decision tree candidates are generated, the classification accuracy of each candidate decision tree is evaluated by using test data (data not participating in modeling), and the decision tree with the minimum classification error rate is reserved.

By the method, the accuracy of the customs commodity code recommendation model for recommending the customs commodity code can be improved.

Further, referring to fig. 7, after performing step 54, the method further comprises:

step 71: and acquiring the key information of the real code.

Step 72: and inputting the key information into a customs commodity code recommendation model to obtain a second prediction code output by the customs commodity code recommendation model.

Step 73: and determining the classification error rate by using the second predictive coding and the real coding.

Step 74: and correcting the customs commodity code recommendation model by using the classification error rate.

By the scheme of the steps 71-74, after the customs commodity code recommendation model is trained by using the training samples, the customs commodity code recommendation model is verified by using the key information of the real codes, so that the accuracy of the customs commodity code recommendation model is improved.

In the training process, the process is iterated for multiple times, so that the predictive code input by the customs commodity code recommendation model is infinitely close to the real code, the training is ended at the moment, and the trained model is used for the commodity coding method.

It can be understood that after the training of the customs goods code recommendation model is completed, each piece of key information corresponds to a corresponding reference goods code and a corresponding weight, when the customs goods code is needed, the data of the customs goods is input into the customs goods code recommendation model, the customs goods code recommendation model obtains the corresponding reference goods code and the weight according to the key information in the data, and then the optimal customs goods code is calculated according to the reference goods code and the weight and is used as the customs goods code of the customs goods.

Referring to fig. 8, fig. 8 is a schematic structural diagram of an embodiment of an electronic device 80 provided in the present application, where the electronic device includes a processor 81 and a memory 82 coupled to the processor 81; wherein the memory 82 is used for storing computer programs and the processor 81 is used for executing the computer programs to realize the following methods:

acquiring data information of commodities to be classified; acquiring at least one piece of key information from the data information; determining a plurality of reference commodity codes for each key information; and determining a target commodity code of the commodity to be classified from the plurality of reference commodity codes.

It will be appreciated that the processor 81 is also configured to execute a computer program to implement the method provided by any of the above embodiments.

Referring to fig. 9, fig. 9 is a schematic structural diagram of an embodiment of a computer-readable storage medium provided in the present application. The computer-readable storage medium 90 is for storing a computer program 91, the computer program 91, when being executed by a processor, is for implementing the method of:

acquiring data information of commodities to be classified; acquiring at least one piece of key information from the data information; determining a plurality of reference commodity codes for each key information; and determining a target commodity code of the commodity to be classified from the plurality of reference commodity codes.

It will be appreciated that the computer program 91, when executed by a processor, is also for implementing the method provided by any of the embodiments described above.

In the several embodiments provided in the present application, it should be understood that the disclosed method and apparatus may be implemented in other manners. For example, the above-described device embodiments are merely illustrative, and for example, the division of the modules or units is only one logical division, and other divisions may be realized in practice, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed.

The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the embodiment.

In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units may be integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.

The integrated units in the other embodiments described above may be stored in a storage medium if they are implemented in the form of software functional units and sold or used as separate products. Based on such understanding, the technical solution of the present application may be substantially implemented or contributed by the prior art, or all or part of the technical solution may be embodied in a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, a network device, or the like) or a processor (processor) to execute all or part of the steps of the method according to the embodiments of the present application. And the aforementioned storage medium includes: various media capable of storing program codes, such as a usb disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk.

The above description is only for the purpose of illustrating embodiments of the present application and is not intended to limit the scope of the present application, and all modifications of equivalent structures and equivalent processes, which are made by the contents of the specification and the drawings of the present application or are directly or indirectly applied to other related technical fields, are also included in the scope of the present application.

16页详细技术资料下载
上一篇:一种医用注射器针头装配设备
下一篇:多模态信息提取方法、装置、设备及计算机可读存储介质

网友询问留言

已有0条留言

还没有人留言评论。精彩留言会获得点赞!

精彩留言,会给你点赞!