Vehicle data cleaning method and device and storage medium

文档序号:907640 发布日期:2021-02-26 浏览:2次 中文

阅读说明:本技术 车辆数据清洗方法、装置及存储介质 (Vehicle data cleaning method and device and storage medium ) 是由 周凯 金振东 徐嘉赟 张明磊 于 2020-11-06 设计创作,主要内容包括:本发明公开了一种车辆数据清洗方法、装置及存储介质。该方法包括:获取标准车辆数据。获取原始车辆数据。原始车辆数据包括原始车型数据时,根据原始车型数据和车型原子库对标准车型进行筛选得到指定标准车型。原始车辆数据包括原始配件数据时,根据原始配件数据和配件原子库对标准配件进行筛选得到指定标准配件。原始车辆数据包括原始配件功能属性数据时,根据原始配件功能属性数据和配件功能属性原子库对标准配件功能属性进行筛选,得到指定标准配件功能属性。本发明依据原始车辆数据对标准车辆数据进行筛选能够实现原始车辆数据的标准化,提高了数据清洗的智能化水平。本发明通过对原始车辆数据进行切词处理,提高了后续筛选的速度和精确度。(The invention discloses a vehicle data cleaning method, a vehicle data cleaning device and a storage medium. The method comprises the following steps: standard vehicle data is acquired. Raw vehicle data is acquired. And when the original vehicle data comprises original vehicle type data, screening the standard vehicle types according to the original vehicle type data and the vehicle type atom library to obtain the specified standard vehicle types. And when the original vehicle data comprises original accessory data, screening the standard accessories according to the original accessory data and the accessory atom library to obtain the specified standard accessories. And when the original vehicle data comprises original accessory function attribute data, screening the standard accessory function attributes according to the original accessory function attribute data and the accessory function attribute atom library to obtain the appointed standard accessory function attributes. According to the invention, the standard vehicle data are screened according to the original vehicle data, so that the standardization of the original vehicle data can be realized, and the intelligent level of data cleaning is improved. The invention improves the speed and the accuracy of subsequent screening by carrying out word segmentation processing on the original vehicle data.)

1. A vehicle data cleansing method, characterized in that the data cleansing method comprises:

acquiring standard vehicle data;

the standard vehicle data comprises standard vehicle types, a vehicle type atom library, standard accessories, an accessory atom library, standard accessory functional attributes and an accessory functional attribute atom library;

acquiring original vehicle data;

wherein the raw vehicle data comprises at least one of: original vehicle type data, original accessory data and original accessory functional attribute data;

when the original vehicle data comprises original vehicle type data, performing word segmentation processing on the original vehicle type data to obtain vehicle type atom information, and screening the standard vehicle type according to the vehicle type atom information and the vehicle type atom library to obtain an appointed standard vehicle type;

when the original vehicle data comprises original accessory data, performing word segmentation processing on the original accessory data to obtain accessory atom information, and screening the standard accessories according to the accessory atom information and the accessory atom library to obtain specified standard accessories;

and when the original vehicle data comprises original accessory function attribute data, performing word segmentation processing on the original accessory function attribute data to obtain accessory function attribute atomic information, and screening the standard accessory function attributes according to the accessory function attribute atomic information and the accessory function attribute atomic library to obtain the appointed standard accessory function attributes.

2. The vehicle data cleaning method according to claim 1, wherein each standard vehicle type is in correspondence with at least one standard accessory, and each standard accessory is in correspondence with at least one standard accessory functional attribute;

the standard vehicle data further comprises original factory accessory codes, and each standard accessory has a corresponding relation with one original factory accessory code;

when the original vehicle data comprises original vehicle type data, the method further comprises the steps of obtaining the standard accessories corresponding to the specified standard vehicle types, obtaining the standard accessory functional attributes corresponding to the specified standard vehicle types, and obtaining the original plant accessory codes corresponding to the specified standard vehicle types;

when the original vehicle data comprises original accessory data, the method further comprises the steps of obtaining the standard vehicle model corresponding to the specified standard accessory, obtaining the standard accessory functional attribute corresponding to the specified standard accessory, and obtaining the original factory accessory code corresponding to the specified standard accessory;

when the original vehicle data comprises original accessory function attribute data, the method further comprises the steps of obtaining the standard vehicle type corresponding to the specified accessory function attribute, obtaining the standard accessory corresponding to the specified accessory function attribute, and obtaining the original plant accessory code corresponding to the specified accessory function attribute.

3. The vehicle data cleaning method according to claim 2, wherein when the original vehicle data includes original vehicle type data, the method further includes acquiring specified original plant vehicle type data according to the original plant accessory code corresponding to the specified standard vehicle type, and verifying the specified standard vehicle type according to the specified original plant vehicle type data;

when the original vehicle data comprises original accessory data, acquiring appointed original accessory data according to the original accessory code corresponding to the appointed standard accessory, and verifying the appointed standard accessory according to the appointed original accessory data;

and when the original vehicle data comprises original accessory function attribute data, acquiring appointed original accessory function attribute data according to the original accessory code corresponding to the appointed standard accessory function attribute, and verifying the appointed standard accessory function attribute according to the appointed original accessory function attribute data.

4. The vehicle data washing method according to claim 2, wherein when the designated standard component is plural, further comprising sorting the plural designated standard components;

the step of ordering a plurality of said specified standard accessories comprises:

setting the value of credit of each of the specified standard accessories to 0;

acquiring an accessory function attribute which has a corresponding relation with any one of the specified standard accessories and recording the accessory function attribute as a grading accessory function attribute;

performing the following steps for each of the scored accessory functional attributes:

respectively calculating the attribute score of each specified standard accessory relative to the current functional attribute of the scored accessory, and adding 1 to the score value of the specified standard accessory with the highest attribute score relative to the current functional attribute of the scored accessory;

and sorting the specified standard accessories according to the grading values from high to low.

5. The vehicle data washing method of claim 4, wherein when the scoring accessory function attribute is oriented, the calculation formula of the attribute score is as follows:

wherein na is a standard accessory, op is a score accessory function attribute, prn is an nth configuration code, and S(na, op, have)For non-deduplication source plant vehicle data totals including both standard and scored accessory functional attributes, S(na, op, prn, have)The total number of original factory vehicle data which simultaneously comprises standard accessories, grading accessory functional attributes and the nth configuration code;

when the scoring accessory functional attribute is non-tendency, the calculation formula of the accessory score is as follows:

wherein na is a standard accessory, op is a score accessory function attribute, prn is an nth configuration code, and S(na, op, none)Total number of non-deduplication source vehicle data, S, excluding standard and scored accessory functional attributes(na, op, prn, none)Max is the maximum for the total number of genuine vehicle data that does not include standard accessories, scored accessory functional attributes, and nth configuration code.

6. The vehicle data washing method according to claim 5, characterized in that, when there are at least two of the specified standard accessories whose score values are the same, a core score accessory functional attribute is obtained from the score accessory functional attributes, and the specified standard accessories are ranked more forward with respect to the higher attribute score of the core score accessory functional attribute.

7. The vehicle data cleansing method according to claim 5, wherein the standard vehicle data further includes vehicle model configuration scores, one for each of the standard vehicle models;

the vehicle type configuration score calculating method comprises the following steps: acquiring original plant accessory function attributes corresponding to the standard vehicle type, screening the original plant accessory function attributes according to the standard accessory function attributes corresponding to the standard vehicle type to obtain matched original plant accessory function attributes, and calculating the ratio of the total number of the matched original plant accessory function attributes to the total number of the original plant accessory function attributes;

and when the original vehicle data comprises original vehicle type data, acquiring the vehicle type configuration score corresponding to the specified standard vehicle type.

8. The vehicle data cleansing method according to any one of claims 1 to 7, wherein the standard vehicle type includes at least one of: vehicle type name, industry and trust department bulletin number, distribution channel sale type, vehicle body form and country.

9. A vehicle data washing apparatus, characterized in that the apparatus comprises:

first acquiring means for acquiring standard vehicle data;

the standard vehicle data comprises standard vehicle types, a vehicle type atom library, standard accessories, an accessory atom library, standard accessory functional attributes and an accessory functional attribute atom library;

second acquiring means for acquiring original vehicle data;

wherein the raw vehicle data comprises at least one of: original vehicle type data, original accessory data and original accessory functional attribute data;

the system comprises a first screening device, a second screening device and a third screening device, wherein when the original vehicle data comprise original vehicle type data, the first screening device is used for carrying out word segmentation processing on the original vehicle type data to obtain vehicle type atom information, and screening the standard vehicle type according to the vehicle type atom information and a vehicle type atom library to obtain an appointed standard vehicle type;

the second screening device is used for carrying out word segmentation processing on the original accessory data to obtain accessory atom information when the original vehicle data comprise original accessory data, and screening the standard accessories according to the accessory atom information and the accessory atom library to obtain specified standard accessories;

and when the original vehicle data comprises original accessory function attribute data, the third screening device is used for performing word segmentation processing on the original accessory function attribute data to obtain accessory function attribute atomic information, and screening the standard accessory function attributes according to the accessory function attribute atomic information and the accessory function attribute atomic library to obtain the specified standard accessory function attributes.

10. A computer-readable storage medium having stored thereon computer instructions, which when executed by a processor, perform the steps of the method of any of claims 1-8.

Technical Field

The invention relates to the field of vehicle data matching, in particular to a vehicle data cleaning method, a vehicle data cleaning device and a storage medium.

Background

In the automobile aftermarket, accessory data of mechanisms such as an accessory manufacturer, an accessory distributor and an accessory e-commerce platform generally relate to different types of accessory data such as multiple brands, multiple varieties, original factories, high imitations, packages and the like. Due to the fact that vehicle types change rapidly and intermediate links are multiple, the accessory data are miscellaneous, disordered, multiple and poor, and a unified data management standard is lacked. Further, the problems of difficult production management, difficult inventory management, blocked information, difficult after-sales service, difficult sales management and the like are caused.

FIG. 1 is a table of factory accessory inventory data in accordance with the prior art. As shown in fig. 1, existing accessory data is usually managed by using Excel or word as a carrier, and a vehicle model adapted to a product is usually filled in one cell. Fig. 2 is a conventional fitting matching table. As shown in fig. 2, the fitter typically converts the data into standard structured data through manual matching, which is labor-intensive and inefficient. The accessory dealer does not have unified standard data to do the basis when matching by oneself and marks, because to understanding deviation and the lack of data of motorcycle type data, the data accuracy after the matching is extremely low, and the later stage still need adjustment and matching many times, but only short-term use. Meanwhile, at present, an accessory manufacturer and a manufacturer do not have comprehensive and standard accessory original factory codes and functional attribute data corresponding to the vehicle type, and the limitation of self-matching data is large, which is also the main reason for leading the data to be more and more difficult to manage. Generally, manufacturers with the capability of data management need to equip a professional data product manager for each category, such as spark plugs, to perform daily data management, which is very demanding for users.

Therefore, how to improve the efficiency, accuracy and intelligence level of vehicle data cleaning and reduce the operation difficulty and maintenance cost becomes a key point of urgent research and technical problems to be solved by technical personnel in the field.

Disclosure of Invention

In view of this, embodiments of the present invention provide a vehicle data cleaning method, apparatus, and storage medium, so as to solve the problems of low efficiency, low accuracy, high operation difficulty, and high maintenance cost of the vehicle data cleaning method in the prior art.

Therefore, the embodiment of the invention provides the following technical scheme:

in a first aspect of the present invention, a vehicle data cleaning method is provided, including:

acquiring standard vehicle data;

the standard vehicle data comprises standard vehicle types, a vehicle type atom library, standard accessories, an accessory atom library, standard accessory functional attributes and an accessory functional attribute atom library;

acquiring original vehicle data;

wherein the raw vehicle data comprises at least one of: original vehicle type data, original accessory data and original accessory functional attribute data;

when the original vehicle data comprises original vehicle type data, performing word segmentation processing on the original vehicle type data to obtain vehicle type atom information, and screening the standard vehicle type according to the vehicle type atom information and the vehicle type atom library to obtain an appointed standard vehicle type;

when the original vehicle data comprises original accessory data, performing word segmentation processing on the original accessory data to obtain accessory atom information, and screening the standard accessories according to the accessory atom information and the accessory atom library to obtain specified standard accessories;

and when the original vehicle data comprises original accessory function attribute data, performing word segmentation processing on the original accessory function attribute data to obtain accessory function attribute atomic information, and screening the standard accessory function attributes according to the accessory function attribute atomic information and the accessory function attribute atomic library to obtain the appointed standard accessory function attributes.

Furthermore, each standard vehicle type has a corresponding relation with at least one standard accessory, and each standard accessory has a corresponding relation with at least one standard accessory functional attribute;

the standard vehicle data further comprises original factory accessory codes, and each standard accessory has a corresponding relation with one original factory accessory code;

when the original vehicle data comprises original vehicle type data, the method further comprises the steps of obtaining the standard accessories corresponding to the specified standard vehicle types, obtaining the standard accessory functional attributes corresponding to the specified standard vehicle types, and obtaining the original plant accessory codes corresponding to the specified standard vehicle types;

when the original vehicle data comprises original accessory data, the method further comprises the steps of obtaining the standard vehicle model corresponding to the specified standard accessory, obtaining the standard accessory functional attribute corresponding to the specified standard accessory, and obtaining the original factory accessory code corresponding to the specified standard accessory;

when the original vehicle data comprises original accessory function attribute data, the method further comprises the steps of obtaining the standard vehicle type corresponding to the specified accessory function attribute, obtaining the standard accessory corresponding to the specified accessory function attribute, and obtaining the original plant accessory code corresponding to the specified accessory function attribute.

Further, when the original vehicle data comprises original vehicle type data, acquiring specified original plant vehicle type data according to the original plant accessory codes corresponding to the specified standard vehicle types, and verifying the specified standard vehicle types according to the specified original plant vehicle type data;

when the original vehicle data comprises original accessory data, acquiring appointed original accessory data according to the original accessory code corresponding to the appointed standard accessory, and verifying the appointed standard accessory according to the appointed original accessory data;

and when the original vehicle data comprises original accessory function attribute data, acquiring appointed original accessory function attribute data according to the original accessory code corresponding to the appointed standard accessory function attribute, and verifying the appointed standard accessory function attribute according to the appointed original accessory function attribute data.

Further, when the specified standard accessories are multiple, the method also comprises the step of sequencing the multiple specified standard accessories;

the step of ordering a plurality of said specified standard accessories comprises:

setting the value of credit of each of the specified standard accessories to 0;

acquiring an accessory function attribute which has a corresponding relation with any one of the specified standard accessories and recording the accessory function attribute as a grading accessory function attribute;

performing the following steps for each of the scored accessory functional attributes:

respectively calculating the attribute score of each specified standard accessory relative to the current functional attribute of the scored accessory, and adding 1 to the score value of the specified standard accessory with the highest attribute score relative to the current functional attribute of the scored accessory;

and sorting the specified standard accessories according to the grading values from high to low.

Further, when the functional attribute of the scoring accessory is tendency, the calculation formula of the attribute score is as follows:

wherein na is a standard accessory, op is a score accessory function attribute, prn is an nth configuration code, and S(na, op, have)For non-deduplication source plant vehicle data totals including both standard and scored accessory functional attributes, S(na, op, prn, have)The total number of original factory vehicle data which simultaneously comprises standard accessories, grading accessory functional attributes and the nth configuration code;

when the scoring accessory functional attribute is non-tendency, the calculation formula of the accessory score is as follows:

wherein na is a standard accessory, op is a score accessory function attribute, prn is an nth configuration code, and S(na, op, none)Total number of non-deduplication source vehicle data, S, excluding standard and scored accessory functional attributes(naOp, prn, none)The total number of genuine vehicle data excluding the standard accessory, the scored accessory function attribute and the nth configuration code.

Further, when the score values of at least two specified standard accessories are the same, the core score accessory functional attributes are obtained from the score accessory functional attributes, and the specified standard accessories are ranked more forward in the higher score relative to the attribute of the core score accessory functional attributes.

Further, the standard vehicle data further comprises vehicle type configuration scores, and each standard vehicle type corresponds to one vehicle type configuration score;

the vehicle type configuration score calculating method comprises the following steps: acquiring original plant accessory function attributes corresponding to the standard vehicle type, screening the original plant accessory function attributes according to the standard accessory function attributes corresponding to the standard vehicle type to obtain matched original plant accessory function attributes, and calculating the ratio of the total number of the matched original plant accessory function attributes to the total number of the original plant accessory function attributes;

and when the original vehicle data comprises original vehicle type data, acquiring the vehicle type configuration score corresponding to the specified standard vehicle type.

Further, the standard vehicle model comprises at least one of: vehicle type name, industry and trust department bulletin number, distribution channel sale type, vehicle body form and country.

In a second aspect of the present invention, there is provided a vehicle data washing apparatus, the apparatus comprising:

first acquiring means for acquiring standard vehicle data;

the standard vehicle data comprises standard vehicle types, a vehicle type atom library, standard accessories, an accessory atom library, standard accessory functional attributes and an accessory functional attribute atom library;

second acquiring means for acquiring original vehicle data;

wherein the raw vehicle data comprises at least one of: original vehicle type data, original accessory data and original accessory functional attribute data;

the system comprises a first screening device, a second screening device and a third screening device, wherein when the original vehicle data comprise original vehicle type data, the first screening device is used for carrying out word segmentation processing on the original vehicle type data to obtain vehicle type atom information, and screening the standard vehicle type according to the vehicle type atom information and a vehicle type atom library to obtain an appointed standard vehicle type;

the second screening device is used for carrying out word segmentation processing on the original accessory data to obtain accessory atom information when the original vehicle data comprise original accessory data, and screening the standard accessories according to the accessory atom information and the accessory atom library to obtain specified standard accessories;

and when the original vehicle data comprises original accessory function attribute data, the third screening device is used for performing word segmentation processing on the original accessory function attribute data to obtain accessory function attribute atomic information, and screening the standard accessory function attributes according to the accessory function attribute atomic information and the accessory function attribute atomic library to obtain the specified standard accessory function attributes.

In a third aspect of the invention, there is provided a computer readable storage medium having stored thereon computer instructions which, when executed by a processor, carry out the steps of the method according to any one of the first aspect of the invention.

The technical scheme of the embodiment of the invention has the following advantages:

the embodiment of the invention provides a vehicle data cleaning method, a vehicle data cleaning device and a storage medium. The existing vehicle data cleaning method is generally manual searching, and is low in efficiency and high in operation difficulty. According to the invention, the standard vehicle data are screened according to the original vehicle data, so that the standardization of the original vehicle data can be realized, and the intelligent level of data cleaning is improved. The invention improves the speed and the accuracy of subsequent screening by carrying out word segmentation processing on the original vehicle data.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and other drawings can be obtained by those skilled in the art without creative efforts.

FIG. 1 is a table of factory accessory inventory data in accordance with the prior art.

Fig. 2 is a conventional fitting matching table.

FIG. 3 is a flowchart of a vehicle data cleansing method according to an embodiment of the present invention.

Fig. 4 is a block diagram of a vehicle data cleansing apparatus according to an embodiment of the present invention.

Detailed Description

The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application. It is to be understood that the embodiments described are only a few embodiments of the present application and not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.

In the description of the present application, it is to be understood that the terms "center," "longitudinal," "lateral," "length," "width," "thickness," "upper," "lower," "front," "rear," "left," "right," "vertical," "horizontal," "top," "bottom," "inner," "outer," "clockwise," "counterclockwise," and the like are used in the orientations and positional relationships indicated in the drawings for convenience in describing the present application and for simplicity in description, and are not intended to indicate or imply that the referenced devices or elements must have a particular orientation, be constructed in a particular orientation, and be operated in a particular manner, and are not to be construed as limiting the present application. Furthermore, the terms "first", "second" and "first" are used for descriptive purposes only and are not to be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, features defined as "first", "second", may explicitly or implicitly include one or more of the described features. In the description of the present application, "a plurality" means two or more unless specifically limited otherwise.

In the description of the present application, it is to be noted that, unless otherwise explicitly specified or limited, the terms "mounted," "connected," and "connected" are to be construed broadly, e.g., as meaning either a fixed connection, a removable connection, or an integral connection; may be mechanically connected, may be electrically connected or may be in communication with each other; either directly or indirectly through intervening media, either internally or in any other relationship. The specific meaning of the above terms in the present application can be understood by those of ordinary skill in the art as appropriate.

In this application, unless expressly stated or limited otherwise, the first feature "on" or "under" the second feature may comprise direct contact of the first and second features, or may comprise contact of the first and second features not directly but through another feature in between. Also, the first feature being "on," "above" and "over" the second feature includes the first feature being directly on and obliquely above the second feature, or merely indicating that the first feature is at a higher level than the second feature. A first feature being "under," "below," and "beneath" a second feature includes the first feature being directly under and obliquely below the second feature, or simply meaning that the first feature is at a lesser elevation than the second feature.

The following disclosure provides many different embodiments or examples for implementing different features of the application. In order to simplify the disclosure of the present application, specific example components and arrangements are described below. Of course, they are merely examples and are not intended to limit the present application. Moreover, the present application may repeat reference numerals and/or letters in the various examples, such repetition is for the purpose of simplicity and clarity and does not in itself dictate a relationship between the various embodiments and/or configurations discussed. In addition, examples of various specific processes and materials are provided herein, but one of ordinary skill in the art may recognize applications of other processes and/or use of other materials.

FIG. 3 is a flowchart of a vehicle data cleansing method according to an embodiment of the present invention. As shown in fig. 3, the vehicle data washing method includes the steps of:

s1: acquiring standard vehicle data;

the standard vehicle data comprises standard vehicle types, a vehicle type atom library, standard accessories, an accessory atom library, standard accessory function attributes and an accessory function attribute atom library. In this embodiment, the standard vehicle model includes at least one of: vehicle type name, industry and trust department bulletin number, distribution channel sale type, vehicle body form and country. The automobile model atom library comprises automobile model common names, the accessory atom library comprises accessory common names, and the accessory function attribute atom library comprises accessory function attribute common names. Each standard vehicle type corresponds to at least one vehicle type common name, each standard accessory corresponds to at least one accessory common name, and each standard accessory functional attribute corresponds to at least one accessory functional attribute common name. In the field of automotive profession, the same part names are written names such as front bumper skins, engine covers and midnets, and the written names are standard names of parts, and are also common names in the industry such as front bumper skins, head covers and ghost face masks. The front bumper is a front bumper skin, the head cover is an engine cover, and the ghost mask is a middle net. Moreover, there are many different common names for an object, the common names for the front bumper skins being: the front bumper comprises a front bumper, a front rod, a front pump handle, a front bumper skin and the like. Standard vehicle models, accessories and standard accessory functional attributes are written designations.

S2: acquiring original vehicle data;

wherein the raw vehicle data includes at least one of: raw vehicle model data, raw accessory data, and raw accessory functional attribute data. In this embodiment, the original vehicle model data includes a colloquial name of a vehicle model, the original accessory data includes a colloquial name of an accessory, and the original accessory functional attribute data includes a colloquial name of an accessory functional attribute.

S3: and when the original vehicle data comprises original vehicle type data, performing word segmentation on the original vehicle type data to obtain vehicle type atom information, and screening standard vehicle types according to the vehicle type atom information and a vehicle type atom library to obtain specified standard vehicle types. In this embodiment, one or more specified standard vehicle types may be selected. The standard vehicle data is preferably filtered by an intelligent Search center (Ai Search).

And when the original vehicle data comprises original accessory data, performing word segmentation processing on the original accessory data to obtain accessory atom information, and screening standard accessories according to the accessory atom information and an accessory atom library to obtain specified standard accessories. In this embodiment, one or more specified standard vehicle types may be selected. The standard vehicle data is preferably filtered by an intelligent Search center (Ai Search).

And when the original vehicle data comprises original accessory function attribute data, performing word segmentation processing on the original accessory function attribute data to obtain accessory function attribute atomic information, and screening standard accessory function attributes according to the accessory function attribute atomic information and an accessory function attribute atomic library to obtain specified standard accessory function attributes. In this embodiment, one or more specified standard vehicle types may be selected. The standard vehicle data is preferably filtered by an intelligent Search center (Ai Search).

In this embodiment, the word segmentation process includes splitting a field into words. For example, the word cutting processing is carried out on the 'wining sports edition' to obtain 'wining' and 'sports', and the word cutting processing is carried out on the 'Passat Bourg' to obtain 'Passat' and 'Bourg'. The standard keywords preferably include brand, manufacturer, chassis, vehicle series, vehicle type, emission, year, engine, transmission, and sales layout.

The existing automobile accessory data cleaning method is generally manual searching, and is low in efficiency and high in operation difficulty. According to the invention, the standardization of the original vehicle data can be realized by matching and screening the original vehicle data, and the intelligent level of data cleaning is improved. The invention improves the speed and the accuracy of subsequent screening by carrying out word segmentation processing on the original vehicle data. The invention converts the original vehicle data with different semantics and dimensions into the vehicle type data with the finest dimension through word segmentation processing, is beneficial to recognition and logic processing, and greatly improves the efficiency of standardized processing. In use, for example, "tittle 2019 comfort version 1.4" may be converted by a vehicle data washing method of an embodiment of the present invention to: the brand-brand is ' popular-popular, the vehicle group is ' soar 0J 2019 ', the vehicle type is ' soar ', the displacement-engine number is ' 1.4T-DJSA ', the sales edition type is ' 1.4TSI double clutch 280TSI comfortable type ', the annual style is ' 2019 ' and the standard vehicle type information is ' MJS9208637 '. The method is different from the existing method that the search repeatedly provides single-vehicle type query when the user directly queries, and the multi-vehicle type query is performed in a catalogue matching scene.

In a specific embodiment, each standard vehicle type has a corresponding relationship with at least one standard accessory, and each standard accessory has a corresponding relationship with at least one standard accessory functional attribute. The standard vehicle data further comprises original plant accessory codes, and each standard accessory is in corresponding relation with one original plant accessory code. When the original vehicle data comprises original vehicle type data, the method further comprises the steps of obtaining standard accessories corresponding to the specified standard vehicle type, obtaining standard accessory functional attributes corresponding to the specified standard vehicle type, and obtaining original plant accessory codes corresponding to the specified standard vehicle type. When the original vehicle data comprises original accessory data, the method further comprises the steps of obtaining a standard vehicle type corresponding to the specified standard accessory, obtaining a standard accessory functional attribute corresponding to the specified standard accessory, and obtaining an original factory accessory code corresponding to the specified standard accessory. When the original vehicle data comprises original accessory function attribute data, the method further comprises the steps of obtaining a standard vehicle type corresponding to the specified accessory function attribute, obtaining a standard accessory corresponding to the specified accessory function attribute, and obtaining an original manufacturer accessory code corresponding to the specified accessory function attribute.

Compared with the prior art, the vehicle data cleaning method provided by the embodiment of the invention can establish a relation between original vehicle data, standard vehicle data and original factory codes. According to the embodiment of the invention, the original vehicle type data, the original accessory data and the original accessory function attribute data are respectively processed, and the mutual verification can be realized when a plurality of processing results are obtained, so that the data stability is improved.

In a specific embodiment, when the original vehicle data includes original vehicle type data, the method further includes obtaining data of a specified original vehicle type according to an original fitting code corresponding to the specified standard vehicle type, and verifying the specified standard vehicle type according to the data of the specified original vehicle type. When the original vehicle data comprises original accessory data, acquiring appointed original accessory data according to original accessory codes corresponding to the appointed standard accessories, and verifying the appointed standard accessories according to the appointed original accessory data. And when the original vehicle data comprises original accessory function attribute data, acquiring appointed original accessory function attribute data according to original accessory codes corresponding to the appointed standard accessory function attribute, and verifying the appointed standard accessory function attribute according to the appointed original accessory function attribute data.

Compared with the prior art, the embodiment of the invention verifies the function attributes of the specified standard vehicle type, the specified standard accessory or the specified standard accessory according to the original factory data, thereby improving the reliability of the data.

In a specific embodiment, when the designated standard component is a plurality, the method further comprises sorting the plurality of designated standard components. The step of ordering the plurality of specified criteria accessories comprises:

the value of credit for each of the specified standard accessories was set to 0. Acquiring the accessory function attribute corresponding to any one of the specified standard accessories and recording the accessory function attribute as a grading accessory function attribute.

Performing the following steps for each scored accessory functional attribute:

respectively calculating the attribute score of each specified standard accessory relative to the current scored accessory functional attribute, and adding 1 to the score value of the specified standard accessory with the highest attribute score relative to the current scored accessory functional attribute;

the assigned standard accessories are sorted from high to low according to the score value.

In this embodiment, when the functional attribute of the scoring accessory is biased, the calculation formula of the attribute scoring is as follows:

wherein na is a standard accessory, op is a score accessory function attribute, prn is an nth configuration code, and S(na, op, have)For non-deduplication source plant vehicle data totals including both standard and scored accessory functional attributes, S(na, op, prn, have)The total number of original factory vehicle data which simultaneously comprises standard accessories, grading accessory functional attributes and the nth configuration code;

when the scoring accessory functional attribute is non-tendency, the calculation formula of the accessory score is as follows:

wherein na is a standard accessory, op is a score accessory function attribute, prn is an nth configuration code, and S(na, op, none)Total number of non-deduplication source vehicle data, S, excluding standard and scored accessory functional attributes(na, op, prn, none)The total number of genuine vehicle data excluding the standard accessory, the scored accessory function attribute and the nth configuration code.

Compared with the prior art, the vehicle data cleaning method provided by the embodiment of the invention scores the standard accessories according to the similarity between the standard accessories and the original factory accessories, and can determine the configuration height of the standard accessories.

In a specific embodiment, when the score values of at least two specified standard accessories are the same, the core score accessory functional attributes are obtained from the score accessory functional attributes, and the higher the score of the specified standard accessories relative to the attribute of the core score accessory functional attributes is, the higher the rank is, the.

In this embodiment, the weight of the core function attribute may be increased according to actual requirements.

In a specific embodiment, the standard vehicle data further includes vehicle type configuration scores, one for each standard vehicle type. The vehicle type configuration score calculating method comprises the following steps: the method comprises the steps of obtaining original plant accessory function attributes corresponding to standard vehicle models, screening the original plant accessory function attributes according to the standard accessory function attributes corresponding to the standard vehicle models to obtain matched original plant accessory function attributes, and calculating the ratio of the total number of the matched original plant accessory function attributes to the total number of the original plant accessory function attributes. And when the original vehicle data comprises the original vehicle type data, acquiring a vehicle type configuration score corresponding to the specified standard vehicle type.

Compared with the prior art, the vehicle data cleaning method provided by the embodiment of the invention scores the standard vehicle type according to the similarity between the standard vehicle type part and the original factory vehicle type, and can determine the configuration height of the standard vehicle type.

In this embodiment, a vehicle data washing apparatus is further provided, and the apparatus is used to implement the above embodiments and preferred embodiments, and the description of the apparatus is omitted. As used below, the term "module" may be a combination of software and/or hardware that implements a predetermined function. Although the means described in the embodiments below are preferably implemented in software, an implementation in hardware, or a combination of software and hardware is also possible and contemplated.

Fig. 4 is a block diagram showing a configuration of a vehicle data cleansing apparatus according to an embodiment of the present invention, and as shown in fig. 4, the apparatus includes: first acquiring means 11 for acquiring standard vehicle data. The standard vehicle data comprises standard vehicle types, a vehicle type atom library, standard accessories, an accessory atom library, standard accessory function attributes and an accessory function attribute atom library. Second acquiring means 12 for acquiring raw vehicle data. Wherein the raw vehicle data includes at least one of: raw vehicle model data, raw accessory data, and raw accessory functional attribute data. And when the original vehicle data comprises original vehicle type data, the first screening device 13 is used for performing word segmentation processing on the original vehicle type data to obtain vehicle type atom information, and screening the standard vehicle type according to the vehicle type atom information and a vehicle type atom library to obtain an appointed standard vehicle type. And when the original vehicle data comprises original accessory data, the second screening device 14 is used for performing word segmentation processing on the original accessory data to obtain accessory atom information, and screening the standard accessories according to the accessory atom information and an accessory atom library to obtain specified standard accessories. And when the original vehicle data comprises original accessory function attribute data, the third screening device 15 is used for performing word segmentation processing on the original accessory function attribute data to obtain accessory function attribute atomic information, and screening the standard accessory function attributes according to the accessory function attribute atomic information and the accessory function attribute atomic library to obtain the specified standard accessory function attributes.

Embodiments of the present invention further provide a non-transitory computer storage medium, where computer executable instructions are stored in the computer storage medium, and the computer executable instructions may execute the vehicle data cleaning method in any of the above method embodiments. The storage medium may be a magnetic Disk, an optical Disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a Flash Memory (Flash Memory), a Hard Disk (Hard Disk Drive, abbreviated as HDD) or a Solid State Drive (SSD), etc.; the storage medium may also comprise a combination of memories of the kind described above.

Although the embodiments of the present invention have been described in conjunction with the accompanying drawings, those skilled in the art may make various modifications and variations without departing from the spirit and scope of the invention, and such modifications and variations fall within the scope defined by the appended claims.

15页详细技术资料下载
上一篇:一种医用注射器针头装配设备
下一篇:知识点预测方法、系统和可读存储介质

网友询问留言

已有0条留言

还没有人留言评论。精彩留言会获得点赞!

精彩留言,会给你点赞!