Transfer learning and domain adaptation using distributable data models

文档序号：1246778 发布日期：2020-08-18 浏览：27次中文

阅读说明：本技术 使用可分布数据模型的传输学习与域适应 (Transfer learning and domain adaptation using distributable data models ) 是由杰森·克拉布特里安德鲁·塞勒斯于 2018-12-07 设计创作，主要内容包括：提供了一种用于使用可分布数据模型以传输学习并域自适应的系统,包括,配置用于服务多个可分布模型实例的联网可分布模型；以及有向计算图模块,配置用于从联网计算系统至少接收至少一个可分布模型的实例,从由传输引擎所执行的机器学习创建第二数据集,采用第二数据集训练可分布模型的实例,并至少部分地基于对可分布模型实例的更新而生成更新报告。(A system for using a distributable data model for transport learning and domain adaptation is provided, comprising, a networked distributable model configured to serve a plurality of distributable model instances; and a directed computational graph module configured to receive at least an instance of the at least one distributable model from the networked computing system, create a second dataset from machine learning performed by the transport engine, train the instance of the distributable model with the second dataset, and generate an update report based at least in part on an update to the instance of the distributable model.)

1. A system for using a distributable data model for transport learning and domain adaptation, comprising:

a distributable model source comprising a memory, a processor, and a plurality of programming instructions stored in said memory thereof and executable on said processor thereof, wherein said programming instructions, when executed on said processor, cause said processor to:

storing a plurality of machine learning models;

generating a distributable model instance based at least in part on the machine learning model;

sending the distributable model instance over a network;

a transmission engine comprising a memory, a processor, and a plurality of programming instructions stored in the memory thereof and executable on the processor thereof, wherein the programming instructions, when executed on the processor, cause the processor to:

receiving at least a distributable model instance from the model source; and

applying a plurality of machine learning algorithms to at least a portion of the received distributable model instances;

a directed computation graph engine comprising a memory, a processor, and a plurality of programming instructions stored in the memory thereof and executable on the processor thereof, wherein the programming instructions, when executed on the processor, cause the processor to:

receiving at least a distributable model instance from the model source;

creating a second data set from data stored in the memory based at least in part on transfer learning performed by a transfer engine;

training the distributable model instance using the second data set; and

generating an update report based at least in part on an update to the distributable model instance.

2. The system of claim 1, wherein the machine learning algorithm comprises a probabilistic learning network.

3. The system of claim 2, wherein the probabilistic learning network comprises a Markov logic network.

4. The system of claim 2, wherein the probabilistic learning network comprises a Bayesian network.

5. The system of claim 1, wherein the transmission engine:

receiving a pre-trained data model;

including at least a portion of the pre-trained data model into a partially unsupervised machine learning process; and

applying the partially unsupervised machine learning process to the distributable model instance.

6. A method for using a distributable data model for transitive learning and domain adaptation, comprising the steps of:

(a) storing a plurality of machine learning models in a distributable model source;

(b) generating a distributable model instance based at least in part on the machine learning model;

(d) receiving at least a distributable model instance at a transmission engine from the distributable model source;

(e) applying a plurality of machine learning algorithms to the distributable model instances;

(f) using a directed computational graph engine to train the distributable model instance employing a plurality of machine learning algorithms executed by the transport engine; and

(g) generating an update report based at least in part on an update to the distributable model instance.

7. The method of claim 6, wherein the machine learning algorithm comprises a probabilistic learning network.

8. The method of claim 7, wherein the probabilistic learning network comprises a Markov logic network.

9. The method of claim 7, wherein the probabilistic learning network comprises a Bayesian network.

10. The method of claim 6, further comprising the step of:

receiving a pre-trained data model;

including at least a portion of the pre-trained data model into a partially unsupervised learning process; and

applying the partially unsupervised machine learning process to the distributable model instance.

Technical Field

The present disclosure relates to the field of machine learning, and more particularly to model improvement using bias included in data distributed across multiple devices.

Background

In conventional machine learning, data is typically collected and processed at a central location. The collected data may then be used to train the model. However, the collectible data by means such as web crawling or news gathering is relatively narrow in scope compared to crime data stored on devices such as personal mobile devices, or from local police stations. This data may be difficult to leave the device it stores due to the possibility that it contains sensitive data, and the bandwidth required to transfer the data may also be large.

The generator model trained on the data line (whether anonymous or not) may be limited for transmission (e.g., due to GDPR). Existing tools such as SNORKEL^TMA substantial mechanism may be provided to accelerate the generation of real and labeled training data from the small seeds. The same concept can support the transmission of valuable models without moving the restricted information itself. This also applies to other models, datasets, visualizations, pipelines, and sharing of other data, such as multi-tenant deployments, inter-organizational sharing, or hybrid cloud edge usage scenarios.

What is needed is a system for using transfer learning to adapt a data model trained in a particular environment or application that may not have an equivalent feature space or distribution within the target data or application.

Disclosure of Invention

Accordingly, the present inventors have contemplated a system and method for using transfer learning to adapt a data model trained in a particular environment or application that may not have an equivalent feature space or distribution within the target data or application.

In a preferred embodiment, the model source may serve instances of various distributable models: generalized models, in which the data used to train it has a weighted and corrected bias, and bias-specific models, in which a bias is utilized. Instances serve distributed devices where they can be trained by the devices. The devices each generate an update report that is transmitted back to the model source where it can be used to refine the master model.

In one aspect of the invention, a system for improving distributable models having biases contained in distributed data is provided, comprising a networked distributable model source comprising a memory, a processor, and a plurality of programming instructions stored in the memory thereof and executable on the processor thereof, wherein the programming instructions, when executed on the processor, cause the processor to service instances of a plurality of distributable models; and a directed computational graph module comprising a memory, a processor, and a plurality of programming instructions stored in the memory thereof and executable on the processor thereof, wherein the programming instructions, when executed on the processor, cause the processor to receive at least an instance of the distributable model from the networked computing system, create a sanitized data set from the data stored in the memory based at least in part on a bias contained in the data stored in the memory, train the instance of the distributable model with the sanitized data set, and generate an update report at least in part by updating the instance of the distributable model.

In another embodiment of the invention, at least a portion of the cleansed data set is data from which sensitive information has been deleted. In another embodiment of the invention, at least a portion of the update reports are used by a networked distributable model source to refine the distributable model. In another embodiment of the present invention, at least a portion of the bias contained within the data stored in the memory is based on geographic location. In another embodiment of the present invention, at least a portion of the deviations contained within the data stored in the memory are based on age. In another embodiment of the present invention, at least a portion of the deviations contained within the data stored in the memory are gender-based.

In another aspect of the invention, there is provided a method for refining a distributable model using a bias included in distributed data, comprising the steps of: (a) serving a plurality of instances of the distributable model with a networked distributable model source; (b) receiving at least one instance of at least one distributable model from a networked computing system having a directed computational graph module; (c) creating a cleansed data set from data stored in a memory having a directed computational graph module based at least in part on deviations contained within the data stored in the memory; (d) training a distributable model example of the purified data set by adopting a directed computation graph module; and (e) generating an update report at least in part by updating the distributable model instance with the directed computational graph module.

Drawings

Detailed Description

The present inventors have contemplated and practiced a system and method for improving a regionalized distributable model with bias included in the distributed data.

One or more different features aspects may be described in the present application. Further, many alternative arrangements may be described for one or more of the feature aspects described herein; it should be understood that these arrangements are presented for illustrative purposes only and are not limiting in any way to the aspects of the features contained herein or the claims presented herein. The arrangement or arrangements can be widely applied to many feature aspects as can be readily seen from the present disclosure. In general, the arrangements are described in sufficient detail to enable those skilled in the art to practice one or more of the characteristic aspects, and it is to be understood that other arrangements may be utilized and that structural, logical, software, electrical, and other changes may be made without departing from the scope of the specific characteristic aspects. Particular features of one or more of the features aspects described herein may be described with reference to one or more particular feature aspects or figures which form a part of this disclosure and which show, by way of illustration, particular arrangements of one or more aspects. It should be understood, however, that these features are not limited to use in describing one or more particular features thereof or in the accompanying drawings. The present invention is neither a literal description of all arrangements of one or more features aspects nor a listing of features of one or more feature aspects that may not necessarily be present in all arrangements.

The section headings provided in this patent application and the headings of this patent application are for convenience only and should not be viewed as limiting the disclosure in any way.

Devices that are in communication with each other need not be in continuous communication with each other, unless expressly specified otherwise. Further, devices that are in communication with each other may communicate logically or physically directly or indirectly through one or more communication mechanisms or intermediaries.

A description of an aspect having multiple components in communication with each other does not imply that all such components are required. Rather, various optional components may be described to illustrate various possible aspects of the features, and to facilitate a more complete description of one or more aspects of the features. Similarly, although process steps, method steps, algorithms or the like may be described in a sequential order, such processes, methods and algorithms may generally be configured to work in alternate orders unless specifically stated to the contrary. In other words, any sequence or order of steps described in this patent application is not itself intended to imply a requirement that the steps be performed in that order. The steps of the described processes may be performed in any practical order. Further, although described or implied as occurring non-concurrently, some steps may be performed concurrently (e.g., because one step is described after another). Furthermore, the illustration of a process by a depiction in a drawing does not imply that the illustrated process excludes other variations and modifications thereto, nor that the illustrated process or any of its steps is essential to one or more features, nor that the illustrated process is preferred. Further, steps are typically described once per feature, but this does not imply that they must occur once, or that they may occur only once per execution or performance of a process, method, or algorithm. Certain steps may be omitted in certain aspects or certain events, or certain steps may be performed multiple times in a given aspect or event.

When a single device or article is described herein, it will be readily apparent that more than one device or article may be used in place of a single device or article. Similarly, when more than one device or article is described herein, it will be readily apparent that a single device or article may be used in place of the more than one device or article.

The functionality or features of an apparatus may alternatively be embodied by one or more other apparatuses which are not explicitly described as having such functionality or features. Thus, other feature aspects need not include the device itself.

The techniques and mechanisms described or referenced herein are sometimes described in the singular for clarity. It should be appreciated, however, that a particular feature aspect may include multiple iterations of a technique or multiple instances of a mechanism, unless otherwise noted. The process descriptions or blocks in the figures should be understood to represent modules, segments, or portions of code that include one or more executable instructions for implementing specific logical functions or arrangements in the process. Alternate embodiments are included within the scope of aspects of the various features, in which functions may be executed out of order from that shown or discussed, including substantially concurrently or in reverse order, depending on the functionality involved, as would be understood by those reasonably skilled in the art.

Conceptual architecture

FIG. 1 is an exemplary architecture diagram of a business operating system 100 in accordance with an embodiment of the present invention. Clients access system 105 through distributed, scalable high bandwidth cloud interfaces 110 and 110 of the system for special data input, system control, and for interaction with system outputs such as automated predictive decisions and plans and alternative path simulationsData store 112, interface 110 using a multipurpose, robust web application driven interface to input and display client-oriented information, data store 112 such as, but not limited to, MONGODB^TM,COUCHDB^TM,CASSANDRA^TMorREDIS^TMDepending on the embodiment. Most of the business data analyzed by the system comes from both data sources within the customer business scope and from cloud-based, public or proprietary data sources 107, such as but not limited to: subscribed business domain specific data services, external remote sensors, subscribed satellite images and data feeds, and websites of general and specific domain business operations of interest may also enter the system through the cloud interface 110, with the data being passed to a connector module 135 that may have API routines 135a needed to receive and convert external data, and then pass normalized information to other analysis and transformation components of the system, a directed computational graph module 155, a high-capacity web crawler module 115, a multidimensional timing database 120, and a graph stack service 145. Directed computational graph module 155 retrieves one or more data streams from a plurality of sources including, but not limited to, a plurality of physical sensors, network service providers, network-based questionnaires and surveys, monitoring of electronic infrastructure, crowd activities, and human input device information. Within the directed computational graph module 155, data may be divided into two equivalent streams in a dedicated pre-programmed data pipeline 155a, where one sub-stream may be sent for batch processing and storage, while another sub-stream may be reformatted for transformation pipeline analysis. The data may then be passed to a generic transformer service module 160 for linear data transformation as part of the analysis, or a decomposable transformer service module 150 for branched or iterative transformation as part of the analysis. The directed computation graph module 155 represents all data as a directed graph, where the transforms are nodes and the resulting messages are represented between the transform edges of the graph. The high capacity web crawling module 115 may use a number of resident server pre-programmed web crawlers that, while self-configuring, may be deployed within the web crawling framework 115a, where SCRAPY^TMIs an example of identifying and retrieving network-based dataData of interest in sources that are not well tagged by conventional web crawler technology. The multi-dimensional time series data storage module 120 may receive streaming data from a large number of multiple sensors, which may be of several different types. The multidimensional time series data storage module 120 may also store any time series data encountered by the system 100, such as, but not limited to, environmental factors at the protected client infrastructure site, component sensor readings and system logs for some or all of the protected clients, weather and catastrophic event reports for the region where the protected clients are located, political publications and/or news (e.g., without limitation, news, capital financing opportunities and financial information, sales, market conditions) from the region where the protected client infrastructure and network service information reside, and customer data related to the service. The multi-dimensional time-series data storage module 120 can process input data by dynamically allocating network bandwidth and server processing channels to accommodate irregular and large-capacity surges. Examples of programming wrappers 120a for languages include, but are not limited to, C + +, PERL, PYTHON, and ERLANG^TMThis allows complex programming logic to be added to the default functionality of multidimensional timing database 120 without familiarity with core programming, greatly extending the breadth of functionality. The data retrieved by the multidimensional timing database 120 and the large capacity web crawler module 115 can be further analyzed and converted into task optimization results by the directed computation graph 155 and associated generic transformer service 160 and resolvable transformer service 150 modules. Alternatively, data from the multidimensional timing database and the large-capacity web crawler module (typically with script hint information 145a identifying important vertices) may be sent to the graphics stack service module 145, which employs standardized protocols to convert the information stream into a graphical representation of the data, such as open graphics Internet technology (although the invention is not dependent on any one standard). Through these steps, the graphics stack service module 145 graphically represents and stores data affected by any predetermined script modifications 145a in a graphics-based data store 145b, such as GIRAPH^TMOr key-value paired data store REDIS^TMOr RIAK^TMAny of which is suitable for storing graphics-based dataThe information of (1).

The results of the transformation analysis process can then be combined with further client instructions, additional business rules and practices associated with the analysis, and situational information beyond that already in the automated planning service module 130, which also runs powerful information theory-based predictive statistical functions and machine learning algorithms 130a to allow rapid prediction of future trends and outcomes based on the results derived from the current system, and selection of each of a plurality of possible business decisions. Then, using all or most of the available data, the auto-planning service module 130 may make business decisions that are most likely to result in favorable business results with a high level of certainty that may be used. Closely related to the automated planning service module 130, with system-derived results and possibly externally provided additional information to assist end-user business decisions, the action result simulation module 125 with discrete event simulator programming module 125a is coupled with an end-user oriented observation and state estimation service 140 with highly scriptable 140b as the case requires and with a game engine 140a to more practically implement the possible results of the business decision under consideration, allowing business decision makers to investigate the selection of one pending action plan over another possible result based on analysis of currently available data.

FIG. 2 is a block diagram of an exemplary system 200 for using transfer learning to adapt a data model trained in a particular environment or application that may not have an equivalent feature space or portion within the target data or application, in accordance with various embodiments of the present invention. The system 200 includes a distributable model source 201, and a plurality of electronic devices 220[ a-n ]. Model source 201 may include a distributable model 205, a local data store 210, and a synthetic data store 215; and each electronic device may have an instance model 221 a-n, and device data 222 a-n. The electronic devices 220 a-n may be any type of computerized hardware commonly used in the art, including, but not limited to, desktop computers, laptop computers, mobile devices, tablet computers, and computer clusters. For simplicity, the following discussion of the system 200 will be from the perspective of a single device 220a, which includes an instance model 221a and device data 222 a. It should be understood that devices 220 a-n may operate independently and in a manner similar to device 220 a.

The model source 201 may be configured to send a copy of an instance of the distributable model 205 to the device 220a, and may be any type of computerized hardware used in the art configured to use the business operating system 100, and may communicate with connected devices via a directed computational graph data pipeline 155a that may allow incoming and outgoing data to be processed over the air. The distributable model 205 may be a model commonly used in the art in machine learning, or may be a specialized model that has been tuned to more efficiently utilize training data that has been distributed among various devices and that is still capable of being trained through conventional machine learning methods. The local data 210 may include previously stored data or data that is being stored in a process, such as data that is currently being collected by monitoring other systems. The synthetic data 215 may be data that has been intelligently and/or predictively generated to fit the model context, and is typically based on real trends and events. Examples of synthetic data 215 may include data that has been collected from a running computer simulation; data that has been generated using a predictive modeling function of the business operating system 100 with previously stored data and current trends; or other specialized software, such as SNORKEL, may be employed. Local data 210 and synthetic data 215 may be used by model source 201 as training data for distributable model 205. Although shown in FIG. 2 as being stored within model source 201, those skilled in the art will appreciate that local data 210 and synthetic data 215 may originate from external systems and be provided to model source 201 through some mechanism, such as, for example, an Internet or local area network connection.

In system 200, device 220a may be connected to model source 201 via a network connection, such as the internet, a local area network, a virtual private network, and so forth, where device 220a may provide a copy of an instance of distributable model 221 a. Device 220a may then sanitize and clean its own data 222a and train its instance model 221a with the sanitized data. Reports based on updates to the instance model 221a can be used by the model source 201 to refine the distributable model 205. In a preferred embodiment, the devices 220[ a-n ] may comprise a system configured to use the business operating system 100 and utilize its directed computational graph capabilities, among other functions, to process data. However, in some embodiments, the hardware of the electronic device may be, for example, a mobile device or a computer system running another operating system. These systems may use specialized software configured to process data according to established compatibility specifications.

The transmission engine 230 may be used to facilitate sharing of the model 205 with some destination devices 220b, if desired, such as for an adaptive model 205 used in a different environment than it was originally created or for a different purpose. This may be used, for example, to share data models between departments or organizations, which may benefit from mutual data regardless of changes or differing purposes in hardware or software environments. The transport engine 230 may utilize various machine learning schemes to provide transport learning, utilize machine learning from a problem or set of data and apply it to different (but related) data. For example, machine learning trained on a set of data about a car may be applicable to data about other vehicles, such as trucks or motorcycles, and transfer learning may be used to apply car-specific data to a broader set of data and related data types. The transmission learning scheme may include, for example, probabilistic logic such as Markov logic networks (which may be used to apply learned techniques to new data models for the purpose of promoting inference under uncertainty, or to modify data models for new purposes) or Bayesian models to model various conditional correlations as directed acyclic graphs. These schemes, and possibly other schemes according to various aspects of the features, may be used to analyze a data set according to an intended purpose for the data set, such as, for example, ensuring that the data complies with any applicable regulations (such as reviewing personally identifiable information that may be present, for example) or ensuring that the data complies with any required specifications to ensure interoperability with an intended design destination or use.

The transport engine 230 may use transport learning to adapt to environments or applications that may have different characteristics or meet different criteria so that a data set may be evaluated for suitability for a particular destination. The analysis may be further enhanced by using a plurality of pre-trained models, which may serve as starting points for further refinement during operation, to apply a machine learning algorithm to the models themselves so that they may be modified to accommodate changes in the data or destination information, ensuring that the analysis remains suitable for the intended design purpose as well as the data set being analyzed. Further, the pre-trained models can be used to adapt to environments that do not require in-depth analysis, such as, for example, in resource-limited deployments (e.g., embedded or mobile devices), where initial analysis of data can be performed using the pre-trained models before they are exported for storage or use of more advanced machine learning capabilities. This also provides a mechanism for adapting to environments that may not require localized training, such as when deployed across many mobile devices, it may be necessary to ensure consistency of analysis operations, without the possibility of each device developing its own localized training model.

The transport engine 230 may be used to facilitate domain adaptation, learn from one data set (the initial distributable data model 205), and apply this machine learning to new data to create a new data model based on knowledge obtained from the initial training. This can be performed in a partially or completely unsupervised manner, learning from source data without manually tagging the data (as is common in conventional approaches). In a partially unsupervised arrangement, a pre-trained model with some labeled data may be used as a starting point for further training, where unsupervised learning incorporates labeled information and develops a larger model for use based thereon. This can be used to merge existing tagged data (e.g., data that may have been created or tagged for other applications) and then apply unsupervised machine learning to extend the data without further manual operations.

The system 200 may also be configured to allow certain deviations in the data to remain unchanged. In some cases, deviations in geographic sources, such as data, may be necessary in order to comply with local laws. For example, some countries may have certain laws enacted in place to prevent importation of data to other restricted countries. By taking into account deviations in the geographic origin of the data, the deviations can be used to classify incoming update reports. The update report may then be used to train the geographically restricted distributable model without including data from other restricted countries, while still including data from non-restricted countries.

To provide another exemplary application, crime reports and data distributed across multiple public security municipalities may contain information on new trends in crimes that may be relevant to the crime prediction model. However, the data may contain sensitive non-public information that must be retained in the downtown area. With the system described herein, data can be deleted from sensitive information for training an instance of a crime prediction model, and relevant data can be indirectly integrated into the crime prediction model. This can be done without fear of sensitive information leakage, since the actual data will not leave the computer system in the downtown area.

It should be understood that the electronic devices 220 a-n need not all be connected to the model source 201 at the same time, and that any number of devices may be actively connected to and interact with the model source 201 at a given time. Further, the improvement of the distributable model, the improvement of the instance model, and the ping-pong communication may run discontinuously, and may be performed only when needed. Furthermore, fig. 2 shows only one example for the sake of simplicity. However, in practice, the system 200 may be a section of a potentially larger system with multiple distributable model sources; or the distributable model source may store and serve multiple distributable models. The electronics 220 a-n are not limited to interacting with only one distributable model source and may be connected to and interact with as many distributable model sources as desired.

Because the deviations contained within the data may contain valuable information, or give valuable insight when applied to the correct model, it may be prudent to use the data for a specific model.

FIG. 3 is a block diagram of an exemplary system 300 that can utilize bias included in data distributed among various sectors to refine a bias-specific distributable model in accordance with various embodiments of the invention. The system 300 may include a distributable model source 301; and a set of sectors 302, which may include a plurality of sectors 325 a-h. Model source 301 may be a system similar to model source 201, including a source of local data 310 and a source of synthetic data 315 that may be used in an equivalent manner; however, the model source 301 may serve instances of two different models-a generalized distributable model 305 and a bias-specific distributable model 320. The generalized distributable model 305 may be a model that may be trained using data distributed across sectors in the group 302, where the intra-data bias has been weighted and corrected, similar to the distributable model 205. The bias-specific distributable model 320 may be a model in which any bias contained within the data from each sector 325 a-n may be carefully considered.

Each of sectors 325[ a-h ] may be a device with distributed data, similar to electronic devices 220[ a-n ] in FIG. 2, that has been assigned a classification, such as a device within a certain area. However, the sectors are not limited to being based on geographic location. To provide some non-geographic examples, sectors 325[ a-h ] may be devices categorized by age group, income category, interest group, and the like. Devices within any of sectors 325 a-h may obtain examples of a generalized distributable model and a bias-specific model. The device may then train the instance model with its own stored data that has been processed in the manner as the method described in FIG. 7 below, and ultimately leads to improvements in the generalized model and the bias-specific model.

It should be understood that, similar to the system 200 shown in FIG. 2, the embodiment shown in the system 300 illustrates one possible arrangement in which the model sources and data sources may be distributed, and this arrangement has been chosen for simplicity. In other embodiments, there may be more or fewer sectors in a group, there may be multiple groups of sectors, or there may be multiple models, including generalized and bias-specific, for example. See discussion of fig. 4 below for an example with multiple layers, each layer having its own group of sectors.

FIG. 4 is a block diagram of an exemplary hierarchy 400 in which each layer may have its own set of distributable models in accordance with various embodiments of the invention. At the highest level of the hierarchy 400, there is a global hierarchy 405 that may contain all of the lower levels of the hierarchy 400. Beneath the global level 405 may be a plurality of continent level 410 sectors which, in turn, may include all levels beneath them. Continuing with this model, below the continent level may be a number of country level 415 sectors, where there may be a number of country level or provincial level 420 sectors, which results in county level 425 sectors, then city level 430 sectors, then street level 435 sectors. Each level sector may have a system as described in fig. 3 and have a distributable model that is specific to the bias that is utilized from the data collection within each level sector distribution, and a generalized distributable model that has a bias that is weighted and corrected to some extent. For example, if a particular model at one level is applicable to another level, the model may also be shared between levels. The models and data collected within a hierarchy may not be limited to use in adjacent hierarchies, e.g., by weighting and correcting biases, and data collected at a county level may also be related to models at the global level as well as models at the street level.

For example, a business office of a global company may be spread around the globe. The company may have a global office responsible for managing offices in a subset of business regions, such as commonly used business regions, e.g., asia-pacific, usa, latin america, southeast asia, etc. These service areas may each have an area office responsible for managing their own national group within the area. Each of these countries may have a national office responsible for managing its own state/province/region group. As shown in fig. 4, there may be deeper, finer grained subgroups, but for purposes of this example, the deepest level used would be state/province/region.

The global office may have a system with a configuration to use the business operating system 100 and may serve two distributable models: a generalized distributable model configured to analyze cleaned data from distributed business areas on a global scale, and a bias-specific distribution model that utilizes deviations in data distributed between business areas. An example of a global model may be a model that predicts environmental or global climate impact, while a business region specific model may be a model that predicts business revenue based on regional data, such as the economics of each region, the trends in development of each region, or the population of each region. At the next level, a regional office may provide a distributable model to the countries within the region: a distributed model for bias corrected, full-area distributed data, and a bias-based distributable model that utilizes bias within country-level distributed data. These models may be similar to those served by global offices, but may be more regional and country-centric, e.g., regional environment and climate, as well as country-level business forecasts. At the next level, which is also the last level of the example, the national office may provide a distributable model to the states/provinces/regions of the country, similar to the example model for both global and regional levels. Data distributed at one level of the state can be collected and used to train these models. It should be noted that the distribution of the model is divided at each level to simplify the example and does not represent any limitation of the invention. In other embodiments, for example, a single system at any level may serve all of the various models at each level of the hierarchy, or there may be servers serving the models in addition to the global corporate systems used in this example.

Detailed description of exemplary feature aspects

FIG. 5 is a flow diagram illustrating a method 500 for cleansing and sanitizing data stored on an electronic device prior to training an instance of a distributable model according to various embodiments of the invention. In an initial step 505, deviations found within the data are identified and corrected so that the data may be more favorable for use in a generalized distributable model. Deviations may include, but are not limited to, body or region trends, language or ethnicity trends, and gender trends. The deviations are intelligently weighted to provide useful insight without over-fitting the data set. In some embodiments, deviation data may be specifically identified and stored for use on other more regionalized distributable models. At step 510, the data may be purged of personal identification information and other sensitive information in order to maintain privacy of the user and the system. The information may include bank information, medical history, address, personal history, etc. The now cleansed and sanitized data may be marked for use as training data at step 515. At step 520, the data is now appropriately formatted for use in training the instance model.

It should be noted that method 500 outlines a method of cleansing and decrypting data. In other embodiments, some steps in method 500 may be skipped or other steps may be added. In addition to being used to train instance models and ultimately improve distributable models, data can be processed in a similar manner to become more generic and more suitable for transmission for use in training models belonging to different domains. This allows previously collected data to be reused in a new model without the sometimes laborious recollection of data formatted for the context of a particular domain of the newly created model.

FIG. 6 is a flow diagram illustrating a method 600 for improving a distributable model on a device external from a model source in accordance with various embodiments of the invention. In an initial step 605, an instance of a distributable model is shared with devices from a distributable model source. At step 610, data on the device is sanitized and decrypted, one such method being outlined in FIG. 5. At step 615, the data may be used to train the shared model (i.e., the shared instance of the distributable model) on the device, now in a format suitable for training the generic distributable model. At step 620, a report based on the updates to the shared model may be generated and transmitted to the distributable model source. One benefit of generating and transmitting a report, as opposed to the transmission of the data itself, is that it may be more efficient than transmitting all of the data needed to improve the distributable model. Furthermore, because the raw data does not need to leave the device, this approach may help to further ensure that sensitive information is not leaked. At step 625, the distributable model source uses the report to refine the distributable model.

FIG. 7 is a flow diagram illustrating a method 700 for refining bias-specific and generalized distributable models using data on distributed devices in accordance with various embodiments of the invention. In an initial step 705, the apparatus obtains an instance of a distributable model from a model source: a generalized distributable model and a bias-specific distributable model. At step 710, a first data set is created using data stored on a device having deviations in the data intelligently identified and processed by the business operating system 100 for use with the deviation-specific model. At step 715, a second data set is created using the data stored on the device, and deviations in the data are intelligently weighted and corrected so that the data set may be more appropriate for the generalized model. At step 720, the data set is cleansed and sensitive information is eliminated. At step 725, the first and second data sets are used to train instances of their respective models; the first data set trains instances of the bias-specific model, and the second data set trains instances of the generalized model. At step 730, a report is generated based on the updates made to the instances of each model. After the report is generated, the steps are similar to the final steps of method 600, particularly steps 620 and 625. The generated reports are transmitted to a model source, each report being used to refine a respective model.

FIG. 8 is a flow diagram illustrating a method 800 for using transfer learning to adapt a data model trained in a particular environment or application that may not have the same feature space or distribution in the target data or application. In an initial step 801, a first data set may be created from the stored data, optionally with deviations identified or removed according to one or more of the previously described aspects of features (see fig. 4-7). In a next step 802, a destination of the data set (e.g., an external application or service for which the data is intended, or another environment or department with which the data is shared) may be analyzed to identify any requirements of the data set, such as, for example, regulations, content, or format restrictions to which the data should comply. In a next step 803, various transfer learning algorithms may be applied to the first data set, using machine learning to process the data and determine its suitability for the destination 804, and identifying areas to which the data must be adapted to meet the requirements. Many transmission learning methods may be used according to a particular arrangement, for example using probabilistic learning models such as Bayesian or Markov networks to properly handle data transmissions for any uncertainty that may exist. In a next step 805, the results of the transfer learning process may be used to create a second data set, generating a derived data set containing appropriate and adaptive data from the first data set such that the second data set is fully compatible with the destination. In a final step 806, the second data set may be provided to a data destination for use.

FIG. 9 is a flow diagram illustrating a method 900 of using a pre-trained model in transfer learning to provide a starting point for additional model training. In an initial step 901, a first data set may be created from stored data, optionally with deviations identified or removed according to one or more aspects previously described (see fig. 4-7). In a next step 902, a destination of the data set (e.g., an external application or service for which the data is intended, or another environment or department with which the data is shared) may be analyzed to determine any requirements of the data set, such as, for example, regulations, content, or format restrictions to which the data should comply. In a next step 903, a pre-trained model containing labeled data may be loaded, providing an initial model to serve as a starting point for unsupervised learning based on a portion of previously labeled data (as opposed to a traditional, fully supervised approach, where all data must be manually labeled for machine learning operations). In a next step 904, unsupervised machine learning may then process the first data set according to the labeled data in the pre-trained model, built from the labeled data to develop a second data model 905 for transport and domain adaptation.

Hardware architecture

In general, the techniques disclosed herein may be implemented on hardware or a combination of software and hardware. For example, they may be implemented in an operating system kernel, in a separate user process, in a library package bound into network applications, on a specially constructed machine, on an Application Specific Integrated Circuit (ASIC), or on a network interface card.

A software/hardware hybrid implementation of at least some of the feature aspects disclosed herein may be implemented on a programmable network-resident machine (understood to include an intermittently connected network-aware machine) that is selectively activated or reconfigured by a computer program stored in memory. The network device may have multiple network interfaces that may be configured or designed to utilize different types of network communication protocols. A general architecture for some of the machines may be described herein to illustrate one or more exemplary means by which a given functional unit may be implemented. According to particular feature aspects, at least some features or functions of the various feature aspects disclosed herein may be implemented on one or more general purpose computers associated with one or more networks, such as, for example, end user computer systems, client computers, network servers or other server systems, mobile computing devices (e.g., tablets, mobile phones, smartphones, laptops or other suitable computing devices), consumer electronics, music players or any other suitable electronic device, router, switch or other suitable device, or any combination thereof. In at least some feature aspects, at least some features or functions of various feature aspects disclosed herein may be implemented in one or more virtual computing environments (e.g., a network computing cloud, virtual machines residing on one or more physical computing machines, or other suitable virtual environment).

Referring now to FIG. 10, there is shown a block diagram depicting an exemplary computing device 10 suitable for implementing at least a portion of the features or functionality disclosed herein. For example, the computing device 10 may be any of the computing machines listed in the preceding paragraphs, or indeed any other electronic device capable of executing software-or hardware-based instructions according to one or more programs stored in memory. Computing device 10 may be configured to communicate with a plurality of other computing devices, such as clients or servers, over a communication network, such as a wide area network, metropolitan area network, local area network, wireless network, the internet, or any other network, whether wireless or wired, using known protocols for such communication.

In one aspect, computing device 10 includes one or more Central Processing Units (CPUs) 12, one or more interfaces 15, and one or more buses 14 (e.g., a Peripheral Component Interconnect (PCI) bus). When acting under the control of appropriate software or firmware, the CPU12 may be responsible for implementing specific functions associated with the functions of a particular configured computing device or machine. For example, in at least one feature aspect, computing device 10 may be configured or designed as a server system that utilizes CPU12, local memory 11 and/or remote memory 16, and interface 15. In at least one aspect, the CPU12 may be caused to perform one or more different types of functions and/or operations under the control of software modules or components, which may include, for example, an operating system and any appropriate applications software, drivers, and the like.

The CPU12 may include one or more processors 13, for example, a processor from one of the Intel, ARM, Qualcomm, and AMD family of microprocessors. In some aspects, the processor 13 may include specially designed hardware, such as Application Specific Integrated Circuits (ASICs), electrically erasable programmable read-only memories (EEPROMs), Field Programmable Gate Arrays (FPGAs), and so forth, for controlling the operation of the computing device 10. In certain feature aspects, local memory 11, such as non-volatile Random Access Memory (RAM) and/or Read Only Memory (ROM), including for example one or more levels of cache memory, may also form part of CPU 12. However, there are many different ways in which memory may be coupled to system 10. Memory 11 may be used for various purposes such as, for example, caching and/or storing data, programming instructions, etc. It should further be appreciated that CPU12 may be various system-on-a-chip (SOC) type hardware, which may include additional hardware such as memory or graphics processing chips such as CPUs as are increasingly common in the art, such as QUALCOMM SNAPDRAGON^TMOr EXYNOS of three stars^TMA CPU, such as for a mobile device or an integrated device.

As used herein, the term "processor" is not limited to just those integrated circuits referred to in the art as a processor, a mobile processor, or a microprocessor, but broadly refers to a microcontroller, a microcomputer, a programmable logic controller, an application specific integrated circuit, and any other programmable circuit.

In one aspect, the interfaces 15 are provided as Network Interface Cards (NICs). Typically, NICs control is by computerTransmission and reception of data packets of the network; for example, other types of interfaces 15 may support other peripheral devices for use with computing device 10. Interfaces that may be provided include ethernet interfaces, frame relay interfaces, cable interfaces, DSL interfaces, token ring interfaces, graphics interfaces, and the like. In addition, various types of interfaces may be provided, such as Universal Serial Bus (USB), Serial, Ethernet, FIREWIRE^TM、THUNDERBOLT^TMPCI, parallel, Radio Frequency (RF), BLUETOOTH^TMNear field communication (e.g., using near field magnetism), 802.11(WiFi), frame relay, TCP/IP, ISDN, fast ethernet interface, gigabit ethernet interface, serial ata (sata) or external sata (esata) interface, High Definition Multimedia Interface (HDMI), Digital Video Interface (DVI), analog or digital audio interface, Asynchronous Transfer Mode (ATM) interface, High Speed Serial Interface (HSSI) interface, point of sale (POS) interface, Fiber Data Distributed Interface (FDDIs), etc. In general, the interface 15 may comprise a physical port adapted to communicate with an appropriate medium. In some cases they may also include a separate processor (such as a dedicated audio or video processor, as is common in the art for high fidelity a/V hardware interfaces) and, in some cases, volatile and/or non-volatile memory (e.g., RAM).

Although the system shown in fig. 10 illustrates one particular architecture of a computing device 10 for implementing one or more aspects described herein, it is by no means the only device architecture on which at least a portion of the features and techniques described herein may be implemented. For example, an architecture having one or any number of processors 13 may be used, and such processors 13 may be present in a single device or distributed across any number of devices. In one feature aspect, a single processor 13 handles both communication and routing computations, while in other feature aspects, separate dedicated communication processors may be provided. In various feature aspects, different types of features or functions may be implemented in the system, depending on the feature aspects including the client device (e.g., a tablet device or smartphone running client software) and the server system (e.g., the server system described in more detail below).

Regardless of network device configuration, the system of feature aspects may employ one or more memories or memory modules (e.g., remote memory block 16 and local memory 11) configured to store data, program instructions for the general-purpose network operations, or other information relating to the functionality of the aspects described herein (or any combination of the above). For example, the program instructions may control the execution of or include an operating system and/or one or more application programs. Memory 16 or memories 11, 16 may also be configured to store data structures, configuration data, encryption data, historical system operation information, or any other specific or general non-program information described herein.

Because such information and program instructions may be employed to implement one or more systems or methods described herein, at least some network device aspects may include non-translated machine-readable storage media that may be configured or designed, for example, to store program instructions, state information, and the like for performing various operations described herein. Examples of such non-transitory machine-readable storage media include, but are not limited to, magnetic media such as disks, floppy disks, and magnetic tape; optical media such as CD-ROM disks; magneto-optical media such as optical disks, and hardware devices specially configured to store and execute program instructions, such as read-only memory devices (ROM), flash memory (common in mobile devices and integrated systems), Solid State Drives (SSDs), and "hybrid SSD" storage drives, which may combine the physical components of a solid state drive and a hard disk drive in one hardware device (as is becoming more common in the art in the case of personal computers), memory, Random Access Memory (RAM), and the like. It should be understood that such storage devices may be integral and non-removable (e.g., RAM hardware modules that may be soldered onto a motherboard or otherwise integrated into an electronic device), or they may be removable, such as swappable flash memory modules (such as "thumb drives" or other removable media designed for fast swap physical storage devices), "hot swap" hard or solid state drives, removable optical disks or other such removable media, and such integrationsBulk storage media and removable storage media may be used interchangeably. Examples of program instructions include two object codes, e.g. compiler-generated object code, machine code, e.g. assembler or linker-generated object code, bytecode, e.g. JAVA-generated object code^TMA compiler, which may be executed using a Java virtual machine or equivalent program, and an interpreter, which may execute files containing higher level code (e.g., scripts written in Python, Perl, Ruby, Groovy, or any other scripting language).

In certain aspects, the system may be implemented on a stand-alone computing system. Referring now to FIG. 11, there is illustrated a block diagram of a typical exemplary architecture depicting one or more aspects or components thereof on a standalone computing system. Computing device 20 includes a processor 21 that may execute software, such as a client application 24, that executes applications that perform one or more functions or aspects. The processor 21 may execute computing instructions under the control of an operating system 22, such as, for example, MICROSOFT WINDOWS^TMOperating system, APPLE macOS^TMOr iOS^TMOperating system, some Linux operating systems, ANDROID^TMVersion of the operating system, etc. In many cases, one or more shared services 23 may run in system 20 and may be used to provide common services to client applications 24. For example, the service 23 may be WINDOWS^TMServices, user space common services in a Linux environment, or any other type of common services architecture used with operating system 21. The input device 28 may be of any type suitable for receiving user input, including, for example, a keyboard, a touch screen, a microphone (e.g., for voice input), a mouse, a touch pad, a trackball, or any combination thereof. Output device 27 may be of any type suitable for providing output to one or more users, whether remote or local to system 20, and may include, for example, one or more screens for visual output, speakers, printers, or any combination thereof. The memory 25 may be a random access memory having any structure and architecture known in the art for the processor 21, e.g., for running software. The storage device 26 may be for storing in digital formAny magnetic, optical, mechanical, memory, or electrical storage device for data (e.g., those described above with reference to fig. 10). Examples of storage device 26 include flash memory, magnetic hard disks, CD-ROMs, and the like.

In some aspects, the system may be implemented on a distributed computing network, such as a network having any number of clients and/or servers. Referring now to FIG. 12, there is illustrated a block diagram depicting an exemplary architecture 30 for implementing at least a portion of a system in accordance with an aspect on a distributed computing network. According to this feature aspect, any number of clients 33 may be provided. Each client 33 may run software for implementing the client portion of the system; the client may comprise a system 20 as shown in fig. 11. In addition, any number of servers 32 may be provided to handle requests received from one or more clients 33. The client 33 and server 32 may communicate with each other via one or more electronic networks 31, which electronic networks 31 may be, in various feature aspects, the internet, a wide area network, a mobile telephone network (e.g., a CDMA or GSM cellular network), a wireless network (e.g., WiFi, WiMAX, LTE, etc.), or a local area network (or indeed any network topology known in the art; neither network topology is preferred in feature aspects). Network 31 may be implemented using any known network protocol, including for example wired and/or wireless protocols.

Further, in some aspects, server 32 may invoke external services 37 to obtain additional information, or reference additional data about a particular call, if desired. For example, communication with external services 37 may be via one or more networks 31. In various feature aspects, the external services 37 may comprise network-enabled services or functions associated with the hardware device itself or installed on the hardware device itself. For example, in one aspect in which the client application 24 is implemented on a smartphone or other electronic device, the client application 24 may retrieve information stored on a server system 32 in the cloud or on external services 37 deployed on one or more sites of a particular enterprise or user.

In some aspects, the client 33 or the server 32 (or both) mayTo use one or more dedicated services or devices that may be deployed locally or remotely over one or more networks 31. For example, one or more databases 34 may be used or referenced by one or more aspects. It will be appreciated by those of ordinary skill in the art that the database 34 may be arranged in a wide variety of architectures and using a wide variety of data access and manipulation means. For example, in various aspects, one or more databases 34 may comprise a relational database system using Structured Query Language (SQL), while other databases 34 may comprise an alternative data storage technology (e.g., hadoopcas senddra), such as what is known in the art as "NoSQL^TM，GOOGLE BIGTABLE^TMEtc.). In some featured aspects, various database architectures may be used in accordance with this aspect, such as column-oriented databases, in-memory databases, clustered databases, distributed databases, and even flat file data repositories. It will be appreciated by those of ordinary skill in the art that any combination of known or future database techniques may be used as appropriate, unless a particular database technique or a particular arrangement of components is specified for a particular aspect described herein. Further, it should be understood that the term "database" as used herein may refer to a physical database machine, a cluster of machines acting as a single database system, or a logical database within an entire database management system. Unless a specific meaning is specified for a given use of the word "database," it should be construed to be any of these meanings of the word, all of which are understood by those skilled in the art as simple meanings of the word "database.

Similarly, some aspects may employ one or more security systems 36 and configuration systems 35. Security and configuration management are common Information Technology (IT) and network functions, some of which are typically associated with any IT or network system. It should be understood by one of ordinary skill in the art that any configuration or security subsystem known in the art now or in the future may be used with the aspects without limitation, unless the description of any particular aspect specifically requires a particular security 36 or configuration system 35 or method.

FIG. 13 shows an exemplary overview of a computer system 40 that may be used in any of various locations throughout the system. Which is an example of any computer that can execute code to process data. Various modifications and changes may be made to computer system 40 without departing from the broader scope of the systems and methods disclosed herein. A Central Processing Unit (CPU)41 is connected to a bus 42, which is also connected to a memory 43, a non-volatile memory 44, a display 47, an input/output (I/O) unit 48, and a Network Interface Card (NIC) 53. In general, the I/O unit 48 may be connected to a keyboard 49, a pointing device 50, a hard disk drive 52, and a real-time clock 51. NIC53 is connected to network 54, which network 54 may or may not be the internet, and which network may or may not have a connection to the internet. In this example, a power supply unit 45 is also shown as part of the system 40, connected to a main Alternating Current (AC) power supply 46. Not shown are the batteries that may be present, as well as many other devices and modifications that are known but not suitable for the particular novel functionality of the present systems and methods disclosed herein. It should be understood that some or all of the components shown may be combined together, for example, in various integrated applications, such as in a high-pass or three-star system on a chip (SOC) device, or in combining multiple functions or functionalities into a single hardware device (e.g., in a mobile device, such as a smartphone, video game console, in-vehicle computer system such as a navigation or car multimedia system, or other integrated hardware device).

In various feature aspects, the functionality of a system or method implementing the various aspects may be distributed among any number of client and/or server components. For example, various software modules may be implemented to perform various functions associated with the system in any particular feature aspect, and may be implemented differently to run on server and/or client components.

Those skilled in the art will appreciate the scope of possible modifications to the various feature aspects described above. Therefore, the invention is defined by the claims and their equivalents.

30页详细技术资料下载

上一篇：一种医用注射器针头装配设备

下一篇：通过即时提供实质性回答以提供自然语言对话的方法、计算机装置及计算机可读存储介质

Transfer learning and domain adaptation using distributable data models

相关技术

网友询问留言