Method and apparatus for training deep learning models
阅读说明:本技术 用于训练深度学习模型的方法和装置 (Method and apparatus for training deep learning models ) 是由 何天健 刘毅 董大祥 马艳军 于佃海 于 2019-11-25 设计创作,主要内容包括:本公开的实施例公开了用于训练深度学习模型的方法和装置。该方法的一具体实施方式包括:获取深度学习模型的模型描述信息和配置信息;根据配置信息中的切分点变量将模型描述信息分成至少两段,并将模型描述信息加载到相应的资源上运行;将一批次的训练样本输入第一段模型描述信息对应的资源后开始训练并将得到的上下文信息作为下一段模型描述信息对应的资源的输入;以此类推直到得到最后一段模型描述信息对应的资源的运行结果;若满足训练完成条件,则输出训练完成的深度学习模型;否则,继续获取下一批次的训练样本执行上述训练步骤直到满足训练完成条件。该实施方式实现了异构设备的自由搭配,充分发挥不同计算设备的计算能力,提升训练速度。(Embodiments of the present disclosure disclose methods and apparatus for training deep learning models. One embodiment of the method comprises: obtaining model description information and configuration information of a deep learning model; dividing the model description information into at least two sections according to the dividing point variable in the configuration information, and loading the model description information to corresponding resources for operation; inputting a batch of training samples into resources corresponding to the first section of model description information, starting training, and taking the obtained context information as the input of the resources corresponding to the next section of model description information; repeating the steps until obtaining the operation result of the resource corresponding to the last section of model description information; if the training completion condition is met, outputting a deep learning model after training is completed; otherwise, continuing to obtain the training samples of the next batch and executing the training steps until the training completion condition is met. The implementation mode realizes the free collocation of heterogeneous devices, fully exerts the computing capability of different computing devices and improves the training speed.)
1. A method for training a deep learning model, comprising:
obtaining model description information and configuration information of a deep learning model, wherein the model description information comprises variables and operation, and the configuration information comprises a dividing point variable and names of resources distributed by each segment;
dividing the model description information into at least two sections according to the dividing point variable in the configuration information, and loading the model description information to corresponding resources for operation according to the names of the resources allocated to the sections;
the following training steps are performed: acquiring a batch of training samples, inputting the batch of training samples into resources corresponding to the first section of model description information, starting training and storing an obtained intermediate result in first context information; inputting the first context information into a resource corresponding to the next section of model description information to obtain second context information; repeating the steps until obtaining the operation result of the resource corresponding to the last section of model description information; if the training completion condition is met, outputting a deep learning model after training is completed;
otherwise, continuing to obtain the training samples of the next batch and executing the training steps until the training completion condition is met.
2. The method of claim 1, wherein the configuration information further includes a proportion of resources allocated for each segment; and
the loading the model description information to the corresponding resource according to the name of the resource allocated to each segment for operation includes:
calculating the quantity of each resource to be distributed according to the proportion of the resource distributed by each section;
and loading the model description information to the corresponding resources according to the names and the quantity of the resources allocated to the segments for operation.
3. The method of claim 1, wherein the dividing model description information into at least two segments according to a dividing point variable in the configuration information comprises:
determining a forward part with a dividing point variable as an end point as a first section;
determining a second segment by adding a residual forward part from a dividing point variable and a gradient variable corresponding to the dividing point variable from a loss part;
and determining the residual reverse part starting from the gradient variable corresponding to the dividing point variable as a third section.
4. The method of claim 1, wherein the obtaining model description information and configuration information of the deep learning model comprises:
determining a conversion variable between the operation of frequent memory interaction and the calculation intensive operation as a division point variable according to the operation of the model description information;
allocating the frequently interactive operation of the memory to the CPU for execution;
the compute-intensive operations are allocated for GPU execution.
5. The method of claim 1, wherein the method further comprises:
splitting a training sample into a predetermined number of parts;
training by using each training sample to obtain parameters of a group of deep learning models;
and synchronizing the parameters of each group of deep learning models once every certain turn.
6. The method of any of claims 1-5, wherein the context information is passed through a queue.
7. An apparatus for training a deep learning model, comprising:
the deep learning system comprises an acquisition unit, a processing unit and a processing unit, wherein the acquisition unit is configured to acquire model description information and configuration information of a deep learning model, the model description information comprises variables and operations, and the configuration information comprises a dividing point variable and a name of a resource allocated to each segment;
the segmentation unit is configured to segment the model description information into at least two segments according to the segmentation point variable in the configuration information, and load the model description information onto corresponding resources according to the names of the resources allocated to the segments for operation;
a training unit configured to perform the following training steps: acquiring a batch of training samples, inputting the batch of training samples into resources corresponding to the first section of model description information, starting training and storing an obtained intermediate result in first context information; inputting the first context information into a resource corresponding to the next section of model description information to obtain second context information; repeating the steps until obtaining the operation result of the resource corresponding to the last section of model description information; if the training completion condition is met, outputting a deep learning model after training is completed;
and the iteration unit is configured to continue to acquire the training samples of the next batch to perform the training steps until the training completion condition is met if the training completion condition is not met.
8. The apparatus of claim 7, wherein the configuration information further comprises a proportion of resources allocated for each segment; and
the slicing unit is further configured to:
calculating the quantity of each resource to be distributed according to the proportion of the resource distributed by each section;
and loading the model description information to the corresponding resources according to the names and the quantity of the resources allocated to the segments for operation.
9. The apparatus of claim 7, wherein the slicing unit is further configured to:
determining a forward part with a dividing point variable as an end point as a first section;
determining a second segment by adding a residual forward part from a dividing point variable and a gradient variable corresponding to the dividing point variable from a loss part;
and determining the residual reverse part starting from the gradient variable corresponding to the dividing point variable as a third section.
10. The apparatus of claim 7, wherein the obtaining unit is further configured to:
determining a conversion variable between the operation of frequent memory interaction and the calculation intensive operation as a division point variable according to the operation of the model description information;
allocating the frequently interactive operation of the memory to the CPU for execution;
the compute-intensive operations are allocated for GPU execution.
11. The apparatus of claim 7, wherein the apparatus further comprises a merging unit configured to:
splitting a training sample into a predetermined number of parts;
training by using each training sample to obtain parameters of a group of deep learning models;
and synchronizing the parameters of each group of deep learning models once every certain turn.
12. The apparatus according to one of claims 7-11, wherein the context information is passed through a queue.
13. An electronic device, comprising:
one or more processors;
a storage device having one or more programs stored thereon,
when executed by the one or more processors, cause the one or more processors to implement the method of any one of claims 1-6.
14. A computer-readable medium, on which a computer program is stored, wherein the program, when executed by a processor, implements the method of any one of claims 1-6.
Technical Field
Embodiments of the present disclosure relate to the field of computer technologies, and in particular, to a method and an apparatus for training a deep learning model.
Background
With the development of the current deep learning model towards multiple directions of deeper layers, wider representations and more complex structures, a GPU (Graphics Processing Unit) with high efficiency of computation is already an indispensable computing resource in the field. Common parallel schemes are classified into model parallel and data parallel.
The technical scheme of model parallelism can distribute all model parameters to different devices to execute calculation, and each device maintains a part of parameters. Because the computation of different devices depends on the computation context of the previous device, in order to improve the utilization rate of the computing device (such as a GPU), the model parallel pipeline enables the computation of different small-batch data to be concurrently executed in a plurality of computing devices by splitting a large batch of data into a plurality of small batches. The model parallel pipeline can be subdivided into two types of synchronous calculation and asynchronous calculation. In the synchronous calculation mode, all the calculation devices block the context required by the backward calculation after completing all the forward calculation tasks of small batches or wait for the synchronous update of parameters after completing all the backward calculation, so that the utilization of the calculation devices is insufficient. Different large-batch calculations in asynchronous calculations can be performed at the same time, and the backward calculations and parameter updates of different small batches are moved forward as far as possible. However, the scheme cannot fully utilize equipment with higher computational power under the condition that the computational power of each link is unequal.
Data parallelism is another parallelism scheme. The data parallel pipeline distributes the computing equipment to different batches of data, the computing among the equipment naturally has the concurrency characteristic, and the utilization rate of the equipment is high. However, this solution does not fully utilize heterogeneous devices, i.e. all data streams can only run on a single kind of device, let alone support the proportioning of computing resources of different devices.
Disclosure of Invention
Embodiments of the present disclosure propose methods and apparatuses for training deep learning models.
In a first aspect, an embodiment of the present disclosure provides a method for training a deep learning model, including: obtaining model description information and configuration information of a deep learning model, wherein the model description information comprises variables and operation, and the configuration information comprises a dividing point variable and names of resources distributed by each segment; dividing the model description information into at least two sections according to the dividing point variable in the configuration information, and loading the model description information to corresponding resources for operation according to the names of the resources allocated to the sections; the following training steps are performed: acquiring a batch of training samples, inputting the batch of training samples into resources corresponding to the first section of model description information, starting training and storing an obtained intermediate result in first context information; inputting the first context information into a resource corresponding to the next section of model description information to obtain second context information; repeating the steps until obtaining the operation result of the resource corresponding to the last section of model description information; if the training completion condition is met, outputting a deep learning model after training is completed; otherwise, continuing to obtain training samples of the next batch to execute the training steps until the training completion condition is met
In some embodiments, the configuration information further includes a proportion of resources allocated by each segment; and loading the model description information to the corresponding resource according to the name of the resource allocated by each segment for operation, wherein the model description information comprises the following steps: calculating the quantity of each resource to be distributed according to the proportion of the resource distributed by each section; and loading the model description information to the corresponding resources according to the names and the quantity of the resources allocated to the segments for operation.
In some embodiments, the model description information is divided into at least two segments according to a dividing point variable in the configuration information, including: determining a forward part with a dividing point variable as an end point as a first section; determining the remaining forward part from the dividing point variable and the gradient variable corresponding to the variable from the loss part to the dividing point as a second segment; and determining the remaining reversed part starting from the gradient variable corresponding to the tangent point variable as a third section.
In some embodiments, obtaining model description information and configuration information for a deep learning model comprises: determining a conversion variable between the operation of frequent memory interaction and the calculation-intensive operation as a division point variable according to the operation of the model description information; allocating the frequently interactive operation of the memory to the CPU for execution; the compute-intensive operations are allocated for GPU execution.
In some embodiments, the method further comprises: splitting a training sample into a predetermined number of parts; training by using each training sample to obtain parameters of a group of deep learning models; and synchronizing the parameters of each group of deep learning models once every certain turn.
In some embodiments, the context information is passed through a queue.
In a second aspect, an embodiment of the present disclosure provides an apparatus for training a deep learning model, including: the acquisition unit is configured to acquire model description information and configuration information of the deep learning model, wherein the model description information comprises variables and operations, and the configuration information comprises a dividing point variable and a name of a resource allocated to each segment; the segmentation unit is configured to segment the model description information into at least two segments according to segmentation point variables in the configuration information, and load the model description information onto corresponding resources according to names of resources allocated to the segments for operation; a training unit configured to perform the following training steps: acquiring a batch of training samples, inputting the batch of training samples into resources corresponding to the first section of model description information, starting training and storing an obtained intermediate result in first context information; inputting the first context information into a resource corresponding to the next section of model description information to obtain second context information; repeating the steps until obtaining the operation result of the resource corresponding to the last section of model description information; if the training completion condition is met, outputting a deep learning model after training is completed; and the iteration unit is configured to continue to acquire the training samples of the next batch to perform the training steps until the training completion condition is met if the training completion condition is not met.
In some embodiments, the configuration information further includes a proportion of resources allocated by each segment; and the slicing unit is further configured to: calculating the quantity of each resource to be distributed according to the proportion of the resource distributed by each section; and loading the model description information to the corresponding resources according to the names and the quantity of the resources allocated to the segments for operation.
In some embodiments, the slicing unit is further configured to: determining a forward part with a dividing point variable as an end point as a first section; determining the remaining forward part from the dividing point variable and the gradient variable corresponding to the variable from the loss part to the dividing point as a second segment; and determining the remaining reversed part starting from the gradient variable corresponding to the tangent point variable as a third section.
In some embodiments, the obtaining unit is further configured to: determining a conversion variable between the operation of frequent memory interaction and the calculation-intensive operation as a division point variable according to the operation of the model description information; allocating the frequently interactive operation of the memory to the CPU for execution; the compute-intensive operations are allocated for GPU execution.
In some embodiments, the apparatus further comprises a merging unit configured to: splitting a training sample into a predetermined number of parts; training by using each training sample to obtain parameters of a group of deep learning models; and synchronizing the parameters of each group of deep learning models once every certain turn.
In some embodiments, the context information is passed through a queue.
In a third aspect, an embodiment of the present disclosure provides an electronic device, including: one or more processors; a storage device having one or more programs stored thereon which, when executed by one or more processors, cause the one or more processors to implement a method as in any one of the first aspects.
In a fourth aspect, embodiments of the disclosure provide a computer readable medium having a computer program stored thereon, wherein the program when executed by a processor implements a method as in any one of the first aspect.
The method and the device for training the deep learning model provided by the embodiment of the disclosure provide an asynchronous pipeline framework, realize free collocation of heterogeneous computing devices (not limited to a CPU, a GPU, a network card and the like, and specifically, the computation is supported by operation), so as to fully exert the characteristics of different devices, for example, the computation and the update operation of Embedding Lookup are performed in the CPU with higher memory interaction speed, and the computation intensive operations such as matrix multiplication are performed in the GPU with higher operation speed. In addition, by distributing computing resources with different proportions to operations with different characteristics and performing asynchronous concurrent execution, the computing capability of different computing devices can be fully exerted, and the overall throughput is improved.
Drawings
Other features, objects and advantages of the disclosure will become more apparent upon reading of the following detailed description of non-limiting embodiments thereof, made with reference to the accompanying drawings in which:
FIG. 1 is an exemplary system architecture diagram in which one embodiment of the present disclosure may be applied;
FIG. 2 is a flow diagram of one embodiment of a method for training a deep learning model according to the present disclosure;
3a, 3b are schematic diagrams of an application scenario of a method for training a deep learning model according to the present disclosure;
FIG. 4 is a flow diagram of yet another embodiment of a method for training a deep learning model according to the present disclosure;
FIG. 5 is a schematic illustration of yet another application scenario of a method for training a deep learning model according to the present disclosure;
FIG. 6 is a schematic structural diagram of one embodiment of an apparatus for training a deep learning model according to the present disclosure;
FIG. 7 is a schematic block diagram of a computer system suitable for use with an electronic device implementing embodiments of the present disclosure.
Detailed Description
The present disclosure is described in further detail below with reference to the accompanying drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the relevant invention and not restrictive of the invention. It should be noted that, for convenience of description, only the portions related to the related invention are shown in the drawings.
It should be noted that, in the present disclosure, the embodiments and features of the embodiments may be combined with each other without conflict. The present disclosure will be described in detail below with reference to the accompanying drawings in conjunction with embodiments.
Fig. 1 illustrates an
As shown in fig. 1, the
The user may use the
The
The
The computing unit of the
the CPU is adept at flow control and logic processing, particularly for irregular data structures or for unpredictable memory structures, due to its general type of functionality. In the deep learning task, the CPU is generally responsible for loading, preprocessing and dumping of data, starting data transmission and function call in the GPU, starting network transmission and the like;
the GPU has a plurality of cores, so the GPU is more adept to data parallel computing, can predict a storage mode particularly for a regular data structure, and has great speed advantage. Therefore, in the deep learning task, the GPU is generally responsible for calculation and is the most critical element in deep learning;
the network card is responsible for uploading and downloading of data and models and communication in distributed training;
obviously, in order to improve the utilization rate of the GPU, on one hand, operations with low GPU utilization rate should be moved to the CPU for execution, and on the other hand, computations among the computing devices should be executed concurrently, so that GPU idleness caused by serial computations among the CPU, the GPU, and the network card is avoided.
The server may be hardware or software. When the server is hardware, it may be implemented as a distributed server cluster formed by multiple servers, or may be implemented as a single server. When the server is software, it may be implemented as multiple pieces of software or software modules (e.g., multiple pieces of software or software modules used to provide distributed services), or as a single piece of software or software module. And is not particularly limited herein.
It should be noted that the method for training the deep learning model provided by the embodiment of the present disclosure is generally performed by the
It should be understood that the number of terminal devices, networks, and servers in fig. 1 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for implementation.
With continued reference to FIG. 2, a
In this embodiment, an executing agent (e.g., a server shown in fig. 1) of the method for training the deep learning model may receive a training request from a terminal with which a user performs model training through a wired connection manner or a wireless connection manner. The training request may include model description information and configuration information of the deep learning model. The training request may also include training samples. The model description information includes variables and operations, and the configuration information includes a division point variable and a name of a resource allocated to each segment. Some noun conventions are made first. A complete deep learning model is described by a Program (model description information), which mainly includes two entities, namely an operation (i.e., OP) and a Variable (Variable). For example, a full join operation and a lookup embedding operation can be regarded as an OP. And parameters of the network and intermediate representation of various data are described by Variable. The model description information may be written in Python language. The back-end runs on the specified resources (CPU or GPU) by converting into an executable C + + program.
In this embodiment, the configuration information includes a division point variable and a name of a resource allocated to each segment. The user sets where to perform segmentation through the configuration information, and each segmented model description information runs in which device. The variable of the cutting point can be manually screened, and the variable at the intersection of the forward calculation and the loss value calculation can be judged by a program to be used as the variable of the cutting point. In normal training, a Program is run on some device (e.g., GPU or CPU) as a whole. While pipeline parallelism supports the cutting of a Program into at least two sections (called sections), each Section can specify the device on which it is running. The idea of model parallelism is also continued here. Specifically, the user may specify pipeline cut points in the custom forward Program (the cut points are all Variable or a list of Variable, that is, there may be multiple cut point variables), and the server divides the complete Program including forward, backward, and optimized Program into multiple sections according to the pipeline cut points, and the Program of the sections obtained by the division describes respective calculation. The forward part relates to the forward propagation process of the neural network training and the backward part, also called backward part, relates to the backward propagation of the neural network training. And a further part for calculating the loss value. The segmentation can be directly carried out according to the segmentation point variable, and the gradient variable corresponding to the segmentation point variable can be automatically set as another segmentation point variable for segmentation. That is, a variable of a cut point may be set to cut three segments. If K cut point variables are set, then 2K +1 segments can be cut.
Fig. 3a depicts a complete example. A complete Program, comprising two parts, forward (left) and reverse (right). The user specifies cut _ var as a cut point, so the pipeline framework will cut the whole Program into three parts:
1. forward portion ending with cut _ var (section 0 in the figure)
2. The remaining forward portion from cut _ var, plus the reverse portion from loss to cut _ var @ GRAD (
3. After the remaining reverse portion (
The model description information can be written in Python language and then converted into executable programs such as C + + and the like to run on the allocated resources.
In the embodiment, in consideration of the universality of the Paddle deep learning framework, the basic concept of the Paddle framework is adopted, and the function is realized by briefly modifying and expanding some entities. Configuration information such as a Program cutting point, equipment types and parallelism of operation of each Section and the like, which is specified by a user when describing a network at a Python end, is sent to a back end (c + +) training engine through a proto format, so that the back end can obtain various configuration information during operation, and a series of subsequent operations such as initialization, calculation and the like are performed.
A brief introduction will be made to the Scope, a concept frequently used hereinafter. Scope is an important general concept in Paddle, which is used to store context information in the batch sample computation process, such as intermediate variables. If the same Program runs in a plurality of different scopes, the different scopes isolate the variables without interfering with each other. In the design of pipeline parallelism, Scope is used as a communication entity between adjacent sections for transmission, is uniformly created at the beginning of the whole program operation, and is transmitted in the sections in sequence in the execution process. The scopeque hereafter is the queue to hold Scope.
The Paddle framework includes a component (class) Section worker for managing the whole process of computation on a Section. The main work comprises the following steps:
1. the initialization phase creates an OP on the Section according to the inputted proto configuration information.
2. The execution stage can block waiting and acquire Scope resources from the input Queue; completing the calculation described in the Program based on the current Scope; and placing the Scope containing the calculation result into the output Queue.
The Paddle framework includes a component (class) PipelineTrainer to manage the life cycle of multiple SectionWorker. The main work comprises the following steps:
1. and (3) performing space development and initialization work of global resources: such as creating Section workers, generating an OP list for each Section worker, creating Scope queues between adjacent sections, and so on.
2. And the parallel starting of pipelines, necessary scheduling and the like are realized. Logic such as creating one thread of execution for each section worker and synchronization between threads.
It is worth mentioning that in order to multiplex and calculate the storage resources applied in Scope, the pipeline trainer creates a sufficient amount of Scope at one time during initialization and destroys the Scope after training.
An intuitive presentation of the above is shown in fig. 3 b.
And step 204, if the training completion condition is met, outputting the trained deep learning model.
In this embodiment, the training completion condition may include that the loss value is less than a predetermined value or that the number of times of training reaches an upper limit, or the like. And outputting a deep learning model when the training is completed. The information can be returned to the terminal equipment, and also can be sent to a publishing server to be published to other users for use. If the training is not complete, the next batch of training samples is updated and execution continues at
With further reference to FIG. 4, a
Step 401 is substantially the same as
The difference is that the configuration information includes the proportion of resources allocated to each segment, i.e. the parallelism of each segment, substantially the same as in
And 403, loading the model description information to the corresponding resource according to the name and the number of the resource allocated to each segment for operation.
And 405, if the training completion condition is met, outputting the deep learning model after the training is completed.
Steps 403-405 are substantially the same as steps 202-204 and thus are not described in detail.
As can be seen from fig. 4, compared with the embodiment corresponding to fig. 2, the
Besides the innovation points, the invention naturally supports the parallel data expansion. The modified Program is completely copied into multiple copies, and then data is split into corresponding copies and trained simultaneously.
It can be seen that the present invention has both model parallel and data parallel capabilities. And while integrating the capabilities of the two devices, the function of supporting heterogeneous devices and the function of supporting the calculation of resource proportions of different devices are added, so that the training mode is further enriched.
With further reference to fig. 6, as an implementation of the methods shown in the above figures, the present disclosure provides an embodiment of an apparatus for training a deep learning model, which corresponds to the method embodiment shown in fig. 2, and which can be applied in various electronic devices.
As shown in fig. 6, the
In this embodiment, the specific processing of the obtaining
In some optional implementations of this embodiment, the configuration information further includes a ratio of resources allocated to each segment; and the
In some optional implementations of the present embodiment, the
In some optional implementations of the present embodiment, the obtaining
In some optional implementations of this embodiment, the
In some optional implementations of this embodiment, the context information is passed through a queue.
Referring now to fig. 7, a schematic diagram of an electronic device (e.g., the server or terminal device of fig. 1) 700 suitable for use in implementing embodiments of the present disclosure is shown. The terminal device in the embodiments of the present disclosure may include, but is not limited to, a mobile terminal such as a mobile phone, a notebook computer, a digital broadcast receiver, a PDA (personal digital assistant), a PAD (tablet computer), a PMP (portable multimedia player), a vehicle terminal (e.g., a car navigation terminal), and the like, and a fixed terminal such as a digital TV, a desktop computer, and the like. The terminal device/server shown in fig. 7 is only an example, and should not bring any limitation to the functions and the scope of use of the embodiments of the present disclosure.
As shown in fig. 7,
Generally, the following devices may be connected to the I/O interface 705:
In particular, according to an embodiment of the present disclosure, the processes described above with reference to the flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program embodied on a computer readable medium, the computer program comprising program code for performing the method illustrated in the flow chart. In such embodiments, the computer program may be downloaded and installed from a network via the communication means 709, or may be installed from the storage means 708, or may be installed from the
The computer readable medium may be embodied in the electronic device; or may exist separately without being assembled into the electronic device. The computer readable medium carries one or more programs which, when executed by the electronic device, cause the electronic device to: obtaining model description information and configuration information of a deep learning model, wherein the model description information comprises variables and operation, and the configuration information comprises a dividing point variable and names of resources distributed by each segment; dividing the model description information into at least two sections according to the dividing point variable in the configuration information, and loading the model description information to corresponding resources for operation according to the names of the resources allocated to the sections; the following training steps are performed: acquiring a batch of training samples, inputting the batch of training samples into resources corresponding to the first section of model description information, starting training and storing an obtained intermediate result in first context information; inputting the first context information into a resource corresponding to the next section of model description information to obtain second context information; repeating the steps until obtaining the operation result of the resource corresponding to the last section of model description information; if the training completion condition is met, outputting a deep learning model after training is completed; otherwise, continuing to obtain the training samples of the next batch and executing the training steps until the training completion condition is met.
Computer program code for carrying out operations for embodiments of the present disclosure may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, Smalltalk, C + +, and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider).
The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
The units described in the embodiments of the present disclosure may be implemented by software or hardware. The described units may also be provided in a processor, and may be described as: a processor comprises an obtaining unit, a segmentation unit, a training unit and an iteration unit. Here, the names of the units do not constitute a limitation to the units themselves in some cases, and for example, the acquiring unit may also be described as a "unit that acquires model description information and configuration information of the deep learning model".
The foregoing description is only exemplary of the preferred embodiments of the disclosure and is illustrative of the principles of the technology employed. It will be appreciated by those skilled in the art that the scope of the invention in the present disclosure is not limited to the specific combination of the above-mentioned features, but also encompasses other embodiments in which any combination of the above-mentioned features or their equivalents is possible without departing from the inventive concept. For example, the above features and (but not limited to) the features disclosed in this disclosure having similar functions are replaced with each other to form the technical solution.
- 上一篇:一种医用注射器针头装配设备
- 下一篇:针对关系网络添加扰动的方法及装置