Artificial intelligence reasoning method, device, equipment and storage medium

文档序号:1953939 发布日期:2021-12-10 浏览:12次 中文

阅读说明:本技术 人工智能推理方法、装置、设备及存储介质 (Artificial intelligence reasoning method, device, equipment and storage medium ) 是由 贾东风 程力行 于 2021-09-17 设计创作,主要内容包括:本发明属于人工智能技术领域,公开了一种人工智能推理方法、装置、设备及存储介质。该方法包括:基于场景线程获取帧数据,确定帧数据对应的目标算法模型;基于目标算法模型对应的目标缓冲实例将帧数据存储至目标缓冲实例对应的存储区域;在检测到当前存储区域中的数据满足预设要求时,基于当前缓冲实例对应的调度线程从当前存储区域中取出待推理数据;将待推理数据合成批次数据;根据当前缓冲实例对批次数据进行推理。通过上述方式,对多个场景对应的帧数据进行调度推理,将多个场景对应的同一算法模型的帧数据合成批次数据,进行批量推理,提高硬件的利用能力,减少场景排队引起的推理堵塞,提高了推理效率,减小了推理耗时。(The invention belongs to the technical field of artificial intelligence, and discloses an artificial intelligence reasoning method, an artificial intelligence reasoning device, an artificial intelligence reasoning apparatus and a storage medium. The method comprises the following steps: acquiring frame data based on a scene thread, and determining a target algorithm model corresponding to the frame data; storing the frame data to a storage area corresponding to a target buffer instance based on the target buffer instance corresponding to the target algorithm model; when detecting that the data in the current storage area meets the preset requirements, taking out the data to be inferred from the current storage area based on a scheduling thread corresponding to the current buffer instance; synthesizing data to be reasoned into batch data; and reasoning the batch data according to the current buffer example. By the mode, the frame data corresponding to the scenes are dispatched and inferred, the frame data of the same algorithm model corresponding to the scenes are synthesized into batch data, batch inference is carried out, the utilization capacity of hardware is improved, inference blockage caused by scene queuing is reduced, inference efficiency is improved, and inference time is reduced.)

1. An artificial intelligence reasoning method, characterized in that the artificial intelligence reasoning method comprises:

acquiring frame data based on a scene thread, and determining a target algorithm model corresponding to the frame data;

storing the frame data to a storage area corresponding to a target buffer instance based on the target buffer instance corresponding to the target algorithm model;

when detecting that the data in the current storage area meet preset requirements, taking out the data to be inferred from the current storage area based on a scheduling thread corresponding to the current buffer instance;

synthesizing the data to be reasoned into batch data;

and reasoning the batch data according to the current buffering example.

2. The artificial intelligence reasoning method of claim 1, wherein before the scene-based thread obtains frame data and determines the target algorithm model corresponding to the frame data, the method further comprises:

when a system starting instruction is obtained, creating a scheduling instance corresponding to each scene thread according to an algorithm model corresponding to each scene thread;

the storing the frame data to a storage area corresponding to the target buffer instance based on the target buffer instance corresponding to the target algorithm model comprises:

and transmitting the frame data to a target buffer instance corresponding to the target algorithm model according to a target scheduling instance corresponding to the scene thread, so that the target buffer instance stores the frame data to a storage area corresponding to the target buffer instance.

3. The artificial intelligence reasoning method of claim 2, wherein the target schedule instance transmits frame data to the target buffer instance over a channel.

4. The artificial intelligence reasoning method of claim 3, wherein after reasoning about the batch of data according to the current buffer instance, the method further comprises:

and when the batch data are detected to be reasoned, distributing the inference result to each scene thread corresponding to the batch data.

5. The artificial intelligence reasoning method of claim 4, wherein the distributing the reasoning result to each scene thread corresponding to the batch data when the batch data reasoning is detected to be completed comprises:

when the batch data are detected to be inferred, determining channel identifications corresponding to each frame of data in the batch data respectively;

respectively searching corresponding thread identifications according to the channel identifications;

and distributing the inference result to each corresponding scene thread according to the thread identifications.

6. The artificial intelligence reasoning method of claim 3, wherein after reasoning about the batch of data according to the current buffer instance, the method further comprises:

when the target frame data corresponding to the target scene thread is detected to be inferred, searching a target thread identifier corresponding to the target scene thread according to the target frame data;

and sending a target reasoning result to the target scene thread according to the target thread identifier.

7. The artificial intelligence reasoning method of claim 1, wherein the reasoning about the batch of data as a function of the current buffer instance comprises:

and sending the batch data to an inference engine according to the current buffer instance so that the inference engine infers the batch data according to a current inference algorithm corresponding to the current buffer instance.

8. An artificial intelligence reasoning apparatus, characterized in that the artificial intelligence reasoning apparatus comprises:

the determining module is used for acquiring frame data based on a scene thread and determining a target algorithm model corresponding to the frame data;

the scheduling module is used for storing the frame data into a storage area corresponding to a target buffer instance based on the target buffer instance corresponding to the target algorithm model;

the detection module is used for taking out data to be inferred from the current storage area based on a scheduling thread corresponding to the current buffer instance when the data in the current storage area meets the preset requirement;

the synthesis module is used for synthesizing the data to be reasoned into batch data;

and the reasoning module is used for reasoning the batch data according to the current buffer example.

9. An artificial intelligence reasoning apparatus, characterized in that the apparatus comprises: a memory, a processor, and an artificial intelligence reasoning program stored on the memory and operable on the processor, the artificial intelligence reasoning program configured to implement the artificial intelligence reasoning method of any of claims 1 to 7.

10. A storage medium having stored thereon an artificial intelligence reasoning program which, when executed by a processor, implements the artificial intelligence reasoning method of any one of claims 1 to 7.

Technical Field

The invention relates to the technical field of artificial intelligence, in particular to an artificial intelligence reasoning method, an artificial intelligence reasoning device, artificial intelligence reasoning equipment and a storage medium.

Background

In an Artificial Intelligence (AI) scene, the operation of the algorithm depends on the reasoning capability of a Graphics Processing Unit (GPU), many GPU devices support multiple batches of reasoning, that is, batch reasoning of n pictures, and the reasoning time is 1/n of n pictures in turn. However, in practical application, multiple batches of reasoning are only used in a single scene, and the reasoning pictures sent in each time by the single scene may not reach n pictures, which causes waste of hardware resources.

The above is only for the purpose of assisting understanding of the technical aspects of the present invention, and does not represent an admission that the above is prior art.

Disclosure of Invention

The invention mainly aims to provide an artificial intelligence reasoning method, an artificial intelligence reasoning device, artificial intelligence reasoning equipment and a storage medium, and aims to solve the technical problems that the existing mode is low in efficiency and long in reasoning time consumption when multi-scene reasoning is faced.

In order to achieve the above object, the present invention provides an artificial intelligence reasoning method, comprising the steps of:

acquiring frame data based on a scene thread, and determining a target algorithm model corresponding to the frame data;

storing the frame data to a storage area corresponding to a target buffer instance based on the target buffer instance corresponding to the target algorithm model;

when detecting that the data in the current storage area meet preset requirements, taking out the data to be inferred from the current storage area based on a scheduling thread corresponding to the current buffer instance;

synthesizing the data to be reasoned into batch data;

and reasoning the batch data according to the current buffering example.

Optionally, before the obtaining frame data based on the scene thread and determining the target algorithm model corresponding to the frame data, the method further includes:

when a system starting instruction is obtained, creating a scheduling instance corresponding to each scene thread according to an algorithm model corresponding to each scene thread;

the storing the frame data to a storage area corresponding to the target buffer instance based on the target buffer instance corresponding to the target algorithm model comprises:

and transmitting the frame data to a target buffer instance corresponding to the target algorithm model according to a target scheduling instance corresponding to the scene thread, so that the target buffer instance stores the frame data to a storage area corresponding to the target buffer instance.

Optionally, the target scheduling instance transmits frame data to the target buffering instance through a channel.

Optionally, after the reasoning is performed on the batch data according to the current buffering instance, the method further includes:

and when the batch data are detected to be reasoned, distributing the inference result to each scene thread corresponding to the batch data.

Optionally, the distributing the inference result to each scene thread corresponding to the batch data when it is detected that the batch data is inferred, includes:

when the batch data are detected to be inferred, determining channel identifications corresponding to each frame of data in the batch data respectively;

respectively searching corresponding thread identifications according to the channel identifications;

and distributing the inference result to each corresponding scene thread according to the thread identifications.

Optionally, after the reasoning is performed on the batch data according to the current buffering instance, the method further includes:

when the target frame data corresponding to the target scene thread is detected to be inferred, searching a target thread identifier corresponding to the target scene thread according to the target frame data;

and sending a target reasoning result to the target scene thread according to the target thread identifier.

Optionally, the inferring the batch data according to the current buffer instance includes:

and sending the batch data to an inference engine according to the current buffer instance so that the inference engine infers the batch data according to a current inference algorithm corresponding to the current buffer instance.

Optionally, when it is detected that the data in the current storage area meets the preset requirement, fetching the data to be inferred from the current storage area based on the scheduling thread corresponding to the current buffer instance includes:

the method comprises the steps that data in a current storage area are detected regularly according to a preset detection period on the basis of a scheduling thread corresponding to a current buffering instance;

and when detecting that data exists in the current storage area, taking out the data to be reasoned from the current storage area according to the scheduling thread.

Optionally, when it is detected that the data in the current storage area meets the preset requirement, fetching the data to be inferred from the current storage area based on the scheduling thread corresponding to the current buffer instance includes:

and when detecting that the data quantity in the current storage area reaches a preset quantity threshold value, taking out the data to be reasoned from the current storage area based on a scheduling thread corresponding to the current buffer instance.

Optionally, when it is detected that the data in the current storage area meets the preset requirement, fetching the data to be inferred from the current storage area based on the scheduling thread corresponding to the current buffer instance includes:

and when the data size in the current storage area is detected to exceed the preset capacity size, taking out the data to be inferred from the current storage area based on the scheduling thread corresponding to the current buffer instance.

In addition, in order to achieve the above object, the present invention further provides an artificial intelligence reasoning apparatus, including:

the determining module is used for acquiring frame data based on a scene thread and determining a target algorithm model corresponding to the frame data;

the scheduling module is used for storing the frame data into a storage area corresponding to a target buffer instance based on the target buffer instance corresponding to the target algorithm model;

the detection module is used for taking out data to be inferred from the current storage area based on a scheduling thread corresponding to the current buffer instance when the data in the current storage area meets the preset requirement;

the synthesis module is used for synthesizing the data to be reasoned into batch data;

and the reasoning module is used for reasoning the batch data according to the current buffer example.

Optionally, the artificial intelligence reasoning apparatus further comprises a creating module;

the creating module is used for creating a scheduling instance corresponding to each scene thread according to the algorithm model corresponding to each scene thread when a system starting instruction is obtained;

the scheduling module is further configured to transmit the frame data to a target buffer instance corresponding to the target algorithm model according to a target scheduling instance corresponding to the scene thread, so that the target buffer instance stores the frame data to a storage area corresponding to the target buffer instance.

Optionally, the target scheduling instance transmits frame data to the target buffering instance through a channel.

Optionally, the artificial intelligence reasoning apparatus further comprises a distribution module;

and the distribution module is used for distributing the inference result to each scene thread corresponding to the batch data when the batch data inference is detected to be completed.

Optionally, the distribution module is further configured to, when it is detected that the batch data is inferred, determine channel identifiers corresponding to each frame of data in the batch data, respectively search corresponding thread identifiers according to the channel identifiers, and distribute an inference result to each corresponding scene thread according to the thread identifiers.

Optionally, the artificial intelligence reasoning apparatus further comprises a result returning module;

and the result returning module is used for searching a target thread identifier corresponding to the target scene thread according to the target frame data and sending a target inference result to the target scene thread according to the target thread identifier when the target frame data corresponding to the target scene thread is detected to be inferred.

Optionally, the inference module is further configured to send the batch data to an inference engine according to the current buffer instance, so that the inference engine performs inference on the batch data according to a current inference algorithm corresponding to the current buffer instance.

Optionally, the detecting module is further configured to detect data in the current storage area at regular time according to a preset detection period based on a scheduling thread corresponding to the current buffer instance, and when data is detected to exist in the current storage area, fetch data to be inferred from the current storage area according to the scheduling thread.

In addition, in order to achieve the above object, the present invention further provides an artificial intelligence reasoning apparatus, including: a memory, a processor, and an artificial intelligence reasoning program stored on the memory and operable on the processor, the artificial intelligence reasoning program configured to implement the artificial intelligence reasoning method as described above.

In addition, to achieve the above object, the present invention further provides a storage medium having an artificial intelligence reasoning program stored thereon, wherein the artificial intelligence reasoning program, when executed by a processor, implements the artificial intelligence reasoning method as described above.

The method comprises the steps of obtaining frame data based on a scene thread, and determining a target algorithm model corresponding to the frame data; storing the frame data to a storage area corresponding to a target buffer instance based on the target buffer instance corresponding to the target algorithm model; when detecting that the data in the current storage area meets the preset requirements, taking out the data to be inferred from the current storage area based on a scheduling thread corresponding to the current buffer instance; synthesizing data to be reasoned into batch data; and reasoning the batch data according to the current buffer example. By the method, the frame data corresponding to the scenes are scheduled and reasoned, the frame data of the same algorithm model corresponding to the scenes are stored in the same storage area, the frame data are taken out to be synthesized into batch data, batch reasoning is carried out, the utilization capacity of hardware is improved, reasoning blockage caused by scene queuing is reduced, reasoning efficiency is improved, and reasoning time is reduced.

Drawings

FIG. 1 is a schematic structural diagram of an artificial intelligence reasoning device of a hardware operating environment according to an embodiment of the present invention;

FIG. 2 is a schematic flow chart of a first embodiment of the artificial intelligence reasoning method of the present invention;

FIG. 3 is a schematic diagram of multi-scene frame data processing according to an embodiment of the artificial intelligence inference method of the present invention;

FIG. 4 is a flowchart illustrating a second embodiment of the artificial intelligence reasoning method of the present invention;

FIG. 5 is a flowchart illustrating a third embodiment of the artificial intelligence reasoning method of the present invention;

fig. 6 is a block diagram showing the structure of the artificial intelligence inference apparatus according to the first embodiment of the present invention.

Detailed Description

It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention. The implementation, functional features and advantages of the objects of the present invention will be further explained with reference to the accompanying drawings.

Referring to fig. 1, fig. 1 is a schematic structural diagram of an artificial intelligence reasoning apparatus in a hardware operating environment according to an embodiment of the present invention.

As shown in fig. 1, the artificial intelligence reasoning apparatus may include: a processor 1001, such as a Graphics Processing Unit (GPU), a communication bus 1002, a user interface 1003, a network interface 1004, and a memory 1005. Wherein a communication bus 1002 is used to enable connective communication between these components. The user interface 1003 may include a Display screen (Display), an input unit such as a Keyboard (Keyboard); optionally, the user interface 1003 may also include a standard wired interface, a wireless interface. Optionally, the network interface 1004 includes a standard wired interface, a Wireless interface (e.g., a Wireless-Fidelity (Wi-Fi) interface). The Memory 1005 may be a Random Access Memory (RAM) or a Non-Volatile Memory (NVM), such as a disk Memory. Alternatively, the memory 1005 may be a storage device independent of the processor 1001.

Those skilled in the art will appreciate that the configuration shown in FIG. 1 does not constitute a limitation of an artificial intelligent inference device, and may include more or less components than those shown, or some components in combination, or a different arrangement of components.

As shown in fig. 1, a memory 1005, which is a kind of storage medium, may include therein an operating system, a network communication module, a user interface module, and an artificial intelligence inference program.

In the artificial intelligence reasoning apparatus shown in fig. 1, the network interface 1004 is mainly used for data communication with a network server; the user interface 1003 is mainly used for data interaction with a user; the processor 1001 and the memory 1005 of the artificial intelligence reasoning apparatus of the present invention may be disposed in the artificial intelligence reasoning apparatus, and the artificial intelligence reasoning apparatus invokes the artificial intelligence reasoning program stored in the memory 1005 through the processor 1001 and executes the artificial intelligence reasoning method provided by the embodiment of the present invention.

An embodiment of the present invention provides an artificial intelligence reasoning method, and referring to fig. 2, fig. 2 is a schematic flow diagram of a first embodiment of the artificial intelligence reasoning method of the present invention.

In this embodiment, the artificial intelligence reasoning method includes the following steps:

step S10: frame data are obtained based on a scene thread, and a target algorithm model corresponding to the frame data is determined.

It can be understood that the execution subject of this embodiment is an artificial intelligence reasoning device, and the artificial intelligence reasoning device may be a cloud server, an end-side AI box, or other devices having the same or similar functions, which is not limited in this embodiment.

It should be noted that, in this embodiment, each scene thread corresponds to one scene, each scene may include multiple algorithm models, frame data corresponding to each scene is received based on a plurality of scene threads, optionally, a corresponding scene thread is created according to a task flow, one task flow may include one or more model inference tasks, and a target algorithm model corresponding to each frame data to be inferred is determined based on the model inference tasks.

Step S20: and storing the frame data into a storage area corresponding to the target buffer instance based on the target buffer instance corresponding to the target algorithm model.

It is understood that each algorithm model corresponds to a buffer instance, and the buffer instance is used for managing data, scheduling threads and reasoning algorithms corresponding to a certain algorithm model. In a specific implementation, when different scene threads use the same algorithm model, the same buffer instance is called, and frame data is stored in the same storage area.

Step S30: and when detecting that the data in the current storage area meets the preset requirement, taking out the data to be inferred from the current storage area based on the scheduling thread corresponding to the current buffer instance.

It should be noted that when different scene threads use the same algorithm model, frame data is stored in the same storage area, and when the storage area meets the preset requirement, the corresponding scheduling thread is called to take out the data to be inferred. When multi-scene AI reasoning (artificial intelligence reasoning) is faced, a plurality of pieces of frame data can be rapidly stored in the storage area managed by each buffer instance in a short time, so that reasoning on the plurality of pieces of frame data is ensured, hardware resources are saved, and the utilization efficiency of the hardware resources is improved. The preset requirement is matched with a preset detection mechanism, for example, when the detection mechanism is timing detection, the preset requirement is that data exists in the current storage area; when the detection mechanism is quantitative detection, the number of data in the current storage area reaches a number threshold or the size reaches a capacity threshold. For example, assuming that the number threshold is N, when it is detected that the number of data in the current storage area reaches N, the data to be inferred is fetched from the current storage area based on the scheduling thread corresponding to the current buffer instance.

Optionally, the step S30 includes: the method comprises the steps that data in a current storage area are detected regularly according to a preset detection period on the basis of a scheduling thread corresponding to a current buffering instance; and when detecting that data exists in the current storage area, taking out the data to be reasoned from the current storage area according to the scheduling thread.

It can be understood that the preset detection period may be set according to an actual situation, the scheduling thread corresponding to the current buffering instance detects data in the current storage area at regular time, when it is detected that data exists, the data to be inferred is fetched from the current storage area, and after the data to be processed is fetched, no data exists in the current storage area until the corresponding current buffering instance stores frame data again.

Optionally, the step S30 includes: and when detecting that the data quantity in the current storage area reaches a preset quantity threshold value, taking out the data to be reasoned from the current storage area based on a scheduling thread corresponding to the current buffer instance.

It should be noted that the preset number threshold may be set according to actual situations, for example, the upper limit of the buffer data in the storage area is set to N, and when the number of data in the current storage area reaches N, an operation of fetching the data to be inferred from the current storage area is triggered. Optionally, a scheduling instance is set, and is used to transmit frame data to a target buffer instance corresponding to the target algorithm model, and store the frame data in a corresponding storage area, and a counting mechanism is set at the scheduling instance to determine the number of data stored in each storage area; optionally, a detector including a counting mechanism is provided, and is configured to detect the number of data in the current storage area, and when detecting that a preset number threshold is reached, call the scheduling thread to take out the data to be inferred.

Optionally, the step S30 includes: and when the data size in the current storage area is detected to exceed the preset capacity size, taking out the data to be inferred from the current storage area based on the scheduling thread corresponding to the current buffer instance.

It can be understood that the preset capacity can be set according to actual conditions, a detector including space capacity detection is arranged for detecting the size of data in the current storage area, and when the preset capacity is detected to be exceeded, the scheduling thread is called to take out the data to be inferred.

Step S40: and synthesizing the data to be reasoned into batch data.

It should be noted that, in this embodiment, when the current storage area meets the preset requirement, the batch data synthesis interface is called, and the data to be inferred is loaded according to batch (batch), so as to obtain the batch data.

Step S50: and reasoning the batch data according to the current buffering example.

Specifically, the step S50 includes: and sending the batch data to an inference engine according to the current buffer instance so that the inference engine infers the batch data according to a current inference algorithm corresponding to the current buffer instance.

It can be understood that the current buffer instance manages a corresponding inference algorithm, and sends the batch data to the inference engine for inference, in the specific implementation, a plurality of inference engines are provided to execute inference tasks, optionally, a hardware distribution mapping table is provided, the hardware distribution mapping table includes an algorithm model and inference hardware identifiers corresponding to one, the current buffer instance searches for the corresponding inference hardware identifiers according to the hardware distribution mapping table, and transmits the batch data to the corresponding inference engine for inference. Optionally, when the current buffer instance transmits the batch data, the state parameters of all the inference engines are acquired, and an idle inference engine is selected according to the state parameters of the inference engine, so that the inference engine executes the batch data inference operation.

It should be noted that, referring to fig. 3, fig. 3 is a schematic diagram of processing multi-scene frame data according to an embodiment of the artificial intelligence inference method of the present invention; the AlgoScheduler is a scheduling instance, various AI algorithms are encapsulated, and each path of scene thread creates a scheduling instance; the buffer Mgr is a buffer instance, each algorithm model corresponds to one buffer instance and is used for managing data, a storage area, a scheduling thread and an inference algorithm of a certain algorithm model; the Algo (object _ detector) is an inference algorithm, and each algorithm model corresponds to one inference algorithm; the BufferMap frame is a storage area and is used for storing data transmitted by all channels of a specific model; the BufferThread is a scheduling thread and is used for scheduling frame data from the storage area when the data in the storage area meets the preset requirement. The specific reasoning process comprises the following steps: the method comprises the steps that a system is started, a single scheduling instance (AlgoScheduler) or a plurality of combined scheduling instances (AlgoScheduler) of a plurality of models are created, frame data acquired by a scene thread is transmitted into a buffer map (BufferMap) of a buffer instance (BufferMgr) through each scheduling instance (AlgoScheduler), a current channel is blocked, the scheduling thread (BufferThread) periodically checks data in the buffer map (BufferMap), data are periodically fetched, multiple batch data are synthesized and transmitted to hardware for reasoning, and after the reasoning is finished, a reasoning result is distributed to the corresponding scene thread.

The method comprises the steps of obtaining frame data based on a scene thread, and determining a target algorithm model corresponding to the frame data; storing the frame data to a storage area corresponding to a target buffer instance based on the target buffer instance corresponding to the target algorithm model; when detecting that the data in the current storage area meets the preset requirements, taking out the data to be inferred from the current storage area based on a scheduling thread corresponding to the current buffer instance; synthesizing data to be reasoned into batch data; and reasoning the batch data according to the current buffer example. By the method, the frame data corresponding to the scenes are scheduled and reasoned, the frame data of the same algorithm model corresponding to the scenes are stored in the same storage area, the frame data are taken out to be synthesized into batch data, batch reasoning is carried out, the utilization capacity of hardware is improved, reasoning blockage caused by scene queuing is reduced, reasoning efficiency is improved, and reasoning time is reduced.

Referring to fig. 4, fig. 4 is a flowchart illustrating an artificial intelligence reasoning method according to a second embodiment of the present invention.

Based on the first embodiment, before the step S10, the artificial intelligence inference method in this embodiment further includes:

step S101: when a system starting instruction is obtained, a scheduling instance corresponding to each scene thread is created according to the algorithm model corresponding to each scene thread.

It can be understood that, at the time of system startup, a single scheduling instance or a scheduling instance combining multiple models is created, and the scheduling instance is used for scheduling and distributing frame data corresponding to each algorithm model received by a scene thread.

The step S20 includes:

step S201: and transmitting the frame data to a target buffer instance corresponding to the target algorithm model according to a target scheduling instance corresponding to the scene thread, so that the target buffer instance stores the frame data to a storage area corresponding to the target buffer instance.

It should be noted that each scene thread corresponds to one scheduling instance, each scheduling instance corresponds to a plurality of buffer instances according to a plurality of algorithm models, and each buffer instance manages data received by all channels corresponding to one algorithm model, a storage area for storing frame data, a scheduling thread, and an inference algorithm. The buffer instance stores the frame data transmitted by each scheduling instance to the same storage area, namely, the frame data corresponding to the same algorithm model received by the multi-scene thread is stored to the same storage area, a plurality of frame data are taken out from the frame data, and batch data are synthesized for reasoning, so that multi-scene batch data reasoning is realized.

When a system starting instruction is acquired, a scheduling instance corresponding to each scene thread is created according to an algorithm model corresponding to each scene thread; acquiring frame data based on a scene thread, and determining a target algorithm model corresponding to the frame data; transmitting the frame data to a target buffer instance corresponding to the target algorithm model according to a target scheduling instance corresponding to the scene thread, so that the target buffer instance stores the frame data to a storage area corresponding to the target buffer instance; when detecting that the data in the current storage area meets the preset requirements, taking out the data to be inferred from the current storage area based on a scheduling thread corresponding to the current buffer instance; synthesizing data to be reasoned into batch data; and reasoning the batch data according to the current buffer example. Through the method, the scheduling instance corresponding to each scene thread is created, the frame data corresponding to the scenes are transmitted to the storage area managed by the buffer instance corresponding to the algorithm model based on the scheduling instance, the frame data of the same algorithm model corresponding to the scenes are stored as the same storage area, the frame data are taken out to be synthesized into batch data, batch reasoning is carried out, the utilization capacity of hardware is improved, reasoning blockage caused by scene queuing is reduced, reasoning efficiency is improved, and reasoning time consumption is reduced.

Referring to fig. 5, fig. 5 is a flowchart illustrating an artificial intelligence reasoning method according to a third embodiment of the present invention.

Based on the first embodiment, after the step S50, the artificial intelligence inference method in this embodiment further includes:

step S501: and when the batch data are detected to be reasoned, distributing the inference result to each scene thread corresponding to the batch data.

It can be understood that, in this embodiment, two ways are adopted to distribute the inference result, the first way is to distribute the inference result by batch when each batch of data is inferred; the second mode is that when the inference of each frame data is completed, the inference result is distributed according to the scene thread corresponding to the frame data.

Further, the target scheduling instance transmits frame data to the target buffering instance through a channel.

It should be noted that, when the target scheduling instance transmits frame data through a channel (channel), if the target buffering instance has not been received all the time, the sending operation will continue to block the channel.

Specifically, the step S501 includes: when the batch data are detected to be inferred, determining channel identifications corresponding to each frame of data in the batch data respectively; respectively searching corresponding thread identifications according to the channel identifications; and distributing the inference result to each corresponding scene thread according to the thread identifications.

It can be understood that, when the system is started, a scene thread corresponding to each channel creates a scheduling instance corresponding to each scene thread, the scheduling instance transmits frame data received by the scene thread to a storage area corresponding to the buffer instance through the channel, binds a thread identifier (thread ID) to the channel, wakes up the corresponding channel through the thread ID, and distributes an inference result to the corresponding scene thread through the woken-up channel.

In an embodiment, after the step S50, the method further includes: when the target frame data corresponding to the target scene thread is detected to be inferred, searching a target thread identifier corresponding to the target scene thread according to the target frame data; and sending a target reasoning result to the target scene thread according to the target thread identifier.

In the specific implementation, when frame data is transmitted, marking the frame data according to a received scene thread, when frame data inference is detected to be completed, searching a corresponding thread identifier according to the frame data, awakening a corresponding channel according to the thread identifier, and distributing an inference result to a corresponding scene thread through the awakened channel.

The method comprises the steps of obtaining frame data based on a scene thread, and determining a target algorithm model corresponding to the frame data; storing the frame data to a storage area corresponding to a target buffer instance based on the target buffer instance corresponding to the target algorithm model; when detecting that the data in the current storage area meets the preset requirements, taking out the data to be inferred from the current storage area based on a scheduling thread corresponding to the current buffer instance; synthesizing data to be reasoned into batch data; reasoning the batch data according to the current buffer example; and when the batch data are detected to be reasoned, distributing the inference result to each scene thread corresponding to the batch data. According to the mode, the frame data corresponding to the scenes are scheduled and reasoned, the frame data of the same algorithm model corresponding to the scenes are stored in the same storage area, the frame data are taken out to be synthesized into batch data, batch reasoned, and the reasoned result is returned to each scene thread, so that batch reasoned of the reasoned tasks of the scenes is realized, the utilization capacity of hardware is improved, the reasoned blockage caused by scene queuing is reduced, the reasoned efficiency is improved, and the reasoned time is reduced.

In addition, an embodiment of the present invention further provides a storage medium, where an artificial intelligence reasoning program is stored, and when the artificial intelligence reasoning program is executed by a processor, the artificial intelligence reasoning method is implemented.

Since the storage medium adopts all technical solutions of all the embodiments, at least all the beneficial effects brought by the technical solutions of the embodiments are achieved, and no further description is given here.

Referring to fig. 6, fig. 6 is a block diagram illustrating a first embodiment of the artificial intelligence reasoning apparatus according to the present invention.

As shown in fig. 6, the artificial intelligence inference apparatus provided in the embodiment of the present invention includes:

the determining module 10 is configured to acquire frame data based on a scene thread and determine a target algorithm model corresponding to the frame data.

And the scheduling module 20 is configured to store the frame data into a storage area corresponding to the target buffer instance based on the target buffer instance corresponding to the target algorithm model.

And the detecting module 30 is configured to, when it is detected that the data in the current storage area meets a preset requirement, take out the data to be inferred from the current storage area based on a scheduling thread corresponding to the current buffer instance.

And the synthesis module 40 is used for synthesizing the data to be reasoned into batch data.

And the reasoning module 50 is used for reasoning the batch data according to the current buffer example.

It should be understood that the above is only an example, and the technical solution of the present invention is not limited in any way, and in a specific application, a person skilled in the art may set the technical solution as needed, and the present invention is not limited thereto.

The method comprises the steps of obtaining frame data based on a scene thread, and determining a target algorithm model corresponding to the frame data; storing the frame data to a storage area corresponding to a target buffer instance based on the target buffer instance corresponding to the target algorithm model; when detecting that the data in the current storage area meets the preset requirements, taking out the data to be inferred from the current storage area based on a scheduling thread corresponding to the current buffer instance; synthesizing data to be reasoned into batch data; and reasoning the batch data according to the current buffer example. By the method, the frame data corresponding to the scenes are scheduled and reasoned, the frame data of the same algorithm model corresponding to the scenes are stored in the same storage area, the frame data are taken out to be synthesized into batch data, batch reasoning is carried out, the utilization capacity of hardware is improved, reasoning blockage caused by scene queuing is reduced, reasoning efficiency is improved, and reasoning time is reduced.

It should be noted that the above-described work flows are only exemplary, and do not limit the scope of the present invention, and in practical applications, a person skilled in the art may select some or all of them to achieve the purpose of the solution of the embodiment according to actual needs, and the present invention is not limited herein.

In addition, the technical details that are not described in detail in this embodiment may refer to the artificial intelligence reasoning method provided in any embodiment of the present invention, and are not described herein again.

In one embodiment, the artificial intelligence reasoning apparatus further comprises a creating module;

the creating module is used for creating a scheduling instance corresponding to each scene thread according to the algorithm model corresponding to each scene thread when a system starting instruction is obtained;

the scheduling module 20 is further configured to transmit the frame data to a target buffer instance corresponding to the target algorithm model according to a target scheduling instance corresponding to the scene thread, so that the target buffer instance stores the frame data in a storage area corresponding to the target buffer instance.

In one embodiment, the target scheduling instance transmits frame data to the target buffering instance over a channel.

In one embodiment, the artificial intelligence reasoning apparatus further comprises a distribution module;

and the distribution module is used for distributing the inference result to each scene thread corresponding to the batch data when the batch data inference is detected to be completed.

In an embodiment, the distribution module is further configured to, when it is detected that the batch data is inferred, determine channel identifiers corresponding to each frame of data in the batch data, respectively search corresponding thread identifiers according to the channel identifiers, and distribute an inference result to corresponding scene threads according to the thread identifiers.

In one embodiment, the artificial intelligence reasoning device further comprises a result returning module;

and the result returning module is used for searching a target thread identifier corresponding to the target scene thread according to the target frame data and sending a target inference result to the target scene thread according to the target thread identifier when the target frame data corresponding to the target scene thread is detected to be inferred.

In an embodiment, the inference module 50 is further configured to send the batch data to an inference engine according to the current buffer instance, so that the inference engine performs inference on the batch data according to a current inference algorithm corresponding to the current buffer instance.

In an embodiment, the detecting module 30 is further configured to detect data in the current storage area at regular time according to a preset detection period based on a scheduling thread corresponding to the current buffering instance, and when the data in the current storage area is detected, fetch the data to be inferred from the current storage area according to the scheduling thread.

In an embodiment, the detecting module 30 is further configured to, when it is detected that the number of data in the current storage area reaches a preset number threshold, fetch the data to be inferred from the current storage area based on a scheduling thread corresponding to the current buffer instance.

In an embodiment, the detecting module 30 is further configured to, when it is detected that the size of the data in the current storage area exceeds the preset capacity size, fetch the data to be inferred from the current storage area based on a scheduling thread corresponding to the current buffer instance.

Further, it is to be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or system that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or system. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, method, article, or system that comprises the element.

The above-mentioned serial numbers of the embodiments of the present invention are merely for description and do not represent the merits of the embodiments.

Through the above description of the embodiments, those skilled in the art will clearly understand that the method of the above embodiments can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware, but in many cases, the former is a better implementation manner. Based on such understanding, the technical solution of the present invention or portions thereof that contribute to the prior art may be embodied in the form of a software product, where the computer software product is stored in a storage medium (e.g. Read Only Memory (ROM)/RAM, magnetic disk, optical disk), and includes several instructions for enabling a terminal device (e.g. a mobile phone, a computer, a server, or a network device) to execute the method according to the embodiments of the present invention.

The above description is only a preferred embodiment of the present invention, and not intended to limit the scope of the present invention, and all modifications of equivalent structures and equivalent processes, which are made by using the contents of the present specification and the accompanying drawings, or directly or indirectly applied to other related technical fields, are included in the scope of the present invention.

The invention discloses an artificial intelligence reasoning method A1, which comprises the following steps:

acquiring frame data based on a scene thread, and determining a target algorithm model corresponding to the frame data;

storing the frame data to a storage area corresponding to a target buffer instance based on the target buffer instance corresponding to the target algorithm model;

when detecting that the data in the current storage area meet preset requirements, taking out the data to be inferred from the current storage area based on a scheduling thread corresponding to the current buffer instance;

synthesizing the data to be reasoned into batch data;

and reasoning the batch data according to the current buffering example.

A2, before the artificial intelligence reasoning method as described in A1, the method further includes, before the scene thread acquires frame data and determines a target algorithm model corresponding to the frame data:

when a system starting instruction is obtained, creating a scheduling instance corresponding to each scene thread according to an algorithm model corresponding to each scene thread;

the storing the frame data to a storage area corresponding to the target buffer instance based on the target buffer instance corresponding to the target algorithm model comprises:

and transmitting the frame data to a target buffer instance corresponding to the target algorithm model according to a target scheduling instance corresponding to the scene thread, so that the target buffer instance stores the frame data to a storage area corresponding to the target buffer instance.

A3, the artificial intelligence reasoning method as in A2, the target schedule instance transmitting frame data to the target buffer instance through a channel.

A4, the artificial intelligence reasoning method of A3, further comprising, after reasoning about the batch data according to the current buffered instance:

and when the batch data are detected to be reasoned, distributing the inference result to each scene thread corresponding to the batch data.

The artificial intelligence reasoning method of a5, as described in a4, when it is detected that the reasoning of the batch data is completed, distributing the reasoning result to each scene thread corresponding to the batch data, includes:

when the batch data are detected to be inferred, determining channel identifications corresponding to each frame of data in the batch data respectively;

respectively searching corresponding thread identifications according to the channel identifications;

and distributing the inference result to each corresponding scene thread according to the thread identifications.

A6, the artificial intelligence reasoning method of A3, further comprising, after reasoning about the batch data according to the current buffered instance:

when the target frame data corresponding to the target scene thread is detected to be inferred, searching a target thread identifier corresponding to the target scene thread according to the target frame data;

and sending a target reasoning result to the target scene thread according to the target thread identifier.

A7, the artificial intelligence reasoning method of A1, wherein the reasoning the batch data according to the current buffer instance comprises:

and sending the batch data to an inference engine according to the current buffer instance so that the inference engine infers the batch data according to a current inference algorithm corresponding to the current buffer instance.

The artificial intelligence reasoning method of A8 as described in any one of a1-a7, wherein the fetching of the data to be reasoned from the current storage area based on the scheduling thread corresponding to the current buffer instance when detecting that the data in the current storage area meets the preset requirement includes:

the method comprises the steps that data in a current storage area are detected regularly according to a preset detection period on the basis of a scheduling thread corresponding to a current buffering instance;

and when detecting that data exists in the current storage area, taking out the data to be reasoned from the current storage area according to the scheduling thread.

The artificial intelligence reasoning method of a9 as described in any one of a1-a7, wherein the fetching of the data to be reasoned from the current storage area based on the scheduling thread corresponding to the current buffer instance when detecting that the data in the current storage area meets the preset requirement includes:

and when detecting that the data quantity in the current storage area reaches a preset quantity threshold value, taking out the data to be reasoned from the current storage area based on a scheduling thread corresponding to the current buffer instance.

The artificial intelligence reasoning method of a10 as described in any one of a1-a7, wherein the fetching of the data to be reasoned from the current storage area based on the scheduling thread corresponding to the current buffer instance when detecting that the data in the current storage area meets the preset requirement includes:

and when the data size in the current storage area is detected to exceed the preset capacity size, taking out the data to be inferred from the current storage area based on the scheduling thread corresponding to the current buffer instance.

The invention also discloses B11 and an artificial intelligence reasoning device, which comprises:

the determining module is used for acquiring frame data based on a scene thread and determining a target algorithm model corresponding to the frame data;

the scheduling module is used for storing the frame data into a storage area corresponding to a target buffer instance based on the target buffer instance corresponding to the target algorithm model;

the detection module is used for taking out data to be inferred from the current storage area based on a scheduling thread corresponding to the current buffer instance when the data in the current storage area meets the preset requirement;

the synthesis module is used for synthesizing the data to be reasoned into batch data;

and the reasoning module is used for reasoning the batch data according to the current buffer example.

B12, the artificial intelligence reasoning device as B11, further comprising a creating module;

the creating module is used for creating a scheduling instance corresponding to each scene thread according to the algorithm model corresponding to each scene thread when a system starting instruction is obtained;

the scheduling module is further configured to transmit the frame data to a target buffer instance corresponding to the target algorithm model according to a target scheduling instance corresponding to the scene thread, so that the target buffer instance stores the frame data to a storage area corresponding to the target buffer instance.

B13, the artificial intelligence reasoning device as B12, the target schedule instance transmitting frame data to the target buffer instance through a channel.

B14, the artificial intelligence reasoning device as B13, further comprising a distribution module;

and the distribution module is used for distributing the inference result to each scene thread corresponding to the batch data when the batch data inference is detected to be completed.

The artificial intelligence reasoning apparatus of B15, as described in B14, the distribution module is further configured to determine channel identifiers corresponding to data of each frame in the batch data when it is detected that the reasoning of the batch data is completed, search for corresponding thread identifiers according to the channel identifiers, and distribute the reasoning result to corresponding scene threads according to the thread identifiers.

B16, the artificial intelligence reasoning device as B13, the artificial intelligence reasoning device also comprises a result returning module;

and the result returning module is used for searching a target thread identifier corresponding to the target scene thread according to the target frame data and sending a target inference result to the target scene thread according to the target thread identifier when the target frame data corresponding to the target scene thread is detected to be inferred.

B17, the artificial intelligence reasoning device of B11, wherein the reasoning module is further configured to send the batch data to a reasoning engine according to the current buffer instance, so that the reasoning engine can reason the batch data according to a current reasoning algorithm corresponding to the current buffer instance.

The artificial intelligence reasoning device of any one of B18 and B11-B17, wherein the detecting module is further configured to detect data in a current storage area at regular time according to a preset detection period based on a scheduling thread corresponding to a current buffer instance, and when data in the current storage area is detected, fetch data to be reasoned from the current storage area according to the scheduling thread.

The invention also discloses C19, an artificial intelligence reasoning device, the device includes: a memory, a processor, and an artificial intelligence reasoning program stored on the memory and operable on the processor, the artificial intelligence reasoning program configured to implement the artificial intelligence reasoning method as claimed in any one of a1-a 10.

The invention also discloses D20 and a storage medium, wherein the storage medium is stored with an artificial intelligence reasoning program, and the artificial intelligence reasoning program is executed by a processor to realize the artificial intelligence reasoning method of any one of A1 to A10.

19页详细技术资料下载
上一篇:一种医用注射器针头装配设备
下一篇:一种贝叶斯网络参数初始化方法

网友询问留言

已有0条留言

还没有人留言评论。精彩留言会获得点赞!

精彩留言,会给你点赞!