Image analysis device and image analysis system
阅读说明:本技术 图像分析装置以及图像分析系统 (Image analysis device and image analysis system ) 是由 宇野礼于 土田安纩 于 2020-04-01 设计创作,主要内容包括:在图像分析装置及图像分析系统中,即使在针对从某相机输入的图像的物体识别的种类与针对从其他相机输入的图像的物体识别的种类不同的情况下也为与这些物体识别和物体检测对应的神经网络模型各自的推理处理分配适当的推理用处理器。根据人工智能推理实例中所含的物体检测用神经网络模型及物体识别用神经网络模型各自的推理时间和使用频度,来进行多个芯片(推理用处理器)当中的用于上述各神经网络模型的推理处理的芯片的分配(S3、S5、S9、S13、S16及S19)。由此,在针对从某相机输入的图像的物体识别的种类与针对从其他相机输入的图像的物体识别的种类不同的情况下也为与这些物体识别和物体检测对应的神经网络模型各自的推理处理分配适当的芯片。(In the image analysis device and the image analysis system, even when the type of object recognition for an image input from a certain camera is different from the type of object recognition for an image input from another camera, an appropriate processor for inference is assigned to each inference process of the neural network models corresponding to the object recognition and the object detection. The distribution of the chips used for the inference process of each of the plurality of chips (inference processors) is performed based on the inference time and the use frequency of each of the object detection neural network model and the object recognition neural network model included in the example of artificial intelligence inference (S3, S5, S9, S13, S16, and S19). Thus, even when the type of object recognition for an image input from a certain camera is different from the type of object recognition for an image input from another camera, an appropriate chip is assigned to each of the inference processes of the neural network models corresponding to the object recognition and the object detection.)
1. An image analysis apparatus connected to a plurality of cameras,
the image analysis device is provided with:
an image analysis unit that analyzes an image input from each of the plurality of cameras using each instance of an image analysis program including a learned object detection neural network model for detecting an object captured in an image input from each of the cameras and 1 or more learned object recognition neural network models for recognizing an object detected by the learned object detection neural network model;
A plurality of inference processors for performing inference processing in the learning-completed object detection neural network model and the learning-completed object recognition neural network model; and
and a processor allocation unit that allocates inference processors, among the plurality of inference processors, for inference processing in the learning object detection neural network model and inference processing in each learning object recognition neural network model, based on inference time and usage frequency required for the inference processing in each of the learning object detection neural network model and the learning object recognition neural network model included in each instance of the image analysis program.
2. The image analysis apparatus according to claim 1,
the processor assigning unit estimates the number of the inference processors necessary for the inference process of each of the learned object recognition neural network models, based on the inference time necessary for the inference process of each of the learned object recognition neural network models and the use frequency of each of the learned object recognition neural network models.
3. The image analysis apparatus according to claim 1 or 2,
the processor assigning unit estimates the number of inference processors necessary for the inference processing of the learned object detection neural network model, based on the inference time necessary for the inference processing in the learned object detection neural network model and the number of cameras that are input sources of images of objects to be detected based on the learned object detection neural network model.
4. The image analysis apparatus according to claim 2,
the processor assigning unit estimates the number of inference processors necessary for the inference processing of each learned object recognition neural network model, based on the inference time necessary for the inference processing of each learned object recognition neural network model, the use frequency of each learned object recognition neural network model, and the target frame number for inference processing of each learned object recognition neural network model in a predetermined time.
5. The image analysis apparatus according to claim 3,
The processor assigning unit estimates the number of inference processors necessary for inference processing of the learned object detection neural network model, based on an inference time necessary for inference processing in the learned object detection neural network model, the number of cameras that are input sources of images of objects to be detected based on the learned object detection neural network model, and the number of target frames for inference processing performed by the learned object detection neural network model in a predetermined time.
6. The image analysis apparatus according to claim 1,
the image analysis device further includes an image accumulation unit that accumulates images input from the cameras,
the processor assigning unit may not assign the inference process of the learning object detection neural network model or the learning object recognition neural network model of the inference processor to the inference process of the learning object detection neural network model or the learning object recognition neural network model at one point in time, and then, after the processor assigning unit becomes able to assign the inference processor to the inference process of the corresponding learning object detection neural network model or the learning object recognition neural network model, the image analyzing apparatus may perform the inference process of the corresponding learning object detection neural network model or the learning object recognition neural network model in non-real time based on the past images accumulated in the image accumulating unit.
7. The image analysis apparatus according to claim 1,
cameras connected to the image analysis device are classified into a plurality of camera groups, and the image analysis program corresponding to each of the plurality of camera groups is configured by using a combination of the learning object detection neural network model and the learning object recognition neural network model that are different from each other.
8. An image analysis system is provided with:
a plurality of image analysis devices;
a plurality of cameras respectively connected to the image analysis devices; and
a management server that manages the image analysis apparatus and the camera including the installation of the image analysis program to the image analysis apparatus,
each of the image analysis devices includes:
an image analysis circuit that analyzes an image input from each of the plurality of cameras using each instance of an image analysis program including a learned object detection neural network model for detecting an object captured in an image input from each of the cameras and 1 or more learned object recognition neural network models for recognizing an object detected by the learned object detection neural network model;
A plurality of inference processors for performing inference processing in the learning-completed object detection neural network model and the learning-completed object recognition neural network model; and
and a processor allocation circuit configured to allocate inference processors, which are used for inference processing in the learning object detection neural network model and inference processing in each learning object recognition neural network model, among the plurality of inference processors, according to inference time and use frequency required for inference processing in each of the learning object detection neural network model and the learning object recognition neural network model included in each instance of the image analysis program.
9. The image analysis system of claim 8,
cameras connected to the plurality of image analysis devices are classified into a plurality of camera groups, and the image analysis program corresponding to each of the plurality of camera groups is configured by using a combination of the learned object detection neural network model and the learned object recognition neural network model that are different from each other.
Technical Field
The present invention relates to an image analysis device and an image analysis system.
Background
Conventionally, the following devices or systems are known: an object such as a person captured in a frame image captured by a camera such as a monitoring camera is detected by an object detection neural network or the like, and the detected object is recognized by an object recognition neural network (for example, see patent document 1).
Prior art documents
Patent document
Patent document 1: JP 2017-224925 publication
Disclosure of Invention
(problems to be solved by the invention)
However, in the above-described apparatus or system for detecting and recognizing an object using a neural network (hereinafter, simply referred to as "apparatus for detecting and recognizing an object"), both object detection and object recognition are processes requiring a large amount of computer resources. Also, the time required to identify all the objects in 1 frame image depends on the number of (detected) objects contained in the frame.
Therefore, it takes a long time to recognize an object in a frame image in which many objects are detected, and therefore, there occurs a problem that so-called frame dropping occurs in which an object in a frame image input for a period after a frame image in which many objects are present cannot be recognized (recognition omission occurs).
As a conventional method for dealing with the above problems, there is a method in which: in order to divide a thread for object detection Processing and a thread for object recognition Processing and perform parallel Processing, and to speed up the object recognition Processing, a large number of inference processors such as GPU (graphics Processing Unit) are allocated to the inference Processing of the neural network for object recognition.
However, in the case where the apparatus or system for detecting and recognizing the object performs object detection and object recognition on the input images from 1 camera, the types of object detection and object recognition performed on the input images are determined, and therefore the method can be applied to the existing method described above, but in the case where object detection and object recognition are performed on the input images from a plurality of cameras, the types of object detection and object recognition performed on the input images from the respective cameras are all the same, and therefore the method cannot be applied to the method described above. More specifically, when an apparatus or system that performs object detection and recognition performs object detection and object recognition on input images from a plurality of cameras and the type of object recognition on an image input from a certain camera is different from the type of object recognition on images input from other cameras, if a large number of GPUs are assigned to (the inference processing of) the neural networks corresponding to all types of object recognition, the cost becomes too high. Therefore, it is necessary to assign an appropriate inference processor to each of the neural network models for each of the inference processes of the neural network models in consideration of the processing time (inference time) and the use frequency of each neural network model corresponding to a plurality of types of object recognition and object detection.
The present invention has been made to solve the above-described problems, and an object of the present invention is to provide an image analysis device and an image analysis system capable of assigning an appropriate inference processor to each inference process of each of neural network models corresponding to object recognition and object detection even when the kind of object recognition with respect to an image input from one of a plurality of cameras is different from the kind of object recognition with respect to an image input from another camera.
(means for solving the problems)
In order to solve the above problem, an image analysis device according to a first aspect of the present invention is connected to a plurality of cameras, and includes: an image analysis unit that analyzes an image input from each of the plurality of cameras using each instance of an image analysis program including a learned object detection neural network model for detecting an object captured in an image input from each of the cameras and 1 or more learned object recognition neural network models for recognizing an object detected by the learned object detection neural network model; a plurality of inference processors for performing inference processing in the learning-completed object detection neural network model and the learning-completed object recognition neural network model; and a processor allocation unit that allocates inference processors, among the plurality of inference processors, for inference processing in the learning object detection neural network model and inference processing in each learning object recognition neural network model, based on inference time and usage frequency required for the inference processing of each of the learning object detection neural network model and the learning object recognition neural network model included in each instance of the image analysis program.
In the image analysis device, the processor assigning unit may estimate the number of the inference processors necessary for the inference process of each of the learned object recognition neural network models, based on the inference time necessary for the inference process of each of the learned object recognition neural network models and the use frequency of each of the learned object recognition neural network models.
In the image analysis device, the processor assigning unit may estimate the number of the inference processors necessary for the inference processing of the learning object detection neural network model from an inference time necessary for the inference processing in the learning object detection neural network model and the number of cameras that are input sources of the image of the object to be detected based on the learning object detection neural network model.
In the image analysis device, the processor assigning unit may estimate the number of the inference processors necessary for the inference process of each of the learned object recognition neural network models, based on an inference time necessary for the inference process of each of the learned object recognition neural network models, a frequency of use of each of the learned object recognition neural network models, and a target frame number for the inference process of each of the learned object recognition neural network models in a predetermined time.
In the image analysis device, the processor assigning unit may estimate the number of the inference processors necessary for the inference process of the learned object detection neural network model based on an inference time necessary for the inference process in the learned object detection neural network model, the number of cameras that are input sources of an image to be an object of object detection based on the learned object detection neural network model, and a target frame number for the inference process performed by the learned object detection neural network model in a predetermined time.
The image analysis apparatus may further include an image accumulation unit that accumulates images input from the cameras, and at one point in time, the processor assigning unit fails to assign the inference process of the learning object detection neural network model or the learning object recognition neural network model to the inference processor, then, after the processor assigning unit becomes able to assign the inference processor to the inference process of the corresponding learning object detection neural network model or the learning object recognition neural network model, the image analysis device performs an inference process of the corresponding learned object detection neural network model or the learned object recognition neural network model in non-real time on the basis of the past images accumulated in the image accumulation unit.
In the image analysis device, the cameras connected to the image analysis device may be classified into a plurality of camera groups, and the image analysis program corresponding to each of the plurality of camera groups may be configured by using a combination of the learning object detection neural network model and the learning object recognition neural network model that are different from each other.
An image analysis system according to a second aspect of the present invention includes: a plurality of image analysis devices; a plurality of cameras respectively connected to the image analysis devices; and a management server that manages the image analysis apparatus and the camera including the installation of the image analysis program to the image analysis apparatus. Each of the image analysis devices includes: an image analysis circuit that analyzes an image input from each of the plurality of cameras using each instance of an image analysis program including a learned object detection neural network model for detecting an object captured in an image input from each of the cameras and 1 or more learned object recognition neural network models for recognizing an object detected by the learned object detection neural network model; a plurality of inference processors for performing inference processing in the learning-completed object detection neural network model and the learning-completed object recognition neural network model; and a processor allocation circuit configured to allocate inference processors, among the plurality of inference processors, for inference processing in the learning object detection neural network model and inference processing in each learning object recognition neural network model, based on inference time and usage frequency required for the inference processing in each of the learning object detection neural network model and the learning object recognition neural network model included in each instance of the image analysis program.
Preferably, in the image analysis system, the cameras connected to the plurality of image analysis devices are classified into a plurality of camera groups, and the image analysis program corresponding to each of the plurality of camera groups is configured by using a combination of the learning object detection neural network model and the learning object recognition neural network model which are different from each other.
(effect of the invention)
According to the image analysis device of the first aspect of the present invention, the assignment of the inference processor among the plurality of inference processors for the inference process in the learning-completed object detection neural network model (included in the example) and the inference process in the learning-completed object recognition neural network model (included in the example) is performed based on the inference time and the use frequency of each of the learning-completed object detection neural network model and the learning-completed object recognition neural network model included in each example of the image analysis program. Thus, even when the type of object recognition for an image input from one of the plurality of cameras is different from the type of object recognition for an image input from another camera, it is possible to assign an appropriate inference processor to each inference process of each of the neural network models, taking into account the processing time (inference time) and the frequency of use of each neural network model corresponding to a plurality of types of object recognition and object detection, for each of the neural network models corresponding to each of the object recognition and object detection. Accordingly, efficient object recognition for images input from each of the plurality of cameras can be performed using a limited number of inference processors.
In addition to the above-described effects, the image analysis system according to the second aspect of the present invention can manage the image analysis apparatus including the installation of the image analysis program into the image analysis apparatus by using the management server.
Drawings
Fig. 1 is a block configuration diagram showing a schematic configuration of an image analysis system including a cartridge according to an embodiment of the present invention.
Fig. 2 is a block diagram showing a schematic hardware configuration of the cartridge.
Fig. 3 is a functional block configuration diagram of a CPU in the same cartridge.
Fig. 4 is a configuration diagram of main software in the cartridge.
Fig. 5 is an explanatory diagram of a method of calculating the number of chips required for the inference processing of the NN model for identifying each object in the analysis box.
Fig. 6 is an explanatory diagram of communication between an Artificial Intelligence (AI) inference instance in the analysis box and the GPU server.
Fig. 7 is a flowchart of the assignment process of assigning the NN model to the chip by the GPU server.
Fig. 8 is an explanatory diagram of a camera group and an application group in the image analysis system.
Fig. 9 is an explanatory diagram showing an example of the application group.
Fig. 10 is an explanatory diagram of vectorization processing by the vectorization model in fig. 9.
Fig. 11 is a block diagram showing connection between each camera and each analysis box in the store and the management server.
Fig. 12 is an explanatory diagram showing an example of a unit of management of cameras connected to each analysis cassette in the image analysis system.
Detailed Description
Hereinafter, an image analysis device and an image analysis system according to an embodiment of the present invention will be described with reference to the drawings. Fig. 1 is a block configuration diagram showing a schematic configuration of an
The
In addition, as shown in fig. 1, the
The management server 7 manages a plurality of
Next, the hardware configuration of the
The (inference) chips 14a to 14h are preferably processors (inference-specific chips) optimized for dnn (deep Neural networks) inference, but may be general-purpose gpus (graphical processing units) or other processors for general use. Each of the
As shown in fig. 2, the (inference) chips 14a to 14h are connected to the CPU11 through PCI Express or USB. Some of the
The communication control IC15 has a LAN port 16 as a connection port for connecting to an Ethernet-compliant LAN.
Fig. 3 shows functional blocks of the CPU11 in the
The image analysis unit 18 analyzes the images input from the
Next, the main software configuration in the
The AI inference examples 23a to 23c are each an example of the application package (corresponding to the
The analysis box OS24 controls applications such as the
Next, a basic guideline for assigning each of the object detection NN model and the object recognition NN model to the
1. Although the NN models (the NN model for object detection or the NN model for object recognition) assigned to the
2. Each NN model is assigned to each
2-1, at a frame rate at which all recognition objects can be recognized with the current chip configuration (the types and number of chips), the object detection NN model is assigned to the minimum number of chips that can perform object detection for images from all cameras. The minimum chip is the minimum chip from the viewpoint of the number and performance.
2-2. each object recognition NN model is assigned to an appropriate number of chips based on the recognition inference time (for example, classification) (the inference time required for the inference process of the object recognition NN model) and the necessity (the frequency of use and the priority) of the recognition.
2-3, when the time points 2-1 and 2-2 are changed, the minimum time (cost) required for replacing the NN model of 1 may be considered, and the NN model may be replaced (exchanged).
3. The recognition process can be postponed (may not be done in real time) for less high priority. That is, the above-described recognition processing may be performed based on (data of) the frame image from the VMS server 20 (see fig. 4) in the idle time period.
As a result of assigning the respective NN models to the
Next, an example of a manner in which the
The above points are explained with reference to fig. 5. In fig. 5, the object
The number of recognition objects as a result of object detection for each frame image 33 and the time for performing the inference process of the NN model for object recognition on these recognition objects are different depending on the time period. For example, in a busy evening time period and a leisure time period in the afternoon, the number of recognition objects is different, and therefore, the time required for the object recognition processing for these recognition objects is different. Therefore, it is expected that this situation requires assigning the respective NN models (for example, the object
The management server 7 (see fig. 1) allocates each NN model necessary for analysis (object recognition) to be performed to each chip by a manager in a simple procedure (by performing input processing of model performance, the number of cameras, target performance, and the like used in the "necessary chip number estimation" processing described later). Specifically, when the manager inputs model performance, the number of cameras, target performance, and the like from the management server 7, the management server 7 transmits the input information to the
Next, an example of an assignment method of the NN model to the chip by the
In fig. 6, the
First, as shown in fig. 6, when the
The
The processing performed by the
Next, an example of the assignment method of the NN model to the chip described above with reference to fig. 5 will be described in more detail with reference to fig. 7. In the following description, the
Fig. 7 is a flowchart of the assignment process of assigning the NN model to the chip by the GPU server 25 (more precisely, the processor assigning unit 19). The allocation process is performed as a timer-based batch process at every given time (for example, once an hour). Before starting the processing shown in the flowchart of fig. 7, the
Then, the
(1) Based on the pieces of information (information of the model ID, the model path, the model performance (value), the priority of the NN model itself, and the priority of the AI inference instance 23 itself) acquired as described above, parameter groups (model ID, model path, model performance (value), priority) are generated for all the NN models included in all the AI inference instances 23, a list of parameter groups is obtained, and a list L (L1, L2, … …, lN) is created in which the list is rearranged in order of priority.
(2) The assignment of the NN model at this time point to the chip is reset (released).
(3) The processes from S1 and thereafter in fig. 7 are sequentially executed from the top (from the top element) for each element li in the list L, and the NN model is assigned to the chip.
Here, when the priority is a negative value (when a negative value is set for the priority of the NN model itself), it is permissible that the NN model is not assigned to a chip in the following processing. When the NN model is not assigned to the chip, the inference process based on the NN model is deferred using the frame image stored in the VMS server 20 as described in paragraph (0036).
The priority of the AI inference instance 23 is calculated from the actual results of past inference amounts of the AI inference instance 23 (all the included NN models). For example, the priority of the AI inference instance 23 may be calculated from the accumulated value of the inference times of the AI inference instance 23 at the previous time, or the inference time of the AI inference instance 23 at the next time may be predicted and prioritized from the accumulated value of the inference times of the AI inference instance 23 at each week/each time period of the past month or so.
When the chip assignment process for a certain element li (corresponding NN model) in the list L in (3) above is started (yes in S1), the
Next, when there are remaining (NN model) unassigned chips among the (inference) chips 14a to 14h shown in fig. 2 (yes in S4), the
As a result of the chip assignment process at S5, the NN model to which the chips are assigned by the assignment process can be used (in a state of "True") in the analysis cartridge 1 (S7). In fig. 7, "true" indicates that a minimum of 1 or more chips are assigned to the corresponding NN model, and means a state in which the NN model can be used. In contrast, a "False (False)" indicates that one chip is not allocated to the corresponding NN model, and indicates that the NN model cannot be used. In the case of "false", the inference process based on this NN model is as described in paragraph (0036), and the process is deferred.
In the determination process at S4, if there is no unallocated chip remaining (no at S4), the
As a result of calling the process of assigning the chips of S9, the NN model to which the chips are assigned by this assignment process can be used (in a state of "true") in the analysis cartridge 1 (S11). However, if the priority of the element li is less than 0 (negative) or if the chip cannot be called from another inference thread as a result of the determination at S8 (no at S8), the
In the determination processing at S2, when the assignment of the NN model of the corresponding element li to the chip of the NN model is performed (when the model ID of the element li is already stored in the model ID array) (yes at S2), the number of NN models assigned to the chip does not increase, but the number of instances of the corresponding NN model increases. Therefore, the
In the estimation process at S13, since the number of instances of the NN model increases compared to when the required chip number estimation process for the NN model (estimation process at S3) is first performed, the number of cameras that are the basis of the required chip number estimation of the NN model for object detection and the NN model for object recognition, and the average number of persons (average recognition objects) that are captured by each camera that is the basis of the required chip number estimation of the NN model for object recognition increase. Therefore, the
If the result of the determination at S14 is that it is necessary to add chips assigned to the corresponding NN model (yes at S14), and if there are remaining (to the NN model) unallocated chips (yes at S15), the
On the other hand, if the result of the determination at S15 is that there is no unallocated chip remaining (no at S15), the
When the additional assignment process of S19 is performed, since 2 or more chips are assigned to the corresponding NN model, the corresponding NN model is naturally in the "true" state (S20). As a result of the determination at S18, when the priority of the element li is less than 0 (negative), or when a chip cannot be called from another inference thread (no at S18), additional assignment of a chip to the corresponding NN model cannot be performed. However, even in this case, since chips should be assigned to the NN model corresponding to the corresponding model ID based on the determination result in S2 (determination result "assignment of chips to corresponding NN models", the NN model is naturally in the "true" state (S21).
In the above description, after all the assignment processing for the NN model having the priority of 0 or higher is completed, if any unallocated chip remains (yes in S4 and yes in S15), a chip is assigned even to the NN model having the negative priority. However, the present invention is not limited to this, and even when there are no unallocated chips remaining, when there are threads of the NN model that have a sufficient excess of the target performance, which will be described later, the chip call allocated to the NN model may be allocated to the NN model having a negative priority.
As a result of the determination at S14, if there is no need to add chips assigned to the corresponding NN model (no at S14), that is, if the number of currently assigned chips is sufficient even if the number of instances of the corresponding NN model increases, the corresponding NN model is naturally in the "true" state (S22).
Next, the required chip number estimation processing at S3 and S13 will be described in detail. The required chip number estimation processing differs in content depending on whether the NN model to be the target is an NN model for object detection or an NN model for object recognition. However, regardless of the NN model, the GPU server 25 (processor assigning unit 19) estimates the number of inference processors necessary for the inference process of each NN model from the inference time and the use frequency necessary for the inference process of each NN model. In the following description that requires the number-of-chips estimation process, for the sake of simplifying the description, an example will be described in which 1
First, a necessary chip number estimation process for the object detection NN model will be described. In this case, the GPU server 25 (processor assigning unit 19) estimates the number of chips (required performance) required for the inference process of the NN model for object detection based on the following equation, based on the number K of cameras that are input sources of the image of the object to be detected based on the NN model for object detection, the model performance T of the NN model for object detection (the inference time (seconds) required for the inference process of the NN model for object detection), and the target performance F of the NN model for object detection (the target frame number (fps) (frame per second)) for the inference process of the NN model for object detection in a predetermined time (1 second).
Necessary performance (chip number) ═ K × F × T
For example, if the number of cameras K is 3 (stations), the model performance T is 0.05 (sec), and the target performance F is 6(FPS), the performance required of the object detection NN model (the number of chips required for the inference process) is calculated by the following equation.
Essential properties (chip number) 3 x 6 x 0.05 x 0.9
Therefore, in the case of the above example, 1 chip is required. As a reference value for estimating the required number of chips, the above-described target performance F is required. The target performance F is also required to compare performance and a degree of resource fullness with other threads (corresponding to the NN model).
Next, a necessary chip number estimation process for the NN model for object recognition will be described. In this case, the GPU server 25 (processor assigning unit 19) estimates the number of chips (required performance) required for the inference process of the NN model for object recognition based on the following equation, based on the average number of persons N1, N2, … … (i.e., the frequency of use of the NN model for object recognition) captured by each camera as an input source of an image of the NN model for object recognition, the model performance T of the NN model for object recognition (the inference time (seconds) required for the inference process of the NN model for object recognition), and the target performance F of the NN model for object recognition (the number of target Frames (FPS) for the inference process of the NN model for object recognition in a predetermined time).
Required performance (chip number) sum (N1, N2, … …) F T
(wherein sum (N1, N2, … …) represents the sum (total) of N1, N2, … ….)
For example, if the number of cameras as the input source of the image of the NN model for object recognition is 3, the average number of persons photographed by each camera is 5, 2, or 3, the model performance T is 0.03 seconds, and the target performance F is 6FPS, the required performance (the number of chips required for the inference process) of the NN model for object recognition is calculated by the following equation.
Essential properties (chip number) — (5+2+3) × 6 ═ 0.03 ═ 1.8
Therefore, in the case of the above example, 2 chips are required. As in the case of the object detection NN model, the target performance F is required as a reference value for estimating the required number of chips. The target performance F is also required to compare performance and a degree of resource fullness with other threads (corresponding to the NN model).
Next, the calling processing for calling the chip from other inference threads, which has been described in the above-described explanations of S8, S9, S18, and S19, is explained in more detail. Before describing the process of calling a specific chip, the basic principles when calling a chip from other inference threads are described. In the above-described S4 and S15, when a chip is called from another inference thread (corresponding NN model) without an unassigned chip remaining, the assignment of only 1 chip should be changed without changing the assignment of more chips at one time. This is because, when the number of chips allocated to the NN model for object detection (thread) is increased or decreased at a time, the possibility of the increase or decrease in the number of chips allocated to the NN model for object detection (thread) is increased to some extent, and when the number of chips allocated to the NN model for object detection (thread) is increased or decreased, the inflow amount of data (mainly frame data of an object to be recognized) flowing into the NN model for object recognition (thread) that performs object recognition processing after object detection processing is greatly increased or decreased, and therefore, the required number of chips for (threads of) each NN model needs to be estimated (re-estimated) again.
Next, a specific procedure for calling the processing of the chip from another inference thread will be described.
1. First, a thread of the NN model to which a plurality of chips are assigned is enumerated.
2. If there are threads with a data loss rate of 0, which will be described later, among the threads listed in the above 1, these threads are listed again. On the other hand, when there is no thread having a data loss rate of 0 among the threads listed in 1, the threads listed in 1 are first sorted in ascending order of priority, and when there are a plurality of threads of the NN model having the same priority, they are sorted in descending order of the recognition rate (i.e., descending order of recognition rate) as described later.
3. If there is a thread with a data loss rate of 0 in 2 above, 1 chip is released from the top (first) thread among the enumerated threads with a data loss rate of 0. In addition, when there are a plurality of threads having a data loss rate of 0, it is preferable to release 1 chip from the thread having the lowest priority (using the NN model). In addition, in the above 2, when no thread having a data loss rate of 0 is sorted in descending order of the recognition rate, 1 chip is released from the top (first) thread.
4. And (3) allocating the released chips to the threads (of the NN model) needing the chips (the threads of the NN model of the unallocated chips or the threads with the lowest recognition rate).
Next, the above data loss rate will be described. The data loss rate is a ratio of data that is discarded without being detected or recognized, among data that flows into each thread corresponding to each of the object detection NN model and the object recognition NN model.
The recognition rate R is characterized by the target performance fa (fps) and the actual performance fr (fps) as follows.
R=Fr/Fa
In the above equation, the actual performance Fr represents an actual measurement value of the number of data (number of frames) that the corresponding thread (of the NN model) has performed the inference process within a given time (1 second). This actual performance Fr is significant only when the data loss rate > 0 is described above. This is because, when the data loss rate is 0, the number of data (frame number) that can be inferred by the corresponding thread in a predetermined time (1 second) is equal to or greater than the above-described actual value (i.e., actual performance Fr). The target performance Fa in the above equation is substantially the same as the target performance F of the object detection NN model and the object recognition NN model, and represents a target frame number (FPS) at which a thread corresponding to each NN model performs inference processing within a predetermined time (1 second).
Next, an application group (hereinafter, referred to as an "application group") in the
Fig. 9 shows an example of the application group. The entrance face
Note that the registration of the priorities of the NN models themselves used in the assignment process of the NN models to the chips described above with reference to fig. 7 is performed as follows. That is, when the administrator inputs the priorities of the
On the other hand, the
The GPU server 25 (processor assigning unit 19) assigns the
The vectorization processing performed by the
When the vector V1 obtained from the image 57a of the person detected in the frame image 33a captured by the
Therefore, the image analysis unit 18 of the CPU11 can determine whether or not the person captured by the
Fig. 11 is a diagram showing the connection of the management server 7, each
At this time, if the
Further, in fig. 11, the first application 51a of the magazine 1a is associated with the
The applications 51a to 51c correspond to the application groups in fig. 8. The
Although the correspondence relationship between the
In the above description of fig. 8, 9, 11, and the like, an example is shown in which the management server 7 manages the
The management server 7 can manage each
As described above, according to the
In addition, according to the
In addition, according to the
In addition, according to the
In addition, according to the
Further, according to the
In addition to the above-described effects, the
Modification example:
the present invention is not limited to the configurations of the above embodiments, and various modifications can be made without departing from the scope of the present invention. Next, a modified example of the present invention will be described.
Modification 1:
in the above-described embodiment, an example is shown in which the processor assigning unit 19 (the GPU server 25) estimates the number of chips 14 (inference processors) necessary for the inference process of each NN model (each of the NN model for object detection and the NN model for object recognition), and assigns the estimated number of
Modification 2:
in the necessary chip number estimation processing of the NN model for object detection in the above embodiment, the processor assigning unit 19 (the GPU server 25) estimates the chip number (necessary performance) necessary for the inference processing of the NN model for object detection based on the number K of cameras that are input sources of images of objects to be detected by the NN model for object detection, the model performance T of the NN model for object detection (the inference time necessary for the inference processing of the NN model for object detection), and the target performance F of the NN model for object detection (the target frame number for the NN model for object detection to perform the inference processing in a predetermined time). However, the
In the necessary chip number estimation processing of the NN model for object recognition in the above-described embodiment, the processor assigning unit 19 (the GPU server 25) estimates the number of chips necessary for the inference processing of the NN model for object recognition based on the average number of people N1, N2, … … (that is, the frequency of use of the NN model for object recognition) taken by each camera as an input source of an image of an object to be object recognized based on the NN model for object recognition, the model performance T of the NN model for object recognition (the inference time (seconds) necessary for the inference processing of the NN model for object recognition), and the target performance F of the NN model for object recognition (the number of target frames for inference processing of the NN model for object recognition in a predetermined time). However, the present invention is not limited to this, and the
Modification 3:
although the above embodiment shows an example in which the
(description of reference numerals)
1 analysis box (image analysis device)
2 network camera (Camera)
7 management server
10 image analysis system
14 a-14 h chip (reasoning processor)
18 image analysis section
19 processor distribution part
20 VMS server
22 storage device (image accumulating unit)
23. 23a, 23b, 23c AI inference examples (examples of image analysis programs)
32 decoder
51 application (image analysis program)
Number of K cameras
T model performance (inference time required for learning each to complete inference processing in the neural network model for object recognition and inference time required for learning to complete inference processing in the neural network model for object detection)
F target performance (the number of target frames for which the learning object recognition neural network model performs inference processing within a given time, and the number of target frames for which the learning object detection neural network model performs inference processing within a given time).
- 上一篇:一种医用注射器针头装配设备
- 下一篇:光学字符识别方法、装置、电子设备及存储介质