Method, apparatus, electronic device, and medium for generating video

文档序号：1834889 发布日期：2021-11-12 浏览：38次中文

阅读说明：本技术 用于生成视频的方法、装置、电子设备和介质 (Method, apparatus, electronic device, and medium for generating video ) 是由刘然于 2021-08-13 设计创作，主要内容包括：本公开的实施例公开了用于生成视频的方法、装置、电子设备和介质。该方法的一具体实施方式包括：获取待监控区域的背景图像,其中,该待监控区域中包括预设告警区域,该背景图像包括在该待监控区域内无预设闯入物的图像；获取预设闯入物的图像；生成至少两张融合图像,其中,该融合图像用于呈现该预设闯入物位于该待监控区域中的不同位置；将该至少两张融合图像转换为视频。该实施方式实现了按需灵活构造视频数据集,丰富了视频的生成方式,并且为高效、低成本地构造安防领域的测试视频提供基础。(Embodiments of the present disclosure disclose methods, apparatuses, electronic devices, and media for generating video. One embodiment of the method comprises: acquiring a background image of a region to be monitored, wherein the region to be monitored comprises a preset alarm region, and the background image comprises an image without a preset intruder in the region to be monitored; acquiring an image of a preset intruder; generating at least two fusion images, wherein the fusion images are used for showing that the preset intruding object is located at different positions in the area to be monitored; and converting the at least two fused images into a video. The embodiment realizes the flexible construction of the video data set according to the requirement, enriches the generation modes of the video, and provides a basis for constructing the test video in the security protection field with high efficiency and low cost.)

1. A method for generating video, comprising:

acquiring a background image of a region to be monitored, wherein the region to be monitored comprises a preset alarm region, and the background image comprises an image without a preset intruder in the region to be monitored;

acquiring an image of a preset intruder;

generating at least two fusion images, wherein the fusion images are used for showing that the preset intruding object is located at different positions in the area to be monitored;

and converting the at least two fused images into a video.

2. The method of claim 1, wherein the generating at least two fused images comprises:

acquiring a moving track of the preset intruder in the area to be monitored, wherein the moving track comprises at least two key points;

and generating a fusion image which presents the preset intruding object at the position according to the position, indicated by the at least two key points, in the area to be monitored.

3. The method of claim 2, wherein the movement trajectory comprises key points located in the preset alarm area.

4. The method according to claim 2, wherein the movement track comprises key points located in a non-preset alarm area in the area to be monitored.

5. The method according to claim 2, wherein the generating a fusion image presenting the preset intruding object at the position according to the position indicated by the at least two key points and located in the area to be monitored comprises:

generating a coordinate sequence group between each two adjacent key points in the moving track, wherein a coordinate sequence in the coordinate sequence group is used for fitting the moving track between the adjacent key points;

and generating a fusion image which presents the preset intruding object at the position according to the position, indicated by the coordinates included in the coordinate series group, in the area to be monitored.

6. The method of claim 5, wherein the generating a set of coordinate sequences between adjacent keypoints in the movement trajectory comprises:

selecting adjacent key points from the moving track;

determining the number of coordinates in a coordinate sequence corresponding to the selected adjacent key points according to the coordinates of the selected adjacent key points and a preset pixel span value;

and performing interpolation according to the coordinates of the selected adjacent key points and the indicated slope, and generating the coordinates in the coordinate sequence corresponding to the selected adjacent key points.

7. The method according to one of claims 1-6, wherein the method further comprises:

inputting the video into a preset intruder detection model to generate a detection result, wherein the detection result is used for indicating whether to alarm the condition that the preset intruder exists in the preset alarm area;

and adjusting the invader detection model according to the detection result.

8. An apparatus for generating video, comprising:

the monitoring system comprises a first acquisition unit, a second acquisition unit and a monitoring unit, wherein the first acquisition unit is configured to acquire a background image of a region to be monitored, the region to be monitored comprises a preset alarm region, and the background image comprises an image without a preset intruder in the region to be monitored;

a second acquisition unit configured to acquire an image of a preset intruding object;

the generating unit is configured to generate at least two fusion images, wherein the fusion images are used for showing that the preset intruding object is located at different positions in the area to be monitored;

a conversion unit configured to convert the at least two fused images into a video.

9. An electronic device, comprising:

one or more processors;

a storage device having one or more programs stored thereon;

when executed by the one or more processors, cause the one or more processors to implement the method of any one of claims 1-7.

10. A computer-readable medium, on which a computer program is stored which, when being executed by a processor, carries out the method according to any one of claims 1-7.

Technical Field

Embodiments of the present disclosure relate to the field of computer technologies, and in particular, to a method, an apparatus, an electronic device, and a medium for generating a video.

Background

With the development of artificial intelligence technology, the technology for identifying the abnormity in video monitoring is applied more and more in the field of security protection. In the prior art, a high-definition camera and an electronic fence technology are combined, and whether people or objects break into a specific space is judged by means of an abnormal recognition algorithm, so that the core area is monitored.

In order to verify the effectiveness of the anomaly identification algorithm, a video stream to be tested is prepared in advance, and the video stream is played back to simulate a real environment to verify whether the algorithm can effectively detect the intrusion event of people or objects.

In the prior art, the monitored area is often the core area of the user, so that the period for acquiring the corresponding video is long and the acquisition difficulty is high due to the need of process approval and the like. When the monitored object is large equipment (such as a forklift, a truck and the like), the video of the intrusion scene can be acquired only by coordinating the operation of professionals. Moreover, if a video stream test returned from the site is used, only a motion track test of a monitored object based on a record in a video can be performed, and an intrusion track cannot be flexibly constructed, so that the covered test range is limited. Therefore, for the test process in the field, how to flexibly and efficiently construct test data integration is an urgent problem to be solved.

Disclosure of Invention

Embodiments of the present disclosure propose methods, apparatuses, electronic devices, and media for generating videos.

In a first aspect, an embodiment of the present disclosure provides a method for generating a video, the method including: acquiring a background image of a region to be monitored, wherein the region to be monitored comprises a preset alarm region, and the background image comprises an image without a preset intruder in the region to be monitored; acquiring an image of a preset intruder; generating at least two fusion images, wherein the fusion images are used for showing different positions of the preset intruding object in the area to be monitored; and converting the at least two fused images into a video.

In some embodiments, the generating at least two fused images includes: acquiring a moving track of a preset intruder in an area to be monitored, wherein the moving track comprises at least two key points; and generating a fusion image which shows the preset intruding object at the position according to the position, indicated by the at least two key points, in the area to be monitored.

In some embodiments, the movement trajectory includes a key point located in a preset warning area.

In some embodiments, the movement trajectory includes key points located in a non-preset alarm area in the area to be monitored.

In some embodiments, the generating a fusion image that presents the preset intrusion object at the position according to the position indicated by the at least two key points and located in the area to be monitored includes: generating a coordinate sequence group between each two adjacent key points in the moving track, wherein the coordinate sequences in the coordinate sequence group are used for fitting the moving track between the adjacent key points; and generating a fusion image which presents the preset intruding object at the position according to the position which is indicated by the coordinates in the coordinate sequence group and is positioned in the area to be monitored.

In some embodiments, the generating a set of coordinate sequences between adjacent key points in the movement trajectory includes: selecting adjacent key points from the moving track; determining the number of coordinates in a coordinate sequence corresponding to the selected adjacent key points according to the coordinates of the selected adjacent key points and a preset pixel span value; and performing interpolation according to the coordinates of the selected adjacent key points and the indicated slope, and generating the coordinates in the coordinate sequence corresponding to the selected adjacent key points.

In some embodiments, the method further comprises: inputting the video into a preset intruder detection model to generate a detection result, wherein the detection result is used for indicating whether to alarm whether a preset intruder exists in a preset alarm area; and adjusting the intruder detection model according to the detection result.

In a second aspect, an embodiment of the present disclosure provides an apparatus for generating a video, the apparatus including: the monitoring system comprises a first acquisition unit, a second acquisition unit and a monitoring unit, wherein the first acquisition unit is configured to acquire a background image of a region to be monitored, the region to be monitored comprises a preset alarm region, and the background image comprises an image without a preset intruder in the region to be monitored; a second acquisition unit configured to acquire an image of a preset intruding object; the system comprises a generating unit, a monitoring unit and a processing unit, wherein the generating unit is configured to generate at least two fusion images, and the fusion images are used for presenting different positions of a preset intrusion object in a region to be monitored; a conversion unit configured to convert the at least two fused images into a video.

In some embodiments, the generating unit includes: the monitoring system comprises an acquisition subunit, a monitoring unit and a monitoring unit, wherein the acquisition subunit is configured to acquire a movement track of a preset intruder in a region to be monitored, and the movement track comprises at least two key points; and the generating subunit is configured to generate a fusion image for presenting the preset intruding object at the position according to the position, indicated by the at least two key points, in the area to be monitored.

In some embodiments, the movement trajectory includes a key point located in a preset warning area.

In some embodiments, the movement trajectory includes key points located in a non-preset alarm area in the area to be monitored.

In some embodiments, the generating subunit includes: the first generation module is configured to generate a coordinate sequence group between each adjacent key point in the moving track, wherein a coordinate sequence in the coordinate sequence group is used for fitting the moving track between the adjacent key points; and the second generation module is configured to generate a fusion image which presents the preset intruding object at the position according to the position, indicated by the coordinates included in the coordinate sequence group, in the area to be monitored.

In some embodiments, the first generating module is further configured to: selecting adjacent key points from the moving track; determining the number of coordinates in a coordinate sequence corresponding to the selected adjacent key points according to the coordinates of the selected adjacent key points and a preset pixel span value; and performing interpolation according to the coordinates of the selected adjacent key points and the indicated slope, and generating the coordinates in the coordinate sequence corresponding to the selected adjacent key points.

In some embodiments, the apparatus is further configured to: inputting the video into a preset intruder detection model to generate a detection result, wherein the detection result is used for indicating whether to alarm whether a preset intruder exists in a preset alarm area; and adjusting the intruder detection model according to the detection result.

In a third aspect, an embodiment of the present disclosure provides an electronic device, including: one or more processors; a storage device having one or more programs stored thereon; when the one or more programs are executed by the one or more processors, the one or more processors are caused to implement the method as described in any implementation of the first aspect.

In a fourth aspect, embodiments of the present disclosure provide a computer-readable medium on which a computer program is stored, which when executed by a processor implements the method as described in any of the implementations of the first aspect.

According to the method, the device, the electronic equipment and the medium for generating the video, the background image of the area to be monitored and the image of the preset intruder are utilized to synthesize the images for presenting the preset intruder at different positions of the area to be monitored so as to generate the video, the flexible construction of a video data set according to requirements is realized, the generation mode of the video is enriched, and a foundation is provided for efficiently and inexpensively constructing the test video in the security protection field.

Drawings

Other features, objects and advantages of the disclosure will become more apparent upon reading of the following detailed description of non-limiting embodiments thereof, made with reference to the accompanying drawings in which:

FIG. 1 is an exemplary system architecture diagram in which one embodiment of the present disclosure may be applied;

FIG. 2a is a flow diagram for one embodiment of a method for generating video in accordance with the present disclosure;

FIG. 2b is a schematic diagram of a method of determining a set of coordinate series in one embodiment of a method for generating a video according to the present disclosure;

FIG. 3 is a schematic diagram of one application scenario of a method for generating video in accordance with an embodiment of the present disclosure;

FIG. 4 is a flow diagram of yet another embodiment of a method for generating video in accordance with the present disclosure;

FIG. 5 is a schematic block diagram illustrating one embodiment of an apparatus for generating video in accordance with the present disclosure;

FIG. 6 is a schematic structural diagram of an electronic device suitable for use in implementing embodiments of the present disclosure.

Detailed Description

The present disclosure is described in further detail below with reference to the accompanying drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the relevant invention and not restrictive of the invention. It should be noted that, for convenience of description, only the portions related to the related invention are shown in the drawings.

It should be noted that, in the present disclosure, the embodiments and features of the embodiments may be combined with each other without conflict. The present disclosure will be described in detail below with reference to the accompanying drawings in conjunction with embodiments.

Fig. 1 illustrates an exemplary architecture 100 to which the method for generating video or the apparatus for generating video of the present disclosure may be applied.

As shown in fig. 1, the system architecture 100 may include terminal devices 101, 102, 103, a network 104, and a server 105. The network 104 serves as a medium for providing communication links between the terminal devices 101, 102, 103 and the server 105. Network 104 may include various connection types, such as wired, wireless communication links, or fiber optic cables, to name a few.

The terminal devices 101, 102, 103 interact with a server 105 via a network 104 to receive or send messages or the like. The terminal devices 101, 102, 103 may have various communication client applications installed thereon, such as a web browser application, a shopping application, a search application, an instant messaging tool, a mailbox client, an intruder identification application, and the like.

The terminal apparatuses 101, 102, and 103 may be hardware or software. When the terminal devices 101, 102, 103 are hardware, they may be various electronic devices having a display screen and supporting human-computer interaction, including but not limited to smart phones, tablet computers, laptop portable computers, desktop computers, and the like. When the terminal apparatuses 101, 102, 103 are software, they can be installed in the electronic apparatuses listed above. It may be implemented as multiple pieces of software or software modules (e.g., software or software modules used to provide distributed services) or as a single piece of software or software module. And is not particularly limited herein.

The server 105 may be a server providing various services, such as a background server providing support for intruding object identification type applications on the terminal devices 101, 102, 103. The background server can acquire a background image of an area to be monitored and an image of a preset intruding object from the terminal equipment, generate a video for showing the movement of the preset intruding object in the area to be monitored according to the acquired background image and the image of the preset intruding object, and feed back the generated video to the terminal equipment, so that the terminal equipment can test an algorithm applied to the identification of the intruding object by using the video.

The server may be hardware or software. When the server is hardware, it may be implemented as a distributed server cluster formed by multiple servers, or may be implemented as a single server. When the server is software, it may be implemented as multiple pieces of software or software modules (e.g., software or software modules used to provide distributed services), or as a single piece of software or software module. And is not particularly limited herein.

It should be noted that the method for generating video provided by the embodiment of the present disclosure is generally performed by the server 105, and accordingly, the apparatus for generating video is generally disposed in the server 105. Optionally, the method for generating the video provided by the embodiment of the present application may also be executed by the terminal devices 101, 102, and 103 under the condition that the computing capability is satisfied, and accordingly, the apparatus for generating the video may also be disposed in the terminal devices 101, 102, and 103.

It should be understood that the number of terminal devices, networks, and servers in fig. 1 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for implementation.

With continued reference to fig. 2a, a flow 200 of one embodiment of a method for generating video in accordance with the present disclosure is shown. The method for generating video comprises the following steps:

step 201, obtaining a background image of a region to be monitored.

In this embodiment, an execution subject (such as the server 105 shown in fig. 1) of the method for generating a video may acquire a background image of an area to be monitored by a wired connection manner or a wireless connection manner. The area to be monitored may include an area photographed by a camera. The area to be monitored may include a preset alarm area. The preset alarm area may be preset, and generally refers to an area where an intruder triggers an alarm, for example, a key area for security. The background image may include an image without a predetermined intruding object in the area to be monitored.

In this embodiment, as an example, the execution main body may acquire a background image of an area to be monitored, which is pre-stored locally, or may acquire an image, as the background image of the area to be monitored, of an electronic device (e.g., a camera) connected in a communication manner, where no intruding object exists in the monitored area.

Step 202, acquiring an image of a preset intruding object.

In this embodiment, the execution subject may acquire the image of the preset intruding object in various ways. The preset intruding object is used for indicating a monitored object, and the preset intruding object can be determined according to actual application scenes, such as people, forklifts and the like. As an example, the execution subject may directly obtain the image of the preset intruding object from a local or communication connected device. As still another example, the execution body may first select an image in which the preset intruding object is displayed from images captured by a camera. Then, the execution body can also remove irrelevant Beijing from the selected image by using a matting technology, so as to intercept the image only containing the part of the preset intruder as the image of the preset intruder.

It should be noted that the size of the image of the preset intruding object is generally smaller than the size of the background image of the monitored area.

Step 203, generating at least two fused images.

In this embodiment, the execution body may generate at least two fusion images in various ways. The fused image can be used for showing different positions of the preset intruding object in the area to be monitored. As an example, the executing entity may stick the image of the preset intruder acquired in step 202 to different positions of the background image of the area to be monitored acquired in step 201 as the base image, so as to generate at least two fused images. Wherein, only one image with a specific preset intruding object is presented in each fused image. Specifically, when there is only one preset intruding object, only one image of the preset intruding object is presented in each fused image. When there are a plurality of preset intrusions (e.g., a, b, c), the number of the same preset intrusions represented in each fused image cannot be more than 1. Namely, a preset intruding object a, b, c, a + b, a + c, b + c, a + b + c can be presented in a fused image. But it is not possible to simultaneously present a plurality of preset intrusions a or a plurality of preset intrusions b or a plurality of preset intrusions c.

In some optional implementations of this embodiment, the executing body may generate at least two fused images according to the following steps:

the method comprises the steps of firstly, obtaining a moving track of a preset intruding object in an area to be monitored.

In these implementations, the execution body may acquire the moving track of the preset intruding object in the area to be monitored in various ways. Wherein, the movement track comprises at least two key points. The key points are usually expressed in coordinate form.

As an example, the executing body may locally acquire a movement track of a pre-stored preset intruding object in the area to be monitored. As another example, the execution main body may receive a movement track of a preset intruding object in the area to be monitored, which is sent by the electronic device connected with the communication. As another example, the executing body may further randomly generate a moving track of the preset intruding object in the area to be monitored. For example, the execution subject may first randomly generate N key points, and then connect the N key points into a trajectory.

Optionally, the movement trajectory may include a key point located in the preset warning area.

Optionally, the movement track includes a key point located in a non-preset alarm area in the area to be monitored.

Therefore, the video serving as the positive sample and the negative sample can be flexibly generated according to the position relation between the key point and the preset alarm area.

And secondly, generating a fusion image which shows the preset intruding object at the position according to the position, indicated by the at least two key points, in the area to be monitored.

In these implementations, according to the position in the area to be monitored, which is indicated by at least two key points included in the movement trajectory acquired in the first step, the execution subject may generate a fusion image in which the preset intruding object is presented at the position in various ways. As an example, the execution subject may paste an image of the preset intruding object in a background image of the to-be-monitored area as a base image, thereby generating a fused image. And the pasted position is consistent with the position, indicated by at least two key points included in the movement track obtained in the first step, in the area to be monitored.

Based on the optional implementation mode, the fusion image matched with the track can be generated according to the obtained moving track.

Optionally, based on the above optional implementation manner, according to the position indicated by the at least two key points and located in the area to be monitored, the executing body may further generate a fusion image that presents the preset intruding object at the position according to the following steps:

and S1, generating a coordinate sequence group between each two adjacent key points in the moving track.

In these implementations, the execution subject may determine the set of coordinate series between adjacent key points in the movement trajectory acquired in the first step in various ways. The coordinate sequences in the coordinate sequence group may be used to fit a moving track between adjacent key points.

And S2, generating a fusion image which presents the preset intruding object at the position according to the position in the area to be monitored, which is indicated by the coordinates included in the coordinate sequence group.

In these implementations, the executing entity may generate the fusion image in which the preset intruding object is presented at the position indicated by the coordinates included in the coordinate series group determined in step S1. As an example, the execution subject may paste an image of the preset intruding object in a background image of the to-be-monitored area as a base image, thereby generating a fused image. Wherein the pasted position coincides with the position in the area to be monitored indicated by the coordinates included in the coordinate series group determined in the above step S1.

Optionally, the pasted position may also be consistent with a position in the area to be monitored, which is indicated by at least two key points included in the movement trajectory obtained in the first step.

Based on the optional implementation manner, the method and the device can generate more fusion images by determining the coordinate sequence group between each adjacent key point in the moving track, and are beneficial to subsequently generating smoother videos.

Optionally, based on the optional implementation manner, the executing body may further generate a coordinate series group between adjacent key points in the movement trajectory according to the following steps:

and S11, selecting adjacent key points from the moving track.

In these implementations, the execution body may select the neighboring key points from the moving trajectory in various ways. As an example, assume that the movement trajectory includes n key points E₀，E₁，…，E_n(n>2). The execution main bodies can be selected sequentially or randomly.

And S12, determining the number of coordinates in the coordinate sequence corresponding to the selected adjacent key points according to the coordinates of the selected adjacent key points and the preset pixel span value.

In these implementations, according to the coordinates of the neighboring keypoints selected in step S11 and the preset pixel span value, the execution subject may determine the number of coordinates in the coordinate sequence corresponding to the selected neighboring keypoints in various ways. The preset pixel span value can be used to represent the pixel value spaced every new generated 1 new picture.

See, as an example, FIG. 2b, E₀And E₁May be respectively (x)₀，y₀) And (x)₁，y₂). The preset pixel span value can be represented by T, for example. The value of T may be adjusted, for example, by taking T10. In terms of average division, E above₀And E₁The number m of coordinates in the corresponding coordinate sequence may be determined by:

when x is₀≠x₁When m ═ x [ (. sup.x) ]₁-x₀)/T]-1, and m is calculated as an integer by rounding up.

When x is₀＝x₁When m ═ y [ [ (y)₁-y₀)/T]-1, and m is calculated as an integer by rounding up.

And S13, performing interpolation according to the coordinates of the selected adjacent key points and the indicated slope, and generating the coordinates in the coordinate sequence corresponding to the selected adjacent key points.

In these implementations, the execution subject may interpolate from the coordinates and the indicated slope of the neighboring keypoints selected in step S11 in various ways, so as to generate the coordinates in the coordinate sequence corresponding to the selected neighboring keypoints.

See, for example, FIG. 2b, above E₀And E₁The coordinates in the corresponding coordinate sequence may be determined by:

when x is₀≠x₁When the slope k is (y)₁-y₀)/(x₁-x₀). Coordinates (x) in the generated coordinate sequence_i，y_i) In x_i＝x₀+T·i，y_i＝y₀+k·x_i. Wherein, 0<i≤m。

When x is₀＝x₁Coordinate (x) in the generated coordinate sequence_i,y_i) In x_i＝x₀，y_i＝y₀+ T · i. Wherein, 0<i≤m。

Alternatively, the executing body may repeat the operations from the step S11 to the step S13, so that for each two adjacent key points in the moving track, a corresponding coordinate sequence may be calculated to form a coordinate sequence group.

Optionally, the set of coordinate series may further include coordinates corresponding to at least two key points included in the movement trajectory. Wherein, the positions of the coordinates corresponding to the at least two key points included in the movement trajectory in the coordinate series group generally coincide with the movement trajectory.

Based on the optional implementation mode, the coordinate sequence group can be generated through an interpolation method according to the preset pixel span value, the generation mode of the coordinate sequence group is enriched, and a technical basis is provided for the subsequent generation of a smoother video.

Step 204, converting the at least two fused images into a video.

In this embodiment, the execution subject may convert the at least two fused images generated in step 203 into a video in various ways. Specifically, the execution body may arrange the generated fusion images in a time series to generate a video. As an example, the execution body may randomly arrange the generated fusion images to generate a video. As still another example, the execution subject may arrange the generated fusion images in an order indicated by the corresponding movement trajectories to generate a video.

With continued reference to fig. 3, fig. 3 is a schematic diagram of an application scenario of a method for generating video according to an embodiment of the present disclosure. In the application scenario of fig. 3, the server 301 obtains an image of a warehouse without people entering from the camera 302 as a background image 303. The server 301 may also obtain a person image 304. The server 301 then generates a fused image 305 for presenting the person at different locations in the warehouse. Thereafter, the server 301 converts the above-described fused image 305 into a video 306.

At present, one of the prior art generally directly adopts a real shot video, which results in a long period for acquiring a corresponding video, is difficult to acquire the video of a monitoring object such as a large-scale device (e.g., a forklift, a truck, etc.), and can only test the real motion track of the monitoring object based on the record in the video, and the intrusion track cannot be flexibly constructed, so that the covered test range is limited. In the method provided by the embodiment of the disclosure, the background image of the area to be monitored and the image of the preset intruding object are utilized to synthesize the images for presenting the preset intruding object at different positions of the area to be monitored so as to generate the video, so that the flexible construction of the video data set according to the requirement is realized, the generation mode of the video is enriched, and a basis is provided for efficiently constructing the test video in the security field at low cost.

With further reference to fig. 4, a flow 400 of yet another embodiment of a method for generating a video is shown. The flow 400 of the method for generating a video comprises the steps of:

step 401, obtaining a background image of a region to be monitored.

Step 402, acquiring an image of a preset intruding object.

At step 403, at least two fused images are generated.

Step 404, converting the at least two fused images into a video.

The above steps 401 to 404 are respectively consistent with the steps 201 to 204 in the foregoing embodiment, and the above description of the steps 201 to 204 and the optional implementation manner thereof also applies to the steps 401 to 404, which is not described herein again.

Step 405, inputting the video to a preset intruder detection model to generate a detection result.

In this embodiment, an executing entity (for example, the server 105 shown in fig. 1) of the method for generating a video may input the video converted in step 404 to a preset intruder detection model in various ways to generate a detection result. The detection result may be used to indicate whether to alarm the preset intruder in the preset alarm region. The preset intruder detection model may include various algorithms for detecting an intruder. The preset intruding object detection model can detect an input video, and when the preset alarm area is detected to have the image of the preset intruding object, an alarm is given, such as a pop-up window or a frame selection of the intruding object.

And step 406, adjusting the intruder detection model according to the detection result.

In this embodiment, the execution body may adjust the intruder detection model in various ways according to the detection result generated in step 405. For example, if the detection result generated in step 405 is used to indicate that the preset intruder is present in the preset alarm region to be alarmed, the executing entity may not adjust the parameters of the intruder detection model. As another example, if the detection result generated in step 405 indicates that the preset intruder is not warned in the preset warning area, the executing entity may continue to train the intruder detection model.

As can be seen from fig. 4, the process 400 of the method for generating a video in this embodiment represents a step of inputting a video into a preset intruder detection model, generating a detection result, and a step of adjusting the intruder detection model according to the detection result. Therefore, the scheme described by the embodiment can be applied to testing in the security field, and the testing requirements are met by flexibly and inexpensively generating the testing video.

With further reference to fig. 5, as an implementation of the methods shown in the above figures, the present disclosure provides an embodiment of an apparatus for generating a video, which corresponds to the method embodiment shown in fig. 2 or fig. 4, and which is particularly applicable in various electronic devices.

As shown in fig. 5, the apparatus 500 for generating video provided by the present embodiment includes a first acquiring unit 501, a second acquiring unit 502, a generating unit 503, and a converting unit 504. The first obtaining unit 501 is configured to obtain a background image of a region to be monitored, where the region to be monitored includes a preset alarm region, and the background image includes an image without a preset intruder in the region to be monitored; a second acquisition unit 502 configured to acquire an image of a preset intruding object; a generating unit 503 configured to generate at least two fused images, wherein the fused images are used for presenting that the preset intruding object is located at different positions in the area to be monitored; a conversion unit 504 configured to convert the at least two fused images into a video.

In the present embodiment, in the apparatus 500 for generating a video: the specific processing of the first obtaining unit 501, the second obtaining unit 502, the generating unit 503 and the converting unit 504 and the technical effects thereof can refer to the related descriptions of step 201, step 202, step 203 and step 204 in the corresponding embodiment of fig. 2, which are not described herein again.

In some optional implementations of this embodiment, the generating unit 503 may include: an acquiring subunit (not shown in the drawings) configured to acquire a movement track of a preset intrusion object in an area to be monitored, wherein the movement track may include at least two key points; and the generating subunit (not shown in the figure) is configured to generate a fusion image for presenting the preset intruding object at the position according to the position, indicated by the at least two key points, in the area to be monitored.

In some optional implementation manners of this embodiment, the movement trajectory may include a key point located in a preset alarm area.

In some optional implementation manners of this embodiment, the movement trajectory may include a key point located in a non-preset alarm area in the area to be monitored.

In some optional implementations of this embodiment, the generating subunit may include: a first generating module (not shown in the figure) configured to generate a set of coordinate sequences between adjacent key points in the moving track, wherein a coordinate sequence in the set of coordinate sequences can be used to fit the moving track between the adjacent key points; and a second generating module (not shown in the figure) configured to generate a fusion image that presents the preset intruding object at the position according to the position in the area to be monitored, which is indicated by the coordinates included in the coordinate sequence group.

In some optional implementations of this embodiment, the first generating module may be further configured to: selecting adjacent key points from the moving track; determining the number of coordinates in a coordinate sequence corresponding to the selected adjacent key points according to the coordinates of the selected adjacent key points and a preset pixel span value; and performing interpolation according to the coordinates of the selected adjacent key points and the indicated slope, and generating the coordinates in the coordinate sequence corresponding to the selected adjacent key points.

In some optional implementations of this embodiment, the apparatus 500 for generating video may be further configured to: inputting the video into a preset intruder detection model to generate a detection result, wherein the detection result can be used for indicating whether to alarm whether a preset intruder exists in a preset alarm area; and adjusting the intruder detection model according to the detection result.

According to the device provided by the above embodiment of the present disclosure, the generation unit 503 synthesizes the background image of the to-be-monitored area acquired by the first acquisition unit 501 and the image of the preset intruder acquired by the second acquisition unit 502 to present images of the preset intruder at different positions of the to-be-monitored area, and then generates a video through the conversion unit 504, so that a video data set is flexibly configured as required, the generation manner of the video is enriched, and a basis is provided for efficiently and inexpensively configuring a test video in the security field.

Referring now to fig. 6, shown is a schematic diagram of an electronic device (e.g., a server or a terminal device in fig. 1) 600 suitable for implementing embodiments of the present application. The terminal device in the embodiments of the present application may include, but is not limited to, a mobile terminal such as a mobile phone, a notebook computer, a digital broadcast receiver, a PDA (personal digital assistant), a PAD (tablet computer), a PMP (portable multimedia player), a vehicle terminal (e.g., a car navigation terminal), and the like, and a fixed terminal such as a digital TV, a desktop computer, and the like. The electronic device shown in fig. 6 is only an example, and should not bring any limitation to the functions and the scope of use of the embodiments of the present application.

As shown in fig. 6, electronic device 600 may include a processing means (e.g., central processing unit, graphics processor, etc.) 601 that may perform various appropriate actions and processes in accordance with a program stored in a Read Only Memory (ROM)602 or a program loaded from a storage means 608 into a Random Access Memory (RAM) 603. In the RAM603, various programs and data necessary for the operation of the electronic apparatus 600 are also stored. The processing device 601, the ROM 602, and the RAM603 are connected to each other via a bus 604. An input/output (I/O) interface 605 is also connected to bus 604.

Generally, the following devices may be connected to the I/O interface 605: input devices 606 including, for example, a touch screen, touch pad, keyboard, mouse, camera, microphone, accelerometer, gyroscope, etc.; an output device 607 including, for example, a Liquid Crystal Display (LCD), a speaker, a vibrator, and the like; storage 608 including, for example, tape, hard disk, etc.; and a communication device 609. The communication means 609 may allow the electronic device 600 to communicate with other devices wirelessly or by wire to exchange data. While fig. 6 illustrates an electronic device 600 having various means, it is to be understood that not all illustrated means are required to be implemented or provided. More or fewer devices may alternatively be implemented or provided. Each block shown in fig. 6 may represent one device or may represent multiple devices as desired.

In particular, according to embodiments of the application, the processes described above with reference to the flow diagrams may be implemented as computer software programs. For example, embodiments of the present application include a computer program product comprising a computer program embodied on a computer readable medium, the computer program comprising program code for performing the method illustrated by the flow chart. In such an embodiment, the computer program may be downloaded and installed from a network via the communication means 609, or may be installed from the storage means 608, or may be installed from the ROM 602. The computer program, when executed by the processing device 601, performs the above-described functions defined in the methods of the embodiments of the present application.

It should be noted that the computer readable medium described in the embodiments of the present disclosure may be a computer readable signal medium or a computer readable storage medium or any combination of the two. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples of the computer readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In embodiments of the disclosure, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In embodiments of the present disclosure, however, a computer readable signal medium may comprise a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: electrical wires, optical cables, RF (Radio Frequency), etc., or any suitable combination of the foregoing.

The computer readable medium may be embodied in the electronic device; or may exist separately without being assembled into the electronic device. The computer readable medium carries one or more programs which, when executed by the electronic device, cause the electronic device to: acquiring a background image of a region to be monitored, wherein the region to be monitored comprises a preset alarm region, and the background image comprises an image without a preset intruder in the region to be monitored; acquiring an image of a preset intruder; generating at least two fusion images, wherein the fusion images are used for showing different positions of the preset intruding object in the area to be monitored; and converting the at least two fused images into a video.

Computer program code for carrying out operations for embodiments of the present disclosure may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, Smalltalk, C + +, and conventional procedural programming languages, such as "C", Python, or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider).

The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

The units described in the embodiments of the present disclosure may be implemented by software or hardware. The described units may also be provided in a processor, and may be described as: a processor includes a first acquisition unit, a second acquisition unit, a generation unit, and a conversion unit. For example, the first acquiring unit may be further described as a unit that acquires a background image of the area to be monitored, wherein the area to be monitored includes a preset alarm area, and the background image includes an image without a preset intruder in the area to be monitored.

The foregoing description is only exemplary of the preferred embodiments of the disclosure and is illustrative of the principles of the technology employed. It will be appreciated by those skilled in the art that the scope of the invention in the embodiments of the present disclosure is not limited to the specific combination of the above-mentioned features, but also encompasses other embodiments in which any combination of the above-mentioned features or their equivalents is made without departing from the inventive concept as defined above. For example, the above features and (but not limited to) technical features with similar functions disclosed in the embodiments of the present disclosure are mutually replaced to form the technical solution.

19页详细技术资料下载

Method, apparatus, electronic device, and medium for generating video

相关技术

网友询问留言