Method and system for composing video material

文档序号：682757 发布日期：2021-04-30 浏览：12次中文

阅读说明：本技术 用于组成视频资料的方法和系统 (Method and system for composing video material ) 是由维克托·A·安德松莫阿·莱恩哈特约阿基姆·图尔贝里克里斯特·内斯林德于 2020-10-22 设计创作，主要内容包括：本发明涉及用于组成视频资料的方法和系统,具体涉及用于沿着由多个视频摄像机监控的区域中的轨迹组成动作过程的视频资料的方法和系统。接收限定由多个视频摄像机监控的区域中的轨迹的用户输入的第一序列。第一序列中每个用户输入与时间戳关联,接收作为正由多个视频摄像机监控的区域的地图中的位置的指示。对于第一序列中每个用户输入,收集来自多个视频摄像机中的具有覆盖由用户输入指示的位置的视场的那些摄像机的视频记录。收集的视频记录在从与用户输入关联的时间戳处开始并在与第一序列中的下一用户输入关联的时间戳处或在用于停止收集的指示被接收时结束的时间段中被记录。然后根据为第一序列中的用户输入收集的视频记录组成视频资料。(The present invention relates to a method and system for composing video material, and in particular to a method and system for composing video material of course of action along a trajectory in an area monitored by a plurality of video cameras. A first sequence of user inputs defining a trajectory in an area monitored by a plurality of video cameras is received. Each user input in the first sequence is associated with a time stamp, receiving an indication of a location in the map as an area being monitored by the plurality of video cameras. For each user input in the first sequence, video recordings are collected from those of the plurality of video cameras having a field of view covering the location indicated by the user input. The collected video recording is recorded for a period of time starting at a time stamp associated with the user input and ending at a time stamp associated with the next user input in the first sequence or when the indication to stop collecting is received. The video material is then composed based on the video recordings collected for the user input in the first sequence.)

1. A method for composing video material of a course of action along a trajectory in an area monitored by a plurality of video cameras, comprising:

receiving a first sequence of user inputs defining a trajectory in the area monitored by the plurality of video cameras,

wherein each user input in the first sequence is associated with a timestamp and is received as an indication of a location in a map of the area being monitored by the plurality of video cameras;

for each user input in the first sequence, collecting video recordings from those of the plurality of video cameras that have a field of view that covers the location indicated by the user input, the collected video recordings being recorded for a period of time that starts at the timestamp associated with the user input and ends at a timestamp associated with a next user input in the first sequence or when an indication to stop the collecting is received; and is

Composing a video feed from the video recordings collected for each user input in the first sequence.

2. The method of claim 1, wherein at least one of the user inputs further indicates a section around the location in the map of the area being monitored, a size of the section reflecting a degree of uncertainty of the indicated location, and

wherein, in collecting video recordings for the at least one user input, video recordings are collected from those of the plurality of video cameras that have a field of view that overlaps the region around the location indicated by the user input.

3. The method of claim 1, further comprising:

in response to receiving a user input in the first sequence of user inputs, pointing one or more of the plurality of video cameras to the indicated location in the map of the area being monitored.

4. The method of claim 1, further comprising:

in response to receiving a user input in the first sequence of user inputs, displaying a video recording from those of the plurality of video cameras having a field of view that covers the location indicated by the user input, the video recording beginning at the timestamp associated with the user input.

5. The method of claim 1, further comprising:

in response to receiving a user input in the first sequence of user inputs, displaying one or more suggestions of a location in the map for the area being monitored for a next user input.

6. The method of claim 5, wherein the one or more suggestions for a location of a next user input are determined based on the location indicated by a most recently received user input in the first sequence of user inputs and the locations of the plurality of video cameras in the area being monitored.

7. The method of claim 1, further comprising:

storing the first sequence of user inputs, and

accessing the stored first sequence of user inputs at a later point in time to perform the steps of collecting video recordings and composing video material.

8. The method of claim 7, further comprising:

modifying the user input in the stored first sequence of user inputs prior to performing said steps of collecting video recordings and composing video material.

9. The method of claim 8, wherein user input in the stored first sequence of user inputs is modified by adjusting the location indicated by the user input in the map of the area being monitored.

10. The method of claim 1, wherein the timestamp associated with a user input corresponds to a point in time at which the user input was made.

11. The method of claim 1, further comprising:

receiving and storing video recordings recorded by the plurality of video cameras during a first time period,

wherein the step of receiving a first sequence of user inputs is performed after the first time period, and wherein each user input is associated with a timestamp corresponding to a time within the first time period.

12. The method of claim 1, further comprising:

for each user input in the first sequence, collecting data from other data sources arranged within a predetermined distance from the location indicated by the user input, the collected data from the other data sources being generated in a time period starting at the timestamp associated with the user input and ending at a timestamp associated with a next user input in the first sequence or when an indication to stop the collection is received; and is

Adding the data from the other data sources to the video material.

13. The method of claim 1, further comprising:

receiving a second sequence of user inputs defining a second trajectory in the area monitored by the plurality of video cameras,

wherein the first sequence of user inputs and the second sequence of user inputs overlap in that they share at least one user input;

for each user input in the second sequence that is not shared with the first sequence of user inputs, collecting video recordings from those of the plurality of video cameras that have a field of view that covers the location indicated by the user input, the collected video recordings being recorded for a period of time that starts at a timestamp associated with the user input and ends at a timestamp associated with the next user input in the second sequence or when an indication to stop the collecting is received; and is

Including in the video feed the video recordings collected for each user input in the second sequence of user inputs that is not shared with the first sequence of user inputs.

14. A system for composing video material of a course of action along a trajectory in an area monitored by a plurality of video cameras, comprising:

a user interface arranged to receive a first sequence of user inputs defining a trajectory in the area being monitored by the plurality of video cameras, wherein the user interface is arranged to receive each user input in the first sequence as an indication of a location in a map of the area being monitored by the plurality of video cameras and to associate each user input with a timestamp;

a data store arranged to store video recordings from the plurality of video cameras; and

a processor arranged to:

receiving a first sequence of the user inputs from the user interface;

for each user input in the first sequence, collecting from the data store video recordings from those of the plurality of video cameras having a field of view covering the location indicated by the user input, the collected video recordings being recorded for a period of time starting at the timestamp associated with the user input and ending at a timestamp associated with the next user input in the first sequence or when an indication to stop the collecting is received; and is

Composing a video feed from the video recordings collected for each user input in the first sequence.

15. A non-transitory computer readable medium having stored thereon computer code instructions which, when executed by a processor, cause the processor to perform the method of claim 1.

Technical Field

The present invention relates to the field of video surveillance of an area by a plurality of cameras. In particular, the invention relates to a method and system for composing video material of course of action along a trajectory in an area monitored by a plurality of video cameras.

Background

Video cameras are commonly used for surveillance purposes. Video surveillance systems typically include a plurality of video cameras and a video management system installed in an area to be monitored. The video recorded by the camera is sent to a video management system for storage and display to an operator. For example, an operator may display video recordings from one or more selected video cameras via a video management system in order to track events and incidents that occur in a monitored area. Further, in the event of an accident, video material that can be used as evidence of forensics can be composed from the video recordings stored in the video management system.

However, as the number of cameras in a video surveillance system increases, it becomes a challenge to obtain an overview of the recorded video from all cameras. It is not uncommon for video surveillance devices to have hundreds of cameras. For example, it is difficult for an operator to track a particular course of action in a scene, such as when a person or moving object moves in a monitored area. Further, a large amount of manual input is required to combine video material of a specific accident that has occurred in the monitored area, which becomes a cumbersome work.

Therefore, there is a need for methods and systems that allow video material of the course of action in a monitored area to be more easily combined together.

Disclosure of Invention

In view of the above, it is therefore an object of the present invention to alleviate the above problems and to simplify the process of composing video material of a course of action in an area monitored by a plurality of video cameras.

According to a first aspect, there is provided a method for composing video material of a course of action along a trajectory in an area monitored by a plurality of video cameras, comprising:

receiving a first sequence of user inputs, the first sequence of user inputs defining a trajectory in an area monitored by a plurality of video cameras,

wherein each user input in the first sequence is associated with a timestamp and is received as an indication of a location in a map of an area being monitored by the plurality of video cameras;

for each user input in the first sequence, collecting video recordings from those of the plurality of video cameras having a field of view covering the location indicated by the user input, the collected video recordings being recorded for a period of time starting at the timestamp associated with the user input and ending at the timestamp associated with the next user input in the first sequence or when the indication to stop collecting is received; and is

The video material is composed from video recordings collected for each user input in the first sequence.

The first sequence of user inputs is typically received and processed sequentially. Thus, as user input is received, a video recording for the user input may be collected. Then the next user input is received and then a video recording for the next user input is collected. The process may then be repeated until all user inputs in the first sequence have been received and processed.

With this approach, the video material is automatically composed according to a trajectory defined via user input. Accordingly, the user does not have to browse through all of the video material recorded by the plurality of cameras to identify the video recordings that are relevant to the event of interest. Instead, the user need only define a track in the monitored area, and the associated video recordings depicting the course of action along the track are collected and included in the video material.

The method further allows the user to freely select a desired trajectory in the monitored area. This is advantageous over methods in which the relevant video recording is simply identified by analyzing the recorded video content.

Video material refers to a collection of video files. The video material may be in the form of an output file in which a plurality of video files are included.

At least one of the user inputs may further indicate a section around the location in the map of the area being monitored, the size of the section reflecting the degree of uncertainty of the indicated location. In collecting the video recording for the at least one user input, then collecting the video recording from those of the plurality of video cameras having a field of view that overlaps with a region around the location indicated by the user input. The degree of uncertainty may also be considered to be the accuracy of the user input. In this case, the smaller size of the sections reflects a higher accuracy, and vice versa.

In this way, the user may indicate a region around one or more of the user inputs, and video from those cameras having fields of view that overlap with the region is collected. A larger section generally results in more video recordings being collected since potentially more cameras will have fields of view that overlap with the larger section than a smaller section. This may advantageously be used when the user is unsure of the location of the next user input. For example, a user may attempt to track an object in a monitored area and be uncertain whether the object will turn to a right or left position. The user may then indicate a location between the left location and the right location, and further indicate a section large enough to cover both the left location and the right location. Also for example, a user may attempt to track a set of objects through a monitoring area. The user may then indicate a section around the indicated location so that all objects in the group fall within the section.

One or more of the plurality of cameras may have a variable field of view. For example, there may be one or more pan, tilt, zoom cameras. In response to receiving user input in the first sequence of user inputs, the method may further point one or more of the plurality of video cameras to the indicated location in the map of the area being monitored. In this way, cameras that are pointing in another direction may be redirected to the indicated location so that they capture video of the event at the indicated location.

In response to receiving the user input in the first sequence of user inputs, the method may further display a video recording from those of the plurality of video cameras having a field of view covering the location indicated by the user input, the video recording beginning with a timestamp associated with the user input. This allows the user to view the video recordings collected for the current user input. The user may use the displayed video recording as a guide for the next user input. For example, a user may see in a video recording that an object is rotating in a certain direction in a monitored area. In response, the user may position the next user input in the direction on the map of the monitored area.

The method may further give guidance on the next user input through a map of the monitored area. In particular, in response to receiving user input in the first sequence of user inputs, the method may display one or more suggestions for a location of a next user input in the map of the area being monitored. This guidance saves time and simplifies the decision of the user. This is also advantageous in the case where there is a blind spot in the monitored area that is not covered by any of the cameras. If an object enters a blind area, it cannot be inferred from the video data where the object will appear after passing through the blind area. In this case, the suggested position may indicate to the user where the object would normally appear again after passing through the blind area. For example, if the currently indicated location on the map is at the beginning of an unmonitored hallway leading in several directions, the suggested location may indicate to the user at which monitored location the object typically appears after passing through the hallway.

The one or more suggestions for the location of the next user input may be determined based on the location indicated by the most recently received user input in the first sequence of user inputs and the locations of the plurality of video cameras in the area being monitored. Alternatively or additionally, the recommendation may be based on statistical data about common trajectories in the monitored area. Such statistical data may be collected from historical data. The statistical data may be used to calculate one or more most likely next positions given a current position along the trajectory. The user may then be presented with one or more of the most likely next locations as suggestions in a map of the monitored area. In this way, a priori knowledge of the locations of the multiple video cameras and/or a priori knowledge of typical trajectories may be used to guide the user in making a decision for the next user input.

The tracks that have been input by the user may be stored for later use. Specifically, the method may further comprise: storing a first sequence of user inputs; and accessing the stored first sequence of user inputs at a later point in time to perform the steps of collecting the video recordings and composing the video material. In this way, for example, when a forensic video material needs to be generated and output, the user can return to the stored track and use it later to compose the video material. It may also happen that additional video recordings are made available at a later point in time, the tracks being recorded but not available upon input by the user. For example, video recordings from camcorders carried by objects in the monitored area are not available until after the cameras have uploaded their video. In this case, the stored tracks may be accessed when additional video recordings are available to compose video material that also includes some of the additional video recordings.

Another advantage of using stored tracks is that the tracks can be modified before composing the video material. In more detail, the method may include modifying the user input in the stored first sequence of user inputs prior to performing the steps of collecting the video recordings and composing the video material. In this way, the user can adjust the stored trajectory so that the resulting composed video material better reflects the course of action in the monitored area.

For example, user inputs in the stored first sequence of user inputs may be modified by adjusting a location indicated by the user inputs in a map of the area being monitored. Modifying may also include adding or removing one or more user inputs to the first sequence, and/or modifying timestamps associated with the one or more user inputs. It is also possible to offset all timestamps of the user input in the first sequence by a certain value. The latter may advantageously be used to compose video material reflecting the course of action along the track at a point in time before or after the point in time indicated by the timestamp, such as 24 hours before or 24 hours after the timestamp associated with the track.

The track may be defined in real-time via user input, that is, while the video recording is being recorded. In this case, the timestamp associated with the user input corresponds to the point in time at which the user input was made.

Alternatively, the track may be defined via user input after the video is recorded. More specifically, the method may include receiving and storing video recordings recorded by a plurality of video cameras during a first time period, wherein the step of receiving a first sequence of user inputs is performed after the first time period, and wherein each user input is associated with a timestamp corresponding to a time within the first time period. Accordingly, in this case, the time stamp of the user input does not correspond to the time point at which the user input is made. Conversely, a timestamp associated with the user input may be generated by offsetting the point in time at which the user input was made by some user-specified value. For example, a user may specify an appropriate timestamp in a first time period of a first user input in a trace, and timestamps of further user inputs in the trace may be set relative to the first timestamp.

In addition to a plurality of video cameras, data sources of other data types may be arranged in the monitored area. This may include sensors and/or detectors such as microphones, radar sensors, door sensors, temperature sensors, thermal cameras, face detectors, license plate detectors, and the like. The method may further comprise: for each user input in the first sequence, collecting data from other data sources arranged within a predetermined distance from the location indicated by the user input, the collected data from the other data sources being generated for a period of time starting at a timestamp associated with the user input and ending at a timestamp associated with the next user input in the first sequence or when an indication to stop collecting is received; and adds data from other data sources to the video material. Thus, the composed video material includes not only video recordings, but also data from other types of sensors and detectors that can provide forensic evidence about the course of action along the track in the monitored area.

Sometimes, two tracks in the monitored area may overlap. For example, two objects of interest may first follow a common trajectory, and then they separate, forming two branches of the trajectory. Conversely, two objects may first follow two separate trajectories, but then join each other along a common trajectory. In this case, it may be of interest to compose a single video material comprising a video recording of two tracks. To this end, the method may further include:

receiving a second sequence of user inputs, the second sequence of user inputs defining a second trajectory in an area monitored by the plurality of video cameras,

wherein the first sequence of user inputs and the second sequence of user inputs overlap in that they share at least one user input;

Included in the video material is a video record collected for each user input in the second sequence of user inputs that is not shared with the first sequence of user inputs.

Alternatively, if there are user inputs in the first sequence and user inputs in the second sequence whose locations are covered by the field of view of the same camera during an overlapping period, the first sequence of user inputs and the second sequence of user inputs may be considered to overlap. In this case, when there is no overlap with the first sequence, it may be sufficient to collect a video recording of the second sequence for user input and time period.

According to a second aspect, there is provided a system for composing video material of a course of action along a trajectory in an area monitored by a plurality of video cameras, comprising:

a user interface arranged to receive a first sequence of user inputs defining a trajectory in an area being monitored by the plurality of video cameras, wherein the user interface is arranged to receive each user input in the first sequence as an indication of a location in a map of the area being monitored by the plurality of video cameras and to associate each user input with a timestamp;

a data store arranged to store video recordings from a plurality of video cameras; and

a processor arranged to:

receiving a first sequence of user inputs from a user interface;

for each user input in the first sequence, collecting from the data store video recordings from those of the plurality of video cameras having a field of view covering the location indicated by the user input, the collected video recordings being recorded for a period of time starting at the timestamp associated with the user input and ending at the timestamp associated with the next user input in the first sequence or when the indication to stop collecting is received; and is

The video material is composed from video recordings collected for each user input in the first sequence.

According to a third aspect, there is provided a computer program product comprising a non-transitory computer readable medium having stored thereon computer code instructions which, when executed by a processor, cause the processor to perform the method according to the first aspect.

The second and third aspects may generally have the same features and advantages as the first aspect. It should also be noted that the present invention relates to all possible combinations of features, unless explicitly stated otherwise.

Drawings

The above and further objects, features and advantages of the present invention will be better understood by the following illustrative and non-limiting detailed description of embodiments thereof with reference to the accompanying drawings, in which like reference numerals will be used for similar elements, and in which:

fig. 1 schematically illustrates a video surveillance system according to an embodiment.

FIG. 2 is a flow chart of a method for composing video material according to a first set of embodiments.

Figures 3a to 3d schematically illustrate a sequence of user inputs received as an indication of a location in a map of a monitored area.

Fig. 4 illustrates video recordings collected for user input in the sequence illustrated in fig. 3a to 3 d.

FIG. 5 is a flow chart of a method for composing video material according to a second set of embodiments.

FIG. 6 is a flow chart of a method for composing video material according to a third set of embodiments.

Fig. 7 schematically illustrates two overlapping user input sequences.

Detailed Description

The present invention now will be described more fully hereinafter with reference to the accompanying drawings, in which embodiments of the invention are shown.

Fig. 1 illustrates a video surveillance system 1 and a video management system 100, the video surveillance system 1 including a plurality of video cameras 10 mounted to monitor an area 12. The video management system 100 will also be referred to herein as a system for composing video material of course of action along a trajectory in an area 12 monitored by a plurality of cameras 10.

The monitored area 12 is illustrated in the form of a map showing the plan of the area 12. In this example, it is assumed that the area 12 is an indoor area, with walls separating different portions of the area 12 to form rooms, hallways, and other spaces. However, it should be understood that the concepts described herein are equally applicable to other types of areas, including outdoor areas. A plurality of video cameras 10 (illustrated here by an enumerated twelve cameras 10-1 through 10-12) are disposed in a monitored area 12. Each of the cameras 10 has a field-of-view (field-of-view) covering a part of the monitored area. Some of the cameras 10 may be fixed cameras, meaning that they have a fixed field of view. Other cameras may have a variable field of view, meaning that the cameras may be zoomed and/or controlled to move in the pan or tilt direction so that their field of view covers different portions of area 12 at different points in time. As a special case of a camera with a variable field of view, there may be a camera carried by an object moving around in the area 10, such as a mobile phone camera, a wearable camera, or an onboard camera. Preferably, the plurality of cameras 10 are arranged in the monitored area 12 such that each point in the monitored area 12 falls within or may fall within the field of view of at least one of the plurality of cameras 10. However, this is not necessary to implement the concepts described herein.

In addition to the video camera 10, a plurality of data sources 14 may be disposed in the area 12. The data source 14 may generally generate any type of data that provides evidence of an action or event that has occurred in the monitored area 12. This includes sensors and/or detectors such as microphones, radar sensors, door sensors, temperature sensors, thermal cameras, face detectors, license plate detectors, etc. The data source 14 may also include a point-of-sale system that registers sales and returns of purchases made in the area 12. By configuration, the data source 14 may be associated with the camera 10 or another data source. For example, the data source 14-1 may be associated with the camera 10-1, or the data source 14-2 may be associated with the data source 14-1. Furthermore, chains of such associations may be formed. For example, the data source 14-2 may be associated with the data source 14-1, which in turn is associated with the camera 10-1, the data source 14-1. Such associations and chains of associations may be used when collecting data from a camera or data source 14. For example, if video is to be collected from a camera during a time period, data may also be automatically collected from an associated data source during the time period.

A plurality of cameras 10 and additional data sources 14 (if available) communicate with the video management system 100 via a communication link 16. The communication link 16 may be provided by any type of network, such as any known wired or wireless network. For example, multiple cameras 10 may send recorded video to the video management system 100 for display or storage via the communication link 16. Further, the video management system 100 may send control instructions to multiple cameras 10 to start and stop recording or to redirect or change the zoom level of one or more of the cameras 10.

The video management system 100 includes a user interface 102, a data store 104, and a processor 106. The video management system 100 may also include a non-transitory type of computer readable memory 108, such as a non-volatile memory. The computer-readable memory 108 may store computer code instructions that, when executed by the processor 106, cause the processor 106 to perform any of the methods described herein.

The user interface 102 may include a graphical user interface through which an operator may view video recorded by one or more of the plurality of cameras 10. The user interface 102 may also display a map of the monitored area, similar to the map of the area shown at the top of FIG. 1. As will be explained in more detail later, the operator may interact with the map to indicate a location in the map, for example by clicking on the location in the map with a mouse cursor. If the operator indicates several locations in sequence on the map, the indicated locations will define a trajectory in the area 12.

The data storage 104, which may be a database, stores video recordings received from a plurality of cameras over the communication link 16. Through interaction with the user interface 102, the data store 104 may further store one or more trajectories that have been defined by an operator.

The processor 106 interacts with the user interface 102 and the database 104 to compose video material of course of action along such a trajectory. This will now be explained in more detail with reference to the flow chart of fig. 2, fig. 2 showing a first set of embodiments of a method for composing video material. The dashed lines in fig. 2 (and in fig. 5 and 6) illustrate optional steps.

In the first set of embodiments shown in fig. 2, it is assumed that the operator provides input regarding the trajectory in the area 12 in real time, that is, video is recorded simultaneously.

The method starts at step S102 by receiving a first user input via the user interface 102. The user input is received in the form of an indication of a location in a map of the monitored area 12. This is illustrated in more detail in fig. 3 a. On the user interface 102, a map 32 of the monitored area 12 may be displayed. Via the user interface 102, the user may enter an indication 34-1 of a location in the map 32, for example, by clicking with a mouse cursor on the desired location in the map 32. Here, the indication 34-1 is graphically represented by a star icon, with the center of the icon representing the indicated location. However, it should be understood that this is only one of many possibilities.

Optionally, the user input may also indicate a section around the location. The purpose of this section is to associate a position indication with a degree of uncertainty. The degree of uncertainty reflects the degree to which the user determines the precise location of the input. In other words, the size of the section indicates the accuracy of the input. For example, a larger section may indicate a more uncertain or less accurate position indication than a smaller section. To specify a section, the user may input a graphical icon such as a circle or rectangle or a star as shown in fig. 3a, where the center of the icon indicates the desired position and the size of the icon reflects the degree of uncertainty.

Upon receiving the user input, the processor 106 associates the received user input with a timestamp. In a first set of embodiments in which user input is made while video is being captured, the timestamp corresponds to the time at which the user input was made. In the example of FIG. 3a, the first user input identifying location 34-1 is associated with a timestamp T1.

In some cases, particularly when there are video cameras 10 with variable fields of view, processor 106 may control one or more of video cameras 10 to point at indicated location 34-1 in step S103. For example, assuming video camera 10-2 is a camera with pan-tilt-zoom functionality, processor 106 may control video camera 10-2 to point at indicated location 34-1. It should be appreciated that in step S103, processor 106 need not redirect all cameras having a variable field of view, but only those cameras having a field of view that, when redirected or zoomed, may cover location 34-1. If an indeterminate region around location 34-1 has been provided by user input, it may be sufficient if the field of view of the camera overlaps the identified region when redirected or zoomed. For example, the processor 106 would not need to redirect the camera 10-3 to the location 34-1 because there is a wall between the camera 10-3 and the location 34-1. The processor 106 may identify candidate cameras to redirect based on the location of the camera 10 relative to the indicated location 34-1 and using knowledge of the plan of the area, such as where walls or other obstacles are located.

In step S104, the processor 106 then collects video recordings associated with the user input received in step S102. Video recordings are collected during a time period beginning with a timestamp associated with the user input. To do so, the processor 106 first identifies those of the plurality of cameras 10 that have a field of view that covers the location 34-1 indicated by the user input. The processor 106 may identify those cameras by using information about the location in the area 12 where the cameras 10 are installed and information about the plan of the area, such as where walls and obstacles are located. Such information is typically provided when the camera is installed in the area and may be stored in the data store 104 of the video management system 100. In the example illustrated in FIG. 3a, processor 106 identifies camera 10-1 and camera 10-2 (after redirection as described above). These cameras are represented by having black padding in the figure.

In the event that the user input further defines a section around location 34-1, the processor 106 may more generally identify a video camera having a field of view that overlaps the section. Thus, more cameras may be identified by the processor 106 as a larger section is indicated by user input. In the case where the wall in the area 12 bisects the section associated with the indicated position, the video camera 10 located on the other side of the wall than the indicated position may be excluded from being identified.

Further, in the event that additional data sources 14 are present in the area 12, the processor 106 may also identify data sources 14 that are located within a predetermined distance of-1 from the indicated location 34. The predetermined distance may be different for different types of data sources 14 and may also vary depending on where the data source is located in the area 12. In the example of FIG. 3a, processor 106 identifies that sensor 14-1 is within a predetermined distance from location 34-1.

After identifying the cameras 12 and possibly the additional data sources 14 as described above, the processor 106 collects video and data from these cameras 12 and additional data sources 14. This is further illustrated in fig. 4, which fig. 4 shows the timeline and video recordings 43 and data 44 generated by the cameras 10-1 to 10-12 and the additional data sources 14-1 to 14-4. A timestamp associated with the user input, such as timestamp T1 associated with the first user input indicating location 34-1 on the map, is identified along the timeline. The processor 106 collects video recordings from the identified cameras and from the data source (if available) starting at the timestamp T1 associated with the first user input. Video and data are collected until another user input is received or an indication that there will be no more user input is received. Thus, in the illustrated example, video recordings are collected for cameras 10-1 and 10-2 and data source 14-1 until the next user input associated with timestamp T2 is received. In fig. 4, the collected recordings are indicated by the shaded areas.

Optionally, in step S105, the processor 106 may display the collected video recordings on the user interface 102. In this way, the user is able to track the current action at the indicated location 34-1. This also facilitates the user making a decision regarding the next user input. For example, a user may see from a video that an object is moving in a certain direction, and may then decide to indicate a location in that direction on a map in order to track the object.

As an option, the processor 106 may also provide one or more suggestions to the user via the user interface 102 regarding the location of the next user input. One or more suggestions may be provided in the map 32 by using predefined graphical symbols. In the example of FIG. 3a, processor 106 suggests location 35-1 as a possible location for the next user input. The suggestion guides the user to select the next location. The processor 106 may base its suggestion on a number of factors. For example, it may be based on the current user input location 34-1 and the location of the video camera 10 and/or the additional data source 14 in the area 12. In this way, the processor 106 may suggest a next location to be covered by one or more cameras. The recommendation may further be based on the planning of the area 12. The planning of the area 12 provides useful input as to the possible trajectories that an object may take given the location in the area 12 where walls and other obstacles are located. Additionally or alternatively, the processor 106 may also utilize historical data obtained by tracking objects in the area 12. Based on the statistics of the historical object trajectories in the region 12, the processor 106 can infer which trajectory an object typically moves along through the region 12. Assuming that the current user input 34-1 is along such a trajectory, the processor 106 may suggest a next location along the trajectory. The suggested positions along the trajectory may be selected such that at least one of the cameras 10 has a field of view covering the suggested position.

The processor 106 then awaits further user input via the map 32 shown on the user interface 102.

If further user input is received, the processor 106 repeats steps S102, S104, and optionally also steps S103, S105, S106, for new user input.

Returning to this example, FIG. 3b illustrates a second user input indicating location 34-2. The processor 106 associates the second user input with a timestamp T2 corresponding to the point in time at which the second user input was received. The second user input may be provided by accepting the suggested location 35-1 (e.g., by clicking the suggested location 35-1 with a mouse cursor). Alternatively, user input may be provided by simply indicating a desired location in the map 32. In this case, the user input defines a section around location 34-2 that is larger than the corresponding section of location 34-1. This is illustrated by the star icon at location 34-2 being larger than the star icon at location 34-1. Thus, the user input reflects that the uncertainty of the indicated location 34-2 is greater than the uncertainty of the indicated location 34-1.

In response to the second user input, the processor 306 may optionally proceed to point one or more of the cameras 10 at the location 34-2, as described above in connection with step S103. Further, the processor 106 may identify which camera 10 has a field of view that covers the indicated location 34-2 or at least overlaps with a region around the indicated location 34-2 as defined by the second user input. In this case, cameras 10-2, 10-4, 10-5, 10-8, 10-9, 10-12 are identified. Further, the processor 106 may identify whether any of the data sources 14 are within a predetermined distance from the indicated location 34-2. In this case, the data source 14-1 is identified. The processor 106 then collects video recordings from the identified video cameras and from the identified data sources (if any) in step S104. As shown in FIG. 4, the collection begins at timestamp T2 and continues until another user input is received with timestamp T3. Optionally, the collected video recordings may be displayed on the user interface 102 to allow the user to track the course of action at location 34-2 in real time.

Further, as shown in FIG. 3b, the processor 106 suggests a plurality of locations 35-2 as candidates for the next user input.

As shown in fig. 3c and 3d, the processor 106 repeats the above process for a third user input indicating the location 34-3 and a fourth user input indicating the location 34-4, respectively. The third user input is associated with a timestamp of T3 and the fourth user input is associated with a timestamp of T4. After the fourth user input, the processor 106 receives an indication via the user interface 102 that this is the last user input. For a third user input, and as shown in FIG. 4, video recordings are collected from cameras 10-4, 10-11, 10-12 between time stamps T3 and T4. Further, data from the data source 14-4 is collected. In addition, a candidate location 35-3 for the next user input is suggested in the map 32. For the fourth user input, video recordings are collected from cameras 10-4, 10-11, 10-12 between timestamp T4 and the time at which an indication was received that the fourth user input was the last user input (which time was the time indicated by "stop" in FIG. 4).

As best seen in fig. 3d, the sequence of received user inputs defines a trajectory 36 in the area 12 monitored by the plurality of cameras 10. In particular, such a trajectory 36 is defined by the locations 34-1, 34-2, 34-3, 34-4 indicated by these user inputs. Further, the video recordings collected by processor 106 as described above illustrate the course of action along track 36 in region 12.

The processor 106 then composes a video material from the collected video recordings in step S107. Further, data collected from the data source 14 may be added to the video material. The video material may be in the form of an output file to which the collected recordings are added. The video material may be output to, for example, constitute forensic evidence. Video material may also be stored in the data storage 104 for future use.

The video material may also include a first sequence of user inputs defining a track 36 in the monitored area. In particular, the locations 34-1, 34-2, 34-3, 34-4 and associated timestamps T1, T2, T3, T4 may be included in the video material. The video material may further include a representation of a map 32 of the area 12. This allows the recipient of the video material to not only play back the video included in the video material, but also simultaneously display a map in which the trajectory is indicated.

The video material may also include metadata associated with the camera 10. The metadata may include an indication of the field of view of the camera 10, and possibly how the field of view changes over time. In particular, the field of view of the camera from which the video recording was collected may be included as metadata. Having such metadata in the video material allows the field of view of the camera 10 and how the field of view changes over time to be displayed in the map 10. In other words, the metadata may be used to animate the map 10. For a camcorder, such as a mobile telephone camera or a wearable camera, the metadata included in the video material may relate to the location of the camera and how the location changes over time.

In a similar manner, the video material may also include metadata associated with the additional data source 14. In this case, the metadata may relate to how the value of the additional data source 14 changes over time. The metadata of the data source 14 may be used to animate the map 10, for example, by animating the opening and closing of doors in the map 10 according to the values of the door sensors.

The video material may be provided with a signature that prevents the video material from being edited and enables detection of whether data in the video material has been tampered with. This is advantageous in the case where video material is to be used as forensic evidence.

In other cases, the video material is editable. In this case, an editing history may be provided for the video material so that changes made to the video material after it is created can be easily tracked.

Optionally, in step S108, the processor 108 may also store the user input sequence. For example, the indicated locations 34-1, 34-2, 34-3, 34-4 may be stored in the data store 104 along with their associated timestamps.

In the second group of embodiments, the input with respect to the track in the area 12 is not performed in real time, that is, the video is not recorded simultaneously. More specifically, assume that the video camera 10 records video during a first time period, and that input regarding the trajectory in the area 12 is received after the first time period. In other words, the operator wishes to generate video material of the course of action occurring during the first time period along the trajectory in the area. However, the trajectory is not specified until after the first time period. Thus, the second set of embodiments allows a user to generate video material of course of action along a particular trajectory from a pre-recorded video.

The second set of embodiments will now be explained in more detail with reference to the flow chart of fig. 5.

In step S201, the processor 106 receives and stores video recordings captured by the plurality of cameras 10 during a first time period. Such video recordings may be stored by the processor 106 in the data storage 104.

The processor 106 then continues to receive user input at step S202 and collects video recordings for the user input at step S204. Optionally, the processor 106 may also display the video recordings collected for the user input at step S205 and display the suggested location of the next user input at step S206. These steps correspond to steps S102, S104, S105, S106 of the first set of embodiments. It is noted, however, that step S103 of fig. 2 of pointing the camera is not possible because the method operates on previously recorded video data. Further, in contrast to the first set of embodiments, steps S202, S204, S205, S206 are performed after a first period of time during which video is recorded.

In order to track the course of action occurring during the first time period, the trajectory defined by the sequence of user inputs needs to be correlated with points in time during the first time period. Thus, the timestamp associated with the user input should not correspond to the point in time at which the user input was received. Instead, the processor 106 associates the user input with a timestamp corresponding to a point in time within the first time period. For example, the processor 106 may receive a user input specifying a point in time during the first time period that should be a timestamp of the first user input defining the start of the trajectory. The timestamp of the subsequent user input may then be set relative to the timestamp of the first user input. In practice, this may correspond to the user viewing the recorded material, finding an event of interest, and then starting to track the object of interest as if it were a live view. The difference, of course, is that the user can quickly browse through a large amount of material when locating the next appropriate user input. Another difference is that the user can also track the object backwards in time, but the associated timestamp will be coupled to the time of recording rather than the time of user input.

When the processor 106 receives an indication that user input will no longer be received, it proceeds to step S207 to compose a video material from the collected video recordings. Optionally, it may also store the received sequence of user inputs in the data store 104. These steps are performed in the same manner as step S107 and step S108 described in conjunction with fig. 2.

In a third set of embodiments illustrated in the flow chart of fig. 6, the method operates on a stored sequence of user inputs. In more detail, in step S302, the processor 106 receives a stored sequence of user inputs defining trajectories in the area monitored by the plurality of cameras 10. For example, the processor 106 may access a stored sequence of user inputs from the data store 104 (as previously stored in step S108 of the first set of embodiments or in step S208 of the second set of embodiments).

Optionally, the processor 106 may modify one or more of the user inputs in the received sequence of user inputs in step S303. The modification may be made in response to user input. For example, the processor 106 may display the received sequence of user inputs and the map 32 of the monitored area on the display 102. The user may then adjust one of the indicated positions, for example by moving a graphical representation of the position using a mouse cursor.

The processor 106 may then proceed to collect video recordings for each user input in the sequence at step S304 and compose a video material from the collected video recordings at step S307. Optionally, the processor 106 may also display the collected video recordings for user input at step S305 and store the possibly modified user input sequence at step S308. Steps S304, S305, S307 and S308 are performed in the same manner as the corresponding steps of the first and second sets of embodiments, and thus will not be described in detail.

The embodiments described herein may be advantageously used to compose video material of objects moving through a monitored area, such as a person walking through the area. In some cases, it may happen that the trajectories of the two objects overlap, meaning that they share at least one location indicated by the user input. For example, two objects may first move together along a common trajectory and then separate such that the trajectory is split into two sub-trajectories. This is illustrated in FIG. 7, where the first track 36-1 is defined by locations 34-1, 34-2, 34-3, 34-4 and the second overlapping track 36-2 is defined by locations 34-1, 34-2, 34-5, 34-6. Alternatively, two objects may first move along two separate trajectories and then combine with each other to move along a common trajectory. In this case, it may be of interest to compose the common video material for the overlapping tracks. This may be accomplished by receiving a second sequence of user inputs defining a second trajectory 36-2 in addition to the first sequence of user inputs defining the first trajectory 36-1. The processor 106 may then identify user inputs of the second track 36-2 that do not overlap with the user inputs of the first track 36-1. In the example of FIG. 7, processor 106 would then identify user inputs corresponding to locations 34-5 and 34-6. The processor 106 may then proceed in the same manner as explained in connection with steps S104, S204 and S304 to collect video recordings from those of the plurality of cameras 10 having a field of view covering the identified location of the user input of the second trajectory 36-2. In the example of fig. 7, processor 106 may collect a video recording from video camera 10-9 for user input indicating location 34-5. The video may be collected between the time stamp associated with the user input 34-5 and the time stamp associated with the next user input 34-6. In addition, data may be collected from the data source 14-2 within a predetermined distance from the indicated location 34-5. For a user input indicating location 34-6, processor 106 may begin with a timestamp associated with user input 34-6 and end collecting video recordings from video camera 10-10 when an indication to stop collecting is received. Further, data may be collected from the data source 14-3 within a predetermined distance from the indicated location 34-6.

It will be appreciated that the embodiments described above may be modified in numerous ways by a person skilled in the art and still use the advantages of the invention as shown in the above embodiments. Accordingly, the present invention should not be limited to the illustrated embodiments, but should be defined only by the following claims. Additionally, the illustrated embodiments may be combined as understood by those skilled in the art.

22页详细技术资料下载

Method and system for composing video material

相关技术

网友询问留言