Segmentation method, segmentation system and non-transitory computer readable medium
阅读说明:本技术 分段方法、分段系统及非暂态电脑可读取媒体 (Segmentation method, segmentation system and non-transitory computer readable medium ) 是由 詹诗涵 柯兆轩 于 2019-02-01 设计创作,主要内容包括:本公开内容关于一种分段方法、分段系统及非暂态电脑可读取媒体。该分段方法包含下列步骤:接收影片内容;其中,影片内容包含影像信号以及声音信号;针对影像数据进行分段处理,以产生至少一影像段落标记;针对该声音数据进行分段处理,以产生至少一声音段落标记;以及比较该至少一影像段落标记的一影像标记时间与该至少一声音段落标记的一声音标记时间之间的差异,以产生至少一影片内容标记。(The present disclosure relates to a segmentation method, a segmentation system, and a non-transitory computer readable medium. The segmentation method comprises the following steps: receiving movie content; wherein, the film content comprises a video signal and a sound signal; performing segmentation processing on the image data to generate at least one image paragraph mark; performing segmentation processing on the sound data to generate at least one sound paragraph mark; and comparing a difference between a video mark time of the at least one video segment mark and a sound mark time of the at least one sound segment mark to generate at least one film content mark.)
1. A segmentation method, comprising:
receiving a movie content; wherein, the film content comprises an image data and a sound data;
performing segmentation processing on the image data to generate at least one image paragraph mark;
performing segmentation processing on the sound data to generate at least one sound paragraph mark; and
comparing a difference between a video mark time of the at least one video segment mark and a sound mark time of the at least one sound segment mark to generate at least one film content mark.
2. The segmentation method of claim 1, wherein the performing segmentation processing on the image data to generate the at least one image segment marker further comprises:
selecting M units of the image data, and dividing the selected image data into a first image paragraph;
judging the content of the first image paragraph to generate an image content result; wherein the image content result comprises a dynamic content and a static content; and
detecting a changed content for the image data based on the image content result, and generating the at least one image paragraph mark according to the time position of the changed content.
3. The segmentation method of claim 2, wherein said determining the content of the first video segment to generate the video content result further comprises:
selecting T units from the first image paragraph, calculating image similarity in the T units, and generating an image difference result;
if the image difference result is greater than a first image threshold value, determining the content of the first image paragraph as the dynamic content; and
if the image difference result is not greater than the first image threshold, determining the content of the first image segment as the static content.
4. The segmentation method of claim 2, wherein the detecting the variant content for the image data based on the image content result and generating the at least one image segment marker according to a temporal location of the variant content further comprises:
if the content of the first image paragraph is the dynamic content, calculating the similarity between the image of the Mth unit and the image of the M +1 th unit to generate an image difference value;
merging the M +1 unit image with the first image section if the image difference value is greater than a second image threshold value; and
if the image difference value is not greater than the second image threshold value, the at least one image paragraph mark is generated at the time position of the M +1 unit image, and the M units of image data are selected to divide the selected image data into a second image paragraph.
5. The segmentation method of claim 2, wherein the detecting the variant content for the image data based on the image content result and generating the at least one image segment marker at a temporal location of the variant content further comprises:
if the content of the first image paragraph is the static content, calculating the similarity between the image of the Mth unit and the image of the M +1 th unit to generate an image difference value;
merging the M +1 unit image with the first image section if the image difference value is not greater than a second image threshold value; and
if the image difference value is greater than the second image threshold value, the at least one image paragraph mark is generated at the time position of the image of the M +1 unit, the image data of the M units is selected, and the selected image data is divided into a second image paragraph.
6. The segmentation method according to claim 1, wherein the performing segmentation processing on the sound data to generate the at least one sound paragraph mark further comprises:
converting the sound data into a sound time domain signal and a sound frequency domain signal respectively;
selecting a time domain section from the sound time domain signal, and judging whether the amplitude of the time domain section is smaller than a first threshold value, if the amplitude of the time domain section is smaller than the first threshold value, generating the at least one sound paragraph mark; and
selecting a first frequency domain section and a second frequency domain section from the sound frequency domain signal, and judging whether the difference value of the spectral intensities of the first frequency domain section and the second frequency domain section is larger than a second threshold value, if the difference value of the spectral intensities of the first frequency domain section and the second frequency domain section is larger than the second threshold value, generating the at least one sound paragraph mark.
7. A segmentation system, comprising:
a storage unit for storing a film content and at least one film content mark; and
a processor electrically connected to the storage unit for receiving the content of the movie; wherein, the film content includes an image data and a sound data, the processor includes:
an image segmentation unit for performing segmentation processing on the image data to generate at least one image segment mark;
a sound segmentation unit electrically connected to the image segmentation unit for performing segmentation processing on the sound data to generate at least one sound paragraph mark; and
a segment mark generating unit electrically connected to the image segmenting unit and the sound segmenting unit for comparing a difference between an image mark time of the at least one image segment mark and a sound mark time of the at least one sound segment mark to generate the at least one film content mark.
8. The segmentation system of claim 7, wherein the image segmentation unit is further configured to select M units of the image data, divide the selected image data into a first image segment, and determine the content of the first image segment to generate an image content result; wherein the image content result comprises a dynamic content and a static content; and detecting a change content for the image data based on the image content result, and generating the at least one image paragraph mark according to the time position of the change data.
9. The segmentation system of claim 8, wherein the image segmentation unit is further configured to select T units from the first image paragraph, calculate similarity of images in the T units, and generate an image difference result; if the image difference result is greater than a first image threshold value, determining the content of the first image paragraph as the dynamic content; and if the image difference result is not greater than the first image threshold value, determining the content of the first image paragraph as the static content.
10. The segmentation system of claim 8, wherein the image segmentation unit is further configured to calculate a similarity between an M +1 th unit of the image and an M-th unit of the image when the content of the first image segment is the dynamic content, so as to generate an image difference value; merging the M +1 unit image with the first image section if the image difference value is greater than a second image threshold value; and if the image difference value is not greater than the second image threshold value, generating the at least one image paragraph mark at the time position of the image of the (M + 1) th unit, selecting the image data for M seconds, and dividing the selected image data into a second image paragraph.
11. The segmentation system of claim 8, wherein the image segmentation unit is further configured to calculate a similarity between the M unit image and the M +1 unit image to generate an image difference value when the content of the first image segment is the static content; merging the M +1 unit image with the first image section if the image difference value is not greater than the second image threshold value; and if the image difference value is greater than the second image threshold value, generating the at least one image paragraph mark at the time position of the image of the M +1 unit, selecting the image data of the M units, and dividing the selected image data into a second image paragraph.
12. The system of claim 7, wherein the sound segmentation unit is further configured to convert the sound data into a sound time domain signal and a sound frequency domain signal, respectively, select a time domain segment from the sound time domain signal, determine whether the amplitude of the time domain segment is smaller than a first threshold, and generate the at least one sound segment flag if the amplitude of the time domain segment is smaller than the first threshold; and selecting a first frequency domain section and a second frequency domain section from the sound frequency domain signal, and judging whether the difference value of the spectral intensities of the first frequency domain section and the second frequency domain section is larger than a second threshold value, if the difference value of the spectral intensities of the first frequency domain section and the second frequency domain section is larger than the second threshold value, generating the at least one sound paragraph mark.
13. A non-transitory computer readable medium containing at least one program of instructions which is executed by a processor to perform a segmentation method, the segmentation method comprising:
receiving a movie content; wherein, the film content comprises an image data and a sound data;
performing segmentation processing on the image data to generate at least one image paragraph mark;
performing segmentation processing on the first sound data to generate at least one sound paragraph mark; and
comparing a difference between a video mark time of the at least one video segment mark and a sound mark time of the at least one sound segment mark to generate at least one film content mark.
Technical Field
The present disclosure relates to a segmentation method, a segmentation system and a non-transitory computer readable medium, and more particularly, to a segmentation method, a segmentation system and a non-transitory computer readable medium for a video source.
Background
The on-line learning platform is a network service that stores a plurality of learning data in a server, so that a user can connect to the server through the internet to browse the learning data at any time. In the existing various online learning platforms, the types of learning materials provided include films, audios, presentations, documents or forums.
Because the amount of learning materials stored in the online learning platform is huge, in order to facilitate the use of users, the audio-video content of the learning materials needs to be automatically segmented. Therefore, how to perform processing according to the correlation between the sound content and the image content of the learning film to achieve automatic segmentation of the learning film is a problem to be solved in the art.
Disclosure of Invention
A first aspect of the present disclosure is to provide a segmentation method. The segmentation method comprises the following steps: receiving movie content; wherein, the film content comprises image data and sound data; performing segmentation processing on the image data to generate at least one image paragraph mark; performing segmentation processing on the sound data to generate at least one sound paragraph mark; and comparing a difference between a video mark time of the at least one video segment mark and a sound mark time of the at least one sound segment mark to generate at least one film content mark.
A second aspect of the present disclosure is to provide a segmentation system, which includes a storage unit and a processor. The storage unit is used for storing a video source and at least one film content mark. The processor is electrically connected with the storage unit and used for receiving the film content; wherein, the film content includes image data and sound data, and the processor includes: the device comprises an image segmentation unit, a sound segmentation unit and a paragraph mark generation unit. The image segmentation unit is used for performing segmentation processing on the image data to generate at least one image segment mark. The sound segmentation unit is electrically connected with the image segmentation unit and used for performing segmentation processing on the sound data to generate at least one sound paragraph mark. The paragraph mark generating unit is electrically connected with the image segmenting unit and the sound segmenting unit and is used for comparing the difference between the image marking time of at least one image paragraph mark and the sound marking time of at least one sound paragraph mark so as to generate at least one film content mark.
A third aspect of the present application provides a non-transitory computer readable medium containing at least one program of instructions for execution by a processor to perform a segmentation method, the segmentation method comprising: receiving movie content; wherein, the film content comprises image data and sound data; performing segmentation processing on the image data to generate at least one image paragraph mark; performing segmentation processing on the sound data to generate at least one sound paragraph mark; and comparing a difference between a video mark time of the at least one video segment mark and a sound mark time of the at least one sound segment mark to generate at least one film content mark.
The present disclosure relates to a segmentation method, a segmentation system and a non-transitory computer readable medium, which mainly solve the problem of consuming a lot of labor and time for marking film segments manually. Segment marks are respectively carried out on the image signal and the sound signal, and then film content marks are generated according to the segment marks of the image signal and the segment marks of the sound signal, so that the function of automatically segmenting the learning film is achieved.
Drawings
In order to make the aforementioned and other objects, features, advantages and embodiments of the present disclosure more comprehensible, the following description is made with reference to the accompanying drawings:
FIG. 1 is a schematic diagram of a segmentation system depicted in accordance with some embodiments of the present application;
FIG. 2 is a flow diagram of a segmentation method according to some embodiments of the present application;
fig. 3 is a flowchart of step S220 according to some embodiments of the present application;
fig. 4 is a flowchart of step S222 according to some embodiments of the present application;
fig. 5A is a flowchart of step S223 according to some embodiments of the present application;
fig. 5B is a flowchart of step S223 according to some embodiments of the present application; and
fig. 6 is a flowchart of step S230 according to some embodiments of the present application.
[ description of reference ]
100: segmentation system
110: storage unit
130: processor with a memory having a plurality of memory cells
DB: course database
131: image segmentation unit
132: sound segmentation unit
133: paragraph mark generating unit
200: segmentation method
S210 to S240, S221 to S223, S2221 to S2223, S2231a to S2233a, S2231b to S2233b, and S231 to S233: step (ii) of
Detailed Description
Reference will now be made in detail to the present embodiments of the present application, examples of which are illustrated in the accompanying drawings. It should be understood, however, that these implementation details should not be used to limit the application. That is, in some embodiments of the disclosure, such practical details are not necessary. In addition, for simplicity, some conventional structures and elements are shown in the drawings in a simple schematic manner.
When an element is referred to as being "connected" or "coupled," it can be referred to as being "electrically connected" or "electrically coupled. "connected" or "coupled" may also be used to indicate that two or more elements are in mutual engagement or interaction. Moreover, although terms such as "first," "second," …, etc., may be used herein to describe various elements, these terms are used merely to distinguish one element or operation from another element or operation described in similar technical terms. Unless the context clearly dictates otherwise, the terms do not specifically refer or imply an order or sequence nor are they intended to limit the invention.
Please refer to fig. 1. Fig. 1 is a schematic diagram of a
As mentioned above, the
Please refer to fig. 2. Fig. 2 is a flow diagram of a
Next, the
Next, the
Next, the
Next, the
In accordance with the above, the
In view of the above, step S223 further includes steps S2231B-S2233B, please refer to fig. 5B, and fig. 5B is a flowchart of step S223 according to some embodiments of the present disclosure. As shown in fig. 5B, the
In the above, the
In another embodiment, the similarity between the images may be compared by using Peak signal-to-noise ratio (PSNR), Structural Similarity Index (SSIM), texture or color of the images, or specific shape (pattern), and the disclosure is not limited thereto.
Then, the
In light of the above, the
In view of the above, the
Next, the
According to the embodiments of the present application, the problem that a lot of labor and time are consumed for marking film paragraphs by manual methods in the prior art is mainly solved. The method comprises the steps of respectively carrying out paragraph marking on image data and sound data, and generating a film content mark according to the paragraph mark of the image data and the paragraph mark of the sound data, thereby achieving the function of automatically segmenting a learning film.
Additionally, the above illustration includes exemplary steps in sequential order, but the steps need not be performed in the order shown. It is within the contemplation of the disclosure that these steps may be performed in a different order. Steps may be added, substituted, changed in order, and/or omitted as appropriate within the spirit and scope of embodiments of the disclosure.
Although the present disclosure has been described with reference to the above embodiments, it should be understood that various changes and modifications can be made by one skilled in the art without departing from the spirit and scope of the disclosure, and therefore, the scope of the disclosure should be determined by that of the appended claims.
- 上一篇:一种医用注射器针头装配设备
- 下一篇:用于RRU通讯模块设备安装的拼装组合支架