Audio processing method, device, system, browser module and readable storage medium

文档序号：972878 发布日期：2020-11-03 浏览：6次中文

阅读说明：本技术 音频处理方法、装置、系统、浏览器模块和可读存储介质 (Audio processing method, device, system, browser module and readable storage medium ) 是由范林峰刘楚文黄勇尤于 2020-04-14 设计创作，主要内容包括：本公开涉及一种音频处理方法、装置、系统、浏览器模块和可读存储介质,涉及计算机技术领域。本公开的方法包括：浏览器模块响应于用户的预览请求,获取待合成的多个音频,以及各个音频的配置信息；其中,每个音频的配置信息包括：该音频在合成音频中的起始时刻,以及该音频用于合成的时间段；浏览器模块接收用户的播放请求,播放请求包括：合成音频的播放起始时刻；浏览器模块根据合成音频的播放起始时刻,以及各个音频在合成音频中的起始时刻,播放各个音频用于合成的时间段内的音频,以便向用户展示合成音频的效果。(The disclosure relates to an audio processing method, an audio processing device, an audio processing system, a browser module and a readable storage medium, and relates to the technical field of computers. The method of the present disclosure comprises: the browser module responds to a preview request of a user and acquires a plurality of audios to be synthesized and configuration information of each audio; wherein the configuration information of each audio comprises: the starting time of the audio in the synthesized audio, and the time period of the audio for synthesis; the browser module receives a play request of a user, wherein the play request comprises: synthesizing the playing start time of the audio; and the browser module plays the audio in the time period for synthesizing each audio according to the playing starting time of the synthesized audio and the starting time of each audio in the synthesized audio so as to show the effect of the synthesized audio to the user.)

1. An audio processing method, comprising:

the browser module responds to a preview request of a user and acquires a plurality of audios to be synthesized and configuration information of each audio; wherein the configuration information of each audio comprises: the starting time of the audio in the synthesized audio, and the time period of the audio for synthesis;

the browser module receives a play request of the user, wherein the play request comprises: synthesizing the playing start time of the audio;

and the browser module plays the audio in the time period for synthesizing each audio according to the playing starting time of the synthesized audio and the starting time of each audio in the synthesized audio so as to show the effect of the synthesized audio to the user.

2. The audio processing method according to claim 1,

the playing, by the browser module, the audio within the time period for synthesizing each audio according to the playing start time of the synthesized audio and the start time of each audio in the synthesized audio includes:

the browser module determines the time difference between the playing starting time of the synthesized audio and the starting time of each audio in the synthesized audio;

for each audio, the browser module sets the value of the start timer corresponding to the audio as the time difference when the time difference corresponding to the audio is greater than zero, and starts to play the audio within a time period for synthesizing the audio when the start timer corresponding to the audio is finished.

3. The audio processing method according to claim 2,

for each audio, the browser module starts to play the audio within a time period for synthesizing the audio under the condition that the time difference corresponding to the audio is equal to zero;

or, for each audio, the browser module determines a difference value between a starting time of a time period for synthesizing the audio and the time difference when the time difference corresponding to the audio is less than zero, and starts playing from the time corresponding to the audio and the difference value.

4. The audio processing method according to claim 2 or 3,

and for each audio, the browser module sets an end timer corresponding to the audio according to the end time of the time period for synthesizing the audio when the browser module starts to play the audio, and finishes playing the audio when the end timer finishes.

5. The audio processing method according to claim 1,

the configuration information for each audio further includes: the number of repetitions of the audio;

and aiming at each audio, the browser module judges whether the repetition frequency of the audio is reached or not under the condition that the current playing of the audio is finished, and re-plays the audio in the time period for synthesizing the audio under the condition that the repetition frequency of the audio is not reached.

6. The audio processing method of claim 1, further comprising:

responding to the user to confirm the effect of the synthesized audio displayed by the browser module, and acquiring a plurality of audios to be synthesized and configuration information of each audio by the server;

and the server synthesizes the audio frequencies into one audio frequency according to the configuration information of the audio frequencies.

7. The audio processing method according to claim 6,

the server synthesizes each audio into one audio according to the configuration information of each audio, and the method comprises the following steps:

the server converts the configuration information of each audio into a filter composite _ composite parameter in a fast-forwarding motion picture experts group (FFmpeg);

and the server synthesizes each audio into one audio according to the filter _ complete parameter of each audio by using the FFmpeg.

8. The audio processing method according to claim 7,

the server converting the configuration information of each audio into a filter _ complete parameter in the FFmpeg comprises:

sequentially aiming at each audio, the server adds a intercepting atrim filter, configures parameters of the atrim filter corresponding to the audio in the filter _ complete parameters according to the time period for synthesizing the audio, adds a delay adelay filter, and configures parameters of the adelay filter corresponding to the audio in the filter _ complete parameters according to the starting time of the audio in the synthesized audio;

and adding a fusion amix filter by the server, and configuring the parameter of the amix filter in the filter _ complete parameter according to the number of all audios.

9. The audio processing method according to claim 8,

the configuration information for each audio further includes: the number of repetitions of the audio;

the server converting the configuration information of each audio into a filter composite _ complete parameter in FFmpeg further comprises:

and sequentially aiming at each audio, adding a cyclic aloop filter by the server, and configuring the parameter of the aloop filter corresponding to the audio in the filter _ complete parameter by the repetition times of the audio.

10. A browser module, comprising:

the device comprises an acquisition unit, a processing unit and a processing unit, wherein the acquisition unit is used for responding to a preview request of a user and acquiring a plurality of audios to be synthesized and configuration information of each audio; wherein the configuration information of each audio comprises: the starting time of the audio in the synthesized audio, and the time period of the audio for synthesis;

a receiving unit, configured to receive a play request of the user, where the play request includes: synthesizing the playing start time of the audio;

and the playing unit is used for playing the audio in the time period for synthesizing each audio according to the playing starting time of the synthesized audio and the starting time of each audio in the synthesized audio so as to show the effect of the synthesized audio to the user.

11. An audio processing system comprising:

the browser module of claim 10; and

the server is used for responding to the effect that the user confirms the synthesized audio displayed by the browser module, and acquiring a plurality of audios to be synthesized and the configuration information of each audio; and synthesizing the audio frequencies into one audio frequency according to the configuration information of the audio frequencies.

12. An audio processing apparatus includes:

a processor; and

a memory coupled to the processor for storing instructions that, when executed by the processor, cause the processor to perform the audio processing method of any of claims 1-9.

13. A non-transitory computer readable storage medium having stored thereon a computer program, wherein the program when executed by a processor implements the steps of the audio processing method of any of claims 1-9.

Technical Field

The present disclosure relates to the field of computer technologies, and in particular, to an audio processing method, apparatus, system, browser module, and readable storage medium.

Background

In order to enhance the expressiveness of the audio, multiple audios are usually synthesized. For example, some specific sound effects are added to the main audio, such as a punch in a kungfu piece, a leaf drop, and so on.

At present, if a user wants to preview or listen to the synthesized audio, the server generally merges the respective audio, and sends the synthesized new audio to the browser to start playing.

Disclosure of Invention

The inventor finds that: currently, when a user previews audio through a browser, the user may repeatedly adjust audio configuration parameters (e.g., time, times, etc. of sound effect playing), and if the audio is synthesized through a server each time of adjustment and then returned to the browser for playing, the server may be overloaded, especially when the user is many. Also, waiting for the server to synthesize the audio and return to the browser takes a long time, which also results in a poor user experience.

One technical problem to be solved by the present disclosure is: how to improve the playing efficiency and real-time performance of the synthesized audio when the user previews the synthesized audio and reduce the pressure of the server.

According to some embodiments of the present disclosure, there is provided an audio processing method including: the browser module responds to a preview request of a user and acquires a plurality of audios to be synthesized and configuration information of each audio; wherein the configuration information of each audio comprises: the starting time of the audio in the synthesized audio, and the time period of the audio for synthesis; the browser module receives a play request of a user, wherein the play request comprises: synthesizing the playing start time of the audio; and the browser module plays the audio in the time period for synthesizing each audio according to the playing starting time of the synthesized audio and the starting time of each audio in the synthesized audio so as to show the effect of the synthesized audio to the user.

In some embodiments, the browser module playing the audio within the time period for the respective audio to be synthesized according to the play start time of the synthesized audio and the start time of the respective audio in the synthesized audio includes: the browser module determines the time difference between the playing starting time of the synthesized audio and the starting time of each audio in the synthesized audio; for each audio, the browser module sets the value of the start timer corresponding to the audio to be the time difference when the time difference corresponding to the audio is greater than zero, and starts to play the audio within the time period for synthesizing the audio when the start timer corresponding to the audio is finished.

In some embodiments, the browser module playing the audio within the time period for the respective audio to be synthesized according to the playing start time of the synthesized audio and the start time of the respective audio in the synthesized audio further includes: for each audio, the browser module starts to play the audio within a time period for synthesizing the audio under the condition that the time difference corresponding to the audio is equal to zero; or, for each audio, the browser module determines a difference value between a starting time of a time period for synthesizing the audio and the time difference when the time difference corresponding to the audio is less than zero, and starts playing from the time corresponding to the audio and the difference value.

In some embodiments, the configuration information for each audio further comprises: the number of repetitions of the audio; the browser module plays the audio in the time period for synthesizing each audio according to the playing start time of the synthesized audio and the start time of each audio in the synthesized audio, including: and aiming at each audio, the browser module judges whether the repetition frequency of the audio is reached under the condition that the current playing of the audio is finished, and plays the audio in the time period for synthesizing the audio again under the condition that the repetition frequency of the audio is not reached.

In some embodiments, the method further comprises: responding to the effect that the user confirms the synthesized audio displayed by the browser module, and acquiring a plurality of audios to be synthesized and the configuration information of each audio by the server; and the server synthesizes the audio into one audio according to the configuration information of the audio.

In some embodiments, the server synthesizing the respective audios into one audio according to the configuration information of the respective audios includes: the server converts the configuration information of each audio into a filter composite filter _ complete parameter in a fast-forwarding motion picture experts group FFmpeg; the server synthesizes each audio into one audio according to the filter _ complete parameter of each audio by using the FFmpeg.

In some embodiments, the server converting the configuration information of each audio into a filter _ complete parameter in FFmpeg comprises: sequentially aiming at each audio, the server adds a intercepting atrim filter, configures parameters of the atrim filter corresponding to the audio in the filter _ complete parameters according to the time period for synthesizing the audio, adds a delay adelay filter, and configures parameters of the adelay filter corresponding to the audio in the filter _ complete parameters according to the starting time of the audio in the synthesized audio; and adding a fusion amix filter by the server, and configuring the parameter of the amix filter in the filter _ complete parameter according to the number of all audios.

In some embodiments, the configuration information for each audio further comprises: the number of repetitions of the audio; the server converting the configuration information of each audio into a filter composite _ complete parameter in FFmpeg further comprises: and sequentially aiming at each audio, the server adds a cyclic aloop filter, and the repetition times of the audio configure the parameters of the aloop filter corresponding to the audio in the filter _ complete parameters.

According to further embodiments of the present disclosure, there is provided a browser module including: the device comprises an acquisition unit, a processing unit and a processing unit, wherein the acquisition unit is used for responding to a preview request of a user and acquiring a plurality of audios to be synthesized and configuration information of each audio; wherein the configuration information of each audio comprises: the starting time of the audio in the synthesized audio, and the time period of the audio for synthesis; a receiving unit, configured to receive a play request of a user, where the play request includes: synthesizing the playing start time of the audio; and the playing unit is used for playing the audio in the time period for synthesizing each audio according to the playing starting time of the synthesized audio and the starting time of each audio in the synthesized audio so as to show the effect of the synthesized audio to the user.

In some embodiments, the playing unit is configured to determine a time difference between a playing start time of the synthesized audio and a start time of each audio in the synthesized audio; and for each audio, setting the value of the start timer corresponding to the audio as the time difference under the condition that the time difference corresponding to the audio is larger than zero, and starting to play the audio in the time period for synthesizing the audio when the start timer corresponding to the audio is finished.

In some embodiments, the playing unit is further configured to, for each audio, start playing the audio within a time period for the audio to be synthesized if the time difference corresponding to the audio is equal to zero; or, for each audio, in the case that the time difference corresponding to the audio is less than zero, determining a difference value between the starting time of the time period for synthesizing the audio and the time difference, and starting playing from the time corresponding to the audio and the difference value.

In some embodiments, the playing unit is further configured to, for each audio, set an end timer corresponding to the audio according to an end time of a time period for synthesizing the audio while starting playing the audio, and end playing the audio when the end timer ends.

In some embodiments, the configuration information for each audio further comprises: the number of repetitions of the audio; the playing unit is further used for judging whether the repetition frequency of the audio is reached or not under the condition that the current playing of the audio is finished aiming at each audio, and playing the audio in the time period for synthesizing the audio again under the condition that the repetition frequency of the audio is not reached.

According to still further embodiments of the present disclosure, there is provided an audio processing system including: the browser module of any of the preceding embodiments; the server is used for responding to the effect that the user confirms the synthesized audio displayed by the browser module, and acquiring a plurality of audios to be synthesized and the configuration information of each audio; and synthesizing the audio frequencies into one audio frequency according to the configuration information of the audio frequencies.

In some embodiments, the server is configured to convert the configuration information of each audio into a filter composite filter _ complete parameter in fast-forwarding motion picture experts group FFmpeg; using FFmpeg, each audio is synthesized into one audio according to the filter _ complete parameter of each audio.

In some embodiments, for each audio in turn, the server is configured to add a truncating atrim filter, configure a parameter of the atrim filter corresponding to the audio in the filter _ complete parameter according to a time period for synthesizing the audio, add a delay adelay filter, and configure a parameter of the adelay filter corresponding to the audio in the filter _ complete parameter according to a starting time of the audio in the synthesized audio; and adding a fusion amix filter, and configuring the parameter of the amix filter in the filter _ complete parameter according to the number of all audios.

In some embodiments, the configuration information for each audio further comprises: the number of repetitions of the audio; and sequentially aiming at each audio, the server is also used for adding a cyclic aloop filter, and the repetition times of the audio configure the parameters of the aloop filter corresponding to the audio in the filter _ complete parameters.

According to still further embodiments of the present disclosure, there is provided an audio processing apparatus including: a processor; and a memory coupled to the processor for storing instructions that, when executed by the processor, cause the processor to perform the audio processing method of any of the preceding embodiments.

According to further embodiments of the present disclosure, there is provided a non-transitory computer readable storage medium having a computer program stored thereon, wherein the program, when executed by a processor, implements the steps of the audio processing method of any of the preceding embodiments

The browser module in the disclosure responds to a preview request of a user, and acquires a plurality of audios to be synthesized and configuration information of each audio. The configuration information of each audio includes: the starting time of the audio in the synthesized audio, and the time period for which the audio is to be synthesized. When a user initiates a playing request, the browser module plays the audio in the time period for synthesizing each audio according to the playing starting time of the synthesized audio and the starting time of each audio in the synthesized audio. By the method, the user can directly preview the audio at the browser module end when frequently modifying the audio, the server is not required to be informed to synthesize the audio again and return to the browser module each time, the playing efficiency and the real-time performance of the user on the synthesized audio during the preview of the synthesized audio are improved, and the pressure of the server is reduced.

Other features of the present disclosure and advantages thereof will become apparent from the following detailed description of exemplary embodiments thereof, which proceeds with reference to the accompanying drawings.

Drawings

In order to more clearly illustrate the embodiments of the present disclosure or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present disclosure, and other drawings can be obtained by those skilled in the art without creative efforts.

Fig. 1 illustrates a flow diagram of an audio processing method of some embodiments of the present disclosure.

Fig. 2 shows a flow diagram of an audio processing method of further embodiments of the present disclosure.

Fig. 3 shows a flow diagram of an audio processing method of further embodiments of the present disclosure.

FIG. 4 illustrates a structural schematic diagram of a browser module of some embodiments of the present disclosure.

Fig. 5 shows a schematic structural diagram of an audio processing system of some embodiments of the present disclosure.

Fig. 6 shows a schematic structural diagram of an audio processing apparatus of some embodiments of the present disclosure.

Fig. 7 shows a schematic structural diagram of an audio processing device according to further embodiments of the present disclosure.

Detailed Description

The technical solutions in the embodiments of the present disclosure will be clearly and completely described below with reference to the drawings in the embodiments of the present disclosure, and it is obvious that the described embodiments are only a part of the embodiments of the present disclosure, and not all of the embodiments. The following description of at least one exemplary embodiment is merely illustrative in nature and is in no way intended to limit the disclosure, its application, or uses. All other embodiments, which can be derived by a person skilled in the art from the embodiments disclosed herein without making any creative effort, shall fall within the protection scope of the present disclosure.

The present disclosure provides an audio processing method, which can be applied to a scene in which a user previews a synthesized audio, and is described below with reference to fig. 1.

Fig. 1 is a flow diagram of some embodiments of the disclosed audio processing method. As shown in fig. 1, the method of this embodiment includes: steps S102 to S106.

In step S102, the browser module acquires a plurality of audios to be synthesized and configuration information of each audio in response to a preview request of a user.

The configuration information of each audio includes, for example: the starting time of the audio in the synthesized audio, and the time period for which the audio is to be synthesized. For example, the starting time of the audio in the synthesized audio is denoted as start, and the start of playing the audio at the 2 nd s of the synthesized audio is denoted as 2; the time period of the audio for synthesis can be represented by startTime and endTime, where startTime is 3, and endTime is 5, which represents the audio in the 3 rd to 5 th s of the audio for synthesis. The time period for synthesizing the audio may be represented by startTime and duration, where startTime is 3 and duration is 2, and the audio within the 3 rd to 5 th seconds of the audio is used for synthesizing. The representation manner of the configuration information of the audio may be set according to actual requirements, and is not limited to the illustrated example.

Further, the configuration information of each audio may further include the number of times of repetition of the audio, for example, the number of times of repetition of the audio is expressed as time, and time ═ 5 indicates that the audio is repeatedly played 5 times in the synthesized audio. The plurality of audios may include: a primary audio and at least one secondary audio; the main audio starts at time zero in the synthesized audio.

Prior to step S102, the browser module may receive an audio synthesis request from the user, provide selectable audio to the user, or receive audio uploaded by the user. Further, the user sets configuration information for a plurality of selected or uploaded audios, and the browser module receives the configuration information of each audio set by the user. The user can listen to each audio by trial, and set the starting time of each audio in the synthesized audio and the time period for synthesizing each audio through the corresponding key or the like. Further, the user may issue a preview request to the browser module by clicking a preview button in order to preview the effect of the synthesized audio.

And the browser module reads the configuration information of each audio and loads each audio in parallel. The browser module may generate a configuration list according to the configuration information of each audio, and may include at least one of an identifier of each audio, a start time of a corresponding audio in a synthesized audio, a time period startTime (or duration) for synthesizing the corresponding audio, and a repetition time of the corresponding audio. The browser module can also obtain the total duration of each audio frequency, and can also store the total duration into a configuration list, so that the subsequent playing operation is facilitated.

In step S104, the browser module receives a play request of the user.

After receiving a preview request of a user, the browser module can display a play window for synthesizing audio to the user, and the user sends a play request by clicking a play button. The play request may include a play start time of the synthesized audio. I.e. the user can choose to start playing at any time, e.g. at the 10 th second of the synthesized audio, or directly from 0 seconds, etc.

In step S106, the browser module plays the audio within the time period for synthesizing each audio according to the playing start time of the synthesized audio and the start time of each audio in the synthesized audio, so as to show the effect of the synthesized audio to the user.

Under the condition that the playing time ranges of the multiple audios have intersection, the browser module can play the multiple audios at the same time, so that the effect of the synthesized audio can be displayed for a user. For example, the primary audio is played while the secondary audio is played, thereby presenting the user with the effect of adding the secondary audio to the primary audio.

In some embodiments, the browser module determines a time difference between the play start time of the synthesized audio and the start time of the respective audio in the synthesized audio. For each audio, the browser module sets the value of a start timer corresponding to the audio as the time difference under the condition that the time difference corresponding to the audio is greater than zero, and starts to play the audio in a time period for synthesizing the audio when the start timer corresponding to the audio is finished; or, for each audio, the browser module starts to play the audio within a time period for synthesizing the audio when the time difference corresponding to the audio is equal to zero; or, for each audio, the browser module determines a difference value between a starting time of a time period for synthesizing the audio and the time difference when the time difference corresponding to the audio is less than zero, and starts playing from the time corresponding to the audio and the difference value.

As shown in fig. 2, for each audio, the browser module may process in the following way.

In step S202, the play start time of the synthesized audio is set.

In step S204, configuration information of the audio is read.

The browser module may traverse the previously generated configuration list, and read information such as the start time of each audio in the configuration list in the synthesized audio, the time period for the synthesis, and the like.

In step S206, a time difference between the start time of the audio in the synthetic audio and the playback start time of the synthetic audio is calculated.

For example, needlesFor each audio frequency, the time difference is respectively delta t₀,Δt₁,...,Δt_nAnd n represents the number of audios.

In step S208, it is determined whether the time difference corresponding to the audio is greater than 0. If greater than 0, step S209 is performed, if equal to 0, step S210 is performed, and if less than 0, step S211 is performed.

In step S209, the value of the start timer corresponding to the audio is set to the corresponding time difference. Step S212 is then performed.

If Δ t_iIf the value is more than 0, the audio i (i is more than or equal to 0 and less than or equal to n, i is an integer) does not start playing, and the browser module sets the value of the start timer corresponding to the audio i to be delta t_iWait for Δ t_iAfter which audio i starts to play.

In step S210, the playing of audio within a time period for synthesis is started.

If Δ t_i0 means that the audio i just starts playing at the beginning of the synthesized audio.

In step S211, a difference between the start time of the time period for synthesizing the audio and the time difference is determined, and playback is started from the time corresponding to the difference.

If Δ t_i< 0, this means that the audio i has been played for a certain period of time at the start of the synthesized audio. For example, if the audio i starts at the 2 nd time in the synthesized audio and the play start time of the synthesized audio selected by the user is the 3 rd time, Δ t_iAnd 1, the audio i is played for 1s, the time period for synthesis in the configuration information of the audio i is 4-7 s, and the starting time of the time period for synthesis is 4 s. At the play start time of the synthesized audio, the audio i should be played from the 5 th s.

In step S212, in response to the start timer ending, audio within a time period for the synthesis starts to be played.

In step S214, an end timer corresponding to the audio is set.

In some embodiments, for each audio, the browser module sets an end timer corresponding to the audio according to an end time of a time period for synthesizing the audio while starting playing the audio, and ends playing the audio when the end timer ends. The browser module may calculate, for each audio, the time period for synthesis remaining for that audio at the start of playback, and set the value of the end timer to the length of the time period remaining for synthesis. For example, at the playing start time of the synthesized audio, the time period of the audio i for synthesis is 4-7 s, the audio i starts to play from the 5 th s, the remaining time period for synthesis is 5-7 s, and the value of the end timer is set to 2 s.

In step S216, the audio is finished playing in response to the end timer ending.

In step S218, it is determined whether the number of times of playing the audio reaches the number of times of repetition, and if so, the process is terminated, otherwise, the process returns to step S210 to start execution.

In some embodiments, for each audio, the browser module determines whether the number of repetitions of the audio is reached when the current playing of the audio is finished, and re-plays the audio within a time period for synthesis when the number of repetitions of the audio is not reached.

For each audio, if the parameters such as start and start time are not configured, the parameters may default to 0, if endTime is not configured, the end time of the time period for synthesizing the audio may default to the end time of the audio, and if times are not configured, the number of repetitions may default to 1. These default values may also be stored to the configuration list.

The browser module may implement the scheme of synthesizing the Audio preview in the above embodiments using a Web Audio API (network Audio interface) or using Audio node technology, which is not limited to the illustrated examples.

In the above embodiment, the browser module, in response to a preview request of a user, acquires a plurality of audios to be synthesized and configuration information of each audio. The configuration information of each audio includes: the starting time of the audio in the synthesized audio, and the time period for which the audio is to be synthesized. When a user initiates a playing request, the browser module plays the audio in the time period for synthesizing each audio according to the playing starting time of the synthesized audio and the starting time of each audio in the synthesized audio. By the method of the embodiment, when the user frequently modifies the audio, the user can directly preview the audio at the browser module end, the server is not required to be informed to synthesize the audio again and return to the browser module each time, the playing efficiency and the real-time performance of the user on the synthesized audio during the preview of the synthesized audio are improved, and the pressure of the server is reduced.

After the user previews the effect of the synthesized audio through the browser, the user can readjust the configuration of each audio, and then previewing can be performed again according to the method. The user may also confirm the synthesized audio, and the present disclosure also provides some embodiments of a method for the server to synthesize multiple audios, as described below in conjunction with fig. 3.

Fig. 3 is a flow diagram of further embodiments of the audio processing method of the present disclosure. As shown in fig. 3, the method of this embodiment includes: steps S302 to S304.

In step S302, in response to the user confirming the effect of the synthesized audio presented by the browser module, the server acquires a plurality of audios to be synthesized and configuration information of each audio.

After the preview is completed, the user can generate the synthesized audio by triggering the confirmation function. The browser module can share the same configuration information to form a set of front-end and back-end complete audio solutions in the preview stage and the synthesis stage of the server.

In step S304, the server synthesizes the respective audios into one audio according to the configuration information of the respective audios.

The server may use applications such as FFmpeg (Fast Forward mpeg), audio, sox (sound exchange) to synthesize each audio into one audio, which is not limited to the illustrated example. No matter which application is adopted, the configuration information of the audio needs to be converted into corresponding parameters in the application, and corresponding functions are called to realize the synthesis of each audio. Taking FFmpeg as an example, the server specifically performs the audio synthesis.

In some embodiments, the server converts the configuration information of each audio into a filter _ complete parameter in FFmpeg; using FFmpeg, each audio is synthesized into one audio according to the filter _ complete parameter of each audio.

The filter _ complete parameter in the Ffmpe can be arbitrarily modified for the audio in combination with various built-in filters, such as atrim (audio intercept), adelay (audio delay play), aloop (audio loop play), amix (audio stream fusion), and so on. Since the filter _ complete supports streaming configuration and automatically synthesizes audio according to configured logic, it is necessary to convert the incoming audio configuration information into the filter _ complete format.

In some embodiments, for each audio in turn, the server adds an atrim filter, configures the parameter of the atrim filter corresponding to the audio in the filter _ complete parameter according to the time period for synthesizing the audio, adds a delay adelay filter, and configures the parameter of the adelay filter corresponding to the audio in the filter _ complete parameter according to the starting time of the audio in the synthesized audio; and adding a fusion amix filter, and configuring the parameter of the amix filter in the filter _ complete parameter according to the number of all audios.

Further, the configuration information of each audio may further include: the number of repetitions of the audio. And sequentially aiming at each audio, the server adds an aloop filter, and the repetition times of the audio configure the parameters of the aloop filter corresponding to the audio in the filter _ complete parameters.

For example, the audio configuration information may be represented in the form of a configuration list. The following method may be employed for the conversion and processing of the parameters for each audio in the list.

(1) Reading the ith item of configuration information in the configuration list, and marking as item.

(2) Judging whether item and path are set, if so, executing (3), otherwise, throwing out an error. Path represents the path of the audio, and determines whether the path of the audio has been set.

(3) And (4) judging whether item.starttime or item.endtime or duration exists, if the item.starttime does not exist, setting the item.starttime to be 0, if the item.endtime or duration does not exist, setting the item.endtime to be audio end time or setting the duration to be the time length from the startTime to the audio end time, and if the item.starttime or duration exists, executing (4).

Namely, judging whether the configuration information of the audio comprises time period related information for synthesis, if so, executing the step (4), otherwise, setting according to the method.

(4) The addition of the atrim filter intercepts the audio according to item. For example, when startTime of audio is 26 and endTime is 34, atrim is 26 in the filter _ complete parameter: endTime 34.

(5) And judging whether the item is existed or not, if not, setting the item to be 1, otherwise, executing (6). I.e. whether the number of repetitions of the audio has been set or not, and if not, 1.

(6) Adding an aloop filter to set the audio playing times according to item. For example, time of the audio is 4, and then time of the audio in the filter _ complete parameter is 4.

(7) And judging whether item.start exists, if not, setting item.start to 0, otherwise, executing (8). I.e. whether the starting time of the audio in the synthesized audio is set or not is judged, and if not, the default is set to 0.

(8) Add an adelay filter sets the start time of the audio in the synthesized audio according to item. For example, if the start of the audio is 2, the adelay in the filter _ complete parameter is 2000, 2000 units are ms, and 2 units are s.

(9) The individual filter parameters are combined.

(10) And judging that i is n and n is the number of the audios, if so, executing (11), otherwise, i is i +1, and returning to (1) to restart the execution.

(11) The amix filter is added to synthesize multiple tones.

In the method of the embodiment, the browser module performs audio preview and the server performs audio synthesis to share the same configuration parameters, so as to form a set of front-end and back-end complete audio solution. The server is used for audio synthesis only when the user confirms the audio previewed in the browser, the server is not required to be informed to synthesize the audio again and return to the browser module each time, playing efficiency and real-time performance of the user on the synthesized audio when the user previews the synthesized audio are improved, and server pressure is reduced.

The present disclosure also provides a browser module, described below in conjunction with fig. 4.

FIG. 4 is a block diagram of some embodiments of a browser module of the present disclosure. As shown in fig. 4, the browser module 40 of this embodiment includes: an acquisition unit 410, a receiving unit 420, and a playing unit 430.

The obtaining unit 410 is configured to obtain a plurality of audios to be synthesized and configuration information of each audio in response to a preview request of a user; wherein the configuration information of each audio comprises: the starting time of the audio in the synthesized audio, and the time period for which the audio is to be synthesized.

The receiving unit 420 is configured to receive a play request of a user, where the play request includes: the play start time of the synthesized audio.

The playing unit 430 is configured to play the audio within the time period for synthesizing each audio according to the playing start time of the synthesized audio and the start time of each audio in the synthesized audio, so as to show the effect of the synthesized audio to the user.

In some embodiments, the playing unit 430 is configured to determine a time difference between a playing start time of the synthesized audio and a start time of each audio in the synthesized audio; and for each audio, setting the value of the start timer corresponding to the audio as the time difference under the condition that the time difference corresponding to the audio is larger than zero, and starting to play the audio in the time period for synthesizing the audio when the start timer corresponding to the audio is finished.

In some embodiments, the playing unit 430 is further configured to, for each audio, start playing the audio within the time period for synthesizing, where the time difference corresponding to the audio is equal to zero; or, for each audio, in the case that the time difference corresponding to the audio is less than zero, determining a difference value between the starting time of the time period for synthesizing the audio and the time difference, and starting playing from the time corresponding to the audio and the difference value.

In some embodiments, the playing unit 430 is further configured to, for each audio, set an end timer corresponding to the audio according to an end time of a time period for synthesizing the audio while starting playing the audio, and end playing the audio when the end timer ends.

In some embodiments, the configuration information for each audio further comprises: the number of repetitions of the audio; the playing unit 430 is further configured to determine, for each audio, whether the number of repetitions of the audio is reached when the current playing of the audio is finished, and replay the audio within a time period for synthesizing the audio when the number of repetitions of the audio is not reached.

The present disclosure also provides an audio processing system, described below in conjunction with fig. 5.

Fig. 5 is a block diagram of some embodiments of an audio processing system of the present disclosure. As shown in fig. 5, the audio processing system 5 of this embodiment includes: a browser module 40 and a server 52.

The server 52 is configured to, in response to the user confirming the effect of the synthesized audio presented by the browser module 40, obtain a plurality of audios to be synthesized and configuration information of each audio; and synthesizing the audio frequencies into one audio frequency according to the configuration information of the audio frequencies.

In some embodiments, the server 52 is configured to convert the configuration information of each audio into a filter _ complete parameter in the fast-forwarding motion picture experts group FFmpeg; using FFmpeg, each audio is synthesized into one audio according to the filter _ complete parameter of each audio.

In some embodiments, for each audio in turn, the server 52 is configured to add a truncating atrim filter, configure a parameter of the atrim filter corresponding to the audio in the filter _ complete parameter according to a time period for synthesizing the audio, add a delay adelay filter, and configure a parameter of the adelay filter corresponding to the audio in the filter _ complete parameter according to a starting time of the audio in the synthesized audio; and adding a fusion amix filter, and configuring the parameter of the amix filter in the filter _ complete parameter according to the number of all audios.

In some embodiments, the configuration information for each audio further comprises: the number of repetitions of the audio; for each audio in turn, the server 52 is further configured to add a cyclic aloop filter, and the number of repetitions of the audio configures a parameter of the aloop filter corresponding to the audio in the filter _ complete parameter.

The audio processing apparatus including the browser module and the server in the embodiments of the present disclosure may be implemented by various computing devices or computer systems, which are described below in conjunction with fig. 6 and 7.

Fig. 6 is a block diagram of some embodiments of an audio processing device of the present disclosure. As shown in fig. 6, the apparatus 60 of this embodiment includes: a memory 610 and a processor 620 coupled to the memory 610, the processor 620 configured to perform the audio processing method in any of the embodiments of the present disclosure based on instructions stored in the memory 610.

Memory 610 may include, for example, system memory, fixed non-volatile storage media, and the like. The system memory stores, for example, an operating system, an application program, a Boot Loader (Boot Loader), a database, and other programs.

Fig. 7 is a block diagram of further embodiments of an audio processing device of the present disclosure. As shown in fig. 7, the apparatus 70 of this embodiment includes: memory 710 and processor 720 are similar to memory 610 and processor 620, respectively. An input output interface 730, a network interface 740, a storage interface 750, and the like may also be included. These interfaces 730, 740, 750, as well as the memory 710 and the processor 720, may be connected, for example, by a bus 760. The input/output interface 730 provides a connection interface for input/output devices such as a display, a mouse, a keyboard, and a touch screen. The network interface 740 provides a connection interface for various networking devices, such as a database server or a cloud storage server. The storage interface 750 provides a connection interface for external storage devices such as an SD card and a usb disk.

As will be appreciated by one skilled in the art, embodiments of the present disclosure may be provided as a method, system, or computer program product. Accordingly, the present disclosure may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present disclosure may take the form of a computer program product embodied on one or more computer-usable non-transitory storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.

The present disclosure is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the disclosure. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

The above description is only exemplary of the present disclosure and is not intended to limit the present disclosure, so that any modification, equivalent replacement, or improvement made within the spirit and principle of the present disclosure should be included in the scope of the present disclosure.

18页详细技术资料下载

上一篇：一种医用注射器针头装配设备

下一篇：语音转换方法、装置及服务器

Audio processing method, device, system, browser module and readable storage medium

相关技术

网友询问留言