Recording processing and playing method and device, server, terminal and storage medium

文档序号:1616997 发布日期:2020-01-10 浏览:11次 中文

阅读说明:本技术 录音处理、播放方法、装置、服务器、终端及存储介质 (Recording processing and playing method and device, server, terminal and storage medium ) 是由 李�杰 于 2018-07-03 设计创作,主要内容包括:本发明实施例提供一种录音处理、播放方法、装置、服务器、终端及存储介质,由于服务器对N端通话中N个通话端的独立通话音频数据进行优化处理所依据的混音处理规则是根据对象的对象特征信息生成的,因此得到的优化录音数据与对象特征信息有关,当对象不同时,根据混音规则对原始音频数据进行优化处理得到的优化录音数据也就不会完全一样。针对同样的原始音频数据,可以处理得到优化倾向不同的优化录音数据,让用户得以从不同的“角度”来回放倾听N端通话录音,从而更加全面、清楚地了解N端通过的通话信息。另外,由于能够让用户听不不一样的效果,因此这也在很大程度上提升了录音的趣味性。(Embodiments of the present invention provide a recording processing method, a recording playing method, a recording processing apparatus, a server, a terminal, and a storage medium, in which a sound mixing processing rule according to which a server performs optimization processing on independent call audio data of N call terminals in an N-terminal call is generated according to object feature information of an object, so that obtained optimized recording data is related to the object feature information, and when the objects are different, the optimized recording data obtained by performing optimization processing on original audio data according to the sound mixing rule may not be completely the same. The optimized recording data with different optimization tendencies can be processed aiming at the same original audio data, so that a user can play and listen to the call recording of the N end from different 'angles', and the call information passed by the N end can be more comprehensively and clearly understood. In addition, because the users can listen to different effects, the interestingness of recording is improved to a great extent.)

1. A method of recording processing, comprising:

acquiring object characteristic information of an object;

generating a sound mixing processing rule according to the object characteristic information;

processing original audio data from a terminal according to the sound mixing processing rule to obtain optimized recording data, wherein the original audio data comprises independent call audio data of N call ends in N-end call, and N is more than or equal to 2;

and sending the optimized recording data to a terminal.

2. The audio recording processing method according to claim 1, wherein the object characteristic information includes basic object characteristic information and/or subject object characteristic information, and the basic object characteristic information includes current physiological information and/or social information of the object; the subject object feature information is object feature information of the object under the call subject of the N-terminal call.

3. The audio recording processing method of claim 2, wherein the object feature information includes basic object feature information, and acquiring the basic object feature information of the object includes:

receiving a recording optimization instruction sent by a terminal;

and extracting basic object characteristic information of the object from the recording optimization instruction.

4. The audio recording processing method according to claim 2, wherein the object feature information includes subject object feature information, and acquiring the subject object feature information of the object includes:

fusing the call audio data of each call end in the original audio data according to a time axis to generate pre-mixed audio data;

performing semantic recognition on the pre-mixed audio data;

and determining the subject object characteristic information of the object according to the semantic recognition result.

5. The audio recording processing method according to claim 4, wherein the determining the subject object feature information of the object according to the semantic recognition result comprises:

determining the call theme and/or call keywords of the current N-terminal call according to the semantic recognition result;

and determining information of which the importance degree of the object reaches a preset threshold value as the characteristic information of the object of the theme according to the call theme and/or the call keywords.

6. The recording processing method of any of claims 1-5, wherein the optimized recording data includes at least one optimized audio clip; the sound mixing processing rule comprises an optimization processing mode for optimizing an original audio clip in original audio data to obtain an optimized audio clip; the processing the original audio data from the terminal according to the audio mixing processing rule to obtain optimized recording data comprises:

and processing the original audio clip according to the optimization processing mode corresponding to the original audio clip in the sound mixing processing rule to obtain a corresponding optimized audio clip.

7. The recording processing method of claim 6, wherein the optimization processing means includes at least one of the following:

after the playing volume of the call audio data of at least one call terminal is increased/decreased, carrying out fusion processing on the call audio data of each call terminal according to a time axis;

increasing/decreasing the playback speed of the original audio data;

and determining overlapped audio data with overlapped contents of each call end and other call ends on a time axis, and sequentially and independently decomposing the overlapped audio data of each call end to the time axis of optimized recording data according to a semantic recognition result of the original audio data.

8. The audio recording processing method according to claim 7, further comprising: and sending the sound mixing processing rule to the terminal.

9. A recording playing method comprises the following steps:

acquiring object characteristic information of an object and sending the object characteristic information to a server;

receiving optimized recording data sent by the server, wherein the optimized recording data is obtained by processing original audio data according to an audio mixing processing rule after the server generates the audio mixing processing rule according to the object characteristic information, the original audio data comprises independent call audio data of N call ends in N-end call, and N is more than or equal to 2;

and playing the optimized recording data.

10. The audio record playing method of claim 9, wherein before playing the optimized audio record data, further comprising: receiving a semantic recognition result which is sent by the server and is obtained by performing semantic recognition on pre-mixing audio data, wherein the pre-mixing audio data are generated by fusing call audio data of all call ends in the original audio data according to a time axis;

when playing the optimized recording data, the method further comprises: and synchronously displaying semantic identification result content corresponding to the currently played audio data according to the time axis of the optimized recording data.

11. The sound recording playing method according to claim 9 or 10, wherein the optimized sound recording data includes at least one optimized audio clip; the playing the optimized recording data comprises:

displaying an audio selection control corresponding to each optimized audio clip in the optimized recording data;

receiving a selection instruction for the audio selection control through a display screen;

and playing the optimized audio clip corresponding to the audio selection control.

12. The audio record playing method of claim 11, wherein before playing the optimized audio record data, further comprising: receiving a sound mixing processing rule which is sent by the server and corresponds to the optimized recording data, wherein the sound mixing processing rule comprises an optimization starting and ending time and a mode of optimizing original audio data within the optimization starting and ending time;

the displaying of the audio selection control corresponding to each optimized audio clip in the optimized recording data includes:

and marking and displaying the audio selection control corresponding to each optimized audio clip on the playing time axis of the pre-mixed audio data according to the optimized starting and stopping time.

13. The audio playback method of claim 11, wherein the audio selection control displays keywords corresponding to the optimized audio segment.

14. A recording processing apparatus, characterized by comprising:

the information acquisition module is used for acquiring object characteristic information of an object;

the rule generating module is used for generating a sound mixing processing rule according to the object characteristic information;

the optimization processing module is used for processing original audio data from a terminal according to the sound mixing processing rule to obtain optimized recording data, wherein the original audio data comprises independent call audio data of N call ends in N-end call, and N is more than or equal to 2;

and the recording sending module is used for sending the optimized recording data to a terminal.

15. A recording/playback apparatus, comprising:

the information sending module is used for acquiring object characteristic information of an object and sending the object characteristic information to the server;

the recording receiving module is used for receiving optimized recording data sent by the server, the optimized recording data is obtained by processing original audio data according to the sound mixing processing rule after the sound mixing processing rule is generated by the server according to the object characteristic information, the original audio data comprises independent call audio data of N call ends in N-end call, and N is more than or equal to 2;

and the recording playing module is used for playing the optimized recording data.

16. A server, comprising a first processor, a first memory, and a first communication bus;

the first communication bus is used for realizing connection communication between the first processor and the first memory;

the first processor is configured to execute one or more programs stored in the first memory to implement the steps of the sound recording processing method according to any one of claims 1 to 8.

17. A terminal, comprising a second processor, a second memory, and a second communication bus;

the second communication bus is used for realizing connection communication between the second processor and the second memory;

the second processor is configured to execute one or more programs stored in the second memory to implement the steps of the sound recording player of any of claims 9-13.

18. A storage medium storing a recording processing program and/or a recording playback program, the recording processing program being executable by one or more processors to implement the steps of the recording processing method according to any one of claims 1 to 8; the sound recording playback program is executable by one or more processors to implement the steps of the sound recording playback method according to any one of claims 9 to 13.

Technical Field

The present invention relates to the multimedia field, and in particular, to a recording processing method, a recording playing method, an apparatus, a server, a terminal, and a storage medium.

Background

Recording is a very common way of retaining information in people's daily life and daily work: for example, a writer uses a recording pen to record the speech of a interview object and plays the mind of a character more deeply by recalling the recording in the writing process; for example, information is transmitted to relatives and friends of the family through recording, so that the relatives and the friends can know the corresponding information through the recording and can also realize the feelings of the recorder; for example, the company records call contents when the company is in a teleconference with the client, so that even if the client makes a large amount of request or detailed information during the teleconference, the company can determine the client's expectation by means of a callback record. The existing recording function belongs to the basic function of the intelligent terminal, and the recording function can be controlled to be started for recording whenever a user needs to record. For example, when a user makes a multi-party call, if the user needs to use a recording function to assist in recording conference information, the recording function can be directly opened on a call interface.

Disclosure of Invention

The embodiment of the invention provides a recording processing method, a recording playing method, a recording processing device, a recording playing device, a server, a terminal and a storage medium, and mainly solves the technical problems that: the problem of because can not handle recording audio file among the current recording scheme, lead to recording audio file's broadcast effect poor, user experience is not high is solved.

To solve the foregoing technical problem, an embodiment of the present invention provides a recording processing method, including:

acquiring object characteristic information of an object;

generating a sound mixing processing rule according to the object characteristic information;

processing original audio data from a terminal according to a sound mixing processing rule to obtain optimized recording data, wherein the original audio data comprises independent call audio data of N call ends in N-end call, and N is more than or equal to 2;

and sending the optimized recording data to the terminal.

Optionally, the object feature information includes basic object feature information and/or subject object feature information, and the basic object feature information includes current physiological information and/or social information of the object; the subject object feature information is object feature information of an object under a call subject of the N-terminal call.

Optionally, the object feature information includes basic object feature information, and acquiring the basic object feature information of the object includes:

receiving a recording optimization instruction sent by a terminal;

and extracting basic object characteristic information of the object from the recording optimization instruction.

Optionally, the object feature information includes subject object feature information, and acquiring the subject object feature information of the object includes:

fusing the call audio data of each call end in the original audio data according to a time axis to generate pre-mixed audio data;

performing semantic recognition on the pre-mixed audio data;

and determining the subject object characteristic information of the object according to the semantic recognition result.

Optionally, determining the subject object feature information of the object according to the semantic recognition result includes:

determining the call theme and/or call keywords of the current N-terminal call according to the semantic recognition result;

and determining information of which the importance degree of the object reaches a preset threshold value as the characteristic information of the object of the subject according to the call subject and/or the call key words.

Optionally, the optimized recording data includes at least one optimized audio clip; the sound mixing processing rule comprises an optimization processing mode for optimizing an original audio clip in original audio data to obtain an optimized audio clip; the method for processing the original audio data from the terminal according to the audio mixing processing rule to obtain the optimized recording data comprises the following steps:

and processing the original audio clip according to the optimization processing mode corresponding to the original audio clip in the sound mixing processing rule to obtain the corresponding optimized audio clip.

Optionally, the optimization processing mode includes at least one of the following:

after the playing volume of the call audio data of at least one call terminal is increased/decreased, carrying out fusion processing on the call audio data of each call terminal according to a time axis;

increasing/decreasing the play speed of the original audio data;

and determining overlapped audio data with overlapped contents of each call end and other call ends on a time axis, and sequentially and independently decomposing the overlapped audio data of each call end to the time axis of the optimized recording data according to a semantic recognition result of the original audio data.

Optionally, the recording processing method further includes: and sending the mixing processing rule to the terminal.

The embodiment of the invention also provides a recording playing method, which comprises the following steps:

acquiring object characteristic information of an object and sending the object characteristic information to a server;

receiving optimized recording data sent by a server, generating a sound mixing processing rule by the optimized recording data according to object characteristic information by the server, and processing original audio data according to the sound mixing processing rule to obtain the optimized recording data, wherein the original audio data comprises independent call audio data of N call ends in N-end call, and N is more than or equal to 2;

optionally, before playing the optimized recording data, the method further includes: receiving a semantic recognition result which is sent by a server and is obtained by performing semantic recognition on pre-mixed audio data, and fusing the pre-mixed audio data with the call audio data of each call terminal in the original audio data according to a time axis to generate the pre-mixed audio data;

when the optimized recording data is played, the method further comprises the following steps: and synchronously displaying semantic identification result content corresponding to the currently played audio data according to the time axis of the optimized recording data.

Optionally, the optimized recording data includes at least one optimized audio clip; playing the optimized recording data comprises:

displaying an audio selection control corresponding to each optimized audio clip in the optimized recording data;

receiving a selection instruction for the audio selection control through the display screen;

and playing the optimized audio clip corresponding to the audio selection control.

Optionally, before playing the optimized recording data, the method further includes: receiving a sound mixing processing rule which is sent by a server and corresponds to the optimized recording data, wherein the sound mixing processing rule comprises an optimization starting and ending time and a mode of optimizing original audio data within the optimization starting and ending time;

displaying the audio selection control corresponding to each optimized audio clip in the optimized recording data comprises:

and marking and displaying the audio selection control corresponding to each optimized audio clip on the playing time axis of the pre-mixed audio data according to the optimized starting and ending time.

Optionally, keywords corresponding to the optimized audio segments are displayed on the audio selection control.

An embodiment of the present invention further provides a recording processing apparatus, including:

the information acquisition module is used for acquiring object characteristic information of an object;

the rule generating module is used for generating a sound mixing processing rule according to the object characteristic information;

the optimization processing module is used for processing original audio data from the terminal according to the sound mixing processing rule to obtain optimized recording data, the original audio data comprises independent call audio data of N call ends in N-end call, and N is more than or equal to 2;

and the recording sending module is used for sending the optimized recording data to the terminal.

An embodiment of the present invention further provides a recording and playing device, which is characterized by including:

the information sending module is used for acquiring object characteristic information of the object and sending the object characteristic information to the server;

the recording receiving module is used for receiving optimized recording data sent by the server, the optimized recording data is obtained by processing original audio data according to the sound mixing processing rule after the sound mixing processing rule is generated by the server according to the object characteristic information, the original audio data comprises independent call audio data of N call ends in N-end call, and N is more than or equal to 2;

and the recording playing module is used for playing the optimized recording data.

The embodiment of the invention also provides a server, which comprises a first processor, a first memory and a first communication bus;

the first communication bus is used for realizing connection communication between the first processor and the first memory;

the first processor is used for executing one or more programs stored in the first memory to realize the steps of the sound recording processing method.

The embodiment of the invention also provides a terminal, which comprises a second processor, a second memory and a second communication bus;

the second communication bus is used for realizing connection communication between the second processor and the second memory;

the second processor is configured to execute one or more programs stored in the second memory to implement the steps of the sound recording player as described in any one of the above.

The embodiment of the invention also provides a storage medium, wherein the storage medium stores a recording processing program and/or a recording playing program, and the recording processing program can be executed by one or more processors to realize the steps of the recording processing method as described in any one of the above; the sound recording and playback program may be executable by one or more processors to implement the steps of the sound recording and playback method as in any one of the above.

The invention has the beneficial effects that:

according to the recording processing and playing method and device, the server, the terminal and the storage medium provided by the embodiment of the invention, when a recording audio file is generated according to the independent call audio data of N call ends in the N-end call process, the server firstly obtains the object characteristic information of an object, then generates the sound mixing processing rule according to the object characteristic information, processes the original audio data of the independent call audio data of the N call ends in the N-end call according to the sound mixing processing rule to obtain optimized recording data, and then sends the optimized recording data to the terminal, so that the terminal plays the optimized recording data to a user. In the recording processing method provided by the embodiment of the present invention, the sound mixing processing rule according to which the server performs optimization processing on the independent call audio data of the N call terminals in the N-terminal call is generated according to the object feature information of the object, so that the obtained optimized recording data is related to the object feature information, and when the objects are different, the object feature information for generating the sound mixing processing rule is different, so that the sound mixing rules are different, and in this case, the optimized recording data obtained by performing optimization processing on the original audio data according to the sound mixing rule is not completely the same. Therefore, in the scheme provided by the embodiment of the invention, the optimized recording data with different optimization tendencies can be processed and obtained aiming at the same original audio data, so that a user can play back and listen to the call recording of the N end from different 'angles', and the call information passed by the N end can be more comprehensively and clearly understood. In addition, because the same original audio data can be optimized according to different objects, the users can listen to different effects, and the interestingness of recording is improved to a great extent.

Additional features and corresponding advantages of the invention will be set forth in part in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention.

Drawings

FIG. 1 is a flowchart illustrating a recording processing method according to an embodiment of the present invention;

FIG. 2 is a schematic diagram of a three-terminal call according to a first embodiment of the present invention;

fig. 3 is a flowchart of acquiring feature information of an object subject according to a first embodiment of the present invention;

fig. 4 is a schematic diagram of audio clips and corresponding optimization processing manners on a time axis of original audio data according to a first embodiment of the present invention;

FIG. 5 is an audio waveform of three-terminal call audio data according to an embodiment of the present invention;

FIG. 6 is an audio waveform of the optimized recording data according to one embodiment of the present invention;

fig. 7 is a flowchart of a recording playing method according to a second embodiment of the present invention;

fig. 8 is a flowchart illustrating playing of optimized recording data by a terminal according to a second embodiment of the present invention;

fig. 9 is a schematic diagram of a display interface of the terminal according to the second embodiment of the present invention;

fig. 10 is a schematic structural diagram of a recording optimization system according to a third embodiment of the present invention;

fig. 11 is an interaction diagram of a terminal and a server side in a recording processing scheme and a recording playing scheme provided in the third embodiment of the present invention;

fig. 12 is a schematic structural diagram of a recording processing apparatus according to a fourth embodiment of the present invention;

fig. 13 is a schematic structural diagram of a recording and playing device according to a fifth embodiment of the present invention;

fig. 14 is a schematic hardware configuration diagram of a server according to a sixth embodiment of the present invention;

fig. 15 is a schematic diagram of a hardware structure of a terminal according to a sixth embodiment of the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention more apparent, embodiments of the present invention are described in detail below with reference to the accompanying drawings. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.

34页详细技术资料下载
上一篇:一种医用注射器针头装配设备
下一篇:一种MAC地址的控制方法、智能终端及存储介质

网友询问留言

已有0条留言

还没有人留言评论。精彩留言会获得点赞!

精彩留言,会给你点赞!

技术分类