Audio distribution method, device and storage medium

文档序号：1628137 发布日期：2020-01-14 浏览：23次中文

阅读说明：本技术 音频分配方法、装置及存储介质 (Audio distribution method, device and storage medium ) 是由彭捷杨益于 2019-09-02 设计创作，主要内容包括：本申请实施例公开了一种音频分配方法、装置及存储介质,其中方法包括：获取待标注音频的第一用户信息和音频属性以及多个标注方中每一标注方的第二用户信息和处理属性；根据所述第一用户信息和每一所述第二用户信息从所述音频属性对应的预设评分列表中确定每一所述标注方的安全值；根据每一所述标注方的安全值从所述多个标注方中选取安全值大于第一阈值的标注方以得到多个待分配标注方；根据所述音频属性和每一所述待分配标注方的处理属性从所述多个待分配标注方中选取目标标注方；将所述待标注音频对应的标注任务分配给所述目标标注方。采用本申请,可提高分配音频标注任务的准确性和安全性。(The embodiment of the application discloses an audio distribution method, an audio distribution device and a storage medium, wherein the method comprises the following steps: acquiring first user information and audio attributes of audio to be labeled and second user information and processing attributes of each labeling party in a plurality of labeling parties; determining a safety value of each labeling party from a preset scoring list corresponding to the audio attribute according to the first user information and each piece of second user information; selecting a labeling party with a safety value larger than a first threshold value from the plurality of labeling parties according to the safety value of each labeling party to obtain a plurality of labeling parties to be distributed; selecting a target labeling party from the plurality of labeling parties to be distributed according to the audio attribute and the processing attribute of each labeling party to be distributed; and distributing the labeling task corresponding to the audio to be labeled to the target labeling party. By the method and the device, the accuracy and the safety of the audio annotation task can be improved.)

1. An audio distribution method, comprising:

acquiring first user information and audio attributes of audio to be labeled, and acquiring second user information and processing attributes of each labeling party in a plurality of labeling parties;

according to the first user information and each piece of second user information, determining a safety value of each labeling party from a preset scoring list corresponding to the audio attribute; the information in the preset scoring list is used for describing the corresponding relation among the first user information, the second user information and the safety value;

according to the safety value of each labeling party, selecting a labeling party with a safety value larger than a first threshold value from the plurality of labeling parties to obtain a plurality of labeling parties to be distributed;

selecting a target labeling party from the plurality of labeling parties to be distributed according to the audio attribute and the processing attribute of each labeling party to be distributed;

and distributing the labeling task corresponding to the audio to be labeled to the target labeling party.

2. The method according to claim 1, wherein the selecting a target annotating party from the plurality of annotating parties to be allocated according to the audio attributes and the processing attributes of each annotating party to be allocated comprises: acquiring a corresponding labeling progress of each to-be-distributed labeling party;

determining the distribution probability of each party to be distributed with labels according to the audio attributes and the processing attributes of each party to be distributed with labels;

determining an evaluation value of each to-be-distributed labeling party according to the labeling progress and the distribution probability corresponding to each to-be-distributed labeling party to obtain a plurality of evaluation values;

and taking the to-be-distributed labeling party corresponding to the maximum value in the evaluation values as a target labeling party.

3. The method according to claim 2, wherein the obtaining of the annotation progress corresponding to each of the to-be-allocated annotators to obtain a plurality of annotation progresses comprises:

acquiring a distribution list corresponding to each to-be-distributed labeling party to obtain a plurality of distribution lists;

acquiring the pre-stored average marking rate corresponding to each marking party to be distributed so as to obtain a plurality of average marking rates;

acquiring the size of the marking data corresponding to each marking party to be distributed according to the distribution lists to obtain a plurality of sizes of the marking data;

and acquiring the annotation progress corresponding to each annotation party to be distributed according to the sizes of the plurality of annotation data and the plurality of average annotation rates so as to obtain a plurality of annotation progresses.

4. The method according to any one of claims 1 to 3, wherein the preset scoring list comprises a plurality of preset scoring dimensions, and the determining the safety value of each of the annotating parties from the preset scoring list corresponding to the audio attribute according to the first user information and each of the second user information comprises:

determining an evaluation value corresponding to each preset scoring dimension according to the first user information and the second user information;

and determining the safety value of each label party according to the preset weight and the evaluation value corresponding to each preset grading dimension.

5. The method according to any one of claims 1 to 3, wherein the allocating the annotation task corresponding to the audio to be annotated to the target annotating party comprises:

separating the audio to be marked to obtain a plurality of audio segments;

and distributing the labeling tasks corresponding to the plurality of audio segments to the target labeling party.

6. The method according to claim 5, wherein the separating the audio to be labeled to obtain a plurality of audio segments comprises:

performing voice recognition on the audio to be marked to obtain text information;

segmenting the text information to obtain a plurality of text segments;

and separating the audio to be marked according to the time information of each text segment to obtain a plurality of audio segments.

7. The method according to any one of claims 1 to 3, wherein after the assigning the annotation task corresponding to the audio to be annotated to the target annotating party, the method further comprises:

receiving a target labeling file which is sent by a labeling device corresponding to the target labeling party aiming at the labeling task;

comparing the target labeling file with a reference labeling file corresponding to the audio to be labeled to obtain an identification rate;

and if the identification rate is smaller than a second threshold value, sending prompt information to the labeling equipment, wherein the prompt information is used for prompting the target labeling party to label the audio to be labeled again.

8. An audio distribution apparatus, comprising:

the processing unit is used for acquiring first user information and audio attributes of audio to be labeled and acquiring second user information and processing attributes of each labeling party in a plurality of labeling parties; according to the first user information and each piece of second user information, determining a safety value of each labeling party from a preset scoring list corresponding to the audio attribute; the information in the preset scoring list is used for describing the corresponding relation among the first user information, the second user information and the safety value; according to the safety value of each labeling party, selecting a labeling party with a safety value larger than a first threshold value from the plurality of labeling parties to obtain a plurality of labeling parties to be distributed; selecting a target labeling party from the plurality of labeling parties to be distributed according to the audio attribute and the processing attribute of each labeling party to be distributed;

and the communication unit is used for distributing the marking task corresponding to the audio to be marked to the target marking party.

9. An electronic device comprising a processor, a memory, a communication interface, and one or more programs, wherein the one or more programs are stored in the memory and configured to be executed by the processor, the programs comprising instructions for performing the steps of the method of any of claims 1-7.

10. A computer-readable storage medium for storing a computer program, wherein the computer program causes a computer to perform the method according to any one of claims 1-7.

Technical Field

The application relates to the technical field of computers, and mainly relates to an audio distribution method, an audio distribution device and a storage medium.

Background

In the prior art, audio tagging tasks are basically distributed based on task quantity requirements, that is, the number of tasks requiring audio tagging is counted first, and then the tasks requiring audio tagging are distributed evenly according to the number of tagging parties. However, the security levels corresponding to different audio annotation tasks are different, and the average distribution may cause inaccurate distribution of the audio annotation tasks, thereby affecting the security of the audio.

Disclosure of Invention

The embodiment of the application provides an audio distribution method, an audio distribution device and a storage medium, which can improve the accuracy and the safety of audio labeling task distribution.

In a first aspect, an embodiment of the present application provides an audio distribution method, including:

selecting a target labeling party from the plurality of labeling parties to be distributed according to the audio attribute and the processing attribute of each labeling party to be distributed;

and distributing the labeling task corresponding to the audio to be labeled to the target labeling party.

In a second aspect, an embodiment of the present application provides an audio distribution apparatus, wherein:

and the communication unit is used for distributing the marking task corresponding to the audio to be marked to the target marking party.

In a third aspect, an embodiment of the present application provides an electronic device, including a processor, a memory, a communication interface, and one or more programs, where the one or more programs are stored in the memory and configured to be executed by the processor, and the program includes instructions for some or all of the steps described in the first aspect.

In a fourth aspect, the present application provides a computer-readable storage medium, where the computer-readable storage medium stores a computer program, where the computer program makes a computer perform some or all of the steps as described in the first aspect of the present application.

In a fifth aspect, embodiments of the present application provide a computer program product, where the computer program product comprises a non-transitory computer-readable storage medium storing a computer program, the computer program being operable to cause a computer to perform some or all of the steps as described in the first aspect of embodiments of the present application. The computer program product may be a software installation package.

The embodiment of the application has the following beneficial effects:

after the audio distribution method, the device and the storage medium are adopted, the first user information and the audio attribute of the audio to be marked and the second attribute information and the processing attribute of each marking party in a plurality of marking parties are obtained. And then, according to the first user information and each second user information, determining a safety value of each labeling party from a preset scoring list corresponding to the audio attribute, and taking the labeling party with the safety value larger than a first threshold value as a labeling party to be allocated. And then, determining a target labeling party according to the audio attribute of the audio to be labeled and the processing attribute of each labeling party to be allocated, and allocating a labeling task corresponding to the audio to be labeled to the target labeling party. Therefore, the accuracy and the safety of the task of distributing the audio annotation can be improved.

Drawings

In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present application, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.

Wherein:

fig. 1 is a schematic flowchart of an audio distribution method according to an embodiment of the present application;

fig. 2 is a schematic structural diagram of an audio distribution apparatus according to an embodiment of the present disclosure;

fig. 3 is a schematic structural diagram of an electronic device according to an embodiment of the present application.

Detailed Description

In order to make the technical solutions of the present application better understood, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments obtained by a person of ordinary skill in the art without any inventive work according to the embodiments of the present application are within the scope of the present application.

The terms "first," "second," and the like in the description and claims of the present application and in the above-described drawings are used for distinguishing between different objects and not for describing a particular order. Furthermore, the terms "include" and "have," as well as any variations thereof, are intended to cover non-exclusive inclusions. For example, a process, method, system, article, or apparatus that comprises a list of steps or elements is not limited to only those steps or elements listed, but may alternatively include other steps or elements not listed, or inherent to such process, method, article, or apparatus.

Reference herein to "an embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment can be included in at least one embodiment of the application. The appearances of the phrase in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. It is explicitly and implicitly understood by one skilled in the art that the embodiments described herein can be combined with other embodiments.

The following describes embodiments of the present application in detail.

Referring to fig. 1, an embodiment of the present application provides a flow chart illustrating an audio distribution method. The audio distribution method is applied to electronic devices, and the electronic devices according to embodiments of the present disclosure may include various handheld devices, wearable devices, computing devices, or other processing devices connected to a wireless modem, as well as various forms of User Equipment (UE), Mobile Station (MS), terminal equipment (terminal), and so on. For convenience of description, the above-mentioned devices are collectively referred to as electronic devices.

Specifically, as shown in fig. 1, an audio distribution method is applied to an electronic device, where:

s101: the method comprises the steps of obtaining first user information and audio attributes of audio to be labeled, and obtaining second user information and processing attributes of each labeling party in a plurality of labeling parties.

In this embodiment of the application, the audio to be labeled may be an audio file that is not labeled, or an audio file that is used in a training process of a labeling party and has been labeled, which is not limited herein.

The first user information of the audio to be marked refers to user information of a person who enters the audio to be marked, that is, user information of the person who enters the audio to be marked. The first user information may include information related to the native place, the local area, the age, the occupation, the gender, the education background, the work experience, and the like of the input person, which is not limited herein.

The audio attributes of the audio to be annotated may include audio type, audio volume, audio source, audio content, and the like. The audio capacity is used for describing the data size of the audio to be marked. The audio source is used for describing the uploading information of the audio to be labeled, such as: and if the audio source is the WeChat account, the audio to be marked is the audio input by the input personnel in the WeChat application. The audio content may include summary information corresponding to the audio. The audio type may be classified by application type, for example: a browser, an instant messaging application, a financial management application, etc. The audio type may also be classified by language type, for example: chinese, english, mandarin, dialect, etc. The audio type may also be classified by input type, for example: search, voice chat, etc., or the audio type may also be categorized by audio content, such as: a dialog scenario, an authentication scenario, etc., are not limited herein.

In the embodiment of the present application, the annotation party may be a person who is registered in an audio annotation system in the electronic device and can process an audio annotation task. The second user information of the annotating party refers to the user information of the annotating party, such as the native place, the region, the age, the occupation, the sex, the education background, the work experience and the like of the annotating party, and is not limited herein.

In the embodiment of the present application, the annotating party can also be an electronic device, that is, an audio annotation task is processed based on a computer program in the electronic device. The second user information of the labeling party refers to hardware information of the labeling party, such as capacity, remaining memory size, physical address, network speed, and the like, which is not limited herein.

The processing attributes of the annotating party may include processing audio type, average annotation rate, etc. Wherein, the processing audio type comprises the audio type which is trained by the labeling party. The average annotation rate is the average rate of the annotating party for processing the audio annotation task. Further, the processing efficiency corresponding to different types of audio labeling tasks is different, and the average labeling rate can be divided into the average labeling rates corresponding to the audio types.

S102: and determining the safety value of each labeling party from a preset scoring list corresponding to the audio attribute according to the first user information and each piece of second user information.

In the embodiment of the application, the security value is used for describing the security of the annotating party for processing the audio to be annotated, and the greater the security value is, the more secure the annotating party for processing the audio to be annotated is. And information in the preset scoring list is used for describing the corresponding relation among the first user information, the second user information and the safety value. The preset scoring list can describe various information which is possibly encountered, or information corresponding to the information, such as an association value between an entering person and a labeling party corresponding to the audio to be labeled.

For example, assuming that the preset rating list corresponding to the audio attribute is shown in table 1 below, the preset rating list may be divided into two items, i.e., a rating standard describing a rating value corresponding to a region and a job where the first user information and the second user information are located, and an information type. When the region of the entry personnel corresponding to the audio to be labeled in the first user information is Shenzhen and the occupation is a teacher, and the region of the label party in the second user information is Chongqing and the occupation is a doctor, summing up the score values corresponding to the region and the occupation according to the table 1 to obtain a safety value of 4.

TABLE 1

Type of information	Scoring criteria
		In the area	The same area is 0, and the different areas are 2
Occupation of the world	The same occupation is 0, the related occupation is 1, and the unrelated occupation is 2

In one possible example, the preset scoring list includes a plurality of preset scoring dimensions, and the specific implementation of step S102 includes steps a1-a2, wherein:

and A1, determining an evaluation value corresponding to each preset scoring dimension according to the first user information and the second user information.

In this example, the preset scoring dimension may be each information type between the first user information and the second user information, and may also include associated information corresponding to each information type, for example: and the method comprises the steps of recording an association value between a person to be recorded and a label party corresponding to the audio to be labeled, a distance between the person to be recorded and the label party, a similarity value between the person to be recorded and the label party and the like.

And A2, determining the safety value of each label party according to the preset weight and the evaluation value corresponding to each preset grading dimension.

In this example, weights corresponding to different preset scoring dimensions may be preset, for example, when the preset scoring dimension is a correlation value between the input person and the annotating party, the preset weight corresponding to the preset scoring dimension is 0.5. When the preset scoring dimension is the distance between the input personnel and the labeling party, the preset weight corresponding to the preset scoring dimension is 0.2. And when the preset scoring dimension is a similarity value between the input personnel and the labeling party, the preset weight value corresponding to the preset scoring dimension is 0.3 and the like.

In this example, the preset weight and the evaluation value corresponding to each preset scoring dimension may be weighted and summed to obtain a security value of each annotating party. For example, assuming that the preset rating list corresponding to the audio attribute is as shown in table 2 below, it can be seen from table 2 that when the association value between the entering person and the annotating party is 0.3, the corresponding rating value is 2. When the distance between the recording personnel and the labeling party is 2 kilometers, the corresponding evaluation value is 3. When the similarity value between the entering person and the labeling party is 0.5, the corresponding evaluation value is 3. Assuming that a preset weight corresponding to a correlation value between an entering person and a labeling party is 0.5, a preset weight corresponding to a distance between the entering person and the labeling party is 0.2, and a preset weight corresponding to a similarity value between the entering person and the labeling party is 0.3, performing weighted summation on the preset weight and an evaluation value corresponding to each preset scoring dimension, namely 0.5 x 2+0.2 x 3+0.3, and obtaining a safety value of 2.5.

TABLE 2

It can be understood that in step a1 and step a2, the evaluation value corresponding to each preset scoring dimension is determined according to the first user information and the second user information, and then the safety value of each label party is determined according to the preset weight corresponding to each scoring dimension, so that the accuracy of determining the safety value is improved.

S103: and selecting the labeling party with the safety value larger than a first threshold value from the plurality of labeling parties according to the safety value of each labeling party so as to obtain a plurality of labeling parties to be distributed.

In the embodiment of the present application, the first threshold is not limited. In one possible example, the method further comprises: and determining an audio type according to the audio attribute, and taking a preset marking time length corresponding to the audio type as the first threshold value.

The audio type can be directly obtained from the audio attribute, can be determined according to the audio content and/or the audio scene, and can also be determined according to the application type and/or the input type. It can be understood that the audio attribute can embody the audio type, and the audio type of the audio to be labeled is determined according to the audio attribute, so that the accuracy of determining the audio type can be improved.

It is understood that, in this possible example, the preset tagging duration corresponding to the audio type of the audio to be tagged is used as the first threshold. Therefore, different to-be-distributed labeling parties can be selected according to the audio type, and the accuracy of selecting the to-be-distributed labeling parties is improved.

S104: and selecting a target labeling party from the plurality of labeling parties to be distributed according to the audio attribute and the processing attribute of each labeling party to be distributed.

In the embodiment of the present application, the target annotating party is an annotating party corresponding to an annotating task corresponding to the audio to be annotated to be allocated, that is, the target annotating party processes the annotating task after receiving the annotating task. It can be understood that the target labeling party is selected according to the audio attribute, the safety value and the processing attribute of each labeling party, and the safety and the processing efficiency of processing the labeling task corresponding to the audio to be labeled can be improved.

The method for selecting the target annotation party is not limited in the present application, and in a possible example, the specific implementation manner of the step S104 includes steps B1-B5, where:

and B1, acquiring the labeling progress corresponding to each labeling party to be distributed.

And the marking progress is the progress of the party to be allocated with the mark to complete the current audio task. The method for obtaining the annotation progress is not limited in the present application, and in a possible example, the specific implementation manner of the step B1 includes steps B11-B14, where:

and B11, acquiring a distribution list corresponding to each party to be distributed with labels to obtain a plurality of distribution lists.

The allocation list is used for recording the audio allocated to each party to be allocated with the annotation and the first user information and audio attributes of each allocated audio.

And B12, acquiring the pre-stored average marking rate corresponding to each marking party to be distributed so as to obtain a plurality of average marking rates.

The average annotation rate is used for describing the annotation efficiency of each annotation party to be allocated, and can be obtained by analyzing the audio capacity and the completion time of each annotation party to be allocated.

And B13, acquiring the size of the annotation data corresponding to each annotation party to be allocated according to the allocation lists, so as to obtain a plurality of sizes of the annotation data.

The size of the labeled data is used for describing the task amount of the allocated audio, and can be obtained through the capacity of each allocated audio.

And B14, obtaining the annotation progress corresponding to each annotation party to be allocated according to the sizes of the plurality of annotation data and the plurality of average annotation rates, so as to obtain a plurality of annotation progresses.

It can be understood that, in steps B11-B14, the allocation list and the average annotation rate of each to-be-allocated annotating party are obtained first, then the size of the annotation data corresponding to each to-be-allocated annotating party is obtained according to each allocation list, and finally the annotation progress corresponding to the to-be-allocated annotating party is obtained according to the size of the annotation data corresponding to each to-be-allocated annotating party and the average annotation rate. Therefore, the annotation progress is obtained according to the allocated annotation task and the average annotation rate of the annotation party to be allocated, and the accuracy of obtaining the annotation progress can be improved.

B2, determining the distribution probability of each party to be distributed with labels according to the audio attributes and the processing attributes of each party to be distributed with labels.

The distribution probability is used for describing the probability of processing the audio to be labeled of each labeling party to be distributed. Specifically, the service type required by the audio attribute and the service capability in the processing attribute of the to-be-allocated annotating party can be obtained, for example, the plurality of to-be-allocated annotating parties include a first to-be-allocated annotating party, a second to-be-allocated annotating party and a third to-be-allocated annotating party. The audio attribute is the english, and the average mark rate that first waiting to assign the mark side and handle english audio is 2 words per minute, and the average mark rate that the second waiting to assign the mark side and handle english audio is 5 words per minute, and the average mark rate that the third waiting to assign the mark side and handle english audio is 4 words per minute. Thus, it can be determined that the distribution probability of the first party to be assigned is 0.5, the distribution probability of the second party to be assigned is 0.8, and the distribution probability of the third party to be assigned is 0.7.

And B3, determining the evaluation value of each party to be assigned according to the corresponding annotation progress and the assignment probability of each party to be assigned, so as to obtain a plurality of evaluation values.

The evaluation value is used for describing the arrangement sequence of the audio to be labeled distributed to the labeling party to be distributed. The method for determining the evaluation value is not limited, and the weights corresponding to the labeling progress and the distribution probability can be respectively set and then weighted with the labeling progress and the distribution probability to obtain the evaluation value of each to-be-distributed labeling party. For example, suppose that the annotation progress of the annotation party to be allocated is 60%, and the allocation probability is 0.5. When the weights corresponding to the labeling progress and the distribution probability are 0.5 and 0.5 respectively, the evaluation value is 0.55.

And B4, taking the to-be-distributed annotation party corresponding to the maximum value in the plurality of annotation progress as a target annotation party.

It can be understood that, in steps B1-B4, the evaluation values of the annotating parties to be allocated are determined according to the annotation progress and the allocation probability corresponding to each annotating party to be allocated, and then the maximum value of the evaluation values is used as the target annotating party. Therefore, the marking efficiency can be improved.

S105: and distributing the labeling task corresponding to the audio to be labeled to the target labeling party.

It can be understood that in the audio allocation method shown in fig. 1, first user information and audio attributes of the audio to be labeled are obtained, and second attribute information and processing attributes of each of the plurality of labeling parties are obtained. And then, according to the first user information and each second user information, determining a safety value of each labeling party from a preset scoring list corresponding to the audio attribute, and taking the labeling party with the safety value larger than a first threshold value as a labeling party to be allocated. And then, determining a target labeling party according to the audio attribute of the audio to be labeled and the processing attribute of each labeling party to be allocated, and allocating a labeling task corresponding to the audio to be labeled to the target labeling party. Therefore, the accuracy and the safety of the task of distributing the audio annotation can be improved.

In one possible example, the specific implementation of step S105 includes step C1 and step C2, wherein:

and C1, separating the audio to be labeled to obtain a plurality of audio segments.

The separation method of the audio to be marked can identify the users in the audio to be marked by a voiceprint identification method, wherein each audio clip corresponds to one user. The method for separating the audio to be labeled may also be a method for separating audio channels, that is, classifying audio segments obtained by different pickup devices, for example: the two channels are divided into 2 audio segments, and the three channels are divided into 3 audio segments, which is not limited herein.

In one possible example, the audio attribute comprises an audio type, and embodiments of step C1 include steps C11-C13, wherein:

and C11, performing voice recognition on the audio to be labeled to obtain text information.

Speech recognition technology is the conversion of lexical content in human speech into computer readable input, such as keystrokes, binary codes or character sequences.

And C12, segmenting the text information to obtain a plurality of text segments.

In this example, the segmentation may be performed according to the completeness of the sentence, i.e. the same segment of text is divided into one text segment.

And C13, separating the audio to be labeled according to the time information of each text segment to obtain a plurality of audio segments.

It can be understood that, in steps C11-C13, the speech recognition is performed on the audio to be labeled to obtain the text information, and then the text information is segmented to obtain a plurality of text segments, so that the accuracy of segmenting the text segments can be improved. And then separating the audio to be marked according to the time information of each text segment to obtain a plurality of audio segments, thereby improving the accuracy of segmenting the audio segments.

C2, distributing the labeling tasks corresponding to the audio clips to the target labeling party.

It can be understood that, in the step C1 and the step C2, the audio to be labeled is classified to obtain a plurality of audio segments, and then the labeling tasks corresponding to the plurality of audio segments are allocated to the target labeling party, so that the target labeling party can label the audio segments separately and label the audio segments by combining the upper and lower semantics, which is convenient for improving the efficiency and accuracy of labeling.

In one possible example, after step S105, steps D1-D3 may also be performed, wherein:

and D1, receiving a target annotation file sent by the annotation equipment corresponding to the target annotation party aiming at the annotation task.

The target labeling file is a file obtained by labeling the audio to be labeled by the target labeling party. The target markup file may include, but is not limited to, a word translation, a speech rate, an emotion, a role, a gender, an identity, and the like of the audio to be annotated.

D2, comparing the target labeling file with the reference labeling file corresponding to the audio to be labeled to obtain the identification rate.

The reference annotation file is a standard annotation file stored in advance. The identification rate is used for describing the identification accuracy rate of the target labeling file.

And D3, if the recognition rate is smaller than a second threshold, sending prompt information to the labeling device, wherein the prompt information is used for prompting the target labeling party to label the audio to be labeled again.

The second threshold is not limited and can be set according to training.

It is understood that in steps D1-D3, the target annotation file sent by the target annotating party through the annotating device is received, and then the target annotation file is compared with the reference annotation file to obtain the recognition rate. And then comparing the recognition rate with a second threshold, and if the recognition rate is smaller than the second threshold, sending prompt information to the labeling equipment to prompt the target labeling party to label the audio to be labeled again. Therefore, the marking service capability of the target marking party is improved in a checking mode.

Referring to fig. 2, fig. 2 is a schematic structural diagram of an audio distribution apparatus according to an embodiment of the present disclosure, where the audio distribution apparatus is applied to an electronic device. As shown in fig. 2, the audio distribution apparatus 200 includes:

the processing unit 201 is configured to obtain first user information and audio attributes of an audio to be labeled, and obtain second user information and processing attributes of each labeling party in a plurality of labeling parties; according to the first user information and each piece of second user information, determining a safety value of each labeling party from a preset scoring list corresponding to the audio attribute; the information in the preset scoring list is used for describing the corresponding relation among the first user information, the second user information and the safety value; according to the safety value of each labeling party, selecting a labeling party with a safety value larger than a first threshold value from the plurality of labeling parties to obtain a plurality of labeling parties to be distributed; selecting a target labeling party from the plurality of labeling parties to be distributed according to the audio attribute and the processing attribute of each labeling party to be distributed;

the communication unit 202 is configured to allocate the annotation task corresponding to the audio to be annotated to the target annotating party.

It can be understood that first user information and audio attributes of the audio to be labeled are obtained, and second attribute information and processing attributes of each of the plurality of labeling parties are obtained. And then, according to the first user information and each second user information, determining a safety value of each labeling party from a preset scoring list corresponding to the audio attribute, and taking the labeling party with the safety value larger than a first threshold value as a labeling party to be allocated. And then, determining a target labeling party according to the audio attribute of the audio to be labeled and the processing attribute of each labeling party to be allocated, and allocating a labeling task corresponding to the audio to be labeled to the target labeling party. Therefore, the accuracy and the safety of the task of distributing the audio annotation can be improved.

In a possible example, in the aspect that the target annotation party is selected from the multiple annotation parties to be allocated according to the audio attribute and the processing attribute of each annotation party to be allocated, the processing unit 201 is specifically configured to obtain an annotation progress corresponding to each annotation party to be allocated, so as to obtain multiple annotation progresses; determining the distribution probability of each party to be distributed with labels according to the audio attributes and the processing attributes of each party to be distributed with labels; determining an evaluation value of each to-be-distributed labeling party according to the labeling progress and the distribution probability corresponding to each to-be-distributed labeling party to obtain a plurality of evaluation values; and taking the to-be-distributed labeling party corresponding to the maximum value in the evaluation values as a target labeling party.

In a possible example, in the aspect of obtaining the annotation progress corresponding to each to-be-allocated annotating party to obtain multiple annotation progresses, the processing unit 201 is specifically configured to obtain an allocation list corresponding to each to-be-allocated annotating party to obtain multiple allocation lists; acquiring the pre-stored average marking rate corresponding to each marking party to be distributed so as to obtain a plurality of average marking rates; acquiring the size of the marking data corresponding to each marking party to be distributed according to the distribution lists to obtain a plurality of sizes of the marking data; and acquiring the annotation progress corresponding to each annotation party to be distributed according to the sizes of the plurality of annotation data and the plurality of average annotation rates so as to obtain a plurality of annotation progresses.

In a possible example, the preset scoring list includes a plurality of preset scoring dimensions, and in terms of determining the safety value of each annotating party from the preset scoring list corresponding to the audio attribute according to the first user information and each second user information, the processing unit 201 is specifically configured to determine an evaluation value corresponding to each preset scoring dimension according to the first user information and the second user information; and determining the safety value of each label party according to the preset weight and the evaluation value corresponding to each preset grading dimension.

Labeling party in one possible example, the processing unit 201 is further configured to separate the audio to be labeled to obtain a plurality of audio segments; the communication unit 202 is specifically configured to allocate the annotation tasks corresponding to the multiple audio segments to the target annotating party.

In one possible example, in terms of separating the audio to be labeled to obtain a plurality of audio segments, the processing unit 201 is specifically configured to perform speech recognition on the audio to be labeled to obtain text information; segmenting the text information to obtain a plurality of text segments; and separating the audio to be marked according to the time information of each text segment to obtain a plurality of audio segments.

In a possible example, after the annotation task corresponding to the audio to be annotated is allocated to the target annotator, the communication unit 202 is further configured to receive a target annotation file sent by an annotation device corresponding to the target annotator for the annotation task; the processing unit 202 is further configured to compare the target annotation file with a reference annotation file corresponding to the audio to be annotated to obtain an identification rate; the communication unit 202 is further configured to send prompt information to the labeling device if the identification rate is smaller than a second threshold, where the prompt information is used to prompt the target labeling party to label the audio to be labeled again.

Referring to fig. 3, fig. 3 is a schematic structural diagram of an electronic device according to an embodiment of the present disclosure. As shown in fig. 3, the electronic device 300 comprises a processor 310, a memory 320, a communication interface 330, and one or more programs 340, wherein the one or more programs 340 are stored in the memory 320 and configured to be executed by the processor 310, and wherein the program 340 comprises instructions for:

selecting a target labeling party from the plurality of labeling parties to be distributed according to the audio attribute and the processing attribute of each labeling party to be distributed;

and distributing the labeling task corresponding to the audio to be labeled to the target labeling party.

It can be understood that the safety value of each labeling party is determined from the preset scoring list corresponding to the audio attribute according to the first user information of the audio to be labeled and the second user information of each labeling party, and then the labeling party with the safety value greater than the first threshold value is used as the labeling party to be allocated. And then, determining a target labeling party according to the audio attribute of the audio to be labeled and the processing attribute of each labeling party to be allocated, and allocating a labeling task corresponding to the audio to be labeled to the target labeling party. Therefore, the accuracy and the safety of the task of distributing the audio annotation can be improved.

In one possible example, in the aspect of selecting a target annotating party from the plurality of annotating parties to be allocated according to the audio attribute and the processing attribute of each annotating party to be allocated, the program 340 is specifically configured to execute the following steps:

acquiring a corresponding labeling progress of each to-be-distributed labeling party to obtain a plurality of labeling progresses;

determining the distribution probability of each party to be distributed with labels according to the audio attributes and the processing attributes of each party to be distributed with labels;

and taking the to-be-distributed labeling party corresponding to the maximum value in the evaluation values as a target labeling party.

In a possible example, in the aspect of obtaining the annotation progress corresponding to each of the to-be-allocated annotators to obtain a plurality of annotation progresses, the program 340 is specifically configured to execute the following steps:

acquiring a distribution list corresponding to each to-be-distributed labeling party to obtain a plurality of distribution lists;

acquiring the pre-stored average marking rate corresponding to each marking party to be distributed so as to obtain a plurality of average marking rates;

acquiring the size of the marking data corresponding to each marking party to be distributed according to the distribution lists to obtain a plurality of sizes of the marking data;

In one possible example, the preset scoring list includes a plurality of preset scoring dimensions, and in the aspect of determining the safety value of each annotator from the preset scoring list corresponding to the audio attribute according to the first user information and each second user information, the program 340 is specifically configured to execute the following steps:

determining an evaluation value corresponding to each preset scoring dimension according to the first user information and the second user information;

and determining the safety value of each label party according to the preset weight and the evaluation value corresponding to each preset grading dimension.

In one possible example, in terms of allocating the annotation task corresponding to the audio to be annotated to the target annotating party, the program 340 is specifically configured to execute the following steps:

separating the audio to be marked to obtain a plurality of audio segments;

and distributing the labeling tasks corresponding to the plurality of audio segments to the target labeling party.

In one possible example, in terms of the separating the audio to be labeled to obtain a plurality of audio segments, the program 340 is specifically configured to execute the following steps:

performing voice recognition on the audio to be marked to obtain text information;

segmenting the text information to obtain a plurality of text segments;

and separating the audio to be marked according to the time information of each text segment to obtain a plurality of audio segments.

In one possible example, after the assigning the annotation task corresponding to the audio to be annotated to the target annotator, the program 340 is further configured to execute the following steps:

receiving a target labeling file which is sent by a labeling device corresponding to the target labeling party aiming at the labeling task;

comparing the target labeling file with a reference labeling file corresponding to the audio to be labeled to obtain an identification rate;

Embodiments of the present application also provide a computer storage medium, where the computer storage medium stores a computer program for causing a computer to execute a part or all of the steps of any one of the methods as described in the method embodiments, and the computer includes an electronic device.

Embodiments of the application also provide a computer program product comprising a non-transitory computer readable storage medium storing a computer program operable to cause a computer to perform some or all of the steps of any of the methods as recited in the method embodiments. The computer program product may be a software installation package and the computer comprises the electronic device.

It should be noted that, for simplicity of description, the above-mentioned method embodiments are described as a series of acts or combination of acts, but those skilled in the art will recognize that the present application is not limited by the order of acts described, as some steps may occur in other orders or concurrently depending on the application. Further, those skilled in the art will also appreciate that the embodiments described in this specification are presently preferred and that no particular act or mode of operation is required in the present application.

In the foregoing embodiments, the descriptions of the respective embodiments have respective emphasis, and for parts that are not described in detail in a certain embodiment, reference may be made to related descriptions of other embodiments.

In the embodiments provided in the present application, it should be understood that the disclosed apparatus may be implemented in other manners. For example, the above-described embodiments of the apparatus are merely illustrative, and for example, a division of a unit is merely a logical division, and an actual implementation may have another division, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection of some interfaces, devices or units, and may be an electric or other form.

Units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.

In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a hardware mode or a software program mode.

The integrated unit, if implemented in the form of a software program module and sold or used as a stand-alone product, may be stored in a computer readable memory. With such an understanding, the technical solution of the present application may be embodied in the form of a software product, which is stored in a memory and includes several instructions for causing a computer device (which may be a personal computer, a server, a network device, or the like) to perform all or part of the steps of the method according to the embodiments of the present application. And the aforementioned memory comprises: various media capable of storing program codes, such as a usb disk, a read-only memory (ROM), a Random Access Memory (RAM), a removable hard disk, a magnetic or optical disk, and the like.

Those skilled in the art will appreciate that all or part of the steps in the methods of the above embodiments may be implemented by associated hardware instructed by a program, which may be stored in a computer-readable memory, which may include: flash disk, ROM, RAM, magnetic or optical disk, and the like.

The foregoing detailed description of the embodiments of the present application has been presented to illustrate the principles and implementations of the present application, and the above description of the embodiments is only provided to help understand the method and the core concept of the present application; meanwhile, for a person skilled in the art, according to the idea of the present application, there may be variations in the specific embodiments and application scope, and in summary, the content of the present specification should not be construed as a limitation to the present application.

16页详细技术资料下载

Audio distribution method, device and storage medium

相关技术

网友询问留言