Voice communication method, electronic device and readable medium

文档序号：173022 发布日期：2021-10-29 浏览：24次中文

阅读说明：本技术 语音通信方法、电子设备和可读介质 (Voice communication method, electronic device and readable medium ) 是由姜晨阳于 2021-07-23 设计创作，主要内容包括：本申请涉及通信技术领域,公开了一种语音通信方法、电子设备和可读介质。其中语音通信方法包括：终端向网络侧设备发送与终端当前位置信息对应的语音编/解码列表,其中语音编/解码列表中的语音编/解码方式是基于终端在当前位置处的历史语音通话质量所确定的；网络侧设备根据语音编/解码列表选择出语音通信中与终端最为适配的语音编/解码方式,并将选择出的语音编/解码方式发送给终端。上述语音通信方法,通过调整终端上报给网络侧设备的语音编/解码方式,进而调整网络侧设备筛选出的语音编/解码方式,以使得筛选出的语音编/解码方式为真实有效的优选方式,更贴近用户的真实需求,提高用户语音通话质量,提升用户的体验感。(The application relates to the technical field of communication, and discloses a voice communication method, electronic equipment and a readable medium. The voice communication method comprises the following steps: the terminal sends a voice coding/decoding list corresponding to the current position information of the terminal to the network side equipment, wherein the voice coding/decoding mode in the voice coding/decoding list is determined based on the historical voice call quality of the terminal at the current position; and the network side equipment selects the voice coding/decoding mode most suitable for the terminal in the voice communication according to the voice coding/decoding list and sends the selected voice coding/decoding mode to the terminal. According to the voice communication method, the voice coding/decoding mode reported to the network side equipment by the terminal is adjusted, and then the voice coding/decoding mode screened out by the network side equipment is adjusted, so that the screened out voice coding/decoding mode is a real and effective optimal mode, is closer to the real requirement of a user, improves the voice communication quality of the user, and improves the experience of the user.)

1. A method of voice communication, comprising:

a terminal sends a voice coding/decoding list of the terminal to network side equipment, wherein a voice coding/decoding mode in the voice coding/decoding list is determined based on historical voice call quality of the terminal at the current position;

and the terminal receives the voice coding/decoding mode which is selected from the voice coding/decoding list by the network side equipment and is used by the terminal in voice communication from the network side equipment.

2. The method according to claim 1, before the terminal sends the voice codec list of the terminal to the network-side device, further comprising:

and the terminal deletes the voice coding/decoding mode of which the historical voice call quality at the current position does not meet the call requirement from the existing voice coding/decoding list in the voice coding/decoding modes supported by the terminal.

3. The method according to claim 1, before the terminal sends the voice codec list of the terminal to the network-side device, further comprising:

the terminal adjusts the arrangement sequence of the voice coding/decoding modes of which the historical voice call quality at the current position does not meet the call requirement in an existing voice coding/decoding list in the voice coding/decoding modes supported by the terminal.

4. The method according to any one of claims 1 to 3, wherein the terminal is a calling end, and the voice coding/decoding list is included in an origination message sent by the calling end to the network-side device.

5. The method according to any of claims 1 to 3, wherein the terminal is a called terminal, and the voice coding/decoding list is included in a response message sent by the called terminal to the network side device.

6. The method according to any of claims 1 to 5, wherein the voice communication is carried over VoLTE traffic.

7. The method according to claim 1, further comprising initializing the voice codec list by the terminal when any one of the following conditions is satisfied by the terminal:

the terminal is turned on and off;

the terminal and the SIM card interrupt signals;

the voice encoding/decoding list of the terminal lasts for a preset duration;

the terminal deletes all voice coding/decoding modes in the voice coding/decoding list;

the terminal adjusts the arrangement order of all the voice encoding/decoding modes in the voice encoding/decoding list.

8. The method according to claim 7, wherein the initialization manner of the voice coding/decoding list comprises any one of the following:

the terminal recovers all the voice coding/decoding modes supported by the terminal in the voice coding/decoding list;

the terminal restores the arrangement order of the voice coding/decoding modes in the voice coding/decoding list;

and the terminal deletes the voice coding/decoding list and generates a voice coding/decoding list according to all the voice coding/decoding modes supported by the terminal.

9. A method of voice communication, comprising:

and the network side equipment selects a voice coding/decoding mode to be used by the terminal in voice communication according to the voice coding/decoding list and sends the selected voice coding/decoding mode to the terminal.

10. An electronic device, comprising:

a memory storing instructions;

a processor coupled with a memory, the program instructions stored by the memory when executed by the processor causing the electronic device to perform the voice communication method of any of claims 1-9.

11. A readable medium having stored therein instructions, which when run on the readable medium, cause the readable medium to perform the voice communication method according to any one of claims 1 to 9.

Technical Field

The present application relates to the field of communications technologies, and in particular, to a voice communication method, an electronic device, and a readable medium.

Background

With the rapid development of communication technologies, more and more terminal devices are currently capable of supporting Voice over LTE (Voice over LTE) services carried in 4G communication technologies or Voice over NR (Voice over NR) services carried in 5G communication technologies. The VoLTE service is a voice service that is based on an IP Multimedia Subsystem (IMS) and is carried in a Long Term Evolution (LTE) network, and an analog voice signal in the voice service is transmitted through an Internet Protocol (IP) data packet.

At present, when a VoLTE service is created between a terminal device initiating a voice call (hereinafter, referred to as a calling end) and a terminal device being called (hereinafter, referred to as a called end), the calling end and the called end generally report voice encoding/decoding modes supported by the calling end and the called end to a network side device (e.g., a base station), the network side device selects corresponding voice encoding/decoding modes for the calling end and the called end according to a preset screening method, and the calling end and the called end respectively encode/decode analog voice signals interacted between the calling end and the called end by using the voice encoding/decoding modes selected by the network side device.

However, in some application scenarios, the voice encoding/decoding method screened by the network side device is not necessarily the most suitable voice encoding/decoding method between the calling end (called end) and the network side device. For example, the construction time of the network side device is earlier, the occurrence time of the voice encoding/decoding mode screened by the network side device is later, and although the network side device is upgraded, the network side device cannot guarantee sufficient support capability of the screened voice encoding/decoding mode. Furthermore, when the calling end (called end) encodes/decodes the analog voice signal interacted between the calling end and the network side device by adopting the voice encoding/decoding mode screened out by the network side device, the voice call between the calling end and the called end may be interrupted, silent, and even the call fails, so that the voice call quality is reduced, and the user experience is influenced.

Disclosure of Invention

The method and the device aim to solve the technical problems that the voice call between a calling end and a called end is intermittent, silent and even call failure, the voice call quality is reduced, and the user experience is influenced. A voice communication method, an electronic device, and a readable medium are provided based on this application. The voice call method in the application comprises the following steps: the terminal correspondingly stores voice coding/decoding lists corresponding to different position information, and when a user wants to perform voice communication with other users through the terminal, the terminal sends the voice coding/decoding list corresponding to the current position information of the user to the network side equipment, so that the network side equipment screens out a voice coding/decoding mode most suitable for the terminal and the network side equipment from the voice coding/decoding list. And the voice coding/decoding mode in the voice coding/decoding list sent by the terminal is a coding/decoding mode obtained by updating according to the quality of the historical voice call data of the user at the current position. According to the voice communication method, the voice coding/decoding mode reported to the network side equipment by the terminal is adjusted, and then the preferred voice coding/decoding mode screened by the network side equipment is reasonably adjusted under the condition that the network equipment screening method is not required to be adjusted, so that the screened voice coding/decoding mode is a real and effective preferred mode and is closer to the real requirement of a user, the voice communication quality of the user is improved, and the experience of the user is improved.

A first aspect of the present application provides a voice communication method, including: the terminal sends a voice coding/decoding list of the terminal to the network side equipment, wherein the voice coding/decoding mode in the voice coding/decoding list is determined based on the historical voice call quality of the terminal at the current position; the terminal receives the voice coding/decoding mode which is selected by the network side equipment from the voice coding/decoding list and is used by the terminal in the voice communication from the network side equipment.

The terminal can be any one of electronic equipment which can be inserted with an SIM card and has a voice communication function, such as a mobile phone, a tablet computer, a watch and the like. The terminal can be a calling terminal or a called terminal, the calling terminal and the called terminal can be the same type of electronic equipment, and the calling terminal and the called terminal can also be different types of electronic equipment. The network side device may be a base station provided for various operators. The Voice encoding/decoding mode may be an Adaptive Multi-Rate audio encoding/decoding mode (AMR), an Adaptive Multi-Rate Wideband Speech encoding/decoding mode (AMR-WB), an enhanced Voice service encoding/decoding mode (EVS), and other Voice encoding/decoding modes. The voice encoding/decoding list refers to a set of voice encoding/decoding modes corresponding to current position information and supported by the terminal, and the voice encoding/decoding lists corresponding to different position information are correspondingly stored in the terminal. The current position refers to the current position of the terminal. The historical voice call data refers to voice call data meeting preset conditions after one voice call is finished. For example, the history voice call data includes the current location information of the terminal and audio data within one week corresponding to the current location information of the terminal. For another example, the historical voice call data includes the current location information of the terminal and the latest two pieces of voice call data corresponding to the current location information of the terminal. The historical voice call quality refers to the voice call quality of the historical voice call data.

That is, in the embodiment of the present application, the terminal correspondingly stores the voice coding/decoding list corresponding to different location information, and when a user wants to perform a voice call with another user through the terminal at a certain location, the terminal sends the voice coding/decoding list corresponding to the current location information of the user to the network side device. And the network side equipment screens out the voice coding/decoding mode which is most adaptive to the terminal and the network side equipment from the voice coding/decoding list, and returns the screened voice coding/decoding mode to the terminal, so that the terminal performs voice communication with the network side equipment according to the screened voice coding/decoding mode. Wherein the voice encoding/decoding list can be continuously updated according to the call quality of the historical voice call data corresponding to the location information. Specifically, the updating method of the speech encoding/decoding list includes updating the type of the speech encoding/decoding method in the speech encoding/decoding list and updating the sequence of the speech encoding/decoding method in the speech encoding/decoding list.

For example, the screening scheme of the voice encoding/decoding mode in the application is applied between the calling terminal and the network side equipment, the calling terminal firstly queries the current position information of the calling terminal and sends the latest voice encoding/decoding list corresponding to the current position information to the network side equipment, the network side equipment screens out the calling voice encoding/decoding mode most suitable for the calling terminal and the network side equipment from the received voice encoding/decoding list according to the screening logic and returns the calling voice encoding/decoding mode to the calling terminal, so that the calling terminal performs voice communication with the network side equipment according to the calling voice encoding/decoding mode. Wherein the voice encoding/decoding list can be continuously updated according to the call quality of the historical voice call data corresponding to the location information.

For another example, the screening scheme of the voice encoding/decoding manner in the present application is applied between the network side device and the called end, and the network side device sends the paging message to the called end according to the target number carried in the origination message when the network side device receives the origination message sent by the calling end. After receiving the paging message, the called end inquires the current position information of the called end and sends the latest voice coding/decoding list corresponding to the current position information to the network side equipment. The network side equipment screens out a called voice coding/decoding mode which is most suitable for the called end and the network side from the received voice coding/decoding list according to the screening logic, and returns the calling voice coding/decoding mode to the calling end, so that the calling end carries out voice communication with the network side equipment according to the calling voice coding/decoding mode. Wherein the voice encoding/decoding list can be continuously updated according to the call quality of the historical voice call data corresponding to the location information.

According to the voice communication method, the voice coding/decoding modes reported to the network side equipment by the terminal are respectively adjusted at different positions through the historical voice call data, and then the preferred voice coding/decoding modes screened by the network side equipment are reasonably adjusted under the condition that the screening method of the network equipment is not required to be adjusted, so that the voice coding/decoding modes screened by the network side equipment are the real and effective preferred modes of the terminal and are more in line with and close to the real requirements of users. Therefore, the screening scheme of the voice coding/decoding mode can improve the voice call quality of the user and improve the experience of the user.

In another possible implementation of the first aspect, before the terminal sends the voice coding/decoding list of the terminal to the network side device, the method further includes: the terminal deletes the voice coding/decoding mode of which the historical voice call quality at the current position does not meet the call requirement from the existing voice coding/decoding list in the voice coding/decoding modes supported by the terminal.

That is, in the embodiment of the present application, the updating method of the voice encoding/decoding list of the terminal is to delete the voice encoding/decoding mode with the highest priority or a certain order in the voice encoding/decoding list, and keep the voice encoding/decoding modes with other priorities or other orders. That is, the terminal deletes the voice encoding/decoding mode at the time of the voice call with poor call quality at the current position among all the voice encoding/decoding modes supported by the terminal, obtains the set of the voice encoding/decoding modes corresponding to the current position, and completes the update of the voice encoding/decoding modes in the voice encoding/decoding list.

For example, the preset updating mode in the calling end is to delete the voice encoding/decoding mode with the highest priority in the voice encoding/decoding list, and reserve the voice encoding/decoding modes with other priorities to obtain the updated voice encoding/decoding list. Correspondingly, the screening logic of the network side equipment is used for screening the data processing rule of the calling voice coding/decoding list according to the priority of the voice coding/decoding mode.

For another example, the preset updating mode in the calling end is a voice encoding/decoding mode deleting the first order in the voice encoding/decoding list, and the voice encoding/decoding modes of other orders are sequentially moved forward to obtain the updated voice encoding/decoding list. Correspondingly, the screening logic of the network side equipment is the data processing rule of the voice coding/decoding mode with the first order as the preferred voice coding/decoding mode.

The updating mode of the voice coding/decoding list in the voice communication method can prevent the voice coding/decoding mode with poor call quality from being reported to the network side again when the terminal is at the current position, further prevent the network side from screening the voice coding/decoding mode with poor call quality again, improve the call quality of users and improve the user experience.

In another possible implementation of the first aspect, before the terminal sends the voice coding/decoding list of the terminal to the network side device, the method further includes: the terminal adjusts the arrangement sequence of the voice coding/decoding modes of which the historical voice call quality at the current position does not meet the call requirement in the existing voice coding/decoding list in the voice coding/decoding modes supported by the terminal.

That is, in the embodiment of the present application, the voice encoding/decoding manner in the voice encoding/decoding list corresponding to the current location information of the user may be a combination of the voice encoding/decoding manners obtained after the terminal continuously adjusts the arrangement order of all the voice encoding/decoding manners supported by the terminal according to the call quality of the historical voice call data when the terminal is at the current location.

For example, the preset updating mode in the calling end is to move the first order voice encoding/decoding mode in the voice encoding/decoding list to the last order in the voice encoding/decoding list, and the other order voice encoding/decoding modes move forward in sequence. The screening logic may be a data processing rule for screening a predetermined order of voice encoding/decoding modes in the calling voice encoding/decoding list.

The updating mode of the voice coding/decoding list in the voice communication method can reduce the priority of the voice coding/decoding mode with poor call quality in the voice coding/decoding modes reported to the network side when the terminal is at the current position, and further avoid the network side from screening the voice coding/decoding mode with poor call quality again.

In another possible implementation of the first aspect, the terminal is a calling end, and the voice coding/decoding list is included in an origination message sent by the calling end to the network side device.

For example, a modem in the calling end sends an origination message to the network side device. The SDP protocol in the call-initiating message carries a voice coding/decoding list corresponding to the calling end and the calling position information.

In another possible implementation of the first aspect, the terminal is a called end, and the voice coding/decoding list is included in a response message sent by the called end to the network side device.

For example, the network side device sends a paging message to the called side, the called side queries the called location information currently corresponding to the called side, and the called side sends a response message to the network side device. The SDP protocol in the response message carries a voice coding/decoding list corresponding to the called position information.

In another possible implementation of the first aspect, the voice communication is carried over VoLTE services.

For example, in VoLTE service, network side equipment always selects EVS as a voice encoding/decoding method. However, some network side devices with earlier construction time cannot support the voice encoding/decoding method because hardware devices are not upgraded, or the hardware devices have poor support capability for the voice encoding/decoding method though being upgraded later, so that when voice communication is performed between a terminal and the network side device in a local area, the voice communication is interrupted, silent, or even call failure occurs.

EVS, as a new generation of audio encoding/decoding method, is used by many operators as a preferred encoding/decoding method in audio services because it can provide higher audio quality and interference resistance. When the voice coding/decoding mode reported to the network side equipment by the terminal comprises the EVS, the network side equipment screens out the EVS. However, at the initial stage of the IP multimedia system network construction, many operators supporting the VoLTE service at a later stage have poor encoding/decoding support capability for the EVS, or have compatibility problems, so that when the EVS is used for voice call in a part of areas or within a range of individual base stations, the whole voice call is interrupted, silent, or even the call fails, resulting in poor voice call quality of users in some areas, which affects user experience. Based on this, the voice communication method is particularly suitable for the VoLTE service. The voice communication method can prevent the voice coding/decoding mode with poor call quality from being reported to the network side again when the terminal is at the current position in the VoLTE service, and further prevent the network side from screening the voice coding/decoding mode with poor call quality again.

In another possible implementation of the first aspect, the method further includes initializing, by the terminal, a voice coding/decoding list when the terminal satisfies any one of the following conditions: turning on and turning off the terminal; interrupting signals between the terminal and the SIM card; the voice coding/decoding list of the terminal lasts for a preset duration; the terminal deletes all voice coding/decoding modes in the voice coding/decoding list; the terminal adjusts the arrangement order of all the voice encoding/decoding modes in the voice encoding/decoding list.

That is, in the embodiment of the present application, when the terminal detects that the initialization condition is triggered, the terminal initializes the voice coding/decoding list in the terminal, so that the voice coding/decoding list in the terminal is restored to the initialization state. The voice encoding/decoding list in the initialization state is a data list including all voice encoding/decoding modes supported by the terminal and an initial arrangement sequence of all voice encoding/decoding modes. The initialization condition is a trigger condition for triggering the terminal to initialize the current speech coding/decoding list.

The terminal and SIM card interrupt signal may be a hot plug of the SIM card in the terminal device, so that the SIM card and the terminal signal are interrupted.

According to the voice communication method, under the condition that the terminal screens the coding/decoding modes according to the voice coding/decoding mode screening scheme and the voice coding/decoding mode adaptive to the terminal is not obtained, the deleted or order-adjusted voice coding/decoding mode supported by the terminal is recovered, the voice coding/decoding modes reported to the network side by the terminal are increased, and the selectable range of the network side equipment is expanded.

In another possible implementation of the first aspect, the initialization manner of the speech coding/decoding list includes any one of the following: the terminal recovers all voice coding/decoding modes supported by the terminal in the voice coding/decoding list; the terminal restores the arrangement order of the voice coding/decoding modes in the voice coding/decoding list; and the terminal deletes the voice coding/decoding list and generates the voice coding/decoding list according to all voice coding/decoding modes supported by the terminal.

That is, in the embodiment of the present application, when the updating manner of the voice coding/decoding list is to delete the voice coding/decoding manner that does not satisfy the call requirement, the initializing manner of the voice coding/decoding list includes that the terminal restores all the voice coding/decoding manners supported by the terminal in the voice coding/decoding list. When the updating mode of the voice coding/decoding list is the arranging order of the voice coding/decoding modes which can not meet the call requirement, the initialization mode of the voice coding/decoding list comprises that the terminal restores the arranging order of the voice coding/decoding modes in the voice coding/decoding list. In addition, for the voice encoding/decoding list of any one of the above updating methods, the voice encoding/decoding list may also be initialized by deleting the original voice encoding/decoding list and generating a new voice encoding/decoding list according to all the voice encoding/decoding methods supported by the terminal.

A second aspect of the present application provides a voice communication method, which specifically includes: the terminal sends a voice coding/decoding list of the terminal to the network side equipment, wherein the voice coding/decoding mode in the voice coding/decoding list is determined based on the historical voice call quality of the terminal at the current position; and the network side equipment selects a voice coding/decoding mode used by the terminal in the voice communication according to the voice coding/decoding list and sends the selected voice coding/decoding mode to the terminal.

A third aspect of the present application provides a voice communication method, which specifically includes: the method comprises the steps that network side equipment receives a voice coding/decoding list of a terminal, wherein the voice coding/decoding mode in the voice coding/decoding list is determined based on historical voice call quality of the terminal at the current position; and the network side equipment selects a voice coding/decoding mode used by the terminal in the voice communication according to the voice coding/decoding list and sends the selected voice coding/decoding mode to the terminal.

A fourth aspect of the present application provides an electronic device, comprising: a memory storing instructions; a processor, the processor coupled to the memory, which when executed by the processor causes the electronic device to perform any of the voice communication methods as described above in the first, second and third aspects.

A fifth aspect of the present application provides a readable medium having instructions stored therein, wherein the instructions, when executed on the readable medium, cause the readable medium to perform any of the voice communication methods as described in the first, second and third aspects above.

Drawings

FIG. 1(a) is an application scenario in some embodiments of the present application;

fig. 1(b) illustrates a transmission process of an analog voice signal between a calling end 100, a network-side device 200, and a called end 300 according to some embodiments of the present application;

fig. 2(a) is a negotiation scheme of voice encoding/decoding modes between the calling terminal 100, the network-side device 200 and the called terminal 300 in some related art schemes;

fig. 2(b) is a negotiation scheme of a voice encoding/decoding mode between the calling terminal 100, the network-side device 200 and the called terminal 300 in some embodiments of the present application;

fig. 3 is a schematic structural diagram of a calling end 100 and a network side device 200 according to some embodiments of the present application;

fig. 4 is an interaction diagram of the calling terminal 100 and the network side device 200 screening the voice encoding/decoding manner in some embodiments of the present application;

fig. 5 is a schematic diagram of calling location information of the calling terminal 100 according to some embodiments of the present application;

fig. 6 is a flowchart illustrating the updating of the voice codec list in the calling terminal 100 according to some embodiments of the present application;

fig. 7 is a flowchart illustrating updating of a speech codec list in the calling terminal 100 according to another embodiment of the present application;

fig. 8 is a schematic structural diagram of a network-side device 200 and a called end 300 according to some embodiments of the present application;

fig. 9 is an interaction diagram of the network side device 200 and the called terminal 300 when screening the voice encoding/decoding manner in some embodiments of the present application;

fig. 10 is an interaction diagram of the calling terminal 100, the network device 200 and the called terminal 300 when the voice encoding/decoding mode is selected in some embodiments of the present application;

FIG. 11 is a schematic diagram of a handset 100' according to some embodiments of the present application;

fig. 12 is a block diagram illustrating a software structure of a handset 100' according to some embodiments of the present application.

Wherein the accompanying drawings illustrate: 100-a calling end; 101-dial; 102-a dial-up application; 103-a modem; 103 a-NVRAM; 104-a speech encoding/decoding module; 104 a-a receiving submodule; 104 b-a decoding sub-module; 104 c-statistics submodule; 104 d-a calculation submodule; 104 e-an encoding submodule; 200-network side equipment; 300-called end; 303-modem; 303 a-NVRAM; 304-a speech encoding/decoding module; 304 a-a receive sub-module; 304 b-a decoding sub-module; 304 c-statistics submodule; 304 d-calculation submodule; 304 e-encoding sub-module.

Detailed Description

To make the objects, technical solutions and advantages of the present application more clear, embodiments of the present application will be described in further detail below with reference to the accompanying drawings.

It can be understood that, in the technical solution of the present application, the network side device may be an evolved Node B (eNB) in an LTE system, or a base station device in a 5G network, a Transmission and Reception Point (TRP), and the like. For convenience of explanation, the following description takes a network side device as an evolved node b device in the LTE system as an example.

In order to facilitate understanding of the technical solution of the present application, a scenario in which a terminal device performs a voice call through a base station is introduced first below. Specifically, fig. 1(a) shows an example of a scenario of a voice call according to an embodiment of the present application. As shown in fig. 1(a), the calling terminal 100 performs a voice call with the called terminal 300-1 and the called terminal 300-2 through the network-side device 200.

It is understood that the number of the calling terminal 100 is one, and the number of the called terminal 300 may be several. The calling terminal 100 and the called terminal 300 may be any one of electronic devices that can be inserted with a SIM card and have a voice call function, such as a mobile phone, a tablet computer, and a watch. In addition, the calling terminal 100 and the called terminal 300 may be the same type of electronic device, for example, both the calling terminal 100 and the called terminal 300 are mobile phones; the calling terminal 100 and the called terminal 300 may also be different types of electronic devices, for example, the calling terminal 100 is a mobile phone, and the called terminal 300 is a tablet computer. The network side device 200 is a base station provided by various operators. The Voice encoding/decoding mode may be an Adaptive Multi-Rate audio encoding/decoding mode (AMR), an Adaptive Multi-Rate Wideband Speech encoding/decoding mode (AMR-WB), an enhanced Voice service encoding/decoding mode (EVS), and other Voice encoding/decoding modes.

Since the voice encoding/decoding method supported by the terminal (the calling terminal and/or the called terminal) is related to the model of the terminal and the category of the operator, a calling voice encoding/decoding method between the calling terminal 100 and the network-side device 200 and a called voice encoding/decoding method between the called terminal 300 and the network-side device 200 need to be negotiated respectively. For ease of understanding, before describing the negotiation scheme of the voice encoding/decoding scheme, a brief description will be given of the flow of analog voice signals between the calling terminal 100 and the called terminal 300.

As shown in fig. 1(b), in the calling terminal 100, the voice encoding/decoding module 104 encodes the analog voice signal according to the calling voice encoding/decoding mode to obtain audio data, and sends the audio data to the Modem (Modem)103, and the Modem103 converts the audio data into a calling voice packet, and sends the calling voice packet to the network-side device 200.

In the network-side device 200, a Modem (Modem)203 receives the calling voice packet, converts the calling voice packet into audio data, and then sends the audio data to a voice encoding/decoding module 204. The voice encoding/decoding module 204 first decodes the audio data in the calling voice encoding/decoding manner to obtain an analog voice signal, encodes the analog voice signal in the called voice encoding/decoding manner to form audio data, and sends the newly formed audio data to the Modem 203. The Modem 203 converts the audio data into a called voice packet, and transmits the called voice packet to the called terminal 300.

In the called terminal 300, a Modem (Modem)303 receives a called voice packet, converts the called voice packet into audio data, and then sends the audio data to a voice encoding/decoding module 304, and the voice encoding/decoding module 304 decodes the audio data according to a called voice encoding/decoding mode to obtain an analog voice signal.

At this point, the transmission of the analog voice signal between the calling terminal 100 and the called terminal 300 is completed once.

It can be seen that the calling voice encoding/decoding manner and the called voice encoding/decoding manner are particularly important for the voice call between the calling terminal 100 and the called terminal 300. Some negotiation schemes of the calling voice encoding/decoding and the called voice encoding/decoding modes between the calling terminal 100, the network side device 200 and the called terminal 300 in the related art schemes will be described below with reference to fig. 2 (a): the calling terminal 100 sends an origination message (denoted as an INVITE message) to the network side device 200, and reports all voice encoding/decoding modes supported by the calling terminal 100 to the network side device 200 through the origination message. The network side device 200 matches the called end 300 according to the target number carried in the originating message, and initiates a paging message (denoted as QUERY message) to the called end 300, and at the same time, the network side device 200 selects a calling voice encoding/decoding mode from the voice encoding/decoding modes reported by the calling end 100 according to a predetermined screening logic. After the called end 300 returns the response message, the network side device 200 returns the calling voice encoding/decoding mode to the calling end 100. In addition, the called end 300 reports all the voice encoding/decoding modes supported by the called end 300 to the network side device 200 through the response message. The network side device 200 selects a called voice encoding/decoding mode from the voice encoding/decoding modes reported by the called terminal 300 according to a predetermined screening logic, and returns the called voice encoding/decoding to the called terminal 300.

However, in the negotiation scheme of the calling voice encoding/decoding method or the called voice encoding/decoding method, the voice encoding/decoding method reported by the calling terminal 100 (the called terminal 300) to the network side device 200 is all the voice encoding/decoding methods supported by the calling terminal 100 (the called terminal 300). That is, the voice encoding/decoding modes reported by the calling end 100 (the called end 300) are always the same, and the screening logic of the network-side device 200 is always unchanged, so that the network-side device 200 always selects a certain voice encoding/decoding mode (e.g., EVS) as the calling voice encoding/decoding mode (the called voice encoding/decoding mode). However, some network-side devices 200 with an earlier construction time cannot support the calling speech encoding/decoding method (called speech encoding/decoding method) because hardware devices are not upgraded, or the hardware devices are upgraded later, but have a poor support capability for the calling speech encoding/decoding method (called speech encoding/decoding method), so that when a voice call is performed between a calling terminal 100 (called terminal 300) and the network-side device 200 in a local area by using an EVS, the voice call is interrupted, silent, or even a call fails.

In order to solve the above problem, the present application provides a voice communication method, where a screening scheme of a voice encoding/decoding manner is specifically that a voice encoding/decoding list corresponding to different location information is stored in a terminal, and when a user wants to perform a voice call with another user through the terminal at a certain location, the terminal sends the voice encoding/decoding list corresponding to the current location information of the user to a network side device, so that the network side device screens out a voice encoding/decoding manner that is most suitable for the terminal and the network side device from the voice encoding/decoding list. The voice encoding/decoding list is a set of voice encoding/decoding modes corresponding to the current position information and supported by the terminal. In addition, the voice coding/decoding list can be continuously updated according to the call quality of the history voice call data corresponding to the location information. The updating method of the voice encoding/decoding list comprises the updating of the voice encoding/decoding method type and the updating of the voice encoding/decoding method arrangement order.

It can be understood that, in some embodiments of the present application, the speech encoding/decoding manners in the speech encoding/decoding list corresponding to the current location information of the user may be obtained by updating the speech encoding/decoding manners in the speech encoding/decoding list according to the call quality of the historical speech call data of the user at the current location, for example, deleting the speech encoding/decoding manners at the time of the speech call with poor call quality at the current location from all the speech encoding/decoding manners supported by the terminal, so as to obtain a set of the speech encoding/decoding manners corresponding to the current location. The voice coding/decoding list can prevent the voice coding/decoding mode with poor call quality from being reported to the network side again when the terminal is at the current position, and further prevent the network side from screening the voice coding/decoding mode with poor call quality again.

It can be understood that, in some other embodiments of the present application, the voice encoding/decoding manners in the voice encoding/decoding list corresponding to the current location information of the user may be a combination of the voice encoding/decoding manners obtained after the terminal continuously adjusts the ranking order of all the voice encoding/decoding manners supported by the terminal according to the call quality of the historical voice call data when the terminal is at the current location. The voice coding/decoding list can reduce the priority of the voice coding/decoding mode with poor call quality in the voice coding/decoding modes reported to the network side when the terminal is at the current position, and further avoid the network side from screening the voice coding/decoding mode with poor call quality again.

For example, as shown in fig. 2(b), in an implementation manner, when the screening scheme of the voice encoding/decoding manner in the present application is applied between the calling terminal 100 and the network-side device 200 in the above scenario, the calling terminal 100 first queries current location information of the calling terminal 100, and sends a latest voice encoding/decoding list corresponding to the current location information to the network-side device 200, and the network-side device 200 screens out, from the received voice encoding/decoding list, a calling voice encoding/decoding manner that is most suitable for the calling terminal 100 and the network-side device 200 according to the screening logic.

Correspondingly, for the called terminal 300, when the screening scheme of the voice encoding/decoding manner in the present application is applied between the network-side device 200 and the called terminal 300 in the above scenario, the network-side device 200 sends the paging message to the called terminal 300 according to the target number carried in the origination message when the network-side device 200 receives the origination message sent by the calling terminal 100. After receiving the paging message, the called end 300 queries the current location information of the called end 300, and sends the latest voice encoding/decoding list corresponding to the current location information to the network side device 200. The network-side device 200 selects the called voice encoding/decoding mode most suitable for the called terminal 300 and the network-side device 200 from the received voice encoding/decoding list according to the selection logic.

According to the voice coding/decoding mode screening scheme, the voice coding/decoding modes reported to the network side equipment by the terminal are respectively adjusted at different positions through the historical voice call data, and then the preferred voice coding/decoding modes screened by the network side equipment are reasonably adjusted, so that the voice coding/decoding modes screened by the network side equipment are the real and effective preferred modes of the terminal and are more in line with and close to the real requirements of users. Therefore, the screening scheme of the voice coding/decoding mode can improve the voice call quality of the user and improve the experience of the user.

According to the above description of the screening scheme of the voice encoding/decoding manner provided by the present application, it is easy to find that the negotiation process of the calling voice encoding/decoding between the calling terminal 100 and the network-side device 200 is similar to the negotiation process of the called voice encoding/decoding between the called terminal 300 and the network-side device 200, and for convenience of description, the negotiation process of the calling voice encoding/decoding between the calling terminal 100 and the network-side device 200 is mainly used, and the technical solution of the present application is described in detail below.

Fig. 3 is a schematic structural diagram illustrating a calling end 100 and a network-side device 200 capable of implementing the technical solution of the present application in the voice call scenario illustrated in fig. 1. As shown in fig. 3, the calling terminal 100 includes a dial 101, a dialing application 102, a Modem103, a voice encoding/decoding module 104, and the like, and the network-side device 200 includes a filtering module 201 and a Modem 203 for filtering a calling voice encoding/decoding mode from a voice encoding/decoding list.

The Modem103 is configured to send an INVITE message and a calling voice packet, receive a 200OK message, and implement interconversion between the calling voice packet and audio data. In addition, the Modem103 can also receive a calling voice packet sent by the Modem 203, where the calling voice packet is a voice packet sent by the called end 300 to the calling end 100 through the network-side device 200, and audio data corresponding to the calling voice packet is used to update a voice encoding/decoding manner in the voice encoding/decoding list, and a specific scheme will be described in detail below. The 200OK message is used to indicate that the network-side device 200 has successfully received and understood the INVITE message sent by the calling terminal 100, that is, the network-side device 200 has successfully responded to the INVITE message. The Modem103 further includes a Non-Volatile Random Access Memory (NVRAM) 103 a. The NVRAM 103a stores the current location information of the calling terminal 100, and the voice encoding/decoding list 1, the voice encoding/decoding list 2, the voice encoding/decoding list 3, … …, and the voice encoding/decoding list n corresponding to different location information. The voice encoding/decoding module 104 is configured to encode the analog voice signal into audio data in a calling voice encoding/decoding manner, and to decode the audio data into the analog voice signal in the calling voice encoding/decoding manner. The voice encoding/decoding module 104 specifically includes a receiving submodule 104a for receiving audio data, a decoding submodule 104b for decoding the audio data according to the calling voice encoding/decoding mode, a counting submodule 104c for counting the number of voice packets and the number of frames corresponding to the decoded audio data, a calculating submodule 104d for calculating the voice packet decoding effect corresponding to the audio data, and an encoding submodule 104e for encoding the analog voice signal according to the calling voice encoding/decoding mode.

Fig. 4 shows an interaction diagram between the calling terminal 100 and the network-side device 200 when the speech coding/decoding mode screening scheme of the present application is applied to the calling terminal 100 and the network-side device 200 in the speech call scenario shown in fig. 1. The following describes in detail a negotiation process for determining a voice encoding/decoding mode between the calling terminal 100 and the network-side device 200 when the calling terminal 100 wants to perform a VoLTE voice call with the called terminal 300 according to the schematic structural diagram in fig. 3 and the interaction diagram in fig. 4. As can be seen from fig. 3 and fig. 4, the negotiation process of the voice encoding/decoding method between the calling terminal 100 and the network-side device 200 in the present application specifically includes the following steps:

step S410: the calling terminal 100 queries the current calling location information of the calling terminal 100.

In one implementation, the calling terminal 100 opens the dialing application 102 according to a touch operation of a user on the touch panel, and the dial 101 in the dialing application 102 obtains a target number input by the user and dials in response to the dialing operation of the user. Then, the dialing application 102 sends a dialing message including the destination number to the Modem 103. After receiving the dialing message, the Modem103 queries the calling location information currently corresponding to the calling terminal 100. The calling location information refers to current location information of the calling terminal 100. The calling location information may be stored in the NVRAM 103a, or may be stored in other storage media of the calling terminal 100.

In one implementation, the calling location information is an area code (denoted as Cell ID) that can characterize the area where the calling terminal 100 is currently located. The region code may be a code corresponding to a region divided from a region around the network-side device 200.

Fig. 5 is a schematic diagram illustrating the encoding of each region around the network-side device 200 according to some embodiments of the present application. As shown in fig. 5, the area around the network-side device 200 may be divided according to the distance from the network-side device 200 and the orientation with respect to the network-side device 200. In particular, the ray l₁Ray l₂Ray l₃Ray l₄Ray l₅Ray l₆Ray l₇And ray l₈With point O (P)_OWhere it is located) as an origin and extends radially around to divide the area around the network-side device 200 into 8 sector areas. Furthermore, the ray l₁And ray l₂Of middle, ray l₂And ray l₃Of middle, ray l₃And ray l₄Of middle, ray l₄And ray l₅And a ray l₅And ray l₆The 6 sector areas in between are also respectively curved₁Curve C₂And curve C₃The division into 4 regions (including 1 small sector region and 3 annular regions) is to divide the region around the network side device 200 into 26 coding regions, and each coding region corresponds to one region code.

Where the network-side device 200 transmits different radio spectrum resources w to the 26 coding regions. For example, network-side device 200 transmits radio spectrum resource w to coding region I₁The network side device 200 transmits the radio spectrum resource w to the coding region II₂And the network side device 200 transmits the radio spectrum resource w to the coding region III₃. Wherein, the radio frequency spectrum resource refers to a radio electromagnetic wave which is below 3000GHz and propagates in space without artificial waveguide. Different ones do notLine spectral resources refer to radio electromagnetic waves of different bands. The calling terminal 100 determines the coding region where the calling terminal 100 is currently located according to the received radio electromagnetic wave. The radio electromagnetic waves received by any two points in the same coding area are the same, and the region codes of any two points in the same coding area are the same.

As shown in FIG. 5, P_OIs a distributed location, P, of a network-side device 200 (e.g., a base station)_APosition point and P_BThe location point is located in the coding region I, and the calling terminal 100 is at P_APosition point and P_BThe wireless electromagnetic waves received by the position point are all w₁. Thus P_APosition point and P_BThe region codes of the location points are the same, i.e. when the calling terminal 100 is at P_APosition point or P_BAt the location point, the calling location information of the calling terminal 100 is the same. P_CThe position points are located in coding regions II and P_DThe location point is located in the coding region III, and the calling terminal 100 is at P_CPosition point receiving wireless electromagnetic wave w₂The calling terminal 100 is at P_DPosition point receiving wireless electromagnetic wave w₃. Thus P_CPosition point and P_DThe region codes of the location points are different, i.e. when the calling terminal 100 is in P_CPosition point or P_DAt the location point, the calling location information of the calling terminal 100 is different.

In an alternative other implementation, the calling location information is coordinate information capable of representing the current location of the calling terminal 100.

Step S420: the calling terminal 100 sends an origination message to the network-side device 200. Wherein, the calling message carries a voice coding/decoding list corresponding to the calling position information.

In one implementation, the Modem103 in the calling end 100 sends an origination message to the network-side device 200. The sdp (session Description protocol) protocol in the origination message carries a voice encoding/decoding list corresponding to the calling terminal 100 and the calling location information. The text information of the SDP protocol comprises media streams and coding/decoding sets corresponding to the media streams. The coding/decoding set is a voice coding/decoding list. The voice coding/decoding list may be stored in the NVRAM 103a, or may be stored in other storage media of the calling terminal 100.

It can be understood that, in general, the calling terminal 100 is a mobile terminal, and when the coding region where the calling terminal 100 is located changes, that is, the calling location information of the calling terminal 100 changes, the voice encoding/decoding manner applicable to the calling terminal 100 may change, so that for each calling location information, the calling terminal 100 has a voice encoding/decoding list corresponding to the calling location information.

The voice encoding/decoding list includes voice encoding/decoding modes supported by the calling terminal 100 and stored in the calling terminal 100. The voice encoding/decoding list corresponding to the calling location information includes the voice encoding/decoding modes stored in the calling terminal 100 and supported by the calling terminal 100 when the calling terminal 100 is located at the calling location information. For convenience of description, the voice coding/decoding list corresponding to the caller location information will be hereinafter referred to as a caller voice coding/decoding list. The calling voice codec list is a data list stored in the NVRAM 103a, and one calling voice codec list includes at least one voice codec manner. The calling voice encoding/decoding list is a data list updated in real time according to historical voice call data. That is, the calling voice encoding/decoding list can update the voice encoding/decoding mode in the calling voice encoding/decoding list according to the switching of the scene. For example, the calling voice encoding/decoding list includes voice encoding/decoding modes with better call quality of the historical voice call data of the terminal at the current position.

For example, the calling end 100 deletes the voice encoding/decoding modes in the calling voice encoding/decoding list that are not applicable between the calling end 100 and the network side device 200 according to the scene switching. When the calling terminal 100 is located at the first calling location information, in some scenarios, the calling voice coding/decoding list includes three voice coding/decoding modes, namely EVS, AMR-WB and AMR. When the calling terminal 100 is located at the first calling location information, in some other scenarios, the calling terminal voice coding/decoding list only includes two voice coding/decoding modes, AMR-WB and AMR. Some of the scenes and other scenes may be different time period scenes.

For another example, the calling voice codec list can adjust the order of the voice codec modes in the calling voice codec list according to the scene switching. When the calling terminal 100 is located at the second calling location information, in some scenarios, the calling voice coding/decoding list includes three voice coding/decoding modes, namely EVS, AMR-WB and AMR, which are stored in sequence. When the calling terminal 100 is located at the second calling location information, in some other scenarios, the calling terminal voice coding/decoding list includes three voice coding/decoding modes, i.e., AMR-WB, AMR and EVS, which are stored in sequence. Some of the scenes and other scenes may be different time period scenes.

For another example, the calling terminal 100 can delete the voice encoding/decoding modes in the calling voice encoding/decoding list and adjust the arrangement order of the voice encoding/decoding modes in the calling voice encoding/decoding list according to the scene switching, and the specific technical solution is a combination of the two technical solutions, which is not described herein.

In one implementation, when the NVRAM 103a does not store the voice coding/decoding list corresponding to the calling location information, the voice coding/decoding list in the initial state is used as the voice coding/decoding list corresponding to the calling location information. In the initial state, the voice encoding/decoding list stores the voice encoding/decoding modes supported by the calling terminal 100.

Step S430: the network side device 200 screens out the calling voice coding/decoding mode according to the screening logic and the voice coding/decoding list carried in the received call-initiating message.

In some embodiments, the network-side device 200 obtains the filtering logic, extracts the voice coding/decoding list carried in the received origination message, further filters the voice coding/decoding mode adapted to the calling terminal 100 from the extracted voice coding/decoding list according to the filtering logic, and uses the filtered voice coding/decoding mode as the calling voice coding/decoding mode.

In one implementation, the screening logic is to screen the data processing rules of the calling voice codec list according to the priority of the voice codec mode. The screening module 201 in the network side device 200 can screen out the speech encoding/decoding mode with the highest current priority level according to the screening logic. The screening logic may be configured to, when the speech encoding/decoding manner reported by the calling terminal 100 includes the speech encoding/decoding manner with the highest priority level, use the speech encoding/decoding manner with the highest priority level as the calling speech encoding/decoding manner, and when the speech encoding/decoding manner in the calling speech encoding/decoding list does not include the speech encoding/decoding manner with the highest priority level, determine whether the speech encoding/decoding manner in the calling speech encoding/decoding list includes the speech encoding/decoding manner with the next highest priority level.

For example, the calling voice encoding/decoding list reported by the calling terminal 100 to the network side device 200 includes three voice encoding/decoding modes of EVS, AMR-WB, and AMR, and the three voice encoding/decoding modes are in a descending order according to the priority: EVS, AMR-WB, and AMR. The screening logic is to use the EVS as the calling voice encoding/decoding mode when the calling voice encoding/decoding list includes the EVS, and to judge whether the calling voice encoding/decoding list includes AMR-WB when the calling voice encoding/decoding list does not include the EVS, until the network side device 200 screens out the voice encoding/decoding mode with the highest priority in the current calling voice encoding/decoding list through the screening logic, and uses the screened voice encoding/decoding mode as the calling voice encoding/decoding mode.

In some embodiments, the priority level of the voice encoding/decoding manner is mainly determined by the operation policy of the operator. For example, when the network-side device 200 is upgraded to improve the voice call quality of one of the voice encoding/decoding schemes, the operator adjusts the preference policy of the network-side device 200 to improve the priority level of the voice encoding/decoding scheme. In addition, the screening logic and the priority level of the speech encoding/decoding scheme are both stored in the network-side device 200.

In another implementation, the voice encoding/decoding modes in the calling voice encoding/decoding list are sequentially arranged according to an order, and the screening logic is a data processing rule for screening the voice encoding/decoding modes with predetermined orders in the calling voice encoding/decoding list. For example, the screening logic is to screen out the first-order speech codec mode in the calling speech codec list as the calling speech codec mode.

Step S440: the network side device 200 returns a 200OK message to the calling terminal 100. The 200OK message carries the calling voice encoding/decoding mode. For example, the network-side device 200 returns a 200OK message to the Modem103 in the calling terminal 100.

In one implementation, the network-side device 200 fills the selected calling voice encoding/decoding manner in the SDP protocol in the 200OK message, and then the network-side device 200 sends the 200OK message to the calling terminal 100. The 200OK message is a message indicating that the call request of the calling terminal 100 has been received by the called terminal 300.

Step S450: the calling terminal 100 determines that the voice encoding/decoding mode used for the voice call with the network side device 200 is the calling voice encoding/decoding mode.

In an implementation manner, the calling terminal 100 performs a voice call with the network-side device 200 according to a received calling voice encoding/decoding manner, so as to implement transmission of an analog voice signal between the calling terminal 100 and the network-side device 200. The voice packet may be based on a standard Protocol RTP (Real-time Transport Protocol) and an RTCP (RTP Control Ptotocol) to realize transmission of an analog voice signal and a video signal.

After the screening scheme of the voice encoding/decoding manner during the voice call in the present application is introduced, since the above scheme needs to use the voice encoding/decoding list related to the current location information of the calling terminal 100, and the voice encoding/decoding list is continuously updated along with the historical voice call data generated by the user through the calling terminal 100. Based on this, in order to clearly and completely express the technical solution of the present application, the present application will further describe in detail the updating scheme of the voice coding/decoding list. Fig. 6 shows an updating scheme of a voice encoding/decoding list corresponding to the calling terminal 100 in the voice encoding/decoding manner screening scheme of the present application. As shown in fig. 3, 4 and 6, the method for updating the voice coding/decoding list in the calling terminal 100 specifically includes the following steps:

step S610: the calling terminal 100 acquires historical voice call data.

In some embodiments, after a voice call is finished, the calling terminal 100 obtains the voice call data of this time, and uses the voice call data meeting the preset condition as historical voice call data. It will be appreciated that each of the historical analog voice signals includes current location information and audio data transmitted during the voice call. For example, in fig. 3, the Modem 203 in the network side device 200 transmits a calling voice packet to the Modem103 in the calling end 100. The calling voice packet is a voice packet corresponding to the analog voice signal sent by the called terminal 300 to the calling terminal 100, that is, the historical voice call data in this scenario is audio data corresponding to the calling voice packet in fig. 3.

In one implementation, the preset condition may be voice call data corresponding to the current location information of the calling terminal 100 within a preset period. For example, the historical voice call data includes the current location information of the calling terminal 100 and audio data within one week corresponding to the current location information of the calling terminal 100.

In another implementation, the preset condition may also be the latest preset number of voice call data corresponding to the current location information of the calling terminal 100. For example, the historical voice call data includes the current location information of the calling terminal 100 and the latest two pieces of voice call data corresponding to the current location information of the calling terminal 100.

It can be understood that, since the "calling end" represents that a certain electronic device is called according to the intention of the user to actively make a voice call with another person in a certain voice call, the calling end is substantially an electronic device, but is not always the calling end, and only represents the initiative intention of the user in the voice call. Based on this, the historical voice call data in the calling terminal 100 is not necessarily the voice call data when the calling terminal 100 is used as the calling terminal, and the historical voice call data in the calling terminal 100 may also be the voice call data when the calling terminal 100 is used as the called terminal. Accordingly, the historical voice call data in the calling terminal 100 may also include both the voice call data when the calling terminal 100 is used as the calling terminal and the voice call data when the calling terminal 100 is used as the called terminal.

Step S620: the calling terminal 100 obtains the frame error rate and the packet loss rate of each piece of historical voice call data according to each piece of historical voice call data.

In one implementation, in the speech encoding/decoding module 104 in the calling terminal 100, after the receiving sub-module 104a receives the audio data, the decoding sub-module 104b decodes the received audio data. When the decoding submodule 104b can successfully decode all the audio data converted from one calling voice packet according to the calling voice encoding/decoding mode, the statistical submodule 104c increases the total number of the calling voice packets by one. When the decoding submodule 104b cannot decode all the audio data converted from one calling voice packet, the statistical submodule 104c increases the total number of the calling voice packets and the number of the calling voice packets failed in decoding by one. Then, the calculating submodule 104d calculates the packet loss rate of the voice call data according to the total number of the calling voice packets counted by the counting submodule 104c and the number of the calling voice packets failed in decoding. The packet loss rate is the ratio of the number of the calling voice packets with failed decoding to the total number of the calling voice packets.

In another implementation, when the Modem103 does not receive the calling voice packet on time, and therefore the receiving sub-module 104a does not receive the audio data on time, even if the calling voice packet is received by the Modem103 finally, that is, the receiving sub-module 104a receives the audio data, the calling voice packet is considered to be lost, and the counting sub-module 104c increases the total number of the calling voice packets and the number of the calling voice packets failing to be decoded by one. Then, the calculating submodule 104d calculates the packet loss rate of the voice call data according to the total number of the calling voice packets counted by the counting submodule 104c and the number of the calling voice packets failed in decoding. The packet loss rate is the ratio of the number of the calling voice packets with failed decoding to the total number of the calling voice packets.

In addition, when the decoding sub-module 104b decodes the audio data according to the calling speech encoding/decoding method, the counting sub-module 104c also counts the total frame number of the data in the audio data and the frame number of the decoding error. The frame error rate is the ratio of the number of frames with decoding errors to the total number of frames of data.

Step S630: the calling terminal 100 determines whether the voice coding/decoding list meets the update condition according to the frame error rate and the packet loss rate of each piece of historical voice call data.

If so, it indicates that the calling terminal 100 obtains the frame error rate and the packet loss rate according to the historical voice call data, and the current calling voice encoding/decoding list is not applicable to the calling terminal 100 and the network side device 200 at the current location information, that is, the calling voice encoding/decoding list of the calling terminal 100 needs to be updated, and the process proceeds to step S640. If not, it indicates that the calling terminal 100 obtains the calling voice encoding/decoding manner that the network-side device 200 screens according to the current calling voice encoding/decoding list according to the frame error rate and the packet loss rate of the historical voice call data, and is applicable to the voice call between the calling terminal 100 and the network-side device 200, that is, the current calling voice encoding/decoding list is applicable to the calling terminal 100, that is, the current calling voice encoding/decoding list of the calling terminal 100 does not need to be updated, the current calling voice encoding/decoding list is retained, and the step S610 is returned.

In some embodiments of the present application, the update condition is that the packet loss rates of a preset number of pieces of historical voice call data all exceed a packet loss threshold and/or the frame error rates of a preset number of pieces of historical voice call data all exceed a frame error threshold.

In one implementation, the update condition is that packet loss rates of two consecutive pieces of historical voice call data both exceed a packet loss threshold and/or frame error rates of two consecutive pieces of historical voice call data both exceed a frame error threshold.

For example, the packet loss threshold is 20% and the frame error threshold is 20%. It can be understood that the value corresponding to the packet loss threshold may be adjusted according to requirements, the value of the packet loss threshold is not specifically limited, and any packet loss threshold that can meet the judgment of the update condition is within the protection scope of the present application. The value corresponding to the frame error threshold value can be adjusted according to requirements, the value of the frame error threshold value is not particularly limited, and any frame error threshold value which can meet the judgment of the updating condition is within the protection range of the application.

In another implementation manner, the update condition is that the packet loss rates of the two accumulated pieces of historical voice call data both exceed a packet loss threshold and/or the frame error rates of the two accumulated pieces of historical voice call data both exceed a frame error threshold.

Step S640: the calling terminal 100 updates the voice coding/decoding list.

In one implementation, the Modem103 in the calling terminal 100 updates the calling voice encoding/decoding list according to a preset updating manner, and covers the original calling voice encoding/decoding list with the updated calling voice encoding/decoding list.

In one implementation, the preset updating manner in the calling terminal 100 is to delete the first order voice encoding/decoding manner in the voice encoding/decoding list, and move the other order voice encoding/decoding manners forward in sequence. The screening logic of the network-side device 200 may be a data processing rule for screening the calling voice codec list according to the priority of the voice codec mode.

In another implementation, the preset updating manner in the calling terminal 100 is to move the first-order speech encoding/decoding manner in the speech encoding/decoding list to the last-order speech encoding/decoding list, and the other-order speech encoding/decoding manners move forward in sequence. The screening logic may be a data processing rule for screening a predetermined order of voice encoding/decoding modes in the calling voice encoding/decoding list.

In some other embodiments of the present application, fig. 7 illustrates an updating method of a speech coding/decoding list corresponding to the calling terminal 100 in the speech coding/decoding manner screening scheme of the present application. As shown in fig. 7, the method for updating the speech codec list in the calling terminal 100 specifically includes the following steps:

step S710: the calling terminal 100 acquires historical voice call data.

The step of acquiring the historical voice call data in step S710 is the same as step S610, and is not described herein again.

Step S720: the calling terminal 100 obtains the frame error rate and the packet loss rate of each piece of historical voice call data according to each piece of historical voice call data.

The step of obtaining the frame error rate and the packet loss rate of each piece of historical voice call data in step S720 is the same as step S620, and is not described herein again.

Step S730: the calling terminal 100 calculates a decoding score of the voice encoding/decoding list according to the frame error rate and the packet loss rate of each piece of historical voice call data.

In an implementation manner, when the frame error rate of a piece of historical voice call data exceeds the frame error threshold and/or the packet loss rate exceeds the packet loss threshold, the statistical submodule 104c in the calling terminal 100 records the piece of historical voice call data as unsuccessful, and subtracts 1 from the decoding score corresponding to the current calling decoding manner.

For example, in the initial case, the decoding score corresponding to the calling voice decoding method is 100 points, and when the decoding submodule 104b decodes the calling voice packet according to the calling voice decoding method, the decoding score is decreased by 1 every time the decoding is unsuccessful.

Step S740: the calling terminal 100 determines whether the speech coding/decoding list meets the updating condition according to the decoding score. If yes, the process proceeds to step S750, and if no, the process returns to step S710.

In one implementation, the update condition is that the score difference between the coding/decoding score corresponding to the calling voice decoding mode and the coding/decoding score (e.g., 100 points) corresponding to the other voice decoding modes is greater than the difference threshold. For example, the difference threshold is 5, it can be understood that a value corresponding to the difference threshold may be adjusted according to a requirement, a value of the difference threshold is not specifically limited, and any difference threshold that can meet the judgment of the update condition is within the protection scope of the present application.

Step S750: the calling terminal 100 updates the voice coding/decoding list. Step S760 is the same as step S650, and is not described herein.

In some application scenarios, a user wants to initialize the voice codec list in the calling terminal 100, and in some embodiments of the present application, when the calling terminal 10 detects that the initialization condition is triggered, the voice codec list in the calling terminal 100 is restored to the initialization state.

The voice encoding/decoding list in the initialization state may be a data list including all voice encoding/decoding modes supported by the calling terminal 100. The voice encoding/decoding list in the initialization state may also be a data list that includes all the voice encoding/decoding manners supported by the calling terminal 100 and has the same order of the voice encoding/decoding manners. The initialization condition refers to a trigger condition for triggering the calling terminal 100 to initialize the current calling voice codec list.

In some implementations, the initialization condition is an active condition, for example, by the user operating the calling terminal 100 to turn on or off the calling terminal 100 or to perform hot plug on a SIM card in the calling terminal 100. The hot plug of the SIM card refers to pulling out and inserting the SIM card from the terminal 100, so that the SIM card and the calling terminal 100 are interrupted.

In some implementations of the present application, for a situation where the initialization condition is an active condition, when the calling end 100 detects that the voice call quality of the user is poor, the initialization prompt message may be presented to the user. The content of the initialization prompt message includes prompting the user to perform a related initialization trigger operation, so that the calling terminal 100 meets the initialization condition.

In other implementations, the initialization condition is that the voice codec list in the calling terminal 100 lasts for a preset duration. For example, the preset duration is one week, that is, the voice codec list in the calling terminal 100 lasts for one week.

In other implementations, the initialization condition may also be that the calling terminal 100 deletes all the voice encoding/decoding modes in the voice encoding/decoding list. Or, only one voice encoding/decoding mode is left in the voice encoding/decoding list. Still alternatively, the calling terminal 100 adjusts the arrangement order of the voice encoding/decoding modes in the voice encoding/decoding list.

In some embodiments, when the updating manner of the voice coding/decoding list is to delete the voice coding/decoding manner that does not satisfy the call requirement, the initialization manner of the voice coding/decoding list is that the calling terminal 100 restores all the voice coding/decoding manners supported by the calling terminal 100 in the voice coding/decoding list. When the updating method of the speech coding/decoding list is to adjust the arrangement order of the speech coding/decoding methods that do not meet the requirement of the call, the initialization method of the speech coding/decoding list is to restore the arrangement order of the speech coding/decoding methods in the speech coding/decoding list by the calling terminal 100. In addition, for any voice encoding/decoding list with an update mode, the voice encoding/decoding list may also be initialized by deleting the original voice encoding/decoding list and generating a new voice encoding/decoding list according to all the voice encoding/decoding modes supported by the calling terminal 100.

Since the interaction process between the calling end 100 and the network-side device 200 is similar to the interaction process between the called end 300 and the network-side device 200, the application of the screening scheme of the speech encoding/decoding method in the application scenario between the network-side device 200 and the called end 300 will be briefly described below.

In some application scenarios, the calling end 100 deletes a certain voice encoding/decoding mode according to the historical voice call data, but after the network side device 200 is upgraded, the deleted voice encoding/decoding mode can be supported, that is, when the calling end 100 performs voice call with the network side 200 by using the deleted voice encoding/decoding mode, the call quality is improved. In order to update the speech coding/decoding list reported by the calling terminal 100 in time, in some embodiments of the present application, the calling terminal 100 adds the speech coding/decoding mode to the speech coding/decoding list according to an addition instruction. The adding instruction is an adjusting instruction generated by the calling terminal 100 or the network side device 200 for adding the voice encoding/decoding mode to the voice encoding/decoding list when the voice encoding/decoding mode does not exist in the historical voice encoding/decoding list.

In one implementation, the adding instruction may be an instruction sent by the network-side device 200 to the calling end 100 to instruct the calling end 100 to add the voice encoding/decoding mode. The instruction is generated by the network side device 200 according to the last reported historical speech encoding/decoding list of the calling terminal 100.

In another implementation, the adding instruction may also be an instruction generated by the user by triggering the calling terminal 100. The network side device 200 generates an increase prompting message (e.g., operator notification message) according to the last reported historical speech encoding/decoding list of the calling terminal 100, and broadcasts the increase prompting message to the calling terminal 100.

The calling terminal 100 presents the received prompt message to the user, and the prompt message includes prompt content and a tab, for example, the prompt content is "user, you are! Asking if you want to optimize the current voice call mode? "tabs include" horse upgrade "," not upgrade for the time "and" no more reminders ".

The calling terminal 100 determines whether to generate an increase instruction according to the tab selected by the user and the voice encoding/decoding mode supported by the calling terminal 100. When the voice encoding/decoding method supported by the calling terminal 100 does not include the voice encoding/decoding method, it indicates that the calling terminal 100 and the network-side device 200 cannot perform a voice call through the voice encoding/decoding method. When the voice encoding/decoding mode supported by the calling terminal 100 includes the voice encoding/decoding mode, it indicates that the calling terminal 100 and the network side device 200 can perform voice call through the voice encoding/decoding mode, and at this time, it needs to determine whether to generate an increase instruction according to a tab selected by a user.

When the user selects "upgrade immediately", the calling terminal 100 generates an increase instruction; when the user selects "do not upgrade for the moment", the calling terminal 100 does not generate an increase instruction; when the user selects "no more alert", the calling terminal 100 does not display such alert information for a preset duration. Alternatively, when the user selects "no more alert", the calling terminal 100 no longer displays such alert information of the preset number.

In some embodiments of the present application, after the network-side device 200 completes the upgrade, the network-side device 200 sends an addition instruction or a prompt message to the calling terminal 100. In the above technical solution, the addition of the speech encoding/decoding mode to the speech encoding/decoding list and the deletion of the speech encoding/decoding mode in the speech encoding/decoding list are two processes independent of each other.

In some other embodiments of the present application, after the network-side device 200 finishes upgrading and the voice call between the calling end 100 and the network-side device 200 is ended, the network-side device 200 sends an addition instruction or a prompt message to the calling end 100. In the above technical solution, adding the speech encoding/decoding mode to the speech encoding/decoding list and deleting the speech encoding/decoding mode in the speech encoding/decoding list may be combined into one process. That is, the voice encoding/decoding modes can be simultaneously deleted and added in the updating process of the voice encoding/decoding list.

When the user selects the no-more-reminding option, it indicates that the user is satisfied with the current voice call quality, and does not want to add an additional voice encoding/decoding mode, in order to improve the user experience and avoid continuously displaying the prompt information that the user does not need, in some embodiments of the present application, a timer is further provided in the calling terminal 100, and the timer is used to limit the calling terminal 100 not to display the increase prompt information in a preset period or limit the calling terminal 100 not to display the increase prompt information of a preset number when the user selects the no-more-reminding option.

Fig. 8 is a schematic structural diagram illustrating a called end 300 and a network-side device 200 capable of implementing the technical solution of the present application in the voice call scenario illustrated in fig. 1. As shown in fig. 8, the called end 300 includes a Modem 303, a voice encoding/decoding module 304, and the like, and the network-side device 200 includes a screening module 201 for screening a voice encoding/decoding manner most suitable for the calling end 100 from the voice encoding/decoding list. The Modem 303 is configured to send a 200OK message and a called voice packet, receive a QUERY message, and perform inter-conversion between the called voice packet and audio data. Furthermore, the Modem103 can also receive a calling voice packet sent by the Modem 203, where the calling voice packet is a voice packet sent by the called end 300 to the calling end 100 through the network-side device 200. The Modem 303 further includes a Non-Volatile Random Access Memory (NVRAM) 303 a. The NVRAM 303a stores current location information of the called terminal 300, and a speech encoding/decoding list 1, a speech encoding/decoding list 2, speech encoding/decoding lists 3 and … …, and a speech encoding/decoding list n corresponding to different location information. The voice encoding/decoding module 304 is used for encoding the analog voice signal into audio data by using the called voice encoding/decoding method, and for decoding the audio data into the analog voice signal by using the called voice encoding/decoding method. The voice encoding/decoding module 304 specifically includes a receiving submodule 304a for receiving the audio data, a decoding submodule 304b for decoding the audio data according to the called voice encoding/decoding mode, a counting submodule 304c for counting the number of voice packets and the number of frames corresponding to the decoded audio data, a calculating submodule 304d for calculating the decoding effect of the voice packets corresponding to the audio data, and an encoding submodule 304e for encoding the analog voice signal according to the called voice encoding/decoding mode.

Fig. 9 shows an interaction diagram between the network-side device 200 and the called end 300 when the voice encoding/decoding manner screening scheme of the present application is applied to the network-side device 200 and the called end 300 in the voice call application scenario shown in fig. 1. A negotiation process for determining a voice encoding/decoding mode between the network-side device 200 and the called terminal 300 when the calling terminal 100 wants to perform a VoLTE voice call with the called terminal 300 will be briefly described below according to the structure diagram in fig. 8 and the interaction diagram in fig. 9. As can be seen from fig. 8 and fig. 9, the negotiation process of the voice encoding/decoding method between the network-side device 200 and the called end 300 specifically includes the following steps:

step S910: the network side device 200 sends a paging message to the called end 300. In one implementation, the origination message sent by the calling end 100 to the network side device 200 also carries a destination number corresponding to the paged called end 300. The network side device 200 can obtain the corresponding called terminal 300 according to the destination number, and further send a paging message to the called terminal 300. It is understood that the destination number may be a Public Switched Telephone Network (PSTN) number or a Session Initiation Protocol (SIP) number corresponding to the called terminal 300 one to one. The paging message is used to initiate a signaling connection to the called end 300, so as to establish a call connection between the calling end 100 and the called end 300 through the network-side device 200.

Step S920: the called terminal 300 inquires the called location information currently corresponding to the called terminal 300. The called location information refers to a region code that can represent the current region of the called end 300, and step S920 is similar to step S410 and is not described herein again.

Step S930: the called terminal 300 sends a response message to the network-side device 200. Wherein, the response message carries the voice coding/decoding list corresponding to the called position information. Step S930 is similar to step S420 and will not be described herein.

Step S940: the network side device 200 screens out the called voice encoding/decoding mode according to the screening logic and the received voice encoding/decoding list. The called voice encoding/decoding mode refers to a voice encoding/decoding mode that is selected by the network side device 200 from all voice encoding/decoding modes supported by the called terminal 300 and reported by the called terminal 300 and is most suitable for the voice call of the called terminal 300, and step S940 is similar to step S430 and is not described herein again.

Step S950: the network-side device 200 returns a 200OK message to the called terminal 300. The 200OK message carries the called voice encoding/decoding mode. Step S950 is similar to step S440, and is not described herein.

Step S960: the called terminal 100 determines that the voice encoding/decoding method used for the voice call with the network side device 200 is the called voice encoding/decoding method. Step S950 is similar to step S450, and is not described herein.

Fig. 10 shows an interaction diagram of the voice encoding/decoding mode screening scheme applied to the application scenario of fig. 1 among the calling terminal 100, the network side device 200, and the called terminal 300. The following describes, with reference to the schematic structural diagram in fig. 3, the schematic structural diagram in fig. 8, and the interaction diagram in fig. 10, a procedure of performing a voice call when the calling end 100, the network-side device 200, and the called end 300 use the VoLTE service when the calling end 100 and the called end 300 perform the voice call through the network-side device 200. In some embodiments of the present application, when the calling end 100 wants to perform VoLTE service with the called end 300, a negotiation process of a voice encoding/decoding manner between the calling end 100, the network side device 200, and the called end 300 specifically includes the following steps:

step S1010: the calling terminal 100 queries the calling location information currently corresponding to the calling terminal 100. Step S1010 is the same as step S410, and is not described herein.

Step S1020: the calling terminal 100 sends an origination message to the network-side device 200. Wherein, the calling message carries a voice coding/decoding list corresponding to the calling position information. Step S1020 is the same as step S420, and is not described herein.

Step S1030 a: the network side device 200 screens out the calling voice encoding/decoding mode adapted to the calling terminal 100 according to the screening logic and the voice encoding/decoding list carried in the received call-initiating message. Step S1030a is the same as step S430, and is not described herein.

Step S1030 b: step S1030b is the same as step S910 for the network-side device 200 sending the paging message to the called end 300, and is not described herein again.

Step S1040 a: the network side device 200 sends a 100Tring message to the calling terminal 100. The 100Tring message is used to inform the calling terminal 100 that the network-side device 200 has received the INVITE message and is processing, so that the calling terminal 100 stops timing without sending the INVITE message to the network-side device 200 again.

Step S1040 b: the called terminal 300 inquires the called location information currently corresponding to the called terminal 300. Step S1040b is the same as step S920, and is not described herein.

Step S1050: the called terminal 300 sends a response message to the network-side device 200. Wherein, the response message carries the voice coding/decoding list corresponding to the called position information. Step S1050 is the same as step S930, and is not described herein.

Step S1060 a: the network side device 200 returns a 200OK message to the calling terminal 100. The 200OK message carries the calling voice encoding/decoding mode. Step S1060a is the same as step S440, and is not described herein.

Step S1060 b: the network side device 200 screens out the called voice encoding/decoding mode according to the screening logic and the received voice encoding/decoding list. Step S1060b is the same as step S940, and is not described herein.

Step S1070 a: the calling terminal 100 determines that the voice encoding/decoding mode used for the voice call with the network side device 200 is the calling voice encoding/decoding mode. Step S1070a is the same as step S450, and will not be described herein.

Step S1070 b: the network-side device 200 returns a 200OK message to the called terminal 300. The 200OK message carries the called voice encoding/decoding mode. Step S1070b is the same as step S950, and is not described herein.

Step S1080: the called terminal 300 determines that the voice encoding/decoding method used for the voice call with the network side device 200 is the called voice encoding/decoding method. Step S1080 is the same as step S960, and is not described herein.

In the above-mentioned process of interaction among the calling end 100, the network-side device 200 and the called end 300, step S1010, step S1020, step S1030a, step S1040a, step S1060a and step S1070a are interaction steps between the calling end 100 and the network-side device 200, and step S1030b, step S1040b, step S1050, step S1060b, step S1070b and step S1080 are interaction steps between the network-side device 200 and the called end 300. Except for the restriction before and after the step S1050 and the step S1060a, after the called end 300 returns the response message to the network-side device 200, the network-side device 200 returns the 200OK message carrying the calling voice encoding/decoding manner to the calling end 100, and the interaction steps between the remaining calling end 100 and the network-side device 200 and the interaction steps between the remaining network-side device 200 and the called end 300 are independent from each other and do not interfere with each other.

That is, step S1010, step S1020, step S1030a, step S1040a, step S1060a, and step S1070a are sequentially performed in the front-back order, step S1030b, step S1040b, step S1050, step S1060b, step S1070b, and step S1080 are sequentially performed in the front-back order, and step S1060a occurs after step S1050, and the front-back order between the remaining steps is not limited.

In other application scenarios, for an application scenario in which the number of the calling terminal 100 is one and the number of the called terminals 300 is multiple, the establishment process of the voice call between the calling terminal 100 and each of the called terminals 300 is the same as that in the foregoing embodiment, and is not described herein again.

In some embodiments of the present application, the terminal (the calling terminal 100 and/or the called terminal 300) in the speech encoding/decoding mode screening scheme of the present application is a mobile phone 100'. Fig. 11 shows a schematic diagram of a structure of a handset 100' according to an embodiment of the application. The mobile phone 100' may include a modem103, an NVRAM 103a, a voice coding/decoding module 104, a receiving sub-module 104a, a decoding sub-module 104B, a statistics sub-module 104C, a calculation sub-module 104D, an encoding sub-module 104e, a processor 110, an external memory interface 120, an internal memory 121, a Universal Serial Bus (USB) interface 130, a charging management module 140, a power management module 141, a battery 142, an antenna 1, an antenna 2, a mobile communication module 150, a wireless communication module 160, an audio module 170, a speaker 170A, a receiver 170B, a microphone 170C, an earphone interface 170D, a sensor module 180, a button 190, a motor 191, an indicator 192, a camera 193, a display screen 194, and a Subscriber Identity Module (SIM) card interface 195. The sensor module 180 may include a pressure sensor 180A, a gyroscope sensor 180B, an air pressure sensor 180C, a magnetic sensor 180D, an acceleration sensor 180E, a distance sensor 180F, a proximity light sensor 180G, a fingerprint sensor 180H, a temperature sensor 180J, a touch sensor 180K, an ambient light sensor 180L, a bone conduction sensor 180M, and the like.

It is to be understood that the illustrated structure of the embodiment of the present invention does not specifically limit the mobile phone 100'. In other embodiments of the present application, the handset 100' may include more or fewer components than shown, or some components may be combined, some components may be separated, or a different arrangement of components may be used. The illustrated components may be implemented in hardware, software, or a combination of software and hardware.

Processor 110 may include one or more processing units, such as: the processor 110 may include an Application Processor (AP), a modem processor, a Graphics Processing Unit (GPU), an Image Signal Processor (ISP), a controller, a video codec, an audio codec, a Digital Signal Processor (DSP), a baseband processor, and/or a neural-Network Processing Unit (NPU), etc. Video decoding in which the different processing units may be separate devices or may be integrated in one or more processors. The controller can generate an operation control signal according to the instruction operation code and the timing signal to complete the control of instruction fetching and instruction execution.

A memory may also be provided in processor 110 for storing instructions and data. In some embodiments, the memory in the processor 110 is a cache memory. The memory may hold instructions or data that have just been used or recycled by the processor 110. If the processor 110 needs to reuse the instruction or data, it can be called directly from the memory. Avoiding repeated accesses reduces the latency of the processor 110, thereby increasing the efficiency of the system.

In some embodiments, processor 110 may include one or more interfaces. The interface may include an integrated circuit (I2C) interface, an integrated circuit built-in audio (I2S) interface, a Pulse Code Modulation (PCM) interface, a universal asynchronous receiver/transmitter (UART) interface, a Mobile Industry Processor Interface (MIPI), a general-purpose input/output (GPIO) interface, a Subscriber Identity Module (SIM) interface, and/or a Universal Serial Bus (USB) interface, etc.

Internal memory 121, as a computer-readable storage medium, may include one or more tangible, non-transitory computer-readable media for storing data and/or instructions. For example, the internal memory 121 may include any suitable non-volatile memory such as flash memory and/or any suitable non-volatile storage device, such as one or more Hard-Disk drives (Hard-Disk drives, hdd (s)), one or more Compact Discs (CD) drives, and/or one or more Digital Versatile Discs (DVD) drives. According to some embodiments of the present application, the memory 121 serving as a computer-readable storage medium stores instructions that, when executed on a computer, cause the processor 110 to execute the audio processing method according to the embodiments of the present application, which may specifically refer to the methods of the above embodiments, and will not be described herein again.

It should be understood that the connection relationship between the modules according to the embodiment of the present invention is only an exemplary illustration, and does not limit the structure of the mobile phone 100'. In other embodiments of the present application, the mobile phone 100' may also adopt different interface connection manners or a combination of multiple interface connection manners in the above embodiments.

The external memory interface 120 may be used to connect an external memory card, such as a Micro SD card, to extend the memory capability of the mobile phone 100'. The external memory card communicates with the processor 110 through the external memory interface 120 to implement a data storage function. For example, files such as music, video, etc. are saved in an external memory card.

The internal memory 121 may be used to store computer-executable program code, which includes instructions. The internal memory 121 may include a program storage area and a data storage area. The storage program area may store an operating system, an application program (such as a sound playing function, an image playing function, etc.) required by at least one function, and the like. The data storage area can store data (such as audio data, phone book, etc.) created during the use of the mobile phone 100', etc. In addition, the internal memory 121 may include a high-speed random access memory, and may further include a nonvolatile memory, such as at least one magnetic disk storage device, a flash memory device, a universal flash memory (UFS), and the like. The processor 110 executes various functional applications of the handset 100' and data processing by executing instructions stored in the internal memory 121 and/or instructions stored in a memory provided in the processor.

The handset 100' may implement audio functions via the audio module 170, speaker 170A, receiver 170B, microphone 170C, headset interface 170D, and application processor, etc. Such as music playing, recording, etc.

The audio module 170 is used to convert digital audio information into an analog audio signal output and also to convert an analog audio input into a digital audio signal. The speech coding/decoding module 104 in the audio module 170 may also be used to encode and decode audio signals. In some embodiments, the audio module 170 may be disposed in the processor 110, or some functional modules of the audio module 170 may be disposed in the processor 110.

The speaker 170A, also called a "horn", is used to convert the audio electrical signal into an acoustic signal. The cellular phone 100' can listen to music through the speaker 170A or listen to a hands-free call.

The receiver 170B, also called "earpiece", is used to convert the electrical audio signal into an acoustic signal. When the handset 100' receives a call or voice message, it can receive voice by placing the receiver 170B close to the ear.

The microphone 170C, also referred to as a "microphone," is used to convert sound signals into electrical signals. When making a call or transmitting voice information, the user can input a voice signal to the microphone 170C by speaking the user's mouth near the microphone 170C. The handset 100' may be provided with at least one microphone 170C. In other embodiments, the handset 100' may be provided with two microphones 170C to achieve noise reduction functions in addition to collecting sound signals. In other embodiments, the mobile phone 100' may further include three, four or more microphones 170C to collect sound signals, reduce noise, identify sound sources, and perform directional recording.

The headphone interface 170D is used to connect a wired headphone. The headset interface 170D may be the USB interface 130, or may be a 3.5mm open mobile electronic device platform (OMTP) standard interface, a cellular telecommunications industry association (cellular telecommunications industry association of the USA, CTIA) standard interface.

The speech coding/decoding module 104 is used to encode and decode an audio signal. The speech encoding/decoding module 104 includes a receiving sub-module 104a, a decoding sub-module 104b, a statistics sub-module 104c, and a computation sub-module 104 d. The receiving submodule 104a is configured to receive a voice packet transmitted by the network side device 200, the decoding submodule 104b is configured to decode the voice packet received by the receiving submodule according to the screened voice encoding/decoding manner, the counting submodule 104c is configured to count the number of the decoded voice packets and the number of frames of audio signals in the voice packets, the calculating submodule 104d is configured to calculate a decoding result, and the encoding submodule 104e is configured to encode voice information to be transmitted according to the screened voice encoding/decoding manner.

The bone conduction sensor 180M may acquire a vibration signal. In some embodiments, the bone conduction sensor 180M may acquire a vibration signal of the human vocal part vibrating the bone mass. The bone conduction sensor 180M may also contact the human pulse to receive the blood pressure pulsation signal. In some embodiments, the bone conduction sensor 180M may also be disposed in a headset, integrated into a bone conduction headset. The audio module 170 may decode a voice signal based on the vibration signal of the bone block vibrated by the sound part acquired by the bone conduction sensor 180M, so as to implement a voice function. The application processor can decode heart rate information based on the blood pressure beating signal acquired by the bone conduction sensor 180M, and realize a heart rate detection function.

The motor 191 may generate a vibration cue. The motor 191 may be used for incoming call vibration cues, as well as for touch vibration feedback. For example, touch operations applied to different applications (e.g., photographing, audio playing, etc.) may correspond to different vibration feedback effects. The motor 191 may also respond to different vibration feedback effects for touch operations applied to different areas of the display screen 194. Different application scenes (such as time reminding, receiving information, alarm clock, game and the like) can also correspond to different vibration feedback effects. The touch vibration feedback effect may also support customization.

The SIM card interface 195 is used to connect a SIM card. The SIM card can be attached to and detached from the cellular phone 100' by being inserted into the SIM card interface 195 or being pulled out from the SIM card interface 195. The handset 100' may support 1 or N SIM card interfaces, N being a positive integer greater than 1. The SIM card interface 195 may support a Nano SIM card, a Micro SIM card, a SIM card, etc. The same SIM card interface 195 can be inserted with multiple cards at the same time. The types of the plurality of cards may be the same or different. The SIM card interface 195 may also be compatible with different types of SIM cards. The SIM card interface 195 may also be compatible with external memory cards. The mobile phone 100' interacts with the network through the SIM card to implement functions such as communication and data communication. In some embodiments, the handset 100' employs esims, namely: an embedded SIM card. The eSIM card can be embedded in the mobile phone 100 'and cannot be separated from the mobile phone 100'. In the present application, the hot plug of the SIM card in the mobile phone 100' is realized by releasing the connection relationship between the SIM card and the SIM card interface 195.

Fig. 12 shows an architectural diagram of a handset 100' according to an embodiment of the application. As shown in FIG. 12, the application framework layers may include a window manager, content provider, view system, phone manager, resource manager, notification manager, and the like.

The window manager is used for managing window programs. The window manager can obtain the size of the display screen, judge whether a status bar exists, lock the screen, intercept the screen and the like.

The content provider is used to store and retrieve data and make it accessible to applications. The data may include video, images, audio, calls made and received, browsing history and bookmarks, phone books, etc.

The view system includes visual controls such as controls to display text, controls to display pictures, and the like. The view system may be used to build applications. The display interface may be composed of one or more views. For example, the display interface including the short message notification icon may include a view for displaying text and a view for displaying pictures.

The phone manager is used to provide the communication functions of the handset 100'. Such as management of call status (including on, off, etc.).

The resource manager provides various resources for the application, such as localized strings, icons, pictures, layout files, video files, and the like.

The notification manager enables the application to display notification information in the status bar, can be used to convey notification-type messages, can disappear automatically after a short dwell, and does not require user interaction. Such as a notification manager used to inform download completion, message alerts, etc. The notification manager may also be a notification that appears in the form of a chart or scroll bar text at the top status bar of the system, such as a notification of a background running application, or a notification that appears on the screen in the form of a dialog window. For example, prompting text information in the status bar, sounding a prompt tone, vibrating the electronic device, flashing an indicator light, etc.

The Android Runtime comprises a core library and a virtual machine. The Android runtime is responsible for scheduling and managing an Android system.

The core library comprises two parts: one part is a function which needs to be called by java language, and the other part is a core library of android.

The application layer and the application framework layer run in a virtual machine. And executing java files of the application program layer and the application program framework layer into a binary file by the virtual machine. The virtual machine is used for performing the functions of object life cycle management, stack management, thread management, safety and exception management, garbage collection and the like.

The kernel layer is a layer between hardware and software. The inner core layer at least comprises a display driver, a camera driver, an audio driver and a sensor driver.

The system library may include a plurality of functional modules. For example: surface Managers (SM), Media Libraries (ML), three-dimensional graphics processing Libraries (e.g., OpenGL ES), 2D graphics engines (e.g., SGL), and the like.

The media library supports a variety of commonly used audio, video format playback and recording, and still image files, among others. The media library may support a variety of audio-video encoding formats, such as: MPEG4, H.264, MP3, AAC, AMR, JPG, PNG, etc.

Reference in the specification to "some embodiments" or "an embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment is included in at least one example embodiment or technology in accordance with the present disclosure. The appearances of the phrase "in one embodiment" in various places in the specification are not necessarily all referring to the same embodiment.

The disclosure also relates to an operating device for executing in text. This apparatus may be specially constructed for the required purposes, or it may comprise a general-purpose computer selectively activated or reconfigured by a computer program stored in the computer. Such a computer program may be stored in a computer readable medium, such as, but is not limited to, any type of disk including floppy disks, optical disks, CD-ROMs, magnetic-optical disks, Read-Only memories (ROMs), Random Access Memories (RAMs), EPROMs, EEPROMs, magnetic or optical cards, Application Specific Integrated Circuits (ASICs), or any type of media suitable for storing electronic instructions, and each may be coupled to a computer system bus. Further, the computers referred to in the specification may include a single processor or may be architectures employing multiple processor designs for increased computing capability.

The processes and displays presented herein are not inherently related to any particular computer or other apparatus. Various general-purpose systems may also be used with programs in accordance with the teachings herein, or it may prove convenient to construct more specialized apparatus to perform one or more method steps. The structure for a variety of these systems is discussed in the description that follows. In addition, any particular programming language sufficient to implement the techniques and embodiments of the present disclosure may be used. Various programming languages may be used to implement the present disclosure as discussed herein.

Moreover, the language used in the specification has been principally selected for readability and instructional purposes, and may not have been selected to delineate or circumscribe the disclosed subject matter. Accordingly, the present disclosure is intended to be illustrative, but not limiting, of the scope of the concepts discussed herein.

36页详细技术资料下载

上一篇：一种医用注射器针头装配设备

下一篇：一种大型室内位置服务场景下位置隐私保护方法

Voice communication method, electronic device and readable medium

相关技术

网友询问留言