Dialogue system, vehicle, and method for controlling vehicle

文档序号:1600017 发布日期:2020-01-07 浏览:18次 中文

阅读说明:本技术 对话系统、车辆和用于控制车辆的方法 (Dialogue system, vehicle, and method for controlling vehicle ) 是由 金桂润 石东熙 申东洙 李廷馣 金佳熙 金宣我 朴贞美 卢熙真 于 2018-11-30 设计创作,主要内容包括:本发明公开了一种对话系统、车辆和用于控制车辆的方法。一种用于控制车辆的方法包括:当多个讲话者的语音通过语音输入装置输入时,通过识别语音获取话语和语音模式;基于获取的话语和语音模式,对每个讲话者的对话内容进行分类;基于获取的话语,获取讲话者之间的关系;基于获取的讲话者之间的关系以及获取的每个讲话者的对话内容,理解每个讲话者的意图和语境;确定与获取的关系相对应的动作以及获取的每个讲话者的意图和语境;并输出与确定的动作相对应的话语;生成与确定的动作相对应的控制指令;并且基于生成的控制指令控制负载。(The invention discloses a dialogue system, a vehicle and a method for controlling the vehicle. A method for controlling a vehicle includes: acquiring utterances and speech patterns by recognizing speech when speech of a plurality of speakers is input through the speech input device; classifying the conversation content of each speaker based on the acquired utterance and speech pattern; obtaining a relationship between speakers based on the obtained utterance; understanding an intention and a context of each speaker based on the acquired relationship between the speakers and the acquired conversation content of each speaker; determining an action corresponding to the obtained relationship and the obtained intention and context of each speaker; and outputting an utterance corresponding to the determined action; generating a control instruction corresponding to the determined action; and controlling the load based on the generated control instruction.)

1. A dialog system, comprising:

a speech input processor configured to:

when voices of a plurality of speakers are input, an utterance and a voice pattern are acquired by recognizing the input voices,

classifying conversation content of each of the plurality of speakers based on the acquired utterance and speech pattern;

a contextual information processor configured to:

acquiring a relationship between speakers among the plurality of speakers based on the acquired utterance,

determining an intention and context of each of the plurality of speakers based on the obtained relationships between the speakers of the plurality of speakers and the classified conversation content of each of the plurality of speakers;

a storage device configured to store a relationship between speakers of a plurality of speakers and a speech pattern of each speaker of the plurality of speakers;

a dialog manager configured to determine an intent and context of each of the plurality of speakers and an action corresponding to the obtained relationship;

a result processor configured to output an utterance corresponding to the determined action.

2. The dialog system of claim 1, wherein:

when the contextual information processor receives contact information from a mobile device, the contextual information processor is configured to: at least one of a name, a title, or a phone number of a speaker having a conversation with the user is acquired based on the received contact information, and a relationship between speakers of the plurality of speakers is acquired based on the acquired at least one of the name, the title, or the phone number of the speaker having a conversation with the user.

3. The dialog system of claim 1, wherein:

the context information processor is configured to determine a priority between speakers of the plurality of speakers based on the obtained relationship of speakers of the plurality of speakers,

the storage device is configured to store function control information for each of a plurality of speakers,

the dialog manager is configured to determine an action corresponding to the determined priority and the stored function control information,

the result processor is configured to generate a control instruction to control a function based on the determined priority and the stored function control information of each of the plurality of speakers, and output the generated control instruction to the device.

4. The dialog system of claim 1, further comprising:

a pre-voicing determiner configured to determine a pre-voicing context,

wherein the context information processor is configured to determine a priority between speakers of the plurality of speakers based on the obtained relationship between speakers of the plurality of speakers, and to identify a speaker of the plurality of speakers having a highest priority based on the determined priority;

the dialog manager is configured to determine an action corresponding to the speaker with the highest priority and the pre-utterance context.

5. A vehicle, comprising:

a plurality of loads;

a voice input device configured to receive voice;

a dialog system configured to:

an utterance and a voice pattern are acquired by recognizing a voice input via a voice input device,

classifying conversation content of each of the plurality of speakers based on the acquired utterance and speech pattern,

acquiring a relationship between speakers among the plurality of speakers based on the acquired utterance,

determining an intention and context of each of the plurality of speakers based on the obtained relationships between the speakers of the plurality of speakers and the classified conversation content of each of the plurality of speakers;

determining an action corresponding to the acquired relationship between speakers of the plurality of speakers and the intent and context of each of the acquired plurality of speakers,

outputting an utterance corresponding to the determined action,

generating a control instruction corresponding to the determined action;

a vehicle controller configured to control at least one load of the plurality of loads based on the received control instruction when the control instruction is received.

6. The vehicle of claim 5, further comprising:

a communication device configured to communicate with a mobile device,

wherein the dialog system is configured to:

when contact information is received from the mobile device, at least one of a name, a title, or a phone number of a speaker conversing with the user is acquired based on the contact information,

acquiring a relationship between speakers among a plurality of speakers based on at least one of acquired names, titles, or telephone numbers of the speakers who have a conversation with the user,

storing at least one of a name, title, or phone number of each of the plurality of speakers, a relationship between the speakers of the plurality of speakers, and a speech pattern of each of the plurality of speakers.

7. The vehicle of claim 6, wherein the dialog system is configured to:

function control information for controlling vehicle functions is received from mobile devices of a plurality of speakers,

storing the received function control information for each of the plurality of speakers in a storage device,

determining a priority between speakers of the plurality of speakers based on a relationship between speakers of the plurality of speakers,

determining an action corresponding to the determined priority and the stored function control information,

generating a control instruction to control at least one function based on the determined priority and the stored function control information,

and outputting the generated control instruction to at least one load of the plurality of loads.

8. The vehicle according to claim 7, wherein,

the dialogue system is configured to recognize a voice when the voice is input through the voice input device, and recognize a speaker by comparing a voice pattern of the recognized voice with a voice pattern stored in the storage device.

9. The vehicle according to claim 7, wherein,

the function control information includes: seat inclination angle information of each of the plurality of seats, horizontal position information of each of the front and rear rows of seats, air volume and wind direction information of an air conditioner, seat heating wire on/off information, seat heating wire temperature information, seat ventilation on/off information, and lumbar support information.

10. The vehicle of claim 9, further comprising:

an information input device other than a voice is provided,

wherein the dialogue system is configured to store a seat position of the speaker input through an information input device other than voice.

11. The vehicle according to claim 9, wherein,

the dialogue system is configured to identify an utterance position based on a time of arrival of a speech signal at the speech input device, identify a seat position corresponding to the identified utterance position, identify a speech pattern of speech arriving at the speech input device, match the speech pattern of the identified speech with the identified seat position, and store information that the speech pattern of the identified speech matches the identified seat position.

12. The vehicle of claim 11, further comprising:

a detector provided in the plurality of seats, the detector configured to detect seat inclination information of the plurality of seats and horizontal positions of the front and rear seats,

wherein the dialog system is configured to match the seat inclination information and the horizontal position with the identified seat position and to store information that the seat inclination information and the horizontal position match with the identified seat position.

13. The vehicle according to claim 5, wherein,

the dialog system is configured to determine a pre-utterance context based on a relationship between speakers of the plurality of speakers, and output an utterance corresponding to the pre-utterance context.

14. The vehicle according to claim 13, wherein,

the dialog system is configured to determine a priority between speakers of the plurality of speakers based on a relationship between speakers of the plurality of speakers, identify a speaker with a highest priority based on the determined priority, and determine an action corresponding to the speaker with the highest priority and the pre-utterance context.

15. The vehicle according to claim 5, wherein,

upon determining that the intent of the currently input utterance is a response to the previously input speech, the dialog system is configured to obtain a relationship with a speaker of the currently input utterance based on an utterance corresponding to the previously input speech.

16. The vehicle according to claim 15, wherein,

the dialogue system is configured to match the acquired speaker relationship of the currently input speech with the speech pattern of the currently input speech, and store the matched information.

17. A method for controlling a vehicle, the method comprising:

acquiring utterances and speech patterns by speech recognition when speech of a plurality of speakers is input through a speech input device;

classifying conversation content of each of the plurality of speakers based on the acquired utterance and speech pattern;

acquiring a relationship between speakers among the plurality of speakers based on the acquired utterance;

determining an intent and context for each of a plurality of speakers based on the obtained relationships;

determining an action corresponding to the obtained relationship and the obtained intention and context of each of the plurality of speakers, and outputting an utterance corresponding to the determined action;

generating a control instruction corresponding to the determined action;

controlling at least one load of the plurality of loads based on the generated control instruction.

18. The control method of claim 17, further comprising:

obtaining at least one of a name, a title, or a phone number of a speaker of the plurality of speakers based on the contact information received from the mobile device;

acquiring a relationship between speakers of the plurality of speakers based on at least one of a name, title, or phone number of the speakers of the plurality of speakers;

at least one of a name, title, and telephone number of each speaker and a relationship between speakers of the plurality of speakers are stored in a storage device.

19. The control method of claim 18, further comprising:

receiving function control information for controlling a vehicle function from mobile devices of a plurality of speakers;

determining a priority between speakers of the plurality of speakers based on a relationship between speakers of the plurality of speakers;

determining an action corresponding to the determined priority and the received function control information of each speaker;

generating a control instruction to control at least one function based on the determined priority and the received function control information of each speaker;

and outputting the generated control instruction to at least one load of the plurality of loads.

20. The control method of claim 17, further comprising:

acquiring a seat position of a speaker;

the seat position of the speaker is stored.

21. The control method of claim 17, further comprising:

recognizing a sound production position based on a time when the voice signal arrives at the voice input device;

identifying a seat position corresponding to the identified sound production position;

recognizing a voice pattern of a voice arriving at the voice input device;

matching the voice pattern of the recognized voice with the recognized seat position, and storing the matched information.

22. The control method of claim 21, further comprising:

determining a priority between speakers of the plurality of speakers based on a relationship between speakers of the plurality of speakers;

identifying a speaker having a highest priority based on the determined priority and determining a pre-voicing context;

when the pre-utterance context is determined, an utterance corresponding to the speaker with the highest priority and the pre-utterance context is output.

23. The control method according to claim 17,

obtaining a relationship between speakers of the plurality of speakers based on the obtained utterance includes:

upon determining that the intent of the currently input utterance is a response to a previously input speech, obtaining a relationship with a speaker of the currently input utterance based on an utterance corresponding to the previously input speech;

and matching the acquired speaker relationship of the currently input speech with the speech pattern of the currently input speech, and storing the matched information.

Technical Field

The present invention relates to a dialogue system that provides information or services required by a user, a vehicle having the dialogue system, and a method for controlling the vehicle.

Background

The statements in this section merely provide background information related to the present disclosure and may not constitute prior art.

For audio-video-navigation (AVN) devices of vehicles or most mobile devices, a small screen and small buttons provided in the devices may cause inconvenience to the user when providing visual information to the user or receiving input of the user.

In particular, during driving, when the user takes his or her hand off the steering wheel or when the user looks at another place to check visual information and operate the device, it may pose a serious danger to safe driving.

Accordingly, when the vehicle employs a dialogue system capable of recognizing the user's intention and providing information or services required by the user by dialogue with the user, it is possible to provide services in a more convenient and safer manner.

Disclosure of Invention

The present invention provides a dialogue system capable of generating a relationship between speakers among a plurality of speakers by acquiring a voice pattern of each of the plurality of speakers, recognizing a speaker who utters a voice based on the recognized voice and the acquired voice pattern when the voice is recognized, selecting a leader of a dialogue based on the recognized speaker and the generated relationship, conducting a dialogue with the selected leader of the dialogue, and controlling at least one function, a vehicle having the dialogue system, and a method for controlling a vehicle.

Another aspect of the present invention is to provide a dialogue system capable of providing a service according to a user's real intention or a service most desired by the user by accurately recognizing the user's intention based on various information such as a dialogue with the user during the driving of a vehicle and vehicle state information, driving environment information, and user information, a vehicle having the dialogue system, and a method for controlling a vehicle.

Additional aspects of the invention will be set forth in part in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention.

According to one aspect of the invention, a dialog system includes a speech input processor, a contextual information processor, a storage device, a dialog manager, and a results processor, the speech input processor configured to: acquiring an utterance and a speech pattern by recognizing the input speech when the speech of the plurality of speakers is input, and configured to classify the conversation content of each of the plurality of speakers based on the acquired utterance and speech pattern; the context information processor is configured to: obtaining a relationship between speakers of the plurality of speakers based on the obtained utterance, and determining an intention and a context of each speaker based on the obtained relationship between the speakers of the plurality of speakers and the classified conversation content of each speaker; the storage device is configured to store relationships between speakers and a speech pattern of each speaker; the dialog manager is configured to determine actions corresponding to the obtained relationships and an intent and context of each speaker; the results processor is configured to output an utterance corresponding to the determined action.

When the context information processor receives the contact information from the mobile device, the context information processor may acquire at least one of a name, a title, or a phone number of a speaker having a conversation with the user based on the received contact information, and acquire a relationship between the speakers based on the acquired at least one of the name, the title, or the phone number of the speaker having a conversation with the user.

The context information processor may determine a priority between speakers among the plurality of speakers based on the acquired relationship of the speakers, the storage device may further store function control information of each speaker, the dialog manager may determine an action corresponding to the determined priority and the stored function control information, and the result processor may generate a control instruction to control at least one function based on the determined priority and the stored function control information of each speaker and output the generated control instruction to the at least one device.

The dialog system may further include a pre-utterance determiner configured to determine whether it is a pre-utterance context, wherein the context information processor may determine priorities between speakers based on the acquired relationships between the speakers, and identify the speaker having the highest priority based on the determined priorities; and the dialog manager may determine the action corresponding to the speaker with the highest priority and the pre-utterance context.

According to another aspect of the present invention, a vehicle includes: a plurality of loads; a voice input device configured to receive voice; a dialog system configured to: acquiring an utterance and a speech pattern by recognizing speech input via a speech input device, classifying a conversation content of each of a plurality of speakers based on the acquired utterance and the speech pattern, acquiring a relationship between the speakers of the plurality of speakers based on the acquired utterance, and determining an intention and a context of each of the plurality of speakers based on the acquired relationship between the speakers and the classified conversation content of each of the speakers; determining an action corresponding to the acquired relationship between the speakers and the intention and context of each of the acquired plurality of speakers, outputting an utterance corresponding to the determined action, and generating a control instruction corresponding to the determined action; and a vehicle controller configured to control at least one load of the plurality of loads based on the received control instruction when the control instruction is received.

The vehicle may further include: a communication device configured to communicate with the mobile device, wherein the conversation system may acquire at least one of a name, a title, or a phone number of a speaker who can have a conversation with the user based on the received contact information, acquire a relationship between speakers of the plurality of speakers based on the acquired at least one of the name, the title, or the phone number of the speaker who has a conversation with the user, and store at least one information of the name, the title, or the phone number of each speaker, a relationship between speakers of the plurality of speakers, and a voice pattern of each speaker of the plurality of speakers, when the contact information is received from the mobile device.

The dialogue system may receive function control information for controlling a vehicle function from mobile devices of a plurality of speakers, store the received function control information of each speaker in a storage device, determine a priority between speakers of the plurality of speakers based on a relationship between speakers of the plurality of speakers, determine an action corresponding to the determined priority and the stored function control information, generate a control instruction to control at least one function based on the determined priority and the stored function control information, and output the generated control instruction to at least one load of the plurality of loads.

When a voice is input through the voice input device, the dialog system may recognize the voice and recognize a speaker by comparing a voice pattern of the recognized voice with a voice pattern stored in the storage device.

The function control information may include: seat inclination angle information of each of the plurality of seats, horizontal position information of each of the front and rear rows of seats, air volume and wind direction information of an air conditioner, seat heating wire on/off information, seat heating wire temperature information, seat ventilation on/off information, and lumbar support information.

The vehicle may further include an information input device other than voice, wherein the dialogue system may store the seat position of each speaker input through the information input device other than voice.

The dialog system may identify an utterance position based on a time at which a voice signal arrives at the voice input device, identify a seat position corresponding to the identified utterance position, identify a voice pattern of the voice arriving at the voice input device, match the voice pattern of the identified voice with the identified seat position, and store information that the voice pattern of the identified voice matches the identified seat position.

The vehicle may further include a detector disposed in the plurality of seats, the detector configured to detect seat inclination information of the plurality of seats and horizontal positions of the front and rear seats, wherein the dialogue system is configured to match the seat inclination information and the horizontal positions with the identified seat positions, and store information that the seat inclination information and the horizontal positions match with the identified seat positions.

The dialog system may determine whether it is a pre-utterance context, and when it is determined to be the pre-utterance context, the dialog system may output an utterance corresponding to the pre-utterance context based on a relationship between speakers of the plurality of speakers.

The dialog system may determine a priority between speakers of the plurality of speakers based on a relationship between speakers of the plurality of speakers, identify a speaker with a highest priority based on the determined priority, and determine an action corresponding to the speaker with the highest priority and the pre-utterance context.

Upon determining that the intent of the currently input utterance is a response to the previously input speech, the dialog system is configured to obtain a relationship with a speaker of the currently input utterance based on an utterance corresponding to the previously input speech.

The dialog system may match the acquired speaker relationship of the currently input voice with the voice pattern of the currently input voice, and store the matched information.

According to another aspect of the present invention, a method for controlling a vehicle includes: acquiring utterances and speech patterns by speech recognition when speech of a plurality of speakers is input through a speech input device; classifying conversation content of each of the plurality of speakers based on the acquired utterance and speech pattern; acquiring a relationship between speakers among the plurality of speakers based on the acquired utterance; determining an intent and context for each of a plurality of speakers based on the obtained relationships; determining an action corresponding to the obtained relationship and the obtained intention and context of each of the plurality of speakers, and outputting an utterance corresponding to the determined action; generating a control instruction corresponding to the determined action; and controlling at least one load of the plurality of loads based on the generated control instruction.

The control method may further include: obtaining at least one of a name, a title, or a phone number of a speaker of the plurality of speakers based on the contact information received from the mobile device; acquiring a relationship between speakers of the plurality of speakers based on at least one of a name, title, or phone number of the speakers of the plurality of speakers; and storing in the storage means at least one of a name, title, and telephone number of each speaker and a relationship between speakers of the plurality of speakers.

The control method may further include: receiving function control information for controlling a vehicle function from mobile devices of a plurality of speakers; determining a priority between speakers of the plurality of speakers based on a relationship between speakers of the plurality of speakers; determining an action corresponding to the determined priority and the received function control information of each speaker; generating a control instruction to control at least one function based on the determined priority and the received function control information of each speaker; and outputting the generated control instruction to at least one load of the plurality of loads.

The control method may further include: acquiring a seat position of a speaker; and stores the seat position of the speaker.

The control method may further include: recognizing a sound production position based on a time when the voice signal arrives at the voice input device; identifying a seat position corresponding to the identified sound production position; recognizing a voice pattern of a voice arriving at the voice input device; and matching the voice pattern of the recognized voice with the recognized seat position, and storing the matched information.

The control method may further include: determining a priority between speakers of the plurality of speakers based on a relationship between speakers of the plurality of speakers; identifying a speaker having a highest priority based on the determined priority and determining whether it is a pre-voicing context; and when the pre-utterance context is determined, outputting an utterance corresponding to the speaker with the highest priority and the pre-utterance context.

Obtaining a relationship between speakers of the plurality of speakers based on the obtained utterance may include: upon determining that the intent of the currently input utterance is a response to a previously input speech, obtaining a relationship with a speaker of the currently input utterance based on an utterance corresponding to the previously input speech; and matching the acquired speaker relationship of the currently input utterance with the speech pattern of the currently input speech, and storing the matched information.

Further areas of applicability will become apparent from the description provided herein. It should be understood that the description and specific examples are intended for purposes of illustration only and are not intended to limit the scope of the present disclosure.

Drawings

In order that the invention may be better understood, various embodiments of the invention will be described by way of example only, with reference to the accompanying drawings, in which:

FIG. 1 is a control block diagram showing a dialog system;

FIG. 2 is a schematic view showing the interior of a vehicle;

fig. 3 to 5 are schematic diagrams showing examples of a dialog generated between the dialog system and the driver;

FIGS. 6 and 7 are control block diagrams schematically illustrating the connections between the dialog system and components of the vehicle;

FIGS. 8 and 9 are control block diagrams schematically illustrating the connections between components of the dialog system and components of the vehicle;

fig. 10 is a control block diagram showing a vehicle-independent method in which a dialogue system is provided in a vehicle;

fig. 11 and 12 are control block diagrams showing a vehicle gateway method in which a dialogue system is provided in a remote server and a vehicle is used as a gateway for connecting a user to the dialogue system;

fig. 13 is a control block diagram showing a case where the vehicle is capable of partial input processing and output processing in the vehicle gateway method;

fig. 14 is a control block diagram showing a hybrid method in which both the remote dialogue system server and the vehicle perform dialogue processing;

fig. 15 and 16 are control block diagrams illustrating a mobile gateway method in which a mobile device connected to a vehicle connects a user to a remote dialog system server;

fig. 17 is a control block diagram showing a movement independent method in which a dialogue system is set in a mobile device;

fig. 18, 19A, and 19B are control block diagrams showing in detail the configuration of an input processor in the configuration of the dialog system;

fig. 20A and 20B are diagrams showing examples of information stored in the contextual understanding table;

FIG. 21 is a control block diagram showing a dialog system suitable for use in a situation where the dialog system first outputs an utterance before receiving user input;

fig. 22A, 22B, and 22C are diagrams showing examples of information stored in the advance utterance condition table;

fig. 23 is a control block diagram showing the configuration of the dialog manager in detail;

fig. 24 is a diagram showing an example of information stored in the relationship action DB;

fig. 25 is a diagram showing an example of information stored in the action execution condition DB;

fig. 26 is a diagram showing an example of information stored in the action parameter DB;

fig. 27 is a table showing an example of information stored in the ambiguity resolution information DB;

FIGS. 28A and 28B are tables showing various examples of vehicle control executed as a result of the ambiguity resolver resolving the ambiguity by referring to the ambiguity resolution information DB and extracting the action;

fig. 29 is a control block diagram showing the configuration of the result processor in detail;

fig. 30 to 42 are diagrams showing specific examples in which the dialogue system processes input, manages dialogue, and outputs a result when a user inputs an utterance related to route guidance;

fig. 43 is a flowchart showing a method of processing a user input in the dialogue processing method;

FIG. 44 is a flowchart showing a method of managing dialogs using the output of an input handler in a dialog processing method;

fig. 45 is a flowchart illustrating a result processing method for generating a response corresponding to a result of dialog management in a dialog processing method according to an embodiment;

fig. 46 to 48 are flowcharts showing a case where the dialogue system outputs a preliminary utterance before a user inputs an utterance in the dialogue processing method;

FIG. 49 is a flowchart illustrating processing of repetitive tasks when the dialog system outputs a pre-utterance before a user inputs an utterance in a dialog processing method;

FIG. 50 is a control block diagram showing a vehicle in which a dialogue system is provided;

fig. 51 is a detailed control block diagram showing a dialogue system;

FIG. 52 is a control block diagram showing an input processor of the dialog system;

fig. 53 is a detailed control block diagram showing an input processor of the dialogue system;

FIG. 54 is a control block diagram showing a result processor of the dialog system;

FIG. 55 is a control block diagram showing a vehicle having a dialogue system;

56A-56D are diagrams that illustrate examples of contact information stored in a mobile device in communication with a dialog system;

FIG. 57 is a diagram illustrating an example of contact information stored in the dialog system;

FIG. 58 is a diagram showing an example of a conversation between speakers in a vehicle having a conversation system;

fig. 59A and 59B are diagrams showing an example of a speaker in a vehicle having a dialogue system;

fig. 60 is a diagram showing an example of function control information stored in a mobile device communicating with a dialogue system; and

fig. 61 is an example showing a dialog between a user and a dialog system.

The drawings described herein are for illustration purposes only and are not intended to limit the scope of the present disclosure in any way.

Description of reference numerals:

100: dialogue system

110: input processor

120: dialog manager

130: result processor

200: vehicle with a steering wheel

210: voice input device

220: information input device other than speech

230: loudspeaker

280: a communication device.

Detailed Description

The following description is merely exemplary in nature and is not intended to limit the present disclosure, application, or uses. It should be understood that throughout the drawings, corresponding reference numerals indicate like or corresponding parts and features.

Well-known functions or constructions are not described in detail since they would obscure one or more exemplary embodiments in unnecessary detail. Terms such as "unit," "module," "member," and "block" may be implemented as hardware or software. Depending on the embodiment, a plurality of "units", "modules", "members" and "blocks" may be implemented as a single component, or a single "unit", "module", "member" and "block" may include a plurality of components.

It will be understood that when an element is referred to as being "connected" to another element, it can be directly or indirectly connected to the other element, wherein indirect connection includes "connection through a wireless communication network".

Further, when a component "comprises" or "comprising" an element, the component may further comprise, but not exclude, other elements, unless there is a specific description to the contrary.

As used herein, the singular forms "a", "an" and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise.

The identification code is used for convenience of description, but is not intended to illustrate the order of each step. Unless the context clearly dictates otherwise, each step may be performed in a different order than that shown.

Reference will now be made in detail to the embodiments of the present invention, examples of which are illustrated in the accompanying drawings.

In one embodiment, the dialog system may be configured to recognize the user's intention by using the user's voice and another input other than the voice, and to provide a service suitable for the user's intention or the user's intention needs. The dialog system may perform a dialog with a user by outputting a system utterance, which is one of tools configured to provide a service or clearly recognize a user's intention.

In an exemplary embodiment, the service provided to the user may include all types of operations according to the user's needs or the user's intention, wherein all types of operations may include providing information, controlling a vehicle, performing an audio/video/navigation function, and providing content from an external server.

In another embodiment, the dialogue system provides dialogue processing techniques that are specific to the vehicle environment in order to accurately identify the user's intent in the particular environment (i.e., the vehicle).

The gateway connecting the dialog system to the user may be a vehicle or a mobile device connected to a vehicle. As described below, the dialogue system may be provided in a vehicle or a remote server outside the vehicle to transmit or receive data by communicating with the vehicle or a mobile device connected to the vehicle.

Some components of the dialog system may be located in the vehicle and other components may be located in a remote server. Thus, the vehicle and the remote server may perform a part of the operation of the dialogue system.

Fig. 1 is a control block diagram illustrating a dialog system according to an embodiment of the present invention.

Referring to fig. 1, the dialogue system 100 may include an input processor 110, a dialogue manager 120, a result processor 130, and a storage 140, the input processor 110 processing a user's input including a user's voice and an input other than the user's voice, or an input including vehicle-related information or user-related information; the dialogue manager 120 recognizes the user's intention and vehicle state using the processing result of the input processor 110 and determines an action corresponding to the user's intention or vehicle state; the result processor 130 provides a specific service or outputs a system utterance for continuing a dialog according to an output result of the dialog manager 120; the storage device 140 stores various information for operations described later.

The input processor 110 may receive two inputs, such as user speech and input other than speech. The input other than the voice may include recognizing a gesture of the user, an input other than the voice of the user input through an operation of the input device, vehicle state information indicating a state of the vehicle, traveling environment information related to traveling information of the vehicle, and user information indicating a state of the user. Further, in addition to the above information, information related to the user and the vehicle may be input to the input processor 110 as long as the information is used to recognize the user's intention or provide a service to the user or the vehicle. The users may include drivers and passengers.

The input processor 110 converts the user's speech into a text-type utterance by recognizing the user's speech, and recognizes the user's intention by applying a natural language understanding algorithm to the user's utterance.

The input processor 110 collects information related to a vehicle state or a driving environment of the vehicle other than the user's voice and then understands a context using the collected information.

The input processor 110 transmits the user intention acquired through the natural language understanding technology and the information related to the context to the dialog manager 120.

The dialog manager 120 determines an action corresponding to the user's intention or the current context based on the user's intention and the context-related information transmitted from the input processor 110, and manages parameters required to perform the corresponding action.

In an exemplary embodiment, the action may represent various actions for providing a specific service, and the kind of the action may be predetermined. Providing the service may correspond to performing the action, as desired.

For example, actions such as route guidance, vehicle state check, and gas station recommendation may be predefined in the domain/action inference rule DB141 (refer to fig. 19A), and an action corresponding to an utterance of the user, that is, an action expected by the user, may be extracted according to a stored inference rule. The action related to the event occurring in the vehicle may be defined in advance and then stored in the relation action DB 146b (refer to fig. 21).

The kind of the action is not limited. If the dialog system 100 is allowed to perform an action via the vehicle 200 or the mobile device 400, and the action is predefined while it infers that the rule or the relationship of the action to other actions/events is stored, the action may be the action mentioned above.

The dialog manager 120 sends information related to the determined action to the results processor 130.

The results processor 130 generates and outputs a dialog response and instructions necessary to perform the sent action. The dialog response may be output in text, image or audio type. When the instruction is output, services such as vehicle control and provision of external content corresponding to the output instruction may be executed.

The storage device 140 stores various information for conversation processing and providing services. For example, the storage 140 may previously store information related to domains, actions, language behaviors, and entity names used for natural language understanding, and a context understanding table used for understanding a context by inputting information. Further, the storage device 140 may store data detected by a sensor provided in the vehicle, information related to the user, and information required for action in advance. A description of the information stored in the storage device 140 will be described later.

As described above, the dialogue system 100 provides dialogue processing techniques specific to the vehicle environment. All or some of the components of the dialog system 100 may be present in the vehicle. The dialog system 100 may be provided in a remote server and the vehicle may act as a gateway between the dialog system 100 and the user. In either case, the dialog system 100 may be connected to the user via the vehicle or a mobile device connected to the vehicle.

Fig. 2 is a schematic view showing the interior of the vehicle.

Referring to fig. 2, a display device 231 and an input button 221 may be provided on the center dashboard 203; the display device 231 is configured to display a screen required for vehicle control including an audio function, a video function, a navigation function, and a call function; the input button 221 is configured to receive a control instruction of a user; the center dash panel 203 corresponds to a center portion of the dash panel inside the vehicle 200.

For the convenience of the operation of the user, input buttons may be provided on the steering wheel 207, and a knob 225 serving as an input button may be provided on the center console region 202 provided between the driver seat 254a and the passenger seat 254 b.

The modules including the display device 231, the input buttons 221, and the processor controlling various functions may correspond to an Audio Video Navigation (AVN) terminal or a stereo host.

The display device 231 may be implemented by any of various display devices, for example, a Liquid Crystal Display (LCD), a Light Emitting Diode (LED), a Plasma Display Panel (PDP), an Organic Light Emitting Diode (OLED), and a Cathode Ray Tube (CRT).

The input buttons 221 may be disposed in a region adjacent to the display device 231 in a physical key type, as shown in fig. 2. Alternatively, when the display device 231 is implemented by a touch screen, the display device 231 may perform the function of the input button 221.

The vehicle 200 may receive the user control instruction as voice via the voice input device 210. The voice input device 210 may include a microphone configured to receive sound and then convert the sound into an electrical signal.

The voice input device 210 may be mounted to the headliner 205 for efficient voice input, as shown in fig. 2, however, exemplary embodiments of the vehicle 200 are not limited thereto. Thus, the voice input device 210 may be mounted to the dashboard 201 or the steering wheel 207. In addition, the voice input device 210 may be installed at any location as long as the location is suitable for receiving the user's voice.

Inside the vehicle 200, a speaker 232 may be provided, and the speaker 232 may be configured to have a dialogue with a user or to output sounds required to provide a service desired by the user. For example, the speaker 232 may be provided inside the driver seat door 253a and the passenger seat door 253 b.

The speaker 232 may output voice for navigation route guidance, sound or voice present in audio and video content, voice for providing information or services desired by the user, and system utterances generated as a response to utterances by the user.

In one embodiment, the dialogue system 100 provides services suitable for a user's lifestyle by using dialogue processing technologies suitable for a vehicle environment, and the dialogue system 100 may implement new services using technologies such as networked cars, internet of things (IoT), and Artificial Intelligence (AI).

When applying dialogue processing techniques adapted to the vehicle environment, such as the dialogue system 100, key contexts can be easily identified and responded to during the driver's direct driving of the vehicle. The service may be provided by weighting parameters affecting driving, such as gasoline shortage and fatigue driving, or information required for the service, such as travel time and destination information, may be easily acquired based on the situation where the vehicle moves to the destination in most cases.

Further, an intelligent service configured to provide a function can be easily implemented by recognizing the intention of the driver. This is because real-time information and actions are prioritized in the case of direct driver driving. For example, when a driver searches for a gas station while driving, it can be interpreted as the driver's intention to go to the gas station. However, when the driver does not search for a gas station at the vehicle, it can be interpreted as another intention, such as a search location information query, a phone number query, and a price query, instead of the intention that the driver will go to the gas station.

Further, although the vehicle is a limited space, various situations may occur therein. For example, a driver may use the dialog system 100 in various situations, such as driving a vehicle with an unfamiliar interface (e.g., a rented vehicle), using driver services, vehicle management situations (e.g., car washes), situations where there is a baby in the vehicle, and situations where a particular destination is visited.

Further, various service and dialogue situations may occur in each stage forming the vehicle travel and in the preceding and following stages of the travel (for example, a vehicle inspection stage, a start preparation stage, a travel stage, and a parking stage). Specifically, the driver can use the dialogue system 100 in various situations, for example, a situation in which the driver does not know how to deal with a problem, a situation in which the vehicle is associated with various external devices, a situation in which driving habits (such as gasoline mileage) are checked, and a situation in which a safety support function (such as smart cruise control) is utilized, a situation in which a navigation operation is performed, a situation in which fatigue driving is performed, a situation in which the driver travels along the same route every day, and a situation in which whether the place can be parked or not are checked.

Fig. 3 to 5 are schematic diagrams showing examples of a dialog generated between the dialog system and the driver.

Referring to fig. 3, although the driver does not input a word for inquiring about the current remaining fuel amount or for requesting a gas station guide, the dialogue system 100 may recognize the current remaining fuel by itself, and when the recognized remaining fuel is less than a predetermined value, the dialogue system 100 may first output a word providing information related to the current remaining fuel (S1: 43km can be driven with the remaining fuel).

In response to the utterance, the driver may input an utterance asking a nearby gas station to receive route guidance (U1: let I know the nearby gas station), and the dialogue system 100 may output an utterance providing information about the gas station closest to the current location (S2: the closest gas stations are the A oil Seong-rim gas station, the B oil Jang-dae gas station, and the C oil Pacific gas station).

The driver may additionally input a word asking the price of gasoline (where the U2 is the cheapest?) and the dialog system 100 may output a word providing information related to the price of the fuel type (the lowest price of gasoline is B oil Jang-dae station, 1294 won per liter, whereas the lowest price of diesel is a oil Seong-rim station, 985 won per liter).

The driver may input a speech inquiring about guidance to the B oil Jang-dae gasoline station (U3), and the dialogue system 100 may output a speech indicating guidance start to the gasoline station selected by the driver (S4: start of route to the B oil Jang-dae gasoline station).

That is, the dialogue system 100 may determine that the currently required service is a gas station guidance service based on the state information of the vehicle received via the input processor 110, and output a pre-utterance to provide the required service. Further, the driver can be guided to a nearby gas station that sells the current vehicle's fuel type at the lowest price through a dialogue with the dialogue system 100. Assume that "pre-vocalization" represents an utterance that is first output from the dialog system 100 before a user vocalizes.

Meanwhile, when a gas station is selected in the example shown in fig. 3, the dialogue system 100 can omit part of the questions and directly provide information, and thus steps and time for dialogue can be reduced.

For example, the dialogue system 100 may recognize in advance that the fuel type of the current vehicle is gasoline and the criterion for the driver to select a gas station is price. Information relating to the fuel type of the vehicle may be acquired from the vehicle, and the criteria for the driver to select a gas station may be input by the driver in advance, or acquired by learning the driver's dialogue history or gas station selection history. This information may be stored in the storage device 140 in advance.

In this case, in the case where the driver does not input the utterance (U2) requesting information about the fuel price, i.e., omits U2, as shown in fig. 4, the dialogue system 100 may actively output the utterance providing information related to the fuel price (S2+ S3 — S3'), specifically, the gasoline price of the fuel type of the current vehicle.

The driver may omit the utterance (U2) for requesting information about the fuel price, and the response of the dialogue system 100 may be formed such that the utterance (S2) for guiding a nearby gas station and the utterance (S3) for guiding the fuel price are integrated into a single response to reduce the steps and time of the dialogue.

Further, the dialogue system 100 can identify the intention of the driver as searching for a gas station by itself based on the situation where the driver inquires about the current remaining fuel amount.

In this case, as shown in fig. 5, although the driver does not input a word asking for a nearby gas station (U1), i.e., omits U1, the dialogue system 100 may actively output a word providing information related to fuel price (S2+ S3 ═ S3 ").

In a state where a gas station closest to the current location and a gas station providing the lowest fuel price are the same gas station, the utterance (S3 ") providing information related to the fuel price may include a question for asking whether to guide to the corresponding gas station. Thus, the user can request route guidance to a corresponding gas station by simply inputting an utterance (U3': YES) agreeing to a question of the dialogue system 100 without inputting a specific utterance for inquiring about guidance to a specific gas station.

As described above, the dialogue system 100 can recognize the real intention of the user by considering the content that the user has not spoken and actively provide information corresponding to the intention based on the information acquired in advance. Accordingly, a dialog step and time for providing a service desired by a user can be reduced.

Fig. 6 and 7 are control block diagrams schematically showing connections between the dialog system and components of the vehicle.

Referring to fig. 6, a user voice input to the dialogue system 100 may be input via a voice input device 210 provided in the vehicle 200. As shown in fig. 2, the voice input device 210 may include a microphone disposed inside the vehicle 200.

The input other than voice among the user input may be input through the information input device 220 other than voice. The information input device 220 other than voice may include input buttons 221 and 223 and a knob 225 for receiving an instruction by an operation of a user.

The information input device 220 other than the voice input may include a camera that images the user. From the image imaged by the camera, a gesture, an expression, or a line-of-sight direction of the user serving as an instruction input tool can be recognized. Alternatively, the state of the user (drowsy state, etc.) may be recognized by an image imaged by a camera.

Information related to the vehicle may be input into the dialog system 100 via the vehicle controller 240. The vehicle-related information may include vehicle state information or surrounding environment information acquired by various sensors provided in the vehicle 200, and information originally stored in the vehicle 200, such as a fuel type of the vehicle.

The dialogue system 100 may recognize the intention and context of the user using the user's voice input via the voice input device 210, the input other than the user's voice input via the information input device 220 other than the voice, and various information input via the vehicle controller 240. The dialog system 100 outputs a response to perform an action corresponding to the user's intent.

The dialog output device 230 is a device configured to provide output to a speaker in a visual, auditory, or tactile manner. The dialogue output device 230 may include a display device 231 and a speaker 232 provided in the vehicle 200. The display device 231 and the speaker 232 may output a response to an utterance of the user, a question about the user, or information requested by the user in a visual or audible manner. Further, the vibration may be output by mounting a vibrator in the steering wheel 207.

Further, according to the response output from the dialogue system 100, the vehicle controller 240 may control the vehicle 200 to perform an action corresponding to the user's intention or the current situation.

Meanwhile, the vehicle 200 may collect information acquired from the external content server 300 or an external device, for example, driving environment information and user information such as traffic conditions, weather, temperature, passenger information, and driver personal information, via the communication device 280, in addition to information acquired by a sensor provided in the vehicle 200, and then the vehicle 200 may transmit the information to the dialogue system 100.

As shown in fig. 7, information (e.g., a remaining fuel amount, a rainfall speed, surrounding obstacle information, a speed, an engine temperature, a tire pressure, a current position) acquired by a sensor provided in the vehicle 200 may be input to the dialogue system 100 via the internal signal controller 241.

Driving environment information acquired from the outside through vehicle-to-all (V2X) communication may be input to the dialogue system 100 via the external signal controller 242. V2X may indicate that vehicles exchange and share various useful information (e.g., traffic conditions) during travel by communicating with the road infrastructure as well as other vehicles.

The V2X communications may include vehicle-to-infrastructure (V2I) communications, vehicle-to-vehicle (V2V) communications, and vehicle-to-nomadic device (V2N) communications. Therefore, by using the V2X communication, information (such as traffic information on the front, or access of another vehicle, or risk of collision with another vehicle) can be transmitted and received through communication performed directly between vehicles or communication with infrastructure installed on a road, and thus the driver can be notified of the information.

Accordingly, the driving environment information input to the dialogue system 100 via the external signal controller 242 may include traffic information on the front, access information of neighboring vehicles, a collision warning with another vehicle, a real-time traffic condition, an accident situation, and a traffic flow control state.

Although not shown in the drawings, the signal acquired via V2X may also be input to the vehicle 200 via the communication device 280.

The vehicle controller 240 may include a memory storing a program for executing the above-described operations and operations described later, and a processor; the processor is configured to execute a stored program. At least one memory and at least one processor may be provided, and when a plurality of memories and processors are provided, they may be integrated on one chip or physically separated.

In addition, the internal signal controller 241 and the external signal controller 242 may be implemented by the same processor and memory, or by separate processors and memories.

Fig. 8 and 9 are control block diagrams schematically showing connections between the dialog system and components of the vehicle.

Referring to fig. 8, a user's voice transmitted from the voice input device 210 may be input to the voice input processor 111 provided in the input processor 110, and an input other than the user's voice transmitted from the information input device 220 other than the voice may be input to the contextual information processor 112 provided in the input processor 110.

In addition, information input via the internal signal controller 241 or the external signal controller 242 is input to the context information processor 112 provided in the input processor 110.

The context information input to the context information processor 112 may include vehicle state information, driving environment information, and user information, which are input from an information input device 220 other than voice and the vehicle controller 240. The context information processor 112 can identify a context based on the input context information. The dialog system 100 can accurately recognize the user's intention or efficiently find a service required by the user by recognizing the context.

The response output from the result processor 130 may be input to the dialogue output device 230 or the vehicle controller 240 to allow the vehicle 200 to provide the service required by the user. Further, a response may be sent to the external content server 300 to request a desired service.

The vehicle state information, the driving environment information, and the user information transmitted from the vehicle controller 240 may be stored in the storage device 140.

Referring to fig. 9, the storage device 140 may include a long term memory 143 and a short term memory 144. The data stored in the storage device 140 may be classified into a short term memory and a long term memory according to the importance and persistence of the data and the intention of a designer.

The short-term memory 144 may store previously executed dialogs. The previous dialog may be a dialog performed within a reference time from the current time. Alternatively, the dialogs may be continuously stored until the capacity of the utterance contents between the user and the dialog system 100 is a reference value.

For example, when it is time to have a meal, the vehicle 200 may output via the speaker 232 an utterance asking whether to guide a restaurant. Whether it is a meal time may be identified based on whether the current time is within a predetermined meal time range. When the user says "let i know restaurants near the south of the river station" or "let i know restaurants" and when the current position of the vehicle 200 is near the south of the river station, the dialogue system 100 may search for restaurants near the south of the river station through the external content server 300 and then provide the user with information about the searched restaurants near the south of the river station. Examples of the provided information: the dialog system 100 may display a list of restaurants on the display device 231 and, when the user says "first", may store dialog content in the short-term memory 144 that is relevant from the requesting restaurant to the selected restaurant.

Alternatively, not only the entire dialog content but also specific information present in the dialog content may be stored. For example, the first restaurant on the restaurant list may be stored in the short term memory 144 or the long term memory 143 as the restaurant selected by the user.

When the user asks the dialog system 100 about "how to weather?" after a dialog with a restaurant near the south of the river station, the dialog system 100 can assume, through the dialog stored in the short-term memory 144, that the place of interest to the user is the south of the river station and then output a response "the south of the river station is raining".

Next, when the user says "recommended restaurant menu", the dialogue system 100 may assume that "restaurant" represents a restaurant near the south of the river station through the dialogue stored in the short-term memory, and acquire information related to the recommended menu of the corresponding restaurant through a service provided from the external content server 300. Thus, the dialog system 100 may output a response "the noodles are the best menu in the restaurant".

Long-term storage 143 may store data according to the existence of data persistence. For example, long-term storage 143 may determine that the persistence of data is warranted and then store data therein, such as location of interest (POI) information (e.g., home), phone numbers of friends, and family or companies, as well as user preferences for certain parameters. Conversely, when it is determined that the persistence of the data is not guaranteed, the data may be stored in the short-term memory 144.

For example, the user's current location may be temporary data, and thus stored in short-term memory 144, while the user's preferences for restaurants may be persistent data that is available later, and thus stored in long-term memory 143.

When the user says "there is no restaurant? in the vicinity", the dialog system 100 may identify the user's current location and find the favorite chinese restaurant of the user from the long-term memory 143.

In addition, the dialog system 100 can actively provide services and information to the user using data stored in the long term 143 and short term 144 memories.

For example, information related to the user's home may be stored in long-term memory 143. The dialogue system 100 may acquire information related to the user's house from the external content server 300 and then provide information indicating "water is expected to be cut off on this friday due to the cleanliness of the apartment".

Information relating to the vehicle battery status may be stored in the short term memory 144. The dialogue system 100 may analyze the vehicle battery status stored in the short-term memory 144 and then provide an indication that "the battery is in a bad state, and that it was repaired before winter. "is used as the information.

Fig. 10 is a control block diagram showing a vehicle-independent method in which a dialogue system is provided in a vehicle.

According to the vehicle independent method, the dialogue system 100 has an input processor 110, a dialogue manager 120, a result processor 130, and a storage device 140, and the dialogue system 100 may exist in a vehicle 200, as shown in fig. 10.

When the dialogue system 100 exists in the vehicle 200, the vehicle 200 can handle the dialogue with the user by itself and provide the service required by the user. However, information required for the session processing and the provision of the service may also be acquired from the external content server 300.

Vehicle state information or running environment information (e.g., remaining fuel amount, rainfall speed, surrounding obstacle information, speed, engine temperature, tire pressure, current location) detected by the vehicle detector 260 may be input to the dialogue system 100 via the vehicle controller 240.

In accordance with the response output from the dialogue system 100, the vehicle controller 240 may control an air conditioner 251, a window 252, a door 253, a seat 254, or an AVN255 provided in the vehicle 200.

For example, when the dialogue system 100 determines that the user's intention or the service required by the user is to lower the temperature in the vehicle 200 and then generates and outputs a corresponding instruction, the vehicle controller 240 may lower the temperature in the vehicle 200 by controlling the air conditioner 251.

For another example, when the dialogue system 100 determines that the user's intention or the service required by the user is to raise the window 252a of the driver seat and generate and output a corresponding instruction, the vehicle controller 240 may raise the window 252a of the driver seat by controlling the window 252.

For another example, when the dialogue system 100 determines that the user's intention or the service required by the user is a route to guide to a specific destination and generates and outputs a corresponding instruction, the vehicle controller 240 may perform route guidance by controlling the AVN 255. The communication device 280 may acquire map data and POI information from the external content server 300 and then provide a service using the information, as necessary.

Fig. 11 and 12 are control block diagrams showing a vehicle gateway method in which a dialogue system is set in a remote server and a vehicle is used as a gateway for connecting a user to the dialogue system.

According to the vehicle gateway method, as shown in fig. 11, a remote dialogue system server 1 may be provided outside a vehicle 200, and a communication device 280 and a dialogue system client 270 connected via the remote dialogue system server 1 may be provided in the vehicle 200. The communication device 280 serves as a gateway connecting the vehicle 200 and the remote dialogue system server 1.

The dialog system client 270 may serve as an interface to connect to input/output devices and collect, send, and receive data.

When the voice input device 210 and the information input device 220 other than the voice input provided in the vehicle 200 receive the input of the user and transmit the user input to the dialogue system client 270, the dialogue system client 270 may transmit the input data to the remote dialogue system server 1 via the communication device 280.

The vehicle controller 240 may also transmit data detected by the vehicle detector 260 to the dialogue system client 270, and the dialogue system client 270 may transmit data detected by the vehicle detector 260 to the remote dialogue system server 1 via the communication device 280.

Since the above-described dialogue system 100 is provided in the remote dialogue system server 1, the remote dialogue system server 1 can execute all of the following processes: input data processing, dialogue processing based on a result of the input data processing, and result processing based on a result of the dialogue processing.

Further, the remote dialogue system server 1 may acquire information or contents necessary for input data processing, dialogue management, or result processing from the external content server 300.

The vehicle 200 may acquire information or contents of the service required by the user from the external content server 300 according to the response transmitted from the remote dialogue system server 1.

Referring to fig. 12, the communication device 280 may include at least one communication module configured to communicate with an external device. For example, the communication device 280 may include at least one of a short-range communication module 281, a wired communication module 282, and a wireless communication module 283.

The short-range communication module 281 may include various short-range communication modules configured to transmit and receive signals at a short range using a wireless communication module, for example, a bluetooth module, an infrared communication module, a Radio Frequency Identification (RFID) communication module, a Wireless Local Area Network (WLAN) communication module, an NFC communication module, and a ZigBee communication module.

The wired communication module 282 may include various wired communication modules, such as a Local Area Network (LAN) module, a Wide Area Network (WAN) module, or a Value Added Network (VAN) module, and various cable communication modules; the cable communication module is, for example, Universal Serial Bus (USB), High Definition Multimedia Interface (HDMI), Digital Video Interface (DVI), recommended standard 232(RS-232), power line communication, or Plain Old Telephone Service (POTS).

The wireless communication module 283 may include a wireless communication module supporting various wireless communication methods, for example, a Wifi module, a wireless broadband module, Global System for Mobile (GSM) communication, Code Division Multiple Access (CDMA), Wideband Code Division Multiple Access (WCDMA), Time Division Multiple Access (TDMA), Long Term Evolution (LTE), 4G, and 5G.

In addition, the communication device 280 may further include an internal communication module (not shown) for communication between electronic devices in the vehicle 200. The communication protocol of the vehicle 200 may utilize a Controller Area Network (CAN), a Local Interconnect Network (LIN), FlexRay, and ethernet.

The dialogue system 100 can transmit data to the external content server 300 or the remote dialogue system server 1 via the wireless communication module 283 and receive data from the external content server 300 or the remote dialogue system server 1. The dialog system 100 may perform V2X communication using the wireless communication module 283. Further, using the short-range communication module 281 or the wired communication module 282, the dialogue system 100 can transmit and receive data to and from a mobile device connected to the vehicle 200.

Fig. 13 is a control block diagram showing a case where the vehicle is capable of partial input processing and output processing in the vehicle gateway method.

As described above, the dialogue system client 270 of the vehicle 200 may collect, transmit and receive only data, but since the input processor 271, the result processor 273 and the storage 274 are included in the dialogue system client 270, the dialogue system client 270 may process data input from the user or the vehicle or perform a process related to providing a service determined to be required by the user, as shown in fig. 13. That is, the operations of the input processor 110 and the result processor 130 may be performed not only by the remote dialogue system server 1 but also by the vehicle 200.

In this case, the dialog system client 270 may perform all or part of the operations of the input processor 110. The dialog system client 270 may perform all or part of the operations of the results processor 130.

The task sharing between the remote dialog system server 1 and the dialog system client 270 may be determined in consideration of the capacity of data to be processed and the data processing speed.

Fig. 14 is a control block diagram showing a hybrid method in which both the remote dialogue system server and the vehicle perform dialogue processing.

According to the hybrid method, as shown in fig. 14, since the input processor 110, the dialogue manager 120, the result processor 130, and the storage device 140 are provided in the remote dialogue system server 1, the remote dialogue system server 1 can perform dialogue processing, and since the terminal dialogue system 290 (the terminal dialogue system 290 is provided with the input processor 291, the dialogue manager 292, the result processor 293, and the storage device 294) is provided in the vehicle 200, the vehicle 200 can perform dialogue processing.

However, there may be a difference in capacity or performance between the processor and the memory provided in the vehicle 200 and the processor or the memory provided in the remote dialogue system server 1. Accordingly, when the terminal dialog system 290 can output a result by processing all input data and managing a dialog, the terminal dialog system 290 can perform the entire process. Otherwise, processing may be requested from the remote dialogue system server 1.

Before performing the dialogue process, the terminal dialogue system 290 may determine whether the dialogue process can be performed based on the data type, and the terminal dialogue system 290 may directly perform the process or request the process from the remote dialogue system server 1 based on the result of the determination.

When an event occurs in which the terminal dialog system 290 cannot perform a process during the execution of a dialog process, the terminal dialog system 290 may request a process from the remote dialog system server 1 while transmitting the result of the terminal dialog system 290 itself to the remote dialog system server 1.

For example, when high-performance computing power or long-term data processing is required, the remote dialog system server 1 may perform dialog processing, and when real-time processing is required, the terminal dialog system 290 may perform dialog processing. For example, when a situation requiring immediate processing occurs and thus data needs to be processed before synchronization, the terminal dialog system 290 may be set to process the data first.

Further, when there is an unregistered speaker in the vehicle and thus user confirmation is required, the remote dialogue system server 1 may process the dialogue.

In addition, when the terminal dialogue system 290 cannot complete the dialogue processing by itself in a state where the terminal dialogue system 290 is not allowed to connect to the remote dialogue system server 1 via the communication device 280, the user may be notified that the dialogue processing cannot be executed by the dialogue output device 230.

The data stored in the terminal dialog system 290 and the data stored in the remote dialog system server 1 may be determined according to the data type or data capacity. For example, in the case of data having a risk of invading privacy due to personal identification, the data may be stored in the storage 294 of the terminal dialog system 290. Further, a large amount of data may be stored in the storage device 140 of the remote dialog system server 1, and a small amount of data may be stored in the storage device 294 of the terminal dialog system 290. Alternatively, a small amount of data may be stored in the storage device 140 of the remote dialog system server 1 and the storage device 294 of the terminal dialog system 290.

Fig. 15 and 16 are control block diagrams illustrating a mobile gateway method in which a mobile device connected to a vehicle connects a user to a remote dialog system server.

According to the mobile gateway method, as shown in fig. 15, the mobile device 400 may receive vehicle state information, driving environment information, and the like from the vehicle 200 and transmit user input and the vehicle state information to the remote dialogue system server 1. That is, the mobile device 400 may act as a gateway connecting the user to the remote dialogue system server 1 or connecting the vehicle 200 to the remote dialogue system server 1.

The mobile device 400 may represent an electronic device that is portable and capable of transmitting and receiving data to and from an external server and a vehicle by communicating with the external server and the vehicle, wherein the mobile device 400 may include a smart phone, a smart watch, smart glasses, a PDA, and a tablet computer.

The mobile device 400 may include a voice input device 410, an information input device 420 other than voice, an output device 430, a communication device 480, and a dialog system client 470, the voice input device 410 receiving a user's voice; the information input means 420 other than voice receives an input other than user voice; the output device 430 outputs a response in a visual, audible, or tactile manner; the communication device 480 transmits and receives data to and from the remote dialogue system server 1 and the vehicle 200 by communication; the dialog system client 470 collects input data from the user via the communication device 480 and transmits the data to the remote dialog system server 1.

The voice input device 410 may include a microphone that receives sound, converts the sound into an electrical signal, and outputs the electrical signal.

The information input device 420 other than voice may include an input button provided in the mobile device 400, a touch screen, or a camera.

The output device 430 may include a display device, a speaker, or a vibrator provided in the mobile device 400.

The voice input device 410, the information input device 420 and the output device 430, which are provided in the mobile device 400, may serve as input and output interfaces for the user. Further, the voice input device 210, the information input device 220 other than voice, and the dialogue output device 230 provided in the vehicle 200 may be used as input and output interfaces for the user.

When the vehicle 200 transmits data and user input detected by the vehicle detector 260 to the mobile device 400, the dialogue system client 470 of the mobile device 400 may transmit the data and user input to the remote dialogue system server 1.

The dialogue system client 470 may transmit a response or an instruction transmitted from the remote dialogue system server 1 to the vehicle 200. When the dialogue system client 470 utilizes the dialogue output device 230 provided in the vehicle 200 as an input and output interface of the user, an utterance of the dialogue system 100 or a response to the utterance of the user can be output via the dialogue output device 230. When the conversation system client 470 utilizes the output device 430 provided in the mobile device 400, an utterance of the conversation system 100 or a response to the utterance of the user can be output via the output device 430.

An instruction for vehicle control may be transmitted to the vehicle 200, and the vehicle controller 240 may perform control corresponding to the transmitted instruction, thereby providing a service required by the user.

The dialog system client 470 may collect input data and transmit the input data to the remote dialog system server 1. The dialog system client 470 may also perform all or part of the functionality of the input processor 110 and the result processor 130 of the dialog system 100.

Referring to fig. 16, the communication device 480 of the mobile device 400 may include at least one communication module configured to communicate with an external device. For example, the communication device 480 may include at least one of a short-range communication module 481, a wired communication module 482, and a wireless communication module 483.

The short-range communication module 481 may include various short-range communication modules configured to transmit and receive signals at a short range using a wireless communication module, such as a bluetooth module, an infrared communication module, a Radio Frequency Identification (RFID) communication module, a Wireless Local Area Network (WLAN) communication module, an NFC communication module, and a ZigBee communication module.

The wired communication module 482 may include various wired communication modules, such as a Local Area Network (LAN) module, a Wide Area Network (WAN) module, or a Value Added Network (VAN) module, and various cable communication modules; the cable communication module is, for example, Universal Serial Bus (USB), High Definition Multimedia Interface (HDMI), Digital Video Interface (DVI), recommended standard 232(RS-232), power line communication, or Plain Old Telephone Service (POTS).

The wireless communication module 483 may include a wireless communication module supporting various wireless communication methods, for example, a Wifi module, a wireless broadband module, Global System for Mobile (GSM) communication, Code Division Multiple Access (CDMA), Wideband Code Division Multiple Access (WCDMA), Time Division Multiple Access (TDMA), Long Term Evolution (LTE), 4G, and 5G.

For example, the mobile device 400 may be connected to the vehicle 200 via the short range communication module 481 or the wired communication module 482, and the mobile device 400 may be connected to the remote conversation system server 1 or the external content server 300 via the wireless communication module 483.

Fig. 17 is a control block diagram showing a movement independent method in which a dialogue system is set in a mobile device.

According to the mobile independent method, as shown in fig. 17, the dialogue system 100 may be provided in the mobile device 400.

Accordingly, the mobile device 400 can process a conversation with the user by itself and provide a service required by the user without connecting to the remote conversation system server 1 for conversation processing. However, the mobile device 400 may acquire one piece of information for the session processing and the provision service from the external content server 300.

The components forming the dialog system 100 may be physically separated from each other, or some of the components may be omitted, according to any of the methods described above. For example, even if the dialogue system 100 is provided in the remote dialogue system server 1, a part of the components forming the dialogue system 100 may be provided in a separate server or vehicle. The operator or manager of the separate server may be the same as or different from the operator or manager of the remote dialog system server 1. For example, a speech recognizer or a natural language understanding part described later may be provided in a separate server, and the dialogue system 100 may receive a result of speech recognition or a result of natural language understanding on an utterance of the user from the separate server. Alternatively, the storage device 140 may be provided in a separate server.

The detailed configuration and the detailed operation of each component of the dialog system 100 will be described in detail. According to the embodiment described later, it is assumed that the dialogue system 100 is provided in the vehicle 200 for convenience of explanation. Specific components of the dialog system 100 described later may be classified according to their operations, and there may be no limitation as to whether the components are implemented by the same processor and memory and the physical locations of the processor and memory.

Fig. 18, 19A, and 19B are control block diagrams showing in detail the configuration of an input processor in the configuration of the dialog system.

Referring to fig. 18, the input processor 110 may include a voice input processor 111 processing a voice input and a context information processor 112 processing context information.

The user's voice transmitted from the voice input device 210 may be input to the voice input processor 111, and the input other than the user's voice transmitted from the information input device other than voice 220 may be input to the contextual information processor 112.

The vehicle controller 240 may transmit the vehicle state information, the driving environment information, and the user information to the context information processor 112. The driving environment information and the user information may be provided to the external content server 300 or the mobile device 400 connected to the vehicle 200.

Input other than speech may be present in the context information. That is, the context information may include vehicle state information, driving environment information, and user information.

The vehicle state information may include information indicating a vehicle state and acquired by sensors provided in the vehicle 200, and information related to the vehicle and stored in the vehicle, such as a fuel type of the vehicle.

The running environment information may be information acquired by a sensor provided in the vehicle 200. The running environment information may include image information acquired by a front camera, a rear camera, or a stereo camera, obstacle information acquired by a sensor (e.g., radar, lidar, ultrasonic sensor), information related to rainfall, and rainfall speed information acquired by a rainfall sensor.

The driving environment information may further include traffic state information, traffic light information, and adjacent vehicle visit or adjacent vehicle collision risk information acquired through V2X.

The user information may include: information related to a user's state measured by a camera or a biometric reader provided in the vehicle, information related to the user directly input using an input device provided in the vehicle by the user, information related to the user and stored in the external content server 300, and information stored in the mobile device 400 connected to the vehicle.

The voice input processor 111 may include: a speech recognizer 111a, a natural language understanding part 111b, and a dialogue input manager 111c, the speech recognizer 111a outputting an utterance of a text type by recognizing an inputted voice of a user; the natural language understanding section 111b recognizes the intention of the user contained in the utterance by applying a natural language understanding technique to the utterance of the user; the dialog input manager 111c transmits the result of the understanding of the natural language and the context information to the dialog manager 120.

The speech recognizer 111a may include a speech recognition engine, and the speech recognition engine may recognize a speech uttered by a user and generate a recognition result by applying a speech recognition algorithm to the input speech.

Since the input speech is converted into a more useful form for speech recognition, the speech recognizer 111a can detect an actual speech portion included in the speech by detecting a start point and an end point from the speech signal. This is called End Point Detection (EPD).

The speech recognizer 111a may extract feature vectors of the input speech from the detected portions by applying feature vector extraction techniques such as cepstral, Linear Prediction Coefficients (LPC), mel-frequency cepstral coefficients (MFCC), or filter bank energy.

The speech recognizer 111a may obtain a recognition result by comparing the extracted feature vectors with the trained reference pattern. At this time, the speech recognizer 111a may utilize an acoustic model that models and compares signal characteristics of speech and a language model; the language model models the language sequential relationship of words or syllables corresponding to the vocabulary recognition. To this end, the storage 140 may store an acoustic model and a language model DB.

The acoustic model may be classified into a direct comparison method of setting target recognition as a feature vector model and comparing the feature vector model with a feature vector of a voice signal and a statistical method; the statistical method statistically processes feature vectors of target identification.

The direct comparison method is to set units such as words or phonemes recognized as targets to the feature vector model and compare the received speech with the feature vector model to determine the similarity therebetween. A representative example of the direct comparison method is vector quantization. The vector quantization is to map feature vectors of a received speech signal to a codebook as a reference model, to encode the mapped result into representative values, and to compare the representative values with each other.

The statistical model approach is to configure the units of target recognition as state sequences and to exploit the relationships between the state sequences. Each state sequence may be configured with multiple nodes. Methods using the relationship between state sequences may be classified into Dynamic Time Warping (DTW), Hidden Markov Models (HMM), and methods using neural networks.

DTW is one such method: which compensates for the difference in time axis by comparison with a reference model taking into account the dynamic characteristics of speech, i.e. the length of the signal varies over time even if the same person utters the same utterance. The HMM is one such recognition method: it assumes speech as a markov process having a state transition probability and an observation probability of a node (output symbol) in each state, then estimates the state transition probability and the observation probability of the node based on learning data, and calculates a probability of generating the received speech by an estimated model.

Meanwhile, the language model models the language sequential relationship of words, syllables, etc., and can reduce sound blur and recognition errors by applying the sequential relationship between units configuring a language to units acquired through voice recognition. The language models may include statistical language models and Finite State Automata (FSA) based models. Statistical language models utilize chain probabilities of words, such as Unigram, Bigram, and Trigram.

The speech recognizer 111a may perform speech recognition using any of the methods described above. For example, the speech recognizer 111a may utilize an acoustic model applying HMM, or an N-best search method in which an acoustic model is combined with a speech model. The N-best search method may improve recognition performance by selecting N or less recognition result candidates using an acoustic model and a language model, and then re-estimate the order of the recognition result candidates.

The speech recognizer 111a may calculate a confidence level to ensure reliability of the recognition result. Confidence may be a criterion that indicates how reliable the speech recognition results are. For example, the confidence may be defined as: for a phoneme or word for which a result is recognized, a relative value of a probability of emitting the corresponding phoneme or word from other phonemes or words. Thus, the confidence may be expressed as a value between 0 and 1 or between 0 and 100.

When the confidence is greater than a predetermined threshold, the speech recognizer 111a may output a recognition result to allow an operation corresponding to the recognition result to be performed. When the confidence is equal to or less than the threshold, the speech recognizer 111a may reject the recognition result.

An utterance in a text form as a recognition result of the speech recognizer 111a may be input to the natural language understanding section 111 b.

The natural language understanding section 111b can recognize the intention of the user utterance included in the utterance language by applying a natural language understanding technique. Accordingly, the user can input the control command through a natural dialogue, and the dialogue system 100 can also cause the input of the control command through a dialogue and provide the service desired by the user.

The natural language understanding section 111b may perform morphological analysis on the utterance in the text form. A morpheme is the smallest unit of meaning and represents the smallest semantic element that cannot be subdivided. Therefore, morphological analysis is the first step in natural language understanding and converts an input string into a morpheme string.

The natural language understanding section 111b may extract a domain from the utterance based on the morphological analysis result. The domain may be used to recognize a subject of a user utterance language, and domains indicating various subjects (e.g., remaining amount route guidance of gasoline, weather search, traffic search, schedule management, fuel management, and air conditioning control) may be stored as a database.

The natural language understanding section 111b can recognize the entity name from the utterance. The entity name may be a proper noun, such as a person name, a place name, an organization name, a time, a date, and a currency, and the entity name identification may be configured to identify the entity name in the sentence and determine a type of the identified entity name. The natural language understanding section 111b can extract important keywords from the sentence using the entity name recognition and recognize the meaning of the sentence.

The natural language understanding portion 111b can analyze speech behavior present in the utterance. The verbal behavioral analysis may be configured to recognize the intent of the user's utterance, e.g., whether the user asks questions, whether the user makes requests, whether the user responds, or whether the user simply expresses an emotion.

The natural language understanding section 111b extracts an action corresponding to the utterance intention of the user. The natural language understanding section 111b can recognize the intention of the utterance of the user based on information such as a domain, an entity name, and a speech behavior, and extract an action corresponding to the utterance. Actions may be defined by objects and operators.

The natural language understanding section 111b may extract parameters related to the execution of the action. The parameter related to the action execution may be a valid parameter directly required for the action execution or an invalid parameter for extracting a valid parameter.

For example, when the utterance of the user is "we go to seoul bar", the natural language understanding section 111b may extract "navigation" as a domain corresponding to the utterance and "route guidance" as an action, in which the verbal behavior corresponds to "request".

The entity name "seoul station" may correspond to [ parameter: destination ], but may require a specific exit number for the station or GPS information to actually guide the route via the navigation system. In this case, the [ parameter: destination: seoul station ] may be a candidate parameter for searching for "seoul station" actually desired by the user among a plurality of seoul station POIs.

The natural language understanding part 111b may extract a tool configured to express a relationship between words or sentences, for example, a syntax tree.

The processing result of the natural language understanding part 111b may be transmitted to the dialog input manager 111c, and the processing result of the natural language understanding part 111b includes a morphological analysis result, domain information, action information, speech behavior information, extracted parameter information, entity name information, and a syntax tree.

The contextual information processor 112 may include a contextual information collector 112a, a contextual information collection manager 112b, and a context understanding part 112c, the contextual information collector 112a collecting information from the information input device 220 and the vehicle controller 240 other than voice; the contextual information collection manager 112b manages the collection of contextual information; the context understanding section 112c understands a context based on the result of the natural language understanding and the collected context information.

The input processor 110 may include a memory in which a program for performing the above-described operations and operations described later is stored, and a processor; the processor is configured to execute a stored program. At least one memory and at least one processor may be provided, and when a plurality of memories and processors are provided, they may be integrated on one chip or physically separated.

The speech input processor 111 and the context information processor 112 present in the input processor 110 may be implemented by the same processor and memory, or may be implemented by separate processors and memories.

Hereinafter, a method in which components of the input processor 110 process input data using information stored in the storage device 140 will be described in detail with reference to fig. 19A and 19B.

Referring to fig. 19A, the natural language understanding part 111b may perform domain extraction, entity recognition, speech behavior analysis, and action extraction using the domain/action inference rule DB 141.

In the domain/action inference rule DB141, a domain extraction rule, a speech behavior analysis rule, an entity name conversion rule, an action extraction rule may be stored.

Other information such as user input other than voice, vehicle state information, driving environment information, and user information may be input to the context information collector 112a and then stored in the context information DB 142, the long term memory 143, or the short term memory 144.

For example, the raw data detected by the vehicle detector 260 may be divided into a sensor type and a sensor value and then stored in the context information DB 142.

In the short term memory 144 and the long term memory 143, data meaningful to the user may be stored, wherein the data may include the current user state, the user's preferences and orientation, or data for determining the user's preferences and orientation.

As described above, information that ensures durability and is thus available for a long term may be stored in the long term memory 143, wherein the information may include a user's phone book, schedule, preferences, educational history, personality, work, and family-related information.

Information that is not guaranteed to be persistent or uncertain and therefore available for short-term use may be stored in the short-term storage 144, where the information may include current and previous locations, today's schedule, previous conversation content, conversation participants, environment, domain, and driver status. The data stored in at least two storage devices among the context information DB 142, the short term memory 144, and the long term memory 143 may be duplicated according to the data type.

Further, among the information stored in the short-term memory 144, data determined to ensure durability may be transmitted to the long-term memory 143.

The information to be stored in the long-term memory 143 can be acquired using the information stored in the short-term memory 144 and the context information DB 142. For example, the user's preference may be acquired by analyzing destination information or conversation contents stored for a specific duration, and the acquired user's preference may be stored in the long-term memory 143.

The acquisition of information to be stored in the long-term memory 143 in the dialogue system 100 or in an additional external system may be performed by using information stored in the short-term memory 144 or the context information DB 142.

The former case may be performed in the memory manager 135 of the result processor 130. In this case, among the data stored in the short-term memory 144 or the context information DB 142, data for acquiring meaningful information (e.g., user's preference or orientation or persistent information) may be stored in the long-term memory 143 in a log file type. The memory manager 135 may retrieve persistent data by analyzing data stored for more than a particular duration and restore the data in the long term memory 143. In long-term storage 143, the location where persistent data is stored may be different from the location where data stored in a log file type is stored.

The memory manager 135 may determine persistent data among the data stored in the short-term memory 144 and move and store the determined data into the long-term memory 143.

When acquiring information to be stored in the long-term memory 143 using information stored in the short-term memory 144 or the context information DB 142 is performed in an additional external system, the data management system 800 may be used, the data management system 800 being provided with a communicator 810, a storage device 820, and a controller 830, as shown in fig. 19B.

The communicator 810 can receive data stored in the context information DB 142 or the short term memory 144. All of the stored data may be sent to the communicator 810 or may choose to then send data for obtaining meaningful information (e.g., user's preferences or orientation or persistent information). The received data may be stored in storage 820.

The controller 830 may acquire the persistent data by analyzing the stored data and then transmit the acquired data to the dialog system 100 via the communicator 810. The transmitted data may be stored in long-term memory 143 of dialog system 100.

Further, the dialog input manager 111c can acquire context information related to the execution of the action by transmitting the result of the output of the natural language understanding section 111b to the context understanding section 112 c.

By referring to the context information stored according to the action in the context understanding table 145, the context understanding part 112c can determine the context information related to the action execution corresponding to the intention of the user utterance.

Fig. 20A and 20B are diagrams illustrating examples of information stored in the contextual understanding table.

Referring to the example of fig. 20A, context information and types of context information related to execution of an action may be stored in the contextual understanding table 145 according to each action.

For example, when the action is route guidance, the current location may be required as the context information, and the type of the context information may be GPS information. When the action is a vehicle state check, a travel distance may be required as the context information, and the type of the context information may be an integer. When the action is a gas station recommendation, a remaining fuel amount and a remaining Distance To Empty (DTE) may be required as the context information, and the type of the context information may be an integer.

When context information related to the execution of an action corresponding to the intention of the user utterance is stored in the context information DB 142, the long-term memory 143, or the short-term memory 144 in advance, the context understanding part 112c may acquire corresponding information from the context information DB 142, the long-term memory 143, or the short-term memory 144 and transmit the corresponding information to the dialog input manager 111 c.

When the context information related to the execution of the action corresponding to the intention of the user utterance is not stored in the context information DB 142, the long-term memory 143, or the short-term memory 144, the context understanding part 112c may request the context information collection manager 112b for the required information. The context information collection manager 112b may allow the context information collector 112a to collect the required information.

The contextual information collector 112a may collect data periodically or only when a particular event occurs. In addition, the context information collector 112a may collect data periodically and then additionally collect data when a specific event occurs. In addition, the contextual information collector 112a may collect data when a data collection request is received from the contextual information collection manager 112 b.

The context information collector 112a may collect the required information and then store the information in the context information DB 142 or the short term memory 144. The contextual information collector 112a may send a confirmation signal to the contextual information collection manager 112 b.

The context information collection manager 112b can send a confirmation signal to the context understanding part 112c, and the context understanding part 112c can retrieve the required information from the long term memory 143 or the short term memory 144 and then send the information to the dialog input manager 111 c.

Specifically, when the action corresponding to the intention of the utterance of the user is route guidance, the context understanding section 112c may search the context understanding table 145 and recognize that the context information related to the route guidance is the current location.

When the current position is stored in the short-term memory 144 in advance, the context understanding section 112c can acquire the current position and transmit the current position to the dialogue input manager 111 c.

When the current location is not stored in the short-term memory 144, the context understanding portion 112c can request the current location from the context information collection manager 112b, and the context information collection manager 112b can allow the context information collector 112a to obtain the current location from the vehicle controller 240.

The contextual information collector 112a may obtain the current location and then store the current location in the short-term memory 144. The contextual information collector 112a may send a confirmation signal to the contextual information collection manager 112 b. The context information collection manager 112b can send a confirmation signal to the context understanding part 112c, and the context understanding part 112c can retrieve the current location information from the short-term memory 144 and then send the information to the dialog input manager 111 c.

The dialog input manager 111c can transmit the output of the natural language understanding part 111b and the output of the context understanding part 112c to the dialog manager 120, and the dialog input manager 111c can try to prevent repeated input from entering the dialog manager 120. At this time, the output of the natural language understanding section 111b and the output of the context understanding section 112c may be combined into one output and then transmitted to the dialog manager 120, or may be independently transmitted to the dialog manager 120.

When the contextual information collection manager 112b determines that a particular event has occurred because the data collected by the contextual information collector 112a meets a predetermined condition, the contextual information collection manager 112b may send a trigger for an action to the context understanding portion 112 c.

The context understanding part 112c can search the context understanding table 145 for the context information related to the corresponding event, and when the searched context information is not stored in the context understanding table 145, the context understanding part 112c can again transmit the context information request signal to the context information collection manager 112 b.

As shown in fig. 20B, context information and types of context information related to an event can be stored in the context understanding table 145 according to each event.

For example, when the generated event is an engine temperature warning, the engine temperature in integer form may be stored as context information associated with the event. When the generated event is driver fatigue driving detection, the driver fatigue driving state in the form of an integer may be stored as the contextual information related to the event. When the generated event is insufficient tire pressure, the tire pressure in the form of an integer may be stored as context information associated with the event. When the generated event is a fuel warning, a Distance To Empty (DTE) may be stored in integer form as the event related context information. When the generated event is a sensor error, a sensor name in the form of text may be stored as context information associated with the event.

The context information collection manager 112b can collect the required context information via the context information collector 112a and send a confirmation signal to the context understanding part 112 c. The context understanding part 112c can acquire required context information from the context information DB 142, the long term memory 143, or the short term memory 144 and then transmit the context information together with the action information to the dialog input manager 111 c.

The dialog input manager 111c can input the output of the context understanding portion 112c to the dialog manager 120.

Hereinafter, a case where the dialogue system 100 outputs a preliminary utterance by itself before an utterance of a user is input will be described.

Fig. 21 is a control block diagram showing a dialogue system suitable for a case where the dialogue system first outputs an utterance before receiving a user input, and fig. 22A, 22B, and 22C are diagrams showing examples of information stored in the pre-utterance condition table.

Referring to fig. 21, the input processor 110 of the dialog system 100 may further include a pre-utterance determiner 151 and a repetitive task processor 152, the pre-utterance determiner 151 determining whether it is a pre-utterance context. The storage device 140 may further include a pre-sound condition table 145a that stores pre-sound conditions, and a task processing DB145 b.

The data stored in the context information DB 142, the long term memory 143, and the short term memory 144 may be transmitted to the preliminary utterance determiner 151. The pre-voicing determiner 151 may analyze the transmitted data and determine whether the transmitted data satisfies the pre-voicing conditions stored in the pre-voicing condition table 145 a.

Referring to the example of fig. 22A, in the pre-utterance condition table 145a, pre-utterance conditions related to context information and pre-utterance messages output when the corresponding pre-utterance conditions are satisfied may be stored for each context information.

The pre-utterance determiner 151 may determine that it is a pre-utterance context and generate a pre-utterance trigger signal when the context information transmitted from the context information DB 142 satisfies a pre-utterance condition.

The pre-utterance determiner 151 may send the pre-utterance trigger signal and the pre-utterance message corresponding to the corresponding pre-utterance context to the context understanding portion 112 c. Further, the pre-utterance determiner 151 may transmit information related to a corresponding pre-utterance context. The information related to the corresponding pre-utterance context may include a pre-utterance condition corresponding to the corresponding pre-utterance context or an action corresponding to the pre-utterance context, which is described later.

For example, the pre-sound condition may be satisfied when the context information relates to the tire air pressure and the tire air pressure is equal to or less than a predetermined reference value. When the pre-sound condition of the tire air pressure is satisfied, the pre-sound determiner 151 may determine a pre-sound context caused by the tire air pressure being insufficient, and generate a pre-sound trigger signal.

The pre-utterance determiner 151 may transmit the pre-utterance trigger signal and the pre-utterance message to the context understanding portion 112 c. For example, in a pre-sound context caused by insufficient tire air pressure, a pre-sound message indicating that the tire air pressure is low (such as "tire pressure is too low") may be sent to the context understanding portion 112 c.

Further, the advance sound emission condition may be satisfied when the context information relates to an engine temperature and the engine temperature is equal to or higher than a predetermined reference value. When the preliminary sound emission condition of the engine temperature is satisfied, the preliminary sound emission determiner 151 may determine that the preliminary sound emission context is caused by an abnormality of the engine temperature, and generate a preliminary sound emission trigger signal.

The pre-utterance determiner 151 may transmit the pre-utterance trigger signal and the pre-utterance message to the context understanding portion 112 c. For example, in a pre-sound context caused by an abnormality in engine temperature, a pre-sound message indicating that the engine is overheated (e.g., "engine temperature is too high") may be sent to the context understanding portion 112 c.

Further, the advance sound emission condition may be satisfied when the context information is related to the remaining amount of gasoline and the remaining amount of gasoline is equal to or less than a predetermined reference value. When the user sets a destination using a navigation service of the vehicle, a predetermined reference value may be set based on a distance from the current location to the destination. When the destination is not set, a default value may be applied as a reference value. For example, when the value is smaller than the reference value for indicating the fuel deficiency warning lamp, the value may be set as the reference value of the advance sound condition relating to the deficiency in the amount of remaining gasoline. When the preliminary sound emission condition of the remaining amount of gasoline is satisfied, the preliminary sound emission determiner 151 may determine a preliminary sound emission context caused by the insufficient remaining amount of gasoline, and generate a preliminary sound emission trigger signal.

The pre-utterance determiner 151 may transmit the pre-utterance trigger signal and the pre-utterance message to the context understanding portion 112 c. For example, in a pre-sound context caused by a shortage of the remaining amount of gasoline, a pre-sound message indicating that the remaining amount of gasoline is insufficient (e.g., "the remaining amount of gasoline is insufficient to reach the destination") may be sent to the context understanding portion 112 c.

However, the pre-spoken conditions and the pre-spoken messages shown in fig. 22A are only examples that may be applied to the dialog system 100. In the above examples, the case where the pre-spoken message corresponding to the pre-spoken context is the content notifying the current situation has been described. However, the dialog system 100 may also first suggest specific functions or services needed to perform the pre-utterance context.

Referring to fig. 22B, when the pre-sound context is caused by insufficient tire air pressure or an abnormal engine temperature, a pre-sound message corresponding to a content of an active repair shop reservation service recommendation, such as "do you want to reserve repair shop?" may be stored.

Further, when the pre-vocalized context is caused by insufficient remaining gasoline, a pre-vocalized message corresponding to content actively suggesting a gasoline station guidance service, such as "do you want to guide a gasoline station?," can be stored.

Further, the advance sound emission condition may be satisfied when the advance sound emission context is caused by the interior temperature of the vehicle and when the interior temperature of the vehicle is out of a predetermined reference range. When the pre-sound condition of the vehicle interior temperature is satisfied, the context understanding portion 112c may determine that the pre-sound context is caused by an abnormality of the vehicle interior temperature, and generate a pre-sound trigger signal.

In the context of a pre-sound caused by an abnormality in the vehicle interior temperature, a pre-sound message corresponding to the content of an active advice of the interior temperature control function, for example, "do you want to operate the air conditioner?" may be stored.

The context understanding part 112c may determine that it is a pre-vocalized context for changing emotion when the pre-vocalization condition of the microphone input is satisfied, and generate a trigger signal of the pre-vocalization, and thus, may store a pre-vocalized message corresponding to the content of the active suggestion multimedia play service, for example, "do you want to play music?".

Further, the advance sound emission condition may be satisfied when the context information is related to the opening and closing of the window and whether it is raining, and when the window is opened and raining. When the window is open and it is raining, the context understanding portion 112c can determine that a pre-sound context is caused by the window being open and generate a pre-sound trigger signal.

In the context of a pre-sound caused by window opening, a pre-sound message corresponding to content that proactively suggests a window closing function, such as "do you want to close window?," may be stored.

In the above-described example of fig. 22A and 22B, the case of the pre-utterance message corresponding to the pre-utterance context stored in advance in the pre-utterance condition table 145a has been described. However, the example of the dialog system 100 is not limited thereto, and the action corresponding to the pre-utterance context may be stored in advance.

As described above, when an utterance of a user is input, the natural language understanding part 111b may refer to the domain/action inference rule DB141 to extract an action corresponding to the utterance of the user. When the dialog system 100 outputs the advance utterance, an action corresponding to the advance utterance context may be previously stored in each advance utterance context, as shown in fig. 22C.

For example, when the pre-sound context is caused by an abnormality in tire air pressure and engine temperature, "service shop guide" may be stored as a corresponding action, and when the pre-sound context is caused by a shortage of the remaining amount of gasoline, "gas station guide" may be stored as a corresponding action.

Further, when the pre-sound context is caused by an abnormality of the vehicle interior temperature, "air-conditioning operation" may be stored as the corresponding action, and when the pre-sound context is for changing the emotion, "multimedia play" may be stored as the corresponding action. When the pre-sound context is caused by window opening, the "opening and closing of the window" may be stored as a corresponding action.

As described above, when the action corresponding to the pre-utterance context is pre-stored, the pre-utterance trigger signal and the action corresponding to the pre-utterance context may be transmitted to the context understanding part 112c, and the dialog input manager 111c may input the pre-utterance trigger signal and the action corresponding to the pre-utterance context to the dialog manager 120. In this case, the same operation as in the case of the input user utterance may be performed in the dialog manager 120.

As another example, in the pre-utterance condition table 145a, the pre-utterance context may be stored in such a manner that: the pre-utterance context matches the virtual user utterance corresponding to each pre-utterance context, and the pre-utterance determiner 151 may generate the virtual user utterance corresponding to the pre-utterance context. The preliminary utterance determiner 151 may transmit the user utterance stored in the preliminary utterance condition table 145a or generated by the preliminary utterance determiner 151 to the natural language understanding section 111b in a text type. For example, when the pre-sound context is caused by an anomaly in tire pressure, a virtual user utterance, such as "check tire pressure" or "lead to a repair shop" may be stored or generated. Further, when the pre-sound context is caused by an abnormality of the vehicle interior temperature, a virtual user utterance, such as "turn on the air conditioner", may be stored or generated.

Further, according to the mobile gateway method in which the mobile device 400 serves as a gateway between the vehicle and the dialogue system 100, the dialogue system client 470 of the mobile device 400 may perform part of the operation of the pre-utterance determiner 151. In this case, the dialog system client 470 can generate a virtual user utterance corresponding to the pre-utterance context and send the virtual user utterance to the natural language understanding section 111 b.

The natural language understanding part 111b may extract a domain and an action corresponding to the transmitted virtual user utterance and transmit the domain and the action to the dialog input manager 111 c. The action extracted by the natural language understanding section 111b may be an action corresponding to the pre-utterance context. The processing performed after the action corresponding to the pre-utterance context is transmitted to the dialog manager 120 may be performed in the same manner as in the case where the user utters first.

The above-described contextual information, pre-spoken conditions, pre-spoken messages, and actions are merely examples of implementations that may be applied to the dialog system 100, but the implementations of the dialog system 100 are not limited in this regard. In addition, various contextual information, pre-sound conditions, pre-sound messages, and actions may be stored.

When the pre-utterance determiner 151 sends the pre-utterance trigger signal and the pre-utterance context related information to the context understanding portion 112c, the context understanding portion 112c may send the pre-utterance context related information to the repetitive task processor 152.

The repeat task processor 152 can determine whether a task associated with the currently occurring pre-sound context has been processed or whether the task is a repeat task.

In the task processing DB145b, information related to tasks that have been processed or are currently processed may be stored. For example, a conversation history (including conversation contents and each conversation time), a vehicle state, and whether a task is completed within the conversation time, etc. may be stored. Further, the results of processing and task processing, such as route guidance using a navigation function regardless of a dialog, may be stored.

Specifically, when the pre-sound context is caused by a shortage of the remaining amount of gasoline, the repetitive task processor 152 may determine whether the gas station guidance task is currently processed based on the information stored in the task processing DB145 b. When a conversation for gas station guidance is currently being conducted or a gas station guidance action is currently being performed, the repetitive task processor 152 can determine that the task associated with the current pre-vocalization context is a repetitive task and terminate the pre-vocalization context.

Further, when the utterance for gas station guidance is previously output, and when there is a history of a conversation in which the user rejects gas station guidance, the repetitive task processor 152 may determine that the task related to the current pre-utterance context is a repetitive task, and terminate the pre-utterance context.

Further, while the gas station guidance task using the navigation function is currently processed regardless of the dialogue history of the gas station guidance, the repeat task processor 152 may determine that the task related to the current advance vocal context is a repeat task and terminate the advance vocal context. The repetitive task processor 152 can recognize that a gas station guidance task using a navigation function is currently being processed based on information stored in the task processing DB145 b.

Further, when the reference time period has not elapsed from the time when the dialogue related to the guidance of the remaining amount of gasoline is performed, it may be assumed that the user drives to the gasoline station by himself although the gasoline station guidance is not currently performed. Thus, the repetitive task processor 152 can determine that the task associated with the current pre-utterance context is a repetitive task and terminate the pre-utterance context.

Further, in a state where the pre-utterance context is a schedule indicating a schedule based on information stored in the long-term memory 143 (such as a user's birthday or a family member birthday), when there is a conversation history of the same schedule that was previously conducted and a reference time period has not elapsed from the time when the corresponding conversation was conducted, the repetitive task processor 152 may determine that a task related to the current pre-utterance context is a repetitive task and terminate the pre-utterance context.

That is, the repetitive task processor 152 may determine whether the advance utterance is previously output and the user's intention with respect to the advance utterance context based on the dialog history stored in the task processing DB145 b. The repetitive task processor 152 may determine whether it is a repetitive task based on the stored dialog time, the user's intention, the vehicle state, or the completion of the task.

In the repetitive task processor 152, a policy configured to determine whether it is a repetitive task (i.e., whether to terminate the pre-utterance context) based on information stored in the task processing DB145b may be stored. The repetitive task processor 152 may determine whether the task associated with the current pre-utterance context is a repetitive task according to a stored policy, and when it is determined to be a repetitive task, the repetitive task processor 152 may terminate the pre-utterance context.

In the above example, the case where the dialogue system 100 includes the preliminary utterance determiner 151, the repetitive task processor 152, the preliminary utterance condition table 145a, and the task processing DB145b has been described.

However, the example of the dialog system 100 is not limited thereto, and thus the operations of the above-described components may be performed using the components shown in fig. 19A and 19B.

For example, the context understanding portion 112c may perform the operation of the preliminary utterance determiner 151 corresponding to determining whether the preliminary utterance condition is satisfied, and the operation of the repetitive task processor 152 corresponding to processing the repetitive task.

The information stored in the pre-utterance condition table 145a may be stored in the context understanding table 145, and the information stored in the task processing DB145b may be stored in a dialog and action state DB 147, which will be described later.

Fig. 23 is a control block diagram showing the configuration of the dialog manager in detail, fig. 24 is a diagram showing an example of information stored in the relational action DB, fig. 25 is a diagram showing an example of information stored in the action execution condition DB, and fig. 26 is a diagram showing an example of information stored in the action parameter DB.

Referring to fig. 23, the dialog manager 120 may include a dialog flow manager 121, a dialog action manager 122, an ambiguity resolver 123, a parameter manager 124, an action priority determiner 125, and an external information manager 126, the dialog flow manager 121 requesting generation, deletion, and update of a dialog or an action; the dialog action manager 122 generates, deletes and updates a dialog or action according to a request of the dialog flow manager 121; the ambiguity resolver 123 clarifies the user's intention by resolving ambiguities of contexts and dialogs; the parameter manager 124 manages parameters required for action execution; the action priority determiner 125 determines whether an action of a plurality of candidate actions is executable; the external information manager 126 manages an external content list and related information, and manages parameter information of external content inquiry.

The dialog manager 120 may include a memory storing a program for performing the above-described operations and operations described later, and a processor; the processor is configured to execute a stored program. At least one memory and at least one processor may be provided, and when a plurality of memories and processors are provided, they may be integrated on one chip or physically separated.

Each of the components present in dialog manager 120 may be implemented by the same processor or may be implemented by separate processors.

Further, the dialog manager 120 and the input processor 110 may be implemented by the same processor, or may be implemented by separate processors.

When a user utterance is input or when a user utterance that matches the pre-utterance context is transmitted to the natural language understanding section 111b, the dialog input manager 111c may transmit a result of natural language understanding (output of the natural language understanding section) and context information (output of the context understanding section) to the dialog flow manager 121. In addition, the dialog input manager 111c can send a pre-vocalized trigger signal when a pre-vocalization context occurs.

The output of the natural language understanding section 111b may include information (e.g., morphological analysis results) related to the utterance content of the user, as well as information (e.g., domains and actions). The output of the context understanding portion 112c can include the events determined by the context information collection manager 112b as well as the context information.

The dialog flow manager 121 can search the dialog and action state DB 147 for whether there is a dialog task or an action task corresponding to the input of the dialog input manager 111 c.

The dialog and action state DB 147 may be a storage space for managing dialog states and action states, and thus the dialog and action state DB 147 may store dialogs and actions currently in progress, and dialog states and action states related to preliminary actions to be processed. For example, the dialog and action state DB 147 may store states related to completed dialogs and actions, stopped dialogs and actions, ongoing dialogs and actions, and pending dialogs and actions.

The dialog and action state DB 147 can store the final output state in relation to whether to switch and nest actions, a switch action index, an action change time, and a screen/voice/instruction.

For example, in the case of extracting a domain and an action corresponding to a user utterance, when there is a dialog and an action corresponding to the corresponding domain and action in the recently stored dialog, the dialog and action state DB 147 may determine the dialog and action as a dialog task or an action task corresponding to an input from the dialog input manager 111 c.

When the domain and the action corresponding to the user utterance are not extracted, the dialogue and action state DB 147 may generate a random task or request the dialogue action manager 122 to refer to a recently stored task.

When there is no dialog task or action task corresponding to the input of the input processor 110 in the dialog and action state DB 147, the dialog flow manager 121 may request the dialog action manager 122 to generate a new dialog task or action task.

Further, when a trigger signal of a pre-utterance is transmitted from the input processor 110, although there is a dialog task or an action task currently executed, the dialog task or the action task may be temporarily stopped, and a dialog task or an action task corresponding to a pre-utterance context may be first generated. Furthermore, the priority may be selected according to established rules.

When the pre-vocalized trigger signal and the action corresponding to the pre-vocalized trigger signal are input from the dialog input manager 111c, the dialog flow manager 121 may request the dialog action manager 122 to generate a new dialog task or action task in the same manner as the case of acquiring the action from the user utterance.

Further, when the pre-spoken trigger signal and the pre-spoken message corresponding to the pre-spoken trigger signal are input from the dialog input manager 111c, the dialog flow manager 121 may request the dialog action manager 122 to generate a new dialog task or action task for outputting the input pre-spoken message.

When the dialog flow manager 121 manages the dialog flow, the dialog flow manager 121 may refer to the dialog policy DB 148. The dialog policy DB148 may store policies for continuing a dialog, wherein the policies may represent policies for selecting, starting, suggesting, stopping, and terminating a dialog.

In addition, the dialog policy DB148 may store the time point at which the system outputs the response and the policy regarding the methodology. The dialog policy DB148 may store a policy for generating a response by connecting a plurality of services and may store a policy for deleting a previous action and replacing the previous action with another action.

For example, two policies may be allowed, where the two policies may include a policy that generates responses for two actions at a time (e.g., "whether or not a B action needs to be performed? after performing a action"), and a policy that generates a separate response for one action after generating a response for the other action (e.g., "a action has been performed," "whether or not you are to perform B action?").

The dialog and action state DB 147 may store policies for determining priorities among candidate actions. The priority determination policy will be described later.

The dialog and action manager 122 may designate a memory space to the dialog and action state DB 147 and generate a dialog task and an action task corresponding to an output of the input processor 110.

The dialog action manager 122 may generate a random dialog state when domains and actions cannot be extracted from the user's utterance. In this case, as described later, the ambiguity resolver 123 may recognize the user's intention based on the content of the utterance of the user, environmental conditions, vehicle states, and user information, and determine an action suitable for the user's intention.

When there is a dialog task or an action task corresponding to the output of the input processor 110 in the dialog and action state DB 147, the dialog flow manager 121 may request the dialog action manager 122 to refer to the corresponding dialog task or action task.

The action priority determiner 125 may search the relational action DB 146b for a list of actions related to actions or events contained in the output of the input processor 110, and then the action priority determiner 125 may extract candidate actions. As shown in fig. 24, the relationship action DB 146b may indicate actions related to each other, relationships between actions, actions related to events, and relationships between events. For example, route guidance, vehicle status checks, and gas station recommendations may be separated into relational actions, and the relationships therein may correspond to associations.

Therefore, when performing route guidance, the vehicle status check and the gas station recommendation can be performed together. In this case, "to be executed together" may include a case where the vehicle state check and the gas station recommendation are executed before or after the route guidance and a case where the vehicle state check and the gas station recommendation are executed during the route guidance (for example, added as a stopover).

The warning light output events may be stored as event actions related to the service shop guide action, and the relationship between them may correspond to the association.

When a warning lamp output event occurs, a service shop guide action may be performed according to the type of warning lamp or whether service is required.

When the input processor 110 transmits an action corresponding to the utterance of the user together with the event determined by the context information collection manager 112b, the action related to the action corresponding to the utterance of the user and the action related to the event may become candidate actions.

The extracted candidate action list may be sent to the dialog action manager 122, and the dialog action manager 122 may update the action state of the dialog and action state DB 147 by adding the candidate action list.

The action priority determiner 125 may search the action execution condition DB 146c for a condition for executing each candidate action.

As shown in fig. 25, the action execution condition DB 146c may store, from each action, conditions required to execute the action, and parameters that determine whether the respective conditions are satisfied.

For example, the execution condition for the vehicle state check may be a case where the destination distance is equal to or greater than 100km, wherein the parameter for determining the condition may correspond to the destination distance. The gas station recommended condition may be a case where the destination distance is greater than a remaining fuel travelable Distance (DTE), wherein the parameter for determining the condition may correspond to the destination distance and the remaining fuel travelable Distance (DTE).

The action priority determiner 125 may send the execution conditions of the candidate actions to the dialogue action manager 122, and the dialogue action manager 122 may add the execution conditions and update the action state of the dialogue and action state DB 147 according to each candidate action.

The action priority determiner 125 may search the context information DB 142, the long term memory 143, the short term memory 144, or the dialogue and action state DB 147 for parameters (hereinafter, referred to as condition determination parameters) required to determine action execution conditions, and determine whether a candidate action can be executed using the searched parameters.

When the parameters for determining the action execution condition are not stored in the context information DB 142, the long term memory 143, the short term memory 144, or the dialog and action state DB 147, the action priority determiner 125 may acquire the required parameters from the external content server 300 via the external information manager 126.

The action priority determiner 125 may determine whether the candidate action may be performed using the parameter for determining the action performing condition. Further, the action priority determiner 125 may determine the priority of the candidate action based on whether to execute the candidate action and the priority determination rule stored in the dialog policy DB 148.

A score for each candidate action may be calculated based on the current situation. Higher priority may be given to the calculated candidate actions having a higher score. For example, an action corresponding to the following parameters may be used as the parameter for calculating the score: user utterances, security scores, convenience scores, processing times, processing time points (whether to immediately process), user preferences (user acceptance level when suggesting a service or preferences predetermined by the user), administrator scores, scores related to vehicle states, and action success rates (conversation success rates), as shown in the following equation 1. w1, w2, w3, w4, w5, w6, w7, w8, and w9 represent weight values of each parameter.

[ equation 1]

The priority score is w1 × user utterance action + w2 × security score + w3 × convenience score + w4 × processing time + w5 × processing time point + w6 × user preference + w7 × administrator score + w8 × score related to vehicle state + w9 × action success rate × possibility of action execution (1: possible, not yet known, 0: impossible) × action completion state (completion: 1, not completed: 0).

As described above, the action prioritizer 125 may provide the user with the most needed services by searching for actions directly associated with the user's utterance, context information, and action lists related thereto, and by determining priorities therebetween.

The action priority determiner 125 may transmit the likelihood and priority of candidate action execution to the dialog action manager 122, and the dialog action manager 122 may update the action state of the dialog and action state DB 147 by adding the transmitted information.

The parameter manager 124 may search the action parameter DB 146a for a parameter (hereinafter, referred to as an action parameter) for performing each candidate action.

As shown in fig. 26, the action parameter DB 146a may store necessary parameters, substitute parameters, initial values of the parameters, and reference positions for acquiring the parameters according to each action. In a state where the initial values of the parameters are stored, when there is no parameter value corresponding to the corresponding parameter in the utterance of the user and the context information output from the input processor 110, and when there is no parameter value in the context information DB 142, an action may be performed according to the stored initial values, or whether the action is performed according to the stored initial values may be confirmed to the user.

For example, the necessary parameters for route guidance may include the current location and destination, and the alternative parameters may include the route type. The initial values of the alternative parameters may be stored as fast routes. The current location and destination can be acquired by searching the dialog and action state DB 147, the context information DB 142, the short-term memory 144, or the long-term memory 143 in order.

The necessary parameters for the vehicle state check may include vehicle state information, and the substitute parameters may include a portion to be checked (hereinafter referred to as a "check portion"). The entire portion may be stored as an initial value for the replacement parameter. The vehicle state information can be acquired from the context information DB 142.

The gas station recommended alternative parameters may include favorite gas stations, and "a-oil" may be stored as an initial value of the alternative parameters. A favorite gas station can be retrieved from the long term storage 143. The alternative parameters may further include the fuel type and fuel price of the vehicle.

As described above, the parameter manager 124 may acquire the parameter values of the parameters searched in the action parameter DB 146a from the corresponding reference positions. The reference location where the parameter value is introduced may be at least one of the context information DB 142, the short term memory 144 or the long term memory 143, the dialog and action state DB 147 and the external content server 300.

The parameter manager 124 may obtain parameter values from the external content server 300 through the external information manager 126. The external information manager 126 can determine where to acquire information by referring to the external service set DB 146 d.

The external service set DB 146d may store information related to an external content server connected to the dialog system 100. For example, the external service ensemble DB 146d may store an external service name, a description about the external service, a type of information provided from the external service, an external service using method, and a subject providing the external service.

The initial values obtained by the parameter manager 124 may be sent to the dialog action manager 122, and the dialog action manager 122 may update the dialog and action state DB 147 by adding the initial values of the candidate actions to the action state.

The parameter manager 124 may acquire initial values of all candidate actions, or the parameter manager 124 may acquire only initial values of candidate actions determined to be executable by the action priority determiner 125.

The parameter manager 124 may selectively utilize an initial value of different types of initial values indicating the same information. For example, the "seoul station" indicating a destination and being in a text form may be converted into the "seoul station" in the form of a POI by the navigation system using a destination search service.

When there is no ambiguity in the dialog and context, the above-described operations according to the action priority determiner 125, the parameter manager 124, and the external information manager 126 can acquire required information and manage the dialog and action. When there is ambiguity in the dialog and context, it may be difficult to provide a service desired by the user using only the operations of the action prioritizer 125, the parameter manager 124, and the external information manager 126.

In this case, the ambiguity resolver 123 can handle ambiguities in the dialog or in the context. For example, when a reference is included in a conversation (e.g., that person, that place yesterday, father, mother, grandmother, and daughter, since the person or thing that the reference represents is unclear, there may be ambiguity. In this case, the ambiguity resolver 123 may resolve the ambiguity by referring to the context information DB 142, the long-term memory 143, or the short-term memory 144, or provide guidance for resolving the ambiguity.

For example, the ambiguous words contained in "place yesterday", "market near house" and "seoul station i went yesterday" may correspond to the parameter values of the action parameters or the parameter values of the condition determining parameters. However, in this case, due to the ambiguity of words, it is impossible to perform an actual action or determine an action execution condition by using the corresponding words.

The ambiguity resolver 123 may resolve the ambiguity of the initial value by referring to information stored in the context information DB 142, the long-term memory 143, or the short-term memory 144. The blur solver 123 may obtain required information from the external content server 300 by using the external information manager 126, as needed.

For example, the ambiguity resolver 123 may search for where the user went yesterday by referring to the short-term memory 144 to convert "yesterday's place" into available information for the destination of the route guidance action. The ambiguity resolver 123 can search for the user's house address by referring to the long-term memory 143 and acquire the location information related to the a-market near the user's house address from the external content server 300. Thus, the ambiguity resolver 123 can convert "the market near the house" into the available information of the destination of the route guidance action.

When the input processor 110 does not clearly extract the action (object and operator) or when the user's intention is unclear, the ambiguity resolver 123 can recognize the user's intention by referring to the ambiguity resolution information DB 146e and determine the action corresponding to the recognized intention.

Fig. 27 is a table showing an example of information stored in the ambiguity resolution information DB.

Based on the vehicle state information and the surrounding environment information, the blur calculation information DB 146e may match the utterance with the action corresponding to the utterance, and then store the utterance and the action. The utterance stored in the ambiguity resolution information DB 146e may be an utterance in which an action cannot be extracted by natural language understanding. Fig. 27 shows a case where the content of the utterance according to the morphological analysis result is extremely cold hands or cold hands.

The surrounding environment information may include an external temperature of the vehicle and whether it is raining, and the vehicle state information may include on/off of an air conditioner and a heater, an air volume and a wind direction of the air conditioner, and on/off of an electric heating wire of a steering wheel.

Specifically, in a state where the outside temperature exceeds 20 degrees while it is raining, when the air conditioner is turned ON (ON), it can be recognized that the air conditioner temperature setting is low, and thus "raise the air conditioner temperature by 3 degrees" can be stored as a vehicle control action corresponding thereto.

In a state where the outdoor temperature exceeds 20 degrees while raining, when the air conditioner is turned OFF (OFF), it can be recognized that the user feels cold due to raining, and thus the "heater on" can be stored as a vehicle control action corresponding thereto.

In a state where the outside temperature exceeds 20 degrees while there is no rain, when the air conditioner is turned ON (ON) and the wind direction of the air conditioner is upward, it can be recognized that the hand is extremely cold because the wind of the air conditioner directly affects the hand, and thus "change the wind direction of the air conditioner downward" can be stored as a vehicle control action corresponding thereto.

In a state where the outside temperature exceeds 20 degrees while there is no rain, when the air conditioner is turned ON (ON), the wind direction of the air conditioner is downward, and the air volume is set to be larger than the middle range, it can be recognized that the user feels cold due to the air volume of the air conditioner being excessively large, and thus "reducing the air volume of the air conditioner" can be stored as a vehicle control operation corresponding thereto.

In a state where the outside temperature exceeds 20 degrees while there is no rain, when the air conditioner is turned ON (ON), the wind direction of the air conditioner is downward, and the air volume is set weak, "increase the air conditioner temperature by 3 degrees" may be stored as a vehicle control action corresponding thereto.

In a state where the outside temperature is lower than 20 degrees, when the heater is turned OFF (OFF), it can be recognized that the hand is extremely cold due to cold weather, and thus "turn on the heater" can be stored as a vehicle control action corresponding thereto.

In a state where the external temperature is lower than 20 degrees, when the heater is turned ON (ON) and the steering wheel wire is turned off, it can be recognized that the hand is extremely cold because hot air is not transferred to the hand, and thus the "turn-ON steering wheel wire" can be stored as a vehicle control action corresponding thereto.

In a state where the external temperature is lower than 20 degrees, when the heater and the steering wheel heater are turned ON (ON) and the wind direction of the heater is downward, it can be recognized that the hand is extremely cold because the wind of the heater is not transferred to the hand, and thus "change the wind direction of the heater to the bidirectional" can be stored as a vehicle control action corresponding thereto.

In a state where the external temperature is lower than 20 degrees, the heater and the steering wheel electric wire are turned ON (ON), and the wind direction of the heater is upward, when the heater temperature is set to be lower than the maximum, "increase the temperature of the heater" may be stored as a vehicle control action corresponding thereto.

When the air volume of the heater is not set to be the highest in a state where the external temperature is lower than 20 degrees, the heater and the steering wheel heater are turned ON (ON), the air direction of the heater is upward, and the heater temperature is set to be the highest, "increase the air volume of the heater" may be stored as a vehicle control action corresponding thereto.

When the seat heating wire is turned off in a state where the external temperature is lower than 20 degrees, the heater and the steering wheel heating wire are turned ON (ON), the wind direction of the heater is upward, and the heater temperature and the air volume of the heater are set to be the highest, "turn ON seat heating wire" may be stored as a vehicle control action corresponding thereto.

In a state where the external temperature is lower than 20 degrees, the heater and the steering wheel heating wire are turned ON (ON), the wind direction of the heater is upward, and the heater temperature and the air volume of the heater are set to be the highest, when the seat heating wire is turned ON, "notify: waiting for a period of time because the heater is now in a fully operational state "may be stored as a vehicle control action corresponding thereto.

Fig. 28A and 28B are tables showing various examples of executing vehicle control as a result of the ambiguity resolver resolving the ambiguity by referring to the ambiguity resolving information DB and extracting the action.

For example, as shown in fig. 28A and 28B, in a state where the content of speech according to the morphological analysis result is extremely cold hands or cold hands, when the surrounding environment is summer, the vehicle state is that the wind direction of the air conditioner is above (upward) the head of the passenger, the air conditioner setting temperature is 19 degrees, and the air volume of the air conditioner is high, it can be recognized that the hands are extremely cold because the wind direction of the air conditioner is directed to the hands. An air conditioning control action for reducing the intensity of the air volume while changing the wind direction to the foot side (downward) may be extracted as an action corresponding to the speech, and the vehicle may be controlled according to the extracted action.

In the words having the same contents, when the surrounding environment is winter, the vehicle state is that the wind direction of the air conditioner is the feet of the passengers, the air conditioner set temperature is 25 degrees, and the air volume of the air conditioner is in the high range, it can be recognized that the hands are extremely cold because hot air is not transferred to the hands. The action of "turning on the steering wheel heater" may be extracted as the action corresponding to the speech, and the vehicle may be controlled according to the extracted action.

In a state where the content of the utterance according to the morphological analysis result is "stuffiness", when the vehicle speed is 30km or less and the front-rear distance is less than 30cm, it can be recognized that the stuffiness is caused by heavy traffic. Accordingly, "change a route option (quick route guidance) in a route guidance action", "play multimedia content such as music", or "open a chat function" can be extracted as an action corresponding to the utterance, and the vehicle can be controlled according to the extracted action.

In a state where the content of the words according to the morphological analysis result is "drowsiness", when the vehicle state is the interior air mode, it can be recognized that the drowsiness is caused by lack of air circulation. Therefore, the "change to the outside air mode" can be extracted as the action corresponding to the utterance, and the vehicle can be controlled according to the extracted action.

In words having the same contents, when the vehicle state is the outside air mode and the heater is turned ON (ON), it can be recognized that drowsiness is caused by hot air discharged from the heater. "opening the window" may be extracted as an action corresponding to the utterance, and the vehicle may be controlled according to the extracted action.

In a state where the content of speech according to the morphological analysis result is "sweating" or "hot", when the surrounding environment is winter and the heater is turned ON (ON), it can be recognized that heat is caused by hot air discharged from the heater. Thus, "lowering the heater temperature" or "reducing the air volume" may be stored as an action corresponding to the speech.

In the words having the same contents, when the surrounding environment is winter and when the heater is turned OFF (OFF), it can be recognized that the heat is caused by the body heat of the user. Therefore, "open window" or "suggest to open window" may be extracted as an action corresponding to the utterance, and the vehicle may be controlled according to the extracted action.

In the words having the same content, when the surrounding environment is summer and when the air conditioner is OFF (OFF), it can be recognized that the heat is caused by the increase in the internal temperature of the vehicle. Therefore, "turning on the air conditioner" may be extracted as the action corresponding to the speech, and the vehicle may be controlled according to the extracted action.

In the words having the same contents, when the surrounding environment is summer and when the air conditioner is turned ON (ON), it can be recognized that heat is caused by a high air conditioner temperature setting. Therefore, "lowering the air-conditioning temperature" or "increasing the air volume of the air conditioner" can be extracted as the action corresponding to the speech, and the vehicle can be controlled according to the extracted action.

In a state where the content of the words according to the morphological analysis result is "cold", when the surrounding environment is summer and when the air conditioner is turned ON (ON), it can be recognized that cold is caused by too low of the air conditioner temperature setting or by too strong wind of the air conditioner. Therefore, "increasing the air-conditioning temperature" or "decreasing the air volume" can be extracted as the action corresponding to the speech, and the vehicle can be controlled according to the extracted action.

In the words having the same content, when the surrounding environment is summer and when the air conditioner is OFF (OFF), it can be recognized that cold is caused by the physical condition of the user. The "heater operation" or "checking the biorhythm of the user" may be extracted as the action corresponding to the utterance, and the vehicle may be controlled according to the extracted action.

In the words having the same contents, when the surrounding environment is winter and the heater is turned ON (ON), it can be recognized that cold is caused by a low heater temperature setting or a weak air volume. Therefore, "increasing the heater temperature" or "increasing the air volume" can be extracted as the action corresponding to the speech, and the vehicle can be controlled according to the extracted action.

In the words having the same contents, when the surrounding environment is winter and the heater is turned OFF (OFF), it can be recognized that cold is caused by the heater not being operated. The "heater operation" may be extracted as an action corresponding to the utterance, and the vehicle may be controlled according to the extracted action.

In a state where the content of the speech according to the morphological analysis result is "headache", when the surrounding environment is winter and the heater is turned ON (ON), it can be recognized that the headache is caused by lack of air circulation. Therefore, "change to the outside air mode" or "open the window" can be extracted as the action corresponding to the utterance, and the vehicle can be controlled according to the extracted action.

In the words having the same contents, when the surrounding environment is winter and the heater is turned OFF (OFF), it can be recognized that the headache is caused by cold. The "heater operation" may be extracted as an action corresponding to the utterance, and the vehicle may be controlled according to the extracted action.

In a speech having the same content, when the surrounding environment is summer and the air conditioner is OFF (OFF), it can be recognized that the headache is caused by heat. The "air conditioning work" may be extracted as the action corresponding to the utterance, and the vehicle may be controlled according to the extracted action.

In the words having the same contents, when the surrounding environment is summer and the air conditioner is turned ON (ON), it can be recognized that the headache is caused by the air conditioner. "changing the wind direction or the air volume of the air conditioner" may be extracted as the action corresponding to the speech, and the vehicle may be controlled according to the extracted action.

In a state where the content of the speech according to the morphological analysis result is "uncomfortable", when the surrounding environment is winter and it is raining, it can be recognized that the discomfort is caused by high humidity. Therefore, the "defogging function operation" or the "dehumidification function operation" may be extracted as the action corresponding to the utterance, and the vehicle may be controlled according to the extracted action.

In a speech having the same content, when the surrounding environment is summer and it is not rainy, it can be recognized that discomfort is caused by seasonal characteristics and heat. Therefore, "let the air conditioner operate at the lowest temperature" can be extracted as the action corresponding to the speech, and the vehicle can be controlled according to the extracted action.

In a speech having the same content, when the surrounding environment is summer and is raining, it can be recognized that discomfort is caused by heat and high humidity. Therefore, "let the air conditioner operate in the dehumidification mode" can be extracted as the action corresponding to the speech, and the vehicle can be controlled according to the extracted action.

According to the operation of the above-described blur solver 123, although there is a blur in the utterance or situation of the user, the blur solver 123 can accurately recognize an action actually desired by the user or an action actually required by the user and provide the desired action and the required action by considering the utterance of the user and the surrounding environment information and the vehicle state information as a whole.

Information related to the action determined by the ambiguity resolver 123 may be transmitted to the dialogue action manager 122, and the dialogue action manager 122 may update the dialogue and action state DB 147 based on the transmitted information.

As described above, the action priority determiner 125 and the parameter manager 124 may determine action execution conditions with respect to the action determined by the ambiguity resolver 123, determine priorities thereof, and acquire parameter values.

When all values of the parameter values for performing each action are obtained (wherein the values are obtained through the current context and dialog), the dialog action manager 122 may send a signal to the dialog flow manager 121.

When necessary parameter values for action execution and condition determination are to be acquired by the user due to absence of the necessary parameters in the dialogue and action state DB 147, the external content server 300, the long term memory 143, the short term memory 144, and the context information DB 142, the result processor 130 may generate a response of a dialogue asking the user for the parameter values.

The dialog flow manager 121 may send information related to the action corresponding to the first priority action and the dialog state to the results processor 130. Further, the dialog flow manager 121 may send information related to a plurality of candidate actions according to the dialog policy.

When the dialog system 100 outputs a pre-utterance (i.e., a pre-utterance trigger signal is generated by the input processor 110), the dialog state sent from the result processor 130 may include the pre-utterance trigger signal. However, the pre-spoken trigger signal is not required to be present in the dialog state, but any type of information indicating the pre-spoken context may be present in the dialog state. When information indicating the pre-utterance context is present in the dialog state, the results processor 130 may output the dialog response first, or along with other types of responses, as compared to other types of responses.

When a pre-spoken message corresponding to a pre-spoken context is input from the dialog input manager 111c in a state where the dialog system 100 outputs the pre-utterance, the pre-spoken message may be transmitted to the result processor 130 without the above-described procedure of the ambiguity resolution scheme, parameter management, and action priority determination.

When an action corresponding to the pre-utterance context is input from the dialog input manager 111c in a state where the dialog system 100 outputs the pre-utterance, a pre-utterance message may be transmitted to the result processor 130 with or without the above-described procedure of the ambiguity resolution scheme, parameter management, and action priority determination.

Fig. 29 is a control block diagram showing the configuration of the result processor in detail.

Referring to fig. 29, the result handler 130 may include a response generation manager 131, a dialog response generator 132, an output manager 133, a service editor 134, a memory manager 135, and an instruction generator 136, the response generation manager 131 managing generation of a response required to perform an action input from the dialog manager 120; the dialog response generator 132 generates a response of a text, image or audio type according to the request of the response generation manager 131; the instruction generator 136 generates an instruction for vehicle control or providing service using external contents according to a request of the response generation manager 131; the service editor 134 continuously or intermittently executes a plurality of services and collects the results thereof to provide a service desired by a user; the output manager 133 outputs the generated text type response, image type response, or audio type response, outputs the instruction generated by the instruction generator 136, or determines the order of output when the output is plural; the memory manager 135 manages the long-term memory 143 and the short-term memory 144 based on the outputs of the response generation manager 131 and the output manager 133.

The result processor 130 may include a memory in which a program for performing the above-described operations and operations described later is stored, and a processor; the processor is configured to execute a stored program. At least one memory and at least one processor may be provided, and when a plurality of memories and processors are provided, they may be integrated on a single chip or physically separated.

Each of the components present in the results processor 130 may be implemented by the same processor or may be implemented by separate processors.

Further, the result processor 130, the dialog manager 120 and the input processor 110 may be implemented by the same processor, or may be implemented by separate processors.

The responses output by corresponding to the user's utterance or context may include dialogue responses, vehicle controls, and providing external content. The dialog response may include an initial dialog, a question, and an answer that includes information. The dialog response may be stored as a database in the response template 149.

The response generation manager 131 may request the dialog response generator 132 and the instruction generator 136 to generate a response required to perform an action, which is determined by the dialog manager 120. To this end, the response generation manager 131 may transmit information related to an action to be performed, which may include an action name and a parameter value, to the dialog response generator 132 and the instruction generator 136. When generating a response, dialog response generator 132 and instruction generator 136 may reference the current dialog state and action state.

The dialog response generator 132 may extract the dialog response template by searching the response template 149 and generate a dialog response by populating the extracted dialog response template with parameter values. The generated dialog response may be transmitted to the response generation manager 131. When parameter values required to generate a dialog response are not transmitted from the dialog manager 120 or when an introduction using external content is transmitted, the dialog response generator 132 may receive the parameter values from the external content server 300 or search the long-term memory 143, the short-term memory 144, or the context information DB 142.

For example, when the action determined by the dialog manager 120 corresponds to route guidance, the dialog response generator 132 may search the response template 149 and then extract that the dialog response template "from [ current location: - ] to [ destination: - ] will take [ duration: - ]. start guidance?".

The [ current position ] and [ destination ] of the parameters that need to be populated in the dialog response template may be sent from the dialog manager 120, and the parameter values for [ duration ] may not be sent. In this case, the dialog response generator 132 may request the duration from [ current location ] to [ destination ] to the external content server 300.

When the response to the user utterance or context includes vehicle control or providing external content, the instruction generator 136 may generate an instruction to perform vehicle control or provide external content. For example, when the action determined by the dialog manager 120 is to control an air conditioner, a window, and an AVN, the instruction generator 136 may generate an instruction to perform the control and then transmit the instruction to the response generation manager 131.

When the action determined by the dialog manager 120 requires the provision of external content, the instruction generator 136 may generate an instruction to receive corresponding content from the external content server 300 and then transmit the instruction to the response generation manager 131.

When the instruction generator 136 provides a plurality of instructions, the service editor 134 may determine a method and an instruction to execute the plurality of instructions and transmit the method and the instruction to the response generation manager 131.

The response generation manager 131 may transmit the response transmitted from the dialog response generator 132, the instruction generator 136, or the service editor 134 to the output manager 133.

The output manager 133 can determine the output time, output sequence, and output position of the dialog response generated by the dialog response generator 132, and the instruction generated by the instruction generator 136.

The output manager 133 may output the response by: the dialog responses generated by the dialog response generator 132 and the instructions generated by the instruction generator 136 are sent to the appropriate output locations at the appropriate time and in the appropriate order. The output manager 133 may output a text-to-speech (TTS) response via the speaker 232 and a text response via the display device 231. When outputting a dialog response of TTS type, the output manager 133 may utilize a TTS module provided in the vehicle 200, or the output manager 133 may include a TTS module.

Depending on the control target, an instruction may be sent to the vehicle controller 240 or the communication device 280 communicating with the external content server 300.

The response generation manager 131 may also transmit the response transmitted from the dialog response generator 132, the instruction generator 136, or the service editor 134 to the memory manager 135.

The output manager 133 may transmit the response output by itself to the memory manager 135.

The memory manager 135 may manage the long term memory 143 or the short term memory 144 based on the contents transmitted from the response generation manager 131 and the output manager 133. For example, based on the generated and output dialog responses, the memory manager 135 may update the short-term memory 144 by storing dialog content between the user and the system. The memory manager 135 may update the long-term memory 143 by storing information related to the user, which is acquired through a dialog with the user.

Among the information stored in the short-term memory 144, persistent information (e.g., a user's preference or orientation) or information for obtaining persistent information may be stored in the long-term memory 143.

Based on the vehicle control and external content requests corresponding to the generated and output instructions, the user preferences or vehicle control history stored in long-term memory 143 may be updated.

Meanwhile, in a state in which the dialog system 100 outputs a pre-utterance before the user inputs an utterance, when an action corresponding to a pre-utterance context is input from the dialog input manager 111c, the dialog response generator 132, which receives information related to the action, may extract a dialog response template by searching the response template 149 and generate a dialog response by filling the extracted dialog response template with parameter values. The generated dialog response may be transmitted to the response generation manager 131. The dialog response may become a pre-utterance of the dialog system 100.

The response generation manager 131 may transmit the dialog response transmitted from the dialog response generator 132 to the output manager 133.

The output manager 133 may output the dialog response generated by the dialog response generator 132 via the speaker 232.

When the results processor 130 receives a pre-spoken message from the dialog flow manager 121 corresponding to the pre-spoken context, the input pre-spoken message may become a dialog response and the input pre-spoken message may be sent to the output manager 133.

The output manager 133 may output the transmitted pre-audible message via the speaker 232.

When the user utterance is input after the dialogue system 100 outputs the preliminary utterance, the same operation as that for processing the user utterance may be performed.

According to the above-described embodiment, the dialogue system 100 can provide a service that is most suitable for a user by considering various situations occurring inside the vehicle. The dialogue system 100 can determine a service required by the user on its own based on the context information collected by itself or the driver information without inputting the utterance of the user, and provide the service on its own initiative.

For example, the evaluation criterion of the vehicle state may be changed depending on the situation when the vehicle is started, and thus feedback may be actively provided. The travel start time may be defined as a vehicle start time, a time point (EPB) at which an electronic parking brake is released, or a time point at which a navigation destination is set. The vehicle condition evaluation system that calculates the driving availability score may give a weight to an individual device and change a variable weight applied to the individual device according to a situation factor. When it is determined that there is a problem with the vehicle state, a solution regarding the individual device, such as a service shop guide, may be provided.

By considering the destination at the time of vehicle start, it can be determined whether the vehicle is out of fuel. When the fuel is insufficient, it is possible to perform automatic stopover in which a gas station that the user likes is added as a route to the destination as feedback of the fuel shortage, and notify the user of a change in the stopover. Further, the gas station added as an automatic stopover can be changed according to the user's response.

Although the current vehicle state does not indicate a fuel deficiency, a gas station or refueling time can be actively provided by integrating the user's next schedule, primary movement history, and remaining fuel quantity.

By acquiring information related to the physical condition and sleep record of the driver, it is possible to conditionally allow the vehicle to start based on the acquired information. For example, when the risk of fatigue driving is identified by the identified physical condition and a sleep record outside the vehicle, the user may be advised not to drive the vehicle. Alternatively, information related to recommended driving time may be provided according to physical conditions or sleep records.

When a trigger indicating a risk of fatigue driving repeatedly occurs, the risk of fatigue driving may be detected and a warning may be output or feedback provided according to the degree of risk, such as automatically changing the route, i.e., changing the route to go to a rest area. For example, the case where the heart rate is reduced, the case where the pitch is a reference distance or more, the case where the vehicle speed is a reference speed or less, the trigger indicating the risk of fatigue driving may be acquired by manually measuring the driver state and the vehicle state, or the case where, for example, a question is uttered to the driver and the response speed of the driver to the question is measured, the trigger indicating the risk of fatigue driving may be acquired by actively measuring via a dialogue.

When a user inputs an utterance indicating an emotion, the dialog system 100 may not be able to extract a specific domain or action from the utterance of the user. However, the dialogue system 100 may recognize the user's intention by using the surrounding environment information, the vehicle state information, and the user state information, and then continue the dialogue. As described above, the implementation may be performed by the ambiguity resolver 123 resolving the ambiguity of the user utterance.

Hereinafter, an example of a dialogue process using the dialogue system 100 will be described in detail.

Fig. 30 to 42 are diagrams showing specific examples in which the dialogue system 100 processes input, manages dialogue, and outputs a result when a user inputs an utterance related to route guidance.

As shown in fig. 34, when the user inputs the utterance "the initial bar that we have gone yesterday", the speech recognizer 111a can output the speech of the user as an utterance in a text form (the initial bar that we have gone yesterday).

The natural language understanding section 111b can perform morphological analysis and output [ domain: navigation ], [ action: route guidance ], [ verbal behavior: request ] and [ parameter: NLU: destination: seoul station ], then input them to the dialog input manager 111 c.

Referring to fig. 31, while transmitting the natural language understanding result of the natural language understanding section 111b to the context understanding section 112c, when additional information exists in the context understanding section 112c, the dialog input manager 111c may request the context understanding section 112c to transmit the additional information.

The context understanding part 112c can search the context understanding table 145 and extract the case where: and [ domain: navigation ] and [ action: route guidance ] the relevant context information is the current location, and the type of the context information is a GPS value.

The context understanding part 112c can extract the GPS value of the current location by searching the context information DB 142. When the GPS value of the current location is not stored in the context information DB 142, the context understanding part 112c may request the GPS value of the current location from the context information collection manager 112 b.

The context information collection manager 112b can send a signal to the context information collector 112a to cause the context information collector 112a to collect the GPS value of the current location. The context information collector 112a may collect the GPS value of the current location from the vehicle controller 240 and then store the GPS value of the current location in the context information DB 142 while transmitting a GPS value collection confirmation signal to the context information collection manager 112 b. When the context information collection manager 112b transmits a GPS value collection confirmation signal to the context understanding part 112c, the context understanding part 112c may extract the GPS value of the current location from the context information DB 142 and then transmit the GPS value of the current location to the dialog input manager 111 c.

The dialog input manager 111c may combine the natural language understanding results: [ domain: navigation ], [ action: route guidance ], [ verbal behavior: request ], [ parameters: NLU: destination: seoul station ] and [ context information: current position: an exgine station (GPS value), and then sends the combined information to the dialog manager 120.

Referring to fig. 32, the dialog flow manager 121 may search the dialog and action state DB 147 and determine whether there is a dialog task or an action task currently in progress. At this time, the dialog flow manager 121 may refer to the dialog policy DB 148. According to this embodiment, it is assumed that there is no currently ongoing conversational or action task.

The dialog flow manager 121 may request the dialog action manager 122 to generate an action task and a dialog task corresponding to an output of the input processor 110. Generating the action task and the dialog task may represent specifying a storage space for storing and managing information related to the action state and the dialog state.

Accordingly, the dialog action manager 122 may specify a storage space in the dialog and action state DB 147 to store information related to action states and dialog states.

The dialog action manager 122 may send the action state and the dialog state to the action prioritizer 125.

The action priority determiner 125 may search the relational action DB 146b for a vehicle state check and a gas station recommendation related to route guidance. The route guidance action and the relationship action may become candidate actions.

Action prioritizer 125 may prioritize the candidate actions according to pre-stored rules. The priority may be determined before determining the execution condition of the candidate action, or only the priority of the candidate action satisfying the execution condition may be determined after determining the execution condition of the candidate action.

The candidate action list may again be sent to the dialog action manager 122, and the dialog action manager 122 may update the action state by adding the searched relationship action.

Referring to fig. 33, the action priority determiner 125 may search the action execution condition DB 146c for an execution condition or a parameter determining the execution condition with respect to each candidate action. The action prioritizer 125 may also determine a priority between candidate actions.

For example, the condition for vehicle state check may be a case where the destination distance is equal to or greater than 100km, wherein the parameter for determining the condition may correspond to the destination distance.

The condition for gas station recommendation may be a case where the destination distance is greater than a remaining fuel travelable Distance (DTE), wherein the parameter for determining the condition may correspond to the destination distance and the remaining fuel travelable Distance (DTE).

The dialogue-action manager 122 may update the action state by adding a condition for executing each candidate action and a parameter required for determining the condition to the dialogue-and-action state DB 147.

The action priority determiner 125 may search the dialog and action state DB 147, the context information DB 142, the long term memory 143, or the short term memory 144 for parameter values required for determining whether the candidate action satisfies the execution condition, and acquire the parameter values from the dialog and action state DB 147, the context information DB 142, the long term memory 143, or the short term memory 144.

The action priority determiner 125 may obtain the parameter value from the dialog and action state DB 147 when the parameter value exists in the previous dialog contents, in the context information related to the dialog contents, or in the context information related to the generated event.

When the action priority determiner 125 is not allowed to retrieve parameter values from the dialog and action state DB 147, the context information DB 142, the long term memory 143, or the short term memory 144, the action priority determiner 125 may request the parameter values from the external information manager 126.

For example, the destination distance may be acquired from the external content server 300 providing the navigation service, and the DTE may be acquired from the context information DB 142 via the external information manager 126. Meanwhile, in order to search for a destination distance, correct destination information for a navigation service may be required. In this embodiment, the destination entered from the user's utterance may correspond to "seoul station", where "seoul station" may include various places having names beginning with "seoul station", as well as "seoul station" having a particular meaning. Therefore, it may be difficult to search for the correct destination distance using only "seoul station".

The parameter values may be obtained from a mobile device 400 connected to the vehicle 200 as needed. For example, when user information (e.g., contacts and schedules not stored in the long-term memory 143) is required as parameter values, the external information manager 126 may request the mobile device 400 for the required information and then acquire the required parameter values.

When the parameter values may not be obtained via the storage device 140, the external content server 300, and the mobile device 400, the user may be asked to obtain the required parameter values.

The action priority determiner 125 may determine the execution condition of the candidate action by using the parameter value. Since the destination distance is not searched out, the determination of the execution conditions related to the vehicle state checking action and the gas station recommendation can be postponed.

As shown in fig. 34, the dialogue-action manager 122 may update the action state by adding the acquired parameter value to the dialogue-and-action state DB 147 along with whether the action execution condition is satisfied, which is determined by using the corresponding parameter value.

The dialog action manager 122 may request a parameter list from the parameter manager 124 for performing the candidate action.

The parameter manager 124 may extract the current location and destination from the action parameter DB 146a as necessary parameters for performing the route guidance action, and extract the route type (initial value: express route) as a substitute parameter.

The parameter manager 124 may extract a check part (initial value: whole part) for performing the vehicle state check action as an alternative parameter, and extract a favorite gas station (initial value: a-oil) as an alternative parameter for performing the gas station recommended action.

The extracted parameter list may be sent to the dialog action manager 122 and used to update the action state.

The parameter manager 124 may search the reference location of each parameter in the dialog and action state DB 147, the context information DB 142, the long term memory 143, and the short term memory 144 for a corresponding parameter value to obtain a parameter value corresponding to a necessary parameter and a substitute parameter of a candidate action. When the parameter value needs to be provided via the external service, the parameter manager 124 may request the required parameter value to the external content server 300 via the external information manager 126.

The parameters for determining the execution conditions of the candidate action and the parameters for executing the candidate action may be duplicated. When there is a parameter corresponding to the parameter (necessary parameter and substitute parameter) for performing the candidate action among the parameter values acquired by the action priority determiner 125 and then stored in the dialogue and action state DB 147, the corresponding parameter may be used.

Referring to fig. 35, the dialog action manager 122 may update the action state by adding parameter values acquired by the parameter manager 124.

As described above, when the destination (seoul station) extracted from the utterance of the user is utilized as a parameter of the route guidance action, there may be ambiguity. Therefore, the parameter of the route guidance action (destination), the parameter of the vehicle state checking action (destination distance), and the parameter of the gas station recommendation (destination distance) may not have been acquired yet.

When [ parameter: NLU: destination: seoul station ] is converted into destination parameters suitable for route guidance actions, the ambiguity resolver 123 may check whether an ambiguity exists. As mentioned above, "seoul station" may include various places having names beginning with "seoul station", as well as "seoul station" having a user-specific meaning.

The ambiguity resolver 123 can confirm that the modifier of "seoul station" is present in the user utterance by referring to the morphological analysis result. The ambiguity resolver 123 can search the long term memory 143 or the short term memory 144 for the schedule, the mobile location and the contacts to identify the location of "the seoul stop we have gone yesterday".

For example, the ambiguity resolver 123 can confirm from the user's mobile position performed yesterday that "the seoul station we have gone yesterday" is "the seoul station exit 4". After confirming that a POI (e.g., "seoul station exit 4") exists, the ambiguity resolver 123 may obtain a corresponding value.

The destination information obtained by the ambiguity resolver 123 may be sent to the dialog action manager 122, and the dialog action manager 122 may update the action state by adding "seoul station exit 4" to the destination parameter of the candidate action.

The parameter manager 124 may acquire destination information (seoul station exit 4) from the dialog and action state DB 147 and request a destination distance value to the external content server 300 providing the navigation service via the external information manager 126.

Referring to fig. 36, when the external information manager 126 acquires a destination distance value (80km) from the external content server 300 and then transmits the destination distance value to the parameter manager 124, the parameter manager 124 may transmit the destination distance value to the dialogue action manager 122 to allow the action state to be updated.

The action priority determiner 125 may determine whether the candidate action is executable by referring to the action state and adjust the priority of the candidate action. Since the parameter values of the current position and the destination as the necessary parameters are acquired, it can be determined that the route guidance action is executable. Since the destination distance (70km) is less than 100km, it can be determined that the vehicle state checking motion is not executable. Since the destination distance (80km) is greater than the DTE, it can be determined that the gas station recommended action is performable.

Since the vehicle state checking action is not executable, the vehicle state checking action may be excluded from the determination of the priority. Thus, the route guidance action can be ranked first and the gasoline station recommendation action can be ranked second.

The dialog action manager 122 may update the action state based on whether the candidate action may be performed and the priority of the modification.

The dialog flow manager 121 can check the dialog state and the action state stored in the dialog and action state DB 147 and can develop a dialog policy by referring to the dialog policy DB148 to continue the dialog. For example, dialog flow manager 121 may select the highest priority action among the executable actions, and dialog flow manager 121 may request response generation manager 131 to generate a response for dialog continuation according to dialog policy DB 148.

The dialog state and action state stored in the dialog and action state DB 147 may be updated to [ state: confirm the start of route guidance ].

Referring to fig. 37, the response generation manager 131 may generate a TTS response and a text response by searching the response template 149.

Dialog response generator 132 may generate a dialog response configured to output "it is expected that 30 minutes will be needed to exit 4 from exvan to seoul station in TTS form and text form, do you want to start guiding?".

Response generation manager 131 may transmit the TTS response and the text response generated by dialog response generator 132 to output manager 133 and memory manager 135, and output manager 133 may transmit the TTS response to speaker 232 and the text response to display device 231. At this point, after the TTS response passes through the TTS module (which is configured to combine text with speech), output manager 133 may send the TTS response to speaker 232.

The memory manager 135 may store the user-requested route guidance in the short-term memory 144 or the long-term memory 143.

It is configured to ask "it takes 30 minutes to expect to go from shiwang station to seoul station No. 4 exit, a dialogue response of which you want to start guiding?" can be output through the display device 231 and the speaker 232 as shown in fig. 38, when the user says "yes", the utterance of the user can be input to the voice recognizer 111a and then output as [ text: yes ], and the natural language understanding section 111b can output [ field: ], [ action: ], [ speech action: ], [ and [ morphological analysis result: yes/IC ].

The natural language understanding result may be transmitted to the dialog input manager 111c, and the dialog input manager 111c may transmit the natural language understanding result to the dialog manager 120.

Referring to fig. 39, the dialog flow manager 121 may search the dialog and action state DB 147 and analyze a previous dialog state. The dialog flow manager 121 may request the dialog action manager 122 to update the dialog/action related to the currently executed route guidance.

The dialog action manager 122 may update the dialog state and the action state to [ state: route guidance starts ].

Dialog flow manager 121 may request that result processor 130 generate a response for initiating route guidance.

Referring to fig. 40, the dialog action manager 122 may update the dialog state to [ state: proceed next dialog ] and update the action state to [ state: and execute ].

The dialog flow manager 121 may request the response generation manager 131 to generate a response for route guidance.

The dialog response generator 132 may generate a dialog response configured to output "start route guidance" in a TTS form and a text form and then transmit the dialog response to the response generation manager 131.

Instruction generator 136 may generate an instruction to perform route guidance [ target: navigation, instructing: route guidance, destination: seoul station exit 4, start: exwaning station ], then sends the instruction to the response generation manager 131.

The response generation manager 131 may transmit the generated dialog response and the instruction to the output manager 133. The output manager 133 may output the dialog response via the display device 231 and the speaker 232. The output manager 133 may transmit a route guidance instruction to the AVN 230 of the vehicle 200 or to the external content server 300 providing the navigation service via the vehicle controller 240.

Referring to fig. 41, the dialog flow manager 121 may select a gas station recommendation as the next executable action and request the response generation manager 131 to generate a response configured to ask the user whether to recommend a gas station.

The dialog state and the action state may be updated to [ state: check for relevant service recommendations ].

The response generation manager 131 may request the dialog response generator 132 to generate a TTS response and a text response, and the dialog response generator 132 may generate a dialog response configured to output "no sufficient fuel to reach the destination" in TTS and text form, do you want to add an a terminal to a stopover?.

Response generation manager 131 may transmit the TTS response and the text response generated by dialog response generator 132 to output manager 133 and memory manager 135, and output manager 133 may transmit the TTS response to speaker 232 and the text response to display device 231.

A dialogue response configured to ask "there is not enough fuel to reach the destination" you want to add the a fuel station to the stopover? "may be output through the display device 231 and the speaker 232 as shown in fig. 42, when the user says" no ", the utterance of the user may be input to the voice recognizer 111a and then output as [ text: no ], and the natural language understanding section 111b may output [ field: ], [ action: ], [ speech action: ], [ and [ morphological analysis result: no/IC ].

The dialog flow manager 121 may request the dialog action manager 122 to update the dialog state and the action state.

The dialog action manager 122 may update the dialog state to [ state: proceed next dialog ] and update the action state to [ state: cancel ].

The dialog flow manager 121 can request the response generation manager 131 to generate a response indicating that the gas station recommendation service is cancelled, and the dialog flow manager 121 can check whether there is a dialog to be continued. When a dialog to be continued does not exist, the dialog flow manager 121 may update the dialog state to [ state: idle ] and wait for user input.

The above-described data processing flow is only an example applied to the dialogue system 100. Accordingly, the order in which data is processed by each component of the dialog system 100 is not limited to the above-described example, and thus a plurality of components may process data simultaneously, or a plurality of components may process data in an order different from the above-described example.

Hereinafter, a dialogue processing method according to an embodiment will be described. In one embodiment, the dialogue processing method may be applied to the dialogue system 100 or the vehicle 200 provided with the dialogue system 100 described above. Therefore, the description of fig. 1 to 42 will be applied to the dialogue processing method in the same manner.

Fig. 43 is a flowchart illustrating a method of processing user input in a dialog processing method in one embodiment. The method of processing user input may be performed in the input processor 110 of the dialog system 100.

Referring to fig. 43, when an utterance of a user is input (yes of 500), the speech recognizer 111a may recognize the input utterance of the user (510). The user's utterance may be input to the voice input device 210 provided in the vehicle 200 or the voice input device 410 provided in the mobile device 400.

The speech recognizer 111a may recognize an input utterance of the user and output the utterance in a text form.

The natural language understanding section 111b may apply a natural language understanding technique to the utterance in text form (520) and output a result of the natural language understanding.

Specifically, the natural language understanding process (520) may include performing morphological analysis (521) on the utterance in text form, extracting a domain from the utterance based on a result of the morphological analysis (522), recognizing an entity name (523), analyzing the speech behavior (524), and extracting an action (525).

The extraction of the domain, the identification of the entity name and the extraction of the action may be performed by referring to the domain/action inference rule DB 141.

The output of the natural language understanding part 111b, i.e., the result of the natural language understanding, may include the result of the domain, the action, the speech behavior, and the morphological analysis corresponding to the utterance of the user.

Contextual information related to the extracted action may be searched. Contextual information related to the extracted action may be stored in the contextual understanding table 145. The context understanding part 112c may search the context index table 145 for context information related to the extracted action, and the context information processor 112 may acquire an information value of the searched context information from the context information DB 142, the long term memory 143, or the short term memory 144.

When additional context information is needed (yes of 530), i.e. without retrieving context information from the context information DB 142, the long term memory 143 or the short term memory 144, the context understanding part 112c may request collection of the corresponding context information (540). Inputs other than voice, such as vehicle state information, surrounding environment information, and driver information, may be input via the context information collector 112a, which is performed separately from the input of the user's utterance.

The information may be entered periodically or only when a particular event occurs. Further, information may be input periodically and then additionally input when a specific event occurs. In any case, when information collection is requested, the corresponding information may be actively collected.

Accordingly, when the context information related to the action has been collected, the corresponding information may be acquired from the context information DB 142, the long term memory 143, or the short term memory 144, and otherwise, the corresponding information may be collected by the context information collector 112 a.

When the context information collector 112a that has received the request for collecting the context information collects the corresponding context information and stores the information in the context information DB 142, the context understanding part 112c can acquire the corresponding context information from the context information DB 142.

When the context information collection manager 112b determines that a particular event occurs because the data collected by the context information collector 112a satisfies a predetermined condition, the context information collection manager 112b may send a trigger signal for an action to the context understanding part 112 c.

The context understanding part 112c can search the context understanding table 145 for the context information related to the corresponding event, and when the searched context information is not stored in the context understanding table 145, the context understanding part 112c can again transmit the context information request signal to the context information collection manager 112 b.

When the collection of the required context information is completed, the result of the natural language understanding and the context information may be transmitted to the dialog manager 120 (560). When an event occurs, information related to the event (which event occurred) and context information related to the occurred event may also be transmitted.

FIG. 44 is a flow diagram that illustrates a method for managing conversations using the output of an input handler in a conversation processing method in one embodiment. The dialog processing method may be performed by the dialog manager 120 of the dialog system 100.

Referring to fig. 44, the dialog flow manager 121 may search the dialog and action state DB 147 for a relevant dialog history (600).

In this embodiment, a case where a domain and an action are extracted from an utterance of a user has been described as an example, but there may be a case where: domains and actions cannot be extracted from the user's utterance due to ambiguities in the content or context of the utterance. In this case, the dialogue action manager 122 may generate a random dialogue state, and the ambiguity resolver 123 may recognize the user's intention based on the content of the utterance of the user, environmental conditions, vehicle state, and user information, and determine an action suitable for the user's intention.

When there is a related dialog history (yes of 600), the related dialog history may be referred to (690). When no relevant conversation history exists (NO of 600), new conversation tasks and action tasks may be generated (610).

A related action list (hereinafter, referred to as an inputted action) related to an action extracted from the utterance of the user may be searched in the relation action DB 146b, and a candidate action list may be generated (620). The entered action and the action associated with the entered action may correspond to a list of candidate actions.

The execution condition DB 146c may be searched for the execution condition of each candidate action (630). The execution condition may represent a necessary condition for executing the action. Thus, when the respective condition is satisfied, the action may be determined to be executable, but when the respective condition is not satisfied, the action may be determined not to be executable. In the action execution condition DB 146c, information related to the type of parameter for determining the action execution condition may also be stored.

Parameter values for determining action execution conditions may be obtained (640). The parameter for determining the action execution condition may be referred to as a condition determination parameter. Parameter values of the condition determining parameters can be acquired by searching the context information DB 142, the long term memory 143, the short term memory 144, or the dialogue and action state DB 147. When the parameter value of the condition determining parameter needs to be provided via the external service, the required parameter value may be provided from the external content server 300 via the external information manager 126.

When the required parameter values cannot be acquired due to ambiguities in the context and utterances, the required parameter values can be acquired by resolving the ambiguities with the ambiguity resolver 123.

Although the acquired parameters are invalid parameters for which it is difficult to determine the action execution condition, the ambiguity resolver 123 may acquire valid parameters by the invalid parameters.

Based on the obtained condition determination parameters, it may be determined whether each candidate action is executable (650), and a priority of the candidate action may be determined (660). Rules for determining the priority of the candidate action may be pre-stored. The action priority determiner 125 may determine the priority of the candidate actions by considering only executable candidate actions after determining whether each candidate action is executable. Alternatively, regardless of whether each candidate action is executable, after determining the priority of the candidate action, the priority of the candidate action may be modified based on whether each candidate action is executable.

The action parameter DB 146a may be searched for a parameter list for performing a candidate action (670). The parameters for performing the candidate action may correspond to the action parameters. The action parameters may include necessary parameters and alternative parameters.

Parameter values for performing the candidate action may be obtained (680). Parameter values of the action parameters can be acquired by searching the context information DB 142, the long term memory 143, the short term memory 144, or the dialogue and action state DB 147. When the parameter values of the action parameters need to be provided via the external service, the required parameter values may be provided from the external content server 300 via the external information manager 126.

When the required parameter values cannot be acquired due to ambiguities in the context and utterances, the required parameter values can be acquired by resolving the ambiguities with the ambiguity resolver 123.

Although the acquired parameters are invalid parameters for which it is difficult to determine the action execution condition, the ambiguity resolver 123 may acquire valid parameters by the invalid parameters.

The dialog state and the action state managed by the dialog action manager 122 may be performed through the above-described steps, and the dialog state and the action state may be updated whenever the state is changed.

Upon retrieving all available parameter values, the dialog flow manager 121 may send information related to the candidate actions and dialog states to the results handler 130. According to the dialog policy, the dialog flow manager 121 may transmit information related to an action corresponding to the first priority or information related to a plurality of candidate actions.

When the required parameter values can be acquired only by the user because the required parameter values do not exist in the external content server 300, the long term memory 143, the short term memory 144, and the context information DB 142, a dialog response for inquiring the user about the parameter values may be output.

FIG. 45 is a flow diagram that illustrates a result processing method for generating a response corresponding to a result of dialog management in a dialog processing method in one embodiment. The result processing method may be performed by the result processor 130 of the dialog system 100.

Referring to fig. 45, when a dialog response needs to be generated (yes of 700), the dialog response generator 132 may search the response template 149 (710). The dialog response generator 132 may extract a dialog response template corresponding to the current dialog state and action state and populate the response template with the required parameter values to generate a dialog response (720).

When the parameter values required to generate a dialog response are not transmitted from the dialog manager 120 or when an introduction using external content is transmitted, the required parameter values may be provided from the external content server 300 or searched in the long-term memory 143, the short-term memory 144, or the context information DB 142. When the required parameter values can be acquired only by the user because the required parameter values do not exist in the external content server 300, the long term memory 143, the short term memory 144, and the context information DB 142, a dialogue response for inquiring the user about the parameter values may be generated.

When instructions need to be generated (760), the instruction generator 136 may generate instructions for vehicle control or external content (770).

The generated dialog responses or instructions may be input to the output manager 133, and the output manager 133 may determine an output order between the dialog responses and the instructions or an output order of the plurality of instructions (730).

The memory may be updated based on the generated dialog response or instructions (740). Based on the generated dialog responses or instructions, the memory manager 135 may update the short-term memory 144 by storing dialog content between the user and the system, and update the long-term memory 143 by storing user-related information acquired by the user's dialog. The memory manager 135 may update the user's preferences and vehicle control history stored in the long-term memory 143 based on the generated and output vehicle control and external content requests.

The output manager 133 may output the response by sending the dialog response and the instruction to the appropriate output location (750). TTS responses may be output via speaker 232 and text responses may be output on display device 231. The instruction may be transmitted to the vehicle controller 240 or the instruction may be transmitted to the external content server 300 according to the control target. Further, the instruction may be transmitted to a communication device 280 configured to communicate with the external content server 300.

Fig. 46 to 48 are flowcharts showing a case when the dialogue system outputs a preliminary utterance before a user inputs an utterance in the dialogue processing method according to one embodiment.

Referring to fig. 46, the context information collector 112a and the context information collection manager 112b collect context information (810). Specifically, the vehicle controller 240 may input information acquired by sensors provided in the vehicle (e.g., the remaining amount of fuel, the amount of rainfall, the rainfall speed, surrounding obstacle information, speed, engine temperature, tire pressure, current position, and running environment information) to the context information processor 112. User information input via the information input device 220 other than voice and information acquired from the external content server 300 or an external device may be input to the contextual information processor 112. The collected context information may be stored in the context information DB 142, the long term memory 143, or the short term memory 144.

The preliminary utterance determiner 151 determines a preliminary utterance condition based on the context information (811). The advance voicing conditions may be stored in the advance voicing condition table 145 a. As shown in fig. 22A to 22C, the advance utterance condition related to the context information may be stored in the advance utterance condition table 145a for each context information.

When the context information transmitted from the context information DB 142, the long-term memory 143, or the short-term memory 144 satisfies the pre-utterance condition (yes of 812), the pre-utterance determiner 151 determines that it is a pre-utterance context and generates a pre-utterance trigger signal (813).

The pre-utterance determiner 151 extracts an action corresponding to the pre-utterance context (814). As shown in fig. 22C, the action corresponding to the pre-utterance context may be stored in advance in the pre-utterance condition table 145 a. The pre-utterance determiner 151 may extract an action corresponding to the pre-utterance context from the pre-utterance condition table 145 a. Further, the pre-utterance determiner 151 may generate an action corresponding to the pre-utterance context according to the established rule.

When the pre-utterance determiner 151 transmits the trigger signal of the pre-utterance and the action corresponding to the pre-utterance context to the dialog input manager 111c, the dialog input manager 111c transmits the action corresponding to the pre-utterance context to the dialog manager 120 (815). In this case, a pre-sound trigger signal and a signal indicative of the pre-sound context may be sent.

After sending the action corresponding to the pre-utterance context to the dialog manager 120, a series of processes, such as generating a dialog task and an action task, and acquiring action parameters, may be performed, as shown in fig. 44. When other dialog tasks or action tasks are also being executed, the dialog flow manager 121 may first generate and process tasks related to the pre-utterance context, or may select a priority according to established rules.

When the dialog manager 120 transmits information related to an action performed first to the results processor 130, the dialog response generator 132 may extract a dialog response template by searching the response template 149 and generate a dialog response by populating the extracted dialog response template with parameter values. The generated dialog response may be sent to the output manager 133 via the response generation manager 131. The output manager 133 may output the generated dialogue response via a speaker provided in the vehicle 200 or the mobile device 400.

Further, the pre-spoken message corresponding to the pre-spoken context may be retrieved or generated in the same manner as described above. Referring to fig. 47, the context information collector 112a and the context information collection manager 112b collect context information (820), and the preliminary utterance determiner 151 determines a preliminary utterance condition based on the context information (821).

When the context information transmitted from the context information DB 142, the long term memory 143, or the short term memory 144 satisfies the pre-utterance condition (yes of 822), the pre-utterance determiner 151 determines that it is a pre-utterance context and generates a pre-utterance trigger signal (823).

The pre-utterance determiner 151 extracts a pre-utterance message corresponding to the pre-utterance context (824). As shown in fig. 22A and 22B, the pre-spoken message corresponding to the pre-spoken context may be pre-stored in the pre-spoken condition table 145 a. The pre-stored pre-spoken message may be content indicating the current context or content suggesting a particular function or service required to first execute the pre-spoken context. Further, the preliminary utterance determiner 151 may generate a preliminary utterance message according to an established rule.

When the pre-utterance determiner 151 transmits the pre-utterance trigger signal and the pre-utterance message to the dialog input manager 111c, the dialog input manager 111c may transmit the pre-utterance message to the dialog manager 120 (825). In this case, a pre-sound trigger signal and a signal indicative of the pre-sound context may be sent.

The dialog manager 120 may generate a dialog task for outputting the transmitted pre-spoken message and transmit the dialog task to the results processor 130. Results processor 130 may output the incoming pre-audible message via speaker 232.

Further, virtual user utterances corresponding to the pre-utterance context can be extracted. Referring to fig. 48, the context information collector 112a and the context information collection manager 112b collect context information (830), and the pre-utterance determiner 151 determines a pre-utterance condition based on the context information (831).

When the context information transmitted from the context information DB 142, the long term memory 143, or the short term memory 144 satisfies the pre-utterance condition (yes of 832), the pre-utterance determiner 151 determines that it is a pre-utterance context and generates a pre-utterance trigger signal (833).

The pre-utterance determiner 151 extracts a virtual user utterance corresponding to the pre-utterance context (834). Although not shown in the drawings, a virtual user utterance corresponding to the pre-utterance context may be stored in advance in the pre-utterance condition table 145 a. The pre-utterance determiner 151 may extract a virtual user utterance corresponding to the pre-utterance context from the pre-utterance condition table 145 a. Further, the pre-utterance determiner 151 may generate a virtual user utterance corresponding to the pre-utterance context according to the established rules.

When the preliminary utterance determiner 151 transmits the virtual user utterance to the natural language understanding section 111b in text form (835), the natural language understanding section 111b can extract a domain and an action from the virtual user utterance in the same manner as in the case where the user actually utters.

The dialog input manager 111c transmits the pre-spoken trigger signal and the natural language understanding result to the dialog manager 120 (836). The result of the natural language understanding may include a domain and an action extracted from the virtual user utterance, and the extracted domain and action may become a domain and action corresponding to the pre-utterance context.

For example, according to the mobile gateway method in which the mobile device 400 serves as a gateway between the vehicle and the dialogue system 100, the dialogue system client 470 of the mobile device 400 may perform part of the operation of the preliminary utterance determiner 151. In this case, the dialog system client 470 can generate a virtual user utterance corresponding to the pre-utterance context and send the virtual user utterance to the natural language understanding section 111 b.

After the pre-uttered trigger signal and the natural language understanding result are transmitted to the dialogue manager 120, a series of processes such as generating a dialogue task and an action task and acquiring an action parameter may be performed, as shown in fig. 44. When other dialog tasks or action tasks are also being executed, the dialog flow manager 121 may first generate and process tasks related to the pre-utterance context, or may select a priority according to established rules.

When the dialog manager 120 transmits information related to an action performed first to the results processor 130, the dialog response generator 132 may extract a dialog response template by searching the response template 149 and generate a dialog response by populating the extracted dialog response template with parameter values. The generated dialog response may be sent to the output manager 133 via the response generation manager 131. The output manager 133 may output the generated dialogue response via a speaker provided in the vehicle 200 or the mobile device 400.

Fig. 49 is a flowchart showing processing of a repetitive task when the dialogue system outputs a preliminary utterance before a user inputs an utterance in the dialogue processing method according to the embodiment.

Referring to fig. 49, the context information collector 112a and the context information collection manager 112b collect context information (840), and the preliminary utterance determiner 151 determines preliminary utterance conditions based on the context information (841).

The pre-utterance determiner 151 determines whether the context information transmitted from the context information DB 142, the long-term memory 143, or the short-term memory satisfies a pre-utterance condition, and when the context information satisfies the pre-utterance condition (yes of 842), the repetitive task processor 152 determines whether a task related to the pre-utterance context that is currently occurring is repetitive (843).

Specifically, based on information stored in the task processing DB145b related to tasks previously executed or being executed in the dialog system 100, the repeat task processor 152 may determine whether a task, such as a dialog and an action related to a currently occurring pre-sound context, has been executed or is being executed.

For example, the repetitive task processor 152 may determine that the task related to the current pre-utterance context is a repetitive task when the dialog related to the currently occurring pre-utterance context has been performed and when the reference time period has not elapsed since the dialog time point. Further, the repetitive task processor 152 can determine that the task associated with the current pre-utterance context is a repetitive task while the dialog and action associated with the current pre-utterance context is being performed.

That is, based on the dialog history stored in the task processing DB145b and whether the task is executed, the repeat task processor 152 may determine whether the preliminary utterance has been output and the user's intention with respect to the preliminary utterance. Based on the stored dialog time, the user's intention, or whether the task is processed, the repetitive task processor 152 may determine whether it is a repetitive task.

When it is recognized that the task associated with the current pre-utterance context is a repeat task (yes of 843), the repeat task processor 152 terminates the pre-utterance context.

When it is determined that the task associated with the current pre-voicing context is not a repeat task (no of 843), the pre-voicing operation (844) as shown in the above embodiments may be performed. For example, pre-spoken trigger signals and actions or pre-spoken messages corresponding to the pre-spoken context may be sent to the dialog manager 120. Further, a virtual user utterance corresponding to the pre-utterance context may be transmitted to the natural language understanding part 111b, and a result of the natural language understanding and a pre-utterance trigger signal may be transmitted to the dialog manager 120.

According to the above-described embodiment, it is assumed that additional components (such as the pre-utterance determiner 151 and the repetitive task processor 152) and additional storage devices (such as the pre-utterance condition table 145a and the task processing DB145 b) are used to perform a pre-utterance dialogue processing method. However, the embodiment of the dialogue processing method is not limited thereto, and the context understanding part 112c may also perform the operations of the pre-utterance determiner 151 and the repetitive task processor 152, and the information stored in the pre-utterance condition table 145a and the task processing DB145b may also be stored in the context understanding table 145.

The dialog processing method in one embodiment is not limited to the order in the above-described flowcharts. The flow according to the flowcharts of fig. 41 to 49 may be only an example applied to the dialogue processing method. Therefore, a plurality of steps can be performed at the same time, and the order of each step can also be changed.

Fig. 50 is a control block diagram showing a dialogue system 100a according to another embodiment of the present invention and an apparatus provided with the dialogue system. The device may represent a household appliance or terminal as well as a vehicle.

In this embodiment, the vehicle 200a will be described as an apparatus provided with a dialogue system.

Further, among the components of the vehicle and dialogue system according to another embodiment, the components that perform the same operations as the components according to one embodiment have the same reference numerals as the components according to the exemplary embodiment.

According to another embodiment, the vehicle 200a includes a dialogue system 100a, a voice input device 210, an information input device 220 other than voice, a dialogue output device 230, a vehicle controller 240a, a plurality of loads 251 and 255, a vehicle detector 260, and a communication device 280.

When the dialogue system 100a exists in the vehicle 200a, the vehicle 200a can handle the dialogue with the user by itself and provide the service required by the user. However, information required for the session processing and the provision of the service may also be acquired from the external content server 300.

As described above, the dialogue system 100a according to another embodiment provides a dialogue processing method suitable for a vehicle environment. All or some of the components of the dialog system 100a may be present in the vehicle.

The dialogue system 100a may be provided in a remote server, and the vehicle may be used only as a gateway between the dialogue system 100a and the user. In any case, the dialog system 100a may be connected to the user via at least one of the plurality of mobile devices 400a, 400b, 400c and 400d connected to the vehicle or the vehicle.

The dialog system 100a may be configured to recognize the user's intent and context by: using the voice of the user input through the voice input device 210, another input other than the voice input through the information input device 220 other than the voice, and various information related to the vehicle input through the vehicle controller 240 a; and the dialog system 100a can output a response to perform an action corresponding to the user's intent.

The various information related to the vehicle may include vehicle state information or surrounding environment information acquired by various sensors provided in the vehicle 200a, and information originally stored in the vehicle 200a, such as the fuel type of the vehicle.

The state information of the vehicle may be state information of a device provided with a dialogue system, and particularly, various states indicating a load provided in the device.

The dialogue system 100a receives contact information stored in a mobile device of a user (i.e., an owner of a vehicle), and acquires names, titles, and phone numbers of a plurality of speakers who can converse with the user in the vehicle based on the received contact information. Thus, the dialogue system 100a can acquire the acquired relationship between the speakers.

The dialog system 100a may receive contact information for a plurality of mobile devices and compare the contact information contained in the vehicle owner's mobile device with the contact information contained in the remaining mobile devices to select a group that includes the vehicle owner and the passenger. Thus, the dialog system 100a may receive only contact information relating to the selected group.

The dialog system 100a may select a group including the owner (i.e., user) and the passenger based on the owner name or title in the contact information stored in the passenger mobile device, and may receive only the contact information designated as the selected group contact information.

The group may include a family, a company, a superior or inferior level, a friend, and a club.

When the title of the speaker is input through an information input device other than the voice, the dialogue system 100a may request the input of the voice of the speaker, and when the voice of the speaker is input, the dialogue system 100a may acquire a voice pattern through voice recognition. Accordingly, the dialogue system 100a can match the acquired voice pattern with the title and store the matched information.

The conversation system 100a can acquire names, titles, and telephone numbers of a plurality of speakers who can have a conversation with a user in a vehicle based on contact information transmitted from mobile devices of the plurality of speakers in the vehicle.

The dialogue system 100a can perform speech recognition on the chronologically input voices, and recognize whether the relationship between the recognized voices is a question and a response or an instruction and a response. The dialogue system 100a can acquire the relationship between speakers based on an utterance contained in speech information recognized in a dialogue having a question-and-response relationship or an instruction-and-response relationship.

When a voice uttered by a speaker in a conversation in a vehicle is input, the conversation system 100a performs voice recognition in the order of the input voice based on the recognized voice information and acquires a speech pattern and a voice pattern in order.

The dialogue system 100a acquires a title for acquiring a relationship with the user based on the utterance in the previously recognized voice information. When a voice corresponding to a response to a previous voice is input, the dialogue system 100a recognizes the input voice and acquires a voice pattern in the currently recognized voice information. Accordingly, the dialogue system 100a can match the previously acquired title with the currently acquired voice pattern and store the matched information.

The previous speech and the current speech may be speech having a query and response relationship, or speech having an instruction and response relationship.

Further, the dialogue system 100a may determine whether a title is included in the voice information of the response, and when the title is included in the voice information, the dialogue system 100a may match a voice pattern previously acquired through voice recognition with the acquired title, and store the matched information.

That is, the dialogue system 100a can perform speech recognition on the chronologically input speech, and recognize whether the relationship between the recognized speech is a question and a response, or an instruction and a response, based on the first speech and the second speech chronologically arranged in the recognized speech. The dialogue system 100a may acquire the relationship between the speakers based on the recognized relationship between the first speech and the second speech and the speech pattern of each speaker.

The dialogue system 100a acquires dialogue content of each speaker based on a speech pattern of each speaker and an utterance of each speaker.

The dialogue system 100a recognizes the intention and context of the currently uttered speaker based on the relationship between the speakers, the dialogue content of each speaker, and the acquired utterance, determines an action corresponding to the relationship between the speakers, the intention and context of the uttered speaker, and the recognized utterance, and outputs the utterance corresponding to the determined action. Accordingly, the dialog system 100a may output an utterance corresponding to the determined action and generate a control instruction corresponding to the determined action.

The dialogue system 100a may determine a priority based on the relationship of the passengers in the vehicle, and generate a control instruction for controlling one of the mobile devices belonging to the plurality of speakers in the vehicle based on the determined priority.

The dialog system 100a may determine a priority based on the relationship of the passengers in the vehicle and generate a communication connection instruction for communicating with one of the mobile devices of the plurality of speakers in the vehicle based on the determined priority.

When the seat position of each speaker is input through the information input device 220 other than voice, the dialogue system 100a may generate a control instruction for controlling the seat function of each seat based on the function control information of the mobile devices of a plurality of speakers in the vehicle.

The functions of the seat may include: at least one of an angle adjusting function of the seat, a front-rear distance adjusting function of the seat, an electric heating wire on/off of the seat, an electric heating wire temperature adjusting function of the seat, and a seat ventilation on/off function.

The dialogue system 100a may determine the seat position of each speaker based on the voice signal of the voice input via the voice input device 210 and the voice pattern input.

The dialog system 100a may estimate the direction of utterance based on the temporal distance of the speech signals arriving at the at least two microphones and based on the distance between the two microphones. Thus, the dialog system 100a may determine the seat position of the speaker based on the estimated direction of utterance.

Alternatively, the dialog system 100a may receive the seat position of the speaker directly through a voice input device. That is, the dialog system may receive the location of each speaker through the user's voice.

When determined to be a pre-utterance context, the dialog system 100a may determine an action corresponding to the pre-utterance context based on the determined priority, and may output an utterance corresponding to the determined action.

The dialogue system 200a will be described in detail later.

The voice input device 210 may receive a user control instruction as a voice of the user in the vehicle 200 a. The voice input device 210 may include a microphone configured to receive sound and then convert the sound into an electrical signal.

The voice input device 210 may include one microphone or two or more microphones.

When a single microphone is provided, the microphone may be directional.

When two or more microphones are provided, the two or more microphones may be implemented in a microphone array.

Two or more microphones may be arranged at a distance.

The information input device other than voice 220 receives an instruction other than voice input from the user instruction.

The information input means 220 other than voice may include at least one of an input button and a knob for receiving an instruction by an operation of a user.

The information input device 220 other than voice may include a camera configured to image a user. In this case, the vehicle may receive the instruction through the image acquired by the camera. That is, the vehicle may recognize a gesture, an expression, or a line-of-sight direction of the user existing in the image, and take the received recognized information as a user instruction. Further, the vehicle can recognize the state (drowsy state, etc.) of the user through the image acquired by the camera.

The information input device 220 other than voice may receive contact information and function control information from at least one of the plurality of mobile devices 400a, 400b, 400c, and 400d, and the information input device 220 other than voice may transmit the received information to the contextual information processor. The information input device other than voice 220 may receive a title and a seat position of a speaker in the vehicle, and the information input device other than voice 220 may receive a voice input instruction to acquire a voice pattern of the speaker.

The dialog output device 230 is a device configured to provide output to a speaker in a visual, auditory, or tactile manner. The dialogue output device 230 may include a display device 231 and a speaker 232 provided in the vehicle 200 a.

The display device 231 and the speaker 232 may output a response to an utterance of the user, a question about the user, or information requested by the user in a visual or audible manner. Further, the vibration may be output by mounting a vibrator in the steering wheel 207.

The vehicle controller 240a may transmit information acquired from sensors provided in the vehicle 200a, such as the remaining amount of fuel, the amount of rainfall, the rainfall speed, surrounding obstacle information, the speed, the engine temperature, the tire pressure, and the current location, to the dialogue system 100 a.

In addition to data (i.e., information) acquired by sensors provided in the vehicle 200a, the vehicle controller 240a may transmit information, which includes driving environment information and user information, such as traffic conditions, weather, temperature, passenger information, and driver personal information, acquired from the external content server 300, the mobile devices 400a, 400b, 400c, and 400d, or external devices, through the communication device 280. The vehicle controller 240a may transmit function control information regarding a plurality of functions of the vehicle to the dialogue system 100 a.

The vehicle controller 240a may transmit driving environment information acquired from the outside through vehicle-to-all (V2X) communication to the dialogue system 100 a.

The travel environment information transmitted from the dialogue system 100a may include traffic information on the front, access information of neighboring vehicles, a collision warning with another vehicle, a real-time traffic condition, an accident situation, and a traffic flow control state.

According to the response output from the dialogue system 100a, the vehicle controller 240a may control the vehicle 200a to perform an action corresponding to the user's intention or current situation. That is, the vehicle controller 240a may receive a control instruction for at least one function transmitted from the dialogue system 100a, and control the operation of at least one load to perform the at least one function based on the received control instruction.

The at least one function may include a window opening/closing function, a broadcasting channel changing function, an air conditioning opening/closing function, an air conditioning temperature control function, a seat heating opening/closing function, a steering wheel wire opening/closing function, an audio type changing function, a volume adjusting function, and a communication connection function with a mobile device.

The at least one load performing at least one function may include an air conditioner 251, a window 252, a door 253, a heater wire of a seat 254, and an AVN255, and may further include a steering wheel heater wire, a broadcasting and communication device 280.

For example, when the dialogue system 100a determines that the user's intention or the service required by the user is to lower the temperature in the vehicle 200a and then generates and outputs a corresponding instruction, the vehicle controller 240a may lower the temperature in the vehicle 200a by controlling the air conditioner 251 according to the received instruction.

For another example, when the dialogue system 100a determines that the user's intention or the service required by the user is a route to guide to a specific destination and generates and outputs a corresponding instruction, the vehicle controller 240a may perform route guidance by controlling the AVN 255. The communication device 280 may acquire map data and POI information from the external content server 300 and then provide a service using the information, as necessary.

The vehicle controller 240a may monitor the status of at least one mobile device 400a, 400b, 400c, and 400d capable of communication and transmit the status information of the mobile device to the dialogue system 100 a.

The vehicle controller 240a may receive contact information from at least one of the plurality of mobile devices 400a, 400b, 400c, and 400d and transmit the received contact information to the dialogue system 100 a.

When the identification information and the function control information of at least one mobile device are received, the vehicle controller 240a transmits the received identification information and the function control information of the mobile device to the dialogue system.

When a control instruction of at least one mobile device is received from the dialogue system 100a, the vehicle controller 240a may transmit the received control instruction to at least one mobile device 400a, 400b, 400c, and 400 d.

The vehicle controller 240a may receive the titles of the plurality of speakers input through the information input device 220 other than voice, and transmit the received titles of the plurality of speakers to the dialogue system 100 a.

The vehicle controller 240a may receive the seat positions of the plurality of speakers input through the information input device 220 other than voice, and may transmit the received seat positions of the plurality of speakers to the dialogue system 100 a.

The vehicle controller 240a may recognize seat information, and match the seat information with each speaker and store the matched information.

The seat information may include at least one of seat inclination information, seat front-rear position information, seat heater wire on/off information, seat heater wire temperature information, and seat ventilation device on/off information.

The vehicle controller 240a may acquire the seat information detected by the detector, and may transmit the acquired seat information to the dialogue system.

The vehicle controller 240a may include a memory storing programs for performing the above-described operations and operations described later, and a processor; the processor includes a program for executing the storage. At least one memory and at least one processor may be provided, and when a plurality of memories and processors are provided, they may be integrated on one chip or physically separated.

The vehicle detector 260 (i.e., a detector) may detect vehicle state information such as the remaining amount of fuel, tire pressure, and current vehicle position, engine temperature, vehicle speed, brake pedal pressure, accelerator pedal, and maintenance time.

The vehicle detector 260 detects traveling environment information such as an outside temperature, an inside temperature, whether the passenger is seated in the passenger seat, the left rear seat, or the right rear seat, a seat heater wire on/off, a seat ventilation on/off, a seat front-rear position, an outside humidity, an inside humidity, a rainfall speed, and adjacent obstacle information.

That is, the vehicle detector may include a plurality of sensors to detect vehicle state information and environmental information.

The detector is configured to detect whether a passenger is seated on the passenger seat, the left rear seat, or the right rear seat, and may include a seat belt detector configured to detect whether each seat belt is fastened or not, or a weight detector or a pressure detector, wherein the seat belt detector or the weight detector is provided in the seat to detect whether the passenger is seated or not.

A detector configured to detect a seat inclination angle and a front-rear position, the detector being capable of detecting adjustment information performed by an angle adjuster and a position adjuster, the angle adjuster being provided in the seat to adjust the seat inclination angle; the position adjuster is configured to adjust a fore-aft position of the seat.

Further, the vehicle controller may receive a seat inclination angle adjusted by the angle adjuster and a seat position adjusted by the position adjuster.

The communication device 280 may communicate with the external content server 300 and the plurality of mobile devices 400a, 400b, 400c, and 400d, and may communicate with other vehicles and infrastructure.

The communication device 280 is also configured to transmit the received information to at least one of the dialogue system and the vehicle controller, and configured to transmit the information of the dialogue system 100a and the vehicle controller 240a to the outside.

The external content server 300 provides the dialog system with information necessary for providing a user with a desired service according to a response transmitted from the remote dialog system server.

The plurality of mobile devices 400a, 400b, 400c, and 400d communicate with at least one of the vehicle controller 240a and the dialogue system 100a via the communication device 280 of the vehicle.

The plurality of mobile devices 400a, 400b, 400c, and 400d may be mobile devices of passengers in the vehicle. One of the mobile devices may be the mobile device of the owner of the vehicle, while the remaining mobile devices may be the mobile devices of passengers other than the owner of the vehicle.

Among the plurality of mobile devices 400a, 400b, 400c, and 400d, the passenger's mobile device may provide only predetermined limited information to at least one of the vehicle controller 240a and the dialogue system 100a, in addition to the owner's mobile device.

For example, the predetermined limited information may include function control information and contact information related to functions of the vehicle.

Fig. 51 is a detailed control block diagram showing a dialogue system according to another embodiment of the present invention, and will be described with reference to fig. 52 to 54.

Figure 52 is a control block diagram illustrating an input processor of a dialog system according to another embodiment of the present invention,

fig. 53 is a detailed control block diagram showing an input processor of a dialog system according to another embodiment of the present invention, and fig. 54 is a control block diagram showing a result processor of the dialog system according to another embodiment of the present invention.

As shown in fig. 51, the dialog system 100a includes an input handler 110a, a dialog manager 120a, a result handler 130a, and a storage 140 a.

The input processor 110a may receive two types of input, such as user speech and input other than speech. The input other than the voice may include recognizing a gesture of the user, an input other than the voice of the user input through an operation of the input device, vehicle state information indicating a state of the vehicle, traveling environment information related to traveling information of the vehicle, and user information indicating a state of the user.

Further, in addition to the above information, information related to the user and the vehicle may be input to the input processor 110a as long as the information is used to recognize the user's intention or provide a service to the user or the vehicle. The users may include drivers and passengers.

The input processor 110a converts the user's speech into a text-type utterance by recognizing the user's speech, and recognizes the user's intention by applying a natural language understanding algorithm to the user's utterance.

The input processor 110a collects information related to a vehicle state or a driving environment of the vehicle, other than the user's voice, and then understands a context using the collected information.

The input processor 110a transmits the user's intention acquired through the natural language understanding technology and the information related to the context to the dialog manager 120 a.

As shown in fig. 52, the input processor 110a may include a voice input processor a1 and a contextual information processor a2, the voice input processor a1 being configured to receive information on a user's voice transmitted from the voice input device 210; the contextual information processor a2 is configured to receive input information other than user speech transmitted from the information input device other than speech 220.

The input processor 110a may further include a pre-voicing determiner 151.

As shown in fig. 52 to 53, the voice input processor a1 may include a voice recognizer a11, a natural language understanding part a12, and a dialogue input manager a13, the voice recognizer a11 outputting an utterance of a text type by recognizing an inputted voice of a user; the natural language understanding part a12 recognizes the user's intention contained in the utterance by applying a natural language understanding technique to the utterance of the user; the dialog input manager a13 sends the results of the natural language understanding and the context information to the dialog manager 120 a.

The speech recognizer a11 may include a speech recognition engine, and the speech recognition engine may recognize a speech uttered by a user and generate a recognition result by applying a speech recognition algorithm to the input speech.

Speech recognizer a11 recognizes the speech pattern of the input to recognize the speaker.

An utterance in a text form corresponding to a recognition result of the speech recognizer a11 is input to the natural language understanding section a 12.

The speech recognizer a11 may also estimate the utterance location by utilizing a beamforming algorithm configured to generate a beam in a particular direction. The utterance location may represent a location of a speaker's seat.

The beam forming algorithm is such that: the time difference between the signals arriving at the at least two microphones and the distance between the microphones are used to estimate the direction of speech production.

The beamforming algorithm may enhance only the speech signal at the estimated speech position or remove the speech at the remaining positions by interference noise.

As described above, by using beamforming, the voice recognizer a11 can improve the performance of sound separation to remove or separate noise sources, or the performance of recognizing the utterance position, and by using post filtering, the voice recognizer a11 can reduce noise or reverberation having no directivity.

The natural language understanding section a12 can recognize the user's intention contained in the utterance sentence by applying a natural language understanding technique. Accordingly, the user can input a control instruction through a natural dialog, and the dialog system 100a can also cause the input of the control instruction and provide a service desired by the user through the dialog.

The natural language understanding part a12 may perform morphological analysis on the utterance in text form. A morpheme is the smallest unit of meaning and represents the smallest semantic element that cannot be subdivided. Therefore, morphological analysis is the first step in natural language understanding and converts an input string into a morpheme string.

The natural language understanding part a12 may extract a domain from the utterance based on the morphological analysis result. The domain may be used to recognize the subject of the user utterance.

The natural language understanding part a12 may recognize an entity name from an utterance. The entity name may be a proper noun, such as a person name, a place name, an organization name, a time, a date, a currency, and a title, and the entity name recognition may be configured to recognize the entity name in the sentence and determine a type of the recognized entity name. The natural language understanding section a12 can extract important keywords from a sentence using entity name recognition and recognize the meaning of the sentence.

The natural language understanding part a12 can analyze the speech behavior present in the utterance. The verbal behavioral analysis may be configured to recognize the intent of the user utterance, e.g., whether the user asks a question, whether the user makes a request, whether the user gives instructions, whether the user responds, or whether the user simply expresses an emotion.

Based on the information stored in the relationship rule DB 141b, the natural language understanding part a12 can recognize whether an utterance is in favor or non-in favor and whether the utterance has an respect.

The relation rule DB 141b may store information related to a packet, a packet title, a rating corresponding to the title, respect, dedication, and non-dedication.

The group may be a group to which the user belongs and may include company, family, school, club, superior, and friend.

The natural language understanding part a12 may recognize the meaning of the sentence and the intention of the utterance based on at least one of the relationship stored in the short-term memory 144a, the priority of the speaker, the seat position of the speaker, and the function control information.

The natural language understanding section a12 extracts an action corresponding to the utterance intention of the user. The natural language understanding part a12 may recognize the intention of an utterance of a user based on information such as a domain, an entity name, and a speech behavior, and extract an action corresponding to the utterance. Actions may be defined by objects and operators.

The natural language understanding part a12 may extract parameters related to the execution of the action. The parameter related to the action execution may be a valid parameter directly required for the action execution or an invalid parameter for extracting a valid parameter.

The morphological analysis result, the domain information, the action information, the speech behavior information, the extracted parameter information, the entity name information, and the syntax tree, which are the processing results of the natural language understanding part a12, may be transmitted to the dialog input manager a 13.

The contextual information processor a2 receives contextual information, such as vehicle state information, driving environment information, and user information, input from the information input device 220 and the vehicle controller 240a, in addition to voice.

As shown in fig. 52 to 53, the context information processor a2 may include a context information collector a21, a context information collection manager a22, and a context understanding part a23, the context information collector a21 collecting information from the information input device 220 and the vehicle controller 240a other than voice; the context information collection manager a22 manages the collection of context information; the context understanding section a23 understands a context based on the result of natural language understanding and collected context information.

More specifically, the context information collector a21 of the context information processor a2 may collect data periodically or only when a specific event occurs. In addition, the context information collector a21 may collect data periodically and then additionally collect data when a specific event occurs.

Further, when receiving a data collection request from the context information collection manager a22, the context information collector a21 may collect data.

The context information collector a21 may collect required information and then store the information in the context information DB142a or the short term memory 144 a. The context information collector a21 may send a confirmation signal to the context information collection manager a 22.

The context information collector a21 may receive contact information from at least one of a plurality of mobile devices and function control information from at least one of the plurality of mobile devices.

The function control information may be control information regarding a function controlled by a user among functions performed in the vehicle.

The context information collection manager a22 sends a confirmation signal to the context understanding part a 23. The context understanding part a23 collects required information from the context information DB142a, the long term memory 143a or the short term memory 144a, and transmits the collected information to the dialog input manager a 13.

For example, the context information collector a21 may receive destination information, travel time information, travel distance information, high speed travel information, window opening/closing information, broadcast on/off information, broadcast channel change information, air conditioner on/off information, air conditioner temperature control information, seat heater wire on/off information, seat front/rear position adjustment information, steering wheel heater wire on/off information, audio type change information, volume adjustment information, communication connection information with a mobile device, contact information of the mobile device, function control information of the mobile device, external temperature information, internal temperature information, external humidity information, internal humidity information, brake pedal pressure information, accelerator pedal pressure information, maintenance information, and fuel information.

The context information collection manager a22 monitors various information received and sends the monitored information to the context understanding part a 23.

The context understanding section a23 acquires the relationship between speakers based on the received contact information.

The context understanding section a23 determines a ranking (which is a priority) between speakers in the vehicle based on at least one of the received function control information, the relationship, and the title.

Context understanding portion a23 may identify the conversation flow and context or intent of each speaker based on the conversation content of each speaker over time.

When the pre-utterance determiner 151 determines that it is a pre-utterance context, the context understanding part a23 may recognize priorities among speakers and recognize a speaker having the highest priority among speakers in the vehicle.

Context understanding portion a23 may also identify the identification information or phone number of the mobile device of the speaker with the highest priority.

The pre-voicing determiner 151 may analyze the transmitted data and determine whether the transmitted data satisfies the pre-voicing conditions stored in the pre-voicing condition table 145 a.

In the pre-utterance condition table 145a, pre-utterance conditions related to context information, which are output when the corresponding pre-utterance conditions are satisfied, and pre-utterance messages may be stored for each of the context information.

When the context information transmitted from the pre-utterance condition table 145a satisfies the pre-utterance condition, the pre-utterance determiner 151 may determine that it is a pre-utterance context and generate a pre-utterance trigger signal.

The pre-utterance determiner 151 may send pre-utterance trigger signals and pre-utterance messages corresponding to corresponding pre-utterance contexts to the context understanding portion a 23. Further, the pre-utterance determiner 151 may transmit information related to a corresponding pre-utterance context. The information related to the corresponding pre-utterance context may include a pre-utterance condition corresponding to the corresponding pre-utterance context or an action corresponding to the pre-utterance context, which is described later.

When the context information related to the execution of the action corresponding to the intention of the user utterance is not stored in the context information DB142a, the long-term memory 143a, or the short-term memory 144a, the context understanding part a23 requests the context information collection manager a22 for the required information.

The dialog manager 120a determines an action corresponding to the user's intention or the current context based on the user's intention, the relationship between speakers, and the context-related information transmitted from the input processor a1, and manages parameters required to perform the corresponding action.

According to an embodiment, the action may represent various actions for providing a specific service, and the kind of the action may be predetermined. Providing the service may correspond to performing the action, as desired.

For example, actions such as route guidance, vehicle state check, and gas station recommendation may be predefined in the domain/action inference rule DB 141a (refer to fig. 53), and an action corresponding to an utterance of the user, that is, an action expected by the user, may be extracted according to a stored inference rule. The action related to the event occurring in the vehicle may be predefined and then stored in the context understanding table 145 (refer to fig. 53).

The kind of the action is not limited. If the dialog system 100a is allowed to perform an action via the vehicle 200a or the mobile devices 400a, 400b, 400c, and 400d, and the action is predefined while it infers that the rule or the relationship of the action to other actions/events is stored, the action may be the action mentioned above.

Dialog manager 120a sends information related to the determined action to results processor 130 a.

The results processor 130a generates and outputs a dialog response and instructions necessary to perform the sent action. The dialog response may be output in text, image or audio type. When the instruction is output, services such as vehicle control and provision of external content corresponding to the output instruction may be executed.

Referring to fig. 54, the result handler 130a may include a response generation manager 131a, a dialog response generator 132a, an output manager 133a, a service editor 134a, a memory manager 135a, and an instruction generator 136a, the response generation manager 131a managing generation of a response required to perform an action input from the dialog manager 120 a; the dialog response generator 132a generates a response of a text, image or audio type according to the request of the response generation manager 131 a; the instruction generator 136a generates an instruction for vehicle control or providing service using external contents according to the request of the response generation manager 131 a; the service editor 134a continuously or intermittently executes a plurality of services and collects the results thereof to provide a service desired by a user; the output manager 133a outputs the generated text type response, image type response, or audio type response, outputs the instruction generated by the instruction generator 136a, or determines the order of output when the output is plural; the memory manager 135a manages the long-term memory 143a and the short-term memory 144a based on the outputs of the response generation manager 131a and the output manager 133 a.

The mobile device information storage 144b stores contact information and function control information of a plurality of mobile devices. In addition, the mobile device information storage 144b may also store the positions of the seats occupied by the owners of the plurality of mobile devices.

The short-term memory 144a stores the conversation contents of each speaker in the vehicle, stores the conversation contents in chronological order, stores the context and intention of each speaker, and stores the seat position of the speaker in the vehicle.

The instruction generator 136a may generate a seat control instruction for each seat based on the seat position of the speaker and the function control information stored in the mobile device of the speaker.

The instruction generator 136a may identify the speaker having the highest priority based on the acquired priority, and the instruction generator 136a may generate a control instruction for controlling the vehicle function based on the function control information stored in the mobile device of the speaker having the highest priority.

Based on the relationship between the pre-utterance context and the speaker, instruction generator 136a may generate control instructions for controlling at least one function performed in the vehicle.

The service editor 134a outputs various information to the external server 300 a.

The various information may be information on a control instruction for controlling the mobile device or information for requesting the mobile device to provide information.

Detailed components of the result processor 130a are the same as those according to the embodiment, and thus a description thereof will be omitted.

The storage device 140a stores various information for conversation processing and providing services.

For example, the storage 140a may previously store information related to domains, actions, verbal behaviors, and entity names for natural language understanding, and a context understanding table for understanding a context by inputting information. Further, the storage device 140a may store data detected by a sensor provided in the vehicle, information related to the user, and information required for the action in advance.

The storage device 140a stores vehicle state information and running environment information.

The storage device 140a stores contact information and function control information stored in a plurality of mobile devices.

The storage device 140a may store the relationship of speakers who may be riding in the vehicle and the speech patterns of the speakers. Further, the storage device 140a may store the relationship of speakers in groups.

The storage device 140a may store a title of the speaker, identification information of the speaker's mobile device, a phone number, a name, and function control information.

The storage device 140a may store the priority of the talker in the vehicle, and may store the seat position of the talker.

Fig. 55 is a control block diagram showing a vehicle having a dialogue system according to another embodiment of the present invention, which will be described with reference to fig. 56A to 61.

When the ignition is turned on, the vehicle supplies power to various loads to drive. At this time, the dialogue system provided in the vehicle may be activated together with the voice input device, the information input device other than voice, and the communication device.

The vehicle may communicate with a plurality of mobile devices via a communication device. At this time, the vehicle may receive predetermined limited information from a plurality of mobile devices. Further, the vehicle may receive predetermined limited information from the owner's mobile device.

The limited information indicates information that can be shared among information stored in the mobile devices of the vehicle occupants, at a level that does not infringe the privacy of each occupant. The limited information may include contact information, function control information of the vehicle, and music information.

Information received from multiple mobile devices may be sent to the dialog system.

As shown in fig. 56A to 56D, the dialogue system of the vehicle may receive contact information stored in each mobile device from the mobile devices of the passengers of the vehicle, and may also receive the name and phone number of the owner of the mobile device.

The contact information may include names, groups, and phone numbers. Further, the name may include a title.

The vehicle dialog system stores contact information received from a plurality of mobile devices. At this time, the dialog system compares the name of the received contact information with the phone number, searches for contacts having the same name and phone number, determines that the retrieved contacts are duplicate contacts, and deletes all duplicate contacts to keep only one.

The dialog system of the vehicle obtains the relationship between the passengers based on the grouping of the contact information and the name.

For example, when contact information is stored in a group of companies and the name is "senior manager Kim," the dialog system may acquire the relationship of the owner of the mobile device storing the contact information with other passengers having a job position "senior manager," which is a co-worker of the same company or an employee of a partner company.

When no grouping is specified, the dialog system may determine whether a title exists in the names included in the contact information, and when it is determined that the title is included in the names, the dialog system may obtain the relationship based on the title.

For example, when the name in the contact information is stored as "senior manager Kim," the dialog system may obtain the relationship of the owner of the mobile device storing the contact information with other passengers having the job position "senior manager," which is a co-worker of the same company or an employee of a partner company.

Further, the vehicle may receive information on the name and relationship of each passenger through an information input device other than voice, and store the information on the name and relationship with the passenger in the dialogue system. At this time, the relationship of each passenger may be the title of each passenger, i.e., the title of each speaker.

The vehicle may receive information about the name and relationship of each passenger as voice via the voice input device and transmit the input voice to the dialogue system. At this time, the dialogue system of the vehicle may perform voice recognition on the input voice, acquire the name and relationship of each passenger based on the recognized voice information, and store the acquired name and relationship of each passenger.

The dialogue system of the vehicle may be configured to perform voice recognition on voice of each speaker input through the voice input device, and store a voice pattern in voice information of the recognized voice for each speaker.

That is, as shown in fig. 57, the dialogue system of the vehicle may store information on speakers who can speak in the vehicle, wherein the information may include a name, a group, a phone number, a relationship, and a voice pattern of each speaker. Alternatively, the dialog system of the vehicle may store only the name, relationship, and speech pattern of each speaker.

The dialogue system of the vehicle can directly receive information on a speaker who can speak in the vehicle through the voice input device and the information input device other than voice.

Further, the dialogue system of the vehicle may automatically acquire the speaker in the vehicle by analyzing and recognizing the voice of the speaker input via the voice input device. A description thereof will be described.

When a conversation is conducted between speakers in a vehicle, voices of the speakers in the vehicle may be input in chronological order through a voice input device (1001).

That is, the vehicle can receive voice in real time through the voice input device. At this time, the dialogue system of the vehicle performs voice recognition in the order of voice input to the voice input device.

Performing speech recognition is performed in the following manner: acquiring an utterance concerning a text type of a voice spoken by a user by applying a voice recognition algorithm; identifying a user intent contained in the utterance by applying a natural language understanding algorithm; performing morphological analysis on the utterance in text form; and extracting a domain from the utterance based on the morphological analysis result. The domain represents the subject of the identified language spoken by the user.

Performing voice recognition includes recognizing an entity name from the utterance, analyzing the speech behavior contained in the utterance, recognizing whether the utterance is in favor or non-in favor based on the information stored in the relationship rule DB 141b, and recognizing whether the utterance is respected.

The entity name may be a proper noun, such as a person name, a place name, an organization name, a time, a date, a currency, and a title and the entity name recognition may be configured to recognize the entity name in a sentence and determine a type of the recognized entity name.

The verbal behavioral analysis may be configured to recognize the intent of the user utterance, e.g., whether the user asks a question, whether the user makes a request, whether the user gives instructions, whether the user responds, or whether the user simply expresses an emotion.

By utilizing a beamforming algorithm configured to generate a beam in a particular direction, the performance of speech recognition includes an utterance location, which is a seat location of a speaker.

The performance of speech recognition may include recognizing a speech pattern of the input speech.

The dialogue system of the vehicle may acquire a seat position of each speaker by recognizing a position from which a voice having the recognized voice pattern is uttered, and store the acquired seat position of each speaker.

As described above, by utilizing a dialog system, a vehicle can perform speech recognition on a dialog in progress in the vehicle in real time, acquire an utterance and a speech pattern from speech of a speaker recognized in real time (1002), classify the utterance according to the speech pattern, and store the classified utterance. That is, since the vehicle classifies the utterance according to the voice pattern and stores the classified utterance, the vehicle can classify the dialogue contents according to the speaker (1003).

The dialogue system of the vehicle may determine whether a title is included in the acquired utterance through a process of recognizing an entity name from the utterance, and when it is determined that the title is included in the utterance, the dialogue system of the vehicle may acquire a relationship between the speaker and another speaker based on the determined title (1004).

The dialogue system of the vehicle determines whether the respect is included in the acquired utterance through a process of recognizing the entity name from the utterance, and when it is determined that the respect is included in the utterance, the dialogue system of the vehicle may acquire a relationship between the speaker and another speaker based on the determined title, and acquire a rank (i.e., a grade) between the speaker and the other speaker.

More specifically, the dialogue system of the vehicle may perform speech recognition on chronologically input voices, and recognize whether a relationship between the recognized voices is a question and response, an instruction and a response, or a request and a response, based on a first speech and a second speech chronologically arranged in the recognized voices. The dialogue system 100a can acquire the relationship between speakers based on an utterance corresponding to speech information recognized in a dialogue having a query and response relationship, an instruction and response relationship, or a request and response relationship, and sequentially acquire the speech pattern of each speaker.

That is, the dialogue system of the vehicle may acquire a title to acquire a relationship with the user based on an utterance in the voice information of the previous voice recognition, and when a voice corresponding to a response to the previous utterance is input, the dialogue system of the vehicle may recognize the input voice. The dialogue system of the vehicle may acquire a voice pattern in the voice information of the current voice recognition, match a previously acquired title with the voice pattern of the current acquisition, and store the matched information.

The dialog system of the vehicle may determine that a speaker having a title contained in previously uttered speech utters current speech in response to the previous speech. Accordingly, the dialogue system of the vehicle stores the title contained in the previous voice as a relationship, matches the voice pattern of the current voice with the relationship, and stores the matched information.

Further, the dialogue system of the vehicle may determine whether the title is included in the current voice, and when it is determined that the title is included in the current voice, the dialogue system of the vehicle may match a voice pattern acquired through previously performed voice recognition with the currently acquired title and store the matched information.

That is, since the dialogue system determines that the current response voice is the response voice of the speaker who uttered the previous question voice, the dialogue system can store the title included in the response voice as the relation. The dialog system may match the relationship to the speech pattern of the previous speech and store the matched information.

Next, the dialog system of the vehicle may understand the intention and context of each speaker based on the relationship between speakers in the vehicle and the dialog content of each speaker (1005). The dialogue system of the vehicle may determine an action based on at least one of a relationship between speakers, an intention and context of each speaker, vehicle state information, and vehicle driving environment information, and output a speech corresponding to the determined action (1007).

The dialog system of the vehicle may determine an action based on at least one of the plurality of mobile devices capable of communicating in the vehicle and output an utterance corresponding to the determined action.

This will be described with reference to fig. 58.

The dialogue system of the vehicle may perform voice recognition on a voice of a first speaker (U1) in the vehicle, acquire a title "advanced manager" based on a result of the voice recognition, acquire a question voice corresponding to an utterance intention, and acquire a voice pattern of the first speaker (U1).

The dialogue system of the vehicle may perform voice recognition on a voice of a second speaker (U2) in the vehicle, acquire a response voice corresponding to the utterance intention, and acquire a voice pattern of the second speaker (U2).

The dialog system of the vehicle may recognize that the relationship between the second speaker and the other speaker is "premium manager" based on the dialog between the first speaker and the second speaker. The dialog system of the vehicle may match the second speaker "senior manager" with the second speaker's voice pattern and store the matched information.

In the dialogue between the first speaker and the second speaker, the vehicular dialogue system can detect that the speech of the first speaker contains the terms "mr. and" can go "and that the speech of the second speaker contains the non-terms" good "and" good ". Accordingly, the dialog system of the vehicle can acquire the ranking between the first speaker and the second speaker. That is, the dialogue system of the vehicle may acquire level information that the level of the second speaker is higher than the level of the first speaker.

The vehicle may recognize the vehicle state information and the vehicle running environment information, determine an action corresponding to at least one of the recognized vehicle state information and the vehicle running environment information, and a conversation content between the first speaker and the second speaker, and output a speech corresponding to the determined action (S1).

For example, when the final destination is determined by a conversation between the first speaker and the second speaker, the dialog system may compare the navigation destination of the AVN with the final destination, and when the two destinations are different, the dialog system may output an utterance indicating that the navigation destination of the AVN is set as the final destination according to the conversation between the first speaker and the second speaker.

When the speech of the first speaker is input after completion of the speech recognition of the second speaker, the dialogue system of the vehicle may recognize a relationship between the utterance intention of the speech of the first speaker currently input and the utterance intention of the speech of the second speaker previously input. When it is determined that there is no relationship between the current first speaker's speech and the previous second speaker's speech, the dialog system may perform speech recognition on the current first speaker's speech to obtain an utterance, a speech pattern, and an utterance intention, and store the obtained utterance, speech pattern, and utterance intention. Next, the dialog system can prepare for the input of the next speech.

At this time, the dialogue system of the vehicle may acquire the title "sub-manager C" based on the result of the voice recognition, acquire question voice corresponding to the utterance intention, and acquire the voice pattern of the first speaker (U1). By comparing the speech patterns acquired from the previous two dialogs with the currently acquired speech pattern, the vehicle's dialog system can acquire the same information as the first speaker of the currently speaking speaker.

When the voice of the third speaker is input, the dialogue system of the vehicle can acquire the utterance, the voice pattern, and the utterance intention by performing voice recognition on the voice of the third speaker. The dialogue system of the vehicle may acquire a response voice corresponding to the utterance intention based on a result of the voice recognition, and acquire a voice pattern of the third talker (U3).

That is, the vehicular dialogue system may determine whether the same voice pattern as the currently acquired voice pattern exists among the voice patterns contained in the first three dialogs, and when it is determined that the same voice pattern does not exist, the vehicular dialogue system may recognize that the speaker currently speaking is a new speaker.

The dialog system of the vehicle may recognize that the relationship between the third speaker and the other speaker is "sub-manager" based on the dialog between the first speaker and the third speaker. The dialog system of the vehicle may match the third speaker "sub-manager" with the third speaker's voice pattern and store the matched information.

In the conversation between the first speaker and the third speaker, the vehicular conversation system can detect that the speech of the first speaker contains non-professor and detect that the professor "yes" and "mr. brings it (i) in the speech of the third speaker. Accordingly, the dialogue system of the vehicle can acquire the ranking between the first talker and the third talker. That is, the dialogue system of the vehicle may acquire level information that the level of the first speaker is higher than the level of the third speaker.

When the voice of the third speaker is input after the voice recognition of the third speaker is completed, the dialogue system of the vehicle may recognize a relationship between the utterance intention of the currently input voice of the third speaker and the utterance intention of the previously input voice of the third speaker. When it is determined that there is no relationship between the current third speaker's voice and the previous third speaker's voice, the dialog system may perform voice recognition on the current third speaker's voice to acquire the utterance, the voice pattern, and the utterance intention, and store the acquired utterance, the voice pattern, and the utterance intention. Next, the dialog system can prepare for the input of the next speech.

At this time, the dialogue system of the vehicle may acquire the title "chief prison" based on the result of the voice recognition, acquire the question voice corresponding to the utterance intention, and acquire the voice pattern of the third talker (U3). By comparing the voice patterns acquired from the first four dialogs with the currently acquired voice pattern, the vehicle's dialog system can acquire the same information as the fourth speaker of the speaker who is currently speaking.

When the voice of the fourth speaker (U4) is input, the dialogue system of the vehicle can acquire an utterance, a voice pattern, and an utterance intention by performing voice recognition on the voice of the fourth speaker. The dialogue system of the vehicle may acquire a response voice corresponding to the utterance intention based on a result of the voice recognition, and acquire a voice pattern of the fourth speaker (U4).

That is, the vehicular dialogue system may determine whether the same voice pattern as the currently acquired voice pattern exists among the voice patterns contained in the first four dialogs, and when it is determined that the same voice pattern does not exist, the vehicular dialogue system may recognize that the speaker currently speaking is a new speaker.

The dialog system of the vehicle may recognize that the relationship between the fourth speaker and the other speakers is "chief" based on the dialog between the third speaker and the fourth speaker. The dialog system of the vehicle may match the fourth speaker "chief" with the fourth speaker's voice pattern and store the matched information.

In the dialogue between the third speaker and the fourth speaker, the dialogue system of the vehicle may detect that the third speaker includes honor "mr" and dedication "mr, and that the non-dedication" good "and" good "is included in the utterance of the fourth speaker. Accordingly, the dialogue system of the vehicle can acquire the ranking between the third talker and the fourth talker. That is, the dialogue system of the vehicle may acquire level information that the level of the fourth speaker is higher than the level of the third speaker.

The vehicle may recognize the vehicle state information and the vehicle running environment information, determine an action corresponding to at least one of the recognized vehicle state information and the vehicle running environment information, and a conversation content between the third speaker and the fourth speaker, and output a speech corresponding to the determined action (S1).

For example, when an action of increasing the temperature of the air conditioner by one degree is determined through a conversation between the third speaker and the fourth speaker, the conversation system may output a word indicating that the target temperature of the air conditioner is increased by one degree.

Further, the dialog system of the vehicle may recognize the ranks of the first speaker, the second speaker, the third speaker, and the fourth speaker through the dialogues of the first speaker, the second speaker, the third speaker, and the fourth speaker, and may determine the priorities corresponding to the ranks.

As a result, the dialog system of the vehicle may recognize that the job title of the second speaker is "senior manager" because the first speaker calls the second speaker with "senior manager" corresponding to the title. The dialog system of the vehicle may recognize that the job position of the third talker is "sub-manager" because the first talker calls the third talker with "sub-manager" corresponding to the title. The dialog system of the vehicle may recognize that the job position of the fourth talker is "president" because the third talker calls the fourth talker by "president" corresponding to the title.

Since the first speaker calls the second speaker in honor, the dialog system of the vehicle may recognize that the second speaker is ranked higher than the first speaker. Since the third speaker calls the first speaker in respect of the priority, the dialogue system of the vehicle can recognize that the first speaker is ranked higher than the third speaker. Since the third speaker calls the fourth speaker in respect of honor, the dialogue system of the vehicle can recognize that the fourth speaker is ranked higher than the third speaker.

In addition, the dialog system may not be able to identify the job position of the first speaker in the dialog. However, the dialog system may estimate that the job site of the first speaker is "manager," or that the job site of the first speaker is the same as a third speaker with the third speaker funded higher than the first speaker, or that the job site of the first speaker is the same as a second speaker with the second speaker funded lower than the first speaker. Thus, the dialog system may store the estimated rank as a priority.

The dialog system of the vehicle may determine whether it is a pre-utterance context, and when it is determined to be the pre-utterance context, the dialog system may determine an action corresponding to the pre-utterance context and output an utterance corresponding to the determined action.

When it is determined that control of the at least one function is required in response to the determined action, the dialog system of the vehicle may generate control instructions to control the at least one function based on the determined action (1008), and based on the generated control instructions, the dialog system of the vehicle may control the at least one function by controlling at least one operation of a plurality of loads provided in the vehicle (1009).

More specifically, the dialogue system of the vehicle determines whether the context information transmitted from the context information DB142a, the long-term memory 143a, or the short-term memory 144a satisfies the advance-vocalization conditions of the advance-vocalization condition table 145a, and when the context information satisfies the advance-vocalization conditions, the dialogue system of the vehicle determines whether a task related to the currently occurring advance-vocalization context is repeated.

The vehicle's dialog system may terminate the pre-vocalization context when it is determined that the task associated with the currently occurring pre-vocalization context is repeated, and may perform the pre-vocalization operation when it is determined that the task is not repeated.

Performing the pre-sound operation may include outputting, via speaker 232, a pre-sound message corresponding to the pre-sound context. When a speaker's voice is input, the dialog system may recognize the input voice and perform a function corresponding to the recognized voice.

For example, when a route to a destination needs to be suggested due to a change in traffic information, the dialogue system may output a voice regarding the route change suggestion as a preliminary utterance, and when a response voice indicating agreement to the route change is input, the dialogue system may output a control instruction regarding the route change to the AVN so as to change the route to the destination. The AVN changes a route to a destination and outputs route guidance information based on the changed route.

The vehicular dialogue system may acquire the relationship, the voice pattern, and the seat position based on the dialogue between the speakers in the vehicle, or the vehicular dialogue system may acquire information about the speakers based on information stored in the storage device. The description thereof will be described later.

The dialogue system of the vehicle may acquire a voice pattern by voice-recognizing a dialogue between speakers in the vehicle, recognize the speakers in the vehicle by comparing the acquired voice pattern with the voice patterns stored in the storage device 140a, and acquire information about the speakers stored in the storage device 140 a.

The acquired information of the speaker may include a name, a phone number, a relationship, and a voice pattern.

By utilizing a beamforming algorithm configured to generate a beam in a particular direction, the dialog system may acquire an utterance location corresponding to a seat location of a speaker. The dialogue system of the vehicle can acquire the seat position of each speaker by checking the utterance position from which the acquired voice pattern is uttered.

The vehicular conversation system acquires a relationship between speakers based on the collected information about the speakers, and determines a priority between the speakers based on the acquired relationship.

As shown in fig. 59A, when a manager, a senior manager, a sub-manager, and a chief deputy ride, the dialog system may determine that the priority of the chief deputy is highest according to job positions, the priority of the senior manager is the second highest after the chief deputy, the priority of the manager is the third highest after the senior manager, and the priority of the sub-manager is the lowest.

As shown in fig. 59B, when the daughter, father and mother are riding, the dialog system may determine that the priority of the father is the highest, the mother's priority is the second highest after the father, and the daughter's priority is the lowest, according to the ranking in the family relationship.

The vehicle's dialog system requests communication with the speaker's mobile device.

When performing communication with a plurality of mobile devices, a dialogue system of a vehicle receives function control information to control a function of the vehicle, and stores the received function control information of the plurality of mobile devices.

The dialogue system of the vehicle may recognize a mobile device of each speaker by comparing recognition information such as a name and a phone number of the speaker with phone numbers and owner names of a plurality of mobile devices, and acquire function control information of each speaker based on the recognition result.

The vehicle may recognize seat positions of speakers who take the vehicle, and control operation of the seats based on the recognized seat position of each speaker and function control information of each speaker.

The vehicle may control at least one function of a plurality of functions performed in the vehicle based on the function control information stored in the mobile device of the speaker having the highest priority.

Further, when two or more speakers having the highest priority ride, the vehicle may select any one speaker among the two or more speakers by using a voice request, recognize a voice of the speaker when the voice of the speaker is input through the voice input device, recognize the speaker included in the speech of the recognized voice, and control at least one function of a plurality of functions performed in the vehicle based on function control information stored in the mobile device of the recognized speaker.

A configuration for controlling at least one function performed in a vehicle based on a priority of a speaker will be described as an example.

As shown in fig. 59A, when a manager sits on the driver seat, a chief manager sits on the co-driver seat, a senior manager sits on the left rear seat, and a assistant manager sits on the right rear seat, the dialogue system may set the chief manager to have the highest priority and request communication with the mobile device of the chief manager. The dialogue system may receive function control information for controlling a function of the vehicle when communicatively connected to a mobile device of the chief, and store the received function control information.

As shown in fig. 60, the dialogue system may receive broadcast channel information, air-conditioning information, and seat information stored in the mobile device of the director, and store the received broadcast channel information, air-conditioning information, and seat information.

Next, the dialog system may determine an action corresponding to the stored broadcast channel information, air conditioning information, and seat information, and output a speech corresponding to the determined action.

As shown in fig. 61, the dialog system may determine that the director prefers to listen to the broadcast based on the function control information and output an utterance about the listening broadcast with the title "director".

The dialogue system recognizes a voice input through the voice input device, and acquires an utterance and a voice pattern from voice information of the recognized voice. When it is determined that the acquired voice pattern is a chief complaint voice pattern and the utterance contains a positive word, the dialogue system may generate a control instruction for turning on the broadcast function.

When it is determined that a plurality of broadcast channels are included in the function control information, the dialogue system may output information on the plurality of broadcast channels with the utterance.

The dialogue system recognizes a voice input through the voice input device, and acquires an utterance and a voice pattern from voice information of the recognized voice. When it is determined that the acquired voice pattern is a voice pattern of the director, the dialogue system may acquire information on a channel included in the utterance and output the acquired channel information using the utterance.

The dialogue system generates a control instruction of the acquired channel information and transmits the generated broadcast function open command and channel information control command to the broadcast set in the vehicle.

The dialogue system may acquire air conditioning information preferred by the chief administrator based on the function control information stored in the mobile device of the chief administrator and output words about the acquired air conditioning information.

The dialogue system recognizes a voice input through the voice input device, and acquires an utterance and a voice pattern from voice information of the recognized voice. When it is determined that the acquired voice pattern is a voice pattern of the chief executive and the utterance contains a positive word, the dialogue system may output the utterance regarding the change of the air conditioning function.

The dialog system generates a control command for the air conditioning function and outputs the generated control command to the air conditioner.

At this time, the air conditioner compares the current air conditioner setting information with information corresponding to the received control instruction, and when it is determined that the two kinds of information are different, the air conditioner changes the setting information of the air conditioner based on the received control instruction.

That is, when the current set temperature is 25 degrees and the received target temperature is 24 degrees, the air conditioner changes the target temperature to 24 degrees, and when the current set air volume is "medium" and the received target air volume is "weak", the air conditioner changes the air volume to "weak".

Further, the air conditioner maintains the current temperature at 24 degrees when the current set temperature is 24 degrees and the received target temperature is 24 degrees, and maintains the air volume at "weak" when the current set air volume is "weak" and the received target air volume is "weak".

The dialogue system generates control instructions for controlling the inclination, the front-rear position, the lumbar support, and the heating wire of the seat based on the seat information in the function control information stored in the mobile device of the chief administrator, and transmits the generated control instructions to the passenger seat.

At this time, the front passenger seat recognizes the tilt information, the front-rear position information, the lumbar support information, and the heating wire information of the seat corresponding to the received control command, and controls the tilt, the horizontal positions of the front and rear rows, the lumbar support, and the heating wire.

Further, the dialogue system may receive function control information stored in the mobile devices of the senior manager and the subsidiary manager, may generate control instructions for controlling the seat inclination, the front-rear position, the lumbar support, and the heating wires, and may transmit the generated control instructions to the left and right rear seats.

The left and right rear seats recognize the inclination information, the front-rear position information, the lumbar support information, and the heating wire information of the seat corresponding to the received control command, and control the inclination, the horizontal positions of the front and rear rows, the lumbar support, and the heating wire.

As is apparent from the above description, according to the proposed dialogue system, vehicle, and method for controlling a vehicle, it is possible to provide a service suitable for a user's intention or a service required by the user by accurately recognizing the user's intention based on various information (e.g., dialogue with the user during vehicle driving and vehicle state information, driving environment information, and user information).

When the dialog system is required to perform a pre-utterance, the dialog system determines whether a plurality of speakers are riding in a car. When it is determined that a plurality of speakers are riding in a vehicle, the dialogue system may select a leader of a dialogue based on a relationship between the speakers, continue the dialogue with the selected leader through voice recognition, suggest controlling at least one function among a plurality of functions provided in the vehicle, and make the dialogue between the system and the plurality of speakers proceed smoothly.

Through the dialogue function, the quality of the vehicle can be improved, the commercial property can be increased, the satisfaction degree of the user can be increased, and the convenience of the user and the safety of the vehicle can be improved.

Although a few embodiments of the present invention have been shown and described, it would be appreciated by those skilled in the art that changes may be made in these embodiments without departing from the principles and spirit of the invention, the scope of which is defined in the claims and their equivalents.

According to the proposed dialogue processing apparatus, the vehicle having the dialogue processing apparatus, and the dialogue processing method, it is possible to provide a service suitable for a user's intention or a service required by the user by using a vehicle-specific dialogue processing method.

In addition, by considering various contexts occurring in the vehicle, a service desired by the user can be provided. Specifically, regardless of the utterance of the user, the service required by the user may be determined and actively provided based on the context information or the driver information collected by the dialogue system 100.

131页详细技术资料下载
上一篇:一种医用注射器针头装配设备
下一篇:声纹特征更新方法、装置、计算机设备及存储介质

网友询问留言

已有0条留言

还没有人留言评论。精彩留言会获得点赞!

精彩留言,会给你点赞!