By Marc Cavazza
Recent progress in affective dialogue systems makes it possible to consider a new application for Embodied Conversational Agents (ECA), which can become virtual companions to their users. Previous research has mostly developed ECA as personal assistants, which served as interfaces to various services, such as electronic TV guides or e-commerce sites. By comparison, a companion agent should be able to depart from task-based dialogue, engage in natural conversations with its owner, and establish personal relationships. The COMPANIONS project addresses this research challenge, and has recently released its final demonstration. The system presents itself as an ECA with which the user can engage in an open conversation, albeit on a limited number of topics. As an application scenario, we wanted an everyday life domain that would support conversation with some affective content. We opted for a scenario in which the user, a typical office worker, returns home and complains about their day in the office. We refer to this as the “How was your day?” (HWYD) scenario. The prototype currently supports over 30 work-based topics of conversation corresponding, for example, to representative events such as meetings, company restructuring, relationships with colleagues, and others, across a range of situations whose discussion is likely to include emotional elements.
One specific innovation of the project has been to develop a conversational approach to dialogue, departing from taskbased dialogue, and allowing long user utterances as well as user interruptions. A natural conversation is an important prerequisite for affective dialogue systems, since users may naturally get carried away in their descriptions of events. In a similar fashion, they may have strong reactions to long tirades from the agent, leading them to interrupt these. The system implements a real-time interaction strategy supporting different feedback loops and backchannels to preserve the quality of interaction. These include mechanisms for accepting user interruptions during agent utterances and processing the contents of these interruptions as part of the conversation.
In such a context, the need for robustness in speech understanding has led us to explore different natural language processing solutions, such as Information Extraction, a set of methods originally developed for text analysis, which instantiate event descriptions from user utterances up to 60 words in length.
This prototype supports end-to-end affective dialogue, from the emotional content of the user’s discourse to the generation of multimodal expressions for the conversational agent. User input is analyzed for emotional content both at speech and text level, resulting in an emotional category being attached to each event description. This will be used to determine the appropriateness of the user’s emotional reaction to the events he/she is reporting.
The companion uses an affective strategy to influence the user, which is based on its perceived appropriateness of the user’s emotional reaction. Depending on its analysis, it will choose to comfort the user or provide some warning about the possible evolution of the situation. The affective strategy is composed of a set of influencing operators whose sequence is meant to gradually convey the message to the user, for instance, by initially expressing agreement and empathy.
The sequence of influence operators is translated into a tirade of automatically generated system utterances. Each of these is also associated with emotional content, which is expressed by the companion using Text-To-Speech parameters and real-time animation of facial expressions and gestures.
The following dialogue example illustrates the companion affective strategy when it concludes that the user’s attitude may be unrealistic or inappropriate. In this case, the user is reporting worrisome news but user input analysis detects a positive mood.
<User>: We’ve just learned her department is going to be merged with the Barcelona one I’ve always liked working in the finance department
<System>: What do you think will happen next?
<User>: I heard some people will be moved to the other office in the next couple of months hopefully I won’t get laid off or forced to move with them
<System>: You seem to be in a good mood.
<System>: But are you sure this is appropriate for the situation?
<System>: I also think the merger will go ahead.
<System>: What is more the merger is likely to create redundancies.
<System>: I’m worried that this may mean more stress for you.
<System>: Remember that the consequences of the merger are likely to be bad.
The latest developments in the project consist of incorporating some form of computational humor in the agent, enabling it to use irony as part of its persuasive strategy.
More information about the project, including consortium members can be found at http://www.companionsproject. org/.
Acknowledgment: This work was funded by the European Commission as part of the Information Society Technologies (IST) programme under EC grant number IST-FP6-034434
Marc Cavazza, M.D., Ph.D. Teesside University U.K. firstname.lastname@example.org
President of Virtual Reality Medical Institute (VRMI) in Brussels, Belgium. Executive VP Virtual Reality Medical Center (VRMC), based in San Diego and Los Angeles, California. CEO of Interactive Media Institute a 501c3 non-profit Clinical Instructor in Department of Psychiatry at UCSD Founder of CyberPsychology, CyberTherapy, & Social Networking Conference Visiting Professor at Catholic University Milan.