Learning dialogue POMDP model components from expert dialogues

Authors: Chinaei, Hamid Reza
Advisor: Chaib-draa, Brahim; Lamontagne, Luc D.
Abstract: Spoken dialogue systems should realize the user intentions and maintain a natural and efficient dialogue with users. This is however a difficult task as spoken language is naturally ambiguous and uncertain, and further the automatic speech recognition (ASR) output is noisy. In addition, the human user may change his intention during the interaction with the machine. To tackle this difficult task, the partially observable Markov decision process (POMDP) framework has been applied in dialogue systems as a formal framework to represent uncertainty explicitly while supporting automated policy solving. In this context, estimating the dialogue POMDP model components is a signifficant challenge as they have a direct impact on the optimized dialogue POMDP policy. This thesis proposes methods for learning dialogue POMDP model components using noisy and unannotated dialogues. Speciffically, we introduce techniques to learn the set of possible user intentions from dialogues, use them as the dialogue POMDP states, and learn a maximum likelihood POMDP transition model from data. Since it is crucial to reduce the observation state size, we then propose two observation models: the keyword model and the intention model. Using these two models, the number of observations is reduced signifficantly while the POMDP performance remains high particularly in the intention POMDP. In addition to these model components, POMDPs also require a reward function. So, we propose new algorithms for learning the POMDP reward model from dialogues based on inverse reinforcement learning (IRL). In particular, we propose the POMDP-IRL-BT algorithm (BT for belief transition) that works on the belief states available in the dialogues. This algorithm learns the reward model by estimating a belief transition model, similar to MDP (Markov decision process) transition models. Ultimately, we apply the proposed methods on a healthcare domain and learn a dialogue POMDP essentially from real unannotated and noisy dialogues.
Document Type: Thèse de doctorat
Issue Date: 2013
Open Access Date: 19 April 2018
Permalink: http://hdl.handle.net/20.500.11794/24185
Grantor: Université Laval
Collection:Thèses et mémoires

Files in this item:
29690.pdf1.74 MBAdobe PDFView/Open
All documents in CorpusUL are protected by Copyright Act of Canada.