Drafts
Pierre Lison. Probabilistic Dialogue Models with Prior Domain Knowledge, accepted for publication at the 13th SIGDIAL Meeting on Discourse and Dialogue, 2012.
[Abstract]
[BibTeX]
[PDF]
Probabilistic models such as Bayesian Networks are now in widespread use in spoken
dialogue systems, but their scalability to complex interaction domains remains a
challenge. One central limitation is that the state space of such models grows
exponentially with the problem size, which makes parameter estimation increasingly
difficult, especially for domains where only limited training data is available.
In this paper, we show how to capture the underlying structure of a dialogue domain
in terms of probabilistic rules operating on the dialogue state. The
probabilistic rules are associated with a small, compact set of parameters which
can be directly estimated from data. We argue that the introduction of this
abstraction mechanism yields probabilistic models which are both easier to learn
and generalise better than their unstructured counterparts. We empirically
demonstrate the benefits of such an approach to learn a dialogue policy for a
human-robot interaction domain based on a Wizard-of-Oz data set.
@unpublished{rulebasedmodels-sigdial2012,
Author = {Pierre Lison},
Title = {Probabilistic Dialogue Models with Prior Domain Knowledge},
Year = {2012}
}
NB: Title of a previous version of this paper: "Rule-based probabilistic inference for spoken dialogue systems".
Conference Papers (peer-reviewed)
Pierre Lison. Multi-Policy Dialogue Management. In Proceedings of the 12th SIGDIAL
meeting on Discourse and Dialogue (SIGDIAL 2011), 2011.
[Abstract]
[BibTeX]
[PDF]
We present a new approach to dialogue management based on the use of multiple,
interconnected policies. Instead of capturing the complexity of the interaction
in a single large policy, the dialogue manager operates with a collection of small
local policies combined concurrently and hierarchically. The meta-control of
these policies relies on an activation vector updated before and after each turn.
@InProceedings{lison:2011:SIGDIAL2011,
author = {Lison, Pierre},
title = {Multi-Policy Dialogue Management},
booktitle = {Proceedings of the SIGDIAL 2011 Conference},
month = {June},
year = {2011},
address = {Portland, Oregon},
publisher = {Association for Computational Linguistics},
pages = {294--300},
url = {http://www.aclweb.org/anthology/W11-2033}
}
Pierre Lison and Geert-Jan M. Kruijff. Policy activation for open-ended
dialogue management. In Proceedings of the AAAI Fall
Symposium on Dialog with Robots, 2010.
[Abstract]
[BibTeX]
[PDF]
An important difficulty in developing spoken dialogue systems for
robots is the open-ended nature of most interactions. Robotic agents must typically
operate in complex, continuously changing environments which are difficult to model
and do not provide any clear, predefined goal. Directly capturing this complexity
in a single, large dialogue policy is thus inadequate. This paper presents a new
approach which tackles the complexity of open-ended interactions by breaking it
into a set of small, independent policies, which can be activated and deactivated
at runtime by a dedicated mechanism. The approach is currently being implemented in
a spoken dialogue system for autonomous robots.
@inproceedings{policyactivation-aaai2010,
Author = {Pierre Lison and Geert-Jan M. Kruijff},
Booktitle = {Proceedings of the AAAI Fall Symposium on Dialog with Robots},
Title = {Policy activation for open-ended dialogue management},
Url = {http://folk.uio.no/plison/pdfs/cl/main.policyactivation.aaai2010.pdf},
Year = {2010}}
Pierre Lison.
Towards relational pomdps for adaptive dialogue management.
In Proceeding of the ACL 2010 Student Research Workshop. Association
for Computational Linguistics, 2010.
[Abstract]
[BibTeX]
[PDF]
Open-ended spoken interactions are typically characterised by both structural
complexity and high levels of uncertainty, making dialogue management in such
settings a particularly challenging problem. Traditional approaches have
focused on providing theoretical accounts for either the uncertainty or the
complexity of spoken dialogue, but rarely considered the two issues in tandem.
This paper describes ongoing work on a new approach to dialogue management
which attempts to fill this gap. We represent the interaction as a Partially
Observable Markov Decision Process (POMDP) over a rich state space incorporating
both dialogue, user, and environment models. The tractability of the resulting
POMDP can be preserved using a mechanism for dynamically constraining the action
space based on prior knowledge over locally relevant dialogue structures. These
constraints are encoded in a small set of general rules expressed as a Markov
Logic network. The first-order expressivity of Markov Logic enables us to
leverage the rich relational structure of the problem and efficiently abstract over
large regions of the state and action spaces.
@InProceedings{lison:2010:SRW,
author = {Lison, Pierre},
title = {Towards Relational {P}{O}{M}{D}{P}s for Adaptive Dialogue Management},
booktitle = {Proceedings of the ACL 2010 Student Research Workshop},
month = {July},
year = {2010},
address = {Uppsala, Sweden},
publisher = {Association for Computational Linguistics},
pages = {7--12},
url = {http://www.aclweb.org/anthology/P10-3002}
}
Pierre Lison, Carsten Ehrler, and Geert-Jan M. Kruijff.
Belief modelling for situation awareness in human-robot interaction. In Proceedings of the 19th International
Symposium on Robot and Human Interactive Communication (RO-MAN 2010), Viareggio, Italy, 2010.
[Abstract]
[BibTeX]
[PDF]
To interact naturally with humans, robots needs to be aware of their own
surroundings. This awareness is usually encoded in some implicit or explicit
representation of the situated context. In this paper, we present a new
framework for constructing rich belief models of the robot's environment.
Key to our approach is the use of \textit{Markov Logic} as a unified
representation formalism. Markov Logic is a combination of first-order
logic and probabilistic graphical models. Its expressive power allows us
to capture both the rich relational structure of the environment and the
uncertainty arising from the noise and incompleteness of low-level sensory
data. The constructed belief models evolve dynamically over time and
incorporate various contextual information such as spatio-temporal framing,
multi-agent epistemic status, and saliency measures. Beliefs can also be
referenced and extended ``top-down'' via linguistic communication. The
approach is being integrated into a cognitive architecture for mobile robots
interacting with humans using spoken dialogue.
@inproceedings{roman2010-beliefs,
Author = {Pierre Lison and Carsten Ehrler and Geert-Jan M. Kruijff},
Booktitle = {Proceedings of the 19th International Symposium on Robot and
Human Interactive Communication (RO-MAN 2010)},
Location = {Viareggio, Italy},
Title = {Belief Modelling for Situation Awareness in Human-Robot Interaction},
Url = {http://folk.uio.no/plison/pdfs/cogx/main.beliefs.roman2010.pdf},
Year = {2010}}
Geert-Jan M. Kruijff, Miroslav Miroslav Janiček, and Pierre Lison.
Continual processing of situated dialogue in human-robot
collaborative activities. In Proceedings of the 19th International Symposium on Robot and Human Interactive
Communication (RO-MAN 2010), Viareggio, Italy, 2010.
[Abstract]
[BibTeX]
[PDF]
The paper presents an implemented approach of processing situated dialogue
between a human and a robot. The focus is on task-oriented dialogue, set in
the larger context of human-robot collaborative activity. The approach models
understanding and production of dialogue to include intension (what is being
talked about), intention (the goal of why something is being said), and attention
(what is being focused on). These dimensions are directly construed in terms of
assumptions and assertions on situated multi-agent belief models. The approach is
continual in that it allows for interpretations to be dynamically retracted,
revised, or deferred. This makes it possible to deal with the inherent asymmetry
in how robots and humans tend to understand dialogue, and the world it is set in.
The approach has been fully implemented, and integrated into a cognitive robot.
The paper discusses the implementation, and illustrates it in a collaborative learning
setting.
@inproceedings{roman2010-cca,
Author = {Geert-Jan M. Kruijff and Miroslav Miroslav Jani\v{c}ek and Pierre Lison},
Booktitle = {Proceedings of the 19th International Symposium on Robot and Human
Interactive Communication (RO-MAN 2010)},
Location = {Viareggio, Italy},
Title = {Continual Processing of Situated Dialogue in Human-Robot
Collaborative Activities},
Url = {http://folk.uio.no/plison/pdfs/cogx/main.cca.roman2010.pdf},
Year = {2010}}
Pierre Lison.
Robust processing of situated spoken dialogue.
In Christian Chiarcos, Richard Eckart de Castilho and Manfred Stede, editors, Von der Form zur Bedeutung : Texte automatisch
verarbeiten / From Form to Meaning : Processing Texts Automatically. Narr Verlag, Proceedings of
the Biennial GSCL Conference 2009 , Potsdam, Germany, 2009.
[Abstract]
[BibTeX]
[PDF]
Spoken dialogue is notoriously hard to process with standard language
processing technologies. Dialogue systems must indeed meet two major challenges.
First, natural spoken dialogue is replete with disfluent, partial, elided or
ungrammatical utterances. Second, speech recognition remains a highly error-
prone task, especially for complex, open-ended domains. We present an integrated
approach for addressing these two issues, based on a robust incremental
parser. The parser takes word lattices as input and is able to handle ill-formed
and misrecognised utterances by selectively relaxing its set of grammatical rules.
The choice of the most relevant interpretation is then realised via a discrimina-
tive model augmented with contextual information. The approach is fully implemented
in a dialogue system for autonomous robots. Evaluation results on a
Wizard of Oz test suite demonstrate very significant improvements in accuracy
and robustness compared to the baseline.
@inproceedings{plison.robustprocessing.gscl2009,
Author = {Pierre Lison},
Booktitle = {Von der Form zur Bedeutung: Texte automatisch verarbeiten /
From Form to Meaning: Processing Texts Automatically},
Editor = {Christian Chiarcos and Richard Eckart de Castilho and Manfred Stede},
Note = {Proceedings of the Biennial GSCL Conference 2009 , Potsdam, Germany},
Publisher = {Narr Verlag},
Title = {Robust processing of situated spoken dialogue},
Url = {http://folk.uio.no/plison/pdfs/cl/plison.robustprocessing.gscl2009.pdf},
Year = {2009}}
Pierre Lison and Geert-Jan M. Kruijff.
Efficient parsing of spoken inputs for human-robot interaction.
In Proceedings of the 18th IEEE International Symposium on Robot
and Human Interactive Communication (RO-MAN 09), Toyama, Japan, 2009.
[Abstract]
[BibTeX]
[PDF]
The use of deep parsers in spoken dialogue systems is usually subject to strong
performance requirements. Real-time dialogue applications must be capable of responding
quickly to any given utterance, even in the presence of noisy, ambiguous or distorted
input. The parser must therefore ensure that the number of analyses remains bounded at
every processing step. The paper presents a practical approach to address this issue in
the context of deep parsers designed for spoken dialogue. The approach is based on a word
lattice parser for Combinatory Categorial Grammar combined with a discriminative model for
parse selection. Each word lattice is parsed incrementally, word by word, and a
discriminative model is applied at each incremental step to prune the set of resulting
partial analyses. The model incorporates a wide range of linguistic and contextual
features and can be trained with a simple perceptron. The approach is fully implemented
as part of a spoken dialogue system for human-robot interaction. Evaluation results on a
Wizard-of-Oz test suite demonstrate significant improvements in parsing time.
@inproceedings{plison.chartpruning.cgparsing2009,
Address = {Bordeaux, France},
Author = {Pierre Lison},
Booktitle = {Proceedings of the ESSLLI Workshop on Parsing with Categorial Grammars},
Title = {A Method to Improve the Efficiency of Deep Parsers with Incremental
Chart Pruning},
Url = {http://folk.uio.no/plison/pdfs/cl/plison-kruijff.robustprocessing.srsl2009.pdf},
Year = {2009}}
Pierre Lison and Geert-Jan M. Kruijff.
Robust processing of situated spoken dialogue.
In KI 2009 :Advances in Artificial Intelligence, Lecture Notes
in Artificial Intelligence , Vol. 5803. Springer Verlag, 32nd Annual
German Conference on AI, Paderborn, Germany, 2009.
[Abstract]
[BibTeX]
[PDF]
Spoken dialogue is notoriously hard to process with standard language processing
technologies. Dialogue systems must indeed meet two major challenges. First, natural
spoken dialogue is replete with disfluent, partial, elided or ungrammatical
utterances. Second, speechrecognition remains a highly error-prone task, especially
for complex, open-ended domains. We present an integrated approach for addressing
these two issues, based on a robust incremental parser. The parser takes word
lattices as input and is able to handle ill-formed and misrecognised utterances by
selectively relaxing its set of grammatical rules. The choice of the most relevant
interpretation is then realised via a discriminative model augmented with contextual
information. The approach is fully implemented in a dialogue system for autonomous
robots. Evaluation results on a Wizard of Oz test suite demonstrate very significant
improvements in accuracy and robustness compared to the baseline.
@inproceedings{plison-kruijff.robustprocessing.ki2009,
Address = {Paderborn, Germany},
Author = {Pierre Lison and Geert-Jan M. Kruijff},
Booktitle = {KI 2009: Advances in Artificial Intelligence. Proceedings of the 32nd
Annual German Conference on AI, Paderborn, Germany, September 15-18, 2009},
Editors = {B\"{a}rbel Mertsching and Marcus Hund and Zaheer Aziz},
Publisher = {Springer Verlag},
Series = {Lecture Notes in Artificial Intelligence , Vol. 5803},
Title = {Robust processing of situated spoken dialogue},
Url = {http://folk.uio.no/plison/pdfs/cl/plison-kruijff.robustprocessing.ki2009.pdf},
Year = {2009}}
Pierre Lison and Geert-Jan M. Kruijff.
Salience-driven contextual priming of speech recognition for
human-robot interaction.
In Proceedings of the 18th European Conference on Artificial
Intelligence (ECAI 2008), Patras, Greece, 2008.
[Abstract]
[BibTeX]
[PDF]
The paper presents an implemented model for priming speech recognition,
using contextual information about salient entities. The underlying hypothesis is
that, in human-robot interaction, speech recognition performance can be improved
by exploiting knowledge about the immediate physical situation and the dialogue
history. To this end, visual salience (objects perceived in the physical scene)
and linguistic salience (objects, events already mentioned in the dialogue) are
integrated into a single cross-modal salience model.The model is dynamically
updated as the environment changes. It is used to establish expectations about which
words are most likely to be heard in the given context. The update is realised by
continuously adapting the word-class probabilities specified in a statistical
language model.The paper discusses the motivations behind the approach, and presents
the implementation as part of a cognitive architecture for mobile robots. Evaluation
results on a test suite show a statistically significant improvement of
salience-driven priming speech recognition (WER) over a commercial baseline system.
@inproceedings{Lison/Kruijff:2008,
Address = {Patras, Greece},
Author = {Pierre Lison and Geert-Jan M. Kruijff},
Booktitle = {Proceedings of the 18th European Conference on Artificial Intelligence
(ECAI 2008)},
Keywords = {human-robot interaction, speech recognition, statistical language
models, salience modeling, cognitive systems},
Title = {Salience-driven Contextual Priming of Speech Recognition for
Human-Robot Interaction},
Url = {http://folk.uio.no/plison/pdfs/cl/main.sitASR.ecai08.pdf},
Year = {2008}}
Geert-Jan M. Kruijff, Pierre Lison, Trevor Benjamin, Henrik Jacobsson, and Nick
Hawes.
Incremental, multi-level processing for comprehending situated
dialogue in human-robot interaction.
In Proceedings of the Symposium on Language and Robots
(LangRo'2007), Aveiro, Portugal, 2007.
[Abstract]
[BibTeX]
[PDF]
The paper presents work in progress on an implemented
model of situated dialogue processing. The underlying assumption
is that to understand situated dialogue, communicated
meaning needs to be related to situation(s) it refers to.
The model couples incremental processing to a notion of bidirectional
connectivity, inspired by how humans process visually
situated language. Analyzing an utterance in a ``word by
word'' fashion, a representation of possible utterance interpretations
is gradually built up. In a top-down fashion,
the model tries to ground these interpretations in situation
awareness, through which they can prime what is focused
on in a situation. In a bottom-up fashion, the (im)possibility
to ground certain interpretations primes how the analysis of
the utterance further unfolds. The paper discusses the implementation
of the model in a distributed, cognitive architecture
for human-robot interaction, and presents an evaluation on a
test suite. The evaluation shows (and quantifies) the effects
linguistic interpretation have on priming incremental utterance
processing, and discusses how such evaluation can be
extended to include situation-relative interpretation.
@inproceedings{aveiro07,
Address = {Aveiro, Portugal},
Author = {Geert-Jan M. Kruijff and Pierre Lison and Trevor Benjamin
and Henrik Jacobsson and Nick Hawes},
Booktitle = {Proceedings of the Symposium on Language and Robots (LangRo'2007)},
Title = {Incremental, multi-level processing for comprehending situated
dialogue in human-robot interaction},
Url = {pdfs/cl/main.incrsitdial.langro2007.pdf},
Year = {2007}}
Workshop Papers (lightly reviewed)
Danijel Skočaj, Matej Kristan,Aleš Leonardis, Alen Vrečko, Miroslav Janíček,
Geert-Jan M. Kruijff, Pierre Lison and Michael Zillich.
A system approach to interactive learning of visual concepts.
In Tenth International Conference on Epigenetic Robotics , 2010.
[Abstract]
[BibTeX]
[PDF]
In this work we present a system and underlying representations and
mechanisms for continuous learning of visual concepts in dialogue
with a human tutor.
@inproceedings{george-epirob2010,
Author = {Danijel Sko\v{c}aj and Matej Kristan and Ale\v{s} Leonardis and
Alen Vre\v{c}ko and Miroslav Jani\v{c}ek and Geert-Jan M. Kruijff and
Pierre Lison and Michael Zillich},
Booktitle = {Tenth International Conference on Epigenetic Robotics},
Title = {A system approach to interactive learning of visual concepts},
Url = {http://folk.uio.no/plison/pdfs/cogx/epirob2010george.pdf},
Year = {2010}}
Danijel Skočaj, Matej Kristan,Aleš Leonardis, Alen Vrečko, Miroslav Janíček,
Geert-Jan M. Kruijff, Pierre Lison and Michael Zillich.
A basic cognitive system for interactive learning of simple visual concepts.
In RSS workshop on learning for human-robot interaction modelling, 2010.
[Abstract]
[BibTeX]
[PDF]
Danijel Skočaj, Miroslav Janíček, Matej Kristan, Geert-Jan M.
Kruijff, Aleš Leonardis, Pierre Lison, Alen Vrečko, and Michael
Zillich.
A basic cognitive system for interactive continuous learning of
visual concepts.
In Proceedings of the workshop on Interactive Communication for
Autonomous Intelligent Robots, ICRA 2010, 2010.
[Abstract]
[BibTeX]
[PDF]
Interactive continuous learning is an important characteristic of a
cognitive agent that is supposed to operate and evolve in an everchanging
environment. In this paper we present representations and mechanisms that
are necessary for continuous learning of visual concepts in dialogue with
a tutor. We present an approach for modelling beliefs stemming from
multiple modalities and we show how these beliefs are created by processing
visual and linguistic information and how they are used for learning.
We also present a system that exploits these representations and mechanisms,
and demonstrate these principles in the case of learning about object colours
and basic shapes in dialogue with the tutor.
@inproceedings{george-icair2010,
Address = {Anchorage, AK, USA},
Author = {Danijel Sko\v{c}aj and Miroslav Jani\v{c}ek and Matej Kristan
and Geert-Jan M. Kruijff and Ale\v{s} Leonardis and Pierre Lison
and Alen Vre\v{c}ko and Michael Zillich},
Booktitle = {Proceedings of the workshop on Interactive Communication for
Autonomous Intelligent Robots, ICRA 2010},
Month = {May},
Pages = {30-36},
Title = {A basic cognitive system for interactive continuous learning
of visual concepts},
Url = {http://folk.uio.no/plison/pdfs/cogx/icair10main.pdf},
Year = {2010}}
Pierre Lison.
A method to improve the efficiency of deep parsers with incremental
chart pruning.
In Proceedings of the ESSLLI Workshop on Parsing with Categorial
Grammars, Bordeaux, France, 2009.
[Abstract]
[BibTeX]
[PDF]
The use of deep parsers in spoken dialogue systems is usually subject to strong
performance requirements. Real-time dialogue applications must be capable of responding
quickly to any given utterance, even in the presence of noisy, ambiguous or distorted
input. The parser must therefore ensure that the number of analyses remains bounded at
every processing step. The paper presents a practical approach to address this issue in
the context of deep parsers designed for spoken dialogue. The approach is based on a word
lattice parser for Combinatory Categorial Grammar combined with a discriminative model for
parse selection. Each word lattice is parsed incrementally, word by word, and a
discriminative model is applied at each incremental step to prune the set of resulting
partial analyses. The model incorporates a wide range of linguistic and contextual
features and can be trained with a simple perceptron. The approach is fully implemented
as part of a spoken dialogue system for human-robot interaction. Evaluation results on a
Wizard-of-Oz test suite demonstrate significant improvements in parsing time.
@inproceedings{plison.chartpruning.cgparsing2009,
Address = {Bordeaux, France},
Author = {Pierre Lison},
Booktitle = {Proceedings of the ESSLLI Workshop on Parsing with Categorial Grammars},
Title = {A Method to Improve the Efficiency of Deep Parsers with Incremental
Chart Pruning},
Url = {http://folk.uio.no/plison/pdfs/cl/main.incrchartpruning.cgparsing2009.pdf},
Year = {2009}}
Pierre Lison and Geert-Jan M. Kruijff.
An integrated approach to robust processing of situated spoken
dialogue.
In Proceedings of the Second International Workshop on the
Semantic Representation of Spoken Language (SRSL 09), Athens, Greece, 2009.
[Abstract]
[BibTeX]
[PDF]
Spoken dialogue is notoriously hard to process with standard NLP technologies.
Natural spoken dialogue is replete with disfluent, partial, elided or ungrammatical
utterances, all of which are difficult to accommodate in a dialogue system. Furthermore,
speech recognition is known to be a highly error-prone task, especially for complex,
open-ended domains. The combination of these two problems -- ill-formed and/or
misrecognised speech inputs -- raises a major challenge to the development of robust
dialogue systems. We present an integrated approach for addressing these two issues,
based on an incremental parser for Combinatory Categorial Grammar. The parser takes word
lattices as input and is able to handle ill-formed and misrecognised utterances by
selectively relaxing its set of grammatical rules. The choice of the most relevant
interpretation is then realised via a discriminative model augmented with contextual
information. The approach is fully implemented in a dialogue system for autonomous robots.
Evaluation results on a Wizard of Oz test suite demonstrate very significant improvements
in accuracy and robustness compared to the baseline.
@InProceedings{lison-kruijff:2009:SRSL,
author = {Lison, Pierre and Kruijff, Geert-Jan M.},
title = {An Integrated Approach to Robust Processing of Situated Spoken Dialogue},
booktitle = {Proceedings of SRSL 2009, the 2nd Workshop on Semantic Representation of Spoken Language},
month = {March},
year = {2009},
address = {Athens, Greece},
publisher = {Association for Computational Linguistics},
pages = {58--65},
url = {http://www.aclweb.org/anthology/W09-0508}
}
Pierre Lison.
A salience-driven approach to speech recognition for human-robot
interaction.
In Proceedings of the 13th ESSLLI student session (ESSLLI
2008), Hamburg, Germany, 2008.
[Abstract]
[BibTeX]
[PDF]
We present an implemented model for speech recognition in natural environments
which relies on contextual information about salient entities to prime
utterance recognition. The hypothesis underlying our approach is that, in situated
human-robot interactions, the speech recognition performance can be significantly
enhanced by exploiting knowledge about the immediate physical environment and
the dialogue history. To this end, visual salience (objects perceived in the physical
scene) and linguistic salience (previously referred expressions within the current
dialogue) are integrated into a single cross-modal salience model. The model is dynamically
updated as the environment evolves, and is used to establish expectations
about uttered words which are most likely to be heard given the context. The update
is realised by continously adapting the word-class probabilities specified in the
statistical language model. The present article discusses the motivations behind our
approach, describes our implementation as part of a distributed, cognitive architecture
for mobile robots, and reports the evaluation results on a test suite.
@inproceedings{ESSLLI2008,
Address = {Hamburg, Germany},
Author = {Pierre Lison},
Booktitle = {Proceedings of the 13th ESSLLI student session (ESSLLI 2008)},
Keywords = {human-robot interaction, speech recognition, statistical language
models, salience modeling, cognitive systems},
Title = {A salience-driven approach to speech recognition for
human-robot interaction},
Url = {http://folk.uio.no/plison/pdfs/cl/situatedASR-ESSLLI08.pdf},
Year = {2008}}
Articles, book chapters and collections
Jeremy L. Wyatt, Alper Aydemir, Michael Brenner, Marc Hanheide, Nick Hawes,
Patric Jensfelt, Geert-Jan M. Kruijff, Matej Kristan, Pierre Lison, Andrzej
Pronobis, Kristoffer Sjöö, Danijel Skočaj, Alen Vrečko, and
Michael Zillich Hendrik Zender, "Self-Understanding and Self-Extension: A Systems and Representational Approach",
IEEE Transactions on Autonomous Mental Development,
vol.2, no.4, pp.282-303, Dec. 2010
[Abstract]
[BibTeX]
[PDF]
There are many different approaches to building a system that can engage in
autonomous mental development. In this paper we present an approach based on
what we term self- understanding, by which we mean the use of explicit
representation of and reasoning about what a system does and doesn't know, and
how that understanding changes under action. We present a coherent architecture
and a set of representations used in two robot systems that exhibit a limited
degree of autonomous mental development, what we term self-extension. The
contributions include: representations of gaps and uncertainty for specific
kinds of knowledge, and a motivational and planning system for setting and
achieving learning goals.
@article{tamd-architecture,
Author = {Jeremy~L.~Wyatt and Alper Aydemir and Michael Brenner and Marc Hanheide
and Nick Hawes and Patric Jensfelt and Matej Kristan and Geert-Jan M. Kruijff and
Pierre Lison and Andrzej Pronobis and Kristoffer Sj\"{o}\"{o} and Danijel Sko\v{c}aj
and Alen Vre\v{c}ko and Hendrik Zender, Michael Zillich},
Journal = {IEEE Transactions on Autonomous Mental Development},
Month = {December},
Numer = {4},
Pages = {282-303},
Title = {Self-Understanding \& Self-Extension: A Systems and Representational
Approach},
Url = {http://folk.uio.no/plison/pdfs/cogx/ieee.tamd.architectures.main.pdf},
Volume = {2},
Year = {2010}}
Pierre Lison.
A salience-driven approach to speech recognition for human-robot
interaction, Interfaces: Explorations in Logic, Language and Computation, Springer Verlag, 2010.
(extended reprint of the 2008 ESSLLI paper).
[Abstract]
[BibTeX]
[PDF]
We present an implemented model for speech recognition in natural environments
which relies on contextual information about salient entities to prime utterance
recognition. The hypothesis underlying our approach is that, in situated
human-robot interactions, the speech recognition performance can be significantly
enhanced by exploiting knowledge about the immediate physical environment and
the dialogue history. To this end, visual salience (objects perceived in the
physical scene) and linguistic salience (previously referred expressions within
the current dialogue) are integrated into a single cross-modal salience model.
The model is dynamically updated as the environment evolves, and is used to
establish expectations about uttered words which are most likely to be heard
given the context. The update is realised by continuously adapting the word-class
probabilities specified in the statistical language model. The present article
discusses the motivations behind our approach, describes our implementation
as part of a distributed, cognitive architecture for mobile robots, and reports
the evaluation results on a test suite.
@incollection{ESSLLI2008-springerreprint,
Author = {Pierre Lison},
Booktitle = {Thomas Icard; Reinhard Muskens: Interfaces: Explorations in Logic,
Language and Computation: ESSLLI 2008 and ESSLLI 2009 Student Sessions,
Selected Papers},
Keywords = {speech recognition, human-robot interaction, spoken dialogue systems},
Note = {(extended reprint of the 2008 ESSLLI paper)},
Pages = {o.A.},
Publisher = {Springer Verlag},
Title = {A salience-driven approach to speech recognition for human-robot
interaction},
Url = {http://folk.uio.no/plison/pdfs/cl/main.sitASR.selectedESSLLI.pdf},
Year = {2010}}
Geert-Jan M. Kruijff, Pierre Lison, Trevor Benjamin, Henrik Jacobsson, Hendrik
Zender, and Ivana Kruijff-Korbayová.
Cognitive Systems, volume 8 of Cognitive Systems
Monographs, chapter Situated Dialogue Processing for Human-Robot
Interaction.
Springer Verlag, Heidelberg, Germany, May 2010.
[BibTeX]
[PDF]
@inbook{cosybook:dialogue,
Address = {Heidelberg, Germany},
Author = {Geert-Jan M. Kruijff and Pierre Lison and Trevor Benjamin
and Henrik Jacobsson and Hendrik Zender and Ivana Kruijff-Korbayov\'{a}},
Chapter = {Situated Dialogue Processing for Human-Robot Interaction},
Editor = {Christensen, Henrik Iskov and Sloman, Aaron and Kruijff,
Geert-Jan M. and Wyatt, Jeremy L.},
Month = {July},
Publisher = {Springer Verlag},
Series = {Cognitive Systems Monographs},
Title = {Cognitive Systems},
Url = {http://folk.uio.no/plison/pdfs/cl/my-cosy-book.pdf},
Volume = {8},
Year = {2010}}
Books
Pierre Lison, Mattias Nilsson and Marta Recasens.
Proceedings of the Student Research Workshop at the 13th Conference of the European Chapter of the Association for Computational Linguistics.
Association for Computational Linguistics, 2012.
[BibTeX]
[PDF]
@Book{SRWEACL2012:2012,
editor = {Pierre Lison and Mattias Nilsson and Marta Recasens},
title = {Proceedings of the Student Research Workshop at the 13th Conference
of the European Chapter of the Association for Computational Linguistics},
month = {April},
year = {2012},
address = {Avignon, France},
publisher = {Association for Computational Linguistics},
url = {http://www.aclweb.org/anthology/E12-3}
}
Pierre Lison.
Robust Processing of Spoken Situated Dialogue - A Study in
Human-Robot Interaction.
Diplomica Verlag, Hamburg, March 2010.
[Abstract]
[BibTeX]
[Amazon]
Recent years have witnessed a surge of interest for service robots endowed with
communicative abilities. Such robots could take care of routine tasks, in homes,
offices, schools or hospitals, help disabled or mentally impaired persons,
serve as social companions for the elderly, or simply entertain us. They would
assist us in our daily life activities. These robots are, by definition, meant
to be deployed in social environments, and their capacity to interact naturally
with humans is thus a crucial factor. The development of such talking robots led
to the emergence of a new research field, Human-Robot Interaction (HRI), which
draws from a wide range of scientific disciplines such as artificial intelligence,
robotics, linguistics and cognitive science.
This work focuses on the issue of robust speech understanding - that is, how to
process spoken dialogue automatically to extract the intended meaning. The book
presents a new approach which combines linguistic resources with statistical
techniques and context-sensitive interpretation to achieve both deep and robust
spoken dialogue comprehension.
The first part of the book provides a general introduction to the field of human-robot
interaction and details the major linguistic properties of spoken dialogue, as well as
some grammatical formalisms used to analyse them. The second part describes the approach
itself, devoting one chapter to context-sensitive speech recognition for HRI, and
one chapter to the robust parsing of spoken inputs via grammar relaxation and
statistical parse selection. All the algorithms presented are fully implemented, and
integrated as part of a distributed cognitive architecture for autonomous robots.
A complete evaluation of our approach using Wizard-of-Oz experiments is also
provided in this book. The results demonstrate very significant improvements in
accuracy and robustness compared to the baseline.
@book{robustprocessing-diplomica2010,
Author = {Pierre Lison},
Month = {March},
Publisher = {Diplomica Verlag},
Title = {Robust Processing of Spoken Situated Dialogue - A Study in
Human-Robot Interaction},
Year = {2010}}
Master theses
Pierre Lison.
Robust processing of situated spoken dialogue.
Master's thesis, Universität des Saarlandes, Saarbrücken,
December 2008.
[Abstract]
[BibTeX]
[PDF]
Spoken dialogue is often considered as one of the most natural means
of interaction between a human and a machine. It is, however, notoriously
hard to process using NLP technology. As many corpus studies have shown,
natural spoken dialogue is replete with disfluent, partial, elided or
ungrammatical utterances, all of which are very hard to accommodate
in a dialogue system. Furthermore, automatic speech recognition [ASR]
is known to be a highly error-prone task, especially when dealing with
complex, open-ended discourse domains. The combination of these two
problems -- ill-formed and/or misrecognised speech inputs -- raises a
major challenge to the development of robust dialogue systems.
This thesis presents an integrated approach for addressing these issues
in the context of domain-specific dialogues for human-robot interaction
[HRI]. Several new techniques and algorithms have been developed to this
end. They can be divided into two main lines of work.
The first line of work pertains to speech recognition. We describe a
new model for context-sensitive speech recognition, which is specifically
suited to HRI. The underlying hypothesis is that, in situated human-robot
interaction, ASR performance can be significantly improved by exploiting
contextual knowledge about the physical environment (objects perceived in
the visual scene) and the dialogue history (previously referred-to objects
within the current dialogue). The language model is dynamically updated as
the environment changes, and is used to establish expectations about uttered
words which are most likely to be heard given the context.
The second line of work deals with the robust parsing of spoken inputs.
We present a new approach for this task, based on a incremental parser
for Combinatory Categorial Grammar [CCG]. The parser takes word lattices
as input and is able to handle ill-formed and misrecognised utterances by
selectively relaxing and extending its set of grammatical rules. This
operation is done via the introduction of non-standard CCG rules into the
grammar. The choice of the most relevant interpretation is then realised
via a discriminative model augmented with contextual information. The model
includes a broad range of linguistic and contextual features, and can
be trained with a simple perceptron algorithm.
All the algorithms presented in this thesis are fully implemented, and
integrated as part of a distributed cognitive architecture for autonomous
robots. We performed an extensive evaluation of our approach using a set of
Wizard of Oz experiments. The obtained results demonstrate very significant
improvements in accuracy and robustness compared to the baseline.
@mastersthesis{LisonThesis2008,
Address = {Saarbr\"{u}cken},
Author = {Pierre Lison},
Month = {December},
School = {Universit\"at des Saarlandes},
Title = {Robust Processing of Situated Spoken Dialogue},
Url = {http://folk.uio.no/plison/pdfs/thesis/main.thesis.plison2008.pdf},
Year = {2008}}
Pierre Lison.
Implémentation d'une interface sémantique-syntaxe basée
sur des grammaires d'unification polarisées.
Master's thesis, Université Catholique de Louvain,
Louvain-la-Neuve, Belgium, 2006.
[Abstract]
[BibTeX]
[PDF]
This work relates to {\it Natural Language Processing} [NLP],
a scientific research field situated at the intersection of several classical
disciplines such as computer science, linguistics, mathematics, psychology,
and whose object is the design of computational systems able to {\it process}
(i.e. understand and/or generate) linguistic data, whether oral or written. \\
In order to achieve that goal, it is often necessary to design {\it formal models}
able to simulate the behaviour of complex linguistic phenomena. Several theories
have been elaborated to this end. Significent divergences do exist between them
concerning linguistic foundations as well as grammatical formalisms and related
computer tools. Nevertheless, many efforts have recently been made to bring them
closer together, and two major trends clearly seem to emerge from the main
contemporary theories:
\begin{itemize}
\item They are all built around {\it modular architecture}, explicitly distinguishing
the semantic, syntactic, morphological and phonological representation levels ;
\item They al give a central position to the {\it lexicon}, rightly seen as a
crucial resource for the establishment of efficient and wide-coverage systems \\
\end{itemize}
This study examines an essential component of all these models: the
{\it semantics-syntax interface}, responsible for the mapping between the semantic
and syntactic levels of the architecture. Indeed, many distortion phenomenas
can be found in every human language between these two levels. Let us mention
as examples the handling of idioms and locutions, the active/passive alternation,
the so-called "extraction" phenomenas (relative subordinates, interrogative clauses),
elliptic coordination, and many others. \\
We approach this issue in the framework of a particular linguistic theory, the
{\it Meaning-Text Unification Grammars} [MPUG], an articulated mathematical model
of language recently device by S. Kahane, and his related description formalism,
{\it Polarized Unification Grammars} [PUG]. \\
The first part of our work deals with the general study of the role and inner workings
of the semantics-syntax interface within this theory. We then propose a concrete
implementation of it based on {\it Constraint Programming}. This implementation is
grounded on an axiomatization of our initial formalism into a
{\it Constraint Satisfaction Problem}. \\
Rather than developping the software entirely from scratch, we have instead chosen
to reuse an existing tool, the {\it XDG Development Kit}, and to adapt it to our needs.
Its is a grammar development environment for the meta-grammatical formalism of
{\it Extensible Dependency Grammar} [XDG], entirely based on Constraint Programming. \\
Practically, this work makes three original contributions to NLP research:
\begin{enumerate}
\item An {\it axiomatization} of MTUG/PUG into a Constraint Satisfaction Problem,
enabling us to give a solid formal ground to our implementation ;
\item An {\it implementation} of our semantics-syntax interface by means of a compiler
from MTUG/PUG grammars to XDG grammars called \texttt{auGUSTe} as well as by the
integration of eight new "principles" (i.e. constraints sets) into XDG ;
\item And finally, the {\it application} of our compiler to a small hand-crafted grammar
centered on culinary vocabulary in order to experimentally validate our work.
\end{enumerate}
@mastersthesis{LisonMscThesis06,
Author = {Pierre Lison},
Keywords = {dependency grammar, constraint programming, meaning-text
unification grammars, polarized unification grammars, semantics-syntax
interface, extensible dependency grammar},
School = {Universit{\'e} Catholique de Louvain, Louvain-la-Neuve, Belgium},
Title = {Impl{\'e}mentation d'une Interface S{\'e}mantique-Syntaxe bas{\'e}e
sur des Grammaires d'Unification Polaris{\'e}es},
Url = {http://folk.uio.no/plison/pdfs/thesisUCL/memoire_plison.pdf},
Year = 2006}
Internal reports and project deliverables
Pierre Lison.
Dialogue management for rich, open-ended domains.
PhD thesis proposal, University of Oslo, January 2011.
[Abstract]
[BibTeX]
[PDF]
The following PhD project proposal presents a first attempt at delineating
the research objectives pursued in my Ph.D thesis. I start by introducing the
general issue of dialogue management and its role in spoken dialogue systems.
I also describe some of the most important challenges and open issues which need
to be addressed when developing dialogue strategies for domains which go beyond
the classical ``slot-filling'' applications which have been traditionally
studied in the literature so far.
I then outline the main research directions I plan to investigate in my thesis.
All these directions are subsumed by a common, overarching goal: develop a new,
unified framework for dialogue management in rich, open-ended domains. The proposed
approach will be ``hybrid'' in the sense that it will combine probabilistic models
and policies optimised via reinforcement learning techniques with linguistic/pragmatic
insights about dialogue structure and pragmatic behaviour.
The framework is to be implemented in a new, generic dialogue system architecture
(released under an open source license), and will be evaluated in specific interaction
scenarios, which are still to be determined.
@techreport{phdproposal,
Author = {Pierre Lison},
Institution = {University of Oslo},
Month = {January},
Title = {Dialogue management for rich, open-ended domainsn},
Url = {http://folk.uio.no/plison/pdfs/dissertation/phdproposal.pdf},
Year = {2011}
}
Jeremy Wyatt, Geert-Jan Kruijff, Pierre Lison, Michael Zillich, Kai Zhou
Thomas Mörwald, Michael Brenner, Charles Gretton, Patric Jensfelt,
Kristoffer Sjöö, Andzrej Pronobis, Matej Kristan, Marko Mahnič,
and Danijel Skočaj.
Unifying representations of beliefs about beliefs and knowledge
producing actions.
Technical report, CogX Project, 2010.
WP 1, year 2 deliverable, 134 pages.
[Abstract]
[BibTeX]
[PDF]
Representing the epistemic state of the robot and how that epistemic
state changes under action is one of the key tasks in CogX. In this
report we describe progress on this in the first 18 months of the
project, and set out a typology of epistemic knowledge. We describe
the specific representations we have developed for different domains
or modalities, or are planning to develop, and how those are related
to one another.
@techreport{wp1-dr2010,
Author = {Jeremy Wyatt and Geert-Jan Kruijff and Pierre Lison and Michael
Zillich and Thomas M\"{o}rwald, Kai Zhou and Michael Brenner and Charles
Gretton and Patric Jensfelt and Kristoffer Sj\"{o}\"{o} and Andzrej
Pronobis and Matej Kristan and Marko Mahni\v{c} and Danijel Sko\v{c}aj},
Institution = {CogX Project},
Note = {WP 1, year 2 deliverable, 134 pages},
Title = {Unifying representations of beliefs about beliefs and knowledge
producing actions},
Url = {http://folk.uio.no/plison/pdfs/cogx/wp1.M21.cogxreport.pdf},
Year = {2010}}
Geert-Jan M. Kruijff, Miroslav Janiček, Ivana Kruijff-Korbayová, Pierre
Lison, Raveesh Meena, and Hendrik Zender.
Transparency in situated dialogue for interactive learning.
Technical report, CogX Project, July 2009.
WP 6, year 1 deliverable, 38 pages.
[Abstract]
[BibTeX]
[PDF]
A robot can use dialogue to try to learn more about the world. For
this to work, the robot and a human need to establish a mutually agreed-upon
understanding of what is being talked about, and why. Thereby it is particularly
important for the human to understand what the robot is after. The notion of
\emph{transparency} tries to capture this. It involves the relation between why
a question is asked, how it relates to private and shared beliefs, and how it reveals
what the robot does or does not know. For year 1, WP6 investigated means for
establishing transparency in situated dialogue for interactive learning. This
covered two aspects: how to phrase questions for knowledge gathering and -refinement,
and how to verbalize knowledge. Results include methods for verbalizing what
the robot does and does not know about referents and aspects of the environment,
based on a mixture of prior and autonomously acquired knowledge and basic methods for
self-understanding (Task 6.1); and, novel algorithms for determining content and
context for question subdialogues to gather more information to help resolve
misunderstandings or fill gaps (Task 6.2). WP6 also reports results on making
spoken situated dialogue more robust, employing probabilistic models for using
multi-modal information to reduce uncertainty in comprehension.
@techreport{wp6-dr2009,
Author = {Geert-Jan M. Kruijff and Miroslav Jani\v{c}ek and Ivana
Kruijff-Korbayov\'a and Pierre Lison and Raveesh Meena and Hendrik Zender},
Institution = {CogX Project},
Month = {July},
Note = {WP 6, year 1 deliverable, 38 pages},
Title = {Transparency in situated dialogue for interactive learning},
Url = {http://folk.uio.no/plison/pdfs/cogx/DR.6.1.PUBLIC.pdf},
Year = {2009}}
Others
Pierre Lison and Geert-Jan M. Kruijff.
On the proper treatment of disfluencies in spoken dialogue, 2009.
(submitted).
[Abstract]
[BibTeX]
[PDF]
Speech disfluencies such as filled pauses, repetitions and
self-corrections are unavoidable and pervasive in natural spoken
dialogue. A robust dialogue system should therefore be able to
accommodate such linguistic constructions. In this paper, we present
an ongoing work on the treatment of disfluencies for spoken dialogue
comprehension. We describe an incremental parser capable of handling
slightly ill-formed utterances by selectively relaxing and extending
its set of grammatical rules. To this end, a set of non-standard rules
are introduced into the grammar. These rules are specifically tailored
to dealwith the type of disfluencies encountered in natural spoken
dialogue. A discriminative model for parse selection including a
broad range of linguistic and contextual features is coupled to the
parser in order to rank the resulting semantic interpretations. The approach
has been implemented as part of an integrated spoken dialogue system
for human-robot interaction.
@misc{sigdial2009,
Author = {Pierre Lison and Geert-Jan M. Kruijff},
Note = {(submitted)},
Title = {On the Proper Treatment of Disfluencies in Spoken Dialogue},
Url = {http://folk.uio.no/plison/pdfs/cl/main.disfluencies.sigdial2009.pdf},
Year = {2009}}
Pierre Lison.
Aux sources de l'inspiration.
Revue Louvain, (145):14-15, March 2004.
Dossier thématique 'Comment apprendre la paix'.
[BibTeX]
[PDF]
@article{louvain04,
Author = {Pierre Lison},
Journal = {Revue Louvain},
Month = {March},
Note = {Dossier th\'{e}matique 'Comment apprendre la paix'},
Number = {145},
Pages = {14-15},
Title = {Aux sources de l'inspiration},
Url = {http://folk.uio.no/plison/pdfs/others/Lv_145_2.pdf},
Year = {2004}}