My name is Pierre Lison and I am currently working as a Postdoctoral Research Fellow in the Language Technology Group of the Department of Informatics, University of Oslo. My research is funded by a 3-years research grant from the Norwegian Research Council.
My current research concentrates on probabilistic dialogue modelling and its applications to various language technology tasks. For my PhD, I developed a hybrid approach to (spoken) dialogue management which combines expert domain knowledge and statistical models into a unified framework. One offspring of my work was the release of the OpenDial toolkit for building robust and adaptive dialogue systems.
I am now focusing on the use of dialogue modelling for another application domain, namely statistical machine translation. In spite of great progress in recent years, machine translation remains poor at adapting translations to the relevant context. To translate a dialogue (for instance, film subtitles), current systems generally operate one utterance at a time and ignore the global coherence and structure of the conversation. The goal of this research is to enhance the contextual awareness of translation systems by incorporating dialogue-level features into the statistical translation models and exploiting them to dynamically modulate the translation outputs.
Jörg Tiedemann and I just released a new major release of the OpenSubtitles collection of parallel corpora. The release is compiled from a large database of movie and TV subtitles and includes 2.6 billion sentences across 60 languages! This makes it the world’s largest collection of parallel corpora freely available. See our paper at LREC for more details on the corpus.
My article “A hybrid approach to dialogue management based on probabilistic rules” will soon appear in Computer Speech & Language (see the “Publications” section). The article is essentially a summary of my PhD work and describes the formalisation of probabilistic rules, the statistical estimation of unknown parameters, and the empirical evaluation of the framework in a human-robot interaction domain.