Skip to Main content Skip to Navigation

La représentation des documents par réseaux de neurones pour la compréhension de documents parlés

Abstract : Application of spoken language understanding aim to extract relevant items of meaning from spoken signal. There is two distinct types of spoken language understanding : understanding of human/human dialogue and understanding in human/machine dialogue. Given a type of conversation, the structure of dialogues and the goal of the understanding process varies. However, in both cases, most of the time, automatic systems have a step of speech recognition to generate the textual transcript of the spoken signal. Speech recognition systems in adverse conditions, even the most advanced one, produce erroneous or partly erroneous transcript of speech. Those errors can be explained by the presence of information of various natures and functions such as speaker and ambience specificities. They can have an important adverse impact on the performance of the understanding process. The first part of the contribution in this thesis shows that using deep autoencoders produce a more abstract latent representation of the transcript. This latent representation allow spoken language understanding system to be more robust to automatic transcription mistakes. In the other part, we propose two different approaches to generate more robust representation by combining multiple views of a given dialogue in order to improve the results of the spoken language understanding system. The first approach combine multiple thematic spaces to produce a better representation. The second one introduce new autoencoders architectures that use supervision in the denoising autoencoders. These contributions show that these architectures reduce the difference in performance between a spoken language understanding using automatic transcript and one using manual transcript.
Document type :
Complete list of metadata

Cited literature [70 references]  Display  Hide  Download
Contributor : Abes Star :  Contact Connect in order to contact the contributor
Submitted on : Wednesday, June 27, 2018 - 3:09:06 PM
Last modification on : Tuesday, January 14, 2020 - 10:38:06 AM
Long-term archiving on: : Thursday, September 27, 2018 - 4:28:56 AM


Version validated by the jury (STAR)


  • HAL Id : tel-01824741, version 1



Killian Janod. La représentation des documents par réseaux de neurones pour la compréhension de documents parlés. Intelligence artificielle [cs.AI]. Université d'Avignon, 2017. Français. ⟨NNT : 2017AVIG0222⟩. ⟨tel-01824741⟩



Record views


Files downloads