The Ravel Data Set

Robots have gradually moved form factory floors to populated spaces. There is a need to design robots able to operate in populated, unstructured and unconstrained environments. In order to interact and communicate with people in the most natural way, robots need to make use of their sensory, communication and motor abilities.

The RAVEL corpora (Robots with Auditory and Visual AbiLities) contains typical scenarios useful for the development and benchmark of robot-human interaction (HRI) systems. RAVEL is freely accessible for research purposes and for non-commercial applications. The RAVEL Corpora is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License. Please refer to the following publication:

X. Alameda-Pineda, J. Sanchez-Riera, V. Franc, J. Wienke, J. Cech, K. Kulkarni, A. Deleforge, R. Horaud. RAVEL: An Annotated Corpus for Training Robots with Audiovisual Abilities. Journal of Multimodal User Interfaces, 7 (1-2) 79-91, 2013.

@Article{Alameda-Ravel2013,
author = "Alameda-Pineda, Xavier and Sanchez-Riera, Jordi and Wienke, Johannes and Franc, Vojtech and Cech, Jan and Kulkarni, Kaustubh and Deleforge, Antoine and Horaud, Radu P.",
title = "RAVEL: An Annotated Corpus for Training Robots with Audiovisual Abilities",
journal = "Journal on Multimodal User Interfaces",
volume = "7",
number = "1-2",
pages = "79-91",
year = "2013",
url = "http://hal.inria.fr/hal-00720734/en"
}

Bi-Binaural (two microphone pairs) and binocular (a stereoscopic camera pair) recordings.

The Ravel data set consists of synchronized auditory and visual data, namely the data were gathered with four microphones and two cameras. The stability of the acquisition device ensures the repeatability of the recordings and, hence, the significance of the experiments using the data set. In addition, the scenarios were designed to benchmark algorithms aiming different applications: action recognition, gender identification, audio-visual object detection, dialog modeling, etc. The data were prepared within the framework of the HUMAVIPS European project in December 2010 at INRIA Grenoble Rhône-Alpes.

The Ravel data set: an overview

The data set set is composed of over 40 audio-visual sequences, roughly 100 minutes of recording of different scenarios. These scenarios are split in three classes: action recognition, robot gestures and interaction. The detailed description of the classes as well as the download links can be found respectively in the following links action recognition, robot gestures and interaction.

The Ravel Corpora is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License.

The RAVEL Corpora

HUMAVIPS Project (FP7-ICT-2009-247525)

Bi-Binaural (two microphone pairs) and binocular (a stereoscopic camera pair) recordings.

The Ravel data set: an overview