The CAVA database is a unique set of audiovisual recordings using binocular and binaural camera/microphone pairs both mounted onto a person’s head. The database was gathered in order to develop computational methods and cognitive models for audiovisual scene analysis, as part of the European project POP (Perception on Purpose, FP6-IST-027268). The CAVA database was recorded in May 2007 by two POP partners : The University of Sheffield and INRIA Grenoble Rhône-Alpes. We recorded a large variety of scenarios representative of typical audiovisual tasks such as tracking a speaker in a complex and dynamic environment : multiple speakers participating to an informal meeting, both static and dynamic speakers, presence of acoustic noise, occluded speakers, speakers’ faces turning away from the cameras, etc.
The CAVA database is freely accessible for scientific research purposes and for non-commercial applications.