Depth information has been proved to be very effective in Image Processing community and with the popularity of Kinect since its introduction, RGB-D has been explored extensively for various applications. Therefore, the need for the development of Kinect image & video database is crucial.

Here, our effort is to create a Kinect Face database of images of different facial expressions in different lighting and occlusion conditions to serve various research purposes.


The Dataset consists of the multimodal facial images of 52 people (14 females, 38 males) obtained by Kinect. The data is captured in two sessions happened at different time period (about half month). In each session, the dataset provides the facial images of each person in 9 states of different facial expressions, different lighting and occlusion conditions: neutral, smile, open mouth, left profile, right profile, occlusion eyes, occlusion mouth, occlusion paper and light on [Figure 1]. All the images are provided in three sources of information: the RGB color image, the depth map (provided in two forms of the bitmap depth image and the text file containing the original depth levels sensed by Kinect) as well as 3D. In addition, the dataset comes with the manual landmarks of 6 positions in the face: left eye, right eye, the tip of nose, left side of mouth, right side of mouth and the chin [Figure 2]. Other information of the person such as gender, year of birth, glasses (this person wears the glasses or not), capture time of each session are also available.
Figure 1: Illustration of different facial variations acquired in our database: (a) neutral face; (b) smiling; (c) mouth open; (d) strong illumination; (e) occlusion by sunglasses; (f) occlusion by hand; (g) occlusion by paper; (h) right profile face and (i) left profile face. (Upper: the RGB images. Lower: the depth maps aligned with above RGB images.).
Figure 2: The marker point positions in the neutral image and the corresponding .txt file.

Acquiring process

The recording took place in an indoor environment in the Lab at EURECOM Institute. Each person is recorded twice in two sessions happening at different time. In each session, the person is asked to perform different face states in front of the Kinect camera at a distance of around 1 meter. Images are then normalized by cropping at the size of 256x256 centered by the face, with the upper left corner at the coordinates of (192,74) [figure 3]
Figure 3: Cropping region.


The structure of the database is illustrated in the following hierarchy [figure 4]

Figure 4: The hierarchy structure of the database.

Where, the meaning of subpart is as follows:

0001: Folder name with person identifier 0001, each person is determined by an identifier of 4 digits.
Info.txt: some information about the person (ID, Gender, Year of born, Glasses (this person wears the glasses or not), Capture time of each session (format: yyyy:mm:ddhh:mm:ss))
S1: Session 1 (there are 2 session: S1 and S2)
RGB: Contains the RGB images, the file names are in the format: rgb_personIdentifier_session_faceStatus.bmp
Depth: Contains the Depth information
DepthBMP: Contains the .bmp depth image, the file names are in the format: depth_personIdentifier_session_faceStatus.bmp
DepthKinect: Contains the .txt files of the depth information from the sensor of each pixel in the original coordinates (before cropping at size 640x480), file names format:depth_personIdentifier_session_faceStatus.txt
3DObj: Contains the .obj 3D Object files, format of the file names: depth_personIdentifier_session_faceStatus.obj
Mark: Contains the information of the marked points
Mark3DObj: Contains the .txt files composed of the coordinates in 3D Object space, file names format: depth_personIdentifier_session_faceStatus_Points_OBJ.txt
MarkDepth: Contains the .txt files composed of the coordinates in the original Depth image coordinates (before cropping at size 640x480), file names format:depth_personIdentifier_session_faceStatus_Points_TXT.txt
MarkRGB: Contains the .txt files composed of the coordinates in 2D RGB image space, file names format: rgb_personIdentifier_session_faceStatus_Points.txt

From KINECT Video to Animatable Face Model

Obtaining the database

Please fill online this form to request for the Database. You need to be a representative for your organization (students are not accepted) and use your official email address in the organization in order to request for the Database. After filling the form, an email will be automatically sent to you with the Usage Agreement between your organization and EURECOM, in addition with the instruction to get the Database.


Any publication using this database must cite the following paper

Rui Min, Neslihan Kose, Jean-Luc Dugelay, “KinectFaceDB: A Kinect Database for Face Recognition,” Systems, Man, and Cybernetics: Systems, IEEE Transactions on , vol.44, no.11, pp.1534,1548, Nov. 2014, doi: 10.1109/TSMC.2014.2331215

author={Rui Min and Neslihan Kose and Jean-Luc Dugelay},
journal={Systems, Man, and Cybernetics: Systems, IEEE Transactions on},
title={KinectFaceDB: A Kinect Database for Face Recognition},

Interested researchers can refer to the Florence Superface Dataset at, in order to test their algorithms on a separate dataset, and thus use diverse sets for training and test.



If you have any question or request regarding the EURECOM Kinect Face Dataset, please contact Prof. Jean-Luc DUGELAY via