Datasets

Extended annotations for driving gaze datasets (DR(eye)VE, BDD-A and LBW)

How to cite:

@inproceedings{2024_IV_data, author = {Kotseruba, Iuliia and Tsotsos, John K.}, title = {Data limitations for modeling top-down effects on drivers’ attention}, booktitle = {IV}, year = {2024} }

Odd-One-Out (O³) Dataset

A set of realistic odd-one-out stimuli gathered “in the wild”. Each image in the Odd-One-Out (O3) dataset depicts a scene with multiple objects similar to each other in appearance (distractors) and a singleton (target) distinct in one or more feature dimensions (e.g. color, shape, size). All images are resized so that the larger dimension is 1024px. Targets represent approx. 400 common object types such as flowers, sweets, chicken eggs, leaves, tiles and birds. Pixelwise masks are provided for targets and distractors. Annotations are generated using CVAT.

Link to Paper | Download and more info

How to cite:

@inproceedings{Kotseruba2019BMVC, author = {Iuliia Kotseruba and Calden Wloka and Amir Rasouli and John K. Tsotsos}, title = {{Do Saliency Models Detect Odd-One-Out Targets? New Datasets and Evaluations}}, booktitle = {British Machine Vision Conference (BMVC)}, year = {2019} }

Psychophysical Patterns (P³) Dataset

A set of patters used in psychophysical research to evaluate the ability of saliency algorithms to find targets distinct from distractors in orientation, color and size. Each image is a 7×7 grid and contains a single target. All images are 1024x1024px and have corresponding ground truth masks for the target and distractors. Patterns and annotations are generated with Psychophysical Image Generator (PIG).

Link to Paper | Download and more info

How to cite:

@inproceedings{Kotseruba2019BMVC,
author = {Iuliia Kotseruba and Calden Wloka and Amir Rasouli and John K. Tsotsos},
title = {{Do Saliency Models Detect Odd-One-Out Targets? New Datasets and Evaluations}},
booktitle = {British Machine Vision Conference (BMVC)},
year = {2019} }

PIE: Pedestrian Intention Estimation

PIE is a new dataset for studying pedestrian behavior in traffic. PIE contains over 6 hours of footage recorded in typical traffic scenes with on-board camera. It also provides accurate vehicle information from OBD sensor (vehicle speed, heading direction and GPS coordinates) synchronized with video footage.

Rich spatial and behavioral annotations are available for pedestrians and vehicles that potentially interact with the ego-vehicle as well as for the relevant elements of infrastructure (traffic lights, signs and zebra crossings).

There are over 300K labeled video frames with 1842 pedestrian samples making this the largest publicly available dataset for studying pedestrian behavior in traffic.

Link to Paper | Download and more info

How to cite:

@inproceedings{Rasouli2019PIE,
author = {Amir Rasouli and Iuliia Kotseruba and Toni Kunic and John K. Tsotsos},
title = {PIE: A Large-Scale Dataset and Models for Pedestrian Intention Estimation and Trajectory Prediction},
booktitle = {International Conference on Computer Vision (ICCV)},
year = {2019} }

Early Salient Region Selection Does Not Drive Rapid Visual Categorization

This archive contains images and ground truth labels used in Tsotsos, et al. “Early Salient Region Selection Does Not Drive Rapid Visual Categorization.” arXiv preprint arXiv:1901.04908 (2019) [https://arxiv.org/abs/1901.04908]. Additional details, sufficient for duplicating the experiments in that paper, can be found in the Methods Section.

Thorpe dataset originally appeared in the Thorpe et. al. “Speed of processing in the human visual system”. Nature. 1996;381(6582):520–522. The images can be purchased from the Corel Stock Photo Library (https://www.amazon.com/Corel-Stock-Photo-Library-2/dp/B000V933GI), the ground truth masks for 996 images containing targets are provided. DOWNLOAD

The original set of images used in the Potter et al. “Detecting meaning in RSVP at 13 ms per picture”. Attention, Perception, & Psychophysics. 2014;76(2):270–279 are provided with permission from M. Potter and her colleagues. DOWNLOAD

The ground truth masks are available for 365 targets as defined in the experiment by Potter et al. DOWNLOAD

To download the full package: DOWNLOAD

ACCP Image Dataset

The Active Control of Camera Parameter (ACCP) image dataset is created to evaluate the sensitivity to ambient environment and camera parameters of various object detection algorithms. It consists of 2240 images of 5 static objects, captured with different shutter speed and voltage gain, under various light conditions.

Reference: Wu, Yulong, and John Tsotsos. “Active control of camera parameters for object detection algorithms.” arXiv preprint arXiv:1705.05685 (2017).

DOWNLOAD

Joint Attention in Autonomous Driving (JAAD)

JAAD is a new dataset (by I. Kotseruba, A. Rasouli, J.K. Tsotsos) for studying joint attention in the context of autonomous driving. It contains an annotated collection of short video clips representing scenes typical for everyday urban driving in various weather conditions.

JAAD dataset contains 346 high-resolution video clips (most are 5-10 sec) extracted from approx. 240 hours of driving videos filmed in several locations in North America and Eastern Europe.

It is available at: http://data.nvision.eecs.yorku.ca/JAAD_dataset/

Place Recognition Dataset

A dataset of images from 17 indoor places built by 2 robots (virtualMe and Pioneer) under different lighting conditions (day and night) used for place recognition tasks in the following paper:

R. Sahdev and J. K. Tsotsos, “Indoor Place Recognition for Localization of Mobile Robots,” In 13th International Conference on Computer and Robot Vision, 2016, Victoria, BC, June 1-3, 2016

We respectfully ask that if you use the dataset, you cite the above paper as its source. DOWNLOAD

Sensor Parameters Dataset

A dataset of images that were captured under variable sensorshutter speeds and gain values. The dataset was compiled and used as part of the following paper:

A. Andreopoulos, J. K. Tsotsos. “On Sensor Bias In Experimental Methods for Comparing Interest Point, Saliency and Recognition Algorithms”. IEEE Transactions On Pattern Analysis and Machine Intelligence (2011, in press).

We respectfully ask that if you use the dataset, you cite the above paper as its source. DOWNLOAD

Cardiac MRI Dataset

This webpage contains a dataset of short axis cardiac MR images and the ground truth of their left ventricles’ endocardial and epicardial segmentations. The dataset was first compiled and used as part of the following paper:

Alexander Andreopoulos, John K. Tsotsos, Efficient and Generalizable Statistical Models of Shape and Appearance for Analysis of Cardiac MRI, Medical Image Analysis, Volume 12, Issue 3, June 2008, Pages 335-357. PDF

Use is free of charge; We respectfully ask that if you use this dataset, you cite the above paper as its source.

The authors would like to acknowledge Dr. Paul Babyn, Radiologist-in-Chief, and Dr. Shi-Joon Yoo, Cardiac Radiologist, of the Hospital for Sick Children, Toronto, for the data sets and their assistance with this research project.

Disclaimer: The dataset is provided for research purposes only and there are no warranties provided nor liabilities assumed by York University nor the researchers involved in the production of the dataset.

Downloads

Cardiac MR images acquired from 33 subjects. Each subject’s sequence consists of 20 frames and 8-15 slices along the long axis, for a total of 7980 images. The sequence corresponding to each subject x is in a distinct .mat (MATLAB) file named sol_yxzt_patx.mat. These are the raw, unprocessed images, that were originally stored as 16-bit DICOM images. DOWNLOAD
Segmentations of the above sequences. We have manually segmented each of the 7980 images where both the endocardium and epicardium of the left ventricle were visible, for a total of 5011 segmented MR images and 10022 contours. The segmentation corresponding to each subject x is in a distinct .mat (MATLAB) file named manual_seg_32points_patx.mat. Each contour is described by 32 points given in pixel coordinates. DOWNLOAD
Two small MATLAB functions for visualizing the segmentations on their corresponding images. Please see the included README file for examples of their use. DOWNLOAD
Metadata containing the pixel-spacing (mm per pixel), the spacing between slices along the long axis (mm per slice) of each subject’s sequence, each subject’s age and diagnosis. DOWNLOAD

Fixation Data and Code

Fixation data and code are available here: AIM.zip. The code written in MATLAB and includes a variety of learned ICA bases. Note that the code given expects a relatively low resolution image as the receptive fields are small, for a high resolution larger image, you may wish to try some larger receptive fields. Also, if you have any questions about the code, feel free to ask. To use within matlab, you should be able to simply do something along the lines of: info = AIM(’21.jpg’,0.5); with the parameter being a rescaling factor. It is also possible to vary a variety of parameters on both the command line and within the code itself, so feel free to experiment. There are also some comments and notes specific to psychophysics examples within one of the included files.

Note that all of these bases should result in better performance than that based on the *very* small 7×7 filters used in the original NIPS paper. The eye tracking data may be found at eyetrackingdata.zip. This includes binary maps for each of the images which indicate which pixel locations were fixated in addition to the raw data. Correspondence is best addressed to Neil.Bruce [at] sophia.inria.fr

Facial Gestures Dataset

This webpage contains a dataset of images of facial gestures taken by a camera mounted on a wheelchair. The dataset was first compiled by Gregory Fine and used as part of the his Master of Science thesis:

Examining the feasibility of face gesture detection using a wheelchair mounted camera.

Use is free of charge; We respectfully ask that if you use this dataset, you cite the above paper as its source.

Downloads:

Facial gestures images acquired from 10 subjects. Each subject’s sequence consists of 10 gestures and 100 images for each gesture, for a total of 9140 images. DOWNLOAD
Images used to train AAM algorithm to detect the eyes and mouth. It contains images along with ground truth contours of the eyes and mouth. DOWNLOAD
Images of facial gestures used to test false positive rate of the algorithm. It contains 440 images of facial gestures produced by 5 subjects. DOWNLOAD

Datasets

Extended annotations for driving gaze datasets (DR(eye)VE, BDD-A and LBW)

Odd-One-Out (O3) Dataset

Psychophysical Patterns (P3) Dataset

PIE: Pedestrian Intention Estimation

Early Salient Region Selection Does Not Drive Rapid Visual Categorization

ACCP Image Dataset

Joint Attention in Autonomous Driving (JAAD)

Place Recognition Dataset

Sensor Parameters Dataset

Cardiac MRI Dataset

Facial Gestures Dataset

Odd-One-Out (O³) Dataset

Psychophysical Patterns (P³) Dataset