PETS 2007
Rio de Janeiro, Brazil - 14 October 2007

In Conjunction with 11th IEEE International Conference on Computer Vision 2007

PETS 2007 Benchmark Data

Overview

The datasets are multisensor sequences containing the following 3 scenarios, with increasing scene complexity: 1. loitering, 2. attended luggage removal (theft), 3. unattended luggage. The results of processing the datasets are to be submitted in XML format (details below).

The electronic proceedings for this workshop can be downloaded here (PDF, 30Mb).

Please e-mail datasets@pets2007.net if you require assistance obtaining these datasets for the workshop.

Aims and Objectives

The aim of this workshop is to employ existing (or new) systems for the detection of one or more of 3 types of security/criminal events, within a real-world environment. The scenarios are filmed from multiple cameras and involve multiple actors.

Preliminaries

Please read (and re-read) the following information carefully before processing the dataset, as the details are essential to the understanding of when warning/alarm events should be generated by your system. In particular, please observe the spatial and temporal requirements for each scenario. For the purposes of all scenarios, "entering the scene" refers to a person or persons entering the field of view of camera 3 for the first time.

1. Definition of Loitering

Loitering is defined as a person who enters the scene, and remains within the scene for more than t seconds. For the purposes of PETS 2007, t = 60 seconds.

2. Definition of Left-Luggage

Left-luggage in the context of PETS 2007 is defined as items of luggage that have been abandoned by their owner. It is based on the definition used for PETS 2006

To implement a system based on this definition there are three additional components that need to be defined:

A.What items are classed as luggage? Luggage is defined to include all types of baggage that can be carried by hand e.g. trunks, bags, rucksacks, backpacks, parcels, and suitcases.

Four common types of luggage are considered in this study:

  1. Handbag
  2. Carry-on case
  3. 70 litre backpack
  4. Ski gear carrier

B.What constitutes attended and unattended luggage? In this study three rules are used to determine whether luggage is attended to by a person (or not):

  1. A luggage is owned and attended to by a person or persons who enter the scene with the luggage until such point that the luggage is not in physical contact with the person (contextual rule).
  2. At this point the luggage is attended to by the owner ONLY when they are within a distance a metres of the luggage (spatial rule). All distances are measured between object centroids on the ground plane (i.e. z=0). If a person is within a (=2) metres of their luggage no alarm should be raised by the system.
  3. A luggage item is unattended when the owner is further than b metres (where b>=a *) from the luggage. If a person crosses the line at b (=3) metres the system should use the spatio-temporal rule in item C, below, to detect whether this item of luggage has been abandoned (an alarm event).
* If b > a, the distance between radii a and b is determined to be a warning zone where the luggage is neither attended to nor left unattended. This zone is defined to separate the detection points of the two states, reducing uncertainties introduced due to calibration / detection errors in the sensor system etc. If a person crosses the line at a (=2) metres, but within the radius b (=3) metres, the system can be set up to trigger a warning event, using a rule similar to the spatio-temporal rule in item C, below. Both warning and alarm events will be given in the ground truth.


C. What constitutes abandonment of luggage by the owner? The abandonment of an item of luggage is defined spatially and temporally. Abandonment (causing an alarm) is defined as:

  1. An item of luggage that has been left unattended by the owner for a period of t (=25) consecutive seconds in which time the owner has not re-attended to the luggage, nor has the luggage been attended to by a second party (instigated by physical contact, in which case a theft / tampering event may be raised). If an item of luggage is left unattended for t (=25) seconds, the alarm event is triggered.

3. Definition of Attended Luggage Removal (Theft)

The theft of an item of luggage is defined using a spatial constraint only. Theft is defined as an item of luggage moved further than b (=3) metres away from the owner. A warning can be issued at a (=2) metres away from the owner.

Calibration Data

Equidistant markers placed on the floor of the terminal were used for calibration purposes. The following point locations were used as the calibration points used (click to view full resolution image):

The image contains an example of the "real-world co-ordinate system" the cameras convert to EXCEPT that the green units correspond to one "square with black crosses at corners" distance, which in real world units is 1.8m (to a tolerance of +- 1cm). (Each black cross square is composed by 3x3 floor tiles each of which is 0.6m by 0.6m.) The (0,0) point is both the origin of the calibration and the point at which bags are put down on the ground . The "pixel positions" take the top left corner as (0,0) with travelling horizontally right increasing the first coordinate and travelling vertically down increasing the second coordinate (ie, the xv convention).

All spatial measurements are in metres. The provided calibration parameters were obtained using the freely available Tsai Camera Calibration Software by Reg Willson. For instructions on how to use Reg Willsons software visit Chris Needhams helpful page. More information on the Tsai camera model is available on CVonline.

An example of the provided calibration parameter XML file is given here. This XML file contains Tsai camera parameters obtained from Reg Willsons software, using the calibration points image shown above and this set of points. C++ code (available here) is provided to allow you to load and use the calibration parameters in your program (courtesy of project ETISEO). Please note that separate calibration parameters are provided for each Scenario (located within each Scenario .zip)

The DV cameras used to film all datasets are:

Camera 1: Canon MV-1 1xCCD w/progressive scan

Camera 2: Sony DCR-PC1000E 3xCMOS

Camera 3: Canon MV-1 1xCCD w/progressive scan

Camera 4: Sony DCR-PC1000E 3xCMOS

The resolution of all sequences are PAL standard (full colour, 768 x 576 pixels, 25 frames per second) and compressed as JPEG image sequences (approx. 90% quality).

XML Schema

All scenarios come with four XML files. The XML files contains the camera calibration parameters for camera views 1-4 respectively.

The XML schema for the configuration / submission is given here.

For submitted XML not all details need to be provided. An example of the (minimum) data to be submitted is given here.

Training Data

Background images are provided of the monitored surveillance system. Note that the scene is never completely empty of people. However, it is envisaged that the data is useful for training some systems. Note that testing (and corresponding generation of results) should not be performed on the training data, only based on some or all of sequences S0-S8 below.

Download

The background training data (including the calibration data) background.zip (1000 frames, 40s, 241 Mb)

Dataset S0

Scenario: nothing happening

Elements: no actors, no bags, medium density crowd

Ground truth parameters: N/A

Subjective Difficulty:

This scenario is a control sequence in which none of the events defined (loitering, theft, unattended luggage) takes place.

Sample Images

The following images show representative images captured from cameras 1-4.

Download

The entire scenario including the calibration data s00.zip (4500 frames, 180s, 862 Mb)

The ground truth for this scenario is available here. Please check regularly for updates.

Dataset S1

Scenario: general loitering 1

Elements: 1 actor, no bags, medium crowd

Ground truth parameters: t = 60 seconds

Subjective Difficulty:

This scenario contains one person who enters the scene and then loiters, remaining almost motionless at times, then leaves the scene.

Sample Images

The following images show representative images captured from cameras 1-4.

Download

The entire scenario including the calibration data s01.zip (4001 frames, 160.04s, 862 Mb)

The ground truth for this scenario is available here. Please check regularly for updates.

Dataset S2

Scenario: general loitering 2

Elements: 1 actor, 1 large bag, medium density crowd

Ground truth parameters: t = 60 seconds

Subjective Difficulty:

This scenario contains a person who walks into the scene carrying a bag which they then proceed to put down on the ground. The person then loiters in the middle of the scene before exiting.

Sample Images

The following images show representative images captured from cameras 1-4.

Download

The entire scenario including the calibration data s02.zip (4500 frames, 180s, 839 Mb)

The ground truth for this scenario is available here. Please check regularly for updates.

Dataset S3

Scenario: theft 1

Elements: 2 actors, 1 small bag, low density crowd

Ground truth parameters: N/A

Subjective Difficulty:

This scenario contains two persons who enter the scene, one carrying a shoulder bag. Both persons walk to the centre of the scene before the bag owner places the bag on the ground. The second person picks up the bag and both persons then proceed to walk out of the scene. No warning/alarm should be generated as the bag remains within a/b metres of the owner at all times.

Sample Images

The following images show representative images captured from cameras 1-4.

Download

The entire scenario including the calibration data s03.zip (2971 frames, 118.84s, 693 Mb)

The ground truth for this scenario is available here. Please check regularly for updates.

Dataset S4

Scenario: theft 2

Elements: 4 actors, 1 large bag, low density crowd

Ground truth parameters: N/A

Subjective Difficulty:

This scenario contains four persons who walk into the scene, one carrying a rucksack. One of the other persons picks up the bag and all walk out of the scene. No warning/alarm should be generated as the rucksack remains within a/b metres of the owner at all times.

Sample Images

The following images show representative images captured from cameras 1-4.

Download

The entire scenario including the calibration data s04.zip (3500 frames, 140s, 789 Mb)

The ground truth for this scenario is available here. Please check regularly for updates.

Dataset S5

Scenario: theft 3

Elements: 2 actors, 1 large bag, medium density crowd

Ground truth parameters: a = 2 metres, b = 3 metres

Subjective Difficulty:

This scenario contains one person who enters the scene carrying a large rucksack, which is placed on the ground. A second person (thief) picks up the bag and walks out of the scene, without the bag owner immediately noticing.

Sample Images

The following images show representative images captured from cameras 1-4.

Download

The entire scenario including the calibration data s05.zip (2900 frames, 116s, 639 Mb)

The ground truth for this scenario is available here. Please check regularly for updates.

Dataset S6

Scenario: theft 4

Elements: 4 actors, 2 large bags, medium density crowd

Ground truth parameters: a = 2 metres, b = 3 metres

Subjective Difficulty:

This scenario contains two persons who enter the scene carrying two large bags. They place the bags down on the ground, to give directions a passer by (third person). While the bag owner is distracted, a fourth person (thief) picks up and walks away with one of the bags, without the owner immediately noticing.

Sample Images

The following images show representative images captured from cameras 1-4.

Download

The entire scenari including the calibration data s06.zip (2735 frames, 109.4s, 577 Mb)

The ground truth for this scenario is available here. Please check regularly for updates.

Dataset S7

Scenario: left luggage 1

Elements: 1 actor, 1 small & 1 large bag, low density crowd

Ground truth parameters: a = 2 metres, b = 3 metres, t = 25 seconds

Subjective Difficulty:

This scenario contains a single person with two bags. The individual enters the scene, stops in the middle of the scene, before walking away whilst accidentally leaving one bag on the ground. The bag owner then returns to the scene to retrieve the bag.

Sample Images

The following images show representative images captured from cameras 1-4 .

Download

The entire scenario including the calibration data s07.zip (3000 frames, 120s, 708 Mb)

The ground truth for this scenario is available here. Please check regularly for updates.

Dataset S8

Scenario: left luggage 2

Elements: 1 actor, 1 large bag, low desnity crowd

Ground truth parameters: a = 2 metres, b = 3 metres, t = 25 seconds

Subjective Difficulty:

This scenario contains an individual who enters the scene carrying a large bag, which is placed on the ground. The owner then walks away from the bag before retrieving it, and leaving the scene.

Sample Images

The following images show representative images captured from cameras 1-4 .

Download

The entire scenari including the calibration data s08.zip (3000 frames, 120s, 646 Mb)

The ground truth for this scenario is available here. Please check regularly for updates.

Additional Information

The scenarios can also be downloaded from ftp://ftp.cs.rdg.ac.uk/pub/PETS2007/ (use anonymous login). Warning: ftp://ftp.pets.rdg.ac.uk is not listing files correctly on some ftp clients. If you experience problems you can connect to the http server at http://ftp.cs.rdg.ac.uk/PETS2007/.

Legal note: The UK Information Commisioner has agreed that the PETS 2007 datasets described here may be made publicly available for the purposes of academic research. The video sequences are copyright UK EPSRC REASON Project consortium and permission is hereby granted for free download for the purposes of the PETS 2007 workshop.