Friday, March 16, 2007

CMU vision talk: A Logical Theory for Detecting Humans in Surveillance Video (with an excursion into collaborative filtering)

A Logical Theory for Detecting Humans in Surveillance Video (with an excursion into collaborative filtering)

LARRY S. DAVIS, University of Maryland
_______________________
> March 19, 2007
> 3:30 PM
> Hamerschlag Hall 1112
> Refreshments 3:15 PM
________________________

The capacity to robustly detect humans in video is a critical component of automated visual surveillance systems. This talk describes a bilattice based logical reasoning approach that exploits contextual information, and knowledge about interactions between humans, and augments it with the output of low level body part detectors for human detection. Detections from low level parts-based detectors are treated as logical facts and used to reason explicitly about the presence or absence of humans in the scene. Positive and negative information from different sources, as well as uncertainties from detections and logical rules, are integrated within the bilattice framework. This approach also generates proofs or justifications for each hypothesis it proposes. These justifications (or lack thereof) are further employed by the system to explain and validate, or reject potential hypotheses. This allows the system to explicitly reason about complex interactions between humans and handle occlusions. These proofs are also available to the end user as an explanation of why the system thinks a particular hypothesis is actually a human. We employ a boosted cascade of gradient histograms based detector to detect individual body parts. We have applied this framework to analyze the presence of humans in static images from different datasets.

I will also talk about the application of this framework to the problem of collaborative filtering. In these applications, there exist, potentially, a large number of cues that can contribute to making a final recommendation for a given user. The proposed bilattice based logical reasoning approach exploits these multiple, noisy, and potentially contradictory sources of information to predict movie preferences. We report results on the publicly available MovieLens dataset and compare our approach against a number of state-of-the-art ranking algorithms for collaborative filtering.

No comments: