Tuesday, April 23, 2013

Lab meeting Apr 24th 2013 (Hank Lin): Scene Parsing with Multiscale Feature Learning, Purity Trees, and Optimal Covers

Presented by: Hank Lin

From: Proc. of the International Conference on Machine Learning (ICML'12), Edinburgh, Scotland, 2012.

Authors: C. Farabet, C. Couprie, L. Najman, Y. LeCun

Link: Paper Video

     Scene parsing, or semantic segmentation, consists in la-
beling each pixel in an image with the category of the object
it belongs to. It is a challenging task that involves the simul-
taneous detection, segmentation and recognition of all the
objects in the image.
     The scene parsing method proposed here starts by com-
puting a tree of segments from a graph of pixel dissimilari-
ties. Simultaneously, a set of dense feature vectors is com-
puted which encodes regions of multiple sizes centered on
each pixel. The feature extractor is a multiscale convolu-
tional network trained from raw pixels. The feature vec-
tors associated with the segments covered by each node in
the tree are aggregated and fed to a classi´Čüer which pro-
duces an estimate of the distribution of object categories
contained in the segment. A subset of tree nodes that cover
the image are then selected so as to maximize the aver-
age “purity” of the class distributions, hence maximizing
the overall likelihood that each segment will contain a sin-
gle object. The convolutional network feature extractor is
trained end-to-end from raw pixels, alleviating the need for
engineered features. After training, the system is parameter
      The system yields record accuracies on the Stanford
Background Dataset (8 classes), the Sift Flow Dataset (33
classes) and the Barcelona Dataset (170 classes) while
being an order of magnitude faster than competing ap-
proaches, producing a 320 × 240 image labeling in less
than 1 second.

Wednesday, April 17, 2013

Lab meeting Apr 17th 2013 (Bang-Cheng Wang): Biped Walking Pattern Generation by using Preview Control of Zero-Moment Point

Presented by Bang-Cheng Wang

From Proceedings of the 2003 IEEE
International Conference on Robotics & Automation
Taipei, Taiwan, September 14-19, 2003.

Kensuke HARADA, Kazuhito YOKOI and Hirohisa HIRUKAWA

We introduce a new method of a biped walking pattern
generation by using a preview control of the zero moment
point (ZMP). First, the dynamics of a biped
robot is modeled as a running cart on a table which
gives a convenient representation to treat ZMP. After
reviewing conventional methods of ZMP based pattern
generation, we formalize the problem as the design of a
ZMP tracking servo controller. It is shown that we can
realize such controller by adopting the preview control
theory that uses the future reference. It is also shown
that a preview controller can be used to compensate
the ZMP error caused by the difference between a simple
model and the precise multibody model. The effectiveness
of the proposed method is demonstrated by a
simulation of walking on spiral stairs.


Tuesday, April 09, 2013

Lab Meeting April 10, 2013 (Jimmy): Geodesic Flow Kernel for Unsupervised Domain Adaptation

Title: Geodesic Flow Kernel for Unsupervised Domain Adaptation
Authors: Boqing Gong, Yuan Shi, Fei Sha, Kristen Grauman
In: CVPR2012

In real-world applications of visual recognition, many factors—such as pose, illumination, or image quality—can cause a significant mismatch between the source domain on which classifiers are trained and the target domain to which those classifiers are applied. As such, the classifiers often perform poorly on the target domain. Domain adaptation techniques aim to correct the mismatch. Existing approaches have concentrated on learning feature representations that are invariant across domains, and they often do not directly exploit low-dimensional structures that are intrinsic to many vision datasets. In this paper, we propose a new kernel-based method that takes advantage of such structures. Our geodesic flow kernel models domain shift by integrating an infinite number of subspaces that characterize changes in geometric and statistical properties from the source to the target domain. Our approach is computationally advantageous, automatically inferring important algorithmic parameters without requiring extensive crossvalidation or labeled data from either domain. We also introduce a metric that reliably measures the adaptability between a pair of source and target domains. For a given target domain and several source domains, the metric can be used to automatically select the optimal source domain to adapt and avoid less desirable ones. Empirical studies on standard datasets demonstrate the advantages of our approach over competing methods.