MACHINE LEARNING READING
GROUP ARCHIVE
Meets bi-weekly, Wednesdays, 10:00 - 12:00,
Columbia Conference Room.
02/01/2006 (led by Anindya )
CONDENSATION—Conditional
Density Propagation for Visual Tracking
Original
Condensation Algorithm
01/18/2006 (led by Andri
)
C. Xu and J. L. Prince, "Snakes, Shapes, and Gradient Vector Flow", IEEE Transactions on Image Processing, 7(3),
pp. 359-369, March 1998
C. Xu and J.L. Prince, "Gradient Vector Flow: A New External Force for Snakes," Proc. IEEE Conf. on Comp. Vis. Patt. Recog. (CVPR), Los Alamitos: Comp.
Soc. Press, pp. 66-71, June 1997.
M. Kass, A. Witkin, and D. Terzopoulos. Snakes: Active contour models. In Proc. 1st [CCV, pages 259-268, June 1987. London, UK.
Powerpoint slides of the seminar
12/07/2005 (led by Bing)
Principal
Manifolds and Nonlinear Dimension Reduction
via Local Tangent Space Alignment
Supporting Papers:
Adaptive manifold learning
Local smoothing for manifold learning
Isometric embedding and continuum ISOMAP
Regularized principal manifolds
11/23/2005 (led by Tian)
Learning and Design of Principal Curves
11/09/05 (led by Tian)
Principal curves
Trevor Hastie, Werner Stuetzle: Principal
Curves
Miguel Carreira-Perpinan: Dimensionality Reduction
Kegl, B.; Krzyzak, A.; Linder, T.l Zeger, K.;: Learning
and design of principal curves, sequel
Kui-Yu Chang; Ghosh, J.;: A
unified model for probabilistic principal surfaces 10/12/05,
10/26/05 (led by
Houwu Bai)
Fast Gauss Transform and applications to machine learning
A. Elgammal, R. Duraiswami and L. Davis: "Efficient
Kernel Density Estimation Using the Fast Gauss Transform with Applications
to Color Modeling and Tracking", IEEE PAMI 2003
C. Yang, R. Duraiswami and L. Davis: "Efficient
Kernel Machines Using the Improved Fast Gauss Transform", NIPS
2005
Changjiang Yang, Ramani Duraiswami, Nail A. Gumerov and Larry Davis:
"Improved
Fast Gauss Transform and Efficient Kernel Density Estimation" ICCV
2003
Changjiang Yang, Ramani Duraiswami, Nail A. Gumerov: "Improved
Fast Gauss Transform" UMD TR 2003
1/21/2005
Yair Weiss. Segmentation using eigenvectors: a
unifying view. Proceedings IEEE International
Conference on Computer Vision p. 975-982
(1999)
http://www.cs.huji.ac.il/~yweiss/iccv99.pdf
ABSTRACT
Automatic grouping and segmentation of images
remains a challenging problem in computer vision.
Recently, a number of authors have demonstrated
good performance on this task using methods that
are based on eigenvectors of the afinity matrix.
These approaches are extremely attractive in that
they are based on simple eigendecomposition
algorithms whose stability is well understood.
Nevertheless, the use of eigendecompositions in
the context of segmentation is far from well
understood. In this paper we give a unified
treatment of these algorithms, and show the close
connections between them while highlighting their
distinguishing features. We then prove results on
eigenvectors of block matrices that allow us to
analyze the performance of these algorithms in
simple grouping settings. Finally, we use our
analysis to motivate a variation on the existing
methods that combines aspects from different
eigenvector segmentation algorithms. We
illustrate our analysis with results on real and
synthetic images.
10/8/2004
Matthias Seeger. Gaussian Processes for Machine
Learning. International Journal of Neural Systems
14(2), 2004, 69--106.
http://www.cs.berkeley.edu/~mseeger/papers/bayesgp-tut.pdf
ABSTRACT
Gaussian process models are routinely used to
solve hard machine learning problems. They are
attractive because of their flexible
non-parametric nature and computational
simplicity, and their main drawback of heavy
computational scaling has recently been
alleviated by the introduction of generic sparse
approximations.
The mathematical literature on GPs is large and
often uses deep concepts which are not required
to fully understand their machine learning
applications. In this tutorial paper, we aim to
present characteristics of GPs relevant to
machine learning and to show up precise
connections to other ``kernel machines'' popular
in the community. Our focus is on a simple
presentation, but references to more detailed
sources are provided.
5/28/2004
S. Geman and D. Geman: "Stochastic relaxation,
Gibbs distributions, and the Bayesian restoration
of images." IEEE TPAMI, 6, 721-741, 1984.
Download the paper from here
http://www.dam.brown.edu/people/geman/Papers/stochastic%20relaxation.pdf
This paper is about Bayesian restoration of
images. It assumes that a Markov Random Field
(MRF) model is used to construct a prior for the
images and makes quite general assumptions on the
degradation of an image by a noise model wich
includes blurring by an imaging system, sensor
noise and sensor nonlinearities. By recognizing
the equivalence between MRF models and a certain
class of distributions over the images (so called
"Gibbs distributions"), and by assuming Gaussian
noise it is possible to arrive at a fairly simple
expression for the a posteriori distribution of
the original image given the degraded image (i.e.
a Gibbs distribution again).
Now the process of reconstruction is essentially
MCMC sampling of that a posteriori distribution,
starting from the degraded image. But unlike in
the book chapter last time, the goal is not to
estimate a mean but to find a point (image) that
maximizes the a posteriori distribution. The
trick now is that the sampled distribution will
be changed slowly over the course of sampling but
in a way so that the points of maximum
probability stay the same and "attract" all the
probability mass over time. The parameter that is
altered to change the distribtution is called
"temperature" (in reference to probability
distributions of this form that occur in
statistical mechanics) and the whole idea is also
known as "simulated annealing" in
optimzation.
Quite interesting is the idea of constructing a
more sophisticated image prior by use of an
"adjoint" stochastic process. Such an adjoint
process models features of the image that cannot
be observed directly. An example used in the
paper is an MRF process that generates lines
(edges) separating regions of homogeneous
intensity.
5/14/2004
Introduction to Markov Chain Monte Carlo
The actual paper(s) we will cover (or at least
use for background material) are the
following.
- Chapter 1 ("Introducing Markov Chain Monte
Carlo") from the book "Markov Chain Monte Carlo
in Practice", edited by W. R. Gilks, S.
Richardson & D. J. Spiegelhalter. Chapman
& Hall / CRC, 1996.
I made a number of photo-copies of the relevant
chapter (you can pick it up from the bookcase in
front of my office, 150-L in BCB).
For further insight and background reading, the
following tutorial by Andrieu et al is also
recommended:
- C. Andrieu, N. de Freitas, A. Doucet and M. I.
Jordan. "An Introduction to MCMC for Machine
Learning". in Machine Learning, 2002.
http://www.cs.ubc.ca/~nando/papers/mlintro.pdf
4/30/2004
Volker Tresp. Mixtures of Gaussian Processes.
Advances in Neural Information Processing Systems
13. MIT Press, 2001.
http://wwwbrauer.informatik.tu-muenchen.de/~trespvol/papers/moe_gpr2.ps.gz
ABSTRACT
We introduce the mixture of Gaussian processes
(MGP) model which is useful for applications in
which the optimal bandwidth of a map is input
dependent. The MGP is derived from the mixture of
experts model and can also be used for modeling
general conditional probability densities. We
discuss how Gaussian processes --- in particular
in form of Gaussian process classification, the
support vector machine and the MGP model --- can
be used for quantifying the dependencies in
graphical models.
4/9/2004
David Mackay. Introduction to Gaussian
Processes.Extended version of a tutorial at
ICANN'97
ftp://wol.ra.phy.cam.ac.uk/pub/mackay/gpB.ps.gz
Feedforward neural networks such as multilayer
perceptrons are popular tools for nonlinear
regression and classification problems. From a
Bayesian perspective, a choice of a neural
network model can be viewed as defining a prior
probability distribution over non-linear
functions, and the neural network's learning
process can be interpreted in terms of the
posterior probability distribution over the
unknown function. (Some learning algorithms
search for the function with maximum posterior
probability and other Monte Carlo methods draw
samples from this posterior probability). In the
limit of large but otherwise standard networks,
\citeasnoun{Radford_book} has shown that the
prior distribution over non-linear functions
implied by the Bayesian neural network falls in a
class of probability distributions known as
Gaussian processes. The hyperparameters of the
neural network model determine the characteristic
lengthscales of the Gaussian process. Neal's
observation motivates the idea of discarding
parameterized networks and working directly with
Gaussian processes. Computations in which the
parameters of the network are optimized are then
replaced by simple matrix operations using the
covariance matrix of the Gaussian process. In
this chapter I will review work on this idea by
\citeasnoun{williams_rasmussen:96},
\citeasnoun{Neal_gp}, \citeasnoun{williams:96}
and \citeasnoun{Gibbs_MacKay97b}, and will assess
whether, for supervised regression and
classification tasks, the feedforward network has
been superceded.
Known typos in this paper:
equation 25 should read:
C_{nn'} = ... + \sigma_{\nu}^2 \delta_{nn'}
instead of:
C_{nn'} = ... + \delta_{nn'}
11/7/2003
A Tutorial on Particle Filters for Online
Nonlinear/Non-Gaussian Bayesian Tracking, M. S.
Arulampalam, S. Maskell, N. Gordon, and T. Clapp,
IEEE Trans. on Signal Processing, vol.50, No.2,
pp174-188, Feb. 2002
http://moody.engr.uconn.edu/cyberlab/Jianhui_file/Particle_filter/Tutorial_Particle_Filter_Online_Nonlinear_NonGaussian_Bayesian_Tracking_Arulampalam.pdf
6/27/2003
Advances in Large Margin Classifiers, Llew
Mason, Jonathan Baxter, Peter Bartlett, and
Marcus Frean. MIT Press, 1999.
http://www.lsmason.com/papers/LMC-DOOMII.pdf
|