Veranstalter |
Giorgio Panin, Ph.D. |
Modul |
IN3150 |
Typ |
Vorlesung |
Sprache |
Englisch |
Semester |
WS 2009/2010 |
ECTS |
3.0 |
SWS |
2V |
Hörerkreis |
Wahlfach für Studenten der Informatik (Master, Diploma) |
Zeit & Ort |
Di 10:00 - 12:00 MI 03.07.023 |
Schein |
Nach erfolgreicher mündlichen Prüfung |
News
Exam: Thursday, 04.03.10 startgin from 14:00, in Seminarraum 03.07.023.
Registration: through TUM-Online.
Open and currently running Theses
Thesis proposals can be found at the
Vision section of our student projects webpage.
For information about our research group, see also the
ITrackU webpage, and the
OpenTL library.
Course description
The course aims to provide a structured overview of model-based object tracking, with the purpose of estimating and following in real-time the spatial pose (rotation, translation etc.) of one or more objects, by using digital cameras and fast computer vision techniques.
The first part of the course will introduce the general tools for object tracking:
1. Pose and deformation models, and camera projection
2. Methods for pose estimation from geometric feature correspondences
3. Bayesian tracking concepts (state dynamics, measurement likelihood)
4. Bayesian filters for linear and nonlinear models, with single or multi-hypothesis state distributions
Afterwards, we will concentrate on the visual part: among the many modalities available, we will focus in particular on the following ones:
1.
Color-based: Matching color statistics, from the visible object surface to the underlying image area.
2.
Keypoint- and
Motion-based: Detection and tracking of single point features, possibly making use of image motion information (optical flow).
3.
Contour-based: Matching the object boundary line, as it deforms with the object roto-translation (also called
Active Contours).
4.
Template-based: Registration of a fully textured surface (Template) to the image gray-level intensities.
Finally, the last lecture will introduce advanced topics, concerning: multiple cameras, multiple simultaneous objects, and data fusion with multiple modalities (colors, edges, ...).
Pre-requisites
The course will also provide the following pre-requisites in a self-contained fashion (a basic knowledge would be in any case recommended):
- Basic math and algebra (nonlinear functions and derivatives, matrix computation)
- Basic geometry: 3D transformations, projective geometry, camera imaging
- Probability theory and statistics
- Basic image processing (representation, filtering etc.)
- System theory: state-space representation, dynamics, observation
Textbook
The reference text for this course is
Giorgio Panin,
Model-based visual tracking, ed. Wiley-Blackwell (to appear around end 2010).
Please check also the
OpenTL webpage for more information.
Slides
Lecture slides for WS09/10 are currently in preparation:
IMPORTANT: now they are protected by password, and restricted only to the course participants.
Please contact me by email in order to obtain the password.
Part I - General tools for object tracking
Part II - Visual modalities
- Lecture6.pdf: Colorbased object tracking
- Lecture7.pdf: Keypoint tracking and image motion
- Lecture8.pdf: Invariant keypoints: detection, description and matching
- Lecture9.pdf: Contour-based tracking: re-projection of contour points and lines
- Lecture10.pdf: Contour-based tracking: Snakes, Condensation and the CCD algorithm
- Lecture11.pdf: Template-based tracking: Active Appearance Models
- Lecture12.pdf: Introduction to multi-camera/-modal/-target tracking
Bibliographical references
Lecture 1:
- Survey [1] (Introduction)
- Survey paper [12]
Lecture 2:
- General transformations: [3], Chapter 2
- Rigid body motion, exponential representation: [1], Sec. 2.2; [7]
- Camera model: [1], Sec 2.1 (and references), [3], Chapter 6
- Camera calibration: [3], Chapter 7
Lecture 3:
- Pose estimation from corresponding features: [3], Chapter 4
- P3P problem: [1], Sec. 2.3.3
- Similarity estimation (in N-dimensions): [11]
- Linear and Nonlinear LSE: [1], Sec. 2.4 (and references)
- Robust LSE: [10], and [1], Sec.2.5
Lecture 4:
- General tracking concepts (not only vision): [6], Introduction
- Dynamical models: [5], Chapter 9, [6], Chapters 4 and 6 (until 6.3)
- The three levels of visual measurements: taken from the data fusion literature [8] (data-, feature-, decision-level)
- General Bayesian tracking equations: [1], Sec. 2.6
Lecture 5:
Kalman Filter:
- Kalman Filter (and EKF): Tutorial by Greg Welch (in particular, the Introductory Paper [13]); [1], Sec.2.6.1 and references; [6], Chapter 5 (KF) and Chapter 10.3 (EKF)
- Particle Filters: [1], Sec. 2.6.2; paper [16]; the Condensation web page.
- Unscented Kalman Filter: Paper [14]; Tutorial [15]
Lecture 6:
Color spaces: (wikipedia links)
Color distributions:
Mean-shift: (for more recent applications and videos, look at the
homepage of D. Comaniciu)
Blob matching: [2], Chapter 5 (morphology) and 8 (matching contours)
Color-based particle filter: [17] (feature-level) and [18] (pixel-level)
Lecture 7:
A general list of keypoint detectors (for this and the next lecture) can be found
here.
KLT algorithm: material can be found at the KLT
homepage by Stan Birchfield.
In particular, [19] is the reference paper for this method.
Back-projection: for more information about depth maps, see the
Z-buffering webpage.
Feature detection vs. tracking: see also [1]
Optical flow: see the Wikipedia
page and the Lucas-Kanade original paper [21].
Harris detector: see the paper [20]
Lecture 8:
Harris corners: (see Lecture 7)
Scale-space theory: the book of T. Lindeberg [9], plus some references (for a quicker introduction), at this
webpage.
SIFT
webpage (by D. Lowe).
Lecture 9:
- Edge-based tracking methods: [1], Sec. 4.1
- Sampling model contour points with the GPU: paper [29] (general concepts) and paper [26] (our implementation)
- Canny edge detector: Wikipedia page, and the original paper [22]
- Marr-Hildreth edge detector: Wikipedia page, and paper [24]
- Harris (RAPiD) tracker: original paper [22] and robust improvement with RANSAC [25]
- Segment detection: paper [27], segment-based object tracking: paper [28]
Lecture 10:
- B-splines: [5] Chapter 3, and wikipedia page
- Snakes: the original paper [30], another webpage and slides
- Multi-hypothesis likelihood for particle filters: see the original CONDENSATION paper [16]
- The CCD algorithm: original paper [31], a faster implementation [32] and our real-time version [33]
Lecture 11:
- AAM/ASM (including learning): the homepage of Tim Cootes, his paper [34] and the paper [35]
- Computational improvements (forwards-, inverse-compositional approach): paper [34], and the face tracking project webpage at CMU
- From 2D to 3D face tracking: paper [36]
- See also the Wikipedia webpages about PCA and SVD
Lecture 12:
- Data fusion books: [37] and [38] (only the first Chapters; in particular, have a look at the JDL data fusion scheme)
References:
- [6] Y. Bar-Shalom, X.-R. Li, T. Kirubarajan, Estimation with Applications to Tracking and Navigation, J. Wiley & Sons, 2001
- [7] R. Murray, Z. Li, S. Sastry, A Mathematical Introduction to Robotic Manipulation, CRC Press, 2002
- [8] D. Hall, J. Llinas, Handbook of multisensor data fusion, CRC Press, 2nd Edition, 2008
- [10] M. A. Fischler, R. C. Bolles, Random Sample Consensus: A Paradigm for Model Fitting with Applications to Image Analysis and Automated Cartography Comm. of the ACM, Vol 24, pp 381-395, 1981
- [11] S. Umeyama, Least-Squares Estimation of Transformation Parameters Between Two Point Patterns IEEE Trans. Pattern Anal. Mach. Intell. 13(4): 376-380 (1991)
- [12] Yilmaz, A., Javed, O., and Shah, M. 2006. Object tracking: A survey. ACM Comput. Surv. 38, 4 (Dec. 2006), 13
- [14] S. Julier, J. Uhlmann, H. F. Durrant-Whyte, A new method for the nonlinear transformation of means and covariances in filters and estimators, IEEE Transactions on In Automatic Control, Vol. 45, No. 3. (06 August 2002), pp. 477-482.
- [19] J. Shi, C. Tomasi. Good Features to Track, IEEE Conference on Computer Vision and Pattern Recognition, pages 593-600, 1994.
- [22] C. J. Harris, Tracking with rigid models. In A. Blake and A. Yuille, editors, Active Vision. MIT Press, Cambridge, MA, 1992.
- [24] D. Marr, E. Hildreth, Theory of Edge Detection. In Proceedings of the Royal Society of London. B 207, 1980, S. 187-217.
- [25] M. Armstrong, A. Zisserman, Robust object tracking. In Proc. Asian Conference on Computer Vision, volume I (1995).
- [26] E. Roth, G. Panin, and A. Knoll. Sampling feature points for contour tracking with graphics hardware. In International Workshop on Vision, Modeling and Visualization (VMV), Konstanz, Germany, October 2008.
- [27] D. Lowe, Three-Dimensional Object Recognition from Single Two-Dimensional Images. Artif. Intell. 31(3): 355-395 (1987)
- [28] D. Koller, K. Daniilidis, H.-H. Nagel., Model-Based Object Tracking in Monocular Image Sequences of Road Traffic Scenes International Journal of Computer Vision 10:3 (1993) 257--281.
- [30] M. Kass, A. Witkin, D. Terzopoulos, Snakes: Active contour models, International Journal of Computer Vision, 1(4), 1987, 321331. Marr Prize Special Issue
- [31] Hanek, R. and Beetz, M. 2004. The Contracting Curve Density Algorithm: Fitting Parametric Curve Models to Images Using Local Self-Adapting Separation Criteria Int. J. Comput. Vision 59, 3 (Sep. 2004), 233-258
- [32] Hanek, R., Schmitt, T., Buck, S., and Beetz, M. 2003. Toward RoboCup without color labeling AI Mag. 24, 2 (Jun. 2003), 47-50.
- [34] T. Cootes, G. Edwards, C.Taylor. Active Appearance Models, in Proc. European Conference on Computer Vision 1998 (H.Burkhardt & B. Neumann Ed.s). Vol. 2, pp. 484-498, Springer, 1998.