Computer Vision
Semester: |
SS 2019 |
Type: |
Lecture |
Lecturer: |
|
Credits: |
V3 + Ü1 (6 ECTS credits) |
Find a list of current courses on the Teaching page.
Type |
Date |
Room |
---|---|---|
Lecture / Exercise | Monday, 10:30 - 12:00 | TEMP2 |
Lecture / Exercise | Tuesday, 14:30 - 16:00 | H03 |
Lecture Description
Cameras and images form an ever-growing part of our daily lives. Billions of images and massive amounts of video data are becoming available on the Internet. Large search engines are being created to make sense out of this data. And more and more commercial applications are coming up, e.g. in surveillance and security, on consumer devices, for video special effects, in mobile robotics and automotive contexts, and for medical image processing. All those applications are building on visual capabilities. For us humans, those capabilities are natural. But how do we actually accomplish them? And how can we teach a machine to perform similar tasks for us?
The goal of Computer Vision is to develop methods that enable a machine to "understand" or analyze images and videos. This lecture will teach the fundamental Computer Vision techniques that underlie such capabilities. In addition, it will show current research developments and how they are applied to solve real-world tasks. The lecture is accompanied by programming exercises that will allow you to collect hands-on experience with the algorithms introduced in the lecture (there will be one exercise sheet roughly every two weeks).
Literature
In the last decades, Computer Vision has evolved into a rapidly growing field with research going into so many directions that no single book can cover them all. We will mainly make use of the following books:
- D. Forsyth, J. Ponce, Computer Vision - A Modern Approach, Prentice Hall, 2002
- R. Szeliski, Computer Vision - Algorithms and Applications, Springer, 2010
- R. Hartley, A. Zisserman, Multiple View Geometry in Computer Vision, 2nd Edition, Cambridge University Press, 2004
- I. Goodfellow, Y. Bengio, A. Courville, Deep Learning, 2016
However, a good part of the material presented in this class is the result of very recent research, so it hasn't found its way into textbooks yet. Wherever research papers are necessary for a deeper understanding, we will make them available on this web page or in the Moodle course room.
Date | Title | Content | Material |
---|---|---|---|
-- | no class (RWTH DIES) | ||
-- | no class (RWTH DIES) | ||
-- | no class | ||
Introduction | Why vision? Applications, Challenges, Image Formation | ||
Image Processing I | Linear Filters, Gaussian Smoothing, Multi-scale Representations | ||
-- | no class | ||
-- | no class (Easter Monday) | ||
Image Processing II | Image Derivatives, Edge detection, Canny | ||
Structure Extraction | Line Fitting, Hough Transform, Gen. Hough Transform | ||
Segmentation I | Segmentation as Clustering, k-means, EM, Mean-Shift Segmentation | ||
Exercise 1 | Filtering, Derivatives, Edges, Hough Transform | ||
Segmentation II | Segmentation as Energy Minimization, Markov Random Fields, Graph Cuts | ||
Categorization | Sliding Window-based Object Detection, HOG, SVMs, Viola-Jones detector, AdaBoost | ||
Local Features I | Interest points, Harris Detector, Hessian Detector, Scale Invariance, Local Descriptors, SIFT | ||
Local Features II | Specific Object Recognition with Local Features, Geometric Verification, RANSAC | ||
Exercise 2 | Mean-Shift and Graph Cut Segmentation, Sliding-Window Detection | ||
Deep Learning I | Intro Neural Networks, Backpropagation, etc. | ||
Deep Learning II | CNNs, Current Architectures, VGGNet, GoogLeNet, ResNet | ||
Exercise 3 | Interest Point Detection & Matching, Homography Estimation | ||
-- | no class (RWTH DIES Sports Day) | ||
-- | no class (Excursion Week) | ||
-- | no class (Excursion Week) | ||
Deep Learning III | CNNs for Object Detection | ||
Deep Learning IV | CNNs for Semantic Segmentation, Human Pose Estimation, Matching | ||
Exercise 4 | CNNs | ||
3D Reconstruction I | Multi-View Stereo Basics, Disparity, Triangulation, Epipolar Geometry, Essential Matrix, Correspondence Search | ||
3D Reconstruction II | Camera Parameters, Calibration, Triangulation, DLT | ||
3D Reconstruction III | Fundamental Matrix, Eight-Point Algorithm, Active Stereo, Outlook to SfM | ||
Repetition | Repetition | ||
Exercise 5 | Eight-Point Algorithm, RANSAC, Triangulation |