Objective
The student will review and analyze the state-of-the-art techniques in three-dimensional computer vision, the objective is to understand the functioning and the scope of the subject in order to apply the best one for real-world problems related to computer engineering.
[Español] El estudiante revisará y analizará las técnicas más recientes en la visión computacional tridimensional con el objetivo de comprender el funcionamiento y alcance de las mismas en problemas reclacionados con la tecnología de cómputo.
Requirements
- Obligatory
- Linear algebra
- Calculus
- Object oriented programming,
- Statistics
- Pattern recognition
- Desirable:
- Introduction to computer vision
- Deep learning
Content
- Background Recall
- Linear algebra
- Basic operations
- Partial derivatives
- 3D Solid object transformations
- Rotation matrices
- Rotation conventions
- Homogeneous transformations
- Quaternions
- Deep learning
- Perceptron
- Gradient descent
- Multi-layer perceptron (MLP)
- Back propagation
- Multinomial Logistic Classsifier
- Convolutional neural networks
- Linear algebra
- Introduction
- The 3d world
- Camera calibration
- Extrinsic and intrinsic parameters
- Depth estimation
- Depth image
- Shape from X
- Stereo vision
- Multi-view geometry
- Data driven depth estimation
- 3D representation
- Point clouds
- Polyhedrons
- Uniform grids
- Hierarchical grids
- Probabilistic grids
- Model integration
- Registration
- Voxelization
- Bayes filter
- Camera localization
- Feature points
- Kalman Filter
- Graph SLAM
- ORB SLAM
- View Planning
- Model based view planning
- Global information
- TSP problem
- Next-best-view planning
- Greedy approaches
- Information gain
- Data driven next-best-view
- Model based view planning
- Model completion
- 3D Autoencoder
- Surface inference
- Semantic segmentation
- Unet-based segmentation
- Surfaces segmentation
- Tissue segmentation
- 3D object recognition
- 3D-CNN recognition
- Dissease detection
- Next-best-view for recognition
- Image based rendering
- Neural rendering
Bibliography
- Richard Hartley and Andrew Zisserman.Multiple View Geometry in Com-puter Vision. Cambridge University Press, 2 edition, 2004.
- Jeff Heaton. Ian goodfellow, yoshua bengio, and aaron courville: Deep learning, 2018.
- J. Irving Vasquez-Gomez, Planificación de Vsitas para Reconstrucción Tridimensional de Objetos con Robots Móviles, Tesis de doctorado, INAOE, 2014.
- Richard Szeliski. Computer vision: algorithms and applications. SpringerScience & Business Media, 2010.
- Sebastian Thrun. Probabilistic robotics.Communications of the ACM,45(3):52–57, 2002.
- Telea, A. C. (2014). Data visualization: principles and practice. CRC Press.
Papers
- Niklaus, S., Mai, L., Yang, J., & Liu, F. (2019). 3D Ken Burns effect from a single image. ACM Transactions on Graphics (TOG), 38(6), 1-15.
- Chen, R., Mahmood, F., Yuille, A., & Durr, N. J. (2018). Rethinking monocular depth estimation with adversarial training. arXiv preprint arXiv:1808.07528.
- Besl, P. J., & McKay, N. D. (1992, April). Method for registration of 3-D shapes. In Sensor fusion IV: control paradigms and data structures (Vol. 1611, pp. 586-606). Spie.
- Mendoza, Miguel and Vasquez-Gomez, J Irving and Taud, Hind and Sucar, Luis Enrique and Reta, Carolina, Supervised Learning of the Next-Best-View for 3D Object Reconstruction, Pattern Recognition Letters, (2020), https://doi.org/10.1016/j.patrec.2020.02.024, I.F. 2.810