Deep Learning Stereo Matching Algorithm using Siamese Network
Keywords:LIDAR, Stereo Vision, Siamese Deep Neural Network, GPU.
AbstractAutonomous vehicle has become a very hot topic for researchers in recent years. One of the important sensors used in these vehicles is Stereo Cameras/Vision. Stereo vision systems are used to estimate the depth from the two cameras installed on robots or vehicles. This method can deliver the 3D position of all objects captured in the scene at a lower cost and higher density compared to LIDAR. Recently, neural net-works are vastly investigated and used in image processing problems and deep learning networks which has surpassed traditional computer vision methods specially in object recognition. In this paper, we propose to use a GPU with a new Siamese deep learning method to speed up the stereo matching algorithm. In this work, we use a high end Nvidia DGX workstation to train and test our algorithm and compare the results with normal GPUs and CPUs. Based on numerical evaluation, the Nvidia DGX can train a neural network with higher input image resolution approximately 8 times faster than a normal GPU and 40 times faster than a Core i7 8 Cores CPU. Since it has the ability to train on a higher resolution the network can be trained in more iteration and results in higher accuracy.
Ben-Tzvi, P. and Xu, X. An embedded feature-based stereo vision system for autonomous mobile robots. Robotic and Sensors Environments (ROSE), 2010 IEEE International Workshop on. 2010. 1 â€“6.
DOI : http://dx.doi.org/10.1109/ROSE.2010. 5675303.
Wang, L., Liao, M., Gong, M., Yang, R. and Nister, D. High-Quality Real-Time Stereo Using Adaptive Cost Aggregation and Dynamic Programming. Proceedings of the Third International Symposium on 3D Data Processing, Visualization, and Transmission. Washington, DC, USA: IEEE Computer Society.
Samadi M., Othman M.F. (2013) A New Fast and Robust Stereo Matching Algorithm for Robotic Systems. Advances in Intelligent Systems and Computing, vol 209. Springer, Berlin, Heidelberg.
Geiger A., Roser M., Urtasun R. (2011) Efficient Large-Scale Stereo Matching. In: Kimmel R., Klette R., Sugimoto A. (eds) Computer Vision â€“ ACCV 2010. ACCV 2010. Lecture Notes in Computer Science, vol 6492. Springer, Berlin, Heidelberg.
H. Hirschmuller. Accurate and efficient stereo processing by semiglobal matching and mutual information. In 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, volume 2, pages 807â€“814.
A. Krizhevsky, I. Sutskever, and G. E. Hinton. Imagenet classification with deep convolutional neural networks. In Advances in neural information processing systems, pages 1097â€“1105, 2012.
R. Girshick, J. Donahue, T. Darrell, and J. Malik. Rich feature hierarchies for accurate object detection and semantic segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 580â€“587, 2014.
Z. Chen, X. Sun, L. Wang, Y. Yu, and C. Huang. A deep visual correspondence embedding model for stereo matching costs. In Proceedings of the IEEE International Conference on Computer Vision, pages 972â€“980, 2015.
V. Badrinarayanan, A. Kendall, and R. Cipolla. Segnet: A deep convolutional encoder-decoder architecture for image segmentation. DOI :https://doi.org/10.1109/TPAMI.2016.2644615
J. Zbontar and Y. LeCun. Stereo matching by training a convolutional neural network to compare image patches. Journal of Machine Learning Research, 17.
Z. Chen, X. Sun, L. Wang, Y. Yu, and C. Huang. A deep visual correspondence embedding model for stereo matching costs. In Proceedings of the IEEE International Conference on Computer Vision, pages 972â€“980, 2016.
J. Flynn, I. Neulander, J. Philbin, and N. Snavely. DeepStereo: Learning to Predict New Views from the Worldâ€™s Imagery.
H. Park and K. M. Lee. Look wider to match image patches with convolutional neural networks. IEEE Signal Processing Letters, PP(99):11, 2017.
N. Mayer, E. Ilg, P. HÂ¨ausser, P. Fischer, D. Cremers, A. Dosovitskiy, and T. Brox. A Large Dataset to Train Convolutional Networks for Disparity, Optical Flow, and Scene Flow Estimation. CoRR, abs/1510.0(2002), 2015.
A. Kendall, H. Martirosyan, S. Dasgupta, P. Henry, R. Kennedy, A. Bachrach, and A. Bry. End-to-end learning of geometry and context for deep stereo regression. In IEEE Conference on Computer Vision and Pattern Recognition.
A. Geiger, P. Lenz and R. Urtasun, "Are we ready for autonomous driving? The KITTI vision benchmark suite," 2012 IEEE Conference on Computer Vision and Pattern Recognition, Providence, RI, 2012, pp. 3354-3361.
M. Menze and A. Geiger. Object scene flow for autonomous vehicles. In Conference on Computer Vision and Pattern Recognition (CVPR).
Samadi M., Othman M.F., Talib M.F. 2016. Fast and Robust Stereo Matching Algorithm For Obstacle Detection In Robotic Vision Systems. 6-13, Jurnal Teknologi
Mahammed M. A., Melhum A. I., Kochery F.A. 2013. Object Distance Measurement by Stereo VISION. International Journal of Science and Applied Information Technology (IJSAIT), Vol.2, No.2, Pages: 05-08
https://www.tensorflow.org/, TensorFlow opensource deep learning library, Retrieved 15 October 2018.
How to Cite
Copyright of articles that appear in Elektrika belongs exclusively to Penerbit Universiti Teknologi Malaysia (Penerbit UTM Press). This copyright covers the rights to reproduce the article, including reprints, electronic reproductions, or any other reproductions of similar nature.