Danijela Ristić-Durrant, Marten Franke, Kai Michels, Vlastimir Nikolić, Milan Banić, Miloš Simonović

DOI Number
First page
Last page


In this paper, a system consisting of deep learning (DL)-based object detection followed by neural network based object distance estimation is considered. The accuracy of object distance estimation strongly depends on the size of the bounding box (BB) of the detected object extracted by the DL-based object detector. A method for improvement of the accuracy of object BB is proposed, which involves traditional computer vision-based edge segmentation of object BB image region. The proposed method is evaluated on the real-world images of railway scenes with obstacles on the rail tracks captured by thermal and RGB cameras. The evaluation results demonstrate the potential of traditional computer vision methods to complement state-of-the-art DL methods for accurate object detection and distance estimation.


Autonomous obstacle detection; deep learning-based object detection; object bounding box-based distance estimation; traditional computer vision edge-based segmentation

Full Text:



J. Weichselbaum, C. Zinner, O. Gebauer, W. Pree, “Accurate 3D-vision-based obstacle detection for an autonomous train”, Computers in Industry, vol. 64, no. 9, pp. 1209–1220, 2013.

Y. Xu, C. Gao, L. Yuan, S. Tang, G. Wei, “Real-time Obstacle Detection Over Rails Using Deep Convolutional Neural Network”, in Proceedings of 2019 IEEE Intelligent Transportation Systems Conference (ITSC), Auckland, NZ, pp. 1007-1012, 2019.

C. Herrmanna, M. Rufa, J. Beyerer, “CNN-based thermal infrared person detection by domain adaptation”, in Proceedings Volume 10643, Autonomous Systems: Sensors, Vehicles, Security, and the Internet of Everything, 2018.

Y. Jiang, H. Shin, H. Ko, “Precise Regression for Bounding Box Correction for Improved Tracking Based on Deep Reinforcement Learning”, in Proceedings of 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Calgary, AB, pp. 1643-1647, 2018.

J. Liu, S. He, “6D Object Pose Estimation Based on 2D Bounding Box, arXiv:1901.09366v1”, [cs.CV] 27 Jan 2019, [Online]. Available:

M. A. Haseeb, J. Guan, D. Ristić-Durrant, A. Gräser., “DisNet: A novel method for distance estimation from monocular camera”, in 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems - IROS 2018, 10th Workshop on Planning, Perception and Navigation for Intelligent Vehicles (PPNIV). [Online]. Available:

Shift2Rail project SMART-Smart Automation of Rail Transport. [Online]. Available:

J. Redmon, A. Farhadi, “YOLOv3: An Incremental Improvement”, arXiv:1804.02767v1 [cs.CV], 2018. [Online]. Available:

X. Zhou, D, Wang, P. Krähenbühl, “Objects as points”, arXiv preprint arXiv:1904.07850, 2019. [Online]. Available:

COCO dataset, [Online]. Available:

M. A. Haseeb, D. Ristić-Durrant, A. Gräser, M. Banić, D. Stamenković, Multi-DisNet: Machine learning-based object distance estimation from multiple cameras, D. Tzovaras et al. (Eds.): ICVS 2019, LNCS 11754, pp. 457–469, 2019, Springer Nature Switzerland AG 2019

J. A. Canny, “Computational Approach To Edge Detection”, IEEE Trans. on Pattern Analysis and Machine Intelligence, vol. 8, no. 6, pp. 679–698, 1986.

D. Ristić, Feedback Structures in Image Processing. PhD Thesis, University of Bremen, 2007. ISBN 978-3-8322-6598-4.

D. Ristić and A. Gräser, “Performance measure as feedback variable in image processing”, EURASIP Journal on Applied Signal Processing, vol. 2006,



  • There are currently no refbacks.

Print ISSN: 1820-6417
Online ISSN: 1820-6425