Erhan Gundogdu, Ph.D.

I have been a Computer Vision Scientist at Amazon Berlin since October 2019. I worked as a Postdoctoral Researcher at Computer Vision Laboratory (CVLab), École Polytechnique Fédérale de Lausanne (EPFL) from March 2018 until October 2019.

I received my B.Sc., M.Sc. and Ph.D. degrees at Middle East Technical University, Turkey. During my M.Sc. and Ph.D. studies, I was advised by Prof. Dr. A. Aydın Alatan.

Research Interests

My research interests include but not limited to video understanding, multi-modal image/video representation learning, (visible and infrared) object tracking, recognition and (weakly-supervised) detection, deep metric learning, 3D object understanding (3D cloth fitting, 3D shape recognition and extraction).

For my full publication list, please visit my Google Scholar Page. My Ph.D. thesis is about visual object tracking (lib.metu) and my M.Sc. thesis is about local feature detection and description learning for fast image matching (lib.metu)

Honors and Awards
Selected Projects
Cross-Modal Recipe Retrieval
Revamping Cross-Modal Recipe Retrieval with Hierarchical Transformers and Self-supervised Learning
(CVF) (Code)
A. Salvador, E. Gundogdu, L. Bazzani, M. Donoser, published in IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2021

In this work, we revisit existing approaches for cross-modal recipe retrieval and propose a simplified end-to-end model based on well established and high performing encoders for text and images. We leverage transformers more effectively with a hierarchical design and exploit self-supervised text representation learning where we support different food descriptions to be similar but not the same. As a result, our proposed method achieves state-of-the-art performance in the cross-modal recipe retrieval task on the Recipe1M dataset. We make code and models publicly available.

See More

3D Cloth Draping by Deep Learning
  • GarNet++: Improving Fast and Accurate Static 3D Cloth Draping by Curvature Loss (, arXiv Preprint) E. Gundogdu, V. Constantin, S. Parashar, A. Seifoddini, M. Dang, M. Salzmann, P. Fua, IEEE Transactions on Pattern Analysis and Machine Intelligence, 2020 (bibtex, webpage)

  • GarNet: A Two-stream Network for Fast and Accurate 3D Cloth Draping (, arXiv Preprint) E. Gundogdu, V. Constantin, A. Seifoddini, M. Dang, M. Salzmann, P. Fua, IEEE International Conference on Computer Vision, 2019 (bibtex, webpage)

In this work, we tackle the problem of static 3D cloth draping on virtual human bodies. We introduce a two-stream deep network model that produces a visually plausible draping of a template cloth on virtual 3D bodies by extracting features from both the body and garment shapes. Our network learns to mimic a Physics-Based Simulation (PBS) method while requiring two orders of magnitude less computation time.

See More
Shape Reconstruction
Shape Reconstruction by Learning Differentiable Surface Representations
(arXiv Preprint)
J. Bednarik, S. Parashar, E. Gundogdu, M. Salzmann, P. Fua, accepted to IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2020

In this paper, we show that we can exploit the inherent differentiability of deep networks to leverage differential surface properties during training so as to prevent patch collapse and strongly reduce patch overlap.

See More
Deep Learning for Correlation Filters
Good Features to Correlate for Visual Tracking
(, arXiv Preprint)
E. Gundogdu, A. A. Alatan, IEEE Transactions on Image Processing, 2018
code bibtex

In this work, the problem of learning deep fully convolutional features for the CFB visual tracking is formulated. To learn the proposed model, a novel and efficient backpropagation algorithm is presented based on the loss function of the network. The proposed learning framework enables the network model to be flexible for a custom design. Moreover, it alleviates the dependency on the network trained for classification. The proposed tracking method is the winner of VOT2017 Challenge, organized by IEEE ICCV 2017.

Improving Correlation Filters
  • Extending Correlation Filter based Visual Tracking by Tree-Structured Ensemble and Spatial Windowing (
    E. Gundogdu, H. Ozkan, A. A. Alatan, IEEE Transactions on Image Processing, 2017
  • Spatial Windowing for Correlation Filter Based Visual Tracking (
    E. Gundogdu, A. A. Alatan, IEEE International Conference on Image Processing (ICIP), 2016
  • Ensemble of Adaptive Correlation Filters for Robust Visual Tracking (
    E. Gundogdu, H. Ozkan, A. A. Alatan, IEEE International Conference on Advanced Video and Signal-based Surveillance (AVSS), 2016

In the studies above, we improve upon the conventional correlation filters by proposing two methods. First, we present an approach to learn a spatial window at each frame during the course of the tracking. When the learned window is element-wise multiplied by the object patch/correlation filter, it can suppress the irrelevant regions of the object patch. Second, a tree-structured ensemble of trackers algorithm is proposed to combine multiple correaltion filter-based trackers while hierarchically keeping the appearance model of the object at the tree nodes. At each frame, only the relevant node trackers are activated to be combined as the final tracking decision. The combination of these two approaches also yield a better performance.

Visual Recognition for Maritime Vessels
  • MARVEL: A Large-Scale Image Dataset for Maritime Vessels (SpringerLink)
    E. Gundogdu, B. Solmaz, V. Yucesoy, A. Koc, Asian Conference on Computer Vision, 2016
  • Generic and Attribute-specific Deep Representations for Maritime Vessels (SpringerOpen)
    B. Solmaz, E. Gundogdu, V. Yucesoy, A. Koc, IPSJ Transactions on Computer Vision and Applications, 2017
  • Fine-Grained Recognition of Maritime Vessels and Land Vehicles by Deep Feature Embedding (IET Digital Lib.)
    B. Solmaz, E. Gundogdu, V. Yucesoy, A. Koc, A. A. Alatan, IEEE, IET Computer Vision, 2018
bibtex / dataset page

In the studies above, we first construct a large-scale maritime vessel dataset by distilling 2M annotated vessel images. Based on a semi-supervised clustering scheme, 26 hyper-classes for vessel types are construced. Four potential applications are introduced; namely, vessel classification, verification, retrieval and recognition with their provided baseline results.

See More
Tracking and Recognition in Infrared Spectrum
  • Comparison of Infrared and Visible Imagery for Object Tracking: Toward Trackers with Superior IR Performance (
    E. Gundogdu, H. Ozkan, H. S. Demir, H. Ergezer, E. Akagunduz, S. K. Pakin
    IEEE Computer Vision and Pattern Recognition Workshops, 2015
  • Object classification in infrared images using deep representations (
    E. Gundogdu, A. Koc, A. A. Alatan
    IEEE International Conference on Image Processing (ICIP), 2016
  • Evaluation of Feature Channels for Correlation-Filter-Based Visual Object Tracking in Infrared Spectrum (
    E. Gundogdu, A. Koc, B. Solmaz, R. I. Hammoud, A. A. Alatan
    IEEE Computer Vision and Pattern Recognition Workshops, 2016

Unlike the visible spectrum, the problem of object recognition and tracking are not extensively studied in Infrared (IR) Spectrum. In these studies, we first provide the first benchmark comparison work where the available tracking methods are evaluated in IR and Visible pairs of 20 videos and a novel ensemble of trackers method is presented.

See More