Suman Saha

Suman Saha

Senior Data Scientist
Research
(Alumni)

Suman pursued his doctoral studies in the Visual Artificial Intelligence Laboratory at Oxford Brookes University, United Kingdom, focusing on spatiotemporal human action localization using deep learning techniques. Suman received a Ph.D. in Computer Science and Mathematics from Oxford Brookes University in 2017. He served as a postdoctoral fellow at Oxford Brookes University between December 2017 and July 2018. Then he moved to CVL (Computer Vision Lab) at ETH Zurich, holding a postdoctoral researcher position until June 2023. Suman's research has centered around unsupervised domain adaptation (UDA) for visual scene understanding (semantic and panoptic segmentation), human behavior understanding, and vision-based biometrics (face anti-spoofing). He also worked on semi-supervised learning for semantic segmentation by leveraging self-supervised depth estimation. His internship at Disney Research Zurich involved designing deep generative models for unsupervised facial expression learning. Additionally, Suman tackled research problems in multi-task learning (MTL) by addressing two common challenges in developing multi-task models, incremental learning and task interference.

Projects

LAMP

Completed
Lensless Actinic Metrology for EUV Photomasks
Large-scale Infrastructures

Inter-Detect

In Progress
Quantifying Plant-Pollinator Interactions Using Computer Vision
Climate & Environment

Publications

Unal, O.; Sakaridis, C.; Saha, S.; Van Gool, L.; Leonardis, A.; Ricci, E.; Roth, S.; Russakovsky, O.; Sattler, T.; Varol, G. "Four Ways to Improve Verbo-visual Fusion for Dense 3D Visual Grounding" Computer Vision – ECCV 2024 196-213 2025 View publication
Mansour, E. A.; Unal, O.; Saha, S.; Bejar, B.; van Gool, L. "Language-Guided Instance-Aware Domain-Adaptive Panoptic Segmentation" 2025 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) 1637-1648 2025 View publication
Hassan, M.; Stapf, S.; Rahimi, A.; Rezende, P. M. B.; Haghighi, Y.; Brüggemann, D.; Katircioglu, I.; Zhang, L.; Chen, X.; Saha, S.; et al. "GEM: A Generalizable Ego-Vision Multimodal World Model for Fine-Grained Ego-Motion, Object Dynamics, and Scene Composition Control" Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition 22404-22415 2025 View publication
Ansuinelli, P.; Saha, S.; Flores, L. F. B.; Haro, B. B.; Ekinci, Y.; Mochi, I. "Prior–primed deep neural network based EUV mask inspection" Optics Express 33 6 12572 2025 View publication
Saha, S.; Ansuinelli, P.; Barba, L.; Mochi, I.; Haro, B. B. "Ptycho-LDM: A Hybrid Framework for Efficient Phase Retrieval of EUV Photomasks Using Conditional Latent Diffusion Models" Photonics 12 9 900 2025 View publication
Ansuinelli, P.; Saha, S.; Flores, L. F. B.; Haro, B. B.; Ekinci, Y.; Mochi, I. "Prior–primed deep neural network based EUV mask inspection" Optics Express 33 6 12572 2025 View publication
Hassan, M.; Stapf, S.; Rahimi, A.; Rezende, P.; Haghighi, Y.; Brüggemann, D.; Katircioglu, I.; Zhang, L.; Chen, X.; Saha, S.; et al. "Gem: A generalizable ego-vision multimodal world model for fine-grained ego-motion, object dynamics, and scene composition control" Proceedings of the Computer Vision and Pattern Recognition Conference 22404–22415 2025 View publication
Saha, S.; Hoyer, L.; Obukhov, A.; Dai, D.; Van Gool, L. "EDAPS: Enhanced Domain-Adaptive Panoptic Segmentation" Proceedings of the IEEE/CVF International Conference on Computer Vision 19234-19245 2023 View publication

Mentioned in

Case Studies

Contact us

Let’s talk Data Science

Do you need our services or expertise?
Contact us for your next Data Science project!