IRMA

Interpretable and Robust Machine Learning for Mobility Analysis

Started

November 1, 2021

Status

Completed

Share this project

Artificial intelligence (AI) is revolutionizing many areas of our lives, leading a new era of technological advancement. Particularly, the transportation sector would benefit from the progress in AI and advance the development of intelligent transportation systems. Building intelligent transportation systems requires an intricate combination of artificial intelligence and mobility analysis. The past few years have seen rapid development in transportation applications using advanced deep neural networks. However, such deep neural networks are often difficult to interpret and lack robustness, which slows the deployment of these AI-powered algorithms in practice. To improve their usability in deployment, an increasing research effort has been devoted to developing interpretable and robust machine learning methods, among which the causal inference approach recently gained traction as it can provide interpretable and actionable information. However, most methods are developed for image or sequential data which cannot satisfy the specific requirements of mobility data analysis. These unique requirements have been intensively studied in the Geographic Information Science (GIScience) field but have not yet been well utilized in developing machine learning models. The goal of our project is to bring together the knowledge of GIScience and Machine Learning, advancing our understanding of how interpretable and robust machine learning methods can be tailored to mobility analysis with the support of causal inference. The outcome of this research will deepen our understanding of how to integrate AI technologies and GIScience for mobility analysis, making AI in the transportation sector more interpretable and reliable. Ultimately, we aim to facilitate the deployment of AI in intelligent transportation systems and build a safer, more efficient, and more sustainable transportation system in the future.

People

Collaborators

SDSC Team:

Simon Dirmeier

Senior Data Scientist

Simon joined the SDSC as a senior data scientist in April 2022. He conducted his doctoral studies on statistical modeling of genetic data at ETH Zürich and obtained his MSc and BSc degrees at Technical University Munich in computer science. Before joining the SDSC, Simon worked as a freelance statistical consultant, and as an ML scientist at an AI startup in Lugano where he built experience in various topics ranging from generative modeling over Bayesian optimization to time series forecasting. Simon's research interests and expertise lie broadly in probabilistic machine and deep learning, causal inference, generative modeling, and their application in the natural sciences. Simon is an avid open-source software contributor and particularly enthusiastic about probabilistic programming languages, such as Stan.

Simon Dirmeier

Fernando Perez-Cruz

Former Deputy Executive Director & Chief Data Scientist

Fernando Perez-Cruz received a PhD. in Electrical Engineering from the Technical University of Madrid. He is Titular Professor in the Computer Science Department at ETH Zurich and Head of Machine Learning Research and AI at Spiden. He has been a member of the technical staff at Bell Labs and a Machine Learning Research Scientist at Amazon. Fernando has been a visiting professor at Princeton University under a Marie Curie Fellowship and an associate professor at University Carlos III in Madrid. He held positions at the Gatsby Unit (London), Max Planck Institute for Biological Cybernetics (Tuebingen), and BioWulf Technologies (New York). Fernando Perez-Cruz has served as Chief Data Scientist at the SDSC from 2018 to 2023, and Deputy Executive Director of the SDSC from 2022 to 2023

Fernando Perez-Cruz

PI | Partners:

ETH Zurich, Mobility Information Engineering Lab:

Prof. Dr. Martin Raubal
Dr. Yanan Xin
Ye Hong

More info

description

Motivation

‍Recent research on computational methods for mobility analysis has focused on black-box deep learning methods, because of their superior predictive power compared to conventional methods in many mobility-related applications. Despite their predictive performance, deep learning based algorithms typically have several shortcomings: a) they lack interpretability, b) they do generally not provide uncertainty estimates, c) it is unclear whether they are robust to distributional shifts in the input data, and d) they are typically not privacy preserving, i.e., the trained neural network weights can reflect confidential information, for instance, when trained on personal GPS tracking and location data.

Proposed Approach / Solution

For this project, we develop novel methods for enhancing the interpretability and robustness of machine learning models for mobility analysis. We first develop a benchmarking framework and datasets by simulating synthetic mobility data using both mechanistic models and generative AI models (Figure 1). The mechanistic models are designed to generate controlled interventional data for evaluating the robustness of neural networks (Hong et al., 2023). The generative denoising diffusion model is developed to simulate privacy-preserving mobility data (Dirmeier et al., 2024), since it does not rely on statistics of the data that might reveal information of individuals. Based on the simulated data, we assess how robust deep learning-based predictors are to distributional shifts in the input data and we present an approach that is able to discern in-distribution data from out-of-distribution data based on density estimation (Figure 2; Dirmeier et al., 2023). In addition, we present a data-driven approach to inform decision-making in mobility using counterfactual explanations (Figure 3; Wang et al., 2024).

Impact

‍The outcome of this research will deepen our understanding of how to integrate AI technologies and GIScience for mobility analysis, making AI in the transportation sector more interpretable and reliable. Ultimately, we aim to facilitate the deployment of AI in intelligent transportation systems, which could make the transportation system safer, more efficient, and more
sustainable in the future.

***Figure 1:*** Comparison of the privacy-preserving CDPM to mechanistic models (EPR, dEPR, dtEPR, IPT) from the mobility literature. Among other statistics that are generally employed in the mobility literature, we compute the entropy-distribution of a set of simulated location trajectories (blue and green) with the entropy-distribution of the observed location-trajectories (grey). For this evaluation, the CDPM is fairly close to the real trajectories which means that the information content between the sequences is similar.

***Figure 2:*** Out-of-distribution detection via epistemic uncertainty quantification. We developed an approach that can discern in-distribution from out-of-distribution data. When our model is applied to in-distribution test data (dark blue), it produces similar uncertainty estimates as when applied to training data (grey). When the model is applied to out-of-distribution data (green) the histogram of uncertainty estimates is shifted further to the left. One can then apply conventional distributional tests to detect if the distributions are significantly different from each other.

***Figure 3:*** Counterfactual explanations for retrospective decision making. Counterfactual explanations are used to illuminate how alterations in these input variables affect predicted outcomes, thereby enhancing the transparency of the deep learning model. We investigated the impact of contextual features on traffic speed prediction under varying spatial and temporal conditions.

Presentation

Download Presentation



Gallery

Annexe

Additional resources

—



Bibliography

Publications



Wang, R.; Xin, Y.; Zhang, Y.; Perez-Cruz, F.; Raubal, M. "Counterfactual Explanations for Deep Learning-Based Traffic Forecasting" Preprint 2024 View publication 



Dirmeier, S.; Hong, Y.; Perez-Cruz, F. "Synthetic location trajectory generation using categorical diffusion models" Preprint 2024 View publication 



Hong, Y.; Xin, Y.; Dirmeier, S.; Perez-Cruz, F.; Raubal, M. "A causal intervention framework for synthesizing mobility data and evaluating predictive neural networks" Preprint 2023 View publication 



Dirmeier, S.; Hong, Y.; Xin, Y.; Perez-Cruz, F. "Uncertainty quantification and out-of-distribution detection using surjective normalizing flows" Preprint 2023 View publication 

GitHub organisation (including code for all case studies): Interpretable and robust machine learning for mobility analysis

‍

More projects

MAGNIFY

In Progress

Machine learning Assisted larGe scale quaNtIfication of building energy FlexibilitY

Energy, Climate & Environment

SPI-GreenFjord

In Progress

Energy, Climate & Environment

SPI-PAMIR

In Progress

Energy, Climate & Environment

TREMA

Completed

Transforming real estate management with AI

Engineering

All projects

News

Latest news

March 12, 2025

First National Calls: 50 selected projects to start in 2025

50 proposals were selected through the review processes of the SDSC's first National Calls.





January 22, 2025

AIXD | Generative AI toolbox for architects and engineers

Introducing AIXD (AI-eXtended Design), a toolbox for forward and inverse modeling for exhaustive design exploration.





May 1, 2024

Smartair | An active learning algorithm for real-time acquisition and regression of flow field data

We’ve developed a smart solution for wind tunnel testing that learns as it works, providing accurate results faster. It provides an accurate mean flow field and turbulence field reconstruction while shortening the sampling time.





All news

Contact us

Let’s talk Data Science

Do you need our services or expertise?
Contact us for your next Data Science project!

Contact our team

IRMA

Abstract

People

Collaborators

PI | Partners:

ETH Zurich, Mobility Information Engineering Lab:

description

Motivation

Proposed Approach / Solution

Impact

Presentation

Gallery

Annexe

Additional resources

Bibliography

Publications

Related Pages

More projects

MAGNIFY

SPI-GreenFjord

SPI-PAMIR

TREMA

News

Latest news

First National Calls: 50 selected projects to start in 2025

First National Calls: 50 selected projects to start in 2025

AIXD | Generative AI toolbox for architects and engineers

AIXD | Generative AI toolbox for architects and engineers

Smartair | An active learning algorithm for real-time acquisition and regression of flow field data

Smartair | An active learning algorithm for real-time acquisition and regression of flow field data

Contact us

Let’s talk Data Science