Private sector

Job title standardization through entity alignment of knowledge graphs

SDSC Team:

Saurabh Bhargava

Principal Data Scientist

Saurabh Bhargava, joined the SDSC as a Principal Data Scientist in the Industry Cell at the Zürich office in 2022. Saurabh previously worked in the retail sector and the advertising industry in Germany. He lead and built various data products for customers using state of the art machine learning methods and industrializing them thereby adding value for the customers. He completed his PhD from ETH Zürich in June 2017 specializing in machine learning applications on Audio data. He obtained his Master’s and Bachelor’s degrees from EPFL and Indian Institute of Technology (IIT), Roorkee, India in 2011 and 2009 respectively. His interests and expertise are in combining state of the art data science and data engineering tools for building scalable data products.

Lucas Chizzali

Senior Data Scientist

Lucas joined the SDSC's industry cell as a Data Scientist in November 2020, having previously worked in data related roles at the New York State Attorney and at Ericsson. He holds a BSc in Economics from Bocconi University, a MSc in Urban Science and Informatics from New York University as well as a MSc in Machine Learning from KTH Royal Institute of Technology. Over the course of his academic and professional career he has worked on a variety of topics, from computer vision tasks for automated driving to financial fraud detection to generating data driven insights to inform urban policy decisions.

Share this post

Context

The Adecco Group is one of the largest HR providers and staffing firms in the world. In order to find the best candidate for a given job vacancy, it is necessary to write precise job descriptions and to identify successful candidate profiles. Achieving this relies on curating unified and standardized job information. The focus of this project is on standardizing the terminology of job titles.

Objectives

Job information is scattered across various homogeneous sources, such as ESCO or O*NET, that differ in the use of terminology and data completeness. To optimally leverage information from these sources, they must be unified and standardized.

One approach to achieving this is by representing data sources as knowledge graphs (KG) and applying a technique named “entity alignment”, which identifies nodes in different KGs that refer to the same entity (i.e. concept). KGs generally contain different types of relationships (edges) and different types of entities (nodes). Crucially though, all constructed Knowledge Graphs have one type of entity in common, namely job titles.

Examples of relationships are those identifying alternative titles (e.g. Software Architect vs Application Architect), job categories (e.g. IT professionals) or skill requirements (e.g. Python). Considering node connectivity and embeddings of job titles and their descriptions obtained from fine-tuned Natural Language Processing models, a Deep Learning model was trained to identify nodes that refer to the same job title. This hybrid approach allows to incorporate both semantic and graph-based similarity of job titles.

Benefits

Aligned job titles as identified by the developed Deep Learning model are merged and represented by a single, standardized job title. Having a standardized terminology of job titles and their descriptions allows recruiters to describe job postings and assess candidate profiles more efficiently. This ensures faster and more accurate staffing, thereby raising labor productivity.

Notes

The SDSC would like to thanks the following people at Adecco Group: Pencho Yordanov, Riccardo Menoli, Sarah Mathews, Giovanna Favia, Helmi Boussetta, Marco Totolo.

More case studies

Public Sector

Smart Waste Collection with AI-Empowered Planning

City of Burgdorf deploys adaptive algorithms to save critical resources.



Public Sector

Enhancing Parliamentary Services with Generative AI

Partly motivated by Inter-Parliamentary Union directives, the Swiss Parliament departments are exploring uses of generative AI in collaboration with academic institutions. The SDSC team was mandated to create a custom chat assistant powered by state-of-the-art LLMs and a robust RAG system to support complex multilingual interactions for information retrieval.



Private sector

An artificial intelligence-based system for augmented cell & gene therapies

Tigen is a clinical-stage biotech company, founded in 2017 and based in Switzerland, with a mission to bridge the gap between academic research and commercially viable therapies, particularly in the field of T cell-based cancer treatments.



Contact us

Let’s talk Data Science

Do you need our services or expertise?
Contact us for your next Data Science project!

Contact our team