MUTIGER

MUTations, Interactions and GEne Regulation

Started
June 1, 2023
Status
In Progress
Share this project

Abstract

Non-coding mutations constitute more than 95% of all mutations; however, they remain understudied in the context of diseases such as cancer. Several studies recently documented the consequences of non-coding mutations in cancer on the activity of regulatory elements linked to changes in the expression of cancer-related genes. Nevertheless, a comprehensive study that would evaluate the effects of non-coding mutations on the 3D structure of enhancer- promoter looping across various types of cancer is currently missing. Moreover, there is no available computational approach to predict and evaluate the effects of large structural variants (e.g., translocations or large genomic duplications and deletions) on enhancer-promoter looping, and consecutively gene expression, when the gene itself is not affected by the rearrangement.

This project's primary goal is to build a computational approach to reliably predict the effect of each non-coding genomic variant in a tumor genome in a cell-type-specific manner via explicitly modeling changes in the activity of regulatory elements and 3D chromatin structure.

Overall, this work will further investigate the role of non-coding variation including structural rearrangements in cancer development making a specific emphasis on variants affecting DNA 3D structure and activity of regulatory elements. We are confident that the application of our method will allow extracting a very small number of truly functional non-coding variants that affect the expression of neighboring genes. Using our analysis in hundreds of available cancer whole-genome sequence samples, we aim to improve our understanding of cancer drivers, further elucidating oncogenic mechanisms in human cancers.

People

Collaborators

SDSC Team:
Lin Zhang
Till Muser
Ekaterina Krymova

PI | Partners:

ETH Zurich, Computational Cancer Genomics Lab:

  • Prof. Valentina Boeva
  • Aayush Grover

More info

description

Motivation

Non-coding mutations remain under-explored in cancer research. There is a lack of comprehensive study on how these mutations affect the 3D structure of enhancer-promoter looping across different cancer types. While experimental methods are available, they are expensive and constrained by technical limitations, making them impractical for high-throughput analysis. This project aims to develop computational approaches to assess the effects of non-coding variants and large structural variants in a cell-type-specific manner by explicitly modelling alterations in regulatory element activity and 3D chromatin structure caused by genomic variations.

Proposed Approach / Solution

SDSC is engaged in the development of computational methods for prediction of cell-type-specific effects of non-coding variants based on unmatched open chromatin data and DNA sequence. The objective is to create a user-friendly tool capable of predicting the influence of non-coding variants on regulatory element activity, thereby how it affects the expression of target genes.

Figure 1: Enhancer-Promoter-Interaction causing a change in gene expression. Enhancers and promoters both belong to the non-coding region of the genome with promoters being very close to the target gene start site. In contrast, enhancers are up to a few mega-base-pairs (MBPs) away. The binding of transcription factors (TFs) to the DNA sequence regulates these enhancer-promoter-interactions.
Figure 2: Interactions between different regions of the genome are captured using a biological experiment called Hi-C. Accurately modelling these interactions is an important stepping stone for the MUTIGER project. Interactions can be visualized using the HiC-Matrix, here pictured within chromosome 2 for a section around 139MBP. The brightness of the spot (labelled “A↔B”) where the diagonals extending from two regions A and B meet, signifies their interaction strength. The intensity of the interaction is primarily governed by distance and specific transcription factors within non-coding regions in the vicinity of A and B.

Impact

This project aims to delve deeper into the role of non-coding mutations and structural variants in cancer development. The methodology devised in this project offers a valuable tool for identifying truly functional non-coding mutations and structural variants that influence the expression of cancer-related genes. This advancement can significantly deepen our understanding of oncogenic mechanisms in human cancers.

Gallery

Annexe

Additional resources

Bibliography

  1. Tan, J., Shenker-Tauris, N., Rodriguez-Hernaez, J. et al. Cell-type-specific prediction of 3D chromatin organization enables high-throughput in silico genetic screening. Nat Biotechnol 41, 1140–1150 (2023). https://doi.org/10.1038/s41587-022-01612-8  
  2. Fudenberg, G., Kelley, D.R. & Pollard, K.S. Predicting 3D genome folding from DNA sequence with Akita. Nat Methods 17, 1111–1117 (2020). https://doi.org/10.1038/s41592-020-0958-x  
  3. Kelley DR (2020) Cross-species regulatory sequence activity prediction. PLOS Computational Biology 16(7): e1008050. Cross-species regulatory sequence activity prediction
  4. Avsec, Ž., Agarwal, V., Visentin, D. et al. Effective gene expression prediction from sequence by integrating long-range interactions. Nat Methods 18, 1196–1203 (2021). https://doi.org/10.1038/s41592-021-01252-x

Publications

Related Pages

More projects

ML-L3DNDT

Completed
Robust and scalable Machine Learning algorithms for Laue 3-Dimensional Neutron Diffraction Tomography
Big Science Data

BioDetect

Completed
Deep Learning for Biodiversity Detection and Classification
Energy, Climate & Environment

IRMA

In Progress
Interpretable and Robust Machine Learning for Mobility Analysis
No items found.

FLBI

In Progress
Feature Learning for Bayesian Inference
No items found.

News

Latest news

Smartair | An active learning algorithm for real-time acquisition and regression of flow field data
May 1, 2024

Smartair | An active learning algorithm for real-time acquisition and regression of flow field data

Smartair | An active learning algorithm for real-time acquisition and regression of flow field data

We’ve developed a smart solution for wind tunnel testing that learns as it works, providing accurate results faster. It provides an accurate mean flow field and turbulence field reconstruction while shortening the sampling time.
The Promise of AI in Pharmaceutical Manufacturing
April 22, 2024

The Promise of AI in Pharmaceutical Manufacturing

The Promise of AI in Pharmaceutical Manufacturing

Innovation in pharmaceutical manufacturing raises key questions: How will AI change our operations? What does this mean for the skills of our workforce? How will it reshape our collaborative efforts? And crucially, how can we fully leverage these changes?
Efficient and scalable graph generation through iterative local expansion
March 20, 2024

Efficient and scalable graph generation through iterative local expansion

Efficient and scalable graph generation through iterative local expansion

Have you ever considered the complexity of generating large-scale, intricate graphs akin to those that represent the vast relational structures of our world? Our research introduces a pioneering approach to graph generation that tackles the scalability and complexity of creating such expansive, real-world graphs.

Contact us

Let’s talk Data Science

Do you need our services or expertise?
Contact us for your next Data Science project!