In the last 5 years, the field of Machine Learning has been revolutionized by the success of deep learning. Thanks to the increasing availability of data and computations, we now are able to train very complex/deep models and hence solve challenging tasks better than we ever did.

Nevertheless, Deep Learning is successful when the network architecture exploits properties of the data, allowing efficient and principled learning. For example, convolutional neural networks (CNNs) revolutionized computer vision because the network architecture has been specifically designed to deal with images. CNNs have important advantages, but their main strength is that they are built for translation equivariance, meaning that if the input is translated, so will the output, which allows spatial weight sharing using few parameters compared to a full input parametrization, and therefore they are computationally efficient.

The translation equivariance property is extremely valuable insegmentation tasks, where a translated input should always result in a same but translated output. Because of these properties, it is not surprising that CNNs are at the core of state-of-the-art research in image classification, segmentation, restoration and generation.

Unfortunately, not all datasets are images and we need architectures that adapt to other types of data, encoding both domain specific knowledge and data specific characteristics. For instance, at the Swiss Data Science Center (SDSC), we deal with spherical data, i.e. curved images on a sphere, but without clear borders and arbitrary orientation. This data is very common a) in cosmology where most observations are made from the earth are hence are spherical (see Figure 1), b) in climate science as the earth is a sphere, and c) in virtual reality, where one often need to work with user-centered 360-degree images.


Figure 1 : Example maps on the sphere: (left) the CMB (cosmic microwave background) temperature (K) map from Plank (number of galaxies per arcmin2)  (middle) map of galaxy number counts and (right) simulated weak lensing convergence map (dimensionless).


To be successful with spherical data, new architectures are needed. As for traditional CNNs, we ideally would leverage some equivariance property. However, instead of translation, the spherical domain (rotation group SO(3)) naturally suggests rotation equivariance: a rotation of the input implies the same rotation of the outputs.

So far, mainly two approaches have been followed. In the first, the data is transformed using a planar projection and a modified CNN is applied [1]. This strategy has the advantage to be built on top of a traditional CNN and hence to be efficient. Nevertheless, the distortions induced by the projection make the translation equivariance property of CNNs different from the desired rotation equivariance. In simple words, we are destroying the spherical structure of the data. The second approach [2] leverages the convolution on the SO(3) rotation group. This convolution is a generalization of the planar convolution for the sphere and similarly it can be performed by a multiplication in the spectral/Fourier domain. In this case, rotation equivariance is naturally obtained. However, the computational cost of the spectral projections (Fourier transforms) is important, limiting the size and the depth of this architectures. An attempt to solve the computational cost of this architecture was performed in [3].

As a joint work with Michaël Defferrard from LTS2 at EPFL and Tomasz Kacprzak from the Cosmology Research Group at ETHZ, we developed a new architecture that is almost rotation equivariant while remaining computationally inexpensive.

Our idea is to perform the convolution on a graph, that approximates the sphere. Similar to the traditional convolution, the graph convolution can be performed with a weighted average of neighboring pixels. Thanks to this property, we avoid computing Fourier transforms and obtain an operation with complexity linear to the data size.

Figure : DeepSphere overall architecture, showing here two convolutional layers acting as feature extractors followed by a fully connected layer with softmax acting as the classifier.


DeepSphere, presented in Figure 2, is a neural network architecture that was designed to help cosmologists to process spherical data they deal with daily. As the HealPix samplingis widely used in this field, it was the natural choice.HEALPix is a sphere hierarchical sampling based on a rhombic dodecahedron, i.e., a polyhedron made from 12 congruent rhombic faces. Hence it as the property that each point corresponds to a surface of the same area and comes with a natural pooling operation (see Figure 3). Nevertheless, the approach can be adapted to almost any sampling as graphs are very flexible objects. In fact, we discover during the development of our research that graphs were already used similarly in [4]. One significant advantage of using graphs to approximate the sphere, is that it becomes trivial to only use a sub-region of the sphere. This latter point is very useful, as most of the cosmological data do not span the full sphere (see Figure 1).

Figure 3 : Two levels of coarsening and pooling: groups of 4 cells are merged into one, then the data on them is summarized in one. The coarsest cell covers 1/12 of the sphere.


DeepSphere was tested in a classification task against traditional classifiers used in cosmology. In order to assess its performance, a specific dataset composed of spherical convergence maps was created. Convergence maps represent the dimensionless distribution of over- and under-densities of mass in the universe, projected on the sky plane. The samples composing two classes were obtained by simulation using identical initial conditions, but different cosmological parameters (See Figure 4). Then, we tested the classification of the data when corrupted by different noise levels. As shown in Figure 5, DeepSphere allows for a significant increase in classification performance.

Figure : Example maps from two classes to be discriminated.

Figure : Performance of DeepSphere versus classical classifiers used in Astronomy. (The higher, the better)

We try to make our research pipeline as open as possible. While we are not able to share our dataset, the code used in all experiments can be found in the form of a python package on Github. You can also directly try to play with this notebook.

 Nathanaël Perraudin, Sr. Data Scientist, Swiss Data Science Center


Further reading

To understand the convolution on the rotation group SO(3), you can check the section 2 of [3]
You can also check the papers in the reference list

References (non-exhaustive)

[1] Boomsma, W., & Frellsen, J. (2017). Spherical convolutions and their application in molecular modelling. In Advances in Neural Information Processing Systems (pp. 3433-3443). NIPS Proceeding

[2] Cohen, T. S., Geiger, M., Köhler, J., & Welling, M. (2018). Spherical CNNs. arXiv preprint arXiv:1801.10130. arXiv

[3] Kondor, R., Lin, Z., & Trivedi, S. (2018). Clebsch-Gordan Nets: a Fully Fourier Space Spherical Convolutional Neural Network. arXiv preprint arXiv:1806.09231. arXiv

[4] Khasanova, R., & Frossard, P. (2017, October). Graph-based classification of omnidirectional images. In IEEE International Conference on Computer Vision Workshops (ICCVW) (No. CONF, pp. 860-869). arXiv

[5] DeepSphere: Efficient spherical Convolutional Neural Network with HEALPix sampling for cosmological applications, Nathanaël Perraudin, Michaël Defferrard, Tomasz Kacprzak, Raphael Sgier, arXiv preprint 1810.12186arXiv

The code: