Towards understanding glasses with graph neural networks

Under a microscope, a pane of window glass doesn’t look like a collection of orderly molecules, as a crystal would, but rather a jumble with no discernable structure. Glass is made by starting with a glowing mixture of high-temperature melted sand and minerals. Once cooled, its viscosity (a measure of the friction in the fluid) increases a trillion-fold, and it becomes a solid, resisting tension from stretching or pulling. Yet the molecules in the glass remain in a seemingly disordered state, much like the original molten liquid – almost as though the disordered liquid state had been flash-frozen in place. The glass transition, then, first appears to be a dramatic arrest in the movement of the glass molecules. Whether this process corresponds to a structural phase transition (as in water freezing, or the superconducting transition) is a major open question in the field. Understanding the nature of the dynamics of glass is fundamental to understanding how the atomic-scale properties define the visible features of many solid materials.  In the words of the recently deceased Nobel Prize laureate Philip W. Anderson, whose pioneering work shaped the field of solid-state physics:

The deepest and most interesting unsolved problem in solid state theory is probably the theory of the nature of glass and the glass transition.

Philip W. Anderson

Figure 1: A liquid, when cooled too quickly past its crystallization point, turns into a supercooled liquid which, upon further cooling, turns into a disordered, amorphous glass. If cooled slowly enough, it may instead transform into an ordered crystal.
Figure 1: A liquid, when cooled too quickly past its crystallisation point, turns into a supercooled liquid which, upon further cooling, turns into a disordered, amorphous glass. If cooled slowly enough, it may instead transform into an ordered crystal.

The practical implications of modelling glass

The glass transition is a ubiquitous phenomenon which manifests in more than window (silica) glasses. For instance, when ironing, polymers in a fabric are heated, become mobile, and then oriented by the weight of the iron. More broadly, a similar and related transition, the jamming transition, can be found in colloidal suspensions (such as ice cream), granular materials (such as a static pile of sand), and also biological systems (e.g., for modelling cell migration during embryonic development) as well as social behaviours (for instance traffic jams). These systems all operate under local constraints where the position of some elements inhibits the motion of others (termed frustration). Their dynamics are complex and cooperative, taking the form of large-scale, collective rearrangements which propagate through space in a heterogeneous manner. Glasses are considered to be archetypal of these kinds of complex systems, and so better understanding them will have implications across many research areas. This understanding might yield practical benefits – for example, creating materials that have a more stable glass structure, instead of a crystalline one, would allow them to dissolve quickly, which could lead to new drug delivery methods.  Understanding the glass transition may result in other applications of disordered materials, in fields as diverse as biorenewable polymers and food processing. The study of glasses has also already led to insights in apparently very different domains such as constraint satisfaction problems in computer science and, more recently, the training dynamics of under-parameterized neural networks

A deeper understanding of glasses may lead to practical advances in the future, but their mysterious properties also raise many fundamental research questions. Though humans have been making silica glasses for at least four thousand years, they remain enigmatic to scientists: there are many unknowns about the underlying physical correlates of, for example, the trillion-fold increase in viscosity that happens over the cooling process. Our interest in this field was also motivated by the fact that glasses are also an excellent testbed for applying modern machine learning methods to physical problems: they’re easy to simulate, and easy to input to particle-based machine learning models. Crucially, we can then go in and examine these models to understand what they’ve learned about the system, to gain deeper qualitative insights about the nature of glass, and the structural quantities which underpin its mysterious dynamical qualities. Our new work, published in Nature Physics, could help us gain an understanding of the structural changes that may occur near the glass transition. More practically, this research could lead to insights about the mechanical constraints of glasses (e.g., where a glass will break). 

Leveraging graph neural networks to model glassy dynamics

Glasses can be modelled as particles interacting via a short-range repulsive potential which essentially prevents particles from getting too close to each other. This potential is relational (only pairs of particles interact) and local (only nearby particles interact with each other), which suggests that a model that respects this local and relational structure should be effective. In other words, given the system is underpinned by a graph-like structure, we reasoned it would be best modeled by a graph structured network, and set out to apply Graph Neural Networks to predict physical aspects of a glass. 

We first created an input graph where the nodes represent particles, and edges represent interactions between particles, and are labelled with their relative distance.  A particle was connected to its neighboring particles within a certain radius (in this case, 2 particle diameters). We then trained a neural network, described below, to predict a single real number for each node of the graph. This prediction was ultimately regressed towards the mobilities of particles obtained from computer simulations of glasses. Mobility is a measure of how much a particle typically moves (more technically, this corresponds to the average distance travelled when averaging over initial velocities). 

Figure 2: Model architecture. a) From the 3-d inputs, nodes at distance less than 2 are connected to form a graph. After processing, the network predicts mobilities (represented by different colours) for each particle. b) The graph network’s core first updates edges based on their previous embedding and those of their adjacent nodes, and then nodes based on their previous embeddings and those of incoming edges. c) The graph network consists of an encoder, several applications of the core, followed by a decoder. Each application of the core increases the shell of particles contributing to a given particle’s prediction, here shown in colour for the central particle (dark blue).

Our network architecture was a typical graph network architecture, consisting of several neural networks. We first embedded the node and edge labels in a high-dimensional vector-space using two encoder networks (we used standard multi-layer perceptrons). Next, we iteratively updated the embedded node and edge labels using two update networks visualized in Fig. 2b. At first, each edge updated based on its previous embedding and the embeddings of the two nodes it connected to. After all edges were updated in parallel using the same network, the nodes were also updated based on the sum of their neighboring edge embeddings and their previous embeddings, using a second network. We repeated this procedure several times (typically 7), allowing local information to propagate throughout the graph, as shown in Fig. 2c. Finally, we extracted the mobility for each particle from the final embeddings of the corresponding node using a decoder network. The resulting network has all the required properties: it is inherently relational, it is invariant under permutation of the nodes and edges of the graph, and it updates embedding in a way that is a composition of local operations. The network parameter training was done via stochastic gradient descent.

To study the full dynamical evolution of glasses, we constructed several datasets corresponding to predictions of mobilities on different time horizons and for different temperatures. We note that each particle will have collided several thousands of times over those timescales. Thus, the network must find a way to coarsely represent the long-term dynamics of the system.

Connecting the network’s prediction with physics 

After applying graph networks to the three dimensional glasses that we simulated, we found that they strongly outperformed existing models, ranging from standard physics-inspired baselines to state-of-the-art machine learning models. Comparing the predicted mobilities (colour gradients, Figure 3) with the ground truth simulation (dots, Figure 3), we see that the agreement is extremely good on short times and remains well matched up to the relaxation time of the glass. Looking at a glass over the timescale of its relaxation time – for actual glass, this would be thousands of years – is like looking at a liquid over about a picosecond (10-12): the relaxation time is loosely when particles have collided enough to start losing information about their initial position. In numbers, the correlation between our prediction and the simulation's ground truth is 96% for very short timescales, and remains high at 64% for the relaxation time of the glass (an improvement of 40% compared to the previous state of the art).

3D prediction over short timescales

Figure 3a: GNN-predicted mobilities (coloured from least mobile in blue to most mobile in red) compared to the position of the most mobile particles in the simulation (dots) in a slice of our 3-dimensional box. Better performance corresponds to greater alignment of red areas and dots. This panel corresponds to a prediction over a short timescale: a regime in which our network attains a very strong performance.

3D prediction over long timescales

Figure 3b. In this panel, corresponding to a timescale 28,000 times longer than the top panel, particles in the glass have started to diffuse. The dynamics are heterogeneous – particle mobilities are correlated locally, but heterogeneous at macroscopic scales – yet our network still makes predictions in agreement with the ground truth simulation.

We don’t want to simply model glass, however: we want to understand it.  We therefore explored what factors were important to our model’s success in order to infer what properties are important in the underlying system. A central unsolved question in the dynamics of glass is how particles influence one another as a function of distance, and how this evolves over time. We investigated this by designing an experiment leveraging the specific architecture of the graph network. Recall that repeated applications of the edge and node updates define shells of particles around any given particle: the first shell consists of all particles one step away from this "marked" particle, the second shell consists of all particles one step away from the first shell, and so on (see the different shades of blue on Figure 2c). By measuring the sensitivity of the prediction that the network makes for the central particle when the n-th shell is modified, we can measure how large an area the network uses to extract its prediction, which provides an estimate of the distance over which particles influence each other in the physical system.

Figure 4: Ablation experiment. On the left experiment, all particles beyond the first shell around one central particle are removed. On the right experiment, the input is perturbed by increasing the distance between the first and second shells of particles.
Figure 4: Ablation experiment. On the left experiment, all particles beyond the first shell around one central particle are removed. On the right experiment, the input is perturbed by increasing the distance between the first and second shells of particles.

We found that when predicting what happens in the near future or in the liquid phase, drastic modifications of the third shell (for instance, removing it altogether, Figure 4, left) did not modify the prediction that the network would make for the marked particle. On the other hand, when making predictions at low temperature and in the far future, after the glass starts to relax, even tiny perturbations (Figure 4, right) of the 5-th shell affect the prediction for the marked particle. These findings are consistent with a physical picture where a correlation length (a measure of the distance over which particles influence each other) grows upon approaching the glass transition. The definition and study of correlation lengths is a cornerstone of the study of phase transition in physics, and one that is still an open point of debate when studying glasses. While this "machine learned" correlation length cannot be directly transformed into a physically measurable quantity, it provides compelling evidence that growing spatial correlations are present in the system upon approaching the glass transition, and that our network has learned to extract them.

Conclusion

Our results show that graph networks constitute a powerful tool to predict the long term dynamics of glassy systems, leveraging the structure hidden in a local neighborhood of particles. We expect our technique to be useful for predicting other physical quantities of interest in glasses, and hope that it will lead to more insights for glassy system theorists – we are open-sourcing our models and trained networks to aid this effort. More generally, graph networks are a versatile tool that are being applied to many other physical systems that consist of many-body interactions, in contexts including traffic, crowd simulations, and cosmology. The network analysis methods used here also yield a deeper understanding in other fields: graph networks may not only help us make better predictions for a range of systems, but indicate what physical correlates are important for modeling them – in this work, how interactions between local particles in a glassy material evolve over time. 

We believe that our results advocate using structured models when applying machine learning to the physical sciences; in our case, the ability to analyse the inner workings of a neural network indicated that it had discovered a quantity that correlates with an elusive physical quantity. This demonstrates that machine learning can be used not only to make quantitative predictions, but also to gain qualitative understanding of physical systems. This could mean that machine learning systems might be able to eventually assist researchers in deriving fundamental physical theories, ultimately helping to augment, rather than replace, human understanding.

Work done in collaboration with: E. D. Cubuk, S. S. Schoenholz, A. Obika, A. W. R. Nelson, T. Back, D. Hassabis and P. Kohli

Figure design by Paulo Estriga and Adam Cain

Read more

We’re continuing to develop methods for applying machine learning to a broad range of fundamental science questions. We’re always looking to hire more scientists–read about our science programme and openings for more information.