Hamiltonian Neural Networks

20 Jun 2019

Introduction

The paper proposes a very cool idea at the intersection of deep learning and physics.
The idea is to train a neural network architecture that builds on the concept of Hamiltonian Mechanics (from Physics) to learn physical conservation laws in an unsupervised manner.
Link to the paper
Link to the code
Link to author’s blog

It is a branch of physics that can describe systems which follow some conservation laws and invariants.
Consider a set of N pair of coordinates [(q₁, p₁), …, (q_N, p_N)] where q = [q₁, …, q_N] dnotes the position of the set of objects while p = [p₁, …, p_N] denotes the momentum of the set of variables.
Together these N pairs completely describe the system.
A scalar function H(q, p), called as the Hamiltonian is defined such that the partial derivative of H with respect to p is equal to derivative of q with respect to time t and the negative of partial derivative of H with respect to q is equal to derivative of p with respect to time t.
This can be expressed in the form of the equation as follows:

equation1

The Hamiltonian can be tied to the total energy of the system and can be used in any system where the total energy is conserved.

The Hamiltonian H can be parameterized using a neural network and can learn conserved quantities from the data in an unsupervised manner.
The loss function looks as follows:

equation2

The partial derivatives can be obtained by computing the in-graph gradient of the output variables with respect to the input variables.

For setups where the energy must be conserved exactly, (eg ideal mass-spring and ideal pendulum), the HNN learn to preserve an energy-like scalar.
For setups where the energy need not be conserved exactly, the HNNs still learn to preserve the energy thus highlighting a limitation of HNNs.
In case of two body problems, the HNN model is shown to be much more robust when making predictions over longer time horizons as compared to the baselines.
In the final experiment, the model is trained on pixel observations and not state observations. In this case, two auxiliary losses are added: auto-encoder reconstruction loss and a loss on the latent space representations. Similar to the previous experiments, the HNN model makes robust predictions over much longer time horizons.