Hamiltonian Neural Networks
20 Jun 2019Introduction

The paper proposes a very cool idea at the intersection of deep learning and physics.

The idea is to train a neural network architecture that builds on the concept of Hamiltonian Mechanics (from Physics) to learn physical conservation laws in an unsupervised manner.
Hamiltonian Mechanics

It is a branch of physics that can describe systems which follow some conservation laws and invariants.

Consider a set of N pair of coordinates [(q_{1}, p_{1}), …, (q_{N}, p_{N})] where q = [q_{1}, …, q_{N}] dnotes the position of the set of objects while p = [p_{1}, …, p_{N}] denotes the momentum of the set of variables.

Together these N pairs completely describe the system.

A scalar function H(q, p), called as the Hamiltonian is defined such that the partial derivative of H with respect to p is equal to derivative of q with respect to time t and the negative of partial derivative of H with respect to q is equal to derivative of p with respect to time t.

This can be expressed in the form of the equation as follows:
 The Hamiltonian can be tied to the total energy of the system and can be used in any system where the total energy is conserved.
Hamiltonian Neural Network (HNN)

The Hamiltonian H can be parameterized using a neural network and can learn conserved quantities from the data in an unsupervised manner.

The loss function looks as follows:
 The partial derivatives can be obtained by computing the ingraph gradient of the output variables with respect to the input variables.
Observations

For setups where the energy must be conserved exactly, (eg ideal massspring and ideal pendulum), the HNN learn to preserve an energylike scalar.

For setups where the energy need not be conserved exactly, the HNNs still learn to preserve the energy thus highlighting a limitation of HNNs.

In case of two body problems, the HNN model is shown to be much more robust when making predictions over longer time horizons as compared to the baselines.

In the final experiment, the model is trained on pixel observations and not state observations. In this case, two auxiliary losses are added: autoencoder reconstruction loss and a loss on the latent space representations. Similar to the previous experiments, the HNN model makes robust predictions over much longer time horizons.