Representation Tradeoffs for Hyperbolic Embeddings

11 Dec 2018

Introduction

The paper describes a combinatorial approach to embed trees into hyperbolic spaces without performing optimization.
The resulting mechanism is analyzed to obtain dimensionality-precision tradeoffs.
To embed any metric spaces in the hyperbolic spaces, a hyperbolic generalization of the multidimensional scaling (h-MDS) is proposed.
Link to the paper

Hyperbolic Spaces
- Have the “tree” like property ie the shortest path between a pair of points is almost the same as the path through the origin.
- Generally, Poincare ball model is used given its advantages like conformity to the Euclidean spaces.
Fidelity Measures
- Mean Average Precision - MAP
  - A local metric that ranks between distances of the immediate neighbors.
- Distortion
  - A global metric that depends on the underlying distances and not just the local relationship between distances.

Consider two points a and b (from the tree) where b is the parent of a.
Assume that a is embedded as f(a) and b is embedded as f(b) and the children of a needs to be embedded.
Reflect f(a) and f(b) across a geodesic such that f(a) is mapped to 0 (origin) while f(b) is mapped to some new point z.
Children of a are placed at points y_i which are equally placed around a circle of radius (e^r - 1) / (e^r + 1) and maximally seperated from z, where r is the scaling factor.
Then all the points are reflected back across the geodesic so that all children are at a distance r from f(a).
To embed the tree itself, place the root node at the origin, place its children around it in a circle, then place their children and so on.
In this construct, precision scales logarithmically with the degree of the tree but linearly with the maximum path length.

In the d-dimensional space, the points are embedded into hyperspheres (instead of circles).
The number of children node that can be placed for a particular angle grows with the dimension.
Increasing dimension helps with bushy trees (with high node degree).

Given the pairwise distance from a set of points in the hyperbolic space, how to recover the points?
The corresponding problem in the Euclidean space is solved using MDS.
A variant of MDS called as h-MDS is proposed.
MDS makes a centering assumption that points have 0 mean. In h-MDS, a new mean (called as the pseudo-Euclidean mean) is introduced to enable recovery via matrix factorization.
Instead of the Poincare model, the hyperboloid model is used (though the points can be mapped back and forth).

A set of points can always be centered without affecting their pairwise distance by simply finding their mean and sending it to 0 via isometry

Given the pairwise distances, a new matrix Y is constructed by applying cosh on the pairwise distances.
Running PCA on -Y recovers X up to rotation.

PGA is the counterpart of PCA in the hyperbolic spaces.
First the Karcher mean of the given points is computed.
All points x_i are reflected so that their mean is 0 in the Poincare disk model.
Combining that with Euclidean reflection formula and hyperbolic metrics leads to a non-convex loss function which can be optimized using gradient descent algorithm.

Datasets
- Trees: fully balanced and phylogenic trees expressing genetic heritage.
- Tree-like hierarchy: WordNet hypernym and graph of Ph.D. advisor-advisee relationships.
- No-tree like disease relationships, proteins interactions etc
Results
- Combinatorial construction outperforms approaches based on optimization in terms of both MAP and distortion.
- eg on WordNet, the combinatorial approach achieves a MAP of 0.989 with just 2 dimensions while the previous best was 0.87 with 200 dimensions.