Accurate, Large Minibatch SGD - Training ImageNet in 1 Hour

09 Jan 2020

Introduction

Training models with large minibatches (using distributed synchronous SGD) can lead...

Superposition of many models into one

02 Jan 2020

Introduction

The paper proposes a technique (called Parameter Superposition or PSP) for...

Towards a Unified Theory of State Abstraction for MDPs

26 Dec 2019

Introduction

The paper studies five different techniques for stat abstraction in MDPs...

ALBERT - A Lite BERT for Self-supervised Learning of Language Representations

19 Dec 2019

Introduction

The paper proposes parameter-reduction techniques to lower the memory consumption (and...

Everything Happens for a Reason - Discovering the Purpose of Actions in Procedural Text

12 Dec 2019

Introduction

Procedural text comprehension tasks focus on modeling the effect of actions...

Mastering Atari, Go, Chess and Shogi by Planning with a Learned Model

05 Dec 2019

Introduction

The paper presents the MuZero algorithm that performs planning with a...

Contrastive Learning of Structured World Models

28 Nov 2019

Introduction

The paper introduces Contrastively-trained Structured World Models (C-SWMs).
These...

Gossip based Actor-Learner Architectures for Deep RL

12 Sep 2019

Link to the paper
The paper considers the task of...

How to train your MAML

05 Sep 2019

Introduction

The paper proposes MAML++ - a modification of MAML algorithm that...

PHYRE - A New Benchmark for Physical Reasoning

29 Aug 2019

Introduction

The paper proposes the PHYRE (PHYsical REasoning) benchmark - consisting of...

Older Newer