Papers I Read Notes and Summaries

Superposition of many models into one


  • The paper proposes a technique (called Parameter Superposition or PSP) for...

Towards a Unified Theory of State Abstraction for MDPs


  • The paper studies five different techniques for stat abstraction in MDPs...

ALBERT - A Lite BERT for Self-supervised Learning of Language Representations


  • The paper proposes parameter-reduction techniques to lower the memory consumption (and...

Everything Happens for a Reason - Discovering the Purpose of Actions in Procedural Text


  • Procedural text comprehension tasks focus on modeling the effect of actions...

Mastering Atari, Go, Chess and Shogi by Planning with a Learned Model


  • The paper presents the MuZero algorithm that performs planning with a...

Contrastive Learning of Structured World Models


  • The paper introduces Contrastively-trained Structured World Models (C-SWMs).

  • These...

Gossip based Actor-Learner Architectures for Deep RL

How to train your MAML


  • The paper proposes MAML++ - a modification of MAML algorithm that...

PHYRE - A New Benchmark for Physical Reasoning


  • The paper proposes the PHYRE (PHYsical REasoning) benchmark - consisting of...

Large Memory Layers with Product Keys


  • The paper proposes a structured key-value memory layer that:
    • Can scale...