Archive

Blog Posts

10 Feb 2023 » Toolformer - Language Models Can Teach Themselves to Use Tools
29 Mar 2021 » Synthesized Policies for Transfer and Adaptation across Tasks and Environments
22 Mar 2021 » Deep Neural Networks for YouTube Recommendations
15 Mar 2021 » The Tail at Scale
08 Mar 2021 » Practical Lessons from Predicting Clicks on Ads at Facebook
01 Mar 2021 » Ad Click Prediction - a View from the Trenches
22 Feb 2021 » Anatomy of Catastrophic Forgetting - Hidden Representations and Task Semantics
15 Feb 2021 » When Do Curricula Work?
08 Feb 2021 » Continual learning with hypernetworks
01 Feb 2021 » Zero-shot Learning by Generating Task-specific Adapters
25 Jan 2021 » HyperNetworks
18 Jan 2021 » Energy-based Models for Continual Learning
11 Jan 2021 » GPipe - Easy Scaling with Micro-Batch Pipeline Parallelism
04 Jan 2021 » Compositional Explanations of Neurons
21 Dec 2020 » Design patterns for container-based distributed systems
14 Dec 2020 » Cassandra - a decentralized structured storage system
07 Dec 2020 » CAP twelve years later - How the rules have changed
30 Nov 2020 » Consistency Tradeoffs in Modern Distributed Database System Design
23 Nov 2020 » Exploring Simple Siamese Representation Learning
16 Nov 2020 » Data Management for Internet-Scale Single-Sign-On
09 Nov 2020 » Searching for Build Debt - Experiences Managing Technical Debt at Google
02 Nov 2020 » One Solution is Not All You Need - Few-Shot Extrapolation via Structured MaxEnt RL
19 Oct 2020 » Learning Explanations That Are Hard To Vary
12 Oct 2020 » Remembering for the Right Reasons - Explanations Reduce Catastrophic Forgetting
28 Sep 2020 » A Foliated View of Transfer Learning
21 Sep 2020 » Harvest, Yield, and Scalable Tolerant Systems
14 Sep 2020 » MONet - Unsupervised Scene Decomposition and Representation
07 Sep 2020 » Revisiting Fundamentals of Experience Replay
31 Aug 2020 » Deep Reinforcement Learning and the Deadly Triad
24 Aug 2020 » Alpha Net–Adaptation with Composition in Classifier Space
14 Aug 2020 » Outrageously Large Neural Networks–The Sparsely-Gated Mixture-of-Experts Layer
06 Aug 2020 » Gradient Surgery for Multi-Task Learning
30 Jul 2020 » GradNorm–Gradient Normalization for Adaptive Loss Balancing in Deep Multitask Networks
23 Jul 2020 » TaskNorm–Rethinking Batch Normalization for Meta-Learning
16 Jul 2020 » Averaging Weights leads to Wider Optima and Better Generalization
09 Jul 2020 » Decentralized Reinforcement Learning – Global Decision-Making via Local Economic Transactions
02 Jul 2020 » When to use parametric models in reinforcement learning?
25 Jun 2020 » Network Randomization - A Simple Technique for Generalization in Deep Reinforcement Learning
18 Jun 2020 » On the Difficulty of Warm-Starting Neural Network Training
30 Apr 2020 » Supervised Contrastive Learning
09 Apr 2020 » CURL - Contrastive Unsupervised Representations for Reinforcement Learning
12 Mar 2020 » Competitive Training of Mixtures of Independent Deep Generative Models
05 Mar 2020 » What Does Classifying More Than 10,000 Image Categories Tell Us?
27 Feb 2020 » mixup - Beyond Empirical Risk Minimization
20 Feb 2020 » ELECTRA - Pre-training Text Encoders as Discriminators Rather Than Generators
13 Feb 2020 » Gradient based sample selection for online continual learning
06 Feb 2020 » Your Classifier is Secretly an Energy Based Model and You Should Treat it Like One
30 Jan 2020 » Massively Multilingual Neural Machine Translation in the Wild - Findings and Challenges
23 Jan 2020 » Observational Overfitting in Reinforcement Learning
16 Jan 2020 » Rapid Learning or Feature Reuse? Towards Understanding the Effectiveness of MAML
09 Jan 2020 » Accurate, Large Minibatch SGD - Training ImageNet in 1 Hour
02 Jan 2020 » Superposition of many models into one
26 Dec 2019 » Towards a Unified Theory of State Abstraction for MDPs
19 Dec 2019 » ALBERT - A Lite BERT for Self-supervised Learning of Language Representations
12 Dec 2019 » Everything Happens for a Reason - Discovering the Purpose of Actions in Procedural Text
05 Dec 2019 » Mastering Atari, Go, Chess and Shogi by Planning with a Learned Model
28 Nov 2019 » Contrastive Learning of Structured World Models
12 Sep 2019 » Gossip based Actor-Learner Architectures for Deep RL
05 Sep 2019 » How to train your MAML
29 Aug 2019 » PHYRE - A New Benchmark for Physical Reasoning
22 Aug 2019 » Large Memory Layers with Product Keys
15 Aug 2019 » Abductive Commonsense Reasoning
08 Aug 2019 » Deep Reinforcement Learning in a Handful of Trials using Probabilistic Dynamics Models
01 Aug 2019 » Assessing Generalization in Deep Reinforcement Learning
25 Jul 2019 » Quantifying Generalization in Reinforcement Learning
18 Jul 2019 » Set Transformer - A Framework for Attention-based Permutation-Invariant Neural Networks
27 Jun 2019 » Measuring abstract reasoning in neural networks
20 Jun 2019 » Hamiltonian Neural Networks
13 Jun 2019 » Extrapolating Beyond Suboptimal Demonstrations via Inverse Reinforcement Learning from Observations
08 Jun 2019 » Meta-Reinforcement Learning of Structured Exploration Strategies
01 Jun 2019 » Relational Reinforcement Learning
21 May 2019 » Good-Enough Compositional Data Augmentation
14 May 2019 » Multiple Model-Based Reinforcement Learning
09 Apr 2019 » Towards a natural benchmark for continual learning
02 Apr 2019 » Meta-Learning Update Rules for Unsupervised Representation Learning
26 Mar 2019 » GNN Explainer - A Tool for Post-hoc Explanation of Graph Neural Networks
16 Mar 2019 » To Tune or Not to Tune? Adapting Pretrained Representations to Diverse Tasks
12 Mar 2019 » Model Primitive Hierarchical Lifelong Reinforcement Learning
19 Feb 2019 » TuckER - Tensor Factorization for Knowledge Graph Completion
05 Feb 2019 » Linguistic Knowledge as Memory for Recurrent Neural Networks
29 Jan 2019 » Diversity is All You Need - Learning Skills without a Reward Function
22 Jan 2019 » Modular meta-learning
15 Jan 2019 » Hierarchical RL Using an Ensemble of Proprioceptive Periodic Policies
08 Jan 2019 » Efficient Lifelong Learning with A-GEM
02 Jan 2019 » Pre-training Graph Neural Networks with Kernels
25 Dec 2018 » Smooth Loss Functions for Deep Top-k Classification
18 Dec 2018 » Hindsight Experience Replay
11 Dec 2018 » Representation Tradeoffs for Hyperbolic Embeddings
01 Nov 2018 » Learned Optimizers that Scale and Generalize
25 Oct 2018 » One-shot Learning with Memory-Augmented Neural Networks
18 Oct 2018 » BabyAI - First Steps Towards Grounded Language Learning With a Human In the Loop
11 Oct 2018 » Poincaré Embeddings for Learning Hierarchical Representations
04 Oct 2018 » When Recurrent Models Don’t Need To Be Recurrent
27 Sep 2018 » HoME - a Household Multimodal Environment
12 Sep 2018 » Emergence of Grounded Compositional Language in Multi-Agent Populations
21 Aug 2018 » A Semantic Loss Function for Deep Learning with Symbolic Knowledge
16 Aug 2018 » Hierarchical Graph Representation Learning with Differentiable Pooling
08 Aug 2018 » Imagination-Augmented Agents for Deep Reinforcement Learning
19 Jul 2018 » Kronecker Recurrent Units
11 Jul 2018 » Learning Independent Causal Mechanisms
04 Jul 2018 » Memory-based Parameter Adaptation
09 Jun 2018 » Born Again Neural Networks
21 May 2018 » Net2Net-Accelerating Learning via Knowledge Transfer
06 May 2018 » Learning to Count Objects in Natural Images for Visual Question Answering
08 Apr 2018 » Neural Message Passing for Quantum Chemistry
02 Apr 2018 » Unsupervised Learning by Predicting Noise
25 Mar 2018 » The Lottery Ticket Hypothesis - Training Pruned Neural Networks
18 Mar 2018 » Cyclical Learning Rates for Training Neural Networks
11 Mar 2018 » Improving Information Extraction by Acquiring External Evidence with Reinforcement Learning
05 Mar 2018 » An Empirical Investigation of Catastrophic Forgetting in Gradient-Based Neural Networks
24 Feb 2018 » Learning an SAT Solver from Single-Bit Supervision
17 Feb 2018 » Neural Relational Inference for Interacting Systems
11 Feb 2018 » Stylistic Transfer in Natural Language Generation Systems Using Recurrent Neural Networks
05 Feb 2018 » Get To The Point - Summarization with Pointer-Generator Networks
29 Jan 2018 » StarSpace - Embed All The Things!
22 Jan 2018 » Emotional Chatting Machine - Emotional Conversation Generation with Internal and External Memory
14 Jan 2018 » Exploring Models and Data for Image Question Answering
06 Jan 2018 » How transferable are features in deep neural networks
31 Dec 2017 » Distilling the Knowledge in a Neural Network
24 Dec 2017 » PTE - Predictive Text Embedding through Large-scale Heterogeneous Text Networks
11 Dec 2017 » Revisiting Semi-Supervised Learning with Graph Embeddings
28 Nov 2017 » Two-Stage Synthesis Networks for Transfer Learning in Machine Comprehension
19 Nov 2017 » Higher-order organization of complex networks
12 Nov 2017 » Network Motifs - Simple Building Blocks of Complex Networks
05 Nov 2017 » Word Representations via Gaussian Embedding
28 Oct 2017 » HARP - Hierarchical Representation Learning for Networks
22 Oct 2017 » Swish - a Self-Gated Activation Function
15 Oct 2017 » Reading Wikipedia to Answer Open-Domain Questions
01 Oct 2017 » Task-Oriented Query Reformulation with Reinforcement Learning
22 Sep 2017 » Refining Source Representations with Relation Networks for Neural Machine Translation
27 Aug 2017 » Pointer Networks
21 Aug 2017 » Learning to Compute Word Embeddings On the Fly
07 Aug 2017 » R-NET - Machine Reading Comprehension with Self-matching Networks
24 Jul 2017 » ReasoNet - Learning to Stop Reading in Machine Comprehension
17 Jul 2017 » Principled Detection of Out-of-Distribution Examples in Neural Networks
09 Jul 2017 » Ask Me Anything - Dynamic Memory Networks for Natural Language Processing
01 Jul 2017 » One Model To Learn Them All
26 Jun 2017 » Two/Too Simple Adaptations of Word2Vec for Syntax Problems
17 Jun 2017 » A Decomposable Attention Model for Natural Language Inference
03 Jun 2017 » A Fast and Accurate Dependency Parser using Neural Networks
23 May 2017 » Neural Module Networks
14 May 2017 » Making the V in VQA Matter - Elevating the Role of Image Understanding in Visual Question Answering
07 May 2017 » Conditional Similarity Networks
28 Apr 2017 » Simple Baseline for Visual Question Answering
27 Apr 2017 » VQA-Visual Question Answering

Papers I Read Notes and Summaries

Archive

Blog Posts