gammamodels.github.io - Gamma-Models: Generative Temporal Difference Learning for Infinite-Horizon Prediction

Description: Gamma-models are predictive models with an infinite probabilistic horizon, trained using a generative adaptation of temporal difference learning.

models (3482) deep (573) representation (440) gamma (172) successor (3) michael janner (1) igor mordatch (1) sergey levine (1) gamma models (1)

Example domain paragraphs

If you cannot access YouTube, please download our video here .

Summary We train predictive models of environment dynamics with infinite probabilistic horizons using a generative adaptation of temporal difference learning. The resulting gamma-model is a continuous, generative analogue of the successor representation and a hybrid between model-free and model-based mechanisms. Like a value function, it contains information about the long-term future; like a standard predictive model, it is independent of reward.

Gamma-model rollouts Replacing single-step models with gamma-models leads to generalizations of the procedures that form the foundation of model-based control. Generalized rollouts have a negative binomial distribution over time per model step. The first step has a geometric distribution from the special case of NegBinom(1, p ) = Geom(1 – p ) .

Links to gammamodels.github.io (2)