The dm_control software package is a collection of Python libraries and task suites for reinforcement learning agents in an articulated-body simulation. A MuJoCo wrapper provides convenient bindings to functions and data structures. The PyMJCF and Composer libraries enable procedural model manipulation and task authoring. The Control Suite is a fixed set of tasks with standardised structure, intended to serve as performance benchmarks. The Locomotion framework provides high-level abstractions and examples of locomotion tasks. A set of configurable manipulation tasks with a robot arm and snap-together bricks is also included.
dm_control is publicly available at https://github.com/deepmind/dm_control
A public colab notebook with a tutorial for dm_control software is available here.
Exploiting MuJoCo’s support of names for all model elements, we allow strings to index and slice into arrays. So instead of writing:
...using obscure, fragile numerical indexing, you can write:
leading to a much more robust, readable codebase.
The PyMJCF library creates a Python object hierarchy with 1 : 1 correspondence to a MuJoCo model. It introduces the attach() method which allows models to be attached to one another. For example, in our tutorial we create procedural multi-legged creatures by attaching legs to bodies and creatures to the scene.
Composer is the “game engine“ framework, which defines a particular order of runtime function calls, and abstracts the affordances of reward, termination and observation. These abstractions allowed us to create useful submodules:
composer.Observable: An abstract observation wrapper which can add noise, delays, buffering and filtering to any sensor.
composer.Variation: A set of tools for randomising simulation quantities, allowing for agent robustification and sim-to-real via model variation.
The Locomotion framework introduced the abstractions:
Walker: A controllable entity with common locomotion-related methods, like projection of vectors into an egocentric frame.
Arena: A self-scaling randomized scene, in which the walker can be placed and given a task to perform.
For example, using just 4 function calls, we can instantiate a humanoid walker, a WallsCorridor arena and combine them in a RunThroughCorridor task.
A fast-paced montage of dm_control based tasks from DeepMind: