Single-robot Gymnasium Environment#

class enki_env.EnkiEnv(scenario: Scenario, config: GroupConfig, name: str = '', time_step: float = 0.1, max_duration: float = -1, physics_substeps: int = 3, render_mode: str | None = None, render_fps: float = 10.0, render_kwargs: dict[str, Any] = {}, notebook: bool | None = None, success_info: bool = True, default_success: bool | None = None)#

Bases: SingleAgentEnv[str, dict[str, ndarray[tuple[Any, …], dtype[float64]]], ndarray[tuple[Any, …], dtype[float64]]]

A gymnasium.Env that exposes a single robot in a pyenki.World.

Internally, it creates a enki_env.ParallelEnkiEnv with a single group composed by the robot, and then forwards to it methods like gymnasium.Env.reset(), gymnasium.Env.step(), and gymnasium.Env.render().

Observations, rewards, and information returned by gymnasium.Env.reset() and gymnasium.Env.step(), are generated from the robot sensors and internal state using enki_env.GroupConfig.observation, enki_env.GroupConfig.reward, and enki_env.GroupConfig.info. Termination criteria are specified in enki_env.GroupConfig.terminations. Actions are actuated according to enki_env.GroupConfig.action.

Rendering is performed:

The environment is registered under id "Enki". To create an environment, you need to first

  1. define a scenario with a least one robot, e.g.

    import enki_env
    import pyenki
    
    class MyScenario(enki_env.BaseScenario):
    
        def init(self, world: pyenki.World) -> None:
            robot = pyenki.Thymio2()
            robot.angle = world.random_generator.uniform(-1, 1)
            world.add_object(robot)
    
  2. define a configuration, e.g., the default configuration associated with the robot

    config = enki_env.ThymioConfig()
    

Then, you can call the factory function, customizing the other parameters as you see fit

import gymnasium

env = gymnasium.make("Enki", MyScenario(), config, max_duration=10)
__init__(scenario: Scenario, config: GroupConfig, name: str = '', time_step: float = 0.1, max_duration: float = -1, physics_substeps: int = 3, render_mode: str | None = None, render_fps: float = 10.0, render_kwargs: dict[str, Any] = {}, notebook: bool | None = None, success_info: bool = True, default_success: bool | None = None) None#

Constructs a new instance. It takes the same arguments as enki_env.ParallelEnkiEnv but referring to a single robot.

Parameters:
  • scenario – The scenario that generates worlds at gymnasium.Env.reset().

  • config – The configuration for the group containing the robot.

  • name – The name of the robot.

  • time_step – The time step of the simulation [s].

  • max_duration – The maximum duration of the episodes [s].

  • physics_substeps – The number of physics sub-steps for each simulation step, see pyenki.World.step().

  • render_mode – The render mode (one of None, rgb_array or human).

  • render_fps – The render fps (only relevant when render_mode="human".

  • render_kwargs – The render keywords arguments arguments forwarded to pyenki.viewer.render() when rendering an environment.

  • notebook – Whether to use a notebook-compatible renderer. If None, it will select it if we are running a notebook.

  • success_info – Whether to include key "is_success" in the final info dictionary for each robot. It will be included only if it has been set by one of enki_env.GroupConfig.terminations or if default_success is not None.

  • default_success – The value associated with "is_success" in the final info dictionary when, at the end of the episode, the robot has not been yet terminated.

property config: GroupConfig#

The robot configuration.

display_in_notebook() None#

Displays the environment in a notebook using a an interactive pyenki.buffer.EnkiRemoteFrameBuffer.

Requires render_mode="human" and a notebook.

make_world(policy: Predictor | None = None, seed: int = 0, deterministic: bool = True, cutoff: float = 0) pyenki.World#

Generates a world using the scenario and, if specified, assign a policy to the robot controller.

Parameters:
  • policy – The policy

  • seed – The random seed

  • deterministic – Whether to evaluate the policy deterministically.

  • cutoff – When the absolute value of actions is below this threshold, they will be set to zero.

Returns:

The world

rollout(policy: Predictor | None = None, max_steps: int = -1, seed: int = 0, deterministic: bool = True, cutoff: float = 0) Rollout#

Performs the rollout of an episode

Parameters:
  • policy – The policy to apply; if not provided, it will randomly generate actions.

  • max_steps – The maximum number of steps to perform.

  • seed – The random seed.

  • deterministic – Whether to evaluate the policy deterministically.

  • cutoff – When the absolute value of actions is below this threshold, they will be set to zero.

Returns:

The data collected during the rollout.