Single-robot Gymnasium Environment#
- class enki_env.EnkiEnv(scenario: Scenario, config: GroupConfig, name: str = '', time_step: float = 0.1, max_duration: float = -1, physics_substeps: int = 3, render_mode: str | None = None, render_fps: float = 10.0, render_kwargs: dict[str, Any] = {}, notebook: bool | None = None, success_info: bool = True, default_success: bool | None = None)#
Bases:
SingleAgentEnv[str,dict[str,ndarray[tuple[Any, …],dtype[float64]]],ndarray[tuple[Any, …],dtype[float64]]]A
gymnasium.Envthat exposes a single robot in apyenki.World.Internally, it creates a
enki_env.ParallelEnkiEnvwith a single group composed by the robot, and then forwards to it methods likegymnasium.Env.reset(),gymnasium.Env.step(), andgymnasium.Env.render().Observations, rewards, and information returned by
gymnasium.Env.reset()andgymnasium.Env.step(), are generated from the robot sensors and internal state usingenki_env.GroupConfig.observation,enki_env.GroupConfig.reward, andenki_env.GroupConfig.info. Termination criteria are specified inenki_env.GroupConfig.terminations. Actions are actuated according toenki_env.GroupConfig.action.Rendering is performed:
by a
pyenki.viewer.WorldViewifrender_mode="human"and we are not in a Jupyter notebook.by a
pyenki.buffer.EnkiRemoteFrameBufferifrender_mode="human"and we are in a Jupyter notebook.by
pyenki.viewer.render()ifrender_mode="rgb_array".
The environment is registered under id
"Enki". To create an environment, you need to firstdefine a scenario with a least one robot, e.g.
import enki_env import pyenki class MyScenario(enki_env.BaseScenario): def init(self, world: pyenki.World) -> None: robot = pyenki.Thymio2() robot.angle = world.random_generator.uniform(-1, 1) world.add_object(robot)
define a configuration, e.g., the default configuration associated with the robot
config = enki_env.ThymioConfig()
Then, you can call the factory function, customizing the other parameters as you see fit
import gymnasium env = gymnasium.make("Enki", MyScenario(), config, max_duration=10)
- __init__(scenario: Scenario, config: GroupConfig, name: str = '', time_step: float = 0.1, max_duration: float = -1, physics_substeps: int = 3, render_mode: str | None = None, render_fps: float = 10.0, render_kwargs: dict[str, Any] = {}, notebook: bool | None = None, success_info: bool = True, default_success: bool | None = None) None#
Constructs a new instance. It takes the same arguments as
enki_env.ParallelEnkiEnvbut referring to a single robot.- Parameters:
scenario – The scenario that generates worlds at
gymnasium.Env.reset().config – The configuration for the group containing the robot.
name – The name of the robot.
time_step – The time step of the simulation [s].
max_duration – The maximum duration of the episodes [s].
physics_substeps – The number of physics sub-steps for each simulation step, see
pyenki.World.step().render_mode – The render mode (one of
None,rgb_arrayorhuman).render_fps – The render fps (only relevant when
render_mode="human".render_kwargs – The render keywords arguments arguments forwarded to
pyenki.viewer.render()when rendering an environment.notebook – Whether to use a notebook-compatible renderer. If
None, it will select it if we are running a notebook.success_info – Whether to include key
"is_success"in the final info dictionary for each robot. It will be included only if it has been set by one ofenki_env.GroupConfig.terminationsor ifdefault_successis notNone.default_success – The value associated with
"is_success"in the final info dictionary when, at the end of the episode, the robot has not been yet terminated.
- property config: GroupConfig#
The robot configuration.
- display_in_notebook() None#
Displays the environment in a notebook using a an interactive
pyenki.buffer.EnkiRemoteFrameBuffer.Requires
render_mode="human"and a notebook.
- make_world(policy: Predictor | None = None, seed: int = 0, deterministic: bool = True, cutoff: float = 0) pyenki.World#
Generates a world using the scenario and, if specified, assign a policy to the robot controller.
- Parameters:
policy – The policy
seed – The random seed
deterministic – Whether to evaluate the policy deterministically.
cutoff – When the absolute value of actions is below this threshold, they will be set to zero.
- Returns:
The world
- rollout(policy: Predictor | None = None, max_steps: int = -1, seed: int = 0, deterministic: bool = True, cutoff: float = 0) Rollout#
Performs the rollout of an episode
- Parameters:
policy – The policy to apply; if not provided, it will randomly generate actions.
max_steps – The maximum number of steps to perform.
seed – The random seed.
deterministic – Whether to evaluate the policy deterministically.
cutoff – When the absolute value of actions is below this threshold, they will be set to zero.
- Returns:
The data collected during the rollout.