Types#

protocol enki_env.types.Termination#

A criterion to decide the success/failure of an episode for a given robot. Should return True for success, False for failure, and None if not yet decided.

For example, for a task where a robot needs to travel along the positive x-direction, we may select failure when if exits some narrow bands and success when it travels further enough:

def my_criterion(robot: pyenki.DifferentialWheeled
                 ) -> bool | None:
    if robot.position[1] > 100:
        return True
    if abs(robot.position[0]) > 10:
        return False
    return None

Classes that implement this protocol must have the following methods / attributes:

__call__(robot: pyenki.DifferentialWheeled) bool | None#

Decides if the episode should terminate for a given robot

Parameters:

robot – The robot

Returns:

True to terminate with success, False to terminate with failure, None to not terminate.

type enki_env.types.Array = numpy.typing.NDArray[numpy.float64]#

Floating-point array.

type enki_env.types.BoolArray = numpy.typing.NDArray[numpy.bool_]#

Boolean array.

type enki_env.types.Observation = dict[str, numpy.typing.NDArray[np.float64]]#

enki_env environments uses dictionaries of floating-point arrays as observations.

type enki_env.types.Action = numpy.typing.NDArray[np.float64]#

enki_env environments uses floating-point arrays as actions.

type enki_env.types.Info = dict[str, Any]#

Generic info dictionary

type enki_env.types.State = tuple[Array, ...]#

State arrays

type enki_env.types.EpisodeStart = Array#

Array that flags episodes start.

type enki_env.types.PathLike = os.PathLike[str] | str#

Anything that can be converted to a file path

protocol enki_env.types.Predictor#

This class describes the predictor protocol.

Same as stable_baselines3.common.type_aliases.PolicyPredictor, included here to be self-contained.

Classes that implement this protocol must have the following methods / attributes:

property action_space: gym.Space[Any]#
property observation_space: gym.Space[Any]#
predict(observation: Observation, state: tuple[ndarray[tuple[Any, ...], dtype[float64]], ...] | None = None, episode_start: ndarray[tuple[Any, ...], dtype[float64]] | None = None, deterministic: bool = False) tuple[TypeAliasForwardRef('Action'), tuple[ndarray[tuple[Any, ...], dtype[float64]], ...] | None]#

Get the policy action from an observation (and optional hidden state). Includes sugar-coating to handle different observations (e.g. normalizing images).

Parameters:
  • observation – the input observation

  • state – The last hidden states (can be None, used in recurrent policies)

  • episode_start – The last masks (can be None, used in recurrent policies) this correspond to beginning of episodes, where the hidden states of the RNN must be reset.

  • deterministic – Whether or not to return deterministic actions.

Returns:

the model’s action and the next hidden state (used in recurrent policies)

type enki_env.types.PyTorchObs = torch.Tensor | dict[str, torch.Tensor]#

The type of observations in PyTorch

protocol enki_env.types.PyTorchPolicy#

typing.Protocol.

Classes that implement this protocol must have the following methods / attributes:

__call__(obs: PyTorchObs, deterministic: bool = False) torch.Tensor#

Evaluate the policy

Parameters:
  • obs – The observations

  • deterministic – Whether or not to return deterministic actions.

forward(obs: PyTorchObs, deterministic: bool = False) torch.Tensor#

Evaluate the policy

Parameters:
  • obs – The observations

  • deterministic – Whether or not to return deterministic actions.