Types#

protocol enki_env.types.Termination#

A criterion to decide the success/failure of an episode for a given robot. Should return True for success, False for failure, and None if not yet decided.

For example, for a task where a robot needs to travel along the positive x-direction, we may select failure when if exits some narrow bands and success when it travels further enough:

def my_criterion(robot: pyenki.DifferentialWheeled
                 ) -> bool | None:
    if robot.position[1] > 100:
        return True
    if abs(robot.position[0]) > 10:
        return False
    return None

Classes that implement this protocol must have the following methods / attributes:

__call__(robot: pyenki.DifferentialWheeled) → bool | None#

Decides if the episode should terminate for a given robot

Parameters:: robot – The robot
Returns:: True to terminate with success, False to terminate with failure, None to not terminate.

type enki_env.types.Array = numpy.typing.NDArray[numpy.float64]#: Floating-point array.

type enki_env.types.BoolArray = numpy.typing.NDArray[numpy.bool_]#: Boolean array.

type enki_env.types.Observation = dict[str, numpy.typing.NDArray[np.float64]]#: enki_env environments uses dictionaries of floating-point arrays as observations.

type enki_env.types.Action = numpy.typing.NDArray[np.float64]#: enki_env environments uses floating-point arrays as actions.

type enki_env.types.Info = dict[str, Any]#: Generic info dictionary

type enki_env.types.State = tuple[Array, ...]#: State arrays

type enki_env.types.EpisodeStart = Array#: Array that flags episodes start.

type enki_env.types.PathLike = os.PathLike[str] | str#: Anything that can be converted to a file path

protocol enki_env.types.Predictor#

This class describes the predictor protocol.

Same as stable_baselines3.common.type_aliases.PolicyPredictor, included here to be self-contained.

Classes that implement this protocol must have the following methods / attributes:

property action_space: gym.Space[Any]#

property observation_space: gym.Space[Any]#

predict(observation: Observation, state: tuple[ndarray[tuple[Any, ...], dtype[float64]], ...] | None = None, episode_start: ndarray[tuple[Any, ...], dtype[float64]] | None = None, deterministic: bool = False) → tuple[TypeAliasForwardRef('Action'), tuple[ndarray[tuple[Any, ...], dtype[float64]], ...] | None]#

Get the policy action from an observation (and optional hidden state). Includes sugar-coating to handle different observations (e.g. normalizing images).

Parameters:

observation – the input observation
state – The last hidden states (can be None, used in recurrent policies)
episode_start – The last masks (can be None, used in recurrent policies) this correspond to beginning of episodes, where the hidden states of the RNN must be reset.
deterministic – Whether or not to return deterministic actions.

Returns:

the model’s action and the next hidden state (used in recurrent policies)

type enki_env.types.PyTorchObs = torch.Tensor | dict[str, torch.Tensor]#: The type of observations in PyTorch

protocol enki_env.types.PyTorchPolicy#

typing.Protocol.

Classes that implement this protocol must have the following methods / attributes:

__call__(obs: PyTorchObs, deterministic: bool = False) → torch.Tensor#

Evaluate the policy

Parameters:

obs – The observations
deterministic – Whether or not to return deterministic actions.

forward(obs: PyTorchObs, deterministic: bool = False) → torch.Tensor#

Evaluate the policy

Parameters:

obs – The observations
deterministic – Whether or not to return deterministic actions.

Types

Contents

Types#