TorchActorCritic¶

Encapsulates a structured torch policy and critic for training actor-critic algorithms in structured environments.

Parameters

One method to compute the policy and critic output in one go, managing the sub-steps, individual critic types shared embeddings of networks.

Parameters

record – The StructuredSpacesRecord holding the observation and actor ids.
temperature – (Optional) The temperature used for initializing the probability distribution of the action heads.

Returns

A tuple of the policy and critic output.

eval() → None ¶

implementation of TorchModel

load_state_dict(state_dict: Dict) → None ¶

(overrides TorchModel)

implementation of TorchModel

parameters() → List[torch.Tensor]¶

implementation of TorchModel

state_dict() → Dict¶

(overrides TorchModel)

implementation of TorchModel

to(device: str)¶

(overrides TorchModel)

implementation of TorchModel

train() → None ¶

(overrides TorchModel)

implementation of TorchModel