Synchronizes policy updates and other information across actors on local machine.
Used for dummy and sub-process distribution scenarios.
The BroadcastingContainer object can be read by all actor workers in order to update their policy, and it can be accessed by the main Thread to write the updated policy from the learner into it.
policy_state_dict() → Dict¶
Return the current policy state dict.
policy_version() → int¶
Return the current policy version number (to check whether fetching a new state dict is necessary).
set_policy_state_dict(state_dict: Dict) → NoReturn¶
Store new policy version.
state_dict – New state dict to store
set_stop_flag() → NoReturn¶
Signal to the workers to exit after they finish the current rollout.