Contents Menu Expand Light mode Dark mode Auto light/dark, in light mode Auto light/dark, in dark mode Skip to content
Maze documentation
Light Logo Dark Logo
Maze documentation
  • Installation
  • A First Example
  • Maze - Step by Step
    • 1. Cutting-2D Problem Specification
    • 2. Implementing the CoreEnv
    • 3. Adding a Renderer
    • 4. Implementing the MazeEnv
    • 5. Training the MazeEnv
    • 6. Adding Events and KPIs
    • 7. Training with Events and KPIs
    • 8. Adding Reward Customization
  • API Documentation
    • Environment Interfaces
      • BaseEnv
      • ActorID
      • StructuredEnv
      • CoreEnv
      • StructuredEnvSpacesMixin
      • MazeEnv
      • RenderEnvMixin
      • RecordableEnvMixin
      • SerializableEnvMixin
      • TimeEnvMixin
      • EventEnvMixin
      • SimulatedEnvMixin
      • ObservationConversionInterface
      • ActionConversionInterface
      • MazeStateType
      • MazeActionType
      • RewardAggregatorInterface
      • EnvironmentContext
    • Environment Wrappers
      • Wrapper
      • ObservationWrapper
      • ActionWrapper
      • RewardWrapper
      • WrapperFactory
      • LogStatsWrapper
      • MazeEnvMonitoringWrapper
      • ObservationVisualizationWrapper
      • TimeLimitWrapper
      • RandomResetWrapper
      • NoDictSpacesWrapper
      • ExportGifWrapper
      • SpacesRecordingWrapper
      • ActionRecordingWrapper
      • DictObservationWrapper
      • ObservationStackWrapper
      • NoDictObservationWrapper
      • DictActionWrapper
      • NoDictActionWrapper
      • SplitActionsWrapper
      • DiscretizeActionsWrapper
      • RewardScalingWrapper
      • RewardClippingWrapper
      • ReturnNormalizationRewardWrapper
      • PreProcessingWrapper
      • PreProcessor
      • FlattenPreProcessor
      • OneHotPreProcessor
      • ResizeImgPreProcessor
      • TransposePreProcessor
      • UnSqueezePreProcessor
      • Rgb2GrayPreProcessor
      • ObservationNormalizationWrapper
      • ObservationNormalizationStrategy
      • obtain_normalization_statistics
      • estimate_observation_normalization_statistics
      • make_normalized_env_factory
      • MeanZeroStdOneObservationNormalizationStrategy
      • RangeZeroOneObservationNormalizationStrategy
      • GymMazeEnv
      • make_gym_maze_env
      • GymCoreEnv
      • GymRenderer
      • GymObservationConversion
      • GymActionConversion
    • Event System, Logging & Statistics
      • Subscriber
      • Pubsub
      • event_topic_factory
      • EventScope
      • EventService
      • EventCollection
      • EventRecord
      • StepEventLog
      • EpisodeEventLog
      • KpiCalculator
      • LogEventsWriterRegistry
      • LogEventsWriter
      • LogEventsWriterTSV
      • EventRow
      • SimpleEventLoggingSetup
      • ObservationEvents
      • ActionEvents
      • RewardEvents
      • create_categorical_plot
      • create_histogram
      • create_relative_bar_plot
      • create_violin_distribution
      • LogStatsEnv
      • LogStatsWriterConsole
      • LogStatsWriterTensorboard
      • LogStatsLevel
      • LogStatsConsumer
      • LogStatsAggregator
      • LogStatsWriter
      • GlobalLogState
      • LogStatsLogger
      • register_log_stats_writer
      • log_stats
      • increment_log_step
      • get_stats_logger
      • define_step_stats
      • define_episode_stats
      • define_epoch_stats
      • define_stats_grouping
      • define_plot
      • histogram
      • LogStatsValue
      • LogStatsGroup
      • LogStatsKey
      • LogStats
    • Rendering
      • Renderer
      • StepStatsRenderer
      • EventStatsRenderer
      • NotebookEventLogsViewer
      • NotebookTrajectoryViewer
      • KeyboardControlledTrajectoryViewer
      • RendererArg
      • IntRangeArg
      • OptionsArrayArg
    • Trajectory Recorder
      • InMemoryDataset
      • DataLoadWorker
      • TrajectoryProcessor
      • IdentityTrajectoryProcessor
      • BaseClippingTrajectoryProcessor
      • ClipTerminatedEpisodeTrajectoryProcessor
      • ClipTruncatedEpisodeTrajectoryProcessor
      • DeadEndClippingTrajectoryProcessor
      • SolvedClippingTrajectoryProcessor
      • SpacesRecord
      • StepKeyType
      • StructuredSpacesRecord
      • StateRecord
      • TrajectoryRecord
      • StateTrajectoryRecord
      • SpacesTrajectoryRecord
      • ActionRecord
      • MonitoringSetup
      • SimpleTrajectoryRecordingSetup
      • TrajectoryWriterRegistry
      • TrajectoryWriter
      • TrajectoryWriterFile
    • General and Rollout Runners
      • Runner
      • maze_run
      • RolloutRunner
      • RolloutGenerator
      • SequentialRolloutRunner
      • ParallelRolloutRunner
      • ParallelRolloutWorker
      • EpisodeRecorder
      • EpisodeStatsReport
      • ExceptionReport
      • ActionRecordRolloutRunner
      • ActionRecordWorker
    • Policies, Critics and Agents
      • FlatPolicy
      • Policy
      • TorchPolicy
      • PolicySubStepOutput
      • PolicyOutput
      • DefaultPolicy
      • RandomPolicy
      • DummyCartPolePolicy
      • SerializedTorchPolicy
      • ReplayRecordedActionsPolicy
      • StateCritic
      • StateCriticStepOutput
      • StateCriticOutput
      • StateCriticStepInput
      • StateCriticInput
      • TorchStateCritic
      • TorchSharedStateCritic
      • TorchStepStateCritic
      • TorchDeltaStateCritic
      • StateActionCritic
      • TorchStateActionCritic
      • TorchSharedStateActionCritic
      • TorchStepStateActionCritic
      • TorchModel
      • TorchActorCritic
    • Agent Deployment
      • AgentDeployment
      • PolicyExecutor
      • ExternalCoreEnv
    • Perception Module
      • PerceptionBlock
      • ShapeNormalizationBlock
      • InferenceBlock
      • InferenceGraph
      • DenseBlock
      • VGGConvolutionBlock
      • StridedConvolutionBlock
      • GraphConvBlock
      • GraphAttentionBlock
      • MultiHeadAttentionBlock
      • PointNetFeatureBlock
      • GNNBlock
      • LSTMBlock
      • FlattenBlock
      • CorrelationBlock
      • ConcatenationBlock
      • FunctionalBlock
      • GlobalAveragePoolingBlock
      • MaskedGlobalPoolingBlock
      • MultiIndexSlicingBlock
      • RepeatToMatchBlock
      • SelfAttentionConvBlock
      • SelfAttentionSeqBlock
      • SliceBlock
      • ActionMaskingBlock
      • TorchModelBlock
      • FlattenDenseBlock
      • VGGConvolutionDenseBlock
      • VGGConvolutionGAPBlock
      • StridedConvolutionDenseBlock
      • LSTMLastStepBlock
      • BaseModelBuilder
      • ConcatModelBuilder
      • BaseModelComposer
      • TemplateModelComposer
      • CustomModelComposer
      • SpacesConfig
      • BasePolicyComposer
      • ProbabilisticPolicyComposer
      • CriticComposerInterface
      • BaseStateCriticComposer
      • SharedStateCriticComposer
      • StepStateCriticComposer
      • DeltaStateCriticComposer
      • StateCriticComposer
      • BaseStateActionCriticComposer
      • SharedStateActionCriticComposer
      • StepStateActionCriticComposer
      • StateActionCriticComposer
      • FlattenConcatBaseNet
      • FlattenConcatPolicyNet
      • FlattenConcatStateValueNet
      • observation_spaces_to_in_shapes
      • flatten_spaces
      • stack_and_flatten_spaces
      • convert_to_torch
      • convert_to_numpy
      • make_module_init_normc
      • compute_sigmoid_bias
    • Action Spaces and Distributions Module
      • ProbabilityDistribution
      • TorchProbabilityDistribution
      • DistributionMapper
      • atanh
      • tensor_clamp
      • CategoricalProbabilityDistribution
      • BernoulliProbabilityDistribution
      • DiagonalGaussianProbabilityDistribution
      • SquashedGaussianProbabilityDistribution
      • BetaProbabilityDistribution
      • MultiCategoricalProbabilityDistribution
      • DictProbabilityDistribution
    • Core Utilities
      • override
      • unused
      • set_seeds_globally
      • MazeSeeding
      • flat_structured_space
      • flat_structured_shapes
      • read_config
      • list_to_dict
      • EnvFactory
      • make_env_from_hydra
      • Factory
      • ConfigType
      • CollectionOfConfigType
      • CumulativeMovingMeanStd
    • Utilities
      • SimpleStatsLoggingSetup
      • clear_global_state
      • setup_logging
      • Timeout
      • tensorboard_to_pandas
      • Process
      • BColors
      • MazeLocalLauncher
      • LauncherConfig
    • Trainers and Training Runners
      • Trainer
      • TrainingRunner
      • TrainConfig
      • ModelConfig
      • AlgorithmConfig
      • ModelSelectionBase
      • BestModelSelection
      • Evaluator
      • MultiEvaluator
      • RolloutEvaluator
      • ValueTransform
      • ReduceScaleValueTransform
      • support_to_scalar
      • scalar_to_support
      • BaseReplayBuffer
      • UniformReplayBuffer
      • ACRunner
      • ACDevRunner
      • ACLocalRunner
      • ActorCritic
      • ActorCriticEvents
      • A2C
      • A2CAlgorithmConfig
      • PPO
      • PPOAlgorithmConfig
      • IMPALA
      • ImpalaAlgorithmConfig
      • ImpalaEvents
      • ImpalaRunner
      • ImpalaDevRunner
      • ImpalaLocalRunner
      • log_probs_from_logits_and_actions_and_spaces
      • from_logits
      • from_importance_weights
      • get_log_rhos
      • SAC
      • SACAlgorithmConfig
      • SACEvents
      • SACRunner
      • SACDevRunner
      • ESTrainer
      • ESAlgorithmConfig
      • ESEvents
      • ESMasterRunner
      • ESLocalRunner
      • ESDevRunner
      • SharedNoiseTable
      • Optimizer
      • SGD
      • Adam
      • ESRolloutResult
      • ESDummyDistributedRollouts
      • ESDistributedRollouts
      • ESAbortException
      • ESRolloutWorkerWrapper
      • get_flat_parameters
      • set_flat_parameters
      • ImitationEvents
      • BCRunner
      • BCTrainer
      • BCAlgorithmConfig
      • BCValidationEvaluator
      • BCLoss
      • stack_numpy_dict_list
      • unstack_numpy_list_dict
      • compute_gradient_norm
      • stack_torch_dict_list
    • Parallelization
      • VectorEnv
      • StructuredVectorEnv
      • SequentialVectorEnv
      • SubprocVectorEnv
      • CloudpickleWrapper
      • SinkHoleConsumer
      • disable_epoch_level_stats
      • DistributedActors
      • SequentialDistributedActors
      • SubprocDistributedActors
      • BaseDistributedWorkersWithBuffer
      • DummyDistributedWorkersWithBuffer
      • BroadcastingContainer
      • BroadcastingManager

Workflow

  • Training
  • Rollouts
  • Deployment
  • Collecting and Visualizing Rollouts
  • Imitation Learning and Fine-Tuning
  • Experiment Configuration

Policy and Value Networks

  • Introducing the Perception Module
  • Action Spaces and Distributions
  • Working with Template Models
  • Working with Custom Models

Training

  • Maze Trainers

Concepts and Structure

  • Policies, Critics and Agents
  • Maze Environment Hierarchy
  • Maze Event System
  • Configuration with Hydra
    • Hydra: Overview
    • Hydra: Your Own Configuration Files
    • Hydra: Advanced Concepts
  • Environment Rendering
  • Structured Environments
    • Flat Environments
    • Multi-Stepping
    • Multi-Agent RL
    • Hierarchical RL

Environment Customization

  • Customizing Core and Maze Envs
  • Customizing / Shaping Rewards
  • Environment Wrappers
  • Observation Pre-Processing
  • Observation Normalization

Best Practices and Tutorials

  • Tricks of the Trade
  • Cheat Sheet
  • Integrating an Existing Gym Environment
  • Structured Environments and Action Masking
    • Turning a “flat” MazeEnv into a StructuredEnv
    • Training the Structured Environment
    • Adding Step-Conditional Action Masking
    • Training with Action Masking
  • Combining Maze with other RL Frameworks
  • Plain Python Training Example (low-level)

Logging and Monitoring

  • Tensorboard and Command Line Logging
  • Event and KPI Logging
  • Action Distribution Visualization
  • Observation Logging

Scaling the Training Process

  • Runner Concept
Back to top

StateCriticComposer¶

maze.perception.models.critics.StateCriticComposer¶

alias of StepStateCriticComposer

Next
BaseStateActionCriticComposer
Previous
DeltaStateCriticComposer
Copyright © 2025, EnliteAI GmbH
Made with Sphinx and @pradyunsg's Furo
On this page
  • StateCriticComposer
    • StateCriticComposer