[GitHub Code for this Chapter: chapter2_3
branch](https://github.com/Lab-of-AI-and-Robotics/IsaacLab-Tutorial/tree/chapter2_3)
To deconstruct the ManagerBasedRLEnvCfg
class, the declarative "blueprint" that defines our entire reinforcement learning task. This chapter delves into the most critical file in our project: the environment configuration (_env_cfg.py
). By examining the default test
project's example, we will gain a deep understanding of how to define a scene, and how the managers for actions, observations, rewards, and events are configured.
_env_cfg.py
In the last chapter, we established that the Manager-Based workflow separates configuration (what to do) from implementation (how to do it). We now shift our focus to that configuration.
Navigate to your project's task folder: .../test/source/test/test/tasks/manager_based/test/
. Inside, you'll find test_env_cfg.py
. Think of this file as a detailed recipe. It tells Isaac Lab's "engine" (ManagerBasedRLEnv
) exactly what ingredients to use (a robot, a ground plane) and the rules for success (the rewards and terminations). Our job as researchers is primarily to be the "chef" who fine-tunes this recipe.
Let's dissect the TestEnvCfg
class to understand its main components.
TestEnvCfg
ClassOpening test_env_cfg.py
reveals the main TestEnvCfg
class. This top-level class acts as a container, assembling all the individual configuration modules (SceneCfg
, ActionsCfg
, etc.) into a single, cohesive definition for our environment. This modular assembly is the core principle of the Manager-Based workflow.
The code below shows how these components are brought together. We will then break down each of these modules one by one.
@configclass
class TestEnvCfg(ManagerBasedRLEnvCfg):
# Scene settings
scene: TestSceneCfg = TestSceneCfg(num_envs=4096, env_spacing=4.0)
# Basic settings
observations: ObservationsCfg = ObservationsCfg()
actions: ActionsCfg = ActionsCfg()
events: EventCfg = EventCfg()
# MDP settings
rewards: RewardsCfg = RewardsCfg()
terminations: TerminationsCfg = TerminationsCfg()
# Post initialization
def __post_init__(self) -> None:
"""Post initialization."""
# general settings
self.decimation = 2
self.episode_length_s = 5
# viewer settings
self.viewer.eye = (8.0, 0.0, 5.0)
# simulation settings
self.sim.dt = 1 / 120
self.sim.render_interval = self.decimation
Notice the __post_init__
method. This special function runs after all the configuration objects are initialized, allowing us to set global parameters like episode_length_s
and the simulation timestep sim.dt
.
This object defines the physical world. As seen in the TestSceneCfg
class, it defines a ground plane, spawns the robot using a pre-defined CARTPOLE_CFG
, and sets up lighting. It also sets crucial parameters like num_envs
and env_spacing
in the main TestEnvCfg
.
@configclass
class TestSceneCfg(InteractiveSceneCfg):
"""Configuration for a cart-pole scene."""
# ground plane
ground = AssetBaseCfg(
prim_path="/World/ground",
spawn=sim_utils.GroundPlaneCfg(size=(100.0, 100.0)),
)
# robot
robot: ArticulationCfg = CARTPOLE_CFG.replace(prim_path="{ENV_REGEX_NS}/Robot")
# lights
dome_light = AssetBaseCfg(
prim_path="/World/DomeLight",
spawn=sim_utils.DomeLightCfg(color=(0.9, 0.9, 0.9), intensity=500.0),
)
This defines the agent's action space. In this case, joint_effort
allows the policy to apply a force (effort) to the slider_to_cart
joint, scaled by 100.0
.
@configclass
class ActionsCfg:
"""Action specifications for the MDP."""
joint_effort = mdp.JointEffortActionCfg(asset_name="robot", joint_names=["slider_to_cart"], scale=100.0)