Skip to main content

API Overview

Reference documentation for the Mesozoic Labs environments.

Gymnasium Registration

All environments are registered with the MesozoicLabs namespace:

import gymnasium as gym

# Import to trigger registration
import environments.velociraptor.envs.raptor_env # noqa: F401

env = gym.make("MesozoicLabs/Raptor-v0")

Available environment IDs:

  • MesozoicLabs/Raptor-v0 - Velociraptor
  • MesozoicLabs/Brachio-v0 - Brachiosaurus
  • MesozoicLabs/TRex-v0 - T-Rex

Velociraptor Environment

The main environment for velociraptor locomotion and predatory strike training.

from environments.velociraptor.envs.raptor_env import RaptorEnv

env = RaptorEnv(
render_mode="human", # "human" or "rgb_array"
frame_skip=5, # Action repeat steps
max_episode_steps=1000,
forward_vel_weight=1.0, # Reward for forward movement
alive_bonus=0.1, # Bonus for staying upright
strike_bonus=10.0, # Reward for claw-prey contact
)

observation, info = env.reset(seed=42)
action = env.action_space.sample()
obs, reward, terminated, truncated, info = env.step(action)

Observation Space (67 dimensions)

ComponentDimsDescription
Joint positions24All qpos excluding root freejoint (24 hinge joints)
Joint velocities24All qvel excluding root freejoint (24 hinge joints)
Pelvis orientation4Quaternion from framequat sensor
Pelvis angular velocity3Gyroscope reading
Pelvis linear velocity3Root body velocity
Pelvis acceleration3Accelerometer reading
Foot contact2Left/right touch sensors (on central digit 3)
Prey direction3Unit vector toward prey
Prey distance1Scalar distance to prey

Action Space (22 dimensions)

Continuous actions in [-1, 1], scaled to actuator control ranges:

  • Right leg: hip pitch, hip roll, knee, ankle, toe d3, toe d4 (6)
  • Right sickle claw (1)
  • Left leg: hip pitch, hip roll, knee, ankle, toe d3, toe d4 (6)
  • Left sickle claw (1)
  • Tail: pitch 1, yaw 1, pitch 2, pitch 3 (4)
  • Right arm: shoulder pitch, shoulder roll (2)
  • Left arm: shoulder pitch, shoulder roll (2)

Reward Components

ComponentWeightDescription
forward_vel_weight1.0Reward proportional to forward velocity
alive_bonus0.1Per-step survival bonus
energy_penalty_weight0.001Penalizes large actions
tail_stability_weight0.05Penalizes tail angular velocity
strike_bonus10.0Bonus when sickle claw contacts prey
strike_approach_weight1.0Reward for closing distance to prey
fall_penalty-100.0Penalty on termination from falling

T-Rex Environment

Large bipedal predator with bite-attack behavior.

from environments.trex.envs.trex_env import TRexEnv

env = TRexEnv(
render_mode="human",
bite_bonus=10.0, # Reward for head-prey contact
bite_approach_weight=1.0, # Reward for closing distance
)
  • Observation: 83 dimensions
  • Action: 21 dimensions (3 neck/head + 7 per leg + 4 tail)

Brachiosaurus Environment

Quadrupedal sauropod with food-reaching behavior.

from environments.brachiosaurus.envs.brachio_env import BrachioEnv

env = BrachioEnv(render_mode="human")
  • Observation: 83 dimensions
  • Action: 30 dimensions (6 neck + 24 legs)

Training with Stable-Baselines3

from stable_baselines3 import PPO
from stable_baselines3.common.vec_env import DummyVecEnv, VecNormalize
from stable_baselines3.common.monitor import Monitor

from environments.velociraptor.envs.raptor_env import RaptorEnv

def make_env():
env = RaptorEnv(forward_vel_weight=0.0, alive_bonus=1.0)
return Monitor(env)

vec_env = DummyVecEnv([make_env])
vec_env = VecNormalize(vec_env, norm_obs=True, norm_reward=True)

model = PPO("MlpPolicy", vec_env, learning_rate=3e-4)
model.learn(total_timesteps=1_000_000, progress_bar=True)
model.save("raptor_stage1")

Or use the included training script. The curriculum command runs all three stages in a single call — each stage loads its own hyperparameters from the TOML config automatically:

cd environments/velociraptor

# Full 3-stage curriculum (recommended) — one command, stages 1-3
python scripts/train_sb3.py curriculum --algorithm ppo --n-envs 4

# Or control stages individually
python scripts/train_sb3.py train --stage 1 --algorithm ppo --timesteps 1000000
python scripts/train_sb3.py train --stage 2 --algorithm ppo --timesteps 1000000 \
--load logs/stage1/models/stage1_final.zip
python scripts/train_sb3.py eval logs/stage2/models/stage2_final.zip --algorithm ppo --stage 2

# Use SAC instead
python scripts/train_sb3.py curriculum --algorithm sac

# Override hyperparameters without editing TOML files
python scripts/train_sb3.py train --stage 1 \
--override ppo.learning_rate=1e-3 env.alive_bonus=3.0

# Write outputs to a specific directory (e.g. GCS mount for cloud training)
python scripts/train_sb3.py curriculum --output-dir /mnt/gcs/training/velociraptor

Diagnostic Metrics

The LocomotionMetrics class (from environments.shared.metrics) automatically computes eight diagnostic metrics during evaluation. Call record_step(info, reward) each step, then compute() at episode end.

from environments.shared.metrics import LocomotionMetrics

metrics = LocomotionMetrics()
obs, info = env.reset()
for _ in range(1000):
action = model.predict(obs)[0]
obs, reward, terminated, truncated, info = env.step(action)
metrics.record_step(info, reward)
if terminated or truncated:
break

report = metrics.compute()
agg = LocomotionMetrics.aggregate_episodes([report])

Locomotion Health Metrics

Metric keyDescriptionSpecies
mean_forward_velocityMean forward speed (m/s)All
cost_of_transportEnergy per unit distance per unit weightAll
mean_pelvis_heightMean pelvis z-position (m)T-Rex, Velociraptor
gait_symmetryLeft/right stride symmetry ∈ [0, 1]All
stride_frequencyMean step frequency (Hz)All
mean_tilt_angleMean body tilt angle (rad)All
termination_reasonReason episode endedAll

Behavioral Metrics

Metric keyDescriptionSpecies
mean_heading_alignmentcos θ alignment toward prey ∈ [-1, 1]T-Rex, Velociraptor
mean_contact_asymmetryLeft/right contact imbalance ∈ [0, 1]All
success_rateFraction of steps with a success eventAll
min_prey_distanceMinimum distance reached to prey (m)T-Rex, Velociraptor
min_prey_distanceMinimum distance reached to food (m)Brachiosaurus

Species-Specific Key Mapping

MetricT-RexVelociraptorBrachiosaurus
Height keypelvis_heightpelvis_heightpelvis_height (aliased from torso)
Success keybite_successstrike_successfood_reached
Prey/food keyprey_distanceprey_distanceprey_distance (aliased from head_food_distance)
HeadingN/A

The evaluate command in each train_sb3.py script prints all metrics above grouped into Core Performance, Velocity, Gait Quality, Balance, and a species-specific Hunting / Food Reaching section.