Simulation

icarus

Autonomous Rocket Landing. RL-trained 6-DOF rocket simulation with curriculum learning and real-time telemetry.

"Land from any angle, any altitude, every time."
ICARUS trains reinforcement learning agents to master rocket propulsive landing through curriculum learning and domain randomization.
Interactive Demonstration
ICARUS_AI
AUTONOMOUS LANDING SIMULATION
INIT

Initializing simulation...

Mission Gallery

From hover to full suborbital hop — every stage visualized.

Precision Landing

Watch AI agents nail propulsive landings from any entry angle, altitude, or velocity.

Curriculum Mastery

5-stage training from hover to full suborbital hop with progressive difficulty.

Live Telemetry

Real-time altitude, velocity, throttle, attitude, and neural activity visualization.

Multi-Agent GNC

Four specialized agents coordinate guidance, navigation, control, and safety.

ICARUS (Intelligent Control for Autonomous Rocket Upright Stabilization) is a full-fidelity rocket flight simulation with AI-powered guidance, navigation, and control. The system uses PPO/SAC reinforcement learning with a 5-stage curriculum to train agents that achieve 95%+ landing success rates.

The physics engine runs at 1kHz with RK4 numerical integration, modeling US Standard Atmosphere 1976, Dryden wind turbulence, J2 gravity perturbations, and full 6-DOF rigid body dynamics. Four specialized agents handle GNC, attitude control, throttle management, and safety monitoring.

"The future of autonomous flight is training in simulation and transferring to reality. ICARUS proves that reinforcement learning can solve the hardest control problems in aerospace when given the right curriculum and physics fidelity."

The Philosophy

ICARUS isn't just a simulation — it's a proving ground for the thesis that RL can solve safety-critical control problems when paired with the right curriculum.

01

Sim-to-Real Transfer

The gap between simulation and reality is the central challenge of robotics RL. ICARUS addresses this with domain randomization — varying wind, gravity, mass properties, and sensor noise across training episodes so the agent learns policies robust to the real-world variations it will encounter. The physics engine's fidelity (US Std Atmosphere 1976, Dryden turbulence, J2 gravity) ensures the simulation is close enough to reality that transfer is possible.

02

Why Curriculum Learning Works

You wouldn't teach a student calculus before algebra. Similarly, ICARUS uses a 5-stage curriculum that builds mastery incrementally: Stage 1 (Hover) teaches basic thrust control. Stage 2 (Vertical Drop) adds altitude management. Stage 3 (Offset Landing) introduces lateral correction. Stage 4 (High Entry) adds full trajectory planning. Stage 5 (Full Mission) combines everything into a complete suborbital hop. Each stage's reward function shapes behavior toward the next stage's starting conditions.

03

Multi-Agent GNC Architecture

Rather than training a single monolithic policy, ICARUS decomposes the control problem into four specialized agents: GNC (trajectory planning and guidance), ATT (attitude control and stabilization), THR (throttle management and fuel optimization), and SAF (safety monitoring and abort decisions). Each agent trains on its specific subtask while communicating through a shared observation space. This decomposition mirrors real aerospace GNC architectures and produces more interpretable, more robust policies.

Kingly's Approach

We built ICARUS as a full-stack RL platform combining aerospace-grade physics with modern deep RL training infrastructure.

1kHz Physics Engine

The core simulation runs at 1000Hz with RK4 numerical integration for stability. We model drag, gravity (with J2 perturbation), thrust vectoring, mass flow dynamics, atmospheric density variation, and wind turbulence. This fidelity is essential — agents trained on simplified physics fail when exposed to real-world conditions.

Training Infrastructure

ICARUS uses Stable-Baselines3 for PPO and SAC training with custom gymnasium environments. Training runs are managed via Weights & Biases for experiment tracking, with automatic curriculum advancement based on rolling success rate thresholds. A full training run from Stage 1 to Stage 5 takes approximately 48 hours on a single A100 GPU.

Real-Time 3D Dashboard

The visualization layer uses Three.js for 3D rocket rendering with real-time telemetry overlays. The dashboard shows altitude, velocity, throttle, attitude, neural activity patterns, mission phase progress, and agent status — all updated at 60fps from the simulation backend via WebSocket.

The Future

Autonomous landing is just the beginning. The simulation platform extends to multiple aerospace control problems.

Atmospheric Re-entry

Extending the simulation to model full orbital re-entry with ablative heat shielding, plasma effects, and skip trajectories.

Multi-Vehicle Coordination

Training swarms of rockets to perform coordinated landings, formation flying, and cooperative docking maneuvers.

Hardware-in-the-Loop

Connecting the simulation to actual flight computers and actuators for pre-flight validation of trained policies.

Lunar/Mars Landing

Adapting the physics and training curriculum for reduced-gravity environments with different atmospheric conditions.

Tech Stack

6-DOF Physics
PPO/SAC RL
Curriculum Learning
Three.js
Next.js
Python/PyTorch

What This Is Used For

01

Propulsive landing R&D for reusable launch vehicles

02

RL curriculum design for safety-critical control domains

03

High-fidelity aerospace simulation and visualization

04

Multi-agent GNC system prototyping

Install

NPM Package Coming Soon

Sign up for the newsletter to get notified when it's released.

Also from Labs

Stay Updated

Get notified of new labs, experiments, and updates.

Share