Why CartPole?

CartPole is the hello world of continuous control. A cart can move left or right, while a pole pivots freely on top. With no controller, gravity quickly pulls the pole over. To balance it, you must push left and right to keep the system within a narrow range.

This demo focuses on continuous state spaces rather than grid cells. Every small nudge changes four real-valued variables, and small delays mean failure—a perfect illustration of why reinforcement learning agents are needed.

The State Vector

  • Cart position x[2.4,2.4] meters
  • Cart velocity x˙ (m/s)
  • Pole angle θ[12,12] from vertical
  • Pole angular velocity θ˙ (rad/s)

Actions: discrete left or right pushes. Use the left/right arrows (or A/D). A neutral action (no push) is included for completeness, but makes balancing harder.

Gym-Compatible Dynamics

  • Force magnitude: 10 N applied to the cart (left or right)
  • Masses: cart 1.0 kg, pole 0.1 kg; pole half-length 0.5 m
  • Update rate: 50 Hz (τ=0.02 s) using Euler integration
  • Gravity: 9.8 m/s^2; pole torque couples cart acceleration and angle
  • Termination: |x|>2.4 m, |θ|>12, or 500 steps

These are the exact equations from OpenAI Gym's CartPole-v1 so you can swap in an RL agent later without changing the environment.

How to Play

  • Controls: hold left/right arrows (or A/D) to push the cart
  • Goal: keep the pole upright and the cart on the track
  • Score: +1 every physics step; max episode length 500
  • Fail conditions: pole angle beyond 12°, or cart beyond ±2.4 m
  • Reset: click "Reset Episode" to randomize the initial state

Watch the state values update in real time. Even tiny delays push the system toward failure.

What to Notice

  • The best strategy is gentle, rapid corrections—overcorrection makes the pole whip around.
  • CartPole highlights why tabular RL (like grid world) fails on continuous spaces.
  • This manual controller sets the stage for Deep RL (e.g., DQN) to automate balancing.
  • Look at angular velocity: reacting to θ˙ early is key to staying upright.
  • Try letting go: within seconds the pole diverges, showing the inherent instability.

Live Competition Mode

Challenge: Keep the pole balanced for as long as possible in slow motion! Compete for the highest survival time in your class.

⚠️ Keep this tab active - switching to another tab may temporarily remove you from the leaderboard

Competition Rules:

  • Everyone uses the same initial CartPole physics
  • Rendering runs at 10 FPS (slow motion) for fairness
  • Physics still runs at 50 Hz internally - no accuracy compromises
  • Unlimited attempts - your best survival time counts
  • Real-time leaderboard shows top 10 performers

Scoring:

  • Score = survival time (number of steps before termination)
  • Maximum possible score: 500 steps
  • Your best attempt is automatically submitted

Strategy Tips:

  • Slow motion gives you more time to react
  • Watch angular velocity closely for early corrections
  • Practice on normal mode first to understand the physics

Continuous Control

Balance the CartPole by hand

Feel the instability that Deep RL algorithms must master.

Left force Right force Termination zone

Live state variables

Watch how the continuous state evolves as you apply discrete pushes.

x=0.03m
=0.04m/s
θ=1.3deg
θ̇=0.02rad/s
Score=0