What is a Grid World?

A grid world is a simple environment where an agent (marked with "A") must navigate from a starting position to a goal while avoiding obstacles. Think of it like a maze or a simple video game level.

This demo lets you experience the navigation problem firsthand by manually controlling the agent. You'll see why finding the optimal path isn't always obvious!

Why Grid Worlds Matter:

  • Simple but not trivial - Easy to visualize, but planning ahead is still challenging
  • Real-world parallels - Robot navigation, warehouse routing, game AI
  • Foundation for RL - This problem structure appears throughout reinforcement learning

Try navigating the grid yourself to understand the challenge before learning how algorithms can solve it automatically!

The Navigation Challenge

Your Mission:

Navigate the agent from the starting position (lower left) to the goal (upper right) in as few steps as possible.

The Environment:

  • Grid: 5 × 5 cells
  • Start: Blue cell with "A" in lower left corner
  • Goal: Yellow cell in upper right corner
  • Obstacles: Gray cells block your path
  • Valid cells: White cells are walkable

Movement Rules:

  • You can move: ↑ Up, ↓ Down, ← Left, → Right
  • If you try to move into an obstacle or outside the grid, you stay in place
  • Every move counts as one step

Scoring:

  • Each step costs you -1 point
  • Reaching the goal gives you +10 points
  • Fewer steps = higher score

The Challenge:

The obstacles create a barrier that forces you to plan your route. Dead ends waste steps, so you need to think ahead!

How to Use This Demo

Manual Control:

  • Use the arrow buttons or keyboard arrow keys to move the agent
  • Watch the total reward accumulate with each step
  • Try to find the shortest path to the goal
  • Click "Reset" to start over from the beginning

Visual Guide:

  • Blue cell with "A": Agent's current position
  • Yellow cell with "GOAL": Target destination
  • Gray cells: Obstacles (impassable)
  • White cells: Empty, walkable cells
  • Light blue cells: Your path history

Understanding the Metrics:

  • Steps: Total number of moves taken
  • Total Reward: Cumulative reward (starts at 0, decreases by 1 per step, +10 at goal)
  • Status: Current state (Ready / In Progress / Goal Reached!)

Navigation Strategy

Key Points:

  • The optimal path is 8 steps (best score: +2 points)
  • Obstacles form a wall requiring strategic navigation
  • Don't move right immediately - there's an obstacle blocking
  • Plan ahead to avoid dead ends and backtracking

Practice Approach:

  • Try 1: Reach the goal any way you can
  • Try 2: Count steps and optimize your path
  • Challenge: Find the 8-step optimal solution
  • Pro tip: Try mentally tracking the "value" (expected future reward) of each square you visit - this is exactly what RL algorithms learn!

Next: Learn how RL algorithms automatically discover optimal paths through trial and error!

Live Competition Mode

Challenge: Navigate the 6×6 grid world over 3 attempts and compete for the highest total score in your class!

⚠️ Keep this tab active - switching to another tab may temporarily remove you from the leaderboard

Competition Rules:

  • Everyone plays the same 6×6 grid with fog of war
  • You must complete all 3 attempts
  • Your score is submitted automatically after attempt 3 completes
  • One-time only: Your first completion counts - no retries
  • Real-time leaderboard shows top performers

Three-Attempt Structure:

  • Attempt 1: Pure exploration - fog of war, no information
  • Attempt 2: Value estimates shown on visited cells from attempt 1
  • Attempt 3: Back to fog of war - test your memory!

Scoring:

  • Goal reward: +20 points
  • Step penalty: -1 per move
  • Small treats: +1 and +2 along the way (repeatable)
  • Final Score: Sum of all rewards across 3 attempts

Strategy Tips:

  • Explore efficiently in attempt 1 to build good value estimates
  • Use value information in attempt 2 to find optimal path
  • Memorize the best route for attempt 3

Navigate to the Goal

Steps: 0
Total Reward: 0
Status: In Progress