Markov Decision Process Explorer

Visualize, simulate, and optimize Markov Decision Processes with advanced reinforcement learning algorithms

Load Preset Examples

Choose from pre-built MDP examples to get started quickly

🎓

Simple 3-State MDP

beginner

A basic 3-state Markov chain with deterministic transitions. Great for learning the basics.

3states
2actions
γ = 0.9
🎓

2x2 Grid World

beginner

A classic grid world problem where an agent navigates to reach a goal while avoiding obstacles.

4states
4actions
γ = 0.95
🔬

Gambler's Problem

intermediate

A classic reinforcement learning problem where a gambler tries to reach a target amount.

6states
5actions
γ = 0.9
🔬

Robot Navigation

intermediate

A robot navigating through a simple environment with obstacles and goals.

5states
4actions
γ = 0.8

MDP Configuration

Configure your Markov Decision Process using the interactive controls below

States

Define the possible states in your MDP

1
2
3

Actions

Define the available actions in your MDP

a
b

Discount Factor (γ)

Controls how much future rewards are valued relative to immediate rewards

0.95
Current value: 0.95 (High discount - future rewards matter more)

Transitions

Configure state transitions, probabilities, and rewards for each action

S0

State: S0

a
Action: a
Total: 1.000
0.90
0.10
b
Action: b
Total: 1.000
1.00
S1

State: S1

a
Action: a
Total: 1.000
1.00
b
Action: b
Total: 1.000
1.00
S2

State: S2

a
Action: a
Total: 1.000
1.00
b
Action: b
Total: 1.000
1.00

Monte Carlo Simulation

Configure simulation parameters and run Monte Carlo analysis

📊

Configure your MDP using the visual configurator above to see the graph.