├── .gitignore
├── LICENSE
├── README.md
├── gym_box.ipynb
├── sprites
├── bot_blue.png
├── floor.png
└── package.png
├── v0_warehouse_robot.py
├── v0_warehouse_robot_env.py
└── v0_warehouse_robot_train.py
/.gitignore:
--------------------------------------------------------------------------------
1 | __pycache__
2 | models
3 | logs
4 | *solution*.pkl
5 | *solution*.png
--------------------------------------------------------------------------------
/LICENSE:
--------------------------------------------------------------------------------
1 | MIT License
2 |
3 | Copyright (c) 2025 johnnycode8
4 |
5 | Permission is hereby granted, free of charge, to any person obtaining a copy
6 | of this software and associated documentation files (the "Software"), to deal
7 | in the Software without restriction, including without limitation the rights
8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9 | copies of the Software, and to permit persons to whom the Software is
10 | furnished to do so, subject to the following conditions:
11 |
12 | The above copyright notice and this permission notice shall be included in all
13 | copies or substantial portions of the Software.
14 |
15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21 | SOFTWARE.
22 |
--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
1 |
2 |
3 |
Gymnasium Custom Reinforcement Learning Environments
4 |
5 | Tutorials on how to create custom Gymnasium-compatible Reinforcement Learning environments using the [Gymnasium Library](https://gymnasium.farama.org/), formerly OpenAI’s Gym library. Each tutorial has a companion video explanation and code walkthrough from my YouTube channel [@johnnycode](https://www.youtube.com/@johnnycode). If the code and video helped you, please consider:
6 |
7 |
8 | ## Custom Gym Environment part 1 - Warehouse Robot v0
9 | This is a very basic tutorial showing end-to-end how to create a custom Gymnasium-compatible Reinforcement Learning environment. The tutorial is divided into three parts:
10 | 1. Model your problem.
11 | 2. Convert your problem into a Gymnasium-compatible environment.
12 | 3. Train your custom environment in two ways; using Q-Learning and using the Stable Baselines3 library.
13 |
14 | ##### Code Reference:
15 | * v0_warehouse_robot*.py
16 |
17 | ##### YouTube Tutorial:
18 |
19 |
20 |
21 | ## Custom Gym Environment part 2 - Visualization with Pygame
22 | In part 1, we created a very simple custom Reinforcement Learning environment that is compatible with Farama Gymnasium (formerly OpenAI Gym). In this tutorial, we'll do a minor upgrade and visualize our environment using Pygame.
23 |
24 | ##### Code Reference:
25 | * v0_warehouse_robot*.py
26 |
27 | ##### YouTube Tutorial:
28 |
29 |
30 |
31 |
32 | # Additional Resources
33 |
34 | ## How gymnasium.spaces.Box Works
35 | The Box space type is used in many Gymnasium environments and you'll likely need it for your custom environment. The Box action space can be used to validate agent actions or generate random actions. The Box observation space can be used to validate the environment's state. This video explains and demos how to create boxes of different sizes/shapes, lower (low) and upper (high) boundaries, and as int/float data types.
36 |
37 | ##### YouTube Tutorial:
38 |
39 |
40 | ## Reinforcement Learning Tutorials
41 | For more Reinforcement Learning and Deep Reinforcement Learning tutorials, check out my
42 | [Gym Solutions](https://github.com/johnnycode8/gym_solutions) repository.
43 |
44 | (back to top)
45 |
--------------------------------------------------------------------------------
/gym_box.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "cells": [
3 | {
4 | "cell_type": "code",
5 | "execution_count": null,
6 | "metadata": {},
7 | "outputs": [],
8 | "source": [
9 | "import gymnasium as gym\n",
10 | "from gymnasium import spaces\n",
11 | "import numpy as np"
12 | ]
13 | },
14 | {
15 | "cell_type": "code",
16 | "execution_count": 318,
17 | "metadata": {},
18 | "outputs": [
19 | {
20 | "data": {
21 | "text/plain": [
22 | "array([[ 1.54535348, 12.69607217, 24.35233266],\n",
23 | " [ 2.30901039, 18.2606693 , -1.68016234]])"
24 | ]
25 | },
26 | "execution_count": 318,
27 | "metadata": {},
28 | "output_type": "execute_result"
29 | }
30 | ],
31 | "source": [
32 | "myspace = spaces.Box( \n",
33 | " low = -np.inf,\n",
34 | " high = np.array([[2, 15, 25], \n",
35 | " [3, 19, np.inf]]),\n",
36 | " shape=(2,3),\n",
37 | " dtype=np.float64\n",
38 | ")\n",
39 | "\n",
40 | "myspace.sample()"
41 | ]
42 | }
43 | ],
44 | "metadata": {
45 | "kernelspec": {
46 | "display_name": "gymenv",
47 | "language": "python",
48 | "name": "python3"
49 | },
50 | "language_info": {
51 | "codemirror_mode": {
52 | "name": "ipython",
53 | "version": 3
54 | },
55 | "file_extension": ".py",
56 | "mimetype": "text/x-python",
57 | "name": "python",
58 | "nbconvert_exporter": "python",
59 | "pygments_lexer": "ipython3",
60 | "version": "3.11.3"
61 | }
62 | },
63 | "nbformat": 4,
64 | "nbformat_minor": 2
65 | }
66 |
--------------------------------------------------------------------------------
/sprites/bot_blue.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/johnnycode8/gym_custom_env/84262cede528d5dd36847c465a40ba58b0668861/sprites/bot_blue.png
--------------------------------------------------------------------------------
/sprites/floor.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/johnnycode8/gym_custom_env/84262cede528d5dd36847c465a40ba58b0668861/sprites/floor.png
--------------------------------------------------------------------------------
/sprites/package.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/johnnycode8/gym_custom_env/84262cede528d5dd36847c465a40ba58b0668861/sprites/package.png
--------------------------------------------------------------------------------
/v0_warehouse_robot.py:
--------------------------------------------------------------------------------
1 | '''
2 | This module models the problem to be solved. In this very simple example, the problem is to optimze a Robot that works in a Warehouse.
3 | The Warehouse is divided into a rectangular grid. A Target is randomly placed on the grid and the Robot's goal is to reach the Target.
4 | '''
5 | import random
6 | from enum import Enum
7 | import pygame
8 | import sys
9 | from os import path
10 |
11 | # Actions the Robot is capable of performing i.e. go in a certain direction
12 | class RobotAction(Enum):
13 | LEFT=0
14 | DOWN=1
15 | RIGHT=2
16 | UP=3
17 |
18 | # The Warehouse is divided into a grid. Use these 'tiles' to represent the objects on the grid.
19 | class GridTile(Enum):
20 | _FLOOR=0
21 | ROBOT=1
22 | TARGET=2
23 |
24 | # Return the first letter of tile name, for printing to the console.
25 | def __str__(self):
26 | return self.name[:1]
27 |
28 | class WarehouseRobot:
29 |
30 | # Initialize the grid size. Pass in an integer seed to make randomness (Targets) repeatable.
31 | def __init__(self, grid_rows=4, grid_cols=5, fps=1):
32 | self.grid_rows = grid_rows
33 | self.grid_cols = grid_cols
34 | self.reset()
35 |
36 | self.fps = fps
37 | self.last_action=''
38 | self._init_pygame()
39 |
40 | def _init_pygame(self):
41 | pygame.init() # initialize pygame
42 | pygame.display.init() # Initialize the display module
43 |
44 | # Game clock
45 | self.clock = pygame.time.Clock()
46 |
47 | # Default font
48 | self.action_font = pygame.font.SysFont("Calibre",30)
49 | self.action_info_height = self.action_font.get_height()
50 |
51 | # For rendering
52 | self.cell_height = 64
53 | self.cell_width = 64
54 | self.cell_size = (self.cell_width, self.cell_height)
55 |
56 | # Define game window size (width, height)
57 | self.window_size = (self.cell_width * self.grid_cols, self.cell_height * self.grid_rows + self.action_info_height)
58 |
59 | # Initialize game window
60 | self.window_surface = pygame.display.set_mode(self.window_size)
61 |
62 | # Load & resize sprites
63 | file_name = path.join(path.dirname(__file__), "sprites/bot_blue.png")
64 | img = pygame.image.load(file_name)
65 | self.robot_img = pygame.transform.scale(img, self.cell_size)
66 |
67 | file_name = path.join(path.dirname(__file__), "sprites/floor.png")
68 | img = pygame.image.load(file_name)
69 | self.floor_img = pygame.transform.scale(img, self.cell_size)
70 |
71 | file_name = path.join(path.dirname(__file__), "sprites/package.png")
72 | img = pygame.image.load(file_name)
73 | self.goal_img = pygame.transform.scale(img, self.cell_size)
74 |
75 |
76 | def reset(self, seed=None):
77 | # Initialize Robot's starting position
78 | self.robot_pos = [0,0]
79 |
80 | # Random Target position
81 | random.seed(seed)
82 | self.target_pos = [
83 | random.randint(1, self.grid_rows-1),
84 | random.randint(1, self.grid_cols-1)
85 | ]
86 |
87 | def perform_action(self, robot_action:RobotAction) -> bool:
88 | self.last_action = robot_action
89 |
90 | # Move Robot to the next cell
91 | if robot_action == RobotAction.LEFT:
92 | if self.robot_pos[1]>0:
93 | self.robot_pos[1]-=1
94 | elif robot_action == RobotAction.RIGHT:
95 | if self.robot_pos[1]0:
99 | self.robot_pos[0]-=1
100 | elif robot_action == RobotAction.DOWN:
101 | if self.robot_pos[0]