├── .gitignore ├── LICENSE ├── README.md ├── gym_box.ipynb ├── sprites ├── bot_blue.png ├── floor.png └── package.png ├── v0_warehouse_robot.py ├── v0_warehouse_robot_env.py └── v0_warehouse_robot_train.py /.gitignore: -------------------------------------------------------------------------------- 1 | __pycache__ 2 | models 3 | logs 4 | *solution*.pkl 5 | *solution*.png -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | MIT License 2 | 3 | Copyright (c) 2025 johnnycode8 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. 22 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | 2 | 3 |

Gymnasium Custom Reinforcement Learning Environments

4 | 5 | Tutorials on how to create custom Gymnasium-compatible Reinforcement Learning environments using the [Gymnasium Library](https://gymnasium.farama.org/), formerly OpenAI’s Gym library. Each tutorial has a companion video explanation and code walkthrough from my YouTube channel [@johnnycode](https://www.youtube.com/@johnnycode). If the code and video helped you, please consider: 6 |

7 | 8 | ## Custom Gym Environment part 1 - Warehouse Robot v0 9 | This is a very basic tutorial showing end-to-end how to create a custom Gymnasium-compatible Reinforcement Learning environment. The tutorial is divided into three parts: 10 | 1. Model your problem. 11 | 2. Convert your problem into a Gymnasium-compatible environment. 12 | 3. Train your custom environment in two ways; using Q-Learning and using the Stable Baselines3 library. 13 | 14 | ##### Code Reference: 15 | * v0_warehouse_robot*.py 16 | 17 | ##### YouTube Tutorial: 18 |

19 | 20 | 21 | ## Custom Gym Environment part 2 - Visualization with Pygame 22 | In part 1, we created a very simple custom Reinforcement Learning environment that is compatible with Farama Gymnasium (formerly OpenAI Gym). In this tutorial, we'll do a minor upgrade and visualize our environment using Pygame. 23 | 24 | ##### Code Reference: 25 | * v0_warehouse_robot*.py 26 | 27 | ##### YouTube Tutorial: 28 |

29 | 30 | 31 | 32 | # Additional Resources 33 | 34 | ## How gymnasium.spaces.Box Works 35 | The Box space type is used in many Gymnasium environments and you'll likely need it for your custom environment. The Box action space can be used to validate agent actions or generate random actions. The Box observation space can be used to validate the environment's state. This video explains and demos how to create boxes of different sizes/shapes, lower (low) and upper (high) boundaries, and as int/float data types. 36 | 37 | ##### YouTube Tutorial: 38 |

39 | 40 | ## Reinforcement Learning Tutorials 41 | For more Reinforcement Learning and Deep Reinforcement Learning tutorials, check out my 42 | [Gym Solutions](https://github.com/johnnycode8/gym_solutions) repository. 43 | 44 |

(back to top)

45 | -------------------------------------------------------------------------------- /gym_box.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "code", 5 | "execution_count": null, 6 | "metadata": {}, 7 | "outputs": [], 8 | "source": [ 9 | "import gymnasium as gym\n", 10 | "from gymnasium import spaces\n", 11 | "import numpy as np" 12 | ] 13 | }, 14 | { 15 | "cell_type": "code", 16 | "execution_count": 318, 17 | "metadata": {}, 18 | "outputs": [ 19 | { 20 | "data": { 21 | "text/plain": [ 22 | "array([[ 1.54535348, 12.69607217, 24.35233266],\n", 23 | " [ 2.30901039, 18.2606693 , -1.68016234]])" 24 | ] 25 | }, 26 | "execution_count": 318, 27 | "metadata": {}, 28 | "output_type": "execute_result" 29 | } 30 | ], 31 | "source": [ 32 | "myspace = spaces.Box( \n", 33 | " low = -np.inf,\n", 34 | " high = np.array([[2, 15, 25], \n", 35 | " [3, 19, np.inf]]),\n", 36 | " shape=(2,3),\n", 37 | " dtype=np.float64\n", 38 | ")\n", 39 | "\n", 40 | "myspace.sample()" 41 | ] 42 | } 43 | ], 44 | "metadata": { 45 | "kernelspec": { 46 | "display_name": "gymenv", 47 | "language": "python", 48 | "name": "python3" 49 | }, 50 | "language_info": { 51 | "codemirror_mode": { 52 | "name": "ipython", 53 | "version": 3 54 | }, 55 | "file_extension": ".py", 56 | "mimetype": "text/x-python", 57 | "name": "python", 58 | "nbconvert_exporter": "python", 59 | "pygments_lexer": "ipython3", 60 | "version": "3.11.3" 61 | } 62 | }, 63 | "nbformat": 4, 64 | "nbformat_minor": 2 65 | } 66 | -------------------------------------------------------------------------------- /sprites/bot_blue.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/johnnycode8/gym_custom_env/84262cede528d5dd36847c465a40ba58b0668861/sprites/bot_blue.png -------------------------------------------------------------------------------- /sprites/floor.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/johnnycode8/gym_custom_env/84262cede528d5dd36847c465a40ba58b0668861/sprites/floor.png -------------------------------------------------------------------------------- /sprites/package.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/johnnycode8/gym_custom_env/84262cede528d5dd36847c465a40ba58b0668861/sprites/package.png -------------------------------------------------------------------------------- /v0_warehouse_robot.py: -------------------------------------------------------------------------------- 1 | ''' 2 | This module models the problem to be solved. In this very simple example, the problem is to optimze a Robot that works in a Warehouse. 3 | The Warehouse is divided into a rectangular grid. A Target is randomly placed on the grid and the Robot's goal is to reach the Target. 4 | ''' 5 | import random 6 | from enum import Enum 7 | import pygame 8 | import sys 9 | from os import path 10 | 11 | # Actions the Robot is capable of performing i.e. go in a certain direction 12 | class RobotAction(Enum): 13 | LEFT=0 14 | DOWN=1 15 | RIGHT=2 16 | UP=3 17 | 18 | # The Warehouse is divided into a grid. Use these 'tiles' to represent the objects on the grid. 19 | class GridTile(Enum): 20 | _FLOOR=0 21 | ROBOT=1 22 | TARGET=2 23 | 24 | # Return the first letter of tile name, for printing to the console. 25 | def __str__(self): 26 | return self.name[:1] 27 | 28 | class WarehouseRobot: 29 | 30 | # Initialize the grid size. Pass in an integer seed to make randomness (Targets) repeatable. 31 | def __init__(self, grid_rows=4, grid_cols=5, fps=1): 32 | self.grid_rows = grid_rows 33 | self.grid_cols = grid_cols 34 | self.reset() 35 | 36 | self.fps = fps 37 | self.last_action='' 38 | self._init_pygame() 39 | 40 | def _init_pygame(self): 41 | pygame.init() # initialize pygame 42 | pygame.display.init() # Initialize the display module 43 | 44 | # Game clock 45 | self.clock = pygame.time.Clock() 46 | 47 | # Default font 48 | self.action_font = pygame.font.SysFont("Calibre",30) 49 | self.action_info_height = self.action_font.get_height() 50 | 51 | # For rendering 52 | self.cell_height = 64 53 | self.cell_width = 64 54 | self.cell_size = (self.cell_width, self.cell_height) 55 | 56 | # Define game window size (width, height) 57 | self.window_size = (self.cell_width * self.grid_cols, self.cell_height * self.grid_rows + self.action_info_height) 58 | 59 | # Initialize game window 60 | self.window_surface = pygame.display.set_mode(self.window_size) 61 | 62 | # Load & resize sprites 63 | file_name = path.join(path.dirname(__file__), "sprites/bot_blue.png") 64 | img = pygame.image.load(file_name) 65 | self.robot_img = pygame.transform.scale(img, self.cell_size) 66 | 67 | file_name = path.join(path.dirname(__file__), "sprites/floor.png") 68 | img = pygame.image.load(file_name) 69 | self.floor_img = pygame.transform.scale(img, self.cell_size) 70 | 71 | file_name = path.join(path.dirname(__file__), "sprites/package.png") 72 | img = pygame.image.load(file_name) 73 | self.goal_img = pygame.transform.scale(img, self.cell_size) 74 | 75 | 76 | def reset(self, seed=None): 77 | # Initialize Robot's starting position 78 | self.robot_pos = [0,0] 79 | 80 | # Random Target position 81 | random.seed(seed) 82 | self.target_pos = [ 83 | random.randint(1, self.grid_rows-1), 84 | random.randint(1, self.grid_cols-1) 85 | ] 86 | 87 | def perform_action(self, robot_action:RobotAction) -> bool: 88 | self.last_action = robot_action 89 | 90 | # Move Robot to the next cell 91 | if robot_action == RobotAction.LEFT: 92 | if self.robot_pos[1]>0: 93 | self.robot_pos[1]-=1 94 | elif robot_action == RobotAction.RIGHT: 95 | if self.robot_pos[1]0: 99 | self.robot_pos[0]-=1 100 | elif robot_action == RobotAction.DOWN: 101 | if self.robot_pos[0]