├── README.md
├── images
    ├── world.gif
    └── world.png
├── launch
    └── serp_rl.launch.py
├── package.xml
├── param
    └── params.yaml
├── resource
    └── serp_rl
├── rviz
    └── robot_navigation.rviz
├── serp_rl
    └── __init__.py
├── setup.cfg
├── setup.py
├── test
    ├── test_copyright.py
    ├── test_flake8.py
    └── test_pep257.py
└── world
    ├── empty.png
    ├── empty.yaml
    ├── end_beacon.model.yaml
    ├── serp.model.yaml
    ├── turn.png
    ├── turn.yaml
    └── world.yaml


/README.md:
--------------------------------------------------------------------------------
  1 | # Flatland Reinforcement Learning Tutorial using ROS 2
  2 | 
  3 | The [previous tutorial](https://github.com/FilipeAlmeidaFEUP/ros2_teleopkeys_tutorial) focused on explaining how to use ROS 2 and Flatland to create a robot and control it. In this tutorial, you will learn how to use Reinforcement Learning (RL) inside the same setup to teach the robot how to perform a simple task. The packages used for the RL algorithms are the [Stable-Baselines3](https://stable-baselines3.readthedocs.io/en/master/) and OpenAI's [Gym](https://www.gymlibrary.dev/).
  4 | 
  5 | This tutorial was developed alongside a scientific publication ([link](https://www.researchgate.net/publication/380135583_An_Educational_Kit_for_Simulated_Robot_Learning_in_ROS_2)). If you found this tutorial useful and helpful for your robotics project, we would be grateful if you include the citation below. Thank you!
  6 | 
  7 | (IEEE citation format)
  8 | ```
  9 | F. Almeida, G. Leão, A. Sousa, “An Educational Kit for Simulated Robot Learning in ROS 2”, Iberian Robotics Conference, pp. 513-525, 2023.
 10 | ```
 11 | 
 12 | (Bibtex citation)
 13 | ```
 14 | @inproceedings{almeida2023educational,
 15 |   title={An Educational Kit for Simulated Robot Learning in ROS 2},
 16 |   author={Almeida, Filipe and Le{\~a}o, Gon{\c{c}}alo and Sousa, Armando},
 17 |   booktitle={Iberian Robotics conference},
 18 |   pages={513--525},
 19 |   year={2023},
 20 |   publisher={Springer}
 21 | }
 22 | ```
 23 | 
 24 | ## First Run
 25 | 
 26 | ### Pre-requisites
 27 | 
 28 | All the prerequisites are carried over from the previous tutorial. In addition, you also need to install the Stable-Baselines3 package. Follow the [installation guide](https://stable-baselines3.readthedocs.io/en/master/guide/install.html) from the documentation. For this example, you can install the most basic version either for [Windows](https://stable-baselines3.readthedocs.io/en/master/guide/install.html#windows-10) or using [pip](https://stable-baselines3.readthedocs.io/en/master/guide/install.html#stable-release) for other OS's.
 29 | 
 30 | The Stable-Baselines3 installation should automatically install all missing dependencies, including the Gym package. Nevertheless, pay attention during the installation and make sure there were no errors. If any dependency fails to install try to do it manually.
 31 | 
 32 | If you are using the [VM provided](https://drive.google.com/file/d/1Wte7yGi9puJU5gR8mpzAtvtKOPtoYKEJ/view?usp=sharing), the package is ready to run.
 33 | 
 34 | ### Running the code
 35 | 
 36 | Clone the repository to your `<ros2_workspace>/src` folder:
 37 | ```
 38 | git clone https://github.com/FilipeAlmeidaFEUP/ros2_flatland_rl_tutorial.git
 39 | ```
 40 | 
 41 | Build the project and install dependencies:
 42 | ```
 43 | rosdep install -i --from-path src --rosdistro humble -y
 44 | colcon build
 45 | source install/setup.bash
 46 | ```
 47 | 
 48 | Run the launch file:
 49 | ```
 50 | ros2 launch serp_rl serp_rl.launch.py
 51 | ```
 52 | 
 53 | At this point, if no errors occur, you should be seeing the following world:
 54 | 
 55 | ![RL world](images/world.png)
 56 | 
 57 | ![RL world](images/world.gif)
 58 | 
 59 | The robot is currently using the Proximal Policy Optimization (PPO) algorithm to learn how to navigate the hallway from one end to the other. In the beginning, the robot is just exploring the environment and practically taking random actions. Over time, it starts to learn what to do in the different scenarios and it will improve at the task.
 60 | 
 61 | The target area is represented by the green circle and, every time the task is restarted, the initial and final positions swap so the robot learns how to turn to both the left and the right. The task is restarted if it fails if there are any collisions, if it takes too much time or if it succeeds (reaches the end).
 62 | 
 63 | These are all the components in the world and how they changed from the previous tutorial:
 64 | - Map: now has a simpler layout, representing a hallway with a 90º turn. Same configuration with a different [image](world/turn.png).
 65 | - SERP robot model with LiDAR: this model is exactly the same as before.
 66 | - End Beacon: new model added to detect if the SERP robot reached the end. It's represented by a green circle and its body has no collision with any other component. 
 67 | - End Beacon LiDAR: laser plugin inside the end beacon model that ignores walls and only detects the SERP robot. The smallest value read is the distance from the robot to the end and, if below a threshold, the target area was reached.
 68 | 
 69 | NOTE: The layers were set up in order to accommodate the necessary collisions and lasers. If you're having trouble understanding the logic, revisit this [Google Slides presentation on how to manage layers](https://docs.google.com/presentation/d/1KqJDQR_PBaGtS-kA5KgsTbaRThu_uP2U8NJeoSP0GDE/edit?usp=sharing). The last slide shows the layer graph for this world.
 70 | 
 71 | As using RL can be very time-consuming, this simulation was sped up. This can be done by modifying the `update_rate` in Flatland, as mentioned in the [previous tutorial](https://github.com/FilipeAlmeidaFEUP/ros2_teleopkeys_tutorial#launch-file).
 72 | 
 73 | NOTE: If you are using the VM and are running into performance issues, it might be a good idea to lower the `update_rate`.
 74 | 
 75 | The next sections will explain how the code from the controller needs to be organized to use any RL algorithm provided by Stable-Baselines3.
 76 | 
 77 | ## Setup the Environment
 78 | 
 79 | Important definitions before you start:
 80 | - RL Agent: component that decides what action to take based on the state of its environment.
 81 | - RL Environment: component that contains information about all possible actions for every state and the respective reward (how good the action is for any given state).
 82 | 
 83 | 
 84 | Every RL algorithm has the same goal: find the best action for the agent to take to every possible state of the environment. The first step to using RL in robotics (also known as Robot Learning) is to turn the robot (and its sensors) and the map into the environment for the agent to use. 
 85 | 
 86 | This is exactly what the OpenAI's Gym package allows us to do with an `Env` class that can be inherited. In this package, we can reuse the node class:
 87 | ```
 88 | from gym import Env
 89 | 
 90 | class SerpControllerEnv(Node, Env):
 91 |   [...]
 92 | ```
 93 | 
 94 | This will then allow us to use variables and override functions that define the environment. The next sections will explain how to configure this environment but you can also check the documentation on [how to make your own custom environment](https://www.gymlibrary.dev/content/environment_creation/).
 95 | 
 96 | ### \_\_init\_\_ fuction
 97 | 
 98 | Inside the init function you need to initialize the `action_space`, `observation_space` and `state` variables. 
 99 | 
100 | Let us start by choosing the actions the robot can perform. For this project, there are three possible actions, executable by changing the robot's linear and angular velocity:
101 | 1. Move Forward
102 | 2. Rotate Left
103 | 3. Rotate Right
104 | 
105 | Then we need to choose what a state of our environment looks like (the `observation_space`). For that, we'll use the readings from the LiDAR. Since using all 90 rays would give a state space too large that would take too long for the agent to train in, we need to sample the readings. To do that, The LiDAR was divided into 9 equal sections, and from each we get the closest reading. This means that our observation space is composed of 9 floating point values.
106 | 
107 | NOTE: In more conventional definitions of a state in RL, the agent has total knowledge of the environment (position of all entities, shape of the map, etc.). In this case (and a lot of other Robot Learning applications), the agent can only know what the robot knows, which is the LiDAR readings.
108 | 
109 | To actually initialize the variables, we need to define their shape using [Gym Spaces](https://www.gymlibrary.dev/api/spaces/). For this project, these shapes are used:
110 | ```
111 | from gym.spaces import Discrete, Box
112 | 
113 | # action is an integer between 0 and 2 (total of 3 actions)
114 | self.action_space = Discrete(3)
115 | 
116 | # state is represented by a numpy.Array with size 9 and values between 0 and 2
117 | self.observation_space = Box(0, 2, shape=(9,), dtype=numpy.float64)
118 | 
119 | # after the observation space is defined, you can declare the initial state 
120 | self.state = numpy.array(self.lidar_sample)
121 | ```
122 | 
123 | ### step function
124 | 
125 | In RL a step is the process that is constantly being repeated and consists of:
126 | 1. Receiving the action decided by the agent for the current state and executing it.
127 | 2. Determining the new state.
128 | 3. Calculate the reward based on the old and new states and the action.
129 | 4. Check if a final state was reached and the environment needs to be reset.
130 | 
131 | In the code for this package, this equates to:
132 | 1. Changing the robot speed based on the action chosen.
133 | 2. Wait for the action to be completed. In this case, the action finishes when the next LiDAR reading is published. This means each step lasts the same as the update rate of the LiDAR, 0.1 seconds.
134 | 3. Calculate the reward based on events detected in the simulation while performing the action and other available information.
135 | 4. Determines if it is a final state based on events from subscribed topics during the action (collisions or end reached) or the number of actions already taken (times out at 200 steps).
136 | 
137 | Returns: The current state, the reward, and if it reached a final state (boolean). 
138 | 
139 | #### Calculating the Reward
140 | 
141 | The events that trigger rewards for an action are:
142 | 
143 | - Detecting the end was reached (large positive reward)
144 | - Detecting a collision (large negative reward)
145 | - Exceeding the maximum number of steps (large negative reward)
146 | - Choosing the action to move forward (small positive reward)
147 | 
148 | NOTE: The reward calculation can still be improved, allowing the agent to train faster. A good exercise for you would be to think of other things that can be added to help the robot complete the task. Hint: take advantage of all information available, like old and new lidar readings and the action taken.
149 | 
150 | ### reset function
151 | 
152 | An episode, in the context of RL, is the set of steps between an initial state and a final state. The reset function has the task of starting a new episode by setting the environment back to an initial state. In this case, this means:
153 | - Placing all the models (SERP and End beacon) in starting positions.
154 | - Resetting all variables that need to.
155 | - Determine the new initial state (next reading from the Lidar).
156 | 
157 | Returns: The initial state.
158 | 
159 | ### Other functions
160 | 
161 | - render: Used to render your environment. Not needed here since it is already rendered in Flatland.
162 | - close: Runs when the environment is no longer needed. Can be used to close no longer necessary ROS 2 processes.
163 | 
164 | ## Running the RL algorithm
165 | 
166 | The more recently developed RL algorithms are very effective but also very hard to understand. The good news is that, thanks to the Stable-Baselines3 package, you can completely abstract from how the algorithm works. In fact, if you have a properly setup environment, getting an algorithm to run an agent on it only takes a few lines of code:
167 | 
168 | ```
169 | from stable_baselines3 import PPO
170 | from stable_baselines3.common.env_checker import check_env
171 | 
172 | env = Env()
173 | 
174 | # optional but launches exceptions with helpful messages debug your environment if it has errors
175 | check_env(env)
176 | 
177 | # create the agent for the PPO algorithm and assign one of the predefined policies and the environment
178 | agent = PPO("MlpPolicy", env)
179 | 
180 | # letting the agent learn for 25000 steps
181 | agent.learn(total_timesteps=25000)
182 | ```
183 | 
184 | Go to the [documentation](https://stable-baselines3.readthedocs.io/en/master/guide/algos.html) to see all available RL algorithms in this package. Notice that you can swap between algorithms with very few changes to the code.
185 | 
186 | After the training is complete, you can test your model by manually calling the environment functions:
187 | ```
188 | obs = self.reset()
189 | while True:
190 |   # returns an action based on what it learned
191 |   action, _states = agent.predict(obs)
192 |   
193 |   obs, rewards, done = self.step(action)
194 |   if done:
195 |     self.reset()
196 | ```
197 | 
198 | You can easily store your trained agent in a file and load it later with the functions:
199 | ```
200 | agent.save("ppo")
201 | agent = PPO.load("ppo")
202 | ```
203 | 
204 | This code needs to run in parallel with the ROS2 processes so threading was used.
205 | 
206 | The function `learn` trains the agent for a given number of steps, which does not guarantee that by the end of training the agent will be capable of performing the task. To make sure your agent is properly trained, you need to think what is the best strategy to train it. This section will present you with an option to resolve this issue.
207 | 
208 | ### Training Strategies
209 | 
210 | The goal is to make sure you end up with an agent capable of completing the task, meaning that it needs to somehow be validated. One possible solution, which is used in this project, is to follow these steps:
211 | 
212 | 1. Train for a given number of steps.
213 | 2. Test the agent for a set number of episodes and determine the accuracy (number of successful episodes divided by the total number of episodes)
214 | 3. If the accuracy is above a given threshold finish training, otherwise go back to step 1. 
215 | 
216 | ## Next steps
217 | 
218 | Now that you know how both Flatland, ROS2 and RL can work together, you can now start developing your own projects and experimenting with different maps, setups or tasks.
219 | 


--------------------------------------------------------------------------------
/images/world.gif:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/FilipeAlmeidaFEUP/ros2_flatland_rl_tutorial/e7ed1180dc5155f801af8ed8d8ba08a19a60b326/images/world.gif


--------------------------------------------------------------------------------
/images/world.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/FilipeAlmeidaFEUP/ros2_flatland_rl_tutorial/e7ed1180dc5155f801af8ed8d8ba08a19a60b326/images/world.png


--------------------------------------------------------------------------------
/launch/serp_rl.launch.py:
--------------------------------------------------------------------------------
 1 | from launch import LaunchDescription
 2 | from launch.actions import DeclareLaunchArgument, SetEnvironmentVariable, ExecuteProcess
 3 | from launch.substitutions import LaunchConfiguration, PathJoinSubstitution, FindExecutable
 4 | import launch.conditions as conditions
 5 | from launch_ros.substitutions import FindPackageShare
 6 | from launch_ros.actions import Node
 7 | 
 8 | 
 9 | def generate_launch_description():
10 |     pkg_share = FindPackageShare("serp_rl")
11 | 
12 |     world_path = LaunchConfiguration("world_path")
13 |     update_rate = LaunchConfiguration("update_rate")
14 |     step_size = LaunchConfiguration("step_size")
15 |     show_viz = LaunchConfiguration("show_viz")
16 |     viz_pub_rate = LaunchConfiguration("viz_pub_rate")
17 |     use_sim_time = LaunchConfiguration("use_sim_time")
18 | 
19 |     ld = LaunchDescription(
20 |         [
21 |             # Flatland parameters.
22 |             # You can either change these values directly here or override them in the launch command default values. Example:
23 |             #   ros2 launch serp_rl serp_rl.launch.py update_rate:="20.0"
24 |             DeclareLaunchArgument(
25 |                 name="world_path",
26 |                 default_value=PathJoinSubstitution([pkg_share, "world/world.yaml"]),
27 |             ),
28 |             DeclareLaunchArgument(name="update_rate", default_value="1000.0"),
29 |             DeclareLaunchArgument(name="step_size", default_value="0.01"),
30 |             DeclareLaunchArgument(name="show_viz", default_value="true"),
31 |             DeclareLaunchArgument(name="viz_pub_rate", default_value="30.0"),
32 |             DeclareLaunchArgument(name="use_sim_time", default_value="true"),
33 | 
34 |             SetEnvironmentVariable(name="ROSCONSOLE_FORMAT", value="[${severity} ${time} ${logger}]: ${message}"),
35 | 
36 |             # **** Nodes launched by this file ****
37 |             # launch flatland server
38 |             Node(
39 |                 name="flatland_server",
40 |                 package="flatland_server",
41 |                 executable="flatland_server",
42 |                 output="screen",
43 |                 parameters=[
44 |                     # use the arguments passed into the launchfile for this node
45 |                     {"world_path": world_path},
46 |                     {"update_rate": update_rate},
47 |                     {"step_size": step_size},
48 |                     {"show_viz": show_viz},
49 |                     {"viz_pub_rate": viz_pub_rate},
50 |                     {"use_sim_time": use_sim_time},
51 |                 ],
52 |             ),
53 |             # runs the code in the file serp_rl/__init__.py that is used to control the robot
54 |             Node(
55 |                 name="serp_rl",
56 |                 package="serp_rl",
57 |                 executable="serp_rl",
58 |                 output="screen",
59 |             ),
60 | 
61 |             # maps
62 |             Node(
63 |                 name="tf",
64 |                 package="tf2_ros",
65 |                 executable="static_transform_publisher",
66 |                 arguments=["0", "0", "0", "0", "0", "0", "map", "odom"],
67 |             ),
68 | 
69 |             # visualisation 
70 |             Node(
71 |                 name="rviz",
72 |                 package="rviz2",
73 |                 executable="rviz2",
74 |                 arguments=["-d", PathJoinSubstitution([pkg_share, "rviz/robot_navigation.rviz"])],
75 |                 parameters=[{"use_sim_time": use_sim_time}],
76 |                 condition=conditions.IfCondition(show_viz),
77 |             ),
78 |         ]
79 |     )
80 |     return ld
81 | 
82 | 
83 | if __name__ == "__main__":
84 |     generate_launch_description()
85 | 


--------------------------------------------------------------------------------
/package.xml:
--------------------------------------------------------------------------------
 1 | <?xml version="1.0"?>
 2 | <?xml-model href="http://download.ros.org/schema/package_format3.xsd" schematypens="http://www.w3.org/2001/XMLSchema"?>
 3 | <package format="3">
 4 |   <name>serp_rl</name>
 5 |   <version>0.0.0</version>
 6 |   <description>TODO: Package description</description>
 7 |   <maintainer email="filipe.almeida.18@gmail.com">filipe</maintainer>
 8 |   <license>TODO: License declaration</license>
 9 | 
10 |   <!-- write all your dependencies here -->
11 |   <exec_depend>rclpy</exec_depend>
12 |   <exec_depend>flatland_msgs</exec_depend>
13 |   
14 |   <export>
15 |     <build_type>ament_python</build_type>
16 |   </export>
17 | </package>
18 | 


--------------------------------------------------------------------------------
/param/params.yaml:
--------------------------------------------------------------------------------
 1 | do-reset: True
 2 | do-stop: True
 3 | do-odometry: False
 4 | max-linVel: 2.0
 5 | linAcc: 1.5
 6 | linDec: 3.0
 7 | max-angVel: 3.0
 8 | angAcc: 8.0
 9 | angDec: 8.0
10 | 


--------------------------------------------------------------------------------
/resource/serp_rl:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/FilipeAlmeidaFEUP/ros2_flatland_rl_tutorial/e7ed1180dc5155f801af8ed8d8ba08a19a60b326/resource/serp_rl


--------------------------------------------------------------------------------
/rviz/robot_navigation.rviz:
--------------------------------------------------------------------------------
  1 | Panels:
  2 |   - Class: rviz_common/Displays
  3 |     Help Height: 78
  4 |     Name: Displays
  5 |     Property Tree Widget:
  6 |       Expanded:
  7 |         - /Global Options1
  8 |         - /TF1/Frames1
  9 |         - /TF1/Tree1
 10 |         - /Local Planning1
 11 |         - /Global Planning1
 12 |       Splitter Ratio: 0.5
 13 |     Tree Height: 595
 14 |   - Class: rviz_common/Selection
 15 |     Name: Selection
 16 |   - Class: rviz_common/Tool Properties
 17 |     Expanded:
 18 |       - /2D Pose Estimate1
 19 |       - /2D Nav Goal1
 20 |     Name: Tool Properties
 21 |     Splitter Ratio: 0.588679016
 22 |   - Class: rviz_common/Views
 23 |     Expanded:
 24 |       - /Current View1
 25 |     Name: Views
 26 |     Splitter Ratio: 0.5
 27 |   - Class: rviz_common/Time
 28 |     Experimental: false
 29 |     Name: Time
 30 |     SyncMode: 0
 31 |     SyncSource: LaserScan (kinect)
 32 |   - Class: nav2_rviz_plugins/Navigation 2
 33 |     Name: Navigation 2
 34 | Visualization Manager:
 35 |   Class: ""
 36 |   Displays:
 37 |     - Alpha: 0.1
 38 |       Cell Size: 1
 39 |       Class: rviz_default_plugins/Grid
 40 |       Color: 224; 224; 224
 41 |       Enabled: true
 42 |       Line Style:
 43 |         Line Width: 0.0299999993
 44 |         Value: Lines
 45 |       Name: Grid
 46 |       Normal Cell Count: 0
 47 |       Offset:
 48 |         X: 0
 49 |         Y: 0
 50 |         Z: 0
 51 |       Plane: XY
 52 |       Plane Cell Count: 100
 53 |       Reference Frame: <Fixed Frame>
 54 |       Value: true
 55 |     - Class: rviz_default_plugins/TF
 56 |       Enabled: false
 57 |       Frame Timeout: 15
 58 |       Frames:
 59 |         All Enabled: false
 60 |       Marker Scale: 1
 61 |       Name: TF
 62 |       Show Arrows: true
 63 |       Show Axes: true
 64 |       Show Names: false
 65 |       Tree:
 66 |         {}
 67 |       Update Interval: 0
 68 |       Value: false
 69 |     - Alpha: 1
 70 |       Autocompute Intensity Bounds: true
 71 |       Autocompute Value Bounds:
 72 |         Max Value: 10
 73 |         Min Value: -10
 74 |         Value: true
 75 |       Axis: Z
 76 |       Channel Name: intensity
 77 |       Class: rviz_default_plugins/LaserScan
 78 |       Color: 255; 0; 0
 79 |       Color Transformer: FlatColor
 80 |       Decay Time: 0
 81 |       Enabled: false
 82 |       Invert Rainbow: false
 83 |       Max Color: 255; 255; 255
 84 |       Max Intensity: 1
 85 |       Min Color: 0; 0; 0
 86 |       Min Intensity: 1
 87 |       Name: LaserScan (kinect)
 88 |       Position Transformer: XYZ
 89 |       Queue Size: 10
 90 |       Selectable: true
 91 |       Size (Pixels): 3
 92 |       Size (m): 0.0700000003
 93 |       Style: Squares
 94 |       Topic: /static_laser
 95 |       Unreliable: false
 96 |       Use Fixed Frame: true
 97 |       Use rainbow: true
 98 |       Value: false
 99 |     - Alpha: 1
100 |       Autocompute Intensity Bounds: true
101 |       Autocompute Value Bounds:
102 |         Max Value: 10
103 |         Min Value: -10
104 |         Value: true
105 |       Axis: Z
106 |       Channel Name: intensity
107 |       Class: rviz_default_plugins/LaserScan
108 |       Color: 255; 255; 255
109 |       Color Transformer: Intensity
110 |       Decay Time: 0
111 |       Enabled: true
112 |       Invert Rainbow: false
113 |       Max Color: 255; 255; 255
114 |       Max Intensity: 285
115 |       Min Color: 0; 0; 0
116 |       Min Intensity: 2
117 |       Name: PointCloud (bumpers)
118 |       Position Transformer: XYZ
119 |       Queue Size: 10
120 |       Selectable: true
121 |       Size (Pixels): 3
122 |       Size (m): 0.0399999991
123 |       Style: Flat Squares
124 |       Topic: /mobile_base/sensors/bumper_pointcloud
125 |       Unreliable: false
126 |       Use Fixed Frame: true
127 |       Use rainbow: true
128 |       Value: true
129 |     - Alpha: 0.699999988
130 |       Class: rviz_default_plugins/Map
131 |       Color Scheme: map
132 |       Draw Behind: false
133 |       Enabled: true
134 |       Name: Map
135 |       Topic: /map
136 |       Unreliable: false
137 |       Use Timestamp: false
138 |       Value: false
139 |     - Class: rviz_common/Group
140 |       Displays:
141 |         - Alpha: 0.699999988
142 |           Class: rviz_default_plugins/Map
143 |           Color Scheme: costmap
144 |           Draw Behind: false
145 |           Enabled: true
146 |           Name: Costmap
147 |           Topic: /move_base/local_costmap/costmap
148 |           Unreliable: false
149 |           Use Timestamp: false
150 |           Value: true
151 |         - Alpha: 1
152 |           Buffer Length: 1
153 |           Class: rviz_default_plugins/Path
154 |           Color: 0; 12; 255
155 |           Enabled: true
156 |           Head Diameter: 0.300000012
157 |           Head Length: 0.200000003
158 |           Length: 0.300000012
159 |           Line Style: Lines
160 |           Line Width: 0.0299999993
161 |           Name: Planner
162 |           Offset:
163 |             X: 0
164 |             Y: 0
165 |             Z: 0
166 |           Pose Color: 255; 85; 255
167 |           Pose Style: None
168 |           Radius: 0.0299999993
169 |           Shaft Diameter: 0.100000001
170 |           Shaft Length: 0.100000001
171 |           Topic: /move_base/DWAPlannerROS/local_plan
172 |           Unreliable: false
173 |           Value: true
174 |         - Alpha: 0.400000006
175 |           Autocompute Intensity Bounds: true
176 |           Autocompute Value Bounds:
177 |             Max Value: 0
178 |             Min Value: 0
179 |             Value: true
180 |           Axis: Z
181 |           Channel Name: goal_cost
182 |           Class: rviz_default_plugins/PointCloud2
183 |           Color: 255; 255; 255
184 |           Color Transformer: Intensity
185 |           Decay Time: 0
186 |           Enabled: true
187 |           Invert Rainbow: false
188 |           Max Color: 255; 255; 255
189 |           Max Intensity: 84
190 |           Min Color: 0; 0; 0
191 |           Min Intensity: 0
192 |           Name: Cost Cloud
193 |           Position Transformer: XYZ
194 |           Queue Size: 10
195 |           Selectable: false
196 |           Size (Pixels): 5
197 |           Size (m): 0.0399999991
198 |           Style: Squares
199 |           Topic: /move_base/DWAPlannerROS/cost_cloud
200 |           Unreliable: false
201 |           Use Fixed Frame: true
202 |           Use rainbow: true
203 |           Value: true
204 |         - Alpha: 1
205 |           Autocompute Intensity Bounds: true
206 |           Autocompute Value Bounds:
207 |             Max Value: 10
208 |             Min Value: -10
209 |             Value: true
210 |           Axis: Z
211 |           Channel Name: total_cost
212 |           Class: rviz_default_plugins/PointCloud2
213 |           Color: 255; 255; 255
214 |           Color Transformer: Intensity
215 |           Decay Time: 0
216 |           Enabled: true
217 |           Invert Rainbow: false
218 |           Max Color: 255; 255; 255
219 |           Max Intensity: 15.1999998
220 |           Min Color: 0; 0; 0
221 |           Min Intensity: 0
222 |           Name: Trajectory Cloud
223 |           Position Transformer: XYZ
224 |           Queue Size: 10
225 |           Selectable: true
226 |           Size (Pixels): 3
227 |           Size (m): 0.0399999991
228 |           Style: Flat Squares
229 |           Topic: /move_base/DWAPlannerROS/trajectory_cloud
230 |           Unreliable: false
231 |           Use Fixed Frame: true
232 |           Use rainbow: false
233 |           Value: true
234 |       Enabled: true
235 |       Name: Local Planning
236 |     - Class: rviz_common/Group
237 |       Displays:
238 |         - Alpha: 0.400000006
239 |           Class: rviz_default_plugins/Map
240 |           Color Scheme: costmap
241 |           Draw Behind: true
242 |           Enabled: true
243 |           Name: Costmap
244 |           Topic: /move_base/global_costmap/costmap
245 |           Unreliable: false
246 |           Use Timestamp: false
247 |           Value: true
248 |         - Alpha: 1
249 |           Buffer Length: 1
250 |           Class: rviz_default_plugins/Path
251 |           Color: 255; 0; 0
252 |           Enabled: true
253 |           Head Diameter: 0.300000012
254 |           Head Length: 0.200000003
255 |           Length: 0.300000012
256 |           Line Style: Lines
257 |           Line Width: 0.0299999993
258 |           Name: Planner
259 |           Offset:
260 |             X: 0
261 |             Y: 0
262 |             Z: 0
263 |           Pose Color: 255; 85; 255
264 |           Pose Style: None
265 |           Radius: 0.0299999993
266 |           Shaft Diameter: 0.100000001
267 |           Shaft Length: 0.100000001
268 |           Topic: /move_base/DWAPlannerROS/global_plan
269 |           Unreliable: false
270 |           Value: true
271 |       Enabled: true
272 |       Name: Global Planning
273 |     - Alpha: 1
274 |       Axes Length: 1
275 |       Axes Radius: 0.100000001
276 |       Class: rviz_default_plugins/Pose
277 |       Color: 255; 25; 0
278 |       Enabled: true
279 |       Head Length: 0.200000003
280 |       Head Radius: 0.100000001
281 |       Name: Pose (move_base)
282 |       Shaft Length: 1
283 |       Shaft Radius: 0.0500000007
284 |       Shape: Arrow
285 |       Topic: /move_base/current_goal
286 |       Unreliable: false
287 |       Value: true
288 |     - Alpha: 1
289 |       Arrow Length: 0.200000003
290 |       Axes Length: 0.300000012
291 |       Axes Radius: 0.00999999978
292 |       Class: rviz_default_plugins/PoseArray
293 |       Color: 0; 192; 0
294 |       Enabled: true
295 |       Head Length: 0.0700000003
296 |       Head Radius: 0.0299999993
297 |       Name: ParticleCloud
298 |       Shaft Length: 0.230000004
299 |       Shaft Radius: 0.00999999978
300 |       Shape: Arrow (Flat)
301 |       Topic: /particlecloud
302 |       Unreliable: false
303 |       Value: true
304 |     - Alpha: 1
305 |       Buffer Length: 1
306 |       Class: rviz_default_plugins/Path
307 |       Color: 25; 255; 0
308 |       Enabled: true
309 |       Head Diameter: 0.300000012
310 |       Head Length: 0.200000003
311 |       Length: 0.300000012
312 |       Line Style: Lines
313 |       Line Width: 0.0299999993
314 |       Name: Path (global)
315 |       Offset:
316 |         X: 0
317 |         Y: 0
318 |         Z: 0
319 |       Pose Color: 255; 85; 255
320 |       Pose Style: None
321 |       Radius: 0.0299999993
322 |       Shaft Diameter: 0.100000001
323 |       Shaft Length: 0.100000001
324 |       Topic: /move_base/NavfnROS/plan
325 |       Unreliable: false
326 |       Value: true
327 |     - Angle Tolerance: 0.100000001
328 |       Class: rviz_default_plugins/Odometry
329 |       Covariance:
330 |         Orientation:
331 |           Alpha: 0.5
332 |           Color: 255; 255; 127
333 |           Color Style: Unique
334 |           Frame: Local
335 |           Offset: 1
336 |           Scale: 1
337 |           Value: true
338 |         Position:
339 |           Alpha: 0.300000012
340 |           Color: 204; 51; 204
341 |           Scale: 1
342 |           Value: true
343 |         Value: true
344 |       Enabled: false
345 |       Keep: 10
346 |       Name: Odometry
347 |       Position Tolerance: 0.100000001
348 |       Shape:
349 |         Alpha: 1
350 |         Axes Length: 1
351 |         Axes Radius: 0.100000001
352 |         Color: 255; 25; 0
353 |         Head Length: 0.300000012
354 |         Head Radius: 0.100000001
355 |         Shaft Length: 1
356 |         Shaft Radius: 0.0500000007
357 |         Value: Arrow
358 |       Topic: /odom
359 |       Unreliable: false
360 |       Value: false
361 |     - Class: rviz_default_plugins/MarkerArray
362 |       Enabled: false
363 |       Topic:
364 |         Value: /move_base/EBandPlannerROS/eband_visualization_array
365 |       Name: EBand boubles
366 |       Namespaces:
367 |         {}
368 |       Queue Size: 100
369 |       Value: false
370 |     - Class: rviz_default_plugins/MarkerArray
371 |       Enabled: true
372 |       Topic:
373 |         Value: /layers/l_collisions_layer
374 |         Depth: 1
375 |         History Policy: Keep Last
376 |         Reliability Policy: Reliable
377 |         Durability Policy: Transient Local
378 |       Name: World
379 |       Namespaces:
380 |         "": true
381 |       Queue Size: 100
382 |       Value: true
383 |     - Class: rviz_default_plugins/MarkerArray
384 |       Enabled: true
385 |       Topic:
386 |         Value: /layers/l_serp_laser_layer
387 |         Depth: 1
388 |         History Policy: Keep Last
389 |         Reliability Policy: Reliable
390 |         Durability Policy: Transient Local
391 |       Name: World
392 |       Namespaces:
393 |         "": true
394 |       Queue Size: 100
395 |       Value: true
396 |     - Class: rviz_default_plugins/MarkerArray
397 |       Enabled: true
398 |       Topic:
399 |         Value: /layers/l_end_laser_layer
400 |         Depth: 1
401 |         History Policy: Keep Last
402 |         Reliability Policy: Reliable
403 |         Durability Policy: Transient Local
404 |       Name: World
405 |       Namespaces:
406 |         "": true
407 |       Queue Size: 100
408 |       Value: true
409 |     - Class: rviz_default_plugins/MarkerArray
410 |       Enabled: true
411 |       Topic:
412 |         Value: /layers/l_end_beacon_layer
413 |         Depth: 1
414 |         History Policy: Keep Last
415 |         Reliability Policy: Reliable
416 |         Durability Policy: Transient Local
417 |       Name: World
418 |       Namespaces:
419 |         "": true
420 |       Queue Size: 100
421 |       Value: true
422 |     - Class: rviz_default_plugins/MarkerArray
423 |       Enabled: true
424 |       Topic:
425 |         Value: /models/m_serp
426 |       Name: Turtlebot
427 |       Namespaces:
428 |         "": true
429 |       Queue Size: 100
430 |       Value: true
431 |     - Class: rviz_default_plugins/MarkerArray
432 |       Enabled: true
433 |       Topic:
434 |         Value: /models/m_end_beacon
435 |       Name: Turtlebot
436 |       Namespaces:
437 |         "": true
438 |       Queue Size: 100
439 |       Value: true
440 |     - Alpha: 1
441 |       Autocompute Intensity Bounds: true
442 |       Autocompute Value Bounds:
443 |         Max Value: 10
444 |         Min Value: -10
445 |         Value: true
446 |       Axis: Z
447 |       Channel Name: intensity
448 |       Class: rviz_default_plugins/LaserScan
449 |       Color: 255; 170; 0
450 |       Color Transformer: FlatColor
451 |       Decay Time: 0
452 |       Enabled: true
453 |       Invert Rainbow: false
454 |       Max Color: 255; 255; 255
455 |       Max Intensity: 4096
456 |       Min Color: 0; 0; 0
457 |       Min Intensity: 0
458 |       Name: LaserScan
459 |       Position Transformer: XYZ
460 |       Queue Size: 10
461 |       Selectable: true
462 |       Size (Pixels): 3
463 |       Size (m): 0.100000001
464 |       Style: Flat Squares
465 |       Topic: /scan_3d
466 |       Unreliable: false
467 |       Use Fixed Frame: true
468 |       Use rainbow: true
469 |       Value: true
470 |     - Class: rviz_default_plugins/InteractiveMarkers
471 |       Enable Transparency: true
472 |       Enabled: true
473 |       Interactive Markers Namespace: /interactive_model_markers
474 |       Name: InteractiveMarkers
475 |       Show Axes: false
476 |       Show Descriptions: true
477 |       Show Visual Aids: false
478 |       Value: true
479 |   Enabled: true
480 |   Global Options:
481 |     Background Color: 48; 48; 48
482 |     Fixed Frame: map
483 |     Frame Rate: 30
484 |   Name: root
485 |   Tools:
486 |     - Class: rviz_default_plugins/MoveCamera
487 |     - Class: flatland_rviz_plugins/Interact
488 |       Hide Inactive Objects: true
489 |     - Class: flatland_rviz_plugins/TogglePause
490 |     - Class: flatland_rviz_plugins/SpawnModel
491 |     - Class: rviz_default_plugins/Select
492 |     - Class: rviz_default_plugins/SetInitialPose
493 |       Topic: /initialpose
494 |     - Class: rviz_default_plugins/SetGoal
495 |       Topic: /move_base_simple/goal
496 |     - Class: rviz_default_plugins/Measure
497 |   Value: true
498 |   Views:
499 |     Current:
500 |       Angle: 0
501 |       Class: rviz_default_plugins/TopDownOrtho
502 |       Enable Stereo Rendering:
503 |         Stereo Eye Separation: 0.0599999987
504 |         Stereo Focal Distance: 1
505 |         Swap Stereo Eyes: false
506 |         Value: false
507 |       Invert Z Axis: false
508 |       Name: Current View
509 |       Near Clip Distance: 0.00999999978
510 |       Scale: 250.0
511 |       Target Frame: <Fixed Frame>
512 |       Value: TopDownOrtho (rviz)
513 |       X: 0.5
514 |       Y: 1.0
515 |     Saved: ~
516 | Window Geometry:
517 |   Displays:
518 |     collapsed: false
519 |   Height: 837
520 |   Hide Left Dock: false
521 |   Hide Right Dock: true
522 |   QMainWindow State: 000000ff00000000fd0000000400000000000001e1000002e4fc0200000005fb0000001200530065006c0065006300740069006f006e00000001e10000009b0000005400fffffffb0000001e0054006f006f006c002000500072006f007000650072007400690065007302000001ed000001df00000185000000a3fb000000120056006900650077007300200054006f006f02000001df000002110000018500000122fb000000200054006f006f006c002000500072006f0070006500720074006900650073003203000002880000011d000002210000017afb000000100044006900730070006c0061007900730100000041000002e4000000bb00ffffff000000010000010f000002e4fc0200000003fb0000001e0054006f006f006c002000500072006f00700065007200740069006500730100000041000000780000000000000000fb0000000a005600690065007700730000000041000002e4000000a000fffffffb0000001200530065006c0065006300740069006f006e010000025a000000b200000000000000000000000200000490000000a9fc0100000001fb0000000a00560069006500770073030000004e00000080000002e10000019700000003000004a00000003efc0100000002fb0000000800540069006d00650000000000000004a00000022600fffffffb0000000800540069006d0065010000000000000450000000000000000000000459000002e400000004000000040000000800000008fc0000000100000002000000010000000a0054006f006f006c00730100000000ffffffff0000000000000000
523 |   Selection:
524 |     collapsed: false
525 |   Time:
526 |     collapsed: false
527 |   Tool Properties:
528 |     collapsed: false
529 |   Views:
530 |     collapsed: true
531 |   Width: 1600
532 |   X: 0
533 |   Y: 30
534 | 


--------------------------------------------------------------------------------
/serp_rl/__init__.py:
--------------------------------------------------------------------------------
  1 | #!/usr/bin/env python3
  2 | import rclpy
  3 | from rclpy.publisher import Publisher
  4 | from rclpy.node import Node
  5 | from geometry_msgs.msg import Twist, Pose2D
  6 | from sensor_msgs.msg import LaserScan
  7 | from flatland_msgs.srv import MoveModel
  8 | from flatland_msgs.msg import Collisions
  9 | 
 10 | from gym import Env
 11 | from gym.spaces import Discrete, Box
 12 | 
 13 | from stable_baselines3 import PPO
 14 | from stable_baselines3.common.env_checker import check_env
 15 | 
 16 | import numpy as np
 17 | import time
 18 | import threading
 19 | 
 20 | class SerpControllerEnv(Node, Env):
 21 |     def __init__(self) -> None:
 22 |         super().__init__("SerpControllerEnv")
 23 | 
 24 |         # Predefined speed for the robot
 25 |         linear_speed = 0.5
 26 |         angular_speed = 1.57079632679
 27 | 
 28 |         # Set of actions. Defined by their linear and angular speed
 29 |         self.actions = [(linear_speed, 0.0), # move forward
 30 |                         (0.0, angular_speed), # rotate left
 31 |                         (0.0, -angular_speed)] # rotate right
 32 | 
 33 |         # How close the robot needs to be to the target to finish the task
 34 |         self.end_range = 0.2
 35 | 
 36 |         # Number of divisions of the LiDAR
 37 |         self.n_lidar_sections = 9
 38 |         self.lidar_sample = []
 39 | 
 40 |         # Variables that track a possible end state
 41 |         # current distance to target
 42 |         self.distance_to_end = 10.0
 43 |         # true if a collision happens
 44 |         self.collision = False
 45 | 
 46 |         # Possible starting positions
 47 |         self.start_positions = [(0.0, 0.0, 1.57079632679), (1.6, 1.6, 3.14159265359)]
 48 |         # Current position
 49 |         self.position = 0
 50 | 
 51 |         self.step_number = 0
 52 | 
 53 |         # Maximum number of steps before it times out
 54 |         self.max_steps = 200
 55 | 
 56 |         # Records previous action taken. At the start of an episode, there is no prior action so -1 is assigned
 57 |         self.previous_action = -1
 58 | 
 59 |         # Used for data collection during training
 60 |         self.total_step_cnt = 0
 61 |         self.total_episode_cnt = 0
 62 |         self.training = False
 63 |                                     
 64 |         # **** Create publishers ****
 65 |         self.pub:Publisher = self.create_publisher(Twist, "/cmd_vel", 1)
 66 |         # ***************************
 67 | 
 68 |         # **** Create subscriptions ****
 69 |         self.create_subscription(LaserScan, "/static_laser", self.processLiDAR, 1)
 70 | 
 71 |         self.create_subscription(LaserScan, "/end_beacon_laser", self.processEndLiDAR, 1)
 72 | 
 73 |         self.create_subscription(Collisions, "/collisions", self.processCollisions, 1)
 74 |         # ******************************
 75 | 
 76 |         # **** Define action and state spaces ****
 77 | 
 78 |         # action is an integer between 0 and 2 (total of 3 actions)
 79 |         self.action_space = Discrete(len(self.actions))
 80 |         # state is represented by a numpy.Array with size 9 and values between 0 and 2
 81 |         self.observation_space = Box(0, 2, shape=(self.n_lidar_sections,), dtype=np.float64)
 82 | 
 83 |         # ****************************************
 84 | 
 85 |         # Initial state
 86 |         self.state = np.array(self.lidar_sample)
 87 | 
 88 |     # Resets the environment to an initial state
 89 |     def reset(self):
 90 |         # Make sure the robot is stopped
 91 |         self.change_robot_speeds(self.pub, 0.0, 0.0)
 92 | 
 93 |         if self.total_step_cnt != 0: self.total_episode_cnt += 1
 94 | 
 95 |         # **** Move robot and end beacon to new positions ****
 96 |         start_pos = self.start_positions[self.position]
 97 |         self.position = 1 - self.position
 98 |         end_pos = self.start_positions[self.position]
 99 |         
100 |         self.move_model('serp', start_pos[0], start_pos[1], start_pos[2])
101 |         self.move_model('end_beacon', end_pos[0], end_pos[1], 0.0)
102 |         # ****************************************************
103 | 
104 |         # **** Reset necessary values ****
105 |         self.lidar_sample = []
106 |         self.wait_lidar_reading()
107 |         self.state = np.array(self.lidar_sample)
108 | 
109 |         # Flatland can sometimes send several collision messages. 
110 |         # This makes sure that no collisions are wrongfully detected at the start of an episode 
111 |         time.sleep(0.1)
112 | 
113 |         self.distance_to_end = 10.0
114 |         self.collision = False
115 |         self.step_number = 0
116 |         self.previous_action = -1
117 |         # ********************************
118 | 
119 |         return self.state
120 | 
121 |     # Performs a step for the RL agent
122 |     def step(self, action): 
123 | 
124 |         # **** Performs the action and waits for it to be completed ****
125 |         self.change_robot_speeds(self.pub, self.actions[action][0], self.actions[action][1])
126 | 
127 |         self.lidar_sample = []
128 |         self.wait_lidar_reading()
129 |         self.change_robot_speeds(self.pub, 0.0, 0.0)
130 |         # **************************************************************
131 | 
132 |         # Register current state
133 |         self.state = np.array(self.lidar_sample)
134 | 
135 |         self.step_number += 1
136 |         self.total_step_cnt += 1
137 | 
138 |         # **** Calculates the reward and determines if an end state was reached ****
139 |         done = False
140 | 
141 |         end_state = ''
142 | 
143 |         if self.collision:
144 |             end_state = "colision"
145 |             reward = -200
146 |             done = True
147 |         elif self.distance_to_end < self.end_range:
148 |             end_state = "finished"
149 |             reward = 400 + (200 - self.step_number)
150 |             done = True
151 |         elif self.step_number >= self.max_steps:
152 |             end_state = "timeout"
153 |             reward = -300 
154 |             done = True
155 |         elif action == 0:
156 |             reward = 2
157 |         else:
158 |             reward = 0
159 |         # **************************************************************************
160 | 
161 |         info = {'end_state': end_state}
162 | 
163 |         if done and self.training:
164 |             self.get_logger().info('Training - Episode ' + str(self.total_episode_cnt) + ' end state: ' + end_state)
165 |             self.get_logger().info('Total steps: ' + str(self.total_step_cnt))
166 | 
167 |         return self.state, reward, done, info
168 | 
169 |     def render(self): pass
170 | 
171 |     def close(self): pass
172 | 
173 |     def reset_counters(self):
174 |         self.total_step_cnt = 0
175 |         self.total_episode_cnt = 0
176 | 
177 |     # Change the speed of the robot
178 |     def change_robot_speeds(self, publisher, linear, angular):
179 |         twist_msg = Twist()
180 |         twist_msg.linear.x = linear
181 |         twist_msg.angular.z = angular
182 |         publisher.publish(twist_msg)
183 | 
184 |     # Waits for a new LiDAR reading.
185 |     # A new LiDAR reading is also used to signal when an action should finish being performed.
186 |     def wait_lidar_reading(self):
187 |         while len(self.lidar_sample) != self.n_lidar_sections: pass
188 | 
189 |     # Send a request to move a model
190 |     def move_model(self, model_name, x, y, theta):
191 |         client = self.create_client(MoveModel, "/move_model")
192 |         client.wait_for_service()
193 |         request = MoveModel.Request()
194 |         request.name = model_name
195 |         request.pose = Pose2D()
196 |         request.pose.x = x
197 |         request.pose.y = y
198 |         request.pose.theta = theta
199 |         client.call_async(request)
200 |     
201 |     # Sample LiDAR data
202 |     # Divite into sections and sample the lowest value from each
203 |     def processLiDAR(self, data):
204 |         self.lidar_sample = []
205 | 
206 |         rays = data.ranges
207 |         rays_per_section = len(rays) // self.n_lidar_sections
208 | 
209 |         for i in range(self.n_lidar_sections - 1):
210 |             self.lidar_sample.append(min(rays[rays_per_section * i:rays_per_section * (i + 1)]))
211 |         self.lidar_sample.append(min(rays[(self.n_lidar_sections - 1) * rays_per_section:]))
212 | 
213 |     
214 |     # Handle end beacon LiDAR data
215 |     # Lowest value is the distance from robot to target
216 |     def processEndLiDAR(self, data):
217 |         clean_data = [x for x in data.ranges if str(x) != 'nan']
218 |         if not clean_data: return
219 |         self.distance_to_end = min(clean_data)
220 |     
221 |     # Process collisions
222 |     def processCollisions(self, data):
223 |         if len(data.collisions) > 0:
224 |             self.collision = True
225 | 
226 |     # Run an entire episode manually for testing purposes
227 |     # return true if succesful
228 |     def run_episode(self, agent):
229 |         
230 |         com_reward = 0
231 | 
232 |         obs = self.reset()
233 |         done = False
234 |         while not done:
235 |             action, states = agent.predict(obs)
236 |             obs, rewards, done, info = self.step(action)
237 |             com_reward += rewards
238 |         
239 |         self.get_logger().info('Episode concluded. End state: ' + info['end_state'] + '  Commulative reward: ' + str(com_reward))
240 | 
241 |         return info['end_state'] == 'finished'
242 | 
243 |     def run_rl_alg(self):
244 | 
245 |         check_env(self)
246 |         self.wait_lidar_reading()
247 | 
248 |         # Create the agent
249 |         agent = PPO("MlpPolicy", self, verbose=1)
250 | 
251 |         # Target accuracy
252 |         min_accuracy = 0.8
253 |         # Current accuracy
254 |         accuracy = 0
255 |         # Number of tested episodes in each iteration
256 |         n_test_episodes = 20
257 | 
258 |         training_iterations = 0
259 | 
260 |         while accuracy < min_accuracy:
261 |             training_steps= 5000
262 |             self.get_logger().info('Starting training for ' + str(training_steps) + ' steps')
263 | 
264 |             self.training = True
265 |             self.reset_counters()
266 | 
267 |             # Train the agent
268 |             agent.learn(total_timesteps=training_steps)
269 | 
270 |             self.training = False
271 | 
272 |             successful_episodes = 0
273 | 
274 |             # Test the agent
275 |             for i in range(n_test_episodes):
276 |                 self.get_logger().info('Testing episode number ' + str(i + 1) + '.')
277 |                 if self.run_episode(agent): successful_episodes += 1
278 |             
279 |             # Calculate the accuracy
280 |             accuracy = successful_episodes/n_test_episodes
281 | 
282 |             self.get_logger().info('Testing finished. Accuracy: ' + str(accuracy))
283 | 
284 |             training_iterations += 1
285 | 
286 |         self.get_logger().info('Training Finished. Training iterations: ' + str(training_iterations) + '  Accuracy: ' + str(accuracy))
287 | 
288 | 
289 |         agent.save("src/ros2_flatland_rl_tutorial/ppo")
290 | 
291 | def main(args = None):
292 |     rclpy.init()
293 |     
294 |     serp = SerpControllerEnv()
295 | 
296 |     thread = threading.Thread(target=serp.run_rl_alg)
297 |     thread.start()
298 | 
299 |     rclpy.spin(serp)
300 | 
301 | 
302 | 
303 | if __name__ == "__main__":
304 |     main()
305 | 


--------------------------------------------------------------------------------
/setup.cfg:
--------------------------------------------------------------------------------
1 | [develop]
2 | script_dir=$base/lib/serp_rl
3 | [install]
4 | install_scripts=$base/lib/serp_rl
5 | 


--------------------------------------------------------------------------------
/setup.py:
--------------------------------------------------------------------------------
 1 | from setuptools import setup
 2 | from glob import glob
 3 | 
 4 | package_name = 'serp_rl'
 5 | 
 6 | setup(
 7 |     name=package_name,
 8 |     version='0.0.0',
 9 |     # Packages to export
10 |     packages=[package_name],
11 |     # Files we want to install, specifically launch files
12 |     data_files=[
13 |         ('share/' + package_name, ['package.xml']),
14 |         ('share/' + package_name + "/launch/", glob("launch/*launch*")),
15 |         ('share/' + package_name + "/rviz/", glob("rviz/*")),
16 |         ('share/' + package_name + "/world/" , glob('world/*')),
17 |         ('share/ament_index/resource_index/packages',
18 |             ['resource/' + package_name]),
19 |     ],
20 |     install_requires=['setuptools'],
21 |     zip_safe=True,
22 |     maintainer='filipe',
23 |     maintainer_email='filipe.almeida.18@gmail.com',
24 |     description='TODO: Package description',
25 |     license='TODO: License declaration',
26 |     tests_require=['pytest'],
27 |     # Declare the enry point for your python scripts
28 |     entry_points={
29 |         'console_scripts': [
30 |             'serp_rl = serp_rl.__init__:main'
31 |         ],
32 |     },
33 | )
34 | 


--------------------------------------------------------------------------------
/test/test_copyright.py:
--------------------------------------------------------------------------------
 1 | # Copyright 2015 Open Source Robotics Foundation, Inc.
 2 | #
 3 | # Licensed under the Apache License, Version 2.0 (the "License");
 4 | # you may not use this file except in compliance with the License.
 5 | # You may obtain a copy of the License at
 6 | #
 7 | #     http://www.apache.org/licenses/LICENSE-2.0
 8 | #
 9 | # Unless required by applicable law or agreed to in writing, software
10 | # distributed under the License is distributed on an "AS IS" BASIS,
11 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
12 | # See the License for the specific language governing permissions and
13 | # limitations under the License.
14 | 
15 | from ament_copyright.main import main
16 | import pytest
17 | 
18 | 
19 | # Remove the `skip` decorator once the source file(s) have a copyright header
20 | @pytest.mark.skip(reason='No copyright header has been placed in the generated source file.')
21 | @pytest.mark.copyright
22 | @pytest.mark.linter
23 | def test_copyright():
24 |     rc = main(argv=['.', 'test'])
25 |     assert rc == 0, 'Found errors'
26 | 


--------------------------------------------------------------------------------
/test/test_flake8.py:
--------------------------------------------------------------------------------
 1 | # Copyright 2017 Open Source Robotics Foundation, Inc.
 2 | #
 3 | # Licensed under the Apache License, Version 2.0 (the "License");
 4 | # you may not use this file except in compliance with the License.
 5 | # You may obtain a copy of the License at
 6 | #
 7 | #     http://www.apache.org/licenses/LICENSE-2.0
 8 | #
 9 | # Unless required by applicable law or agreed to in writing, software
10 | # distributed under the License is distributed on an "AS IS" BASIS,
11 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
12 | # See the License for the specific language governing permissions and
13 | # limitations under the License.
14 | 
15 | from ament_flake8.main import main_with_errors
16 | import pytest
17 | 
18 | 
19 | @pytest.mark.flake8
20 | @pytest.mark.linter
21 | def test_flake8():
22 |     rc, errors = main_with_errors(argv=[])
23 |     assert rc == 0, \
24 |         'Found %d code style errors / warnings:\n' % len(errors) + \
25 |         '\n'.join(errors)
26 | 


--------------------------------------------------------------------------------
/test/test_pep257.py:
--------------------------------------------------------------------------------
 1 | # Copyright 2015 Open Source Robotics Foundation, Inc.
 2 | #
 3 | # Licensed under the Apache License, Version 2.0 (the "License");
 4 | # you may not use this file except in compliance with the License.
 5 | # You may obtain a copy of the License at
 6 | #
 7 | #     http://www.apache.org/licenses/LICENSE-2.0
 8 | #
 9 | # Unless required by applicable law or agreed to in writing, software
10 | # distributed under the License is distributed on an "AS IS" BASIS,
11 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
12 | # See the License for the specific language governing permissions and
13 | # limitations under the License.
14 | 
15 | from ament_pep257.main import main
16 | import pytest
17 | 
18 | 
19 | @pytest.mark.linter
20 | @pytest.mark.pep257
21 | def test_pep257():
22 |     rc = main(argv=['.', 'test'])
23 |     assert rc == 0, 'Found code style errors / warnings'
24 | 


--------------------------------------------------------------------------------
/world/empty.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/FilipeAlmeidaFEUP/ros2_flatland_rl_tutorial/e7ed1180dc5155f801af8ed8d8ba08a19a60b326/world/empty.png


--------------------------------------------------------------------------------
/world/empty.yaml:
--------------------------------------------------------------------------------
1 | image: empty.png
2 | resolution: 0.04
3 | origin: [-0.5, -0.3, 0.0]
4 | negate: 0
5 | occupied_thresh: 0.65
6 | free_thresh: 0.196
7 | 


--------------------------------------------------------------------------------
/world/end_beacon.model.yaml:
--------------------------------------------------------------------------------
 1 | #End beacon model with LiDAR
 2 | 
 3 | bodies:
 4 |   - name: base_prox
 5 |     enabled: true
 6 |     pose: [0, 0, 0]
 7 |     type: dynamic
 8 |     color: [0, 1, 0, 0.75]
 9 |     footprints:
10 |       - type: circle
11 |         radius: 0.1
12 |         center: [0.0, 0.0]
13 |         density: 1
14 |         layers: ["end_beacon_layer"]
15 | 
16 | plugins:
17 |   # lidar identical to the one from serp model but ignores walls
18 |   - type: Laser
19 |     name: end_beacon_laser
20 |     frame: end_beacon_link
21 |     topic: end_beacon_laser
22 |     body: base_prox
23 |     broadcast_tf: true
24 |     origin: [0.0, 0.0, 0.0]
25 |     range: 10
26 |     angle: {min: -3.14159265359, max: 3.14159265359, increment: 0.06981317007}
27 |     noise_std_dev: 0.015
28 |     update_rate: 10
29 |     layers: ["end_laser_layer"]
30 | 


--------------------------------------------------------------------------------
/world/serp.model.yaml:
--------------------------------------------------------------------------------
 1 | #SERP model with LiDAR
 2 | 
 3 | bodies:
 4 | 
 5 |   - name: base_link
 6 |     enabled: true
 7 |     pose: [0, 0, 0]
 8 |     type: dynamic
 9 |     color: [0, 0, 1, 0.75]
10 |     footprints:
11 |       - type: polygon
12 |         density: 1
13 |         layers: ["collisions_layer", "end_laser_layer"]
14 |         points: [ [-0.117, -0.055],
15 |                   [0.048, -0.055],
16 |                   [0.048, 0.055],
17 |                   [-0.117, 0.055] ]
18 | 
19 |   - name: left_wheel
20 |     color: [1, 1, 0, 0.75] 
21 |     footprints:
22 |       - type: polygon
23 |         density: 1.0
24 |         layers: ["collisions_layer", "end_laser_layer"]
25 |         points: [ [ -0.035, -0.0125],
26 |                   [ 0.035,  -0.0125],
27 |                   [ 0.035, 0.0125],
28 |                   [ -0.035, 0.0125] ]
29 |   - name: right_wheel
30 |     color: [1, 1, 0, 0.75] 
31 |     footprints:
32 |       - type: polygon
33 |         density: 1.0
34 |         layers: ["collisions_layer", "end_laser_layer"]
35 |         points: [ [ -0.035, -0.0125],
36 |                   [ 0.035,  -0.0125],
37 |                   [ 0.035, 0.0125],
38 |                   [ -0.035, 0.0125] ]
39 | 
40 | joints: 
41 |   - type: weld
42 |     name: left_wheel_weld
43 |     bodies: 
44 |       - name: left_wheel
45 |         anchor: [0, 0]
46 |       - name: base_link
47 |         anchor: [0, -0.0725]
48 | 
49 |   - type: weld
50 |     name: right_wheel_weld  
51 |     bodies: 
52 |       - name: right_wheel
53 |         anchor: [0, 0]
54 |       - name: base_link
55 |         anchor: [0, 0.0725]
56 | 
57 | plugins:
58 |   - type: DiffDrive
59 |     name: serp_diff_drive
60 |     body: base_link
61 |     pub_rate: 10
62 |     twist_sub: cmd_vel
63 |     odom_frame_id: odom
64 |     odom_pub: odom  # topic odom is published on
65 |   
66 |   # needed for some visualization components
67 |   - type: ModelTfPublisher
68 |     name: tf_publisher
69 |     publish_tf_world: false
70 | 
71 |   # simulates a lidar by having 90 lasers evenly spread around the model
72 |   - type: Laser
73 |     name: static_laser
74 |     frame: static_laser_link
75 |     topic: static_laser
76 |     body: base_link
77 |     broadcast_tf: true
78 |     origin: [0.0, 0.0, 0.0]
79 |     range: 10
80 |     angle: {min: -3.14159265359, max: 3.14159265359, increment: 0.06981317007}
81 |     noise_std_dev: 0.015
82 |     update_rate: 10
83 |     layers: ["serp_laser_layer"]
84 | 
85 |   - type: Bumper
86 |     name: bumper
87 |     topic: collisions
88 |     update_rate: 10


--------------------------------------------------------------------------------
/world/turn.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/FilipeAlmeidaFEUP/ros2_flatland_rl_tutorial/e7ed1180dc5155f801af8ed8d8ba08a19a60b326/world/turn.png


--------------------------------------------------------------------------------
/world/turn.yaml:
--------------------------------------------------------------------------------
1 | image: turn.png
2 | resolution: 0.04
3 | origin: [-0.5, -0.3, 0.0]
4 | negate: 0
5 | occupied_thresh: 0.65
6 | free_thresh: 0.196
7 | 


--------------------------------------------------------------------------------
/world/world.yaml:
--------------------------------------------------------------------------------
 1 | properties: {}
 2 | layers:
 3 |   - name: ["collisions_layer", "serp_laser_layer"]
 4 |     map: "turn.yaml"
 5 |     color: [1.0, 1.0, 1.0, 1]
 6 |   - name: ["end_laser_layer", "end_beacon_layer"]
 7 |     map: "empty.yaml"
 8 |     color: [1.0, 1.0, 1.0, 1]
 9 | models:
10 |   - name: serp
11 |     model: "serp.model.yaml"
12 |     pose: [0.0, 0.0, 1.57079632679]
13 |   - name: end_beacon
14 |     model: "end_beacon.model.yaml"
15 |     pose: [1.6, 1.6, 0.0]


--------------------------------------------------------------------------------