├── docs ├── charts.png ├── errors.png ├── model.png ├── output.png ├── contours.png ├── labeler.png ├── arrow_types.png ├── binarized.png ├── grayscale.png ├── hue_output.png ├── input_image.png ├── pipelines.png ├── find_matches.png ├── find_rotated.png ├── preprocessing.png ├── reduced_noise.png ├── rune_example.png ├── search_region.png ├── augmented_output.png ├── old_rune_system.png ├── selected_contours.png ├── preprocessor_window.png └── difficult_rune_example.png ├── requirements.txt ├── operations ├── revert_dataset.py └── make_dataset.py ├── LICENSE ├── common.py ├── results ├── denoising.txt ├── thresholding.txt ├── pipeline.txt └── grayscale.txt ├── .gitignore ├── preprocessing ├── label.py └── preprocess.py ├── model ├── classify.py └── train.py └── README.md /docs/charts.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/gbrlfaria/rune-breaker/HEAD/docs/charts.png -------------------------------------------------------------------------------- /docs/errors.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/gbrlfaria/rune-breaker/HEAD/docs/errors.png -------------------------------------------------------------------------------- /docs/model.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/gbrlfaria/rune-breaker/HEAD/docs/model.png -------------------------------------------------------------------------------- /docs/output.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/gbrlfaria/rune-breaker/HEAD/docs/output.png -------------------------------------------------------------------------------- /docs/contours.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/gbrlfaria/rune-breaker/HEAD/docs/contours.png -------------------------------------------------------------------------------- /docs/labeler.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/gbrlfaria/rune-breaker/HEAD/docs/labeler.png -------------------------------------------------------------------------------- /docs/arrow_types.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/gbrlfaria/rune-breaker/HEAD/docs/arrow_types.png -------------------------------------------------------------------------------- /docs/binarized.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/gbrlfaria/rune-breaker/HEAD/docs/binarized.png -------------------------------------------------------------------------------- /docs/grayscale.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/gbrlfaria/rune-breaker/HEAD/docs/grayscale.png -------------------------------------------------------------------------------- /docs/hue_output.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/gbrlfaria/rune-breaker/HEAD/docs/hue_output.png -------------------------------------------------------------------------------- /docs/input_image.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/gbrlfaria/rune-breaker/HEAD/docs/input_image.png -------------------------------------------------------------------------------- /docs/pipelines.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/gbrlfaria/rune-breaker/HEAD/docs/pipelines.png -------------------------------------------------------------------------------- /docs/find_matches.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/gbrlfaria/rune-breaker/HEAD/docs/find_matches.png -------------------------------------------------------------------------------- /docs/find_rotated.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/gbrlfaria/rune-breaker/HEAD/docs/find_rotated.png -------------------------------------------------------------------------------- /docs/preprocessing.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/gbrlfaria/rune-breaker/HEAD/docs/preprocessing.png -------------------------------------------------------------------------------- /docs/reduced_noise.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/gbrlfaria/rune-breaker/HEAD/docs/reduced_noise.png -------------------------------------------------------------------------------- /docs/rune_example.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/gbrlfaria/rune-breaker/HEAD/docs/rune_example.png -------------------------------------------------------------------------------- /docs/search_region.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/gbrlfaria/rune-breaker/HEAD/docs/search_region.png -------------------------------------------------------------------------------- /docs/augmented_output.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/gbrlfaria/rune-breaker/HEAD/docs/augmented_output.png -------------------------------------------------------------------------------- /docs/old_rune_system.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/gbrlfaria/rune-breaker/HEAD/docs/old_rune_system.png -------------------------------------------------------------------------------- /docs/selected_contours.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/gbrlfaria/rune-breaker/HEAD/docs/selected_contours.png -------------------------------------------------------------------------------- /docs/preprocessor_window.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/gbrlfaria/rune-breaker/HEAD/docs/preprocessor_window.png -------------------------------------------------------------------------------- /docs/difficult_rune_example.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/gbrlfaria/rune-breaker/HEAD/docs/difficult_rune_example.png -------------------------------------------------------------------------------- /requirements.txt: -------------------------------------------------------------------------------- 1 | Keras == 2.3.1 2 | colorful == 0.5.4 3 | matplotlib == 3.2.1 4 | numpy == 1.18.1 5 | opencv_python == 4.2.0.34 6 | pandas == 1.0.3 7 | scikit_image == 0.16.2 8 | tensorflow == 2.2.0rc3 9 | -------------------------------------------------------------------------------- /operations/revert_dataset.py: -------------------------------------------------------------------------------- 1 | import os 2 | import sys 3 | 4 | sys.path.insert(0, os.path.abspath('.')) 5 | 6 | import common 7 | 8 | 9 | def main(): 10 | common.create_directories() 11 | 12 | print("Reverting images from the training directory...") 13 | revert_files(common.TRAINING_DIR) 14 | 15 | print("Reverting images from the validation directory...") 16 | revert_files(common.VALIDATION_DIR) 17 | 18 | print("Reverting images from the testing directory...") 19 | revert_files(common.TESTING_DIR) 20 | 21 | print("Finished!") 22 | 23 | 24 | def revert_files(src_dir): 25 | images = common.get_files(src_dir) 26 | 27 | for path, filename in images: 28 | os.rename(path, common.SAMPLES_DIR + filename) 29 | 30 | print("Reverted {} images.\n".format(len(images))) 31 | 32 | 33 | if __name__ == "__main__": 34 | main() 35 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | MIT License 2 | 3 | Copyright (c) 2020 Gabriel Faria 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. 22 | -------------------------------------------------------------------------------- /common.py: -------------------------------------------------------------------------------- 1 | import os 2 | import re 3 | 4 | 5 | # Classification 6 | INPUT_SHAPE = (60, 60, 1) 7 | CLASSES = ['down', 'left', 'right', 'up'] 8 | 9 | # Directories 10 | DATA_DIR = './data/' 11 | 12 | SAMPLES_DIR = DATA_DIR + 'samples/' 13 | TRAINING_DIR = DATA_DIR + 'training/' 14 | VALIDATION_DIR = DATA_DIR + 'validation/' 15 | TESTING_DIR = DATA_DIR + 'testing/' 16 | LABELED_DIR = DATA_DIR + 'labeled/' 17 | PREPROCESSED_DIR = DATA_DIR + 'preprocessed/' 18 | SCREENSHOTS_DIR = DATA_DIR + 'screenshots/' 19 | 20 | MODEL_DIR = './model/' 21 | 22 | 23 | # Functions 24 | def get_files(directory): 25 | result = [] 26 | 27 | for name in os.listdir(directory): 28 | path = directory + name 29 | 30 | if os.path.isfile(path): 31 | result.append((path, name)) 32 | else: 33 | result.extend(get_files(path + '/')) 34 | 35 | return result 36 | 37 | 38 | def arrow_labels(name): 39 | tokens = re.split('_', name) 40 | arrow_direction, arrow_type = tokens[1], tokens[0] 41 | 42 | return arrow_direction, arrow_type 43 | 44 | 45 | def create_directories(): 46 | directories = [ 47 | SCREENSHOTS_DIR, 48 | LABELED_DIR, 49 | PREPROCESSED_DIR, 50 | SAMPLES_DIR 51 | ] 52 | 53 | for d in [TRAINING_DIR, VALIDATION_DIR, TESTING_DIR]: 54 | for c in CLASSES: 55 | directories.append(d + c + '/') 56 | 57 | for d in directories: 58 | os.makedirs(d, exist_ok=True) 59 | -------------------------------------------------------------------------------- /results/denoising.txt: -------------------------------------------------------------------------------- 1 | n1 = 660, n2 = 100 2 | 3 | (threshold, connectivity): [arrow center misses] 4 | ================================================ 5 | (8, 1): [10, 15] 6 | (8, 0): [10, 15] 7 | (64, 8): [12, 11] 8 | (64, 4): [12, 11] 9 | (64, 2): [12, 11] 10 | (32, 8): [12, 11] 11 | (32, 4): [12, 11] 12 | (32, 2): [12, 11] 13 | (16, 8): [12, 11] 14 | (16, 4): [12, 11] 15 | (16, 2): [12, 11] 16 | (8, 8): [12, 11] 17 | (8, 4): [12, 11] 18 | (8, 2): [12, 11] 19 | (4, 8): [12, 11] 20 | (4, 4): [12, 11] 21 | (4, 2): [12, 11] 22 | (4, 1): [12, 14] 23 | (4, 0): [12, 14] 24 | (2, 8): [12, 11] 25 | (2, 4): [12, 11] 26 | (2, 2): [12, 11] 27 | (1, 8): [12, 11] 28 | (1, 4): [12, 11] 29 | (1, 2): [12, 11] 30 | (1, 1): [12, 11] 31 | (1, 0): [12, 11] 32 | (0, 8): [12, 11] 33 | (0, 4): [12, 11] 34 | (0, 2): [12, 11] 35 | (0, 1): [12, 11] 36 | (0, 0): [12, 11] 37 | (128, 8): [13, 12] 38 | (128, 4): [13, 12] 39 | (128, 2): [13, 12] 40 | (2, 1): [13, 12] 41 | (2, 0): [13, 12] 42 | (16, 1): [14, 15] 43 | (16, 0): [14, 15] 44 | (32, 1): [16, 17] 45 | (32, 0): [16, 17] 46 | (64, 1): [19, 16] 47 | (64, 0): [19, 16] 48 | (128, 1): [25, 18] 49 | (128, 0): [25, 18] 50 | (256, 8): [95, 31] 51 | (256, 4): [95, 31] 52 | (256, 2): [95, 31] 53 | (256, 1): [150, 46] 54 | (256, 0): [150, 46] 55 | (512, 8): [1319, 215] 56 | (512, 4): [1319, 215] 57 | (512, 2): [1319, 215] 58 | (512, 1): [1971, 310] 59 | (512, 0): [1971, 310] 60 | (1024, 8): [2277, 369] 61 | (1024, 4): [2277, 369] 62 | (1024, 2): [2277, 369] 63 | (1024, 1): [2625, 400] 64 | (1024, 0): [2625, 400] 65 | -------------------------------------------------------------------------------- /results/thresholding.txt: -------------------------------------------------------------------------------- 1 | n1 = 660, n2 = 100 2 | (block_size, C): [arrow center misses] 3 | ====================================== 4 | (5, -1): [12, 11] 5 | (7, -2): [16, 17] 6 | (5, -2): [21, 16] 7 | (7, -1): [24, 8] 8 | (7, 2): [27, 22] 9 | (5, 1): [29, 19] 10 | (9, -2): [32, 13] 11 | (7, 1): [32, 12] 12 | (9, -1): [33, 10] 13 | (11, -1): [34, 13] 14 | (11, -2): [34, 18] 15 | (7, 0): [34, 12] 16 | (5, 0): [35, 10] 17 | (9, 1): [36, 13] 18 | (9, 2): [37, 16] 19 | (7, -3): [37, 18] 20 | (3, -1): [37, 17] 21 | (13, -1): [45, 16] 22 | (11, 1): [45, 15] 23 | (9, 0): [45, 10] 24 | (9, -3): [45, 19] 25 | (11, 2): [46, 17] 26 | (13, 1): [47, 14] 27 | (11, 0): [49, 14] 28 | (9, 3): [49, 28] 29 | (13, 2): [53, 17] 30 | (11, -3): [54, 23] 31 | (13, 0): [55, 16] 32 | (13, -2): [55, 17] 33 | (15, -1): [56, 19] 34 | (11, 3): [56, 24] 35 | (7, 3): [56, 32] 36 | (15, 2): [58, 21] 37 | (15, 0): [58, 17] 38 | (15, -2): [60, 18] 39 | (15, 1): [61, 18] 40 | (5, -3): [61, 35] 41 | (13, 3): [62, 30] 42 | (13, -3): [63, 21] 43 | (9, -4): [64, 32] 44 | (5, 2): [65, 27] 45 | (15, 3): [68, 28] 46 | (7, -4): [69, 32] 47 | (13, 4): [74, 30] 48 | (11, 4): [74, 34] 49 | (11, -4): [74, 34] 50 | (15, -3): [76, 23] 51 | (13, -4): [78, 31] 52 | (9, 4): [81, 38] 53 | (15, 4): [84, 38] 54 | (15, -4): [84, 28] 55 | (3, -2): [86, 43] 56 | (3, 0): [98, 24] 57 | (7, 4): [115, 39] 58 | (5, -4): [125, 48] 59 | (5, 3): [194, 44] 60 | (3, 1): [254, 46] 61 | (3, -3): [261, 70] 62 | (5, 4): [408, 79] 63 | (3, -4): [501, 112] 64 | (3, 2): [667, 89] 65 | (3, 3): [1135, 168] 66 | (3, 4): [1598, 251] 67 | -------------------------------------------------------------------------------- /.gitignore: -------------------------------------------------------------------------------- 1 | # Byte-compiled / optimized / DLL files 2 | __pycache__/ 3 | *.py[cod] 4 | *$py.class 5 | 6 | # C extensions 7 | *.so 8 | 9 | # Distribution / packaging 10 | .Python 11 | build/ 12 | develop-eggs/ 13 | dist/ 14 | downloads/ 15 | eggs/ 16 | .eggs/ 17 | lib/ 18 | lib64/ 19 | parts/ 20 | sdist/ 21 | var/ 22 | wheels/ 23 | pip-wheel-metadata/ 24 | share/python-wheels/ 25 | *.egg-info/ 26 | .installed.cfg 27 | *.egg 28 | MANIFEST 29 | 30 | # PyInstaller 31 | # Usually these files are written by a python script from a template 32 | # before PyInstaller builds the exe, so as to inject date/other infos into it. 33 | *.manifest 34 | *.spec 35 | 36 | # Installer logs 37 | pip-log.txt 38 | pip-delete-this-directory.txt 39 | 40 | # Unit test / coverage reports 41 | htmlcov/ 42 | .tox/ 43 | .nox/ 44 | .coverage 45 | .coverage.* 46 | .cache 47 | nosetests.xml 48 | coverage.xml 49 | *.cover 50 | *.py,cover 51 | .hypothesis/ 52 | .pytest_cache/ 53 | 54 | # Translations 55 | *.mo 56 | *.pot 57 | 58 | # Django stuff: 59 | *.log 60 | local_settings.py 61 | db.sqlite3 62 | db.sqlite3-journal 63 | 64 | # Flask stuff: 65 | instance/ 66 | .webassets-cache 67 | 68 | # Scrapy stuff: 69 | .scrapy 70 | 71 | # Sphinx documentation 72 | docs/_build/ 73 | 74 | # PyBuilder 75 | target/ 76 | 77 | # Jupyter Notebook 78 | .ipynb_checkpoints 79 | 80 | # IPython 81 | profile_default/ 82 | ipython_config.py 83 | 84 | # pyenv 85 | .python-version 86 | 87 | # pipenv 88 | # According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control. 89 | # However, in case of collaboration, if having platform-specific dependencies or dependencies 90 | # having no cross-platform support, pipenv may install dependencies that don't work, or not 91 | # install all needed dependencies. 92 | #Pipfile.lock 93 | 94 | # PEP 582; used by e.g. github.com/David-OConnor/pyflow 95 | __pypackages__/ 96 | 97 | # Celery stuff 98 | celerybeat-schedule 99 | celerybeat.pid 100 | 101 | # SageMath parsed files 102 | *.sage.py 103 | 104 | # Environments 105 | .env 106 | .venv 107 | env/ 108 | venv/ 109 | ENV/ 110 | env.bak/ 111 | venv.bak/ 112 | 113 | # Spyder project settings 114 | .spyderproject 115 | .spyproject 116 | 117 | # Rope project settings 118 | .ropeproject 119 | 120 | # mkdocs documentation 121 | /site 122 | 123 | # mypy 124 | .mypy_cache/ 125 | .dmypy.json 126 | dmypy.json 127 | 128 | # Pyre type checker 129 | .pyre/ 130 | 131 | # models 132 | *.h5 133 | 134 | data/* 135 | logs/* 136 | -------------------------------------------------------------------------------- /preprocessing/label.py: -------------------------------------------------------------------------------- 1 | import os 2 | import re 3 | import sys 4 | import time 5 | 6 | sys.path.insert(0, os.path.abspath('.')) 7 | 8 | import matplotlib.pyplot as plt 9 | import numpy as np 10 | 11 | import common 12 | 13 | type_label = None 14 | direction_label = '' 15 | 16 | plt_text = None 17 | 18 | type_dictionary = {'1': 'round', '2': 'wide', '3': 'narrow'} 19 | 20 | 21 | def main(): 22 | common.create_directories() 23 | 24 | print(" Q = ignore image") 25 | print(" 1 = label as round") 26 | print(" 2 = label as wide") 27 | print(" 3 = label as narrow") 28 | print("ARROW KEYS = label directions\n") 29 | 30 | global type_label 31 | global direction_label 32 | global plt_text 33 | 34 | unlabeled_imgs = common.get_files(common.SCREENSHOTS_DIR) 35 | 36 | num_labeled = 0 37 | for path, filename in unlabeled_imgs: 38 | print("Processing {}...".format(filename)) 39 | 40 | img = plt.imread(path) 41 | 42 | ax = plt.gca() 43 | fig = plt.gcf() 44 | plot = ax.imshow(img) 45 | 46 | plt.axis('off') 47 | plt.tight_layout() 48 | plt_text = plt.text(0, 0, "") 49 | 50 | fig.canvas.mpl_connect('key_press_event', on_press) 51 | 52 | mng = plt.get_current_fig_manager() 53 | mng.window.state('zoomed') 54 | 55 | plt.show() 56 | 57 | if type_label and direction_label: 58 | dst_filename = "{}_{}_{}.png".format( 59 | type_dictionary[type_label], direction_label, time.strftime("%Y%m%d-%H%M%S")) 60 | 61 | os.rename(path, common.LABELED_DIR + dst_filename) 62 | 63 | direction_label = '' 64 | type_label = None 65 | 66 | num_labeled += 1 67 | 68 | if len(unlabeled_imgs) > 0: 69 | print("\nLabeled {} out of {} images ({}%).".format( 70 | num_labeled, len(unlabeled_imgs), 100 * num_labeled // len(unlabeled_imgs))) 71 | print("Finished!") 72 | else: 73 | print("\nThere are no images to label.") 74 | 75 | 76 | def on_press(event): 77 | global type_label 78 | global direction_label 79 | 80 | if event.key in ['1', '2', '3']: 81 | type_label = event.key 82 | if len(direction_label) == 4: 83 | plt.close() 84 | return 85 | 86 | elif event.key in ['left', 'right', 'up', 'down']: 87 | if len(direction_label) < 4: 88 | direction_label += event.key[0] 89 | if len(direction_label) >= 4 and type_label: 90 | plt.close() 91 | return 92 | 93 | elif event.key == 'z': 94 | type_label = None 95 | direction_label = '' 96 | 97 | if event.key != 'q': 98 | if not type_label: 99 | t = '-' 100 | else: 101 | t = type_dictionary[type_label] 102 | 103 | plt_text.set_text(make_text(t, direction_label)) 104 | plt.draw() 105 | 106 | 107 | def make_text(type_label, direction_label): 108 | directions = [] 109 | 110 | for d in direction_label: 111 | if d == 'd': 112 | directions.append('down') 113 | elif d == 'l': 114 | directions.append('left') 115 | elif d == 'r': 116 | directions.append('right') 117 | elif d == 'u': 118 | directions.append('up') 119 | 120 | for x in range(len(direction_label), 4): 121 | directions.append('-') 122 | 123 | return "%s: { %s, %s, %s, %s }" % (type_label, directions[0], directions[1], directions[2], directions[3]) 124 | 125 | 126 | if __name__ == "__main__": 127 | main() 128 | -------------------------------------------------------------------------------- /operations/make_dataset.py: -------------------------------------------------------------------------------- 1 | import argparse 2 | import os 3 | import random 4 | import re 5 | import sys 6 | 7 | sys.path.insert(0, os.path.abspath('.')) 8 | 9 | import colorful as cf 10 | import numpy as np 11 | import pandas as pd 12 | 13 | import common 14 | 15 | TRAINING_SET_RATIO = 0.9 16 | VALIDATION_SET_RATIO = 0.5 17 | 18 | 19 | def main(training_set_ratio): 20 | common.create_directories() 21 | 22 | arrows = pd.DataFrame( 23 | np.zeros((3, 4), dtype=np.int32), 24 | index=('round', 'wide', 'narrow'), 25 | columns=('down', 'left', 'right', 'up') 26 | ) 27 | 28 | images = [(p, f) for p, f in common.get_files(common.SAMPLES_DIR) if f[-5] != 'F'] 29 | 30 | if images: 31 | for _, filename in images: 32 | arrow_direction, arrow_type = common.arrow_labels(filename) 33 | 34 | arrows[arrow_direction][arrow_type] += 1 35 | 36 | num_samples = int(arrows.min().min() * training_set_ratio) 37 | 38 | print("Samples per type: {}".format(num_samples * 4)) 39 | 40 | for t, _ in arrows.iterrows(): 41 | print("\nProcessing {} arrows...".format(t)) 42 | 43 | for direction in arrows: 44 | candidates = [(p, f) for p, f in images if common.arrow_labels(f) == (direction, t)] 45 | 46 | print("{}: {}".format(direction, len(candidates))) 47 | 48 | training = random.sample(candidates, num_samples) 49 | for path, filename in training: 50 | dst_dir = common.TRAINING_DIR + direction + '/' 51 | os.rename(path, dst_dir + filename) 52 | os.rename(flipped(path), dst_dir + flipped(filename)) 53 | 54 | candidates = [c for c in candidates if c not in training] 55 | 56 | validation = random.sample( 57 | candidates, int(len(candidates) * VALIDATION_SET_RATIO) 58 | ) 59 | for path, filename in validation: 60 | dst_dir = common.VALIDATION_DIR + direction + '/' 61 | os.rename(path, dst_dir + filename) 62 | os.rename(flipped(path), dst_dir + flipped(filename)) 63 | 64 | testing = [c for c in candidates if c not in validation] 65 | for path, filename in testing: 66 | dst_dir = common.TESTING_DIR + direction + '/' 67 | os.rename(path, dst_dir + filename) 68 | os.rename(flipped(path), dst_dir + flipped(filename)) 69 | 70 | show_summary() 71 | 72 | print("\nFinished!") 73 | 74 | 75 | def flipped(s): 76 | return s[:-4] + 'F' + s[-4:] 77 | 78 | 79 | def show_summary(): 80 | print("\n" + cf.skyBlue("Training set")) 81 | print(get_summary_matrix(common.TRAINING_DIR)) 82 | 83 | print("\n" + cf.salmon("Validation set")) 84 | print(get_summary_matrix(common.VALIDATION_DIR)) 85 | 86 | print("\n" + cf.lightGreen("Testing set")) 87 | print(get_summary_matrix(common.TESTING_DIR)) 88 | 89 | 90 | def get_summary_matrix(directory): 91 | matrix = pd.DataFrame( 92 | np.zeros((4, 5), dtype=np.int32), 93 | index=('round', 'wide', 'narrow', 'total'), 94 | columns=('down', 'left', 'right', 'up', 'total') 95 | ) 96 | 97 | images = common.get_files(directory) 98 | 99 | for _, filename in images: 100 | arrow_direction, arrow_type = common.arrow_labels(filename) 101 | 102 | matrix[arrow_direction][arrow_type] += 1 103 | 104 | matrix['total'][arrow_type] += 1 105 | matrix[arrow_direction]['total'] += 1 106 | matrix['total']['total'] += 1 107 | 108 | return matrix 109 | 110 | 111 | if __name__ == "__main__": 112 | os.system('color') 113 | 114 | parser = argparse.ArgumentParser() 115 | 116 | parser.add_argument('-r', '--ratio', type=float, default=TRAINING_SET_RATIO, 117 | help="Specifies the training/validation set proportion") 118 | 119 | args = parser.parse_args() 120 | 121 | main(args.ratio) 122 | -------------------------------------------------------------------------------- /model/classify.py: -------------------------------------------------------------------------------- 1 | import argparse 2 | import os 3 | import re 4 | import sys 5 | 6 | sys.path.insert(0, os.path.abspath('.')) 7 | os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2' 8 | 9 | import colorful as cf 10 | import cv2 11 | import numpy as np 12 | import pandas as pd 13 | import tensorflow.keras 14 | 15 | import common 16 | 17 | 18 | def main(src_subdir, verbose, model_name): 19 | common.create_directories() 20 | 21 | src_dir = common.DATA_DIR + src_subdir + "/" 22 | 23 | model = tensorflow.keras.models.load_model(common.MODEL_DIR + model_name) 24 | 25 | # real (index) x predicted (column) 26 | confusion_matrix = pd.DataFrame(np.zeros((4, 4), dtype=np.int32), 27 | index=('down', 'left', 'right', 'up'), 28 | columns=('down', 'left', 'right', 'up')) 29 | 30 | classification_matrix = pd.DataFrame(np.zeros((4, 3)), 31 | index=('down', 'left', 'right', 'up'), 32 | columns=('precision', 'recall', 'f1')) 33 | 34 | type_matrix = pd.DataFrame(np.zeros((4, 2), dtype=np.int32), 35 | index=('round', 'wide', 'narrow', 'total'), 36 | columns=('correct', 'incorrect')) 37 | 38 | images = common.get_files(src_dir) 39 | 40 | print("Processing {} file(s) in {}/...\n".format(len(images), src_subdir)) 41 | 42 | for path, filename in images: 43 | img = cv2.imread(path, cv2.IMREAD_GRAYSCALE) 44 | 45 | data = np.reshape(img, (1, ) + common.INPUT_SHAPE) 46 | prediction = model.predict(data) 47 | 48 | class_index = np.argmax(prediction) 49 | predicted_class = common.CLASSES[class_index] 50 | 51 | real_class, arrow_type = common.arrow_labels(filename) 52 | 53 | if verbose and real_class != predicted_class: 54 | print(path) 55 | print("Expected {} but got {}: {}\n".format( 56 | cf.lightGreen(real_class), 57 | cf.lightCoral(predicted_class), 58 | str(prediction[0]))) 59 | 60 | confusion_matrix[predicted_class][real_class] += 1 61 | 62 | if real_class == predicted_class: 63 | type_matrix['correct'][arrow_type] += 1 64 | type_matrix['correct']['total'] += 1 65 | else: 66 | type_matrix['incorrect'][arrow_type] += 1 67 | type_matrix['incorrect']['total'] += 1 68 | 69 | print("\n" + cf.sandyBrown("Confusion matrix")) 70 | print(confusion_matrix) 71 | 72 | classification_matrix['precision'] = confusion_matrix.apply(precision) 73 | classification_matrix['recall'] = confusion_matrix.apply(recall, axis=1) 74 | 75 | classification_matrix['f1'] = classification_matrix.apply(f1, axis=1) 76 | 77 | print("\n" + cf.skyBlue("Classification summary")) 78 | print(classification_matrix) 79 | 80 | type_matrix['accuracy'] = type_matrix.apply(type_accuracy, axis=1) 81 | 82 | print("\n" + cf.plum("Accuracy by type")) 83 | print(type_matrix) 84 | 85 | print("\nFinished!") 86 | 87 | 88 | def precision(x): 89 | return round(x[x.name] / sum(x), 4) 90 | 91 | 92 | def recall(x): 93 | return round(x[x.name] / sum(x), 4) 94 | 95 | 96 | def f1(x): 97 | return round(2 * (x['precision'] * x['recall']) / (x['precision'] + x['recall']), 4) 98 | 99 | 100 | def type_accuracy(x): 101 | return round(x['correct'] / (x['correct'] + x['incorrect']), 4) 102 | 103 | 104 | if __name__ == "__main__": 105 | os.system('color') 106 | np.set_printoptions(suppress=True) 107 | 108 | parser = argparse.ArgumentParser() 109 | 110 | parser.add_argument('-d', '--dir', type=str, default='testing', 111 | help="Specifies the directory from which images will be classified") 112 | parser.add_argument('-v', '--verbose', action='store_true', 113 | help="Enables logging of misclassified examples") 114 | parser.add_argument('-m', '--model', type=str, default='arrow_model.h5', 115 | help="Specifies the model file name") 116 | 117 | args = parser.parse_args() 118 | 119 | main(args.dir, args.verbose, args.model) 120 | -------------------------------------------------------------------------------- /results/pipeline.txt: -------------------------------------------------------------------------------- 1 | ---------------------------------------------------------------------- 2 | IMAGE PREPROCESSING 3 | ---------------------------------------------------------------------- 4 | preprocessing_accuracy = 0.9975 5 | 6 | $ python preprocessing/preprocess.py -m binarized 7 | 8 | ... 9 | 10 | Approved 798 out of 800 images (99%). 11 | 12 | Samples summary 13 | down left right up total 14 | round 1872 1872 1872 1872 7488 15 | wide 2632 2632 2632 2632 10528 16 | narrow 1880 1880 1880 1880 7520 17 | total 6384 6384 6384 6384 25536 18 | 19 | Finished! 20 | ---------------------------------------------------------------------- 21 | DATASET SPLITTING 22 | ---------------------------------------------------------------------- 23 | split = 0.90, 0.05, 0.05 24 | real_split = 0.79, 0.105, 0.105 25 | 26 | $ python operations/make_dataset.py -r 0.9 27 | 28 | ... 29 | 30 | Training set 31 | down left right up total 32 | round 1684 1684 1684 1684 6736 33 | wide 1684 1684 1684 1684 6736 34 | narrow 1684 1684 1684 1684 6736 35 | total 5052 5052 5052 5052 20208 36 | 37 | Validation set 38 | down left right up total 39 | round 94 94 94 94 376 40 | wide 474 474 474 474 1896 41 | narrow 98 98 98 98 392 42 | total 666 666 666 666 2664 43 | 44 | Testing set 45 | down left right up total 46 | round 94 94 94 94 376 47 | wide 474 474 474 474 1896 48 | narrow 98 98 98 98 392 49 | total 666 666 666 666 2664 50 | 51 | Finished! 52 | ---------------------------------------------------------------------- 53 | MODEL TRAINING 54 | ---------------------------------------------------------------------- 55 | $ python model/train.py -m binarized_model128.h5 -b 128 56 | 57 | ... 58 | 59 | Settings 60 | value 61 | max_epochs 240 62 | patience 80 63 | batch_size 128 64 | 65 | Creating model... 66 | 67 | Creating generators... 68 | Found 20208 images belonging to 4 classes. 69 | Found 2664 images belonging to 4 classes. 70 | 71 | Fitting model... 72 | 73 | ... 74 | 75 | Epoch 228/240 76 | - 28s - loss: 0.0095 - accuracy: 0.9974 - val_loss: 0.0000e+00 - val_accuracy: 0.9945 77 | 78 | ... 79 | 80 | Best epoch: 228 81 | 82 | Saving model... 83 | Model saved to ./model/binarized_model128.h5 84 | 85 | Finished! 86 | ---------------------------------------------------------------------- 87 | VALIDATION PERFORMANCE 88 | ---------------------------------------------------------------------- 89 | $ python model/classify.py -m binarized_model128.h5 -d validation 90 | Processing 2664 file(s) in validation/... 91 | 92 | Confusion matrix 93 | down left right up 94 | down 662 0 3 1 95 | left 0 665 0 1 96 | right 0 0 666 0 97 | up 0 1 2 663 98 | 99 | Classification summary 100 | precision recall f1 101 | down 1.0000 0.9940 0.9970 102 | left 0.9985 0.9985 0.9985 103 | right 0.9925 1.0000 0.9962 104 | up 0.9970 0.9955 0.9962 105 | 106 | Accuracy by type 107 | correct incorrect accuracy 108 | round 374 2 0.9947 109 | wide 1892 4 0.9979 110 | narrow 390 2 0.9949 111 | total 2656 8 0.9970 112 | 113 | Finished! 114 | ---------------------------------------------------------------------- 115 | FINAL PERFORMANCE 116 | ---------------------------------------------------------------------- 117 | $ python model/classify.py -m binarized_model128.h5 -d testing 118 | Processing 2064 file(s) in testing/... 119 | 120 | Confusion matrix 121 | down left right up 122 | down 664 0 2 0 123 | left 0 665 0 1 124 | right 0 0 666 0 125 | up 0 0 1 665 126 | 127 | Classification summary 128 | precision recall f1 129 | down 1.0000 0.9970 0.9985 130 | left 1.0000 0.9985 0.9992 131 | right 0.9955 1.0000 0.9977 132 | up 0.9985 0.9985 0.9985 133 | 134 | Accuracy by type 135 | correct incorrect accuracy 136 | round 376 0 1.0000 137 | wide 1892 4 0.9979 138 | narrow 392 0 1.0000 139 | total 2660 4 0.9985 140 | ---------------------------------------------------------- 141 | P(Binarized) = 0.9975 * (0.9985)^4 = 0.9915 142 | 143 | normal_test_arrow_accuracy = (1)^4 = 1 144 | normal_test_rune_accuracy = 20 out of 20 145 | 146 | hard_test_arrow_accuracy = (0.9504)^4 = 0.8159 147 | hard_test_rune_accuracy = 16 out of 20 = 0.8 148 | -------------------------------------------------------------------------------- /model/train.py: -------------------------------------------------------------------------------- 1 | import argparse 2 | import os 3 | import sys 4 | from datetime import datetime 5 | 6 | sys.path.insert(0, os.path.abspath('.')) 7 | os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2' 8 | 9 | import colorful as cf 10 | import numpy as np 11 | import pandas as pd 12 | from keras.callbacks import EarlyStopping, TensorBoard 13 | from keras.layers import Activation, Dense, Flatten, Dropout 14 | from keras.layers import Conv2D, MaxPooling2D 15 | from keras.models import Sequential 16 | from keras.preprocessing.image import ImageDataGenerator 17 | 18 | import common 19 | 20 | BATCH_SIZE = 128 21 | MAX_EPOCHS = 240 22 | PATIENCE = MAX_EPOCHS // 3 23 | 24 | LOG_DIR = './logs/' 25 | 26 | IMAGE_SHAPE = (common.INPUT_SHAPE[0], common.INPUT_SHAPE[1]) 27 | 28 | 29 | def main(batch_size, model_name): 30 | common.create_directories() 31 | 32 | show_settings(batch_size) 33 | 34 | model = make_model() 35 | 36 | training, validation = make_generators(batch_size) 37 | 38 | fit(model, training, validation, batch_size) 39 | 40 | save(model, model_name) 41 | 42 | print("\nFinished!") 43 | print("Run " + cf.skyBlue("classify.py") + 44 | " to test the model and get information about its performance.") 45 | print("More information available with " + cf.orange("Tensorboard") + ".") 46 | 47 | 48 | def show_settings(batch_size): 49 | print("Classification model training application started.\n") 50 | 51 | settings = pd.DataFrame(index=('max_epochs', 'patience', 'batch_size'), 52 | columns=('value', )) 53 | 54 | settings['value']['max_epochs'] = MAX_EPOCHS 55 | settings['value']['patience'] = PATIENCE 56 | settings['value']['batch_size'] = batch_size 57 | 58 | print(cf.skyBlue("Settings")) 59 | print(settings) 60 | 61 | 62 | def make_model(): 63 | print("\nCreating model...") 64 | 65 | model = Sequential() 66 | 67 | # Convolution block 1 68 | model.add(Conv2D(32, (3, 3), padding='same', input_shape=common.INPUT_SHAPE)) 69 | model.add(Activation('relu')) 70 | model.add(MaxPooling2D(pool_size=(2, 2))) 71 | 72 | # Convolution block 2 73 | model.add(Conv2D(48, (3, 3), padding='same')) 74 | model.add(Activation('relu')) 75 | model.add(MaxPooling2D(pool_size=(2, 2))) 76 | 77 | # Convolution block 3 78 | model.add(Conv2D(64, (3, 3), padding='same')) 79 | model.add(Activation('relu')) 80 | model.add(MaxPooling2D(pool_size=(2, 2))) 81 | 82 | model.add(Flatten()) 83 | 84 | model.add(Dense(64, activation='relu')) 85 | model.add(Dropout(0.2)) 86 | 87 | model.add(Dense(len(common.CLASSES), activation='softmax')) 88 | 89 | model.compile(optimizer='adam', 90 | loss='categorical_crossentropy', 91 | metrics=['accuracy']) 92 | 93 | return model 94 | 95 | 96 | def make_generators(batch_size): 97 | print("\nCreating generators...") 98 | 99 | aug = ImageDataGenerator( 100 | width_shift_range=0.125, height_shift_range=0.125, zoom_range=0.2) 101 | 102 | training = aug.flow_from_directory( 103 | common.TRAINING_DIR, 104 | target_size=IMAGE_SHAPE, 105 | color_mode='grayscale', 106 | batch_size=batch_size, 107 | class_mode='categorical') 108 | 109 | validation = aug.flow_from_directory( 110 | common.VALIDATION_DIR, 111 | target_size=IMAGE_SHAPE, 112 | color_mode='grayscale', 113 | batch_size=batch_size, 114 | class_mode='categorical') 115 | 116 | return training, validation 117 | 118 | 119 | def fit(model, training, validation, batch_size): 120 | print("\nFitting model...") 121 | 122 | history = model.fit_generator( 123 | training, 124 | epochs=MAX_EPOCHS, 125 | validation_data=validation, 126 | steps_per_epoch=training.n // batch_size, 127 | validation_steps=validation.n // batch_size, 128 | callbacks=setup_callbacks(), 129 | workers=2, 130 | verbose=2) 131 | 132 | best_epoch = np.argmin(history.history['val_loss']) + 1 133 | print("\n" + cf.lightGreen("Best epoch: {}".format(best_epoch))) 134 | 135 | 136 | def setup_callbacks(): 137 | log_dir = LOG_DIR + datetime.now().strftime("%Y%m%d-%H%M%S") 138 | tensorboard_callback = TensorBoard(log_dir=log_dir) 139 | 140 | early_stopping = EarlyStopping(patience=PATIENCE, verbose=1, restore_best_weights=True) 141 | 142 | return [tensorboard_callback, early_stopping] 143 | 144 | 145 | def save(model, model_name): 146 | print("\nSaving model...") 147 | 148 | path = common.MODEL_DIR + model_name 149 | 150 | model.save(path) 151 | print("Model saved to " + cf.skyBlue(path)) 152 | 153 | 154 | if __name__ == "__main__": 155 | os.system('color') 156 | 157 | parser = argparse.ArgumentParser() 158 | 159 | parser.add_argument('-b', '--batch_size', type=int, default=BATCH_SIZE, 160 | help="Specifies the batch size") 161 | parser.add_argument('-m', '--model', type=str, default="arrow_model.h5", 162 | help="Specifies the output model name") 163 | 164 | args = parser.parse_args() 165 | 166 | main(args.batch_size, args.model) 167 | -------------------------------------------------------------------------------- /results/grayscale.txt: -------------------------------------------------------------------------------- 1 | n = 660 2 | 3 | (h, s, v): arrow center misses 4 | ============================== 5 | (0.05, 0.65, 0.30): 36 6 | (0.10, 0.55, 0.35): 37 7 | (0.00, 0.70, 0.30): 37 8 | (0.10, 0.50, 0.40): 40 9 | (0.05, 0.55, 0.40): 40 10 | (0.20, 0.50, 0.30): 41 11 | (0.05, 0.70, 0.25): 42 12 | (0.00, 0.55, 0.45): 42 13 | (0.05, 0.50, 0.45): 43 14 | (0.00, 0.65, 0.35): 43 15 | (0.15, 0.50, 0.35): 44 16 | (0.00, 0.60, 0.40): 44 17 | (0.10, 0.65, 0.25): 45 18 | (0.25, 0.45, 0.30): 46 19 | (0.20, 0.55, 0.25): 46 20 | (0.15, 0.60, 0.25): 46 21 | (0.15, 0.55, 0.30): 46 22 | (0.10, 0.60, 0.30): 46 23 | (0.00, 0.80, 0.20): 46 24 | (0.30, 0.45, 0.25): 47 25 | (0.35, 0.45, 0.20): 48 26 | (0.20, 0.45, 0.35): 48 27 | (0.00, 0.85, 0.15): 48 28 | (0.00, 0.75, 0.25): 48 29 | (0.25, 0.50, 0.25): 49 30 | (0.30, 0.50, 0.20): 50 31 | (0.15, 0.65, 0.20): 50 32 | (0.15, 0.45, 0.40): 51 33 | (0.00, 0.50, 0.50): 51 34 | (0.35, 0.40, 0.25): 53 35 | (0.30, 0.40, 0.30): 53 36 | (0.25, 0.60, 0.15): 53 37 | (0.20, 0.60, 0.20): 53 38 | (0.15, 0.70, 0.15): 53 39 | (0.00, 0.90, 0.10): 53 40 | (0.20, 0.35, 0.45): 54 41 | (0.40, 0.45, 0.15): 55 42 | (0.20, 0.40, 0.40): 55 43 | (0.10, 0.80, 0.10): 55 44 | (0.05, 0.75, 0.20): 55 45 | (0.05, 0.45, 0.50): 55 46 | (0.30, 0.55, 0.15): 56 47 | (0.25, 0.55, 0.20): 56 48 | (0.20, 0.65, 0.15): 56 49 | (0.10, 0.85, 0.05): 57 50 | (0.35, 0.35, 0.30): 58 51 | (0.15, 0.75, 0.10): 58 52 | (0.25, 0.65, 0.10): 59 53 | (0.05, 0.90, 0.05): 59 54 | (0.00, 0.95, 0.05): 59 55 | (0.45, 0.40, 0.15): 60 56 | (0.40, 0.50, 0.10): 60 57 | (0.35, 0.50, 0.15): 60 58 | (0.20, 0.80, 0.00): 60 59 | (0.10, 0.45, 0.45): 60 60 | (0.05, 0.80, 0.15): 60 61 | (0.30, 0.65, 0.05): 61 62 | (0.15, 0.40, 0.45): 61 63 | (0.10, 0.70, 0.20): 61 64 | (0.45, 0.35, 0.20): 62 65 | (0.25, 0.35, 0.40): 62 66 | (0.20, 0.75, 0.05): 62 67 | (0.05, 0.95, 0.00): 62 68 | (0.45, 0.50, 0.05): 63 69 | (0.25, 0.70, 0.05): 63 70 | (0.45, 0.30, 0.25): 64 71 | (0.40, 0.35, 0.25): 64 72 | (0.30, 0.30, 0.40): 64 73 | (0.10, 0.75, 0.15): 64 74 | (0.25, 0.40, 0.35): 65 75 | (0.10, 0.40, 0.50): 65 76 | (0.00, 0.45, 0.55): 65 77 | (0.45, 0.45, 0.10): 66 78 | (0.40, 0.40, 0.20): 66 79 | (0.15, 0.80, 0.05): 66 80 | (0.15, 0.85, 0.00): 67 81 | (0.05, 0.40, 0.55): 67 82 | (0.10, 0.90, 0.00): 68 83 | (0.30, 0.70, 0.00): 70 84 | (0.25, 0.30, 0.45): 71 85 | (0.50, 0.40, 0.10): 72 86 | (0.40, 0.60, 0.00): 72 87 | (0.35, 0.65, 0.00): 72 88 | (0.40, 0.55, 0.05): 73 89 | (0.55, 0.20, 0.25): 74 90 | (0.50, 0.35, 0.15): 74 91 | (0.45, 0.55, 0.00): 74 92 | (0.25, 0.75, 0.00): 74 93 | (0.40, 0.30, 0.30): 76 94 | (0.55, 0.25, 0.20): 77 95 | (0.50, 0.30, 0.20): 77 96 | (0.50, 0.45, 0.05): 78 97 | (0.30, 0.25, 0.45): 79 98 | (0.55, 0.30, 0.15): 80 99 | (0.45, 0.25, 0.30): 80 100 | (0.15, 0.35, 0.50): 80 101 | (0.50, 0.50, 0.00): 81 102 | (0.50, 0.25, 0.25): 81 103 | (0.40, 0.25, 0.35): 81 104 | (0.75, 0.15, 0.10): 84 105 | (0.65, 0.20, 0.15): 85 106 | (0.60, 0.25, 0.15): 85 107 | (0.60, 0.20, 0.20): 86 108 | (0.35, 0.25, 0.40): 86 109 | (0.00, 0.40, 0.60): 87 110 | (0.70, 0.15, 0.15): 88 111 | (0.70, 0.10, 0.20): 89 112 | (0.60, 0.15, 0.25): 89 113 | (0.70, 0.25, 0.05): 91 114 | (0.10, 0.35, 0.55): 92 115 | (0.55, 0.40, 0.05): 93 116 | (0.50, 0.20, 0.30): 93 117 | (0.20, 0.30, 0.50): 93 118 | (0.65, 0.25, 0.10): 94 119 | (0.65, 0.15, 0.20): 94 120 | (0.55, 0.15, 0.30): 95 121 | (0.15, 0.30, 0.55): 95 122 | (0.75, 0.20, 0.05): 96 123 | (0.65, 0.30, 0.05): 97 124 | (0.55, 0.45, 0.00): 97 125 | (0.25, 0.25, 0.50): 97 126 | (0.60, 0.40, 0.00): 98 127 | (0.70, 0.05, 0.25): 101 128 | (0.05, 0.35, 0.60): 103 129 | (0.75, 0.25, 0.00): 104 130 | (0.45, 0.20, 0.35): 104 131 | (0.75, 0.05, 0.20): 105 132 | (0.65, 0.10, 0.25): 105 133 | (0.80, 0.15, 0.05): 106 134 | (0.40, 0.20, 0.40): 106 135 | (0.35, 0.20, 0.45): 106 136 | (0.65, 0.35, 0.00): 108 137 | (0.70, 0.30, 0.00): 109 138 | (0.50, 0.15, 0.35): 110 139 | (0.10, 0.30, 0.60): 110 140 | (0.80, 0.20, 0.00): 111 141 | (0.75, 0.10, 0.15): 111 142 | (0.60, 0.10, 0.30): 112 143 | (0.20, 0.25, 0.55): 112 144 | (0.00, 0.35, 0.65): 113 145 | (0.80, 0.10, 0.10): 114 146 | (0.80, 0.05, 0.15): 115 147 | (0.65, 0.05, 0.30): 116 148 | (0.85, 0.15, 0.00): 119 149 | (0.75, 0.00, 0.25): 119 150 | (0.30, 0.20, 0.50): 119 151 | (0.85, 0.00, 0.15): 121 152 | (0.05, 0.30, 0.65): 123 153 | (0.85, 0.10, 0.05): 125 154 | (0.45, 0.15, 0.40): 125 155 | (0.40, 0.15, 0.45): 128 156 | (0.90, 0.05, 0.05): 130 157 | (0.80, 0.00, 0.20): 130 158 | (0.15, 0.25, 0.60): 131 159 | (0.70, 0.00, 0.30): 132 160 | (0.90, 0.10, 0.00): 134 161 | (0.65, 0.00, 0.35): 135 162 | (0.25, 0.20, 0.55): 135 163 | (0.55, 0.10, 0.35): 137 164 | (0.00, 0.30, 0.70): 138 165 | (0.35, 0.15, 0.50): 142 166 | (0.90, 0.00, 0.10): 148 167 | (0.50, 0.10, 0.40): 148 168 | (0.45, 0.10, 0.45): 148 169 | (0.95, 0.05, 0.00): 149 170 | (0.30, 0.15, 0.55): 150 171 | (0.20, 0.20, 0.60): 150 172 | (0.95, 0.00, 0.05): 155 173 | (0.55, 0.05, 0.40): 155 174 | (0.50, 0.05, 0.45): 160 175 | (0.10, 0.25, 0.65): 160 176 | (0.60, 0.00, 0.40): 161 177 | (0.05, 0.25, 0.70): 162 178 | (0.40, 0.10, 0.50): 164 179 | (0.10, 0.20, 0.70): 168 180 | (0.55, 0.00, 0.45): 172 181 | (0.25, 0.15, 0.60): 172 182 | (0.15, 0.20, 0.65): 173 183 | (0.00, 0.25, 0.75): 175 184 | (0.35, 0.10, 0.55): 184 185 | (0.00, 0.20, 0.80): 190 186 | (0.20, 0.15, 0.65): 193 187 | (0.15, 0.15, 0.70): 193 188 | (0.45, 0.05, 0.50): 196 189 | (0.25, 0.10, 0.65): 196 190 | (0.05, 0.20, 0.75): 196 191 | (0.30, 0.10, 0.60): 198 192 | (0.10, 0.15, 0.75): 207 193 | (0.35, 0.05, 0.60): 208 194 | (0.50, 0.00, 0.50): 211 195 | (0.40, 0.05, 0.55): 213 196 | (0.15, 0.10, 0.75): 214 197 | (0.20, 0.10, 0.70): 220 198 | (0.00, 0.15, 0.85): 223 199 | (0.05, 0.15, 0.80): 224 200 | (0.45, 0.00, 0.55): 229 201 | (0.10, 0.10, 0.80): 231 202 | (0.30, 0.05, 0.65): 237 203 | (0.40, 0.00, 0.60): 239 204 | (0.20, 0.05, 0.75): 239 205 | (0.25, 0.05, 0.70): 244 206 | (0.35, 0.00, 0.65): 260 207 | (0.15, 0.05, 0.80): 262 208 | (0.00, 0.10, 0.90): 262 209 | (0.05, 0.10, 0.85): 263 210 | (0.30, 0.00, 0.70): 281 211 | (0.10, 0.05, 0.85): 289 212 | (0.25, 0.00, 0.75): 292 213 | (0.20, 0.00, 0.80): 296 214 | (0.15, 0.00, 0.85): 298 215 | (0.05, 0.05, 0.90): 299 216 | (0.00, 0.05, 0.95): 299 217 | (0.10, 0.00, 0.90): 317 218 | (0.05, 0.00, 0.95): 332 219 | -------------------------------------------------------------------------------- /preprocessing/preprocess.py: -------------------------------------------------------------------------------- 1 | import argparse 2 | import os 3 | import re 4 | import uuid 5 | import sys 6 | 7 | sys.path.insert(0, os.path.abspath('.')) 8 | 9 | import colorful as cf 10 | import cv2 11 | import numpy as np 12 | import pandas as pd 13 | from skimage import morphology 14 | 15 | import common 16 | 17 | OUTPUT_WIDTH = common.INPUT_SHAPE[0] 18 | 19 | ARROW_BOX_DIST = 100 20 | SEARCH_REGION_WIDTH = 120 21 | SEARCH_REGION_HEIGHT = 100 22 | 23 | EXIT_KEY = 113 # q 24 | APPROVE_KEY = 32 # space 25 | 26 | 27 | def main(inspection, mode, automatic): 28 | common.create_directories() 29 | 30 | print(" SPACE = approve") 31 | print("OTHER KEYS = skip") 32 | print(" Q = quit\n") 33 | 34 | labeled_imgs = common.get_files(common.LABELED_DIR) 35 | 36 | approved = 0 37 | for path, filename in labeled_imgs: 38 | print("Processing " + cf.skyBlue(path)) 39 | 40 | arrows = [] 41 | 42 | display = cv2.imread(path) 43 | height, width, _ = display.shape 44 | 45 | # manually tuned values 46 | search_x, search_y = width // 5 + 35, height // 4 47 | search_width, search_height = SEARCH_REGION_WIDTH, height // 2 - search_y 48 | 49 | for _ in range(4): 50 | x0 = search_x 51 | x1 = x0 + search_width 52 | 53 | y0 = search_y 54 | y1 = y0 + search_height 55 | 56 | img = display[y0:y1, x0:x1] 57 | (cx, cy), arrow_box = process_arrow(img, mode) 58 | 59 | search_x += int(cx + ARROW_BOX_DIST - SEARCH_REGION_WIDTH / 2) 60 | search_y += int(cy - SEARCH_REGION_HEIGHT / 2) 61 | 62 | search_width = SEARCH_REGION_WIDTH 63 | search_height = SEARCH_REGION_HEIGHT 64 | 65 | arrows.append(arrow_box) 66 | 67 | if not automatic: 68 | arrow_type, directions, _ = re.split('_', filename) 69 | reference = get_reference_arrows(directions, arrows[0].shape) 70 | 71 | cv2.imshow(arrow_type, np.vstack([np.hstack(arrows), reference])) 72 | 73 | key = cv2.waitKey() 74 | cv2.destroyAllWindows() 75 | else: 76 | key = APPROVE_KEY 77 | 78 | if key == APPROVE_KEY: 79 | if not inspection: 80 | save_arrow_imgs(arrows, filename) 81 | approved += 1 82 | elif key == EXIT_KEY: 83 | break 84 | else: 85 | print("Skipped!") 86 | 87 | if len(labeled_imgs) > 0: 88 | print("\nApproved {} out of {} images ({}%).\n".format( 89 | approved, len(labeled_imgs), 100 * approved // len(labeled_imgs))) 90 | else: 91 | print("There are no images to preprocess.\n") 92 | 93 | show_summary() 94 | 95 | print("Finished!") 96 | 97 | 98 | def process_arrow(img, mode): 99 | # gaussian blur 100 | img = cv2.GaussianBlur(img, (3, 3), 0) 101 | 102 | # color transform 103 | img = cv2.cvtColor(img, cv2.COLOR_BGR2HSV) 104 | 105 | coefficients = (0.0445, 0.6568, 0.2987) # (h, s, v) 106 | img = cv2.transform(img, np.array(coefficients).reshape((1, 3))) 107 | 108 | if mode == 'gray': 109 | output = img.copy() 110 | 111 | # binarization 112 | img = cv2.adaptiveThreshold(img, 255, cv2.ADAPTIVE_THRESH_GAUSSIAN_C, cv2.THRESH_BINARY, 5, -1) 113 | 114 | # noise removal 115 | denoise(img, threshold=8, conn=2) 116 | 117 | if mode == 'binarized': 118 | output = img.copy() 119 | 120 | # processing 121 | cx, cy = compute_arrow_centroid(img) 122 | 123 | # result cropping 124 | max_height, max_width = img.shape 125 | 126 | x0 = max(int(cx - OUTPUT_WIDTH / 2), 0) 127 | y0 = max(int(cy - OUTPUT_WIDTH / 2), 0) 128 | 129 | x1 = int(x0 + OUTPUT_WIDTH) 130 | if x1 >= max_width: 131 | x0 -= x1 - max_width 132 | x1 = max_width 133 | 134 | y1 = int(y0 + OUTPUT_WIDTH) 135 | if y1 >= max_height: 136 | y0 -= y1 - max_height 137 | y1 = max_height 138 | 139 | box = output[y0:y1, x0:x1] 140 | 141 | return (cx, cy), box 142 | 143 | 144 | def denoise(img, threshold=64, conn=2): 145 | processed = img > 0 146 | 147 | processed = morphology.remove_small_objects( 148 | processed, min_size=threshold, connectivity=conn) 149 | 150 | processed = morphology.remove_small_holes( 151 | processed, area_threshold=threshold, connectivity=conn) 152 | 153 | mask_x, mask_y = np.where(processed == True) 154 | img[mask_x, mask_y] = 255 155 | 156 | mask_x, mask_y = np.where(processed == False) 157 | img[mask_x, mask_y] = 0 158 | 159 | 160 | def compute_arrow_centroid(img): 161 | contours, _ = cv2.findContours( 162 | img, cv2.RETR_CCOMP, cv2.CHAIN_APPROX_SIMPLE) 163 | 164 | # filter contours by area 165 | candidates = [] 166 | 167 | for contour in contours: 168 | score, (cx, cy), area = circle_features(contour) 169 | 170 | if area > 784 and area < 3600: 171 | candidates.append(((cx, cy), score)) 172 | 173 | if candidates: 174 | match = max(candidates, key=lambda x: x[1]) 175 | (cx, cy), score = match 176 | 177 | if score > 0.8: 178 | return (int(cx), int(cy)) 179 | 180 | print("Centroid not found! Returning the center point...") 181 | 182 | height, width = img.shape 183 | return (width // 2, height // 2) 184 | 185 | 186 | def circle_features(contour): 187 | hull = cv2.convexHull(contour) 188 | 189 | if len(hull) < 5: 190 | return 0, (-1, -1), -1 191 | 192 | hull_area = cv2.contourArea(hull) 193 | 194 | (ex, ey), (d1, d2), angle = cv2.fitEllipse(hull) 195 | ellipse_area = np.pi * (d1 / 2) * (d2 / 2) 196 | 197 | (cx, cy), r = cv2.minEnclosingCircle(hull) 198 | circle_area = np.pi * r ** 2 199 | 200 | s1 = abs(ellipse_area - hull_area) / max(ellipse_area, hull_area) 201 | s2 = abs(ellipse_area - circle_area) / max(ellipse_area, circle_area) 202 | 203 | score = 1 - np.mean([s1, s2]) 204 | 205 | return score, (ex, ey), ellipse_area 206 | 207 | 208 | def get_reference_arrows(directions, shape): 209 | reference = [] 210 | 211 | for d in directions: 212 | arrow = np.zeros(shape, dtype=np.uint8) 213 | 214 | w, h = shape[1], shape[0] 215 | cx, cy = w // 2, h // 3 216 | 217 | # upward arrow 218 | points = np.array([(cx - w // 5, cy + h // 8), 219 | (cx + w // 5, cy + h // 8), 220 | (cx, cy - h // 8)]) 221 | 222 | cv2.fillConvexPoly(arrow, points, (255, 255, 255)) 223 | cv2.line(arrow, (cx, cy), (cx, 3 * h // 5), (255, 255, 255), 10) 224 | 225 | rotations = 0 226 | 227 | if d == 'r': 228 | rotations = 1 229 | elif d == 'd': 230 | rotations = 2 231 | elif d == 'l': 232 | rotations = 3 233 | 234 | for _ in range(rotations): 235 | arrow = cv2.rotate(arrow, cv2.ROTATE_90_CLOCKWISE) 236 | 237 | reference.append(arrow) 238 | 239 | return np.hstack(reference) 240 | 241 | 242 | def save_arrow_imgs(arrows, labeled_filename): 243 | words = re.split('_', labeled_filename) 244 | arrow_type = words[0] 245 | directions = words[1] 246 | 247 | # save individual arrows + their rotated and flipped versions 248 | for x, arrow_img in enumerate(arrows): 249 | for rotation in range(4): 250 | if rotation > 0: 251 | arrow_img = cv2.rotate(arrow_img, cv2.ROTATE_90_CLOCKWISE) 252 | 253 | direction = get_direction(directions[x], rotation) 254 | arrow_path = "{}{}_{}_{}".format(common.SAMPLES_DIR, arrow_type, direction, uuid.uuid4()) 255 | 256 | cv2.imwrite(arrow_path + ".png", arrow_img) 257 | 258 | if direction in ['down', 'up']: 259 | flipped_img = cv2.flip(arrow_img, 1) 260 | else: 261 | flipped_img = cv2.flip(arrow_img, 0) 262 | 263 | cv2.imwrite(arrow_path + "F.png", flipped_img) 264 | 265 | os.rename(common.LABELED_DIR + labeled_filename, 266 | common.PREPROCESSED_DIR + labeled_filename) 267 | 268 | 269 | def get_direction(direction, rotation): 270 | direction_dict = { 271 | 'l': 'left', 272 | 'u': 'up', 273 | 'r': 'right', 274 | 'd': 'down' 275 | } 276 | rotation_list = ['l', 'u', 'r', 'd'] 277 | 278 | new_index = (rotation_list.index(direction) + 279 | rotation) % len(rotation_list) 280 | new_direction = rotation_list[new_index] 281 | 282 | return direction_dict[new_direction] 283 | 284 | 285 | def show_summary(): 286 | matrix = pd.DataFrame(np.zeros((4, 5), dtype=np.int32), index=( 287 | 'round', 'wide', 'narrow', 'total'), columns=('down', 'left', 'right', 'up', 'total')) 288 | 289 | images = common.get_files(common.SAMPLES_DIR) 290 | 291 | for _, filename in images: 292 | arrow_direction, arrow_type = common.arrow_labels(filename) 293 | 294 | matrix[arrow_direction][arrow_type] += 1 295 | 296 | matrix['total'][arrow_type] += 1 297 | matrix[arrow_direction]['total'] += 1 298 | matrix['total']['total'] += 1 299 | 300 | print(cf.salmon("Samples summary")) 301 | print(matrix, "\n") 302 | 303 | 304 | if __name__ == "__main__": 305 | parser = argparse.ArgumentParser() 306 | 307 | parser.add_argument('-i', '--inspection', action='store_true', 308 | help="Toggles the inspection mode, which disables the output") 309 | parser.add_argument('-m', '--mode', default='binarized', type=str, 310 | choices=['binarized', 'gray'], 311 | help="Sets the output mode to binarized or grayscale") 312 | parser.add_argument('-a', '--automatic', action='store_true', 313 | help="Toggles the automatic mode, which approves all screenshots") 314 | 315 | args = parser.parse_args() 316 | 317 | main(args.inspection, args.mode, args.automatic) 318 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Rune Breaker 2 | This project aims to test the effectiveness of one of MapleStory's anti-botting mechanisms: the rune system. In the process, a model that can solve [**up to 99.15%***](#errors-and-biases) of the runes was created. 3 | 4 | The project provides an end-to-end pipeline that encompasses every step necessary to replicate this model. Additionally, the article below goes over all the details involved during the development of the model. Toward the end, it also considers possible improvements and alternatives to the current rune system. Make sure to read it. 5 | 6 | > This project was created solely for research and learning purposes. The usage of this project for malicious purposes in-game is against Nexon's terms of service and, thus, *discouraged*. 7 | 8 | ## Summary 9 | 1. [Introduction](#introduction) 10 | - [What are runes?](#what-are-runes) 11 | - [What is the challenge?](#what-is-the-challenge) 12 | - [How can it be solved?](#how-can-it-be-solved) 13 | 2. [The Pipeline](#the-pipeline) 14 | - [Overview](#overview) 15 | - [Labeling](#labeling) 16 | - [Preprocessing](#preprocessing) 17 | - [Dataset](#dataset) 18 | - [Classification model](#classification-model) 19 | 3. [Results and Observations](#results-and-observations) 20 | - [Performance](#performance) 21 | - [Observations](#observations) 22 | 4. [Final Considerations](#final-considerations) 23 | 24 | ## Introduction 25 | MapleStory is a massively multiplayer online role-playing game published by Nexon and available in many countries around the world. 26 | 27 | ### What are runes? 28 | In an attempt to protect the game from botting, MapleStory employs a kind of [CAPTCHA](https://en.wikipedia.org/wiki/CAPTCHA) system. From time to time, a rune is spawned somewhere on the player's map. When the user walks into it and presses the activation button, the game displays a panel containing four arrows. See an example below. 29 | 30 | ![Picture of a rune being activated](./docs/rune_example.png) 31 | 32 | The player is then asked to type the direction of the four arrows. If the directions are correct, the rune disappears and the player receives a buff. However, if the user does not activate it within a certain amount of time, they will stop receiving any rewards for hunting monsters, hampering the performance of bots that cannot handle runes. 33 | 34 | That said, our goal is to create a computer program capable of determining the direction of the arrows when given a screenshot from the game. 35 | 36 | ### What is the challenge? 37 | If arrows were static textures, it would be possible to apply a template matching algorithm and reliably determine the arrow directions. Interestingly, that is how runes were initially implemented. 38 | 39 | ![Picture of the old rune system](./docs/old_rune_system.png) 40 | 41 | Nevertheless, the game was later [updated](http://maplestory.nexon.net/news/24928/updated-v-188-tune-up-patch-notes) to add randomness to the arrows. For instance, both the position of the panel and the position of the arrows within it are randomized. Moreover, every color is randomized and often mixed in a semi-transparent gradient. Finally, there are four different arrow shapes, which then suffer rotation and scaling transformations. 42 | 43 | ![Picture of heavily obfuscated arrows](./docs/difficult_rune_example.png) 44 | 45 | As shown in the picture above, these obfuscation mechanisms can make it hard to determine the arrow directions, even for humans. Together, these techniques make it more difficult for machines to define the location, the shape, and the orientation of the arrows. 46 | 47 | Is it possible to overcome these challenges? 48 | 49 | ### How can it be solved? 50 | As we have just discussed, there are several layers of difficulty that must be dealt with to enable a computer program to determine the arrow directions. With that in mind, splitting the problem into different steps is a good idea. 51 | 52 | First, the program should be able to determine the location of each arrow on the screen. After that, the software should classify them as **down**, **left**, **right**, or **up**. 53 | 54 | #### Locating the arrows 55 | Some investigation reveals that it is possible to determine the location of the arrows using less sophisticated techniques. Even though this task can be performed by an artificial neural network, the simplest option is to explore solutions within traditional computer vision. 56 | 57 | #### Classifying the arrow directions 58 | On the other hand, classifying arrow directions is a great candidate for machine learning solutions. First of all, each arrow could fit inside a grayscale 60×60×1 image, which is a relatively small input size. Moreover, it is very convenient to gather data, since it is possible to generate 32 different sample images for each screenshot. The only requirement is to rotate and flip each of the four arrows in the screenshot. 59 | 60 | ## The Pipeline 61 | In this section, we will see how to put these steps together to create a program that classifies rune arrows. 62 | 63 | ### Overview 64 | First, we should define two pipelines: the **model pipeline** and the **runtime pipeline**. See the diagram next. 65 | 66 | ![Pipeline diagram](./docs/pipelines.png) 67 | 68 | The model pipeline is responsible for taking a set of screenshots as input and performing a sequence of steps to create an arrow classification model. On the other hand, the runtime model represents a hypothetical botting software. This software would use parts of the model pipeline in addition to the trained classifier to activate runes during the game runtime. 69 | 70 | Considering the purposes of this project, only the model pipeline will be discussed. 71 | 72 | Each step of this pipeline will be thoroughly explained in the sections below. Also, the experimental results obtained during the development of this project will be presented. 73 | 74 | ### Labeling 75 | First, we should collect game screenshots and label them according to the direction and shape of the arrows. To make this process easier, the [`label.py`](./preprocessing/label.py) script was created. The next figure shows the application screen when running the command below. 76 | 77 | ``` 78 | $ python preprocessing/label.py 79 | ``` 80 | 81 | ![Labeling application screen](./docs/labeler.png) 82 | 83 | When run, the program iterates over all images inside the `./data/screenshots/` directory. For each screenshot, the user should press the arrow keys in the correct order to label the directions and `1`, `2`, or `3` to label the arrow shape. 84 | 85 | There are five different types of arrows, which can be arranged within three groups, as shown below. 86 | 87 | ![Arrow shapes](./docs/arrow_types.png) 88 | 89 | This distinction is made to avoid problems related to data imbalance. This topic will be better explained later on in the [dataset](#dataset) section. 90 | 91 | After a screenshot is labeled, the application moves it to the `./data/labeled/` folder. Finally, the file is renamed with a name that contains its labels and a unique id (e.g. `wide_rdlu_123.png`, where each letter in `rdlu` represents a direction). 92 | 93 | > Note: instead of manually collecting and labeling game screenshots, it is also possible to automatically generate artificial samples by directly using the game assets. 94 | 95 | ### Preprocessing 96 | With our screenshots correctly labeled, we can move to the preprocessing stage. 97 | 98 | Simply put, this step is responsible for locating the arrows within the labeled screenshots and producing preprocessed output images. These images are the arrow samples that will feed the classification model later on. 99 | 100 | This process is handled by the [`preprocess.py`](./preprocessing/preprocess.py) script. Its internals can be visualized through the following diagram. 101 | 102 | ![Preprocessing diagram](./docs/preprocessing.png) 103 | 104 | Each screenshot undergoes a sequence of transformations that simplify the image in several aspects. Next, the transformed image is analyzed and the position of each arrow is determined. Lastly, the script generates the output samples. 105 | 106 | Let's examine each one of these operations individually. 107 | 108 | #### Cropping 109 | Our first objective is to make it easier to find the location of the arrows within a screenshot. Consider the following input image. 110 | 111 | ![Input screenshot](./docs/input_image.png) 112 | 113 | The input has a width of 800 pixels, a height of 600 pixels, and 3 RGB color channels (800×600×3 = 1,400,000), which is a considerable amount of data to be processed. This makes detecting objects difficult and may ultimately harm the accuracy of our model. What can be done to overcome this obstacle? 114 | 115 | One alternative relies on processing each arrow separately. We can accomplish this by first restricting the search to a small portion of the screen. Next, the coordinates of the leftmost arrow are determined. After that, these coordinates are used to define the next search region. Then, the process repeats until all arrows have been found. This is possible because, although the positions of the arrows are random, they still fall within a predictable zone. 116 | 117 | This is the first search region for our input image: 118 | 119 | ![First search region](./docs/search_region.png) 120 | 121 | As can be seen, the search problem has been reduced to an image much smaller than the original one (120×150×3 = 54,000). However, the resulting picture still has three color channels. This issue will be addressed next. 122 | 123 | #### Transforming the input 124 | The key to identifying the arrows is [edge detection](https://en.wikipedia.org/wiki/Edge_detection). Nevertheless, the obfuscation and the complexity of the input image make the process more difficult. 125 | 126 | To make things simpler, we can compress the information of the image from three color channels into a single grayscale channel. However, we should not directly combine the R, G, and B channels. Since the arrows may be of almost any color, we cannot define a single formula that works well for any image. Instead, we will combine the HSV representation of the image, which separates color and intensity information into three different components. 127 | 128 | After some experimentation, the following linear transformation yielded the best results. 129 | 130 | ``` 131 | Grayscale = (0.0445 * H) + (0.6568 * S) + (0.2987 * V) 132 | ``` 133 | 134 | Additionally, we can apply Gaussian Blur to the image to eliminate small noise. The result of these transformations is shown below. 135 | 136 | ![Grayscale search region](./docs/grayscale.png) 137 | 138 | Although the new image has only one channel instead of three (120×150×1 = 18,000), the outlines of the arrow can still be identified. We have roughly the same information with fewer data. This same result is observed in several other examples, even if at different degrees. 139 | 140 | However, we can still do more. Each pixel in the current image now ranges from 0 to 255, since it is a grayscale image. In a binarized image, on the other hand, each pixel is either 0 (black) or 255 (white). In this case, the contours are objectively clear and can be easily processed by an algorithm (or a neural network). 141 | 142 | Since the image has many variations of shading and lighting, the most appropriate way to binarize it is to use adaptive thresholding. Fortunately, [OpenCV](https://opencv.org/) provides this functionality out-of-the-box, which can be configured with several parameters. Applying this algorithm to the grayscale image produces the following result. 143 | 144 | ![Binarized search region](./docs/binarized.png) 145 | 146 | The current result is already considerably positive. Yet, notice that there are some small "patches" scattered throughout the image. To make it easier for the detection algorithm and for the arrow classifier, we can use the morphology operations provided by the [skimage library](https://scikit-image.org/) to mitigate this noise. The next figure shows the resulting image. 147 | 148 | ![Search region with reduced noise](./docs/reduced_noise.png) 149 | 150 | The image is now ready to be analyzed. Next, we will discuss the details of the algorithm that determines the position of the arrow. 151 | 152 | #### Locating the arrow 153 | The first step of the process consists in computing the contours of the image. To do this, an algorithm is used to transform the edges in the image into sets of points, forming many polygons. See the figure below. 154 | 155 | ![Contours of the search region](./docs/contours.png) 156 | 157 | Each polygon constitutes a different contour. But then, how to determine which contours belong to the arrow? It turns out that directly identifying the arrow is not the best alternative. Frequently, the obfuscation heavily distorts the outline of the arrow, which makes identifying it very difficult. 158 | 159 | Fortunately, some outlines are much simpler to identify: the circles that surround the arrow. There are two reasons for this. First, these contours tend to keep their shape intact despite the obfuscation. Second, circumferences have much clearer morphological characteristics. 160 | 161 | Still, another question remains. How to identify these circles then? There are *several* ways to solve this problem. One of the first alternatives that may come to mind is the [Circle Hough Transform](https://en.wikipedia.org/wiki/Circle_Hough_Transform). Unfortunately, this method is known to be unstable when applied to noisy images. Preliminary tests confirmed that this method produces unreliable results for our input images. Because of this, I decided to use morphological analysis instead. 162 | 163 | ##### Algorithm 164 | Our goal is to determine which contours correspond to the surrounding circles and then calculate the position of the arrows, which sit in the middle of the circumferences. To do this, we can use the following algorithm. 165 | 166 | 1. First, the program removes all contours whose area is too small or too large compared to a real surrounding circle. 167 | 168 | 2. Then, a score is calculated for each remaining contour. This score is based on the similarity between the contour, a perfect ellipse, and a perfect circle. 169 | 170 | 3. Next, the contour with the best score is selected. 171 | 172 | 4. Finally, if the score of the selected contour exceeds a certain threshold, the algorithm outputs its center. Otherwise, it outputs the center of the search area as a fallback, which makes the algorithm more robust. 173 | 174 | The code snippet below illustrates the process. 175 | 176 | ```py 177 | candidates = [c for c in contours if area(c) > MIN and area(c) < MAX] 178 | 179 | best_candidate = max(candidates, key=lambda c: score(c)) 180 | 181 | if score(best_candidate) > THRESHOLD: 182 | return center(best_candidate) 183 | 184 | return center(search_region) 185 | ``` 186 | 187 | ##### Similarity score 188 | The key to the accuracy of the operation above is the quality of the score. Its principle is simple: the surrounding circles tend to resemble a perfect circumference more than other objects do. With that in mind, it is possible to measure the likeliness that a contour is a surrounding circle using the following metrics. 189 | 190 | 1. The difference between the convex hull of the contour and its minimum enclosing ellipse. Considering that the area of the ellipse is larger: 191 | ``` 192 | s1 = (area(ellipse) - area(hull)) / area(ellipse) 193 | ``` 194 | 195 | 2. The difference between the minimum enclosing ellipse and its minimum enclosing circle. Considering that the area of the circle is larger: 196 | ``` 197 | s2 = (area(circle) - area(ellipse)) / area(circle) 198 | ``` 199 | 200 | The resulting score is one minus the arithmetic mean of these two metrics. 201 | ``` 202 | score = 1 - (s1 + s2) / 2 203 | ``` 204 | 205 | By doing so, the closer a contour is to a perfect circle, the closer the score will be to one. The more distinct, the closer it will get to zero. 206 | 207 | ##### Iteration output 208 | The figure below shows the filtered candidates (green), the best contour among them (blue), and the coordinate returned by one iteration of the algorithm (red). 209 | 210 | ![Arrow center](./docs/selected_contours.png) 211 | 212 | Of course, there are errors associated with this algorithm. Sometimes, the operation yields false positives due to the obfuscation, causing the center of the arrow to be defined incorrectly. In some cases, these errors are significant enough to prevent the arrow from being in the output image, as we will see next. Nevertheless, these cases are rather uncommon and have a relatively small impact on the accuracy of the model. 213 | 214 | Finally, the coordinate returned by the algorithm is then used to define the next search region, as shown below. After that, the algorithm repeats until it processes all four arrows. 215 | 216 |

217 | (xsearch, ysearch) = (xcenter + d, ycenter) 218 |

219 | 220 | #### Generating the output 221 | After finding the positions of the arrows, the program should generate an output. To do so, it first crops a region around the center of each arrow. The dimension of the cropped area is 60×60, which is just enough to fit an arrow. See the results in the next figure. 222 | 223 | ![Preprocessing output](./docs/output.png) 224 | 225 | But, as mentioned earlier, we can augment the output by rotating and flipping the arrows. By doing this, we both multiply the sample size by **eight** and eliminate class imbalance, which is extremely helpful. The next figure shows the 32 generated samples. 226 | 227 | ![Augmented preprocessing output](./docs/augmented_output.png) 228 | 229 | #### Application interface 230 | Before discussing the experimental results, let's talk about the preprocessing application interface. When the user runs the script, it displays the following window for each screenshot in the `./data/labeled/` folder. 231 | 232 | ``` 233 | $ python preprocessing/preprocess.py 234 | ``` 235 | 236 | ![Preprocess script window](./docs/preprocessor_window.png) 237 | 238 | For each one of them, the user is given the option to either process or skip the screenshot. If the user chooses to process it, the script produces output samples and places them inside the `./data/samples/` folder. 239 | 240 | The user should skip a screenshot when it is impossible to determine the direction of an arrow. In other words, the screenshot should be skipped when the arrow is completely corrupted, or when the algorithm misses its location. See a couple of examples. 241 | 242 | ![Preprocessing errors](./docs/errors.png) 243 | 244 | #### Results 245 | Using 800 labeled screenshots as input and following the guidelines mentioned in the previous section, the following results were obtained. 246 | 247 | ``` 248 | Approved 798 out of 800 images (99%). 249 | ``` 250 | **Samples summary** 251 | | | Down | Left | Right | Up | Total | 252 | |------------|------:|------:|-------:|------:|------:| 253 | | **Round** | 1872 | 1872 | 1872 | 1872 | 7488 | 254 | | **Wide** | 2632 | 2632 | 2632 | 2632 | 10528 | 255 | | **Narrow** | 1880 | 1880 | 1880 | 1880 | 7520 | 256 | | **Total** | 6384 | 6384 | 6384 | 6384 | 25536 | 257 | 258 | In other words, the preprocessing stage had an accuracy of 99.75%. Plus, there are now 25,536 arrow samples that we can use to create the dataset that will be used by the classification model. Great! 259 | 260 | ### Dataset 261 | In this step of the pipeline, the objective is to split the generated samples into training, validation, and testing sets. Each set has its own folder inside the `./data/` directory. The produced dataset will then be used to train the arrow classification model. Note that the quality of the dataset has a direct impact on the performance of the model. Thus, it is essential to perform this step correctly. 262 | 263 | Let's start by defining and understanding the purpose of each set. 264 | 265 | #### Training set 266 | As the name suggests, this is the dataset that will be directly used to fit the model in the training process. 267 | 268 | #### Validation set 269 | This is the dataset used to evaluate the generalization of the model and to optimize any hyperparameters the classifier may use. 270 | 271 | #### Testing set 272 | This is the set used to determine the final performance of the model. 273 | 274 | With these definitions in place, let's address some issues. One of the main obstacles that classification models may face is data imbalance. When the training set is asymmetric, the model may not generalize well, which causes it to perform poorly for the underrepresented samples. Therefore, the training set should have an (approximately) equal number of samples from each shape and direction. Additionally, it is necessary to define the proportion of samples in each set. An (80%, 10%, 10%) division is usually a good starting point. 275 | 276 | The [`make_dataset.py`](./operations/make_dataset.py) script is responsible for creating a dataset that meets all the criteria above. 277 | 278 | > Note: to avoid data leakage, the script also ensures that each sample and its flipped counterpart are always in the same set. 279 | 280 | Running the script with the ratio set to `0.9` produces the following results. 281 | 282 | ``` 283 | $ python operations/make_dataset.py -r 0.9 284 | ``` 285 | **Training set** 286 | | | Down | Left | Right | Up | Total | 287 | |------------|-----:|-----:|------:|-----:|------:| 288 | | **Round** | 1684 | 1684 | 1684 | 1684 | 6736 | 289 | | **Wide** | 1684 | 1684 | 1684 | 1684 | 6736 | 290 | | **Narrow** | 1684 | 1684 | 1684 | 1684 | 6736 | 291 | | **Total** | 5052 | 5052 | 5052 | 5052 | 20208 | 292 | 293 | **Validation set** 294 | | | Down | Left | Right | Up | Total | 295 | |------------|-----:|-----:|------:|----:|------:| 296 | | **Round** | 94 | 94 | 94 | 94 | 376 | 297 | | **Wide** | 474 | 474 | 474 | 474 | 1896 | 298 | | **Narrow** | 98 | 98 | 98 | 98 | 392 | 299 | | **Total** | 666 | 666 | 666 | 666 | 2664 | 300 | 301 | **Testing set** 302 | | | Down | Left | Right | Up | Total | 303 | |------------|-----:|-----:|------:|----:|------:| 304 | | **Round** | 94 | 94 | 94 | 94 | 376 | 305 | | **Wide** | 474 | 474 | 474 | 474 | 1896 | 306 | | **Narrow** | 98 | 98 | 98 | 98 | 392 | 307 | | **Total** | 666 | 666 | 666 | 666 | 2664 | 308 | 309 | Notice that the training set is perfectly balanced on both axes. You can also see that the split between sets follows approximately (80%, 10%, 10%). 310 | 311 | Now that the dataset is ready, we can move on to the main stage: the construction of the classification model. 312 | 313 | ### Classification model 314 | In this stage, the dataset created in the previous step is used to adjust and train a machine learning model that classifies arrows into four directions when given an input image. The trained classifier can be applied in the runtime pipeline to determine the arrow directions within the game. 315 | 316 | ![Model simplification](./docs/model.png) 317 | 318 | To quickly validate the viability of the idea, we can use [Google's Teachable Machine](https://teachablemachine.withgoogle.com/train/image), which allows any user to train a neural network image classifier. An experiment using only **1200** arrow samples results in a model with an overall accuracy of **90%**. It is quite clear that a classification model based on a neural network is feasible. 319 | 320 | In fact, we could export the model trained by Google's tool and call it a day. However, since each rune has four arrows, the total accuracy is 66%. 321 | 322 | Fortunately, there is plenty of room for improvement. For instance, this model transforms our 60×60×1 input into 224×224×3, making the process much more complex than it needs to be. Additionally, Google's tool lack many configuration options. 323 | 324 | With this in mind, the best alternative is to create a model from scratch. 325 | 326 | ### Building the model 327 | Our model will use a convolutional neural network (CNN), which is a type of neural network widely used for image classification. 328 | 329 | We will use Keras to create and train the model. Take a look at the code that generates the structure of the neural network. 330 | 331 | ```py 332 | def make_model(): 333 | model = Sequential() 334 | 335 | # Convolution block 1 336 | model.add(Conv2D(32, (3, 3), padding='same', input_shape=(60, 60, 1))) 337 | model.add(Activation('relu')) 338 | model.add(MaxPooling2D(pool_size=(2, 2))) 339 | 340 | # Convolution block 2 341 | model.add(Conv2D(48, (3, 3), padding='same')) 342 | model.add(Activation('relu')) 343 | model.add(MaxPooling2D(pool_size=(2, 2))) 344 | 345 | # Convolution block 3 346 | model.add(Conv2D(64, (3, 3), padding='same')) 347 | model.add(Activation('relu')) 348 | model.add(MaxPooling2D(pool_size=(2, 2))) 349 | 350 | model.add(Flatten()) 351 | 352 | model.add(Dense(64, activation='relu')) 353 | model.add(Dropout(0.2)) 354 | 355 | # Output layer 356 | model.add(Dense(4, activation='softmax')) 357 | 358 | model.compile(optimizer='adam', 359 | loss='categorical_crossentropy', 360 | metrics=['accuracy']) 361 | 362 | return model 363 | ``` 364 | 365 | The architecture was inspired by one of the best performing MNIST classification models on [Kaggle](https://www.kaggle.com/). While simple, it should be more than enough for our purposes. 366 | 367 | ### More data augmentation 368 | In addition to rotation and flipping, as we saw earlier, we can apply other transformations to the arrow samples. We should do this to help reducing overfitting, especially regarding the position and size of the arrows. 369 | 370 | Keras provides this and many other functionalities out-of-the-box. Using an image data generator, we can automatically augment the data from both training and validation sets during the fitting process. See the code snippet below, which was taken from the training script. 371 | 372 | ```py 373 | aug = ImageDataGenerator(width_shift_range=0.125, height_shift_range=0.125, zoom_range=0.2) 374 | ``` 375 | 376 | ### Training the model 377 | With that set, we can fit the model to the data with the [`train.py`](./model/train.py) script. Based on a few input parameters, the script creates the neural network, performs data augmentation, fits the model to the data, and saves the trained model to a file. Moreover, to improve the performance of the training process, the program applies mini-batch gradient descent and early stopping. 378 | 379 | Let's train a model named `binarized_model128.h5` with the batch size set to `128`. 380 | ``` 381 | $ python model/train.py -m binarized_model128.h5 -b 128 382 | 383 | ... 384 | 385 | Settings 386 | value 387 | max_epochs 240 388 | patience 80 389 | batch_size 128 390 | 391 | Creating model... 392 | 393 | Creating generators... 394 | Found 20208 images belonging to 4 classes. 395 | Found 2664 images belonging to 4 classes. 396 | 397 | Fitting model... 398 | 399 | ... 400 | 401 | Epoch 228/240 402 | - 28s - loss: 0.0095 - accuracy: 0.9974 - val_loss: 0.0000e+00 - val_accuracy: 0.9945 403 | 404 | ... 405 | 406 | Best epoch: 228 407 | 408 | Saving model... 409 | Model saved to ./model/binarized_model128.h5 410 | 411 | Finished! 412 | ``` 413 | 414 | As we can see, the model had a **99.45%** accuracy in the best epoch for the validation set. The next figure shows the accuracy and loss charts for this training session. 415 | 416 | ![Evolution per epoch charts](./docs/charts.png) 417 | 418 | ## Results and Observations 419 | Now that our model has been trained and validated, it is time to evaluate its performance using the testing set. In the sections below, we will show some metrics calculated for the classifier and make some considerations upon them. 420 | 421 | > Note: you can access the complete results [here](./results). 422 | 423 | ### Performance 424 | To calculate the performance of the model on the testing set, we can run the [`classify.py`](./model/classify.py) script with the following parameters. 425 | ``` 426 | $ python model/classify.py -m binarized_model128.h5 -d testing 427 | ``` 428 | **Confusion matrix** 429 | | | Down | Left | Right | Up | 430 | |-----------|-----:|-----:|------:|----:| 431 | | **Down** | 664 | 0 | 2 | 0 | 432 | | **Left** | 0 | 665 | 0 | 1 | 433 | | **Right** | 0 | 0 | 666 | 0 | 434 | | **Up** | 0 | 0 | 1 | 665 | 435 | 436 | **Classification summary** 437 | | | Precision | Recall | F1 | 438 | |-----------|----------:|-------:|-------:| 439 | | **Down** | 1.0000 | 0.9970 | 0.9985 | 440 | | **Left** | 1.0000 | 0.9985 | 0.9992 | 441 | | **Right** | 0.9955 | 1.0000 | 0.9977 | 442 | | **Up** | 0.9985 | 0.9985 | 0.9985 | 443 | 444 | **Classification summary** 445 | | | Correct | Incorrect | Accuracy | 446 | |------------|--------:|----------:|---------:| 447 | | **Round** | 376 | 0 | 1.0000 | 448 | | **Wide** | 1892 | 4 | 0.9979 | 449 | | **Narrow** | 392 | 0 | 1.0000 | 450 | | **Total** | 2660 | 4 | 0.9985 | 451 | 452 | Notice the F1 scores and the accuracy per arrow type. Both of them are balanced, which means that the model is not biased toward any shape or direction. 453 | 454 | The results are remarkable. The model was able to correctly classify **99.85%** of the arrows! Now, let's calculate the overall accuracy of the pipeline. 455 | 456 | | Preprocessing | Four arrows | Total | 457 | |------------------|------------------|----------| 458 | | 0.9975 | 0.9985⁴ = 0.9940 | 0.9915 | 459 | 460 | In other words, the pipeline is expected to solve a rune **99.15%** of the time. That is roughly equivalent to only a single mistake for every hundred runes. Amazing! 461 | 462 | > Note: similar results may be accomplished with much less than 800 screenshots 463 | 464 | ### Observations 465 | Before we wrap-up, it is critical to make some observations about the development process and get some insight into the results. 466 | 467 | #### The bottleneck 468 | We can see which arrows the model was unable to classify by applying the `-v` flag to the classification script. Analyzing the results, we see that most of them were almost off-screen and few of them were heavily distorted. Thus, we conclude that the main bottleneck of the pipeline lies in the preprocessing stage. 469 | 470 | If improvements are made, then they should focus on the algorithms responsible for transforming and locating the arrows. As we have mentioned earlier, one alternative to the algorithm that finds the position of the arrows is to use machine learning. This could increase the accuracy of the pipeline, especially when dealing with 'hard-to-see' images. Another possible improvement may be to optimize the parameters in the preprocessing stage for runes that are very hard to solve. 471 | 472 | It is also possible to improve the performance of the model in the **runtime pipeline**. For instance, the software may take three screenshots from the rune at different moments and classify all of them. After that, the directions of the arrows can be determined by combining the three classifications. This may reduce the error rate caused by specific frames, such as when damage numbers stay behind the arrows. 473 | 474 | #### Errors and biases 475 | For the sake of integrity, we must understand some of the underlying errors and biases behind the result we just saw. The *real* performance of the model can only be measured in practice. 476 | 477 | For instance, the input screenshots are biased toward the maps and situations in which they were taken, even though the images are significantly diverse. Moreover, remember that some of the screenshots were filtered during the preprocessing stage. The human interpretation of whether or not an arrow is corrupted may influence the resulting metrics. Also, when a screenshot is skipped, it is assumed that the model would not be able to solve the rune, which is not always the case. There is still a probability that the model correctly classifies an arrow 'by chance' (25% if you consider it a random event). 478 | 479 | #### Further testing 480 | In a small test with 20 new screenshots, the model was able to solve all the runes. However, in another experiment using exclusively screenshots with lots of action, *especially ones with damage numbers flying around*, the model was able to solve 16 out of 20 runes (80% of the runes and 95% of the arrows). 481 | 482 | In reality, the performance of the model varies according to many factors, such as the player's class, the spawn location of the rune, and the monsters nearby. Nonetheless, the *overall* performance is very good for the purposes of the model. Additionally, we can still apply the improvements discussed in [the bottleneck section](#the-bottleneck). 483 | 484 | ## Final Considerations 485 | With a moderate amount of work, we created a model capable of solving runes up to **99.15%** of the time. We have also seen some ways to improve this result even further. But what do these results tell us? 486 | 487 | ### Conclusion 488 | First of all, we proved that current artificial intelligence technology can easily bypass the employed visual obfuscation, reaching human-level performance. We also know that these obfuscation methods bring significant usability drawbacks since there have been multiple complaints from the player base, including colorblind people. 489 | 490 | Moreover, while the current system forced us to develop a new model from scratch, one would expect it to be more elaborate and expensive. Also, even if a perfect model existed, bot developers would still have to create an algorithm or a hack to control the character and move it to the rune. This is not only a whole different problem but possibly a more difficult one, which already provides a notable level of protection. This makes us wonder whether the part of the rune system update that introduced the visual obfuscation had a significant impact on the botting ecosystem. 491 | 492 | Additionally, "commercial-level" botting tools have been immune to visual tricks since runes first appeared. As we will see, they do not need to process any image. 493 | 494 | Therefore, we are left with two issues. First, the current rune system hurts user interaction without providing meaningful security benefits. Second, it is somewhat ineffective toward some of the botting software available out there. 495 | 496 | ### How to improve it? 497 | The solution to the first problem is straightforward: reduce or remove the semi-transparency effects and use colorblind-friendly colors only. 498 | 499 | The second issue, on the other hand, is trickier. First, there is software capable of bypassing the anti-botting system internally. Second, recent advancements in AI technology have been closing the gap between humans and machines. Even though it is simple to patch the game visuals and prevent this project from functioning, it is just as easy to make it work again. It is no coincidence that traditional CAPTCHAs have disappeared from the web in the last years. **The best alternative seems to be to move away from this type of CAPTCHA**. 500 | 501 | Despite that, there are still some opportunities that can be explored within this type of anti-botting system. Let's see some of them. 502 | 503 | #### Overall improvements 504 | The changes proposed next may enhance both the quality and security of the game, regardless of the CAPTCHA system used. 505 | 506 | ##### Dynamic difficulty 507 | An anti-botting system should keep a healthy balance between user-friendliness and protection against malicious software. To do so, the system could reward successes while penalizing failures. For example, if a user correctly answers many runes, the next ones could contain fewer arrows. On the other hand, the game could propose a harder challenge if someone (or some program) makes many mistakes in a row. 508 | 509 | In addition to this, the delays between attempts could be adjusted. Currently, no matter how many times a user fails to solve a rune, the delay between activations is only five seconds. A better alternative is to increase the amount of time exponentially after a certain amount of errors. This measure impairs the performance of inaccurate models that need several attempts to solve a rune. As a side-effect, it is also harder for an attacker to collect data from the game. 510 | 511 | ##### Better robustness 512 | Currently, the game client is responsible for both generating and validating rune challenges, which allows hackers to perform code/packet injection and forge rune validation. Because of this, botting tools can bypass this protection system without processing the game screen, as we have mentioned previously. 513 | 514 | While it is considerably difficult to accomplish such a feat, which involves bypassing the protection mechanisms of the game, this vulnerability is enough to defeat the purpose of the runes to a significant extent. 515 | 516 | One way to solve this issue is to move both generation and validation processes to the server. The game server would send the generated image to the client and validate the response of the client. The client, on the other hand, would simply display the received image and forward the user's input. By doing so, the only possible way to bypass this mechanism would be through image processing. Although it was reasonably simple to circumvent the current rune system through image processing, it is not nearly as trivial with other CAPTCHAs. 517 | 518 | #### Alternative systems 519 | While the changes above could improve the overall quality of the rune system, they alone are not capable of solving its main vulnerabilities. For this reason, the following alternatives were proposed based on the resources readily available within the game and the capabilities of modern image classifiers. 520 | 521 | #### Finding image matches 522 | ![Finding image matches](./docs/find_matches.png) 523 | 524 | In this system, the user is asked to find all the monsters, NPCs, or characters that match the specified labels. 525 | 526 | #### Finding non-rotated images 527 | ![Finding non-rotated images](./docs/find_rotated.png) 528 | 529 | In this system, the user is asked to find all the monsters, NPCs, or characters that are not rotated. 530 | 531 | Both systems have their advantages and disadvantages compared to the current one. Let's examine them more closely. 532 | 533 | #### Advantages 534 | MapleStory promptly features thousands of different sprites, which may be rotated, scaled, flipped, and put in an obfuscated background, resulting in **hundreds of thousands** of different combinations. Thus, when coupled with [better robustness improvements](#better-robustness), the proposed systems could force botting software to be extremely expensive, inaccurate, and possibly impractical. 535 | 536 | Besides, the first system has an extra advantage since attackers would also have to find out all of the labels that may be specified by the rune system, increasing the cost of developing an automated tool even more. 537 | 538 | #### Disadvantages 539 | Although they may provide benefits, the proposed systems do not come without caveats. Both mechanisms are *considerably more expensive* to implement, especially the first one. Also, they are *not as universal* as the current runes, which only require the user to identify arrow directions. Despite this, it is possible to mitigate the latter issue by employing dynamic difficulty. 540 | 541 | --- 542 | 543 | That concludes the article. I hope you have learned something new. Feel free to open an issue if you have any questions, I will be more than happy to help. 544 | --------------------------------------------------------------------------------