├── .gitignore ├── README.md └── michi.py /.gitignore: -------------------------------------------------------------------------------- 1 | patterns.* 2 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | Michi --- Minimalistic Go MCTS Engine 2 | ===================================== 3 | 4 | Michi aims to be a minimalistic but full-fledged Computer Go program based 5 | on state-of-art methods (Monte Carlo Tree Search) and written in Python. 6 | Our goal is to make it easier for new people to enter the domain of 7 | Computer Go, peek under the hood of a "real" playing engine and be able 8 | to learn by hassle-free experiments - with the algorithms, add heuristics, 9 | etc. 10 | 11 | The algorithm code size is 540 lines of code (without user interface, tables 12 | and empty lines / comments). Currently, it can often win against GNUGo 13 | on 9×9 on an old i3 notebook, be about even with GNUGo on 15×15 on a modern 14 | higher end computer and about two stones weaker on 19×19 (spending no more 15 | than 30s per move). 16 | 17 | This is not meant to be a competitive engine; simplicity and clear code is 18 | preferred over optimization (after all, it's in Python!). But compared to 19 | other minimalistic engines, this one should be able to beat beginner 20 | intermediate human players, and I believe that a *fast* implementation 21 | of exactly the same heuristics would be around 4k KGS or even better. 22 | 23 | Michi is distributed under the MIT licence. Now go forth, hack and peruse! 24 | 25 | Usage 26 | ----- 27 | 28 | If you want to try it out, just start the script. You can also pass the 29 | gtp argument and start it in gogui, or let it play GNUGo: 30 | 31 | gogui/bin/gogui-twogtp -black './michi.py gtp' -white 'gnugo --mode=gtp --chinese-rules --capture-all-dead' -size 9 -komi 7.5 -verbose -auto 32 | 33 | It is *highly* recommended that you download Michi large-scale pattern files 34 | (patterns.prob, patterns.spat): 35 | 36 | http://pachi.or.cz/michi-pat/ 37 | 38 | Store and unpack them in the current directory for Michi to find. 39 | 40 | Understanding and Hacking 41 | ------------------------- 42 | 43 | Note that while all strong Computer Go programs currently use the MCTS 44 | algorithm, there are many particular variants, differing particularly 45 | regarding the playout policy mechanics and the way tree node priors 46 | are constructed and incorporated (this is not about individual heuristics 47 | but the way they are integrated). Michi uses the MCTS flavor used in 48 | Pachi and Fuego, but e.g. Zen and CrazyStone take quite a different 49 | approach to this. For a general introduction to Michi-style MCTS algorithm, 50 | see Petr Baudis' Master Thesis http://pasky.or.cz/go/prace.pdf, esp. 51 | Sec. 2.1 to 2.3 and Sec. 3.3 to 3.4. 52 | 53 | The ethymology of Michi is "Minimalistic Pachi". If you would like 54 | to try your hands at hacking a competitive Computer Go engine, try Pachi! :-) 55 | Michi has been inspired by Sunfish - a minimalistic chess engine. Sadly 56 | (or happily?), for computers Go is a lot more complicated than chess, even 57 | if you want to just implement the rules. 58 | 59 | We would like to encourage you to experiment with the heuristics and try 60 | to add more. But please realize that if some heuristic seems to work well, 61 | you should verify how it works in a more competitive engine (in conjunction 62 | with other heuristics and many more playouts) and play-test it on at least 63 | a few hundred games with a reference opponent (not just the program without 64 | the heuristic, self-play testing greatly exaggerates any improvements). 65 | 66 | TODO 67 | ---- 68 | 69 | Strong Computer Go programs tend to accumulate many specialized, 70 | sophisticated, individually low-yield heuristics. These are mostly 71 | out of scope of Michi in order to keep things short and simple. 72 | However, other than that, there are certainly things that Michi should 73 | or could contain, roughly in order of priority: 74 | 75 | * Superko support. 76 | * Support for early passing and GTP stone status protocol. 77 | * gogui visualization support. 78 | * Group/liberty tracking in the board position implementation. 79 | 80 | If you would like to increase the strength of the program, the lowest 81 | hanging fruit is likely: 82 | 83 | * Tune parameters using Rémi Coulom's CLOP. 84 | * Simple time management. (See the Pachi paper.) 85 | * Pondering (search game tree during opponent's move) support. 86 | * Make it faster - either by optimizations (see group tracking above) 87 | or 1:1 rewrite in a faster language. 88 | * Two/three liberty semeai reading in playouts. (See also CFG patterns.) 89 | * Tsumego improvements - allow single-stone selfatari only for throwins 90 | and detect nakade shapes. 91 | * Try true probability distribution playouts + Rémi Coulom's MM patterns. 92 | 93 | (Most of the mistakes Michi makes is caused by the awfully low number of 94 | playouts; I believe that with 20× speedup, which seems very realistic, the same 95 | algorithm could easily get to KGS 4k on 19×19. One of the things I would hope 96 | to inspire is rewrite of the same algorithm in different, faster programming 97 | languages; hopefully seeing a done real-world thing is easier than developing 98 | it from scratch. What about a Go engine in the Go language?) 99 | 100 | **michi-c** is such a rewrite of Michi, in plain C. It seems to play even with 101 | GNUGo when given 3.3s/move: https://github.com/db3108/michi-c2. 102 | 103 | Note: there is a clone version of the michi python code (slower than michi-c2) 104 | that is available at https://github.com/db3108/michi-c. 105 | This simpler version can be read in parallel with the michi.py python code. 106 | 107 | **michi-go** is a rewrite of Michi in the Go language: 108 | https://github.com/traveller42/michi-go 109 | -------------------------------------------------------------------------------- /michi.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env pypy 2 | # -*- coding: utf-8 -*- 3 | # 4 | # (c) Petr Baudis 2015 5 | # MIT licence (i.e. almost public domain) 6 | # 7 | # A minimalistic Go-playing engine attempting to strike a balance between 8 | # brevity, educational value and strength. It can beat GNUGo on 13x13 board 9 | # on a modest 4-thread laptop. 10 | # 11 | # When benchmarking, note that at the beginning of the first move the program 12 | # runs much slower because pypy is JIT compiling on the background! 13 | # 14 | # To start reading the code, begin either: 15 | # * Bottom up, by looking at the goban implementation - starting with 16 | # the 'empty' definition below and Position.move() method. 17 | # * In the middle, by looking at the Monte Carlo playout implementation, 18 | # starting with the mcplayout() function. 19 | # * Top down, by looking at the MCTS implementation, starting with the 20 | # tree_search() function. It can look a little confusing due to the 21 | # parallelization, but really is just a loop of tree_descend(), 22 | # mcplayout() and tree_update() round and round. 23 | # It may be better to jump around a bit instead of just reading straight 24 | # from start to end. 25 | 26 | from __future__ import print_function 27 | from collections import namedtuple 28 | from itertools import count 29 | import math 30 | import multiprocessing 31 | from multiprocessing.pool import Pool 32 | import random 33 | import re 34 | import sys 35 | import time 36 | 37 | 38 | # Given a board of size NxN (N=9, 19, ...), we represent the position 39 | # as an (N+1)*(N+2) string, with '.' (empty), 'X' (to-play player), 40 | # 'x' (other player), and whitespace (off-board border to make rules 41 | # implementation easier). Coordinates are just indices in this string. 42 | # You can simply print(board) when debugging. 43 | N = 13 44 | W = N + 2 45 | empty = "\n".join([(N+1)*' '] + N*[' '+N*'.'] + [(N+2)*' ']) 46 | colstr = 'ABCDEFGHJKLMNOPQRST' 47 | MAX_GAME_LEN = N * N * 3 48 | 49 | N_SIMS = 1400 50 | RAVE_EQUIV = 3500 51 | EXPAND_VISITS = 8 52 | PRIOR_EVEN = 10 # should be even number; 0.5 prior 53 | PRIOR_SELFATARI = 10 # negative prior 54 | PRIOR_CAPTURE_ONE = 15 55 | PRIOR_CAPTURE_MANY = 30 56 | PRIOR_PAT3 = 10 57 | PRIOR_LARGEPATTERN = 100 # most moves have relatively small probability 58 | PRIOR_CFG = [24, 22, 8] # priors for moves in cfg dist. 1, 2, 3 59 | PRIOR_EMPTYAREA = 10 60 | REPORT_PERIOD = 200 61 | PROB_HEURISTIC = {'capture': 0.9, 'pat3': 0.95} # probability of heuristic suggestions being taken in playout 62 | PROB_SSAREJECT = 0.9 # probability of rejecting suggested self-atari in playout 63 | PROB_RSAREJECT = 0.5 # probability of rejecting random self-atari in playout; this is lower than above to allow nakade 64 | RESIGN_THRES = 0.2 65 | FASTPLAY20_THRES = 0.8 # if at 20% playouts winrate is >this, stop reading 66 | FASTPLAY5_THRES = 0.95 # if at 5% playouts winrate is >this, stop reading 67 | 68 | pat3src = [ # 3x3 playout patterns; X,O are colors, x,o are their inverses 69 | ["XOX", # hane pattern - enclosing hane 70 | "...", 71 | "???"], 72 | ["XO.", # hane pattern - non-cutting hane 73 | "...", 74 | "?.?"], 75 | ["XO?", # hane pattern - magari 76 | "X..", 77 | "x.?"], 78 | # ["XOO", # hane pattern - thin hane 79 | # "...", 80 | # "?.?", "X", - only for the X player 81 | [".O.", # generic pattern - katatsuke or diagonal attachment; similar to magari 82 | "X..", 83 | "..."], 84 | ["XO?", # cut1 pattern (kiri] - unprotected cut 85 | "O.o", 86 | "?o?"], 87 | ["XO?", # cut1 pattern (kiri] - peeped cut 88 | "O.X", 89 | "???"], 90 | ["?X?", # cut2 pattern (de] 91 | "O.O", 92 | "ooo"], 93 | ["OX?", # cut keima 94 | "o.O", 95 | "???"], 96 | ["X.?", # side pattern - chase 97 | "O.?", 98 | " "], 99 | ["OX?", # side pattern - block side cut 100 | "X.O", 101 | " "], 102 | ["?X?", # side pattern - block side connection 103 | "x.O", 104 | " "], 105 | ["?XO", # side pattern - sagari 106 | "x.x", 107 | " "], 108 | ["?OX", # side pattern - cut 109 | "X.O", 110 | " "], 111 | ] 112 | 113 | pat_gridcular_seq = [ # Sequence of coordinate offsets of progressively wider diameters in gridcular metric 114 | [[0,0], 115 | [0,1], [0,-1], [1,0], [-1,0], 116 | [1,1], [-1,1], [1,-1], [-1,-1], ], # d=1,2 is not considered separately 117 | [[0,2], [0,-2], [2,0], [-2,0], ], 118 | [[1,2], [-1,2], [1,-2], [-1,-2], [2,1], [-2,1], [2,-1], [-2,-1], ], 119 | [[0,3], [0,-3], [2,2], [-2,2], [2,-2], [-2,-2], [3,0], [-3,0], ], 120 | [[1,3], [-1,3], [1,-3], [-1,-3], [3,1], [-3,1], [3,-1], [-3,-1], ], 121 | [[0,4], [0,-4], [2,3], [-2,3], [2,-3], [-2,-3], [3,2], [-3,2], [3,-2], [-3,-2], [4,0], [-4,0], ], 122 | [[1,4], [-1,4], [1,-4], [-1,-4], [3,3], [-3,3], [3,-3], [-3,-3], [4,1], [-4,1], [4,-1], [-4,-1], ], 123 | [[0,5], [0,-5], [2,4], [-2,4], [2,-4], [-2,-4], [4,2], [-4,2], [4,-2], [-4,-2], [5,0], [-5,0], ], 124 | [[1,5], [-1,5], [1,-5], [-1,-5], [3,4], [-3,4], [3,-4], [-3,-4], [4,3], [-4,3], [4,-3], [-4,-3], [5,1], [-5,1], [5,-1], [-5,-1], ], 125 | [[0,6], [0,-6], [2,5], [-2,5], [2,-5], [-2,-5], [4,4], [-4,4], [4,-4], [-4,-4], [5,2], [-5,2], [5,-2], [-5,-2], [6,0], [-6,0], ], 126 | [[1,6], [-1,6], [1,-6], [-1,-6], [3,5], [-3,5], [3,-5], [-3,-5], [5,3], [-5,3], [5,-3], [-5,-3], [6,1], [-6,1], [6,-1], [-6,-1], ], 127 | [[0,7], [0,-7], [2,6], [-2,6], [2,-6], [-2,-6], [4,5], [-4,5], [4,-5], [-4,-5], [5,4], [-5,4], [5,-4], [-5,-4], [6,2], [-6,2], [6,-2], [-6,-2], [7,0], [-7,0], ], 128 | ] 129 | spat_patterndict_file = 'patterns.spat' 130 | large_patterns_file = 'patterns.prob' 131 | 132 | 133 | ####################### 134 | # board string routines 135 | 136 | def neighbors(c): 137 | """ generator of coordinates for all neighbors of c """ 138 | return [c-1, c+1, c-W, c+W] 139 | 140 | def diag_neighbors(c): 141 | """ generator of coordinates for all diagonal neighbors of c """ 142 | return [c-W-1, c-W+1, c+W-1, c+W+1] 143 | 144 | 145 | def board_put(board, c, p): 146 | return board[:c] + p + board[c+1:] 147 | 148 | 149 | def floodfill(board, c): 150 | """ replace continuous-color area starting at c with special color # """ 151 | # This is called so much that a bytearray is worthwhile... 152 | byteboard = bytearray(board) 153 | p = byteboard[c] 154 | byteboard[c] = ord('#') 155 | fringe = [c] 156 | while fringe: 157 | c = fringe.pop() 158 | for d in neighbors(c): 159 | if byteboard[d] == p: 160 | byteboard[d] = ord('#') 161 | fringe.append(d) 162 | return str(byteboard) 163 | 164 | 165 | # Regex that matches various kind of points adjecent to '#' (floodfilled) points 166 | contact_res = dict() 167 | for p in ['.', 'x', 'X']: 168 | rp = '\\.' if p == '.' else p 169 | contact_res_src = ['#' + rp, # p at right 170 | rp + '#', # p at left 171 | '#' + '.'*(W-1) + rp, # p below 172 | rp + '.'*(W-1) + '#'] # p above 173 | contact_res[p] = re.compile('|'.join(contact_res_src), flags=re.DOTALL) 174 | 175 | def contact(board, p): 176 | """ test if point of color p is adjecent to color # anywhere 177 | on the board; use in conjunction with floodfill for reachability """ 178 | m = contact_res[p].search(board) 179 | if not m: 180 | return None 181 | return m.start() if m.group(0)[0] == p else m.end() - 1 182 | 183 | 184 | def is_eyeish(board, c): 185 | """ test if c is inside a single-color diamond and return the diamond 186 | color or None; this could be an eye, but also a false one """ 187 | eyecolor = None 188 | for d in neighbors(c): 189 | if board[d].isspace(): 190 | continue 191 | if board[d] == '.': 192 | return None 193 | if eyecolor is None: 194 | eyecolor = board[d] 195 | othercolor = eyecolor.swapcase() 196 | elif board[d] == othercolor: 197 | return None 198 | return eyecolor 199 | 200 | def is_eye(board, c): 201 | """ test if c is an eye and return its color or None """ 202 | eyecolor = is_eyeish(board, c) 203 | if eyecolor is None: 204 | return None 205 | 206 | # Eye-like shape, but it could be a falsified eye 207 | falsecolor = eyecolor.swapcase() 208 | false_count = 0 209 | at_edge = False 210 | for d in diag_neighbors(c): 211 | if board[d].isspace(): 212 | at_edge = True 213 | elif board[d] == falsecolor: 214 | false_count += 1 215 | if at_edge: 216 | false_count += 1 217 | if false_count >= 2: 218 | return None 219 | 220 | return eyecolor 221 | 222 | 223 | class Position(namedtuple('Position', 'board cap n ko last last2 komi')): 224 | """ Implementation of simple Chinese Go rules; 225 | n is how many moves were played so far """ 226 | 227 | def move(self, c): 228 | """ play as player X at the given coord c, return the new position """ 229 | 230 | # Test for ko 231 | if c == self.ko: 232 | return None 233 | # Are we trying to play in enemy's eye? 234 | in_enemy_eye = is_eyeish(self.board, c) == 'x' 235 | 236 | board = board_put(self.board, c, 'X') 237 | # Test for captures, and track ko 238 | capX = self.cap[0] 239 | singlecaps = [] 240 | for d in neighbors(c): 241 | if board[d] != 'x': 242 | continue 243 | # XXX: The following is an extremely naive and SLOW approach 244 | # at things - to do it properly, we should maintain some per-group 245 | # data structures tracking liberties. 246 | fboard = floodfill(board, d) # get a board with the adjecent group replaced by '#' 247 | if contact(fboard, '.') is not None: 248 | continue # some liberties left 249 | # no liberties left for this group, remove the stones! 250 | capcount = fboard.count('#') 251 | if capcount == 1: 252 | singlecaps.append(d) 253 | capX += capcount 254 | board = fboard.replace('#', '.') # capture the group 255 | # Set ko 256 | ko = singlecaps[0] if in_enemy_eye and len(singlecaps) == 1 else None 257 | # Test for suicide 258 | if contact(floodfill(board, c), '.') is None: 259 | return None 260 | 261 | # Update the position and return 262 | return Position(board=board.swapcase(), cap=(self.cap[1], capX), 263 | n=self.n + 1, ko=ko, last=c, last2=self.last, komi=self.komi) 264 | 265 | def pass_move(self): 266 | """ pass - i.e. return simply a flipped position """ 267 | return Position(board=self.board.swapcase(), cap=(self.cap[1], self.cap[0]), 268 | n=self.n + 1, ko=None, last=None, last2=self.last, komi=self.komi) 269 | 270 | def moves(self, i0): 271 | """ Generate a list of moves (includes false positives - suicide moves; 272 | does not include true-eye-filling moves), starting from a given board 273 | index (that can be used for randomization) """ 274 | i = i0-1 275 | passes = 0 276 | while True: 277 | i = self.board.find('.', i+1) 278 | if passes > 0 and (i == -1 or i >= i0): 279 | break # we have looked through the whole board 280 | elif i == -1: 281 | i = 0 282 | passes += 1 283 | continue # go back and start from the beginning 284 | # Test for to-play player's one-point eye 285 | if is_eye(self.board, i) == 'X': 286 | continue 287 | yield i 288 | 289 | def last_moves_neighbors(self): 290 | """ generate a randomly shuffled list of points including and 291 | surrounding the last two moves (but with the last move having 292 | priority) """ 293 | clist = [] 294 | for c in self.last, self.last2: 295 | if c is None: continue 296 | dlist = [c] + list(neighbors(c) + diag_neighbors(c)) 297 | random.shuffle(dlist) 298 | clist += [d for d in dlist if d not in clist] 299 | return clist 300 | 301 | def score(self, owner_map=None): 302 | """ compute score for to-play player; this assumes a final position 303 | with all dead stones captured; if owner_map is passed, it is assumed 304 | to be an array of statistics with average owner at the end of the game 305 | (+1 black, -1 white) """ 306 | board = self.board 307 | i = 0 308 | while True: 309 | i = self.board.find('.', i+1) 310 | if i == -1: 311 | break 312 | fboard = floodfill(board, i) 313 | # fboard is board with some continuous area of empty space replaced by # 314 | touches_X = contact(fboard, 'X') is not None 315 | touches_x = contact(fboard, 'x') is not None 316 | if touches_X and not touches_x: 317 | board = fboard.replace('#', 'X') 318 | elif touches_x and not touches_X: 319 | board = fboard.replace('#', 'x') 320 | else: 321 | board = fboard.replace('#', ':') # seki, rare 322 | # now that area is replaced either by X, x or : 323 | komi = self.komi if self.n % 2 == 1 else -self.komi 324 | if owner_map is not None: 325 | for c in range(W*W): 326 | n = 1 if board[c] == 'X' else -1 if board[c] == 'x' else 0 327 | owner_map[c] += n * (1 if self.n % 2 == 0 else -1) 328 | return board.count('X') - board.count('x') + komi 329 | 330 | 331 | def empty_position(): 332 | """ Return an initial board position """ 333 | return Position(board=empty, cap=(0, 0), n=0, ko=None, last=None, last2=None, komi=7.5) 334 | 335 | 336 | ############### 337 | # go heuristics 338 | 339 | def fix_atari(pos, c, singlept_ok=False, twolib_test=True, twolib_edgeonly=False): 340 | """ An atari/capture analysis routine that checks the group at c, 341 | determining whether (i) it is in atari (ii) if it can escape it, 342 | either by playing on its liberty or counter-capturing another group. 343 | 344 | N.B. this is maybe the most complicated part of the whole program (sadly); 345 | feel free to just TREAT IT AS A BLACK-BOX, it's not really that 346 | interesting! 347 | 348 | The return value is a tuple of (boolean, [coord..]), indicating whether 349 | the group is in atari and how to escape/capture (or [] if impossible). 350 | (Note that (False, [...]) is possible in case the group can be captured 351 | in a ladder - it is not in atari but some capture attack/defense moves 352 | are available.) 353 | 354 | singlept_ok means that we will not try to save one-point groups; 355 | twolib_test means that we will check for 2-liberty groups which are 356 | threatened by a ladder 357 | twolib_edgeonly means that we will check the 2-liberty groups only 358 | at the board edge, allowing check of the most common short ladders 359 | even in the playouts """ 360 | 361 | def read_ladder_attack(pos, c, l1, l2): 362 | """ check if a capturable ladder is being pulled out at c and return 363 | a move that continues it in that case; expects its two liberties as 364 | l1, l2 (in fact, this is a general 2-lib capture exhaustive solver) """ 365 | for l in [l1, l2]: 366 | pos_l = pos.move(l) 367 | if pos_l is None: 368 | continue 369 | # fix_atari() will recursively call read_ladder_attack() back; 370 | # however, ignore 2lib groups as we don't have time to chase them 371 | is_atari, atari_escape = fix_atari(pos_l, c, twolib_test=False) 372 | if is_atari and not atari_escape: 373 | return l 374 | return None 375 | 376 | fboard = floodfill(pos.board, c) 377 | group_size = fboard.count('#') 378 | if singlept_ok and group_size == 1: 379 | return (False, []) 380 | # Find a liberty 381 | l = contact(fboard, '.') 382 | # Ok, any other liberty? 383 | fboard = board_put(fboard, l, 'L') 384 | l2 = contact(fboard, '.') 385 | if l2 is not None: 386 | # At least two liberty group... 387 | if twolib_test and group_size > 1 \ 388 | and (not twolib_edgeonly or line_height(l) == 0 and line_height(l2) == 0) \ 389 | and contact(board_put(fboard, l2, 'L'), '.') is None: 390 | # Exactly two liberty group with more than one stone. Check 391 | # that it cannot be caught in a working ladder; if it can, 392 | # that's as good as in atari, a capture threat. 393 | # (Almost - N/A for countercaptures.) 394 | ladder_attack = read_ladder_attack(pos, c, l, l2) 395 | if ladder_attack: 396 | return (False, [ladder_attack]) 397 | return (False, []) 398 | 399 | # In atari! If it's the opponent's group, that's enough... 400 | if pos.board[c] == 'x': 401 | return (True, [l]) 402 | 403 | solutions = [] 404 | 405 | # Before thinking about defense, what about counter-capturing 406 | # a neighboring group? 407 | ccboard = fboard 408 | while True: 409 | othergroup = contact(ccboard, 'x') 410 | if othergroup is None: 411 | break 412 | a, ccls = fix_atari(pos, othergroup, twolib_test=False) 413 | if a and ccls: 414 | solutions += ccls 415 | # XXX: floodfill is better for big groups 416 | ccboard = board_put(ccboard, othergroup, '%') 417 | 418 | # We are escaping. Will playing our last liberty gain 419 | # at least two liberties? Re-floodfill to account for connecting 420 | escpos = pos.move(l) 421 | if escpos is None: 422 | return (True, solutions) # oops, suicidal move 423 | fboard = floodfill(escpos.board, l) 424 | l_new = contact(fboard, '.') 425 | fboard = board_put(fboard, l_new, 'L') 426 | l_new_2 = contact(fboard, '.') 427 | if l_new_2 is not None: 428 | # Good, there is still some liberty remaining - but if it's 429 | # just the two, check that we are not caught in a ladder... 430 | # (Except that we don't care if we already have some alternative 431 | # escape routes!) 432 | if solutions or not (contact(board_put(fboard, l_new_2, 'L'), '.') is None 433 | and read_ladder_attack(escpos, l, l_new, l_new_2) is not None): 434 | solutions.append(l) 435 | 436 | return (True, solutions) 437 | 438 | 439 | def cfg_distances(board, c): 440 | """ return a board map listing common fate graph distances from 441 | a given point - this corresponds to the concept of locality while 442 | contracting groups to single points """ 443 | cfg_map = W*W*[-1] 444 | cfg_map[c] = 0 445 | 446 | # flood-fill like mechanics 447 | fringe = [c] 448 | while fringe: 449 | c = fringe.pop() 450 | for d in neighbors(c): 451 | if board[d].isspace() or 0 <= cfg_map[d] <= cfg_map[c]: 452 | continue 453 | cfg_before = cfg_map[d] 454 | if board[d] != '.' and board[d] == board[c]: 455 | cfg_map[d] = cfg_map[c] 456 | else: 457 | cfg_map[d] = cfg_map[c] + 1 458 | if cfg_before < 0 or cfg_before > cfg_map[d]: 459 | fringe.append(d) 460 | return cfg_map 461 | 462 | 463 | def line_height(c): 464 | """ Return the line number above nearest board edge """ 465 | row, col = divmod(c - (W+1), W) 466 | return min(row, col, N-1-row, N-1-col) 467 | 468 | 469 | def empty_area(board, c, dist=3): 470 | """ Check whether there are any stones in Manhattan distance up 471 | to dist """ 472 | for d in neighbors(c): 473 | if board[d] in 'Xx': 474 | return False 475 | elif board[d] == '.' and dist > 1 and not empty_area(board, d, dist-1): 476 | return False 477 | return True 478 | 479 | 480 | # 3x3 pattern routines (those patterns stored in pat3src above) 481 | 482 | def pat3_expand(pat): 483 | """ All possible neighborhood configurations matching a given pattern; 484 | used just for a combinatoric explosion when loading them in an 485 | in-memory set. """ 486 | def pat_rot90(p): 487 | return [p[2][0] + p[1][0] + p[0][0], p[2][1] + p[1][1] + p[0][1], p[2][2] + p[1][2] + p[0][2]] 488 | def pat_vertflip(p): 489 | return [p[2], p[1], p[0]] 490 | def pat_horizflip(p): 491 | return [l[::-1] for l in p] 492 | def pat_swapcolors(p): 493 | return [l.replace('X', 'Z').replace('x', 'z').replace('O', 'X').replace('o', 'x').replace('Z', 'O').replace('z', 'o') for l in p] 494 | def pat_wildexp(p, c, to): 495 | i = p.find(c) 496 | if i == -1: 497 | return [p] 498 | return reduce(lambda a, b: a + b, [pat_wildexp(p[:i] + t + p[i+1:], c, to) for t in to]) 499 | def pat_wildcards(pat): 500 | return [p for p in pat_wildexp(pat, '?', list('.XO ')) 501 | for p in pat_wildexp(p, 'x', list('.O ')) 502 | for p in pat_wildexp(p, 'o', list('.X '))] 503 | return [p for p in [pat, pat_rot90(pat)] 504 | for p in [p, pat_vertflip(p)] 505 | for p in [p, pat_horizflip(p)] 506 | for p in [p, pat_swapcolors(p)] 507 | for p in pat_wildcards(''.join(p))] 508 | 509 | pat3set = set([p.replace('O', 'x') for p in pat3src for p in pat3_expand(p)]) 510 | 511 | def neighborhood_33(board, c): 512 | """ return a string containing the 9 points forming 3x3 square around 513 | a certain move candidate """ 514 | return (board[c-W-1 : c-W+2] + board[c-1 : c+2] + board[c+W-1 : c+W+2]).replace('\n', ' ') 515 | 516 | 517 | # large-scale pattern routines (those patterns living in patterns.{spat,prob} files) 518 | 519 | # are you curious how these patterns look in practice? get 520 | # https://github.com/pasky/pachi/blob/master/tools/pattern_spatial_show.pl 521 | # and try e.g. ./pattern_spatial_show.pl 71 522 | 523 | spat_patterndict = dict() # hash(neighborhood_gridcular()) -> spatial id 524 | def load_spat_patterndict(f): 525 | """ load dictionary of positions, translating them to numeric ids """ 526 | for line in f: 527 | # line: 71 6 ..X.X..OO.O..........#X...... 33408f5e 188e9d3e 2166befe aa8ac9e 127e583e 1282462e 5e3d7fe 51fc9ee 528 | if line.startswith('#'): 529 | continue 530 | neighborhood = line.split()[2].replace('#', ' ').replace('O', 'x') 531 | spat_patterndict[hash(neighborhood)] = int(line.split()[0]) 532 | 533 | large_patterns = dict() # spatial id -> probability 534 | def load_large_patterns(f): 535 | """ dictionary of numeric pattern ids, translating them to probabilities 536 | that a move matching such move will be played when it is available """ 537 | # The pattern file contains other features like capture, selfatari too; 538 | # we ignore them for now 539 | for line in f: 540 | # line: 0.004 14 3842 (capture:17 border:0 s:784) 541 | p = float(line.split()[0]) 542 | m = re.search('s:(\d+)', line) 543 | if m is not None: 544 | s = int(m.groups()[0]) 545 | large_patterns[s] = p 546 | 547 | 548 | def neighborhood_gridcular(board, c): 549 | """ Yield progressively wider-diameter gridcular board neighborhood 550 | stone configuration strings, in all possible rotations """ 551 | # Each rotations element is (xyindex, xymultiplier) 552 | rotations = [((0,1),(1,1)), ((0,1),(-1,1)), ((0,1),(1,-1)), ((0,1),(-1,-1)), 553 | ((1,0),(1,1)), ((1,0),(-1,1)), ((1,0),(1,-1)), ((1,0),(-1,-1))] 554 | neighborhood = ['' for i in range(len(rotations))] 555 | wboard = board.replace('\n', ' ') 556 | for dseq in pat_gridcular_seq: 557 | for ri in range(len(rotations)): 558 | r = rotations[ri] 559 | for o in dseq: 560 | y, x = divmod(c - (W+1), W) 561 | y += o[r[0][0]]*r[1][0] 562 | x += o[r[0][1]]*r[1][1] 563 | if y >= 0 and y < N and x >= 0 and x < N: 564 | neighborhood[ri] += wboard[(y+1)*W + x+1] 565 | else: 566 | neighborhood[ri] += ' ' 567 | yield neighborhood[ri] 568 | 569 | 570 | def large_pattern_probability(board, c): 571 | """ return probability of large-scale pattern at coordinate c. 572 | Multiple progressively wider patterns may match a single coordinate, 573 | we consider the largest one. """ 574 | probability = None 575 | matched_len = 0 576 | non_matched_len = 0 577 | for n in neighborhood_gridcular(board, c): 578 | sp_i = spat_patterndict.get(hash(n)) 579 | prob = large_patterns.get(sp_i) if sp_i is not None else None 580 | if prob is not None: 581 | probability = prob 582 | matched_len = len(n) 583 | elif matched_len < non_matched_len < len(n): 584 | # stop when we did not match any pattern with a certain 585 | # diameter - it ain't going to get any better! 586 | break 587 | else: 588 | non_matched_len = len(n) 589 | return probability 590 | 591 | 592 | ########################### 593 | # montecarlo playout policy 594 | 595 | def gen_playout_moves(pos, heuristic_set, probs={'capture': 1, 'pat3': 1}, expensive_ok=False): 596 | """ Yield candidate next moves in the order of preference; this is one 597 | of the main places where heuristics dwell, try adding more! 598 | 599 | heuristic_set is the set of coordinates considered for applying heuristics; 600 | this is the immediate neighborhood of last two moves in the playout, but 601 | the whole board while prioring the tree. """ 602 | 603 | # Check whether any local group is in atari and fill that liberty 604 | # print('local moves', [str_coord(c) for c in heuristic_set], file=sys.stderr) 605 | if random.random() <= probs['capture']: 606 | already_suggested = set() 607 | for c in heuristic_set: 608 | if pos.board[c] in 'Xx': 609 | in_atari, ds = fix_atari(pos, c, twolib_edgeonly=not expensive_ok) 610 | random.shuffle(ds) 611 | for d in ds: 612 | if d not in already_suggested: 613 | yield (d, 'capture '+str(c)) 614 | already_suggested.add(d) 615 | 616 | # Try to apply a 3x3 pattern on the local neighborhood 617 | if random.random() <= probs['pat3']: 618 | already_suggested = set() 619 | for c in heuristic_set: 620 | if pos.board[c] == '.' and c not in already_suggested and neighborhood_33(pos.board, c) in pat3set: 621 | yield (c, 'pat3') 622 | already_suggested.add(c) 623 | 624 | # Try *all* available moves, but starting from a random point 625 | # (in other words, suggest a random move) 626 | x, y = random.randint(1, N), random.randint(1, N) 627 | for c in pos.moves(y*W + x): 628 | yield (c, 'random') 629 | 630 | 631 | def mcplayout(pos, amaf_map, disp=False): 632 | """ Start a Monte Carlo playout from a given position, 633 | return score for to-play player at the starting position; 634 | amaf_map is board-sized scratchpad recording who played at a given 635 | position first """ 636 | if disp: print('** SIMULATION **', file=sys.stderr) 637 | start_n = pos.n 638 | passes = 0 639 | while passes < 2 and pos.n < MAX_GAME_LEN: 640 | if disp: print_pos(pos) 641 | 642 | pos2 = None 643 | # We simply try the moves our heuristics generate, in a particular 644 | # order, but not with 100% probability; this is on the border between 645 | # "rule-based playouts" and "probability distribution playouts". 646 | for c, kind in gen_playout_moves(pos, pos.last_moves_neighbors(), PROB_HEURISTIC): 647 | if disp and kind != 'random': 648 | print('move suggestion', str_coord(c), kind, file=sys.stderr) 649 | pos2 = pos.move(c) 650 | if pos2 is None: 651 | continue 652 | # check if the suggested move did not turn out to be a self-atari 653 | if random.random() <= (PROB_RSAREJECT if kind == 'random' else PROB_SSAREJECT): 654 | in_atari, ds = fix_atari(pos2, c, singlept_ok=True, twolib_edgeonly=True) 655 | if ds: 656 | if disp: print('rejecting self-atari move', str_coord(c), file=sys.stderr) 657 | pos2 = None 658 | continue 659 | if amaf_map[c] == 0: # Mark the coordinate with 1 for black 660 | amaf_map[c] = 1 if pos.n % 2 == 0 else -1 661 | break 662 | if pos2 is None: # no valid moves, pass 663 | pos = pos.pass_move() 664 | passes += 1 665 | continue 666 | passes = 0 667 | pos = pos2 668 | 669 | owner_map = W*W*[0] 670 | score = pos.score(owner_map) 671 | if disp: print('** SCORE B%+.1f **' % (score if pos.n % 2 == 0 else -score), file=sys.stderr) 672 | if start_n % 2 != pos.n % 2: 673 | score = -score 674 | return score, amaf_map, owner_map 675 | 676 | 677 | ######################## 678 | # montecarlo tree search 679 | 680 | class TreeNode(): 681 | """ Monte-Carlo tree node; 682 | v is #visits, w is #wins for to-play (expected reward is w/v) 683 | pv, pw are prior values (node value = w/v + pw/pv) 684 | av, aw are amaf values ("all moves as first", used for the RAVE tree policy) 685 | children is None for leaf nodes """ 686 | def __init__(self, pos): 687 | self.pos = pos 688 | self.v = 0 689 | self.w = 0 690 | self.pv = PRIOR_EVEN 691 | self.pw = PRIOR_EVEN/2 692 | self.av = 0 693 | self.aw = 0 694 | self.children = None 695 | 696 | def expand(self): 697 | """ add and initialize children to a leaf node """ 698 | cfg_map = cfg_distances(self.pos.board, self.pos.last) if self.pos.last is not None else None 699 | self.children = [] 700 | childset = dict() 701 | # Use playout generator to generate children and initialize them 702 | # with some priors to bias search towards more sensible moves. 703 | # Note that there can be many ways to incorporate the priors in 704 | # next node selection (progressive bias, progressive widening, ...). 705 | for c, kind in gen_playout_moves(self.pos, range(N, (N+1)*W), expensive_ok=True): 706 | pos2 = self.pos.move(c) 707 | if pos2 is None: 708 | continue 709 | # gen_playout_moves() will generate duplicate suggestions 710 | # if a move is yielded by multiple heuristics 711 | try: 712 | node = childset[pos2.last] 713 | except KeyError: 714 | node = TreeNode(pos2) 715 | self.children.append(node) 716 | childset[pos2.last] = node 717 | 718 | if kind.startswith('capture'): 719 | # Check how big group we are capturing; coord of the group is 720 | # second word in the ``kind`` string 721 | if floodfill(self.pos.board, int(kind.split()[1])).count('#') > 1: 722 | node.pv += PRIOR_CAPTURE_MANY 723 | node.pw += PRIOR_CAPTURE_MANY 724 | else: 725 | node.pv += PRIOR_CAPTURE_ONE 726 | node.pw += PRIOR_CAPTURE_ONE 727 | elif kind == 'pat3': 728 | node.pv += PRIOR_PAT3 729 | node.pw += PRIOR_PAT3 730 | 731 | # Second pass setting priors, considering each move just once now 732 | for node in self.children: 733 | c = node.pos.last 734 | 735 | if cfg_map is not None and cfg_map[c]-1 < len(PRIOR_CFG): 736 | node.pv += PRIOR_CFG[cfg_map[c]-1] 737 | node.pw += PRIOR_CFG[cfg_map[c]-1] 738 | 739 | height = line_height(c) # 0-indexed 740 | if height <= 2 and empty_area(self.pos.board, c): 741 | # No stones around; negative prior for 1st + 2nd line, positive 742 | # for 3rd line; sanitizes opening and invasions 743 | if height <= 1: 744 | node.pv += PRIOR_EMPTYAREA 745 | node.pw += 0 746 | if height == 2: 747 | node.pv += PRIOR_EMPTYAREA 748 | node.pw += PRIOR_EMPTYAREA 749 | 750 | in_atari, ds = fix_atari(node.pos, c, singlept_ok=True) 751 | if ds: 752 | node.pv += PRIOR_SELFATARI 753 | node.pw += 0 # negative prior 754 | 755 | patternprob = large_pattern_probability(self.pos.board, c) 756 | if patternprob is not None and patternprob > 0.001: 757 | pattern_prior = math.sqrt(patternprob) # tone up 758 | node.pv += pattern_prior * PRIOR_LARGEPATTERN 759 | node.pw += pattern_prior * PRIOR_LARGEPATTERN 760 | 761 | if not self.children: 762 | # No possible moves, add a pass move 763 | self.children.append(TreeNode(self.pos.pass_move())) 764 | 765 | def rave_urgency(self): 766 | v = self.v + self.pv 767 | expectation = float(self.w+self.pw) / v 768 | if self.av == 0: 769 | return expectation 770 | rave_expectation = float(self.aw) / self.av 771 | beta = self.av / (self.av + v + float(v) * self.av / RAVE_EQUIV) 772 | return beta * rave_expectation + (1-beta) * expectation 773 | 774 | def winrate(self): 775 | return float(self.w) / self.v if self.v > 0 else float('nan') 776 | 777 | def best_move(self): 778 | """ best move is the most simulated one """ 779 | return max(self.children, key=lambda node: node.v) if self.children is not None else None 780 | 781 | 782 | def tree_descend(tree, amaf_map, disp=False): 783 | """ Descend through the tree to a leaf """ 784 | tree.v += 1 785 | nodes = [tree] 786 | passes = 0 787 | while nodes[-1].children is not None and passes < 2: 788 | if disp: print_pos(nodes[-1].pos) 789 | 790 | # Pick the most urgent child 791 | children = list(nodes[-1].children) 792 | if disp: 793 | for c in children: 794 | dump_subtree(c, recurse=False) 795 | random.shuffle(children) # randomize the max in case of equal urgency 796 | node = max(children, key=lambda node: node.rave_urgency()) 797 | nodes.append(node) 798 | 799 | if disp: print('chosen %s' % (str_coord(node.pos.last),), file=sys.stderr) 800 | if node.pos.last is None: 801 | passes += 1 802 | else: 803 | passes = 0 804 | if amaf_map[node.pos.last] == 0: # Mark the coordinate with 1 for black 805 | amaf_map[node.pos.last] = 1 if nodes[-2].pos.n % 2 == 0 else -1 806 | 807 | # updating visits on the way *down* represents "virtual loss", relevant for parallelization 808 | node.v += 1 809 | if node.children is None and node.v >= EXPAND_VISITS: 810 | node.expand() 811 | 812 | return nodes 813 | 814 | 815 | def tree_update(nodes, amaf_map, score, disp=False): 816 | """ Store simulation result in the tree (@nodes is the tree path) """ 817 | for node in reversed(nodes): 818 | if disp: print('updating', str_coord(node.pos.last), score < 0, file=sys.stderr) 819 | node.w += score < 0 # score is for to-play, node statistics for just-played 820 | # Update the node children AMAF stats with moves we made 821 | # with their color 822 | amaf_map_value = 1 if node.pos.n % 2 == 0 else -1 823 | if node.children is not None: 824 | for child in node.children: 825 | if child.pos.last is None: 826 | continue 827 | if amaf_map[child.pos.last] == amaf_map_value: 828 | if disp: print(' AMAF updating', str_coord(child.pos.last), score > 0, file=sys.stderr) 829 | child.aw += score > 0 # reversed perspective 830 | child.av += 1 831 | score = -score 832 | 833 | 834 | worker_pool = None 835 | 836 | def tree_search(tree, n, owner_map, disp=False): 837 | """ Perform MCTS search from a given position for a given #iterations """ 838 | # Initialize root node 839 | if tree.children is None: 840 | tree.expand() 841 | 842 | # We could simply run tree_descend(), mcplayout(), tree_update() 843 | # sequentially in a loop. This is essentially what the code below 844 | # does, if it seems confusing! 845 | 846 | # However, we also have an easy (though not optimal) way to parallelize 847 | # by distributing the mcplayout() calls to other processes using the 848 | # multiprocessing Python module. mcplayout() consumes maybe more than 849 | # 90% CPU, especially on larger boards. (Except that with large patterns, 850 | # expand() in the tree descent phase may be quite expensive - we can tune 851 | # that tradeoff by adjusting the EXPAND_VISITS constant.) 852 | 853 | n_workers = multiprocessing.cpu_count() if not disp else 1 # set to 1 when debugging 854 | global worker_pool 855 | if worker_pool is None: 856 | worker_pool = Pool(processes=n_workers) 857 | outgoing = [] # positions waiting for a playout 858 | incoming = [] # positions that finished evaluation 859 | ongoing = [] # currently ongoing playout jobs 860 | i = 0 861 | while i < n: 862 | if not outgoing and not (disp and ongoing): 863 | # Descend the tree so that we have something ready when a worker 864 | # stops being busy 865 | amaf_map = W*W*[0] 866 | nodes = tree_descend(tree, amaf_map, disp=disp) 867 | outgoing.append((nodes, amaf_map)) 868 | 869 | if len(ongoing) >= n_workers: 870 | # Too many playouts running? Wait a bit... 871 | ongoing[0][0].wait(0.01 / n_workers) 872 | else: 873 | i += 1 874 | if i > 0 and i % REPORT_PERIOD == 0: 875 | print_tree_summary(tree, i, f=sys.stderr) 876 | 877 | # Issue an mcplayout job to the worker pool 878 | nodes, amaf_map = outgoing.pop() 879 | ongoing.append((worker_pool.apply_async(mcplayout, (nodes[-1].pos, amaf_map, disp)), nodes)) 880 | 881 | # Anything to store in the tree? (We do this step out-of-order 882 | # picking up data from the previous round so that we don't stall 883 | # ready workers while we update the tree.) 884 | while incoming: 885 | score, amaf_map, owner_map_one, nodes = incoming.pop() 886 | tree_update(nodes, amaf_map, score, disp=disp) 887 | for c in range(W*W): 888 | owner_map[c] += owner_map_one[c] 889 | 890 | # Any playouts are finished yet? 891 | for job, nodes in ongoing: 892 | if not job.ready(): 893 | continue 894 | # Yes! Queue them up for storing in the tree. 895 | score, amaf_map, owner_map_one = job.get() 896 | incoming.append((score, amaf_map, owner_map_one, nodes)) 897 | ongoing.remove((job, nodes)) 898 | 899 | # Early stop test 900 | best_wr = tree.best_move().winrate() 901 | if i > n*0.05 and best_wr > FASTPLAY5_THRES or i > n*0.2 and best_wr > FASTPLAY20_THRES: 902 | break 903 | 904 | for c in range(W*W): 905 | owner_map[c] = float(owner_map[c]) / i 906 | dump_subtree(tree) 907 | print_tree_summary(tree, i, f=sys.stderr) 908 | return tree.best_move() 909 | 910 | 911 | ################### 912 | # user interface(s) 913 | 914 | # utility routines 915 | 916 | def print_pos(pos, f=sys.stderr, owner_map=None): 917 | """ print visualization of the given board position, optionally also 918 | including an owner map statistic (probability of that area of board 919 | eventually becoming black/white) """ 920 | if pos.n % 2 == 0: # to-play is black 921 | board = pos.board.replace('x', 'O') 922 | Xcap, Ocap = pos.cap 923 | else: # to-play is white 924 | board = pos.board.replace('X', 'O').replace('x', 'X') 925 | Ocap, Xcap = pos.cap 926 | print('Move: %-3d Black: %d caps White: %d caps Komi: %.1f' % (pos.n, Xcap, Ocap, pos.komi), file=f) 927 | pretty_board = ' '.join(board.rstrip()) + ' ' 928 | if pos.last is not None: 929 | pretty_board = pretty_board[:pos.last*2-1] + '(' + board[pos.last] + ')' + pretty_board[pos.last*2+2:] 930 | rowcounter = count() 931 | pretty_board = [' %-02d%s' % (N-i, row[2:]) for row, i in zip(pretty_board.split("\n")[1:], rowcounter)] 932 | if owner_map is not None: 933 | pretty_ownermap = '' 934 | for c in range(W*W): 935 | if board[c].isspace(): 936 | pretty_ownermap += board[c] 937 | elif owner_map[c] > 0.6: 938 | pretty_ownermap += 'X' 939 | elif owner_map[c] > 0.3: 940 | pretty_ownermap += 'x' 941 | elif owner_map[c] < -0.6: 942 | pretty_ownermap += 'O' 943 | elif owner_map[c] < -0.3: 944 | pretty_ownermap += 'o' 945 | else: 946 | pretty_ownermap += '.' 947 | pretty_ownermap = ' '.join(pretty_ownermap.rstrip()) 948 | pretty_board = ['%s %s' % (brow, orow[2:]) for brow, orow in zip(pretty_board, pretty_ownermap.split("\n")[1:])] 949 | print("\n".join(pretty_board), file=f) 950 | print(' ' + ' '.join(colstr[:N]), file=f) 951 | print('', file=f) 952 | 953 | 954 | def dump_subtree(node, thres=N_SIMS/50, indent=0, f=sys.stderr, recurse=True): 955 | """ print this node and all its children with v >= thres. """ 956 | print("%s+- %s %.3f (%d/%d, prior %d/%d, rave %d/%d=%.3f, urgency %.3f)" % 957 | (indent*' ', str_coord(node.pos.last), node.winrate(), 958 | node.w, node.v, node.pw, node.pv, node.aw, node.av, 959 | float(node.aw)/node.av if node.av > 0 else float('nan'), 960 | node.rave_urgency()), file=f) 961 | if not recurse: 962 | return 963 | for child in sorted(node.children, key=lambda n: n.v, reverse=True): 964 | if child.v >= thres: 965 | dump_subtree(child, thres=thres, indent=indent+3, f=f) 966 | 967 | 968 | def print_tree_summary(tree, sims, f=sys.stderr): 969 | best_nodes = sorted(tree.children, key=lambda n: n.v, reverse=True)[:5] 970 | best_seq = [] 971 | node = tree 972 | while node is not None: 973 | best_seq.append(node.pos.last) 974 | node = node.best_move() 975 | print('[%4d] winrate %.3f | seq %s | can %s' % 976 | (sims, best_nodes[0].winrate(), ' '.join([str_coord(c) for c in best_seq[1:6]]), 977 | ' '.join(['%s(%.3f)' % (str_coord(n.pos.last), n.winrate()) for n in best_nodes])), file=f) 978 | 979 | 980 | def parse_coord(s): 981 | if s == 'pass': 982 | return None 983 | return W+1 + (N - int(s[1:])) * W + colstr.index(s[0].upper()) 984 | 985 | 986 | def str_coord(c): 987 | if c is None: 988 | return 'pass' 989 | row, col = divmod(c - (W+1), W) 990 | return '%c%d' % (colstr[col], N - row) 991 | 992 | 993 | # various main programs 994 | 995 | def mcbenchmark(n): 996 | """ run n Monte-Carlo playouts from empty position, return avg. score """ 997 | sumscore = 0 998 | for i in range(0, n): 999 | sumscore += mcplayout(empty_position(), W*W*[0])[0] 1000 | return float(sumscore) / n 1001 | 1002 | 1003 | def game_io(computer_black=False): 1004 | """ A simple minimalistic text mode UI. """ 1005 | 1006 | tree = TreeNode(pos=empty_position()) 1007 | tree.expand() 1008 | owner_map = W*W*[0] 1009 | while True: 1010 | if not (tree.pos.n == 0 and computer_black): 1011 | print_pos(tree.pos, sys.stdout, owner_map) 1012 | 1013 | sc = raw_input("Your move: ") 1014 | try: 1015 | c = parse_coord(sc) 1016 | except: 1017 | print('An incorrect move') 1018 | continue 1019 | if c is not None: 1020 | # Not a pass 1021 | if tree.pos.board[c] != '.': 1022 | print('Bad move (not empty point)') 1023 | continue 1024 | 1025 | # Find the next node in the game tree and proceed there 1026 | nodes = filter(lambda n: n.pos.last == c, tree.children) 1027 | if not nodes: 1028 | print('Bad move (rule violation)') 1029 | continue 1030 | tree = nodes[0] 1031 | 1032 | else: 1033 | # Pass move 1034 | if tree.children[0].pos.last is None: 1035 | tree = tree.children[0] 1036 | else: 1037 | tree = TreeNode(pos=tree.pos.pass_move()) 1038 | 1039 | print_pos(tree.pos) 1040 | 1041 | owner_map = W*W*[0] 1042 | tree = tree_search(tree, N_SIMS, owner_map) 1043 | if tree.pos.last is None and tree.pos.last2 is None: 1044 | score = tree.pos.score() 1045 | if tree.pos.n % 2: 1046 | score = -score 1047 | print('Game over, score: B%+.1f' % (score,)) 1048 | break 1049 | if float(tree.w)/tree.v < RESIGN_THRES: 1050 | print('I resign.') 1051 | break 1052 | print('Thank you for the game!') 1053 | 1054 | 1055 | def gtp_io(): 1056 | """ GTP interface for our program. We can play only on the board size 1057 | which is configured (N), and we ignore color information and assume 1058 | alternating play! """ 1059 | known_commands = ['boardsize', 'clear_board', 'komi', 'play', 'genmove', 1060 | 'final_score', 'quit', 'name', 'version', 'known_command', 1061 | 'list_commands', 'protocol_version', 'tsdebug'] 1062 | 1063 | tree = TreeNode(pos=empty_position()) 1064 | tree.expand() 1065 | 1066 | while True: 1067 | try: 1068 | line = raw_input().strip() 1069 | except EOFError: 1070 | break 1071 | if line == '': 1072 | continue 1073 | command = [s.lower() for s in line.split()] 1074 | if re.match('\d+', command[0]): 1075 | cmdid = command[0] 1076 | command = command[1:] 1077 | else: 1078 | cmdid = '' 1079 | owner_map = W*W*[0] 1080 | ret = '' 1081 | if command[0] == "boardsize": 1082 | if int(command[1]) != N: 1083 | print("Warning: Trying to set incompatible boardsize %s (!= %d)" % (command[1], N), file=sys.stderr) 1084 | ret = None 1085 | elif command[0] == "clear_board": 1086 | tree = TreeNode(pos=empty_position()) 1087 | tree.expand() 1088 | elif command[0] == "komi": 1089 | # XXX: can we do this nicer?! 1090 | tree.pos = Position(board=tree.pos.board, cap=(tree.pos.cap[0], tree.pos.cap[1]), 1091 | n=tree.pos.n, ko=tree.pos.ko, last=tree.pos.last, last2=tree.pos.last2, 1092 | komi=float(command[1])) 1093 | elif command[0] == "play": 1094 | c = parse_coord(command[2]) 1095 | if c is not None: 1096 | # Find the next node in the game tree and proceed there 1097 | if tree.children is not None and filter(lambda n: n.pos.last == c, tree.children): 1098 | tree = filter(lambda n: n.pos.last == c, tree.children)[0] 1099 | else: 1100 | # Several play commands in row, eye-filling move, etc. 1101 | tree = TreeNode(pos=tree.pos.move(c)) 1102 | 1103 | else: 1104 | # Pass move 1105 | if tree.children[0].pos.last is None: 1106 | tree = tree.children[0] 1107 | else: 1108 | tree = TreeNode(pos=tree.pos.pass_move()) 1109 | elif command[0] == "genmove": 1110 | tree = tree_search(tree, N_SIMS, owner_map) 1111 | if tree.pos.last is None: 1112 | ret = 'pass' 1113 | elif float(tree.w)/tree.v < RESIGN_THRES: 1114 | ret = 'resign' 1115 | else: 1116 | ret = str_coord(tree.pos.last) 1117 | elif command[0] == "final_score": 1118 | score = tree.pos.score() 1119 | if tree.pos.n % 2: 1120 | score = -score 1121 | if score == 0: 1122 | ret = '0' 1123 | elif score > 0: 1124 | ret = 'B+%.1f' % (score,) 1125 | elif score < 0: 1126 | ret = 'W+%.1f' % (-score,) 1127 | elif command[0] == "name": 1128 | ret = 'michi' 1129 | elif command[0] == "version": 1130 | ret = 'simple go program demo' 1131 | elif command[0] == "tsdebug": 1132 | print_pos(tree_search(tree, N_SIMS, W*W*[0], disp=True)) 1133 | elif command[0] == "list_commands": 1134 | ret = '\n'.join(known_commands) 1135 | elif command[0] == "known_command": 1136 | ret = 'true' if command[1] in known_commands else 'false' 1137 | elif command[0] == "protocol_version": 1138 | ret = '2' 1139 | elif command[0] == "quit": 1140 | print('=%s \n\n' % (cmdid,), end='') 1141 | break 1142 | else: 1143 | print('Warning: Ignoring unknown command - %s' % (line,), file=sys.stderr) 1144 | ret = None 1145 | 1146 | print_pos(tree.pos, sys.stderr, owner_map) 1147 | if ret is not None: 1148 | print('=%s %s\n\n' % (cmdid, ret,), end='') 1149 | else: 1150 | print('?%s ???\n\n' % (cmdid,), end='') 1151 | sys.stdout.flush() 1152 | 1153 | 1154 | if __name__ == "__main__": 1155 | try: 1156 | with open(spat_patterndict_file) as f: 1157 | print('Loading pattern spatial dictionary...', file=sys.stderr) 1158 | load_spat_patterndict(f) 1159 | with open(large_patterns_file) as f: 1160 | print('Loading large patterns...', file=sys.stderr) 1161 | load_large_patterns(f) 1162 | print('Done.', file=sys.stderr) 1163 | except IOError as e: 1164 | print('Warning: Cannot load pattern files: %s; will be much weaker, consider lowering EXPAND_VISITS 5->2' % (e,), file=sys.stderr) 1165 | if len(sys.argv) < 2: 1166 | # Default action 1167 | game_io() 1168 | elif sys.argv[1] == "white": 1169 | game_io(computer_black=True) 1170 | elif sys.argv[1] == "gtp": 1171 | gtp_io() 1172 | elif sys.argv[1] == "mcdebug": 1173 | print(mcplayout(empty_position(), W*W*[0], disp=True)[0]) 1174 | elif sys.argv[1] == "mcbenchmark": 1175 | print(mcbenchmark(20)) 1176 | elif sys.argv[1] == "tsbenchmark": 1177 | t_start = time.time() 1178 | print_pos(tree_search(TreeNode(pos=empty_position()), N_SIMS, W*W*[0], disp=False).pos) 1179 | print('Tree search with %d playouts took %.3fs with %d threads; speed is %.3f playouts/thread/s' % 1180 | (N_SIMS, time.time() - t_start, multiprocessing.cpu_count(), 1181 | N_SIMS / ((time.time() - t_start) * multiprocessing.cpu_count()))) 1182 | elif sys.argv[1] == "tsdebug": 1183 | print_pos(tree_search(TreeNode(pos=empty_position()), N_SIMS, W*W*[0], disp=True).pos) 1184 | else: 1185 | print('Unknown action', file=sys.stderr) 1186 | --------------------------------------------------------------------------------