├── LICENSE └── README.md /LICENSE: -------------------------------------------------------------------------------- 1 | MIT License 2 | 3 | Copyright (c) 2020 Haiguang Liao 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. 22 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # RLCO-Papers 2 | 3 | Reinforcement Learning based combinatorial optimization (**RLCO**) is a very interesting research area. 4 | Combinatorial Optimization Problems include: Travelling Salesman Problem (**TSP**), Single-Source Shortest Paths (**SSP**), Minimum Spanning Tree (**MST**), Vehicle Routing Problem (**VRP**), Orienteering Problem, Knapsack Problem, Maximal Independent Set (**MIS**), Maximum Cut (**MC**), Minimum Vertex Cover (**MVC**), Maximal Clique (**MC**), Integer Linear Programming (**ILP**), Network Optimization (Routing), Graph Coloring Problem (**GCP**), Bin Packing Problem, Graph Partitioning, **EDA** problems. Most of them are NP-hard or NP-complete. 5 | Combinatorial Problems can traditionally be solved by: exact method, heuristic method (genetic algorithm, simulated annealing), etc. Recently, better learning-based solvers are coming out. 6 | 7 | This is a collection of resaerch & application papers of RLCO. Papers are sorted by time and categories. Some related supervised learning papers are also listed as a reference. 8 | 9 | 10 | The sharing principle of these references here is for research. If any authors do not want their paper to be listed here, please feel free to contact [Haiguang Liao] (Email: haiguanl [AT] andrew.cmu.edu). Feedbacks on any mistakes on the repo are also welcomed 11 | 12 | ## Review Papers: 13 | * [Reinforcement Learning for Combinatorial Optimization: A Survey 14 | ](https://arxiv.org/pdf/2003.03600.pdf) Nina Mazyavkina et al. (Skolkovo Institute of Science and Technology, Russia) 15 | 16 | ## Research Papers: 17 | Papers are catgorized based on the solution approahces and ordered in time sequence: 18 | ### 1. Policy RL + (GNN) 19 | * [POMO: Policy Optimization with Multiple Optima for Reinforcement Learning](https://arxiv.org/pdf/2010.16011.pdf) Yeong-Dae Kwon et al. NeurIPS 2020 (Samsung SDS) 20 | * [Reinforcement Learning for Integer Programming- Learning to Cut](https://proceedings.icml.cc/static/paper_files/icml/2020/943-Paper.pdf) Yunhao Tang et al. ICML 2020 (Columbia University) 21 | * [Learning to Perform Local Rewriting for Combinatorial Optimization](https://arxiv.org/pdf/1810.00337.pdf) Xinyun Chen, Yuandong Tian. NeurIPS 2019 (UC Berkeley & Facebook AI Research) 22 | * [Attention, Learn to Solve Routing Problems!](https://arxiv.org/pdf/1803.08475.pdf?source=post_page---------------------------) Max Welling et al. ICLR 2019 (Architecture: Graph Attention Network) 23 | * [Learning to Solve Combinatorial Optimization 24 | Problems on Real-World Graphs in Linear Time](https://arxiv.org/pdf/2006.03750.pdf) Iddo Drori et al. 2020 (Columbia University, Cornell University) 25 | * [Reinforcement Learning Driven Heuristic Optimization](https://arxiv.org/pdf/1906.06639.pdf) Qingpeng Cai, Azalia Mirhoseini et al. DRL4KDD 2019 (Tsinghua University, Google Brain, Google AI) 26 | * [Neural Combinatorial Optimization with Reinforcement Learning](https://arxiv.org/pdf/1611.09940.pdf) Irwan Bello et al. ICLR 2017 (Google Brain) 27 | 28 | 29 | ### 2. Value RL + (GNN) 30 | * [Deep Reinforcement Learning meets Graph Neural 31 | Networks: exploring a routing optimization use case](https://arxiv.org/pdf/1803.08475.pdf?source=post_page---------------------------) Paul Almasan et al. 2020 32 | * [Learning Combinatorial Optimization Algorithms over Graphs](https://arxiv.org/pdf/1704.01665.pdf) Hanjun Dai, Le Song et al. NeurlIPS 2017 (GaTech) 33 | 34 | ### 3. Supervised Learning + Tree Search + Graph Embedding 35 | * [Combinatorial Optimzation with Graph Convolutional Networks and Guided Tree Search](https://papers.nips.cc/paper/7335-combinatorial-optimization-with-graph-convolutional-networks-and-guided-tree-search.pdf) Zhuwen Li et al. NeurlIPS 2017 (Intel) 36 | 37 | ## Application Papers: 38 | Papers are catgorized based on the application domains and ordered in time sequence: 39 | ### 1. Electronic Design Automation (EDA) 40 | EDA is not easy. Some problems in EDA such as physical design (**floorplan**, **placement**, **routing**, etc.) can be formulated as combinatorial optimization problems. Thus, some of them are solved with RLCO. 41 | * [Track-Assignment Detailed Routing Using Attention-based Policy Model With Supervision](https://arxiv.org/pdf/2010.13702.pdf) Haiguang Liao et al. 2020 (CMU & Cadence) 42 | * [AutoCkt: Deep Reinforcement Learning of Analog Circuit Designs 43 | ](https://arxiv.org/pdf/2001.01808.pdf) Keertana Settaluri et al. 2020 (UC Berkeley) 44 | * [Attention Routing: track-assignment detailed routing using attention-based reinforcement learning](https://arxiv.org/pdf/2004.09473.pdf) Haiguang Liao et al. 2020 (CMU & Cadence) 45 | * [Chip Placement with Deep Reinforcement Learning](https://arxiv.org/pdf/2004.10746.pdf) Azalia Mirhoseini et al. 2020 (Google) 46 | * [Placement Optimization with Deep Reinforcement Learning](https://dl.acm.org/doi/pdf/10.1145/3372780.3378174 47 | ) Anna Goldie, Azalia Mirhoseini ISPD 2020 (Google Brain) 48 | * [Placeto: Learning Generalizable Device Placement 49 | Algorithms for Distributed Machine Learning](https://arxiv.org/pdf/1906.08879.pdf) Ravichandra Addanki et al. 2019 (MIT CSAIL) 50 | * [GDP: Generalized Device Placement for Dataflow Graphs](https://arxiv.org/pdf/1910.01578.pdf) Yanqi Zhou et al. 2019 (Google Brain) 51 | * [A Deep Reinforcement Learning Approach for Global Routing](https://arxiv.org/pdf/1906.08809.pdf) Haiguang Liao et al. 2019 (CMU) 52 | * [A Hierarchical Model for Device Placement](https://arxiv.org/pdf/1906.06639.pdf) Azalia Mirhoseini et al. ICLR 2018 (Google) 53 | * [Device Placement Optimization with Reinforcement Learning](https://arxiv.org/pdf/1706.04972.pdf) Azalia Mirhoseini et al. ICML 2017 (Google Brain) 54 | 55 | ### 2. Path Planning (Routing) 56 | RL has been extensively applied to robotic path planning, the papers listed here are only a tiny subset of those we think relevent to RLCO. 57 | * [Multi-Agent Routing Value Iteration Network](https://arxiv.org/pdf/2007.05096.pdf) Quinlan Sykora et al. PMLR 2020 (Uber) 58 | * [Value Iteration Networks](https://arxiv.org/pdf/1602.02867.pdf) (Code: https://github.com/kentsommer/pytorch-value-iteration-networks) Aviv Tamar et al. NeurlIPS 2016 (UC Berkeley) 59 | 60 | ### 3. Resource Management 61 | * [Resource Management with Deep Reinforcement Learning](https://www.cl.cam.ac.uk/~ey204/teaching/ACS/R244_2018_2019/papers/mao_HOTNETS_2016.pdf) Hongzi Mao et al. ACM_HotNets 2016 (MIT & Microsoft Research) 62 | 63 | 64 | ## Relevant Papers: 65 | ### 1. MLCO (Machine Learning for Combinatorial Optimization) 66 | * [Machine learning for combinatorial optimization: a methodological tour d’horizon](https://arxiv.org/pdf/1811.06128.pdf) Yoshua Bengio et al. 2020 (Universit´e de Montr´eal, etc.) 67 | * [Learning Combinatorial Optimization on Graphs: A Survey with Applications to Networking](https://arxiv.org/pdf/2005.11081.pdf) Natalia Vesselinova et al. 2020 68 | ### 2. ILCO (Imitation for Combinatorial Optimization) 69 | * [Exact Combinatorial Optimization 70 | with Graph Convolutional Neural Networks](https://arxiv.org/pdf/1906.01629.pdf) Maxime Gasse et al. NeurlIPS 2019 (Mila, Polytechnique Montréal, etc.) 71 | ### 3. GNN and CO 72 | * [Combinatorial optimization and reasoning with graph neural networks](https://arxiv.org/pdf/2102.09544.pdf) Quentin Cappart et al. (Polytechnique Montréal, UToronto, Deep Mind, etc.) 73 | 74 | 75 | ## Relevant Resources: 76 | ### Frameworks: 77 | 1. [PyTorch Geometric](https://arxiv.org/pdf/1903.02428.pdf) (TU Dortmund University) 78 | 2. [Deep Graph Library](https://arxiv.org/pdf/1909.01315.pdf) (Amazon Web Services, AWS Shanghai AI Lab, New York University, NYU Shanghai) 79 | 80 | ### Libraries: 81 | 1. Facilate the learning of heuristics for CO problems similar to OpenAI Gym. 82 | * [OR-Gym](https://arxiv.org/pdf/2008.06319.pdf) (CMU) 83 | * [OpenGraphGym](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7302566/) 84 | 2. Facilitate the learning of configuration parameters for CO solvers. 85 | * [MIPLearn](https://anl-ceeesa.github.io/MIPLearn) 86 | 3. Offering a general, extensible framework for implementing and evaluating machine learning-enhanced CO, also based on OpenAI Gym. 87 | * [Ecole](https://arxiv.org/pdf/2011.06069.pdf) (Mila, Polytechnique Montréal) 88 | ### Environments/Problems: 89 | 90 | 91 | 92 | --------------------------------------------------------------------------------