├── LICENSE
└── README.md


/LICENSE:
--------------------------------------------------------------------------------
 1 | MIT License
 2 | 
 3 | Copyright (c) 2020 Haiguang Liao
 4 | 
 5 | Permission is hereby granted, free of charge, to any person obtaining a copy
 6 | of this software and associated documentation files (the "Software"), to deal
 7 | in the Software without restriction, including without limitation the rights
 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
 9 | copies of the Software, and to permit persons to whom the Software is
10 | furnished to do so, subject to the following conditions:
11 | 
12 | The above copyright notice and this permission notice shall be included in all
13 | copies or substantial portions of the Software.
14 | 
15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21 | SOFTWARE.
22 | 


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
 1 | # RLCO-Papers
 2 | 
 3 | Reinforcement Learning based combinatorial optimization (**RLCO**) is a very interesting research area. 
 4 | Combinatorial Optimization Problems include: Travelling Salesman Problem (**TSP**), Single-Source Shortest Paths (**SSP**), Minimum Spanning Tree (**MST**), Vehicle Routing Problem (**VRP**), Orienteering Problem, Knapsack Problem, Maximal Independent Set (**MIS**), Maximum Cut (**MC**), Minimum Vertex Cover (**MVC**), Maximal Clique (**MC**), Integer Linear Programming (**ILP**), Network Optimization (Routing), Graph Coloring Problem (**GCP**), Bin Packing Problem, Graph Partitioning, **EDA** problems. Most of them are NP-hard or NP-complete. 
 5 | Combinatorial Problems can traditionally be solved by: exact method, heuristic method (genetic algorithm, simulated annealing), etc. Recently, better learning-based solvers are coming out. 
 6 | 
 7 | This is a collection of resaerch & application papers of RLCO. Papers are sorted by time and categories. Some related supervised learning papers are also listed as a reference.
 8 | 
 9 | 
10 | The sharing principle of these references here is for research. If any authors do not want their paper to be listed here, please feel free to contact [Haiguang Liao]  (Email: haiguanl [AT] andrew.cmu.edu). Feedbacks on any mistakes on the repo are also welcomed
11 | 
12 | ## Review Papers:
13 | * [Reinforcement Learning for Combinatorial Optimization: A Survey
14 | ](https://arxiv.org/pdf/2003.03600.pdf) Nina Mazyavkina et al. (Skolkovo Institute of Science and Technology, Russia)
15 | 
16 | ## Research Papers:
17 | Papers are catgorized based on the solution approahces and ordered in time sequence:
18 | ### 1. Policy RL + (GNN)
19 | * [POMO: Policy Optimization with Multiple Optima for Reinforcement Learning](https://arxiv.org/pdf/2010.16011.pdf) Yeong-Dae Kwon et al. NeurIPS 2020 (Samsung SDS)
20 | * [Reinforcement Learning for Integer Programming- Learning to Cut](https://proceedings.icml.cc/static/paper_files/icml/2020/943-Paper.pdf) Yunhao Tang et al. ICML 2020 (Columbia University)
21 | * [Learning to Perform Local Rewriting for Combinatorial Optimization](https://arxiv.org/pdf/1810.00337.pdf) Xinyun Chen, Yuandong Tian. NeurIPS 2019 (UC Berkeley & Facebook AI Research)
22 | * [Attention, Learn to Solve Routing Problems!](https://arxiv.org/pdf/1803.08475.pdf?source=post_page---------------------------) Max Welling et al. ICLR 2019 (Architecture: Graph Attention Network)
23 | * [Learning to Solve Combinatorial Optimization
24 | Problems on Real-World Graphs in Linear Time](https://arxiv.org/pdf/2006.03750.pdf) Iddo Drori et al. 2020 (Columbia University, Cornell University)
25 | * [Reinforcement Learning Driven Heuristic Optimization](https://arxiv.org/pdf/1906.06639.pdf) Qingpeng Cai, Azalia Mirhoseini et al. DRL4KDD 2019 (Tsinghua University, Google Brain, Google AI)
26 | * [Neural Combinatorial Optimization with Reinforcement Learning](https://arxiv.org/pdf/1611.09940.pdf) Irwan Bello et al. ICLR 2017 (Google Brain)
27 | 
28 | 
29 | ### 2. Value RL + (GNN)
30 | * [Deep Reinforcement Learning meets Graph Neural
31 | Networks: exploring a routing optimization use case](https://arxiv.org/pdf/1803.08475.pdf?source=post_page---------------------------) Paul Almasan et al. 2020
32 | * [Learning Combinatorial Optimization Algorithms over Graphs](https://arxiv.org/pdf/1704.01665.pdf) Hanjun Dai, Le Song et al. NeurlIPS 2017 (GaTech) 
33 | 
34 | ### 3. Supervised Learning + Tree Search + Graph Embedding
35 | * [Combinatorial Optimzation with Graph Convolutional Networks and Guided Tree Search](https://papers.nips.cc/paper/7335-combinatorial-optimization-with-graph-convolutional-networks-and-guided-tree-search.pdf) Zhuwen Li et al. NeurlIPS 2017 (Intel)
36 | 
37 | ## Application Papers:
38 | Papers are catgorized based on the application domains and ordered in time sequence:
39 | ### 1. Electronic Design Automation (EDA)
40 | EDA is not easy. Some problems in EDA such as physical design (**floorplan**, **placement**, **routing**, etc.) can be formulated as combinatorial optimization problems. Thus, some of them are solved with RLCO.
41 | * [Track-Assignment Detailed Routing Using Attention-based Policy Model With Supervision](https://arxiv.org/pdf/2010.13702.pdf) Haiguang Liao et al. 2020 (CMU & Cadence)
42 | * [AutoCkt: Deep Reinforcement Learning of Analog Circuit Designs
43 | ](https://arxiv.org/pdf/2001.01808.pdf) Keertana Settaluri et al. 2020 (UC Berkeley)
44 | * [Attention Routing: track-assignment detailed routing using attention-based reinforcement learning](https://arxiv.org/pdf/2004.09473.pdf) Haiguang Liao et al. 2020 (CMU & Cadence)
45 | * [Chip Placement with Deep Reinforcement Learning](https://arxiv.org/pdf/2004.10746.pdf) Azalia Mirhoseini et al. 2020 (Google)
46 | * [Placement Optimization with Deep Reinforcement Learning](https://dl.acm.org/doi/pdf/10.1145/3372780.3378174
47 | ) Anna Goldie, Azalia Mirhoseini ISPD 2020 (Google Brain) 
48 |  * [Placeto: Learning Generalizable Device Placement
49 | Algorithms for Distributed Machine Learning](https://arxiv.org/pdf/1906.08879.pdf) Ravichandra Addanki et al. 2019 (MIT CSAIL)
50 | * [GDP: Generalized Device Placement for Dataflow Graphs](https://arxiv.org/pdf/1910.01578.pdf) Yanqi Zhou et al. 2019 (Google Brain)
51 | * [A Deep Reinforcement Learning Approach for Global Routing](https://arxiv.org/pdf/1906.08809.pdf) Haiguang Liao et al. 2019 (CMU)
52 | * [A Hierarchical Model for Device Placement](https://arxiv.org/pdf/1906.06639.pdf) Azalia Mirhoseini et al. ICLR 2018 (Google)
53 | * [Device Placement Optimization with Reinforcement Learning](https://arxiv.org/pdf/1706.04972.pdf) Azalia Mirhoseini et al. ICML 2017 (Google Brain)
54 |  
55 |  ### 2. Path Planning (Routing)
56 |  RL has been extensively applied to robotic path planning, the papers listed here are only a tiny subset of those we think relevent to RLCO. 
57 | * [Multi-Agent Routing Value Iteration Network](https://arxiv.org/pdf/2007.05096.pdf) Quinlan Sykora et al. PMLR 2020 (Uber)
58 |  * [Value Iteration Networks](https://arxiv.org/pdf/1602.02867.pdf) (Code: https://github.com/kentsommer/pytorch-value-iteration-networks) Aviv Tamar et al. NeurlIPS 2016 (UC Berkeley)
59 | 
60 |  ### 3. Resource Management
61 |  * [Resource Management with Deep Reinforcement Learning](https://www.cl.cam.ac.uk/~ey204/teaching/ACS/R244_2018_2019/papers/mao_HOTNETS_2016.pdf) Hongzi Mao et al. ACM_HotNets 2016 (MIT & Microsoft Research)
62 | 
63 |  
64 |  ## Relevant Papers:
65 | ### 1. MLCO (Machine Learning for Combinatorial Optimization)
66 | * [Machine learning for combinatorial optimization: a methodological tour d’horizon](https://arxiv.org/pdf/1811.06128.pdf) Yoshua Bengio et al. 2020 (Universit´e de Montr´eal, etc.)
67 | * [Learning Combinatorial Optimization on Graphs: A Survey with Applications to Networking](https://arxiv.org/pdf/2005.11081.pdf) Natalia Vesselinova et al. 2020
68 | ### 2. ILCO (Imitation for Combinatorial Optimization)
69 | * [Exact Combinatorial Optimization
70 | with Graph Convolutional Neural Networks](https://arxiv.org/pdf/1906.01629.pdf) Maxime Gasse et al. NeurlIPS 2019 (Mila, Polytechnique Montréal, etc.)
71 | ### 3. GNN and CO
72 | * [Combinatorial optimization and reasoning with graph neural networks](https://arxiv.org/pdf/2102.09544.pdf) Quentin Cappart et al. (Polytechnique Montréal, UToronto, Deep Mind, etc.)
73 | 
74 | 
75 |  ## Relevant Resources:
76 |  ### Frameworks:
77 |  1. [PyTorch Geometric](https://arxiv.org/pdf/1903.02428.pdf) (TU Dortmund University)
78 |  2. [Deep Graph Library](https://arxiv.org/pdf/1909.01315.pdf) (Amazon Web Services, AWS Shanghai AI Lab, New York University, NYU Shanghai)
79 | 
80 |  ### Libraries:
81 |  1. Facilate the learning of heuristics for CO problems similar to OpenAI Gym.
82 |  * [OR-Gym](https://arxiv.org/pdf/2008.06319.pdf) (CMU)
83 |  * [OpenGraphGym](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7302566/) 
84 |  2. Facilitate the learning of configuration parameters for CO solvers. 
85 |  * [MIPLearn](https://anl-ceeesa.github.io/MIPLearn)
86 |  3. Offering a general, extensible framework for implementing and evaluating machine learning-enhanced CO, also based on OpenAI Gym. 
87 |  * [Ecole](https://arxiv.org/pdf/2011.06069.pdf) (Mila, Polytechnique Montréal)
88 |  ### Environments/Problems:
89 |  
90 |  
91 |  
92 | 


--------------------------------------------------------------------------------