└── README.md /README.md: -------------------------------------------------------------------------------- 1 | # Explainable Reinforcement Learning (XRL) Resources 2 | 3 | This repository aims to keep an up-to-date list of research on explainable reinforcement learning (XRL). If you find this helpful, you can give this repository a star and share it. 4 | 5 | Missing resources or papers, problems, or questions? Open an issue here or email: contact (at) xrl (dot) ai. 6 | 7 | ## Resources 8 | 9 | * Awesome Explainable Reinforcement Learning. [Link](https://github.com/Plankson/awesome-explainable-reinforcement-learning) 10 | 11 | ## Newly Added Papers 12 | 13 | | #/Link | Title | Venue/Journal | Year | 14 | |-----|-----|-----|-----| 15 | | [1](https://doi.org/10.1609/AAAI.V38I13.29372) | CrystalBox: Future-Based Explanations for Input-Driven Deep RL Systems | AAAI | 2024 | 16 | | [2](https://dl.acm.org/doi/10.5555/3635637.3662930) | Causal Explanations for Sequential Decision-Making in Multi-Agent Systems | AAMAS | 2024 | 17 | | [3](https://doi.org/10.1016/J.ARTINT.2024.104182) | ASQ-IT: Interactive explanations for reinforcement-learning agents | Artif. Intell. | 2024 | 18 | | [4](https://doi.org/10.1109/CAI59869.2024.00023) | Abstracted Trajectory Visualization for Explainability in Reinforcement Learning | CAI | 2024 | 19 | | [5](https://doi.org/10.48550/arXiv.2408.09841) | Demystifying Reinforcement Learning in Production Scheduling via Explainable AI | CoRR | 2024 | 20 | | [6](https://arxiv.org/abs/2408.08230) | Explaining an Agent's Future Beliefs through Temporally Decomposing Future Reward Estimators | ECAI | 2024 | 21 | | [7](https://openreview.net/forum?id=0P3kaNluGj) | End-to-End Neuro-Symbolic Reinforcement Learning with Textual Explanations | ICML | 2024 | 22 | | [8](https://doi.org/10.1109/TSMC.2023.3312411) | Leveraging Reward Consistency for Interpretable Feature Discovery in Reinforcement Learning | IEEE Trans. Syst. Man Cybern. Syst. | 2024 | 23 | | [9](https://doi.org/10.48550/ARXIV.2402.12685) | XRL-Bench: A Benchmark for Evaluating and Comparing Explainable Reinforcement Learning Techniques | KDD | 2024 | 24 | | [10](http://arxiv.org/abs/2409.05435) | Semifactual Explanations for Reinforcement Learning | Proc. of HAI | 2024 | 25 | | [11](https://doi.org/10.1038/s41598-024-70701-2) | Information based explanation methods for deep learning agents--with applications on large open-source chess models | Scientific Reports | 2024 | 26 | | [12](https://doi.org/10.1609/AIIDE.V19I1.27516) | Learning of Generalizable and Interpretable Knowledge in Grid-Based Reinforcement Learning Environments | AIIDE | 2023 | 27 | | [13](https://doi.org/10.48550/ARXIV.2310.16410) | Bridging the Human-AI Knowledge Gap: Concept Discovery and Transfer in AlphaZero | CoRR | 2023 | 28 | | [14](https://doi.org/10.1007/978-3-031-44067-0_13) | Explaining Deep Reinforcement Learning-Based Methods for Control of Building HVAC Systems | xAI | 2023 | 29 | 30 | ## Survey Papers 31 | 32 | | #/Link | Title | Venue/Journal | Year | 33 | |-----|-----|-----|-----| 34 | | [1](https://doi.org/10.1145/3616864) | Explainable Reinforcement Learning: A Survey and Comparative Review | ACM Comput. Surv. | 2024 | 35 | | [2](https://doi.org/10.1145/3648472) | Redefining Counterfactual Explanations for Reinforcement Learning: Overview, Challenges and Opportunities | ACM Comput. Surv. | 2024 | 36 | | [3](https://www.taylorfrancis.com/chapters/edit/10.1201/9781003355281-2/survey-global-explanations-reinforcement-learning-yotam-amitai-ofra-amir) | A Survey of Global Explanations in Reinforcement Learning | Explainable Agency in Artificial Intelligence | 2024 | 37 | | [4](https://doi.org/10.1007/s10994-024-06543-w) | A survey on interpretable reinforcement learning | Mach. Learn. | 2024 | 38 | | [5](https://doi.org/10.1007/S10994-023-06479-7) | Explainable reinforcement learning (XRL): a systematic literature review and taxonomy | Mach. Learn. | 2024 | 39 | | [6](https://doi.org/10.1007/978-3-031-47518-4) | Explainable and Interpretable Reinforcement Learning for Robotics | SLAIML | 2024 | 40 | | [7](https://doi.org/10.1145/3623377) | Explainability in Deep Reinforcement Learning, a Review into Current Methods and Applications | ACM Comput. Surv. | 2023 | 41 | | [8](https://doi.org/10.48550/ARXIV.2211.06665) | A Survey on Explainable Reinforcement Learning: Concepts, Algorithms, Challenges | CoRR | 2023 | 42 | | [9](https://doi.org/10.1201/9781003324140-5) | Advances in Explainable Reinforcement Learning: An Intelligent Transportation Systems Perspective | Explainable Artificial Intelligence for Intelligent Transportation Systems | 2023 | 43 | | [10](https://doi.org/10.1007/S00521-023-08423-1) | Explainable reinforcement learning for broad-XAI: a conceptual framework and survey | Neural Comput. Appl. | 2023 | 44 | | [11](https://doi.org/10.1145/3527448) | Explainable Deep Reinforcement Learning: State of the Art and Challenges | ACM Comput. Surv. | 2022 | 45 | | [12](https://doi.org/10.48550/ARXIV.2203.11547) | Explainability in reinforcement learning: perspective and position | CoRR | 2022 | 46 | | [13](https://doi.org/10.3389/FRAI.2021.550030) | Explainable AI and Reinforcement Learning - A Systematic Review of Current Approaches and Trends | Frontiers Artif. Intell. | 2021 | 47 | | [14](https://doi.org/10.1016/J.KNOSYS.2020.106685) | Explainability in deep reinforcement learning | Knowl. Based Syst. | 2021 | 48 | | [15](https://doi.org/10.1007/978-3-030-57321-8_5) | Explainable Reinforcement Learning: A Survey | CD-MAKE | 2020 | 49 | | [16](https://doi.org/10.1109/ACCESS.2020.3023394) | Reinforcement Learning Interpretation Methods: A Survey | IEEE Access | 2020 | 50 | 51 | ## Papers 52 | 53 | | #/Link | Title | Venue/Journal | Year | 54 | |-----|-----|-----|-----| 55 | | [1](https://doi.org/10.48550/arXiv.2312.11118) | Explaining Reinforcement Learning Agents Through Counterfactual Action Outcomes | AAAI | 2024 | 56 | | [2](https://proceedings.mlr.press/v222/bekkemoen24a.html) | ASAP: Attention-Based State Space Abstraction for Policy Summarization | ACML | 2024 | 57 | | [3](https://proceedings.mlr.press/v236/lu24a.html) | Causal State Distillation for Explainable Reinforcement Learning | CLeaR | 2024 | 58 | | [4](https://doi.org/10.48550/ARXIV.2402.12939) | Discovering Behavioral Modes in Deep Reinforcement Learning Policies Using Trajectory Clustering in Latent Space | CoRR | 2024 | 59 | | [5](https://doi.org/10.48550/ARXIV.2401.05821) | Interpretable Concept Bottlenecks to Align Reinforcement Learning Agents | CoRR | 2024 | 60 | | [6](https://doi.org/10.48550/ARXIV.2405.14956) | Interpretable and Editable Programmatic Tree Policies for Reinforcement Learning | CoRR | 2024 | 61 | | [7](https://hdl.handle.net/10125/106551) | Detection of Important States through an Iterative Q-value Algorithm for Explainable Reinforcement Learning | HICSS | 2024 | 62 | | [8](https://openreview.net/forum?id=0BdPwBncot) | ''You just can't go around killing people'' Explaining Agent Behavior to a Human Terminator | ICML Workshop MHFAIA | 2024 | 63 | | [9](https://doi.org/10.48550/ARXIV.2406.01178) | Deep Reinforcement Learning Behavioral Mode Switching Using Optimal Control Based on a Latent Space Objective | MED | 2024 | 64 | | [10](https://doi.org/10.1109/MED61351.2024.10566218) | Explaining Deep Reinforcement Learning Policies with SHAP, Decision Trees, and Prototypes | MED | 2024 | 65 | | [11](https://doi.org/10.1609/AAAI.V37I7.26081) | Local Explanations for Reinforcement Learning | AAAI | 2023 | 66 | | [12](https://ceur-ws.org/Vol-3433/paper5.pdf) | Explainable Reinforcement Learning Based on Q-Value Decomposition by Expected State Transitions | AAAI-MAKE | 2023 | 67 | | [13](https://dl.acm.org/doi/10.5555/3545946.3598751) | GANterfactual-RL: Understanding Reinforcement Learning Agents' Strategies through Visual Counterfactual Explanations | AAMAS | 2023 | 68 | | [14](https://doi.org/10.1007/S10489-022-03788-7) | Interpreting a deep reinforcement learning model with conceptual embedding and performance analysis | Appl. Intell. | 2023 | 69 | | [15](https://doi.org/10.1007/978-3-031-43421-1_13) | Deep Explainable Relational Reinforcement Learning: A Neuro-Symbolic Approach | ECML PKDD | 2023 | 70 | | [16](https://doi.org/10.1007/978-3-031-40878-6_10) | Inherently Interpretable Deep Reinforcement Learning Through Online Mimicking | EXTRAAMAS | 2023 | 71 | | [17](https://doi.org/10.1109/ICDL55364.2023.10364407) | A Closer Look at Reward Decomposition for High-Level Robotic Explanations | ICDL | 2023 | 72 | | [18](https://openreview.net/pdf?id=hWwY_Jq0xsN) | Towards Interpretable Deep Reinforcement Learning with Human-Friendly Prototypes | ICLR | 2023 | 73 | | [19](https://proceedings.mlr.press/v202/beechey23a.html) | Explaining Reinforcement Learning with Shapley Values | ICML | 2023 | 74 | | [20](https://doi.org/10.48550/ARXIV.2307.13192) | Counterfactual Explanation Policies in RL | ICML Workshop on Counterfactuals in Minds and Machines | 2023 | 75 | | [21](https://doi.org/10.1007/978-3-031-30047-9_25) | Explaining Black Box Reinforcement Learning Agents Through Counterfactual Policies | IDA | 2023 | 76 | | [22](https://doi.org/10.1109/TCSS.2022.3225362) | Extracting Decision Tree From Trained Deep Reinforcement Learning in Traffic Signal Control | IEEE Trans. Comput. Soc. Syst. | 2023 | 77 | | [23](https://doi.org/https://doi.org/10.1016/j.ifacol.2023.10.1328) | Real-Time Counterfactual Explanations For Robotic Systems With Multiple Continuous Outputs | IFAC-PapersOnLine | 2023 | 78 | | [24](https://doi.org/10.24963/IJCAI.2023/7) | Explainable Multi-Agent Reinforcement Learning for Temporal Queries | IJCAI | 2023 | 79 | | [25](https://doi.org/10.24963/IJCAI.2023/505) | Explainable Reinforcement Learning via a Causal World Model | IJCAI | 2023 | 80 | | [26](https://doi.org/10.24963/IJCAI.2023/541) | Unveiling Concepts Learned by a World-Class Chess-Playing Agent | IJCAI | 2023 | 81 | | [27](https://doi.org/10.1016/J.INS.2022.12.080) | Extracting tactics learned from self-play in general games | Inf. Sci. | 2023 | 82 | | [28](https://doi.org/10.1007/s10994-022-06295-5) | Learning state importance for preference-based reinforcement learning | Mach. Learn. | 2023 | 83 | | [29](https://openreview.net/forum?id=PbMBfRpVgU) | Interpretable and Explainable Logical Policies via Neurally Guided Symbolic Abstraction | NeurIPS | 2023 | 84 | | [30](https://openreview.net/forum?id=xGz0wAIJrS) | State2Explanation: Concept-Based Explanations to Benefit Agent Learning and User Understanding | NeurIPS | 2023 | 85 | | [31](https://openreview.net/forum?id=pzc6LnUxYN) | StateMask: Explaining Deep Reinforcement Learning through State Mask | NeurIPS | 2023 | 86 | | [32](https://doi.org/10.1007/s00521-023-08696-6) | Comparing explanations in RL | Neural Comput. Appl. | 2023 | 87 | | [33](https://doi.org/10.1007/S00521-021-06425-5) | Explainable robotic systems: understanding goal-driven actions in a reinforcement learning scenario | Neural Comput. Appl. | 2023 | 88 | | [34](https://doi.org/10.1007/S00521-022-07280-8) | Hierarchical goals contextualize local reward decomposition explanations | Neural Comput. Appl. | 2023 | 89 | | [35](https://doi.org/10.1016/J.NEUNET.2023.01.025) | Achieving efficient interpretability of reinforcement learning via policy distillation and selective input gradient regularization | Neural Networks | 2023 | 90 | | [36](https://doi.org/10.1016/J.NEUCOM.2022.10.014) | Model tree methods for explaining deep reinforcement learning agents in real-time robotic applications | Neurocomputing | 2023 | 91 | | [37](https://doi.org/10.1007/978-3-031-37616-0_27) | Integrating Policy Summaries with Reward Decomposition for Explaining Reinforcement Learning Agents | PAAMS | 2023 | 92 | | [38](https://doi.org/10.1007/978-3-031-44067-0_4) | Contrastive Visual Explanations for Reinforcement Learning via Counterfactual Rewards | xAI | 2023 | 93 | | [39](https://doi.org/10.1007/978-3-031-44064-9_20) | IxDRL: A Novel Explainable Deep Reinforcement Learning Toolkit Based on Analyses of Interestingness | xAI | 2023 | 94 | | [40](https://doi.org/10.1609/AAAI.V36I5.20463) | "I Don't Think So": Summarizing Policy Disagreements for Agent Comparison | AAAI | 2022 | 95 | | [41](https://doi.org/10.5555/3535850.3535950) | CAPS: Comprehensible Abstract Policy Summaries for Explaining Reinforcement Learning Agents | AAMAS | 2022 | 96 | | [42](https://doi.org/10.5555/3535850.3535865) | Interpretable Preference-based Reinforcement Learning with Tree-Structured Reward Functions | AAMAS | 2022 | 97 | | [43](https://doi.org/10.5555/3535850.3535926) | Lazy-MDPs: Towards Interpretable RL by Learning When to Act | AAMAS | 2022 | 98 | | [44](https://doi.org/10.1109/ACSOS55765.2022.00023) | Explaining Online Reinforcement Learning Decisions of Self-Adaptive Systems | ACSOS | 2022 | 99 | | [45](https://doi.org/10.3390/A15030091) | Analysis of Explainable Goal-Driven Reinforcement Learning in a Continuous Simulated Environment | Algorithms | 2022 | 100 | | [46](https://doi.org/10.3390/app122110947) | BEERL: Both Ends Explanations for Reinforcement Learning | Applied Sciences | 2022 | 101 | | [47](https://doi.org/10.3390/app12115380) | Energy-Efficient Driving for Adaptive Traffic Signal Control Environment via Explainable Reinforcement Learning | Applied Sciences | 2022 | 102 | | [48](https://proceedings.mlr.press/v205/zabounidis23a.html) | Concept Learning for Interpretable Multi-Agent Reinforcement Learning | CoRL | 2022 | 103 | | [49](https://doi.org/10.1109/DESTION56136.2022.00017) | Comparing Strategies for Visualizing the High-Dimensional Exploration Behavior of CPS Design Agents | DESTION | 2022 | 104 | | [50](https://doi.org/10.1007/978-3-031-19839-7_22) | InAction: Interpretable Action Decision Making for Autonomous Driving | ECCV | 2022 | 105 | | [51](https://doi.org/https://doi.org/10.1016/j.epsr.2022.107932) | Enhanced Oblique Decision Tree Enabled Policy Extraction for Deep Reinforcement Learning in Power System Emergency Control | Electric Power Systems Research | 2022 | 106 | | [52](https://doi.org/10.3390/electronics11213599) | Attributation Analysis of Reinforcement Learning-Based Highway Driver | Electronics | 2022 | 107 | | [53](https://doi.org/10.1007/978-3-031-02056-8_18) | Multi-objective Genetic Programming for Explainable Reinforcement Learning | EuroGP | 2022 | 108 | | [54](https://doi.org/10.5220/0010796300003116) | Deep-Learning-based Fuzzy Symbolic Processing with Agents Capable of Knowledge Communication | ICAART | 2022 | 109 | | [55](https://openreview.net/forum?id=o-1v9hdSult) | Bridging the Gap: Providing Post-Hoc Symbolic Explanations for Sequential Decision-Making Problems with Inscrutable Representations | ICLR | 2022 | 110 | | [56](https://openreview.net/forum?id=AJsI-ymaKn_) | POETREE: Interpretable Policy Learning with Adaptive Decision Trees | ICLR | 2022 | 111 | | [57](https://openreview.net/forum?id=6Tk2noBdvxt) | Programmatic Reinforcement Learning without Oracles | ICLR | 2022 | 112 | | [58](https://arxiv.org/abs/2201.12462) | Explaining Reinforcement Learning Policies through Counterfactual Trajectories | ICML Workshop on HILL | 2022 | 113 | | [59](https://doi.org/10.1145/3523111.3523127) | Mean-variance Based Risk-sensitive Reinforcement Learning with Interpretable Attention | ICMVA | 2022 | 114 | | [60](https://doi.org/10.1109/ICPR56361.2022.9956245) | Towards Interpretable Deep Reinforcement Learning Models via Inverse Reinforcement Learning | ICPR | 2022 | 115 | | [61](https://doi.org/10.1109/ACCESS.2022.3176104) | Explaining Intelligent Agent's Future Motion on Basis of Vocabulary Learning With Human Goal Inference | IEEE Access | 2022 | 116 | | [62](https://doi.org/10.1109/LRA.2022.3146555) | Interpretable Autonomous Flight Via Compact Visualizable Neural Circuit Policies | IEEE Robotics Autom. Lett. | 2022 | 117 | | [63](https://doi.org/10.1109/TCSS.2021.3096824) | Explainable AI in Deep Reinforcement Learning Models for Power System Emergency Control | IEEE Trans. Comput. Soc. Syst. | 2022 | 118 | | [64](https://doi.org/10.1109/TITS.2021.3096998) | Hierarchical Program-Triggered Reinforcement Learning Agents for Automated Driving | IEEE Trans. Intell. Transp. Syst. | 2022 | 119 | | [65](https://doi.org/10.1109/TITS.2020.3046646) | Interpretable End-to-End Urban Autonomous Driving With Latent Deep Reinforcement Learning | IEEE Trans. Intell. Transp. Syst. | 2022 | 120 | | [66](https://doi.org/10.1109/TPAMI.2021.3103132) | Continuous Action Reinforcement Learning From a Mixture of Interpretable Experts | IEEE Trans. Pattern Anal. Mach. Intell. | 2022 | 121 | | [67](https://doi.org/10.1109/TPAMI.2020.3037898) | Self-Supervised Discovering of Interpretable Features for Reinforcement Learning | IEEE Trans. Pattern Anal. Mach. Intell. | 2022 | 122 | | [68](https://doi.org/10.1109/TPAMI.2021.3133717) | Temporal-Spatial Causal Interpretations for Vision-Based Reinforcement Learning | IEEE Trans. Pattern Anal. Mach. Intell. | 2022 | 123 | | [69](https://doi.org/10.1109/TVCG.2021.3076749) | Visual Analytics for RNN-Based Deep Reinforcement Learning | IEEE Trans. Vis. Comput. Graph. | 2022 | 124 | | [70](https://doi.org/10.1109/TCYB.2022.3180664) | Toward Interpretable-AI Policies Using Evolutionary Nonlinear Decision Trees for Discrete-Action Systems | IEEE Transactions on Cybernetics | 2022 | 125 | | [71](https://doi.org/10.1109/TNNLS.2022.3184956) | Understanding via Exploration: Discovery of Interpretable Features With Deep Reinforcement Learning | IEEE Transactions on Neural Networks and Learning Systems | 2022 | 126 | | [72](https://arxiv.org/abs/2201.07749) | Summarising and Comparing Agent Dynamics with Contrastive Spatiotemporal Abstraction | IJCAI Workshop on XAI | 2022 | 127 | | [73](https://doi.org/10.1007/S12650-021-00793-9) | ACMViz: a visual analytics approach to understand DRL-based autonomous control model | J. Vis. | 2022 | 128 | | [74](https://doi.org/10.1007/978-3-031-10986-7_16) | Incorporating Explanations to Balance the Exploration and Exploitation of Deep Reinforcement Learning | KSEM | 2022 | 129 | | [75](https://doi.org/10.1007/978-3-031-10986-7_44) | Towards Explainable Reinforcement Learning Using Scoring Mechanism Augmented Agents | KSEM | 2022 | 130 | | [76](http://papers.nips.cc/paper_files/paper/2022/hash/dbef234be68d8b170240511639610fd1-Abstract-Conference.html) | Explainable Reinforcement Learning via Model Transforms | NeurIPS | 2022 | 131 | | [77](http://papers.nips.cc/paper_files/paper/2022/hash/7dd309df03d37643b96f5048b44da798-Abstract-Conference.html) | GALOIS: Boosting Deep Reinforcement Learning via Generalizable Logic Synthesis | NeurIPS | 2022 | 132 | | [78](http://papers.nips.cc/paper_files/paper/2022/hash/672e44a114a41d5f34b97459877c083d-Abstract-Conference.html) | Inherently Explainable Reinforcement Learning in Natural Language | NeurIPS | 2022 | 133 | | [79](http://papers.nips.cc/paper_files/paper/2022/hash/b157cfde6794e93b2353b9712bbd45a5-Abstract-Conference.html) | Non-Markovian Reward Modelling from Trajectory Labels via Interpretable Multiple Instance Learning | NeurIPS | 2022 | 134 | | [80](http://papers.nips.cc/paper_files/paper/2022/hash/ae5bf4f35236240c9460e761c60fa53d-Abstract-Conference.html) | ProtoX: Explaining a Reinforcement Learning Agent via Prototyping | NeurIPS | 2022 | 135 | | [81](https://doi.org/10.48550/ARXIV.2211.07719) | (When) Are Contrastive Explanations of Reinforcement Learning Helpful? | NeurIPS workshop on HiLL | 2022 | 136 | | [82](https://doi.org/10.1016/J.NEUNET.2022.03.022) | Mo\"ET: Mixture of Expert Trees and its application to verifiable reinforcement learning | Neural Networks | 2022 | 137 | | [83](https://doi.org/10.1016/J.NEUCOM.2022.04.005) | Analysing deep reinforcement learning agents trained with domain randomisation | Neurocomputing | 2022 | 138 | | [84](https://doi.org/10.1109/PACIFICVIS53943.2022.00020) | Why? Why not? When? Visual Explanations of Agent Behaviour in Reinforcement Learning | PacificVis | 2022 | 139 | | [85](https://doi.org/10.1016/J.PATCOG.2021.108421) | Driving behavior explanation with multi-level fusion | Pattern Recognit. | 2022 | 140 | | [86](https://doi.org/10.1073/pnas.2206625119) | Acquisition of chess knowledge in AlphaZero | Proc. Natl. Acad. Sci. U.S.A. | 2022 | 141 | | [87](https://doi.org/10.15607/RSS.2022.XVIII.068) | Learning Interpretable, High-Performing Policies for Autonomous Driving | Robotics: Science and Systems | 2022 | 142 | | [88](https://doi.org/10.1007/S10270-021-00952-4) | Event-driven temporal models for explanations - ETeMoX: explaining reinforcement learning | Softw. Syst. Model. | 2022 | 143 | | [89](https://doi.org/10.1111/TOPS.12573) | Toward a Psychology of Deep Reinforcement Learning Agents Using a Cognitive Architecture | Top. Cogn. Sci. | 2022 | 144 | | [90](https://doi.org/10.1609/AAAI.V35I9.16935) | DeepSynth: Automata Synthesis for Automatic Task Segmentation in Deep Reinforcement Learning | AAAI | 2021 | 145 | | [91](https://doi.org/10.1609/AAAI.V35I11.17192) | Iterative Bounding MDPs: Learning Interpretable Policies via Non-Interpretable Methods | AAAI | 2021 | 146 | | [92](https://doi.org/10.1609/AAAI.V35I13.17360) | TripleTree: A Versatile Interpretable Representation of Black Box Agents and their Environments | AAAI | 2021 | 147 | | [93](https://ojs.aaai.org/index.php/AIIDE/article/view/18894) | Explaining Deep Reinforcement Learning Agents in the Atari Domain through a Surrogate Model | AIIDE | 2021 | 148 | | [94](https://doi.org/10.1080/01691864.2021.1946423) | A framework of explanation generation toward reliable autonomous robots | Adv. Robotics | 2021 | 149 | | [95](https://doi.org/https://doi.org/10.1016/j.ast.2021.107052) | Explainable Deep Reinforcement Learning for UAV autonomous path planning | Aerospace Science and Technology | 2021 | 150 | | [96](https://doi.org/https://doi.org/10.1002/ail2.52) | Explaining robot policies | Applied AI Letters | 2021 | 151 | | [97](https://doi.org/10.1016/J.ARTINT.2021.103455) | Counterfactual state explanations for reinforcement learning agents via generative deep learning | Artif. Intell. | 2021 | 152 | | [98](https://doi.org/10.1016/J.ARTINT.2021.103571) | Local and global explanations of agent behavior: Integrating strategy summaries with saliency maps | Artif. Intell. | 2021 | 153 | | [99](https://doi.org/10.1145/3459637.3482494) | XPM: An Explainable Deep Reinforcement Learning Framework for Portfolio Management | CIKM | 2021 | 154 | | [100](https://doi.org/10.1109/COG52621.2021.9618999) | Interactive Explanations: Diagnosis and Repair of Reinforcement Learning Based Agent Behaviors | CoG | 2021 | 155 | | [101](https://doi.org/10.48550/arXiv.2011.07553) | CDT: Cascading Decision Trees for Explainable Reinforcement Learning | CoRR | 2021 | 156 | | [102](https://arxiv.org/abs/2112.09462) | Contrastive Explanations for Comparing Preferences of Reinforcement Learning Agents | CoRR | 2021 | 157 | | [103](https://doi.org/10.23919/ECC54610.2021.9655007) | Approximating a deep reinforcement learning docking agent using linear model trees | ECC | 2021 | 158 | | [104](https://doi.org/10.23919/ECC54610.2021.9654850) | Robotic Lever Manipulation using Hindsight Experience Replay and Shapley Additive Explanations | ECC | 2021 | 159 | | [105](https://doi.org/10.1007/978-3-030-86520-7_38) | Off-Policy Differentiable Logic Reinforcement Learning | ECML PKDD | 2021 | 160 | | [106](https://doi.org/10.18653/V1/2021.EMNLP-MAIN.283) | Neuro-Symbolic Reinforcement Learning with First-Order Logic | EMNLP | 2021 | 161 | | [107](https://doi.org/10.5220/0010256208740881) | Explainable Reinforcement Learning for Longitudinal Control | ICAART | 2021 | 162 | | [108](https://doi.org/10.1145/3490354.3494415) | Explainable deep reinforcement learning for portfolio management: an empirical approach | ICAIF | 2021 | 163 | | [109](https://doi.org/10.1109/ICAR53236.2021.9659472) | Explainable Reinforcement Learning for Human-Robot Collaboration | ICAR | 2021 | 164 | | [110](https://doi.org/10.1109/ICCV48922.2021.00752) | DRIVE: Deep Reinforced Accident Anticipation with Visual Explanation | ICCV | 2021 | 165 | | [111](https://openreview.net/forum?id=Ud3DSz72nYR) | Contrastive Explanations for Reinforcement Learning via Embedded Self Predictions | ICLR | 2021 | 166 | | [112](https://openreview.net/forum?id=unI5ucw_Jk) | Explaining by Imitating: Understanding Decisions by Interpretable Policy Learning | ICLR | 2021 | 167 | | [113](https://openreview.net/forum?id=h0de3QWtGG) | Learning "What-if" Explanations for Sequential Decision-Making | ICLR | 2021 | 168 | | [114](http://proceedings.mlr.press/v139/landajuela21a.html) | Discovering symbolic policies with deep reinforcement learning | ICML | 2021 | 169 | | [115](http://proceedings.mlr.press/v139/danesh21a.html) | Re-understanding Finite-State Representations of Recurrent Policy Networks | ICML | 2021 | 170 | | [116](https://doi.org/10.1007/978-3-030-79457-6_15) | Explainable Reinforcement Learning with the Tsetlin Machine | IEA/AIE | 2021 | 171 | | [117](https://doi.org/10.1109/ACCESS.2021.3100007) | A Blood Glucose Control Framework Based on Reinforcement Learning With Safety and Interpretability: In Silico Validation | IEEE Access | 2021 | 172 | | [118](https://doi.org/10.1109/ACCESS.2021.3119000) | Symbolic Regression Methods for Reinforcement Learning | IEEE Access | 2021 | 173 | | [119](https://doi.org/10.1109/LRA.2021.3068906) | Efficient Robotic Object Search Via HIEM: Hierarchical Policy Learning With Intrinsic-Extrinsic Modeling | IEEE Robotics Autom. Lett. | 2021 | 174 | | [120](https://doi.org/10.1109/LRA.2021.3091885) | Learning to Discover Task-Relevant Features for Interpretable Reinforcement Learning | IEEE Robotics Autom. Lett. | 2021 | 175 | | [121](https://doi.org/10.1109/TFUZZ.2020.2999776) | Explaining Deep Learning Models Through Rule-Based Approximation and Visualization | IEEE Trans. Fuzzy Syst. | 2021 | 176 | | [122](https://doi.org/10.1109/TVT.2021.3098321) | Interpretable Decision-Making for Autonomous Vehicles at Highway On-Ramps With Latent Space Reinforcement Learning | IEEE Trans. Veh. Technol. | 2021 | 177 | | [123](https://doi.org/https://doi.org/10.1016/j.ifacol.2021.10.086) | Explainable AI methods on a deep reinforcement learning agent for automatic docking | IFAC-PapersOnLine | 2021 | 178 | | [124](https://doi.org/10.1109/IJCNN52387.2021.9534363) | Visual Explanation using Attention Mechanism in Actor-Critic-based Deep Reinforcement Learning | IJCNN | 2021 | 179 | | [125](https://doi.org/10.1007/978-3-030-97454-1_11) | Programmatic Policy Extraction by Iterative Local Search | ILP | 2021 | 180 | | [126](https://doi.org/10.1109/IROS51168.2021.9636594) | Explaining the Decisions of Deep Policy Networks for Robotic Manipulations | IROS | 2021 | 181 | | [127](https://doi.org/10.1109/IROS51168.2021.9636759) | XAI-N: Sensor-based Robot Navigation using Expert Policies and Decision Trees | IROS | 2021 | 182 | | [128](https://doi.org/10.1109/ITSC48978.2021.9565053) | Mixed Autonomous Supervision in Traffic Signal Control | ITSC | 2021 | 183 | | [129](https://doi.org/10.1109/IV48863.2021.9575328) | Can You Trust Your Autonomous Car? Interpretable and Verifiably Safe Reinforcement Learning | IV | 2021 | 184 | | [130](https://doi.org/10.3390/jmse9111178) | Explaining a Deep Reinforcement Learning Docking Agent Using Linear Model Trees with User Adapted Visualization | Journal of Marine Science and Engineering | 2021 | 185 | | [131](https://doi.org/10.3837/TIIS.2021.03.003) | Visual Analysis of Deep Q-network | KSII Trans. Internet Inf. Syst. | 2021 | 186 | | [132](https://doi.org/10.1007/S10994-021-05963-2) | Automatic discovery of interpretable planning strategies | Mach. Learn. | 2021 | 187 | | [133](https://proceedings.neurips.cc/paper/2021/hash/65c89f5a9501a04c073b354f03791b1f-Abstract.html) | EDGE: Explaining Deep Reinforcement Learning Policies | NeurIPS | 2021 | 188 | | [134](https://proceedings.neurips.cc/paper/2021/hash/a35fe7f7fe8217b4369a0af4244d1fca-Abstract.html) | Learning Tree Interpretation from Object Representation for Deep Reinforcement Learning | NeurIPS | 2021 | 189 | | [135](https://proceedings.neurips.cc/paper/2021/hash/d37124c4c79f357cb02c655671a432fa-Abstract.html) | Learning to Synthesize Programs as Interpretable and Generalizable Policies | NeurIPS | 2021 | 190 | | [136](https://proceedings.neurips.cc/paper/2021/hash/d58e2f077670f4de9cd7963c857f2534-Abstract.html) | Machine versus Human Attention in Deep Reinforcement Learning Tasks | NeurIPS | 2021 | 191 | | [137](https://arxiv.org/abs/2106.03775) | Explainable Artificial Intelligence (XAI) for Increasing User Trust in Deep Reinforcement Learning Driven Autonomous Systems | NeurIPS Workshop on Deep RL | 2021 | 192 | | [138](https://arxiv.org/abs/2101.03309) | Identifying Decision Points for Safe and Interpretable Reinforcement Learning in Hypotension Treatment | NeurIPS Workshop on Machine Learning for Health | 2021 | 193 | | [139](https://doi.org/10.1109/SMC52423.2021.9658917) | Feature-Based Interpretable Reinforcement Learning based on State-Transition Models | SMC | 2021 | 194 | | [140](https://doi.org/10.1109/SSCI50451.2021.9660048) | A co-evolutionary approach to interpretable reinforcement learning in environments with continuous action spaces | SSCI | 2021 | 195 | | [141](https://doi.org/10.1109/SSCI50451.2021.9659552) | Interpretable AI Agent Through Nonlinear Decision Trees for Lane Change Problem | SSCI | 2021 | 196 | | [142](https://doi.org/10.1109/SSCI50451.2021.9660192) | Learning Sparse Evidence- Driven Interpretation to Understand Deep Reinforcement Learning Agents | SSCI | 2021 | 197 | | [143](https://doi.org/10.1609/AAAI.V34I03.5631) | Explainable Reinforcement Learning through a Causal Lens | AAAI | 2020 | 198 | | [144](https://ceur-ws.org/Vol-2600/short4.pdf) | Attribution-based Salience Method towards Interpretable Reinforcement Learning | AAAI-MAKE | 2020 | 199 | | [145](https://doi.org/10.5555/3398761.3398777) | Learning an Interpretable Traffic Signal Control Policy | AAMAS | 2020 | 200 | | [146](http://proceedings.mlr.press/v108/silva20a.html) | Optimization Methods for Interpretable Differentiable Decision Trees Applied to Reinforcement Learning | AISTATS | 2020 | 201 | | [147](https://doi.org/10.1016/J.ARTINT.2020.103367) | Interestingness elements for explainable reinforcement learning: Understanding agents' capabilities and limitations | Artif. Intell. | 2020 | 202 | | [148](https://doi.org/10.1007/S10458-020-09451-0) | Model primitives for hierarchical lifelong reinforcement learning | Auton. Agents Multi Agent Syst. | 2020 | 203 | | [149](https://doi.org/10.1007/978-3-030-63710-1_12) | Understanding the Behavior of Reinforcement Learning Agents | BIOMA | 2020 | 204 | | [150](https://doi.org/10.1109/BIGDATA50022.2020.9377735) | Methodology for Interpretable Reinforcement Learning Model for HVAC Energy Control | Big Data | 2020 | 205 | | [151](https://doi.org/10.1109/CVPRW50498.2020.00178) | Explaining Autonomous Driving by Learning End-to-End Visual Attention | CVPRW | 2020 | 206 | | [152](https://arxiv.org/abs/2012.05862) | Understanding Learned Reward Functions | CoRR | 2020 | 207 | | [153](https://doi.org/10.1007/s40747-020-00175-y) | Interpretable policy derivation for reinforcement learning based on evolutionary feature synthesis | Complex & Intelligent Systems | 2020 | 208 | | [154](https://doi.org/10.1111/CGF.13962) | DRLViz: Understanding Decisions and Memory in Deep Reinforcement Learning | Comput. Graph. Forum | 2020 | 209 | | [155](https://doi.org/10.23915/distill.00029) | Understanding RL Vision | Distill | 2020 | 210 | | [156](https://doi.org/10.1016/J.ENGAPPAI.2020.103559) | Interpretable policies for reinforcement learning by empirical fuzzy sets | Eng. Appl. Artif. Intell. | 2020 | 211 | | [157](https://doi.org/10.1145/3377930.3389847) | Neuroevolution of self-interpretable agents | GECCO | 2020 | 212 | | [158](https://doi.org/10.5220/0008913303700377) | Topological Visualization Method for Understanding the Landscape of Value Functions and Structure of the State Space in Reinforcement Learning | ICAART | 2020 | 213 | | [159](https://doi.org/10.1007/978-3-030-61609-0_29) | Identifying Critical States by the Action-Based Variance of Expected Return | ICANN | 2020 | 214 | | [160](https://ojs.aaai.org/index.php/ICAPS/article/view/6671) | TLdR: Policy Summarization for Factored SSP Problems Using Temporal Abstractions | ICAPS | 2020 | 215 | | [161](https://openreview.net/forum?id=SJgzLkBKPB) | Explain Your Move: Understanding Agent Actions Using Specific and Relevant Feature Attribution | ICLR | 2020 | 216 | | [162](https://openreview.net/forum?id=rkl3m1BFDB) | Exploratory Not Explanatory: Counterfactual Analysis of Saliency Maps for Deep Reinforcement Learning | ICLR | 2020 | 217 | | [163](https://openreview.net/forum?id=rylvYaNYDH) | Finding and Visualizing Weaknesses of Deep Reinforcement Learning Agents | ICLR | 2020 | 218 | | [164](http://proceedings.mlr.press/v119/gottesman20a.html) | Interpretable Off-Policy Evaluation in Reinforcement Learning by Highlighting Influential Transitions | ICML | 2020 | 219 | | [165](https://doi.org/10.1109/LRA.2020.3011912) | Deep Reinforcement Learning for Safe Local Planning of a Ground Vehicle in Unknown Rough Terrain | IEEE Robotics Autom. Lett. | 2020 | 220 | | [166](https://doi.org/10.1587/TRANSINF.2019EDP7170) | Towards Interpretable Reinforcement Learning with State Abstraction Driven by External Knowledge | IEICE Trans. Inf. Syst. | 2020 | 221 | | [167](https://doi.org/10.1109/IJCNN48605.2020.9207648) | Improved Policy Extraction via Online Q-Value Distillation | IJCNN | 2020 | 222 | | [168](https://doi.org/10.1109/IJCNN48605.2020.9206675) | Visualization of topographical internal representation of learning robots | IJCNN | 2020 | 223 | | [169](https://doi.org/10.1007/s12008-020-00717-1) | Explainable navigation system using fuzzy reinforcement learning | IJIDeM | 2020 | 224 | | [170](https://doi.org/10.1109/ITSC45102.2020.9294213) | Explainability of Intelligent Transportation Systems using Knowledge Compilation: a Traffic Light Controller Case | ITSC | 2020 | 225 | | [171](https://doi.org/10.1145/3394486.3403186) | xGAIL: Explainable Generative Adversarial Imitation Learning for Explainable Human Decision Analysis | KDD | 2020 | 226 | | [172](https://proceedings.neurips.cc/paper/2020/hash/d5ab8dc7ef67ca92e41d730982c5c602-Abstract.html) | What Did You Think Would Happen? Explaining Agent Behaviour through Intended Outcomes | NeurIPS | 2020 | 227 | | [173](https://arxiv.org/abs/2011.09004) | Explaining Conditions for Reinforcement Learning Behaviors from Real and Imagined Data | NeurIPS Workshop on Challenges of Real-World RL | 2020 | 228 | | [174](https://doi.org/10.1109/PACIFICVIS48177.2020.7127) | DynamicsExplorer: Visual Analytics for Robot Control Tasks involving Dynamics and LSTM-based Control Policies | PacificVis | 2020 | 229 | | [175](https://doi.org/10.1016/J.ROBOT.2020.103568) | Combining reinforcement learning with rule-based controllers for transparent and general decision-making in autonomous driving | Robotics Auton. Syst. | 2020 | 230 | | [176](https://doi.org/10.1007/978-3-030-73959-1_16) | Modelling Agent Policies with Interpretable Imitation Learning | TAILOR | 2020 | 231 | | [177](https://doi.org/10.1007/978-3-031-04083-2_11) | Interpretable, Verifiable, and Robust Reinforcement Learning via Program Synthesis | xxAI - Beyond Explainable AI | 2020 | 232 | | [178](https://doi.org/10.1609/AAAI.V33I01.33012514) | Generation of Policy-Level Explanations for Reinforcement Learning | AAAI | 2019 | 233 | | [179](https://doi.org/10.1609/AAAI.V33I01.33012970) | SDRL: Interpretable and Data-Efficient Deep Reinforcement Learning Leveraging Symbolic Planning | AAAI | 2019 | 234 | | [180](https://doi.org/10.1609/AAAI.V33I01.33014561) | Towards Better Interpretability in Deep Q-Networks | AAAI | 2019 | 235 | | [181](https://dl.acm.org/doi/10.5555/3306127.3332017) | Toward Robust Policy Summarization | AAMAS | 2019 | 236 | | [182](http://proceedings.mlr.press/v101/yang19a.html) | Towards Governing Agent's Efficacy: Action-Conditional \textdollar\(\beta\)\textdollar-VAE for Deep Transparent Reinforcement Learning | ACML | 2019 | 237 | | [183](https://doi.org/10.1007/978-3-030-35288-2_6) | Memory-Based Explainable Reinforcement Learning | AI | 2019 | 238 | | [184](https://doi.org/10.1007/S10458-019-09418-W) | Summarizing agent strategies | Auton. Agents Multi Agent Syst. | 2019 | 239 | | [185](https://doi.org/10.1007/S10514-018-9771-0) | Enabling robots to communicate their objectives | Auton. Robots | 2019 | 240 | | [186](https://doi.org/10.1109/CIG.2019.8847950) | Visualization of Deep Reinforcement Learning using Grad-CAM: How AI Plays Atari Games? | CoG | 2019 | 241 | | [187](https://aaai.org/ocs/index.php/FLAIRS/FLAIRS19/paper/view/18275) | Explaining Reward Functions in Markov Decision Processes | FLAIRS | 2019 | 242 | | [188](https://doi.org/10.1109/HRI.2019.8673104) | Explanation-Based Reward Coaching to Improve Human Performance via Reinforcement Learning | HRI | 2019 | 243 | | [189](https://doi.org/10.1109/ICCVW.2019.00522) | Free-Lunch Saliency via Attention in Atari Agents | ICCVW | 2019 | 244 | | [190](https://openreview.net/forum?id=HkxaFoC9KQ) | Deep reinforcement learning with relational inductive biases | ICLR | 2019 | 245 | | [191](https://openreview.net/forum?id=S1gOpsCctm) | Learning Finite State Representations of Recurrent Policy Networks | ICLR | 2019 | 246 | | [192](http://proceedings.mlr.press/v97/jiang19a.html) | Neural Logic Reinforcement Learning | ICML | 2019 | 247 | | [193](https://doi.org/10.1109/ICMLA.2019.00041) | Interpretable Approximation of a Deep Reinforcement Learning Agent as a Set of If-Then Rules | ICMLA | 2019 | 248 | | [194](https://doi.org/10.1109/ICRA.2019.8794437) | Semantic Predictive Control for Explainable and Efficient Policy Learning | ICRA | 2019 | 249 | | [195](https://doi.org/10.1109/TVCG.2018.2864504) | DQNViz: A Visual Analytics Approach to Understand Deep Q-Networks | IEEE Trans. Vis. Comput. Graph. | 2019 | 250 | | [196](https://doi.org/10.1007/978-3-030-37442-6_11) | Visualizing Deep Q-Learning to Understanding Behavior of Swarm Robotic System | IES | 2019 | 251 | | [197](https://doi.org/10.24963/IJCAI.2019/194) | Exploring Computational User Models for Agent Policy Summarization | IJCA | 2019 | 252 | | [198](https://doi.org/10.24963/IJCAI.2019/184) | Explaining Reinforcement Learning to Mere Mortals: An Empirical Study | IJCAI | 2019 | 253 | | [199](http://arxiv.org/abs/1909.12969) | Counterfactual States for Atari Agents via Generative Deep Learning | IJCAI Workshop on XAI | 2019 | 254 | | [200](https://sites.google.com/view/xai2019/home) | Distilling Deep Reinforcement Learning Policies in Soft Decision Trees | IJCAI Workshop on XAI | 2019 | 255 | | [201](https://doi.org/10.1109/IROS40897.2019.8968488) | Dot-to-Dot: Explainable Hierarchical Reinforcement Learning for Robotic Manipulation | IROS | 2019 | 256 | | [202](https://doi.org/10.1109/ITSC.2019.8917519) | Reinforcement Learning with Explainability for Traffic Signal Control | ITSC | 2019 | 257 | | [203](https://ceur-ws.org/Vol-2327/IUI19WS-ExSS2019-1.pdf) | Interestingness Elements for Explainable Reinforcement Learning through Introspection | IUI Workshops | 2019 | 258 | | [204](https://web.engr.oregonstate.edu/~afern/papers/reward_decomposition__workshop_final.pdf) | Explainable Reinforcement Learning via Reward Decomposition | JCAI Workshop on XAI | 2019 | 259 | | [205](https://doi.org/10.1007/978-3-030-30179-8_16) | Enhancing Explainability of Deep Reinforcement Learning Through Selective Layer-Wise Relevance Propagation | KI | 2019 | 260 | | [206](https://proceedings.neurips.cc/paper/2019/hash/5a44a53b7d26bb1e54c05222f186dcfb-Abstract.html) | Imitation-Projected Programmatic Reinforcement Learning | NeurIPS | 2019 | 261 | | [207](https://proceedings.neurips.cc/paper/2019/hash/e9510081ac30ffa83f10b68cde1cac07-Abstract.html) | Towards Interpretable Reinforcement Learning Using Attention Augmented Agents | NeurIPS | 2019 | 262 | | [208](https://doi.org/10.1109/RO-MAN46459.2019.8956301) | Verbal Explanations for Deep Reinforcement Learning Neural Networks with Attention on Extracted Features | RO-MAN | 2019 | 263 | | [209](https://doi.org/10.1126/SCIROBOTICS.AAY6276) | A formal methods approach to interpretable reinforcement learning for robotic planning | Sci. Robotics | 2019 | 264 | | [210](http://dl.acm.org/citation.cfm?id=3237869) | HIGHLIGHTS: Summarizing Agent Behavior to People | AAMAS | 2018 | 265 | | [211](https://doi.org/10.1145/3278721.3278736) | Rationalization: A Neural Machine Translation Approach to Generating Natural Language Explanations | AIES | 2018 | 266 | | [212](https://doi.org/10.1145/3278721.3278776) | Transparency and Explanation in Deep Reinforcement Learning Neural Networks | AIES | 2018 | 267 | | [213](https://doi.org/10.1007/978-3-030-31978-6_12) | Visual Rationalizations in Deep Reinforcement Learning for Atari Games | BNAIC | 2018 | 268 | | [214](https://doi.org/10.1007/978-3-030-01216-8_35) | Textual Explanations for Self-Driving Vehicles | ECCV | 2018 | 269 | | [215](https://doi.org/10.1007/978-3-030-10928-8_25) | Toward Interpretable Deep Reinforcement Learning with Linear Model U-Trees | ECML PKDD | 2018 | 270 | | [216](https://doi.org/10.1016/J.ENGAPPAI.2018.09.007) | Interpretable policies for reinforcement learning by genetic programming | Eng. Appl. Artif. Intell. | 2018 | 271 | | [217](https://doi.org/10.1145/3205651.3208277) | Generating interpretable fuzzy controllers using particle swarm optimization and genetic programming | GECCO | 2018 | 272 | | [218](https://openreview.net/forum?id=SJJQVZW0b) | Hierarchical and Interpretable Skill Acquisition in Multi-task Reinforcement Learning | ICLR | 2018 | 273 | | [219](http://proceedings.mlr.press/v80/verma18a.html) | Programmatically Interpretable Reinforcement Learning | ICML | 2018 | 274 | | [220](http://proceedings.mlr.press/v80/greydanus18a.html) | Visualizing and Understanding Atari Agents | ICML | 2018 | 275 | | [221](https://doi.org/10.1109/ICMLA.2018.00095) | Deep Reinforcement Learning Monitor for Snapshot Recording | ICMLA | 2018 | 276 | | [222](http://arxiv.org/abs/1807.08706) | Contrastive Explanations for Reinforcement Learning in terms of Expected Consequences | IJCAI Workshop on XAI | 2018 | 277 | | [223](https://par.nsf.gov/biblio/10096985) | Explaining Deep Adaptive Programs via Reward Decomposition | IJCAI/ECAI Workshop XAI | 2018 | 278 | | [224](https://doi.org/10.1109/IROS.2018.8593649) | Establishing Appropriate Trust via Critical States | IROS | 2018 | 279 | | [225](https://proceedings.neurips.cc/paper/2018/hash/96f2b50b5d3613adf9c27049b2a888c7-Abstract.html) | Unsupervised Video Object Segmentation for Deep Reinforcement Learning | NeurIPS | 2018 | 280 | | [226](https://proceedings.neurips.cc/paper/2018/hash/e6d8545daa42d5ced125a4bf747b3688-Abstract.html) | Verifiable Reinforcement Learning via Policy Extraction | NeurIPS | 2018 | 281 | | [227](https://doi.org/10.1109/SSCI.2018.8628887) | Visual Sparse Bayesian Reinforcement Learning: A Framework for Interpreting What an Agent Has Learned | SSCI | 2018 | 282 | | [228](https://doi.org/10.1016/J.ENGAPPAI.2017.07.005) | Particle swarm optimization for generating interpretable fuzzy reinforcement learning policies | Eng. Appl. Artif. Intell. | 2017 | 283 | | [229](https://doi.org/10.1145/3125739.3125746) | Autonomous Self-Explanation of Behavior for Interactive Reinforcement Learning Agents | HAI | 2017 | 284 | | [230](https://doi.org/10.1145/2909824.3020233) | Improving Robot Controller Transparency Through Autonomous Policy Explanation | HRI | 2017 | 285 | | [231](https://doi.org/10.1109/ICCV.2017.320) | Interpretable Learning for Self-Driving Cars by Visualizing Causal Attention | ICCV | 2017 | 286 | | [232](https://doi.org/10.1007/978-3-319-70087-8_11) | Application of Instruction-Based Behavior Explanation to a Reinforcement Learning Agent with Changing Policy | ICONIP | 2017 | 287 | | [233](http://proceedings.mlr.press/v48/zahavy16.html) | Graying the black box: Understanding DQNs | ICML | 2016 | 288 | --------------------------------------------------------------------------------