├── .gitignore ├── LICENSE ├── README_ZH.md └── README.md /.gitignore: -------------------------------------------------------------------------------- 1 | # Byte-compiled / optimized / DLL files 2 | __pycache__/ 3 | *.py[cod] 4 | *$py.class 5 | 6 | # C extensions 7 | *.so 8 | 9 | # Distribution / packaging 10 | .Python 11 | build/ 12 | develop-eggs/ 13 | dist/ 14 | downloads/ 15 | eggs/ 16 | .eggs/ 17 | lib/ 18 | lib64/ 19 | parts/ 20 | sdist/ 21 | var/ 22 | wheels/ 23 | pip-wheel-metadata/ 24 | share/python-wheels/ 25 | *.egg-info/ 26 | .installed.cfg 27 | *.egg 28 | MANIFEST 29 | 30 | # PyInstaller 31 | # Usually these files are written by a python script from a template 32 | # before PyInstaller builds the exe, so as to inject date/other infos into it. 33 | *.manifest 34 | *.spec 35 | 36 | # Installer logs 37 | pip-log.txt 38 | pip-delete-this-directory.txt 39 | 40 | # Unit test / coverage reports 41 | htmlcov/ 42 | .tox/ 43 | .nox/ 44 | .coverage 45 | .coverage.* 46 | .cache 47 | nosetests.xml 48 | coverage.xml 49 | *.cover 50 | *.py,cover 51 | .hypothesis/ 52 | .pytest_cache/ 53 | 54 | # Translations 55 | *.mo 56 | *.pot 57 | 58 | # Django stuff: 59 | *.log 60 | local_settings.py 61 | db.sqlite3 62 | db.sqlite3-journal 63 | 64 | # Flask stuff: 65 | instance/ 66 | .webassets-cache 67 | 68 | # Scrapy stuff: 69 | .scrapy 70 | 71 | # Sphinx documentation 72 | docs/_build/ 73 | 74 | # PyBuilder 75 | target/ 76 | 77 | # Jupyter Notebook 78 | .ipynb_checkpoints 79 | 80 | # IPython 81 | profile_default/ 82 | ipython_config.py 83 | 84 | # pyenv 85 | .python-version 86 | 87 | # pipenv 88 | # According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control. 89 | # However, in case of collaboration, if having platform-specific dependencies or dependencies 90 | # having no cross-platform support, pipenv may install dependencies that don't work, or not 91 | # install all needed dependencies. 92 | #Pipfile.lock 93 | 94 | # PEP 582; used by e.g. github.com/David-OConnor/pyflow 95 | __pypackages__/ 96 | 97 | # Celery stuff 98 | celerybeat-schedule 99 | celerybeat.pid 100 | 101 | # SageMath parsed files 102 | *.sage.py 103 | 104 | # Environments 105 | .env 106 | .venv 107 | env/ 108 | venv/ 109 | ENV/ 110 | env.bak/ 111 | venv.bak/ 112 | 113 | # Spyder project settings 114 | .spyderproject 115 | .spyproject 116 | 117 | # Rope project settings 118 | .ropeproject 119 | 120 | # mkdocs documentation 121 | /site 122 | 123 | # mypy 124 | .mypy_cache/ 125 | .dmypy.json 126 | dmypy.json 127 | 128 | # Pyre type checker 129 | .pyre/ 130 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | Apache License 2 | Version 2.0, January 2004 3 | http://www.apache.org/licenses/ 4 | 5 | TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION 6 | 7 | 1. Definitions. 8 | 9 | "License" shall mean the terms and conditions for use, reproduction, 10 | and distribution as defined by Sections 1 through 9 of this document. 11 | 12 | "Licensor" shall mean the copyright owner or entity authorized by 13 | the copyright owner that is granting the License. 14 | 15 | "Legal Entity" shall mean the union of the acting entity and all 16 | other entities that control, are controlled by, or are under common 17 | control with that entity. For the purposes of this definition, 18 | "control" means (i) the power, direct or indirect, to cause the 19 | direction or management of such entity, whether by contract or 20 | otherwise, or (ii) ownership of fifty percent (50%) or more of the 21 | outstanding shares, or (iii) beneficial ownership of such entity. 22 | 23 | "You" (or "Your") shall mean an individual or Legal Entity 24 | exercising permissions granted by this License. 25 | 26 | "Source" form shall mean the preferred form for making modifications, 27 | including but not limited to software source code, documentation 28 | source, and configuration files. 29 | 30 | "Object" form shall mean any form resulting from mechanical 31 | transformation or translation of a Source form, including but 32 | not limited to compiled object code, generated documentation, 33 | and conversions to other media types. 34 | 35 | "Work" shall mean the work of authorship, whether in Source or 36 | Object form, made available under the License, as indicated by a 37 | copyright notice that is included in or attached to the work 38 | (an example is provided in the Appendix below). 39 | 40 | "Derivative Works" shall mean any work, whether in Source or Object 41 | form, that is based on (or derived from) the Work and for which the 42 | editorial revisions, annotations, elaborations, or other modifications 43 | represent, as a whole, an original work of authorship. For the purposes 44 | of this License, Derivative Works shall not include works that remain 45 | separable from, or merely link (or bind by name) to the interfaces of, 46 | the Work and Derivative Works thereof. 47 | 48 | "Contribution" shall mean any work of authorship, including 49 | the original version of the Work and any modifications or additions 50 | to that Work or Derivative Works thereof, that is intentionally 51 | submitted to Licensor for inclusion in the Work by the copyright owner 52 | or by an individual or Legal Entity authorized to submit on behalf of 53 | the copyright owner. For the purposes of this definition, "submitted" 54 | means any form of electronic, verbal, or written communication sent 55 | to the Licensor or its representatives, including but not limited to 56 | communication on electronic mailing lists, source code control systems, 57 | and issue tracking systems that are managed by, or on behalf of, the 58 | Licensor for the purpose of discussing and improving the Work, but 59 | excluding communication that is conspicuously marked or otherwise 60 | designated in writing by the copyright owner as "Not a Contribution." 61 | 62 | "Contributor" shall mean Licensor and any individual or Legal Entity 63 | on behalf of whom a Contribution has been received by Licensor and 64 | subsequently incorporated within the Work. 65 | 66 | 2. Grant of Copyright License. Subject to the terms and conditions of 67 | this License, each Contributor hereby grants to You a perpetual, 68 | worldwide, non-exclusive, no-charge, royalty-free, irrevocable 69 | copyright license to reproduce, prepare Derivative Works of, 70 | publicly display, publicly perform, sublicense, and distribute the 71 | Work and such Derivative Works in Source or Object form. 72 | 73 | 3. Grant of Patent License. Subject to the terms and conditions of 74 | this License, each Contributor hereby grants to You a perpetual, 75 | worldwide, non-exclusive, no-charge, royalty-free, irrevocable 76 | (except as stated in this section) patent license to make, have made, 77 | use, offer to sell, sell, import, and otherwise transfer the Work, 78 | where such license applies only to those patent claims licensable 79 | by such Contributor that are necessarily infringed by their 80 | Contribution(s) alone or by combination of their Contribution(s) 81 | with the Work to which such Contribution(s) was submitted. If You 82 | institute patent litigation against any entity (including a 83 | cross-claim or counterclaim in a lawsuit) alleging that the Work 84 | or a Contribution incorporated within the Work constitutes direct 85 | or contributory patent infringement, then any patent licenses 86 | granted to You under this License for that Work shall terminate 87 | as of the date such litigation is filed. 88 | 89 | 4. Redistribution. You may reproduce and distribute copies of the 90 | Work or Derivative Works thereof in any medium, with or without 91 | modifications, and in Source or Object form, provided that You 92 | meet the following conditions: 93 | 94 | (a) You must give any other recipients of the Work or 95 | Derivative Works a copy of this License; and 96 | 97 | (b) You must cause any modified files to carry prominent notices 98 | stating that You changed the files; and 99 | 100 | (c) You must retain, in the Source form of any Derivative Works 101 | that You distribute, all copyright, patent, trademark, and 102 | attribution notices from the Source form of the Work, 103 | excluding those notices that do not pertain to any part of 104 | the Derivative Works; and 105 | 106 | (d) If the Work includes a "NOTICE" text file as part of its 107 | distribution, then any Derivative Works that You distribute must 108 | include a readable copy of the attribution notices contained 109 | within such NOTICE file, excluding those notices that do not 110 | pertain to any part of the Derivative Works, in at least one 111 | of the following places: within a NOTICE text file distributed 112 | as part of the Derivative Works; within the Source form or 113 | documentation, if provided along with the Derivative Works; or, 114 | within a display generated by the Derivative Works, if and 115 | wherever such third-party notices normally appear. The contents 116 | of the NOTICE file are for informational purposes only and 117 | do not modify the License. You may add Your own attribution 118 | notices within Derivative Works that You distribute, alongside 119 | or as an addendum to the NOTICE text from the Work, provided 120 | that such additional attribution notices cannot be construed 121 | as modifying the License. 122 | 123 | You may add Your own copyright statement to Your modifications and 124 | may provide additional or different license terms and conditions 125 | for use, reproduction, or distribution of Your modifications, or 126 | for any such Derivative Works as a whole, provided Your use, 127 | reproduction, and distribution of the Work otherwise complies with 128 | the conditions stated in this License. 129 | 130 | 5. Submission of Contributions. Unless You explicitly state otherwise, 131 | any Contribution intentionally submitted for inclusion in the Work 132 | by You to the Licensor shall be under the terms and conditions of 133 | this License, without any additional terms or conditions. 134 | Notwithstanding the above, nothing herein shall supersede or modify 135 | the terms of any separate license agreement you may have executed 136 | with Licensor regarding such Contributions. 137 | 138 | 6. Trademarks. This License does not grant permission to use the trade 139 | names, trademarks, service marks, or product names of the Licensor, 140 | except as required for reasonable and customary use in describing the 141 | origin of the Work and reproducing the content of the NOTICE file. 142 | 143 | 7. Disclaimer of Warranty. Unless required by applicable law or 144 | agreed to in writing, Licensor provides the Work (and each 145 | Contributor provides its Contributions) on an "AS IS" BASIS, 146 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or 147 | implied, including, without limitation, any warranties or conditions 148 | of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A 149 | PARTICULAR PURPOSE. You are solely responsible for determining the 150 | appropriateness of using or redistributing the Work and assume any 151 | risks associated with Your exercise of permissions under this License. 152 | 153 | 8. Limitation of Liability. In no event and under no legal theory, 154 | whether in tort (including negligence), contract, or otherwise, 155 | unless required by applicable law (such as deliberate and grossly 156 | negligent acts) or agreed to in writing, shall any Contributor be 157 | liable to You for damages, including any direct, indirect, special, 158 | incidental, or consequential damages of any character arising as a 159 | result of this License or out of the use or inability to use the 160 | Work (including but not limited to damages for loss of goodwill, 161 | work stoppage, computer failure or malfunction, or any and all 162 | other commercial damages or losses), even if such Contributor 163 | has been advised of the possibility of such damages. 164 | 165 | 9. Accepting Warranty or Additional Liability. While redistributing 166 | the Work or Derivative Works thereof, You may choose to offer, 167 | and charge a fee for, acceptance of support, warranty, indemnity, 168 | or other liability obligations and/or rights consistent with this 169 | License. However, in accepting such obligations, You may act only 170 | on Your own behalf and on Your sole responsibility, not on behalf 171 | of any other Contributor, and only if You agree to indemnify, 172 | defend, and hold each Contributor harmless for any liability 173 | incurred by, or claims asserted against, such Contributor by reason 174 | of your accepting any such warranty or additional liability. 175 | 176 | END OF TERMS AND CONDITIONS 177 | 178 | APPENDIX: How to apply the Apache License to your work. 179 | 180 | To apply the Apache License to your work, attach the following 181 | boilerplate notice, with the fields enclosed by brackets "[]" 182 | replaced with your own identifying information. (Don't include 183 | the brackets!) The text should be enclosed in the appropriate 184 | comment syntax for the file format. We also recommend that a 185 | file or class name and description of purpose be included on the 186 | same "printed page" as the copyright notice for easier 187 | identification within third-party archives. 188 | 189 | Copyright [yyyy] [name of copyright owner] 190 | 191 | Licensed under the Apache License, Version 2.0 (the "License"); 192 | you may not use this file except in compliance with the License. 193 | You may obtain a copy of the License at 194 | 195 | http://www.apache.org/licenses/LICENSE-2.0 196 | 197 | Unless required by applicable law or agreed to in writing, software 198 | distributed under the License is distributed on an "AS IS" BASIS, 199 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 200 | See the License for the specific language governing permissions and 201 | limitations under the License. 202 | -------------------------------------------------------------------------------- /README_ZH.md: -------------------------------------------------------------------------------- 1 | 2 | [**English**](https://github.com/destwang/CTCResources) | [**中文**](https://github.com/destwang/CTCResources/blob/main/README_ZH.md) 3 | # 中文文本纠错资源 4 | 5 | 中文文本纠错(Chinese Text Correction, CTC)相关资源,本资源由哈工大讯飞联合实验室(HFL)王宝鑫和赵红红整理维护。 6 | 7 | ## 整体内容 8 | 9 | - [中文文本纠错资源](#中文文本纠错资源) 10 | - [整体内容](#整体内容) 11 | - [定义](#定义) 12 | - [中文拼写纠错(Chinese Spelling Check, CSC)](#中文拼写纠错chinese-spelling-check-csc) 13 | - [语法纠错(Grammatical Error Correction, GEC)](#语法纠错grammatical-error-correction-gec) 14 | - [评测比赛](#评测比赛) 15 | - [CAIL 2022 文书校对 (2022.09.01-2022.11.15)!](#cail-2022-文书校对-20220901-20221115) 16 | - [CCL2022-CLTC 汉语学习者文本纠错评测 (2022.06.05-2022.09.10)](#ccl2022-cltc-汉语学习者文本纠错评测-20220605-20220910) 17 | - [公开测试集实验效果](#公开测试集实验效果) 18 | - [SIGHAN 2015](#sighan-2015) 19 | - [论文](#论文) 20 | - [中文拼写纠错相关论文](#中文拼写纠错相关论文) 21 | - [语法纠错相关论文](#语法纠错相关论文) 22 | - [数据集](#数据集) 23 | - [CTC 2021 :](#ctc-2021-) 24 | - [其他:](#其他) 25 | - [系统 & API](#系统--api) 26 | - [其他资源](#其他资源) 27 | - [相关文章](#相关文章) 28 | - [评测比赛](#评测比赛-1) 29 | 30 | 31 | ## 定义 32 | ### 中文拼写纠错(Chinese Spelling Check, CSC) 33 | 中文拼写纠错任务对中文文本中的拼写纠错(别字、别词)进行检测和纠正。 34 | 35 | ### 语法纠错(Grammatical Error Correction, GEC) 36 | 语法纠错任务纠正文本中不同类型的错误,包括拼写、标点、语法等类型错误。 37 | 38 | ## 评测比赛 39 | ### CAIL 2022 文书校对 (2022.09.01-2022.11.15)![](https://img.shields.io/badge/hot-red.svg) 40 | [CAIL 2022文书校对任务](http://cail.cipsc.org.cn/task2.html?raceID=2&cail_tag=2022)旨在通过机器智能文本校对技术辅助司法人员自动检出并纠正法律文书中存在的错误。本任务涵盖了法律文书中存在的别字、冗余、缺失、乱序四种类型的错误。 41 | 42 | ### CCL2022-CLTC 汉语学习者文本纠错评测 (2022.06.05-2022.09.10) 43 | 44 | [汉语学习者文本纠错任务](https://github.com/blcuicall/CCL2022-CLTC)(Chinese Learner Text Correction,CLTC)旨在自动检测并修改汉语学习者文本中的标点、拼写、语法、语义等错误,从而获得符合原意的正确句子。 45 | 46 | 47 | ## 公开测试集实验效果 48 | ### SIGHAN 2015 49 | *因为目前常用的character-level评测脚本有bug,因此我们没有将character-level结果进行整理。同时建议各位研究人员后续**不要**使用character-level或者修复评测脚本后在进行使用。* 50 | 51 | * 没有进行二次pretrain的模型 52 | 53 | | Model | D-P | D-R | D-F | C-P | C-R | C-F | 54 | | - | - | - | - | - | - | - | 55 | | [FASPell](https://github.com/iqiyi/FASPell) | 67.6 | 60.0 | 63.5 | 66.6 | 59.1 | 62.6 | 56 | | BERT | 73.7 | 78.2 | 75.9 | 70.9 | 75.2 | 73.0 | 57 | | RoBERTa | 74.7 | 77.3 | 76.0 | 72.1 | 74.5 | 73.3 | 58 | | [SpellGCN](https://github.com/ACL2020SpellGCN/SpellGCN) | 74.8 | **80.7** | 77.7 | 72.1 | **77.7** | 74.8 (75.9) | 59 | | [DCN](https://github.com/destwang/DCN) | **76.6** | 79.8 | **78.2** | **74.2** | 77.3 | **75.7** | 60 | 61 | 62 | * With Pretraining 63 | 64 | | Model | D-P | D-R | D-F | C-P | C-R | C-F | 65 | | - | - | - | - | - | - | - | 66 | | [BERT_CRS + GAD](https://aclanthology.org/2021.findings-acl.122.pdf) | 75.6 | 80.4 | 77.9 | 73.2 | 77.8 | 75.4 | 67 | | [DCN-pretrain](https://github.com/destwang/DCN) | 77.1 | 80.9 | 79.0 | 74.5 | 78.2 | 76.3 | 68 | | [REALISE](https://github.com/DaDaMrX/ReaLiSe) | 77.3 | 81.3 | 79.3 | 75.9 | 79.9 | 77.8 | 69 | | [PLOME](https://github.com/liushulinle/PLOME) | 77.4 | 81.5 | 79.4 | 75.3 | 79.3 | 77.2 | 70 | | [Soft-Masked BERT](https://aclanthology.org/2020.acl-main.82.pdf) | 73.7 | 73.2 | 73.5 | 66.7 | 66.2 | 66.4 | 71 | | [Soft-Masked BERT_SSCL](https://aclanthology.org/2021.emnlp-main.281.pdf) | **86.3** | 72.5 | 78.8 | **85.2** | 66.0 | 74.4 | 72 | | [MLM-phonetics](https://aclanthology.org/2021.findings-acl.198.pdf) | 77.5 | **83.1** | 80.2 | 74.9 | 80.2 | 77.5 | 73 | | [MDCSpell](https://aclanthology.org/2022.findings-acl.98.pdf) | 80.8 | 80.6 | **80.7** | 78.4 | 78.2 | 78.3 | 74 | | [ECOPO(BERT)](https://aclanthology.org/2022.findings-acl.252.pdf) | 78.2 | 82.3 | 80.2 | 76.6 | 80.4 | 78.4 | 75 | | [ECOPO(REALISE)](https://aclanthology.org/2022.findings-acl.252.pdf) | 77.5 | 82.6 | 80.0 | 76.1 | **81.2** | **78.5** | 76 | 77 | 78 | ## 论文 79 | ### 中文拼写纠错相关论文 80 | > ### 2022 81 | **MDCSpell: A Multi-task Detector-Corrector Framework for Chinese Spelling Correction**. Findings of ACL 2022. 82 | Chenxi Zhu, Ziqiang Ying, Boyu Zhang, Feng Mao. [[pdf](https://aclanthology.org/2022.findings-acl.98.pdf)] 83 | 84 | **CRASpell: A Contextual Typo Robust Approach to Improve Chinese Spelling Correction**. Findings of ACL 2022. 85 | Shulin Liu, Shengkang Song, Tianchi Yue, Tao Yang, Huihui Cai, TingHao Yu, Shengli Sun. [[pdf](https://aclanthology.org/2022.findings-acl.237.pdf)] 86 | 87 | **The Past Mistake is the Future Wisdom: Error-driven Contrastive Probability Optimization for Chinese Spell Checking**. Findings of ACL 2022. 88 | Yinghui Li, Qingyu Zhou, Yangning Li, Zhongli Li, Ruiyang Liu, Rongyi Sun, Zizhen Wang, Chao Li, Yunbo Cao, Hai-Tao Zheng. [[pdf](https://aclanthology.org/2022.findings-acl.252.pdf)] 89 | 90 | > ### 2021 91 | **PLOME: Pre-training with Misspelled Knowledge for Chinese Spelling Correction**. ACL 2021. 92 | Shulin Liu, Tao Yang, Tianchi Yue, Feng Zhang and Di Wang. [[pdf](https://aclanthology.org/2021.acl-long.233.pdf)], [[code](https://github.com/liushulinle/PLOME)]. 93 | 94 | **PHMOSpell: Phonological and Morphological Knowledge Guided Chinese Spelling Check**. ACL 2021. 95 | Li Huang, Junjie Li, Weiwei Jiang, Zhiyu Zhang, Minchuan Chen, Shaojun Wang and Jing Xiao. [[pdf](https://aclanthology.org/2021.acl-long.464.pdf)]. 96 | 97 | **Exploration and Exploitation: Two Ways to Improve Chinese Spelling Correction Models**. ACL 2021. 98 | Chong Li, Cenyuan Zhang, Xiaoqing Zheng and Xuanjing Huang. [[pdf](https://arxiv.org/pdf/2105.14813.pdf)], [[code](https://github.com/FDChongli/TwoWaysToImproveCSC)]. 99 | 100 | **Dynamic Connected Networks for Chinese Spelling Check**. Findings of ACL 2021. 101 | Baoxin Wang, Wanxiang Che, Dayong Wu, Shijin Wang, Guoping Hu and Ting Liu. [[pdf](https://aclanthology.org/2021.findings-acl.216.pdf)], [[code](https://github.com/destwang/DCN)]. 102 | 103 | **Read, Listen, and See: Leveraging Multimodal Information Helps Chinese Spell Checking**. Findings of ACL 2021. 104 | Heng-Da Xu, Zhongli Li, Qingyu Zhou, Chao Li, Zizhen Wang, Yunbo Cao, Heyan Huang and Xian-Ling Mao. [[pdf](https://arxiv.org/pdf/2105.12306.pdf)], [[code](https://github.com/DaDaMrX/ReaLiSe)]. 105 | 106 | **Global Attention Decoder for Chinese Spelling Error Correction**. Findings of ACL 2021. 107 | Zhao Guo, Yuan Ni, Keqiang Wang, Wei Zhu and Guotong Xie. [[pdf](https://aclanthology.org/2021.findings-acl.122.pdf)]. 108 | 109 | **Correcting Chinese Spelling Errors with Phonetic Pre-training**. Findings of ACL 2021. 110 | Ruiqing Zhang, Chao Pang, Chuanqiang Zhang, Shuohuan Wang, Zhongjun He, Yu Sun, Hua Wu and Haifeng Wang. [[pdf](https://aclanthology.org/2021.findings-acl.198.pdf)]. 111 | 112 | **DCSpell: A Detector-Corrector Framework for Chinese Spelling Error Correction**. SIGIR 2021. 113 | Jing Li, Gaosheng Wu, Dafei Yin, Haozhao Wang, Yonggang Wang. [[pdf](https://dl.acm.org/doi/pdf/10.1145/3404835.3463050)]. 114 | 115 | **An Alignment-Agnostic Model for Chinese Text Error Correction**. EMNLP 2021 short. 116 | Liying Zheng, Yue Deng, Weishun Song, Liang Xu and Jing Xiao. 117 | 118 | > ### 2020 119 | **Chunk-based Chinese Spelling Check with Global Optimization**. Findings of EMNLP 2020. 120 | Zuyi Bao, Chen Li and Rui Wang. [[pdf](https://aclanthology.org/2020.findings-emnlp.184.pdf)]. 121 | 122 | **SpellGCN: Incorporating Phonological and Visual Similarities into Language Models for Chinese Spelling Check**. ACL 2020. 123 | Xingyi Cheng, Weidi Xu, Kunlong Chen, Shaohua Jiang, Feng Wang, Taifeng Wang, Wei Chu and Yuan Qi. [[pdf](https://aclanthology.org/2020.acl-main.81.pdf)], [[code](https://github.com/ACL2020SpellGCN/SpellGCN)]. 124 | 125 | **Spelling Error Correction with Soft-Masked BERT**. ACL 2020. 126 | Shaohua Zhang, Haoran Huang, Jicong Liu and Hang Li. [[pdf](https://aclanthology.org/2020.acl-main.82.pdf)]. 127 | 128 | > ### 2019 129 | **FASPell: A Fast, Adaptable, Simple, Powerful Chinese Spell Checker Based On DAE-Decoder Paradigm**. EMNLP 2019 Workshop W-NUT. 130 | Yuzhong Hong, Xianguo Yu, Neng He, Nan Liu, Junhui Liu. [[pdf](https://aclanthology.org/D19-5522.pdf)], [[code](https://github.com/iqiyi/FASPell)]. 131 | 132 | **Confusionset-guided Pointer Networks for Chinese Spelling Check**. ACL 2019. 133 | Dingmin Wang, Yi Tay, Li Zhong. [[pdf](https://aclanthology.org/P19-1578.pdf)], [[code](https://github.com/sunnyqiny/Confusionset-guided-Pointer-Networks-for-Chinese-Spelling-Check)]. 134 | 135 | ### 语法纠错相关论文 136 | > ### 2022 137 | 138 | **Ensembling and Knowledge Distilling of Large Sequence Taggers for Grammatical Error Correction**. ACL 2022. 139 | Maksym Tarnavskyi, Artem Chernodub, Kostiantyn Omelianchuk. [[pdf](https://aclanthology.org/2022.acl-long.266/)] 140 | 141 | **Interpretability for Language Learners Using Example-Based Grammatical Error Correction**. ACL 2022. 142 | Masahiro Kaneko, Sho Takase, Ayana Niwa, Naoaki Okazaki. [[pdf](https://aclanthology.org/2022.acl-long.496/)] 143 | 144 | **Adjusting the Precision-Recall Trade-Off with Align-and-Predict Decoding for Grammatical Error Correction**. ACL 2022 short. 145 | Xin Sun, Houfeng Wang. [[pdf](https://aclanthology.org/2022.acl-short.77/)] 146 | 147 | **“Is Whole Word Masking Always Better for Chinese BERT?”: Probing on Chinese Grammatical Error Correction**. Findings of ACL 2022. 148 | Yong Dai, Linyang Li, Cong Zhou, Zhangyin Feng, Enbo Zhao, Xipeng Qiu, Piji Li, Duyu Tang. [[pdf](https://aclanthology.org/2022.findings-acl.1/)] 149 | 150 | **Type-Driven Multi-Turn Corrections for Grammatical Error Correction**. Findings of ACL 2022. 151 | Shaopeng Lai, Qingyu Zhou, Jiali Zeng, Zhongli Li, Chao Li, Yunbo Cao, Jinsong Su. [[pdf](https://aclanthology.org/2022.findings-acl.254/)] 152 | 153 | **Reusing a Multi-lingual Setup to Bootstrap a Grammar Checker for a Very Low Resource Language without Data**. ComputEL 2022 Workshop. 154 | Inga Lill Sigga Mikkelsen, Linda Wiechetek, Flammie A Pirinen. [[pdf](https://aclanthology.org/2022.computel-1.19/)] 155 | 156 | 157 | > ### 2021 158 | 159 | **Instantaneous Grammatical Error Correction with Shallow Aggressive Decoding**. ACL 2021. 160 | Xin Sun, Tao Ge, Furu Wei, Houfeng Wang.[[pdf](https://aclanthology.org/2021.acl-long.462/)].[[code](https://github.com/AutoTemp/Shallow-Aggressive-Decoding)]. 161 | 162 | **Tail-to-Tail Non-Autoregressive Sequence Prediction for Chinese Grammatical ErrorCorrection**. ACL 2021. 163 | Piji Li, Shuming Shi.[[pdf](https://aclanthology.org/2021.acl-long.385/)].[[code](https://github.com/lipiji/TtT)]. 164 | 165 | **A Simple Recipe for Multilingual Grammatical Error Correction**. ACL 2021 short. 166 | Sascha Rothe, Jonathan Mallinson, Eric Malmi, Sebastian Krause, Aliaksei Severyn.[[pdf](https://aclanthology.org/2021.acl-short.89/)]. 167 | 168 | **Synthetic Data Generation for Grammatical Error Correction with Tagged Corruption Models**. EACL-BEA 2021. 169 | Felix Stahlberg, Shankar Kumar.[[pdf](https://aclanthology.org/2021.bea-1.4/)]. 170 | 171 | **Document-level grammatical error correction**. EACL-BEA 2021. 172 | Zheng Yuan, Christopher Bryant.[[pdf](https://aclanthology.org/2021.bea-1.8/)]. 173 | 174 | **Data Strategies for Low-Resource Grammatical Error Correction**. EACL-BEA 2021. 175 | Simon Flachs, Felix Stahlberg, Shankar Kumar.[[pdf](https://aclanthology.org/2021.bea-1.12/)]. 176 | 177 | **Assessing Grammatical Correctness in Language Learning**. EACL-BEA 2021. 178 | Anisia Katinskaia, Roman Yangarber.[[pdf](https://aclanthology.org/2021.bea-1.15/)]. 179 | 180 | **Neural Quality Estimation with Multiple Hypotheses for Grammatical Error Correction**. NAACL2021. 181 | Zhenghao Liu, Xiaoyuan Yi, Maosong Sun, Liner Yang, Tat-Seng Chua.[[pdf](https://aclanthology.org/2021.naacl-main.429/)]. 182 | 183 | **Comparison of Grammatical Error Correction Using Back-Translation Models**. NAACL2021 workshop. 184 | Aomi Koyama, Kengo Hotate, Masahiro Kaneko, Mamoru Komachi.[[pdf](https://aclanthology.org/2021.naacl-srw.16/)]. 185 | 186 | **LM-Critic: Language Models for Unsupervised Grammatical Error Correction**. EMNLP 2021. 187 | Michihiro Yasunaga, Jure Leskovec and Percy Liang.[[pdf](https://arxiv.org/abs/2109.06822)]. 188 | 189 | **Multi-Class Grammatical Error Detection for Correction: A Tale of Two Systems**. EMNLP 2021. 190 | Zheng Yuan, Shiva Taslimipoor, Christopher Davis and Christopher Bryant. 191 | 192 | > ### 2020 193 | 194 | **On the Robustness of Language Encoders against Grammatical Errors**. ACL 2020. 195 | Fan Yin, Quanyu Long, Tao Meng, Kai-Wei Chang.[[pdf](https://aclanthology.org/2020.acl-main.310/)]. 196 | 197 | **Encoder-Decoder Models Can Benefit from Pre-trained Masked Language Models in Grammatical Error Correction**. ACL 2020. 198 | Masahiro Kaneko, Masato Mita, Shun Kiyono, Jun Suzuki, Kentaro Inui.[[pdf](https://aclanthology.org/2020.acl-main.391/)]. 199 | 200 | **Grammatical Error Correction Using Pseudo Learner Corpus Considering Learner’s Error Tendency**. ACL 2020 workshop. 201 | Yujin Takahashi, Satoru Katsumata, Mamoru Komachi.[[pdf](https://aclanthology.org/2020.acl-srw.5/)]. 202 | 203 | **GECToR – Grammatical Error Correction: Tag, Not Rewrite**. ACL-BEA 2020. 204 | Kostiantyn Omelianchuk, Vitaliy Atrasevych, Artem Chernodub, Oleksandr Skurzhanskyi.[[pdf](https://aclanthology.org/2020.bea-1.16/)]. 205 | 206 | **A Comparative Study of Synthetic Data Generation Methods for Grammatical Error Correction**. ACL-BEA 2020. 207 | Max White, Alla Rozovskaya.[[pdf](https://aclanthology.org/2020.bea-1.21/)]. 208 | 209 | **Improving Grammatical Error Correction Models with Purpose-Built Adversarial Examples**. EMNLP 2020. 210 | Lihao Wang, Xiaoqing Zheng.[[pdf](https://aclanthology.org/2020.emnlp-main.228/)]. 211 | 212 | **Improving the Efficiency of Grammatical Error Correction with Erroneous Span Detection and Correction**. EMNLP 2020. 213 | Mengyun Chen, Tao Ge, Xingxing Zhang, Furu Wei, Ming Zhou.[[pdf](https://aclanthology.org/2020.emnlp-main.581/)]. 214 | 215 | **Grammatical Error Correction in Low Error Density Domains: A New Benchmark and Analyses**. EMNLP 2020. 216 | Simon Flachs, Ophélie Lacroix, Helen Yannakoudakis, Marek Rei, Anders Søgaard.[[pdf](https://aclanthology.org/2020.emnlp-main.680/)]. 217 | 218 | **Adversarial Grammatical Error Correction**. findings of EMNLP 2020 . 219 | Vipul Raheja, Dimitris Alikaniotis.[[pdf](https://aclanthology.org/2020.findings-emnlp.275/)]. 220 | 221 | **A Self-Refinement Strategy for Noise Reduction in Grammatical Error Correction**. findings of EMNLP 2020 . 222 | Masato Mita, Shun Kiyono, Masahiro Kaneko, Jun Suzuki, Kentaro Inui.[[pdf](https://aclanthology.org/2020.findings-emnlp.26/)]. 223 | 224 | **Improving Grammatical Error Correction with Machine Translation Pairs**. findings of EMNLP 2020 . 225 | Wangchunshu Zhou, Tao Ge, Chang Mu, Ke Xu, Furu Wei, Ming Zhou.[[pdf](https://aclanthology.org/2020.findings-emnlp.30/)]. 226 | 227 | **Chinese Grammatical Correction Using BERT-based Pre-trained Model**. AACL 2020. 228 | Hongfei Wang, Michiki Kurosawa, Satoru Katsumata, Mamoru Komachi.[[pdf](https://aclanthology.org/2020.aacl-main.20/)]. 229 | 230 | **Stronger Baselines for Grammatical Error Correction Using a Pretrained Encoder-Decoder Model**. AACL 2020. 231 | Satoru Katsumata, Mamoru Komachi.[[pdf](https://aclanthology.org/2020.aacl-main.83/)]. 232 | 233 | **Generating Diverse Corrections with Local Beam Search for Grammatical Error Correction**. COLING 2020. 234 | Kengo Hotate, Masahiro Kaneko, Mamoru Komachi.[[pdf](https://aclanthology.org/2020.coling-main.193/)]. 235 | 236 | **Heterogeneous Recycle Generation for Chinese Grammatical Error Correction**. COLING 2020. 237 | Charles Hinson, Hen-Hsen Huang, Hsin-Hsi Chen.[[pdf](https://aclanthology.org/2020.coling-main.199/)]. 238 | 239 | **Improving Grammatical Error Correction with Data Augmentation by Editing Latent Representation**. COLING 2020. 240 | Zhaohong Wan, Xiaojun Wan, Wenguang Wang.[[pdf](https://aclanthology.org/2020.coling-main.200/)]. 241 | 242 | **Cross-lingual Transfer Learning for Grammatical Error Correction**. COLING 2020. 243 | Ikumi Yamashita, Satoru Katsumata, Masahiro Kaneko, Aizhan Imankulova,Mamoru Komachi.[[pdf](https://aclanthology.org/2020.coling-main.415/)]. 244 | 245 | > ### 2019 246 | 247 | **Cross-Sentence Grammatical Error Correction**. ACL 2019. 248 | Shamil Chollampatt, Weiqi Wang, Hwee Tou Ng.[[pdf](https://aclanthology.org/P19-1042/)]. 249 | 250 | **Automatic Grammatical Error Correction for Sequence-to-sequence Text Generation: An Empirical Study**. ACL 2019. 251 | Tao Ge, Xingxing Zhang, Furu Wei, Ming Zhou.[[pdf](https://aclanthology.org/P19-1609/)]. 252 | 253 | **Controlling Grammatical Error Correction Using Word Edit Rate**. ACL 2019. 254 | Kengo Hotate, Masahiro Kaneko, Satoru Katsumata, Mamoru Komachi.[[pdf](https://aclanthology.org/P19-2020/)]. 255 | 256 | **Context is Key: Grammatical Error Detection with Contextual Word Representations**. ACL-BEA 2019. 257 | Samuel Bell, Helen Yannakoudakis, Marek Rei.[[pdf](https://aclanthology.org/W19-4410/)]. 258 | 259 | **The Unreasonable Effectiveness of Transformer Language Models in Grammatical Error Correction**. ACL-BEA 2019. 260 | Dimitris Alikaniotis, Vipul Raheja.[[pdf](https://aclanthology.org/W19-4412/)]. 261 | 262 | **(Almost) Unsupervised Grammatical Error Correction using Synthetic Comparable Corpus**. ACL-BEA 2019. 263 | Satoru Katsumata, Mamoru Komachi.[[pdf](https://aclanthology.org/W19-4413/)]. 264 | 265 | **Learning to combine Grammatical Error Corrections**. ACL-BEA 2019. 266 | Yoav Kantor, Yoav Katz, Leshem Choshen, Edo Cohen-Karlik, Naftali Liberman, Assaf Toledo, Amir Menczel, Noam Slonim.[[pdf](https://aclanthology.org/W19-4414/)]. 267 | 268 | **Erroneous data generation for Grammatical Error Correction**. ACL-BEA 2019. 269 | Shuyao Xu, Jiehao Zhang, Jin Chen, Long Qin.[[pdf](https://aclanthology.org/W19-4415/)]. 270 | 271 | **The CUED’s Grammatical Error Correction Systems for BEA-2019**. ACL-BEA 2019. 272 | Felix Stahlberg, Bill Byrne.[[pdf](https://aclanthology.org/W19-4417/)]. 273 | 274 | **CUNI System for the Building Educational Applications 2019 Shared Task: Grammatical Error Correction**. ACL-BEA 2019. 275 | Jakub Náplava, Milan Straka.[[pdf](https://aclanthology.org/W19-4419/)]. 276 | 277 | **Noisy Channel for Low Resource Grammatical Error Correction**. ACL-BEA 2019. 278 | Simon Flachs, Ophélie Lacroix, Anders Søgaard.[[pdf](https://aclanthology.org/W19-4420/)]. 279 | 280 | **TMU Transformer System Using BERT for Re-ranking at BEA 2019 Grammatical Error Correction on Restricted Track**. ACL-BEA 2019. 281 | Masahiro Kaneko, Kengo Hotate, Satoru Katsumata, Mamoru Komachi.[[pdf](https://aclanthology.org/W19-4422/)]. 282 | 283 | **A Neural Grammatical Error Correction System Built On Better Pre-training and Sequential Transfer Learning**. ACL-BEA 2019. 284 | Yo Joong Choe, Jiyeon Ham, Kyubyong Park, Yeoil Yoon.[[pdf](https://aclanthology.org/W19-4423/)]. 285 | 286 | **Neural and FST-based approaches to grammatical error correction**. ACL-BEA 2019. 287 | Zheng Yuan, Felix Stahlberg, Marek Rei, Bill Byrne, Helen Yannakoudakis.[[pdf](https://aclanthology.org/W19-4424/)]. 288 | 289 | **Improving Precision of Grammatical Error Correction with a Cheat Sheet**. ACL-BEA 2019. 290 | Mengyang Qiu, Xuejiao Chen, Maggie Liu, Krishna Parvathala, Apurva Patil, Jungyeul Park.[[pdf](https://aclanthology.org/W19-4425/)]. 291 | 292 | **Multi-headed Architecture Based on BERT for Grammatical Errors Correction**. ACL-BEA 2019. 293 | Bohdan Didenko, Julia Shaptala.[[pdf](https://aclanthology.org/W19-4426/)]. 294 | 295 | **Neural Grammatical Error Correction Systems with Unsupervised Pre-training on Synthetic Data**. ACL-BEA 2019. 296 | Roman Grundkiewicz, Marcin Junczys-Dowmunt, Kenneth Heafield.[[pdf](https://aclanthology.org/W19-4427/)]. 297 | 298 | **The Unbearable Weight of Generating Artificial Errors for Grammatical Error Correction**. ACL-BEA 2019. 299 | Phu Mon Htut, Joel Tetreault.[[pdf](https://aclanthology.org/W19-4449/)]. 300 | 301 | **An Empirical Study of Incorporating Pseudo Data into Grammatical Error Correction**. EMNLP 2019. 302 | Shun Kiyono, Jun Suzuki, Masato Mita, Tomoya Mizumoto, Kentaro Inui.[[pdf](https://arxiv.org/abs/1909.00502)]. 303 | 304 | **Encode, Tag, Realize: High-Precision Text Editing**. EMNLP 2019. 305 | Eric Malmi, Sebastian Krause, Sascha Rothe, Daniil Mirylenka, Aliaksei Severyn.[[pdf](https://aclanthology.org/D19-1510/)]. 306 | 307 | **Personalizing Grammatical Error Correction: Adaptation to Proficiency Level and L1**. EMNLP 2019. 308 | Maria Nadejde, Joel Tetreault.[[pdf](https://aclanthology.org/D19-5504/)]. 309 | 310 | **Grammatical Error Correction in Low-Resource Scenarios**. EMNLP 2019. 311 | Jakub Náplava, Milan Straka.[[pdf](https://aclanthology.org/D19-5545/)]. 312 | 313 | **Minimally-Augmented Grammatical Error Correction**. EMNLP 2019. 314 | Roman Grundkiewicz, Marcin Junczys-Dowmunt.[[pdf](https://aclanthology.org/D19-5546/)]. 315 | 316 | **Parallel Iterative Edit Models for Local Sequence Transduction**. EMNLP 2019. 317 | Abhijeet Awasthi, Sunita Sarawagi, Rasna Goyal, Sabyasachi Ghosh, Vihari Piratla.[[pdf](https://aclanthology.org/D19-1435/)]. 318 | 319 | **Learning to Copy for Automatic Post-Editing**. EMNLP 2019. 320 | Xuancheng Huang, Yang Liu, Huanbo Luan, Jingfang Xu, Maosong Sun.[[pdf](https://aclanthology.org/D19-1634/)]. 321 | 322 | **Improving Grammatical Error Correction via Pre-Training a Copy-Augmented Architecture with Unlabeled Data**. NAACL 2019. 323 | Wei Zhao, Liang Wang, Kewei Shen, Ruoyu Jia, Jingming Liu.[[pdf](https://aclanthology.org/N19-1014/)]. 324 | 325 | **Cross-Corpora Evaluation and Analysis of Grammatical Error Correction Models — Is Single-Corpus Evaluation Enough?**. NAACL 2019. 326 | Masato Mita, Tomoya Mizumoto, Masahiro Kaneko, Ryo Nagata, Kentaro Inui.[[pdf](https://aclanthology.org/N19-1132/)]. 327 | 328 | **Corpora Generation for Grammatical Error Correction**. NAACL 2019. 329 | Jared Lichtarge, Chris Alberti, Shankar Kumar, Noam Shazeer, Niki Parmar, Simon Tong.[[pdf](https://aclanthology.org/N19-1333/)]. 330 | 331 | **Neural Grammatical Error Correction with Finite State Transducers**. NAACL 2019. 332 | Felix Stahlberg, Christopher Bryant, Bill Byrne.[[pdf](https://aclanthology.org/N19-1406/)]. 333 | 334 | > ### 2018 335 | 336 | **Inherent Biases in Reference-based Evaluation for Grammatical Error Correction**. ACL 2018. 337 | Leshem Choshen, Omri Abend.[[pdf](https://aclanthology.org/P18-1059/)]. 338 | 339 | **Fluency Boost Learning and Inference for Neural Grammatical Error Correction**. ACL 2018. 340 | Tao Ge, Furu Wei, Ming Zhou.[[pdf](https://aclanthology.org/P18-1097/)]. 341 | 342 | **Automatic Metric Validation for Grammatical Error Correction**. ACL 2018. 343 | Leshem Choshen, Omri Abend.[[pdf](https://aclanthology.org/P18-1127/)]. 344 | 345 | **Overview of NLPTEA-2018 Share Task Chinese Grammatical Error Diagnosis**. ACL 2018 NLPTEA. 346 | Gaoqi Rao, Qi Gong, Baolin Zhang, Endong Xun.[[pdf](https://aclanthology.org/W18-3706/)]. 347 | 348 | **Approaching Neural Grammatical Error Correction as a Low-Resource Machine Translation Task**. NAACL 2018. 349 | Marcin Junczys-Dowmunt, Roman Grundkiewicz, Shubha Guha, Kenneth Heafield.[[pdf](https://aclanthology.org/N18-1055/)]. 350 | 351 | **Noising and Denoising Natural Language: Diverse Backtranslation for Grammar Correction**. NAACL 2018. 352 | Ziang Xie, Guillaume Genthial, Stanley Xie, Andrew Ng, Dan Jurafsky.[[pdf](https://aclanthology.org/N18-1057/)]. 353 | 354 | **Reference-less Measure of Faithfulness for Grammatical Error Correction**. NAACL 2018 short. 355 | Leshem Choshen, Omri Abend.[[pdf](https://aclanthology.org/N18-2020/)]. 356 | 357 | **Near Human-Level Performance in Grammatical Error Correction with Hybrid Machine Translation**. NAACL 2018 short. 358 | Roman Grundkiewicz, Marcin Junczys-Dowmunt.[[pdf](https://aclanthology.org/N18-2046/)]. 359 | 360 | **Language Model Based Grammatical Error Correction without Annotated Training Data**. NAACL 2018 BEA. 361 | Christopher Bryant, Ted Briscoe.[[pdf](https://aclanthology.org/W18-0529/)]. 362 | 363 | **A Multilayer Convolutional Encoder-Decoder Neural Network for Grammatical Error Correction**. AAAI 2018. 364 | Shamil Chollampatt, Hwee Tou Ng.[[pdf](https://www.aaai.org/ocs/index.php/AAAI/AAAI18/paper/viewFile/17308/16137)]. 365 | 366 | **Neural Quality Estimation of Grammatical Error Correction**. EMNLP 2018. 367 | Shamil Chollampatt, Hwee Tou Ng.[[pdf](https://aclanthology.org/D18-1274/)]. 368 | 369 | **Wronging a Right: Generating Better Errors to Improve Grammatical Error Detection**. EMNLP 2018. 370 | Sudhanshu Kasewa, Pontus Stenetorp, Sebastian Riedel.[[pdf](https://aclanthology.org/D18-1541/)]. 371 | 372 | **Using Wikipedia Edits in Low Resource Grammatical Error Correction**. EMNLP 2018. 373 | Adriane Boyd.[[pdf](https://aclanthology.org/W18-6111/)]. 374 | 375 | **Cool English: a Grammatical Error Correction System Based on Large Learner Corpora**. COLING 2018. 376 | Yu-Chun Lo, Jhih-Jie Chen, Chingyu Yang, Jason Chang.[[pdf](https://aclanthology.org/C18-2018/)]. 377 | 378 | **A Reassessment of Reference-Based Grammatical Error Correction Metrics**. COLING 2018. 379 | Shamil Chollampatt, Hwee Tou Ng.[[pdf](https://aclanthology.org/C18-1231/)]. 380 | 381 | > ### earlier 382 | 383 | **A Nested Attention Neural Hybrid Model for Grammatical Error Correction**. ACL 2017. 384 | Jianshu Ji, Qinlong Wang, Kristina Toutanova, Yongen Gong, Steven Truong, Jianfeng Gao.[[pdf](https://aclanthology.org/P17-1070/)]. 385 | 386 | **Automatic Annotation and Evaluation of Error Types for Grammatical Error Correction**. ACL 2017. 387 | Christopher Bryant, Mariano Felice, Ted Briscoe.[[pdf](https://aclanthology.org/P17-1074/)]. 388 | 389 | **Neural Sequence-Labelling Models for Grammatical Error Correction**. EMNLP 2017. 390 | Helen Yannakoudakis, Marek Rei, Øistein E. Andersen, Zheng Yuan.[[pdf](https://aclanthology.org/D17-1297/)]. 391 | 392 | **JFLEG: A Fluency Corpus and Benchmark for Grammatical Error Correction**. EACL 2017. 393 | Courtney Napoles, Keisuke Sakaguchi, Joel Tetreault.[[pdf](https://aclanthology.org/E17-2037/)]. 394 | 395 | **Grammatical Error Detection Using Error- and Grammaticality-Specific Word Embeddings**. IJCNLP 2017. 396 | Masahiro Kaneko, Yuya Sakaizawa, Mamoru Komachi.[[pdf](https://aclanthology.org/I17-1005/)]. 397 | 398 | **Reference-based Metrics can be Replaced with Reference-less Metrics in Evaluating Grammatical Error Correction Systems**. IJCNLP 2017. 399 | Hiroki Asano, Tomoya Mizumoto, Kentaro Inui.[[pdf](https://aclanthology.org/I17-2058/)]. 400 | 401 | **Grammatical Error Correction with Neural Reinforcement Learning**. IJCNLP 2017. 402 | Keisuke Sakaguchi, Matt Post, Benjamin Van Durme.[[pdf](https://aclanthology.org/I17-2062/)]. 403 | 404 | **Grammatical Error Correction: Machine Translation and Classifiers**. ACL 2016. 405 | Alla Rozovskaya, Dan Roth.[[pdf](https://aclanthology.org/P16-1208/)]. 406 | 407 | **Compositional Sequence Labeling Models for Error Detection in Learner Writing**. ACL 2016. 408 | Marek Rei, Helen Yannakoudakis.[[pdf](https://aclanthology.org/P16-1112/)]. 409 | 410 | **Grammatical error correction using neural machine translation**. NAACL 2016 short. 411 | Zheng Yuan, Ted Briscoe.[[pdf](https://aclanthology.org/N16-1042/)]. 412 | 413 | **Discriminative Reranking for Grammatical Error Correction with Statistical Machine Translation**. NAACL 2016 short. 414 | Tomoya Mizumoto, Yuji Matsumoto.[[pdf](https://aclanthology.org/N16-1133/)]. 415 | 416 | **Phrase-based Machine Translation is State-of-the-Art for Automatic Grammatical Error Correction**. EMNLP 2016. 417 | Marcin Junczys-Dowmunt, Roman Grundkiewicz.[[pdf](https://aclanthology.org/D16-1161/)]. 418 | 419 | **Adapting Grammatical Error Correction Based on the Native Language of Writers with Neural Network Joint Models**. EMNLP 2016. 420 | Shamil Chollampatt, Duc Tam Hoang, Hwee Tou Ng.[[pdf](https://aclanthology.org/D16-1195/)]. 421 | 422 | **There’s No Comparison: Reference-less Evaluation Metrics in Grammatical Error Correction**. EMNLP 2016. 423 | Courtney Napoles, Keisuke Sakaguchi, Joel Tetreault.[[pdf](https://aclanthology.org/D16-1228/)]. 424 | 425 | **Chinese Preposition Selection for Grammatical Error Diagnosis**. COLING 2016. 426 | Hen-Hsen Huang, Yen-Chi Shao, Hsin-Hsi Chen.[[pdf](https://aclanthology.org/C16-1085/)]. 427 | 428 | **Neural Network Translation Models for Grammatical Error Correction**. IJCAI 2016. 429 | S Chollampatt,K Taghipour,HT Ng.[[pdf](https://arxiv.org/pdf/1606.00189.pdf)]. 430 | 431 | **Exploiting N-Best Hypotheses to Improve an SMT Approach to Grammatical Error Correction**. IJCAI 2016. 432 | DT Hoang,S Chollampatt,HT Ng.[[pdf](http://www.researchgate.net/publication/303749890_Exploiting_N-Best_Hypotheses_to_Improve_an_SMT_Approach_to_Grammatical_Error_Correction)]. 433 | 434 | **How Far are We from Fully Automatic High Quality Grammatical Error Correction?**. ACL 2015. 435 | Christopher Bryant, Hwee Tou Ng.[[pdf](https://aclanthology.org/P15-1068/)]. 436 | 437 | **Ground Truth for Grammatical Error Correction Metrics**. ACL 2015 short. 438 | Courtney Napoles, Keisuke Sakaguchi, Matt Post, Joel Tetreault.[[pdf](https://aclanthology.org/P15-2097/)]. 439 | 440 | **Towards a standard evaluation method for grammatical error detection and correction**. NAACL 2015. 441 | Mariano Felice, Ted Briscoe.[[pdf](https://aclanthology.org/N15-1060/)]. 442 | 443 | **Human Evaluation of Grammatical Error Correction Systems**. EMNLP 2015. 444 | Roman Grundkiewicz, Marcin Junczys-Dowmunt, Edward Gillian.[[pdf](https://aclanthology.org/D15-1052/)]. 445 | 446 | **Ground Truth for Grammatical Error Correction Metrics**. IJCNLP 2015. 447 | Courtney Napoles, Keisuke Sakaguchi, Matt Post, Joel Tetreault.[[pdf](https://aclanthology.org/P15-2097/)]. 448 | 449 | **Go Climb a Dependency Tree and Correct the Grammatical Errors**. EMNLP 2014. 450 | Longkai Zhang, Houfeng Wang.[[pdf](https://aclanthology.org/D14-1033/)]. 451 | 452 | **System Combination for Grammatical Error Correction**. EMNLP 2014. 453 | Raymond Hendy Susanto, Peter Phandi, Hwee Tou Ng.[[pdf](https://aclanthology.org/D14-1102/)]. 454 | 455 | **Data Driven Grammatical Error Detection in Transcripts of Children’s Speech**. EMNLP 2014. 456 | Eric Morley, Anna Eva Hallin, Brian Roark.[[pdf](https://aclanthology.org/D14-1106/)]. 457 | 458 | **Generating artificial errors for grammatical error correction**. EACL 2014. 459 | Mariano Felice, Zheng Yuan.[[pdf](https://aclanthology.org/E14-3013/)]. 460 | 461 | **Correcting Grammatical Verb Errors**. EACL 2014. 462 | Alla Rozovskaya, Dan Roth, Vivek Srikumar.[[pdf](https://aclanthology.org/E14-1038/)]. 463 | 464 | **Detecting Learner Errors in the Choice of Content Words Using Compositional Distributional Semantics**. COLING 2014. 465 | Ekaterina Kochmar, Ted Briscoe.[[pdf](https://aclanthology.org/C14-1164/)]. 466 | 467 | **A Sentence Judgment System for Grammatical Error Detection**. COLING 2014. 468 | Lung-Hao Lee, Liang-Chih Yu, Kuei-Ching Lee, Yuen-Hsien Tseng, Li-Ping Chang, Hsin-Hsi Chen.[[pdf](https://aclanthology.org/C14-2015/)]. 469 | 470 | **Automated Grammatical Error Correction for Language Learners**. COLING 2014. 471 | Joel Tetreault, Claudia Leacock.[[pdf](https://aclanthology.org/C14-3004/)]. 472 | 473 | **Grammatical Error Correction Using Integer Linear Programming**. ACL 2013. 474 | Yuanbin Wu, Hwee Tou Ng.[[pdf](https://aclanthology.org/P13-1143/)]. 475 | 476 | **Joint Learning and Inference for Grammatical Error Correction**. EMNLP 2013. 477 | Alla Rozovskaya, Dan Roth.[[pdf](https://aclanthology.org/D13-1074/)]. 478 | 479 | **Automated Grammar Correction Using Hierarchical Phrase-Based Statistical Machine Translation**. IJCNLP 2013. 480 | Bibek Behera, Pushpak Bhattacharyya.[[pdf](https://aclanthology.org/I13-1122/)]. 481 | 482 | **Grammatical Error Correction Using Feature Selection and Confidence Tuning**. IJCNLP 2013. 483 | Yang Xiang, Yaoyun Zhang, Xiaolong Wang, Chongqiang Wei, Wen Zheng, Xiaoqiang Zhou, Yuxiu Hu, Yang Qin.[[pdf](https://aclanthology.org/I13-1148/)]. 484 | 485 | **A Meta Learning Approach to Grammatical Error Correction**. ACL 2012. 486 | Hongsuck Seo, Jonghoon Lee, Seokhwan Kim, Kyusong Lee, Sechun Kang, Gary Geunbae Lee.[[pdf](https://aclanthology.org/P12-2064/)]. 487 | 488 | **Grammar Error Correction Using Pseudo-Error Sentences and Domain Adaptation**. ACL 2012. 489 | Kenji Imamura, Kuniko Saito, Kugatsu Sadamitsu, Hitoshi Nishikawa.[[pdf]()]. 490 | 491 | **Better Evaluation for Grammatical Error Correction**. NAACL 2012 short. 492 | Daniel Dahlmeier, Hwee Tou Ng.[[pdf](https://aclanthology.org/N12-1067/)]. 493 | 494 | **A Beam-Search Decoder for Grammatical Error Correction**. EMNLP 2012. 495 | Daniel Dahlmeier, Hwee Tou Ng.[[pdf](https://aclanthology.org/D12-1052/)]. 496 | 497 | **Problems in Evaluating Grammatical Error Detection Systems**. COLING 2012. 498 | Martin Chodorow, Markus Dickinson, Ross Israel, Joel Tetreault.[[pdf](https://aclanthology.org/C12-1038/)]. 499 | 500 | **The Effect of Learner Corpus Size in Grammatical Error Correction of ESL Writings**. COLING 2012. 501 | Tomoya Mizumoto, Yuta Hayashibe, Mamoru Komachi, Masaaki Nagata, Yuji Matsumoto.[[pdf](https://aclanthology.org/C12-2084/)]. 502 | 503 | **They Can Help: Using Crowdsourcing to Improve the Evaluation of Grammatical Error Detection Systems**. ACL 2011. 504 | Nitin Madnani, Martin Chodorow, Joel Tetreault, Alla Rozovskaya.[[pdf](https://aclanthology.org/P11-2089/)]. 505 | 506 | **Grammatical Error Correction with Alternating Structure Optimization**. ACL 2011. 507 | Daniel Dahlmeier, Hwee Tou Ng.[[pdf](https://aclanthology.org/P11-1092/)]. 508 | 509 | **Automated Whole Sentence Grammar Correction Using a Noisy Channel Model**. ACL 2011. 510 | Y. Albert Park, Roger Levy.[[pdf](https://aclanthology.org/P11-1094/)]. 511 | 512 | **Grammatical Error Detection for Corrective Feedback Provision in Oral Conversations**. AAAI 2011. 513 | Sungjin Lee, Hyungjong Noh, Kyusong Lee, Gary Geunbae Lee.[[pdf](https://www.aaai.org/ocs/index.php/AAAI/AAAI11/paper/view/3501/3954)]. 514 | 515 | **Evaluating performance of grammatical error detection to maximize learning effect**. COLING 2010. 516 | Ryo Nagata, Kazuhide Nakatani.[[pdf](https://aclanthology.org/C10-2103/)]. 517 | 518 | 519 | 520 | ## 数据集 521 | 522 | ### [CTC 2021](https://destwang.github.io/CTC2021-explorer/) : 523 | | dataset | download | 524 | |-|-| 525 | |training data|[下载](http://pan.iflytek.com/#/link/0CF7992308445E72001ED08F53266D25) (password: girA)| 526 | |validation data|[下载](https://pan.baidu.com/s/19n-NxkMkcnOuHZPsZyFbjw) (password: 5jg2)| 527 | 528 | ### 其他: 529 | | dataset | task | # sents | source | language | 530 | |-|-|-|-|-| 531 | | SIGHAN 2013 | CSC | 350 & 974 | SIGHAN | Zh | 532 | | SIGHAN 2014 | CSC | 6,526 & 526 | SIGHAN | Zh | 533 | | SIGHAN 2015 | CSC | 3,174 & 550 | SIGHAN | Zh | 534 | |[OCR dataset](https://github.com/iqiyi/FASPell)|CSC|4575|[FASPell(iqiyi)](https://www.aclweb.org/anthology/D19-5522.pdf)|Zh| 535 | | [HybridSet](https://github.com/wdimmy/Automatic-Corpus-Generation) | CSC | 270K | - | Zh | 536 | | NLPCC 2018 GEC | GEC | - | NLPCC | Zh | 537 | | CGED | GED | - | HSK | Zh | 538 | | CoNLL 2013 | GEC | 1,381 | CONLL | En | 539 | | CoNLL 2014 | GEC | 1,312 | CONLL | En | 540 | | JFLEG | GEC | 747 | [JFLEG: A Fluency Corpus and Benchmark for Grammatical Error Correction](https://aclanthology.org/E17-2037.pdf) | En | 541 | | NUCLE | GEC | 57k | [Building a Large Annotated Corpus of Learner English: The NUS Corpus of Learner English](https://aclanthology.org/W13-1703.pdf) | En | 542 | | Lang-8 | GEC | 1M+ | Lang-8 | En | 543 | | Write&Improve+LOCNESS | GEC | 63,683 & 7,632 | - | En | 544 | |MMC+PsyTAR (medica)||512 & 79| - |En| 545 | |brikbeck+holbrook-tagged+holbrook-missp+aspell+wikipedia|(Misspelling word)| 36133/6136 & 1791/1200& 531/450& 2455/1922|[BBK](https://www.dcs.bbk.ac.uk/~ROGER/corpora.htm)|En| 546 | |[TOEFL-Spell](https://github.com/EducationalTestingService/toefl-spell)|-|-|[A Benchmark Corpus of English Misspellings and a Minimally-supervised Model for Spelling Correction](https://www.aclweb.org/anthology/W19-4407.pdf)|En| 547 | |[NUC-GEC](https://www.comp.nus.edu.sg/~nlp/corpora.html)|GEC|500 essays|[How Far are We from Fully Automatic High Quality Grammatical Error Correction?](https://www.aclweb.org/anthology/P15-1068.pdf)|En| 548 | | [BEA2019](https://www.cl.cam.ac.uk/research/nl/bea2019st/) | GEC | 34,308 | BEA | En | 549 | | PIE-synthetic | GEC | 9,000,000 | [Parallel iterative edit models for local sequence transduction](https://aclanthology.org/D19-1435/) | En | 550 | | [clang8](https://github.com/google-research-datasets/clang8) | GEC | 2,372,119 & 114,405 & 44,830 | - | En,GE,RU | 551 | | CTC2021 | CSC | 217,634 | - | Zh | 552 | 553 | ## 系统 & API 554 | 飞鹰文本校对系统: http://check.hfl-rc.com/ 555 | 飞鹰校对API: https://www.xfyun.cn/services/textCorrection 556 | 557 | 558 | ## 其他资源 559 | ### 相关文章 560 | * [语法纠错的研究现状](https://mp.weixin.qq.com/s/0_qp1WsrEsjnj8ST4zQyTQ) 561 | * [文本语法纠错不完全调研](https://mp.weixin.qq.com/s/Dj8KIe6LbVGonV-Kk9mO2Q) 562 | 563 | ### 评测比赛 564 | * [CTC 2021](https://github.com/destwang/CTC2021) 565 | 566 | ----- 567 | *以上资源只用于学术研究,如果有侵犯版权,请联系我们进行删除。* 568 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | [**English**](https://github.com/destwang/CTCResources) | [**中文**](https://github.com/destwang/CTCResources/blob/main/README_ZH.md) 2 | 3 | # CTCResources 4 | 5 | Resources for Chinese text correction (CTC). The resource list is mainly mantained by Baoxin Wang and Honghong Zhao from HFL (哈工大讯飞联合实验室). 6 | 7 | ## Contents 8 | 9 | - [CTCResources](#ctcresources) 10 | - [Contents](#contents) 11 | - [Defination](#defination) 12 | - [Chinese Spelling Check (CSC)](#chinese-spelling-check-csc) 13 | - [Grammatical Error Correction (GEC)](#grammatical-error-correction-gec) 14 | - [Competition & Shared Task](#competition--shared-task) 15 | - [CAIL 2022 Legal Instrument Correction(2022.09.01-2022.11.15)!](#cail-2022-legal-instrument-correction20220901-20221115) 16 | - [CCL2022 Chinese Learner Text Correction(2022.06.05-2022.09.10)](#ccl2022-chinese-learner-text-correction20220605-20220910) 17 | - [Experimental Results on Public Datasets](#experimental-results-on-public-datasets) 18 | - [SIGHAN 2015](#sighan-2015) 19 | - [Papers](#papers) 20 | - [CSC Papers](#csc-papers) 21 | - [GEC Papers](#gec-papers) 22 | - [Datasets](#datasets) 23 | - [CTC 2021 :](#ctc-2021-) 24 | - [Others:](#others) 25 | - [Systems & API](#systems--api) 26 | - [Other Resources](#other-resources) 27 | - [Related Articles](#related-articles) 28 | - [Shared Task](#shared-task) 29 | - [* CTC 2021](#-ctc-2021) 30 | 31 | 32 | ## Defination 33 | ### Chinese Spelling Check (CSC) 34 | Chinese spelling check (CSC) is a task to detect and correct spelling errors in Chinese text. 35 | 36 | ### Grammatical Error Correction (GEC) 37 | Grammatical Error Correction (GEC) is the task of correcting different kinds of errors in text such as spelling, punctuation, grammatical, and word choice errors. 38 | 39 | 40 | ## Competition & Shared Task 41 | ### CAIL 2022 Legal Instrument Correction(2022.09.01-2022.11.15)![](https://img.shields.io/badge/hot-red.svg) 42 | [Legal Instrument Correction](http://cail.cipsc.org.cn/task2.html?raceID=2&cail_tag=2022) aims at assisting judicial personnel to automatically detect and correct errors in legal documents through machine learning. This task covers four types of errors in legal documents: spelling errors, redundant errors, missing errors and word order errors. 43 | 44 | ### CCL2022 Chinese Learner Text Correction(2022.06.05-2022.09.10) 45 | 46 | [CLTC](https://github.com/blcuicall/CCL2022-CLTC)(Chinese Learner Text Correction)aims to automatically detect and correct punctuation, spelling, grammatical, semantics and other errors in Chinese learners' texts, so as to obtain correct sentences. 47 | 48 | ## Experimental Results on Public Datasets 49 | ### SIGHAN 2015 50 | *We have not collected the character-level results because of some bugs in the current commonly used character-level evaluation script. It is also recommended that researchers **do not** use character-level evaluation script or fix the evaluation script before using it.* 51 | 52 | * Without Pretraining 53 | 54 | | Model | D-P | D-R | D-F | C-P | C-R | C-F | 55 | | - | - | - | - | - | - | - | 56 | | [FASPell](https://github.com/iqiyi/FASPell) | 67.6 | 60.0 | 63.5 | 66.6 | 59.1 | 62.6 | 57 | | BERT | 73.7 | 78.2 | 75.9 | 70.9 | 75.2 | 73.0 | 58 | | RoBERTa | 74.7 | 77.3 | 76.0 | 72.1 | 74.5 | 73.3 | 59 | | [SpellGCN](https://github.com/ACL2020SpellGCN/SpellGCN) | 74.8 | **80.7** | 77.7 | 72.1 | **77.7** | 74.8 (75.9) | 60 | | [DCN](https://github.com/destwang/DCN) | **76.6** | 79.8 | **78.2** | **74.2** | 77.3 | **75.7** | 61 | 62 | * With Pretraining 63 | 64 | | Model | D-P | D-R | D-F | C-P | C-R | C-F | 65 | | - | - | - | - | - | - | - | 66 | | [BERT_CRS + GAD](https://aclanthology.org/2021.findings-acl.122.pdf) | 75.6 | 80.4 | 77.9 | 73.2 | 77.8 | 75.4 | 67 | | [DCN-pretrain](https://github.com/destwang/DCN) | 77.1 | 80.9 | 79.0 | 74.5 | 78.2 | 76.3 | 68 | | [REALISE](https://github.com/DaDaMrX/ReaLiSe) | 77.3 | 81.3 | 79.3 | 75.9 | 79.9 | 77.8 | 69 | | [PLOME](https://github.com/liushulinle/PLOME) | 77.4 | 81.5 | 79.4 | 75.3 | 79.3 | 77.2 | 70 | | [Soft-Masked BERT](https://aclanthology.org/2020.acl-main.82.pdf) | 73.7 | 73.2 | 73.5 | 66.7 | 66.2 | 66.4 | 71 | | [Soft-Masked BERT_SSCL](https://aclanthology.org/2021.emnlp-main.281.pdf) | **86.3** | 72.5 | 78.8 | **85.2** | 66.0 | 74.4 | 72 | | [MLM-phonetics](https://aclanthology.org/2021.findings-acl.198.pdf) | 77.5 | **83.1** | 80.2 | 74.9 | 80.2 | 77.5 | 73 | | [MDCSpell](https://aclanthology.org/2022.findings-acl.98.pdf) | 80.8 | 80.6 | **80.7** | 78.4 | 78.2 | 78.3 | 74 | | [ECOPO(BERT)](https://aclanthology.org/2022.findings-acl.252.pdf) | 78.2 | 82.3 | 80.2 | 76.6 | 80.4 | 78.4 | 75 | | [ECOPO(REALISE)](https://aclanthology.org/2022.findings-acl.252.pdf) | 77.5 | 82.6 | 80.0 | 76.1 | **81.2** | **78.5** | 76 | 77 | ## Papers 78 | ### CSC Papers 79 | > ### 2022 80 | **MDCSpell: A Multi-task Detector-Corrector Framework for Chinese Spelling Correction**. Findings of ACL 2022. 81 | Chenxi Zhu, Ziqiang Ying, Boyu Zhang, Feng Mao. [[pdf](https://aclanthology.org/2022.findings-acl.98.pdf)] 82 | 83 | **CRASpell: A Contextual Typo Robust Approach to Improve Chinese Spelling Correction**. Findings of ACL 2022. 84 | Shulin Liu, Shengkang Song, Tianchi Yue, Tao Yang, Huihui Cai, TingHao Yu, Shengli Sun. [[pdf](https://aclanthology.org/2022.findings-acl.237.pdf)] 85 | 86 | **The Past Mistake is the Future Wisdom: Error-driven Contrastive Probability Optimization for Chinese Spell Checking**. Findings of ACL 2022. 87 | Yinghui Li, Qingyu Zhou, Yangning Li, Zhongli Li, Ruiyang Liu, Rongyi Sun, Zizhen Wang, Chao Li, Yunbo Cao, Hai-Tao Zheng. [[pdf](https://aclanthology.org/2022.findings-acl.252.pdf)] 88 | 89 | > ### 2021 90 | **PLOME: Pre-training with Misspelled Knowledge for Chinese Spelling Correction**. ACL 2021. 91 | Shulin Liu, Tao Yang, Tianchi Yue, Feng Zhang and Di Wang. [[pdf](https://aclanthology.org/2021.acl-long.233.pdf)], [[code](https://github.com/liushulinle/PLOME)]. 92 | 93 | **PHMOSpell: Phonological and Morphological Knowledge Guided Chinese Spelling Check**. ACL 2021. 94 | Li Huang, Junjie Li, Weiwei Jiang, Zhiyu Zhang, Minchuan Chen, Shaojun Wang and Jing Xiao. [[pdf](https://aclanthology.org/2021.acl-long.464.pdf)]. 95 | 96 | **Exploration and Exploitation: Two Ways to Improve Chinese Spelling Correction Models**. ACL 2021. 97 | Chong Li, Cenyuan Zhang, Xiaoqing Zheng and Xuanjing Huang. [[pdf](https://arxiv.org/pdf/2105.14813.pdf)], [[code](https://github.com/FDChongli/TwoWaysToImproveCSC)]. 98 | 99 | **Dynamic Connected Networks for Chinese Spelling Check**. Findings of ACL 2021. 100 | Baoxin Wang, Wanxiang Che, Dayong Wu, Shijin Wang, Guoping Hu and Ting Liu. [[pdf](https://aclanthology.org/2021.findings-acl.216.pdf)], [[code](https://github.com/destwang/DCN)]. 101 | 102 | **Read, Listen, and See: Leveraging Multimodal Information Helps Chinese Spell Checking**. Findings of ACL 2021. 103 | Heng-Da Xu, Zhongli Li, Qingyu Zhou, Chao Li, Zizhen Wang, Yunbo Cao, Heyan Huang and Xian-Ling Mao. [[pdf](https://arxiv.org/pdf/2105.12306.pdf)], [[code](https://github.com/DaDaMrX/ReaLiSe)]. 104 | 105 | **Global Attention Decoder for Chinese Spelling Error Correction**. Findings of ACL 2021. 106 | Zhao Guo, Yuan Ni, Keqiang Wang, Wei Zhu and Guotong Xie. [[pdf](https://aclanthology.org/2021.findings-acl.122.pdf)]. 107 | 108 | **Correcting Chinese Spelling Errors with Phonetic Pre-training**. Findings of ACL 2021. 109 | Ruiqing Zhang, Chao Pang, Chuanqiang Zhang, Shuohuan Wang, Zhongjun He, Yu Sun, Hua Wu and Haifeng Wang. [[pdf](https://aclanthology.org/2021.findings-acl.198.pdf)]. 110 | 111 | **DCSpell: A Detector-Corrector Framework for Chinese Spelling Error Correction**. SIGIR 2021. 112 | Jing Li, Gaosheng Wu, Dafei Yin, Haozhao Wang, Yonggang Wang. [[pdf](https://dl.acm.org/doi/pdf/10.1145/3404835.3463050)]. 113 | 114 | **SpellBERT: A Lightweight Pretrained Model for Chinese Spelling Check**. EMNLP 2021. 115 | Tuo Ji, Hang Yan, Xipeng Qiu. [[pdf](https://aclanthology.org/2021.emnlp-main.287.pdf)]. 116 | 117 | **Self-Supervised Curriculum Learning for Spelling Error Correction**. EMNLP 2021. 118 | Zifa Gan, Hongfei Xu, Hongying Zan. [[pdf](https://aclanthology.org/2021.emnlp-main.281.pdf)]. 119 | 120 | **An Alignment-Agnostic Model for Chinese Text Error Correction**. Findings of EMNLP 2021. 121 | Liying Zheng, Yue Deng, Weishun Song, Liang Xu and Jing Xiao. [[pdf](https://aclanthology.org/2021.findings-emnlp.30.pdf)]. 122 | 123 | **Domain-Shift Conditioning Using Adaptable Filtering Via Hierarchical Embeddings for Robust Chinese Spell Check**. TASLP 2021. 124 | Minh Nguyen, Gia H. Ngo, and Nancy F. Chen. [[pdf](https://ieeexplore.ieee.org/abstract/document/9439969)]. 125 | 126 | > ### 2020 127 | **Chunk-based Chinese Spelling Check with Global Optimization**. Findings of EMNLP 2020. 128 | Zuyi Bao, Chen Li and Rui Wang. [[pdf](https://aclanthology.org/2020.findings-emnlp.184.pdf)]. 129 | 130 | **SpellGCN: Incorporating Phonological and Visual Similarities into Language Models for Chinese Spelling Check**. ACL 2020. 131 | Xingyi Cheng, Weidi Xu, Kunlong Chen, Shaohua Jiang, Feng Wang, Taifeng Wang, Wei Chu and Yuan Qi. [[pdf](https://aclanthology.org/2020.acl-main.81.pdf)], [[code](https://github.com/ACL2020SpellGCN/SpellGCN)]. 132 | 133 | **Spelling Error Correction with Soft-Masked BERT**. ACL 2020. 134 | Shaohua Zhang, Haoran Huang, Jicong Liu and Hang Li. [[pdf](https://aclanthology.org/2020.acl-main.82.pdf)]. 135 | 136 | > ### 2019 137 | **FASPell: A Fast, Adaptable, Simple, Powerful Chinese Spell Checker Based On DAE-Decoder Paradigm**. EMNLP 2019 Workshop W-NUT. 138 | Yuzhong Hong, Xianguo Yu, Neng He, Nan Liu, Junhui Liu. [[pdf](https://aclanthology.org/D19-5522.pdf)], [[code](https://github.com/iqiyi/FASPell)]. 139 | 140 | **Confusionset-guided Pointer Networks for Chinese Spelling Check**. ACL 2019. 141 | Dingmin Wang, Yi Tay, Li Zhong. [[pdf](https://aclanthology.org/P19-1578.pdf)], [[code](https://github.com/sunnyqiny/Confusionset-guided-Pointer-Networks-for-Chinese-Spelling-Check)]. 142 | 143 | ### GEC Papers 144 | > ### 2022 145 | 146 | **Ensembling and Knowledge Distilling of Large Sequence Taggers for Grammatical Error Correction**. ACL 2022. 147 | Maksym Tarnavskyi, Artem Chernodub, Kostiantyn Omelianchuk. [[pdf](https://aclanthology.org/2022.acl-long.266/)] 148 | 149 | **Interpretability for Language Learners Using Example-Based Grammatical Error Correction**. ACL 2022. 150 | Masahiro Kaneko, Sho Takase, Ayana Niwa, Naoaki Okazaki. [[pdf](https://aclanthology.org/2022.acl-long.496/)] 151 | 152 | **Adjusting the Precision-Recall Trade-Off with Align-and-Predict Decoding for Grammatical Error Correction**. ACL 2022 short. 153 | Xin Sun, Houfeng Wang. [[pdf](https://aclanthology.org/2022.acl-short.77/)] 154 | 155 | **“Is Whole Word Masking Always Better for Chinese BERT?”: Probing on Chinese Grammatical Error Correction**. Findings of ACL 2022. 156 | Yong Dai, Linyang Li, Cong Zhou, Zhangyin Feng, Enbo Zhao, Xipeng Qiu, Piji Li, Duyu Tang. [[pdf](https://aclanthology.org/2022.findings-acl.1/)] 157 | 158 | **Type-Driven Multi-Turn Corrections for Grammatical Error Correction**. Findings of ACL 2022. 159 | Shaopeng Lai, Qingyu Zhou, Jiali Zeng, Zhongli Li, Chao Li, Yunbo Cao, Jinsong Su. [[pdf](https://aclanthology.org/2022.findings-acl.254/)] 160 | 161 | **Reusing a Multi-lingual Setup to Bootstrap a Grammar Checker for a Very Low Resource Language without Data**. ComputEL 2022 Workshop. 162 | Inga Lill Sigga Mikkelsen, Linda Wiechetek, Flammie A Pirinen. [[pdf](https://aclanthology.org/2022.computel-1.19/)] 163 | 164 | > ### 2021 165 | 166 | **Instantaneous Grammatical Error Correction with Shallow Aggressive Decoding**. ACL 2021. 167 | Xin Sun, Tao Ge, Furu Wei, Houfeng Wang.[[pdf](https://aclanthology.org/2021.acl-long.462/)].[[code](https://github.com/AutoTemp/Shallow-Aggressive-Decoding)]. 168 | 169 | **Tail-to-Tail Non-Autoregressive Sequence Prediction for Chinese Grammatical ErrorCorrection**. ACL 2021. 170 | Piji Li, Shuming Shi.[[pdf](https://aclanthology.org/2021.acl-long.385/)].[[code](https://github.com/lipiji/TtT)]. 171 | 172 | **A Simple Recipe for Multilingual Grammatical Error Correction**. ACL 2021 short. 173 | Sascha Rothe, Jonathan Mallinson, Eric Malmi, Sebastian Krause, Aliaksei Severyn.[[pdf](https://aclanthology.org/2021.acl-short.89/)]. 174 | 175 | **Synthetic Data Generation for Grammatical Error Correction with Tagged Corruption Models**. EACL-BEA 2021. 176 | Felix Stahlberg, Shankar Kumar.[[pdf](https://aclanthology.org/2021.bea-1.4/)]. 177 | 178 | **Document-level grammatical error correction**. EACL-BEA 2021. 179 | Zheng Yuan, Christopher Bryant.[[pdf](https://aclanthology.org/2021.bea-1.8/)]. 180 | 181 | **Data Strategies for Low-Resource Grammatical Error Correction**. EACL-BEA 2021. 182 | Simon Flachs, Felix Stahlberg, Shankar Kumar.[[pdf](https://aclanthology.org/2021.bea-1.12/)]. 183 | 184 | **Assessing Grammatical Correctness in Language Learning**. EACL-BEA 2021. 185 | Anisia Katinskaia, Roman Yangarber.[[pdf](https://aclanthology.org/2021.bea-1.15/)]. 186 | 187 | **Neural Quality Estimation with Multiple Hypotheses for Grammatical Error Correction**. NAACL2021. 188 | Zhenghao Liu, Xiaoyuan Yi, Maosong Sun, Liner Yang, Tat-Seng Chua.[[pdf](https://aclanthology.org/2021.naacl-main.429/)]. 189 | 190 | **Comparison of Grammatical Error Correction Using Back-Translation Models**. NAACL2021 workshop. 191 | Aomi Koyama, Kengo Hotate, Masahiro Kaneko, Mamoru Komachi.[[pdf](https://aclanthology.org/2021.naacl-srw.16/)]. 192 | 193 | **LM-Critic: Language Models for Unsupervised Grammatical Error Correction**. EMNLP 2021. 194 | Michihiro Yasunaga, Jure Leskovec and Percy Liang.[[pdf](https://arxiv.org/abs/2109.06822)]. 195 | 196 | **Multi-Class Grammatical Error Detection for Correction: A Tale of Two Systems**. EMNLP 2021. 197 | Zheng Yuan, Shiva Taslimipoor, Christopher Davis and Christopher Bryant. 198 | 199 | > ### 2020 200 | 201 | **On the Robustness of Language Encoders against Grammatical Errors**. ACL 2020. 202 | Fan Yin, Quanyu Long, Tao Meng, Kai-Wei Chang.[[pdf](https://aclanthology.org/2020.acl-main.310/)]. 203 | 204 | **Encoder-Decoder Models Can Benefit from Pre-trained Masked Language Models in Grammatical Error Correction**. ACL 2020. 205 | Masahiro Kaneko, Masato Mita, Shun Kiyono, Jun Suzuki, Kentaro Inui.[[pdf](https://aclanthology.org/2020.acl-main.391/)]. 206 | 207 | **Grammatical Error Correction Using Pseudo Learner Corpus Considering Learner’s Error Tendency**. ACL 2020 workshop. 208 | Yujin Takahashi, Satoru Katsumata, Mamoru Komachi.[[pdf](https://aclanthology.org/2020.acl-srw.5/)]. 209 | 210 | **GECToR – Grammatical Error Correction: Tag, Not Rewrite**. ACL-BEA 2020. 211 | Kostiantyn Omelianchuk, Vitaliy Atrasevych, Artem Chernodub, Oleksandr Skurzhanskyi.[[pdf](https://aclanthology.org/2020.bea-1.16/)]. 212 | 213 | **A Comparative Study of Synthetic Data Generation Methods for Grammatical Error Correction**. ACL-BEA 2020. 214 | Max White, Alla Rozovskaya.[[pdf](https://aclanthology.org/2020.bea-1.21/)]. 215 | 216 | **Improving Grammatical Error Correction Models with Purpose-Built Adversarial Examples**. EMNLP 2020. 217 | Lihao Wang, Xiaoqing Zheng.[[pdf](https://aclanthology.org/2020.emnlp-main.228/)]. 218 | 219 | **Improving the Efficiency of Grammatical Error Correction with Erroneous Span Detection and Correction**. EMNLP 2020. 220 | Mengyun Chen, Tao Ge, Xingxing Zhang, Furu Wei, Ming Zhou.[[pdf](https://aclanthology.org/2020.emnlp-main.581/)]. 221 | 222 | **Grammatical Error Correction in Low Error Density Domains: A New Benchmark and Analyses**. EMNLP 2020. 223 | Simon Flachs, Ophélie Lacroix, Helen Yannakoudakis, Marek Rei, Anders Søgaard.[[pdf](https://aclanthology.org/2020.emnlp-main.680/)]. 224 | 225 | **Adversarial Grammatical Error Correction**. findings of EMNLP 2020 . 226 | Vipul Raheja, Dimitris Alikaniotis.[[pdf](https://aclanthology.org/2020.findings-emnlp.275/)]. 227 | 228 | **A Self-Refinement Strategy for Noise Reduction in Grammatical Error Correction**. findings of EMNLP 2020 . 229 | Masato Mita, Shun Kiyono, Masahiro Kaneko, Jun Suzuki, Kentaro Inui.[[pdf](https://aclanthology.org/2020.findings-emnlp.26/)]. 230 | 231 | **Improving Grammatical Error Correction with Machine Translation Pairs**. findings of EMNLP 2020 . 232 | Wangchunshu Zhou, Tao Ge, Chang Mu, Ke Xu, Furu Wei, Ming Zhou.[[pdf](https://aclanthology.org/2020.findings-emnlp.30/)]. 233 | 234 | **Chinese Grammatical Correction Using BERT-based Pre-trained Model**. AACL 2020. 235 | Hongfei Wang, Michiki Kurosawa, Satoru Katsumata, Mamoru Komachi.[[pdf](https://aclanthology.org/2020.aacl-main.20/)]. 236 | 237 | **Stronger Baselines for Grammatical Error Correction Using a Pretrained Encoder-Decoder Model**. AACL 2020. 238 | Satoru Katsumata, Mamoru Komachi.[[pdf](https://aclanthology.org/2020.aacl-main.83/)]. 239 | 240 | **Generating Diverse Corrections with Local Beam Search for Grammatical Error Correction**. COLING 2020. 241 | Kengo Hotate, Masahiro Kaneko, Mamoru Komachi.[[pdf](https://aclanthology.org/2020.coling-main.193/)]. 242 | 243 | **Heterogeneous Recycle Generation for Chinese Grammatical Error Correction**. COLING 2020. 244 | Charles Hinson, Hen-Hsen Huang, Hsin-Hsi Chen.[[pdf](https://aclanthology.org/2020.coling-main.199/)]. 245 | 246 | **Improving Grammatical Error Correction with Data Augmentation by Editing Latent Representation**. COLING 2020. 247 | Zhaohong Wan, Xiaojun Wan, Wenguang Wang.[[pdf](https://aclanthology.org/2020.coling-main.200/)]. 248 | 249 | **Cross-lingual Transfer Learning for Grammatical Error Correction**. COLING 2020. 250 | Ikumi Yamashita, Satoru Katsumata, Masahiro Kaneko, Aizhan Imankulova,Mamoru Komachi.[[pdf](https://aclanthology.org/2020.coling-main.415/)]. 251 | 252 | > ### 2019 253 | 254 | **Cross-Sentence Grammatical Error Correction**. ACL 2019. 255 | Shamil Chollampatt, Weiqi Wang, Hwee Tou Ng.[[pdf](https://aclanthology.org/P19-1042/)]. 256 | 257 | **Automatic Grammatical Error Correction for Sequence-to-sequence Text Generation: An Empirical Study**. ACL 2019. 258 | Tao Ge, Xingxing Zhang, Furu Wei, Ming Zhou.[[pdf](https://aclanthology.org/P19-1609/)]. 259 | 260 | **Controlling Grammatical Error Correction Using Word Edit Rate**. ACL 2019. 261 | Kengo Hotate, Masahiro Kaneko, Satoru Katsumata, Mamoru Komachi.[[pdf](https://aclanthology.org/P19-2020/)]. 262 | 263 | **Context is Key: Grammatical Error Detection with Contextual Word Representations**. ACL-BEA 2019. 264 | Samuel Bell, Helen Yannakoudakis, Marek Rei.[[pdf](https://aclanthology.org/W19-4410/)]. 265 | 266 | **The Unreasonable Effectiveness of Transformer Language Models in Grammatical Error Correction**. ACL-BEA 2019. 267 | Dimitris Alikaniotis, Vipul Raheja.[[pdf](https://aclanthology.org/W19-4412/)]. 268 | 269 | **(Almost) Unsupervised Grammatical Error Correction using Synthetic Comparable Corpus**. ACL-BEA 2019. 270 | Satoru Katsumata, Mamoru Komachi.[[pdf](https://aclanthology.org/W19-4413/)]. 271 | 272 | **Learning to combine Grammatical Error Corrections**. ACL-BEA 2019. 273 | Yoav Kantor, Yoav Katz, Leshem Choshen, Edo Cohen-Karlik, Naftali Liberman, Assaf Toledo, Amir Menczel, Noam Slonim.[[pdf](https://aclanthology.org/W19-4414/)]. 274 | 275 | **Erroneous data generation for Grammatical Error Correction**. ACL-BEA 2019. 276 | Shuyao Xu, Jiehao Zhang, Jin Chen, Long Qin.[[pdf](https://aclanthology.org/W19-4415/)]. 277 | 278 | **The CUED’s Grammatical Error Correction Systems for BEA-2019**. ACL-BEA 2019. 279 | Felix Stahlberg, Bill Byrne.[[pdf](https://aclanthology.org/W19-4417/)]. 280 | 281 | **CUNI System for the Building Educational Applications 2019 Shared Task: Grammatical Error Correction**. ACL-BEA 2019. 282 | Jakub Náplava, Milan Straka.[[pdf](https://aclanthology.org/W19-4419/)]. 283 | 284 | **Noisy Channel for Low Resource Grammatical Error Correction**. ACL-BEA 2019. 285 | Simon Flachs, Ophélie Lacroix, Anders Søgaard.[[pdf](https://aclanthology.org/W19-4420/)]. 286 | 287 | **TMU Transformer System Using BERT for Re-ranking at BEA 2019 Grammatical Error Correction on Restricted Track**. ACL-BEA 2019. 288 | Masahiro Kaneko, Kengo Hotate, Satoru Katsumata, Mamoru Komachi.[[pdf](https://aclanthology.org/W19-4422/)]. 289 | 290 | **A Neural Grammatical Error Correction System Built On Better Pre-training and Sequential Transfer Learning**. ACL-BEA 2019. 291 | Yo Joong Choe, Jiyeon Ham, Kyubyong Park, Yeoil Yoon.[[pdf](https://aclanthology.org/W19-4423/)]. 292 | 293 | **Neural and FST-based approaches to grammatical error correction**. ACL-BEA 2019. 294 | Zheng Yuan, Felix Stahlberg, Marek Rei, Bill Byrne, Helen Yannakoudakis.[[pdf](https://aclanthology.org/W19-4424/)]. 295 | 296 | **Improving Precision of Grammatical Error Correction with a Cheat Sheet**. ACL-BEA 2019. 297 | Mengyang Qiu, Xuejiao Chen, Maggie Liu, Krishna Parvathala, Apurva Patil, Jungyeul Park.[[pdf](https://aclanthology.org/W19-4425/)]. 298 | 299 | **Multi-headed Architecture Based on BERT for Grammatical Errors Correction**. ACL-BEA 2019. 300 | Bohdan Didenko, Julia Shaptala.[[pdf](https://aclanthology.org/W19-4426/)]. 301 | 302 | **Neural Grammatical Error Correction Systems with Unsupervised Pre-training on Synthetic Data**. ACL-BEA 2019. 303 | Roman Grundkiewicz, Marcin Junczys-Dowmunt, Kenneth Heafield.[[pdf](https://aclanthology.org/W19-4427/)]. 304 | 305 | **The Unbearable Weight of Generating Artificial Errors for Grammatical Error Correction**. ACL-BEA 2019. 306 | Phu Mon Htut, Joel Tetreault.[[pdf](https://aclanthology.org/W19-4449/)]. 307 | 308 | **An Empirical Study of Incorporating Pseudo Data into Grammatical Error Correction**. EMNLP 2019. 309 | Shun Kiyono, Jun Suzuki, Masato Mita, Tomoya Mizumoto, Kentaro Inui.[[pdf](https://arxiv.org/abs/1909.00502)]. 310 | 311 | **Encode, Tag, Realize: High-Precision Text Editing**. EMNLP 2019. 312 | Eric Malmi, Sebastian Krause, Sascha Rothe, Daniil Mirylenka, Aliaksei Severyn.[[pdf](https://aclanthology.org/D19-1510/)]. 313 | 314 | **Personalizing Grammatical Error Correction: Adaptation to Proficiency Level and L1**. EMNLP 2019. 315 | Maria Nadejde, Joel Tetreault.[[pdf](https://aclanthology.org/D19-5504/)]. 316 | 317 | **Grammatical Error Correction in Low-Resource Scenarios**. EMNLP 2019. 318 | Jakub Náplava, Milan Straka.[[pdf](https://aclanthology.org/D19-5545/)]. 319 | 320 | **Minimally-Augmented Grammatical Error Correction**. EMNLP 2019. 321 | Roman Grundkiewicz, Marcin Junczys-Dowmunt.[[pdf](https://aclanthology.org/D19-5546/)]. 322 | 323 | **Parallel Iterative Edit Models for Local Sequence Transduction**. EMNLP 2019. 324 | Abhijeet Awasthi, Sunita Sarawagi, Rasna Goyal, Sabyasachi Ghosh, Vihari Piratla.[[pdf](https://aclanthology.org/D19-1435/)]. 325 | 326 | **Learning to Copy for Automatic Post-Editing**. EMNLP 2019. 327 | Xuancheng Huang, Yang Liu, Huanbo Luan, Jingfang Xu, Maosong Sun.[[pdf](https://aclanthology.org/D19-1634/)]. 328 | 329 | **Improving Grammatical Error Correction via Pre-Training a Copy-Augmented Architecture with Unlabeled Data**. NAACL 2019. 330 | Wei Zhao, Liang Wang, Kewei Shen, Ruoyu Jia, Jingming Liu.[[pdf](https://aclanthology.org/N19-1014/)]. 331 | 332 | **Cross-Corpora Evaluation and Analysis of Grammatical Error Correction Models — Is Single-Corpus Evaluation Enough?**. NAACL 2019. 333 | Masato Mita, Tomoya Mizumoto, Masahiro Kaneko, Ryo Nagata, Kentaro Inui.[[pdf](https://aclanthology.org/N19-1132/)]. 334 | 335 | **Corpora Generation for Grammatical Error Correction**. NAACL 2019. 336 | Jared Lichtarge, Chris Alberti, Shankar Kumar, Noam Shazeer, Niki Parmar, Simon Tong.[[pdf](https://aclanthology.org/N19-1333/)]. 337 | 338 | **Neural Grammatical Error Correction with Finite State Transducers**. NAACL 2019. 339 | Felix Stahlberg, Christopher Bryant, Bill Byrne.[[pdf](https://aclanthology.org/N19-1406/)]. 340 | 341 | > ### 2018 342 | 343 | **Inherent Biases in Reference-based Evaluation for Grammatical Error Correction**. ACL 2018. 344 | Leshem Choshen, Omri Abend.[[pdf](https://aclanthology.org/P18-1059/)]. 345 | 346 | **Fluency Boost Learning and Inference for Neural Grammatical Error Correction**. ACL 2018. 347 | Tao Ge, Furu Wei, Ming Zhou.[[pdf](https://aclanthology.org/P18-1097/)]. 348 | 349 | **Automatic Metric Validation for Grammatical Error Correction**. ACL 2018. 350 | Leshem Choshen, Omri Abend.[[pdf](https://aclanthology.org/P18-1127/)]. 351 | 352 | **Overview of NLPTEA-2018 Share Task Chinese Grammatical Error Diagnosis**. ACL 2018 NLPTEA. 353 | Gaoqi Rao, Qi Gong, Baolin Zhang, Endong Xun.[[pdf](https://aclanthology.org/W18-3706/)]. 354 | 355 | **Approaching Neural Grammatical Error Correction as a Low-Resource Machine Translation Task**. NAACL 2018. 356 | Marcin Junczys-Dowmunt, Roman Grundkiewicz, Shubha Guha, Kenneth Heafield.[[pdf](https://aclanthology.org/N18-1055/)]. 357 | 358 | **Noising and Denoising Natural Language: Diverse Backtranslation for Grammar Correction**. NAACL 2018. 359 | Ziang Xie, Guillaume Genthial, Stanley Xie, Andrew Ng, Dan Jurafsky.[[pdf](https://aclanthology.org/N18-1057/)]. 360 | 361 | **Reference-less Measure of Faithfulness for Grammatical Error Correction**. NAACL 2018 short. 362 | Leshem Choshen, Omri Abend.[[pdf](https://aclanthology.org/N18-2020/)]. 363 | 364 | **Near Human-Level Performance in Grammatical Error Correction with Hybrid Machine Translation**. NAACL 2018 short. 365 | Roman Grundkiewicz, Marcin Junczys-Dowmunt.[[pdf](https://aclanthology.org/N18-2046/)]. 366 | 367 | **Language Model Based Grammatical Error Correction without Annotated Training Data**. NAACL 2018 BEA. 368 | Christopher Bryant, Ted Briscoe.[[pdf](https://aclanthology.org/W18-0529/)]. 369 | 370 | **A Multilayer Convolutional Encoder-Decoder Neural Network for Grammatical Error Correction**. AAAI 2018. 371 | Shamil Chollampatt, Hwee Tou Ng.[[pdf](https://www.aaai.org/ocs/index.php/AAAI/AAAI18/paper/viewFile/17308/16137)]. 372 | 373 | **Neural Quality Estimation of Grammatical Error Correction**. EMNLP 2018. 374 | Shamil Chollampatt, Hwee Tou Ng.[[pdf](https://aclanthology.org/D18-1274/)]. 375 | 376 | **Wronging a Right: Generating Better Errors to Improve Grammatical Error Detection**. EMNLP 2018. 377 | Sudhanshu Kasewa, Pontus Stenetorp, Sebastian Riedel.[[pdf](https://aclanthology.org/D18-1541/)]. 378 | 379 | **Using Wikipedia Edits in Low Resource Grammatical Error Correction**. EMNLP 2018. 380 | Adriane Boyd.[[pdf](https://aclanthology.org/W18-6111/)]. 381 | 382 | **Cool English: a Grammatical Error Correction System Based on Large Learner Corpora**. COLING 2018. 383 | Yu-Chun Lo, Jhih-Jie Chen, Chingyu Yang, Jason Chang.[[pdf](https://aclanthology.org/C18-2018/)]. 384 | 385 | **A Reassessment of Reference-Based Grammatical Error Correction Metrics**. COLING 2018. 386 | Shamil Chollampatt, Hwee Tou Ng.[[pdf](https://aclanthology.org/C18-1231/)]. 387 | 388 | > ### earlier 389 | 390 | **A Nested Attention Neural Hybrid Model for Grammatical Error Correction**. ACL 2017. 391 | Jianshu Ji, Qinlong Wang, Kristina Toutanova, Yongen Gong, Steven Truong, Jianfeng Gao.[[pdf](https://aclanthology.org/P17-1070/)]. 392 | 393 | **Automatic Annotation and Evaluation of Error Types for Grammatical Error Correction**. ACL 2017. 394 | Christopher Bryant, Mariano Felice, Ted Briscoe.[[pdf](https://aclanthology.org/P17-1074/)]. 395 | 396 | **Neural Sequence-Labelling Models for Grammatical Error Correction**. EMNLP 2017. 397 | Helen Yannakoudakis, Marek Rei, Øistein E. Andersen, Zheng Yuan.[[pdf](https://aclanthology.org/D17-1297/)]. 398 | 399 | **JFLEG: A Fluency Corpus and Benchmark for Grammatical Error Correction**. EACL 2017. 400 | Courtney Napoles, Keisuke Sakaguchi, Joel Tetreault.[[pdf](https://aclanthology.org/E17-2037/)]. 401 | 402 | **Grammatical Error Detection Using Error- and Grammaticality-Specific Word Embeddings**. IJCNLP 2017. 403 | Masahiro Kaneko, Yuya Sakaizawa, Mamoru Komachi.[[pdf](https://aclanthology.org/I17-1005/)]. 404 | 405 | **Reference-based Metrics can be Replaced with Reference-less Metrics in Evaluating Grammatical Error Correction Systems**. IJCNLP 2017. 406 | Hiroki Asano, Tomoya Mizumoto, Kentaro Inui.[[pdf](https://aclanthology.org/I17-2058/)]. 407 | 408 | **Grammatical Error Correction with Neural Reinforcement Learning**. IJCNLP 2017. 409 | Keisuke Sakaguchi, Matt Post, Benjamin Van Durme.[[pdf](https://aclanthology.org/I17-2062/)]. 410 | 411 | **Grammatical Error Correction: Machine Translation and Classifiers**. ACL 2016. 412 | Alla Rozovskaya, Dan Roth.[[pdf](https://aclanthology.org/P16-1208/)]. 413 | 414 | **Compositional Sequence Labeling Models for Error Detection in Learner Writing**. ACL 2016. 415 | Marek Rei, Helen Yannakoudakis.[[pdf](https://aclanthology.org/P16-1112/)]. 416 | 417 | **Grammatical error correction using neural machine translation**. NAACL 2016 short. 418 | Zheng Yuan, Ted Briscoe.[[pdf](https://aclanthology.org/N16-1042/)]. 419 | 420 | **Discriminative Reranking for Grammatical Error Correction with Statistical Machine Translation**. NAACL 2016 short. 421 | Tomoya Mizumoto, Yuji Matsumoto.[[pdf](https://aclanthology.org/N16-1133/)]. 422 | 423 | **Phrase-based Machine Translation is State-of-the-Art for Automatic Grammatical Error Correction**. EMNLP 2016. 424 | Marcin Junczys-Dowmunt, Roman Grundkiewicz.[[pdf](https://aclanthology.org/D16-1161/)]. 425 | 426 | **Adapting Grammatical Error Correction Based on the Native Language of Writers with Neural Network Joint Models**. EMNLP 2016. 427 | Shamil Chollampatt, Duc Tam Hoang, Hwee Tou Ng.[[pdf](https://aclanthology.org/D16-1195/)]. 428 | 429 | **There’s No Comparison: Reference-less Evaluation Metrics in Grammatical Error Correction**. EMNLP 2016. 430 | Courtney Napoles, Keisuke Sakaguchi, Joel Tetreault.[[pdf](https://aclanthology.org/D16-1228/)]. 431 | 432 | **Chinese Preposition Selection for Grammatical Error Diagnosis**. COLING 2016. 433 | Hen-Hsen Huang, Yen-Chi Shao, Hsin-Hsi Chen.[[pdf](https://aclanthology.org/C16-1085/)]. 434 | 435 | **Neural Network Translation Models for Grammatical Error Correction**. IJCAI 2016. 436 | S Chollampatt,K Taghipour,HT Ng.[[pdf](https://arxiv.org/pdf/1606.00189.pdf)]. 437 | 438 | **Exploiting N-Best Hypotheses to Improve an SMT Approach to Grammatical Error Correction**. IJCAI 2016. 439 | DT Hoang,S Chollampatt,HT Ng.[[pdf](http://www.researchgate.net/publication/303749890_Exploiting_N-Best_Hypotheses_to_Improve_an_SMT_Approach_to_Grammatical_Error_Correction)]. 440 | 441 | **How Far are We from Fully Automatic High Quality Grammatical Error Correction?**. ACL 2015. 442 | Christopher Bryant, Hwee Tou Ng.[[pdf](https://aclanthology.org/P15-1068/)]. 443 | 444 | **Ground Truth for Grammatical Error Correction Metrics**. ACL 2015 short. 445 | Courtney Napoles, Keisuke Sakaguchi, Matt Post, Joel Tetreault.[[pdf](https://aclanthology.org/P15-2097/)]. 446 | 447 | **Towards a standard evaluation method for grammatical error detection and correction**. NAACL 2015. 448 | Mariano Felice, Ted Briscoe.[[pdf](https://aclanthology.org/N15-1060/)]. 449 | 450 | **Human Evaluation of Grammatical Error Correction Systems**. EMNLP 2015. 451 | Roman Grundkiewicz, Marcin Junczys-Dowmunt, Edward Gillian.[[pdf](https://aclanthology.org/D15-1052/)]. 452 | 453 | **Ground Truth for Grammatical Error Correction Metrics**. IJCNLP 2015. 454 | Courtney Napoles, Keisuke Sakaguchi, Matt Post, Joel Tetreault.[[pdf](https://aclanthology.org/P15-2097/)]. 455 | 456 | **Go Climb a Dependency Tree and Correct the Grammatical Errors**. EMNLP 2014. 457 | Longkai Zhang, Houfeng Wang.[[pdf](https://aclanthology.org/D14-1033/)]. 458 | 459 | **System Combination for Grammatical Error Correction**. EMNLP 2014. 460 | Raymond Hendy Susanto, Peter Phandi, Hwee Tou Ng.[[pdf](https://aclanthology.org/D14-1102/)]. 461 | 462 | **Data Driven Grammatical Error Detection in Transcripts of Children’s Speech**. EMNLP 2014. 463 | Eric Morley, Anna Eva Hallin, Brian Roark.[[pdf](https://aclanthology.org/D14-1106/)]. 464 | 465 | **Generating artificial errors for grammatical error correction**. EACL 2014. 466 | Mariano Felice, Zheng Yuan.[[pdf](https://aclanthology.org/E14-3013/)]. 467 | 468 | **Correcting Grammatical Verb Errors**. EACL 2014. 469 | Alla Rozovskaya, Dan Roth, Vivek Srikumar.[[pdf](https://aclanthology.org/E14-1038/)]. 470 | 471 | **Detecting Learner Errors in the Choice of Content Words Using Compositional Distributional Semantics**. COLING 2014. 472 | Ekaterina Kochmar, Ted Briscoe.[[pdf](https://aclanthology.org/C14-1164/)]. 473 | 474 | **A Sentence Judgment System for Grammatical Error Detection**. COLING 2014. 475 | Lung-Hao Lee, Liang-Chih Yu, Kuei-Ching Lee, Yuen-Hsien Tseng, Li-Ping Chang, Hsin-Hsi Chen.[[pdf](https://aclanthology.org/C14-2015/)]. 476 | 477 | **Automated Grammatical Error Correction for Language Learners**. COLING 2014. 478 | Joel Tetreault, Claudia Leacock.[[pdf](https://aclanthology.org/C14-3004/)]. 479 | 480 | **Grammatical Error Correction Using Integer Linear Programming**. ACL 2013. 481 | Yuanbin Wu, Hwee Tou Ng.[[pdf](https://aclanthology.org/P13-1143/)]. 482 | 483 | **Joint Learning and Inference for Grammatical Error Correction**. EMNLP 2013. 484 | Alla Rozovskaya, Dan Roth.[[pdf](https://aclanthology.org/D13-1074/)]. 485 | 486 | **Automated Grammar Correction Using Hierarchical Phrase-Based Statistical Machine Translation**. IJCNLP 2013. 487 | Bibek Behera, Pushpak Bhattacharyya.[[pdf](https://aclanthology.org/I13-1122/)]. 488 | 489 | **Grammatical Error Correction Using Feature Selection and Confidence Tuning**. IJCNLP 2013. 490 | Yang Xiang, Yaoyun Zhang, Xiaolong Wang, Chongqiang Wei, Wen Zheng, Xiaoqiang Zhou, Yuxiu Hu, Yang Qin.[[pdf](https://aclanthology.org/I13-1148/)]. 491 | 492 | **A Meta Learning Approach to Grammatical Error Correction**. ACL 2012. 493 | Hongsuck Seo, Jonghoon Lee, Seokhwan Kim, Kyusong Lee, Sechun Kang, Gary Geunbae Lee.[[pdf](https://aclanthology.org/P12-2064/)]. 494 | 495 | **Grammar Error Correction Using Pseudo-Error Sentences and Domain Adaptation**. ACL 2012. 496 | Kenji Imamura, Kuniko Saito, Kugatsu Sadamitsu, Hitoshi Nishikawa.[[pdf]()]. 497 | 498 | **Better Evaluation for Grammatical Error Correction**. NAACL 2012 short. 499 | Daniel Dahlmeier, Hwee Tou Ng.[[pdf](https://aclanthology.org/N12-1067/)]. 500 | 501 | **A Beam-Search Decoder for Grammatical Error Correction**. EMNLP 2012. 502 | Daniel Dahlmeier, Hwee Tou Ng.[[pdf](https://aclanthology.org/D12-1052/)]. 503 | 504 | **Problems in Evaluating Grammatical Error Detection Systems**. COLING 2012. 505 | Martin Chodorow, Markus Dickinson, Ross Israel, Joel Tetreault.[[pdf](https://aclanthology.org/C12-1038/)]. 506 | 507 | **The Effect of Learner Corpus Size in Grammatical Error Correction of ESL Writings**. COLING 2012. 508 | Tomoya Mizumoto, Yuta Hayashibe, Mamoru Komachi, Masaaki Nagata, Yuji Matsumoto.[[pdf](https://aclanthology.org/C12-2084/)]. 509 | 510 | **They Can Help: Using Crowdsourcing to Improve the Evaluation of Grammatical Error Detection Systems**. ACL 2011. 511 | Nitin Madnani, Martin Chodorow, Joel Tetreault, Alla Rozovskaya.[[pdf](https://aclanthology.org/P11-2089/)]. 512 | 513 | **Grammatical Error Correction with Alternating Structure Optimization**. ACL 2011. 514 | Daniel Dahlmeier, Hwee Tou Ng.[[pdf](https://aclanthology.org/P11-1092/)]. 515 | 516 | **Automated Whole Sentence Grammar Correction Using a Noisy Channel Model**. ACL 2011. 517 | Y. Albert Park, Roger Levy.[[pdf](https://aclanthology.org/P11-1094/)]. 518 | 519 | **Grammatical Error Detection for Corrective Feedback Provision in Oral Conversations**. AAAI 2011. 520 | Sungjin Lee, Hyungjong Noh, Kyusong Lee, Gary Geunbae Lee.[[pdf](https://www.aaai.org/ocs/index.php/AAAI/AAAI11/paper/view/3501/3954)]. 521 | 522 | **Evaluating performance of grammatical error detection to maximize learning effect**. COLING 2010. 523 | Ryo Nagata, Kazuhide Nakatani.[[pdf](https://aclanthology.org/C10-2103/)]. 524 | 525 | 526 | 527 | ## Datasets 528 | 529 | ### [CCTC: A Cross-Sentence Chinese Text Correction Dataset for Native Speakers](https://aclanthology.org/2022.coling-1.294.pdf) 530 | | dataset | download | 531 | |-|-| 532 | |CCTC |[download](https://pan.baidu.com/s/1FzWzlshVWBVqZvLt07zqiQ) (password: 45ok)| 533 | 534 | 535 | ### [CTC 2021](https://destwang.github.io/CTC2021-explorer/) 536 | | dataset | download | 537 | |-|-| 538 | |training data|[download](https://pan.baidu.com/s/14lSJquTV4eZBGYlnr5Oq3Q) (password: 1yie)| 539 | |validation data|[download](https://pan.baidu.com/s/1dnpCxGK0m8v-R-wMpYH2kQ) (password: asrb)| 540 | 541 | ### Others: 542 | | dataset | task | # sents | source | language | 543 | |-|-|-|-|-| 544 | | SIGHAN 2013 | CSC | 350 & 974 | SIGHAN | Zh | 545 | | SIGHAN 2014 | CSC | 6,526 & 526 | SIGHAN | Zh | 546 | | SIGHAN 2015 | CSC | 3,174 & 550 | SIGHAN | Zh | 547 | |[OCR dataset](https://github.com/iqiyi/FASPell)|CSC|4575|[FASPell(iqiyi)](https://www.aclweb.org/anthology/D19-5522.pdf)|Zh| 548 | | [HybridSet](https://github.com/wdimmy/Automatic-Corpus-Generation) | CSC | 270K | - | Zh | 549 | | NLPCC 2018 GEC | GEC | - | NLPCC | Zh | 550 | | CGED | GED | - | HSK | Zh | 551 | | CoNLL 2013 | GEC | 1,381 | CONLL | En | 552 | | CoNLL 2014 | GEC | 1,312 | CONLL | En | 553 | | JFLEG | GEC | 747 | [JFLEG: A Fluency Corpus and Benchmark for Grammatical Error Correction](https://aclanthology.org/E17-2037.pdf) | En | 554 | | NUCLE | GEC | 57k | [Building a Large Annotated Corpus of Learner English: The NUS Corpus of Learner English](https://aclanthology.org/W13-1703.pdf) | En | 555 | | Lang-8 | GEC | 1M+ | Lang-8 | En | 556 | | Write&Improve+LOCNESS | GEC | 63,683 & 7,632 | - | En | 557 | |MMC+PsyTAR (medica)||512 & 79| - |En| 558 | |brikbeck+holbrook-tagged+holbrook-missp+aspell+wikipedia|(Misspelling word)| 36133/6136 & 1791/1200& 531/450& 2455/1922|[BBK](https://www.dcs.bbk.ac.uk/~ROGER/corpora.htm)|En| 559 | |[TOEFL-Spell](https://github.com/EducationalTestingService/toefl-spell)|-|-|[A Benchmark Corpus of English Misspellings and a Minimally-supervised Model for Spelling Correction](https://www.aclweb.org/anthology/W19-4407.pdf)|En| 560 | |[NUC-GEC](https://www.comp.nus.edu.sg/~nlp/corpora.html)|GEC|500 essays|[How Far are We from Fully Automatic High Quality Grammatical Error Correction?](https://www.aclweb.org/anthology/P15-1068.pdf)|En| 561 | | [BEA2019](https://www.cl.cam.ac.uk/research/nl/bea2019st/) | GEC | 34,308 | BEA | En | 562 | | PIE-synthetic | GEC | 9,000,000 | [Parallel iterative edit models for local sequence transduction](https://aclanthology.org/D19-1435/) | En | 563 | | [clang8](https://github.com/google-research-datasets/clang8) | GEC | 2,372,119 & 114,405 & 44,830 | - | En,GE,RU | 564 | | CTC2021 | CSC | 217,634 | - | Zh | 565 | 566 | 567 | ## Systems & API 568 | Feiying System: http://check.hfl-rc.com/ 569 | Feiying API: https://www.xfyun.cn/services/textCorrection 570 | 571 | 572 | ## Other Resources 573 | ### Related Articles 574 | * [语法纠错的研究现状](https://mp.weixin.qq.com/s/0_qp1WsrEsjnj8ST4zQyTQ) 575 | * [文本语法纠错不完全调研](https://mp.weixin.qq.com/s/Dj8KIe6LbVGonV-Kk9mO2Q) 576 | 577 | ### Shared Task 578 | * [CTC 2021](https://github.com/destwang/CTC2021) 579 | ----- 580 | *The above resources are only used for academic research. If there is any infringement of copyright, please contact us to delete it.* 581 | --------------------------------------------------------------------------------