├── .github └── ISSUE_TEMPLATE │ └── add_paper.md ├── LICENSE └── README.md /.github/ISSUE_TEMPLATE/add_paper.md: -------------------------------------------------------------------------------- 1 | --- 2 | name: "Paper Review 🔬" 3 | about: 새로운 논문에 대한 리뷰를 추가합니다. 4 | title: 원본 논문의 제목을 최대한 보존해 주세요! 5 | labels: "paper" 6 | assignees: "" 7 | --- 8 | 9 | ## 어떤 내용의 논문인가요? 👋 10 | 11 | 간략하게 어떤 내용의 논문인지 작성해 주세요! (짧게 1-2줄 이어도 좋아요!) 12 | 13 | ## Abstract (요약) 🕵🏻‍♂️ 14 | 15 | 논문의 abstract 원본을 적어주세요! 16 | 17 | ## 이 논문을 읽어서 무엇을 배울 수 있는지 알려주세요! 🤔 18 | 19 | 이 논문을 제대로 읽었을 때 어떤 지식을 얻을 수 있을까요? 20 | 21 | ## 같이 읽어보면 좋을 만한 글이나 이슈가 있을까요? 22 | 23 | 만약에 있다면 자유롭게 작성해 주세요! 24 | 25 | ## 레퍼런스의 URL을 알려주세요! 🔗 26 | 27 | markdown 으로 축약하지 말고, 원본 링크 그대로 그냥 적어주세요! 28 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | MIT License 2 | 3 | Copyright (c) 2020 modulabs 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. 22 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # beyondBERT 2 | 11.5기의 beyondBERT의 토론 내용을 정리하는 repository입니다. 3 | 4 | ## 진행방식 5 | - 논문 당 한명의 호스트가 모임을 진행합니다. 6 | - 호스트는 배정된 논문의 idea 위주로 정리합니다. 7 | - 사전학습으로 참가자는 주차 별 논문에 question을 한개 이상 준비하고, 논문 issue의 thread에 question을 등록합니다. 8 | - thread에 달린 question에 thumb up을 하고, thumb up의 개수가 높은 question 위주로 모임시간에 토론하고 해결합니다. 9 | 10 | ## 주차 별 논문 11 | ### week01 12 | - ice breaking 13 | - 진행 방식 결정 14 | ### week02 15 | - [The Bottom-up Evolution of Representations in the Transformer: A Study with Machine Translation and Language Modeling Objectives](https://arxiv.org/abs/1909.01380) 16 | - [How multilingual is Multilingual BERT?](https://arxiv.org/abs/1906.01502) 17 | ### week03 18 | - [ALBERT: A Lite BERT for Self-supervised Learning of Language Representations](https://arxiv.org/abs/1909.11942) 19 | - [ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators](https://arxiv.org/abs/2003.10555) 20 | ### week04 21 | - [BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension](https://arxiv.org/abs/1910.13461) 22 | - [Data Augmentation using Pre-trained Transformer Models](https://arxiv.org/abs/2003.02245) 23 | ### week05 24 | - [Reformer: The Efficient Transformer](https://arxiv.org/abs/2001.04451) 25 | - [Longformer: The Long-Document Transformer](https://arxiv.org/abs/2004.05150) 26 | ### week06 27 | - [Mask-Predict: Parallel Decoding of Conditional Masked Language Models](https://arxiv.org/abs/1904.09324) 28 | - [Unsupervised Data Augmentation for Consistency Training](https://arxiv.org/abs/1904.12848) 29 | ### week07 30 | - [You Impress Me: Dialogue Generation via Mutual Persona Perception](https://arxiv.org/abs/2004.05388) 31 | - [Recipes for building an open-domain chatbot](https://arxiv.org/abs/2004.13637) 32 | ### week08 33 | - [ToD-BERT: Pre-trained Natural Language Understanding for Task-Oriented Dialogues](https://arxiv.org/abs/2004.06871) 34 | - [A Simple Language Model for Task-Oriented Dialogue](https://arxiv.org/abs/2005.00796) 35 | ### week09 36 | - [ReCoSa: Detecting the Relevant Contexts with Self-Attention for Multi-turn Dialogue Generation](https://arxiv.org/abs/1907.05339) 37 | - [FastBERT: a Self-distilling BERT with Adaptive Inference Time](https://arxiv.org/abs/2004.02178) 38 | ### week10 39 | - [PoWER-BERT: Accelerating BERT inference for Classification Tasks](https://arxiv.org/abs/2001.08950) 40 | - [TinyBERT: Distilling BERT for Natural Language Understanding](https://arxiv.org/abs/1909.10351) 41 | ### week11 42 | - [Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer](https://arxiv.org/abs/1910.10683) 43 | - [Language Models are Few-Shot Learners](https://arxiv.org/abs/2005.14165) 44 | --------------------------------------------------------------------------------