├── NAR-ACL 2022.pdf ├── README.md └── index.html /NAR-ACL 2022.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/NAR-tutorial/acl2022/e3b0c43e7226f6ef7beeffddfa2ffd117cc53419/NAR-ACL 2022.pdf -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Non-Autoregressive Sequence Generation 2 | Tutorial @ [ACL 2022](https://www.2022.aclweb.org/tutorials), May 22, 2022 3 | 4 | 5 | ## Speakers 6 | [Jiatao Gu](https://jiataogu.me/), Facebook AI Research,
7 | [Xu Tan](https://www.microsoft.com/en-us/research/people/xuta/), Microsoft Research Asia, 8 | 9 | 10 | 11 | ## Abstract 12 | Non-autoregressive sequence generation (NAR) attempts to generate the entire or partial output sequences in parallel to speed up the generation process and avoid potential issues (e.g., label bias, exposure bias) in autoregressive generation. While it has received much research attention and has been applied in many sequence generation tasks in natural language and speech, naive NAR models still face many challenges to close the performance gap between state-of-the-art autoregressive models because of a lack of modeling power. In this tutorial, we will provide a thorough introduction and review of non-autoregressive sequence generation, in four sections: 1) Background, which covers the motivation of NAR generation, the problem definition, the evaluation protocol, and the comparison with standard autoregressive generation approaches. 2) Method, which includes different aspects: model architecture, objective function, training data, learning paradigm, and additional inference tricks. 3) Application, which covers different tasks in text and speech generation, and some advanced topics in applications. 4) Conclusion, in which we describe several research challenges and discuss the potential future research directions. We hope this tutorial can serve both academic researchers and industry practitioners working on non-autoregressive sequence generation. 13 | 14 | 15 | ## Outline 16 | 17 | PART 1 Introduction (~ 20 minutes) 18 | 1.1 Problem definition 19 | 1.2 Evaluation protocol 20 | 1.3 Multi-modality problem 21 | 22 | PART 2 Methods (~ 80 minutes) 23 | 2.1 Model architectures 24 | 2.1.1 Fully NAR models 25 | 2.1.2 Iteration-based NAR models 26 | 2.1.3 Partially NAR models 27 | 2.1.4 Locally AR models 28 | 2.1.5 NAR models with latent variables 29 | 2.2 Objective functions 30 | 2.2.1 Loss with latent variables 31 | 2.2.2 Loss beyond token-level 32 | 2.3 Training data 33 | 2.4 Learning paradigms 34 | 2.4.1 Curriculum learning 35 | 2.4.2 Self-supervised pre-training 36 | 2.5 Inference methods and tricks 37 | 38 | PART 3 Applications (~ 60 minutes) 39 | 3.1 Task overview in text/speech/image generation 40 | 3.2 NAR generation tasks 41 | 3.2.1 Neural machine translation 42 | 3.2.2 Text error correction 43 | 3.2.3 Automatic speech recognition 44 | 3.2.4 Text to speech / singing voice synthesis 45 | 3.2.5 Image (pixel/token) generation 46 | 3.3 Summary of NAR Applications 47 | 3.3.1 Benefits of NAR for different tasks 48 | 3.3.2 Addressing target-target/source dependency 49 | 3.3.3 Data difficulty vs model capacity 50 | 3.3.4 Streaming vs NAR, AR vs iterative NAR 51 | 52 | PART 4 Open problems, future directions, Q\&A (~ 20 minutes) 53 | 54 | ## Materials 55 | [Slides](https://nar-tutorial.github.io/acl2022/NAR-ACL%202022.pdf)
56 | 57 | 58 | -------------------------------------------------------------------------------- /index.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | Non-Autoregressive Sequence Generation | Tutorial @ ACL 2022 9 | 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 19 | 20 | 21 | 22 | 23 |
24 | 25 |

Non-Autoregressive Sequence Generation

26 |

Tutorial @ ACL 2022, May 22, 2022

27 | 28 |

Speakers

29 |

30 | Jiatao Gu, Facebook AI Research, jgu@fb.com 31 |
32 | Xu Tan, Microsoft Research Asia, xuta@microsoft.com 33 |

34 | 35 |

Abstract

36 |

Non-autoregressive sequence generation (NAR) attempts to generate the entire or partial output sequences in parallel to speed up the generation process and avoid potential issues (e.g., label bias, exposure bias) in autoregressive generation. While it has received much research attention and has been applied in many sequence generation tasks in natural language and speech, naive NAR models still face many challenges to close the performance gap between state-of-the-art autoregressive models because of a lack of modeling power. In this tutorial, we will provide a thorough introduction and review of non-autoregressive sequence generation, in four sections: 1) Background, which covers the motivation of NAR generation, the problem definition, the evaluation protocol, and the comparison with standard autoregressive generation approaches. 2) Method, which includes different aspects: model architecture, objective function, training data, learning paradigm, and additional inference tricks. 3) Application, which covers different tasks in text and speech generation, and some advanced topics in applications. 4) Conclusion, in which we describe several research challenges and discuss the potential future research directions. We hope this tutorial can serve both academic researchers and industry practitioners working on non-autoregressive sequence generation.

37 | 38 | 39 |

Outline

40 |
    41 |
  1. Introduction (~ 20 minutes)
    42 | 1.1 Problem definition
    43 | 1.2 Evaluation protocol
    44 | 1.3 Multi-modality problem

  2. 45 |
  3. Methods (~ 80 minutes)
    46 | 2.1 Model architectures
    47 | 2.1.1 Fully NAR models
    48 | 2.1.2 Iteration-based NAR models
    49 | 2.1.3 Partially NAR models
    50 | 2.1.4 Locally AR models
    51 | 2.1.5 NAR models with latent variables
    52 | 2.2 Objective functions
    53 | 2.2.1 Loss with latent variables
    54 | 2.2.2 Loss beyond token-level
    55 | 2.3 Training data
    56 | 2.4 Learning paradigms
    57 | 2.4.1 Curriculum learning
    58 | 2.4.2 Self-supervised pre-training
    59 | 2.5 Inference methods and tricks

  4. 60 |
  5. Applications (~ 60 minutes)
    61 | 3.1 Task overview in text/speech/image generation
    62 | 3.2 NAR generation tasks
    63 | 3.2.1 Neural machine translation
    64 | 3.2.2 Text error correction
    65 | 3.2.3 Automatic speech recognition
    66 | 3.2.4 Text to speech / singing voice synthesis
    67 | 3.2.5 Image (pixel/token) generation
    68 | 3.3 Summary of NAR Applications
    69 | 3.3.1 Benefits of NAR for different tasks
    70 | 3.3.2 Addressing target-target/source dependency
    71 | 3.3.3 Data difficulty vs model capacity
    72 | 3.3.4 Streaming vs NAR, AR vs iterative NAR

  6. 73 |
  7. Open problems, future directions, Q\&A (~ 20 minutes)
  8. 74 |
75 | 76 | 77 | 78 |

Materials

79 |

Slides
80 | 81 | 82 | 83 | 84 |

85 | 86 | 87 | 88 | 89 | 90 | 91 | 92 | --------------------------------------------------------------------------------