└── README.md
/README.md:
--------------------------------------------------------------------------------
1 | # Awesome Multi-task Learning
2 |
3 | Feel free to contact me or contribute if you find any interesting paper is missing!
4 |
5 | ## Table of Contents
6 |
7 | - [Survey & Study](#survey--study)
8 | - [Benchmarks & Code](#benchmarks--code)
9 | - [Papers](#papers)
10 | - [Awesome Multi-domain Multi-task Learning](#awesome-multi-domain-multi-task-learning)
11 | - [Workshops](#workshops)
12 | - [Online Courses](#online-courses)
13 | - [Related awesome list](#related-awesome-list)
14 |
15 | ## Survey & Study
16 |
17 | * Revisit the Imbalance Optimization in Multi-task Learning: An Experimental Analysis (arXiv, 2025) [[paper](https://arxiv.org/abs/2509.23915)]
18 |
19 | * Unleashing the Power of Multi-Task Learning: A Comprehensive Survey Spanning Traditional, Deep, and Pretrained Foundation Model Eras (arXiv, 2024) [[paper](https://arxiv.org/pdf/2404.18961)] [[code](https://github.com/junfish/Awesome-Multitask-Learning)]
20 |
21 | * A Survey on Mixture of Experts (arXiv, 2024) [[paper](https://arxiv.org/pdf/2407.06204)] [[code](https://github.com/withinmiaov/A-Survey-on-Mixture-of-Experts)]
22 |
23 | * Factors of Influence for Transfer Learning across Diverse Appearance Domains and Task Types (TPAMI, 2022) [[paper](https://arxiv.org/pdf/2103.13318.pdf)]
24 |
25 | * Multi-Task Learning for Dense Prediction Tasks: A Survey (TPAMI, 2021) [[paper](https://arxiv.org/abs/2004.13379)] [[code](https://github.com/SimonVandenhende/Multi-Task-Learning-PyTorch)]
26 |
27 | * A Survey on Multi-Task Learning (TKDE, 2021) [[paper](https://ieeexplore.ieee.org/abstract/document/9392366)]
28 |
29 | * Multi-Task Learning with Deep Neural Networks: A Survey (arXiv, 2020) [[paper](http://arxiv.org/abs/2009.09796)]
30 |
31 | * Taskonomy: Disentangling Task Transfer Learning (CVPR, 2018, **Best Paper**) [[paper](https://openaccess.thecvf.com/content_cvpr_2018/papers/Zamir_Taskonomy_Disentangling_Task_CVPR_2018_paper.pdf)] [[dataset](http://taskonomy.stanford.edu/)]
32 |
33 | * A Comparison of Loss Weighting Strategies for Multi task Learning in Deep Neural Networks (IEEE Access, 2019) [[paper](https://ieeexplore.ieee.org/document/8848395)]
34 |
35 | * An Overview of Multi-Task Learning in Deep Neural Networks (arXiv, 2017) [[paper](http://arxiv.org/abs/1706.05098)]
36 |
37 | ## Benchmarks & Code
38 |
39 | Benchmarks
40 |
41 | ### Dense Prediction Tasks
42 |
43 | * **[NYUv2]** Indoor Segmentation and Support Inference from RGBD Images (ECCV, 2012) [[paper](https://cs.nyu.edu/~silberman/papers/indoor_seg_support.pdf)] [[dataset](https://cs.nyu.edu/~silberman/datasets/nyu_depth_v2.html)]
44 |
45 | * **[Cityscapes]** The Cityscapes Dataset for Semantic Urban Scene Understanding (CVPR, 2016) [[paper](https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=7780719)] [[dataset](https://www.cityscapes-dataset.com/)]
46 |
47 | * **[PASCAL-Context]** The Role of Context for Object Detection and Semantic Segmentation in the Wild (CVPR, 2014) [[paper](https://cs.stanford.edu/~roozbeh/pascal-context/mottaghi_et_al_cvpr14.pdf)] [[dataset](https://cs.stanford.edu/~roozbeh/pascal-context/)]
48 |
49 | * **[Taskonomy]** Taskonomy: Disentangling Task Transfer Learning (CVPR, 2018 [best paper]) [[paper](https://openaccess.thecvf.com/content_cvpr_2018/papers/Zamir_Taskonomy_Disentangling_Task_CVPR_2018_paper.pdf)] [[dataset](http://taskonomy.stanford.edu/)]
50 |
51 | * **[KITTI]** Vision meets robotics: The KITTI dataset (IJRR, 2013) [[paper](http://www.cvlibs.net/publications/Geiger2013IJRR.pdf)] [dataset](http://www.cvlibs.net/datasets/kitti/)
52 |
53 | * **[SUN RGB-D]** SUN RGB-D: A RGB-D Scene Understanding Benchmark Suite (CVPR 2015) [[paper](https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=7298655)] [[dataset](https://rgbd.cs.princeton.edu)]
54 |
55 | * **[BDD100K]** BDD100K: A Diverse Driving Dataset for Heterogeneous Multitask Learning (CVPR, 2020) [[paper](https://openaccess.thecvf.com/content_CVPR_2020/papers/Yu_BDD100K_A_Diverse_Driving_Dataset_for_Heterogeneous_Multitask_Learning_CVPR_2020_paper.pdf)] [[dataset](https://bdd-data.berkeley.edu/)]
56 |
57 | * **[Omnidata]** Omnidata: A Scalable Pipeline for Making Multi-Task Mid-Level Vision Datasets from 3D Scans (ICCV, 2021) [[paper](https://arxiv.org/pdf/2110.04994.pdf)] [[project](https://omnidata.vision)]
58 |
59 | * **Cityscapes-3D** Joint 2D-3D Multi-task Learning on Cityscapes-3D: 3D Detection, Segmentation, and Depth Estimation. [[dataset and code](https://github.com/prismformore/Multi-Task-Transformer/tree/main/TaskPrompter)]
60 |
61 | ### Image Classification
62 |
63 | * **[Meta-dataset]** Meta-Dataset: A Dataset of Datasets for Learning to Learn from Few Examples (ICLR, 2020) [[paper](https://openreview.net/pdf?id=rkgAGAVKPr)] [[dataset](https://github.com/google-research/meta-dataset)]
64 |
65 | * **[Visual Domain Decathlon]** Learning multiple visual domains with residual adapters (NeurIPS, 2017) [[paper](https://arxiv.org/abs/1705.08045)] [[dataset](https://www.robots.ox.ac.uk/~vgg/decathlon/)]
66 |
67 | * **[CelebA]** Deep Learning Face Attributes in the Wild (ICCV, 2015) [[paper](https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=7410782)] [[dataset](http://mmlab.ie.cuhk.edu.hk/projects/CelebA.html)]
68 |
69 |
70 |
71 |
72 | Code
73 |
74 | * [[TorchJD](https://github.com/TorchJD/torchjd)]: A library for multi-objective optimization (focusing on gradient combination) of pytorch models.
75 |
76 | * [[Multi-Task-Transformer](https://github.com/prismformore/Multi-Task-Transformer)]: Transformer for Multi-task Learning including dense prediction problems and 3D detection on Cityscapes.
77 |
78 | * [[Multi-Task-Learning-PyTorch](https://github.com/SimonVandenhende/Multi-Task-Learning-PyTorch)]: Multi-task Dense Prediction.
79 |
80 | * [[Auto-λ](https://github.com/lorenmt/auto-lambda)]: Multi-task Dense Prediction, Robotics.
81 |
82 | * [[UniversalRepresentations](https://github.com/VICO-UoE/UniversalRepresentations)]: [Multi-task Dense Prediction](https://github.com/VICO-UoE/UniversalRepresentations/tree/main/DensePred) (including different loss weighting strategies), [Multi-domain Classification](https://github.com/VICO-UoE/UniversalRepresentations/tree/main/VisualDecathlon), [Cross-domain Few-shot Learning](https://github.com/VICO-UoE/URL).
83 |
84 | * [[MTAN](https://github.com/lorenmt/mtan)]: Multi-task Dense Prediction, Multi-domain Classification.
85 |
86 | * [[ASTMT](https://github.com/facebookresearch/astmt)]: Multi-task Dense Prediction.
87 |
88 | * [[LibMTL](https://github.com/median-research-group/libmtl)]: Multi-task Dense Prediction.
89 |
90 | * [[MTPSL](https://github.com/VICO-UoE/MTPSL)]: Multi-task Partially-supervised Learning for Dense Prediction.
91 |
92 | * [[Resisual Adapater](https://github.com/srebuffi/residual_adapters)]: Multi-domain Classification.
93 |
94 |
95 | ## Papers
96 |
97 | ### 2025
98 |
99 | * Revisit the Imbalance Optimization in Multi-task Learning: An Experimental Analysis (arXiv, 2025) [[paper](https://arxiv.org/abs/2509.23915)]
100 | * Beyond Losses Reweighting: Empowering Multi-Task Learning via the Generalization Perspective (ICCV Highlight, 2025) [[paper](https://arxiv.org/abs/2211.13723)] [[code](https://github.com/VietHoang1512/FS-MTL)]
101 | * Jacobian Descent for Multi-Objective Optimization (arXiv, 2025) [[paper](https://arxiv.org/abs/2406.16232)]
102 |
103 | ### 2024
104 |
105 | * Layer by Layer: Uncovering Where Multi-Task Learning Happens in Instruction-Tuned Large Language Models (EMNLP, 2024) [[paper](https://aclanthology.org/2024.emnlp-main.847.pdf)]
106 |
107 | * MTMamba: Enhancing Multi-Task Dense Scene Understanding by Mamba-Based Decoders (ECCV, 2024) [[paper](https://www.ecva.net/papers/eccv_2024/papers_ECCV/papers/08907.pdf)] [[code](https://github.com/EnVision-Research/MTMamba)]
108 |
109 | * Learning Representation for Multitask Learning through Self-Supervised Auxiliary Learning (ECCV, 2024) [[paper](https://www.ecva.net/papers/eccv_2024/papers_ECCV/papers/10369.pdf)]
110 |
111 | * Fair Resource Allocation in Multi-Task Learning (ICML, 2024) [[paper](https://arxiv.org/abs/2402.15638)] [[code](https://github.com/OptMN-Lab/fairgrad)]
112 |
113 | * Bayesian Uncertainty for Gradient Aggregation in Multi-Task Learning (ICML, 2024) [[paper](https://arxiv.org/pdf/2402.04005)] [[code](https://github.com/ssi-research/BayesAgg_MTL)]
114 |
115 | * Region-aware Distribution Contrast: A Novel Approach to Multi-Task Partially Supervised Learning (arXiv, 2024) [[paper](https://arxiv.org/pdf/2403.10252.pdf)]
116 |
117 | * Multi-Task Dense Prediction via Mixture of Low-Rank Experts (CVPR, 2024) [[paper](https://arxiv.org/abs/2403.17749)] [[code](https://github.com/YuqiYang213/MLoRE)]
118 |
119 | * Joint-Task Regularization for Partially Labeled Multi-Task Learning (CVPR, 2024) [[paper](https://arxiv.org/pdf/2404.01976v1.pdf)] [[code](https://github.com/KentoNishi/JTR-CVPR-2024)]
120 |
121 | * MTLoRA: Low-Rank Adaptation Approach for Efficient Multi-Task Learning (CVPR, 2024) [[paper](https://arxiv.org/pdf/2403.20320)] [[code](https://github.com/scale-lab/MTLoRA)]
122 |
123 | * FedHCA2: Towards Hetero-Client Federated Multi-Task Learning (CVPR, 2024) [[paper](https://arxiv.org/pdf/2311.13250)] [[code](https://github.com/innovator-zero/FedHCA2)]
124 |
125 | * Going Beyond Multi-Task Dense Prediction with Synergy Embedding Models (CVPR, 2024) [[paper](https://openaccess.thecvf.com/content/CVPR2024/papers/Huang_Going_Beyond_Multi-Task_Dense_Prediction_with_Synergy_Embedding_Models_CVPR_2024_paper.pdf)]
126 |
127 | * Efficient Multitask Dense Predictor via Binarization (CVPR, 2024) [[paper](https://openaccess.thecvf.com/content/CVPR2024/papers/Shang_Efficient_Multitask_Dense_Predictor_via_Binarization_CVPR_2024_paper.pdf)]
128 |
129 | * DiffusionMTL: Learning Multi-Task Denoising Diffusion Model from Partially Annotated Data (CVPR, 2024) [[paper](https://arxiv.org/abs/2403.15389)] [[code](https://github.com/prismformore/DiffusionMTL)]
130 |
131 | * Bayesian Uncertainty for Gradient Aggregation in Multi-Task Learning (arXiv, 2024) [[paper](https://arxiv.org/pdf/2402.04005.pdf)] [[code](https://github.com/ssi-research/BayesAgg_MTL)]
132 |
133 | * Representation Surgery for Multi-Task Model Merging (arXiv, 2024) [[paper](https://arxiv.org/pdf/2402.02705.pdf)] [[code](https://github.com/EnnengYang/RepresentationSurgery)]
134 |
135 | * Multi-task Learning with 3D-Aware Regularization (ICLR, 2024) [[paper](https://openreview.net/attachment?id=TwBY17Hgiy&name=pdf)] [[code](https://github.com/VICO-UoE/MTPSL)]
136 |
137 | * AdaMerging: Adaptive Model Merging for Multi-Task Learning (ICLR, 2024) [[paper](https://openreview.net/attachment?id=nZP6NgD3QY&name=pdf)] [[code](https://github.com/EnnengYang/AdaMerging)]
138 |
139 | * Merging Multi-Task Models via Weight-Ensembling Mixture of Experts (ICLR, 2024) [[paper](https://openreview.net/pdf?id=nLRKnO74RB)]
140 |
141 | * ZipIt! Merging Models from Different Tasks without Training (ICLR, 2024) [[paper](https://openreview.net/attachment?id=LEYUkvdUhq&name=pdf)] [[code](https://github.com/gstoica27/ZipIt)]
142 |
143 | * Denoising Task Routing for Diffusion Models (ICLR, 2024) [[paper](https://openreview.net/attachment?id=MY0qlcFcUg&name=pdf)] [[code](https://byeongjun-park.github.io/DTR/)]
144 |
145 | * Active Learning with Task Consistency and Diversity in Multi-Task Networks (WACV, 2024) [[paper](https://openaccess.thecvf.com/content/WACV2024/papers/Hekimoglu_Active_Learning_With_Task_Consistency_and_Diversity_in_Multi-Task_Networks_WACV_2024_paper.pdf)] [[code](https://github.com/aralhekimoglu/mtal)]
146 |
147 | ### 2023
148 |
149 | * Direction-oriented Multi-objective Learning: Simple and Provable Stochastic Algorithms (Neurips, 2023) [[paper](https://arxiv.org/abs/2305.18409)] [[code](https://github.com/OptMN-Lab/sdmgrad)]
150 |
151 | * Addressing Negative Transfer in Diffusion Models (Neurips, 2023) [[paper](https://openreview.net/pdf?id=3G2ec833mW)] [[code](https://github.com/gohyojun15/ANT_diffusion)]
152 |
153 | * Rethinking of Feature Interaction for Multi-task Learning on Dense Prediction (arXiv, 2023) [[paper](https://arxiv.org/pdf/2312.13514.pdf)]
154 |
155 | * PolyMaX: General Dense Prediction with Mask Transformer (arXiv, 2023) [[paper](https://arxiv.org/pdf/2311.05770.pdf)] [[code](https://github.com/google-research/deeplab2)]
156 |
157 | * Challenging Common Assumptions in Multi-task Learning (arXiv, 2023) [[paper](https://arxiv.org/pdf/2311.04698.pdf)]
158 |
159 | * Data exploitation: multi-task learning of object detection and semantic segmentation on partially annotated data (BMVC, 2023) [[paper](https://arxiv.org/pdf/2311.04040.pdf)] [[code](https://github.com/lhoangan/multas)]
160 |
161 | * Factorized Tensor Networks for Multi-task and Multi-domain Learning (arXiv, 2023) [[paper](https://arxiv.org/pdf/2310.06124.pdf)]
162 |
163 | * UMT-Net: A Uniform Multi-Task Network with Adaptive Task Weighting (TIV, 2023) [[paper](https://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=10264163&casa_token=M3FwWSHrnG8AAAAA:lgQdSxiw05Xt5enCrn9wWxCoxxn40vmtkdw_U3gdoqmCjN_ge36-iDWScvODpvLWck6zx1VlyQQ?tag=1)]
164 |
165 | * Label Budget Allocation in Multi-Task Learning (arXiv, 2023) [[paper](https://arxiv.org/pdf/2308.12949.pdf)]
166 |
167 | * Efficient Controllable Multi-Task Architectures (arXiv, 2023) [[paper](https://arxiv.org/pdf/2308.11744.pdf)]
168 |
169 | * Foundation Model is Efficient Multimodal Multitask Model Selector (arXiv, 2023) [[paper](https://arxiv.org/abs/2308.06262)] [[code](https://github.com/OpenGVLab/Multitask-Model-Selector)]
170 |
171 | * Deformable Mixer Transformer with Gating for Multi-Task Learning of Dense Prediction (arXiv, 2023) [[paper](https://arxiv.org/abs/2308.05721)] [[code](https://github.com/yangyangxu0/DeMTG)]
172 |
173 | * AdaMV-MoE: Adaptive Multi-Task Vision Mixture-of-Experts (ICCV, 2023) [[paper](https://openaccess.thecvf.com/content/ICCV2023/papers/Chen_AdaMV-MoE_Adaptive_Multi-Task_Vision_Mixture-of-Experts_ICCV_2023_paper.pdf)] [[code](https://github.com/google-research/google-research/tree/master/moe_mtl)]
174 |
175 | * Deep Multitask Learning with Progressive Parameter Sharing (ICCV, 2023) [[paper](https://openaccess.thecvf.com/content/ICCV2023/papers/Shi_Deep_Multitask_Learning_with_Progressive_Parameter_Sharing_ICCV_2023_paper.pdf)]
176 |
177 | * Achievement-based Training Progress Balancing for Multi-Task Learning (ICCV, 2023) [[paper](https://openaccess.thecvf.com/content/ICCV2023/papers/Yun_Achievement-Based_Training_Progress_Balancing_for_Multi-Task_Learning_ICCV_2023_paper.pdf)] [[code](https://github.com/samsung/Achievement-based-MTL)]
178 |
179 | * Multi-Task Learning with Knowledge Distillation for Dense Prediction (ICCV, 2023) [[paper](https://openaccess.thecvf.com/content/ICCV2023/papers/Xu_Multi-Task_Learning_with_Knowledge_Distillation_for_Dense_Prediction_ICCV_2023_paper.pdf)]
180 |
181 | * Vision Transformer Adapters for Generalizable Multitask Learning (ICCV, 2023) [[paper](https://arxiv.org/abs/2308.12372)] [[code](https://ivrl.github.io/VTAGML/)]
182 |
183 | * TaskExpert: Dynamically Assembling Multi-Task Representations with Memorial Mixture-of-Experts (ICCV, 2023) [[paper](https://arxiv.org/pdf/2307.15324.pdf)]
184 |
185 | * Prompt Guided Transformer for Multi-Task Dense Prediction (arXiv, 2023) [[paper](https://arxiv.org/pdf/2307.15362.pdf)]
186 |
187 | * Auxiliary Learning as an Asymmetric Bargaining Game (ICML, 2023) [[paper](https://arxiv.org/pdf/2301.13501.pdf)] [[code](https://github.com/AvivSham/auxinash)]
188 |
189 | * Learning to Modulate pre-trained Models in RL (arXiv, 2023) [[paper](https://arxiv.org/abs/2306.14884)] [[code](https://github.com/ml-jku/L2M)]
190 |
191 | * **[InvPT++]**: Inverted Pyramid Multi-Task Transformer for Visual Scene Understanding (arXiv, 2023) [[paper](https://arxiv.org/pdf/2306.04842.pdf)] [[code](https://github.com/prismformore/Multi-Task-Transformer/tree/main/InvPT)]
192 |
193 | * FAMO: Fast Adaptive Multitask Optimization (arXiv, 2023) [[paper](https://arxiv.org/pdf/2306.03792.pdf)] [[code](https://github.com/Cranial-XIX/FAMO)]
194 |
195 | * Sample-Level Weighting for Multi-Task Learning with Auxiliary Tasks (arXiv, 2023) [[paper](https://arxiv.org/pdf/2306.04519.pdf)]
196 |
197 | * DynaShare: Task and Instance Conditioned Parameter Sharing for Multi-Task Learning (arXiv, 2023) [[paper](https://arxiv.org/abs/2305.17305)]
198 |
199 | * Planning-oriented Autonomous Driving (CVPR, 2023, **Best Paper**) [[paper](https://arxiv.org/pdf/2212.10156.pdf)] [[code](https://github.com/OpenDriveLab/UniAD)]
200 |
201 | * MDL-NAS: A Joint Multi-domain Learning Framework for Vision Transformer (CVPR, 2023) [[paper](https://openaccess.thecvf.com/content/CVPR2023/papers/Wang_MDL-NAS_A_Joint_Multi-Domain_Learning_Framework_for_Vision_Transformer_CVPR_2023_paper.pdf)]
202 |
203 | * Hierarchical Prompt Learning for Multi-Task Learning (CVPR, 2023) [[paper](https://openaccess.thecvf.com/content/CVPR2023/papers/Liu_Hierarchical_Prompt_Learning_for_Multi-Task_Learning_CVPR_2023_paper.pdf)]
204 |
205 | * Independent Component Alignment for Multi-Task Learning (CVPR, 2023) [[paper](https://openaccess.thecvf.com/content/CVPR2023/papers/Senushkin_Independent_Component_Alignment_for_Multi-Task_Learning_CVPR_2023_paper.pdf)] [[code](https://github.com/SamsungLabs/MTL)]
206 |
207 | * ForkMerge: Mitigating Negative Transfer in Auxiliary-Task Learning (TMLR, 2023) [[paper](https://arxiv.org/abs/2301.12618)] [[code]()]
208 |
209 | * MetaMorphosis: Task-oriented Privacy Cognizant Feature Generation for Multi-task Learning (arXiv, 2023) [[paper](https://arxiv.org/abs/2305.07815)]
210 |
211 | * ESSR: Evolving Sparse Sharing Representation for Multi-task Learning (arXiv, 2023) [[paper](https://ieeexplore.ieee.org/abstract/document/10114675)]
212 |
213 | * AutoTaskFormer: Searching Vision Transformers for Multi-task Learning (arXiv, 2023) [[paper](https://arxiv.org/pdf/2304.08756.pdf)]
214 |
215 | * AdaTT: Adaptive Task-to-Task Fusion Network for Multitask Learning in Recommendations (arXiv, 2023) [[paper](https://arxiv.org/pdf/2304.04959.pdf)]
216 |
217 | * A Study of Autoregressive Decoders for Multi-Tasking in Computer Vision (arXiv, 2023) [[paper](https://arxiv.org/pdf/2303.17376.pdf)]
218 |
219 | * Efficient Computation Sharing for Multi-Task Visual Scene Understanding (arXiv, 2023) [[paper](https://arxiv.org/pdf/2303.09663.pdf)]
220 |
221 | * Mod-Squad: Designing Mixture of Experts As Modular Multi-Task Learners (CVPR, 2023) [[paper](https://arxiv.org/pdf/2212.08066.pdf)] [[code](https://vis-www.cs.umass.edu/mod-squad/)]
222 |
223 | * Mitigating Task Interference in Multi-Task Learning via Explicit Task Routing with Non-Learnable Primitives (CVPR, 2023) [[paper](http://hal.cse.msu.edu/assets/pdfs/papers/2023-cvpr-multi-task-learning-non-learnable-task-routing.pdf)] [[code](https://github.com/zhichao-lu/etr-nlp-mtl)]
224 |
225 | * Mitigating Gradient Bias in Multi-objective Learning: A Provably Convergent Approach (ICLR, 2023) [[paper](https://openreview.net/forum?id=dLAYGdKTi2)]
226 |
227 | * UNIVERSAL FEW-SHOT LEARNING OF DENSE PREDIC- TION TASKS WITH VISUAL TOKEN MATCHING (ICLR, 2023) [[paper](https://openreview.net/pdf?id=88nT0j5jAn)]
228 |
229 | * TASKPROMPTER: SPATIAL-CHANNEL MULTI-TASK PROMPTING FOR DENSE SCENE UNDERSTANDING (ICLR, 2023) [[paper](https://openreview.net/forum?id=-CwPopPJda)] [[code](https://github.com/prismformore/Multi-Task-Transformer/tree/main/TaskPrompter)] [[dataset](https://arxiv.org/pdf/2304.00971.pdf)]
230 |
231 | * Contrastive Multi-Task Dense Prediction (AAAI 2023) [[paper](https://laos-y.github.io/uploads/yang2023AAAI/2437.YangS.pdf)]
232 |
233 | * Composite Learning for Robust and Effective Dense Predictions (WACV, 2023) [[paper](https://arxiv.org/abs/2210.07239)]
234 |
235 | * Toward Edge-Efficient Dense Predictions with Synergistic Multi-Task Neural Architecture Search (WACV, 2023) [[paper](https://arxiv.org/abs/2210.01384)]
236 |
237 | * Cross-task Attention Mechanism for Dense Multi-task Learning (WACV, 2023) [[paper](https://openaccess.thecvf.com/content/WACV2023/papers/Lopes_Cross-Task_Attention_Mechanism_for_Dense_Multi-Task_Learning_WACV_2023_paper.pdf)] [[code](https://github.com/astra-vision/DenseMTL)]
238 |
239 | ### 2022
240 |
241 | * RepMode: Learning to Re-parameterize Diverse Experts for Subcellular Structure Prediction (arXiv, 2022) [[paper](https://arxiv.org/abs/2212.10066)]
242 |
243 | * LEARNING USEFUL REPRESENTATIONS FOR SHIFTING TASKS AND DISTRIBUTIONS (arXiv, 2022) [[paper](https://arxiv.org/abs/2212.07346)]
244 |
245 | * Sub-Task Imputation via Self-Labelling to Train Image Moderation Models on Sparse Noisy Data (ACM CIKM, 2022) [[paper](https://dl.acm.org/doi/pdf/10.1145/3511808.3557149)]
246 |
247 | * Multi-Task Meta Learning: learn how to adapt to unseen tasks (arXiv, 2022) [[paper](https://arxiv.org/pdf/2210.06989.pdf)]
248 |
249 | * M3ViT: Mixture-of-Experts Vision Transformer for Efficient Multi-task Learning with Model-Accelerator Co-design (NeurIPS, 2022) [[paper](https://openreview.net/pdf?id=cFOhdl1cyU-)] [[code](https://github.com/VITA-Group/M3ViT)]
250 |
251 | * AutoMTL: A Programming Framework for Automating Efficient Multi-Task Learning (NeurIPS, 2022) [[paper](https://arxiv.org/abs/2110.13076)] [[code](https://github.com/zhanglijun95/AutoMTL)]
252 |
253 | * Association Graph Learning for Multi-Task Classification with Category Shifts (NeurIPS, 2022) [[paper](https://arxiv.org/pdf/2210.04637.pdf)] [[code](https://github.com/autumn9999/MTC-with-Category-Shifts)]
254 |
255 | * Do Current Multi-Task Optimization Methods in Deep Learning Even Help? (NeurIPS, 2022) [[paper](https://arxiv.org/abs/2209.11379)]
256 |
257 | * Task Discovery: Finding the Tasks that Neural Networks Generalize on (NeurIPS, 2022) [[paper](https://taskdiscovery.epfl.ch/static/paper/arxiv.pdf)]
258 |
259 | * **[Auto-λ]** Auto-λ: Disentangling Dynamic Task Relationships (TMLR, 2022) [[paper](https://arxiv.org/pdf/2202.03091.pdf)] [[code](https://github.com/lorenmt/auto-lambda)]
260 |
261 | * **[Universal Representations]** Universal Representations: A Unified Look at Multiple Task and Domain Learning (arXiv, 2022) [[paper](https://arxiv.org/pdf/2204.02744.pdf)] [[code](https://github.com/VICO-UoE/UniversalRepresentations)]
262 |
263 | * MTFormer: Multi-Task Learning via Transformer and Cross-Task Reasoning (ECCV, 2022) [[paper](https://www.ecva.net/papers/eccv_2022/papers_ECCV/papers/136870299.pdf)]
264 |
265 | * Not All Models Are Equal: Predicting Model Transferability in a Self-challenging Fisher Space (ECCV, 2022) [[paper](https://arxiv.org/abs/2207.03036)] [[code](https://github.com/TencentARC/SFDA)]
266 |
267 | * Factorizing Knowledge in Neural Networks (ECCV, 2022) [[paper](https://arxiv.org/abs/2207.03337)] [[code](https://github.com/Adamdad/KnowledgeFactor)]
268 |
269 | * **[InvPT]** Inverted Pyramid Multi-task Transformer for Dense Scene Understanding (ECCV, 2022) [[paper](https://arxiv.org/pdf/2203.07997.pdf)] [[code](https://github.com/prismformore/InvPT)]
270 |
271 | * **[MultiMAE]** MultiMAE: Multi-modal Multi-task Masked Autoencoders (ECCV, 2022) [[paper](https://arxiv.org/pdf/2204.01678.pdf)] [[code](https://multimae.epfl.ch)]
272 |
273 | * A Multi-objective / Multi-task Learning Framework Induced by Pareto Stationarity (ICML, 2022) [[paper](https://proceedings.mlr.press/v162/momma22a.html)]
274 |
275 | * Mitigating Modality Collapse in Multimodal VAEs via Impartial Optimization (ICML, 2022) [[paper](https://proceedings.mlr.press/v162/javaloy22a.html)]
276 |
277 | * Active Multi-Task Representation Learning (ICML, 2022) [[paper](https://proceedings.mlr.press/v162/chen22j.html)]
278 |
279 | * Generative Modeling for Multi-task Visual Learning (ICML, 2022) [[paper](https://proceedings.mlr.press/v162/bao22c.html)] [[code](https://github.com/zpbao/multi-task-oriented_generative_modeling)]
280 |
281 | * Multi-Task Learning as a Bargaining Game (ICML, 2022) [[paper](https://proceedings.mlr.press/v162/navon22a.html)] [[code](https://github.com/AvivNavon/nash-mtl)]
282 |
283 | * Multi-Task Learning with Multi-query Transformer for Dense Prediction (arXiv, 2022) [[paper](https://arxiv.org/pdf/2205.14354.pdf)]
284 |
285 | * **[Gato]** A Generalist Agent (arXiv, 2022) [[paper](https://arxiv.org/pdf/2205.06175.pdf)]
286 |
287 | * **[MTPSL]** Learning Multiple Dense Prediction Tasks from Partially Annotated Data (CVPR, 2022, **Best Paper Finalist**) [[paper](https://arxiv.org/pdf/2111.14893.pdf)] [[code](https://github.com/VICO-UoE/MTPSL)]
288 |
289 | * **[TSA]** Cross-domain Few-shot Learning with Task-specific Adapters (CVPR, 2022) [[paper](https://arxiv.org/pdf/2107.00358.pdf)] [[code](https://github.com/VICO-UoE/URL)]
290 |
291 | * **[OMNIVORE]** OMNIVORE: A Single Model for Many Visual Modalities (CVPR, 2022) [[paper](https://arxiv.org/pdf/2201.08377.pdf)] [[code](https://github.com/facebookresearch/omnivore)]
292 |
293 | * Task Adaptive Parameter Sharing for Multi-Task Learning (CVPR, 2022) [[paper](https://arxiv.org/pdf/2203.16708.pdf)]
294 |
295 | * Controllable Dynamic Multi-Task Architectures (CVPR, 2022) [[paper](https://arxiv.org/pdf/2203.14949.pdf)] [[code](https://www.nec-labs.com/~mas/DYMU/)]
296 |
297 | * **[SHIFT]** SHIFT: A Synthetic Driving Dataset for Continuous Multi-Task Domain Adaptation (CVPR, 2022) [[paper](https://openaccess.thecvf.com/content/CVPR2022/papers/Sun_SHIFT_A_Synthetic_Driving_Dataset_for_Continuous_Multi-Task_Domain_Adaptation_CVPR_2022_paper.pdf)] [[code](https://www.vis.xyz/shift/)]
298 |
299 | * DiSparse: Disentangled Sparsification for Multitask Model Compression (CVPR, 2022) [[paper](https://openaccess.thecvf.com/content/CVPR2022/papers/Sun_DiSparse_Disentangled_Sparsification_for_Multitask_Model_Compression_CVPR_2022_paper.pdf)] [[code](https://github.com/SHI-Labs/DiSparse-Multitask-Model-Compression)]
300 |
301 | * **[MulT]** MulT: An End-to-End Multitask Learning Transformer (CVPR, 2022) [[paper](https://openaccess.thecvf.com/content/CVPR2022/papers/Bhattacharjee_MulT_An_End-to-End_Multitask_Learning_Transformer_CVPR_2022_paper.pdf)] [[code](https://github.com/IVRL/MulT)]
302 |
303 | * Sound and Visual Representation Learning with Multiple Pretraining Tasks (CVPR, 2022) [[paper](https://openaccess.thecvf.com/content/CVPR2022/papers/Vasudevan_Sound_and_Visual_Representation_Learning_With_Multiple_Pretraining_Tasks_CVPR_2022_paper.pdf)]
304 |
305 | * Medusa: Universal Feature Learning via Attentional Multitasking (CVPR Workshop, 2022) [[paper](https://arxiv.org/abs/2204.05698)]
306 |
307 | * An Evolutionary Approach to Dynamic Introduction of Tasks in Large-scale Multitask Learning Systems (arXiv, 2022) [[paper](https://arxiv.org/pdf/2205.12755.pdf)] [[code](https://github.com/google-research/google-research/tree/master/muNet)]
308 |
309 | * Combining Modular Skills in Multitask Learning (arXiv, 2022) [[paper](https://arxiv.org/pdf/2202.13914.pdf)]
310 |
311 | * Visual Representation Learning over Latent Domains (ICLR, 2022) [[paper](https://openreview.net/pdf?id=kG0AtPi6JI1)]
312 |
313 | * ADARL: What, Where, and How to Adapt in Transfer Reinforcement Learning (ICLR, 2022) [[paper](https://openreview.net/pdf?id=8H5bpVwvt5)] [[code](https://github.com/Adaptive-RL/AdaRL-code)]
314 |
315 | * Towards a Unified View of Parameter-Efficient Transfer Learning (ICLR, 2022) [[paper](https://openreview.net/pdf?id=0RDcd5Axok)] [[code](https://github.com/jxhe/unify-parameter-efficient-tuning)]
316 |
317 | * **[Rotograd]** Rotograd: Dynamic Gradient Homogenization for Multi-Task Learning (ICLR, 2022) [[paper](https://openreview.net/pdf?id=T8wHz4rnuGL)] [[code](https://github.com/adrianjav/rotograd)]
318 |
319 | * Relational Multi-task Learning: Modeling Relations Between Data and Tasks (ICLR, 2022) [[paper](https://openreview.net/pdf?id=8Py-W8lSUgy)]
320 |
321 | * Weighted Training for Cross-task Learning (ICLR, 2022) [[paper](https://openreview.net/pdf?id=ltM1RMZntpu)] [[code](https://github.com/CogComp/TAWT)]
322 |
323 | * Semi-supervised Multi-task Learning for Semantics and Depth (WACV, 2022) [[paper](https://openaccess.thecvf.com/content/WACV2022/papers/Wang_Semi-Supervised_Multi-Task_Learning_for_Semantics_and_Depth_WACV_2022_paper.pdf)]
324 |
325 | * In Defense of the Unitary Scalarization for Deep Multi-Task Learning (arXiv, 2022) [[paper](https://arxiv.org/pdf/2201.04122.pdf)]
326 |
327 | ### 2021
328 |
329 | * Variational Multi-Task Learning with Gumbel-Softmax Priors (NeurIPS, 2021) [[paper](https://arxiv.org/pdf/2111.05323.pdf)] [[code](https://github.com/autumn9999/VMTL)]
330 |
331 | * Efficiently Identifying Task Groupings for Multi-Task Learning (NeurIPS, 2021) [[paper](http://arxiv.org/abs/2109.04617)]
332 |
333 | * **[CAGrad]** Conflict-Averse Gradient Descent for Multi-task Learning (NeurIPS, 2021) [[paper](https://openreview.net/pdf?id=_61Qh8tULj_)] [[code](https://github.com/Cranial-XIX/CAGrad)]
334 |
335 | * A Closer Look at Loss Weighting in Multi-Task Learning (arXiv, 2021) [[paper](https://arxiv.org/pdf/2111.10603.pdf)]
336 |
337 | * Exploring Relational Context for Multi-Task Dense Prediction (ICCV, 2021) [[paper](http://arxiv.org/abs/2104.13874)] [[code](https://github.com/brdav/atrc)]
338 |
339 | * Multi-Task Self-Training for Learning General Representations (ICCVW, 2021) [[paper](https://openaccess.thecvf.com/content/ICCV2021/papers/Ghiasi_Multi-Task_Self-Training_for_Learning_General_Representations_ICCV_2021_paper.pdf)]
340 |
341 | * Task Switching Network for Multi-task Learning (ICCV, 2021) [[paper](https://openaccess.thecvf.com/content/ICCV2021/html/Sun_Task_Switching_Network_for_Multi-Task_Learning_ICCV_2021_paper.html)] [[code](https://github.com/GuoleiSun/TSNs)]
342 |
343 | * Omnidata: A Scalable Pipeline for Making Multi-Task Mid-Level Vision Datasets from 3D Scans (ICCV, 2021) [[paper](https://arxiv.org/pdf/2110.04994.pdf)] [[project](https://omnidata.vision)]
344 |
345 | * Robustness via Cross-Domain Ensembles (ICCV, 2021) [[paper](https://arxiv.org/abs/2103.10919)] [[code](https://github.com/EPFL-VILAB/XDEnsembles)]
346 |
347 | * Domain Adaptive Semantic Segmentation with Self-Supervised Depth Estimation (ICCV, 2021) [[paper](https://openaccess.thecvf.com/content/ICCV2021/papers/Wang_Domain_Adaptive_Semantic_Segmentation_With_Self-Supervised_Depth_Estimation_ICCV_2021_paper.pdf)] [[code](https://qin.ee/corda)]
348 |
349 | * **[URL]** Universal Representation Learning from Multiple Domains for Few-shot Classification (ICCV, 2021) [[paper](https://openaccess.thecvf.com/content/ICCV2021/papers/Li_Universal_Representation_Learning_From_Multiple_Domains_for_Few-Shot_Classification_ICCV_2021_paper.pdf)] [[code](https://github.com/VICO-UoE/URL)]
350 |
351 | * **[tri-M]** A Multi-Mode Modulator for Multi-Domain Few-Shot Classification (ICCV, 2021) [[paper](https://openaccess.thecvf.com/content/ICCV2021/papers/Liu_A_Multi-Mode_Modulator_for_Multi-Domain_Few-Shot_Classification_ICCV_2021_paper.pdf)] [[code](https://github.com/csyanbin/tri-M-ICCV)]
352 |
353 | * MultiTask-CenterNet (MCN): Efficient and Diverse Multitask Learning using an Anchor Free Approach (ICCV Workshop, 2021) [[paper](https://openaccess.thecvf.com/content/ICCV2021W/ERCVAD/papers/Heuer_MultiTask-CenterNet_MCN_Efficient_and_Diverse_Multitask_Learning_Using_an_Anchor_ICCVW_2021_paper.pdf)]
354 |
355 | * See Yourself in Others: Attending Multiple Tasks for Own Failure Detection (arXiv, 2021) [[paper](https://arxiv.org/pdf/2110.02549.pdf)]
356 |
357 | * A Multi-Task Cross-Task Learning Architecture for Ad-hoc Uncertainty Estimation in 3D Cardiac MRI Image Segmentation (CinC, 2021) [[paper](https://www.cinc.org/2021/Program/accepted/115_Preprint.pdf)] [[code](https://github.com/SMKamrulHasan/MTCTL)]
358 |
359 | * Multi-Task Reinforcement Learning with Context-based Representations (ICML, 2021) [[paper](http://arxiv.org/abs/2102.06177)]
360 |
361 | * **[FLUTE]** Learning a Universal Template for Few-shot Dataset Generalization (ICML, 2021) [[paper](https://arxiv.org/pdf/2105.07029.pdf)] [[code](https://github.com/google-research/meta-dataset)]
362 |
363 | * Towards a Unified View of Parameter-Efficient Transfer Learning (arXiv, 2021) [[paper](http://arxiv.org/abs/2110.04366)]
364 |
365 | * UniT: Multimodal Multitask Learning with a Unified Transformer (arXiv, 2021) [[paper](http://arxiv.org/abs/2102.10772)]
366 |
367 | * Learning to Relate Depth and Semantics for Unsupervised Domain Adaptation (CVPR, 2021) [[paper](https://openaccess.thecvf.com/content/CVPR2021/papers/Saha_Learning_To_Relate_Depth_and_Semantics_for_Unsupervised_Domain_Adaptation_CVPR_2021_paper.pdf)] [[code](https://github.com/susaha/ctrl-uda)]
368 |
369 | * CompositeTasking: Understanding Images by Spatial Composition of Tasks (CVPR, 2021) [[paper](https://openaccess.thecvf.com/content/CVPR2021/html/Popovic_CompositeTasking_Understanding_Images_by_Spatial_Composition_of_Tasks_CVPR_2021_paper.html)] [[code](https://github.com/nikola3794/composite-tasking)]
370 |
371 | * Anomaly Detection in Video via Self-Supervised and Multi-Task Learning (CVPR, 2021) [[paper](https://openaccess.thecvf.com/content/CVPR2021/papers/Georgescu_Anomaly_Detection_in_Video_via_Self-Supervised_and_Multi-Task_Learning_CVPR_2021_paper.pdf)]
372 |
373 | * Taskology: Utilizing Task Relations at Scale (CVPR, 2021) [[paper](https://openaccess.thecvf.com/content/CVPR2021/papers/Lu_Taskology_Utilizing_Task_Relations_at_Scale_CVPR_2021_paper.pdf)]
374 |
375 | * Three Ways to Improve Semantic Segmentation with Self-Supervised Depth Estimation (CVPR, 2021) [[paper](https://arxiv.org/pdf/2012.10782.pdf)] [[code](https://github.com/lhoyer/improving_segmentation_with_selfsupervised_depth)]
376 |
377 | * Improving Semi-Supervised and Domain-Adaptive Semantic Segmentation with Self-Supervised Depth Estimation (arXiv, 2021) [[paper](https://arxiv.org/pdf/2108.12545.pdf)] [[code](https://github.com/lhoyer/improving_segmentation_with_selfsupervised_depth)]
378 |
379 | * Counter-Interference Adapter for Multilingual Machine Translation (Findings of EMNLP, 2021) [[paper](https://aclanthology.org/2021.findings-emnlp.240)]
380 |
381 | * Conditionally Adaptive Multi-Task Learning: Improving Transfer Learning in NLP Using Fewer Parameters & Less Data (ICLR) [[paper](https://openreview.net/forum?id=de11dbHzAMF)] [[code](https://github.com/CAMTL/CA-MTL)]
382 |
383 | * **[Gradient Vaccine]** Gradient Vaccine: Investigating and Improving Multi-task Optimization in Massively Multilingual Models (ICLR, 2021) [[paper](https://openreview.net/forum?id=F1vEjWK-lH_)]
384 |
385 | * **[IMTL]** Towards Impartial Multi-task Learning (ICLR, 2021) [[paper](https://openreview.net/forum?id=IMPnRXEWpvr)]
386 |
387 | * Deciphering and Optimizing Multi-Task Learning: A Random Matrix Approach (ICLR, 2021) [[paper](https://openreview.net/forum?id=Cri3xz59ga)]
388 |
389 | * **[URT]** A Universal Representation Transformer Layer for Few-Shot Image Classification (ICLR, 2021) [[paper](https://arxiv.org/pdf/2006.11702.pdf)] [[code](https://github.com/liulu112601/URT)]
390 |
391 | * Flexible Multi-task Networks by Learning Parameter Allocation (ICLR Workshop, 2021) [[paper](http://arxiv.org/abs/1910.04915)]
392 |
393 | * Multi-Loss Weighting with Coefficient of Variations (WACV, 2021) [[paper](https://openaccess.thecvf.com/content/WACV2021/papers/Groenendijk_Multi-Loss_Weighting_With_Coefficient_of_Variations_WACV_2021_paper.pdf)] [[code](https://github.com/rickgroen/cov-weighting)]
394 |
395 | ### 2020
396 |
397 | * Multi-Task Reinforcement Learning with Soft Modularization (NeurIPS, 2020) [[paper](http://arxiv.org/abs/2003.13661)] [[code](https://github.com/RchalYang/Soft-Module)]
398 | * AdaShare: Learning What To Share For Efficient Deep Multi-Task Learning (NeurIPS, 2020) [[paper](http://arxiv.org/abs/1911.12423)] [[code](https://github.com/sunxm2357/AdaShare)]
399 |
400 | * **[GradDrop]** Just Pick a Sign: Optimizing Deep Multitask Models with Gradient Sign Dropout (NeurIPS, 2020) [[paper](https://proceedings.NeurIPS.cc//paper/2020/file/16002f7a455a94aa4e91cc34ebdb9f2d-Paper.pdf)] [[code](https://github.com/tensorflow/lingvo/blob/master/lingvo/core/graddrop.py)]
401 |
402 | * **[PCGrad]** Gradient Surgery for Multi-Task Learning (NeurIPS, 2020) [[paper](http://arxiv.org/abs/2001.06782)] [[tensorflow](https://github.com/tianheyu927/PCGrad)] [[pytorch](https://github.com/WeiChengTseng/Pytorch-PCGrad)]
403 |
404 | * On the Theory of Transfer Learning: The Importance of Task Diversity (NeurIPS, 2020) [[paper](https://proceedings.NeurIPS.cc//paper/2020/file/59587bffec1c7846f3e34230141556ae-Paper.pdf)]
405 |
406 | * A Study of Residual Adapters for Multi-Domain Neural Machine Translation (WMT, 2020) [[paper](https://www.aclweb.org/anthology/2020.wmt-1.72/)]
407 |
408 | * Multi-Task Adversarial Attack (arXiv, 2020) [[paper](http://arxiv.org/abs/2011.09824)]
409 |
410 | * Automated Search for Resource-Efficient Branched Multi-Task Networks (BMVC, 2020) [[paper](http://arxiv.org/abs/2008.10292)] [[code](https://github.com/brdav/bmtas)]
411 | * Branched Multi-Task Networks: Deciding What Layers To Share (BMVC, 2020) [[paper](http://arxiv.org/abs/1904.02920)]
412 |
413 | * MTI-Net: Multi-Scale Task Interaction Networks for Multi-Task Learning (ECCV, 2020) [[paper](http://arxiv.org/abs/2001.06902)] [[code](https://github.com/SimonVandenhende/Multi-Task-Learning-PyTorch)]
414 |
415 | * Reparameterizing Convolutions for Incremental Multi-Task Learning without Task Interference (ECCV, 2020) [[paper](http://arxiv.org/abs/2007.12540)] [[code](https://github.com/menelaoskanakis/RCM)]
416 |
417 | * Selecting Relevant Features from a Multi-domain Representation for Few-shot Classification (ECCV, 2020) [[paper](https://arxiv.org/pdf/2003.09338.pdf)] [[code](https://github.com/dvornikita/SUR)]
418 |
419 | * Multitask Learning Strengthens Adversarial Robustness (ECCV 2020) [[paper](https://www.ecva.net/papers/eccv_2020/papers_ECCV/papers/123470154.pdf)] [[code](https://github.com/columbia/MTRobust)]
420 |
421 | * Duality Diagram Similarity: a generic framework for initialization selection in task transfer learning (ECCV, 2020) [[paper](https://www.ecva.net/papers/eccv_2020/papers_ECCV/papers/123710494.pdf)] [[code](https://github.com/cvai-repo/duality-diagram-similarity)]
422 |
423 | * **[KD4MTL]** Knowledge Distillation for Multi-task Learning (ECCV Workshop) [[paper](https://arxiv.org/pdf/2007.06889.pdf)] [[code](https://github.com/VICO-UoE/KD4MTL)]
424 |
425 | * MTL-NAS: Task-Agnostic Neural Architecture Search towards General-Purpose Multi-Task Learning (CVPR, 2020) [[paper](https://arxiv.org/abs/2003.14058)] [[code](https://github.com/bhpfelix/MTLNAS)]
426 |
427 | * Robust Learning Through Cross-Task Consistency (CVPR, 2020) [[paper](https://consistency.epfl.ch/Cross_Task_Consistency_CVPR2020.pdf)] [[code](https://github.com/EPFL-VILAB/XTConsistency)]
428 |
429 | * 12-in-1: Multi-Task Vision and Language Representation Learning (CVPR, 2020) [paper](https://openaccess.thecvf.com/content_CVPR_2020/papers/Lu_12-in-1_Multi-Task_Vision_and_Language_Representation_Learning_CVPR_2020_paper.pdf) [[code](https://github.com/facebookresearch/vilbert-multi-task)]
430 |
431 | * A Multi-task Mean Teacher for Semi-supervised Shadow Detection (CVPR, 2020) [[paper](https://openaccess.thecvf.com/content_CVPR_2020/papers/Chen_A_Multi-Task_Mean_Teacher_for_Semi-Supervised_Shadow_Detection_CVPR_2020_paper.pdf)] [[code](https://github.com/eraserNut/MTMT)]
432 |
433 | * MAD-X: An Adapter-Based Framework for Multi-Task Cross-Lingual Transfer (EMNLP, 2020) [[paper](https://doi.org/10.18653/v1/2020.emnlp-main.617)]
434 |
435 | * Masking as an Efficient Alternative to Finetuning for Pretrained Language Models (EMNLP, 2020) [[paper](http://arxiv.org/abs/2004.12406)] [[code](https://github.com/ptlmasking/maskbert)]
436 |
437 | * Effcient Continuous Pareto Exploration in Multi-Task Learning (ICML, 2020) [[paper](http://proceedings.mlr.press/v119/ma20a/ma20a.pdf)] [[code](https://github.com/mit-gfx/ContinuousParetoMTL)]
438 |
439 | * Which Tasks Should Be Learned Together in Multi-task Learning? (ICML, 2020) [[paper](http://arxiv.org/abs/1905.07553)] [[code](https://github.com/tstandley/taskgrouping)]
440 |
441 | * Learning to Branch for Multi-Task Learning (ICML, 2020) [[paper](https://arxiv.org/abs/2006.01895)]
442 |
443 | * Partly Supervised Multitask Learning (ICMLA, 2020) [paper](https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=9356271)
444 |
445 | * Understanding and Improving Information Transfer in Multi-Task Learning (ICLR, 2020) [[paper](https://arxiv.org/abs/2005.00944)]
446 |
447 | * Measuring and Harnessing Transference in Multi-Task Learning (arXiv, 2020) [[paper](https://arxiv.org/abs/2010.15413)]
448 |
449 | * Multi-Task Semi-Supervised Adversarial Autoencoding for Speech Emotion Recognition (arXiv, 2020) [[paper](https://arxiv.org/pdf/1907.06078.pdf)]
450 |
451 | * Learning Sparse Sharing Architectures for Multiple Tasks (AAAI, 2020) [[paper](http://arxiv.org/abs/1911.05034)]
452 |
453 | * AdapterFusion: Non-Destructive Task Composition for Transfer Learning (arXiv, 2020) [[paper](http://arxiv.org/abs/2005.00247)]
454 |
455 | ### 2019
456 |
457 | * Adaptive Auxiliary Task Weighting for Reinforcement Learning (NeurIPS, 2019) [[paper](https://papers.nips.cc/paper/2019/hash/0e900ad84f63618452210ab8baae0218-Abstract.html)]
458 |
459 | * Pareto Multi-Task Learning (NeurIPS, 2019) [[paper](http://papers.nips.cc/paper/9374-pareto-multi-task-learning.pdf)] [[code](https://github.com/Xi-L/ParetoMTL)]
460 |
461 | * Modular Universal Reparameterization: Deep Multi-task Learning Across Diverse Domains (NeurIPS, 2019) [[paper](http://arxiv.org/abs/1906.00097)]
462 |
463 | * Fast and Flexible Multi-Task Classification Using Conditional Neural Adaptive Processes (NeurIPS, 2019) [[paper](https://github.com/cambridge-mlg/cnaps)] [[code](https://proceedings.neurips.cc/paper/2019/file/1138d90ef0a0848a542e57d1595f58ea-Paper.pdf)]
464 |
465 | * **[Orthogonal]** Regularizing Deep Multi-Task Networks using Orthogonal Gradients (arXiv, 2019) [[paper](http://arxiv.org/abs/1912.06844)]
466 |
467 | * Many Task Learning With Task Routing (ICCV, 2019) [[paper](https://openaccess.thecvf.com/content_ICCV_2019/papers/Strezoski_Many_Task_Learning_With_Task_Routing_ICCV_2019_paper.pdf)] [[code](https://github.com/gstrezoski/TaskRouting)]
468 |
469 | * Stochastic Filter Groups for Multi-Task CNNs: Learning Specialist and Generalist Convolution Kernels (ICCV, 2019) [[paper](https://arxiv.org/abs/1908.09597)]
470 |
471 | * Deep Elastic Networks with Model Selection for Multi-Task Learning (ICCV, 2019) [[paper](http://arxiv.org/abs/1909.04860)] [[code](https://github.com/rllab-snu/Deep-Elastic-Network)]
472 |
473 | * Feature Partitioning for Efficient Multi-Task Architectures (arXiv, 2019) [[paper](https://arxiv.org/abs/1908.04339)] [[code](https://github.com/google/multi-task-architecture-search)]
474 |
475 | * Task Selection Policies for Multitask Learning (arXiv, 2019) [[paper](http://arxiv.org/abs/1907.06214)]
476 |
477 | * BAM! Born-Again Multi-Task Networks for Natural Language Understanding (ACL, 2019) [[paper](https://www.aclweb.org/anthology/P19-1595/)] [[code](https://github.com/google-research/google-research/tree/master/bam)]
478 |
479 | * OmniNet: A unified architecture for multi-modal multi-task learning (arXiv, 2019) [[paper](http://arxiv.org/abs/1907.07804)]
480 |
481 | * NDDR-CNN: Layerwise Feature Fusing in Multi-Task CNNs by Neural Discriminative Dimensionality Reduction (CVPR, 2019) [[paper](https://arxiv.org/abs/1801.08297)] [[code](https://github.com/ethanygao/NDDR-CNN)]
482 |
483 | * **[MTAN + DWA]** End-to-End Multi-Task Learning with Attention (CVPR, 2019) [[paper](http://arxiv.org/abs/1803.10704)] [[code](https://github.com/lorenmt/mtan)]
484 |
485 | * Attentive Single-Tasking of Multiple Tasks (CVPR, 2019) [[paper](http://arxiv.org/abs/1904.08918)] [[code](https://github.com/facebookresearch/astmt)]
486 |
487 | * Pattern-Affinitive Propagation Across Depth, Surface Normal and Semantic Segmentation (CVPR, 2019) [[paper](https://openaccess.thecvf.com/content_CVPR_2019/papers/Zhang_Pattern-Affinitive_Propagation_Across_Depth_Surface_Normal_and_Semantic_Segmentation_CVPR_2019_paper.pdf)]
488 |
489 | * Representation Similarity Analysis for Efficient Task Taxonomy & Transfer Learning (CVPR, 2019) [[paper](https://arxiv.org/abs/1904.11740)] [[code](https://github.com/kshitijd20/RSA-CVPR19-release)]
490 |
491 | * **[Geometric Loss Strategy (GLS)]** MultiNet++: Multi-Stream Feature Aggregation and Geometric Loss Strategy for Multi-Task Learning (CVPR Workshop, 2019) [[paper](http://arxiv.org/abs/1904.08492)]
492 |
493 | * Parameter-Efficient Transfer Learning for NLP (ICML, 2019) [[paper](http://arxiv.org/abs/1902.00751)]
494 |
495 | * BERT and PALs: Projected Attention Layers for Efficient Adaptation in Multi-Task Learning (ICML, 2019) [[paper](http://arxiv.org/abs/1902.02671)] [[code](https://github.com/AsaCooperStickland/Bert-n-Pals)]
496 |
497 | * Tasks Without Borders: A New Approach to Online Multi-Task Learning (ICML Workshop, 2019) [[paper](https://openreview.net/pdf?id=HkllV5Bs24)]
498 |
499 | * AutoSeM: Automatic Task Selection and Mixing in Multi-Task Learning (NACCL, 2019) [[paper](https://arxiv.org/abs/1904.04153)] [[code](https://github.com/HanGuo97/AutoSeM)]
500 |
501 | * Multi-Task Deep Reinforcement Learning with PopArt (AAAI, 2019) [[paper](https://doi.org/10.1609/aaai.v33i01.33013796)]
502 |
503 | * SNR: Sub-Network Routing for Flexible Parameter Sharing in Multi-Task Learning (AAAI, 2019) [[paper](https://ojs.aaai.org/index.php/AAAI/article/view/3788/3666)]
504 |
505 | * Latent Multi-task Architecture Learning (AAAI, 2019) [[paper](https://arxiv.org/abs/1705.08142)] [[code](https://github.com/ sebastianruder/sluice-networks)]
506 |
507 | * Multi-Task Deep Neural Networks for Natural Language Understanding (ACL, 2019) [[paper](https://arxiv.org/pdf/1901.11504.pdf)]
508 |
509 | ### 2018
510 |
511 | * Learning to Multitask (NeurIPS, 2018) [[paper](https://papers.nips.cc/paper/2018/file/aeefb050911334869a7a5d9e4d0e1689-Paper.pdf)]
512 |
513 | * **[MGDA]** Multi-Task Learning as Multi-Objective Optimization (NeurIPS, 2018) [[paper](http://arxiv.org/abs/1810.04650)] [[code](https://github.com/isl-org/MultiObjectiveOptimization)]
514 |
515 | * Adapting Auxiliary Losses Using Gradient Similarity (arXiv, 2018) [[paper](http://arxiv.org/abs/1812.02224)] [[code](https://github.com/szkocot/Adapting-Auxiliary-Losses-Using-Gradient-Similarity)]
516 |
517 | * Piggyback: Adapting a Single Network to Multiple Tasks by Learning to Mask Weights (ECCV, 2018) [[paper](https://openaccess.thecvf.com/content_ECCV_2018/papers/Arun_Mallya_Piggyback_Adapting_a_ECCV_2018_paper.pdf)] [[code](https://github.com/arunmallya/piggyback)]
518 |
519 | * Dynamic Task Prioritization for Multitask Learning (ECCV, 2018) [[paper](https://openaccess.thecvf.com/content_ECCV_2018/papers/Michelle_Guo_Focus_on_the_ECCV_2018_paper.pdf)]
520 |
521 | * A Modulation Module for Multi-task Learning with Applications in Image Retrieval (ECCV, 2018) [[paper](https://arxiv.org/abs/1807.06708)]
522 |
523 | * Modeling Task Relationships in Multi-task Learning with Multi-gate Mixture-of-Experts (KDD, 2018) [[paper](https://dl.acm.org/doi/pdf/10.1145/3219819.3220007)]
524 |
525 | * Unifying and Merging Well-trained Deep Neural Networks for Inference Stage (IJCAI, 2018) [[paper](http://arxiv.org/abs/1805.04980)] [[code](https://github.com/ivclab/NeuralMerger)]
526 |
527 | * Efficient Parametrization of Multi-domain Deep Neural Networks (CVPR, 2018) [[paper](https://openaccess.thecvf.com/content_cvpr_2018/papers/Rebuffi_Efficient_Parametrization_of_CVPR_2018_paper.pdf)] [[code](https://github.com/srebuffi/residual_adapters)]
528 |
529 | * PAD-Net: Multi-tasks Guided Prediction-and-Distillation Network for Simultaneous Depth Estimation and Scene Parsing (CVPR, 2018) [[paper](https://openaccess.thecvf.com/content_cvpr_2018/papers/Xu_PAD-Net_Multi-Tasks_Guided_CVPR_2018_paper.pdf)]
530 |
531 | * NestedNet: Learning Nested Sparse Structures in Deep Neural Networks (CVPR, 2018) [[paper](https://openaccess.thecvf.com/content_cvpr_2018/papers/Kim_NestedNet_Learning_Nested_CVPR_2018_paper.pdf)]
532 |
533 | * PackNet: Adding Multiple Tasks to a Single Network by Iterative Pruning (CVPR, 2018) [[paper](https://openaccess.thecvf.com/content_cvpr_2018/papers/Mallya_PackNet_Adding_Multiple_CVPR_2018_paper.pdf)] [[code](https://github.com/arunmallya/packnet)]
534 |
535 |
536 | * **[Uncertainty]** Multi-task Learning Using Uncertainty to Weigh Losses for Scene Geometry and Semantics (CVPR, 2018) [[paper](https://openaccess.thecvf.com/content_cvpr_2018/papers/Kendall_Multi-Task_Learning_Using_CVPR_2018_paper.pdf)]
537 |
538 | * Deep Asymmetric Multi-task Feature Learning (ICML, 2018) [[paper](http://proceedings.mlr.press/v80/lee18d/lee18d.pdf)]
539 |
540 | * **[GradNorm]** GradNorm: Gradient Normalization for Adaptive Loss Balancing in Deep Multitask Networks (ICML, 2018) [[paper](http://arxiv.org/abs/1711.02257)]
541 |
542 | * Pseudo-task Augmentation: From Deep Multitask Learning to Intratask Sharing---and Back (ICML, 2018) [[paper](http://arxiv.org/abs/1803.04062)]
543 |
544 | * Gradient Adversarial Training of Neural Networks (arXiv, 2018) [[paper](http://arxiv.org/abs/1806.08028)]
545 |
546 | * Auxiliary Tasks in Multi-task Learning (arXiv, 2018) [[paper](http://arxiv.org/abs/1805.06334)]
547 |
548 | * Routing Networks: Adaptive Selection of Non-linear Functions for Multi-Task Learning (ICLR, 2018) [[paper](http://arxiv.org/abs/1711.01239)] [[code](https://github.com/cle-ros/RoutingNetworks)
549 |
550 | * Beyond Shared Hierarchies: Deep Multitask Learning through Soft Layer Ordering (ICLR, 2018) [[paper](http://arxiv.org/abs/1711.00108)]
551 |
552 | ### 2017
553 |
554 | * Learning multiple visual domains with residual adapters (NeurIPS, 2017) [[paper](https://papers.nips.cc/paper/2017/file/e7b24b112a44fdd9ee93bdf998c6ca0e-Paper.pdf)] [[code](https://github.com/srebuffi/residual_adapters)]
555 |
556 | * Learning Multiple Tasks with Multilinear Relationship Networks (NeurIPS, 2017) [[paper](https://proceedings.NeurIPS.cc/paper/2017/file/03e0704b5690a2dee1861dc3ad3316c9-Paper.pdf)] [[code](https://github.com/thuml/MTlearn)]
557 |
558 | * Federated Multi-Task Learning (NeurIPS, 2017) [[paper](https://proceedings.NeurIPS.cc/paper/2017/file/6211080fa89981f66b1a0c9d55c61d0f-Paper.pdf)] [[code](https://github.com/gingsmith/fmtl)]
559 |
560 | * Multi-task Self-Supervised Visual Learning (ICCV, 2017) [[paper](http://arxiv.org/abs/1708.07860)]
561 |
562 | * Adversarial Multi-task Learning for Text Classification (ACL, 2017) [[paper](http://arxiv.org/abs/1704.05742)]
563 |
564 | * UberNet: Training a Universal Convolutional Neural Network for Low-, Mid-, and High-Level Vision Using Diverse Datasets and Limited Memory (CVPR, 2017) [[paper](https://arxiv.org/abs/1609.02132)]
565 |
566 | * Fully-adaptive Feature Sharing in Multi-Task Networks with Applications in Person Attribute Classification (CVPR, 2017) [[paper](https://openaccess.thecvf.com/content_cvpr_2017/papers/Lu_Fully-Adaptive_Feature_Sharing_CVPR_2017_paper.pdf)]
567 |
568 | * Modular Multitask Reinforcement Learning with Policy Sketches (ICML, 2017) [[paper](http://arxiv.org/abs/1611.01796)] [[code](https://github.com/jacobandreas/psketch)]
569 |
570 |
571 | * SplitNet: Learning to Semantically Split Deep Networks for Parameter Reduction and Model Parallelization (ICML, 2017) [[paper](http://proceedings.mlr.press/v70/kim17b.html)] [[code](https://github.com/dalgu90/splitnet-wrn)]
572 |
573 | * One Model To Learn Them All (arXiv, 2017) [[paper](http://arxiv.org/abs/1706.05137)] [[code](https://github.com/tensorflow/tensor2tensor)]
574 |
575 | * **[AdaLoss]** Learning Anytime Predictions in Neural Networks via Adaptive Loss Balancing (arXiv, 2017) [[paper](http://arxiv.org/abs/1708.06832)]
576 |
577 | * Deep Multi-task Representation Learning: A Tensor Factorisation Approach (ICLR, 2017) [[paper](https://arxiv.org/abs/1605.06391)] [[code](https://github.com/wOOL/DMTRL)]
578 |
579 | * Trace Norm Regularised Deep Multi-Task Learning (ICLR Workshop, 2017) [[paper](http://arxiv.org/abs/1606.04038)] [[code](https://github.com/wOOL/TNRDMTL)]
580 |
581 | * When is multitask learning effective? Semantic sequence prediction under varying data conditions (EACL, 2017) [[paper](http://arxiv.org/abs/1612.02251)] [[code](https://github.com/bplank/multitasksemantics)]
582 |
583 | * Identifying beneficial task relations for multi-task learning in deep neural networks (EACL, 2017) [[paper](http://arxiv.org/abs/1702.08303)]
584 |
585 | * PathNet: Evolution Channels Gradient Descent in Super Neural Networks (arXiv, 2017) [[paper](http://arxiv.org/abs/1701.08734)] [[code](https://github.com/jsikyoon/pathnet)]
586 |
587 | * Attributes for Improved Attributes: A Multi-Task Network Utilizing Implicit and Explicit Relationships for Facial Attribute Classification (AAAI, 2017) [[paper](https://www.aaai.org/ocs/index.php/AAAI/AAAI17/paper/viewFile/14749/14282)]
588 |
589 | ### 2016 and earlier
590 |
591 | * Learning values across many orders of magnitude (NeurIPS, 2016) [[paper](https://arxiv.org/abs/1602.07714)]
592 |
593 | * Integrated Perception with Recurrent Multi-Task Neural Networks (NeurIPS, 2016) [[paper](https://proceedings.neurips.cc/paper/2016/file/06409663226af2f3114485aa4e0a23b4-Paper.pdf)]
594 |
595 | * Unifying Multi-Domain Multi-Task Learning: Tensor and Neural Network Perspectives (arXiv, 2016) [[paper](http://arxiv.org/abs/1611.09345)]
596 |
597 | * Progressive Neural Networks (arXiv, 2016) [[paper](https://arxiv.org/abs/1606.04671)]
598 |
599 | * Deep multi-task learning with low level tasks supervised at lower layers (ACL, 2016) [[paper](https://www.aclweb.org/anthology/P16-2038.pdf)]
600 |
601 | * **[Cross-Stitch]** Cross-Stitch Networks for Multi-task Learning (CVPR,2016) [[paper](https://arxiv.org/abs/1604.03539)] [[code](https://github.com/helloyide/Cross-stitch-Networks-for-Multi-task-Learning)]
602 |
603 | * Asymmetric Multi-task Learning based on Task Relatedness and Confidence (ICML, 2016) [[paper](http://proceedings.mlr.press/v48/leeb16.pdf)]
604 |
605 | * MultiNet: Real-time Joint Semantic Reasoning for Autonomous Driving (arXiv, 2016) [[paper](http://arxiv.org/abs/1612.07695)] [[code](https://github.com/MarvinTeichmann/MultiNet)]
606 |
607 | * A Unified Perspective on Multi-Domain and Multi-Task Learning (ICLR, 2015) [[paper](http://arxiv.org/abs/1412.7489)]
608 |
609 | * Facial Landmark Detection by Deep Multi-task Learning (ECCV, 2014) [[paper](https://personal.ie.cuhk.edu.hk/~ccloy/files/eccv_2014_deepfacealign.pdf)] [[code](http://mmlab.ie.cuhk.edu.hk/projects/TCDCN.html)]
610 |
611 | * Learning Task Grouping and Overlap in Multi-task Learning (ICML, 2012) [[paper](http://arxiv.org/abs/1206.6417)]
612 |
613 | * Learning with Whom to Share in Multi-task Feature Learning (ICML, 2011) [[paper](http://www.cs.utexas.edu/~grauman/papers/icml2011.pdf)]
614 |
615 | * Semi-Supervised Multi-Task Learning with Task Regularizations (ICDM, 2009) [[paper](https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=5360282)]
616 |
617 | * Semi-Supervised Multitask Learning (NeurIPS, 2008) [[paper](https://proceedings.neurips.cc/paper/2007/file/a34bacf839b923770b2c360eefa26748-Paper.pdf)]
618 |
619 | * Multitask Learning (1997) [[paper](https://link.springer.com/content/pdf/10.1023/A:1007379606734.pdf)]
620 |
621 | ## [Awesome Multi-domain Multi-task Learning](https://github.com/WeiHongLee/Awesome-Multi-Domain-Multi-Task-Learning)
622 |
623 | ## Workshops
624 |
625 | * [Universal Representations for Computer Vision Workshop at BMVC 2022](https://sites.google.com/view/universalrepresentations)
626 |
627 | * [Workshop on Multi-Task Learning in Computer Vision (DeepMTL) at ICCV 2021](https://sites.google.com/view/deepmtlworkshop/home)
628 |
629 | * [Adaptive and Multitask Learning: Algorithms & Systems Workshop (AMTL) at ICML 2019](https://www.amtl-workshop.org)
630 |
631 | * [Workshop on Multi-Task and Lifelong Reinforcement Learning at ICML 2015](https://sites.google.com/view/mtlrl)
632 |
633 | * [Transfer and Multi-Task Learning: Trends and New Perspectives at NeurIPS 2015](https://nips.cc/Conferences/2015/Schedule?showEvent=4939)
634 |
635 | * [Second Workshop on Transfer and Multi-task Learning at NeurIPS 2014](https://sites.google.com/site/multitaskwsnips2014/)
636 |
637 | * [New Directions in Transfer and Multi-Task: Learning Across Domains and Tasks Workshop at NeurIPS 2013](https://sites.google.com/site/learningacross/home)
638 |
639 | ## Online Courses
640 |
641 | * [CS 330: Deep Multi-Task and Meta Learning](https://cs330.stanford.edu)
642 |
643 | ## Related awesome list
644 |
645 | * https://github.com/SimonVandenhende/Awesome-Multi-Task-Learning
646 |
647 | * https://github.com/Manchery/awesome-multi-task-learning
648 |
649 | * https://github.com/junfish/Awesome-Multitask-Learning
650 |
651 |
652 |
653 |
--------------------------------------------------------------------------------