└── README.md


/README.md:
--------------------------------------------------------------------------------
  1 | # Awesome Multi-task Learning
  2 | 
  3 | Feel free to contact me or contribute if you find any interesting paper is missing!
  4 | 
  5 | ## Table of Contents
  6 | 
  7 | - [Survey & Study](#survey--study)
  8 | - [Benchmarks & Code](#benchmarks--code)
  9 | - [Papers](#papers)
 10 | - [Awesome Multi-domain Multi-task Learning](#awesome-multi-domain-multi-task-learning)
 11 | - [Workshops](#workshops)
 12 | - [Online Courses](#online-courses)
 13 | - [Related awesome list](#related-awesome-list)
 14 | 
 15 | ## Survey & Study
 16 | 
 17 | * Revisit the Imbalance Optimization in Multi-task Learning: An Experimental Analysis (arXiv, 2025) [[paper](https://arxiv.org/abs/2509.23915)]
 18 | 
 19 | * Unleashing the Power of Multi-Task Learning: A Comprehensive Survey Spanning Traditional, Deep, and Pretrained Foundation Model Eras (arXiv, 2024) [[paper](https://arxiv.org/pdf/2404.18961)] [[code](https://github.com/junfish/Awesome-Multitask-Learning)]
 20 | 
 21 | * A Survey on Mixture of Experts  (arXiv, 2024) [[paper](https://arxiv.org/pdf/2407.06204)] [[code](https://github.com/withinmiaov/A-Survey-on-Mixture-of-Experts)]
 22 | 
 23 | * Factors of Influence for Transfer Learning across Diverse Appearance Domains and Task Types (TPAMI, 2022) [[paper](https://arxiv.org/pdf/2103.13318.pdf)]
 24 | 
 25 | * Multi-Task Learning for Dense Prediction Tasks: A Survey (TPAMI, 2021) [[paper](https://arxiv.org/abs/2004.13379)] [[code](https://github.com/SimonVandenhende/Multi-Task-Learning-PyTorch)]
 26 | 
 27 | * A Survey on Multi-Task Learning (TKDE, 2021) [[paper](https://ieeexplore.ieee.org/abstract/document/9392366)]
 28 | 
 29 | * Multi-Task Learning with Deep Neural Networks: A Survey (arXiv, 2020) [[paper](http://arxiv.org/abs/2009.09796)]
 30 | 
 31 | * Taskonomy: Disentangling Task Transfer Learning (CVPR, 2018, **Best Paper**) [[paper](https://openaccess.thecvf.com/content_cvpr_2018/papers/Zamir_Taskonomy_Disentangling_Task_CVPR_2018_paper.pdf)] [[dataset](http://taskonomy.stanford.edu/)]
 32 | 
 33 | * A Comparison of Loss Weighting Strategies for Multi task Learning in Deep Neural Networks (IEEE Access, 2019) [[paper](https://ieeexplore.ieee.org/document/8848395)]
 34 | 
 35 | * An Overview of Multi-Task Learning in Deep Neural Networks (arXiv, 2017) [[paper](http://arxiv.org/abs/1706.05098)]
 36 | 
 37 | ## Benchmarks & Code
 38 | <details>
 39 |   <summary>Benchmarks</summary>
 40 | 
 41 | ### Dense Prediction Tasks
 42 | 
 43 | * **[NYUv2]** Indoor Segmentation and Support Inference from RGBD Images (ECCV, 2012) [[paper](https://cs.nyu.edu/~silberman/papers/indoor_seg_support.pdf)] [[dataset](https://cs.nyu.edu/~silberman/datasets/nyu_depth_v2.html)]
 44 | 
 45 | * **[Cityscapes]** The Cityscapes Dataset for Semantic Urban Scene Understanding (CVPR, 2016) [[paper](https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=7780719)] [[dataset](https://www.cityscapes-dataset.com/)]
 46 | 
 47 | * **[PASCAL-Context]** The Role of Context for Object Detection and Semantic Segmentation in the Wild (CVPR, 2014) [[paper](https://cs.stanford.edu/~roozbeh/pascal-context/mottaghi_et_al_cvpr14.pdf)] [[dataset](https://cs.stanford.edu/~roozbeh/pascal-context/)]
 48 | 
 49 | * **[Taskonomy]** Taskonomy: Disentangling Task Transfer Learning (CVPR, 2018 [best paper]) [[paper](https://openaccess.thecvf.com/content_cvpr_2018/papers/Zamir_Taskonomy_Disentangling_Task_CVPR_2018_paper.pdf)] [[dataset](http://taskonomy.stanford.edu/)]
 50 | 
 51 | * **[KITTI]** Vision meets robotics: The KITTI dataset (IJRR, 2013) [[paper](http://www.cvlibs.net/publications/Geiger2013IJRR.pdf)] [dataset](http://www.cvlibs.net/datasets/kitti/)
 52 | 
 53 | * **[SUN RGB-D]** SUN RGB-D: A RGB-D Scene Understanding Benchmark Suite (CVPR 2015) [[paper](https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=7298655)] [[dataset](https://rgbd.cs.princeton.edu)]
 54 | 
 55 | * **[BDD100K]** BDD100K: A Diverse Driving Dataset for Heterogeneous Multitask Learning (CVPR, 2020) [[paper](https://openaccess.thecvf.com/content_CVPR_2020/papers/Yu_BDD100K_A_Diverse_Driving_Dataset_for_Heterogeneous_Multitask_Learning_CVPR_2020_paper.pdf)] [[dataset](https://bdd-data.berkeley.edu/)]
 56 | 
 57 | * **[Omnidata]** Omnidata: A Scalable Pipeline for Making Multi-Task Mid-Level Vision Datasets from 3D Scans (ICCV, 2021) [[paper](https://arxiv.org/pdf/2110.04994.pdf)] [[project](https://omnidata.vision)]
 58 | 
 59 | * **Cityscapes-3D** Joint 2D-3D Multi-task Learning on Cityscapes-3D: 3D Detection, Segmentation, and Depth Estimation. [[dataset and code](https://github.com/prismformore/Multi-Task-Transformer/tree/main/TaskPrompter)]
 60 | 
 61 | ### Image Classification
 62 | 
 63 | * **[Meta-dataset]** Meta-Dataset: A Dataset of Datasets for Learning to Learn from Few Examples (ICLR, 2020) [[paper](https://openreview.net/pdf?id=rkgAGAVKPr)] [[dataset](https://github.com/google-research/meta-dataset)]
 64 | 
 65 | * **[Visual Domain Decathlon]** Learning multiple visual domains with residual adapters (NeurIPS, 2017) [[paper](https://arxiv.org/abs/1705.08045)] [[dataset](https://www.robots.ox.ac.uk/~vgg/decathlon/)]
 66 | 
 67 | * **[CelebA]** Deep Learning Face Attributes in the Wild (ICCV, 2015) [[paper](https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=7410782)] [[dataset](http://mmlab.ie.cuhk.edu.hk/projects/CelebA.html)]
 68 | </details>
 69 | 
 70 | 
 71 | <details>
 72 |   <summary>Code</summary>
 73 | 
 74 | * [[TorchJD](https://github.com/TorchJD/torchjd)]: A library for multi-objective optimization (focusing on gradient combination) of pytorch models.
 75 | 
 76 | * [[Multi-Task-Transformer](https://github.com/prismformore/Multi-Task-Transformer)]: Transformer for Multi-task Learning including dense prediction problems and 3D detection on Cityscapes.
 77 | 
 78 | * [[Multi-Task-Learning-PyTorch](https://github.com/SimonVandenhende/Multi-Task-Learning-PyTorch)]: Multi-task Dense Prediction.
 79 | 
 80 | * [[Auto-λ](https://github.com/lorenmt/auto-lambda)]: Multi-task Dense Prediction, Robotics.
 81 | 
 82 | * [[UniversalRepresentations](https://github.com/VICO-UoE/UniversalRepresentations)]: [Multi-task Dense Prediction](https://github.com/VICO-UoE/UniversalRepresentations/tree/main/DensePred) (including different loss weighting strategies), [Multi-domain Classification](https://github.com/VICO-UoE/UniversalRepresentations/tree/main/VisualDecathlon), [Cross-domain Few-shot Learning](https://github.com/VICO-UoE/URL).
 83 | 
 84 | * [[MTAN](https://github.com/lorenmt/mtan)]: Multi-task Dense Prediction, Multi-domain Classification.
 85 | 
 86 | * [[ASTMT](https://github.com/facebookresearch/astmt)]: Multi-task Dense Prediction.
 87 | 
 88 | * [[LibMTL](https://github.com/median-research-group/libmtl)]: Multi-task Dense Prediction.
 89 | 
 90 | * [[MTPSL](https://github.com/VICO-UoE/MTPSL)]: Multi-task Partially-supervised Learning for Dense Prediction.
 91 | 
 92 | * [[Resisual Adapater](https://github.com/srebuffi/residual_adapters)]: Multi-domain Classification.
 93 | </details>
 94 | 
 95 | ## Papers
 96 | 
 97 | ### 2025
 98 | 
 99 | * Revisit the Imbalance Optimization in Multi-task Learning: An Experimental Analysis (arXiv, 2025) [[paper](https://arxiv.org/abs/2509.23915)]
100 | * Beyond Losses Reweighting: Empowering Multi-Task Learning via the Generalization Perspective (ICCV Highlight, 2025) [[paper](https://arxiv.org/abs/2211.13723)] [[code](https://github.com/VietHoang1512/FS-MTL)]
101 | * Jacobian Descent for Multi-Objective Optimization (arXiv, 2025) [[paper](https://arxiv.org/abs/2406.16232)]
102 | 
103 | ### 2024
104 | 
105 | * Layer by Layer: Uncovering Where Multi-Task Learning Happens in Instruction-Tuned Large Language Models (EMNLP, 2024) [[paper](https://aclanthology.org/2024.emnlp-main.847.pdf)]
106 | 
107 | * MTMamba: Enhancing Multi-Task Dense Scene Understanding by Mamba-Based Decoders (ECCV, 2024) [[paper](https://www.ecva.net/papers/eccv_2024/papers_ECCV/papers/08907.pdf)] [[code](https://github.com/EnVision-Research/MTMamba)]
108 |   
109 | * Learning Representation for Multitask Learning through Self-Supervised Auxiliary Learning (ECCV, 2024) [[paper](https://www.ecva.net/papers/eccv_2024/papers_ECCV/papers/10369.pdf)]
110 | 
111 | * Fair Resource Allocation in Multi-Task Learning (ICML, 2024) [[paper](https://arxiv.org/abs/2402.15638)] [[code](https://github.com/OptMN-Lab/fairgrad)]
112 |   
113 | * Bayesian Uncertainty for Gradient Aggregation in Multi-Task Learning (ICML, 2024) [[paper](https://arxiv.org/pdf/2402.04005)] [[code](https://github.com/ssi-research/BayesAgg_MTL)]
114 | 
115 | * Region-aware Distribution Contrast: A Novel Approach to Multi-Task Partially Supervised Learning (arXiv, 2024) [[paper](https://arxiv.org/pdf/2403.10252.pdf)]
116 | 
117 | * Multi-Task Dense Prediction via Mixture of Low-Rank Experts (CVPR, 2024) [[paper](https://arxiv.org/abs/2403.17749)] [[code](https://github.com/YuqiYang213/MLoRE)]
118 | 
119 | * Joint-Task Regularization for Partially Labeled Multi-Task Learning (CVPR, 2024) [[paper](https://arxiv.org/pdf/2404.01976v1.pdf)] [[code](https://github.com/KentoNishi/JTR-CVPR-2024)]
120 |   
121 | * MTLoRA: Low-Rank Adaptation Approach for Efficient Multi-Task Learning (CVPR, 2024) [[paper](https://arxiv.org/pdf/2403.20320)] [[code](https://github.com/scale-lab/MTLoRA)]
122 | 
123 | * FedHCA2: Towards Hetero-Client Federated Multi-Task Learning (CVPR, 2024) [[paper](https://arxiv.org/pdf/2311.13250)] [[code](https://github.com/innovator-zero/FedHCA2)]
124 | 
125 | * Going Beyond Multi-Task Dense Prediction with Synergy Embedding Models (CVPR, 2024) [[paper](https://openaccess.thecvf.com/content/CVPR2024/papers/Huang_Going_Beyond_Multi-Task_Dense_Prediction_with_Synergy_Embedding_Models_CVPR_2024_paper.pdf)]
126 |   
127 | * Efficient Multitask Dense Predictor via Binarization (CVPR, 2024) [[paper](https://openaccess.thecvf.com/content/CVPR2024/papers/Shang_Efficient_Multitask_Dense_Predictor_via_Binarization_CVPR_2024_paper.pdf)]
128 | 
129 | * DiffusionMTL: Learning Multi-Task Denoising Diffusion Model from Partially Annotated Data (CVPR, 2024) [[paper](https://arxiv.org/abs/2403.15389)] [[code](https://github.com/prismformore/DiffusionMTL)]
130 | 
131 | * Bayesian Uncertainty for Gradient Aggregation in Multi-Task Learning (arXiv, 2024) [[paper](https://arxiv.org/pdf/2402.04005.pdf)] [[code](https://github.com/ssi-research/BayesAgg_MTL)]
132 | 
133 | * Representation Surgery for Multi-Task Model Merging (arXiv, 2024) [[paper](https://arxiv.org/pdf/2402.02705.pdf)] [[code](https://github.com/EnnengYang/RepresentationSurgery)]
134 | 
135 | * Multi-task Learning with 3D-Aware Regularization (ICLR, 2024) [[paper](https://openreview.net/attachment?id=TwBY17Hgiy&name=pdf)] [[code](https://github.com/VICO-UoE/MTPSL)]
136 | 
137 | * AdaMerging: Adaptive Model Merging for Multi-Task Learning (ICLR, 2024) [[paper](https://openreview.net/attachment?id=nZP6NgD3QY&name=pdf)] [[code](https://github.com/EnnengYang/AdaMerging)]
138 |   
139 | * Merging Multi-Task Models via Weight-Ensembling Mixture of Experts (ICLR, 2024) [[paper](https://openreview.net/pdf?id=nLRKnO74RB)]
140 |   
141 | * ZipIt! Merging Models from Different Tasks without Training (ICLR, 2024) [[paper](https://openreview.net/attachment?id=LEYUkvdUhq&name=pdf)] [[code](https://github.com/gstoica27/ZipIt)]
142 | 
143 | * Denoising Task Routing for Diffusion Models (ICLR, 2024) [[paper](https://openreview.net/attachment?id=MY0qlcFcUg&name=pdf)] [[code](https://byeongjun-park.github.io/DTR/)]
144 | 
145 | * Active Learning with Task Consistency and Diversity in Multi-Task Networks (WACV, 2024) [[paper](https://openaccess.thecvf.com/content/WACV2024/papers/Hekimoglu_Active_Learning_With_Task_Consistency_and_Diversity_in_Multi-Task_Networks_WACV_2024_paper.pdf)] [[code](https://github.com/aralhekimoglu/mtal)]
146 | 
147 | ### 2023
148 | 
149 | * Direction-oriented Multi-objective Learning: Simple and Provable Stochastic Algorithms (Neurips, 2023) [[paper](https://arxiv.org/abs/2305.18409)] [[code](https://github.com/OptMN-Lab/sdmgrad)]
150 | 
151 | * Addressing Negative Transfer in Diffusion Models (Neurips, 2023) [[paper](https://openreview.net/pdf?id=3G2ec833mW)] [[code](https://github.com/gohyojun15/ANT_diffusion)]
152 | 
153 | * Rethinking of Feature Interaction for Multi-task Learning on Dense Prediction (arXiv, 2023) [[paper](https://arxiv.org/pdf/2312.13514.pdf)]
154 | 
155 | * PolyMaX: General Dense Prediction with Mask Transformer (arXiv, 2023) [[paper](https://arxiv.org/pdf/2311.05770.pdf)] [[code](https://github.com/google-research/deeplab2)]
156 | 
157 | * Challenging Common Assumptions in Multi-task Learning (arXiv, 2023) [[paper](https://arxiv.org/pdf/2311.04698.pdf)]
158 | 
159 | * Data exploitation: multi-task learning of object detection and semantic segmentation on partially annotated data (BMVC, 2023) [[paper](https://arxiv.org/pdf/2311.04040.pdf)] [[code](https://github.com/lhoangan/multas)]
160 | 
161 | * Factorized Tensor Networks for Multi-task and Multi-domain Learning (arXiv, 2023) [[paper](https://arxiv.org/pdf/2310.06124.pdf)]
162 | 
163 | * UMT-Net: A Uniform Multi-Task Network with Adaptive Task Weighting (TIV, 2023) [[paper](https://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=10264163&casa_token=M3FwWSHrnG8AAAAA:lgQdSxiw05Xt5enCrn9wWxCoxxn40vmtkdw_U3gdoqmCjN_ge36-iDWScvODpvLWck6zx1VlyQQ?tag=1)]
164 | 
165 | * Label Budget Allocation in Multi-Task Learning (arXiv, 2023) [[paper](https://arxiv.org/pdf/2308.12949.pdf)]
166 | 
167 | * Efficient Controllable Multi-Task Architectures (arXiv, 2023) [[paper](https://arxiv.org/pdf/2308.11744.pdf)]
168 | 
169 | * Foundation Model is Efficient Multimodal Multitask Model Selector (arXiv, 2023) [[paper](https://arxiv.org/abs/2308.06262)] [[code](https://github.com/OpenGVLab/Multitask-Model-Selector)]
170 | 
171 | * Deformable Mixer Transformer with Gating for Multi-Task Learning of Dense Prediction (arXiv, 2023) [[paper](https://arxiv.org/abs/2308.05721)] [[code](https://github.com/yangyangxu0/DeMTG)]
172 | 
173 | * AdaMV-MoE: Adaptive Multi-Task Vision Mixture-of-Experts (ICCV, 2023) [[paper](https://openaccess.thecvf.com/content/ICCV2023/papers/Chen_AdaMV-MoE_Adaptive_Multi-Task_Vision_Mixture-of-Experts_ICCV_2023_paper.pdf)] [[code](https://github.com/google-research/google-research/tree/master/moe_mtl)]
174 | 
175 | * Deep Multitask Learning with Progressive Parameter Sharing (ICCV, 2023) [[paper](https://openaccess.thecvf.com/content/ICCV2023/papers/Shi_Deep_Multitask_Learning_with_Progressive_Parameter_Sharing_ICCV_2023_paper.pdf)]
176 | 
177 | * Achievement-based Training Progress Balancing for Multi-Task Learning (ICCV, 2023) [[paper](https://openaccess.thecvf.com/content/ICCV2023/papers/Yun_Achievement-Based_Training_Progress_Balancing_for_Multi-Task_Learning_ICCV_2023_paper.pdf)] [[code](https://github.com/samsung/Achievement-based-MTL)]
178 | 
179 | * Multi-Task Learning with Knowledge Distillation for Dense Prediction (ICCV, 2023) [[paper](https://openaccess.thecvf.com/content/ICCV2023/papers/Xu_Multi-Task_Learning_with_Knowledge_Distillation_for_Dense_Prediction_ICCV_2023_paper.pdf)]
180 | 
181 | * Vision Transformer Adapters for Generalizable Multitask Learning (ICCV, 2023) [[paper](https://arxiv.org/abs/2308.12372)] [[code](https://ivrl.github.io/VTAGML/)]
182 | 
183 | * TaskExpert: Dynamically Assembling Multi-Task Representations with Memorial Mixture-of-Experts (ICCV, 2023) [[paper](https://arxiv.org/pdf/2307.15324.pdf)] 
184 | 
185 | * Prompt Guided Transformer for Multi-Task Dense Prediction (arXiv, 2023) [[paper](https://arxiv.org/pdf/2307.15362.pdf)]
186 | 
187 | * Auxiliary Learning as an Asymmetric Bargaining Game (ICML, 2023) [[paper](https://arxiv.org/pdf/2301.13501.pdf)] [[code](https://github.com/AvivSham/auxinash)]
188 | 
189 | * Learning to Modulate pre-trained Models in RL (arXiv, 2023) [[paper](https://arxiv.org/abs/2306.14884)] [[code](https://github.com/ml-jku/L2M)]
190 | 
191 | * **[InvPT++]**: Inverted Pyramid Multi-Task Transformer for Visual Scene Understanding (arXiv, 2023) [[paper](https://arxiv.org/pdf/2306.04842.pdf)] [[code](https://github.com/prismformore/Multi-Task-Transformer/tree/main/InvPT)]
192 | 
193 | * FAMO: Fast Adaptive Multitask Optimization (arXiv, 2023) [[paper](https://arxiv.org/pdf/2306.03792.pdf)] [[code](https://github.com/Cranial-XIX/FAMO)]
194 | 
195 | * Sample-Level Weighting for Multi-Task Learning with Auxiliary Tasks (arXiv, 2023) [[paper](https://arxiv.org/pdf/2306.04519.pdf)]
196 | 
197 | * DynaShare: Task and Instance Conditioned Parameter Sharing for Multi-Task Learning (arXiv, 2023) [[paper](https://arxiv.org/abs/2305.17305)]
198 | 
199 | * Planning-oriented Autonomous Driving (CVPR, 2023, **Best Paper**) [[paper](https://arxiv.org/pdf/2212.10156.pdf)] [[code](https://github.com/OpenDriveLab/UniAD)]
200 | 
201 | * MDL-NAS: A Joint Multi-domain Learning Framework for Vision Transformer (CVPR, 2023) [[paper](https://openaccess.thecvf.com/content/CVPR2023/papers/Wang_MDL-NAS_A_Joint_Multi-Domain_Learning_Framework_for_Vision_Transformer_CVPR_2023_paper.pdf)]
202 | 
203 | * Hierarchical Prompt Learning for Multi-Task Learning (CVPR, 2023) [[paper](https://openaccess.thecvf.com/content/CVPR2023/papers/Liu_Hierarchical_Prompt_Learning_for_Multi-Task_Learning_CVPR_2023_paper.pdf)]
204 | 
205 | * Independent Component Alignment for Multi-Task Learning (CVPR, 2023) [[paper](https://openaccess.thecvf.com/content/CVPR2023/papers/Senushkin_Independent_Component_Alignment_for_Multi-Task_Learning_CVPR_2023_paper.pdf)] [[code](https://github.com/SamsungLabs/MTL)]
206 | 
207 | * ForkMerge: Mitigating Negative Transfer in Auxiliary-Task Learning (TMLR, 2023) [[paper](https://arxiv.org/abs/2301.12618)] [[code]()]
208 | 
209 | * MetaMorphosis: Task-oriented Privacy Cognizant Feature Generation for Multi-task Learning (arXiv, 2023) [[paper](https://arxiv.org/abs/2305.07815)]
210 | 
211 | * ESSR: Evolving Sparse Sharing Representation for Multi-task Learning (arXiv, 2023) [[paper](https://ieeexplore.ieee.org/abstract/document/10114675)]
212 | 
213 | * AutoTaskFormer: Searching Vision Transformers for Multi-task Learning (arXiv, 2023) [[paper](https://arxiv.org/pdf/2304.08756.pdf)]
214 | 
215 | * AdaTT: Adaptive Task-to-Task Fusion Network for Multitask Learning in Recommendations (arXiv, 2023) [[paper](https://arxiv.org/pdf/2304.04959.pdf)]
216 | 
217 | * A Study of Autoregressive Decoders for Multi-Tasking in Computer Vision (arXiv, 2023) [[paper](https://arxiv.org/pdf/2303.17376.pdf)]
218 | 
219 | * Efficient Computation Sharing for Multi-Task Visual Scene Understanding (arXiv, 2023) [[paper](https://arxiv.org/pdf/2303.09663.pdf)]
220 | 
221 | * Mod-Squad: Designing Mixture of Experts As Modular Multi-Task Learners (CVPR, 2023) [[paper](https://arxiv.org/pdf/2212.08066.pdf)] [[code](https://vis-www.cs.umass.edu/mod-squad/)]
222 | 
223 | * Mitigating Task Interference in Multi-Task Learning via Explicit Task Routing with Non-Learnable Primitives (CVPR, 2023) [[paper](http://hal.cse.msu.edu/assets/pdfs/papers/2023-cvpr-multi-task-learning-non-learnable-task-routing.pdf)] [[code](https://github.com/zhichao-lu/etr-nlp-mtl)]
224 | 
225 | * Mitigating Gradient Bias in Multi-objective Learning: A Provably Convergent Approach (ICLR, 2023) [[paper](https://openreview.net/forum?id=dLAYGdKTi2)] 
226 | 
227 | * UNIVERSAL FEW-SHOT LEARNING OF DENSE PREDIC- TION TASKS WITH VISUAL TOKEN MATCHING (ICLR, 2023) [[paper](https://openreview.net/pdf?id=88nT0j5jAn)]
228 | 
229 | * TASKPROMPTER: SPATIAL-CHANNEL MULTI-TASK PROMPTING FOR DENSE SCENE UNDERSTANDING (ICLR, 2023) [[paper](https://openreview.net/forum?id=-CwPopPJda)] [[code](https://github.com/prismformore/Multi-Task-Transformer/tree/main/TaskPrompter)] [[dataset](https://arxiv.org/pdf/2304.00971.pdf)]
230 | 
231 | * Contrastive Multi-Task Dense Prediction (AAAI 2023) [[paper](https://laos-y.github.io/uploads/yang2023AAAI/2437.YangS.pdf)]
232 | 
233 | * Composite Learning for Robust and Effective Dense Predictions (WACV, 2023) [[paper](https://arxiv.org/abs/2210.07239)]
234 | 
235 | * Toward Edge-Efficient Dense Predictions with Synergistic Multi-Task Neural Architecture Search (WACV, 2023) [[paper](https://arxiv.org/abs/2210.01384)] 
236 | 
237 | * Cross-task Attention Mechanism for Dense Multi-task Learning (WACV, 2023) [[paper](https://openaccess.thecvf.com/content/WACV2023/papers/Lopes_Cross-Task_Attention_Mechanism_for_Dense_Multi-Task_Learning_WACV_2023_paper.pdf)] [[code](https://github.com/astra-vision/DenseMTL)]
238 | 
239 | ### 2022
240 | 
241 | * RepMode: Learning to Re-parameterize Diverse Experts for Subcellular Structure Prediction (arXiv, 2022) [[paper](https://arxiv.org/abs/2212.10066)]
242 | 
243 | * LEARNING USEFUL REPRESENTATIONS FOR SHIFTING TASKS AND DISTRIBUTIONS (arXiv, 2022) [[paper](https://arxiv.org/abs/2212.07346)]
244 | 
245 | * Sub-Task Imputation via Self-Labelling to Train Image Moderation Models on Sparse Noisy Data (ACM CIKM, 2022) [[paper](https://dl.acm.org/doi/pdf/10.1145/3511808.3557149)]
246 | 
247 | * Multi-Task Meta Learning: learn how to adapt to unseen tasks (arXiv, 2022) [[paper](https://arxiv.org/pdf/2210.06989.pdf)]
248 | 
249 | * M<sup>3</sup>ViT: Mixture-of-Experts Vision Transformer for Efficient Multi-task Learning with Model-Accelerator Co-design (NeurIPS, 2022) [[paper](https://openreview.net/pdf?id=cFOhdl1cyU-)] [[code](https://github.com/VITA-Group/M3ViT)]
250 | 
251 | * AutoMTL: A Programming Framework for Automating Efficient Multi-Task Learning (NeurIPS, 2022) [[paper](https://arxiv.org/abs/2110.13076)] [[code](https://github.com/zhanglijun95/AutoMTL)]
252 | 
253 | * Association Graph Learning for Multi-Task Classification with Category Shifts (NeurIPS, 2022) [[paper](https://arxiv.org/pdf/2210.04637.pdf)] [[code](https://github.com/autumn9999/MTC-with-Category-Shifts)]
254 | 
255 | * Do Current Multi-Task Optimization Methods in Deep Learning Even Help? (NeurIPS, 2022) [[paper](https://arxiv.org/abs/2209.11379)]
256 | 
257 | * Task Discovery: Finding the Tasks that Neural Networks Generalize on (NeurIPS, 2022) [[paper](https://taskdiscovery.epfl.ch/static/paper/arxiv.pdf)]
258 | 
259 | * **[Auto-λ]** Auto-λ: Disentangling Dynamic Task Relationships (TMLR, 2022) [[paper](https://arxiv.org/pdf/2202.03091.pdf)] [[code](https://github.com/lorenmt/auto-lambda)]
260 | 
261 | * **[Universal Representations]** Universal Representations: A Unified Look at Multiple Task and Domain Learning (arXiv, 2022) [[paper](https://arxiv.org/pdf/2204.02744.pdf)] [[code](https://github.com/VICO-UoE/UniversalRepresentations)]
262 | 
263 | * MTFormer: Multi-Task Learning via Transformer and Cross-Task Reasoning (ECCV, 2022) [[paper](https://www.ecva.net/papers/eccv_2022/papers_ECCV/papers/136870299.pdf)] 
264 | 
265 | * Not All Models Are Equal: Predicting Model Transferability in a Self-challenging Fisher Space (ECCV, 2022) [[paper](https://arxiv.org/abs/2207.03036)] [[code](https://github.com/TencentARC/SFDA)]
266 | 
267 | * Factorizing Knowledge in Neural Networks (ECCV, 2022) [[paper](https://arxiv.org/abs/2207.03337)] [[code](https://github.com/Adamdad/KnowledgeFactor)]
268 | 
269 | * **[InvPT]** Inverted Pyramid Multi-task Transformer for Dense Scene Understanding (ECCV, 2022) [[paper](https://arxiv.org/pdf/2203.07997.pdf)] [[code](https://github.com/prismformore/InvPT)]
270 | 
271 | * **[MultiMAE]** MultiMAE: Multi-modal Multi-task Masked Autoencoders (ECCV, 2022) [[paper](https://arxiv.org/pdf/2204.01678.pdf)] [[code](https://multimae.epfl.ch)]
272 | 
273 | * A Multi-objective / Multi-task Learning Framework Induced by Pareto Stationarity (ICML, 2022) [[paper](https://proceedings.mlr.press/v162/momma22a.html)]
274 | 
275 | * Mitigating Modality Collapse in Multimodal VAEs via Impartial Optimization (ICML, 2022) [[paper](https://proceedings.mlr.press/v162/javaloy22a.html)]
276 | 
277 | * Active Multi-Task Representation Learning (ICML, 2022) [[paper](https://proceedings.mlr.press/v162/chen22j.html)]
278 | 
279 | * Generative Modeling for Multi-task Visual Learning (ICML, 2022) [[paper](https://proceedings.mlr.press/v162/bao22c.html)] [[code](https://github.com/zpbao/multi-task-oriented_generative_modeling)]
280 | 
281 | * Multi-Task Learning as a Bargaining Game (ICML, 2022) [[paper](https://proceedings.mlr.press/v162/navon22a.html)] [[code](https://github.com/AvivNavon/nash-mtl)]
282 | 
283 | * Multi-Task Learning with Multi-query Transformer for Dense Prediction (arXiv, 2022) [[paper](https://arxiv.org/pdf/2205.14354.pdf)]
284 | 
285 | * **[Gato]** A Generalist Agent (arXiv, 2022) [[paper](https://arxiv.org/pdf/2205.06175.pdf)]
286 | 
287 | * **[MTPSL]** Learning Multiple Dense Prediction Tasks from Partially Annotated Data (CVPR, 2022, **Best Paper Finalist**) [[paper](https://arxiv.org/pdf/2111.14893.pdf)] [[code](https://github.com/VICO-UoE/MTPSL)]
288 | 
289 | * **[TSA]** Cross-domain Few-shot Learning with Task-specific Adapters (CVPR, 2022) [[paper](https://arxiv.org/pdf/2107.00358.pdf)] [[code](https://github.com/VICO-UoE/URL)]
290 | 
291 | * **[OMNIVORE]** OMNIVORE: A Single Model for Many Visual Modalities (CVPR, 2022) [[paper](https://arxiv.org/pdf/2201.08377.pdf)] [[code](https://github.com/facebookresearch/omnivore)]
292 | 
293 | * Task Adaptive Parameter Sharing for Multi-Task Learning (CVPR, 2022) [[paper](https://arxiv.org/pdf/2203.16708.pdf)]
294 | 
295 | * Controllable Dynamic Multi-Task Architectures (CVPR, 2022) [[paper](https://arxiv.org/pdf/2203.14949.pdf)] [[code](https://www.nec-labs.com/~mas/DYMU/)]
296 | 
297 | * **[SHIFT]** SHIFT: A Synthetic Driving Dataset for Continuous Multi-Task Domain Adaptation (CVPR, 2022) [[paper](https://openaccess.thecvf.com/content/CVPR2022/papers/Sun_SHIFT_A_Synthetic_Driving_Dataset_for_Continuous_Multi-Task_Domain_Adaptation_CVPR_2022_paper.pdf)] [[code](https://www.vis.xyz/shift/)]
298 | 
299 | * DiSparse: Disentangled Sparsification for Multitask Model Compression (CVPR, 2022) [[paper](https://openaccess.thecvf.com/content/CVPR2022/papers/Sun_DiSparse_Disentangled_Sparsification_for_Multitask_Model_Compression_CVPR_2022_paper.pdf)] [[code](https://github.com/SHI-Labs/DiSparse-Multitask-Model-Compression)]
300 | 
301 | * **[MulT]** MulT: An End-to-End Multitask Learning Transformer (CVPR, 2022) [[paper](https://openaccess.thecvf.com/content/CVPR2022/papers/Bhattacharjee_MulT_An_End-to-End_Multitask_Learning_Transformer_CVPR_2022_paper.pdf)] [[code](https://github.com/IVRL/MulT)]
302 | 
303 | * Sound and Visual Representation Learning with Multiple Pretraining Tasks (CVPR, 2022) [[paper](https://openaccess.thecvf.com/content/CVPR2022/papers/Vasudevan_Sound_and_Visual_Representation_Learning_With_Multiple_Pretraining_Tasks_CVPR_2022_paper.pdf)]
304 | 
305 | * Medusa: Universal Feature Learning via Attentional Multitasking (CVPR Workshop, 2022) [[paper](https://arxiv.org/abs/2204.05698)]
306 | 
307 | * An Evolutionary Approach to Dynamic Introduction of Tasks in Large-scale Multitask Learning Systems (arXiv, 2022) [[paper](https://arxiv.org/pdf/2205.12755.pdf)] [[code](https://github.com/google-research/google-research/tree/master/muNet)]
308 | 
309 | * Combining Modular Skills in Multitask Learning (arXiv, 2022) [[paper](https://arxiv.org/pdf/2202.13914.pdf)]
310 | 
311 | * Visual Representation Learning over Latent Domains (ICLR, 2022) [[paper](https://openreview.net/pdf?id=kG0AtPi6JI1)]
312 | 
313 | * ADARL: What, Where, and How to Adapt in Transfer Reinforcement Learning (ICLR, 2022) [[paper](https://openreview.net/pdf?id=8H5bpVwvt5)] [[code](https://github.com/Adaptive-RL/AdaRL-code)]
314 | 
315 | * Towards a Unified View of Parameter-Efficient Transfer Learning (ICLR, 2022) [[paper](https://openreview.net/pdf?id=0RDcd5Axok)] [[code](https://github.com/jxhe/unify-parameter-efficient-tuning)]
316 | 
317 | * **[Rotograd]** Rotograd: Dynamic Gradient Homogenization for Multi-Task Learning (ICLR, 2022) [[paper](https://openreview.net/pdf?id=T8wHz4rnuGL)] [[code](https://github.com/adrianjav/rotograd)]
318 | 
319 | * Relational Multi-task Learning: Modeling Relations Between Data and Tasks (ICLR, 2022) [[paper](https://openreview.net/pdf?id=8Py-W8lSUgy)]
320 | 
321 | * Weighted Training for Cross-task Learning (ICLR, 2022) [[paper](https://openreview.net/pdf?id=ltM1RMZntpu)] [[code](https://github.com/CogComp/TAWT)]
322 | 
323 | * Semi-supervised Multi-task Learning for Semantics and Depth (WACV, 2022) [[paper](https://openaccess.thecvf.com/content/WACV2022/papers/Wang_Semi-Supervised_Multi-Task_Learning_for_Semantics_and_Depth_WACV_2022_paper.pdf)]
324 | 
325 | * In Defense of the Unitary Scalarization for Deep Multi-Task Learning (arXiv, 2022) [[paper](https://arxiv.org/pdf/2201.04122.pdf)]
326 | 
327 | ### 2021
328 | 
329 | * Variational Multi-Task Learning with Gumbel-Softmax Priors (NeurIPS, 2021) [[paper](https://arxiv.org/pdf/2111.05323.pdf)] [[code](https://github.com/autumn9999/VMTL)]
330 | 
331 | * Efficiently Identifying Task Groupings for Multi-Task Learning (NeurIPS, 2021) [[paper](http://arxiv.org/abs/2109.04617)]
332 | 
333 | * **[CAGrad]** Conflict-Averse Gradient Descent for Multi-task Learning (NeurIPS, 2021) [[paper](https://openreview.net/pdf?id=_61Qh8tULj_)] [[code](https://github.com/Cranial-XIX/CAGrad)]
334 | 
335 | * A Closer Look at Loss Weighting in Multi-Task Learning (arXiv, 2021) [[paper](https://arxiv.org/pdf/2111.10603.pdf)]
336 | 
337 | * Exploring Relational Context for Multi-Task Dense Prediction (ICCV, 2021) [[paper](http://arxiv.org/abs/2104.13874)] [[code](https://github.com/brdav/atrc)]
338 | 
339 | * Multi-Task Self-Training for Learning General Representations (ICCVW, 2021) [[paper](https://openaccess.thecvf.com/content/ICCV2021/papers/Ghiasi_Multi-Task_Self-Training_for_Learning_General_Representations_ICCV_2021_paper.pdf)]
340 | 
341 | * Task Switching Network for Multi-task Learning (ICCV, 2021) [[paper](https://openaccess.thecvf.com/content/ICCV2021/html/Sun_Task_Switching_Network_for_Multi-Task_Learning_ICCV_2021_paper.html)] [[code](https://github.com/GuoleiSun/TSNs)]
342 | 
343 | * Omnidata: A Scalable Pipeline for Making Multi-Task Mid-Level Vision Datasets from 3D Scans (ICCV, 2021) [[paper](https://arxiv.org/pdf/2110.04994.pdf)] [[project](https://omnidata.vision)]
344 | 
345 | * Robustness via Cross-Domain Ensembles (ICCV, 2021) [[paper](https://arxiv.org/abs/2103.10919)] [[code](https://github.com/EPFL-VILAB/XDEnsembles)]
346 | 
347 | * Domain Adaptive Semantic Segmentation with Self-Supervised Depth Estimation (ICCV, 2021) [[paper](https://openaccess.thecvf.com/content/ICCV2021/papers/Wang_Domain_Adaptive_Semantic_Segmentation_With_Self-Supervised_Depth_Estimation_ICCV_2021_paper.pdf)] [[code](https://qin.ee/corda)]
348 | 
349 | * **[URL]** Universal Representation Learning from Multiple Domains for Few-shot Classification (ICCV, 2021) [[paper](https://openaccess.thecvf.com/content/ICCV2021/papers/Li_Universal_Representation_Learning_From_Multiple_Domains_for_Few-Shot_Classification_ICCV_2021_paper.pdf)] [[code](https://github.com/VICO-UoE/URL)]
350 | 
351 | * **[tri-M]** A Multi-Mode Modulator for Multi-Domain Few-Shot Classification (ICCV, 2021) [[paper](https://openaccess.thecvf.com/content/ICCV2021/papers/Liu_A_Multi-Mode_Modulator_for_Multi-Domain_Few-Shot_Classification_ICCV_2021_paper.pdf)] [[code](https://github.com/csyanbin/tri-M-ICCV)]
352 | 
353 | * MultiTask-CenterNet (MCN): Efficient and Diverse Multitask Learning using an Anchor Free Approach (ICCV Workshop, 2021) [[paper](https://openaccess.thecvf.com/content/ICCV2021W/ERCVAD/papers/Heuer_MultiTask-CenterNet_MCN_Efficient_and_Diverse_Multitask_Learning_Using_an_Anchor_ICCVW_2021_paper.pdf)]
354 | 
355 | * See Yourself in Others: Attending Multiple Tasks for Own Failure Detection (arXiv, 2021) [[paper](https://arxiv.org/pdf/2110.02549.pdf)]
356 | 
357 | * A Multi-Task Cross-Task Learning Architecture for Ad-hoc Uncertainty Estimation in 3D Cardiac MRI Image Segmentation (CinC, 2021) [[paper](https://www.cinc.org/2021/Program/accepted/115_Preprint.pdf)] [[code](https://github.com/SMKamrulHasan/MTCTL)]
358 | 
359 | * Multi-Task Reinforcement Learning with Context-based Representations (ICML, 2021) [[paper](http://arxiv.org/abs/2102.06177)]
360 | 
361 | * **[FLUTE]** Learning a Universal Template for Few-shot Dataset Generalization (ICML, 2021) [[paper](https://arxiv.org/pdf/2105.07029.pdf)] [[code](https://github.com/google-research/meta-dataset)]
362 | 
363 | * Towards a Unified View of Parameter-Efficient Transfer Learning (arXiv, 2021) [[paper](http://arxiv.org/abs/2110.04366)]
364 | 
365 | * UniT: Multimodal Multitask Learning with a Unified Transformer (arXiv, 2021) [[paper](http://arxiv.org/abs/2102.10772)]
366 | 
367 | * Learning to Relate Depth and Semantics for Unsupervised Domain Adaptation (CVPR, 2021) [[paper](https://openaccess.thecvf.com/content/CVPR2021/papers/Saha_Learning_To_Relate_Depth_and_Semantics_for_Unsupervised_Domain_Adaptation_CVPR_2021_paper.pdf)] [[code](https://github.com/susaha/ctrl-uda)]
368 | 
369 | * CompositeTasking: Understanding Images by Spatial Composition of Tasks (CVPR, 2021) [[paper](https://openaccess.thecvf.com/content/CVPR2021/html/Popovic_CompositeTasking_Understanding_Images_by_Spatial_Composition_of_Tasks_CVPR_2021_paper.html)] [[code](https://github.com/nikola3794/composite-tasking)]
370 | 
371 | * Anomaly Detection in Video via Self-Supervised and Multi-Task Learning (CVPR, 2021) [[paper](https://openaccess.thecvf.com/content/CVPR2021/papers/Georgescu_Anomaly_Detection_in_Video_via_Self-Supervised_and_Multi-Task_Learning_CVPR_2021_paper.pdf)]
372 | 
373 | * Taskology: Utilizing Task Relations at Scale (CVPR, 2021) [[paper](https://openaccess.thecvf.com/content/CVPR2021/papers/Lu_Taskology_Utilizing_Task_Relations_at_Scale_CVPR_2021_paper.pdf)]
374 | 
375 | * Three Ways to Improve Semantic Segmentation with Self-Supervised Depth Estimation (CVPR, 2021) [[paper](https://arxiv.org/pdf/2012.10782.pdf)] [[code](https://github.com/lhoyer/improving_segmentation_with_selfsupervised_depth)]
376 | 
377 | * Improving Semi-Supervised and Domain-Adaptive Semantic Segmentation with Self-Supervised Depth Estimation (arXiv, 2021) [[paper](https://arxiv.org/pdf/2108.12545.pdf)] [[code](https://github.com/lhoyer/improving_segmentation_with_selfsupervised_depth)]
378 | 
379 | * Counter-Interference Adapter for Multilingual Machine Translation (Findings of EMNLP, 2021) [[paper](https://aclanthology.org/2021.findings-emnlp.240)]
380 | 
381 | * Conditionally Adaptive Multi-Task Learning: Improving Transfer Learning in NLP Using Fewer Parameters & Less Data (ICLR) [[paper](https://openreview.net/forum?id=de11dbHzAMF)] [[code](https://github.com/CAMTL/CA-MTL)]
382 | 
383 | * **[Gradient Vaccine]** Gradient Vaccine: Investigating and Improving Multi-task Optimization in Massively Multilingual Models (ICLR, 2021) [[paper](https://openreview.net/forum?id=F1vEjWK-lH_)] 
384 | 
385 | * **[IMTL]** Towards Impartial Multi-task Learning (ICLR, 2021) [[paper](https://openreview.net/forum?id=IMPnRXEWpvr)]
386 | 
387 | * Deciphering and Optimizing Multi-Task Learning: A Random Matrix Approach (ICLR, 2021) [[paper](https://openreview.net/forum?id=Cri3xz59ga)]
388 | 
389 | * **[URT]** A Universal Representation Transformer Layer for Few-Shot Image Classification (ICLR, 2021) [[paper](https://arxiv.org/pdf/2006.11702.pdf)] [[code](https://github.com/liulu112601/URT)]
390 | 
391 | * Flexible Multi-task Networks by Learning Parameter Allocation (ICLR Workshop, 2021) [[paper](http://arxiv.org/abs/1910.04915)]
392 | 
393 | * Multi-Loss Weighting with Coefficient of Variations (WACV, 2021) [[paper](https://openaccess.thecvf.com/content/WACV2021/papers/Groenendijk_Multi-Loss_Weighting_With_Coefficient_of_Variations_WACV_2021_paper.pdf)] [[code](https://github.com/rickgroen/cov-weighting)]
394 | 
395 | ### 2020
396 | 
397 | * Multi-Task Reinforcement Learning with Soft Modularization (NeurIPS, 2020) [[paper](http://arxiv.org/abs/2003.13661)] [[code](https://github.com/RchalYang/Soft-Module)] 
398 | * AdaShare: Learning What To Share For Efficient Deep Multi-Task Learning (NeurIPS, 2020) [[paper](http://arxiv.org/abs/1911.12423)] [[code](https://github.com/sunxm2357/AdaShare)]
399 | 
400 | * **[GradDrop]** Just Pick a Sign: Optimizing Deep Multitask Models with Gradient Sign Dropout (NeurIPS, 2020) [[paper](https://proceedings.NeurIPS.cc//paper/2020/file/16002f7a455a94aa4e91cc34ebdb9f2d-Paper.pdf)] [[code](https://github.com/tensorflow/lingvo/blob/master/lingvo/core/graddrop.py)]
401 | 
402 | * **[PCGrad]** Gradient Surgery for Multi-Task Learning (NeurIPS, 2020) [[paper](http://arxiv.org/abs/2001.06782)] [[tensorflow](https://github.com/tianheyu927/PCGrad)] [[pytorch](https://github.com/WeiChengTseng/Pytorch-PCGrad)]
403 | 
404 | * On the Theory of Transfer Learning: The Importance of Task Diversity (NeurIPS, 2020) [[paper](https://proceedings.NeurIPS.cc//paper/2020/file/59587bffec1c7846f3e34230141556ae-Paper.pdf)]
405 | 
406 | * A Study of Residual Adapters for Multi-Domain Neural Machine Translation (WMT, 2020) [[paper](https://www.aclweb.org/anthology/2020.wmt-1.72/)]
407 | 
408 | * Multi-Task Adversarial Attack (arXiv, 2020) [[paper](http://arxiv.org/abs/2011.09824)]
409 | 
410 | * Automated Search for Resource-Efficient Branched Multi-Task Networks (BMVC, 2020) [[paper](http://arxiv.org/abs/2008.10292)] [[code](https://github.com/brdav/bmtas)]
411 | * Branched Multi-Task Networks: Deciding What Layers To Share (BMVC, 2020) [[paper](http://arxiv.org/abs/1904.02920)]
412 | 
413 | * MTI-Net: Multi-Scale Task Interaction Networks for Multi-Task Learning (ECCV, 2020) [[paper](http://arxiv.org/abs/2001.06902)] [[code](https://github.com/SimonVandenhende/Multi-Task-Learning-PyTorch)]
414 | 
415 | * Reparameterizing Convolutions for Incremental Multi-Task Learning without Task Interference (ECCV, 2020) [[paper](http://arxiv.org/abs/2007.12540)] [[code](https://github.com/menelaoskanakis/RCM)]
416 | 
417 | * Selecting Relevant Features from a Multi-domain Representation for Few-shot Classification (ECCV, 2020) [[paper](https://arxiv.org/pdf/2003.09338.pdf)] [[code](https://github.com/dvornikita/SUR)]
418 | 
419 | * Multitask Learning Strengthens Adversarial Robustness (ECCV 2020) [[paper](https://www.ecva.net/papers/eccv_2020/papers_ECCV/papers/123470154.pdf)] [[code](https://github.com/columbia/MTRobust)]
420 | 
421 | * Duality Diagram Similarity: a generic framework for initialization selection in task transfer learning (ECCV, 2020) [[paper](https://www.ecva.net/papers/eccv_2020/papers_ECCV/papers/123710494.pdf)] [[code](https://github.com/cvai-repo/duality-diagram-similarity)]
422 | 
423 | * **[KD4MTL]** Knowledge Distillation for Multi-task Learning (ECCV Workshop) [[paper](https://arxiv.org/pdf/2007.06889.pdf)] [[code](https://github.com/VICO-UoE/KD4MTL)]
424 | 
425 | * MTL-NAS: Task-Agnostic Neural Architecture Search towards General-Purpose Multi-Task Learning (CVPR, 2020) [[paper](https://arxiv.org/abs/2003.14058)] [[code](https://github.com/bhpfelix/MTLNAS)]
426 | 
427 | * Robust Learning Through Cross-Task Consistency (CVPR, 2020) [[paper](https://consistency.epfl.ch/Cross_Task_Consistency_CVPR2020.pdf)] [[code](https://github.com/EPFL-VILAB/XTConsistency)]
428 | 
429 | * 12-in-1: Multi-Task Vision and Language Representation Learning (CVPR, 2020) [paper](https://openaccess.thecvf.com/content_CVPR_2020/papers/Lu_12-in-1_Multi-Task_Vision_and_Language_Representation_Learning_CVPR_2020_paper.pdf) [[code](https://github.com/facebookresearch/vilbert-multi-task)]
430 | 
431 | * A Multi-task Mean Teacher for Semi-supervised Shadow Detection (CVPR, 2020) [[paper](https://openaccess.thecvf.com/content_CVPR_2020/papers/Chen_A_Multi-Task_Mean_Teacher_for_Semi-Supervised_Shadow_Detection_CVPR_2020_paper.pdf)] [[code](https://github.com/eraserNut/MTMT)]
432 | 
433 | * MAD-X: An Adapter-Based Framework for Multi-Task Cross-Lingual Transfer (EMNLP, 2020) [[paper](https://doi.org/10.18653/v1/2020.emnlp-main.617)]
434 | 
435 | * Masking as an Efficient Alternative to Finetuning for Pretrained Language Models (EMNLP, 2020) [[paper](http://arxiv.org/abs/2004.12406)] [[code](https://github.com/ptlmasking/maskbert)]
436 | 
437 | * Effcient Continuous Pareto Exploration in Multi-Task Learning (ICML, 2020) [[paper](http://proceedings.mlr.press/v119/ma20a/ma20a.pdf)] [[code](https://github.com/mit-gfx/ContinuousParetoMTL)]
438 | 
439 | * Which Tasks Should Be Learned Together in Multi-task Learning? (ICML, 2020) [[paper](http://arxiv.org/abs/1905.07553)] [[code](https://github.com/tstandley/taskgrouping)]
440 | 
441 | * Learning to Branch for Multi-Task Learning (ICML, 2020) [[paper](https://arxiv.org/abs/2006.01895)]
442 | 
443 | * Partly Supervised Multitask Learning (ICMLA, 2020) [paper](https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=9356271)
444 | 
445 | * Understanding and Improving Information Transfer in Multi-Task Learning (ICLR, 2020) [[paper](https://arxiv.org/abs/2005.00944)]
446 | 
447 | * Measuring and Harnessing Transference in Multi-Task Learning (arXiv, 2020) [[paper](https://arxiv.org/abs/2010.15413)]
448 | 
449 | * Multi-Task Semi-Supervised Adversarial Autoencoding for Speech Emotion Recognition (arXiv, 2020) [[paper](https://arxiv.org/pdf/1907.06078.pdf)]
450 | 
451 | * Learning Sparse Sharing Architectures for Multiple Tasks (AAAI, 2020) [[paper](http://arxiv.org/abs/1911.05034)]
452 | 
453 | * AdapterFusion: Non-Destructive Task Composition for Transfer Learning (arXiv, 2020) [[paper](http://arxiv.org/abs/2005.00247)]
454 | 
455 | ### 2019
456 | 
457 | * Adaptive Auxiliary Task Weighting for Reinforcement Learning (NeurIPS, 2019) [[paper](https://papers.nips.cc/paper/2019/hash/0e900ad84f63618452210ab8baae0218-Abstract.html)]
458 | 
459 | * Pareto Multi-Task Learning (NeurIPS, 2019) [[paper](http://papers.nips.cc/paper/9374-pareto-multi-task-learning.pdf)] [[code](https://github.com/Xi-L/ParetoMTL)]
460 | 
461 | * Modular Universal Reparameterization: Deep Multi-task Learning Across Diverse Domains (NeurIPS, 2019) [[paper](http://arxiv.org/abs/1906.00097)]
462 | 
463 | * Fast and Flexible Multi-Task Classification Using Conditional Neural Adaptive Processes (NeurIPS, 2019) [[paper](https://github.com/cambridge-mlg/cnaps)] [[code](https://proceedings.neurips.cc/paper/2019/file/1138d90ef0a0848a542e57d1595f58ea-Paper.pdf)]
464 | 
465 | * **[Orthogonal]** Regularizing Deep Multi-Task Networks using Orthogonal Gradients (arXiv, 2019) [[paper](http://arxiv.org/abs/1912.06844)]
466 | 
467 | * Many Task Learning With Task Routing (ICCV, 2019) [[paper](https://openaccess.thecvf.com/content_ICCV_2019/papers/Strezoski_Many_Task_Learning_With_Task_Routing_ICCV_2019_paper.pdf)] [[code](https://github.com/gstrezoski/TaskRouting)]
468 | 
469 | * Stochastic Filter Groups for Multi-Task CNNs: Learning Specialist and Generalist Convolution Kernels (ICCV, 2019) [[paper](https://arxiv.org/abs/1908.09597)]
470 | 
471 | * Deep Elastic Networks with Model Selection for Multi-Task Learning (ICCV, 2019) [[paper](http://arxiv.org/abs/1909.04860)] [[code](https://github.com/rllab-snu/Deep-Elastic-Network)]
472 | 
473 | * Feature Partitioning for Efficient Multi-Task Architectures (arXiv, 2019) [[paper](https://arxiv.org/abs/1908.04339)] [[code](https://github.com/google/multi-task-architecture-search)]
474 | 
475 | * Task Selection Policies for Multitask Learning (arXiv, 2019) [[paper](http://arxiv.org/abs/1907.06214)]
476 | 
477 | * BAM! Born-Again Multi-Task Networks for Natural Language Understanding (ACL, 2019) [[paper](https://www.aclweb.org/anthology/P19-1595/)] [[code](https://github.com/google-research/google-research/tree/master/bam)]
478 | 
479 | * OmniNet: A unified architecture for multi-modal multi-task learning (arXiv, 2019) [[paper](http://arxiv.org/abs/1907.07804)]
480 | 
481 | * NDDR-CNN: Layerwise Feature Fusing in Multi-Task CNNs by Neural Discriminative Dimensionality Reduction (CVPR, 2019) [[paper](https://arxiv.org/abs/1801.08297)] [[code](https://github.com/ethanygao/NDDR-CNN)]
482 | 
483 | * **[MTAN + DWA]** End-to-End Multi-Task Learning with Attention (CVPR, 2019) [[paper](http://arxiv.org/abs/1803.10704)] [[code](https://github.com/lorenmt/mtan)] 
484 | 
485 | * Attentive Single-Tasking of Multiple Tasks (CVPR, 2019) [[paper](http://arxiv.org/abs/1904.08918)] [[code](https://github.com/facebookresearch/astmt)]
486 | 
487 | * Pattern-Affinitive Propagation Across Depth, Surface Normal and Semantic Segmentation (CVPR, 2019) [[paper](https://openaccess.thecvf.com/content_CVPR_2019/papers/Zhang_Pattern-Affinitive_Propagation_Across_Depth_Surface_Normal_and_Semantic_Segmentation_CVPR_2019_paper.pdf)]
488 | 
489 | * Representation Similarity Analysis for Efficient Task Taxonomy & Transfer Learning (CVPR, 2019) [[paper](https://arxiv.org/abs/1904.11740)] [[code](https://github.com/kshitijd20/RSA-CVPR19-release)]
490 | 
491 | * **[Geometric Loss Strategy (GLS)]** MultiNet++: Multi-Stream Feature Aggregation and Geometric Loss Strategy for Multi-Task Learning (CVPR Workshop, 2019) [[paper](http://arxiv.org/abs/1904.08492)]
492 | 
493 | * Parameter-Efficient Transfer Learning for NLP (ICML, 2019) [[paper](http://arxiv.org/abs/1902.00751)]
494 | 
495 | * BERT and PALs: Projected Attention Layers for Efficient Adaptation in Multi-Task Learning (ICML, 2019) [[paper](http://arxiv.org/abs/1902.02671)] [[code](https://github.com/AsaCooperStickland/Bert-n-Pals)]
496 | 
497 | * Tasks Without Borders: A New Approach to Online Multi-Task Learning (ICML Workshop, 2019) [[paper](https://openreview.net/pdf?id=HkllV5Bs24)]
498 | 
499 | * AutoSeM: Automatic Task Selection and Mixing in Multi-Task Learning (NACCL, 2019) [[paper](https://arxiv.org/abs/1904.04153)] [[code](https://github.com/HanGuo97/AutoSeM)]
500 | 
501 | * Multi-Task Deep Reinforcement Learning with PopArt (AAAI, 2019) [[paper](https://doi.org/10.1609/aaai.v33i01.33013796)]
502 | 
503 | * SNR: Sub-Network Routing for Flexible Parameter Sharing in Multi-Task Learning (AAAI, 2019) [[paper](https://ojs.aaai.org/index.php/AAAI/article/view/3788/3666)]
504 | 
505 | * Latent Multi-task Architecture Learning (AAAI, 2019) [[paper](https://arxiv.org/abs/1705.08142)] [[code](https://github.com/ sebastianruder/sluice-networks)]
506 | 
507 | * Multi-Task Deep Neural Networks for Natural Language Understanding (ACL, 2019) [[paper](https://arxiv.org/pdf/1901.11504.pdf)]
508 | 
509 | ### 2018
510 | 
511 | * Learning to Multitask (NeurIPS, 2018) [[paper](https://papers.nips.cc/paper/2018/file/aeefb050911334869a7a5d9e4d0e1689-Paper.pdf)]
512 | 
513 | * **[MGDA]** Multi-Task Learning as Multi-Objective Optimization (NeurIPS, 2018) [[paper](http://arxiv.org/abs/1810.04650)] [[code](https://github.com/isl-org/MultiObjectiveOptimization)]
514 | 
515 | * Adapting Auxiliary Losses Using Gradient Similarity (arXiv, 2018) [[paper](http://arxiv.org/abs/1812.02224)] [[code](https://github.com/szkocot/Adapting-Auxiliary-Losses-Using-Gradient-Similarity)]
516 | 
517 | * Piggyback: Adapting a Single Network to Multiple Tasks by Learning to Mask Weights (ECCV, 2018) [[paper](https://openaccess.thecvf.com/content_ECCV_2018/papers/Arun_Mallya_Piggyback_Adapting_a_ECCV_2018_paper.pdf)] [[code](https://github.com/arunmallya/piggyback)]
518 | 
519 | * Dynamic Task Prioritization for Multitask Learning (ECCV, 2018) [[paper](https://openaccess.thecvf.com/content_ECCV_2018/papers/Michelle_Guo_Focus_on_the_ECCV_2018_paper.pdf)]
520 | 
521 | * A Modulation Module for Multi-task Learning with Applications in Image Retrieval (ECCV, 2018) [[paper](https://arxiv.org/abs/1807.06708)]
522 | 
523 | * Modeling Task Relationships in Multi-task Learning with Multi-gate Mixture-of-Experts (KDD, 2018) [[paper](https://dl.acm.org/doi/pdf/10.1145/3219819.3220007)] 
524 | 
525 | * Unifying and Merging Well-trained Deep Neural Networks for Inference Stage (IJCAI, 2018) [[paper](http://arxiv.org/abs/1805.04980)] [[code](https://github.com/ivclab/NeuralMerger)]
526 | 
527 | * Efficient Parametrization of Multi-domain Deep Neural Networks (CVPR, 2018) [[paper](https://openaccess.thecvf.com/content_cvpr_2018/papers/Rebuffi_Efficient_Parametrization_of_CVPR_2018_paper.pdf)] [[code](https://github.com/srebuffi/residual_adapters)]
528 | 
529 | * PAD-Net: Multi-tasks Guided Prediction-and-Distillation Network for Simultaneous Depth Estimation and Scene Parsing (CVPR, 2018) [[paper](https://openaccess.thecvf.com/content_cvpr_2018/papers/Xu_PAD-Net_Multi-Tasks_Guided_CVPR_2018_paper.pdf)]
530 | 
531 | * NestedNet: Learning Nested Sparse Structures in Deep Neural Networks (CVPR, 2018) [[paper](https://openaccess.thecvf.com/content_cvpr_2018/papers/Kim_NestedNet_Learning_Nested_CVPR_2018_paper.pdf)]
532 | 
533 | * PackNet: Adding Multiple Tasks to a Single Network by Iterative Pruning (CVPR, 2018) [[paper](https://openaccess.thecvf.com/content_cvpr_2018/papers/Mallya_PackNet_Adding_Multiple_CVPR_2018_paper.pdf)] [[code](https://github.com/arunmallya/packnet)]
534 | 
535 | 
536 | * **[Uncertainty]** Multi-task Learning Using Uncertainty to Weigh Losses for Scene Geometry and Semantics (CVPR, 2018) [[paper](https://openaccess.thecvf.com/content_cvpr_2018/papers/Kendall_Multi-Task_Learning_Using_CVPR_2018_paper.pdf)]
537 | 
538 | * Deep Asymmetric Multi-task Feature Learning (ICML, 2018) [[paper](http://proceedings.mlr.press/v80/lee18d/lee18d.pdf)]
539 | 
540 | * **[GradNorm]** GradNorm: Gradient Normalization for Adaptive Loss Balancing in Deep Multitask Networks (ICML, 2018) [[paper](http://arxiv.org/abs/1711.02257)]
541 | 
542 | * Pseudo-task Augmentation: From Deep Multitask Learning to Intratask Sharing---and Back (ICML, 2018) [[paper](http://arxiv.org/abs/1803.04062)]
543 | 
544 | * Gradient Adversarial Training of Neural Networks (arXiv, 2018) [[paper](http://arxiv.org/abs/1806.08028)]
545 | 
546 | * Auxiliary Tasks in Multi-task Learning (arXiv, 2018) [[paper](http://arxiv.org/abs/1805.06334)]
547 | 
548 | * Routing Networks: Adaptive Selection of Non-linear Functions for Multi-Task Learning (ICLR, 2018) [[paper](http://arxiv.org/abs/1711.01239)] [[code](https://github.com/cle-ros/RoutingNetworks)
549 | 
550 | * Beyond Shared Hierarchies: Deep Multitask Learning through Soft Layer Ordering (ICLR, 2018) [[paper](http://arxiv.org/abs/1711.00108)]
551 | 
552 | ### 2017
553 | 
554 | * Learning multiple visual domains with residual adapters (NeurIPS, 2017) [[paper](https://papers.nips.cc/paper/2017/file/e7b24b112a44fdd9ee93bdf998c6ca0e-Paper.pdf)] [[code](https://github.com/srebuffi/residual_adapters)]
555 | 
556 | * Learning Multiple Tasks with Multilinear Relationship Networks (NeurIPS, 2017) [[paper](https://proceedings.NeurIPS.cc/paper/2017/file/03e0704b5690a2dee1861dc3ad3316c9-Paper.pdf)] [[code](https://github.com/thuml/MTlearn)]
557 | 
558 | * Federated Multi-Task Learning (NeurIPS, 2017) [[paper](https://proceedings.NeurIPS.cc/paper/2017/file/6211080fa89981f66b1a0c9d55c61d0f-Paper.pdf)] [[code](https://github.com/gingsmith/fmtl)]
559 | 
560 | * Multi-task Self-Supervised Visual Learning (ICCV, 2017) [[paper](http://arxiv.org/abs/1708.07860)]
561 | 
562 | * Adversarial Multi-task Learning for Text Classification (ACL, 2017) [[paper](http://arxiv.org/abs/1704.05742)]
563 | 
564 | * UberNet: Training a Universal Convolutional Neural Network for Low-, Mid-, and High-Level Vision Using Diverse Datasets and Limited Memory (CVPR, 2017) [[paper](https://arxiv.org/abs/1609.02132)]
565 | 
566 | * Fully-adaptive Feature Sharing in Multi-Task Networks with Applications in Person Attribute Classification (CVPR, 2017) [[paper](https://openaccess.thecvf.com/content_cvpr_2017/papers/Lu_Fully-Adaptive_Feature_Sharing_CVPR_2017_paper.pdf)]
567 | 
568 | * Modular Multitask Reinforcement Learning with Policy Sketches (ICML, 2017) [[paper](http://arxiv.org/abs/1611.01796)] [[code](https://github.com/jacobandreas/psketch)]
569 | 
570 | 
571 | * SplitNet: Learning to Semantically Split Deep Networks for Parameter Reduction and Model Parallelization (ICML, 2017) [[paper](http://proceedings.mlr.press/v70/kim17b.html)] [[code](https://github.com/dalgu90/splitnet-wrn)]
572 | 
573 | * One Model To Learn Them All (arXiv, 2017) [[paper](http://arxiv.org/abs/1706.05137)] [[code](https://github.com/tensorflow/tensor2tensor)]
574 | 
575 | * **[AdaLoss]** Learning Anytime Predictions in Neural Networks via Adaptive Loss Balancing (arXiv, 2017) [[paper](http://arxiv.org/abs/1708.06832)]
576 | 
577 | * Deep Multi-task Representation Learning: A Tensor Factorisation Approach (ICLR, 2017) [[paper](https://arxiv.org/abs/1605.06391)] [[code](https://github.com/wOOL/DMTRL)]
578 | 
579 | * Trace Norm Regularised Deep Multi-Task Learning (ICLR Workshop, 2017) [[paper](http://arxiv.org/abs/1606.04038)] [[code](https://github.com/wOOL/TNRDMTL)]
580 | 
581 | * When is multitask learning effective? Semantic sequence prediction under varying data conditions (EACL, 2017) [[paper](http://arxiv.org/abs/1612.02251)] [[code](https://github.com/bplank/multitasksemantics)]
582 | 
583 | * Identifying beneficial task relations for multi-task learning in deep neural networks (EACL, 2017) [[paper](http://arxiv.org/abs/1702.08303)]
584 | 
585 | * PathNet: Evolution Channels Gradient Descent in Super Neural Networks (arXiv, 2017) [[paper](http://arxiv.org/abs/1701.08734)] [[code](https://github.com/jsikyoon/pathnet)]
586 | 
587 | * Attributes for Improved Attributes: A Multi-Task Network Utilizing Implicit and Explicit Relationships for Facial Attribute Classiﬁcation (AAAI, 2017) [[paper](https://www.aaai.org/ocs/index.php/AAAI/AAAI17/paper/viewFile/14749/14282)]
588 | 
589 | ### 2016 and earlier
590 | 
591 | * Learning values across many orders of magnitude (NeurIPS, 2016) [[paper](https://arxiv.org/abs/1602.07714)]
592 | 
593 | * Integrated Perception with Recurrent Multi-Task Neural Networks (NeurIPS, 2016) [[paper](https://proceedings.neurips.cc/paper/2016/file/06409663226af2f3114485aa4e0a23b4-Paper.pdf)] 
594 | 
595 | * Unifying Multi-Domain Multi-Task Learning: Tensor and Neural Network Perspectives (arXiv, 2016) [[paper](http://arxiv.org/abs/1611.09345)]
596 | 
597 | * Progressive Neural Networks (arXiv, 2016) [[paper](https://arxiv.org/abs/1606.04671)]
598 | 
599 | * Deep multi-task learning with low level tasks supervised at lower layers (ACL, 2016) [[paper](https://www.aclweb.org/anthology/P16-2038.pdf)]
600 | 
601 | * **[Cross-Stitch]** Cross-Stitch Networks for Multi-task Learning (CVPR,2016) [[paper](https://arxiv.org/abs/1604.03539)] [[code](https://github.com/helloyide/Cross-stitch-Networks-for-Multi-task-Learning)]
602 | 
603 | * Asymmetric Multi-task Learning based on Task Relatedness and Confidence (ICML, 2016) [[paper](http://proceedings.mlr.press/v48/leeb16.pdf)]
604 | 
605 | * MultiNet: Real-time Joint Semantic Reasoning for Autonomous Driving (arXiv, 2016) [[paper](http://arxiv.org/abs/1612.07695)] [[code](https://github.com/MarvinTeichmann/MultiNet)]
606 | 
607 | * A Unified Perspective on Multi-Domain and Multi-Task Learning (ICLR, 2015) [[paper](http://arxiv.org/abs/1412.7489)]
608 | 
609 | * Facial Landmark Detection by Deep Multi-task Learning (ECCV, 2014) [[paper](https://personal.ie.cuhk.edu.hk/~ccloy/files/eccv_2014_deepfacealign.pdf)] [[code](http://mmlab.ie.cuhk.edu.hk/projects/TCDCN.html)]
610 | 
611 | * Learning Task Grouping and Overlap in Multi-task Learning (ICML, 2012) [[paper](http://arxiv.org/abs/1206.6417)]
612 | 
613 | * Learning with Whom to Share in Multi-task Feature Learning (ICML, 2011) [[paper](http://www.cs.utexas.edu/~grauman/papers/icml2011.pdf)]
614 | 
615 | * Semi-Supervised Multi-Task Learning with Task Regularizations (ICDM, 2009) [[paper](https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=5360282)]
616 | 
617 | * Semi-Supervised Multitask Learning (NeurIPS, 2008) [[paper](https://proceedings.neurips.cc/paper/2007/file/a34bacf839b923770b2c360eefa26748-Paper.pdf)]
618 | 
619 | * Multitask Learning (1997) [[paper](https://link.springer.com/content/pdf/10.1023/A:1007379606734.pdf)]
620 | 
621 | ## [Awesome Multi-domain Multi-task Learning](https://github.com/WeiHongLee/Awesome-Multi-Domain-Multi-Task-Learning)
622 | 
623 | ## Workshops
624 | 
625 | * [Universal Representations for Computer Vision Workshop at BMVC 2022](https://sites.google.com/view/universalrepresentations)
626 | 
627 | * [Workshop on Multi-Task Learning in Computer Vision (DeepMTL) at ICCV 2021](https://sites.google.com/view/deepmtlworkshop/home)
628 | 
629 | * [Adaptive and Multitask Learning: Algorithms & Systems Workshop (AMTL) at ICML 2019](https://www.amtl-workshop.org)
630 | 
631 | * [Workshop on Multi-Task and Lifelong Reinforcement Learning at ICML 2015](https://sites.google.com/view/mtlrl)
632 | 
633 | * [Transfer and Multi-Task Learning: Trends and New Perspectives at NeurIPS 2015](https://nips.cc/Conferences/2015/Schedule?showEvent=4939)
634 | 
635 | * [Second Workshop on Transfer and Multi-task Learning at NeurIPS 2014](https://sites.google.com/site/multitaskwsnips2014/)
636 | 
637 | * [New Directions in Transfer and Multi-Task: Learning Across Domains and Tasks Workshop at NeurIPS 2013](https://sites.google.com/site/learningacross/home)
638 | 
639 | ## Online Courses
640 | 
641 | * [CS 330: Deep Multi-Task and Meta Learning](https://cs330.stanford.edu)
642 | 
643 | ## Related awesome list
644 | 
645 | * https://github.com/SimonVandenhende/Awesome-Multi-Task-Learning
646 | 
647 | * https://github.com/Manchery/awesome-multi-task-learning
648 | 
649 | * https://github.com/junfish/Awesome-Multitask-Learning
650 | 
651 | 
652 | 
653 | 


--------------------------------------------------------------------------------