├── .gitignore ├── LICENSE ├── README.md ├── images └── logo.png └── requirements.txt /.gitignore: -------------------------------------------------------------------------------- 1 | # Byte-compiled / optimized / DLL files 2 | __pycache__/ 3 | *.py[cod] 4 | *$py.class 5 | .idea 6 | # C extensions 7 | *.so 8 | */images 9 | # Distribution / packaging 10 | .Python 11 | build/ 12 | develop-eggs/ 13 | dist/ 14 | downloads/ 15 | eggs/ 16 | .eggs/ 17 | lib/ 18 | lib64/ 19 | parts/ 20 | sdist/ 21 | var/ 22 | wheels/ 23 | pip-wheel-metadata/ 24 | share/python-wheels/ 25 | *.egg-info/ 26 | .installed.cfg 27 | *.egg 28 | MANIFEST 29 | 30 | # PyInstaller 31 | # Usually these files are written by a python script from a template 32 | # before PyInstaller builds the exe, so as to inject date/other infos into it. 33 | *.manifest 34 | *.spec 35 | 36 | # Installer logs 37 | pip-log.txt 38 | pip-delete-this-directory.txt 39 | 40 | # Unit test / coverage reports 41 | htmlcov/ 42 | .tox/ 43 | .nox/ 44 | .coverage 45 | .coverage.* 46 | .cache 47 | nosetests.xml 48 | coverage.xml 49 | *.cover 50 | *.py,cover 51 | .hypothesis/ 52 | .pytest_cache/ 53 | 54 | # Translations 55 | *.mo 56 | *.pot 57 | 58 | # Django stuff: 59 | *.log 60 | local_settings.py 61 | db.sqlite3 62 | db.sqlite3-journal 63 | 64 | # Flask stuff: 65 | instance/ 66 | .webassets-cache 67 | 68 | # Scrapy stuff: 69 | .scrapy 70 | 71 | # Sphinx documentation 72 | docs/_build/ 73 | 74 | # PyBuilder 75 | target/ 76 | 77 | # Jupyter Notebook 78 | .ipynb_checkpoints 79 | 80 | # IPython 81 | profile_default/ 82 | ipython_config.py 83 | 84 | # pyenv 85 | .python-version 86 | 87 | # pipenv 88 | # According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control. 89 | # However, in case of collaboration, if having platform-specific dependencies or dependencies 90 | # having no cross-platform support, pipenv may install dependencies that don't work, or not 91 | # install all needed dependencies. 92 | #Pipfile.lock 93 | 94 | # PEP 582; used by e.g. github.com/David-OConnor/pyflow 95 | __pypackages__/ 96 | 97 | # Celery stuff 98 | celerybeat-schedule 99 | celerybeat.pid 100 | 101 | # SageMath parsed files 102 | *.sage.py 103 | 104 | # Environments 105 | .env 106 | .venv 107 | env/ 108 | venv/ 109 | ENV/ 110 | env.bak/ 111 | venv.bak/ 112 | 113 | # Spyder project settings 114 | .spyderproject 115 | .spyproject 116 | 117 | # Rope project settings 118 | .ropeproject 119 | 120 | # mkdocs documentation 121 | /site 122 | 123 | # mypy 124 | .mypy_cache/ 125 | .dmypy.json 126 | dmypy.json 127 | 128 | # Pyre type checker 129 | .pyre/ 130 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | MIT License 2 | 3 | Copyright (c) 2021 CtrlZ1 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. 22 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 |

2 | 3 | 4 | 5 | # Domain-Adaptation-Algorithms 6 | 7 | Welcome to visit my work space, I'm Yiyang Li, at least in the next three years (2021-2024), I will be here to record what I studied in graduate student stage about Domain Adaptation, such as literature introduction and code implementation, etc. I look forward to working with you scholars and experts in communication and begging your comments. 8 | 9 | **PS.** This code base is based on **models**, which is more convenient for learning a single model. If you want to avoid cumbersome conventional code (such as data reading, etc.), you can visit the following link:https://github.com/CtrlZ1/Domain-Adaptive-CodeBase. It presents various domain adaptive codes in the form of projects. 10 | 11 | **关于代码**:由于各种原因不方便github公开,有需求者可加QQ合作:1005461421 12 | 13 | # Contents 14 | 15 | - [Domain-Adaptation-Algorithms](#Domain-Adaptation-Algorithms) 16 | - [Contents](#contents) 17 | - [Installation](#installation) 18 | - [Implementations](#implementations) 19 | - [GAN](#gan) 20 | - [WGAN](#wgan) 21 | - [WGAN-GP](#wgan-gp) 22 | - [LargeScaleOT](#LargeScaleOT) 23 | - [JCPOT](#JCPOT) 24 | - [JDOT](#JDOT) 25 | - [Deep-JDOT](#Deep-JDOT) 26 | - [DCWD](#DCWD) 27 | - [DAN](#DAN) 28 | - [WDGRL](#WDGRL) 29 | - [DDC](#DDC) 30 | - [JAN](#JAN) 31 | - [MCD](#MCD) 32 | - [SWD](#SWD) 33 | - [JPOT](#JPOT) 34 | - [NW](#NW) 35 | - [WDAN](#WDAN) 36 | - [ADDA](#ADDA) 37 | - [CoGAN](#CoGAN) 38 | - [CDAN](#CDAN) 39 | - [M3SDA](#M3SDA) 40 | - [CMSS](#CMSS) 41 | - [LtC-MSDA](#LtC-MSDA) 42 | - [Dirt-T](#Dirt-T) 43 | 44 | # Installation 45 | 46 | ``` 47 | $ cd yourfolder 48 | $ git clone https://github.com/CtrlZ1/Domain-Adaptation-Algorithms.git 49 | ``` 50 | 51 | # Implementations 52 | 53 | ## GAN 54 | 55 | **title** 56 | 57 | Generative Adversarial Nets 58 | 59 | **Times** 60 | 61 | 2014 NIPS 62 | 63 | **Authors** 64 | 65 | Ian J. Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, 66 | Sherjil Ozair, Aaron Courville, Y oshua Bengio 67 | 68 | **Abstract** 69 | 70 | We propose a new framework for estimating generative models via an adversarial 71 | process, in which we simultaneously train two models: a generative model G that 72 | captures the data distribution, and a discriminative model D that estimates the 73 | probability that a sample came from the training data rather than G. The 74 | training procedure for G is to maximize the probability of D making a mistake. 75 | This framework corresponds to a minimax two-player game. In the space of 76 | arbitrary functions G and D, a unique solution exists, with G recovering the 77 | training data distribution and D equal to1 2everywhere. In the case where G and 78 | D are defined by multilayer perceptrons, the entire system can be trained with 79 | backpropagation. There is no need for any Markov chains or unrolled approximate 80 | inference networks during either training or generation of samples. Experiments 81 | demonstrate the potential of the framework through qualitative and quantitative 82 | evaluation of the generated samples. 83 | 84 | **Content introduction** 85 | 86 | https://blog.csdn.net/qq_41076797/article/details/118483802 87 | 88 | **Paper address** 89 | 90 | https://arxiv.org/abs/1406.2661 91 | 92 | ## WGAN 93 | 94 | **title** 95 | 96 | Wasserstein GAN 97 | 98 | **Times** 99 | 100 | 2017 101 | 102 | **Authors** 103 | 104 | Martin Arjovsky, Soumith Chintala, Léon Bottou 105 | 106 | **Abstract** 107 | 108 | We introduce a new algorithm named WGAN, an alternative to traditional GAN training. In this new model, we show that we can improve the stability of learning, get rid of problems like mode collapse, and provide meaningful learning curves useful for debugging and hyperparameter searches. Furthermore, we show that the corresponding optimization problem is sound, and provide extensive theoretical work highlighting the deep connections to other distances between distributions. 109 | 110 | **Content introduction** 111 | 112 | https://blog.csdn.net/qq_41076797/article/details/116898649 113 | 114 | **Paper address** 115 | 116 | https://arxiv.org/abs/1701.07875 117 | 118 | ## WGAN-GP 119 | 120 | **title** 121 | 122 | Improved Training of Wasserstein GANs 123 | 124 | **Times** 125 | 126 | 2017 127 | 128 | **Authors** 129 | 130 | Ishaan Gulrajani, Faruk Ahmed, Martin Arjovsky, Vincent Dumoulin, Aaron 131 | Courville 132 | 133 | **Abstract** 134 | 135 | Generative Adversarial Networks (GANs) are powerful generative models, but 136 | suffer from training instability. The recently proposed Wasserstein GAN (WGAN) 137 | makes progress toward stable training of GANs, but sometimes can still generate 138 | only poor samples or fail to converge. We find that these problems are often due 139 | to the use of weight clipping in WGAN to enforce a Lipschitz constraint on the 140 | critic, which can lead to undesired behavior. We propose an alternative to 141 | clipping weights: penalize the norm of gradient of the critic with respect to 142 | its input. Our proposed method performs better than standard WGAN and enables 143 | stable training of a wide variety of GAN architectures with almost no 144 | hyperparameter tuning, including 101-layer ResNets and language models with 145 | continuous generators. We also achieve high quality generations on CIFAR-10 and 146 | LSUN bedrooms. 147 | 148 | **Content introduction** 149 | 150 | https://blog.csdn.net/qq_41076797/article/details/118458028 151 | 152 | **Paper address** 153 | 154 | https://arxiv.org/abs/1704.00028 155 | 156 | 157 | 158 | ## LargeScaleOT 159 | 160 | **title** 161 | 162 | Large scale optimal transport and mapping estimation 163 | 164 | **Times** 165 | 166 | 2018 167 | 168 | **Authors** 169 | 170 | Vivien Seguy、Bharath Bhushan Damodaran、Rémi Flamary、Nicolas Courty、Antoine Rolet、Mathieu Blondel 171 | 172 | **Abstract** 173 | 174 | This paper presents a novel two-step approach for the fundamental problem of 175 | learning an optimal map from one distribution to another. First, we learn an 176 | optimal transport (OT) plan, which can be thought as a one-to-many map between 177 | the two distributions. To that end, we propose a stochastic dual approach of 178 | regularized OT, and show empirically that it scales better than a recent related 179 | approach when the amount of samples is very large. Second, we estimate a Monge 180 | map as a deep neural network learned by approximating the barycentric projection 181 | of the previously-obtained OT plan. This parameterization allows generalization 182 | of the mapping outside the support of the input measure. We prove two 183 | theoretical stability results of regularized OT which show that our estimations 184 | converge to the OT plan and Monge map between the underlying continuous 185 | measures. We showcase our proposed approach on two applications: domain 186 | adaptation and generative modeling. 187 | 188 | **Content introduction** 189 | 190 | https://blog.csdn.net/qq_41076797/article/details/118878524 191 | 192 | **Paper address** 193 | 194 | https://arxiv.org/abs/1711.02283 195 | 196 | ## JCPOT 197 | 198 | **title** 199 | 200 | Optimal Transport for Multi-source Domain Adaptation under Target 201 | Shift 202 | 203 | **Times** 204 | 205 | 2019 206 | 207 | **Authors** 208 | 209 | Ievgen Redko 、Nicolas Courty 、Rémi Flamary 、Devis Tuia 210 | 211 | **Abstract** 212 | 213 | In this paper, we tackle the problem of reducing discrepancies between multiple 214 | domains, i.e. multi-source domain adaptation, and consider it under the target 215 | shift assumption: in all domains we aim to solve a classification problem with 216 | the same output classes, but with different labels proportions. This problem, 217 | generally ignored in the vast majority of domain adaptation papers, is 218 | nevertheless critical in real-world applications, and we theoretically show its 219 | impact on the success of the adaptation. Our proposed method is based on optimal 220 | transport, a theory that has been successfully used to tackle adaptation 221 | problems in machine learning. The introduced approach, Joint Class Proportion 222 | and Optimal Transport (JCPOT), performs multi-source adaptation and target shift 223 | correction simultaneously by learning the class probabilities of the unlabeled 224 | target sample and the coupling allowing to align two (or more) probability 225 | distributions. Experiments on both synthetic and real-world data (satellite 226 | image pixel classification) task show the superiority of the proposed method 227 | over the state-of-the-art. 228 | 229 | **Content introduction** 230 | 231 | https://blog.csdn.net/qq_41076797/article/details/117151400 232 | 233 | **Paper address** 234 | 235 | http://proceedings.mlr.press/v89/redko19a/redko19a.pdf 236 | 237 | ## JDOT 238 | 239 | **title** 240 | 241 | Joint distribution optimal transportation for domain adaptation 242 | 243 | **Times** 244 | 245 | 2017 246 | 247 | **Authors** 248 | 249 | Nicolas Courty 、Rémi Flamary 、Amaury Habrard 、Alain Rakotomamonjy 250 | 251 | **Abstract** 252 | 253 | This paper deals with the unsupervised domain adaptation problem, where one 254 | wants to estimate a prediction function f in a given target domain without any 255 | labeled sample by exploiting the knowledge available from a source domain where 256 | labels are known. Our work makes the following assumption: there exists a 257 | non-linear transformation between the joint feature/label space distributions of 258 | the two domain Ps and Pt. We propose a solution of this problem with optimal 259 | transport, that allows to recover an estimated target P^f_t= (X, f(X)) by 260 | optimizing simultaneously the optimal coupling and f. We show that our method 261 | corresponds to the minimization of a bound on the target error, and provide an 262 | efficient algorithmic solution, for which convergence is proved. The versatility 263 | of our approach, both in terms of class of hypothesis or loss functions is 264 | demonstrated with real world classification and regression problems, for which 265 | we reach or surpass state-of-the-art results. 266 | 267 | **Content introduction** 268 | 269 | https://blog.csdn.net/qq_41076797/article/details/116608774 270 | 271 | **Paper address** 272 | 273 | https://proceedings.neurips.cc/paper/2017/file/0070d23b06b1486a538c0eaa45dd167a-Paper.pdf 274 | 275 | ## Deep-JDOT 276 | 277 | **title** 278 | 279 | DeepJDOT: Deep Joint Distribution Optimal Transport for Unsupervised Domain Adaptation 280 | 281 | **Times** 282 | 283 | 2018 284 | 285 | **Authors** 286 | 287 | Bharath Bhushan Damodaran, Benjamin Kellenberger, Remi Flamary, Devis Tuia, Nicolas Courty 288 | 289 | **Abstract** 290 | 291 | In computer vision, one is often confronted with problems of domain shifts, 292 | which occur when one applies a classifier trained on a source dataset to target 293 | data sharing similar characteristics (e.g. same classes), but also different 294 | latent data structures (e.g. different acquisition conditions). In such a 295 | situation, the model will perform poorly on the new data, since the classifier 296 | is specialized to recognize visual cues specific to the source domain. In this 297 | work we explore a solution, named DeepJDOT, to tackle this problem: through a 298 | measure of discrepancy on joint deep representations/labels based on optimal 299 | transport, we not only learn new data representations aligned between the source 300 | and target domain, but also simultaneously preserve the discriminative 301 | information used by the classifier. We applied DeepJDOT to a series of visual 302 | recognition tasks, where it compares favorably against state-of-the-art deep 303 | domain adaptation methods. 304 | 305 | **Content introduction** 306 | 307 | https://blog.csdn.net/qq_41076797/article/details/116698770 308 | 309 | **Paper address** 310 | 311 | https://arxiv.org/abs/1803.10081 312 | 313 | ## DCWD 314 | 315 | **title** 316 | 317 | Domain-attention Conditional Wasserstein Distance 318 | for Multi-source Domain Adaptation 319 | 320 | **Times** 321 | 322 | 2020 323 | 324 | **Authors** 325 | 326 | HANRUI WU 、YUGUANG YAN 、 MICHAEL K. NG 、QINGYAO WU 327 | 328 | **Abstract** 329 | 330 | Multi-source domain adaptation has received considerable attention due to its 331 | effectiveness of leveraging the knowledge from multiple related sources with 332 | different distributions to enhance the learning performance. One of the 333 | fundamental challenges in multi-source domain adaptation is how to determine the 334 | amount of knowledge transferred from each source domain to the target domain. To 335 | address this issue, we propose a new algorithm, called Domain-attention 336 | Conditional Wasserstein Distance (DCWD), to learn transferred weights for 337 | evaluating the relatedness across the source and target domains. In DCWD, we 338 | design a new conditional Wasserstein distance objective function by taking the 339 | label information into consideration to measure the distance between a given 340 | source domain and the target domain. We also develop an attention scheme to 341 | compute the transferred weights of different source domains based on their 342 | conditional Wasserstein distances to the target domain. After that, the 343 | transferred weights can be used to reweight the source data to determine their 344 | importance in knowledge transfer. We conduct comprehensive experiments on 345 | several real-world data sets, and the results demonstrate the effectiveness and 346 | efficiency of the proposed method. 347 | 348 | **Content introduction** 349 | 350 | https://blog.csdn.net/qq_41076797/article/details/118358520 351 | 352 | **Paper address** 353 | 354 | https://dl.acm.org/doi/10.1145/3391229 355 | 356 | ## WDGRL 357 | 358 | **title** 359 | 360 | Wasserstein Distance Guided Representation Learning 361 | for Domain Adaptation 362 | 363 | **Times** 364 | 365 | 2018 366 | 367 | **Authors** 368 | 369 | Jian Shen, Yanru Qu, Weinan Zhang∗, Y ong Yu 370 | 371 | **Abstract** 372 | 373 | Domain adaptation aims at generalizing a high-performance learner on a target 374 | domain via utilizing the knowledge distilled from a source domain which has a 375 | different but related data distribution. One solution to domain adaptation is to 376 | learn domain invariant feature representations while the learned representations 377 | should also be discriminative in prediction. To learn such representations, 378 | domain adaptation frameworks usually include a domain invariant representation 379 | learning approach to measure and reduce the domain discrepancy, as well as a 380 | discriminator for classification. Inspired by Wasserstein GAN, in this paper we 381 | propose a novel approach to learn domain invariant feature representations, 382 | namely Wasserstein Distance Guided Representation Learning (WDGRL). WDGRL 383 | utilizes a neural network, denoted by the domain critic, to estimate empirical 384 | Wasserstein distance between the source and target samples and optimizes the 385 | feature extractor network to minimize the estimated Wasserstein distance in an 386 | adversarial manner. The theoretical advantages of Wasserstein distance for 387 | domain adaptation lie in its gradient property and promising generalization 388 | bound. Empirical studies on common sentiment and image classification adaptation 389 | datasets demonstrate that our proposed WDGRL outperforms the state-of-the-art 390 | domain invariant representation learning approaches. 391 | 392 | **Content introduction** 393 | 394 | https://blog.csdn.net/qq_41076797/article/details/116942752 395 | 396 | **Paper address** 397 | 398 | https://www.aaai.org/ocs/index.php/AAAI/AAAI18/paper/view/17155 399 | 400 | ## DDC 401 | 402 | **title** 403 | 404 | Deep Domain Confusion: Maximizing for Domain Invariance 405 | 406 | **Times** 407 | 408 | 2014 409 | 410 | **Authors** 411 | 412 | Eric Tzeng, Judy Hoffman, Ning Zhang, Kate Saenko, Trevor Darrell 413 | 414 | **Abstract** 415 | 416 | Recent reports suggest that a generic supervised deep CNN model trained on a 417 | large-scale dataset reduces, but does not remove, dataset bias on a standard 418 | benchmark. Fine-tuning deep models in a new domain can require a significant 419 | amount of data, which for many applications is simply not available. We propose 420 | a new CNN architecture which introduces an adaptation layer and an additional 421 | domain confusion loss, to learn a representation that is both semantically 422 | meaningful and domain invariant. We additionally show that a domain confusion 423 | metric can be used for model selection to determine the dimension of an 424 | adaptation layer and the best position for the layer in the CNN architecture. 425 | Our proposed adaptation method offers empirical performance which exceeds 426 | previously published results on a standard benchmark visual domain adaptation 427 | task. 428 | 429 | **Content introduction** 430 | 431 | https://blog.csdn.net/qq_41076797/article/details/119698726 432 | 433 | **Paper address** 434 | 435 | https://arxiv.org/abs/1412.3474 436 | 437 | ## DAN 438 | 439 | **title** 440 | 441 | Learning Transferable Features with Deep Adaptation Networks 442 | 443 | **Times** 444 | 445 | 2015 446 | 447 | **Authors** 448 | 449 | Mingsheng Long Yue Cao Jianmin Wang Michael I. Jordan 450 | 451 | **Abstract** 452 | 453 | Recent studies reveal that a deep neural network can learn transferable features 454 | which generalize well to novel tasks for domain adaptation. However, as deep 455 | features eventually transition from general to specific along the network, the 456 | feature transferability drops significantly in higher layers with increasing 457 | domain discrepancy. Hence, it is important to formally reduce the dataset bias 458 | and enhance the transferability in task-specific layers. In this paper, we 459 | propose a new Deep Adaptation Network (DAN) architecture, which generalizes deep 460 | convolution alneural network to the domain adaptation scenario. In DAN, hidden 461 | representations of all task-specific layers are embeddedin a reproducing kernel 462 | Hilbert space where the mean embeddingsof different domain distributions can be 463 | explicitly matched. The domain discrepancy is further reduced using an optimal 464 | multi-kernel selection method for mean embedding matching. DAN can learn 465 | transferable features with statistic alguarantees,and can scale linearly by 466 | unbiased estimate of kernel embedding. Extensive empirical evidence shows that 467 | the proposed architecture yields state-of-the-art image classification error 468 | rates on standard domain adaptation benchmarks. 469 | 470 | **Content introduction** 471 | 472 | https://blog.csdn.net/qq_41076797/article/details/119829512 473 | 474 | **Paper address** 475 | 476 | http://proceedings.mlr.press/v37/long15.html 477 | 478 | ## JAN 479 | 480 | **title** 481 | 482 | Deep Transfer Learning with Joint Adaptation Networks 483 | 484 | **Times** 485 | 486 | 2017 487 | 488 | **Authors** 489 | 490 | Mingsheng Long Han Zhu Jianmin Wang Michael I. Jordan 491 | 492 | **Abstract** 493 | 494 | Deep networks have been successfully applied to learn transferable features for 495 | adapting models from a source domain to a different target domain. In this 496 | paper, we present joint adaptation networks (JAN), which learn a transfer 497 | network by aligning the joint distributions of multiple domain-specific layers 498 | across domains based on a joint maximum mean discrepancy (JMMD) criterion. 499 | Adversarial training strategy is adopted to maximize JMMD such that the 500 | distributions of the source and target domains are made more distinguishable. 501 | Learning can be performed by stochastic gradient descent with the gradients 502 | computed by back-propagation in linear-time. Experiments testify that our model 503 | yields state of the art results on standard datasets. 504 | 505 | **Content introduction** 506 | 507 | https://blog.csdn.net/qq_41076797/article/details/119850543 508 | 509 | **Paper address** 510 | 511 | http://proceedings.mlr.press/v70/long17a.html 512 | 513 | ## MCD 514 | 515 | **title** 516 | 517 | Maximum Classifier Discrepancy for Unsupervised Domain Adaptation 518 | 519 | **Times** 520 | 521 | 2018 522 | 523 | **Authors** 524 | 525 | Kuniaki Saito, Kohei Watanabe, Yoshitaka Ushiku, and Tatsuya Harada 526 | 527 | **Abstract** 528 | 529 | In this work, we present a method for unsupervised domain adaptation. Many 530 | adversarial learning methods train domain classifier networks to distinguish the 531 | features as either a source or target and train a feature generator network to 532 | mimic the discriminator. Two problems exist with these methods. First, the 533 | domain classifier only tries to distinguish the features as a source or target 534 | and thus does not consider task-specific decision boundaries between classes. 535 | Therefore, a trained generator can generate ambiguous features near class 536 | boundaries. Second, these methods aim to completely match the feature 537 | distributions between different domains, which is difficult because of each 538 | domain’s characteristics. To solve these problems, we introduce a new approach 539 | that attempts to align distributions of source and target by utilizing the 540 | task-specific decision boundaries. We propose to maximize the discrepancy 541 | between two classifiers’ outputs to detect target samples that are far from the 542 | support of the source. A feature generator learns to generate target features 543 | near the support to minimize the discrepancy. Our method outperforms other 544 | methods on several datasets of image classification and semantic segmentation. 545 | 546 | **Content introduction** 547 | 548 | https://blog.csdn.net/qq_41076797/article/details/119991815 549 | 550 | **Paper address** 551 | 552 | https://openaccess.thecvf.com/content_cvpr_2018/html/Saito_Maximum_Classifier_Discrepancy_CVPR_2018_paper.html 553 | 554 | ## SWD 555 | 556 | **title** 557 | 558 | Sliced Wasserstein Discrepancy for Unsupervised Domain Adaptation 559 | 560 | **Times** 561 | 562 | 2019 563 | 564 | **Authors** 565 | 566 | Chen-Yu Lee, Tanmay Batra, Mohammad Haris Baig, Daniel Ulbricht 567 | 568 | **Abstract** 569 | 570 | In this work, we connect two distinct concepts for unsupervised domain 571 | adaptation: feature distribution alignment between domains by utilizing the 572 | task-specificdecision boundary [57] and the Wasserstein metric [72]. Our 573 | proposed sliced Wasserstein discrepancy (SWD) is designed to capture the natural 574 | notion of dissimilarity between the outputs of task-specific classifiers. It 575 | provides a geometrically meaningful guidance to detect target samples that are 576 | far from the support of the source and enables efficient distribution alignment 577 | in an end-to-end trainable fashion. In the experiments, we validate the 578 | effectiveness and genericness of our method on digit and sign recognition, image 579 | classification, semantic segmentation, and object detection. 580 | 581 | **Content introduction** 582 | 583 | https://blog.csdn.net/qq_41076797/article/details/119979243 584 | 585 | **Paper address** 586 | 587 | https://openaccess.thecvf.com/content_CVPR_2019/html/Lee_Sliced_Wasserstein_Discrepancy_for_Unsupervised_Domain_Adaptation_CVPR_2019_paper.html 588 | 589 | ## JPOT 590 | 591 | **title** 592 | 593 | Joint Partial Optimal Transport for Open Set Domain Adaptation 594 | 595 | **Times** 596 | 597 | 2020 598 | 599 | **Authors** 600 | 601 | Renjun Xu, Pelen Liu, Yin Zhang, Fang Cai, Jindong Wang, Shuoying Liang, Heting 602 | 603 | **Abstract** 604 | 605 | Domain adaptation (DA) has achieved a resounding success to learn a good 606 | classifier by leveraging labeled data from a source domain to adapt to an 607 | unlabeled target domain. However, in a general setting when the target domain 608 | contains classes that are never observed in the source domain, namely in Open 609 | Set Domain Adaptation (OSDA), existing DA methods failed to work because of the 610 | interference of the extra unknown classes. This is a much more challenging 611 | problem, since it can easily result in negative transfer due to the mismatch 612 | between the unknown and known classes. Existing researches are susceptible to 613 | misclassification when target domain unknown samples in the feature space 614 | distributed near the decision boundary learned from the labeled source domain. 615 | To overcome this, we propose Joint Partial Optimal Transport (JPOT), fully 616 | utilizing information of not only the labeled source domain but also the 617 | discriminative representation of unknown class in the target domain. The 618 | proposed joint discriminative prototypical compactness loss can not only achieve 619 | intra-class compactness and inter-class separability, but also estimate the mean 620 | and variance of the unknown class through backpropagation, which remains 621 | intractable for previous methods due to the blindness about the structure of the 622 | unknown classes. To our best knowledge, this is the first optimal transport 623 | model for OSDA. Extensive experiments demonstrate that our proposed model can 624 | significantly boost the performance of open set domain adaptation on standard DA 625 | datasets. 626 | 627 | **Content introduction** 628 | 629 | https://blog.csdn.net/qq_41076797/article/details/120133702 630 | 631 | **Paper address** 632 | 633 | https://www.ijcai.org/proceedings/2020/352 634 | 635 | ## NW 636 | 637 | **title** 638 | 639 | Normalized Wasserstein for Mixture Distributions with Applications in 640 | Adversarial Learning and Domain Adaptation 641 | 642 | **Times** 643 | 644 | 2019 645 | 646 | **Authors** 647 | 648 | Yogesh Balaji, Rama Chellappa, Soheil Feizi 649 | 650 | **Abstract** 651 | 652 | Understanding proper distance measures between distributions is at the core of 653 | several learning tasks such as generative models, domain adaptation, clustering, 654 | etc. In this work, we focus on mixture distributions that arise naturally in 655 | several application domains where the data contains different sub-populations. F 656 | or mixture distributions, established distance measures such as the Wasserstein 657 | distance do not take into account imbalanced mixture proportions. Thus, even if 658 | two mixture distributions have identical mixture components but different 659 | mixture proportions, the Wasserstein distance between them will be large. This 660 | often leads to undesired results in distance-based learning methods for mixture 661 | distributions. In this paper , we resolve this issue by introducing the 662 | Normalized Wasserstein measure. The key idea is to introduce mixture proportions 663 | as optimization variables, effectively normalizing mixture proportions in the 664 | Wasserstein formulation. Using the proposed normalized Wasserstein measure leads 665 | to significant performance gains for mixture distributions with imbalanced 666 | mixture proportions compared to the vanilla Wasserstein distance. We demonstrate 667 | the effectiveness of the proposed measure in GANs, domain adaptation and 668 | adversarial clustering in several benchmark datasets. 669 | 670 | **Content introduction** 671 | 672 | https://blog.csdn.net/qq_41076797/article/details/120086168 673 | 674 | **Paper address** 675 | 676 | https://arxiv.org/abs/1902.00415 677 | 678 | ## WDAN 679 | 680 | **title** 681 | 682 | Mind the Class Weight Bias: Weighted Maximum Mean Discrepancy for Unsupervised 683 | Domain Adaptation 684 | 685 | **Times** 686 | 687 | 2017 688 | 689 | **Authors** 690 | 691 | Hongliang Yan, Y ukang Ding, Peihua Li, Qilong Wang, Y ong Xu, Wangmeng Zuo 692 | 693 | **Abstract** 694 | 695 | In domain adaptation, maximum mean discrepancy (MMD) has been widely adopted as 696 | a discrepancy metric between the distributions of source and target domains. 697 | However , existing MMD-based domain adaptation methods generally ignore the 698 | changes of class prior distributions, i.e., class weight bias across domains. 699 | This remains an open problem but ubiquitous for domain adaptation, which can be 700 | caused by changes in sample selection criteria and application scenarios. We 701 | show that MMD cannot account for class weight bias and results in degraded 702 | domain adaptation performance. To address this issue, a weighted MMD model is 703 | proposed in this paper . Specifically, we introduce class-specific auxiliary 704 | weights into the original MMD for exploiting the class prior probability on 705 | source and target domains,whose challengelies inthe factthattheclass label in 706 | target domain is unavailable. To account for it, our proposed weighted MMD model 707 | is defined by introducing an auxiliary weight for each class in the source 708 | domain, and a classification EM algorithm is suggested by alternating between 709 | assigning the pseudo-labels, estimating auxiliary weights and updating model 710 | parameters. Extensive experiments demonstrate the superiority of our weighted 711 | MMD over conventional MMD for domain adaptation. 712 | 713 | **Content introduction** 714 | 715 | https://blog.csdn.net/qq_41076797/article/details/120054974 716 | 717 | **Paper address** 718 | 719 | https://arxiv.org/abs/1705.00609 720 | 721 | ## MCDA 722 | 723 | **title** 724 | 725 | Deep multi-Wasserstein unsupervised domain adaptation 726 | 727 | **Times** 728 | 729 | 2019 730 | 731 | **Authors** 732 | 733 | Tien-Nam Le , Amaury Habrard , Marc Sebban 734 | 735 | **Abstract** 736 | 737 | In unsupervised domain adaptation (DA), 1 aims at learning from labeled source 738 | data and fully unlabeled target examples a model with a low error on the target 739 | domain. In this setting, standard generalization bounds prompt us to minimize 740 | the sum of three terms: (a) the source true risk, (b) the divergence be- tween 741 | the source and target domains, and (c) the combined error of the ideal joint 742 | hypothesis over the two domains. Many DA methods – e s p e c i a l l y those 743 | using deep neural networks – h a v e focused on the first two terms by using 744 | different divergence measures to align the source and target distributions on a 745 | shared latent feature space, while ignoring the third term, assuming it is 746 | negligible to perform the adaptation. However, it has been shown that purely 747 | aligning the two distributions while minimizing the source error may lead to 748 | so-called negative transfer . In this paper, we address this issue with a new 749 | deep unsupervised DA method – called MCDA – minimizing the first two terms while 750 | controlling the third one. MCDA benefits from highly-confident target samples 751 | (using softmax predictions) to minimize class- wise Wasserstein distances and 752 | efficiently approximate the ideal joint hypothesis. Empirical results show that 753 | our approach outperforms state of the art methods. 754 | 755 | **Content introduction** 756 | 757 | https://blog.csdn.net/qq_41076797/article/details/120110987 758 | 759 | **Paper address** 760 | 761 | https://linkinghub.elsevier.com/retrieve/pii/S0167865519301400 762 | 763 | ## ADDA 764 | 765 | **title** 766 | 767 | Adversarial Discriminative Domain Adaptation 768 | 769 | **Times** 770 | 771 | 2017 772 | 773 | **Authors** 774 | 775 | Eric Tzeng , Judy Hoffman , Kate Saenko , Trevor Darrell 776 | 777 | **Abstract** 778 | 779 | Adversarial learning methods are a promising approach to training robust deep 780 | networks, and can generate complex samples across diverse domains. They can also 781 | improve recognition despite the presence of domain shift or dataset bias: recent 782 | adversarial approaches to unsupervised domain adaptation reduce the difference 783 | between the training and test domain distributions and thus improve 784 | generalization performance. However , while generative adversarial networks 785 | (GANs) show compelling visualizations, they are not optimal on discriminative 786 | tasks and can be limited to smaller shifts. On the other hand, discriminative 787 | approaches can handle larger domain shifts, but impose tied weights on the model 788 | and do not exploit a GAN-based loss. In this work, we first outline a novel 789 | generalized framework for adversarial adaptation, which subsumes recent 790 | state-of-the-art approaches as special cases, and use this generalized view to 791 | better relate prior approaches. We then propose a previously unexplored instance 792 | of our general framework which combines discriminative modeling, untied weight 793 | sharing, and a GAN loss, which we call Adversarial Discriminative Domain 794 | Adaptation (ADDA). We show that ADDA is more effective yet considerably simpler 795 | than competing domainadversarial methods, and demonstrate the promise of our 796 | approach by exceeding state-of-the-art unsupervised adaptation results on 797 | standard domain adaptation tasks as well as a difficult cross-modality object 798 | classification task. 799 | 800 | **Content introduction** 801 | 802 | https://blog.csdn.net/qq_41076797/article/details/120273707 803 | 804 | **Paper address** 805 | 806 | https://openreview.net/forum?id=B1Vjl1Stl 807 | 808 | ## CoGAN 809 | 810 | **title** 811 | 812 | Coupled generative adversarial networks 813 | 814 | **Times** 815 | 816 | 2016 817 | 818 | **Authors** 819 | 820 | Ming-Yu Liu , Oncel Tuzel 821 | 822 | **Abstract** 823 | 824 | We propose coupled generative adversarial network (CoGAN) for learning a joint 825 | distribution of multi-domain images. In contrast to the existing approaches, 826 | which require tuples of corresponding images in different domains in the 827 | training set, CoGAN can learn a joint distribution without any tuple of 828 | corresponding images. It can learn a joint distribution with just samples drawn 829 | from the marginal distributions. This is achieved by enforcing a weight-sharing 830 | constraint that limits the network capacity and favors a joint distribution 831 | solution over a product of marginal distributions one. We apply CoGAN to several 832 | joint distribution learning tasks, including learning a joint distribution of 833 | color and depth images, and learning a joint distribution of face images with 834 | different attributes. For each task it successfully learns the joint 835 | distribution without any tuple of corresponding images. We also demonstrate its 836 | applications to domain adaptation and image transformation. 837 | 838 | **Content introduction** 839 | 840 | https://blog.csdn.net/qq_41076797/article/details/120347149 841 | 842 | **Paper address** 843 | 844 | https://proceedings.neurips.cc/paper/2016/hash/502e4a16930e414107ee22b6198c578f-Abstract.html 845 | 846 | ## CDAN 847 | 848 | **title** 849 | 850 | Conditional Adversarial Domain Adaptation 851 | 852 | **Times** 853 | 854 | 2018 855 | 856 | **Authors** 857 | 858 | Mingsheng Long, Zhangjie Cao, Jianmin Wang, and Michael I. Jordan 859 | 860 | **Abstract** 861 | 862 | Adversarial learning has been embedded into deep networks to learn disentangled 863 | and transferable representations for domain adaptation. Existing adversarial 864 | domain adaptation methods may not effectively align different domains of 865 | multimodal distributions native in classification problems. In this paper, we 866 | present conditional adversarial domain adaptation, a principled framework that 867 | conditions the adversarial adaptation models on discriminative information 868 | conveyed in the classifier predictions. Conditional domain adversarial networks 869 | (CDANs) are designed with two novel conditioning strategies: multilinear 870 | conditioning that captures the crosscovariance between feature representations 871 | and classifier predictions to improve the discriminability, and entropy 872 | conditioning that controls the uncertainty of classifier predictions to 873 | guarantee the transferability. With theoretical guarantees and a few lines of 874 | codes, the approach has exceeded state-of-the-art results on five datasets. 875 | 876 | **Content introduction** 877 | 878 | https://blog.csdn.net/qq_41076797/article/details/120622652 879 | 880 | **Paper address** 881 | 882 | https://proceedings.neurips.cc/paper/2018/hash/ab88b15733f543179858600245108dd8-Abstract.html 883 | 884 | ## M3SDA 885 | 886 | **title** 887 | 888 | Moment Matching for Multi-Source Domain Adaptation 889 | 890 | **Times** 891 | 892 | 2019 893 | 894 | **Authors** 895 | 896 | Xingchao Peng, Qinxun Bai, Xide Xia, Zijun Huang, Kate Saenko, Bo Wang 897 | 898 | **Abstract** 899 | 900 | Conventional unsupervised domain adaptation (UDA) assumes that training data are 901 | sampled from a single domain. This neglects the more practical scenario where 902 | training data are collected from multiple sources, requiring multi-source domain 903 | adaptation. We make three major contributions towards addressing this problem. 904 | First, we collect and annotate by far the largest UDA dataset, called DomainNet, 905 | which contains six domains and about 0.6 million images distributed among 345 906 | categories, addressing the gap in data availability for multi-source UDA 907 | research. Second, we propose a new deep learning approach, Moment Matching for 908 | Multi-Source Domain Adaptation (M3SDA), which aims to transfer knowledge learned 909 | from multiple labeled source domains to an unlabeled target domain by 910 | dynamically aligning moments of their feature distributions. Third, we provide 911 | new theoretical insights specifically for moment matching approaches in both 912 | single and multiple source domain adaptation. Extensive experiments are 913 | conducted to demonstrate the power of our new dataset in benchmarking 914 | state-of-the-art multi-source domain adaptation methods, as well as the 915 | advantage of our proposed model. Dataset and Code are available at 916 | http://ai.bu.edu/M3SDA/ 917 | 918 | **Content introduction** 919 | 920 | https://blog.csdn.net/qq_41076797/article/details/120819629 921 | 922 | **Paper address** 923 | 924 | https://arxiv.org/abs/1812.01754 925 | 926 | ## CMSS 927 | 928 | **title** 929 | 930 | Curriculum manager for source selection in multi- source domain adaptation 931 | 932 | **Times** 933 | 934 | 2020 935 | 936 | **Authors** 937 | 938 | Luyu Yang, Yogesh Balaji, Ser-Nam Lim, Abhinav Shrivastava 939 | 940 | **Abstract** 941 | 942 | The performance of Multi-Source Unsupervised Domain Adaptation depends significantly on the effectiveness of transfer from labeled source domain samples. In this paper, we proposed an adversarial agent that learns a dynamic curriculum for source samples, called Curriculum Manager for Source Selection (CMSS). The Curriculum Manager, an independent network module, constantly updates the curriculum during training, and iteratively learns which domains or samples are best suited for aligning to the target. The intuition behind this is to force the Curriculum Manager to constantly re-measure the transferability of latent domains over time to adversarially raise the error rate of the domain discriminator. CMSS does not require any knowledge of the domain labels, yet it outperforms other methods on four well-known benchmarks by significant margins. We also provide interpretable results that shed light on the proposed method. 943 | 944 | **Content introduction** 945 | 946 | https://blog.csdn.net/qq_41076797/article/details/120877511 947 | 948 | **Paper address** 949 | 950 | https://arxiv.org/abs/2007.01261 951 | 952 | ## LtC-MSDA 953 | 954 | **title** 955 | 956 | Learning to Combine: Knowledge Aggregation for Multi-Source Domain Adaptation 957 | 958 | **Times** 959 | 960 | 2020 961 | 962 | **Authors** 963 | 964 | Hang Wang , Minghao Xu , Bingbing Ni , and Wenjun Zhang 965 | 966 | **Abstract** 967 | 968 | Transferring knowledges learned from multiple source domains to target domain is a more practical and challenging task than conventional single-source domain adaptation. Furthermore, the increase of modalities brings more difficulty in aligning feature distributions among multiple domains. To mitigate these problems, we propose a Learning to Combine for Multi-Source Domain Adaptation (LtC-MSDA) framework via exploring interactions among domains. In the nutshell, a knowledge graph is constructed on the prototypes of various domains to realize the information propagation among semantically adjacent representations. On such basis, a graph model is learned to predict query samples under the guidance of correlated prototypes. In addition, we design a Relation Alignment Loss (RAL) to facilitate the consistency of categories’ relational interdependency and the compactness of features, which boosts features’ intra-class invariance and inter-class separability. Comprehensive results on public benchmark datasets demonstrate that our approach outperforms existing methods with a remarkable margin. Our code is available athttps://github.com/ChrisAllenMing/LtC-MSDA. 969 | 970 | **Content introduction** 971 | 972 | https://blog.csdn.net/qq_41076797/article/details/120978951 973 | 974 | **Paper address** 975 | 976 | https://arxiv.org/abs/2007.08801 977 | 978 | ## Dirt-T 979 | 980 | **title** 981 | 982 | A DIRT-T approach to unsupervised domain adaptation 983 | 984 | **Times** 985 | 986 | 2018 987 | 988 | **Authors** 989 | 990 | Rui Shu, Hung H. Bui, Hirokazu Narui, & Stefano Ermon 991 | 992 | **Abstract** 993 | 994 | Domain adaptation refers to the problem of leveraging labeled data in a source domain to learn an accurate model in a target domain where labels are scarce or unavailable. A recent approach for finding a common representation of the two domains is via domain adversarial training (Ganin & Lempitsky,2015), which attempts to induce a feature extractor that matches the source and target feature distributions in some feature space. However, domain adversarial training faces two critical limitations: 1) if the feature extraction function has high-capacity, then feature distribution matching is a weak constraint, 2) in non-conservative domain adaptation (where no single classifier can perform well in both the source and target domains), training the model to do well on the source domain hurts performance on the target domain. In this paper, we address these issues through the lens of the cluster assumption, i.e., decision boundaries should not cross high-density data regions. We propose two novel and related models: 1) the Virtual Adversarial Domain Adaptation (VADA) model, which combines domain adversarial training with a penalty term that punishes violation of the cluster assumption; 2) the Decision-boundary Iterative Refinement Training with a Teacher (DIRT-T)1 model, which takes the V ADA model as initialization and employs natural gradient steps to further minimize the cluster assumption violation. Extensive empirical results demonstrate that the combination of these two models significantly improve the state-of-the-art performance on the digit, traffic sign, and Wi-Fi recognition domain adaptation benchmarks. 995 | 996 | **Content introduction** 997 | 998 | https://blog.csdn.net/qq_41076797/article/details/121226438 999 | 1000 | **Paper address** 1001 | 1002 | https://openreview.net/pdf?id=H1q-TM-AW 1003 | -------------------------------------------------------------------------------- /images/logo.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/CtrlZ1/Domain-Adaptation-Algorithms/cf8dcc5f5d2116005205ad76efdf5e6445013df3/images/logo.png -------------------------------------------------------------------------------- /requirements.txt: -------------------------------------------------------------------------------- 1 | torch>=0.4.0 2 | torchvision 3 | matplotlib 4 | numpy 5 | scipy 6 | pillow 7 | urllib3 8 | scikit-image 9 | tqdm --------------------------------------------------------------------------------