├── .gitignore
├── LICENSE
├── README.md
├── images
└── logo.png
└── requirements.txt
/.gitignore:
--------------------------------------------------------------------------------
1 | # Byte-compiled / optimized / DLL files
2 | __pycache__/
3 | *.py[cod]
4 | *$py.class
5 | .idea
6 | # C extensions
7 | *.so
8 | */images
9 | # Distribution / packaging
10 | .Python
11 | build/
12 | develop-eggs/
13 | dist/
14 | downloads/
15 | eggs/
16 | .eggs/
17 | lib/
18 | lib64/
19 | parts/
20 | sdist/
21 | var/
22 | wheels/
23 | pip-wheel-metadata/
24 | share/python-wheels/
25 | *.egg-info/
26 | .installed.cfg
27 | *.egg
28 | MANIFEST
29 |
30 | # PyInstaller
31 | # Usually these files are written by a python script from a template
32 | # before PyInstaller builds the exe, so as to inject date/other infos into it.
33 | *.manifest
34 | *.spec
35 |
36 | # Installer logs
37 | pip-log.txt
38 | pip-delete-this-directory.txt
39 |
40 | # Unit test / coverage reports
41 | htmlcov/
42 | .tox/
43 | .nox/
44 | .coverage
45 | .coverage.*
46 | .cache
47 | nosetests.xml
48 | coverage.xml
49 | *.cover
50 | *.py,cover
51 | .hypothesis/
52 | .pytest_cache/
53 |
54 | # Translations
55 | *.mo
56 | *.pot
57 |
58 | # Django stuff:
59 | *.log
60 | local_settings.py
61 | db.sqlite3
62 | db.sqlite3-journal
63 |
64 | # Flask stuff:
65 | instance/
66 | .webassets-cache
67 |
68 | # Scrapy stuff:
69 | .scrapy
70 |
71 | # Sphinx documentation
72 | docs/_build/
73 |
74 | # PyBuilder
75 | target/
76 |
77 | # Jupyter Notebook
78 | .ipynb_checkpoints
79 |
80 | # IPython
81 | profile_default/
82 | ipython_config.py
83 |
84 | # pyenv
85 | .python-version
86 |
87 | # pipenv
88 | # According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control.
89 | # However, in case of collaboration, if having platform-specific dependencies or dependencies
90 | # having no cross-platform support, pipenv may install dependencies that don't work, or not
91 | # install all needed dependencies.
92 | #Pipfile.lock
93 |
94 | # PEP 582; used by e.g. github.com/David-OConnor/pyflow
95 | __pypackages__/
96 |
97 | # Celery stuff
98 | celerybeat-schedule
99 | celerybeat.pid
100 |
101 | # SageMath parsed files
102 | *.sage.py
103 |
104 | # Environments
105 | .env
106 | .venv
107 | env/
108 | venv/
109 | ENV/
110 | env.bak/
111 | venv.bak/
112 |
113 | # Spyder project settings
114 | .spyderproject
115 | .spyproject
116 |
117 | # Rope project settings
118 | .ropeproject
119 |
120 | # mkdocs documentation
121 | /site
122 |
123 | # mypy
124 | .mypy_cache/
125 | .dmypy.json
126 | dmypy.json
127 |
128 | # Pyre type checker
129 | .pyre/
130 |
--------------------------------------------------------------------------------
/LICENSE:
--------------------------------------------------------------------------------
1 | MIT License
2 |
3 | Copyright (c) 2021 CtrlZ1
4 |
5 | Permission is hereby granted, free of charge, to any person obtaining a copy
6 | of this software and associated documentation files (the "Software"), to deal
7 | in the Software without restriction, including without limitation the rights
8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9 | copies of the Software, and to permit persons to whom the Software is
10 | furnished to do so, subject to the following conditions:
11 |
12 | The above copyright notice and this permission notice shall be included in all
13 | copies or substantial portions of the Software.
14 |
15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21 | SOFTWARE.
22 |
--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
1 |

2 |
3 |
4 |
5 | # Domain-Adaptation-Algorithms
6 |
7 | Welcome to visit my work space, I'm Yiyang Li, at least in the next three years (2021-2024), I will be here to record what I studied in graduate student stage about Domain Adaptation, such as literature introduction and code implementation, etc. I look forward to working with you scholars and experts in communication and begging your comments.
8 |
9 | **PS.** This code base is based on **models**, which is more convenient for learning a single model. If you want to avoid cumbersome conventional code (such as data reading, etc.), you can visit the following link:https://github.com/CtrlZ1/Domain-Adaptive-CodeBase. It presents various domain adaptive codes in the form of projects.
10 |
11 | **关于代码**:由于各种原因不方便github公开,有需求者可加QQ合作:1005461421
12 |
13 | # Contents
14 |
15 | - [Domain-Adaptation-Algorithms](#Domain-Adaptation-Algorithms)
16 | - [Contents](#contents)
17 | - [Installation](#installation)
18 | - [Implementations](#implementations)
19 | - [GAN](#gan)
20 | - [WGAN](#wgan)
21 | - [WGAN-GP](#wgan-gp)
22 | - [LargeScaleOT](#LargeScaleOT)
23 | - [JCPOT](#JCPOT)
24 | - [JDOT](#JDOT)
25 | - [Deep-JDOT](#Deep-JDOT)
26 | - [DCWD](#DCWD)
27 | - [DAN](#DAN)
28 | - [WDGRL](#WDGRL)
29 | - [DDC](#DDC)
30 | - [JAN](#JAN)
31 | - [MCD](#MCD)
32 | - [SWD](#SWD)
33 | - [JPOT](#JPOT)
34 | - [NW](#NW)
35 | - [WDAN](#WDAN)
36 | - [ADDA](#ADDA)
37 | - [CoGAN](#CoGAN)
38 | - [CDAN](#CDAN)
39 | - [M3SDA](#M3SDA)
40 | - [CMSS](#CMSS)
41 | - [LtC-MSDA](#LtC-MSDA)
42 | - [Dirt-T](#Dirt-T)
43 |
44 | # Installation
45 |
46 | ```
47 | $ cd yourfolder
48 | $ git clone https://github.com/CtrlZ1/Domain-Adaptation-Algorithms.git
49 | ```
50 |
51 | # Implementations
52 |
53 | ## GAN
54 |
55 | **title**
56 |
57 | Generative Adversarial Nets
58 |
59 | **Times**
60 |
61 | 2014 NIPS
62 |
63 | **Authors**
64 |
65 | Ian J. Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley,
66 | Sherjil Ozair, Aaron Courville, Y oshua Bengio
67 |
68 | **Abstract**
69 |
70 | We propose a new framework for estimating generative models via an adversarial
71 | process, in which we simultaneously train two models: a generative model G that
72 | captures the data distribution, and a discriminative model D that estimates the
73 | probability that a sample came from the training data rather than G. The
74 | training procedure for G is to maximize the probability of D making a mistake.
75 | This framework corresponds to a minimax two-player game. In the space of
76 | arbitrary functions G and D, a unique solution exists, with G recovering the
77 | training data distribution and D equal to1 2everywhere. In the case where G and
78 | D are defined by multilayer perceptrons, the entire system can be trained with
79 | backpropagation. There is no need for any Markov chains or unrolled approximate
80 | inference networks during either training or generation of samples. Experiments
81 | demonstrate the potential of the framework through qualitative and quantitative
82 | evaluation of the generated samples.
83 |
84 | **Content introduction**
85 |
86 | https://blog.csdn.net/qq_41076797/article/details/118483802
87 |
88 | **Paper address**
89 |
90 | https://arxiv.org/abs/1406.2661
91 |
92 | ## WGAN
93 |
94 | **title**
95 |
96 | Wasserstein GAN
97 |
98 | **Times**
99 |
100 | 2017
101 |
102 | **Authors**
103 |
104 | Martin Arjovsky, Soumith Chintala, Léon Bottou
105 |
106 | **Abstract**
107 |
108 | We introduce a new algorithm named WGAN, an alternative to traditional GAN training. In this new model, we show that we can improve the stability of learning, get rid of problems like mode collapse, and provide meaningful learning curves useful for debugging and hyperparameter searches. Furthermore, we show that the corresponding optimization problem is sound, and provide extensive theoretical work highlighting the deep connections to other distances between distributions.
109 |
110 | **Content introduction**
111 |
112 | https://blog.csdn.net/qq_41076797/article/details/116898649
113 |
114 | **Paper address**
115 |
116 | https://arxiv.org/abs/1701.07875
117 |
118 | ## WGAN-GP
119 |
120 | **title**
121 |
122 | Improved Training of Wasserstein GANs
123 |
124 | **Times**
125 |
126 | 2017
127 |
128 | **Authors**
129 |
130 | Ishaan Gulrajani, Faruk Ahmed, Martin Arjovsky, Vincent Dumoulin, Aaron
131 | Courville
132 |
133 | **Abstract**
134 |
135 | Generative Adversarial Networks (GANs) are powerful generative models, but
136 | suffer from training instability. The recently proposed Wasserstein GAN (WGAN)
137 | makes progress toward stable training of GANs, but sometimes can still generate
138 | only poor samples or fail to converge. We find that these problems are often due
139 | to the use of weight clipping in WGAN to enforce a Lipschitz constraint on the
140 | critic, which can lead to undesired behavior. We propose an alternative to
141 | clipping weights: penalize the norm of gradient of the critic with respect to
142 | its input. Our proposed method performs better than standard WGAN and enables
143 | stable training of a wide variety of GAN architectures with almost no
144 | hyperparameter tuning, including 101-layer ResNets and language models with
145 | continuous generators. We also achieve high quality generations on CIFAR-10 and
146 | LSUN bedrooms.
147 |
148 | **Content introduction**
149 |
150 | https://blog.csdn.net/qq_41076797/article/details/118458028
151 |
152 | **Paper address**
153 |
154 | https://arxiv.org/abs/1704.00028
155 |
156 |
157 |
158 | ## LargeScaleOT
159 |
160 | **title**
161 |
162 | Large scale optimal transport and mapping estimation
163 |
164 | **Times**
165 |
166 | 2018
167 |
168 | **Authors**
169 |
170 | Vivien Seguy、Bharath Bhushan Damodaran、Rémi Flamary、Nicolas Courty、Antoine Rolet、Mathieu Blondel
171 |
172 | **Abstract**
173 |
174 | This paper presents a novel two-step approach for the fundamental problem of
175 | learning an optimal map from one distribution to another. First, we learn an
176 | optimal transport (OT) plan, which can be thought as a one-to-many map between
177 | the two distributions. To that end, we propose a stochastic dual approach of
178 | regularized OT, and show empirically that it scales better than a recent related
179 | approach when the amount of samples is very large. Second, we estimate a Monge
180 | map as a deep neural network learned by approximating the barycentric projection
181 | of the previously-obtained OT plan. This parameterization allows generalization
182 | of the mapping outside the support of the input measure. We prove two
183 | theoretical stability results of regularized OT which show that our estimations
184 | converge to the OT plan and Monge map between the underlying continuous
185 | measures. We showcase our proposed approach on two applications: domain
186 | adaptation and generative modeling.
187 |
188 | **Content introduction**
189 |
190 | https://blog.csdn.net/qq_41076797/article/details/118878524
191 |
192 | **Paper address**
193 |
194 | https://arxiv.org/abs/1711.02283
195 |
196 | ## JCPOT
197 |
198 | **title**
199 |
200 | Optimal Transport for Multi-source Domain Adaptation under Target
201 | Shift
202 |
203 | **Times**
204 |
205 | 2019
206 |
207 | **Authors**
208 |
209 | Ievgen Redko 、Nicolas Courty 、Rémi Flamary 、Devis Tuia
210 |
211 | **Abstract**
212 |
213 | In this paper, we tackle the problem of reducing discrepancies between multiple
214 | domains, i.e. multi-source domain adaptation, and consider it under the target
215 | shift assumption: in all domains we aim to solve a classification problem with
216 | the same output classes, but with different labels proportions. This problem,
217 | generally ignored in the vast majority of domain adaptation papers, is
218 | nevertheless critical in real-world applications, and we theoretically show its
219 | impact on the success of the adaptation. Our proposed method is based on optimal
220 | transport, a theory that has been successfully used to tackle adaptation
221 | problems in machine learning. The introduced approach, Joint Class Proportion
222 | and Optimal Transport (JCPOT), performs multi-source adaptation and target shift
223 | correction simultaneously by learning the class probabilities of the unlabeled
224 | target sample and the coupling allowing to align two (or more) probability
225 | distributions. Experiments on both synthetic and real-world data (satellite
226 | image pixel classification) task show the superiority of the proposed method
227 | over the state-of-the-art.
228 |
229 | **Content introduction**
230 |
231 | https://blog.csdn.net/qq_41076797/article/details/117151400
232 |
233 | **Paper address**
234 |
235 | http://proceedings.mlr.press/v89/redko19a/redko19a.pdf
236 |
237 | ## JDOT
238 |
239 | **title**
240 |
241 | Joint distribution optimal transportation for domain adaptation
242 |
243 | **Times**
244 |
245 | 2017
246 |
247 | **Authors**
248 |
249 | Nicolas Courty 、Rémi Flamary 、Amaury Habrard 、Alain Rakotomamonjy
250 |
251 | **Abstract**
252 |
253 | This paper deals with the unsupervised domain adaptation problem, where one
254 | wants to estimate a prediction function f in a given target domain without any
255 | labeled sample by exploiting the knowledge available from a source domain where
256 | labels are known. Our work makes the following assumption: there exists a
257 | non-linear transformation between the joint feature/label space distributions of
258 | the two domain Ps and Pt. We propose a solution of this problem with optimal
259 | transport, that allows to recover an estimated target P^f_t= (X, f(X)) by
260 | optimizing simultaneously the optimal coupling and f. We show that our method
261 | corresponds to the minimization of a bound on the target error, and provide an
262 | efficient algorithmic solution, for which convergence is proved. The versatility
263 | of our approach, both in terms of class of hypothesis or loss functions is
264 | demonstrated with real world classification and regression problems, for which
265 | we reach or surpass state-of-the-art results.
266 |
267 | **Content introduction**
268 |
269 | https://blog.csdn.net/qq_41076797/article/details/116608774
270 |
271 | **Paper address**
272 |
273 | https://proceedings.neurips.cc/paper/2017/file/0070d23b06b1486a538c0eaa45dd167a-Paper.pdf
274 |
275 | ## Deep-JDOT
276 |
277 | **title**
278 |
279 | DeepJDOT: Deep Joint Distribution Optimal Transport for Unsupervised Domain Adaptation
280 |
281 | **Times**
282 |
283 | 2018
284 |
285 | **Authors**
286 |
287 | Bharath Bhushan Damodaran, Benjamin Kellenberger, Remi Flamary, Devis Tuia, Nicolas Courty
288 |
289 | **Abstract**
290 |
291 | In computer vision, one is often confronted with problems of domain shifts,
292 | which occur when one applies a classifier trained on a source dataset to target
293 | data sharing similar characteristics (e.g. same classes), but also different
294 | latent data structures (e.g. different acquisition conditions). In such a
295 | situation, the model will perform poorly on the new data, since the classifier
296 | is specialized to recognize visual cues specific to the source domain. In this
297 | work we explore a solution, named DeepJDOT, to tackle this problem: through a
298 | measure of discrepancy on joint deep representations/labels based on optimal
299 | transport, we not only learn new data representations aligned between the source
300 | and target domain, but also simultaneously preserve the discriminative
301 | information used by the classifier. We applied DeepJDOT to a series of visual
302 | recognition tasks, where it compares favorably against state-of-the-art deep
303 | domain adaptation methods.
304 |
305 | **Content introduction**
306 |
307 | https://blog.csdn.net/qq_41076797/article/details/116698770
308 |
309 | **Paper address**
310 |
311 | https://arxiv.org/abs/1803.10081
312 |
313 | ## DCWD
314 |
315 | **title**
316 |
317 | Domain-attention Conditional Wasserstein Distance
318 | for Multi-source Domain Adaptation
319 |
320 | **Times**
321 |
322 | 2020
323 |
324 | **Authors**
325 |
326 | HANRUI WU 、YUGUANG YAN 、 MICHAEL K. NG 、QINGYAO WU
327 |
328 | **Abstract**
329 |
330 | Multi-source domain adaptation has received considerable attention due to its
331 | effectiveness of leveraging the knowledge from multiple related sources with
332 | different distributions to enhance the learning performance. One of the
333 | fundamental challenges in multi-source domain adaptation is how to determine the
334 | amount of knowledge transferred from each source domain to the target domain. To
335 | address this issue, we propose a new algorithm, called Domain-attention
336 | Conditional Wasserstein Distance (DCWD), to learn transferred weights for
337 | evaluating the relatedness across the source and target domains. In DCWD, we
338 | design a new conditional Wasserstein distance objective function by taking the
339 | label information into consideration to measure the distance between a given
340 | source domain and the target domain. We also develop an attention scheme to
341 | compute the transferred weights of different source domains based on their
342 | conditional Wasserstein distances to the target domain. After that, the
343 | transferred weights can be used to reweight the source data to determine their
344 | importance in knowledge transfer. We conduct comprehensive experiments on
345 | several real-world data sets, and the results demonstrate the effectiveness and
346 | efficiency of the proposed method.
347 |
348 | **Content introduction**
349 |
350 | https://blog.csdn.net/qq_41076797/article/details/118358520
351 |
352 | **Paper address**
353 |
354 | https://dl.acm.org/doi/10.1145/3391229
355 |
356 | ## WDGRL
357 |
358 | **title**
359 |
360 | Wasserstein Distance Guided Representation Learning
361 | for Domain Adaptation
362 |
363 | **Times**
364 |
365 | 2018
366 |
367 | **Authors**
368 |
369 | Jian Shen, Yanru Qu, Weinan Zhang∗, Y ong Yu
370 |
371 | **Abstract**
372 |
373 | Domain adaptation aims at generalizing a high-performance learner on a target
374 | domain via utilizing the knowledge distilled from a source domain which has a
375 | different but related data distribution. One solution to domain adaptation is to
376 | learn domain invariant feature representations while the learned representations
377 | should also be discriminative in prediction. To learn such representations,
378 | domain adaptation frameworks usually include a domain invariant representation
379 | learning approach to measure and reduce the domain discrepancy, as well as a
380 | discriminator for classification. Inspired by Wasserstein GAN, in this paper we
381 | propose a novel approach to learn domain invariant feature representations,
382 | namely Wasserstein Distance Guided Representation Learning (WDGRL). WDGRL
383 | utilizes a neural network, denoted by the domain critic, to estimate empirical
384 | Wasserstein distance between the source and target samples and optimizes the
385 | feature extractor network to minimize the estimated Wasserstein distance in an
386 | adversarial manner. The theoretical advantages of Wasserstein distance for
387 | domain adaptation lie in its gradient property and promising generalization
388 | bound. Empirical studies on common sentiment and image classification adaptation
389 | datasets demonstrate that our proposed WDGRL outperforms the state-of-the-art
390 | domain invariant representation learning approaches.
391 |
392 | **Content introduction**
393 |
394 | https://blog.csdn.net/qq_41076797/article/details/116942752
395 |
396 | **Paper address**
397 |
398 | https://www.aaai.org/ocs/index.php/AAAI/AAAI18/paper/view/17155
399 |
400 | ## DDC
401 |
402 | **title**
403 |
404 | Deep Domain Confusion: Maximizing for Domain Invariance
405 |
406 | **Times**
407 |
408 | 2014
409 |
410 | **Authors**
411 |
412 | Eric Tzeng, Judy Hoffman, Ning Zhang, Kate Saenko, Trevor Darrell
413 |
414 | **Abstract**
415 |
416 | Recent reports suggest that a generic supervised deep CNN model trained on a
417 | large-scale dataset reduces, but does not remove, dataset bias on a standard
418 | benchmark. Fine-tuning deep models in a new domain can require a significant
419 | amount of data, which for many applications is simply not available. We propose
420 | a new CNN architecture which introduces an adaptation layer and an additional
421 | domain confusion loss, to learn a representation that is both semantically
422 | meaningful and domain invariant. We additionally show that a domain confusion
423 | metric can be used for model selection to determine the dimension of an
424 | adaptation layer and the best position for the layer in the CNN architecture.
425 | Our proposed adaptation method offers empirical performance which exceeds
426 | previously published results on a standard benchmark visual domain adaptation
427 | task.
428 |
429 | **Content introduction**
430 |
431 | https://blog.csdn.net/qq_41076797/article/details/119698726
432 |
433 | **Paper address**
434 |
435 | https://arxiv.org/abs/1412.3474
436 |
437 | ## DAN
438 |
439 | **title**
440 |
441 | Learning Transferable Features with Deep Adaptation Networks
442 |
443 | **Times**
444 |
445 | 2015
446 |
447 | **Authors**
448 |
449 | Mingsheng Long Yue Cao Jianmin Wang Michael I. Jordan
450 |
451 | **Abstract**
452 |
453 | Recent studies reveal that a deep neural network can learn transferable features
454 | which generalize well to novel tasks for domain adaptation. However, as deep
455 | features eventually transition from general to specific along the network, the
456 | feature transferability drops significantly in higher layers with increasing
457 | domain discrepancy. Hence, it is important to formally reduce the dataset bias
458 | and enhance the transferability in task-specific layers. In this paper, we
459 | propose a new Deep Adaptation Network (DAN) architecture, which generalizes deep
460 | convolution alneural network to the domain adaptation scenario. In DAN, hidden
461 | representations of all task-specific layers are embeddedin a reproducing kernel
462 | Hilbert space where the mean embeddingsof different domain distributions can be
463 | explicitly matched. The domain discrepancy is further reduced using an optimal
464 | multi-kernel selection method for mean embedding matching. DAN can learn
465 | transferable features with statistic alguarantees,and can scale linearly by
466 | unbiased estimate of kernel embedding. Extensive empirical evidence shows that
467 | the proposed architecture yields state-of-the-art image classification error
468 | rates on standard domain adaptation benchmarks.
469 |
470 | **Content introduction**
471 |
472 | https://blog.csdn.net/qq_41076797/article/details/119829512
473 |
474 | **Paper address**
475 |
476 | http://proceedings.mlr.press/v37/long15.html
477 |
478 | ## JAN
479 |
480 | **title**
481 |
482 | Deep Transfer Learning with Joint Adaptation Networks
483 |
484 | **Times**
485 |
486 | 2017
487 |
488 | **Authors**
489 |
490 | Mingsheng Long Han Zhu Jianmin Wang Michael I. Jordan
491 |
492 | **Abstract**
493 |
494 | Deep networks have been successfully applied to learn transferable features for
495 | adapting models from a source domain to a different target domain. In this
496 | paper, we present joint adaptation networks (JAN), which learn a transfer
497 | network by aligning the joint distributions of multiple domain-specific layers
498 | across domains based on a joint maximum mean discrepancy (JMMD) criterion.
499 | Adversarial training strategy is adopted to maximize JMMD such that the
500 | distributions of the source and target domains are made more distinguishable.
501 | Learning can be performed by stochastic gradient descent with the gradients
502 | computed by back-propagation in linear-time. Experiments testify that our model
503 | yields state of the art results on standard datasets.
504 |
505 | **Content introduction**
506 |
507 | https://blog.csdn.net/qq_41076797/article/details/119850543
508 |
509 | **Paper address**
510 |
511 | http://proceedings.mlr.press/v70/long17a.html
512 |
513 | ## MCD
514 |
515 | **title**
516 |
517 | Maximum Classifier Discrepancy for Unsupervised Domain Adaptation
518 |
519 | **Times**
520 |
521 | 2018
522 |
523 | **Authors**
524 |
525 | Kuniaki Saito, Kohei Watanabe, Yoshitaka Ushiku, and Tatsuya Harada
526 |
527 | **Abstract**
528 |
529 | In this work, we present a method for unsupervised domain adaptation. Many
530 | adversarial learning methods train domain classifier networks to distinguish the
531 | features as either a source or target and train a feature generator network to
532 | mimic the discriminator. Two problems exist with these methods. First, the
533 | domain classifier only tries to distinguish the features as a source or target
534 | and thus does not consider task-specific decision boundaries between classes.
535 | Therefore, a trained generator can generate ambiguous features near class
536 | boundaries. Second, these methods aim to completely match the feature
537 | distributions between different domains, which is difficult because of each
538 | domain’s characteristics. To solve these problems, we introduce a new approach
539 | that attempts to align distributions of source and target by utilizing the
540 | task-specific decision boundaries. We propose to maximize the discrepancy
541 | between two classifiers’ outputs to detect target samples that are far from the
542 | support of the source. A feature generator learns to generate target features
543 | near the support to minimize the discrepancy. Our method outperforms other
544 | methods on several datasets of image classification and semantic segmentation.
545 |
546 | **Content introduction**
547 |
548 | https://blog.csdn.net/qq_41076797/article/details/119991815
549 |
550 | **Paper address**
551 |
552 | https://openaccess.thecvf.com/content_cvpr_2018/html/Saito_Maximum_Classifier_Discrepancy_CVPR_2018_paper.html
553 |
554 | ## SWD
555 |
556 | **title**
557 |
558 | Sliced Wasserstein Discrepancy for Unsupervised Domain Adaptation
559 |
560 | **Times**
561 |
562 | 2019
563 |
564 | **Authors**
565 |
566 | Chen-Yu Lee, Tanmay Batra, Mohammad Haris Baig, Daniel Ulbricht
567 |
568 | **Abstract**
569 |
570 | In this work, we connect two distinct concepts for unsupervised domain
571 | adaptation: feature distribution alignment between domains by utilizing the
572 | task-specificdecision boundary [57] and the Wasserstein metric [72]. Our
573 | proposed sliced Wasserstein discrepancy (SWD) is designed to capture the natural
574 | notion of dissimilarity between the outputs of task-specific classifiers. It
575 | provides a geometrically meaningful guidance to detect target samples that are
576 | far from the support of the source and enables efficient distribution alignment
577 | in an end-to-end trainable fashion. In the experiments, we validate the
578 | effectiveness and genericness of our method on digit and sign recognition, image
579 | classification, semantic segmentation, and object detection.
580 |
581 | **Content introduction**
582 |
583 | https://blog.csdn.net/qq_41076797/article/details/119979243
584 |
585 | **Paper address**
586 |
587 | https://openaccess.thecvf.com/content_CVPR_2019/html/Lee_Sliced_Wasserstein_Discrepancy_for_Unsupervised_Domain_Adaptation_CVPR_2019_paper.html
588 |
589 | ## JPOT
590 |
591 | **title**
592 |
593 | Joint Partial Optimal Transport for Open Set Domain Adaptation
594 |
595 | **Times**
596 |
597 | 2020
598 |
599 | **Authors**
600 |
601 | Renjun Xu, Pelen Liu, Yin Zhang, Fang Cai, Jindong Wang, Shuoying Liang, Heting
602 |
603 | **Abstract**
604 |
605 | Domain adaptation (DA) has achieved a resounding success to learn a good
606 | classifier by leveraging labeled data from a source domain to adapt to an
607 | unlabeled target domain. However, in a general setting when the target domain
608 | contains classes that are never observed in the source domain, namely in Open
609 | Set Domain Adaptation (OSDA), existing DA methods failed to work because of the
610 | interference of the extra unknown classes. This is a much more challenging
611 | problem, since it can easily result in negative transfer due to the mismatch
612 | between the unknown and known classes. Existing researches are susceptible to
613 | misclassification when target domain unknown samples in the feature space
614 | distributed near the decision boundary learned from the labeled source domain.
615 | To overcome this, we propose Joint Partial Optimal Transport (JPOT), fully
616 | utilizing information of not only the labeled source domain but also the
617 | discriminative representation of unknown class in the target domain. The
618 | proposed joint discriminative prototypical compactness loss can not only achieve
619 | intra-class compactness and inter-class separability, but also estimate the mean
620 | and variance of the unknown class through backpropagation, which remains
621 | intractable for previous methods due to the blindness about the structure of the
622 | unknown classes. To our best knowledge, this is the first optimal transport
623 | model for OSDA. Extensive experiments demonstrate that our proposed model can
624 | significantly boost the performance of open set domain adaptation on standard DA
625 | datasets.
626 |
627 | **Content introduction**
628 |
629 | https://blog.csdn.net/qq_41076797/article/details/120133702
630 |
631 | **Paper address**
632 |
633 | https://www.ijcai.org/proceedings/2020/352
634 |
635 | ## NW
636 |
637 | **title**
638 |
639 | Normalized Wasserstein for Mixture Distributions with Applications in
640 | Adversarial Learning and Domain Adaptation
641 |
642 | **Times**
643 |
644 | 2019
645 |
646 | **Authors**
647 |
648 | Yogesh Balaji, Rama Chellappa, Soheil Feizi
649 |
650 | **Abstract**
651 |
652 | Understanding proper distance measures between distributions is at the core of
653 | several learning tasks such as generative models, domain adaptation, clustering,
654 | etc. In this work, we focus on mixture distributions that arise naturally in
655 | several application domains where the data contains different sub-populations. F
656 | or mixture distributions, established distance measures such as the Wasserstein
657 | distance do not take into account imbalanced mixture proportions. Thus, even if
658 | two mixture distributions have identical mixture components but different
659 | mixture proportions, the Wasserstein distance between them will be large. This
660 | often leads to undesired results in distance-based learning methods for mixture
661 | distributions. In this paper , we resolve this issue by introducing the
662 | Normalized Wasserstein measure. The key idea is to introduce mixture proportions
663 | as optimization variables, effectively normalizing mixture proportions in the
664 | Wasserstein formulation. Using the proposed normalized Wasserstein measure leads
665 | to significant performance gains for mixture distributions with imbalanced
666 | mixture proportions compared to the vanilla Wasserstein distance. We demonstrate
667 | the effectiveness of the proposed measure in GANs, domain adaptation and
668 | adversarial clustering in several benchmark datasets.
669 |
670 | **Content introduction**
671 |
672 | https://blog.csdn.net/qq_41076797/article/details/120086168
673 |
674 | **Paper address**
675 |
676 | https://arxiv.org/abs/1902.00415
677 |
678 | ## WDAN
679 |
680 | **title**
681 |
682 | Mind the Class Weight Bias: Weighted Maximum Mean Discrepancy for Unsupervised
683 | Domain Adaptation
684 |
685 | **Times**
686 |
687 | 2017
688 |
689 | **Authors**
690 |
691 | Hongliang Yan, Y ukang Ding, Peihua Li, Qilong Wang, Y ong Xu, Wangmeng Zuo
692 |
693 | **Abstract**
694 |
695 | In domain adaptation, maximum mean discrepancy (MMD) has been widely adopted as
696 | a discrepancy metric between the distributions of source and target domains.
697 | However , existing MMD-based domain adaptation methods generally ignore the
698 | changes of class prior distributions, i.e., class weight bias across domains.
699 | This remains an open problem but ubiquitous for domain adaptation, which can be
700 | caused by changes in sample selection criteria and application scenarios. We
701 | show that MMD cannot account for class weight bias and results in degraded
702 | domain adaptation performance. To address this issue, a weighted MMD model is
703 | proposed in this paper . Specifically, we introduce class-specific auxiliary
704 | weights into the original MMD for exploiting the class prior probability on
705 | source and target domains,whose challengelies inthe factthattheclass label in
706 | target domain is unavailable. To account for it, our proposed weighted MMD model
707 | is defined by introducing an auxiliary weight for each class in the source
708 | domain, and a classification EM algorithm is suggested by alternating between
709 | assigning the pseudo-labels, estimating auxiliary weights and updating model
710 | parameters. Extensive experiments demonstrate the superiority of our weighted
711 | MMD over conventional MMD for domain adaptation.
712 |
713 | **Content introduction**
714 |
715 | https://blog.csdn.net/qq_41076797/article/details/120054974
716 |
717 | **Paper address**
718 |
719 | https://arxiv.org/abs/1705.00609
720 |
721 | ## MCDA
722 |
723 | **title**
724 |
725 | Deep multi-Wasserstein unsupervised domain adaptation
726 |
727 | **Times**
728 |
729 | 2019
730 |
731 | **Authors**
732 |
733 | Tien-Nam Le , Amaury Habrard , Marc Sebban
734 |
735 | **Abstract**
736 |
737 | In unsupervised domain adaptation (DA), 1 aims at learning from labeled source
738 | data and fully unlabeled target examples a model with a low error on the target
739 | domain. In this setting, standard generalization bounds prompt us to minimize
740 | the sum of three terms: (a) the source true risk, (b) the divergence be- tween
741 | the source and target domains, and (c) the combined error of the ideal joint
742 | hypothesis over the two domains. Many DA methods – e s p e c i a l l y those
743 | using deep neural networks – h a v e focused on the first two terms by using
744 | different divergence measures to align the source and target distributions on a
745 | shared latent feature space, while ignoring the third term, assuming it is
746 | negligible to perform the adaptation. However, it has been shown that purely
747 | aligning the two distributions while minimizing the source error may lead to
748 | so-called negative transfer . In this paper, we address this issue with a new
749 | deep unsupervised DA method – called MCDA – minimizing the first two terms while
750 | controlling the third one. MCDA benefits from highly-confident target samples
751 | (using softmax predictions) to minimize class- wise Wasserstein distances and
752 | efficiently approximate the ideal joint hypothesis. Empirical results show that
753 | our approach outperforms state of the art methods.
754 |
755 | **Content introduction**
756 |
757 | https://blog.csdn.net/qq_41076797/article/details/120110987
758 |
759 | **Paper address**
760 |
761 | https://linkinghub.elsevier.com/retrieve/pii/S0167865519301400
762 |
763 | ## ADDA
764 |
765 | **title**
766 |
767 | Adversarial Discriminative Domain Adaptation
768 |
769 | **Times**
770 |
771 | 2017
772 |
773 | **Authors**
774 |
775 | Eric Tzeng , Judy Hoffman , Kate Saenko , Trevor Darrell
776 |
777 | **Abstract**
778 |
779 | Adversarial learning methods are a promising approach to training robust deep
780 | networks, and can generate complex samples across diverse domains. They can also
781 | improve recognition despite the presence of domain shift or dataset bias: recent
782 | adversarial approaches to unsupervised domain adaptation reduce the difference
783 | between the training and test domain distributions and thus improve
784 | generalization performance. However , while generative adversarial networks
785 | (GANs) show compelling visualizations, they are not optimal on discriminative
786 | tasks and can be limited to smaller shifts. On the other hand, discriminative
787 | approaches can handle larger domain shifts, but impose tied weights on the model
788 | and do not exploit a GAN-based loss. In this work, we first outline a novel
789 | generalized framework for adversarial adaptation, which subsumes recent
790 | state-of-the-art approaches as special cases, and use this generalized view to
791 | better relate prior approaches. We then propose a previously unexplored instance
792 | of our general framework which combines discriminative modeling, untied weight
793 | sharing, and a GAN loss, which we call Adversarial Discriminative Domain
794 | Adaptation (ADDA). We show that ADDA is more effective yet considerably simpler
795 | than competing domainadversarial methods, and demonstrate the promise of our
796 | approach by exceeding state-of-the-art unsupervised adaptation results on
797 | standard domain adaptation tasks as well as a difficult cross-modality object
798 | classification task.
799 |
800 | **Content introduction**
801 |
802 | https://blog.csdn.net/qq_41076797/article/details/120273707
803 |
804 | **Paper address**
805 |
806 | https://openreview.net/forum?id=B1Vjl1Stl
807 |
808 | ## CoGAN
809 |
810 | **title**
811 |
812 | Coupled generative adversarial networks
813 |
814 | **Times**
815 |
816 | 2016
817 |
818 | **Authors**
819 |
820 | Ming-Yu Liu , Oncel Tuzel
821 |
822 | **Abstract**
823 |
824 | We propose coupled generative adversarial network (CoGAN) for learning a joint
825 | distribution of multi-domain images. In contrast to the existing approaches,
826 | which require tuples of corresponding images in different domains in the
827 | training set, CoGAN can learn a joint distribution without any tuple of
828 | corresponding images. It can learn a joint distribution with just samples drawn
829 | from the marginal distributions. This is achieved by enforcing a weight-sharing
830 | constraint that limits the network capacity and favors a joint distribution
831 | solution over a product of marginal distributions one. We apply CoGAN to several
832 | joint distribution learning tasks, including learning a joint distribution of
833 | color and depth images, and learning a joint distribution of face images with
834 | different attributes. For each task it successfully learns the joint
835 | distribution without any tuple of corresponding images. We also demonstrate its
836 | applications to domain adaptation and image transformation.
837 |
838 | **Content introduction**
839 |
840 | https://blog.csdn.net/qq_41076797/article/details/120347149
841 |
842 | **Paper address**
843 |
844 | https://proceedings.neurips.cc/paper/2016/hash/502e4a16930e414107ee22b6198c578f-Abstract.html
845 |
846 | ## CDAN
847 |
848 | **title**
849 |
850 | Conditional Adversarial Domain Adaptation
851 |
852 | **Times**
853 |
854 | 2018
855 |
856 | **Authors**
857 |
858 | Mingsheng Long, Zhangjie Cao, Jianmin Wang, and Michael I. Jordan
859 |
860 | **Abstract**
861 |
862 | Adversarial learning has been embedded into deep networks to learn disentangled
863 | and transferable representations for domain adaptation. Existing adversarial
864 | domain adaptation methods may not effectively align different domains of
865 | multimodal distributions native in classification problems. In this paper, we
866 | present conditional adversarial domain adaptation, a principled framework that
867 | conditions the adversarial adaptation models on discriminative information
868 | conveyed in the classifier predictions. Conditional domain adversarial networks
869 | (CDANs) are designed with two novel conditioning strategies: multilinear
870 | conditioning that captures the crosscovariance between feature representations
871 | and classifier predictions to improve the discriminability, and entropy
872 | conditioning that controls the uncertainty of classifier predictions to
873 | guarantee the transferability. With theoretical guarantees and a few lines of
874 | codes, the approach has exceeded state-of-the-art results on five datasets.
875 |
876 | **Content introduction**
877 |
878 | https://blog.csdn.net/qq_41076797/article/details/120622652
879 |
880 | **Paper address**
881 |
882 | https://proceedings.neurips.cc/paper/2018/hash/ab88b15733f543179858600245108dd8-Abstract.html
883 |
884 | ## M3SDA
885 |
886 | **title**
887 |
888 | Moment Matching for Multi-Source Domain Adaptation
889 |
890 | **Times**
891 |
892 | 2019
893 |
894 | **Authors**
895 |
896 | Xingchao Peng, Qinxun Bai, Xide Xia, Zijun Huang, Kate Saenko, Bo Wang
897 |
898 | **Abstract**
899 |
900 | Conventional unsupervised domain adaptation (UDA) assumes that training data are
901 | sampled from a single domain. This neglects the more practical scenario where
902 | training data are collected from multiple sources, requiring multi-source domain
903 | adaptation. We make three major contributions towards addressing this problem.
904 | First, we collect and annotate by far the largest UDA dataset, called DomainNet,
905 | which contains six domains and about 0.6 million images distributed among 345
906 | categories, addressing the gap in data availability for multi-source UDA
907 | research. Second, we propose a new deep learning approach, Moment Matching for
908 | Multi-Source Domain Adaptation (M3SDA), which aims to transfer knowledge learned
909 | from multiple labeled source domains to an unlabeled target domain by
910 | dynamically aligning moments of their feature distributions. Third, we provide
911 | new theoretical insights specifically for moment matching approaches in both
912 | single and multiple source domain adaptation. Extensive experiments are
913 | conducted to demonstrate the power of our new dataset in benchmarking
914 | state-of-the-art multi-source domain adaptation methods, as well as the
915 | advantage of our proposed model. Dataset and Code are available at
916 | http://ai.bu.edu/M3SDA/
917 |
918 | **Content introduction**
919 |
920 | https://blog.csdn.net/qq_41076797/article/details/120819629
921 |
922 | **Paper address**
923 |
924 | https://arxiv.org/abs/1812.01754
925 |
926 | ## CMSS
927 |
928 | **title**
929 |
930 | Curriculum manager for source selection in multi- source domain adaptation
931 |
932 | **Times**
933 |
934 | 2020
935 |
936 | **Authors**
937 |
938 | Luyu Yang, Yogesh Balaji, Ser-Nam Lim, Abhinav Shrivastava
939 |
940 | **Abstract**
941 |
942 | The performance of Multi-Source Unsupervised Domain Adaptation depends significantly on the effectiveness of transfer from labeled source domain samples. In this paper, we proposed an adversarial agent that learns a dynamic curriculum for source samples, called Curriculum Manager for Source Selection (CMSS). The Curriculum Manager, an independent network module, constantly updates the curriculum during training, and iteratively learns which domains or samples are best suited for aligning to the target. The intuition behind this is to force the Curriculum Manager to constantly re-measure the transferability of latent domains over time to adversarially raise the error rate of the domain discriminator. CMSS does not require any knowledge of the domain labels, yet it outperforms other methods on four well-known benchmarks by significant margins. We also provide interpretable results that shed light on the proposed method.
943 |
944 | **Content introduction**
945 |
946 | https://blog.csdn.net/qq_41076797/article/details/120877511
947 |
948 | **Paper address**
949 |
950 | https://arxiv.org/abs/2007.01261
951 |
952 | ## LtC-MSDA
953 |
954 | **title**
955 |
956 | Learning to Combine: Knowledge Aggregation for Multi-Source Domain Adaptation
957 |
958 | **Times**
959 |
960 | 2020
961 |
962 | **Authors**
963 |
964 | Hang Wang , Minghao Xu , Bingbing Ni , and Wenjun Zhang
965 |
966 | **Abstract**
967 |
968 | Transferring knowledges learned from multiple source domains to target domain is a more practical and challenging task than conventional single-source domain adaptation. Furthermore, the increase of modalities brings more difficulty in aligning feature distributions among multiple domains. To mitigate these problems, we propose a Learning to Combine for Multi-Source Domain Adaptation (LtC-MSDA) framework via exploring interactions among domains. In the nutshell, a knowledge graph is constructed on the prototypes of various domains to realize the information propagation among semantically adjacent representations. On such basis, a graph model is learned to predict query samples under the guidance of correlated prototypes. In addition, we design a Relation Alignment Loss (RAL) to facilitate the consistency of categories’ relational interdependency and the compactness of features, which boosts features’ intra-class invariance and inter-class separability. Comprehensive results on public benchmark datasets demonstrate that our approach outperforms existing methods with a remarkable margin. Our code is available athttps://github.com/ChrisAllenMing/LtC-MSDA.
969 |
970 | **Content introduction**
971 |
972 | https://blog.csdn.net/qq_41076797/article/details/120978951
973 |
974 | **Paper address**
975 |
976 | https://arxiv.org/abs/2007.08801
977 |
978 | ## Dirt-T
979 |
980 | **title**
981 |
982 | A DIRT-T approach to unsupervised domain adaptation
983 |
984 | **Times**
985 |
986 | 2018
987 |
988 | **Authors**
989 |
990 | Rui Shu, Hung H. Bui, Hirokazu Narui, & Stefano Ermon
991 |
992 | **Abstract**
993 |
994 | Domain adaptation refers to the problem of leveraging labeled data in a source domain to learn an accurate model in a target domain where labels are scarce or unavailable. A recent approach for finding a common representation of the two domains is via domain adversarial training (Ganin & Lempitsky,2015), which attempts to induce a feature extractor that matches the source and target feature distributions in some feature space. However, domain adversarial training faces two critical limitations: 1) if the feature extraction function has high-capacity, then feature distribution matching is a weak constraint, 2) in non-conservative domain adaptation (where no single classifier can perform well in both the source and target domains), training the model to do well on the source domain hurts performance on the target domain. In this paper, we address these issues through the lens of the cluster assumption, i.e., decision boundaries should not cross high-density data regions. We propose two novel and related models: 1) the Virtual Adversarial Domain Adaptation (VADA) model, which combines domain adversarial training with a penalty term that punishes violation of the cluster assumption; 2) the Decision-boundary Iterative Refinement Training with a Teacher (DIRT-T)1 model, which takes the V ADA model as initialization and employs natural gradient steps to further minimize the cluster assumption violation. Extensive empirical results demonstrate that the combination of these two models significantly improve the state-of-the-art performance on the digit, traffic sign, and Wi-Fi recognition domain adaptation benchmarks.
995 |
996 | **Content introduction**
997 |
998 | https://blog.csdn.net/qq_41076797/article/details/121226438
999 |
1000 | **Paper address**
1001 |
1002 | https://openreview.net/pdf?id=H1q-TM-AW
1003 |
--------------------------------------------------------------------------------
/images/logo.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/CtrlZ1/Domain-Adaptation-Algorithms/cf8dcc5f5d2116005205ad76efdf5e6445013df3/images/logo.png
--------------------------------------------------------------------------------
/requirements.txt:
--------------------------------------------------------------------------------
1 | torch>=0.4.0
2 | torchvision
3 | matplotlib
4 | numpy
5 | scipy
6 | pillow
7 | urllib3
8 | scikit-image
9 | tqdm
--------------------------------------------------------------------------------