Common Adaptation Modules¶
Domain Discriminator¶
-
class
dalib.modules.domain_discriminator.
DomainDiscriminator
(in_feature, hidden_size, batch_norm=True)[source]¶ Domain discriminator model from Domain-Adversarial Training of Neural Networks (ICML 2015)
Distinguish whether the input features come from the source domain or the target domain. The source domain label is 1 and the target domain label is 0.
- Parameters
in_feature (int) – dimension of the input feature
hidden_size (int) – dimension of the hidden features
batch_norm (bool) – whether use
BatchNorm1d
. UseDropout
ifbatch_norm
is False. Default: True.
- Shape:
Inputs: (minibatch, in_feature)
Outputs: \((minibatch, 1)\)
GRL¶
-
class
dalib.modules.grl.
WarmStartGradientReverseLayer
(alpha=1.0, lo=0.0, hi=1.0, max_iters=1000.0, auto_step=False)[source]¶ Gradient Reverse Layer \(\mathcal{R}(x)\) with warm start
The forward and backward behaviours are:
\[ \begin{align}\begin{aligned}\mathcal{R}(x) = x,\\\dfrac{ d\mathcal{R}} {dx} = - \lambda I.\end{aligned}\end{align} \]\(\lambda\) is initiated at \(lo\) and is gradually changed to \(hi\) using the following schedule:
\[\lambda = \dfrac{2(hi-lo)}{1+\exp(- α \dfrac{i}{N})} - (hi-lo) + lo\]where \(i\) is the iteration step.
- Parameters
alpha (float, optional) – \(α\). Default: 1.0
lo (float, optional) – Initial value of \(\lambda\). Default: 0.0
hi (float, optional) – Final value of \(\lambda\). Default: 1.0
max_iters (int, optional) – \(N\). Default: 1000
auto_step (bool, optional) – If True, increase \(i\) each time forward is called. Otherwise use function step to increase \(i\). Default: False
Kernels¶
-
class
dalib.modules.kernels.
GaussianKernel
(sigma=None, track_running_stats=True, alpha=1.0)[source]¶ Gaussian Kernel Matrix
Gaussian Kernel k is defined by
\[k(x_1, x_2) = \exp \left( - \dfrac{\| x_1 - x_2 \|^2}{2\sigma^2} \right)\]where \(x_1, x_2 \in R^d\) are 1-d tensors.
Gaussian Kernel Matrix K is defined on input group \(X=(x_1, x_2, ..., x_m),\)
\[K(X)_{i,j} = k(x_i, x_j)\]Also by default, during training this layer keeps running estimates of the mean of L2 distances, which are then used to set hyperparameter \(\sigma\). Mathematically, the estimation is \(\sigma^2 = \dfrac{\alpha}{n^2}\sum_{i,j} \| x_i - x_j \|^2\). If
track_running_stats
is set toFalse
, this layer then does not keep running estimates, and use a fixed \(\sigma\) instead.- Parameters
sigma (float, optional) – bandwidth \(\sigma\). Default: None
track_running_stats (bool, optional) – If
True
, this module tracks the running mean of \(\sigma^2\). Otherwise, it won’t track such statistics and always uses fix \(\sigma^2\). Default:True
alpha (float, optional) – \(\alpha\) which decides the magnitude of \(\sigma^2\) when track_running_stats is set to
True
- Inputs:
X (tensor): input group \(X\)
- Shape:
Inputs: \((minibatch, F)\) where F means the dimension of input features.
Outputs: \((minibatch, minibatch)\)
Entropy¶
-
dalib.modules.entropy.
entropy
(predictions, reduction='none')[source]¶ Entropy of prediction. The definition is:
\[entropy(p) = - \sum_{c=1}^C p_c \log p_c\]where C is number of classes.
- Parameters
predictions (tensor) – Classifier predictions. Expected to contain raw, normalized scores for each class
reduction (str, optional) – Specifies the reduction to apply to the output:
'none'
|'mean'
.'none'
: no reduction will be applied,'mean'
: the sum of the output will be divided by the number of elements in the output. Default:'mean'
- Shape:
predictions: \((minibatch, C)\) where C means the number of classes.
Output: \((minibatch, )\) by default. If
reduction
is'mean'
, then scalar.