Common Adaptation Modules¶

Domain Discriminator¶

class dalib.modules.domain_discriminator.DomainDiscriminator(in_feature, hidden_size, batch_norm=True)[source]¶

Domain discriminator model from Domain-Adversarial Training of Neural Networks (ICML 2015)

Distinguish whether the input features come from the source domain or the target domain. The source domain label is 1 and the target domain label is 0.

Parameters

in_feature (int) – dimension of the input feature
hidden_size (int) – dimension of the hidden features
batch_norm (bool) – whether use BatchNorm1d. Use Dropout if batch_norm is False. Default: True.

Shape:

Inputs: (minibatch, in_feature)
Outputs: \((minibatch, 1)\)

GRL¶

class dalib.modules.grl.WarmStartGradientReverseLayer(alpha=1.0, lo=0.0, hi=1.0, max_iters=1000.0, auto_step=False)[source]¶

Gradient Reverse Layer \(\mathcal{R}(x)\) with warm start

The forward and backward behaviours are:

\[ \begin{align}\begin{aligned}\mathcal{R}(x) = x,\\\dfrac{ d\mathcal{R}} {dx} = - \lambda I.\end{aligned}\end{align} \]

\(\lambda\) is initiated at \(lo\) and is gradually changed to \(hi\) using the following schedule:

\[\lambda = \dfrac{2(hi-lo)}{1+\exp(- α \dfrac{i}{N})} - (hi-lo) + lo\]

where \(i\) is the iteration step.

Parameters

alpha (float, optional) – \(α\). Default: 1.0
lo (float, optional) – Initial value of \(\lambda\). Default: 0.0
hi (float, optional) – Final value of \(\lambda\). Default: 1.0
max_iters (int, optional) – \(N\). Default: 1000
auto_step (bool, optional) – If True, increase \(i\) each time forward is called. Otherwise use function step to increase \(i\). Default: False

step()[source]¶: Increase iteration number \(i\) by 1

Kernels¶

class dalib.modules.kernels.GaussianKernel(sigma=None, track_running_stats=True, alpha=1.0)[source]¶

Gaussian Kernel Matrix

Gaussian Kernel k is defined by

\[k(x_1, x_2) = \exp \left( - \dfrac{\| x_1 - x_2 \|^2}{2\sigma^2} \right)\]

where \(x_1, x_2 \in R^d\) are 1-d tensors.

Gaussian Kernel Matrix K is defined on input group \(X=(x_1, x_2, ..., x_m),\)

\[K(X)_{i,j} = k(x_i, x_j)\]

Also by default, during training this layer keeps running estimates of the mean of L2 distances, which are then used to set hyperparameter \(\sigma\). Mathematically, the estimation is \(\sigma^2 = \dfrac{\alpha}{n^2}\sum_{i,j} \| x_i - x_j \|^2\). If track_running_stats is set to False, this layer then does not keep running estimates, and use a fixed \(\sigma\) instead.

Parameters

sigma (float, optional) – bandwidth \(\sigma\). Default: None
track_running_stats (bool, optional) – If True, this module tracks the running mean of \(\sigma^2\). Otherwise, it won’t track such statistics and always uses fix \(\sigma^2\). Default: True
alpha (float, optional) – \(\alpha\) which decides the magnitude of \(\sigma^2\) when track_running_stats is set to True

Inputs:

X (tensor): input group \(X\)

Shape:

Inputs: \((minibatch, F)\) where F means the dimension of input features.
Outputs: \((minibatch, minibatch)\)

Entropy¶

dalib.modules.entropy.entropy(predictions, reduction='none')[source]¶

Entropy of prediction. The definition is:

\[entropy(p) = - \sum_{c=1}^C p_c \log p_c\]

where C is number of classes.

Parameters

predictions (tensor) – Classifier predictions. Expected to contain raw, normalized scores for each class
reduction (str, optional) – Specifies the reduction to apply to the output: 'none' | 'mean'. 'none': no reduction will be applied, 'mean': the sum of the output will be divided by the number of elements in the output. Default: 'mean'

Shape:

predictions: \((minibatch, C)\) where C means the number of classes.
Output: \((minibatch, )\) by default. If reduction is 'mean', then scalar.

Common Adaptation Modules¶

Domain Discriminator¶

GRL¶

Kernels¶

Entropy¶

Docs

Tutorials