Shortcuts

DEep Learning Transfer using Feature Map with Attention (DELTA)

class ftlib.finetune.delta.L2Regularization(model)[source]

The L2 regularization of parameters \(w\) can be described as:

\[{\Omega} (w) = \dfrac{1}{2} \Vert w\Vert_2^2 ,\]
Parameters

model (torch.nn.Module) – The model to apply L2 penalty.

Shape:
  • Output: scalar.

class ftlib.finetune.delta.SPRegularization(source_model, target_model)[source]

The SP (Starting Point) regularization from Explicit inductive bias for transfer learning with convolutional networks (ICML 2018)

The SP regularization of parameters \(w\) can be described as:

\[{\Omega} (w) = \dfrac{1}{2} \Vert w-w^0\Vert_2^2 ,\]

where \(w^0\) is the parameter vector of the model pretrained on the source problem, acting as the starting point (SP) in fine-tuning.

Parameters
Shape:
  • Output: scalar.

class ftlib.finetune.delta.BehavioralRegularization[source]

The behavioral regularization from DELTA:DEep Learning Transfer using Feature Map with Attention for convolutional networks (ICLR 2019)

It can be described as:

\[{\Omega} (w) = \sum_{j=1}^{N} \Vert FM_j(w, \boldsymbol x)-FM_j(w^0, \boldsymbol x)\Vert_2^2 ,\]

where \(w^0\) is the parameter vector of the model pretrained on the source problem, acting as the starting point (SP) in fine-tuning, \(FM_j(w, \boldsymbol x)\) is feature maps generated from the \(j\)-th layer of the model parameterized with \(w\), given the input \(\boldsymbol x\).

Inputs:

layer_outputs_source (OrderedDict): The dictionary for source model, where the keys are layer names and the values are feature maps correspondingly.

layer_outputs_target (OrderedDict): The dictionary for target model, where the keys are layer names and the values are feature maps correspondingly.

Shape:
  • Output: scalar.

class ftlib.finetune.delta.AttentionBehavioralRegularization(channel_attention)[source]

The behavioral regularization with attention from DELTA:DEep Learning Transfer using Feature Map with Attention for convolutional networks (ICLR 2019)

It can be described as:

\[{\Omega} (w) = \sum_{j=1}^{N} W_j(w) \Vert FM_j(w, \boldsymbol x)-FM_j(w^0, \boldsymbol x)\Vert_2^2 ,\]

where \(w^0\) is the parameter vector of the model pretrained on the source problem, acting as the starting point (SP) in fine-tuning. \(FM_j(w, \boldsymbol x)\) is feature maps generated from the \(j\)-th layer of the model parameterized with \(w\), given the input \(\boldsymbol x\). \(W_j(w)\) is the channel attention of the \(j\)-th layer of the model parameterized with \(w\).

Parameters

channel_attention (list) – The channel attentions of feature maps generated by each selected layer. For the layer with C channels, the channel attention is a tensor of shape [C].

Inputs:

layer_outputs_source (OrderedDict): The dictionary for source model, where the keys are layer names and the values are feature maps correspondingly.

layer_outputs_target (OrderedDict): The dictionary for target model, where the keys are layer names and the values are feature maps correspondingly.

Shape:
  • Output: scalar.

class ftlib.finetune.delta.IntermediateLayerGetter(model, return_layers, keep_output=True)[source]

Wraps a model to get intermediate output values of selected layers.

Parameters
  • model (torch.nn.Module) – The model to collect intermediate layer feature maps.

  • return_layers (list) – The names of selected modules to return the output.

  • keep_output (bool) – If True, model_output contains the final model’s output, else return None. Default: True

Returns

  • An OrderedDict of intermediate outputs. The keys are selected layer names in return_layers and the values are the feature map outputs. The order is the same as return_layers.

  • The model’s final output. If keep_output is False, return None.

Docs

Access comprehensive documentation for Transfer Learning Library

View Docs

Tutorials

Get started for transfer learning

Get Started