DEep Learning Transfer using Feature Map with Attention (DELTA)¶

class ftlib.finetune.delta.L2Regularization(model)[source]¶

The L2 regularization of parameters \(w\) can be described as:

\[{\Omega} (w) = \dfrac{1}{2} \Vert w\Vert_2^2 ,\]

Parameters: model (torch.nn.Module) – The model to apply L2 penalty.

Shape:

Output: scalar.

class ftlib.finetune.delta.SPRegularization(source_model, target_model)[source]¶

The SP (Starting Point) regularization from Explicit inductive bias for transfer learning with convolutional networks (ICML 2018)

The SP regularization of parameters \(w\) can be described as:

\[{\Omega} (w) = \dfrac{1}{2} \Vert w-w^0\Vert_2^2 ,\]

where \(w^0\) is the parameter vector of the model pretrained on the source problem, acting as the starting point (SP) in fine-tuning.

Parameters

source_model (torch.nn.Module) – The source (starting point) model.
target_model (torch.nn.Module) – The target (fine-tuning) model.

Shape:

Output: scalar.

class ftlib.finetune.delta.BehavioralRegularization[source]¶

The behavioral regularization from DELTA:DEep Learning Transfer using Feature Map with Attention for convolutional networks (ICLR 2019)

It can be described as:

\[{\Omega} (w) = \sum_{j=1}^{N} \Vert FM_j(w, \boldsymbol x)-FM_j(w^0, \boldsymbol x)\Vert_2^2 ,\]

where \(w^0\) is the parameter vector of the model pretrained on the source problem, acting as the starting point (SP) in fine-tuning, \(FM_j(w, \boldsymbol x)\) is feature maps generated from the \(j\)-th layer of the model parameterized with \(w\), given the input \(\boldsymbol x\).

Inputs:

layer_outputs_source (OrderedDict): The dictionary for source model, where the keys are layer names and the values are feature maps correspondingly.

layer_outputs_target (OrderedDict): The dictionary for target model, where the keys are layer names and the values are feature maps correspondingly.

Shape:

Output: scalar.

class ftlib.finetune.delta.AttentionBehavioralRegularization(channel_attention)[source]¶

The behavioral regularization with attention from DELTA:DEep Learning Transfer using Feature Map with Attention for convolutional networks (ICLR 2019)

It can be described as:

\[{\Omega} (w) = \sum_{j=1}^{N} W_j(w) \Vert FM_j(w, \boldsymbol x)-FM_j(w^0, \boldsymbol x)\Vert_2^2 ,\]

where \(w^0\) is the parameter vector of the model pretrained on the source problem, acting as the starting point (SP) in fine-tuning. \(FM_j(w, \boldsymbol x)\) is feature maps generated from the \(j\)-th layer of the model parameterized with \(w\), given the input \(\boldsymbol x\). \(W_j(w)\) is the channel attention of the \(j\)-th layer of the model parameterized with \(w\).

Parameters: channel_attention (list) – The channel attentions of feature maps generated by each selected layer. For the layer with C channels, the channel attention is a tensor of shape [C].

Inputs:

layer_outputs_source (OrderedDict): The dictionary for source model, where the keys are layer names and the values are feature maps correspondingly.

layer_outputs_target (OrderedDict): The dictionary for target model, where the keys are layer names and the values are feature maps correspondingly.

Shape:

Output: scalar.

class ftlib.finetune.delta.IntermediateLayerGetter(model, return_layers, keep_output=True)[source]¶

Wraps a model to get intermediate output values of selected layers.

Parameters

model (torch.nn.Module) – The model to collect intermediate layer feature maps.
return_layers (list) – The names of selected modules to return the output.
keep_output (bool) – If True, model_output contains the final model’s output, else return None. Default: True

Returns

An OrderedDict of intermediate outputs. The keys are selected layer names in return_layers and the values are the feature map outputs. The order is the same as return_layers.
The model’s final output. If keep_output is False, return None.

DEep Learning Transfer using Feature Map with Attention (DELTA)¶

Docs

Tutorials