DEep Learning Transfer using Feature Map with Attention (DELTA)¶
-
class
ftlib.finetune.delta.
L2Regularization
(model)[source]¶ The L2 regularization of parameters \(w\) can be described as:
\[{\Omega} (w) = \dfrac{1}{2} \Vert w\Vert_2^2 ,\]- Parameters
model (torch.nn.Module) – The model to apply L2 penalty.
- Shape:
Output: scalar.
-
class
ftlib.finetune.delta.
SPRegularization
(source_model, target_model)[source]¶ The SP (Starting Point) regularization from Explicit inductive bias for transfer learning with convolutional networks (ICML 2018)
The SP regularization of parameters \(w\) can be described as:
\[{\Omega} (w) = \dfrac{1}{2} \Vert w-w^0\Vert_2^2 ,\]where \(w^0\) is the parameter vector of the model pretrained on the source problem, acting as the starting point (SP) in fine-tuning.
- Parameters
source_model (torch.nn.Module) – The source (starting point) model.
target_model (torch.nn.Module) – The target (fine-tuning) model.
- Shape:
Output: scalar.
-
class
ftlib.finetune.delta.
BehavioralRegularization
[source]¶ The behavioral regularization from DELTA:DEep Learning Transfer using Feature Map with Attention for convolutional networks (ICLR 2019)
It can be described as:
\[{\Omega} (w) = \sum_{j=1}^{N} \Vert FM_j(w, \boldsymbol x)-FM_j(w^0, \boldsymbol x)\Vert_2^2 ,\]where \(w^0\) is the parameter vector of the model pretrained on the source problem, acting as the starting point (SP) in fine-tuning, \(FM_j(w, \boldsymbol x)\) is feature maps generated from the \(j\)-th layer of the model parameterized with \(w\), given the input \(\boldsymbol x\).
- Inputs:
layer_outputs_source (OrderedDict): The dictionary for source model, where the keys are layer names and the values are feature maps correspondingly.
layer_outputs_target (OrderedDict): The dictionary for target model, where the keys are layer names and the values are feature maps correspondingly.
- Shape:
Output: scalar.
-
class
ftlib.finetune.delta.
AttentionBehavioralRegularization
(channel_attention)[source]¶ The behavioral regularization with attention from DELTA:DEep Learning Transfer using Feature Map with Attention for convolutional networks (ICLR 2019)
It can be described as:
\[{\Omega} (w) = \sum_{j=1}^{N} W_j(w) \Vert FM_j(w, \boldsymbol x)-FM_j(w^0, \boldsymbol x)\Vert_2^2 ,\]where \(w^0\) is the parameter vector of the model pretrained on the source problem, acting as the starting point (SP) in fine-tuning. \(FM_j(w, \boldsymbol x)\) is feature maps generated from the \(j\)-th layer of the model parameterized with \(w\), given the input \(\boldsymbol x\). \(W_j(w)\) is the channel attention of the \(j\)-th layer of the model parameterized with \(w\).
- Parameters
channel_attention (list) – The channel attentions of feature maps generated by each selected layer. For the layer with C channels, the channel attention is a tensor of shape [C].
- Inputs:
layer_outputs_source (OrderedDict): The dictionary for source model, where the keys are layer names and the values are feature maps correspondingly.
layer_outputs_target (OrderedDict): The dictionary for target model, where the keys are layer names and the values are feature maps correspondingly.
- Shape:
Output: scalar.
-
class
ftlib.finetune.delta.
IntermediateLayerGetter
(model, return_layers, keep_output=True)[source]¶ Wraps a model to get intermediate output values of selected layers.
- Parameters
model (torch.nn.Module) – The model to collect intermediate layer feature maps.
return_layers (list) – The names of selected modules to return the output.
keep_output (bool) – If True, model_output contains the final model’s output, else return None. Default: True
- Returns
An OrderedDict of intermediate outputs. The keys are selected layer names in return_layers and the values are the feature map outputs. The order is the same as return_layers.
The model’s final output. If keep_output is False, return None.