Stochastic Normalization (StochNorm)¶

class ftlib.finetune.stochnorm.StochNorm1d(num_features, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True, p=0.5)[source]¶

Applies Stochastic Normalization over a 2D or 3D input (a mini-batch of 1D inputs with optional additional channel dimension)

Stochastic Normalization is proposed in Stochastic Normalization (NIPS 2020)

\[ \begin{align}\begin{aligned}\hat{x}_{i,0} = \frac{x_i - \tilde{\mu}}{ \sqrt{\tilde{\sigma} + \epsilon}}\\\hat{x}_{i,1} = \frac{x_i - \mu}{ \sqrt{\sigma + \epsilon}}\\\hat{x}_i = (1-s)\cdot \hat{x}_{i,0} + s\cdot \hat{x}_{i,1}\\ y_i = \gamma \hat{x}_i + \beta\end{aligned}\end{align} \]

where \(\mu\) and \(\sigma\) are mean and variance of current mini-batch data.

\(\tilde{\mu}\) and \(\tilde{\sigma}\) are current moving statistics of training data.

\(s\) is a branch-selection variable generated from a Bernoulli distribution, where \(P(s=1)=p\).

During training, there are two normalization branches. One uses mean and variance of current mini-batch data, while the other uses current moving statistics of the training data as usual batch normalization.

During evaluation, the moving statistics is used for normalization.

Parameters

num_features (int) – \(c\) from an expected input of size \((b, c, l)\) or \(l\) from an expected input of size \((b, l)\).
eps (float) – A value added to the denominator for numerical stability. Default: 1e-5
momentum (float) – The value used for the running_mean and running_var computation. Default: 0.1
affine (bool) – A boolean value that when set to True, gives the layer learnable affine parameters. Default: True
track_running_stats (bool) – A boolean value that when set to True, this module tracks the running mean and variance, and when set to False, this module does not track such statistics, and initializes statistics buffers running_mean and running_var as None. When these buffers are None, this module always uses batch statistics in both training and eval modes. Default: True p (float): The probability to choose the second branch (usual BN). Default: 0.5

Shape:

Input: \((b, l)\) or \((b, c, l)\)
Output: \((b, l)\) or \((b, c, l)\) (same shape as input)

class ftlib.finetune.stochnorm.StochNorm2d(num_features, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True, p=0.5)[source]¶

Applies Stochastic Normalization over a 4D input (a mini-batch of 2D inputs with additional channel dimension)

Stochastic Normalization is proposed in Stochastic Normalization (NIPS 2020)

\[ \begin{align}\begin{aligned}\hat{x}_{i,0} = \frac{x_i - \tilde{\mu}}{ \sqrt{\tilde{\sigma} + \epsilon}}\\\hat{x}_{i,1} = \frac{x_i - \mu}{ \sqrt{\sigma + \epsilon}}\\\hat{x}_i = (1-s)\cdot \hat{x}_{i,0} + s\cdot \hat{x}_{i,1}\\ y_i = \gamma \hat{x}_i + \beta\end{aligned}\end{align} \]

where \(\mu\) and \(\sigma\) are mean and variance of current mini-batch data.

\(\tilde{\mu}\) and \(\tilde{\sigma}\) are current moving statistics of training data.

\(s\) is a branch-selection variable generated from a Bernoulli distribution, where \(P(s=1)=p\).

During training, there are two normalization branches. One uses mean and variance of current mini-batch data, while the other uses current moving statistics of the training data as usual batch normalization.

During evaluation, the moving statistics is used for normalization.

Parameters

num_features (int) – \(c\) from an expected input of size \((b, c, h, w)\).
eps (float) – A value added to the denominator for numerical stability. Default: 1e-5
momentum (float) – The value used for the running_mean and running_var computation. Default: 0.1
affine (bool) – A boolean value that when set to True, gives the layer learnable affine parameters. Default: True
track_running_stats (bool) – A boolean value that when set to True, this module tracks the running mean and variance, and when set to False, this module does not track such statistics, and initializes statistics buffers running_mean and running_var as None. When these buffers are None, this module always uses batch statistics in both training and eval modes. Default: True p (float): The probability to choose the second branch (usual BN). Default: 0.5

Shape:

Input: \((b, c, h, w)\)
Output: \((b, c, h, w)\) (same shape as input)

class ftlib.finetune.stochnorm.StochNorm3d(num_features, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True, p=0.5)[source]¶

Applies Stochastic Normalization over a 5D input (a mini-batch of 3D inputs with additional channel dimension)

Stochastic Normalization is proposed in Stochastic Normalization (NIPS 2020)

\[ \begin{align}\begin{aligned}\hat{x}_{i,0} = \frac{x_i - \tilde{\mu}}{ \sqrt{\tilde{\sigma} + \epsilon}}\\\hat{x}_{i,1} = \frac{x_i - \mu}{ \sqrt{\sigma + \epsilon}}\\\hat{x}_i = (1-s)\cdot \hat{x}_{i,0} + s\cdot \hat{x}_{i,1}\\ y_i = \gamma \hat{x}_i + \beta\end{aligned}\end{align} \]

where \(\mu\) and \(\sigma\) are mean and variance of current mini-batch data.

\(\tilde{\mu}\) and \(\tilde{\sigma}\) are current moving statistics of training data.

\(s\) is a branch-selection variable generated from a Bernoulli distribution, where \(P(s=1)=p\).

During training, there are two normalization branches. One uses mean and variance of current mini-batch data, while the other uses current moving statistics of the training data as usual batch normalization.

During evaluation, the moving statistics is used for normalization.

Parameters

num_features (int) – \(c\) from an expected input of size \((b, c, d, h, w)\)
eps (float) – A value added to the denominator for numerical stability. Default: 1e-5
momentum (float) – The value used for the running_mean and running_var computation. Default: 0.1
affine (bool) – A boolean value that when set to True, gives the layer learnable affine parameters. Default: True
track_running_stats (bool) – A boolean value that when set to True, this module tracks the running mean and variance, and when set to False, this module does not track such statistics, and initializes statistics buffers running_mean and running_var as None. When these buffers are None, this module always uses batch statistics in both training and eval modes. Default: True p (float): The probability to choose the second branch (usual BN). Default: 0.5

Shape:

Input: \((b, c, d, h, w)\)
Output: \((b, c, d, h, w)\) (same shape as input)

ftlib.finetune.stochnorm.convert_model(module, p)[source]¶

Traverses the input module and its child recursively and replaces all instance of BatchNorm to StochNorm.

Parameters

module (torch.nn.Module) – The input module needs to be convert to StochNorm model.
p (float) – The hyper-parameter for StochNorm layer.

Returns

The module converted to StochNorm version.

Stochastic Normalization (StochNorm)¶

Docs

Tutorials