Shortcuts

Fourier Domain Adaptation (FDA)

class dalib.translation.fourier_transform.FourierTransform(image_list, amplitude_dir, beta=1, rebuild=False)[source]

Fourier Transform is introduced by FDA: Fourier Domain Adaptation for Semantic Segmentation (CVPR 2020)

Fourier Transform replace the low frequency component of the amplitude of the source image to that of the target image. Denote with \(M_{β}\) a mask, whose value is zero except for the center region:

\[M_{β}(h,w) = \mathbb{1}_{(h, w)\in [-β,β, -β, β]}\]

Given images \(x^s\) from source domain and \(x^t\) from target domain, the source image in the target style is

\[x^{s→t} = \mathcal{F}^{-1}([ M_{β}\circ\mathcal{F}^A(x^t) + (1-M_{β})\circ\mathcal{F}^A(x^s), \mathcal{F}^P(x^s) ])\]

where \(\mathcal{F}^A\), \(\mathcal{F}^P\) are the amplitude and phase component of the Fourier Transform \(\mathcal{F}\) of an RGB image.

Parameters
  • image_list (sequence[str]) – A sequence of image list from the target domain.

  • amplitude_dir (str) – Specifies the directory to put the amplitude component of the target image.

  • beta (int, optional) – \(β\). Default: 1.

  • rebuild (bool, optional) – whether rebuild the amplitude component of the target image in the given directory.

Inputs:
  • image (PIL Image): image from the source domain, \(x^t\).

Examples

>>> from dalib.translation.fourier_transform import FourierTransform
>>> image_list = ["target_image_path1", "target_image_path2"]
>>> amplitude_dir = "path/to/amplitude_dir"
>>> fourier_transform = FourierTransform(image_list, amplitude_dir, beta=1, rebuild=False)
>>> source_image = np.array((256, 256, 3)) # image form source domain
>>> source_image_in_target_style = fourier_transform(source_image)

Note

The meaning of \(β\) is different from that of the origin paper. Experimentally, we found that the size of the center region in the frequency space should be constant when the image size increases. Thus we make the size of the center region independent of the image size. A recommended value for \(β\) is 1.

Note

The image structure of the source domain and target domain should be as similar as possible, thus for segemntation tasks, FourierTransform should be used before RandomResizeCrop and other transformations.

Note

The image size of the source domain and the target domain need to be the same, thus before FourierTransform, you should use Resize to convert the source image to the target image size.

Examples

>>> from dalib.translation.fourier_transform import FourierTransform
>>> import common.vision.datasets.segmentation.transforms as T
>>> from PIL import Image
>>> target_image_list = ["target_image_path1", "target_image_path2"]
>>> amplitude_dir = "path/to/amplitude_dir"
>>> # build a fourier transform that translate source images to the target style
>>> fourier_transform = T.wrapper(FourierTransform)(target_image_list, amplitude_dir)
>>> transforms=T.Compose([
...     # convert source image to the size of the target image before fourier transform
...     T.Resize((2048, 1024)),
...     fourier_transform,
...     T.RandomResizedCrop((1024, 512)),
...     T.RandomHorizontalFlip(),
... ])
>>> source_image = Image.open("path/to/source_image") # image form source domain
>>> source_image_in_target_style = transforms(source_image)
dalib.translation.fourier_transform.low_freq_mutate(amp_src, amp_trg, beta=1)[source]
Parameters
  • amp_src (numpy.ndarray) – amplitude component of the Fourier transform of source image

  • amp_trg (numpy.ndarray) – amplitude component of the Fourier transform of target image

  • beta (int, optional) – the size of the center region to be replace. Default: 1

Returns

amplitude component of the Fourier transform of source image whose low-frequency component is replaced by that of the target image.

dalib.adaptation.segmentation.fda.robust_entropy(y, ita=1.5, num_classes=19, reduction='mean')[source]

Robust entropy proposed in FDA: Fourier Domain Adaptation for Semantic Segmentation (CVPR 2020)

Parameters
  • y (tensor) – logits output of segmentation model in shape of \((N, C, H, W)\)

  • ita (float, optional) – parameters for robust entropy. Default: 1.5

  • num_classes (int, optional) – number of classes. Default: 19

  • reduction (string, optional) – Specifies the reduction to apply to the output: 'none' | 'mean'. 'none': no reduction will be applied, 'mean': the sum of the output will be divided by the number of elements in the output. Default: 'mean'

Returns

Scalar by default. If reduction is 'none', then \((N, )\).

Docs

Access comprehensive documentation for Transfer Learning Library

View Docs

Tutorials

Get started for transfer learning

Get Started