Shortcuts

Vision Transforms

Classification

class common.vision.transforms.DeNormalizeAndTranspose(mean=(104.00698793, 116.66876762, 122.67891434))[source]

First, convert a tensor image from the shape (C x H x W ) to shape (H x W x C). Then, denormalize it with mean and standard deviation.

class common.vision.transforms.Denormalize(mean, std)[source]

DeNormalize a tensor image with mean and standard deviation. Given mean: (mean[1],...,mean[n]) and std: (std[1],..,std[n]) for n channels, this transform will denormalize each channel of the input torch.*Tensor i.e., output[channel] = input[channel] * std[channel] + mean[channel]

Note

This transform acts out of place, i.e., it does not mutate the input tensor.

Parameters
  • mean (sequence) – Sequence of means for each channel.

  • std (sequence) – Sequence of standard deviations for each channel.

class common.vision.transforms.MultipleApply(transforms)[source]

Apply a list of transformations to an image and get multiple transformed images.

Parameters

transforms (list or tuple) – list of transformations

Example

>>> transform1 = T.Compose([
...     ResizeImage(256),
...     T.RandomCrop(224)
... ])
>>> transform2 = T.Compose([
...     ResizeImage(256),
...     T.RandomCrop(224),
... ])
>>> multiply_transform = MultipleApply([transform1, transform2])
class common.vision.transforms.NormalizeAndTranspose(mean=(104.00698793, 116.66876762, 122.67891434))[source]

First, normalize a tensor image with mean and standard deviation. Then, convert the shape (H x W x C) to shape (C x H x W).

class common.vision.transforms.RandomErasing(probability=0.5, sl=0.02, sh=0.4, r1=0.3, mean=(0.4914, 0.4822, 0.4465))[source]

Random erasing augmentation from Random Erasing Data Augmentation (CVPR 2017). This augmentation randomly selects a rectangle region in an image and erases its pixels.

Parameters
  • probability (float) – The probability that the Random Erasing operation will be performed.

  • sl (float) – Minimum proportion of erased area against input image.

  • sh (float) – Maximum proportion of erased area against input image.

  • r1 (float) – Minimum aspect ratio of erased area.

  • mean (sequence) – Value to fill the erased area.

class common.vision.transforms.ResizeImage(size)[source]

Resize the input PIL Image to the given size.

Parameters

size (sequence or int) – Desired output size. If size is a sequence like (h, w), output size will be matched to this. If size is an int, output size will be (size, size)

Segmentation

@author: Junguang Jiang @contact: JiangJunguang1123@outlook.com

common.vision.transforms.segmentation.ColorJitter

alias of common.vision.transforms.segmentation.wrapper.<locals>.WrapperTransform

class common.vision.transforms.segmentation.Compose(transforms)[source]

Composes several transforms together.

Parameters

transforms (list) – list of transforms to compose.

Example

>>> Compose([
>>>     Resize((512, 512)),
>>>     RandomHorizontalFlip()
>>> ])
common.vision.transforms.segmentation.MultipleApply

alias of common.vision.transforms.segmentation.wrapper.<locals>.WrapperTransform

common.vision.transforms.segmentation.Normalize

alias of common.vision.transforms.segmentation.wrapper.<locals>.WrapperTransform

common.vision.transforms.segmentation.NormalizeAndTranspose

alias of common.vision.transforms.segmentation.wrapper.<locals>.WrapperTransform

class common.vision.transforms.segmentation.RandomApply(transforms, p=0.5)[source]

Apply randomly a list of transformations with a given probability.

Parameters
class common.vision.transforms.segmentation.RandomChoice(transforms)[source]

Apply single transformation randomly picked from a list.

class common.vision.transforms.segmentation.RandomCrop(size)[source]

Crop the given image at a random location. The image can be a PIL Image

Parameters

size (sequence) – Desired output size of the crop.

forward(image, label)[source]
Parameters
  • image – (PIL Image): Image to be cropped.

  • label – (PIL Image): Segmentation label to be cropped.

Returns

Cropped image, cropped segmentation label.

class common.vision.transforms.segmentation.RandomHorizontalFlip(p=0.5)[source]

Horizontally flip the given PIL Image randomly with a given probability.

Parameters

p (float) – probability of the image being flipped. Default value is 0.5

forward(image, label)[source]
Parameters
  • image – (PIL Image): Image to be flipped.

  • label – (PIL Image): Segmentation label to be flipped.

Returns

Randomly flipped image, randomly flipped segmentation label.

class common.vision.transforms.segmentation.RandomResizedCrop(size, scale=(0.5, 1.0), ratio=(0.75, 1.3333333333333333), interpolation=3)[source]

Crop the given image to random size and aspect ratio. The image can be a PIL Image.

A crop of random size (default: of 0.5 to 1.0) of the original size and a random aspect ratio (default: of 3/4 to 4/3) of the original aspect ratio is made. This crop is finally resized to given size.

Parameters
  • size (int or sequence) – expected output size of each edge. If size is an int instead of sequence like (h, w), a square output size (size, size) is made. If provided a tuple or list of length 1, it will be interpreted as (size[0], size[0]).

  • scale (tuple of float) – range of size of the origin size cropped

  • ratio (tuple of float) – range of aspect ratio of the origin aspect ratio cropped.

  • interpolation – Default: PIL.Image.BILINEAR

forward(image, label)[source]
Parameters
  • image – (PIL Image): Image to be cropped and resized.

  • label – (PIL Image): Segmentation label to be cropped and resized.

Returns

Randomly cropped and resized image, randomly cropped and resized segmentation label.

static get_params(img, scale, ratio)[source]

Get parameters for crop for a random sized crop.

Parameters
  • img (PIL Image) – Input image.

  • scale (list) – range of scale of the origin size cropped

  • ratio (list) – range of aspect ratio of the origin aspect ratio cropped

Returns

params (i, j, h, w) to be passed to crop for a random sized crop.

class common.vision.transforms.segmentation.Resize(image_size, label_size=None)[source]

Resize the input image and the corresponding label to the given size. The image should be a PIL Image.

Parameters
  • image_size (sequence) – The requested image size in pixels, as a 2-tuple: (width, height).

  • label_size (sequence, optional) – The requested segmentation label size in pixels, as a 2-tuple: (width, height). The same as image_size if None. Default: None.

forward(image, label)[source]
Parameters
  • image – (PIL Image): Image to be scaled.

  • label – (PIL Image): Segmentation label to be scaled.

Returns

Rescaled image, rescaled segmentation label

common.vision.transforms.segmentation.ToPILImage

alias of common.vision.transforms.segmentation.wrapper.<locals>.WrapperTransform

common.vision.transforms.segmentation.ToTensor

alias of common.vision.transforms.segmentation.wrapper.<locals>.WrapperTransform

common.vision.transforms.segmentation.wrapper(transform)[source]

Wrap a transform for classification to a transform for segmentation. Note that the segmentation label will keep the same before and after wrapper.

Parameters

transform (class, callable) – transform for classification

Returns

transform for segmentation

Keypoint Detection

@author: Junguang Jiang @contact: JiangJunguang1123@outlook.com

class common.vision.transforms.keypoint_detection.CenterCrop(size)[source]

Crops the given PIL Image at the center.

common.vision.transforms.keypoint_detection.ColorJitter

alias of common.vision.transforms.keypoint_detection.wrapper.<locals>.WrapperTransform

class common.vision.transforms.keypoint_detection.Compose(transforms)[source]

Composes several transforms together.

Parameters

transforms (list of Transform objects) – list of transforms to compose.

common.vision.transforms.keypoint_detection.Normalize

alias of common.vision.transforms.keypoint_detection.wrapper.<locals>.WrapperTransform

class common.vision.transforms.keypoint_detection.RandomApply(transforms, p=0.5)[source]

Apply randomly a list of transformations with a given probability.

Parameters
class common.vision.transforms.keypoint_detection.RandomResizedCrop(size, scale=(0.6, 1.3), interpolation=2)[source]

Crop the given PIL Image to random size and aspect ratio.

A crop of random size (default: of 0.08 to 1.0) of the original size and a random aspect ratio (default: of 3/4 to 4/3) of the original aspect ratio is made. This crop is finally resized to given size. This is popularly used to train the Inception networks.

Parameters
  • size – expected output size of each edge

  • scale – range of size of the origin size cropped

  • ratio – range of aspect ratio of the origin aspect ratio cropped

  • interpolation – Default: PIL.Image.BILINEAR

static get_params(img, scale)[source]

Get parameters for crop for a random sized crop.

Parameters
  • img (PIL Image) – Image to be cropped.

  • scale (tuple) – range of size of the origin size cropped

Returns

params (i, j, h, w) to be passed to crop for a random

sized crop.

Return type

tuple

class common.vision.transforms.keypoint_detection.RandomRotation(degrees)[source]

Rotate the image by angle.

Parameters

degrees (sequence or float or int) – Range of degrees to select from. If degrees is a number instead of sequence like (min, max), the range of degrees will be (-degrees, +degrees).

static get_params(degrees)[source]

Get parameters for rotate for a random rotation.

Returns

params to be passed to rotate for random rotation.

Return type

sequence

class common.vision.transforms.keypoint_detection.Resize(size, interpolation=2)[source]

Resize the input PIL Image to the given size.

class common.vision.transforms.keypoint_detection.ResizePad(size, interpolation=2)[source]

Pad the given image on all sides with the given “pad” value to resize the image to the given size.

common.vision.transforms.keypoint_detection.ToTensor

alias of common.vision.transforms.keypoint_detection.wrapper.<locals>.WrapperTransform

common.vision.transforms.keypoint_detection.center_crop(image, output_size, keypoint2d)[source]

Crop the given PIL Image and resize it to desired size.

Parameters
  • img (PIL Image) – Image to be cropped. (0,0) denotes the top left corner of the image.

  • output_size (sequence or int) – (height, width) of the crop box. If int, it is used for both directions

Returns

Cropped image.

Return type

PIL Image

common.vision.transforms.keypoint_detection.resized_crop(img, top, left, height, width, size, interpolation=2, keypoint2d=None, intrinsic_matrix=None)[source]

Crop the given PIL Image and resize it to desired size.

Notably used in RandomResizedCrop.

Parameters
  • img (PIL Image) – Image to be cropped. (0,0) denotes the top left corner of the image.

  • top (int) – Vertical component of the top left corner of the crop box.

  • left (int) – Horizontal component of the top left corner of the crop box.

  • height (int) – Height of the crop box.

  • width (int) – Width of the crop box.

  • size (sequence or int) – Desired output size. Same semantics as resize.

  • interpolation (int, optional) – Desired interpolation. Default is PIL.Image.BILINEAR.

Returns

Cropped image.

Return type

PIL Image

common.vision.transforms.keypoint_detection.wrapper(transform)[source]

Wrap a transform for classification to a transform for keypoint detection. Note that the keypoint detection label will keep the same before and after wrapper.

Parameters

transform (class, callable) – transform for classification

Returns

transform for keypoint detection

Docs

Access comprehensive documentation for Transfer Learning Library

View Docs

Tutorials

Get started for Transfer Learning Library

Get Started