Vision Transforms¶
Classification¶
-
class
common.vision.transforms.
DeNormalizeAndTranspose
(mean=(104.00698793, 116.66876762, 122.67891434))[source]¶ First, convert a tensor image from the shape (C x H x W ) to shape (H x W x C). Then, denormalize it with mean and standard deviation.
-
class
common.vision.transforms.
Denormalize
(mean, std)[source]¶ DeNormalize a tensor image with mean and standard deviation. Given mean:
(mean[1],...,mean[n])
and std:(std[1],..,std[n])
forn
channels, this transform will denormalize each channel of the inputtorch.*Tensor
i.e.,output[channel] = input[channel] * std[channel] + mean[channel]
Note
This transform acts out of place, i.e., it does not mutate the input tensor.
- Parameters
mean (sequence) – Sequence of means for each channel.
std (sequence) – Sequence of standard deviations for each channel.
-
class
common.vision.transforms.
MultipleApply
(transforms)[source]¶ Apply a list of transformations to an image and get multiple transformed images.
Example
>>> transform1 = T.Compose([ ... ResizeImage(256), ... T.RandomCrop(224) ... ]) >>> transform2 = T.Compose([ ... ResizeImage(256), ... T.RandomCrop(224), ... ]) >>> multiply_transform = MultipleApply([transform1, transform2])
-
class
common.vision.transforms.
NormalizeAndTranspose
(mean=(104.00698793, 116.66876762, 122.67891434))[source]¶ First, normalize a tensor image with mean and standard deviation. Then, convert the shape (H x W x C) to shape (C x H x W).
-
class
common.vision.transforms.
RandomErasing
(probability=0.5, sl=0.02, sh=0.4, r1=0.3, mean=(0.4914, 0.4822, 0.4465))[source]¶ Random erasing augmentation from Random Erasing Data Augmentation (CVPR 2017). This augmentation randomly selects a rectangle region in an image and erases its pixels.
- Parameters
probability (float) – The probability that the Random Erasing operation will be performed.
sl (float) – Minimum proportion of erased area against input image.
sh (float) – Maximum proportion of erased area against input image.
r1 (float) – Minimum aspect ratio of erased area.
mean (sequence) – Value to fill the erased area.
Segmentation¶
@author: Junguang Jiang @contact: JiangJunguang1123@outlook.com
-
common.vision.transforms.segmentation.
ColorJitter
¶ alias of
common.vision.transforms.segmentation.wrapper.<locals>.WrapperTransform
-
class
common.vision.transforms.segmentation.
Compose
(transforms)[source]¶ Composes several transforms together.
- Parameters
transforms (list) – list of transforms to compose.
Example
>>> Compose([ >>> Resize((512, 512)), >>> RandomHorizontalFlip() >>> ])
-
common.vision.transforms.segmentation.
MultipleApply
¶ alias of
common.vision.transforms.segmentation.wrapper.<locals>.WrapperTransform
-
common.vision.transforms.segmentation.
Normalize
¶ alias of
common.vision.transforms.segmentation.wrapper.<locals>.WrapperTransform
-
common.vision.transforms.segmentation.
NormalizeAndTranspose
¶ alias of
common.vision.transforms.segmentation.wrapper.<locals>.WrapperTransform
-
class
common.vision.transforms.segmentation.
RandomApply
(transforms, p=0.5)[source]¶ Apply randomly a list of transformations with a given probability.
- Parameters
transforms (list or tuple or torch.nn.Module) – list of transformations
p (float) – probability
-
class
common.vision.transforms.segmentation.
RandomChoice
(transforms)[source]¶ Apply single transformation randomly picked from a list.
-
class
common.vision.transforms.segmentation.
RandomCrop
(size)[source]¶ Crop the given image at a random location. The image can be a PIL Image
- Parameters
size (sequence) – Desired output size of the crop.
-
class
common.vision.transforms.segmentation.
RandomHorizontalFlip
(p=0.5)[source]¶ Horizontally flip the given PIL Image randomly with a given probability.
- Parameters
p (float) – probability of the image being flipped. Default value is 0.5
-
class
common.vision.transforms.segmentation.
RandomResizedCrop
(size, scale=(0.5, 1.0), ratio=(0.75, 1.3333333333333333), interpolation=3)[source]¶ Crop the given image to random size and aspect ratio. The image can be a PIL Image.
A crop of random size (default: of 0.5 to 1.0) of the original size and a random aspect ratio (default: of 3/4 to 4/3) of the original aspect ratio is made. This crop is finally resized to given size.
- Parameters
size (int or sequence) – expected output size of each edge. If size is an int instead of sequence like (h, w), a square output size
(size, size)
is made. If provided a tuple or list of length 1, it will be interpreted as (size[0], size[0]).scale (tuple of float) – range of size of the origin size cropped
ratio (tuple of float) – range of aspect ratio of the origin aspect ratio cropped.
interpolation – Default: PIL.Image.BILINEAR
-
forward
(image, label)[source]¶ - Parameters
image – (PIL Image): Image to be cropped and resized.
label – (PIL Image): Segmentation label to be cropped and resized.
- Returns
Randomly cropped and resized image, randomly cropped and resized segmentation label.
-
class
common.vision.transforms.segmentation.
Resize
(image_size, label_size=None)[source]¶ Resize the input image and the corresponding label to the given size. The image should be a PIL Image.
- Parameters
image_size (sequence) – The requested image size in pixels, as a 2-tuple: (width, height).
label_size (sequence, optional) – The requested segmentation label size in pixels, as a 2-tuple: (width, height). The same as image_size if None. Default: None.
-
common.vision.transforms.segmentation.
ToPILImage
¶ alias of
common.vision.transforms.segmentation.wrapper.<locals>.WrapperTransform
-
common.vision.transforms.segmentation.
ToTensor
¶ alias of
common.vision.transforms.segmentation.wrapper.<locals>.WrapperTransform
-
common.vision.transforms.segmentation.
wrapper
(transform)[source]¶ Wrap a transform for classification to a transform for segmentation. Note that the segmentation label will keep the same before and after wrapper.
- Parameters
transform (class, callable) – transform for classification
- Returns
transform for segmentation
Keypoint Detection¶
@author: Junguang Jiang @contact: JiangJunguang1123@outlook.com
-
class
common.vision.transforms.keypoint_detection.
CenterCrop
(size)[source]¶ Crops the given PIL Image at the center.
-
common.vision.transforms.keypoint_detection.
ColorJitter
¶ alias of
common.vision.transforms.keypoint_detection.wrapper.<locals>.WrapperTransform
-
class
common.vision.transforms.keypoint_detection.
Compose
(transforms)[source]¶ Composes several transforms together.
- Parameters
transforms (list of
Transform
objects) – list of transforms to compose.
-
common.vision.transforms.keypoint_detection.
Normalize
¶ alias of
common.vision.transforms.keypoint_detection.wrapper.<locals>.WrapperTransform
-
class
common.vision.transforms.keypoint_detection.
RandomApply
(transforms, p=0.5)[source]¶ Apply randomly a list of transformations with a given probability.
- Parameters
transforms (list or tuple or torch.nn.Module) – list of transformations
p (float) – probability
-
class
common.vision.transforms.keypoint_detection.
RandomResizedCrop
(size, scale=(0.6, 1.3), interpolation=2)[source]¶ Crop the given PIL Image to random size and aspect ratio.
A crop of random size (default: of 0.08 to 1.0) of the original size and a random aspect ratio (default: of 3/4 to 4/3) of the original aspect ratio is made. This crop is finally resized to given size. This is popularly used to train the Inception networks.
- Parameters
size – expected output size of each edge
scale – range of size of the origin size cropped
ratio – range of aspect ratio of the origin aspect ratio cropped
interpolation – Default: PIL.Image.BILINEAR
-
class
common.vision.transforms.keypoint_detection.
RandomRotation
(degrees)[source]¶ Rotate the image by angle.
-
class
common.vision.transforms.keypoint_detection.
Resize
(size, interpolation=2)[source]¶ Resize the input PIL Image to the given size.
-
class
common.vision.transforms.keypoint_detection.
ResizePad
(size, interpolation=2)[source]¶ Pad the given image on all sides with the given “pad” value to resize the image to the given size.
-
common.vision.transforms.keypoint_detection.
ToTensor
¶ alias of
common.vision.transforms.keypoint_detection.wrapper.<locals>.WrapperTransform
-
common.vision.transforms.keypoint_detection.
center_crop
(image, output_size, keypoint2d)[source]¶ Crop the given PIL Image and resize it to desired size.
- Parameters
img (PIL Image) – Image to be cropped. (0,0) denotes the top left corner of the image.
output_size (sequence or int) – (height, width) of the crop box. If int, it is used for both directions
- Returns
Cropped image.
- Return type
PIL Image
-
common.vision.transforms.keypoint_detection.
resized_crop
(img, top, left, height, width, size, interpolation=2, keypoint2d=None, intrinsic_matrix=None)[source]¶ Crop the given PIL Image and resize it to desired size.
Notably used in
RandomResizedCrop
.- Parameters
img (PIL Image) – Image to be cropped. (0,0) denotes the top left corner of the image.
top (int) – Vertical component of the top left corner of the crop box.
left (int) – Horizontal component of the top left corner of the crop box.
height (int) – Height of the crop box.
width (int) – Width of the crop box.
size (sequence or int) – Desired output size. Same semantics as
resize
.interpolation (int, optional) – Desired interpolation. Default is
PIL.Image.BILINEAR
.
- Returns
Cropped image.
- Return type
PIL Image
-
common.vision.transforms.keypoint_detection.
wrapper
(transform)[source]¶ Wrap a transform for classification to a transform for keypoint detection. Note that the keypoint detection label will keep the same before and after wrapper.
- Parameters
transform (class, callable) – transform for classification
- Returns
transform for keypoint detection