Loss¶
Knowledge Distillation¶
- 
class common.loss.KnowledgeDistillationLoss(T=1.0, reduction='batchmean')[source]¶
- Knowledge Distillation Loss. - Parameters
- T (double) – Temperature. Default: 1. 
- reduction (str, optional) – Specifies the reduction to apply to the output: - 'none'|- 'mean'|- 'sum'.- 'none': no reduction will be applied,- 'mean': the sum of the output will be divided by the number of elements in the output,- 'sum': the output will be summed. Default:- 'batchmean'
 
 - Inputs:
- y_student (tensor): logits output of the student 
- y_teacher (tensor): logits output of the teacher 
 
- Shape:
- y_student: (minibatch, num_classes) 
- y_teacher: (minibatch, num_classes) 
 
 
Cross Entropy with Label Smooth¶
- 
class common.vision.models.reid.loss.CrossEntropyLossWithLabelSmooth(num_classes, epsilon=0.1)[source]¶
- Cross entropy loss with label smooth from Rethinking the Inception Architecture for Computer Vision (CVPR 2016). - Given one-hot labels \(labels \in R^C\), where \(C\) is the number of classes, smoothed labels are calculated as \[smoothed\_labels = (1 - \epsilon) \times labels + \epsilon \times \frac{1}{C}\]- We use smoothed labels when calculating cross entropy loss and this can be helpful for preventing over-fitting. - Parameters
 - Inputs:
- y (tensor): unnormalized classifier predictions, \(y\) 
- labels (tensor): ground truth labels, \(labels\) 
 
- Shape:
- y: \((minibatch, C)\), where \(C\) is the number of classes 
- labels: \((minibatch, )\)