文章目录

    • 前言
    • EIoU
      • 论文简介
      • 加入YOLOv5
    • Alpha-IoU
      • 论文简介
      • 加入YOLOv5
    • References

前言

本文使用的YOLOv5版本为v6.1,对YOLOv5-6.x网络结构还不熟悉的同学,可以移步至:【YOLOv5-6.x】网络模型&源码解析

想要尝试改进YOLOv5-6.1的同学,可以参考以下几篇博客:

【魔改YOLOv5-6.x(上)】结合轻量化网络Shufflenetv2、Mobilenetv3和Ghostnet

【魔改YOLOv5-6.x(中)】加入ACON激活函数、CBAM和CA注意力机制、加权双向特征金字塔BiFPN

【魔改YOLOv5-6.x(下)】YOLOv5s+Ghostconv+BiFPN+CA

EIoU

Zhang, Yi-Fan, et al. “Focal and efficient IOU loss for accurate bounding box regression.” arXiv preprint arXiv:2101.08158 (2021).

论文地址

论文简介

我们知道,CIoU损失是在DIoU损失的基础上添加了衡量预测框和GT框纵横比 vvv,在一定程度上可以加快预测框的回归速度,但是仍然存在着很大的问题:

  • 在预测框回归过程中,一旦预测框和GT框的宽高纵横比呈现线性比例时,CIoU中添加的相对比例的惩罚项便不再起作用
  • 根据预测框w和h的梯度公式可以推知,w和h在其中一个值增大时,另外一个值必须减小,它俩不能保持同增同减

为了解决这个问题,EIoU提出了直接对w和h的预测结果进行惩罚的损失函数:
L E I o U = L IoU + L dis+ L asp =1−IoU+ρ 2 ( b , b g t)c2 +ρ 2 ( w , w g t)Cw2 +ρ 2 ( h , h g t)Ch2 \begin{aligned} \mathcal{L}_\mathrm{E I o U} &=\mathcal{L}_\mathrm{I o U}+\mathcal{L}_{\text {dis }}+\mathcal{L}_{\text {asp }} \\ &=1-I o U+\frac{\rho^{2}\left(\mathbf{b}, \mathbf{b}^\mathrm{g t}\right)}{c^{2}}+\frac{\rho^{2}\left(w, w^\mathrm{g t}\right)}{C_\mathrm{w}^{2}}+\frac{\rho^{2}\left(h, h^\mathrm{g t}\right)}{C_\mathrm{h}^{2}} \end{aligned} LEIoU=LIoU+Ldis+Lasp=1IoU+c2ρ2(b,bgt)+Cw2ρ2(w,wgt)+Ch2ρ2(h,hgt)

  • 其中 C w 2C_\mathrm{w}^2 Cw2 C h 2C_\mathrm{h}^2 Ch2分别是预测框和GT框最小外接矩形的宽和高
  • EIoU将损失函数分成了三个部分:
    • 预测框和真实框的重叠损失 L EIoU \mathcal{L}_\mathrm{E I o U} LEIoU
    • 预测框和真实框的中心距离损失 L dis \mathcal{L}_\mathrm{dis} Ldis
    • 预测框和真实框的宽和高损失 L asp \mathcal{L}_\mathrm{asp} Lasp
  • EIOU损失的前两部分延续CIOU中的方法,而宽高损失直接使预测框与真实框的宽度和高度之差最小,使得收敛速度更快

下图是GIoU、CIoU和EIoU损失预测框的迭代过程对比图,红色框和绿色框就是预测框的回归过程,蓝色框是真实框,黑色框是预先设定的锚框:

  • GIoU的问题是使用最小外接矩形的面积减去并集的面积作为惩罚项,这导致了GIoU存在先扩大并集面积,再优化IoU的走弯路的问题
  • CIoU的问题是宽和高不能同时增大或者减小,而EIoU则可以

除此之外,论文中还提到了利用Focal Loss对EIOU进行加权处理:
L Focal−EIoU =Io U γ∗ L EIoU L_\mathrm{Focal-EIoU}=IoU^{\gamma}*L_\mathrm{EIoU} LFocalEIoU=IoUγLEIoU

加入YOLOv5

  • utils/metrics.py中,找到bbox_iou函数,可以把原有的注释掉,换成下面的代码:
# 计算两个框的特定IOUdef bbox_iou(box1, box2, x1y1x2y2=True, GIoU=False, DIoU=False, CIoU=False, EIoU=False, eps=1e-7):# Returns the IoU of box1 to box2. box1 is 4, box2 is nx4# 这里取转置,为了后续方便每个维度(坐标)之间的计算box2 = box2.T# Get the coordinates of bounding boxesif x1y1x2y2:# x1, y1, x2, y2 = box1b1_x1, b1_y1, b1_x2, b1_y2 = box1[0], box1[1], box1[2], box1[3]b2_x1, b2_y1, b2_x2, b2_y2 = box2[0], box2[1], box2[2], box2[3]else:# transform from xywh to xyxy 默认执行这里b1_x1, b1_x2 = box1[0] - box1[2] / 2, box1[0] + box1[2] / 2b1_y1, b1_y2 = box1[1] - box1[3] / 2, box1[1] + box1[3] / 2b2_x1, b2_x2 = box2[0] - box2[2] / 2, box2[0] + box2[2] / 2b2_y1, b2_y2 = box2[1] - box2[3] / 2, box2[1] + box2[3] / 2# Intersection areainter = (torch.min(b1_x2, b2_x2) - torch.max(b1_x1, b2_x1)).clamp(0) * \(torch.min(b1_y2, b2_y2) - torch.max(b1_y1, b2_y1)).clamp(0)# Union Areaw1, h1 = b1_x2 - b1_x1, b1_y2 - b1_y1 + epsw2, h2 = b2_x2 - b2_x1, b2_y2 - b2_y1 + epsunion = w1 * h1 + w2 * h2 - inter + epsiou = inter / union# 目标框IOU损失函数的计算if CIoU or DIoU or GIoU or EIoU:# 两个框的最小闭包区域的widthcw = torch.max(b1_x2, b2_x2) - torch.min(b1_x1, b2_x1)# convex (smallest enclosing box) width# 两个框的最小闭包区域的heightch = torch.max(b1_y2, b2_y2) - torch.min(b1_y1, b2_y1)# convex heightif CIoU or DIoU or EIoU:# Distance or Complete IoU https://arxiv.org/abs/1911.08287v1# 最小外接矩形 对角线的长度平方c2 = cw ** 2 + ch ** 2 + eps# convex diagonal squared# 两个框中心点之间距离的平方rho2 = ((b2_x1 + b2_x2 - b1_x1 - b1_x2) ** 2 +(b2_y1 + b2_y2 - b1_y1 - b1_y2) ** 2) / 4# center distance squaredif DIoU:return iou - rho2 / c2# DIoU# CIoU 比DIoU多了限制长宽比的因素:v * alphaelif CIoU:# https://github.com/Zzh-tju/DIoU-SSD-pytorch/blob/master/utils/box/box_utils.py#L47v = (4 / math.pi ** 2) * torch.pow(torch.atan(w2 / h2) - torch.atan(w1 / h1), 2)with torch.no_grad():alpha = v / (v - iou + (1 + eps))return iou - (rho2 / c2 + v * alpha)# EIoU 在CIoU的基础上将纵横比的损失项拆分成预测的宽高分别与最小外接框宽高的差值 加速了收敛提高了回归精度elif EIoU:rho_w2 = ((b2_x2 - b2_x1) - (b1_x2 - b1_x1)) ** 2rho_h2 = ((b2_y2 - b2_y1) - (b1_y2 - b1_y1)) ** 2cw2 = cw ** 2 + epsch2 = ch ** 2 + epsreturn iou - (rho2 / c2 + rho_w2 / cw2 + rho_h2 / ch2)# GIoU https://arxiv.org/pdf/1902.09630.pdfc_area = cw * ch + eps# convex areareturn iou - (c_area - union) / c_areareturn iou# IoU
  • utils/loss.py中,找到ComputeLoss类中的__call__()函数,把Regression loss中计算iou的代码,换成下面这句:
iou = bbox_iou(pbox.T, tbox[i], x1y1x2y2=False, CIoU=False, EIoU=True)# iou(prediction, target)

Alpha-IoU

He, Jiabo, et al. “$\alpha $-IoU: A Family of Power Intersection over Union Losses for Bounding Box Regression.” Advances in Neural Information Processing Systems 34 (2021).

论文地址

论文简介

由于IoU Loss对于bbox尺度不变,可以训练出更好的检测器,因此在目标检测中常采用IOU Loss对预测框计算定位回归损失(在YOLOv5中采用CIoU Loss)

而本文提出的Alpha-IoU Loss是基于现有IoU Loss的统一幂化,即对所有的IoU Loss,增加 α\alphaα幂,当 α\alphaα等于1时,则回归到原始各个Loss中:
L IoU =1−IoU ⟹ L α− I o U =1−Io U α L GIoU =1−IoU+ ∣C− ( B ∪ B g t)∣∣C∣⟹ L α− G I o U =1−Io U α+ ( ∣C− ( B ∪ B g t)∣∣C∣ )α L DIoU =1−IoU+ρ 2 ( b , b g t)c2⟹ L α− D I o U =1−Io U α+ρ 2α( b , b g t)c 2 αL CIoU =1−IoU+ρ 2 ( b , b g t)c2 +βv ⟹ L α− C I o U =1−Io U α+ρ 2α( b , b g t)c 2 α +(βv ) α\begin{aligned} \mathcal{L}_{\mathrm{IoU}}=1-I o U & \Longrightarrow \mathcal{L}_{\alpha-\mathrm{IoU}}=1-I o U^{\alpha} \\ \mathcal{L}_{\mathrm{GIoU}}=1-I o U+\frac{\left|C-\left(B \cup B^\mathrm{g t}\right)\right|}{|C|} & \Longrightarrow \mathcal{L}_{\alpha-\mathrm{GIoU}}=1-I o U^{\alpha}+\left(\frac{\left|C-\left(B \cup B^\mathrm{g t}\right)\right|}{|C|}\right)^{\alpha} \\ \mathcal{L}_{\mathrm{DIoU}}=1-I o U+\frac{\rho^{2}\left(\boldsymbol{b}, \boldsymbol{b}^\mathrm{g t}\right)}{c^{2}} & \Longrightarrow \mathcal{L}_{\alpha-\mathrm{DIoU}}=1-I o U^{\alpha}+\frac{\rho^{2 \alpha}\left(\boldsymbol{b}, \boldsymbol{b}^\mathrm{g t}\right)}{c^{2 \alpha}} \\ \mathcal{L}_{\mathrm{CIoU}}=1-I o U+\frac{\rho^{2}\left(\boldsymbol{b}, \boldsymbol{b}^\mathrm{g t}\right)}{c^{2}}+\beta v & \Longrightarrow \mathcal{L}_{\alpha-\mathrm{CIoU}}=1-I o U^{\alpha}+\frac{\rho^{2 \alpha}\left(\boldsymbol{b}, \boldsymbol{b}^\mathrm{g t}\right)}{c^{2 \alpha}}+(\beta v)^{\alpha} \end{aligned} LIoU=1IoULGIoU=1IoU+CC(BBgt)LDIoU=1IoU+c2ρ2(b,bgt)LCIoU=1IoU+c2ρ2(b,bgt)+βvLαIoU=1IoUαLαGIoU=1IoUα+(CC(BBgt))αLαDIoU=1IoUα+c2αρ2α(b,bgt)LαCIoU=1IoUα+c2αρ2α(b,bgt)+(βv)α

加入YOLOv5

# Alpha-IOU:https://arxiv.org/abs/2110.13675# 参考:https://mp.weixin.qq.com/s/l22GJtA7Vd11dpY9QG4k2Adef bbox_alpha_iou(box1, box2, x1y1x2y2=False, GIoU=False, DIoU=False, CIoU=False, EIoU=False, alpha=3, eps=1e-9):# Returns tsqrt_he IoU of box1 to box2. box1 is 4, box2 is nx4box2 = box2.T# Get the coordinates of bounding boxesif x1y1x2y2:# x1, y1, x2, y2 = box1b1_x1, b1_y1, b1_x2, b1_y2 = box1[0], box1[1], box1[2], box1[3]b2_x1, b2_y1, b2_x2, b2_y2 = box2[0], box2[1], box2[2], box2[3]else:# transform from xywh to xyxyb1_x1, b1_x2 = box1[0] - box1[2] / 2, box1[0] + box1[2] / 2b1_y1, b1_y2 = box1[1] - box1[3] / 2, box1[1] + box1[3] / 2b2_x1, b2_x2 = box2[0] - box2[2] / 2, box2[0] + box2[2] / 2b2_y1, b2_y2 = box2[1] - box2[3] / 2, box2[1] + box2[3] / 2# Intersection areainter = (torch.min(b1_x2, b2_x2) - torch.max(b1_x1, b2_x1)).clamp(0) * \(torch.min(b1_y2, b2_y2) - torch.max(b1_y1, b2_y1)).clamp(0)# Union Areaw1, h1 = b1_x2 - b1_x1, b1_y2 - b1_y1 + epsw2, h2 = b2_x2 - b2_x1, b2_y2 - b2_y1 + epsunion = w1 * h1 + w2 * h2 - inter + eps# change iou into pow(iou+eps) 加入α次幂# alpha iouiou = torch.pow(inter / union + eps, alpha)beta = 2 * alphaif GIoU or DIoU or CIoU or EIoU:# 两个框的最小闭包区域的width和heightcw = torch.max(b1_x2, b2_x2) - torch.min(b1_x1, b2_x1)# convex (smallest enclosing box) widthch = torch.max(b1_y2, b2_y2) - torch.min(b1_y1, b2_y1)# convex heightif CIoU or DIoU or EIoU:# Distance or Complete IoU https://arxiv.org/abs/1911.08287v1# 最小外接矩形 对角线的长度平方c2 = cw ** beta + ch ** beta + eps# convex diagonalrho_x = torch.abs(b2_x1 + b2_x2 - b1_x1 - b1_x2)rho_y = torch.abs(b2_y1 + b2_y2 - b1_y1 - b1_y2)# 两个框中心点之间距离的平方rho2 = (rho_x ** beta + rho_y ** beta) / (2 ** beta)# center distanceif DIoU:return iou - rho2 / c2# DIoUelif CIoU:# https://github.com/Zzh-tju/DIoU-SSD-pytorch/blob/master/utils/box/box_utils.py#L47v = (4 / math.pi ** 2) * torch.pow(torch.atan(w2 / h2) - torch.atan(w1 / h1), 2)with torch.no_grad():alpha_ciou = v / ((1 + eps) - inter / union + v)# return iou - (rho2 / c2 + v * alpha_ciou)# CIoUreturn iou - (rho2 / c2 + torch.pow(v * alpha_ciou + eps, alpha))# CIoU# EIoU 在CIoU的基础上# 将预测框宽高的纵横比损失项 拆分成预测框的宽高分别与最小外接框宽高的差值# 加速了收敛提高了回归精度elif EIoU:rho_w2 = ((b2_x2 - b2_x1) - (b1_x2 - b1_x1)) ** betarho_h2 = ((b2_y2 - b2_y1) - (b1_y2 - b1_y1)) ** betacw2 = cw ** beta + epsch2 = ch ** beta + epsreturn iou - (rho2 / c2 + rho_w2 / cw2 + rho_h2 / ch2)# GIoU https://arxiv.org/pdf/1902.09630.pdfc_area = torch.max(cw * ch + eps, union)# convex areareturn iou - torch.pow((c_area - union) / c_area + eps, alpha)# GIoUelse:return iou# torch.log(iou+eps) or iou

References

即插即用| Alpha_IOU loss助力yolov5优化

损失函数之Focal-EIoU Loss

目标检测中的预测框回归优化之IOU、GIOU、DIOU、CIOU和EIOU

深度学习笔记(十三):IOU、GIOU、DIOU、CIOU、EIOU、Focal EIOU、alpha IOU损失函数分析及Pytorch实现