Pytorch训练yolo时,原图与映射图的关系
dataloader = DataLoader(dataset,
batch_size=batch_size, # batch_size = 16
num_workers=opt.num_workers,
shuffle=False, # False 不打乱数据的排序
pin_memory=True,
collate_fn=dataset.collate_fn)
for i, (imgs, targets, _, _) in enumerate(dataloader):
imgs = imgs.to(device)
targets = targets.to(device)
batch0的第0张和第1张图片
batch0的图片
print(targets)结果如下,每行代表batch0中的一个Ground Truth
tensor([[0.00000e+00, 4.50000e+01, 5.26238e-01, 6.83974e-01, 9.47524e-01, 5.16700e-01],
[0.00000e+00, 4.50000e+01, 7.81120e-01, 3.14383e-01, 4.37760e-01, 4.05274e-01],
[0.00000e+00, 5.00000e+01, 7.40093e-01, 7.15348e-01, 5.19815e-01, 4.33102e-01],
[0.00000e+00, 4.50000e+01, 4.21901e-01, 4.67164e-01, 7.79564e-01, 6.60547e-01],
[0.00000e+00, 4.90000e+01, 7.42570e-01, 2.23209e-01, 1.33578e-01, 8.29659e-02],
[0.00000e+00, 4.90000e+01, 8.80173e-01, 2.17102e-01, 1.03825e-01, 8.23721e-02],
[0.00000e+00, 4.90000e+01, 7.70766e-01, 2.99850e-01, 1.50537e-01, 1.24273e-01],
[0.00000e+00, 4.90000e+01, 7.35524e-01, 1.79639e-01, 1.68887e-01, 1.25739e-01],
[1.00000e+00, 2.30000e+01, 7.20384e-01, 5.21023e-01, 3.40454e-01, 4.71519e-01],
[1.00000e+00, 2.30000e+01, 1.40495e-01, 7.60592e-01, 2.04362e-01, 9.47416e-02],
[2.00000e+00, 5.80000e+01, 4.45878e-01, 3.75928e-01, 4.25135e-01, 5.24035e-01],
[2.00000e+00, 7.50000e+01, 4.62076e-01, 4.73467e-01, 2.76887e-01, 3.15688e-01],
[3.00000e+00, 2.20000e+01, 6.34281e-01, 4.49670e-01, 7.31439e-01, 6.27662e-01],
[4.00000e+00, 2.50000e+01, 5.41866e-01, 4.97573e-01, 7.24321e-01, 6.72691e-01],
[4.00000e+00, 0.00000e+00, 4.11375e-01, 6.70091e-01, 5.06122e-01, 6.59819e-01],
[5.00000e+00, 1.70000e+01, 5.03602e-01, 7.13400e-01, 2.62876e-01, 3.62025e-01],
''''''
[1.50000e+01, 5.50000e+01, 5.12623e-01, 3.35652e-01, 5.91252e-01, 4.89218e-01]], device='cuda:0')
元素0:代表标记是在batch0中的第几张图片,因此元素0的范围0到15
元素1:代表标记在class中的索引
元素2 3:代表元素XY起点坐标,Ground Truth中心点的坐标
元素4 5:代表元素宽高,Ground truth的宽和高 / 416的结果
gwh = targets[:, 4:6] * layer.ng # layer.ng → (13,13) (26,26) (52,52)