本教程介绍了如何将 Labelme 数据集转换为 YOLOv8 格式,并在 DAMODEL 平台上进行模型训练。
首先,我们需要准备一个已经标注好的 Labelme 数据集。您可以使用之前自己标注的数据集进行操作。点击下载并解压,数据集类别如下:
['c17', 'c5', 'helicopter', 'c130', 'f16', 'b2',
'other', 'b52', 'kc10', 'command', 'f15', 'kc135', 'a10',
'b1', 'aew', 'f22', 'p3', 'p8', 'f35', 'f18', 'v22', 'f4',
'globalhawk', 'u2', 'su-27', 'il-38', 'tu-134', 'su-33',
'an-70', 'su-24', 'tu-22', 'il-76']
为了进行 YOLOv8 模型的训练,首先需要将 Labelme 数据集转换为 YOLOv8 格式。转换的代码如下:
import os
import shutil
import numpy as np
import json
from glob import glob
import cv2
from sklearn.model_selection import train_test_split
def convert(size, box):
dw = 1. / size[0]
dh = 1. / size[1]
x = (box[0] + box[1]) / 2.0 - 1
y = (box[2] + box[3]) / 2.0 - 1
w = box[1] - box[0]
h = box[3] - box[2]
x *= dw
w *= dw
y *= dh
h *= dh
return (x, y, w, h)
def change_2_yolo5(files, txt_Name):
imag_name = []
for json_file_ in files:
json_filename = labelme_path + json_file_ + ".json"
out_file = open('%s/%s.txt' % (labelme_path, json_file_), 'w')
json_file = json.load(open(json_filename, "r", encoding="utf-8"))
imag_name.append(json_file_+'.jpg')
height, width, channels = cv2.imread(labelme_path + json_file_ + ".jpg").shape
for multi in json_file["shapes"]:
points = np.array(multi["points"])
xmin = min(points[:, 0]) if min(points[:, 0]) > 0 else 0
xmax = max(points[:, 0]) if max(points[:, 0]) > 0 else 0
ymin = min(points[:, 1]) if min(points[:, 1]) > 0 else 0
ymax = max(points[:, 1]) if max(points[:, 1]) > 0 else 0
label = multi["label"].lower()
if xmax <= xmin or ymax <= ymin:
continue
cls_id = classes.index(label)
b = (float(xmin), float(xmax), float(ymin), float(ymax))
bb = convert((width, height), b)
out_file.write(str(cls_id) + " " + " ".join([str(a) for a in bb]) + '\n')
return imag_name
def image_txt_copy(files, scr_path, dst_img_path, dst_txt_path):
for file in files:
img_path = scr_path + file
shutil.copy(img_path, dst_img_path + file)
scr_txt_path = scr_path + file.split('.')[0] + '.txt'
shutil.copy(scr_txt_path, dst_txt_path + file.split('.')[0] + '.txt')
if __name__ == '__main__':
classes = ['c17', 'c5', 'helicopter', 'c130', 'f16', 'b2',
'other', 'b52', 'kc10', 'command', 'f15', 'kc135', 'a10',
'b1', 'aew', 'f22', 'p3', 'p8', 'f35', 'f18', 'v22', 'f4',
'globalhawk', 'u2', 'su-27', 'il-38', 'tu-134', 'su-33',
'an-70', 'su-24', 'tu-22', 'il-76']
labelme_path = "USA-Labelme/"
files = glob(labelme_path + "*.json")
files = [i.replace("\\", "/").split("/")[-1].split(".json")[0] for i in files]
trainval_files, test_files = train_test_split(files, test_size=0.1, random_state=55)
train_files, val_files = train_test_split(trainval_files, test_size=0.1, random_state=55)
train_name_list = change_2_yolo5(train_files, "train")
val_name_list = change_2_yolo5(val_files, "val")
test_name_list = change_2_yolo5(test_files, "test")
file_List = ["train", "val", "test"]
for file in file_List:
if not os.path.exists('./VOC/images/%s' % file):
os.makedirs('./VOC/images/%s' % file)
if not os.path.exists('./VOC/labels/%s' % file):
os.makedirs('./VOC/labels/%s' % file)
image_txt_copy(train_name_list, labelme_path, './VOC/images/train/', './VOC/labels/train/')
image_txt_copy(val_name_list, labelme_path, './VOC/images/val/', './VOC/labels/val/')
image_txt_copy(test_name_list, labelme_path, './VOC/images/test/', './VOC/labels/test/')
运行上述代码后,您将获得适用于 YOLOv8 的数据集格式。
我们下载到本地,在官网上下载 YoloV8,GitHub 链接: GitHub - ultralytics/ultralytics: NEW - YOLOv8 🚀 in PyTorch > ONNX > OpenVINO > CoreML > TFLite 或者直接执行命令pip install ultralytics,如果你打算修改模型,或者二次创新,不建议使用安装命令安装。
在开始训练之前,首先从 GitHub 上下载 YOLOv8 或通过以下命令安装:
pip install ultralytics
为了避免潜在的兼容性问题,建议您下载源代码进行本地调试,而非直接使用 pip 安装。将生成的 YOLO 数据集放到一个新建的 datasets
文件夹中。安装训练所需的依赖:
pip install opencv-python
pip install numpy==1.23.5
pip install pyyaml
pip install tqdm
pip install matplotlib
注意,numpy
的版本需要是 1.23.5,避免使用 2.0 以上的版本。
在项目根目录下新建一个 VOC.yaml
文件,内容如下:
train: ./VOC/images/train
val: ./VOC/images/val
test: ./VOC/images/test
names: ['c17', 'c5', 'helicopter', 'c130', 'f16', 'b2',
'other', 'b52', 'kc10', 'command', 'f15', 'kc135', 'a10',
'b1', 'aew', 'f22', 'p3', 'p8', 'f35', 'f18', 'v22', 'f4',
'globalhawk', 'u2', 'su-27', 'il-38', 'tu-134', 'su-33',
'an-70', 'su-24', 'tu-22', 'il-76']
创建 train.py
文件,编写训练代码:
from ultralytics import YOLO
if __name__ == '__main__':
model = YOLO("ultralytics/cfg/models/v8/yolov8l.yaml")
print(model.model)
results = model.train(data="VOC.yaml", epochs=100, device='0', batch=16, workers=0)
训练代码准备完成后,您可以直接运行 train.py
开始训练模型。
在 DAMODEL 平台上创建账号,登录后点击 GPU 云实例。选择按需配置实例,选择 Pytorch 框架,并创建实例。等待实例启动后,上传刚刚生成的 YOLO 数据集和训练代码。
接着进入 JupyterLab 控制台,打开终端,解压上传的代码,并安装依赖包:
pip install opencv-python
pip install pyyaml
pip install tqdm
pip install matplotlib
pip install pandas
遇到 ImportError: libGL.so.1: cannot open shared object
错误时,执行以下命令:
pip install opencv-python-headless
在终端中运行 train.py
文件,即可在云端训练模型。
训练完成后,可以通过以下 test.py
文件进行模型测试:
from ultralytics import YOLO
if __name__ == '__main__':
model = YOLO('runs/detect/train/weights/best.pt')
results = model.predict(source="ultralytics/assets", device='0')
print(results)
``