6. 通用yolov8模型部署¶
6.1. 引言¶
本文档介绍了如何将yolov8架构的模型部署在cv181x开发板的操作流程,主要的操作步骤包括:
yolov8模型pytorch版本转换为onnx模型
onnx模型转换为cvimodel格式
最后编写调用接口获取推理结果
6.2. pt模型转换为onnx¶
首先获取yolov8官方仓库代码[ultralytics/ultralytics: NEW - YOLOv8 🚀 in PyTorch > ONNX > OpenVINO > CoreML > TFLite (github.com)](https://github.com/ultralytics/ultralytics)
git clone https://github.com/ultralytics/ultralytics.git
再下载对应的yolov8模型文件,以[yolov8n](https://github.com/ultralytics/assets/releases/download/v0.0.0/yolov8n.pt)为例,然后将下载的yolov8n.pt放在ultralytics/weights/目录下,如下命令行所示
cd ultralytics & mkdir weights
cd weights
wget https://github.com/ultralytics/assets/releases/download/v0.0.0/yolov8n.pt
调整yolov8输出分支,去掉forward函数的解码部分,并将三个不同的feature map的box以及cls分开,得到6个分支,这一步可以直接使用yolo_export的脚本完成
yolo_export中的脚本可以通过SFTP获取:下载站台:sftp://218.17.249.213 帐号:cvitek_mlir_2023 密码:7&2Wd%cu5k
通过SFTP找到下图对应的文件夹:

将yolo_export/yolov8_export.py代码复制到yolov8仓库下,然后使用以下命令导出分支版本的onnx模型:
python yolov8_export.py --weights ./weights/yolov8.pt
运行上述代码之后,可以在./weights/目录下得到yolov8n.onnx文件,之后就是将onnx模型转换为cvimodel模型
小技巧
如果输入为1080p的视频流,建议将模型输入尺寸改为384x640,可以减少冗余计算,提高推理速度,如下:
python yolov8_export.py --weights ./weights/yolov8.pt --img-size 384 640
6.3. onnx模型转换cvimodel¶
cvimodel转换操作可以参考cvimodel转换操作可以参考yolo-v5移植章节的onnx模型转换cvimodel部分。
6.4. TDL_SDK接口说明¶
yolov8的预处理设置参考如下:
// set preprocess and algorithm param for yolov8 detection
// if use official model, no need to change param
CVI_S32 init_param(const cvitdl_handle_t tdl_handle) {
// setup preprocess
YoloPreParam preprocess_cfg =
CVI_TDL_Get_YOLO_Preparam(tdl_handle, CVI_TDL_SUPPORTED_MODEL_YOLOV8_DETECTION);
for (int i = 0; i < 3; i++) {
printf("asign val %d \n", i);
preprocess_cfg.factor[i] = 0.003922;
preprocess_cfg.mean[i] = 0.0;
}
preprocess_cfg.format = PIXEL_FORMAT_RGB_888_PLANAR;
printf("setup yolov8 param \n");
CVI_S32 ret = CVI_TDL_Set_YOLO_Preparam(tdl_handle, CVI_TDL_SUPPORTED_MODEL_YOLOV8_DETECTION,
preprocess_cfg);
if (ret != CVI_SUCCESS) {
printf("Can not set yolov8 preprocess parameters %#x\n", ret);
return ret;
}
// setup yolo algorithm preprocess
YoloAlgParam yolov8_param =
CVI_TDL_Get_YOLO_Algparam(tdl_handle, CVI_TDL_SUPPORTED_MODEL_YOLOV8_DETECTION);
yolov8_param.cls = 80;
printf("setup yolov8 algorithm param \n");
ret =
CVI_TDL_Set_YOLO_Algparam(tdl_handle, CVI_TDL_SUPPORTED_MODEL_YOLOV8_DETECTION, yolov8_param);
if (ret != CVI_SUCCESS) {
printf("Can not set yolov8 algorithm parameters %#x\n", ret);
return ret;
}
// set theshold
CVI_TDL_SetModelThreshold(tdl_handle, CVI_TDL_SUPPORTED_MODEL_YOLOV8_DETECTION, 0.5);
CVI_TDL_SetModelNmsThreshold(tdl_handle, CVI_TDL_SUPPORTED_MODEL_YOLOV8_DETECTION, 0.5);
printf("yolov8 algorithm parameters setup success!\n");
return ret;
}
推理测试代码:
ret = CVI_TDL_OpenModel(tdl_handle, CVI_TDL_SUPPORTED_MODEL_YOLOV8_DETECTION, argv[1]);
if (ret != CVI_SUCCESS) {
printf("open model failed with %#x!\n", ret);
return ret;
}
printf("---------------------to do detection-----------------------\n");
VIDEO_FRAME_INFO_S bg;
ret = CVI_TDL_ReadImage(strf1.c_str(), &bg, PIXEL_FORMAT_RGB_888_PLANAR);
if (ret != CVI_SUCCESS) {
printf("open img failed with %#x!\n", ret);
return ret;
} else {
printf("image read,width:%d\n", bg.stVFrame.u32Width);
printf("image read,hidth:%d\n", bg.stVFrame.u32Height);
}
std::string str_res;
cvtdl_object_t obj_meta = {0};
CVI_TDL_YOLOV8_Detection(tdl_handle, &bg, &obj_meta);
std::cout << "objnum:" << obj_meta.size << std::endl;
std::stringstream ss;
ss << "boxes=[";
for (uint32_t i = 0; i < obj_meta.size; i++) {
ss << "[" << obj_meta.info[i].bbox.x1 << "," << obj_meta.info[i].bbox.y1 << ","
<< obj_meta.info[i].bbox.x2 << "," << obj_meta.info[i].bbox.y2 << ","
<< obj_meta.info[i].classes << "," << obj_meta.info[i].bbox.score << "],";
}
ss << "]\n";
std::cout << ss.str();
6.5. 测试结果¶
转换测试了官网的yolov8n以及yolov8s模型,在COCO2017数据集上进行了测试,其中阈值设置为:
conf: 0.001
nms_thresh: 0.6
所有分辨率均为640 x 640
yolov8n模型的官方导出方式性能:
测试平台 |
推理耗时 (ms) |
带宽 (MB) |
ION(MB) |
MAP 0.5 |
MAP 0.5-0.95 |
---|---|---|---|---|---|
pytorch |
N/A |
N/A |
N/A |
53 |
37.3 |
cv180x |
ion分配失败 |
ion分配失败 |
13.26 |
ion分配失败 |
ion分配失败 |
cv181x |
54.91 |
44.16 |
8.64 |
量化失败 |
量化失败 |
cv182x |
40.21 |
44.32 |
8.62 |
量化失败 |
量化失败 |
cv183x |
17.81 |
40.46 |
8.3 |
量化失败 |
量化失败 |
cv186x |
7.03 |
55.03 |
13.92 |
量化失败 |
量化失败 |
yolov8n模型的TDL_SDK导出方式性能:
测试平台 |
推理耗时 (ms) |
带宽 (MB) |
ION(MB) |
MAP 0.5 |
MAP 0.5-0.95 |
---|---|---|---|---|---|
onnx |
N/A |
N/A |
N/A |
51.32 |
36.4577 |
cv180x |
299 |
78.78 |
12.75 |
45.986 |
31.798 |
cv181x |
45.62 |
31.56 |
7.54 |
51.2207 |
35.8048 |
cv182x |
32.8 |
32.8 |
7.72 |
51.2207 |
35.8048 |
cv183x |
12.61 |
28.64 |
7.53 |
51.2207 |
35.8048 |
cv186x |
5.20 |
43.06 |
12.02 |
51.03 |
35.61 |
yolov8s模型的官方导出方式性能:
测试平台 |
推理耗时 (ms) |
带宽 (MB) |
ION(MB) |
MAP 0.5 |
MAP 0.5-0.95 |
---|---|---|---|---|---|
pytorch |
N/A |
N/A |
N/A |
61.8 |
44.9 |
cv180x |
模型转换失败 |
模型转换失败 |
模型转换失败 |
模型转换失败 |
模型转换失败 |
cv181x |
144.72 |
101.75 |
17.99 |
量化失败 |
量化失败 |
cv182x |
103 |
101.75 |
17.99 |
量化失败 |
量化失败 |
cv183x |
38.04 |
38.04 |
16.99 |
量化失败 |
量化失败 |
cv186x |
13.16 |
95.03 |
23.44 |
量化失败 |
量化失败 |
yolov8s模型的TDL_SDK导出方式性能:
测试平台 |
推理耗时 (ms) |
带宽 (MB) |
ION(MB) |
MAP 0.5 |
MAP 0.5-0.95 |
---|---|---|---|---|---|
onnx |
N/A |
N/A |
N/A |
60.1534 |
44.034 |
cv180x |
模型转换失败 |
模型转换失败 |
模型转换失败 |
模型转换失败 |
模型转换失败 |
cv181x |
135.55 |
89.53 |
18.26 |
60.2784 |
43.4908 |
cv182x |
95.95 |
89.53 |
18.26 |
60.2784 |
43.4908 |
cv183x |
32.88 |
58.44 |
16.9 |
60.2784 |
43.4908 |
cv186x |
11.37 |
82.61 |
21.96 |
60.27 |
43.52 |