7. Deployment of Yolox Model for General Use¶

7.1. Introduction¶

This document introduces the operation process of deploying the YOLOX architecture model on the CV181x development board. The main steps include:

Convert the YOLOX model Pytorch version to the ONNX model
Convert onnx model to cvi model format
Finally, write a calling interface to obtain the inference results

7.2. Convert pt Model to onnx¶

Firstly, you can download the official code of YOLOX on Github: [Megvii BaseDetection/YOLOX: YOLOX is a high performance anchor free YOLO, excepting YOLOv3~v5 with MegEngine, ONNX, TensorRT, ncnn, and OpenVINO supported. Documentation: https://yolox.readthedocs.io/ (github. com)（ https://github.com/Megvii-BaseDetection/YOLOX/tree/main ）

Install YOLOX from source code using the following command

git clone git@github.com : Megvii BaseDetection/YOLOX.git

cd YOLOX

pip3 install - v - e# Or Python 3 setup.py development

##Onnx model export

You need to switch to the YOLOX repository path you just downloaded, and then create a weights directory to move the pre trained. pth file here

Cd YOLOX&mkdir weights

cp path/to/pth/ Weigths/

###Official export onnx

Switch to the tools path

cd tools

Export method for decoding in onnx

python \
export_onnx.py \
--output-name ../weights/yolox_m_official.onnx \
-n yolox-m \
--no-onnxsim \
-c ../weights/yolox_m.pth \
--decode_in_inference

The meanings of relevant parameters are as follows:

–output-name Represents the path and name of the exported onnx model
-n Represents the model name, which can be selected * yolox-s, m, l, x * yolo-nano * yolox-tiny * yolov3
-c Path to the. pth model file representing pre training
–decode_in_inference Indicates whether to decode in onnx

###TDL_SDK version export onnx

To ensure the accuracy of quantization, it is necessary to divide the YOLOX decoded head into three different branch outputs, rather than the official version of the merged output

Export the heads of three different branches through the following scripts and commands:

Create a new file export in the YOLOX/tools/directory_Onnx_TDL_Sdk.py and attach the following code

#!/usr/bin/env python3
# -*- coding:utf-8 -*-
# Copyright (c) Megvii, Inc. and its affiliates.

import argparse
import os
from loguru import logger

import torch
from torch import nn

import sys
sys.path.append("..")

from yolox.exp import get_exp
from yolox.models.network_blocks import SiLU
from yolox.utils import replace_module
import types

def make_parser():
    parser = argparse.ArgumentParser("YOLOX onnx deploy")
    parser.add_argument(
        "--output-name", type=str, default="yolox.onnx", help="output name of models"
    )
    parser.add_argument(
        "--input", default="images", type=str, help="input node name of onnx model"
    )
    parser.add_argument(
        "--output", default="output", type=str, help="output node name of onnx model"
    )
    parser.add_argument(
        "-o", "--opset", default=11, type=int, help="onnx opset version"
    )
    parser.add_argument("--batch-size", type=int, default=1, help="batch size")
    parser.add_argument(
        "--dynamic", action="store_true", help="whether the input shape should be dynamic or not"
    )
    parser.add_argument("--no-onnxsim", action="store_true", help="use onnxsim or not")
    parser.add_argument(
        "-f",
        "--exp_file",
        default=None,
        type=str,
        help="experiment description file",
    )
    parser.add_argument("-expn", "--experiment-name", type=str, default=None)
    parser.add_argument("-n", "--name", type=str, default=None, help="model name")
    parser.add_argument("-c", "--ckpt", default=None, type=str, help="ckpt path")
    parser.add_argument(
        "opts",
        help="Modify config options using the command-line",
        default=None,
        nargs=argparse.REMAINDER,
    )
    parser.add_argument(
        "--decode_in_inference",
        action="store_true",
        help="decode in inference or not"
    )

    return parser

def forward(self, xin, labels=None, imgs=None):
        outputs = []
        origin_preds = []
        x_shifts = []
        y_shifts = []
        expanded_strides = []

        for k, (cls_conv, reg_conv, stride_this_level, x) in enumerate(
            zip(self.cls_convs, self.reg_convs, self.strides, xin)
        ):
            x = self.stems[k](x)
            cls_x = x
            reg_x = x

            cls_feat = cls_conv(cls_x)
            cls_output = self.cls_preds[k](cls_feat)

            reg_feat = reg_conv(reg_x)
            reg_output = self.reg_preds[k](reg_feat)
            obj_output = self.obj_preds[k](reg_feat)

            outputs.append(reg_output.permute(0, 2, 3, 1))
            outputs.append(obj_output.permute(0, 2, 3, 1))
            outputs.append(cls_output.permute(0, 2, 3, 1))

        return outputs

@logger.catch
def main():
    args = make_parser().parse_args()
    logger.info("args value: {}".format(args))
    exp = get_exp(args.exp_file, args.name)
    exp.merge(args.opts)

    if not args.experiment_name:
        args.experiment_name = exp.exp_name

    model = exp.get_model()
    if args.ckpt is None:
        file_name = os.path.join(exp.output_dir, args.experiment_name)
        ckpt_file = os.path.join(file_name, "best_ckpt.pth")
    else:
        ckpt_file = args.ckpt

    # load the model state dict
    ckpt = torch.load(ckpt_file, map_location="cpu")

    model.eval()
    if "model" in ckpt:
        ckpt = ckpt["model"]
    model.load_state_dict(ckpt)
    model = replace_module(model, nn.SiLU, SiLU)

    # replace official head forward function
    if not args.decode_in_inference:
        model.head.forward = types.MethodType(forward, model.head)

    model.head.decode_in_inference = args.decode_in_inference

    logger.info("loading checkpoint done.")
    dummy_input = torch.randn(args.batch_size, 3, exp.test_size[0], exp.test_size[1])

    torch.onnx._export(
        model,
        dummy_input,
        args.output_name,
        input_names=[args.input],
        output_names=[args.output],
        dynamic_axes={args.input: {0: 'batch'},
                      args.output: {0: 'batch'}} if args.dynamic else None,
        opset_version=args.opset,
    )
    logger.info("generated onnx model named {}".format(args.output_name))

    if not args.no_onnxsim:
        import onnx
        from onnxsim import simplify

        # use onnx-simplifier to reduce reduent model.
        onnx_model = onnx.load(args.output_name)
        model_simp, check = simplify(onnx_model)
        assert check, "Simplified ONNX model could not be validated"
        onnx.save(model_simp, args.output_name)
        logger.info("generated simplified onnx model named {}".format(args.output_name))

if __name__ == "__main__":
    main()

然后输入以下命令

python \
export_onnx_tdl_sdk.py \
--output-name ../weights/yolox_s_9_branch.onnx \
-n yolox-s \
--no-onnxsim \
-c ../weights/yolox_s.pth

7.3. Onnx Model Conversion cvi model¶

The cvi model conversion operation can refer to the onnx model conversion cvi model section in the Yolo v5 porting chapter.

7.4. TDL SDK Interface Description¶

###Preprocessing parameter settings

Preprocessing parameter settings are passed in through a structure to set parameters

typedef struct {
  float factor[3];
  float mean[3];
  meta_rescale_type_e rescale_type;

  bool use_quantize_scale;
  PIXEL_FORMAT_E format;
} YoloPreParam;

For YOLOX, the following four parameters need to be passed in:

Factor preprocessing scale parameter
Mean preprocessing mean parameter
Use_Quantify_Does scale use the size of the model? The default is true
Format image format, PIXEL_FORMAT_RGB_888_PLANAR

###Algorithm parameter settings

typedef struct {
  uint32_t cls;
} YoloAlgParam;

The number of categories that need to be passed in, such as

YoloAlgParam p_yolo_param;
p_yolo_param.cls = 80;

The additional model confidence parameter settings and NMS threshold settings are as follows:

CVI_TDL_SetModelThreshold(tdl_handle, CVI_TDL_SUPPORTED_MODEL_YOLOX, conf_threshold);
CVI_TDL_SetModelNmsThreshold(tdl_handle, CVI_TDL_SUPPORTED_MODEL_YOLOX, nms_threshold);

Among them, conf_Threshold is the confidence threshold; Nms_Threshold is the nms threshold

###Test Code

#define _GNU_SOURCE
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
#include <chrono>
#include <fstream>
#include <functional>
#include <iostream>
#include <map>
#include <sstream>
#include <string>
#include <vector>
#include "core.hpp"
#include "core/cvi_tdl_types_mem_internal.h"
#include "core/utils/vpss_helper.h"
#include "cvi_tdl.h"
#include "evaluation/cvi_tdl_media.h"
#include "sys_utils.hpp"

int main(int argc, char* argv[]) {
  int vpssgrp_width = 1920;
  int vpssgrp_height = 1080;
  CVI_S32 ret = MMF_INIT_HELPER2(vpssgrp_width, vpssgrp_height, PIXEL_FORMAT_RGB_888, 1,
                                vpssgrp_width, vpssgrp_height, PIXEL_FORMAT_RGB_888, 1);
  if (ret != CVI_TDL_SUCCESS) {
    printf("Init sys failed with %#x!\n", ret);
    return ret;
  }

  cvitdl_handle_t tdl_handle = NULL;
  ret = CVI_TDL_CreateHandle(&tdl_handle);
  if (ret != CVI_SUCCESS) {
    printf("Create tdl handle failed with %#x!\n", ret);
    return ret;
  }
  printf("start yolox preprocess config \n");
  // // setup preprocess
  YoloPreParam p_preprocess_cfg;

  for (int i = 0; i < 3; i++) {
    p_preprocess_cfg.factor[i] = 1.0;
    p_preprocess_cfg.mean[i] = 0.0;
  }
  p_preprocess_cfg.use_quantize_scale = true;
  p_preprocess_cfg.format = PIXEL_FORMAT_RGB_888_PLANAR;

  printf("start yolo algorithm config \n");
  // setup yolo param
  YoloAlgParam p_yolo_param;
  p_yolo_param.cls = 80;

  printf("setup yolox param \n");
  ret = CVI_TDL_Set_YOLOX_Param(tdl_handle, &p_preprocess_cfg, &p_yolo_param);
  printf("yolox set param success!\n");
  if (ret != CVI_SUCCESS) {
    printf("Can not set YoloX parameters %#x\n", ret);
    return ret;
  }

  std::string model_path = argv[1];
  std::string str_src_dir = argv[2];

  float conf_threshold = 0.5;
  float nms_threshold = 0.5;
  if (argc > 3) {
    conf_threshold = std::stof(argv[3]);
  }

  if (argc > 4) {
    nms_threshold = std::stof(argv[4]);
  }

  printf("start open cvimodel...\n");
  ret = CVI_TDL_OpenModel(tdl_handle, CVI_TDL_SUPPORTED_MODEL_YOLOX, model_path.c_str());
  if (ret != CVI_SUCCESS) {
    printf("open model failed %#x!\n", ret);
    return ret;
  }
  printf("cvimodel open success!\n");
  // set thershold
  CVI_TDL_SetModelThreshold(tdl_handle, CVI_TDL_SUPPORTED_MODEL_YOLOX, conf_threshold);
  CVI_TDL_SetModelNmsThreshold(tdl_handle, CVI_TDL_SUPPORTED_MODEL_YOLOX, nms_threshold);

  std::cout << "model opened:" << model_path << std::endl;

  VIDEO_FRAME_INFO_S fdFrame;
  ret = CVI_TDL_ReadImage(str_src_dir.c_str(), &fdFrame, PIXEL_FORMAT_RGB_888);
  std::cout << "CVI_TDL_ReadImage done!\n";

  if (ret != CVI_SUCCESS) {
    std::cout << "Convert out video frame failed with :" << ret << ".file:" << str_src_dir
              << std::endl;
  }

  cvtdl_object_t obj_meta = {0};

  CVI_TDL_YoloX(tdl_handle, &fdFrame, &obj_meta);

  printf("detect number: %d\n", obj_meta.size);
  for (uint32_t i = 0; i < obj_meta.size; i++) {
    printf("detect res: %f %f %f %f %f %d\n", obj_meta.info[i].bbox.x1, obj_meta.info[i].bbox.y1,
          obj_meta.info[i].bbox.x2, obj_meta.info[i].bbox.y2, obj_meta.info[i].bbox.score,
          obj_meta.info[i].classes);
  }

  CVI_VPSS_ReleaseChnFrame(0, 0, &fdFrame);
  CVI_TDL_Free(&obj_meta);
  CVI_TDL_DestroyHandle(tdl_handle);

  return ret;
}

7.5. Test Result¶

Tested the performance indicators of the YOLOX model onnx and various platforms on CV181x/2x/3x, with parameter settings as follows:

Conf: 0.001
Nms: 0.65
Resolution: 640 x 640

The official export method of the YOLOX-S model onnx performance:

platform	Inference time (ms)	bandwidth (MB)	ION(MB)	MAP 0.5	MAP 0.5-0.95
pytorch	N/A	N/A	N/A	59.3	40.5
cv181x	131.95	104.46	16.43	Quantification failure	Quantification failure
cv182x	95.75	104.85	16.41	Quantification failure	Quantification failure
cv183x	Quantification failure	Quantification failure	Quantification failure	Quantification failure	Quantification failure

TDL of yolox-s model_SDK export method onnx performance：

platform	Inference time (ms)	bandwidth (MB)	ION(MB)	MAP 0.5	MAP 0.5-0.95
onnx	N/A	N/A	N/A	53.1767	36.4747
cv181x	127.91	95.44	16.24	52.4016	35.4241
cv182x	91.67	95.83	16.22	52.4016	35.4241
cv183x	30.6	65.25	14.93	52.4016	35.4241

The official export method of the yolox-m model onnx performance:

platform	Inference time (ms)	bandwidth (MB)	ION(MB)	MAP 0.5	MAP 0.5-0.95
pytorch	N/A	N/A	N/A	65.6	46.9
cv181x	ion allocation failure	ion allocation failure	39.18	Quantification failure	Quantification failure
cv182x	246.1	306.31	39.16	Quantification failure	Quantification failure
cv183x	Quantification failure	Quantification failure	Quantification failure	Quantification failure	Quantification failure

TDL of yolox-m model SDK export method onnx performance：

platform	Inference time (ms)	bandwidth (MB)	ION(MB)	MAP 0.5	MAP 0.5-0.95
onnx	N/A	N/A	N/A	59.9411	43.0057
cv181x	ion allocation failure	ion allocation failure	38.95	59.3559	42.1688
cv182x	297.5	242.65	38.93	59.3559	42.1688
cv183x	75.8	144.97	33.5	59.3559	42.1688