5.2. Using TPU-MLIR to Generate FLOAT BModel
The SOPHON series of intelligent vision deep learning processor platforms only support BModel model acceleration. Users need to perform model migration first, converting trained models from other frameworks into BModel format that can run on the SOPHON series of intelligent vision deep learning processors.
TPU-MLIR Toolkit
The TPU-MLIR toolkit currently supports models from frameworks such as PyTorch, ONNX, TFLite, and Caffe. After conversion, it generates BModel models supported by the SOPHON series platforms for acceleration. More network layers and models are also continuously being supported. The specific usage guidance methods are as follows:
Note
Currently, MLIR supports PyTorch, ONNX, TFLite, and Caffe frameworks. More network layers and models are also continuously being supported. For specific steps, please refer to the following usage guidance:
Function |
Usage Guidance |
Compiling PyTorch models |
|
Compiling ONNX models |
|
Compiling TFLite models |
|
Compiling Caffe models |
This article introduces how to use the TPU-MLIR to convert models into FP32 format BModel and deploy them through an example of the ONNX framework. For the configuration of the development environment, please refer to TPU-MLIR Quick Start Manual
Note
This chapter uses the yolov5s.onnx model as an example to introduce how to compile and migrate an onnx model to run on the BM1684X platform. Other models please refer to the Model Usage Guidance above.
The model comes from the official website of yolov5: https://github.com/ultralytics/yolov5/releases/download/v6.0/yolov5s.onnx
This section requires the following files (where xxxx corresponds to the actual version information): tpu-mlir_xxxx.tar.gz (tpu-mlir release package)
5.2.1. Loading tpu-mlir
The following operations need to be performed in a Docker container. For the use of Docker, please refer to TPU-MLIR Quick Start Manual
1$ tar zxf tpu-mlir_xxxx.tar.gz
2$ source tpu-mlir_xxxx/envsetup.sh
envsetup.sh will add the following environment variables:
Variable Name |
Value |
Description |
|---|---|---|
TPUC_ROOT |
tpu-mlir_xxx |
Location of the unpacked SDK package |
MODEL_ZOO_PATH |
${TPUC_ROOT}/../model-zoo |
Location of the model-zoo folder, at the same level as the SDK |
REGRESSION_PATH |
${TPUC_ROOT}/regression |
Location of the regression folder |
The modifications made by envsetup.sh to the environment variables are as follows:
1 export PATH=${TPUC_ROOT}/bin:$PATH
2 export PATH=${TPUC_ROOT}/python/tools:$PATH
3 export PATH=${TPUC_ROOT}/python/utils:$PATH
4 export PATH=${TPUC_ROOT}/python/test:$PATH
5 export PATH=${TPUC_ROOT}/python/samples:$PATH
6 export PATH=${TPUC_ROOT}/customlayer/python:$PATH
7 export LD_LIBRARY_PATH=$TPUC_ROOT/lib:$LD_LIBRARY_PATH
8 export PYTHONPATH=${TPUC_ROOT}/python:$PYTHONPATH
9 export PYTHONPATH=${TPUC_ROOT}/customlayer/python:$PYTHONPATH
10 export MODEL_ZOO_PATH=${TPUC_ROOT}/../model-zoo
11 export REGRESSION_PATH=${TPUC_ROOT}/regression
5.2.2. Preparing the working directory
Create a directory named model_yolov5s, note that it is at the same level as tpu-mlir; and put the model file and image file into the model_yolov5s directory.
The operations are as follows:
1$ mkdir yolov5s_onnx && cd yolov5s_onnx
2$ wget https://github.com/ultralytics/yolov5/releases/download/v6.0/yolov5s.onnx
3$ cp -rf $TPUC_ROOT/regression/dataset/COCO2017.
4$ cp -rf $TPUC_ROOT/regression/image.
5$ mkdir workspace && cd workspace
Here, $TPUC_ROOT is an environment variable corresponding to the tpu-mlir_xxxx directory.
5.2.3. ONNX to MLIR
If the model is image input, it is necessary to understand the preprocessing requirements of the model prior to conversion. If the model uses a preprocessed npz file as input, then preprocessing does not need to be considered.
The preprocessing process is expressed by the following formula (where x represents the input):
The official website of yolov5 uses rgb images, and each value is multiplied by 1/255, which corresponds to 0.0,0.0,0.0 and 0.0039216,0.0039216,0.0039216 for mean and scale.
The model conversion command is as follows:
$ model_transform.py \
--model_name yolov5s \
--model_def../yolov5s.onnx \
--input_shapes [[1,3,640,640]] \
--mean 0.0,0.0,0.0 \
--scale 0.0039216,0.0039216,0.0039216 \
--keep_aspect_ratio \
--pixel_format rgb \
--output_names 350,498,646 \
--test_input../image/dog.jpg \
--test_result yolov5s_top_outputs.npz \
--mlir yolov5s.mlir
The main parameters of model_transform.py are explained as follows (for a complete introduction, please refer to the user interface chapter of the TPU-MLIR Development Reference Manual):
Parameter name |
Required |
Description |
|---|---|---|
model_name |
Yes |
Specify the model name |
model_def |
Yes |
Specify the model definition file, such as .onnx or .tflite or .prototxt file |
input_shapes |
No |
Specify the shape of the input, for example [[1,3,640,640]]; two-dimensional array, can support multiple input situations |
input_types |
No |
Specify the type of input, for example int32; multiple inputs are separated by,; if not specified, it defaults to float32 |
resize_dims |
No |
The dimensions of the original image after resizing; if not specified, it will be resized to the input size of the model |
keep_aspect_ratio |
No |
Whether to maintain the aspect ratio when Resize, the default is false; when set, it will fill 0 for the insufficient part |
mean |
No |
The mean value of each channel of the image, the default is 0.0,0.0,0.0 |
scale |
No |
The ratio of each channel of the image, the default is 1.0,1.0,1.0 |
pixel_format |
No |
Image type, can be rgb, bgr, gray, rgbd, the default is bgr |
channel_format |
No |
Channel type, for image input can be nhwc or nchw, non-image input is none, the default is nchw |
output_names |
No |
Specify the name of the output; if unspecified, the output name from the model will be used. Upon specification, the designated name will serve as the output. |
test_input |
No |
Specify the input file for verification, which can be an image or npy or npz; if not specified, no correctness verification will be performed |
test_result |
No |
Specify the output file after verification |
excepts |
No |
Specify the names of the network layers that need to be excluded from verification, separated by, |
mlir |
Yes |
Specify the name and path of the output mlir file |
After converting to an mlir file, a ${model_name}_in_f32.npz file will be generated, which is the input file of the model.
Convert the mlir file to an f32 bmodel, the operation method is as follows:
$ model_deploy.py \
--mlir yolov5s.mlir \
--quantize F32 \
--processor bm1684x \
--test_input yolov5s_in_f32.npz \
--test_reference yolov5s_top_outputs.npz \
--tolerance 0.99,0.99 \
--model yolov5s_1684x_f32.bmodel
The main parameters of model_deploy.py are explained as follows (for a complete introduction, please refer to the user interface chapter of the TPU-MLIR Development Reference Manual):
Parameter name |
Required? |
Description |
|---|---|---|
mlir |
Yes |
Specify the mlir file |
quantize |
Yes |
Specify the default quantization type, support F32/F16/BF16/INT8 |
processor |
Yes |
Specify the platform the model will use, support bm1684x/bm1684/cv183x/cv182x/cv181x/cv180x |
calibration_table |
No |
Specify the path of the calibration table, which is required when there is INT8 quantization |
tolerance |
No |
Represents the error tolerance of the similarity between the results of MLIR quantization and the results of MLIR fp32 inference |
test_input |
No |
Specify the input file for verification, which can be an image or npy or npz; it can be not specified, and no correctness verification will be performed |
test_reference |
No |
Reference data for verifying the correctness of the model (using npz format). It is the calculation result of each operator |
compare_all |
No |
Whether to compare all intermediate results when verifying correctness, the default is not to compare intermediate results |
excepts |
No |
Specify the names of the network layers that need to be excluded from verification, separated by commas |
model |
Yes |
Specify the name and path of the output model file |
After compilation, a file named yolov5s_1684x_f32.bmodel will be generated.
Note
For specific deployment and testing methods, please refer to 5.1.3.7 Model Performance Testing.