6. Compile the Caffe model
This chapter takes mobilenet_v2_deploy.prototxt
and mobilenet_v2.caffemodel
as examples to introduce how to compile and transfer a caffe model to run on the BM1684X TPU platform.
This chapter requires the following files (where xxxx corresponds to the actual version information):
tpu-mlir_xxxx.tar.gz (The release package of tpu-mlir)
6.1. Load tpu-mlir
The following operations need to be in a Docker container. For the use of Docker, please refer to Setup Docker Container.
1$ tar zxf tpu-mlir_xxxx.tar.gz
2$ source tpu-mlir_xxxx/envsetup.sh
envsetup.sh
adds the following environment variables:
Name |
Value |
Explanation |
---|---|---|
TPUC_ROOT |
tpu-mlir_xxx |
The location of the SDK package after decompression |
MODEL_ZOO_PATH |
${TPUC_ROOT}/../model-zoo |
The location of the model-zoo folder, at the same level as the SDK |
envsetup.sh
modifies the environment variables as follows:
1export PATH=${TPUC_ROOT}/bin:$PATH
2export PATH=${TPUC_ROOT}/python/tools:$PATH
3export PATH=${TPUC_ROOT}/python/utils:$PATH
4export PATH=${TPUC_ROOT}/python/test:$PATH
5export PATH=${TPUC_ROOT}/python/samples:$PATH
6export LD_LIBRARY_PATH=$TPUC_ROOT/lib:$LD_LIBRARY_PATH
7export PYTHONPATH=${TPUC_ROOT}/python:$PYTHONPATH
8export MODEL_ZOO_PATH=${TPUC_ROOT}/../model-zoo
9export REGRESSION_PATH=${TPUC_ROOT}/regression
6.2. Prepare working directory
Create a mobilenet_v2
directory, note that it is the same level as tpu-mlir, and put both model files and image files into the mobilenet_v2
directory.
The operation is as follows:
1$ mkdir mobilenet_v2 && cd mobilenet_v2
2$ cp $TPUC_ROOT/regression/model/mobilenet_v2_deploy.prototxt .
3$ cp $TPUC_ROOT/regression/model/mobilenet_v2.caffemodel .
4$ cp -rf $TPUC_ROOT/regression/dataset/ILSVRC2012 .
5$ cp -rf $TPUC_ROOT/regression/image .
6$ mkdir workspace && cd workspace
$TPUC_ROOT
is an environment variable, corresponding to the tpu-mlir_xxxx directory.
6.3. Caffe to MLIR
The model in this example has a BGR input with mean and scale of 103.94, 116.78, 123.68
and 0.017, 0.017, 0.017
respectively.
The model conversion command:
$ model_transform.py \
--model_name mobilenet_v2 \
--model_def ../mobilenet_v2_deploy.prototxt \
--model_data ../mobilenet_v2.caffemodel \
--input_shapes [[1,3,224,224]] \
--resize_dims=256,256 \
--mean 103.94,116.78,123.68 \
--scale 0.017,0.017,0.017 \
--pixel_format bgr \
--test_input ../image/cat.jpg \
--test_result mobilenet_v2_top_outputs.npz \
--mlir mobilenet_v2.mlir
After converting to mlir file, a ${model_name}_in_f32.npz
file will be generated, which is the input file of the model.
6.4. MLIR to F32 bmodel
Convert the mlir file to the bmodel of f32, the operation method is as follows:
$ model_deploy.py \
--mlir mobilenet_v2.mlir \
--quantize F32 \
--chip bm1684x \
--test_input mobilenet_v2_in_f32.npz \
--test_reference mobilenet_v2_top_outputs.npz \
--tolerance 0.99,0.99 \
--model mobilenet_v2_1684x_f32.bmodel
After compilation, a file named mobilenet_v2_1684x_f32.bmodel
is generated.
6.5. MLIR to INT8 bmodel
6.5.1. Calibration table generation
Before converting to the INT8 model, you need to run calibration to get the calibration table. The number of input data is about 100 to 1000 according to the situation.
Then use the calibration table to generate a symmetric or asymmetric bmodel. It is generally not recommended to use the asymmetric one if the symmetric one already meets the requirements, because the performance of the asymmetric model will be slightly worse than the symmetric model.
Here is an example of the existing 100 images from ILSVRC2012 to perform calibration:
$ run_calibration.py mobilenet_v2.mlir \
--dataset ../ILSVRC2012 \
--input_num 100 \
-o mobilenet_v2_cali_table
After running the command above, a file named mobilenet_v2_cali_table
will be generated, which is used as the input file for subsequent compilation of the INT8 model.
6.5.2. Compile to INT8 symmetric quantized model
Execute the following command to convert to the INT8 symmetric quantized model:
$ model_deploy.py \
--mlir mobilenet_v2.mlir \
--quantize INT8 \
--calibration_table mobilenet_v2_cali_table \
--chip bm1684x \
--test_input mobilenet_v2_in_f32.npz \
--test_reference mobilenet_v2_top_outputs.npz \
--tolerance 0.96,0.70 \
--model mobilenet_v2_1684x_int8_sym.bmodel
After compilation, a file named mobilenet_v2_1684x_int8_sym.bmodel
is generated.
6.5.3. Compile to INT8 asymmetric quantized model
Execute the following command to convert to the INT8 asymmetric quantized model:
$ model_deploy.py \
--mlir mobilenet_v2.mlir \
--quantize INT8 \
--asymmetric \
--calibration_table mobilenet_v2_cali_table \
--chip bm1684x \
--test_input mobilenet_v2_in_f32.npz \
--test_reference mobilenet_v2_top_outputs.npz \
--tolerance 0.95,0.69 \
--model mobilenet_v2_1684x_int8_asym.bmodel
After compilation, a file named mobilenet_v2_1684x_int8_asym.bmodel
is generated.