7. Use TPU for Preprocessing

At present, the two main series of chips supported by TPU-MLIR are BM168x and CV18xx. Both of them support common image preprocessing fusion. The developer can pass the preprocessing arguments during the compilation process, and the compiler will directly insert the corresponding preprocessing operators into the generated model. The generated bmodel or cvimodel can directly use the unpreprocessed image as input and use TPU to do the preprocessing.

Table 7.1 Supported Preprocessing Type
Preprocessing Type	BM168x	CV18xx
Crop	True	True
Normalization	True	True
NHWC to NCHW	True	True
BGR/ RGB Conversion	True	True

The image cropping will first adjust the image to the size specified by the “–resize_dims” argument of the model_transform tool, and then crop it to the size of the model input. The normalization supports directly converting unpreprocessed image data.

To integrate preprocessing into the model, you need to speficy the “–fuse_preprocess” argument when using the model_deploy tool, and the test_input should be an image of the original format (i.e., jpg, jpeg and png format). There will be a preprocessed npz file of input named ${model_name}_in_ori.npz generated. In addition, there is a “–customization_format” argument to specify the original image format input to the model. The supported image formats are described as follows:

Table 7.2 Types of customization_format and Description
customization_format	Description	BM168x	CV18xx
None	same with model format, do nothing, as default	True	True
RGB_PLANAR	rgb color order and nchw tensor format	True	True
RGB_PACKED	rgb color order and nhwc tensor format	True	True
BGR_PLANAR	bgr color order and nchw tensor format	True	True
BGR_PACKED	bgr color order and nhwc tensor format	True	True
GRAYSCALE	one color channel only and nchw tensor format	True	True
YUV420_PLANAR	yuv420 planner format, from vpss input	False	True
YUV_NV21	NV21 format of yuv420, from vpss input	False	True
YUV_NV12	NV12 format of yuv420, from vpss input	False	True
RGBA_PLANAR	rgba format and nchw tensor format	False	True

The “YUV*” type format is the special input format of CV18xx series chips. When the order of the color channels in the customization_format is different from the model input, a channel conversion operation will be performed. If the customization_format argument is not specified, the corresponding customization_format will be automatically set according to the pixel_format and channel_format arguments defined when using the model_transform tool.

7.1. Model Deployment Example

Take the mobilenet_v2 model as an example, use the model_transform tool to generate the original mlir, and the run_calibration tool to generate the calibration table (refer to the chapter “Compiling the Caffe Model” for more details).

7.1.1. Deploy to BM168x

The command to generate the preprocess-fused symmetric INT8 quantized bmodel model is as follows:

$ model_deploy.py \
    --mlir mobilenet_v2.mlir \
    --quantize INT8 \
    --calibration_table mobilenet_v2_cali_table \
    --chip bm1684x \
    --test_input ../image/cat.jpg \
    --test_reference mobilenet_v2_top_outputs.npz \
    --tolerance 0.96,0.70 \
    --fuse_preprocess \
    --model mobilenet_v2_bm1684x_int8_sym_fuse_preprocess.bmodel

7.1.2. Deploy to CV18xx

The command to generate the preprocess-fused symmetric INT8 quantized cvimodel model are as follows:

$ model_deploy.py \
    --mlir mobilenet_v2.mlir \
    --quantize INT8 \
    --calibration_table mobilenet_v2_cali_table \
    --chip cv183x \
    --test_input ../image/cat.jpg \
    --test_reference mobilenet_v2_top_outputs.npz \
    --tolerance 0.96,0.70 \
    --fuse_preprocess \
    --customization_format RGB_PLANAR \
    --model mobilenet_v2_cv183x_int8_sym_fuse_preprocess.cvimodel

When the input data comes from the video post-processing module VPSS provided by CV18xx (for details on how to use VPSS for preprocessing, please refer to “CV18xx Media Software Development Reference”), data alignment is required (e.g., 32-bit aligned width). The command to generate the preprocessed-fused cvimodel model is as follows:

$ model_deploy.py \
    --mlir mobilenet_v2.mlir \
    --quantize INT8 \
    --calibration_table mobilenet_v2_cali_table \
    --chip cv183x \
    --test_input ../image/cat.jpg \
    --test_reference mobilenet_v2_top_outputs.npz \
    --tolerance 0.96,0.70 \
    --fuse_preprocess \
    --customization_format RGB_PLANAR \
    --aligned_input \
    --model mobilenet_v2_cv183x_int8_sym_fuse_preprocess_aligned.cvimodel

In the above command, aligned_input specifies the alignment that the model input needs to do. It should be noted that both fuse_preprocess and aligned_input need to be done for input data in YUV format. The rest formats can do both or only one of the two operations. If you only do aligned_input operation, you need to set test_input to the preprocessed ${model_name}_in_f32.npz format, which is consistent with the setting in the chapter “Compile ONNX model”.