6.3. Python Programming Details 

contents

Python Programming Details

SophonSDK provides a Python programming interface to users through the SAIL library.

SAIL (Sophon Artificial Intelligent Library) is the core module of Sophon Inference. SAIL encapsulated BMLib, BMDecoder, BMCV and BMRuntime in SophonSDK. SophonSDK original “load bmodel and drive TPU reasoning”, “drive TPU+VPP to do image processing”, “drive VPU to do image and video decoding” and other functions abstract into a more simple C++ interface to provide external; It is repackaged using pybind11 to provide a concise and easy-to-use python interface.

This section introduces python interface programming by using the YOLOV5 detection algorithm as an example.

注解

Sample code path:sophon-demo/sample/YOLOV5

At present, all classes, enumerations and functions in SAIL module are in the “sail” namespace. The core classes include:

Handle: bm_handle_t wrapper class of BMLib in SDK, device handle, and context information used to interact with the kernel driver.
Tensor: A wrapper class for BMLib from the SDK that encapsulates device memory management and system memory synchronization.
Engine: Wrapper class of BMRuntime in SDK, which can load bmodel and drive TPU for inference. An Engine instance can load an arbitrary bmodel, automatically managing the memory corresponding to the input tensor and the output tensor.
Decoder: Use VPU to decode video, JPU to decode image, both are hardware decoding.
Bmcv: Packaging class of BMCV in SDK, which encapsulates a series of image processing functions and can drive TPU for image processing.

For more information about the interface, please read the SAIL User Development Manual .

This chapter mainly introduces the following three aspects:

Loading model
Preprocessing
Inference

6.3.1. Loading model 

import sophon.sail as sail

...

engine = sail.Engine(model_path, device_id, io_mode)

...

6.3.2. Preprocessing 

class PreProcess:
    def __init__(self, width, height, batch_size, img_dtype, input_scale=None):

        self.std = np.array([255., 255., 255.], dtype=np.float32)
        self.batch_size = batch_size
        self.input_scale = float(1.0) if input_scale is None else input_scale
        self.img_dtype = img_dtype

        self.width = width
        self.height = height
        self.use_resize_padding = True
        self.use_vpp = False
        ...

    def resize(self, img, handle, bmcv):

        if self.use_resize_padding:
            img_w = img.width()
            img_h = img.height()
            r_w = self.width / img_w
            r_h = self.height / img_h

            if r_h > r_w:
                tw = self.width
                th = int(r_w * img_h)
                tx1 = tx2 = 0
                ty1 = int((self.height - th) / 2)
                ty2 = self.height - th - ty1

            else:
                tw = int(r_h * img_w)
                th = self.height
                tx1 = int((self.width - tw) / 2)
                tx2 = self.width - tw - tx1
                ty1 = ty2 = 0

            ratio = (min(r_w, r_h), min(r_w, r_h))
            txy = (tx1, ty1)
            attr = sail.PaddingAtrr()
            attr.set_stx(tx1)
            attr.set_sty(ty1)
            attr.set_w(tw)
            attr.set_h(th)
            attr.set_r(114)
            attr.set_g(114)
            attr.set_b(114)

            tmp_planar_img = sail.BMImage(handle, img.height(), img.width(),
                                      sail.Format.FORMAT_RGB_PLANAR, sail.DATA_TYPE_EXT_1N_BYTE)
            bmcv.convert_format(img, tmp_planar_img)
            preprocess_fn = bmcv.vpp_crop_and_resize_padding if self.use_vpp else bmcv.crop_and_resize_padding
            resized_img_rgb = preprocess_fn(tmp_planar_img,
                                        0, 0, img.width(), img.height(),
                                        self.width, self.height, attr)
        else:
            r_w = self.width / img.width()
            r_h = self.height / img.height()
            ratio = (r_w, r_h)
            txy = (0, 0)
            tmp_planar_img = sail.BMImage(handle, img.height(), img.width(),
                                        sail.Format.FORMAT_RGB_PLANAR, sail.DATA_TYPE_EXT_1N_BYTE)
            bmcv.convert_format(img, tmp_planar_img)
            preprocess_fn = bmcv.vpp_resize if self.use_vpp else bmcv.resize
            resized_img_rgb = preprocess_fn(tmp_planar_img, self.width, self.height)
        return resized_img_rgb, ratio, txy

    ...

    def norm_batch(self, resized_images, handle, bmcv):

        bm_array = eval('sail.BMImageArray{}D'.format(self.batch_size))

        preprocessed_imgs = bm_array(handle,
                                 self.height,
                                 self.width,
                                 sail.FORMAT_RGB_PLANAR,
                                 self.img_dtype)

        a = 1 / self.std
        b = (0, 0, 0)
        alpha_beta = tuple([(ia * self.input_scale, ib * self.input_scale) for ia, ib in zip(a, b)])

        # do convert_to
        bmcv.convert_to(resized_images, preprocessed_imgs, alpha_beta)
        return preprocessed_imgs

6.3.3. Inference 

class SophonInference:
    def __init__(self, **kwargs):

        ...

        self.io_mode = sail.IOMode.SYSIO
        self.engine = sail.Engine(self.model_path, self.device_id, self.io_mode)
        self.handle = self.engine.get_handle()
        self.graph_name = self.engine.get_graph_names()[0]
        self.bmcv = sail.Bmcv(self.handle)

        ...

        input_names = self.engine.get_input_names(self.graph_name)
        for input_name in input_names:

            input_shape = self.engine.get_input_shape(self.graph_name, input_name)
            input_dtype = self.engine.get_input_dtype(self.graph_name, input_name)
            input_scale = self.engine.get_input_scale(self.graph_name, input_name)
            ...
            if self.input_mode:
                input = sail.Tensor(self.handle, input_shape, input_dtype, True, True)
            ...
            input_tensors[input_name] = input
            ...

        output_names = self.engine.get_output_names(self.graph_name)

        for output_name in output_names:

            output_shape = self.engine.get_output_shape(self.graph_name, output_name)
            output_dtype = self.engine.get_output_dtype(self.graph_name, output_name)
            output_scale = self.engine.get_output_scale(self.graph_name, output_name)
            ...
            if self.input_mode:
                output = sail.Tensor(self.handle, output_shape, output_dtype, True, True)
            ...
            output_tensors[output_name] = output
            ...
    def infer_bmimage(self, input_data):
        self.get_input_feed(self.input_names, input_data)

        #inference
        self.engine.process(self.graph_name, self.input_tensors, self.output_tensors)
        outputs_dict = OrderedDict()
        for name in self.output_names:
            outputs_dict[name] = self.output_tensors[name].asnumpy().copy() * self.output_scales[name]
        return outputs_dict

6.3. Python Programming Details

6.3.1. Loading model

6.3.2. Preprocessing

6.3.3. Inference

6.3. Python Programming Details 

6.3.1. Loading model 

6.3.2. Preprocessing 

6.3.3. Inference 