Model compilation

Use of BMNETC

As the model compiler for caffe, BMNETC could compile caffemodel and prototxt of some network into the file required by BM- Runtime. In course of compilation, the calculation results of each layer of NPU will be compared with the calculation results of CPU so as to ensure correctness. Here’s the way of using such compiler.

  1. Requirement

    • python 3.x

    • linux

  2. Usage

    1. Method 1: Form of command:

    Command name: bmnetc - BMNet compiler command for Caffe model

    /path/to/bmnetc [--model=<path>] \
                    [--weight=<path>] \
                    [--shapes=<string>] \
                    [--net_name=<name>] \
                    [--opt=<value>] \
                    [--dyn=<bool>] \
                    [--outdir=<path>] \
                    [--target=<name>] \
                    [--cmp=<bool>] \
                    [--mode=<string>] \
                    [--enable_profile=<bool>] \
                    [--show_args] \
                    [--list_ops]
    

    Parameters are introduced as below:

    To be specific, mode is the float model used for expressing compilation for compile. GenUmodel means the unified model for generating SOPHGO definition. Fixed point generation of INT8 model for INT8 can be carried out subsequently in virtue of SOPHGO fixed point tool. check means checking if network graph operator is supported or not.

    Under GenUmodel mode, parameters such as opt, dyn, target and cmp will become insignificant and therefore do not need designating.

    Parameter dyn=false means using static compilation while dyn=true means using dynamic compilation. Static compilation means running shapes set for compilation only in runtime after model compilation is finished. Dynamic compilation means any shapes can be run in runtime after model compilation is finished, as long as the values in actual shapes are smaller than or equal to the value set by compilation in shapes. Generally speaking, the performance of NN on chip after dynamic compilation is smaller than or equal to that after static compilation. Therefore, dynamic compilation is generally recommended in the circumstances where shapes of actual network change dramatically. Static compilation is recommended, if shapes are fixed or several kinds of shapes are needed only. See the instructions of bmodel for the descriptions on how to support several shapes under static compilation.

    Table 1 Descriptions of bmnetc Parameters

    args

    type

    Description

    model

    string

    Necessary. Caffe prototxt path

    weight

    string

    Necessary. Caffemodel(weight) path

    shapes

    string

    Optional. Shapes of all inputs, default use the shape in prototxt, format [x,x,x,x],[x,x]…, these correspond to inputs one by one in sequence

    net_name

    string

    Optional. Name of the network, default use the name in prototxt

    opt

    int

    Optional. Optimization level. Option: 0,1,2, default 2.

    dyn

    bool

    Optional. Use dynamic compilation, default False.

    outdir

    string

    Optional. Output directory, default “compilation”

    target

    string

    Optional. BM168X or BM1684X; default: BM168X

    cmp

    bool

    Optional. Check result during compilation. Default:True

    mode

    string

    Optional. Set bmnetc mode. Option: compile, GenUmodel, check. Default: compile.

    enable_profile

    bool

    Optional. Enable profile log. Default: False

    show_args

    Optional. Display arguments passed to bmnetc compiler.

    list_ops

    Optional. List bmnetc supported ops.

    examples:

    Below shows caffe model command example for compiling float32.

    /path/to/bmnetc --model=/path/to/prototxt --weight=/path/to/caffemodel --shapes=[1,3,224,224] --net_name=resnet18 --outdir=./resnet18 target=BM1684
    

    Below shows the example of generating Umodel.

    /path/to/bmnetc --mode=GenUmodel --model=/path/to/prototxt --weight=/path/to/caffemodel --shapes=[1,3,224,224] --net_name=resnet18 --outdir=./resnet18
    
    1. Method 2: python interface

    python interface of bmnetc is shown below. pip3 install --user bmnetc-x.x.x-py2.py3-none-any.whl is required.

    Below shows python interface of caffe model for compiling float32.

    import bmnetc
    ## compile fp32 model
    bmnetc.compile(
      model = "/path/to/prototxt",    ## Necessary
      weight = "/path/to/caffemodel", ## Necessary
      target = "BM1684",              ## Necessary
      outdir = "xxx",                 ## optional, default 'compilation'
      shapes = [[x,x,x,x], [x,x,x]],  ## optional, if not set, default use shape in prototxt
      net_name = "name",        ## optional, if not set, default use the network name in prototxt
      opt = 2,                        ## optional, if not set, default equal to 2
      dyn = False,                    ## optional, if not set, default equal to False
      cmp = True,                     ## optional, if not set, default equal to True
      enable_profile = False          ## optional, if not set, default equal to False
    )
    

    Below shows python interface for generating Umodel.

    import bmnetc
    ## Generate SOPHGO U model
    bmnetc.GenUmodel(
      model = "/path/to/prototxt",    ## Necessary
      weight = "/path/to/caffemodel", ## Necessary
      outdir = "xxx",                 ## optional, default 'compilation'
      shapes = [[x,x,x,x], [x,x,x]],  ## optional, if not set, default use shape in prototxt
      net_name = "name"         ## optional, if not set, default use the network name in prototxt
    )
    

    Below shows an example of using bmnetc python:

    import bmnetc
    
    model = r'../../../nnmodel/caffe_models/lenet/lenet_train_test_thin_4.prototxt'
    weight = r'../../../nnmodel/caffe_models/lenet/lenet_thin_iter_1000.caffemodel'
    target = r"BM1684"
    export_dir = r"./compilation"
    shapes = [[1,1,28,28],[1]]
    
    bmnetc.compile(model = model, weight = weight, target = target, outdir = export_dir, shapes = shapes)
    bmnetc.GenUmodel(weight = weight, model = model, net_name = "lenet")
    

    Below shows python interface of list_ops.

    import bmnetc
    ## List all supported ops
    bmnetc.list_ops()
    

    Below shows python interface of check_graph.

    import bmnetc
    ## Check supported ops in model
    bmnetc.check_graph(MODEL_FILE)
    
    1. bmnetc output and log

    If bmnetc succeeds, the following information can be viewed from output log.

    ######################################
    # Store bmodel of BMCompiler.
    ######################################
    

    After bmnetc is successful, one compilation.bmodel file will be generated in designated folder. Such file is bmodel after successful conversion and can be renamed by user.

    If user uses cmp=true model in bmnetc, one input_ref_data.dat and one output_ref_data.dat will be generated in the designated folder as the network input reference data and output reference data generated by Caffe respectively. They can be used by bmrt_test to verify if the result of bmodel generated when chip runs is correct.

    bmnetc fails, if the information above does not exist. If it fails at present, user can modify opt optimization option. Other optimization level may become successfully, which does not affect user’s deployment. Then user should send the reason of failure to our customer support specialists.

Use of BMNETD

As the model compiler for Darknet, BMNETD can compile weight and cfg of some network into the file required by BMRuntime (yolo supported only at present). In course of compilation, the calculation results of each layer of NPU will be compared with the calculation results of CPU so as to ensure correctness. Here’s the way of using such compiler.

  1. Requirement

    • python 3.x

    • linux

  2. Usage

    1. Method 1: Form of command

    Command name: bmnetd - BMNet compiler command for Darknet model

    /path/to/bmnetd [--model=<path>] \
                    [--weight=<path>] \
                    [--shapes=<string>] \
                    [--net_name=<name>] \
                    [--opt=<value>] \
                    [--dyn=<bool>] \
                    [--outdir=<path>] \
                    [--target=<name>] \
                    [--cmp=<bool>] \
                    [--mode=<string>] \
                    [--enable_profile=<bool>]
    

    Parameters are introduced as below:

    To be specific, mode is the float model used for expressing compilation for compile. GenUmodel means generating central model defined by SOPHGO. INT8 can generate INT8 model through fixed point subsequently via SOPHGO fixed point tool.

    Under GenUmodel mode, parameters such as opt, dyn, target and cmp will become insignificant and therefore do not need designating.

    Parameter dyn=false means using static compilation while dyn=true means using dynamic compilation. Static compilation means running shapes set for compilation only in runtime after model compilation is finished. Dynamic compilation means any shapes can be run in runtime after model compilation is finished, as long as the values in actual shapes are smaller than or equal to the value set by compilation in shapes. Generally speaking, the performance of NN on chip after dynamic compilation is smaller than or equal to that after static compilation. Therefore, dynamic compilation is generally recommended in the circumstances where shapes of actual network change dramatically. Static compilation is recommended, if shapes are fixed or several kinds of shapes are needed only. See the instructions of bmodel for the descriptions on how to support several shapes under static compilation.

    Table 2 Descriptions of bmnetd Parameters

    args

    type

    Description

    model

    string

    Necessary. Darknet cfg path

    weight

    string

    Necessary. Darknet weight path

    target

    string

    Optional. BM168X or BM1684X; default: BM168X

    shapes

    string

    Optional. Shapes of all inputs, default use the shape in prototxt, format [x,x,x,x],[x,x]…, these correspond to inputs one by one in sequence

    net_name

    string

    Optional. Name of the network, default “network”

    opt

    int

    Optional. Optimization level. Option: 0, 1, 2, default 2.

    dyn

    bool

    Optional. Use dynamic compilation, default False.

    outdir

    string

    Optional. Output directory, default “compilation”

    cmp

    bool

    Optional.Check result during compilation. Default: True

    mode

    string

    Optional. Set bmnetd mode. Option: compile, GenUmodel. Default: compile.

    enable_profile

    bool

    Optional. Enable profile log. Default: False

    log_dir

    string

    Optional. Specify the log directory Default: “\”

    v

    string

    Optional. Set log verbose level. Default: 0 (0: FATAL, 1: ERROR, 2: WARNING, 3: INFO, 4: DEBUG).

    dump_ref

    bool

    Optional. Enable dump input&output ref data when compile without compare. Default: False.

    examples:

    Below shows Darknet model command example for compiling float32.

    /path/to/bmnetd --model=/path/to/cfg --weight=/path/to/weight --net_name=net_name --outdir=./net_name --target=BM1684
    

    Below shows the example of generating Umodel.

    /path/to/bmnetd --mode=GenUmodel --model=/path/to/cfg --weight=/path/to/weight --net_name=net_name --outdir=./net_name
    
    1. Method 2: python interface

    python interface of bmnetd is shown below. pip3 install --user bmnetd-x.x.x-py2.py3-none-any.whl is required.

    Below shows python interface of Darknet model for compiling float32.

    import bmnetd
    ## compile fp32 model
    bmnetd.compile(
      model = "/path/to/cfg",         ## Necessary
      weight = "/path/to/weight",     ## Necessary
      outdir = "xxx",                 ## Necessary
      target = "BM1684",              ## Necessary
      net_name = "name",              ## optional, if not set, default use the path of cfg
      shapes = [[x,x,x,x], [x,x,x]],  ## optional, if not set, default use shape in weights
      opt = 2,                        ## optional, if not set, default equal to 2
      dyn = False,                    ## optional, if not set, default equal to False
      cmp = True,                     ## optional, if not set, default equal to True
      enable_profile = False          ## optional, if not set, default equal to False
    )
    

    Below shows python interface for generating Umodel.

    import bmnetd
    ## Generate SOPHGO U model
    bmnetd.GenUmodel(
      model = "/path/to/cfg",         ## Necessary
      weight = "/path/to/weight",     ## Necessary
      outdir = "xxx",                 ## Necessary
      net_name = "name",              ## optional, if not set, default use the path of cfg
      shapes = [[x,x,x,x], [x,x,x]],  ## optional, if not set, default use shape in weights
    )
    

    Below shows an example of using bmnetd python:

    import bmnetd
    
    model = r'../../../nnmodel/darknet_models/yolov3-tiny/yolov3-tiny.cfg'
    weight = r'../../../nnmodel/darknet_models/yolov3-tiny/yolov3-tiny.weights'
    target = r"BM1684"
    export_dir = r"./compilation"
    shapes = [[1,3,416,416]]
    
    bmnetd.compile(model = model, weight = weight, target = target, outdir = export_dir, shapes = shapes)
    bmnetd.GenUmodel(weight = weight, model = model, net_name = "yolov3-tiny")
    
    1. bmnetd output and log

    If bmnetd succeeds, the following information can be viewed from output log.

    ######################################
    # Store bmodel of BMCompiler.
    ######################################
    

    After bmnetd is successful, one compilation.bmodel file will be generated in designated folder. Such file is bmodel after successful conversion and can be renamed by user.

    If user uses cmp=true model in bmnetd, one input_ref_data.dat and one output_ref_data.dat will be generated in the designated folder as the network input reference data and output reference data generated by Darknet respectively. They can be used by bmrt_test to verify if the result of bmodel generated when chip runs is correct.

    bmnetd fails, if the information above does not exist. If it fails, user can modify opt optimization option. Other optimization level may become successfully, which does not affect user’s deployment. Then user should send the reason of failure to our customer support specialists.

Common Problems

  1. xxx.cfg file is checked by mistake in compilation model segment. Check if the problem is saved or modified under windows system. As the presentation of line break is different from linux system, analysis error may occur, which results in segment fault. And it could be converted by using dos2unix xxx.cfg command.

Use of BMNETM

As the model compiler for mxnet, BMNETM can convert model structural file with mxnet format and parameter files (such as lenet-symbol.json and lenet-0100.params) into the file required by BMRuntime through image compilation and optimization. Compare the calculation result of each NPU model operated with that of original model on mxnet frame to ensure the correctness of model conversion. The installation requirements, configuration steps, use method and command parameters of the compiler are introduced below.

  1. Requirement

    • python 3.x

    • mxnet>=1.3.0

    • linux

  2. Usage

    1. Method 1: Form of command

    Command name: python3 -m bmnetm - BMNet compiler command for MxNet model

    python3 -m bmnetm [--model=<path>] \
                      [--weight=<path>] \
                      [--shapes=<string>] \
                      [--input_names=<string>] \
                      [--net_name=<name>] \
                      [--opt=<value>] \
                      [--dyn=<bool>] \
                      [--outdir=<path>] \
                      [--target=<name>] \
                      [--cmp=<bool>] \
                      [--mode=<string>] \
                      [--enable_profile=<bool>] \
                      [--input_data=<path>] \
                      [--log_dir=<path>] \
                      [--v=<int>] \
                      [--list_ops]
    

    Parameters are introduced as below:

    Parameter dyn=false means using static compilation while dyn=true means using dynamic compilation. Static compilation means running shapes set for compilation only in runtime after model compilation is finished. Dynamic compilation means any shapes can be run in runtime after model compilation is finished, as long as the values in actual shapes are smaller than or equal to the value set by compilation in shapes. Generally speaking, the performance of NN on chip after dynamic compilation is smaller than or equal to that after static compilation. Therefore, dynamic compilation is generally recommended in the circumstances where shapes of actual network change dramatically. Static compilation is recommended, if shapes are fixed or several kinds of shapes are needed only. See the instructions of bmodel for the descriptions on how to support several shapes under static compilation.

    Table 3 Descriptions of bmnetm Parameters

    args

    type

    Description

    model

    string

    Necessary. MxNet symbol .json path

    weight

    string

    Necessary. MxNet weight .params path

    shapes

    string

    Necessary. Shapes of all inputs, default use the shape in prototxt, format [x,x,x,x],[x,x]…, these correspond to inputs one by one in sequence

    target

    string

    Optional. BM168X or BM1684X; default: BM168X

    input_names

    string

    Optional. Set input name according to .json. They correspond to shapes one by one. Default: “data”. Format “name1,name2,…”.

    net_name

    string

    Optional. Name of the network, default “network”

    opt

    int

    Optional. Optimization level. Option: 0, 1, 2, default 1.

    dyn

    bool

    Optional. Use dynamic compilation, default False.

    outdir

    string

    Optional. Output directory, default “compilation”

    cmp

    bool

    Optional. Check result during compilation. Default: True

    mode

    string

    Optional. Set bmnetm mode. Option: compile, GenUmodel, check. Default: compile.

    enable_profile

    bool

    Optional. Enable profile log. Default: False

    input_data

    int

    Optional. Specify input data from a data file path

    log_dir

    string

    Optional. Log Output directory

    v

    string

    Optional. Option: set verbose level

    list_ops

    Optional. List supported ops.

    1. Method 2: python interface

    python interface of bmnetm is shown below:

    import bmnetm
    ## compile fp32 model
    bmnetm.compile(
      model = "/path/to/.json",       ## Necessary
      weight = "/path/to/.params",    ## Necessary
      outdir = "xxx",                 ## Necessary
      target = "BM1684",              ## Necessary
      shapes = [[x,x,x,x], [x,x,x]],  ## Necessary
      net_name = "name",              ## Necessary
      input_names=["name1","name2"]   ## optional, if not set, default is "data"
      opt = 2,                        ## optional, if not set, default equal to 1
      dyn = False,                    ## optional, if not set, default equal to False
      cmp = True,                     ## optional, if not set, default equal to True
      enable_profile = True           ## optional, if not set, default equal to False
    )
    

    Below shows python interface of list_ops.

    import bmnetm
    ## List all supported ops
    bmnetm.list_ops()
    

    Below shows python interface of check_graph.

    import bmnetm
    ## Check supported ops in model
    bmnetm.check_graph(MODEL_FILE)
    
    1. bmnetm output and log

    If bmnetm becomes successful, the following information can be viewed from output log.

    ######################################
    # Store bmodel of BMCompiler.
    ######################################
    

    After bmnetm is successful, one compilation. bmodel file will be generated in designated folder, which is bmodel after successful conversion and can be renamed by user.

    If user uses cmp=true model in bmnetm, one input_ref_data.dat and one output_ref_data.dat will be generated in the designated folder as the network input reference data and output reference data generated by Mxnet respectively. They can be used by bmrt_test to verify if the result of bmodel generated when chip runs is correct.

    bmnetm fails, if the information above does not exist. If it fails, user can modify opt optimization option. Other optimization level may become successfully, which does not affect user’s deployment. Then user should send the reason of failure to our customer support specialists.

Use of BMNETO

As the model compiler for ONNX, BMNETO can convert model with ONNX format into file required by BMRuntime through image compilation and optimization. While compiling model, compare the calculation result of each NPU operated with that of CPU to ensure the correctness of model conversion. The installation requirements and steps, and use method of the compiler are introduced below.

  1. Requirement

    • python 3.x

    • linux

    • onnx == 1.7.0 (Opset version == 12)

    • onnxruntime == 1.3.0

    • protobuf >= 3.8.0

  2. Usage

    1. Method 1: Form of command

    Command name: python3 -m bmneto - BMNet compiler command for ONNX model

    python3 -m bmneto [--model=<path>] \
                      [--input_names=<string>] \
                      [--shapes=<string>] \
                      [--outdir=<path>] \
                      [--target=<name>] \
                      [--net_name=<name>] \
                      [--opt=<value>] \
                      [--dyn=<bool>] \
                      [--cmp=<bool>] \
                      [--mode=<string>] \
                      [--descs=<string>] \
                      [--enable_profile=<bool>] \
                      [--output_names=<string>] \
                      [--list_ops]
    

    Parameters are introduced as below:

    Parameter dyn=false means using static compilation while dyn=true means using dynamic compilation. Static compilation means running shapes set for compilation only in runtime after model compilation is finished. Dynamic compilation means any shapes can be run in runtime after model compilation is finished, as long as the values in actual shapes are smaller than or equal to the value set by compilation in shapes. Generally speaking, the performance of NN on chip after dynamic compilation is smaller than or equal to that after static compilation. Therefore, dynamic compilation is generally recommended in the circumstances where shapes of actual network change dramatically. Static compilation is recommended, if shapes are fixed or several kinds of shapes are needed only. See the instructions of bmodel for the descriptions on how to support several shapes under static compilation.

    Table 4 Descriptions of bmneto Parameters

    args

    type

    Description

    model

    string

    Necessary. ONNX model (.onnx) path

    input_names

    string

    Optional. Set name of all network inputs one by one in sequence. Format “name1,name2,name3”

    shapes

    string

    Necessary. Shapes of all inputs, default use the shape in prototxt, format [x,x,x,x],[x,x]…, these correspond to inputs one by one in sequence

    target

    string

    Optional. BM168X or BM1684X; default: BM168X

    outdir

    string

    Optional. Output directory, default “compilation”

    net_name

    string

    Optional. Name of the network, default “network”

    opt

    int

    Optional. Optimization level. Option: 0, 1, 2, default 1.

    dyn

    bool

    Optional. Use dynamic compilation, default False.

    cmp

    bool

    Optional. Check result during compilation. Default: True

    mode

    string

    Optional. Set bmneto mode. Option: compile, GenUmodel, check. Default: compile.

    descs

    string

    Optional. Describe data type and value range of some input in format: “[ index, data format, lower bound, upper bound ]”, where data format could be fp32, int64. For example, [0, int64, 0, 100], meaning input of index 0 has data type as int64 and values in [0, 100). If no description of some input given, the data type will be fp32 as default and uniformly distributed in 0 ~ 1.

    enable_profile

    bool

    Optional. Enable profile log. Default: False

    output_names

    string

    Optional. Set name of all network outputs one by one in sequence. Format “name1,name2,name3”

    list_ops

    Optional. List supported ops.

    1. Method 2: python interface

    python interface of bmneto is shown below:

    import bmneto
    ## compile fp32 model
    bmneto.compile(
      model = "/path/to/.onnx",       ## Necessary
      outdir = "xxx",                 ## Necessary
      target = "BM1684",              ## Necessary
      shapes = [[x,x,x,x], [x,x,x]],  ## Necessary
      net_name = "name",              ## optional
      input_names = ['name0','name1'] ## optional
      opt = 2,                        ## optional, if not set, default equal to 1
      dyn = False,                    ## optional, if not set, default equal to False
      cmp = True,                     ## optional, if not set, default equal to True
      enable_profile = True           ## optional, if not set, default equal to False
      descs = "[0, int32, 0, 128]"    ## optional, if not set, default equal to [[x, float32, 0, 1]]
      output_names = ['oname0','oname1']  ## optional, if not set, default equal to graph output names
    )
    

    Below shows python interface of list_ops.

    import bmneto
    ## List all supported ops
    bmneto.list_ops()
    

    Below shows python interface of check_graph.

    import bmneto
    ## Check supported ops in model
    bmneto.check_graph(MODEL_FILE)
    
    1. bmneto output and log

    If bmneto becomes successful, the following information can be viewed from output log.

    ######################################
    # Store bmodel of BMCompiler.
    ######################################
    

    After bmneto is successful, one compilation. bmodel file will be generated in designated folder, which is bmodel after successful conversion and can be renamed by user.

    If user uses cmp=true model in bmneto, one input_ref_data.dat and one output_ref_data.dat will be generated in the designated folder as the network input reference data and output reference data generated by ONNXruntime respectively. They can be used by bmrt_test to verify if the result of bmodel generated when chip runs is correct.

    bmneto fails, if the information above does not exist. If it fails, user can modify opt optimization option. Other optimization level may become successfully, which does not affect user’s deployment. Then user should send the reason of failure to our customer support specialists.

Common Problems

  1. Example of converting onnx into onnx

model = ExampleModel() # or torch.load(model_path)
dummy_input = Variable(torch.randn(input_shape))
dummy_output = model(dummy_input)
torch.onnx.export(model,
                  dummy_input,
                  new_name,
                  opset_version=12,  # opset version must be 12
                  example_outputs = dummy_output,
                  )

Use of BMNETP

As the model compiler for pytorch, BMNETP can compile model of pytorch into the execution command required by BMRuntime directly. Prior to compilation, pytorch needs to undergo torch.jit.trace (see pytorch file) and traced model can be used for compilation only. While compiling model, compare the calculation result of each NPU operated with that of CPU to ensure the correctness of model conversion. The installation requirements and steps, and use method of the compiler are introduced below

  1. Requirement

    • python 3.x

    • linux

  2. Usage

    1. Method 1: Form of command

    Command name: python3 -m bmnetp - BMNet compiler command for PyTorch model

    python3 -m bmnetp [--model=<path>] \
                      [--shapes=<string>] \
                      [--net_name=<name>] \
                      [--opt=<value>] \
                      [--dyn=<bool>] \
                      [--outdir=<path>] \
                      [--target=<name>] \
                      [--cmp=<bool>] \
                      [--enable_profile=<bool>]
    

    Parameters are introduced as below:

    Parameter dyn=false means using static compilation while dyn=true means using dynamic compilation. Static compilation means running shapes set for compilation only in runtime after model compilation is finished. Dynamic compilation means any shapes can be run in runtime after model compilation is finished, as long as the values in actual shapes are smaller than or equal to the value set by compilation in shapes. Generally speaking, the performance of NN on chip after dynamic compilation is smaller than or equal to that after static compilation. Therefore, dynamic compilation is generally recommended in the circumstances where shapes of actual network change dramatically. Static compilation is recommended, if shapes are fixed or several kinds of shapes are needed only. See the instructions of bmodel for the descriptions on how to support several shapes under static compilation.

    Table 5 Descriptions of bmnetp Parameters

    args

    type

    Description

    model

    string

    Necessary. Traced PyTorch model (.pt) path

    shapes

    string

    Necessary. Shapes of all inputs, default use the shape in prototxt, format [x,x,x,x],[x,x]…, these correspond to inputs one by one in sequence

    target

    string

    Optional. BM168X or BM1684X; default: BM168X

    outdir

    string

    Optional. Output directory, default “compilation”

    net_name

    string

    Optional. Name of the network, default “network”

    opt

    int

    Optional. Optimization level. Option: 0, 1, 2, default 1.

    dyn

    bool

    Optional. Use dynamic compilation, default False.

    cmp

    bool

    Optional. Check result during compilation. Default: True

    mode

    string

    Optional. Set bmnetp mode. Option: compile, GenUmodel, dump. Default: compile.

    descs

    string

    Optional. Describe data type and value range of some input in format: “[ index, data format, lower bound, upper bound ]”, where data format could be fp32, int64. For example, [0, int64, 0, 100], meaning input of index 0 has data type as int64 and values in [0, 100). If no description of some input given, the data type will be fp32 as default and uniformly distributed in 0 ~ 1.

    enable_profile

    bool

    Optional. Enable profile log. Default: False

    list_ops

    Optional. List supported ops.

    1. Method 2: python interface

    python interface of bmnetp is shown as below:

    import bmnetp
    ## compile fp32 model
    bmnetp.compile(
      model = "/path/to/.pth",        ## Necessary
      outdir = "xxx",                 ## Necessary
      target = "BM1684",              ## Necessary
      shapes = [[x,x,x,x], [x,x,x]],  ## Necessary
      net_name = "name",              ## Necessary
      opt = 2,                        ## optional, if not set, default equal to 1
      dyn = False,                    ## optional, if not set, default equal to False
      cmp = True,                     ## optional, if not set, default equal to True
      enable_profile = True           ## optional, if not set, default equal to False
    )
    

    Below shows python interface of list_ops.

    import bmnetp
    ## List all supported ops
    bmnetp.list_ops()
    

    Below shows python interface of check_graph.

    import bmnetp
    ## Check supported ops in model
    bmnetp.check_graph(MODEL_FILE)
    
    1. bmnetp输出和log

    If bmnetp becomes successful, the following information can be viewed from output log.

    ######################################
    # Store bmodel of BMCompiler.
    ######################################
    

    After bmnetp is successful, one compilation. bmodel file will be generated in designated folder. Such file is bmodel after successful conversion and can be renamed by user.

    If user uses cmp=true model in bmnetp, one input_ref_data.dat and one output_ref_data.dat will be generated in the designated folder as the network input reference data and output reference data generated by Pytorch respectively, which can be used by bmrt_test to verify if the result of bmodel generated when chip runs is correct.

    bmnetp fails, if the information above does not exist. If it fails, user can modify opt optimization option. Other optimization level may become successfully, which does not affect user’s deployment. Then user should send the reason of failure to our customer support specialists.

    1. View the structure of original model

    The following command can print model structure in the form of text so as to view the input and output of original model and internal connection relationship conveniently.

    python -m bmnetp --mode dump --model model.pt
    

Common Problems

  1. Model format

    pytorch pre-training model generally takes pt or pth as name suffix but not complete network information but weight dictionary may be contained in file only. Test the example of built model by checking if error is reported via import torch; torch.jit.load(pt_file_path), and recover weight. For details, please refer to project source code and save it with the following code:

    import torch
    model = ...
    input_data = torch.randn(input_shape)
    model.cpu()  # Put weight on cpu
    model.eval() # Adopt reasoning mode
    traced_model = torch.jit.trace(model, [iput_data])
    traced_model_name = "traced_model.pt"
    traced_model.save(traced_modelname)
    
  2. How to treat traced_model of weight on gpu?

    We support the saving of model with torch.jit.trace function. Weight must be on cpu instead of gpu. Transfer it to cpu via the following code

    model = torch.jit.load(gpu_traced_model_name, map_location="cpu")  # Convert it into cpu model
    model.eval() # Adopt reasoning mode
    # Carry out trace again
    input_data = torch.randn(input_shape)
    traced_model = torch.jit.trace(model, [iput_data])
    traced_model_name = "traced_model.pt"
    traced_model.save(traced_modelname)
    

Use of BMNETT

As the model compiler for tensorflow, BMNETT could compile file(*.pt) into the file required by BMRuntime. In course of compilation, compare the calculation result of each NPU operated with that of CPU to ensure the correctness of model conversion. The installation requirements and steps, and use method of the compiler are introduced below.

  1. Requirement

    • python 3.x

    • tensorflow>=1.10.0

    • linux

  2. Usage

    1. Method 1: Form of command

    Command name: python3 -m bmnett - BMNet compiler command for TensorFlow model

    python3 -m bmnett [--model=<path>] \
                      [--input_names=<string>] \
                      [--shapes=<string>] \
                      [--descs=<string>] \
                      [--output_names=<string>] \
                      [--net_name=<name>] \
                      [--opt=<value>] \
                      [--dyn=<bool>] \
                      [--outdir=<path>] \
                      [--target=<name>] \
                      [--cmp=<bool>] \
                      [--mode=<string>] \
                      [--enable_profile=<bool>] \
                      [--list_ops]
    

    Parameters are introduced as below:

    To be specific, mode is the float model used for expressing compilation for compile. GenUmodel means the unified model for generating SOPHGO definition. Fixed point generation of INT8 model for INT8 can be carried out subsequently in virtue of SOPHGO fixed point tool. Represent and display network graph for summary or show mode. check means checking if network graph operator is supported or not.

    Under GenUmodel mode, parameters such as opt, dyn, target and cmp will become insignificant and therefore do not need designating.

    Parameter dyn=false means using static compilation while dyn=true means using dynamic compilation. Static compilation means running shapes set for compilation only in runtime after model compilation is finished. Dynamic compilation means any shapes can be run in runtime after model compilation is finished, as long as the values in actual shapes are smaller than or equal to the value set by compilation in shapes. Generally speaking, the performance of NN on chip after dynamic compilation is smaller than or equal to that after static compilation. Therefore, dynamic compilation is generally recommended in the circumstances where shapes of actual network change dramatically. Static compilation is recommended, if shapes are fixed or several kinds of shapes are needed only. See the instructions of bmodel for the descriptions on how to support several shapes under static compilation.

    Table 6 Descriptions of bmnett Parameters

    args

    type

    Description

    model

    string

    Necessary. Tensorflow .pb path

    input_names

    string

    Necessary. Set name of all network inputs one by one in sequence. Format “name1,name2,name3”

    shapes

    string

    Necessary. Shapes of all inputs, format [x,x,x,x],[x,x]…, these correspond to inputs one by one in sequence

    target

    string

    Optional. BM168X or BM1684X; default: BM168X

    outdir

    string

    Optional. Output directory, default “compilation”

    net_name

    string

    Optional. Name of the network, default “network”

    descs

    string

    Optional. Descriptions of inputs, format [serial number, data type, lower bound, upper bound], e.g., [0, uint8, 0, 256], default [x, float, 0, 1]

    output_names

    string

    Optional. Set name of all network outputs one by one in sequence. Format “name1,name2,name3”

    opt

    int

    Optional. Optimization level. Option: 0, 1, 2, default 1.

    dyn

    bool

    Optional. Use dynamic compilation, default False.

    cmp

    bool

    Optional.Check result during compilation. Default: True

    mode

    string

    Optional. Set bmnett mode. Option: compile, GenUmodel, summary, show, check. Default: compile.

    enable_profile

    bool

    Optional. Enable profile log. Default: False

    list_ops

    Optional. List supported ops.

    1. Method 2: python interface

    python interface of float32 tf model compiled by bmnett is shown below:

    import bmnett
    ## compile fp32 model
    bmnett.compile(
      model = "/path/to/model(.pb)",     ## Necessary
      outdir = "xxx",                    ## Necessary
      target = "BM1684",                 ## Necessary
      shapes = [[x,x,x,x], [x,x,x]],     ## Necessary
      net_name = "name",                 ## Necessary
      input_names=["name1", "name2"],    ## Necessary, when .h5 use None
      output_names=["out_name1", "out_name2"], ## Necessary, when .h5 use None
      opt = 2,                           ## optional, if not set, default equal to 1
      dyn = False,                       ## optional, if not set, default equal to False
      cmp = True,                        ## optional, if not set, default equal to True
      enable_profile = True              ## optional, if not set, default equal to False
    )
    

    python interface of SOPHGO U Model that converts bmnett into float32 tf model is shown below:

    import bmnett
    ## compile fp32 model
    bmnett.GenUmodel(
      model = "/path/to/model(.pb,.h5)", ## Necessary
      outdir = "xxx",                    ## Necessary
      shapes = [[x,x,x,x], [x,x,x]],     ## Necessary
      net_name = "name",                 ## Necessary
      input_names=["name1", "name2"],    ## Necessary
      output_names=["out_name1", "out_name2"] ## Necessary
    )
    

    Below shows python interface of list_ops.

    import bmnett
    ## List all supported ops
    bmnett.list_ops()
    
    1. bmnett output and log

    If bmnett becomes successful, the following information can be viewed from output log.

    ######################################
    # Store bmodel of BMCompiler.
    ######################################
    

    After bmnett is successful, one compilation. bmodel file will be generated in designated folder. Such file is bmodel after successful conversion and can be renamed by user.

    If user uses cmp=true model in bmnett, one input_ref_data.dat and one output_ref_data.dat will be generated in the designated folder as the network input reference data and output reference data generated by Tensorflow respectively. They can be used by bmrt_test to verify if the result of bmodel generated when chip runs is correct.

    bmnett fails, if the information above does not exist. If it fails at present, user can modify opt optimization option. Other optimization level may become successfully, which does not affect user’s deployment. Then user should send the reason of failure to our customer support specialists.

Use of BMPADDLE

As the model compiler for paddlepaddle, BMPADDLE could compile model file(inference.pdmodel inference.pdiparams) into the file required by BMRuntime. In course of compilation, compare the calculation result of each NPU operated with that of CPU to ensure the correctness of model conversion. The installation requirements and steps, and use method of the compiler are introduced below.

  1. Requirement

    • python 3.x

    • paddlepaddle>=1.8.0

    • linux

  2. Installation steps

Install the installation package of the compiler. Choose and use one of the following commands. Use command (1) is installed in local catalogue and root authority is not required but command (2) is installed in system catalogue and root authority is required.

pip install --user bmpaddle-x.x.x-py2.py3-none-any.whl

pip install bmpaddle-x.x.x-py2.py3-none-any.whl

Set LD_LIBRARY_PATH. Set path of the library in the current shell using the following way or add the settings of such path to .bashrc file permanently.

export LD_LIBRARY_PATH=path_to_bmcompiler_lib

  1. Usage

    1. Method 1: Form of command

    Command name: python3 -m bmpaddle - BMNet compiler command for paddlepaddle model

    python3 -m bmpaddle [--model=<path>] \
                      [--input_names=<string>] \
                      [--shapes=<string>] \
                      [--descs=<string>] \
                      [--output_names=<string>] \
                      [--net_name=<name>] \
                      [--opt=<value>] \
                      [--dyn=<bool>] \
                      [--outdir=<path>] \
                      [--target=<name>] \
                      [--cmp=<bool>] \
                      [--mode=<string>] \
                      [--enable_profile=<bool>] \
                      [--list_ops]
    

    Parameters are introduced as below:

    To be specific, mode is the float model used for expressing compilation for compile. GenUmodel means the unified model for generating SOPHGO definition. Fixed point generation of INT8 model for INT8 can be carried out subsequently in virtue of SOPHGO fixed point tool. Represent and display network graph for summary or show mode. check means checking if network graph operator is supported or not.

    Under GenUmodel mode, parameters such as opt, dyn, target and cmp will become insignificant and therefore do not need designating.

    Parameter dyn=false means using static compilation while dyn=true means using dynamic compilation. Static compilation means running shapes set for compilation only in runtime after model compilation is finished. Dynamic compilation means any shapes can be run in runtime after model compilation is finished, as long as the values in actual shapes are smaller than or equal to the value set by compilation in shapes. Generally speaking, the performance of NN on chip after dynamic compilation is smaller than or equal to that after static compilation. Therefore, dynamic compilation is generally recommended in the circumstances where shapes of actual network change dramatically. Static compilation is recommended, if shapes are fixed or several kinds of shapes are needed only. See the instructions of bmodel for the descriptions on how to support several shapes under static compilation.

    Table 7 Descriptions of bmpaddle Parameters

    args

    type

    Description

    model

    string

    Necessary. paddlepaddle model directory

    input_names

    string

    Necessary. Set name of all network inputs one by one in sequence. Format “name1,name2,name3”

    shapes

    string

    Necessary. Shapes of all inputs, format [x,x,x,x],[x,x]…, these correspond to inputs one by one in sequence

    descs

    string

    Optional. Descriptions of inputs, format [serial number, data type, lower bound, upper bound], e.g., [0, uint8, 0, 256], default [x, float, 0, 1]

    output_names

    string

    Necessary. Set name of all network outputs one by one in sequence. Format “name1,name2,name3”

    net_name

    string

    Optional. Name of the network, default “network”

    opt

    int

    Optional. Optimization level. Option: 0, 1, 2, default 1.

    dyn

    bool

    Optional. Use dynamic compilation, default False.

    outdir

    string

    Optional. Output directory, default “compilation”

    target

    string

    Optional. BM168X or BM1684X; default: BM168X

    cmp

    bool

    Optional.Check result during compilation. Default: True

    mode

    string

    Optional. Set bmpaddle mode. Option: compile, GenUmodel, summary, show, check. Default: compile.

    enable_profile

    bool

    Optional. Enable profile log. Default: False

    list_ops

    Optional. List supported ops.

    1. Method 2: python interface

    python interface of bmpaddle for compiling float32 paddle model is shown below:

    import bmpaddle
    ## compile fp32 model
    bmpaddle.compile(
      model = "/path/to/model(directory)",     ## Necessary
      outdir = "xxx",                    ## Necessary
      target = "BM1684",                 ## Necessary
      shapes = [[x,x,x,x],[x,x,x]],      ## Necessary
      net_name = "name",                 ## Necessary
      input_names=["name1","name2"],     ## Necessary, when .h5 use None
      output_names=["out_name1","out_name2"], ## Necessary, when .h5 use None
      opt = 2,                           ## optional, if not set, default equal to 1
      dyn = False,                       ## optional, if not set, default equal to False
      cmp = True,                        ## optional, if not set, default equal to True
      enable_profile = True              ## optional, if not set, default equal to False
    )
    

    python interface of SOPHGO U Model that converts float32 bmpaddle into SOPHGO U Model is shown below:

    import bmpaddle
    ## compile fp32 model
    bmpaddle.GenUmodel(
      model = "/path/to/model(directory)",     ## Necessary
      outdir = "xxx",                          ## Necessary
      shapes = [[x,x,x,x],[x,x,x]],            ## Necessary
      net_name = "name",                       ## Necessary
      input_names=["name1","name2"],           ## Necessary
      output_names=["out_name1","out_name2"]   ## Necessary
    )
    

    Below shows python interface of list_ops.

    import bmpaddle
    ## List all supported ops
    bmpaddle.list_ops()
    
    1. bmpaddle output and log

    If bmpaddle succeeds, the following information can be viewed from output log.

    ######################################
    # Store bmodel of BMCompiler.
    ######################################
    

    After bmpaddle is successful, one compilation. bmodel file will be generated in designated folder. Such file is bmodel after successful conversion and can be renamed by user.

    If user uses cmp=true model in bmpaddle, one input_ref_data.dat and one output_ref_data.dat will be generated in the designated folder as the network input reference data and output reference data generated by paddlepaddle respectively. They can be used by bmrt_test to verify if the result of bmodel generated when chip runs is correct.

    bmpaddle fails, if the information above does not exist. If it fails at present, user can modify opt optimization option. Other optimization level may become successfully, which does not affect user’s deployment. Then user should send the reason of failure to our customer support specialists.