10. Test SDK release package with TPU-PERF
10.1. Configure the system environment
If you are using Docker for the first time, use the methods in Environment Setup to install and configure Docker. At the same time, git-lfs
will be used in this chapter. If you use git-lfs
for the first time, you can execute the following commands for installation and configuration (only for the first time, and the configuration is in the user’s own system, not in Docker container):
1$ curl -s https://packagecloud.io/install/repositories/github/git-lfs/script.deb.sh | sudo bash
2$ sudo apt-get install git-lfs
10.2. Get the model-zoo
model 1
In the same directory of tpu-mlir_xxxx.tar.gz
(tpu-mlir’s release package), use the following command to clone the model-zoo
project:
1$ git clone --depth=1 https://github.com/sophgo/model-zoo
2$ cd model-zoo
3$ git lfs pull --include "*.onnx,*.jpg,*.JPEG,*.npz" --exclude=""
4$ cd ../
If you have cloned model-zoo
, you can execute the following command to synchronize the model to the latest state:
1$ cd model-zoo
2$ git pull
3$ git lfs pull --include "*.onnx,*.jpg,*.JPEG" --exclude=""
4$ cd ../
This process downloads a large amount of data from GitHub
. Due to differences in specific network environments, this process may take a long time.
Footnotes
- 1
If you get the
model-zoo
test package provided by SOPHGO, you can do the following to create and set up themodel-zoo
. After completing this step, go directly to the next section Get the tpu-perf tool.$ mkdir -p model-zoo $ tar -xvf path/to/model-zoo_<date>.tar.bz2 --strip-components=1 -C model-zoo
10.3. Get the tpu-perf
tool
Download the latest tpu-perf
wheel installation package from https://github.com/sophgo/tpu-perf/releases. For example, tpu_perf-x.x.x-py3-none-manylinux2014_x86_64.whl. And put the tpu-perf
package in the same directory as model-zoo
. The directory structure at this point should look like this:
├── tpu_perf-x.x.x-py3-none-manylinux2014_x86_64.whl
├── tpu-mlir_xxxx.tar.gz
└── model-zoo
10.4. Test process
10.4.1. Unzip the SDK and create a Docker container
Execute the following command in the tpu-mlir_xxxx.tar.gz
directory (note that tpu-mlir_xxxx.tar.gz
and
model-zoo
needs to be at the same level):
1$ tar zxf tpu-mlir_xxxx.tar.gz
2$ docker pull sophgo/tpuc_dev:v2.2
3$ docker run --rm --name myname -v $PWD:/workspace -it sophgo/tpuc_dev:v2.2
After running the command, it will be in a Docker container.
10.4.2. Set environment variables and install tpu-perf
Complete setting the environment variables needed to run the tests with the following command:
1$ cd tpu-mlir_xxxx
2$ source envsetup.sh
There will be no prompts after the process ends. Then install tpu-perf
with the following command:
$ pip3 install ../tpu_perf-x.x.x-py3-none-manylinux2014_x86_64.whl
10.4.3. Run the test
10.4.3.1. Compile the model
confg.yaml
in model-zoo
configures the test content of the SDK. For example, the configuration file for resnet18 is model-zoo/vision/classification/resnet18-v2/config.yaml
.
Execute the following command to run all test samples:
1$ cd ../model-zoo
2$ python3 -m tpu_perf.build --mlir -l full_cases.txt
The following models are compiled (Due to continuous additions of models in the model-zoo, only a partial list of models is provided here; at the same time, this process also compiles models for accuracy testing, and subsequent accuracy testing sections do not require recompilation of models.):
* efficientnet-lite4
* mobilenet_v2
* resnet18
* resnet50_v2
* shufflenet_v2
* squeezenet1.0
* vgg16
* yolov5s
* ...
After the command is finished, you will see the newly generated output
folder (where the test output is located).
Modify the properties of the output
folder to make it accessible to systems outside of Docker.
1$ chmod -R a+rw output
10.4.3.2. Test model performance
10.4.4. Configure SOC device
Note: If your device is a PCIE board, you can skip this section directly.
The performance test only depends on the libsophon
runtime environment, so after packaging models, compiled in the toolchain compilation environment, and model-zoo
, the performance test can be carried out in the SOC environment by tpu_perf
. However, the complete model-zoo
as well as compiled output contents may not be fully copied to the SOC since the storage on the SOC device is limited. Here is a method to run tests on SOC devices through linux nfs remote file system mounts.
First, install the nfs service on the toolchain environment server “host system”:
$ sudo apt install nfs-kernel-server
Add the following content to /etc/exports
(configure the shared directory):
/the/absolute/path/of/model-zoo *(rw,sync,no_subtree_check,no_root_squash)
Where *
means that everyone can access the shared directory. Moreover, it
can be configured to be accessible by a specific network segment or IP, such as:
/the/absolute/path/of/model-zoo 192.168.43.0/24(rw,sync,no_subtree_check,no_root_squash)
Then execute the following command to make the configuration take effect:
$ sudo exportfs -a
$ sudo systemctl restart nfs-kernel-server
In addition, you need to add read permissions to the images in the dataset directory:
chmod -R +r path/to/model-zoo/dataset
Install the client on the SOC device and mount the shared directory:
$ mkdir model-zoo
$ sudo apt-get install -y nfs-common
$ sudo mount -t nfs <IP>:/path/to/model-zoo ./model-zoo
In this way, the test directory is accessible in the SOC environment. The rest of the SOC test operation is basically the same as that of PCIE. Please refer to the following content for operation. The difference in command execution position and operating environment has been explained in the execution place.
10.4.5. Run the test
Running the test needs to be done in an environment outside Docker (it is assumed that you have installed and configured the 1684X device and driver), so you can exit the Docker environment:
$ exit
Run the following commands under the PCIE board to test the performance of the generated
bmodel
.
1$ pip3 install ./tpu_perf-*-py3-none-manylinux2014_x86_64.whl
2$ cd model-zoo
3$ python3 -m tpu_perf.run --mlir -l full_cases.txt
Note: If multiple SOPHGO accelerator cards are installed on the host, you can
specify the running device of tpu_perf
by adding --devices id
when using
tpu_perf
. Such as:
$ python3 -m tpu_perf.run --devices 2 --mlir -l full_cases.txt
The SOC device uses the following steps to test the performance of the generated
bmodel
.
Download the latest tpu-perf
, tpu_perf-x.x.x-py3-none-manylinux2014_aarch64.whl
, from https://github.com/sophgo/tpu-perf/releases to the SOC device and execute the following operations:
1$ pip3 install ./tpu_perf-x.x.x-py3-none-manylinux2014_aarch64.whl
2$ cd model-zoo
3$ python3 -m tpu_perf.run --mlir -l full_cases.txt
After that, performance data is available in output/stats.csv
, in which the running time, computing resource utilization, and bandwidth utilization of the relevant models are recorded.
10.4.6. Precision test
Precision test shall be carried out in the running environment beyond docker. It is optional to exit docker environment:
exit
Run the following commands under the PCIE board to test the precision of the generated bmodel
.
1$ pip3 install ./tpu_perf-*-py3-none-manylinux2014_x86_64.whl
2$ cd model-zoo
3$ python3 -m tpu_perf.precision_benchmark --mlir -l full_cases.txt
Various types of precision data are available in individual csv files in the output directory.
Note: If multiple SOPHGO accelerator cards are installed on the host, you can
specify the running device of tpu_perf
by adding --devices id
when using
tpu_perf
. Such as:
$ python3 -m tpu_perf.precision_benchmark --devices 2 --mlir -l full_cases.txt