Decompress test package
mkdir -p sophon/model-zoo
tar -xvf path/to/model-zoo_<date>.tar.bz2 --strip-components=1 -C sophon/model-zoo
cd sophon/model-zoo
The catalogue structure of test package is as below:
├── config.yaml
├── requirements.txt
├── data
├── dataset
├── harness
├── output
└── ...
General configurations included in config.yaml: Catalogue of dataset, root catalogue of model, etc. as well as some multiplexing parameters and commands.
requirements.txt is the python dependency of model-zoo.
dataset catalogue includes imagenet dataset pre-processing of model so as to be called by tpu_perf as plugin.
data catalogue will be used for storing lmdb dataset.
output catalogue will be used for storing bmodel output by compilation and some intermediate data.
Other catalogues include the information and configuration of different models. Catalogue corresponding to each model contains one config.yaml file, which is configured with model name, path, FLOPs, dataset production parameters and quantization compilation command of model.
Prepare dataset
ImageNet
Download the followings of imagenet 2012 dataset
ILSVRC2012_img_val.tar(MD5 29b22e2961454d5413ddabcf34fc5622).
Decompress ILSVRC2012_img_val.tar to dataset/ILSVRC2012/ILSVRC2012_img_val
catalogue:
cd path/to/sophon/model-zoo
tar xvf path/to/ILSVRC2012_img_val.tar -C dataset/ILSVRC2012/ILSVRC2012_img_val
COCO (optional)
If coco dataset is used for precision test, please download and decompress it according to the following steps:
cd path/to/sophon/model-zoo/dataset/COCO2017/
wget http://images.cocodataset.org/annotations/annotations_trainval2017.zip
wget http://images.cocodataset.org/zips/val2017.zip
unzip annotations_trainval2017.zip
unzip val2017.zip
Run performance and accuracy tests in SOC
Performance and accuracy testing only depends on the libsophon running environment, so the model compiled in the toolchain compilation environment can be packaged together with Model-zoo, and TPU_perf is used for performance and accuracy testing in the SOC environment. However, due to the limited eMMC, the complete Model-zoo and compiled output may not be fully copied to the SOC.Here is a method to compile the model in the toolchain environment and run the test in the soc by mounting the linux nfs remote file system.
First, install the nfs service on the toolchain environment server 『host system』:
sudo apt install nfs-kernel-server
Add the shared directory in /etc/exports
:
/path/to/sophon *(rw,sync,no_subtree_check,no_root_squash)
*``Indicates that everyone can access the shared directory, or it can be configured to be accessible by a specific network segment or IP, such as ``192.168.43.0/24
.
Then execute the following command to make the configuration take effect:
sudo exportfs -a
sudo systemctl restart nfs-kernel-server
In addition, you need to add read permissions to the images in the dataset directory:
chmod -R +r path/to/sophon/model-zoo/dataset
Install the client in soc and mount the shared directory:
mkdir sophon
sudo apt-get install -y nfs-common
sudo mount -t nfs <IP>:/path/to/sophon ./sophon
This will make the test directory accessible in the soc environment. The remaining operations of the soc test are basically the same as those of pcie, please refer to the following for operations; the difference in the execution position of the running environment command has been added in the execution place.
Prepare running environment
In python3.7 and above,install the required dependencies for model-zoo
using pip
:
sudo apt-get install -y libgl1 # For OpenCV
# The accuracy test needs to perform the following operations, and the performance test does not perform
cd path/to/sophon/model-zoo
pip3 install -r requirements.txt
If running the tests in the soc environment, execute the above command on the soc.
In addition, call tpu hardware in running environment for performance and precision test. Please install libsophon by following libsophon user manual.
Install the tpu_perf tool
Using the tpu_perf tool, you can easily verify the model performance and accuracy, and realize batch testing and verification of multiple models and batch sizes. You can get the latest version of the tpu_perf release corresponding to the architecture from here, or you can compile it from the source code. The following are the source code compilation steps.
Please compile the tpu_perf source code in the deployment environment. Compilation depends on libsophon-dev, please refer to the libsophon manual for installation.
Compilation depends on the python packaging tool, please use pip to install:
pip3 install setuptools wheel
Compilation depends on protoc.On x86 machines, please use the following installation method to maintain compatibility with the tpu-nntc environment:
wget -O /tmp/protoc-3.19.4-linux-x86_64.zip \
https://github.com/protocolbuffers/protobuf/releases/download/v3.19.4/protoc-3.19.4-linux-x86_64.zip
cd path/to/sophon
mkdir protoc
unzip -o -d ./protoc /tmp/protoc-3.19.4-linux-x86_64.zip
In the soc environment, you can use the package manager to install directly:
sudo apt-get install -y protobuf-compiler
Then, get the tpu_perf source code and unzip it, and then execute the compilation. The command is as follows:
mkdir tpu_perf
tar xvf path/to/tpu-perf-X.Y.Z.tar.gz --strip-components=1 -C tpu_perf
# If the tpu_perf/build directory already exists, it is recommended to delete it first.
# rm -r tpu_perf/build
# Execute compilation.
mkdir -p tpu_perf/build
cd tpu_perf/build
PATH=$PATH:../../protoc/bin cmake ..
make install/strip -j4
Among them, the PATH environment variable specified in the cmake generation step is to use the correct protoc program, and it can be not specified in the soc environment.
After successful compilation, the whl package will be generated under tpu_perf/python/dist
.
When tpu_perf
runs also depends on numpy
, lmdb
, protobuf==3.19.*
, psutil
, pyyaml
numpy
, lmdb
, protobuf==3.19.*
, psutil
, pyyaml
,make sure you can connect to the Internet or install dependencies manually when installing whl.
cd ..
pip3 install --upgrade python/dist/tpu_perf-X.Y.Z-py3-none-<arch>.whl
Please note that if you compile tpu_perf in the soc environment, please download and compile it on the soc environment and the toolchain environment 『host system』 respectively to obtain a package for ‘Aarch64’ and a package for ‘x86_64’ for both machines.
Prepare tool chain compilation environment
Tool chain software is suggested to be used in docker environment. docker of the latest version can be installed by referring to official tutorial.After installation, execute the following script and add the present user into docke group to obtain docker execution authority.
sudo usermod -aG docker $USER
newgrp docker
Next, pull docker mirroring from docker hub:
docker pull sophgo/tpuc_dev:v2.1
Then decompress tool chain development package from test package catalogue. The latest tool chain development package could be obtained from official website.
cd path/to/sophon
tar xvf tpu-nntc_vx.y.z-<hash>-<date>.tar.gz
# Optional, copy the tpu_perf installation package to the working directory
cp path/to/tpu_perf-X.Y.Z-py3-none-x86_64.whl ./
Next, start docker container and map the catalogue to docker.
docker run -td -v $(pwd):/workspace --name nntc sophgo/tpuc_dev:v2.1 bash
Enter docker environment by executing the following commands:
docker exec -it nntc bash
One example
By taking resnet50 network as an example, the process of performance and precision test is executed for once here completely so that users could gain a full understanding of test process.
First, compile model in『docker tool chain environment』:
# Env setup
cd /workspace/tpu-nntc_vx.y.z-<hash>-<date>
source scripts/envsetup.sh
# Prepare working directory
cd /workspace/model-zoo
mkdir -p output/resnet50
cp vision/classification/ResNet50-Caffe/ResNet-50-* output/resnet50
cd output/resnet50
python3 -m ufw.cali.cali_model \
--model ./ResNet-50-deploy.prototxt \
--weights ./ResNet-50-model.caffemodel \
--cali_image_path /workspace/model-zoo/dataset/ILSVRC2012/caliset \
--test_iterations 10 \
--net_name resnet50 \
--postprocess_and_calc_score_class none \
--target BM1684X \
--cali_iterations 100 \
--cali_image_preprocess='
resize_side=256;
crop_w=224,crop_h=224;
mean_value=103.94:116.78:123.68,scale=1' \
--input_shapes=[1,3,224,224]
The command quantifies model into int8
model in virtue of auto_cali
quantization tool. Quantization dataset is required and the pre-processing parameter needs designating.
Next, carry out performance test and precision verification of model under the running environment. First, exit tool chain docker environment:
exit
If running the tests in the soc environment, execute the following model test commands on the soc.
Verify model reasoning time via bmrt_test
tool
cd path/to/sophon/model-zoo
bmrt_test --bmodel output/resnet50/resnet50_batch1/compilation.bmodel
The key performance parameters of model will be printed after execution.
Next, the running precision test program will verify bmodel precision on dataset:
python3 harness/topk.py \
--mean 103.94,116.78,123.68 --scale 1 --size 224 \
--bmodel output/resnet50/resnet50_batch1/compilation.bmodel \
--image_path ./dataset/ILSVRC2012/ILSVRC2012_img_val \
--list_file ./dataset/ILSVRC2012/caffe_val.txt \
--devices 10
The last line of program will output top 1 and top 5 precision measured by bmodel.
The following two chapters introduce how to verify performance and precision using tpu_perf. In consideration of multiple models and relatively long execution time of commands, it is suggested to execute command by putting it in terminal manager of screen/tmux, if operation is made via ssh dialogue so as to avoid suspending task once dialogue is finished.
Performance test
First, compile model in『docker tool chain environment:
# Env setup
cd /workspace/tpu-nntc_vx.y.z-<hash>-<date>
source scripts/envsetup.sh
# Install tpu_perf tool
pip3 install --upgrade path/to/tpu_perf-X.Y.Z-py3-none-x86_64.whl
cd /workspace/model-zoo
python3 -m tpu_perf.build --list default_cases.txt --time
tpu_perf
all commands can specify a case list file to run, here default_cases.txt
is used. The full list of cases can be run by specifying full_cases.txt
(may take a long time), or a custom list file. If no list is specified, tpu_perf
will traverse the current directory looking for config.yaml files and execute them one by one.
Running test shall be carried out in the running environment beyond docker. It is optional to exit docker environment:
exit
If running the tests in the soc environment, execute the following model test commands on the soc.
As model is executed by root user by default after generation in docker, change the owner of output catalogue to the current user.
cd path/to/sophon/model-zoo
sudo chown -R $(whoami):$(whoami) output
Then install and use tpu_perf tool to run model and generate performance data:
pip3 install --upgrade path/to/tpu_perf-X.Y.Z-py3-none-<arch>.whl
python3 -m tpu_perf.run --list default_cases.txt
Performance data can be obtained from output/stats.csv
.
Precision verification
First, prepare dataset and compile model in『docker tool chain environment:
Prepare quantization and test LMDB
If model quantization is carried out via auto_cali
, quantify model via image set or LMDB dataset. In fact, if image set is designated, auto_cali
will generate LMDB dataset automatically based on pre-processing parameters for the purpose of quantization. As several models use the same image set and pre-processing in course of batch quantization, tpu_perf
tool is used here to finish LMDB dataset in batch first before calling auto_cali
for quantization.
# Install the tpu_perf tool
cd /workspace/tpu_perf
pip3 install --upgrade path/to/tpu_perf-X.Y.Z-py3-none-x86_64.whl
cd /workspace/model-zoo
python3 -m tpu_perf.make_lmdb --list default_cases.txt
Executing this command generates preprocessed lmdb datasets for quantification and testing based on the configuration of each model.
tpu_perf
all commands can specify a case list file to run, here default_cases.txt
is used. The full list of cases can be run by specifying full_cases.txt
(may take a long time), or a custom list file. If no list is specified, tpu_perf
will traverse the current directory looking for config.yaml files and execute them one by one.
The tool configuration uses 200 image sets for quantization by default. If you want to choose other image sets, you can put the images in the dataset/ILSVRC2012/caliset
directory, and pass the cali_set
field in config.yaml
specifies the number of quantization sets. For details, see the preprocessing implementation in the dataset directory.
Script execution time may be long, pls wait patiently.
Model quantization and compilation
# Env setup
cd /workspace/tpu-nntc_vx.y.z-<hash>-<date>
source scripts/envsetup.sh
cd /workspace/model-zoo
python3 -m tpu_perf.build --list default_cases.txt
This command quantizes and compiles the models according to the configuration of each model. If there are many models, the script execution time will be longer, please be patient.
Precision test shall be carried out in the running environment beyond docker. It is optional to exit docker environment:
exit
Precision test
If running the tests in the soc environment, execute the following model test commands on the soc.
As dataset and model are executed by root user by default after generation in docker, change the owner of generated catalogue to the current user.
# Install the tpu_perf tool
pip3 install --upgrade path/to/tpu_perf-X.Y.Z-py3-none-<arch>.whl
cd path/to/sophon/model-zoo
sudo chown -R $(whoami):$(whoami) data
sudo chown -R $(whoami):$(whoami) output
The tests can then be run using the tpu_perf tool:
python3 -m tpu_perf.precision_benchmark --list default_cases.txt
Various types of precision data are available in individual csv files in the output directory.