3.6. SDK Update Record 

Catalogue

3.0.0

SDK

The examples directory has been removed, and the related routines have been placed in github open source: https://github.com/sophon-ai-algo/examples/

middleware

bm168x

bmcompiler & bmruntime

Add support for BM1684X

Quantization Tool

Sophon-inference

Add an interface to save input and output: set_dump_io_flag (python/c++)
Handle add get_sn interface (python/c++)
Add multi-card inference engine MultiEngine (python)
Add putText and putText_inference (python/c++)
Add image_add_weighted interface (python/c++)
Add image_copy_to interface (python/c++)
Add image_copy_to_padding interface (python/c++)
Add set_decoder_env interface (python/c++) to customize ffmpeg parameters
PaddingAttr add constructor function (python/c++)
Engine adds create_input_tensors_map interface (python/c++) to create input tensor map according to bmodel
Engine adds create_output_tensors_map interface (python/c++) to create output tensor map according to bmodel
Engine optimizes the numpy whose input data format is fortran, and optimizes the conversion time from fortran to contiguous. On ordinary PC, the time of 640-640 color image on ordinary PC is shortened from 40ms to 3.5ms.
Add nms interface (python/c++)

SoC Firmware

2.7.0

middleware

Bmcpu opencv is supported, that is, in pcie mode, cpu executes the corresponding function of opencv on the card. For a list of supported functions, please see the document Multimedia user’s Manual.
Bmcv jpeg decoder interface automatically switches to turbojpeg soft decoding when it encounters a format that is not supported by hardware.
Add opencv capture.read_record interface to decode and provide stream recording function at the same time.
Provides opencv imread decoding robustness, when there are errors in the image, still output the decodable part of the image as far as possible. It is controlled by the IMREAD_RETRY_SOFTDEC flag bit
The sip library in gb28181 is replaced by the GPL osip library with a self-developed sip library
Added fourcc support in opencv capture
opencv adds Abi0/1 libraries on x86, abi0 for compilers like centos below gcc5, and abi1 for ubntu systems
Fixed the decoding failure caused by insufficient secondary axi ram in some 4k videos
Enrich the vidmulti example by incorporating command line options for multiplexing
bmcv::toBMI adds a conversion of type 8SC3/8SC1
bmcv::dumpMat adds support for video compression formats
Improve the robustness of the error stream. When the underlying layer is blocked due to a code stream error, the avcodec_decode_video2 or send_packet/receive_frame interface will return -1 for the upper-layer application to reconnect or handle the error
Some bugs fix

bm168x

A53 is enabled by default when a PCIE driver is loaded.
Supports the PCIE virtual NIC function.
bm-smi
When a53 is enabled, bm-smi recovery reloads a53
Supports PCIE MIX MODE. In pcie mode, complete soc functions are enabled. Reasoning and video codec tasks are completed in the soc environment, and pcie only serves as a communication channel.

bmcompiler and bmruntime

Added new operator support and optimizations
Improved bmodel and bmruntime memory allocation mechanism, can support large model running, such as BasicVSR
Added the data statistics mode. With the export BMCOMPILER_STAT_ERR=1 command, the similarity of the data at the layer is collected during open comparison, and compilation will not be interrupted if certain data exceeds the threshold
Code structure optimization: decoupling with bmlib, bmcompiler layer reconstruction, etc
bug fix

Quantization quantization tool

New ubuntu16.04+python3.7 docker is supported
U-FrameWork optimized lmdb program interface, provides ufwio python installation package, easy to embed user development environment to make lmdb.
Delete the convert_imageset binary tool, use the above ufwio interface to make lmdb, and provide python routines that can be used directly or modified.
Quantization tool adds MSE,PERCENTILE quantization algorithm.
auto-cali is integrated into the ufw whl package for easy dependency management. The code was refactored to optimize the one-click quantization call interface and add more automatic quantization strategies. One-click quantization is recommended for cv quantization.
Add a new version of the visualization tool, the old version of the visualization tool temporarily coexist. The new version of the tool shows the network structure and is more convenient to operate. The new version of the visual tool is recommended.
bug-fix

Sophon-inference

Refactor headers to hide implementation details and improve compatibility
Fixed crop/resize/convert interface forcing the input BMImage with specified output format to BGR_PLANAR
- Added the set_print_flag() switch interface for printing the main time consuming situations during inference
In Python, sail.Decoder removes the read_interface that takes bm_image as the return value
Added copy_from() and attach_from() interfaces to copy and attach data from BMImageArray.
Changed the initialization method for say.tensor in Python to add the own_sys_data token to enable Tensor that consists only of device memory
The imwrite_ interface for bm_image is added to Bmcv
sail in Python.Bmcv added a rectangle_to draw rectangles for bm_image()
sail in Python.Bmcv adds interface vpp_convert_format()
sail in Python.Bmcv adds interface convert_format()
sail in Python.Bmcv adds interface crop_and_resize_padding()
Added syntax hints for Python development
Fixed several documentation errors

SE5 firmware

SE5 v2 supported
Decoupling gate correlation (pre-set face capture recognition application) - For Ubuntu20.04 release only
Transplant sophon system interface and restful api to family box to provide some system information to customers (tpu/vpu/fan etc.)
Port some basic commands of bm_bin from gate file system to family bucket box to improve user experience
oemconfig.ini is generated
Add the upgrade command (bm_upgrade_runtime) to download the latest upgrade package from the official website and upgrade it
qt5.14 lib is ported to box
SophUI(hdmi display interface), display information, change ip

documents

The previous module documents released with the SDK are modified to be released in the official website document Center. You can view or download the PDF online

2.6.0

bmlib&driver

Added support for the loongarch64 platform
Fixed an issue where firmware would overflow after 1M

bmvid / middleware、bmcv

Add the vpu/jpu usage information in proc
opencv videowrite adds direct support for rtsp/rtmp and adds examples
The multimedia_guide document has been significantly updated and the multimedia_faq document has been added
Update and improve the functionality of the bm-scale filter in ffmpeg: Add csc conversions with more colors and support the use of filters in normal mode
pcie opencv enables the bmcpu function and provides basic drawing operations for the A53, such as font, rectangle, and line
bmcv/vpp Adds support for argb colors
ffmpeg bmcodec increases support for loop-decoding
Added support for the loongarch64 platform
Windows/mips64 sw64 / loogarch64 platform to increase the sample under the x86 Linux use cases
Added various examples of opencv/ffmpeg to show switching calls between ffmpeg+bmcv
Bug fixed: Support jpeg encoding of crop images, reduced memory footprint of opencv video decoding, removed libbm_x264, removed absolute path from opencvModule.cmake, etc

Sophon-inference

Fix bugs and improve stability
Added frame rate acquisition interface
Supports int32 model input types
Add a multi-process Demo
Document modification

nntc

Added bmneto to support onnx model compilation
bmnett supports the tensorflow 2.x and saved_model formats
3d operators such as conv3d, deconv3d and related optimizations were added, and 3d models such as slowfast and 3dunet were supported
strideslice, transpose, depth2space and other optimizations were added to improve the performance of yolov5 related models
Optimize the timestep process to reduce compilation time
The python version supported by related tools has been upgraded to support Python 3.7
Fixed several bugs

cali

Document modification
Figure optimization speed optimization, c++ implementation instead of python implementation
reverse layer quantization support, lstm, batchmatmul layer quantization set for floating point reasoning
bugfix

ufw

Added support for onnx model to umodel
Supports retention of user-defined input and output names
Several bug fixes

bsp

Update SM750 driver; Added XFS support
Added pcie mixed mode support. [Fixed] Single gz file in the brush package is too large to be overstepped. Upgrade the open source sdk and newos packaging scripts; Adding the lte modem service; Fixed ethtool version issues

2.5.0

bmlib&driver

Windows SDK development support
Currently, the driver can be installed in debug mode and supports single-core and three-core cards
Basic functions of bm-smi are supported
Supports basic VPU/JPU codec functions
Support for bm_opencv, ffmpeg library acceleration
Supports the network inference runtime library
Added the statistics and display of the usage of vpu and jpu
Made an adaptation on the Sunway server
Changed the display interface of Bm-smi
Added protection against system crashes when executing the Bm-smi recovery operation

bmcv

Add two operators: bmcv_image_axpy, bmcv_image_lapacian
vpp padding support with stride.

bmvid / middleware

Supports A53 opencv running mode in SC5 mode, warp, affine, sobel, erode, dialet, morphologyEx, line, calcOpticalFlowPyrLK, OpticalFlowFarneback, calcHist, boxFitler, bilateralFilter, gaussianBlur,
Fixed an issue with rtsp mjpeg decoding
Fixed packaging issues with FLVS in hardware coding
ffmpeg supports lame mp3 decoding
ffmpeg supports the hls protocol
ffmpeg command line adds zero_copy option to support hardware-decoded yuv output
Extended platform support to Loongson, Sunway, and windows systems
Add vpu/jpu usage information to the proc output
bug fix

Sophon-inference

bug fix
Document correction

nntc

Added the user - defined tpu layer plug-in to enable the bmkernel to write a layer and insert it to the network
Added user-defined compilation optimization plug-ins and demo
The nms supports a maximum of 65536 detection enclosures
The bmnetu supports gru layer
bmcpu/bmruntime matches windows/mips64/sw64
int8 model dynamic compilation supports conv3d/pooling3d
Performance optimization and bug fix
Document update

cali

Incorporate the leaky relu layer and other optimizations
Added the use of maximum quantization option
Document update
bug fix

ufw

Optimized memory usage, effectively reducing memory usage by 50%
The layer adds the Tag function to classify and describe functions of the Layer
Added semantic analysis of CFG
Fixed some bugs

bsp

Added the a53 soft reboot function to ATF. u-boot fixed the mcu watchdog timeout issue after the CLI is restarted. linux added efuse secure key protection and modified tpll switching mode
kernel closed errutam 843419 and fixed the cpufreq bug
kernel5.4 and ubuntu20.04 systems are supported. Compatible with both old and new its node names; Added a dts for sm5 (incompatible!) ; debian adds device discovery tools for ethtool, gate, etc. Modify lpddr4 parameters. Improved u-boot usb compatibility; Delete duplicate realtek config in kernel4.9 Adding usb acm support
Added SE5 lite support. memory layout correction tool Fixed low configuration of SM5 mini
Update SM750 driver; Added XFS support

2.4.0

bmlib&driver

Added a card management interface
Increase the mips tool chain and function support to reach the customer trial level
Added Windows SDK bmlib and some test case cmakelist files, driver and Bmlib platform adaptation code (not yet at the release level).
Fixed an issue where drivers could not be re-installed after A53 was enabled
Refactoring how files are organized under proc under Linux
Reconstructs bm-smi code and interface organization to support display by card
Added soft reset function after hang for vpp module
Added an interface for retrieving memory heap information in bmlib
Fixed INTC iic2 interrupt failed to read the smbus temperature due to mask

bmcv

New features: put_text, draw_lines;
In PCIe mode, A53 can be enabled and the on-chip CPU is used to perform some operations.

bmvid / middleware

Added mips64 platform support
The assert judgment in the bmvid API is removed, and an error value is returned
Unify all jpeg decoded output to planar format
Added support for dumpBMImage for float32 int8 and other formats
Update osip library to improve compatibility and stability of gb28181
Adjust the underlying scheduling to make the scheduling more uniform when multiplexing and decoding
The soc reduces the 8-way limit and supports 24-way video coding
Fix problems and enhance stability

Sophon-inference

Python features added base64 support
Enhanced Bmcv::imwrite to support more formats.
Some optimizations in BMImage processing

nntc

【BMNETC】Power operator bugfix, support L2Normalize operator
【BMNETC】Fixed a bug where the rpnproposal operator was calculated incorrectly
【BMNETP】pytorch version upgraded to 1.8.1
【BMNETP】support BERT model
【BMNETT】Added support for trigonometric functions such as arcsin, arccos, arcsinh, arccosh, arctanh, cosh, sinh, and tan
【BMNETU】The new layer supports: ShapeRange
【BMNETU】The python interface supports int8 umodel splitting
【BMNETU】daily test has been enhanced
【BMNETU】The Tile layer supports two inputs
【BMNETU】Fixed dynamic network bug containing reorg fix8b layer
【BMCOMPILER】Refactoring the graph filter optimizer
【BMCOMPILER】Reconstructs the layer group optimizer into a pass structure
【BMCOMPILER】The performance of active SILU fix8b has been optimized and the performance of yolov5 has been improved
【BMCOMPILER】Added support for active SIGN layer
【BMCOMPILER】Added Timestep merge optimization and fixed 3IC optimization issues. yolov3 and other detection network performance improved somewhat
【BMCOMPILER】TOPK operators use TPU acceleration, including TOPK network has improved performance
【BMLANG】stride class computing performance optimization and coeff input support
【BMLANG】condition_select computations support data broadcast
【BMLANG】Support for Shape Tensor control input, and update bmlang demo rgb2yuv support for dynamic crop regions
【BMLANG】Performance optimization of multiple masked_select operators using the same mask
【BMLANG】The performance of the nms operator is improved by several times
【BMRUNTIME】Remove exit and exit directly, using c++ exception to return an error value
【BMRUNTIME】Support for Loongson mips
【BMRUNTIME】Networks with FC layers also support dynamically variable resolution
【BMPROFILE】Fixed layer and GDMA display errors
【BMKERNEL】Supports mips64
【BMKERNEL】Added BMRT + BMKERNEL running yolov3 backbone + post-processing demo
【BACKEND 1684】Added the bmkernel demo of high-performance group topk
【BACKEND 1684】Fixed an issue where dynamic networks containing reduce would hang after running many rounds
【BACKEND 1684】Support for any multi - operator dynamic network
【BACKEND 1684】Fixed a calculation error bug for 3d max pooling
【BACKEND 1684】Optimized performance of 4N/1N conversion, some network performance will be improved

cali

Fixed bug in layer such as batchnorm leakyrelu
Add is_shape_layer tag to better support shape-related operations, and add shapeRange, expandDims, shapeAssign, and shapeCast layers to support int32 data
Collate forward_with_float function to increase stability
Added the ADMM statistics threshold
Improve the auto_calib function

ufw

Networks containing control flows are supported by default
Networks that support arbitrary dynamic shapes, and networks that include run-time inferences of shapes
Part of the bugfix
Some layer functions are enhanced

bsp

Increased the ramdisk size in recovery mode. Added usb swiping function
Added support for SM5 min wide temperature version; Fixed the temperature point and CPU frequency modulation mechanism of wide temperature plate; Adjust the pcie rc initialization code to support boards without refclk0. spacc secure key function; u-boot and BL2 slim; Upgrade the flash update tool. Increase the recovery partition to support OTA; bind pcie interrupts to cpu7. Added fl2000 re-enumeration function
Added the a53 soft reboot function to ATF

2.3.2

bmlib&driver

Added adaptation to mips
Added some libraries for the mips architecture
Fixed several SC5P bugs
Added some cmake code and scripts to support Windows compilation
Added support for SM5-W

bmcv

Fixed several corner case bugs;
Added interfaces: absdiff, threshold, fft, max_min, cal-hist;
Sort out where the back-end code is stored. Only a few apis are placed in itcm, and all new ones are placed in ddr.

bmvid

Supports Loongson platform compilation

middleware

Extend opencv support to 512 channels for video
ffmpeg osip library update
Fix bugs
Supports Loongson platform compilation

Sophon-inference

Fixed an issue where the URL in the document would not connect.

nntc

【BMLANG】Added deconv’s perchannel int8 compute demo.
【BMLANG】Select/Condition select supports int8.
【BMLANG】The performance of the NMS is optimized
【BMNETC】Added CONV3D and POOLING3D support.
【BMNETT】Add user-defined input data and data types.
【BMCOMPILER】Fixed some bugs to support the customer model.
【BMCOMPILER】Optimized the wait time for running INT8 networks.
【BMNETP】Added support for BMM, LAYER NORM, and EINSUM operators.
【BMNETU】Good unit testing.
【BMPROFILE】Added the CSV export function.
【BMPROFILE】Added the ability to parse Global memory operations.
【BMRUNTIME】Adapted to the Loongson.
【BACKEND_1684】Optimized the time overhead for GDMA GEN CMD.
【BACKEND_1684】Fixed back end bugs.
【BACKEND_1684】Optimized group topk and full library topk demo.

cali

Fixed a bug in the batchnorm layer
Add the auto_calib function

ufw

Fixed several bugs
Enhanced stability of analysis tools
conv per-channel computation is supported
Removed some of the python apis for ufw blob data

bsp

The DDR data in the ATF is separated into a separate bin file
Switch to MCU watchdog
SM5 Wide warm board support
BSP SDK open source related

2.3.1

bmlib&driver

Added a device management interface and updated the document
Added MIPS architecture engineering
fix warn information in compilation
Added adaptation to SC5P
The mechanism of updating chip and board temperature and tpu utilization rate was reconstructed

bmcv

sobel, gaussian blur, add-weighted, dct, yuv2hsv, and batch-topk operators are added.
Optimize and extend the nms to support a maximum of 65535 boxes and improve its performance;
Extension of matmul to support output of float32;
bugfix

middleware

The 8UC1→8UC1, BORDER_DEFAULT case of the sobel interface is implemented with hardware acceleration
The 8UC1 input of the gaussion_blur interface is implemented with hardware acceleration
Added support for hikvision smart options
Optimized opencv font rendering in yuv
Update multimedia document
Improve stability and fix bugs

Sophon-inference

Fixed problems and improved stability

nntc

【bmcompiler】Performance optimization for networks containing C axis CONCAT operators.
【bmnetu】【bmnetp】Added support for 3D Conv/Pooling.
【bmnetu】Added support for multi-input networks.
【bmnett】Added the check ops function.
【bmnetp】Added support for the Pytorch GRU/LSTM model.
【bmnetp】Upgrade to support pytorch version 1.7.
【bmlang】The Deconv operator supports 16bit output.
【bmlang】Added performance optimized group topk business program demo.
【bmlang】GEMM supports INT8 input /FP32 output and supports perchannel scale.
【bmlang】Added deconv to do a demo of perchannel quantization calculations.
【backend_1684】When the nms operator is larger than 1024 boxes, the performance is better than 2.3.0.
【backend_1684】Improved performance of 1N and 4N data conversion.
【all】Fixed BUGs to improve stability.

ufw

float computing supports networks containing control flows
UFW python Blobs are compatible with the numpy data type
The UFW supports partially empty set input and computation

bsp

The perfetto tool is pre-installed
The deb package required for compiling kernel module on the preinstallation board
VPP driver code optimization
The PMU was enabled

3.7. Release Note 

22.09.02 release note:

TPU-NNTC: bugfix
TPU-KERNEL: Optimize soc and cmodel testing; bugfix
TPU-MLIR: None
TPU-PERF: None
sophon-mw: bugfix
libsophon: Improve bm1684 soc mode support; bugfix
sophon-img: bugfix
sophon-pipeline: None

22.10.01 release note:

TPU-NNTC: Send packages directly from nntoolchain
TPU-KERNEL: Add some CV operators
TPU-MLIR: Support for compiling caffe model
TPU-PERF: bugfix
sophon-mw: bugfix
libsophon: bmvid directory structure collation, add ARM schema support; bugfix
sophon-img: update the kernel version to 5.4.219, u-boot version to 2022.10 bugfix
sophon-pipeline: add yolov5、video_stitch routing
sophon-sail:

The following APIs have been added to Python:

BMImage adds an interface to convert to opencv Mat: asmat()
BMImage add and get the device interface to which you belong: get_device_id()
BMImageArray add and get the device interface to which you belong: get_device_id()
Bmcv add BMImage Sketch Interface: drawPoint()
Bmcv add bm_image Sketch Interface: drawPoint_()

The following APIs have been added to C++:

BMImage add to get the device interface: get_device_id()
BMImageArray add to get the device interface: get_device_id()
Bmcv add BMImage Sketch Interface: drawPoint()
Bmcv adds bm_image drawing point API: drawPoint_()

Decoder fixed a problem where Python GIL was not released during decoding

sophon-demo: Released for the first time, including 10 routines such as YOLOv5 and ResNet
sophon-rpc: Support A53 on the PCIe mode startup board, and automatically enable A53 after reboot by configuring udev rules
bmmonke
bmpanda

22.11.01 release note:

TPU-NNTC: bugfix
TPU-KERNEL: Some cv operators samples have been added
TPU-MLIR: Support for hybrid quantization
TPU-PERF: bugfix
sophon-mw: bugfix; increased memory allocation for bmlib and removed the old way of ION LIB allocation
libsophon: centos rpm package support; centos abi0 support
sophon-img: changed the kernel to non-preemptive mode; bugfix; and added a documentation description of how to use custom software packages and how to modify kernel memory layout
sophon-pipeline: add retinaface routines
sophon-sail:

crop_and_resize API adds support for BMCV_INTER_LINEAR and BMCV_INTER_BICUBIC methods
delete the vpp_crop_padding interface because there is a problem with the interface implementation
resize API adds support for BMCV_INTER_LINEAR and BMCV_INTER_BICUBIC methods
crop_and_resize_padding API adds support for BMCV_INTER_LINEAR and BMCV_INTER_BICUBIC methods
add a common preprocessing interface

sophon-demo: fefactoring the cpp/lprnet_bmcv routine of LPRNet
sophon-rpc: bugfix
bmmonkey
bmpanda

22.12.01 release note:

TPU-NNTC: bugfix
TPU-KERNEL: bugfix
TPU-MLIR: Tflite supports inception/yolo network
TPU-PERF: fix mlir model run error; bugfix
sophon-mw: fixbug; add A53 coding case; change ffbmcv case
libsophon: bugfix
sophon-img: enable nfsd function; bugfix
sophon-pipeline: add face_recognition and multi routines
sophon-sail:

crop_and_resize interface adds configurable nearest neighbor algorithm, linear interpolation algorithm, bicubic interpolation algorithm
resize interface adds configurable nearest neighbor algorithm, linear interpolation algorithm, bicubic interpolation algorithm
crop_and_resize_padding interface adds configurable nearest neighbor algorithm, linear interpolation algorithm, bicubic interpolation algorithm
bugfix

sophon-demo: fixbug; LPRNet/cpp/lprnet_bmcv replaces opencv decoding with ff_decode; reconstructs SSD-related routines
sophon-rpc: Support for uninstalling libraries; fix some bugs during installation.
bmmonkey
bmpanda

3.8. SDK known problems 

This section describes the discovered but unsolved SDK related problems and provides temporary solutions for users’ reference.

Temporarily absent

3.6. SDK Update Record

3.7. Release Note

3.8. SDK known problems

3.6. SDK Update Record 

3.7. Release Note 

3.8. SDK known problems 