辅助函数

tpu_data_type_size

获取某数据类型所占的空间大小。

int tpu_data_type_size(data_type_t dtype)

** 不支持 DT_INT4/DT_UINT4, 在该类型下会报ASSERT

tpu_data_type_bits

获取某数据类型比特位宽。

int tpu_data_type_bits(data_type_t dtype)

tpu_npu_index

获取某地址对应的NPU的索引。

int tpu_npu_index(local_addr_t addr)

tpu_bank_index

获取某地址对应的BANK索引。

int tpu_bank_index(local_addr_t addr)

tpu_channle_num_per_npu

获取tensor在NPU中分配的channel数量。

int tpu_channle_num_per_npu(int start_idx, int num_channels)

参数:

start_idx – 起始 NPU 的 index

num_channels – tensor的channel的数量

返回:

NPU分配的channel数量

tpu_aligned_feature_size

以 64-Byte 为存储单元进行内存分配时，计算输入二维向量需要分配的存储单元数量。

int tpu_aligned_feature_size(int h, int w, data_type_t dtype)

参数:

h – 二维向量的高度h

w – 二维向量的宽度w

dtype – 元素的数据类型

tpu_aligned_stride

以 64-Bytes 对齐存储方式计算输入tensor的stride。

void tpu_aligned_stride(dim4 *stride, int start_idx, const dim4 *shape, data_type_t dtype)

参数:

stride – 指向 stride 的指针

start_idx – 起始 NPU 的 index

shape – 指向 shape 的指针

dtype – 元素的数据类型

tpu_compact_stride

以紧凑存储方式计算输入tensor的stride。

void tpu_compact_stride(dim4 *stride, int start_idx, const dim4 *shape)

参数:

stride – 指向 stride 的指针

start_idx – 起始 NPU 的 index

shape – 指向 shape 的指针

tpu_line_aligned_stride

以行 64 字节对齐存储计算输入tensor的stride。

void tpu_line_aligned_stride(dim4 *stride, int start_idx, const dim4 *shape, data_type_t dtype)

参数:

stride – 指向 stride 的指针

start_idx – 起始 NPU 的 index

shape – 指向 shape 的指针

dtype – 元素的数据类型

tpu_continuous_stride

以连续存储方式计算输入tensor的stride。

void tpu_continuous_stride(dim4 *stride, const dim4 *shape)

参数:

stride – 指向 stride 的指针

shape – 指向 shape 的指针

dtype – 元素的数据类型