量化操作

tpu_bdc_int_requant

重量化张量的元素，结果有 saturation。

void tpu_bdc_int_requant(local_addr_t dst_addr, local_addr_t src_addr, const dim4 *shape, int multiplier, char shift, scalar_t offset, data_type_t dst_dtype, data_type_t src_dtype, rounding_mode_t rounding_mode)

\[\begin{split}\mathsf{dst(n, c, h, w)} = {\begin{cases} \mathsf{(src(n, c, h, w)\times multiplier~~\textbf{左移}~~shift) + offset}&\mathsf{\text{如果}~shift > 0}\\ \mathsf{(src(n, c, h, w)\times multiplier~~\textbf{右移}~~-shift) + offset}&{\text{其他情况}}\end{cases}}\end{split}\]

参数

dst_addr – dst 的地址

src_addr – src 的地址

shape – 指向 dst 和 src 的 shape 的指针

multiplier – 乘子常数

shift – 移位数

offset – 补偿常数

dst_dtype – dst 的元素的数据类型

src_dtype – src 的元素的数据类型

rounding_mode – 右移舍入模式

注意事项

dst 和 src 从同一个 NPU 开始，都是 64-byte aligned layout。

shape->n、shape->c、shape->h 和 shape->w 的取值范围是 [1, 65535]。

dst_dtype 的有效取值是 DT_INT16、DT_UINT16、DT_INT8 和 DT_UINT8， src_dtype 的有效取值是 DT_INT32、DT_INT16 和 DT_UINT16，如果 dst_dtype 是有符号的，则 offset 的数据类型是 DT_INT16，否则是 DT_UINT16。

shift 的取值范围是 [-64, 31]。

tpu_bdc_int_pc_requant

按 channel 重量化张量的元素，结果有 saturation。

void tpu_bdc_int_pc_requant(local_addr_t dst_addr, local_addr_t src_addr, local_addr_t quant_addr, const dim4 *shape, data_type_t dst_dtype, data_type_t src_dtype, rounding_mode_t rounding_mode)

\[\begin{split}\mathsf{dst(n, c, h, w)} = {\begin{cases} \mathsf{(src(n, c, h, w)\times quant(0, c, 0, 0)~~\textbf{左移}~~quant(0, c, 0, 1)) + quant(0, c, 0, 2)}&\\ \mathsf{\text{ 如果}~quant(0, c, 0, 1) > 0}\\ \mathsf{(src(n, c, h, w)\times quant(0, c, 0, 0)~~\textbf{右移}~~-quant(0, c, 0, 1)) + quant(0, c, 0, 2)}&{\text{其他情况}}\end{cases}}\end{split}\]

参数

dst_addr – dst 的地址

src_addr – src 的地址

quant_addr – quant 的地址

shape – 指向 dst 和 src 的 shape 的指针

dst_dtype – dst 的元素的数据类型

src_dtype – src 的元素的数据类型

rounding_mode – 右移舍入模式

注意事项

dst、src 和 quant 从同一个 NPU 开始，都是 64-byte aligned layout。

quant 的 shape 是 [1, shape->c, 1, 3]，元素的数据类型是 DT_INT32 ，quant(0, c, 0, 0) 是乘子， quant(0, c, 0, 1) 是移位数，取值范围是 [-64, 31]，quant(0, c, 0, 2) 是补偿，如果 dst_dtype 是有符号的，则 quant(0, c, 0, 2) 的取值范围是 [-32768, 32767]，否则是 [0, 65535]。

shape->n、shape->c、shape->h 和 shape->w 的取值范围是 [1, 65535]。

dst_dtype 的有效取值是 DT_INT16、DT_UINT16、DT_INT8 和 DT_UINT8， src_dtype 的有效取值是 DT_INT32、DT_INT16 和 DT_UINT16。

tpu_bdc_fp32_requant

重量化张量的元素，结果有 saturation。

void tpu_bdc_fp32_requant(local_addr_t dst_addr, local_addr_t src_addr, const dim4 *shape, float scale, float offset, data_type_t dst_dtype, data_type_t src_dtype, rounding_mode_t src_rounding_mode, rounding_mode_t dst_rounding_mode)

\[\mathsf{dst(n, c, h, w) = \textbf{ INT}(\textbf{FP32}(src(n, c, h, w)) \times scale + offset)}\]

参数

dst_addr – dst 的地址

src_addr – src 的地址

shape – 指向 dst 和 src 的 shape 的指针

scale – 乘子常数

offset – 补偿常数

dst_dtype – dst 的元素的数据类型

src_dtype – src 的元素的数据类型

dst_rounding_mode – 浮点数转化到 dst 的元素的舍入模式

src_rounding_mode – src 的元素转化到浮点数的舍入模式

注意事项

dst 和 src 从同一个 NPU 开始，都是 64-byte aligned layout。

shape->n、shape->c、shape->h 和 shape->w 的取值范围是 [1, 65535]。

dst_dtype 的有效取值是 DT_INT16、DT_UINT16、DT_INT8 和 DT_UINT8， src_dtype 的有效取值是 DT_INT32、DT_INT16 和 DT_UINT16。

dst_rounding_mode 的有效取值是 RM_HALF_TO_EVEN、RM_HALF_AWAY_FROM_ZERO、RM_TOWARDS_ZERO、RM_DOWN 和 RM_UP。

tpu_bdc_fp32_pc_requant

按 channel 重量化张量的元素，结果有 saturation。

void tpu_bdc_fp32_pc_requant(local_addr_t dst_addr, local_addr_t src_addr, local_addr_t quant_addr, const dim4 *shape, data_type_t dst_dtype, data_type_t src_dtype, rounding_mode_t dst_rounding_mode, rounding_mode_t src_rounding_mode)

\[\mathsf{dst(n, c, h, w) = \textbf{ INT}(\textbf{FP32}(src(n, c, h, w)) \times quant(0, c, 0, 0) + quant(0, c, 0, 1))}\]

参数

dst_addr – dst 的地址

src_addr – src 的地址

quant_addr – quant 的地址

shape – 指向 dst 和 src 的 shape 的指针

dst_dtype – dst 的元素的数据类型

src_dtype – src 的元素的数据类型

dst_rounding_mode – 浮点数转化到 dst 的元素的舍入模式

src_rounding_mode – src 的元素转化到浮点数的舍入模式

注意事项

dst、src 和 quant 从同一个 NPU 开始，都是 64-byte aligned layout。

quant 的 shape 是 [1, shape->c, 1, 2]，元素的数据类型是 FP32，quant(0, c, 0, 0) 是乘子， quant(0, c, 0, 1) 是补偿。

shape->n、shape->c、shape->h 和 shape->w 的取值范围是 [1, 65535]。

dst_dtype 的有效取值是 DT_INT16、DT_UINT16、DT_INT8 和 DT_UINT8， src_dtype 的有效取值是 DT_INT32、DT_INT16 和 DT_UINT16。

dst_rounding_mode 的有效取值是 RM_HALF_TO_EVEN、RM_HALF_AWAY_FROM_ZERO、RM_TOWARDS_ZERO、RM_DOWN 和 RM_UP。

tpu_bdc_int_dequant

反量化张量的元素，结果有 saturation。

void tpu_bdc_int_dequant(local_addr_t dst_addr, local_addr_t src_addr, const dim4 *shape, scalar_t offset, int multiplier, char shift, data_type_t dst_dtype, data_type_t src_dtype, rounding_mode_t rounding_mode)

\[\begin{split}\mathsf{dst(n, c, h, w)} = {\begin{cases} \mathsf{(src(n, c, h, w) - offset)\times multiplier~~\textbf{左移}~~shift}&\mathsf{\text{如果}~shift > 0}\\ \mathsf{(src(n, c, h, w) - offset)\times multiplier~~\textbf{右移}~~-shift}&{\text{其他情况}}\end{cases}}\end{split}\]

参数

dst_addr – dst 的地址

src_addr – src 的地址

shape – 指向 dst 和 src 的 shape 的指针

offset – 补偿常数

multiplier – 乘子常数

shift – 移位数

dst_dtype – dst 的元素的数据类型

src_dtype – src 的元素的数据类型

rounding_mode – 右移舍入模式

注意事项

dst 和 src 从同一个 NPU 开始，都是 64-byte aligned layout。

shape->n、shape->c、shape->h 和 shape->w 的取值范围是 [1, 65535]。

dst_dtype 的有效取值是 DT_INT32、DT_INT16 和 DT_UINT16， src_dtype 的有效取值是 DT_INT16、DT_UINT16、DT_INT8 和 DT_UINT8，如果 src_dtype 是有符号的，则 offset 的数据类型是 DT_INT16，否则是 DT_UINT16。

shift 的取值范围是 [-64, 31]。

tpu_bdc_int_pc_dequant

按 channel 反量化张量的元素，结果有 saturation。

void tpu_bdc_int_pc_dequant(local_addr_t dst_addr, local_addr_t src_addr, local_addr_t quant_addr, const dim4 *shape, data_type_t dst_dtype, data_type_t src_dtype, rounding_mode_t rounding_mode)

\[\begin{split}\mathsf{dst(n, c, h, w)} = {\begin{cases} \mathsf{(src(n, c, h, w) - quant(0, c, 0, 0))\times quant(0, c, 0, 1)~~\textbf{左移}~~quant(0, c, 0, 2)}&\\ \mathsf{\text{ 如果}~quant(0, c, 0, 2) > 0}\\ \mathsf{(src(n, c, h, w) - quant(0, c, 0, 0))\times quant(0, c, 0, 1)~~\textbf{右移}~~-quant(0, c, 0, 2)}&{\text{其他情况}}\end{cases}}\end{split}\]

参数

dst_addr – dst 的地址

src_addr – src 的地址

quant_addr – quant 的地址

shape – 指向 dst 和 src 的 shape 的指针

dst_dtype – dst 的元素的数据类型

src_dtype – src 的元素的数据类型

rounding_mode – 右移舍入模式

注意事项

dst、src 和 quant 从同一个 NPU 开始，都是 64-byte aligned layout。

quant 的 shape 是 [1, shape->c, 1, 3]，元素的数据类型是 DT_INT32 ，quant(0, c, 0, 0) 是补偿，如果 src_dtype 是有符号的，则 quant(0, c, 0, 0) 的取值范围是 [-32768, 32767]，否则是 [0, 65535]， quant(0, c, 0, 1) 是乘子，quant(0, c, 0, 2) 是移位数，取值范围是 [-64, 31]。

shape->n、shape->c、shape->h 和 shape->w 的取值范围是 [1, 65535]。

dst_dtype 的有效取值是 DT_INT32、DT_INT16 和 DT_UINT16， src_dtype 的有效取值是 DT_INT16、DT_UINT16、DT_INT8 和 DT_UINT8。

tpu_bdc_fp32_dequant

反量化张量的元素。

void tpu_bdc_fp32_dequant(local_addr_t dst_addr, local_addr_t src_addr, const dim4 *shape, scalar_t offset, float scale, data_type_t src_dtype, rounding_mode_t rounding_mode)

\[\mathsf{dst(n, c, h, w) = \textbf{FP32}(src(n, c, h, w) - offset) \times scale}\]

参数

dst_addr – dst 的地址

src_addr – src 的地址

shape – 指向 dst 和 src 的 shape 的指针

offset – 补偿常数

scale – 乘子常数

src_dtype – src 的元素和 offset 的数据类型

rounding_mode – 定点数转化到浮点数的舍入模式

注意事项

dst 和 src 从同一个 NPU 开始，都是 64-byte aligned layout。

shape->n、shape->c、shape->h 和 shape->w 的取值范围是 [1, 65535]。

src_dtype 的有效取值是 DT_INT16、DT_UINT16、DT_INT8 和 DT_UINT8。

tpu_bdc_fp32_pc_dequant

按 channel 反量化张量的元素。

void tpu_bdc_fp32_pc_dequant(local_addr_t dst_addr, local_addr_t src_addr, local_addr_t quant_addr, const dim4 *shape, data_type_t src_dtype, rounding_mode_t rounding_mode)

\[\mathsf{dst(n, c, h, w) = \textbf{FP32}(src(n, c, h, w) - quant(0, c, 0, 0)) \times quant(0, c, 0, 1)}\]

参数

dst_addr – dst 的地址

src_addr – src 的地址

quant_addr – quant 的地址

shape – 指向 dst 和 src 的 shape 的指针

src_dtype – src 的元素的数据类型

rounding_mode – 定点数转化到浮点数的舍入模式

注意事项

dst、src 和 quant 从同一个 NPU 开始，都是 64-byte aligned layout。

quant 的 shape 是 [1, shape->c, 1, 2]，元素的数据类型是 DT_INT32 / DT_FP32，quant(0, c, 0, 0) 是补偿，数据类型是 DT_INT32 ，取值范围与 src_dtype 的取值范围相同， quant(0, c, 0, 1) 是乘子，数据类型是 FP32。

shape->n、shape->c、shape->h 和 shape->w 的取值范围是 [1, 65535]。 src_dtype 的有效取值是 DT_INT16、DT_UINT16、DT_INT8 和 DT_UINT8。