FP32 Binary Functions

okk_bdc_add

void okk_bdc_add(local_addr_t dst_addr, local_addr_t src0_addr, local_addr_t src1_addr, const dim4 *shape, const dim4 *dst_stride, const dim4 *src0_stride, const dim4 *src1_stride)

Perform addition of the elements of the source_0 and source_1 tensors for fp32 data type.

\[dst(n, c, h, w) = src\_0(n, c, h, w) + src\_1(n, c, h, w)\]
Parameters
  • dst_addr – Address of the destination tensor.

  • src0_addr – Address of the source_0 tensor.

  • src1_addr – Address of the source_1 tensor.

  • shape – Pointer to the shape of the destination, source_0 and source_1 tensors.

  • dst_stride – Pointer to the stride of the destination tensor.

  • src0_stride – Pointer to the stride of the source_0 tensor.

  • src1_stride – Pointer to the stride of the source_1 tensor.

Remarks

okk_bdc_add_C

void okk_bdc_add_C(local_addr_t dst_addr, local_addr_t src_addr, float C, const dim4 *shape, const dim4 *dst_stride, const dim4 *src_stride)

Perform addition of the elements of the source tensor and a constant value for fp32 data type.

\[dst(n, c, h, w) = src(n, c, h, w) + C\]
Parameters
  • dst_addr – Address of the destination tensor.

  • src_addr – Address of the source tensor.

  • C – Constant value to add.

  • shape – Pointer to the shape of the destination and source tensors.

  • dst_stride – Pointer to the stride of the destination tensor.

  • src_stride – Pointer to the stride of the source tensor.

Remarks

okk_bdc_sub

void okk_bdc_sub(local_addr_t dst_addr, local_addr_t src0_addr, local_addr_t src1_addr, const dim4 *shape, const dim4 *dst_stride, const dim4 *src0_stride, const dim4 *src1_stride)

Perform subtraction of the elements of the source_0 and source_1 tensors for fp32 data type.

\[dst(n, c, h, w) = src\_0(n, c, h, w) - src\_1(n, c, h, w)\]
Parameters
  • dst_addr – Address of the destination tensor.

  • src0_addr – Address of the source_0 tensor.

  • src1_addr – Address of the source_1 tensor.

  • shape – Pointer to the shape of the destination, source_0 and source_1 tensors.

  • dst_stride – Pointer to the stride of the destination tensor.

  • src0_stride – Pointer to the stride of the source_0 tensor.

  • src1_stride – Pointer to the stride of the source_1 tensor.

Remarks

okk_bdc_sub_C

void okk_bdc_sub_C(local_addr_t dst_addr, local_addr_t src_addr, float C, const dim4 *shape, const dim4 *dst_stride, const dim4 *src_stride)

Perform subtraction of the elements of the source tensor by a constant value for fp32 data type.

\[dst(n, c, h, w) = src(n, c, h, w) - C\]
Parameters
  • dst_addr – Address of the destination tensor.

  • src_addr – Address of the source tensor.

  • C – Constant value to subtract by.

  • shape – Pointer to the shape of the destination and source tensors.

  • dst_stride – Pointer to the stride of the destination tensor.

  • src_stride – Pointer to the stride of the source tensor.

Remarks

okk_bdc_C_sub

void okk_bdc_C_sub(local_addr_t dst_addr, local_addr_t src_addr, float C, const dim4 *shape, const dim4 *dst_stride, const dim4 *src_stride)

Perform subtraction of a constant value by the elements of the source tensor for fp32 data type.

\[dst(n, c, h, w) = C - src(n, c, h, w)\]
Parameters
  • dst_addr – Address of the destination tensor.

  • src_addr – Address of the source tensor.

  • C – Constant value to be subtracted.

  • shape – Pointer to the shape of the destination and source tensors.

  • dst_stride – Pointer to the stride of the destination tensor.

  • src_stride – Pointer to the stride of the source tensor.

Remarks

okk_bdc_mul

void okk_bdc_mul(local_addr_t dst_addr, local_addr_t src0_addr, local_addr_t src1_addr, const dim4 *shape, const dim4 *dst_stride, const dim4 *src0_stride, const dim4 *src1_stride)

Perform multiplication of the elements of the source_0 and source_1 tensors for fp32 data type.

\[dst(n, c, h, w) = src\_0(n, c, h, w) \times src\_1(n, c, h, w)\]
Parameters
  • dst_addr – Address of the destination tensor.

  • src0_addr – Address of the source_0 tensor.

  • src1_addr – Address of the source_1 tensor.

  • shape – Pointer to the shape of the destination, source_0 and source_1 tensors.

  • dst_stride – Pointer to the stride of the destination tensor.

  • src0_stride – Pointer to the stride of the source_0 tensor.

  • src1_stride – Pointer to the stride of the source_1 tensor.

Remarks

okk_bdc_mul_C

void okk_bdc_mul_C(local_addr_t dst_addr, local_addr_t src_addr, float C, const dim4 *shape, const dim4 *dst_stride, const dim4 *src_stride)

Perform multiplication of the elements of the source tensor and a constant value for fp32 data type.

\[dst(n, c, h, w) = src(n, c, h, w) \times C\]
Parameters
  • dst_addr – Address of the destination tensor.

  • src_addr – Address of the source tensor.

  • C – Constant value to multiply.

  • shape – Pointer to the shape of the destination and source tensors.

  • dst_stride – Pointer to the stride of the destination tensor.

  • src_stride – Pointer to the stride of the source tensor.

Remarks

okk_bdc_div

void okk_bdc_div(local_addr_t dst_addr, local_addr_t src0_addr, local_addr_t src1_addr, const dim4 *shape, const dim4 *dst_stride, const dim4 *src0_stride, const dim4 *src1_stride)

Perform division of the elements of the source_0 tensor by the elements of the source_1 tensor for fp32 data type.

\[dst(n, c, h, w) = \frac{src\_0(n, c, h, w)}{src\_1(n, c, h, w)}\]
Parameters
  • dst_addr – Address of the destination tensor.

  • src0_addr – Address of the source_0 tensor.

  • src1_addr – Address of the source_1 tensor.

  • shape – Pointer to the shape of the destination, source_0 and source_1 tensors.

  • dst_stride – Pointer to the stride of the destination tensor.

  • src0_stride – Pointer to the stride of the source_0 tensor.

  • src1_stride – Pointer to the stride of the source_1 tensor.

Remarks

okk_bdc_div_C

void okk_bdc_div_C(local_addr_t dst_addr, local_addr_t src_addr, float C, const dim4 *shape, const dim4 *dst_stride, const dim4 *src_stride)

Perform division of the elements of the source tensor by a constant value for fp32 data type.

\[dst(n, c, h, w) = \frac{src(n, c, h, w)}{C}\]
Parameters
  • dst_addr – Address of the destination tensor.

  • src_addr – Address of the source tensor.

  • C – Constant value to divide by.

  • shape – Pointer to the shape of the destination and source tensors.

  • dst_stride – Pointer to the stride of the destination tensor.

  • src_stride – Pointer to the stride of the source tensor.

Remarks

okk_bdc_C_div

void okk_bdc_C_div(local_addr_t dst_addr, local_addr_t src_addr, float C, const dim4 *shape, const dim4 *dst_stride, const dim4 *src_stride)

Perform division of a constant value by the elements of the source tensor for fp32 data type.

\[dst(n, c, h, w) = \frac{C}{src(n, c, h, w)}\]
Parameters
  • dst_addr – Address of the destination tensor.

  • src_addr – Address of the source tensor.

  • C – Constant value to be divided.

  • shape – Pointer to the shape of the destination and source tensors.

  • dst_stride – Pointer to the stride of the destination tensor.

  • src_stride – Pointer to the stride of the source tensor.

Remarks

okk_bdc_mac

void okk_bdc_mac(local_addr_t dst_addr, local_addr_t src0_addr, local_addr_t src1_addr, const dim4 *shape, const dim4 *dst_stride, const dim4 *src0_stride, const dim4 *src1_stride)

Perform multiply accumulation of the elements of the source_0 and source_1 tensors for fp32 data type.

\[dst(n, c, h, w) = dst(n, c, h, w) + src\_0(n, c, h, w) \times src\_1(n, c, h, w)\]
Parameters
  • dst_addr – Address of the destination tensor.

  • src0_addr – Address of the source_0 tensor.

  • src1_addr – Address of the source_1 tensor.

  • shape – Pointer to the shape of the destination, source_0 and source_1 tensors.

  • dst_stride – Pointer to the stride of the destination tensor.

  • src0_stride – Pointer to the stride of the source_0 tensor.

  • src1_stride – Pointer to the stride of the source_1 tensor.

Remarks

okk_bdc_mac_C

void okk_bdc_mac_C(local_addr_t dst_addr, local_addr_t src_addr, float C, const dim4 *shape, const dim4 *dst_stride, const dim4 *src_stride)

Perform multiply accumulation of the elements of the source and a constant value for fp32 data type.

\[dst(n, c, h, w) = dst(n, c, h, w) + src(n, c, h, w) \times C\]
Parameters
  • dst_addr – Address of the destination tensor.

  • src_addr – Address of the source tensor.

  • C – Constant value to multiply.

  • shape – Pointer to the shape of the destination and source tensors.

  • dst_stride – Pointer to the stride of the destination tensor.

  • src_stride – Pointer to the stride of the source tensor.

Remarks

okk_bdc_max

void okk_bdc_max(local_addr_t dst_addr, local_addr_t src0_addr, local_addr_t src1_addr, const dim4 *shape, const dim4 *dst_stride, const dim4 *src0_stride, const dim4 *src1_stride)

Perform maximum operation of the elements of the source_0 and source_1 tensors for fp32 data type.

\[dst(n, c, h, w) = \max(src\_0(n, c, h, w), src\_1(n, c, h, w))\]
Parameters
  • dst_addr – Address of the destination tensor.

  • src0_addr – Address of the source_0 tensor.

  • src1_addr – Address of the source_1 tensor.

  • shape – Pointer to the shape of the destination, source_0 and source_1 tensors.

  • dst_stride – Pointer to the stride of the destination tensor.

  • src0_stride – Pointer to the stride of the source_0 tensor.

  • src1_stride – Pointer to the stride of the source_1 tensor.

Remarks

okk_bdc_max_C

void okk_bdc_max_C(local_addr_t dst_addr, local_addr_t src_addr, float C, const dim4 *shape, const dim4 *dst_stride, const dim4 *src_stride)

Perform maximum operation of the elements of the source tensor and a constant value for fp32 data type.

\[dst(n, c, h, w) = \max(src(n, c, h, w), C)\]
Parameters
  • dst_addr – Address of the destination tensor.

  • src_addr – Address of the source tensor.

  • C – Constant value be operated.

  • shape – Pointer to the shape of the destination and source tensors.

  • dst_stride – Pointer to the stride of the destination tensor.

  • src_stride – Pointer to the stride of the source tensor.

Remarks

okk_bdc_min

void okk_bdc_min(local_addr_t dst_addr, local_addr_t src0_addr, local_addr_t src1_addr, const dim4 *shape, const dim4 *dst_stride, const dim4 *src0_stride, const dim4 *src1_stride)

Perform minimum operation of the elements of the source_0 and source_1 tensors for fp32 data type.

\[dst(n, c, h, w) = \min(src\_0(n, c, h, w), src\_1(n, c, h, w))\]
Parameters
  • dst_addr – Address of the destination tensor.

  • src0_addr – Address of the source_0 tensor.

  • src1_addr – Address of the source_1 tensor.

  • shape – Pointer to the shape of the destination, source_0 and source_1 tensors.

  • dst_stride – Pointer to the stride of the destination tensor.

  • src0_stride – Pointer to the stride of the source_0 tensor.

  • src1_stride – Pointer to the stride of the source_1 tensor.

Remarks

okk_bdc_min_C

void okk_bdc_min_C(local_addr_t dst_addr, local_addr_t src_addr, float C, const dim4 *shape, const dim4 *dst_stride, const dim4 *src_stride)

Perform minimum operation of the elements of the source tensor and a constant value for fp32 data type.

\[dst(n, c, h, w) = \min(src(n, c, h, w), C)\]
Parameters
  • dst_addr – Address of the destination tensor.

  • src_addr – Address of the source tensor.

  • C – Constant value be operated.

  • shape – Pointer to the shape of the destination and source tensors.

  • dst_stride – Pointer to the stride of the destination tensor.

  • src_stride – Pointer to the stride of the source tensor.

Remarks

okk_bdc_greater_select_value

void okk_bdc_greater_select_value(local_addr_t dst_addr, local_addr_t src0_addr, local_addr_t src1_addr, x32 select_val, const dim4 *shape, const dim4 *dst_stride, const dim4 *src0_stride, const dim4 *src1_stride)

Perform greater comparing of the elements of the source_0 and source_1 tensors for fp32 data type and then select a consant value or zero to be the result.

\[\begin{split}dst(n, c, h, w) = {\begin{cases}select\_val&{\text{if }}src\_0(n, c, h, w)>src\_1(n, c, h, w),\\0&{\text{otherwise}}.\end{cases}}\end{split}\]
Parameters
  • dst_addr – Address of the destination tensor.

  • src0_addr – Address of the source_0 tensor.

  • src1_addr – Address of the source_1 tensor.

  • select_val – Constant value to select.

  • shape – Pointer to the shape of the destination, source_0 and source_1 tensors.

  • dst_stride – Pointer to the stride of the destination tensor.

  • src0_stride – Pointer to the stride of the source_0 tensor.

  • src1_stride – Pointer to the stride of the source_1 tensor.

Remarks

okk_bdc_greater_C_select_value

void okk_bdc_greater_C_select_value(local_addr_t dst_addr, local_addr_t src_addr, float C, x32 select_val, const dim4 *shape, const dim4 *dst_stride, const dim4 *src_stride)

Perform greater comparing of the elements of the source tensor and a constant value for fp32 data type and then select another consant value or zero to be the result.

\[\begin{split}dst(n, c, h, w) = {\begin{cases}select\_val&{\text{if }}src(n, c, h, w)>C,\\0&{\text{otherwise}}.\end{cases}}\end{split}\]
Parameters
  • dst_addr – Address of the destination tensor.

  • src_addr – Address of the source tensor.

  • C – Constant value be operated.

  • select_val – Constant value to select.

  • shape – Pointer to the shape of the destination and source tensors.

  • dst_stride – Pointer to the stride of the destination tensor.

  • src_stride – Pointer to the stride of the source tensor.

Remarks

  • The data type of the source tensor is fp32, the data type of the destination tensor is some 32-bit type.

  • The destination and source tensors start at the same NPU.

  • dst_addr and src_addr are divisible by 4 and preferred by 128.

  • shape->n, shape->h and shape->w are in [1, 65535], shape->c is in [1, 4095].

  • If dst_stride or src_stride is NULL, the relative tensor is in the 128-Byte Aligned Layout.

okk_bdc_C_greater_select_value

void okk_bdc_C_greater_select_value(local_addr_t dst_addr, local_addr_t src_addr, float C, x32 select_val, const dim4 *shape, const dim4 *dst_stride, const dim4 *src_stride)

Perform greater comparing of a constant value and the elements of the source tensor for fp32 data type and then select another consant value or zero to be the result.

\[\begin{split}dst(n, c, h, w) = {\begin{cases}select\_val&{\text{if }}C>src(n, c, h, w),\\0&{\text{otherwise}}.\end{cases}}\end{split}\]
Parameters
  • dst_addr – Address of the destination tensor.

  • src_addr – Address of the source tensor.

  • C – Constant value be operated.

  • select_val – Constant value to select.

  • shape – Pointer to the shape of the destination and source tensors.

  • dst_stride – Pointer to the stride of the destination tensor.

  • src_stride – Pointer to the stride of the source tensor.

Remarks

  • The data type of the source tensor is fp32, the data type of the destination tensor is some 32-bit type.

  • The destination and source tensors start at the same NPU.

  • dst_addr and src_addr are divisible by 4 and preferred by 128.

  • shape->n, shape->h and shape->w are in [1, 65535], shape->c is in [1, 4095].

  • If dst_stride or src_stride is NULL, the relative tensor is in the 128-Byte Aligned Layout.

okk_bdc_equal_select_value

void okk_bdc_equal_select_value(local_addr_t dst_addr, local_addr_t src0_addr, local_addr_t src1_addr, x32 select_val, const dim4 *shape, const dim4 *dst_stride, const dim4 *src0_stride, const dim4 *src1_stride)

Perform equal comparing of a constant value and the elements of the source tensor for fp32 data type and then select another consant value or zero to be the result.

\[\begin{split}dst(n, c, h, w) = {\begin{cases}select\_val&{\text{if }}src\_0(n, c, h, w)=src\_1(n, c, h, w),\\0&{\text{otherwise}}.\end{cases}}\end{split}\]
Parameters
  • dst_addr – Address of the destination tensor.

  • src0_addr – Address of the source_0 tensor.

  • src1_addr – Address of the source_1 tensor.

  • select_val – Constant value to select.

  • shape – Pointer to the shape of the destination, source_0 and source_1 tensors.

  • dst_stride – Pointer to the stride of the destination tensor.

  • src0_stride – Pointer to the stride of the source_0 tensor.

  • src1_stride – Pointer to the stride of the source_1 tensor.

Remarks

okk_bdc_equal_C_select_value

void okk_bdc_equal_C_select_value(local_addr_t dst_addr, local_addr_t src_addr, float C, x32 select_val, const dim4 *shape, const dim4 *dst_stride, const dim4 *src_stride)

Perform equal comparing of the elements of the source tensor and a constant value for fp32 data type and then select another consant value or zero to be the result.

\[\begin{split}dst(n, c, h, w) = {\begin{cases}select\_val&{\text{if }}src(n, c, h, w)=C,\\0&{\text{otherwise}}.\end{cases}}\end{split}\]
Parameters
  • dst_addr – Address of the destination tensor.

  • src_addr – Address of the source tensor.

  • C – Constant value be operated.

  • select_val – Constant value to select.

  • shape – Pointer to the shape of the destination and source tensors.

  • dst_stride – Pointer to the stride of the destination tensor.

  • src_stride – Pointer to the stride of the source tensor.

Remarks

  • The data type of the source tensor is fp32, the data type of the destination tensor is some 32-bit type.

  • The destination and source tensors start at the same NPU.

  • dst_addr and src_addr are divisible by 4 and preferred by 128.

  • shape->n, shape->h and shape->w are in [1, 65535], shape->c is in [1, 4095].

  • If dst_stride or src_stride is NULL, the relative tensor is in the 128-Byte Aligned Layout.