FP32 Unary Functions

okk_bdc_rsqrt

void okk_bdc_rsqrt(local_addr_t dst_addr, local_addr_t src_addr, const dim4 *shape)

Calculate reciprocal of the square-root of the elements of the source tensor.

\[dst(n, c, h, w) = \frac{1}{\sqrt{src(n, c, h, w)}}\]
Parameters
  • dst_addr – Address of the destination tensor.

  • src_addr – Address of the source tensor.

  • shape – Pointer to the shape of the destination and source tensors.

Remarks

  • The destination and source tensors are in the 128-Byte Aligned Layout.

  • The data type of the destination and source tensors is fp32.

  • The destination and source tensors start at the same NPU.

  • dst_addr and src_addr are divisible by 128.

  • shape->n, shape->h and shape->w are in [1, 65535], shape->c is in [1, 4095].

okk_bdc_sqrt

void okk_bdc_sqrt(local_addr_t dst_addr, local_addr_t src_addr, const dim4 *shape)

Calculate square-root of the elements of the source tensor.

\[dst(n, c, h, w) = \sqrt{src(n, c, h, w)}\]
Parameters
  • dst_addr – Address of the destination tensor.

  • src_addr – Address of the source tensor.

  • shape – Pointer to the shape of the destination and source tensors.

Remarks

  • The destination and source tensors are in the 128-Byte Aligned Layout.

  • The data type of the destination and source tensors is fp32.

  • The destination and source tensors start at the same NPU.

  • dst_addr and src_addr are divisible by 128.

  • shape->n, shape->h and shape->w are in [1, 65535], shape->c is in [1, 4095].

okk_bdc_taylor_exp

void okk_bdc_taylor_exp(local_addr_t dst_addr, local_addr_t src_addr, const dim4 *shape, int num_series)

Calculate exponential of the elements of the source tensor by taylor expansion.

\[dst(n, c, h, w) = e^{src(n, c, h, w)}\]
Parameters
  • dst_addr – Address of the destination tensor.

  • src_addr – Address of the source tensor.

  • shape – Pointer to the shape of the destination and source tensors.

  • num_series – Number of the taylor expansion series.

Remarks

  • The destination and source tensors are in the 128-Byte Aligned Layout.

  • The data type of the destination and source tensors is fp32.

  • The destination and source tensors start at the same NPU.

  • dst_addr and src_addr are divisible by 128.

  • shape->n, shape->h and shape->w are in [1, 65535], shape->c is in [1, 4095].

  • num_series is in [1, 64], a trade-off between performance and accuracy.

  • This function is suitable for the situation that the absolute values of the elements of the source tensor are small, at least less than one.

okk_bdc_lookup_exp

void okk_bdc_lookup_exp(local_addr_t dst_addr, local_addr_t src_addr, const dim4 *shape)

Calculate exponential of the elements of the source tensor by lookup table.

\[dst(n, c, h, w) = e^{src(n, c, h, w)}\]
Parameters
  • dst_addr – Address of the destination tensor.

  • src_addr – Address of the source tensor.

  • shape – Pointer to the shape of the destination and source tensors.

Remarks

  • The destination and source tensors are in the 128-Byte Aligned Layout.

  • The data type of the source tensor is int32, the data type of the destination tensor is fp32.

  • The elements of the source tensor are in [-103, 88].

  • The destination and source tensors start at the same NPU.

  • dst_addr and src_addr are divisible by 128.

  • shape->n, shape->h and shape->w are in [1, 65535], shape->c is in [1, 4095].

okk_bdc_exp

void okk_bdc_exp(local_addr_t dst_addr, local_addr_t src_addr, local_addr_t work_addr, const dim4 *shape)

Calculate exponential of the elements of the source tensor.

\[dst(n, c, h, w) = e^{src(n, c, h, w)}\]
Parameters
  • dst_addr – Address of the destination tensor.

  • src_addr – Address of the source tensor.

  • work_addr – Address of the work tensor.

  • shape – Pointer to the shape of the destination, source and work tensors.

Remarks

  • The destination, source and work tensors are in the 128-Byte Aligned Layout.

  • The data type of the destination, source and work tensors is fp32.

  • The elements of the source tensor are in [-103.0, 88.0].

  • The destination, source and work tensors start at the same NPU.

  • dst_addr, src_addr and work_addr are divisible by 128.

  • shape->n, shape->h and shape->w are in [1, 65535], shape->c is in [1, 4095].

  • The work tensor is a workspace to store temporary tensor with the same size as the source tensor, dst_addr = work_addr or src_addr = work_addr is not allowed.

okk_bdc_exp_tunable

void okk_bdc_exp_tunable(local_addr_t dst_addr, local_addr_t src_addr, local_addr_t work_addr, const dim4 *shape, int num_series)

Calculate exponential of the elements of the source tensor with tunable number of the taylor expansion series.

\[dst(n, c, h, w) = e^{src(n, c, h, w)}\]
Parameters
  • dst_addr – Address of the destination tensor.

  • src_addr – Address of the source tensor.

  • work_addr – Address of the work tensor.

  • shape – Pointer to the shape of the destination, source and work tensors.

  • num_series – Number of the taylor expansion series.

Remarks

okk_bdc_sigmoid

void okk_bdc_sigmoid(local_addr_t dst_addr, local_addr_t src_addr, local_addr_t work_addr, const dim4 *shape)

Calculate sigmoid of the elements of the source tensor.

\[dst(n, c, h, w) = \text{sigmoid}(src(n, c, h, w))\]
Parameters
  • dst_addr – Address of the destination tensor.

  • src_addr – Address of the source tensor.

  • work_addr – Address of the work tensor.

  • shape – Pointer to the shape of the destination, source and work tensors.

Remarks

  • The destination, source and work tensors are in the 128-Byte Aligned Layout.

  • The data type of the destination, source and work tensors is fp32.

  • The elements of the source tensor are in [-103.0, 88.0].

  • The destination, source and work tensors start at the same NPU.

  • dst_addr, src_addr and work_addr are divisible by 128.

  • shape->n, shape->h and shape->w are in [1, 65535], shape->c is in [1, 4095].

  • The work tensor is a workspace to store temporary tensor with the same size as the source tensor, dst_addr = work_addr or src_addr = work_addr is not allowed.

okk_bdc_sigmoid_tunable

void okk_bdc_sigmoid_tunable(local_addr_t dst_addr, local_addr_t src_addr, local_addr_t work_addr, const dim4 *shape, int num_series)

Calculate sigmoid of the elements of the source tensor with tunable number of the taylor expansion series.

\[dst(n, c, h, w) = \text{sigmoid}(src(n, c, h, w))\]
Parameters
  • dst_addr – Address of the destination tensor.

  • src_addr – Address of the source tensor.

  • work_addr – Address of the work tensor.

  • shape – Pointer to the shape of the destination, source and work tensors.

  • num_series – Number of the taylor expansion series.

Remarks

okk_bdc_tanh

void okk_bdc_tanh(local_addr_t dst_addr, local_addr_t src_addr, local_addr_t work_addr, const dim4 *shape)

Calculate tanh of the elements of the source tensor.

\[dst(n, c, h, w) = \text{tanh}(src(n, c, h, w))\]
Parameters
  • dst_addr – Address of the destination tensor.

  • src_addr – Address of the source tensor.

  • work_addr – Address of the work tensor.

  • shape – Pointer to the shape of the destination, source and work tensors.

Remarks

  • The destination, source and work tensors are in the 128-Byte Aligned Layout.

  • The data type of the destination, source and work tensors is fp32.

  • The elements of the source tensor are in [-103.0, 88.0].

  • The destination, source and work tensors start at the same NPU.

  • dst_addr, src_addr and work_addr are divisible by 128.

  • shape->n, shape->h and shape->w are in [1, 65535], shape->c is in [1, 4095].

  • The work tensor is a workspace to store temporary tensor with the same size as the source tensor, dst_addr = work_addr or src_addr = work_addr is not allowed.

okk_bdc_tanh_tunable

void okk_bdc_tanh_tunable(local_addr_t dst_addr, local_addr_t src_addr, local_addr_t work_addr, const dim4 *shape, int num_series)

Calculate tanh of the elements of the source tensor with tunable number of the taylor expansion series.

\[dst(n, c, h, w) = \text{tanh}(src(n, c, h, w))\]
Parameters
  • dst_addr – Address of the destination tensor.

  • src_addr – Address of the source tensor.

  • work_addr – Address of the work tensor.

  • shape – Pointer to the shape of the destination, source and work tensors.

  • num_series – Number of the taylor expansion series.

Remarks

okk_bdc_reciprocal

void okk_bdc_reciprocal(local_addr_t dst_addr, local_addr_t src_addr, const dim4 *shape, const dim4 *dst_stride, const dim4 *src_stride)

Calculate reciprocal of the elements of the source tensor for fp32 data type.

\[dst(n, c, h, w) = src(n, c, h, w)^{-1}\]
Parameters
  • dst_addr – Address of the destination tensor.

  • src_addr – Address of the source tensor.

  • shape – Pointer to the shape of the destination and source tensors.

  • dst_stride – Pointer to the stride of the destination tensor.

  • src_stride – Pointer to the stride of the source tensor.

Remarks

okk_bdc_neg

void okk_bdc_neg(local_addr_t dst_addr, local_addr_t src_addr, const dim4 *shape, const dim4 *dst_stride, const dim4 *src_stride)

Calculate negative of the elements of the source tensor for fp32 data type.

\[dst(n, c, h, w) = -src(n, c, h, w)\]
Parameters
  • dst_addr – Address of the destination tensor.

  • src_addr – Address of the source tensor.

  • shape – Pointer to the shape of the destination and source tensors.

  • dst_stride – Pointer to the stride of the destination tensor.

  • src_stride – Pointer to the stride of the source tensor.

Remarks