Data Type Converting Functions


void okk_bdc_fp32_to_int32(local_addr_t dst_addr, local_addr_t src_addr, const dim4 *shape)

Convert the elements of the source tensor from int32 to fp32 by lookup table.

  • dst_addr – Address of the destination tensor.

  • src_addr – Address of the source tensor.

  • shape – Pointer to the shape of the destination and source tensors.


  • The destination and source tensors are in the 128-Byte Aligned Layout.

  • The data type of the source tensor is fp32, the data type of the destination tensor is int32.

  • The destination and source tensors start at the same NPU.

  • dst_addr and src_addr are divisible by 128.

  • shape->n, shape->h and shape->w are in [1, 65535], shape->c is in [1, 4095].


void okk_bdc_lookup_int32_to_fp32(local_addr_t dst_addr, local_addr_t src_addr, const dim4 *shape)

Convert the elements of the source tensor from int32 to fp32 by lookup table.

  • dst_addr – Address of the destination tensor.

  • src_addr – Address of the source tensor.

  • shape – Pointer to the shape of the destination and source tensors.


  • The destination and source tensors are in the 128-Byte Aligned Layout.

  • The data type of the source tensor is int32, the data type of the destination tensor is fp32.

  • The elements of the source tensor are in [-128, 127].

  • The destination and source tensors start at the same NPU.

  • dst_addr and src_addr are divisible by 128.

  • shape->n, shape->h and shape->w are in [1, 65535], shape->c is in [1, 4095].


void okk_bdc_4N_int8_to_fp32(local_addr_t dst_addr, local_addr_t src_addr, local_addr_t work_addr, const dim4 *shape, bool is_signed, bool is_aligned_layout)

Convert the elements of the source tensor from int8 or uint8 to fp32.

  • dst_addr – Address of the destination tensor.

  • src_addr – Address of the source tensor.

  • work_addr – Address of the work tensor.

  • shape – Pointer to the shape of the destination, source and work tensors.

  • is_signed – Flag of the data type of the source tensor, true means int8, otherwise, uint8.

  • is_aligned_layout – Flag of the layout of the destination, source and work tensor, true means 128-Byte Aligned Layout, otherwise, Compact Layout.


  • The destination, source and work tensors are in the 128-Byte Aligned Layout or Compact Layout simutanously.

  • The data type of the source and work tensors is int8 or uint8, the data type of the destination tensor is fp32.

  • The source and work tensors are in the 4N-mode.

  • The destination, source and work tensors start at the same NPU.

  • dst_addr, src_addr and work_addr are divisible by 4 and preferred by 128.

  • shape->n, shape->h and shape->w are in [1, 65535], shape->c is in [1, 4095].

  • The work tensor is a workspace to store temporary tensor with the same size as the source tensor, dst_addr = work_addr is not allowed.


void okk_bdc_int8_to_fp32(local_addr_t dst_addr, local_addr_t src_addr, local_addr_t work_addr, const dim4 *shape, bool is_signed, bool is_aligned_layout)

Convert the elements of the source tensor from int8 or uint8 to fp32.

  • dst_addr – Address of the destination tensor.

  • src_addr – Address of the source tensor.

  • work_addr – Address of the work tensor.

  • shape – Pointer to the shape of the destination, source and work tensors.

  • is_signed – Flag of the data type of the source tensor, true means int8, otherwise, uint8.

  • is_aligned_layout – Flag of the layout of the destination, source and work tensor, true means 128-Byte Aligned Layout, otherwise, Compact Layout.


  • The destination, source and work tensors are in the 128-Byte Aligned Layout or Compact Layout simutanously.

  • The data type of the source and work tensors is int8 or uint8, the data type of the destination tensor is fp32.

  • The destination, source and work tensors start at the same NPU.

  • dst_addr, src_addr and work_addr are divisible by 4 and preferred by 128.

  • shape->n, shape->h and shape->w are in [1, 65535], shape->c is in [1, 4095].

  • The work tensor is a workspace to store temporary tensor with the same size as the source tensor, dst_addr = work_addr is not allowed.

  • If the source and work tensors are in the Compact Layout, another restriction is required that C stride is ALIGN (shape->h * shape->w, 4) other than shape->h * shape->w, so the source and work tensors are in an approximate Compact Layout.