FP32 Neural Network Functions

okk_bdc_relu

void okk_bdc_relu(local_addr_t dst_addr, local_addr_t src_addr, const dim4 *shape, const dim4 *dst_stride, const dim4 *src_stride)

Calculate ReLU of the elements of the source tensor for fp32 data type.

\[\begin{split}dst(n, c, h, w) = {\begin{cases}src(n, c, h, w)&{\text{if }}src(n, c, h, w)>0,\\0&{\text{otherwise}}.\end{cases}}\end{split}\]
Parameters
  • dst_addr – Address of the destination tensor.

  • src_addr – Address of the source tensor.

  • shape – Pointer to the shape of the destination, source and work tensors.

  • dst_stride – Pointer to the stride of the destination tensor.

  • src_stride – Pointer to the stride of the source tensor.

Remarks

okk_bdc_bias

void okk_bdc_bias(local_addr_t dst_addr, local_addr_t src_addr, local_addr_t bias_addr, const dim4 *shape, const dim4 *dst_stride, const dim4 *src_stride)

Perform adding bias to the elements of the source tensor per channel.

\[dst(n, c, h, w) = src(n, c, h, w) + bias(0, c, 0, 0)\]
Parameters
  • dst_addr – Address of the destination tensor.

  • src_addr – Address of the source tensor.

  • bias_addr – Address of the bias tensor.

  • shape – Pointer to the shape of the destination and source tensors.

  • dst_stride – Pointer to the stride of the destination tensor.

  • src_stride – Pointer to the stride of the source tensor.

Remarks

okk_bdc_scale

void okk_bdc_scale(local_addr_t dst_addr, local_addr_t src_addr, local_addr_t scale_addr, const dim4 *shape, const dim4 *dst_stride, const dim4 *src_stride)

Perform scaling the elements of the source tensor per channel.

\[dst(n, c, h, w) = src(n, c, h, w)\times scale(0, c, 0, 0)\]
Parameters
  • dst_addr – Address of the destination tensor.

  • src_addr – Address of the source tensor.

  • bias_addr – Address of the scale tensor.

  • shape – Pointer to the shape of the destination and source tensors.

  • dst_stride – Pointer to the stride of the destination tensor.

  • src_stride – Pointer to the stride of the source tensor.

Remarks

okk_bdc_scale_bias

void okk_bdc_scale_bias(local_addr_t dst_addr, local_addr_t src_addr, local_addr_t scale_addr, local_addr_t bias_addr, const dim4 *shape, const dim4 *dst_stride, const dim4 *src_stride)

Perform scaling and adding bias to the elements of the source tensor per channel.

\[dst(n, c, h, w) = src(n, c, h, w)\times scale(0, c, 0, 0) + bias(0, c, 0, 0)\]
Parameters
  • dst_addr – Address of the destination tensor.

  • src_addr – Address of the source tensor.

  • scale_addr – Address of the scale tensor.

  • bias_addr – Address of the bias tensor.

  • shape – Pointer to the shape of the destination and source tensors.

  • dst_stride – Pointer to the stride of the destination tensor.

  • src_stride – Pointer to the stride of the source tensor.

Remarks

okk_bdc_conv2d

void okk_bdc_conv2d(local_addr_t output_addr, local_addr_t input_addr, local_addr_t weight_addr, local_addr_t bias_addr, const dim4 *input_shape, int output_c, int kernel_h, int kernel_w, const dim4 *input_stride, const dim4 *kernel_stride, bool using_bias, bool result_add, const Padding *padding, const dim2 *stride, const dim2 *dilation)

Perform 2D convolution with or without adding bias and result accumulation by addtition.

Parameters
  • output_addr – Address of the output tensor.

  • input_addr – Address of the input tensor.

  • weight_addr – Address of the weight tensor.

  • bias_addr – Address of the bias tensor, only used when using_bias = true.

  • input_shape – Pointer to the shape of the input tensor.

  • output_c – Channel number of the output tensor.

  • kernel_h – Height of the convolution kernel.

  • kernel_w – Width of the convolution kernel.

  • input_stride – Pointer to the stride of the input tensor.

  • kernel_stride – Pointer to the stride of the weight tensor.

  • using_bias – Flag of adding bias.

  • result_add – Flag of performing result accumulation by addtition.

  • padding – Pointer to the amount of paddings applied to the input tensor.

  • stride – Pointer to the strides for the cross-correlation.

  • dilation – Pointer to the spacings between the kernel points.

Remarks

okk_bdc_depthwise2d

void okk_bdc_depthwise2d(local_addr_t output_addr, local_addr_t input_addr, local_addr_t weight_addr, local_addr_t bias_addr, const dim4 *input_shape, int kernel_h, int kernel_w, bool using_bias, const Padding *padding, const dim2 *stride, const dim2 *dilation)

Perform 2D depthwise convolution with or without adding bias.

Parameters
  • output_addr – Address of the output tensor.

  • input_addr – Address of the input tensor.

  • weight_addr – Address of the weight tensor.

  • bias_addr – Address of the bias tensor, only used when using_bias = true.

  • input_shape – Pointer to the shape of the input tensor.

  • kernel_h – Height of the convolution kernel.

  • kernel_w – Width of the convolution kernel.

  • using_bias – Flag of adding bias.

  • padding – Pointer to the amount of paddings applied to the input tensor.

  • stride – Pointer to the strides for the cross-correlation.

  • dilation – Pointer to the spacings between the kernel points.

Remarks

okk_bdc_avg_pool2d

void okk_bdc_avg_pool2d(local_addr_t output_addr, local_addr_t input_addr, const dim4 *input_shape, int kernel_h, int kernel_w, const Padding *padding, const dim2 *stride)

Perform 2D average pooling.

Parameters
  • output_addr – Address of the output tensor.

  • input_addr – Address of the input tensor.

  • input_shape – Pointer to the shape of the input tensor.

  • kernel_h – Height of the convolution kernel.

  • kernel_w – Width of the convolution kernel.

  • padding – Pointer to the amount of paddings applied to the input tensor.

  • stride – Pointer to the strides for the cross-correlation.

Remarks

okk_bdc_avg_pool2d_v2

void okk_bdc_avg_pool2d_v2(local_addr_t output_addr, local_addr_t input_addr, const dim4 *input_shape, float scale, int kernel_h, int kernel_w, const Padding *padding, const dim2 *stride)

Perform 2D average pooling, but with custom scale value instead.

Parameters
  • output_addr – Address of the output tensor.

  • input_addr – Address of the input tensor.

  • input_shape – Pointer to the shape of the input tensor.

  • scale – Scale factor of each pooling window.

  • kernel_h – Height of the convolution kernel.

  • kernel_w – Width of the convolution kernel.

  • padding – Pointer to the amount of paddings applied to the input tensor.

  • stride – Pointer to the strides for the cross-correlation.

Remarks

okk_bdc_max_pool2d

void okk_bdc_max_pool2d(local_addr_t output_addr, local_addr_t input_addr, const dim4 *input_shape, int kernel_h, int kernel_w, const Padding *padding, const dim2 *stride)

Perform 2D max pooling.

Parameters
  • output_addr – Address of the output tensor.

  • input_addr – Address of the input tensor.

  • input_shape – Pointer to the shape of the input tensor.

  • kernel_h – Height of the convolution kernel.

  • kernel_w – Width of the convolution kernel.

  • padding – Pointer to the amount of paddings applied to the input tensor.

  • stride – Pointer to the strides for the cross-correlation.

Remarks

okk_bdc_matmul

void okk_bdc_matmul(local_addr_t output_addr, local_addr_t left_addr, local_addr_t right_addr, local_addr_t bias_addr, int left_rows, int left_cols, int right_cols, int left_cols_per_channel, int right_cols_per_channel, bool using_bias, bool result_add)

Perform matrix multiplication with or without adding bias and result accumulation by addtition.

Parameters
  • output_addr – Address of the output tensor.

  • left_addr – Address of the left matrix tensor.

  • right_addr – Address of the right matrix tensor.

  • bias_addr – Address of the bias tensor, only used when using_bias = true.

  • left_rows – Number of the rows of the left matrix.

  • left_cols – Number of the columns of the left matrix.

  • right_cols – Number of the columns of the right matrix.

  • left_cols_per_channel – Number of the columns of the left matrix per channel.

  • right_cols_per_channel – Number of the columns of the right matrix per channel.

  • using_bias – Flag of adding bias.

  • result_add – Flag of performing result accumulation by addtition.

Remarks