bmcv_cmulp
该接口实现复数乘法运算,运算公式如下:
\[\text{outputReal} + \text{outputImag} \times i = (\text{inputReal} + \text{inputImag} \times i) \times (\text{pointReal} + \text{pointImag} \times i)\]\[\text{outputReal} = \text{inputReal} \times \text{pointReal} - \text{inputImag} \times \text{pointImag}\]\[\text{outputImag} = \text{inputReal} \times \text{pointImag} + \text{inputImag} \times \text{pointReal}\]其中,\(i\) 是虚数单位,满足公式 \(i^2 = -1\).
处理器型号支持:
该接口支持BM1684/BM1684X。
接口形式:
bm_status_t bmcv_cmulp( bm_handle_t handle, bm_device_mem_t inputReal, bm_device_mem_t inputImag, bm_device_mem_t pointReal, bm_device_mem_t pointImag, bm_device_mem_t outputReal, bm_device_mem_t outputImag, int batch, int len);
输入参数说明:
bm_handle_t handle
输入参数。bm_handle 句柄。
bm_device_mem_t inputReal
输入参数。存放输入实部的 device 地址。
bm_device_mem_t inputImag
输入参数。存放输入虚部的 device 地址。
bm_device_mem_t pointReal
输入参数。存放另一个输入实部的 device 地址。
bm_device_mem_t pointImag
输入参数。存放另一个输入虚部的 device 地址。
bm_device_mem_t outputReal
输出参数。存放输出实部的 device 地址。
bm_device_mem_t outputImag
输出参数。存放输出虚部的 device 地址。
int batch
输入参数。batch 的数量。
int len
输入参数。一个 batch 中复数的数量。
返回值说明:
BM_SUCCESS: 成功
其他:失败
注意事项:
数据类型仅支持 float。
示例代码
int L = 5; int batch = 2; float *XRHost = new float[L * batch]; float *XIHost = new float[L * batch]; float *PRHost = new float[L]; float *PIHost = new float[L]; for (int i = 0; i < L * batch; ++i) { XRHost[i] = rand() % 5 - 2; XIHost[i] = rand() % 5 - 2; } for (int i = 0; i < L; ++i) { PRHost[i] = rand() % 5 - 2; PIHost[i] = rand() % 5 - 2; } float *YRHost = new float[L * batch]; float *YIHost = new float[L * batch]; bm_handle_t handle = nullptr; bm_dev_request(&handle, 0); bm_device_mem_t XRDev, XIDev, PRDev, PIDev, YRDev, YIDev; bm_malloc_device_byte(handle, &XRDev, L * batch * 4); bm_malloc_device_byte(handle, &XIDev, L * batch * 4); bm_malloc_device_byte(handle, &PRDev, L * 4); bm_malloc_device_byte(handle, &PIDev, L * 4); bm_malloc_device_byte(handle, &YRDev, L * batch * 4); bm_malloc_device_byte(handle, &YIDev, L * batch * 4); bm_memcpy_s2d(handle, XRDev, XRHost); bm_memcpy_s2d(handle, XIDev, XIHost); bm_memcpy_s2d(handle, PRDev, PRHost); bm_memcpy_s2d(handle, PIDev, PIHost); bmcv_cmulp(handle, XRDev, XIDev, PRDev, PIDev, YRDev, YIDev, batch, L); bm_memcpy_d2s(handle, YRHost, YRDev); bm_memcpy_d2s(handle, YIHost, YIDev); delete[] XRHost; delete[] XIHost; delete[] PRHost; delete[] PIHost; delete[] YRHost; delete[] YIHost; bm_free_device(handle, XRDev); bm_free_device(handle, XIDev); bm_free_device(handle, YRDev); bm_free_device(handle, YIDev); bm_free_device(handle, PRDev); bm_free_device(handle, PIDev); bm_dev_free(handle);