10.4. Data Types¶
10.4.1. Audio Input / Output¶
The definition of data type and data structure related to audio input / output is as follows.
AI_DEV_MAX_NUM: Define the maximum number of audio input devices.
AO_DEV_MAX_NUM: Define the maximum number of audio output devices.
CVI_AUD_MAX_CHANNEL_NUM:Define the maximum number of channels for an audio output device.
AI_TALKVQE_MASK_AEC: Mask of Talk Vqe AEC function.
AI_TALKVQE_MASK_AGC: Mask of Talk Vqe AGC function.
AI_TALKVQE_MASK_ANR: Mask of Talk Vqe ANR function.
AI_RECORDVQE_MASK_AGC: Mask of Record Vqe AGC function.
MAX_AUDIO_FILE_PATH_LEN: Maximum length limitation for the path of the saved audio file.
MAX_AUDIO_FILE_NAME_LEN: Maximum length limitation for the name of the saved audio file.
CVI_MAX_AI_DEVICE_ID_NUM : Defines the maximum limit for the number of AI (Audio Input) device IDs.
CVI_MAX_AI_CARD_ID_NUM : Defines the maximum limit for the number of AI (Audio Input) card IDs.
CVI_MAX_AO_DEVICE_ID_NUM : Defines the maximum limit for the number of AO (Audio Output) device IDs.
CVI_MAX_AO_CARD_ID_NUM : Defines the maximum limit for the number of AO (Audio Output) card IDs.
CVI_MAX_AUDIO_FRAME_NUM : Defines the maximum limit for the number of audio frames.
CVI_AUD_MAX_VOICE_POINT_NUM : Defines the maximum number of samples for each frame of voice encoding.
CVI_AUD_MAX_AUDIO_POINT_NUM : Defines the maximum number of samples for each frame of all audio encoding.
CVI_MAX_AUDIO_STREAM_LEN : Defines the maximum length of the audio stream.
MAX_AUDIO_VQE_CUSTOMIZE_NAME : Defines the maximum length limit for the custom name of audio voice quality enhancement (VQE).
AUDIO_CLKSEL_E: Define the audio clock source.
AUDIO_SAMPLE_RATE_E: Define the audio sampling rate.
AUDIO_BIT_WIDTH_E: Define the audio sampling accuracy.
AIO_MODE_E: Define the audio input / output working mode.
AIO_I2STYPE_E: Define I2S interface device type.
AUDIO_SOUND_MODE_E: Define the audio channel mode.
AUDIO_MOD_PARAM_S: Define the audio module parameter structure.
AIO_ATTR_S: Define audio input / output device property structure.
AI_CHN_PARAM_S: Define channel parameter structure.
AUDIO_FRAME_S: Define audio frame data structure.
AEC_FRAME_S: Define the information structure of echo cancellation reference frame.
AUDIO_AGC_CONFIG_S: Define the audio AGC configuration information structure.
AI_AEC_CONFIG_S: Define the audio echo cancellation configuration information structure.
AUDIO_ANR_CONFIG_S: Define the information structure of audio voice noise reduction function.
VQE_WORKSTATE_E: Define the working mode of voice quality enhancement.
VQE_RECORD_TYPE: Define the recording type.
AI_TALKVQE_CONFIG_S: Define the structure of audio input sound quality enhancement (Talk) configuration information.
AI_RECORDVQE_CONFIG_S: Define the structure of audio input sound quality enhancement (Record) configuration information
AUDIO_STREAM_S: Define audio stream structure.
AO_CHN_STATE_S: Define Audio Output Channel Data Block Status Structure.
AUDIO_TRACK_MODE_E: Audio device channel mode type.
AUDIO_FADE_RATE_E: The audio device fade in and fade out rate type.
AUDIO_FADE_S: The audio device fades in and out setting structure.
G726_BPS_E: Defines the G.726 codec rate .
ADPCM_TYPE_E: Define ADPCM codec type.
AUDIO_SAVE_FILE_INFO_S: Definition of the configuration information structure for audio file saving function
AUDIO_FILE_STATUS_S: Define the audio file save status structure.
VQE_MODULE_CONFIG_S: Define the configuration information structure of voice quality enhancement and resampling module.
AUDIO_VQE_REGISTER_S: Define the register structure of sound quality enhancement and resampling module.
CVI_HPF_CONFIG_S :Defines the configuration parameters for a high-pass filter (HPF).
CVI_EQ_CONFIG_S :Defines the configuration parameters for an equalizer (EQ).
CVI_DRC_LIMITER_PARAM : Defines the configuration parameters for a dynamic range compressor (DRC) limiter.
CVI_DRC_EXPANDER_PARAM : Defines the configuration parameters for a dynamic range compressor (DRC) expander.
CVI_DRC_COMPRESSOR_PARAM : Defines the configuration parameters for a dynamic range compressor (DRC) compressor.
AUDIO_SPK_EQ_CONFIG_S : Defines the configuration parameters for the speaker equalizer (EQ).
AO_VQE_CONFIG_S : Defines the configuration parameters for audio output (AO) voice quality enhancement (VQE).
HPF_FILTER_TYPE : Defines the enumeration for high-pass filter (HPF) types.
AUDIO_SPK_AGC_CONFIG_S : Defines the configuration parameters for the speaker automatic gain control (AGC).
The following features are not currently supported.
AI_TALKVQE_MASK_HPF : Mask of Talk Vqe HPF function.
AI_TALKVQE_MASK_EQ : Mask of Talk Vqe EQ function.
AI_RECORDVQE_MASK_HPF : Mask of Record Vqe HPF function.
AI_RECORDVQE_MASK_RNR : Mask of Record Vqe RNR function.
AI_RECORDVQE_MASK_HDR : Mask of Record Vqe HDR function.
AI_RECORDVQE_MASK_DRC : Mask of Record Vqe DRC function.
AI_RECORDVQE_MASK_EQ : Mask of Record Vqe EQ function.
AO_VQE_MASK_HPF : Mask of AO Vqe HPF function.
10.4.1.1. AI_DEV_MAX_NUM¶
【Description】
Define the maximum number of audio input devices.
【Syntax】
#define AI_DEV_MAX_NUM 1
【Note】
Deprecated.
【Related Data Type and Interface】
None.
10.4.1.2. AO_DEV_MAX_NUM¶
【Description】
Define the maximum number of audio output devices.
【Syntax】
#define AO_DEV_MAX_NUM 1
【Note】
Deprecated.
【Related Data Type and Interface】
None.
10.4.1.3. CVI_AUD_MAX_CHANNEL_NUM¶
【Description】
Define the maximum number of channels for an audio output device.
【Syntax】
#define CVI_AUD_MAX_CHANNEL_NUM 8
【Note】
Maximum number of audio channels can be set by AIO_ATTR_S stAttr.u32ChnCnt.
The value cannot exceed CVI_AUD_MAX_CHANNEL_NUM
【Related Data Type and Interface】
None.
10.4.1.4. AI_TALKVQE_MASK_AEC¶
【Description】
Mask of Talk Vqe AEC function.
【Syntax】
#define AI_TALKVQE_MASK_AEC 0x3
【Note】
None.
【Related Data Type and Interface】
None.
10.4.1.5. AI_TALKVQE_MASK_AGC¶
【Description】
Define the Mask of Talk Vqe AGC function.
【Syntax】
#define AI_TALKVQE_MASK_AGC 0x8
【Note】
None.
【Related Data Type and Interface】
Assign values to the member of structure u32OpenMask of AI_TALKVQE_CONFIG_S to indicate that AGC function is turned on.
For example, u32OpenMask = AI_TALKVQE_MASK_AEC | AI_TALKVQE_MASK_AGC; it indicates that AEC and AGC functions are turned on.
10.4.1.6. AI_TALKVQE_MASK_ANR¶
【Description】
Define the Mask of talk Vqe ANR function.
【Syntax】
#define AI_TALKVQE_MASK_ANR 0x4
【Note】
None.
【Related Data Type and Interface】
Assign values to the member of structure u32OpenMask of AI_TALKVQE_CONFIG_S to indicate that ANR function is turned on.
For example, u32OpenMask = AI_TALKVQE_MASK_AEC AI_TALKVQE_MASK_ANR; it indicates that AEC and ANR functions are turned on.
10.4.1.7. AI_RECORDVQE_MASK_AGC¶
【Description】
Define the Mask of record Vqe AGC function.
【Syntax】
#define AI_RECORDVQE_MASK_AGC 0x20
【Note】
None.
【Related Data Type and Interface】
Assign values to the member of structure u32OpenMask of AI_RECORDVQE_CONFIG_S to indicate that AGC function is turned on.
For example, u32OpenMask = AI_RECORDVQE_MASK_HPF AI_RECORDVQE_MASK_AGC; it indicates that HPF and AGC functions are turned on.
10.4.1.8. MAX_AUDIO_FILE_PATH_LEN¶
【Description】
Define the maximum length limitation for the path of the saved audio file.
【Syntax】
#define MAX_AUDIO_FILE_PATH_LEN 256
【Note】
None.
【Related Data Type and Interface】
AUDIO_SAVE_FILE_INFO_S
10.4.1.9. MAX_AUDIO_FILE_NAME_LEN¶
【Description】
Define the maximum length limitation for the name of the saved audio file.
【Syntax】
#define MAX_AUDIO_FILE_NAME_LEN 256
【Note】
None.
【Related Data Type and Interface】
AUDIO_SAVE_FILE_INFO_S
10.4.1.10. CVI_MAX_AI_DEVICE_ID_NUM¶
【Description】
Define the maximum number of AI (Audio Input) device IDs.
【Syntax】
#define CVI_MAX_AI_DEVICE_ID_NUM 5 /* Maximum number of AI device ID */
【Note】
None.
【Related Data Type and Interface】
None.
10.4.1.11. CVI_MAX_AI_CARD_ID_NUM¶
【Description】
Define the maximum number of AI (Audio Input) card IDs.
【Syntax】
#define CVI_MAX_AI_CARD_ID_NUM 5 /* Maximum number of AI card ID */
【Note】
None.
【Related Data Type and Interface】
None.
10.4.1.12. CVI_MAX_AO_DEVICE_ID_NUM¶
【Description】
Define the maximum number of AO (Audio Output) device IDs.
【Syntax】
#define CVI_MAX_AO_DEVICE_ID_NUM 5 /* Maximum number of AO device ID */
【Note】
None.
【Related Data Type and Interface】
None.
10.4.1.13. CVI_MAX_AO_CARD_ID_NUM¶
【Description】
Define the maximum number of AO (Audio Output) card IDs.
【Syntax】
#define CVI_MAX_AO_CARD_ID_NUM 5 /* Maximum number of AO card ID */
【Note】
None.
【Related Data Type and Interface】
None.
10.4.1.14. CVI_MAX_AUDIO_FRAME_NUM¶
【Description】
Defines the maximum limit for the number of audio frames.
【Syntax】
#define CVI_MAX_AUDIO_FRAME_NUM 300 /* max count of audio frame in Buffer */
【Note】
None.
【Related Data Type and Interface】
None.
10.4.1.15. CVI_AUD_MAX_VOICE_POINT_NUM¶
【Description】
Defines the maximum number of samples for each frame of voice encoding.
【Syntax】
#define CVI_AUD_MAX_VOICE_POINT_NUM 1280 /* max sample per frame for voice encode */
【Note】
None.
【Related Data Type and Interface】
None.
10.4.1.16. CVI_AUD_MAX_AUDIO_POINT_NUM¶
【Description】
Defines the maximum number of samples for each frame of all audio encoding.
【Syntax】
#define CVI_AUD_MAX_AUDIO_POINT_NUM 2048 /* max sample per frame for all encoder */
【Note】
None.
【Related Data Type and Interface】
None.
10.4.1.17. CVI_MAX_AUDIO_STREAM_LEN¶
【Description】
Defines the maximum length of the audio stream.
【Syntax】
#define CVI_MAX_AUDIO_STREAM_LEN 8192 /* Maximum length of audio stream */
【Note】
None.
【Related Data Type and Interface】
None.
10.4.1.18. MAX_AUDIO_VQE_CUSTOMIZE_NAME¶
【Description】
Defines the maximum length limit for the custom name of audio voice quality enhancement (VQE).
【Syntax】
#define MAX_AUDIO_VQE_CUSTOMIZE_NAME 64 /* Maximum length of VQE customize name */
【Note】
None.
【Related Data Type and Interface】
None.
10.4.1.19. AUDIO_CLKSEL_E¶
【Description】
Define the audio clock source.
【Syntax】
typedef enum _AUDIO_CLKSEL_E
{ AUDIO_CLKSEL_BASE = 0, /*<Audio base clk. */
AUDIO_CLKSEL_SPARE, /*<Audio spare clk. */
AUDIO_CLKSEL_BUTT,
} AUDIO_CLKSEL_E;
【Member】
None.
【Note】
Cvitek users do not need to set the clock at this time.
【Related Data Type and Interface】
10.4.1.20. AUDIO_SAMPLE_RATE_E¶
【Description】
Define the audio sampling rate.
【Syntax】
typedef enum _AUDIO_SAMPLE_RATE_E
{ AUDIO_SAMPLE_RATE_8000 =8000, /* 8K samplerate */
AUDIO_SAMPLE_RATE_11025 =11025, /* 11.025K samplerate */
AUDIO_SAMPLE_RATE_16000 =16000, /* 16K samplerate */
AUDIO_SAMPLE_RATE_22050 =22050, /* 22.050K samplerate */
AUDIO_SAMPLE_RATE_24000 =24000, /* 24K samplerate */
AUDIO_SAMPLE_RATE_32000 =32000, /* 32K samplerate */
AUDIO_SAMPLE_RATE_44100 =44100, /* 44.1K samplerate */
AUDIO_SAMPLE_RATE_48000 =48000, /* 48K samplerate */
AUDIO_SAMPLE_RATE_64000=64000, /* 64K samplerate*/
AUDIO_SAMPLE_RATE_BUTT, }AUDIO_SAMPLE_RATE_E;
【Member】
Member |
Description |
---|---|
AUDIO_SAMPLE_RATE_8000 |
8kHz sample rate |
AUDIO_SAMPLE_RATE_11025 |
11.025kHz sample rate |
AUDIO_SAMPLE_RATE_16000 |
16kHz sample rate |
AUDIO_SAMPLE_RATE_22050 |
22.050kHz sample rate |
AUDIO_SAMPLE_RATE_24000 |
24kHz sample rate |
AUDIO_SAMPLE_RATE_32000 |
32kHz sample rate |
AUDIO_SAMPLE_RATE_44100 |
44.1kHz sample rate |
AUDIO_SAMPLE_RATE_48000 |
48kHz sample rate |
AUDIO_SAMPLE_RATE_64000 |
64kHz sample rate |
【Note】
【Related Data Type and Interface】
10.4.1.21. AUDIO_BIT_WIDTH_E¶
【Description】
Define the audio sampling accuracy.
【Syntax】
typedef enum _AUDIO_BIT_WIDTH_E {
AUDIO_BIT_WIDTH_8 =0, /* 8bit width */
AUDIO_BIT_WIDTH_16 =1, /* 16bit width */
AUDIO_BIT_WIDTH_24 =2, /* 24bit width */
AUDIO_BIT_WIDTH_32 =3, /* 32bit width */
AUDIO_BIT_WIDTH_BUTT, /* boundary check */
} AUDIO_BIT_WIDTH_E;
【Member】
Member |
Description |
---|---|
AUDIO_BIT_WIDTH_8 |
the sampling accuracy is 8 bits |
AUDIO_BIT_WIDTH_16 |
the sampling accuracy is 16 bits |
AUDIO_BIT_WIDTH_24 |
the sampling accuracy is 24 bits |
AUDIO_BIT_WIDTH_32 |
the sampling accuracy is 32 bits |
【Note】
None.
【Related Data Type and Interface】
10.4.1.22. AIO_MODE_E¶
【Description】
Define the audio input / output working mode.
【Syntax】
typedef enum _AIO_MODE_E {
AIO_MODE_I2S_MASTER = 0, /* AIO I2S master mode */
AIO_MODE_I2S_SLAVE, /* AIO I2S slave mode */
AIO_MODE_PCM_SLAVE_STD, /* AIO PCM slave standard mode */
AIO_MODE_PCM_SLAVE_NSTD, /* AIO PCM slave non-standard mode */
AIO_MODE_PCM_MASTER_STD, /* AIO PCM master standard mode */
AIO_MODE_PCM_MASTER_NSTD, /* AIO PCM master non-standard mode */
AIO_MODE_BUTT /* boundary check */
}AIO_MODE_E;
【Member】
Member |
Description |
---|---|
AIO_MODE_I2S_MASTER |
I2S master mode |
AIO_MODE_I2S_SLAVE |
I2S slave mode |
AIO_MODE_PCM_SLAVE_STD |
PCM slave standard mode |
AIO_MODE_PCM_SLAVE_NSTD |
PCM slave non-standard mode |
AIO_MODE_PCM_MASTER_STD |
PCM master standard mode |
AIO_MODE_PCM_MASTER_NSTD |
PCM master non-standard mode |
【Note】
Built-in Cvitek only supports I2S master mode.
【Related Data Type and Interface】
10.4.1.23. AIO_I2STYPE_E¶
【Description】
Define I2S interface device type.
【Syntax】
typedef enum {
AIO_I2STYPE_INNERCODEC = 0, /* AIO I2S connect inner audio CODEC */
AIO_I2STYPE_INNERHDMI, /* AIO I2S connect Inner HDMI */
AIO_I2STYPE_EXTERN, /* AIO I2S connect extern hardware */
} AIO_I2STYPE_E;
【Member】
Member |
Description |
---|---|
AIO_I2STYPE_INNERCODEC |
I2S connect inner audio CODEC |
AIO_I2STYPE_INNERHDMI |
I2S connect Inner HDMI |
AIO_I2STYPE_EXTERN |
I2S connect extern hardware |
【Note】
Cvitek only supports AIO_I2STYPE_INNERCODEC connecting to inner audio CODEC.
【Related Data Type and Interface】
10.4.1.24. AUDIO_SOUND_MODE_E¶
【Description】
Define the audio channel mode.
【Syntax】
typedef enum _AIO_SOUND_MODE_E {
AUDIO_SOUND_MODE_MONO = 0, /*mono*/
AUDIO_SOUND_MODE_STEREO = 1, /*stereo only support interlace mode*/
AUDIO_SOUND_MODE_BUTT /*boundary check*/
} AUDIO_SOUND_MODE_E;
【Member】
Member |
Description |
---|---|
AUDIO_SOUND_MODE_MONO |
Mono |
AUDIO_SOUND_MODE_STEREO |
Stereo |
【Note】
The left channel corresponds to channel 0 and the right channel corresponds to channel 1.
For AI, mono input is from the left channel by default.
If it needs to be configured as the right channel input,
Turn on the right channel only and process.
Open the left and right channels, process according to the left channel, and use CVI_AI_SetTrackMode to configure Audio Input channel mode to AUDIO_TRACK_EXCHANGE”.
For AO, mono input is from the left channel by default.
If it needs to be configured as the right channel input, you can consider two methods.
Turn on the right channel only and process.
Turn on the left and right channels, process according to the left channel, and use CVI_AO_SetTrackMode to configure AO channel mode to “AUDIO_TRACK_EXCHANGE”.
For stereo mode, only the left channel (that is, the channel whose number is less than half of u32ChnCnt in the device attribute) should be operated, and the SDK will automatically operate the right channel.
【Related Data Type and Interface】
10.4.1.25. AUDIO_MOD_PARAM_S¶
【Description】
Define the audio module parameter structure.
【Syntax】
typedef struct _AUDIO_MOD_PARAM_S {
AUDIO_CLKSEL_E enClkSel; /* Audio clock select */
} AUDIO_MOD_PARAM_S;
【Member】
enClkSel audio clock source selection. Please see AUDIO_CLKSEL_E.
【Note】
Cvitek does not need special setting for CLK.
【Related Data Type and Interface】
None.
10.4.1.26. AIO_ATTR_S¶
【Description】
Define audio input / output device property structure.
【Syntax】
typedef struct _AIO_ATTR_S {
AUDIO_SAMPLE_RATE_E enSamplerate; /* sample rate */
AUDIO_BIT_WIDTH_E enBitwidth; /* bitwidth */
AIO_MODE_E enWorkmode; /* master or slave mode */
AUDIO_SOUND_MODE_E enSoundmode; /* momo or steror */
CVI_U32 u32EXFlag;
/* expand 8bit to 16bit,use AI_EXPAND(only valid for AI 8bit),*/
/*use AI_CUT(only valid for extern Codec for 24bit) */
CVI_U32 u32FrmNum;
/* frame num in buf[2,CVI_MAX_AUDIO_FRAME_NUM] */
CVI_U32 u32PtNumPerFrm;
/* point num per frame (80/160/240/320/480/1024/2048) */
/*(ADPCM IMA should add 1 point, AMR only support 160) */
CVI_U32 u32ChnCnt; /* channel number on FS, valid value:1/2/4/8 */
CVI_U32 u32ClkSel; /* 0: AI and AO clock is separate*/
/* 1: AI and AO clock is inseparate, AI use AO's clock*/
AIO_I2STYPE_E enI2sType; /* i2s type */
} AIO_ATTR_S;
【Member】
Member |
Description |
---|---|
enSamplerate |
Audio sample rate (this parameter does not work in slave mode); Static properties. |
enBitwidth |
Audio sampling accuracy (in slave mode, this parameter must match the sampling accuracy of audio AD/DA); Static properties. |
enWorkmode |
Audio I / O working mode; Static properties. |
enSoundmode |
Audio channel mode; Static properties. |
u32EXFlag |
Value range: {0, 1, 2}。 0:does not extend. 1: It is expanded to 16 bits, and the 8-bit to 16bit extension flag (only valid for Audio Input sampling accuracy of 8bit). 2: The 24 bits are cropped to 16 bits, which may be used in the external codec scenario. Static property, keep parameters, generally set to 1. |
u32FrmNum |
Number of block frames. |
u32PtNumPerFrm |
Number of sample points per frame. Value range: G711, G726, ADPCM_DVI4 is 160, 320, 480; |
u32ChnCnt |
Number of channels supported. Values: 1, 2, 4, 8, 16( Input and output supports up to 2 channels respectively. |
u32ClkSel |
Whether AI and AO use the same clock source. |
enI2sType |
Configure I2S interface device type; Cvitek only supports master mode. |
【Note】
The number of sampling points per frame u32PtNumPerFrm and sampling rate enSamplerate determine the frequency of hardware interrupt.
If the frequency is too high, it will affect the performance of the system and interact with other services.
It is suggested that the values of these two parameters satisfy the formula: (u32PtNumPerFrm * 1000) / enSamplerate > = 10.
For example, when the sampling rate is 16000Hz, it is recommended to set the number of sampling points greater than or equal to 160.
【Related Data Type and Interface】
CVI_AI_SetPubAttr
CVI_AO_SetPubAttr
10.4.1.27. AI_CHN_PARAM_S¶
【Description】
Define channel parameter structure.
【Syntax】
typedef struct _AI_CHN_PARAM_S {
CVI_U32 u32UsrFrmDepth; /* user frame depth */
} AI_CHN_PARAM_S;
【Member】
u32UsrFrmDepth: Audio frame block depth.
【Note】
None.
【Related Data Type and Interface】
None.
10.4.1.28. AUDIO_FRAME_S¶
【Description】
Define audio frame data structure.
【Syntax】
typedef struct _AUDIO_FRAME_S {
AUDIO_BIT_WIDTH_E enBitwidth;/*audio frame bitwidth*/
AUDIO_SOUND_MODE_E enSoundmode;/*audio frame momo or stereo mode*/
CVI_U8 * u64VirAddr[2]; /*audio frame vir addr*/
CVI_U64 u64PhyAddr[2]; /*audio frame phy addr*/
CVI_U64 u64TimeStamp; /*audio frame timestamp*/
CVI_U32 u32Seq; /*audio frame seq*/
CVI_U32 u32Len; /*data length per channel in frame*/
CVI_U32 u32PoolId[2]; /*audio frame pool id*/
} AUDIO_FRAME_S;
【Member】
Member |
Description |
---|---|
enBitwidth |
Audio sampling accuracy. |
enSoundmode |
Audio channel mode. |
u64VirAddr [2] |
Audio frame data virtual address. |
u64PhyAddr[2] |
Audio frame data physical address. Not supported at present. |
u64TimeStamp |
Audio frame timestamp. The unit is μs. |
u32Seq |
Audio frame sequence. |
u32Len |
Audio frame length: the total sampling amount of a single channel. samples as the unit. 1 sample = 2 bytes. Ex. AIO_ATTR_S parameters setting: u32FrmNum = 320, u32ChnCnt = 2. And u32Len = 320(samples/channel). u64VirAddr [0] buffer includes the number of bytes should be (u32Len x u32ChnCnt x 2). |
u32PoolId[2] |
Audio frame block pool ID. |
【Note】
u32Len (audio frame length) refers to the data length of a single channel.
u64VirAddr [0],the length is in bytes : (u32Len x bytes_per_sample);
The default channel mode for mono audio is left channel, and the data is arranged as [Left, Left, Left, Left, Left, …].
Stereo data is arranged as [L, R, L, R, L, R, …] where L stands for the left channel and R stands for the right channel.
(Note: the left represents a single sample in the left channel, and the right represents a single sample in the left channel.)
u64VirAddr [1]. There is no storage data, which can be customized.
【Related Data Type and Interface】
None.
10.4.1.29. AEC_FRAME_S¶
【Description】
Define the information structure of echo cancellation reference
【Syntax】
typedef struct _AEC_FRAME_S {
AUDIO_FRAME_S stRefFrame; /* aec reference audio frame */
CVI_BOOL bValid; /* whether frame is valid */
CVI_BOOL bSysBind; /* whether is sysbind */
}AEC_FRAME_S;
【Member】
Member |
Description |
---|---|
stRefFrame |
Echo cancellation reference frame structure. |
bValid |
Reference frame valid flag. Value range: CVI_TRUE:the reference frame is valid. CVI_FALSE: if the reference frame is invalid, it cannot be used for echo cancellation. |
bSysBind |
Whether Audio Input and AENC are system bound. |
【Note】
None.
【Related Data Type and Interface】
None.
10.4.1.30. AUDIO_AGC_CONFIG_S¶
【Description】
Define the audio AGC configuration information structure.
【Syntax】
typedef struct _AUDIO_AGC_CONFIG_S {
/* the max boost gain for AGC release processing, [0, 3] */
/* para_obj.para_agc_max_gain = 1; */
CVI_S8 para_agc_max_gain;
/* the gain level of target high of AGC, [0, 36] */
/* para_obj.para_agc_target_high = 2; */
CVI_S8 para_agc_target_high;
/* the gain level of target low of AGC, [0, 36] */
/* para_obj.para_agc_target_low = 6; */
CVI_S8 para_agc_target_low;
/* speech-activated AGC functionality, [0, 1] */
/* para_obj.para_agc_vad_enable = 1; */
CVI_BOOL para_agc_vad_ena;
} AUDIO_AGC_CONFIG_S;
【Member】
Member |
Description |
---|---|
para_agc_max_gain |
The maximum gain at which a signal can be amplified. |
para_agc_target_high: |
AGC will reach the “Target High” level. |
para_agc_target_low |
AGC will reach the “Target Low” level. |
para_agc_vad_enable: |
Speech-activated AGC uses speech VAD from NR to avoid amplifying background noise. It is recommended to turn on this function in high SNR environment, and it is better to turn off this function in medium / low SNR environment for better voice quality. |
【Note】
【Related Data Type and Interface】
AI_VQE_CONFIG_S
10.4.1.31. AI_AEC_CONFIG_S¶
【Description】
Define the audio echo cancellation configuration information structure.
【Syntax】
typedef struct _AI_AEC_CONFIG_S {
CVI_U16 para_aec_filter_len; /* the filter length of AEC, [1, 13] */
CVI_U16 para_aes_std_thrd; /* the threshold of STD/DTD, [0, 39] */
CVI_U16 para_aes_supp_coeff; /* the residual echo suppression level in AES, [0, 100] */
} AI_AEC_CONFIG_S;
【Member】
Member |
Description |
---|---|
para_aex_filter_len |
The length of the adaptive filter |
para_aes_std_thrd |
Residual Echo Suppression Threshold |
para_aes_supp_coeff |
Residual Echo Suppression Level |
【Note】
When user mode is on, other parameters will take effect;
Otherwise, it is configured according to the default value of the corresponding working mode enWorkstate in according to AI_VQE_CONFIG_S / AI_TALKVQE_CONFIG The working mode in the AI_VQE_CONFIG_S/ AI_TALKVQE_CONFIG_S.
When configuring parameters, correctness checks for advanced parameters are only performed when the user mode is enabled. Only when the advanced parameters are correct can the configuration be successful.
【Related Data Type and Interface】
AI_VQE_CONFIG_S
10.4.1.32. AUDIO_ANR_CONFIG_S¶
【Description】
Define the information structure of audio voice noise reduction function.
【Syntax】
typedef struct _AUDIO_ANR_CONFIG_S {
/* the coefficient of NR priori SNR tracking, [0, 20] */
/* para_obj.para_nr_snr_coeff = 15; */
CVI_U16 para_nr_snr_coeff;
/* the coefficient of NR noise tracking, [0, 14] */
/* para_obj.para_nr_noise_coeff = 2; */
//CVI_S8 para_nr_noise_coeff;
CVI_U16 para_nr_init_sile_time;
} AUDIO_ANR_CONFIG_S;
【Member】
Member |
Description |
---|---|
para_nr_snr_coeff |
Signal-to-Noise Ratio (SNR) tracking coefficient. If it is set to a larger value, NR will have a higher noise reduction ability, but the speech signal may be more easily distorted; If a smaller value is selected, NR will suppress less noise signal, but it will have better speech quality performance. |
para_nr_noise_coeff |
Noise tracking coefficient. This parameter determines the tracking speed of stationary noise[ 0 - 14] 0: slowest noise tracking speed 14: fastest noise tracking speed |
【Note】
None.
【Related Data Type and Interface】
AI_VQE_CONFIG_S
10.4.1.33. AUDIO_DELAY_CONFIG_S¶
【Description】
Definition of Audio Signal Delay Structure.
【Syntax】
typedef struct _AUDIO_DELAY_CONFIG_S {
/* the initial filter length of linear AEC to support up for echo tail, [1, 13] */
CVI_U16 para_aec_init_filter_len;
/* the digital gain target, [1, 12] */
CVI_U16 para_dg_target;
/* the delay sample for ref signal, [1, 3000] */
CVI_U16 para_delay_sample;
} AUDIO_DELAY_CONFIG_S;
【Member】
Member |
Description |
---|---|
para_aec_init_filter_len |
The length of the adaptive filter |
para_dg_target |
Digital Gain. Value range [1-12]. This feature helps reduce residual echo and residual stationary noise. |
para_delay_sample |
Used to delay the reference signal. Value range: [1-3000] It enables AEC/AES to accelerate convergence at the beginning of the echo. |
【Note】
None.
【Related Data Type and Interface】
10.4.1.34. VQE_WORKSTATE_E¶
【Description】
Define the working mode of voice quality enhancement.
【Syntax】
typedef enum _VQE_WORKSTATE_E {
VQE_WORKSTATE_COMMON = 0,
/* common environment, Applicable to the family of voice calls. */
VQE_WORKSTATE_MUSIC = 1,
/* music environment , Applicable to the family of music environment. */
VQE_WORKSTATE_NOISY = 2,
/* noisy environment , Applicable to the noisy voice calls. */
} VQE_WORKSTATE_E;
【Member】
Member |
Description |
---|---|
VQE_WORKSTATE_COMMON |
Common mode. |
VQE_WORKSTATE_MUSIC |
Music mode. |
VQE_WORKSTATE_NOISY |
Noise mode. |
【Note】
None.
【Related Data Type and Interface】
AI_VQE_CONFIG_S
10.4.1.35. VQE_RECORD_TYPE¶
【Description】
Define the recording type.
【Syntax】
typedef enum _VQE_RECORD_TYPE {
VQE_RECORD_NORMAL = 0,
/*<double micphone recording. */
VQE_RECORD_BUTT, /* Used for boundary checking */
} VQE_RECORD_TYPE;
【Member】
VQE_RECORD_NORMAL: Standard type.
【Note】
Cvitek only supports talk VQE, and record VQE is not used until it is customized.
【Related Data Type and Interface】
10.4.1.36. AI_TALKVQE_CONFIG_S¶
【Description】
Define the structure of audio input sound quality enhancement (Talk) configuration information.
【Syntax】
typedef struct _AI_TALKVQE_CONFIG_S {
CVI_U16 para_client_config; /* Client-specific configuration parameter */
CVI_U32 u32OpenMask; /* VQE feature enable mask */
CVI_S32 s32WorkSampleRate; /* Sample Rate: 8KHz/16KHz. Default: 8KHz */
// MIC IN VQE settings
AI_AEC_CONFIG_S stAecCfg; /* Acoustic Echo Cancellation configuration */
AUDIO_ANR_CONFIG_S stAnrCfg; /* Automatic Noise Reduction configuration */
AUDIO_AGC_CONFIG_S stAgcCfg; /* Automatic Gain Control configuration */
AUDIO_DELAY_CONFIG_S stAecDelayCfg; /* AEC delay configuration */
CVI_S32 para_notch_freq; /* User can ignore this flag */
CVI_CHAR customize[MAX_AUDIO_VQE_CUSTOMIZE_NAME]; /* Customization name */
} AI_TALKVQE_CONFIG_S;
【Member】
Member |
Description |
---|---|
para_client_config |
Client parameter configuration. |
u32OpenMask |
Mask value enabled for each Talk Vqe function. |
s32WorkSampleRate |
Operating sampling frequency. This parameter is the working sampling rate of the internal functional algorithm. Value range: 8KHz/16KHz/48KHz. The default value is 8KHz. (48KHz for Hpf only) |
stAecCfg |
Configuration information related to echo cancellation function. |
stAnrCfg |
Configuration information related to voice noise reduction function. |
stAgcCfg |
Automatic gain control configuration information. |
stAecDelayCfg |
Configuration information related to audio signal delay. |
para_notch_freq |
Customized frequency elimination. |
customize |
Customization parameter selection. |
【Note】
Cvitek VQE supports only AGC/ANR/AEC.
For example, if RNR/EQ data is set, it will not have an effect
【Related Data Type and Interface】
None.
10.4.1.37. AI_RECORDVQE_CONFIG_S¶
【Description】
Define the structure of audio input sound quality enhancement (Record) configuration information.
【Syntax】
typedef struct _AI_RECORDVQE_CONFIG_S {
CVI_U32 u32OpenMask; /* Bitmask for enabling/disabling features */
CVI_S32 s32WorkSampleRate; /* Sample Rate: 16KHz/48KHz */
/* Sample Rate:16KHz/48KHz*/
CVI_S32 s32FrameSample; /* Number of samples per frame */
CVI_S32 s32BytesPerSample; /* Number of bytes per sample */
/* VQE frame length:80-4096 */
VQE_WORKSTATE_E enWorkstate; /* Current work state of VQE */
CVI_S32 s32InChNum; /* Number of input channels */
CVI_S32 s32OutChNum; /* Number of output channels */
VQE_RECORD_TYPE enRecordType; /* Type of recording */
AUDIO_AGC_CONFIG_S stAgcCfg; /* Configuration for Automatic Gain Control (AGC) */
} AI_RECORDVQE_CONFIG_S;
【Member】
Member |
Description |
---|---|
u32OpenMask |
Mask value enabled for each Talk Vqe function. |
s32WorkSampleRate |
Operating sampling frequency. This parameter is the working sampling rate of the internal functional algorithm. Value range: 8KHz/16KHz/48KHz. The default value is 8KHz. (48KHz for Hpf only) |
stAgcCfg |
Automatic gain control configuration information. |
enWorkstate |
Working mode |
s32InChNum |
Number of input channels processed by VQE. Value range: [1, 2]. |
s32OutChNum |
Number of output channels processed by VQE. Value range: [1, 2]. |
enRecordType |
Record type |
【Note】
Cvitek VQE supports only AGC/ANR/AEC.
For example, if RNR/EQ data is set, it will not have an effect
【Related Data Type and Interface】
None.
10.4.1.38. AUDIO_STREAM_S¶
【Description】
Define audio stream structure.
【Syntax】
typedef struct _AUDIO_STREAM_S {
CVI_U8 *pStream; /* the virtual address of stream */
CVI_U32 u32PhyAddr; /* the physics address of stream */
CVI_U32 u32Len; /* stream lenth, by bytes */
CVI_U64 u64TimeStamp; /* frame time stamp*/
CVI_U32 u32Seq; /* frame seq,if stream is not a valid frame, u32Seq is 0*/
} AUDIO_STREAM_S;
【Member】
Member |
Description |
---|---|
pStream |
The virtual address of stream |
u32PhyAddr |
the physics address of stream |
u32Len |
Audio stream length. AUDIO_STREAM_S structure body, in byte. |
u64TimeStamp |
Audio stream timestamp |
u32Seq |
Audio stream sequence |
【Note】
None.
【Related Data Type and Interface】
CVI_AENC_GetStream
10.4.1.39. AO_CHN_STATE_S¶
【Description】
Define Audio Output Channel Data Block Status Structure.
【Syntax】
typedef struct hiAO_CHN_STATE_S {
CVI_U32 u32ChnTotalNum;
CVI_U32 u32ChnFreeNum;
CVI_U32 u32ChnBusyNum;
} AO_CHN_STATE_S;
【Member】
Member |
Description |
---|---|
u32ChnTotalNum |
Total number of blocks in output channel. |
u32ChnFreeNum |
Available free blocks |
u32ChnBusyNum |
Occupied blocks |
【Note】
None.
【Related Data Type and Interface】
CVI_AO_QueryChnStat
10.4.1.40. AUDIO_TRACK_MODE_E¶
【Description】
Audio device channel mode type.
【Syntax】
typedef enum _AUDIO_TRACK_MODE_E {
AUDIO_TRACK_NORMAL = 0, /* Normal audio track */
AUDIO_TRACK_BOTH_LEFT = 1, /* Both channels play left audio */
AUDIO_TRACK_BOTH_RIGHT = 2, /* Both channels play right audio */
AUDIO_TRACK_EXCHANGE = 3, /* Exchange left and right audio channels */
AUDIO_TRACK_MIX = 4, /* Mix both left and right audio channels */
AUDIO_TRACK_LEFT_MUTE = 5, /* Mute left audio channel */
AUDIO_TRACK_RIGHT_MUTE = 6, /* Mute right audio channel */
AUDIO_TRACK_BOTH_MUTE = 7, /* Mute both audio channels */
AUDIO_TRACK_BUTT /* End of audio track modes */
} AUDIO_TRACK_MODE_E;
【Member】
Member |
Description |
---|---|
AUDIO_TRACK_NORMAL |
Normal mode, no processing |
AUDIO_TRACK_BOTH_LEFT |
Both channels are left |
AUDIO_TRACK_BOTH_RIGHT |
Both channels are right channel |
AUDIO_TRACK_EXCHANGE |
Data exchange between left and right channels, left channel is right channel sound, right channel is left channel sound |
AUDIO_TRACK_MIX |
The output of left and right channels is the aggregation of left and right channels (mixed) |
AUDIO_TRACK_LEFT_MUTE |
The left channel is mute, and the right channel plays the original right channel sound |
AUDIO_TRACK_RIGHT_MUTE |
The right channel is mute, and the left channel plays the original left channel sound |
AUDIO_TRACK_BOTH_MUTE |
Both left and right channels are mute |
【Note】
None.
【Related Data Type and Interface】
CVI_AI_SetTrackMode
CVI_AO_SetTrackMode
10.4.1.41. AUDIO_FADE_RATE_E¶
【Description】
The audio device fade in and fade out rate type.
【Syntax】
typedef enum _AUDIO_FADE_RATE_E {
AUDIO_FADE_RATE_NONE = 0,
AUDIO_FADE_RATE_10 = 10,
AUDIO_FADE_RATE_20 = 20,
AUDIO_FADE_RATE_30 = 30,
AUDIO_FADE_RATE_50 = 50,
AUDIO_FADE_RATE_100 = 100,
AUDIO_FADE_RATE_200 = 200,
AUDIO_FADE_RATE_BUTT = -1
} AUDIO_FADE_RATE_E;
【Member】
Member |
Description |
---|---|
AUDIO_FADE_RATE_NONE |
No delay between increasing or decreasing the volume |
AUDIO_FADE_RATE_10 |
Volume increments or decrements for every 10ms step. |
AUDIO_FADE_RATE_20 |
Volume increments or decrements for every 20ms step. |
AUDIO_FADE_RATE_30 |
Volume increments or decrements for every 30ms step. |
AUDIO_FADE_RATE_50 |
Volume increments or decrements for every 50ms step. |
AUDIO_FADE_RATE_100 |
Volume increments or decrements for every 100ms step. |
AUDIO_FADE_RATE_200 |
Volume increments or decrements for every 200ms step. |
【Note】
When Cvitek uses AUDIO_FADE_RATE_E parameter, please confirm that bFade in AUDIO_FADE_S has been set to CVI_TRUE.
Fade in or fade out will be set gradually according to the set AUDIO_FADE_RATE time delay based on the current volume value until fade in to unmute or fade out to mute.
【Related Data Type and Interface】
None.
10.4.1.42. AUDIO_FADE_S¶
【Description】
The audio device fades in and out setting structure.
【Syntax】
typedef struct hiAUDIO_FADE_S {
CVI_BOOL bFade;
AUDIO_FADE_RATE_E enFadeInRate;
AUDIO_FADE_RATE_E enFadeOutRate;
} AUDIO_FADE_S;
【Member】
Member |
Description |
---|---|
bFade |
Whether to turn on the fade in and fade out function. CVI_TRUE: Turn on the fading function. CVI_FALSE: Turn off the fading function. |
enFadeInRate |
Audio output device volume fade-in speed. |
enFadeOutRate |
Audio output device volume fade-in speed. |
【Note】
Cvitek please confirm that the bFade in AUDIO_FADE_S has been set to CVI_TRUE, and the setting of enFadeInRate/enFadeOutRate value will have effect.
【Related Data Type and Interface】
10.4.1.43. G726_BPS_E¶
【Description】
Defines the G.726 codec rate
【Syntax】
typedef enum _G726_BPS_E {
G726_16K = 0, /* G726 16kbps, see RFC3551.txt 4.5.4 G726-16 */
G726_24K, /* G726 24kbps, see RFC3551.txt 4.5.4 G726-24 */
G726_32K, /* G726 32kbps, see RFC3551.txt 4.5.4 G726-32 */
G726_40K, /* G726 40kbps, see RFC3551.txt 4.5.4 G726-40 */
MEDIA_G726_16K, /* G726 16kbps for ASF ... */
MEDIA_G726_24K, /* G726 24kbps for ASF ... */
MEDIA_G726_32K, /* G726 32kbps for ASF ... */
MEDIA_G726_40K, /* G726 40kbps for ASF ... */
G726_BUTT, /* Used for boundary checking */
} G726_BPS_E;
【Member】
Member |
Description |
---|---|
G726_16K |
16kbps G.726。 |
G726_24K |
24kbps G. 726。 |
G726_32K |
32kbps G.726。 |
G726_40K |
40kbps G.726。 |
MEDIA_G726_16K G726 |
16kbps for ASF。 |
MEDIA_G726_24K |
G726 24kbps for ASF。 |
MEDIA_G726_32K |
G726 32kbps for ASF。 |
MEDIA_G726_40K |
G726 40kbps for ASF。 |
【Note】
None.
【Related Data Type and Interface】
None.
10.4.1.44. ADPCM_TYPE_E¶
【Description】
Define ADPCM codec type.
【Syntax】
typedef enum _ADPCM_TYPE_E {
/* see DVI4 diiffers in three respects from the IMA ADPCM at RFC3551.txt 4.5.1 DVI4 */
ADPCM_TYPE_DVI4 = 0, /* 32kbps ADPCM(DVI4) for RTP */
ADPCM_TYPE_IMA, /* 32kbps ADPCM(IMA),NOTICE:point num must be 161/241/321/481 */
ADPCM_TYPE_ORG_DVI4, /* Original DVI4 ADPCM type */
ADPCM_TYPE_BUTT, /* Used for boundary checking */
} ADPCM_TYPE_E;
【Member】
Member |
Description |
---|---|
ADPCM_TYPE_DVI4 |
32kbit/s ADPCM(DVI4)。 |
ADPCM_TYPE_IMA |
32kbit/s ADPCM(IMA)。 |
ADPCM_TYPE_ORG_DVI4 |
32kbit/s ADPCM(ORG_DVI4)。 |
【Note】
None.
【Related Data Type and Interface】
None.
10.4.1.45. AUDIO_SAVE_FILE_INFO_S¶
【Description】
Defines the configuration parameters for saving audio files.
【Syntax】
typedef struct _AUDIO_SAVE_FILE_INFO_S {
CVI_BOOL bCfg; /* Configuration flag (TRUE/FALSE) */
CVI_CHAR aFilePath[MAX_AUDIO_FILE_PATH_LEN]; /* File path where the audio is saved */
CVI_CHAR aFileName[MAX_AUDIO_FILE_NAME_LEN]; /* Name of the saved audio file */
CVI_U32 u32FileSize; /* Size of the file in KB */
} AUDIO_SAVE_FILE_INFO_S;
【Members】
Member Name |
Description |
---|---|
bCfg |
Configuration flag indicating whether file saving is enabled. |
aFilePath |
File path specifying the directory where the audio file will be saved. |
aFileName |
File name specifying the name of the saved audio file. |
u32FileSize |
File size specifying the maximum size of the saved audio file in kilobytes (KB). |
【Notes】
Ensure that the file path and file name do not exceed the defined maximum lengths to avoid buffer overflow issues.
【Related Data Types and Interfaces】
10.4.1.46. CVI_HPF_CONFIG_S¶
【Description】
Defines the configuration parameters for a high-pass filter (HPF).
【Syntax】
typedef struct _CVI_HPF_CONFIG_S {
int type; /* HPF filter type */
float f0; /* cut-off frequency */
float Q; /* Q factor */
float gainDb; /* gain in dB */
} CVI_HPF_CONFIG_S;
【Members】
Member Name |
Description |
---|---|
type |
Filter type identifier, used to distinguish between different high-pass filter designs or implementations. |
f0 |
Cut-off frequency (Hz), defines the point where the high-pass filter begins to attenuate signals below this frequency. |
Q |
Quality factor, describes the ratio of the filter bandwidth to its center frequency, affecting the selectivity of the filter. |
gainDb |
Gain (in decibels), specifies the level of gain or attenuation at the pass frequency. |
【Notes】
None.
【Related Data Types and Interfaces】
None.
10.4.1.47. CVI_EQ_CONFIG_S¶
【Description】
Defines the configuration parameters for an equalizer (EQ).
【Syntax】
typedef struct _CVI_EQ_CONFIG_S {
int bandIdx; /* Index of the EQ band */
uint32_t freq; /* Frequency in Hz */
float QValue; /* Quality factor of the EQ band */
float gainDb; /* Gain in decibels for the EQ band */
} CVI_EQ_CONFIG_S;
【Members】
Member Name |
Description |
---|---|
bandIdx |
Index indicating the specific band of the equalizer. |
freq |
Center frequency of the band, measured in Hertz (Hz). |
QValue |
Quality factor, describes the ratio of the bandwidth to the center frequency of the band, affecting the filter’s selectivity. |
gainDb |
Gain, measured in decibels (dB), specifies the level of gain or attenuation for the band. |
【Notes】
None.
【Related Data Types and Interfaces】
None.
10.4.1.48. CVI_DRC_LIMITER_PARAM¶
【Description】
Defines the configuration parameters for a dynamic range compressor (DRC) limiter.
【Syntax】
typedef struct _CVI_DRC_LIMITER_PARAM {
uint32_t attackTimeMs; /* Attack time in milliseconds */
uint32_t releaseTimeMs; /* Release time in milliseconds */
float thresholdDb; /* Threshold level in decibels */
float postGain; /* Post-gain in decibels */
} CVI_DRC_LIMITER_PARAM;
【Members】
Member Name |
Description |
---|---|
attackTimeMs |
Attack time in milliseconds (ms), indicating the time taken for the compressor to start acting after the signal exceeds the threshold. |
releaseTimeMs |
Release time in milliseconds (ms), indicating the time taken for the compressor to stop acting after the signal falls below the threshold. |
thresholdDb |
Threshold level in decibels (dB), indicating the signal level at which compression begins. |
postGain |
Post-gain in decibels (dB), indicating the gain adjustment applied to the signal after compression. |
【Notes】
None.
【Related Data Types and Interfaces】
None.
10.4.1.49. CVI_DRC_EXPANDER_PARAM¶
【Description】
Defines the configuration parameters for a dynamic range compressor (DRC) expander.
【Syntax】
typedef struct _CVI_DRC_EXPANDER_PARAM {
uint32_t attackTimeMs; /* Attack time in milliseconds */
uint32_t releaseTimeMs; /* Release time in milliseconds */
uint32_t holdTimeMs; /* Hold time in milliseconds */
uint16_t ratio; /* Expansion ratio */
float thresholdDb; /* Threshold level in decibels */
float minDb; /* Minimum level in decibels */
} CVI_DRC_EXPANDER_PARAM;
【Members】
Member Name |
Description |
---|---|
attackTimeMs |
Attack time in milliseconds (ms), indicating the time taken for the expander to start acting after the signal exceeds the threshold. |
releaseTimeMs |
Release time in milliseconds (ms), indicating the time taken for the expander to stop acting after the signal falls below the threshold. |
holdTimeMs |
Hold time in milliseconds (ms), indicating the time the expander remains active before releasing. |
ratio |
Expansion ratio, indicating the ratio between the input signal and the output signal. |
thresholdDb |
Threshold level in decibels (dB), indicating the signal level at which expansion begins. |
minDb |
Minimum level in decibels (dB), indicating the minimum output level of the expander. |
【Notes】
None.
【Related Data Types and Interfaces】
None.
10.4.1.50. CVI_DRC_COMPRESSOR_PARAM¶
【Description】
Defines the configuration parameters for a dynamic range compressor (DRC) compressor.
【Syntax】
typedef struct _CVI_DRC_COMPRESSOR_PARAM {
uint32_t attackTimeMs; /* Attack time in milliseconds */
uint32_t releaseTimeMs; /* Release time in milliseconds */
uint16_t ratio; /* Compression ratio */
float thresholdDb; /* Threshold level in decibels */
} CVI_DRC_COMPRESSOR_PARAM;
【Members】
Member Name |
Description |
---|---|
attackTimeMs |
Attack time in milliseconds (ms), indicating the time taken for the compressor to start acting after the signal exceeds the threshold. |
releaseTimeMs |
Release time in milliseconds (ms), indicating the time taken for the compressor to stop acting after the signal falls below the threshold. |
ratio |
Compression ratio, indicating the ratio between the input signal and the output signal. |
thresholdDb |
Threshold level in decibels (dB), indicating the signal level at which compression begins. |
【Notes】
None.
【Related Data Types and Interfaces】
None.
10.4.1.51. AUDIO_SPK_EQ_CONFIG_S¶
【Description】
Defines the configuration parameters for the speaker equalizer (EQ).
【Syntax】
typedef struct _AUDIO_SPK_EQ_CONFIG_S {
CVI_U16 para_spk_eq_nband; /* Number of EQ bands */
CVI_U16 para_spk_eq_freq[5]; /* EQ band frequencies */
CVI_U16 para_spk_eq_gain[5]; /* EQ band gains */
CVI_U16 para_spk_eq_qfactor[5]; /* EQ band Q factors */
} AUDIO_SPK_EQ_CONFIG_S;
【Members】
Member Name |
Description |
---|---|
para_spk_eq_nband |
Number of EQ bands. |
para_spk_eq_freq |
Array of EQ band center frequencies. |
para_spk_eq_gain |
Array of EQ band gains. |
para_spk_eq_qfactor |
Array of EQ band Q factors. |
【Notes】
None.
【Related Data Types and Interfaces】
None.
10.4.1.52. AO_VQE_CONFIG_S¶
【Description】
Defines the configuration parameters for audio output (AO) voice quality enhancement (VQE).
【Syntax】
typedef struct _AO_VQE_CONFIG_S {
CVI_U32 u32OpenMask; /* Open mask for VQE modules */
CVI_S32 s32WorkSampleRate; /* Working sample rate */
CVI_S32 s32channels; /* Number of channels */
/* Sample Rate: 8KHz/16KHz default: 8KHz*/
AUDIO_SPK_AGC_CONFIG_S stAgcCfg; /* AGC configuration */
AUDIO_SPK_EQ_CONFIG_S stEqCfg; /* EQ configuration */
CVI_HPF_CONFIG_S stHpfParam; /* HPF configuration */
CVI_EQ_CONFIG_S stEqParam; /* EQ configuration */
CVI_DRC_COMPRESSOR_PARAM stDrcCompressor; /* DRC compressor configuration */
CVI_DRC_LIMITER_PARAM stDrcLimiter; /* DRC limiter configuration */
CVI_DRC_EXPANDER_PARAM stDrcExpander; /* DRC expander configuration */
} AO_VQE_CONFIG_S;
【Members】
Member Name |
Description |
---|---|
u32OpenMask |
Open mask for VQE modules. |
s32WorkSampleRate |
Working sample rate in Hz. |
s32channels |
Number of channels. |
stAgcCfg |
Automatic Gain Control (AGC) configuration parameters. |
stEqCfg |
Equalizer (EQ) configuration parameters. |
stHpfParam |
High-Pass Filter (HPF) configuration parameters. |
stEqParam |
Equalizer (EQ) configuration parameters. |
stDrcCompressor |
Dynamic Range Compressor (DRC) compressor configuration parameters. |
stDrcLimiter |
Dynamic Range Compressor (DRC) limiter configuration parameters. |
stDrcExpander |
Dynamic Range Compressor (DRC) expander configuration parameters. |
【Notes】
None.
【Related Data Types and Interfaces】
None.
10.4.1.53. HPF_FILTER_TYPE¶
【Description】
Defines the enumeration for high-pass filter (HPF) types.
【Syntax】
typedef enum {
E_FILTER_LPF, /* Low-pass filter */
E_FILTER_HPF, /* High-pass filter */
E_FILTER_LSF, /* Low-shelf filter */
E_FILTER_HSF, /* High-shelf filter */
E_FILTER_PEF, /* Peak filter */
E_FILTER_MAX, /* Maximum filter type */
} HPF_FILTER_TYPE;
【Members】
Enumeration Value |
Description |
---|---|
E_FILTER_LPF |
Low-pass filter. |
E_FILTER_HPF |
High-pass filter. |
E_FILTER_LSF |
Low-shelf filter. |
E_FILTER_HSF |
High-shelf filter. |
E_FILTER_PEF |
Peak filter. |
E_FILTER_MAX |
Maximum filter type. |
【Notes】
None.
【Related Data Types and Interfaces】
None.
10.4.1.54. AUDIO_SPK_AGC_CONFIG_S¶
【Description】
Defines the configuration parameters for the speaker automatic gain control (AGC).
【Syntax】
typedef struct _AUDIO_SPK_AGC_CONFIG_S {
CVI_S8 para_agc_max_gain; /* the max boost gain for AGC release processing, [0, 3] */
CVI_S8 para_agc_target_high; /* the gain level of target high of AGC, [0, 36] */
CVI_S8 para_agc_target_low; /* the gain level of target low of AGC, [0, 36] */
} AUDIO_SPK_AGC_CONFIG_S;
【Members】
Member Name |
Description |
---|---|
para_agc_max_gain |
The max boost gain for AGC release processing, range [0, 3]. |
para_agc_target_high |
The gain level of target high of AGC, range [0, 36]. |
para_agc_target_low |
The gain level of target low of AGC, range [0, 36]. |
【Notes】
None.
【Related Data Types and Interfaces】
None.
10.4.1.55. AUDIO_FILE_STATUS_S¶
【Description】
Define the audio file save status structure.
【Syntax】
typedef struct _AUDIO_FILE_STATUS_S {
CVI_BOOL bSaving; /* Whether the file is saving or not */
} AUDIO_FILE_STATUS_S;
【Member】
Member |
Description |
---|---|
bSaving |
Checking the file storage status. CVI_TRUE: In the state of saving files;; CVI_FALSE: Not in file storage status. |
【Note】
None.
【Related Data Type and Interface】
None.
10.4.1.56. VQE_MODULE_CONFIG_S¶
【Description】
Define the configuration information structure of voice quality enhancement and resampling module.
【Syntax】
typedef struct _VQE_MODULE_CONFIG_S {
CVI_VOID *pHandle; /* Handle of the VQE module */
} VQE_MODULE_CONFIG_S;
【Member】
Member |
Description |
---|---|
pHandle |
Register handle. |
【Note】
The registration handle of each sound quality enhancement and resampling module can be obtained by calling the handle acquisition interface.
【Related Data Type and Interface】
None.
10.4.1.57. AUDIO_VQE_REGISTER_S¶
【Description】
Define the register structure of sound quality enhancement and resampling module.
【Syntax】
typedef struct _AUDIO_VQE_REGISTER_S {
VQE_MODULE_CONFIG_S stResModCfg; /* Configuration for the Resample module */
VQE_MODULE_CONFIG_S stHpfModCfg; /* Configuration for the High Pass Filter module */
VQE_MODULE_CONFIG_S stHdrModCfg; /* Configuration for the HDR module */
VQE_MODULE_CONFIG_S stGainModCfg; /* Configuration for the Gain module */
// Record VQE
VQE_MODULE_CONFIG_S stRecordModCfg; /* Configuration for the Record VQE module */
// Talk VQE
VQE_MODULE_CONFIG_S stAecModCfg; /* Configuration for the Acoustic Echo Cancellation module */
VQE_MODULE_CONFIG_S stAnrModCfg; /* Configuration for the Automatic Noise Reduction module */
VQE_MODULE_CONFIG_S stAgcModCfg; /* Configuration for the Automatic Gain Control module */
VQE_MODULE_CONFIG_S stEqModCfg; /* Configuration for the Equalizer module */
// CviFi VQE
VQE_MODULE_CONFIG_S stRnrModCfg; /* Configuration for the Residual Noise Reduction module */
VQE_MODULE_CONFIG_S stDrcModCfg; /* Configuration for the Dynamic Range Compression module */
VQE_MODULE_CONFIG_S stPeqModCfg; /* Configuration for the Parametric Equalizer module */
} AUDIO_VQE_REGISTER_S;
【Member】
Member |
Description |
---|---|
pHandle |
Register handle. |
【Note】
Currently only supports Talk VQE after audio uplink voice recording.
Other VQE are not supported.
【Related Data Type and Interface】
None.
10.4.2. Audio Encoding¶
The data types and data structures related to audio encoding are defined as follows:
AENC_MAX_CHN_NUM: Define the maximum number of audio coding channels.
AENC_ATTR_G711_S: Define G.711 encoding protocol attribute structure.
AENC_ATTR_G726_S: Define G.726 encoding protocol attribute structure.
AENC_ATTR_ADPCM_S: Define ADPCM encoding protocol attribute structure.
AENC_ATTR_LPCM_S: Define LPCM encoding protocol attribute structure.
AENC_CHN_ATTR_S: Defines the audio encoding channel attribute structure.
AAC_AENC_ENCODER_S: Defines the encoder attribute structure.
10.4.2.1. AENC_MAX_CHN_NUM¶
【Description】
Define the maximum number of audio coding channels.
【Syntax】
#define AENC_MAX_CHN_NUM 1
【Note】
ADEC_MAX_CHN_NUM
【Related Data Type and Interface】
None.
10.4.2.2. AENC_ATTR_G711_S¶
【Description】
Define G.711 encoding protocol attribute structure.
【Syntax】
typedef struct hiAENC_ATTR_G711_S
{
CVI_U32 resv;
}AENC_ATTR_G711_S;
【Member】
Member |
Description |
---|---|
resv |
not used |
【Note】
None.
【Related Data Type and Interface】
None.
10.4.2.3. AENC_ATTR_G726_S¶
【Description】
Define G.726 encoding protocol attribute structure.
【Syntax】
typedef struct _AENC_ATTR_G726_S {
G726_BPS_E enG726bps;
}AENC_ATTR_G726_S;
【Member】
Member |
Description |
---|---|
enG726bps |
G.726 protocol bitrate |
【Note】
None.
【Related Data Type and Interface】
G726_BPS_E
10.4.2.4. AENC_ATTR_ADPCM_S¶
【Description】
Define ADPCM encoding protocol attribute structure.
【Syntax】
typedef struct _AENC_ATTR_ADPCM_S {
ADPCM_TYPE_E enADPCMType;
}AENC_ATTR_ADPCM_S;
【Member】
Member |
Description |
---|---|
enADPCMType |
ADPCM type |
【Note】
None.
【Related Data Type and Interface】
ADPCM_TYPE_E
10.4.2.5. AENC_ATTR_LPCM_S¶
【Description】
Define LPCM encoding protocol attribute structure.
【Syntax】
typedef struct _AENC_ATTR_LPCM_S {
CVI_U32 resv; /*reserve item*/
}AENC_ATTR_LPCM_S;
【Member】
The internal variables of this structure are not used.
【Note】
None.
【Related Data Type and Interface】
None.
10.4.2.6. AENC_CHN_ATTR_S¶
【Description】
Defines the audio encoding channel attribute structure. The definition of this structure varies slightly on different processor platforms.
【Syntax】
typedef struct _AENC_CHN_ATTR_S {
PAYLOAD_TYPE_E enType;
CVI_U32 u32PtNumPerFrm;
CVI_U32 u32BufSize;
CVI_VOID *pValue;
CVI_BOOL bFileDbgMode;
}AENC_CHN_ATTR_S;
【Member】
Member |
Description |
---|---|
enType |
Audio coding protocol type Static property. |
u32PtNumPerFrm |
Frame length of audio encoding protocol |
u32BufSize |
Size of audio encoding block. |
pValue |
Specific protocol attribute pointer. |
bFileDbgMode |
Whether in the file storage status |
【Note】
None.
【Related Data Type and Interface】
None.
10.4.2.7. AAC_AENC_ENCODER_S¶
【Description】
Defines AAC encoder attribute structure.
【Syntax】
typedef struct _AAC_AENC_ENCODER_S {
PAYLOAD_TYPE_E enType;
CVI_U32 u32MaxFrmLen;
CVI_CHAR aszName[17];
/* encoder type,be used to print proc information */
CVI_S32 (*pfnOpenEncoder)(CVI_VOID *pEncoderAttr, CVI_VOID **ppEncoder);
/* pEncoder is the handle to control the encoder */
CVI_S32 (*pfnEncodeFrm)(CVI_VOID *pEncoder, CVI_S16 * inputdata, CVI_U8 *
pu8Outbuf,
CVI_S32 s32InputSizeBytes, CVI_U32 *pu32OutLen);
CVI_S32 (*pfnCloseEncoder)(CVI_VOID *pEncoder);
} AAC_AENC_ENCODER_S;
【Member】
This structure is only used by AAC external LIB link.
This version of SDK is only defined but not supported.
【Note】
This structure is only used by AAC external LIB link.
This version of SDK is only defined but not supported.
If AAC is required, please refer to middleware/sample/audio/aac_sample.
【Related Data Type and Interface】
None.
10.4.3. Audio Decoding¶
Data types and data structures related to audio decoding are defined as follows:
MAX_AUDIO_FRAME_NUM: Define the maximum number of audio decoding block frames.
ADEC_MAX_CHN_NUM: Define the maximum number of audio decoding channels.
ADEC_ATTR_G711_S: Define G.711 decoding protocol attribute structure.
ADEC_ATTR_G726_S: Define G.726 decoding protocol attribute structure.
ADEC_ATTR_ADPCM_S: Define ADPCM decoding protocol attribute structure.
ADEC_ATTR_LPCM_S: Define LPCM decoding protocol attribute structure.
ADEC_MODE_E: Define the decoding method.
ADEC_CHN_ATTR_S: Define the decoding channel attribute structure.
ADEC_DECODER_S: Define the decoder attribute structure.
10.4.3.1. MAX_AUDIO_FRAME_NUM¶
【Description】
Define the maximum number of audio decoding block frames.
【Syntax】
#define MAX_AUDIO_FRAME_NUM 300
【Note】
Currently the number of audio internal cache frames is determined by the SDK, so this setting is not open to users and will not have any effect.
【Related Data Type and Interface】
None.
10.4.3.2. ADEC_MAX_CHN_NUM¶
【Description】
Define the maximum number of audio decoding channels.
【Syntax】
#define ADEC_MAX_CHN_NUM 1
【Note】
Currently only supports single channel encoding and decoding.
【Related Data Type and Interface】
None.
10.4.3.3. ADEC_ATTR_G711_S¶
【Description】
Define G.711 decoding protocol attribute structure.
【Syntax】
typedef struct _ADEC_ATTR_G711_S {
CVI_U32 resv;
}ADEC_ATTR_G711_S;
【Member】
The variables in this structure are not used in Cvitek processor
【Note】
None.
【Related Data Type and Interface】
None.
10.4.3.4. ADEC_ATTR_G726_S¶
【Description】
Define G.726 decoding protocol attribute structure.
【Syntax】
typedef struct _ADEC_ATTR_G726_S {
G726_BPS_E enG726bps;
}ADEC_ATTR_G726_S;
【Member】
Member |
Description |
---|---|
enG726bps |
G.726 protocol bitrate |
【Note】
None.
【Related Data Type and Interface】
G726_BPS_E
10.4.3.5. ADEC_ATTR_ADPCM_S¶
【Description】
Define ADPCM decoding protocol attribute structure .
【Syntax】
typedef struct _ADEC_ATTR_ADPCM_S {
ADPCM_TYPE_E enADPCMType;
}ADEC_ATTR_ADPCM_S;
【Member】
Member |
Description |
---|---|
enADPCMType |
ADPCM type |
【Note】
None.
【Related Data Type and Interface】
10.4.3.6. ADEC_ATTR_LPCM_S¶
【Description】
Define LPCM decoding protocol attribute structure.
【Syntax】
typedef struct _ADEC_ATTR_LPCM_S {
CVI_U32 resv;
}ADEC_ATTR_LPCM_S;
【Member】
resv is to be extended.
【Note】
None.
【Related Data Type and Interface】
None.
10.4.3.7. ADEC_MODE_E¶
【Description】
Define the decoding method.
【Syntax】
typedef enum _ADEC_MODE_E {
ADEC_MODE_PACK = 0,
ADEC_MODE_STREAM ,
ADEC_MODE_BUTT
}ADEC_MODE_E;
【Member】
Member |
Description |
---|---|
ADEC_MODE_PACK |
Decode in Pack mode. |
ADEC_MODE_STREAM |
Decode in stream mode. |
【Note】
Pack mode is used when the user confirms that the current stream packet is the result of one frame data encoding, the decoder will decode it directly.
If it is not one frame, the decoder will fail.
The efficiency of this mode is relatively high.
If the code stream packet encoded by AENC module is not damaged, this mode can be used for decoding.
Stream mode is used when the user can’t confirm whether the current code stream packet is one frame of data, and the decoder needs to determine and block the code stream.
This mode is inefficient, and is generally used in the case of reading file code stream decoding or uncertain code stream packet boundary.
Of course, due to the fixed length of speech coding stream, it is easy to determine the frame boundary in the stream, so it is recommended to use pack decoding.
Cvitek only supports pack mode.
In the case of uncertain stream boundary, Cvitek will make decoding errors due to the misalignment of frame number.
【Related Data Type and Interface】
None.
10.4.3.8. ADEC_CHN_ATTR_S¶
【Description】
Define the decoding channel attribute structure.
【Syntax】
typedef struct _ADEC_CH_ATTR_S {
PAYLOAD_TYPE_E enType;
CVI_U32 u32BufSize; /*buf size[2~CVI_MAX_AUDIO_FRAME_NUM]*/
ADEC_MODE_E enMode;/*decode mode*/
/* CVI_VOID ATTRIBUTE *pValue;*/
CVI_VOID *pValue;
CVI_BOOL bFileDbgMode;
//if ao not enable
CVI_S32 s32BytesPerSample;
CVI_S32 s32frame_size; //in samples
CVI_S32 s32ChannelNums; // 1 or 2
CVI_S32 s32Sample_rate;;
}ADEC_CHN_ATTR_S;
【Member】
Member |
Description |
---|---|
enType |
Audio decoding protocol type, static property. |
u32BufSize |
Audio decoding block buffer size. At present, the number of audio internal cache frames is determined by the SDK, so this setting is not open to users and will not have any effect. |
enMode |
Decoding mode, static property. Mode only supports ADEC_MODE_PACK mode and cannot be detected automatically by setting ADEC_MODE_STREAM. |
pValue |
Specific protocol attribute pointer. |
bFileDbgMode |
Whether to turn on save file mode. Please pay attention to the cvi_sample_audio.c sample code in the SDK. This value is true by default for debugging. Users should set it to false in actual use to avoid using performance or memory due to disk saving. |
When the user only uses the ADECmodule but not the AO module, the user needs to inform the ADECmodule of the relevant parametercharacteristics through the following variable settings. |
|
s32BytesPerSample |
Bytes used for unit sampling. (bit width SL16, 16bits = 2 bytes, at this point, the bytes used for unit sampling should be set to 2) (the examples in SDK are 2). |
s32frame_size |
Period sample size: the number of samples sent to the ADEC module each time. |
s32ChannelNums |
The number of channels (mono: 1, dual: 2). |
s32Sample_rate |
The sampling frequency (HZ) of the code stream to be decoded. |
【Note】
None.
【Related Data Type and Interface】
None.
10.4.3.9. AUDIO_FRAME_INFO_S¶
【Description】
Define the audio frame information structure after decoding.
【Syntax】
typedef struct _AUDIO_FRAME_INFO_S {
AUDIO_FRAME_S *pstFrame;
CVI_U32 u32Id;
} AUDIO_FRAME_INFO_S;
【Member】
Member |
Description |
---|---|
pstFrame |
Audio frame pointer. |
u32Id |
Index of audio frame, range [0, 49]. |
【Note】
None.
【Related Data Type and Interface】
CVI_ADEC_GetFrame CVI_ADEC_ReleaseFrame
10.4.3.10. ADEC_DECODER_S¶
【Description】
Define the decoder attribute structure.
【Syntax】
typedef struct _ADEC_DECODER_S {
PAYLOAD_TYPE_E enType;
CVI_CHAR aszName[17];
CVI_S32 (*pfnOpenDecoder)(CVI_VOID *pDecoderAttr, CVI_VOID
**ppDecoder);
CVI_S32 (*pfnDecodeFrm)(CVI_VOID *pDecoder, CVI_U8 **pu8Inbuf,
CVI_S32 *ps32LeftByte,
CVI_U16 *pu16Outbuf, CVI_U32 *pu32OutLen, CVI_U32
*pu32Chns);
CVI_S32 (*pfnGetFrmInfo)(CVI_VOID *pDecoder, CVI_VOID *pInfo);
CVI_S32 (*pfnCloseDecoder)(CVI_VOID *pDecoder);
CVI_S32 (*pfnResetDecoder)(CVI_VOID *pDecoder);
} ADEC_DECODER_S;
【Member】
Member |
Description |
---|---|
enType |
Decoding protocol type |
aszName |
Decoder name. |
pfnOpenDecoder |
Function pointer to open decoder. |
pfnDecodeFrm |
Function pointer to decode |
pfnGetFrmInfo |
Function pointer to get audio frame information. |
pfnCloseDecoder |
Function pointer to close decoder. |
pfnResetDecoder |
Clear the buffer and reset the decoder. |
【Note】
This structure is specially used to link external AAC decoding lib.
Currently, please refer to middleware/sample/audio/aac_sample for AAC decoding.
【Related Data Type and Interface】
None.