10.4. Data Types

10.4.1. Audio Input / Output

The definition of data type and data structure related to audio input / output is as follows.

The following features are not currently supported.

  • AI_TALKVQE_MASK_HPF : Mask of Talk Vqe HPF function.

  • AI_TALKVQE_MASK_EQ : Mask of Talk Vqe EQ function.

  • AI_RECORDVQE_MASK_HPF : Mask of Record Vqe HPF function.

  • AI_RECORDVQE_MASK_RNR : Mask of Record Vqe RNR function.

  • AI_RECORDVQE_MASK_HDR : Mask of Record Vqe HDR function.

  • AI_RECORDVQE_MASK_DRC : Mask of Record Vqe DRC function.

  • AI_RECORDVQE_MASK_EQ : Mask of Record Vqe EQ function.

  • AO_VQE_MASK_HPF : Mask of AO Vqe HPF function.

10.4.1.1. AI_DEV_MAX_NUM

【Description】

Define the maximum number of audio input devices.

【Syntax】

#define AI_DEV_MAX_NUM      1

【Note】

Deprecated.

【Related Data Type and Interface】

None.

10.4.1.2. AO_DEV_MAX_NUM

【Description】

Define the maximum number of audio output devices.

【Syntax】

#define AO_DEV_MAX_NUM     1

【Note】

Deprecated.

【Related Data Type and Interface】

None.

10.4.1.3. CVI_AUD_MAX_CHANNEL_NUM

【Description】

Define the maximum number of channels for an audio output device.

【Syntax】

#define CVI_AUD_MAX_CHANNEL_NUM       8

【Note】

Maximum number of audio channels can be set by AIO_ATTR_S stAttr.u32ChnCnt.

The value cannot exceed CVI_AUD_MAX_CHANNEL_NUM

【Related Data Type and Interface】

None.

10.4.1.4. AI_TALKVQE_MASK_AEC

【Description】

Mask of Talk Vqe AEC function.

【Syntax】

#define AI_TALKVQE_MASK_AEC  0x3

【Note】

None.

【Related Data Type and Interface】

None.

10.4.1.5. AI_TALKVQE_MASK_AGC

【Description】

Define the Mask of Talk Vqe AGC function.

【Syntax】

#define AI_TALKVQE_MASK_AGC  0x8

【Note】

None.

【Related Data Type and Interface】

Assign values to the member of structure u32OpenMask of AI_TALKVQE_CONFIG_S to indicate that AGC function is turned on.

For example, u32OpenMask = AI_TALKVQE_MASK_AEC | AI_TALKVQE_MASK_AGC; it indicates that AEC and AGC functions are turned on.

10.4.1.6. AI_TALKVQE_MASK_ANR

【Description】

Define the Mask of talk Vqe ANR function.

【Syntax】

#define AI_TALKVQE_MASK_ANR  0x4

【Note】

None.

【Related Data Type and Interface】

Assign values to the member of structure u32OpenMask of AI_TALKVQE_CONFIG_S to indicate that ANR function is turned on.

For example, u32OpenMask = AI_TALKVQE_MASK_AEC AI_TALKVQE_MASK_ANR; it indicates that AEC and ANR functions are turned on.

10.4.1.7. AI_RECORDVQE_MASK_AGC

【Description】

Define the Mask of record Vqe AGC function.

【Syntax】

#define AI_RECORDVQE_MASK_AGC  0x20

【Note】

None.

【Related Data Type and Interface】

Assign values to the member of structure u32OpenMask of AI_RECORDVQE_CONFIG_S to indicate that AGC function is turned on.

For example, u32OpenMask = AI_RECORDVQE_MASK_HPF AI_RECORDVQE_MASK_AGC; it indicates that HPF and AGC functions are turned on.

10.4.1.8. MAX_AUDIO_FILE_PATH_LEN

【Description】

Define the maximum length limitation for the path of the saved audio file.

【Syntax】

#define MAX_AUDIO_FILE_PATH_LEN 256

【Note】

None.

【Related Data Type and Interface】

  • AUDIO_SAVE_FILE_INFO_S

10.4.1.9. MAX_AUDIO_FILE_NAME_LEN

【Description】

Define the maximum length limitation for the name of the saved audio file.

【Syntax】

#define MAX_AUDIO_FILE_NAME_LEN 256

【Note】

None.

【Related Data Type and Interface】

  • AUDIO_SAVE_FILE_INFO_S

10.4.1.10. CVI_MAX_AI_DEVICE_ID_NUM

【Description】

Define the maximum number of AI (Audio Input) device IDs.

【Syntax】

#define CVI_MAX_AI_DEVICE_ID_NUM 5   /* Maximum number of AI device ID */

【Note】

None.

【Related Data Type and Interface】

None.

10.4.1.11. CVI_MAX_AI_CARD_ID_NUM

【Description】

Define the maximum number of AI (Audio Input) card IDs.

【Syntax】

#define CVI_MAX_AI_CARD_ID_NUM 5   /* Maximum number of AI card ID */

【Note】

None.

【Related Data Type and Interface】

None.

10.4.1.12. CVI_MAX_AO_DEVICE_ID_NUM

【Description】

Define the maximum number of AO (Audio Output) device IDs.

【Syntax】

#define CVI_MAX_AO_DEVICE_ID_NUM 5   /* Maximum number of AO device ID */

【Note】

None.

【Related Data Type and Interface】

None.

10.4.1.13. CVI_MAX_AO_CARD_ID_NUM

【Description】

Define the maximum number of AO (Audio Output) card IDs.

【Syntax】

#define CVI_MAX_AO_CARD_ID_NUM 5   /* Maximum number of AO card ID */

【Note】

None.

【Related Data Type and Interface】

None.

10.4.1.14. CVI_MAX_AUDIO_FRAME_NUM

【Description】

Defines the maximum limit for the number of audio frames.

【Syntax】

#define CVI_MAX_AUDIO_FRAME_NUM    300       /* max count of audio frame in Buffer */

【Note】

None.

【Related Data Type and Interface】

None.

10.4.1.15. CVI_AUD_MAX_VOICE_POINT_NUM

【Description】

Defines the maximum number of samples for each frame of voice encoding.

【Syntax】

#define CVI_AUD_MAX_VOICE_POINT_NUM    1280      /* max sample per frame for voice encode */

【Note】

None.

【Related Data Type and Interface】

None.

10.4.1.16. CVI_AUD_MAX_AUDIO_POINT_NUM

【Description】

Defines the maximum number of samples for each frame of all audio encoding.

【Syntax】

#define CVI_AUD_MAX_AUDIO_POINT_NUM    2048     /* max sample per frame for all encoder */

【Note】

None.

【Related Data Type and Interface】

None.

10.4.1.17. CVI_MAX_AUDIO_STREAM_LEN

【Description】

Defines the maximum length of the audio stream.

【Syntax】

#define CVI_MAX_AUDIO_STREAM_LEN 8192   /* Maximum length of audio stream */

【Note】

None.

【Related Data Type and Interface】

None.

10.4.1.18. MAX_AUDIO_VQE_CUSTOMIZE_NAME

【Description】

Defines the maximum length limit for the custom name of audio voice quality enhancement (VQE).

【Syntax】

#define MAX_AUDIO_VQE_CUSTOMIZE_NAME 64 /* Maximum length of VQE customize name */

【Note】

None.

【Related Data Type and Interface】

None.

10.4.1.19. AUDIO_CLKSEL_E

【Description】

Define the audio clock source.

【Syntax】

typedef enum _AUDIO_CLKSEL_E
{  AUDIO_CLKSEL_BASE       = 0,  /*<Audio base clk. */
   AUDIO_CLKSEL_SPARE,           /*<Audio spare clk. */
   AUDIO_CLKSEL_BUTT,
} AUDIO_CLKSEL_E;

【Member】

None.

【Note】

Cvitek users do not need to set the clock at this time.

【Related Data Type and Interface】

10.4.1.20. AUDIO_SAMPLE_RATE_E

【Description】

Define the audio sampling rate.

【Syntax】

 typedef enum _AUDIO_SAMPLE_RATE_E
{ AUDIO_SAMPLE_RATE_8000 =8000,  /* 8K samplerate */
AUDIO_SAMPLE_RATE_11025 =11025,  /* 11.025K samplerate */
AUDIO_SAMPLE_RATE_16000 =16000,  /* 16K samplerate */
AUDIO_SAMPLE_RATE_22050 =22050,  /* 22.050K samplerate */
AUDIO_SAMPLE_RATE_24000 =24000,  /* 24K samplerate */
AUDIO_SAMPLE_RATE_32000 =32000,  /* 32K samplerate */
AUDIO_SAMPLE_RATE_44100 =44100,  /* 44.1K samplerate */
AUDIO_SAMPLE_RATE_48000 =48000,  /* 48K samplerate */
AUDIO_SAMPLE_RATE_64000=64000,  /* 64K samplerate*/
AUDIO_SAMPLE_RATE_BUTT, }AUDIO_SAMPLE_RATE_E;

【Member】

Member

Description

AUDIO_SAMPLE_RATE_8000

8kHz sample rate

AUDIO_SAMPLE_RATE_11025

11.025kHz sample rate

AUDIO_SAMPLE_RATE_16000

16kHz sample rate

AUDIO_SAMPLE_RATE_22050

22.050kHz sample rate

AUDIO_SAMPLE_RATE_24000

24kHz sample rate

AUDIO_SAMPLE_RATE_32000

32kHz sample rate

AUDIO_SAMPLE_RATE_44100

44.1kHz sample rate

AUDIO_SAMPLE_RATE_48000

48kHz sample rate

AUDIO_SAMPLE_RATE_64000

64kHz sample rate

【Note】

【Related Data Type and Interface】

10.4.1.21. AUDIO_BIT_WIDTH_E

【Description】

Define the audio sampling accuracy.

【Syntax】

typedef enum _AUDIO_BIT_WIDTH_E {
   AUDIO_BIT_WIDTH_8 =0,  /* 8bit width */
   AUDIO_BIT_WIDTH_16 =1,  /* 16bit width */
   AUDIO_BIT_WIDTH_24 =2,  /* 24bit width */
   AUDIO_BIT_WIDTH_32 =3,  /* 32bit width */
   AUDIO_BIT_WIDTH_BUTT, /* boundary check */
 } AUDIO_BIT_WIDTH_E;

【Member】

Member

Description

AUDIO_BIT_WIDTH_8

the sampling accuracy is 8 bits

AUDIO_BIT_WIDTH_16

the sampling accuracy is 16 bits

AUDIO_BIT_WIDTH_24

the sampling accuracy is 24 bits

AUDIO_BIT_WIDTH_32

the sampling accuracy is 32 bits

【Note】

None.

【Related Data Type and Interface】

10.4.1.22. AIO_MODE_E

【Description】

Define the audio input / output working mode.

【Syntax】

typedef enum _AIO_MODE_E {
   AIO_MODE_I2S_MASTER = 0,  /* AIO I2S master mode */
   AIO_MODE_I2S_SLAVE,     /* AIO I2S slave mode */
   AIO_MODE_PCM_SLAVE_STD, /* AIO PCM slave standard mode */
   AIO_MODE_PCM_SLAVE_NSTD, /* AIO PCM slave non-standard mode */
   AIO_MODE_PCM_MASTER_STD,  /* AIO PCM master standard mode */
   AIO_MODE_PCM_MASTER_NSTD, /* AIO PCM master non-standard mode */
   AIO_MODE_BUTT             /* boundary check */
}AIO_MODE_E;

【Member】

Member

Description

AIO_MODE_I2S_MASTER

I2S master mode

AIO_MODE_I2S_SLAVE

I2S slave mode

AIO_MODE_PCM_SLAVE_STD

PCM slave standard mode

AIO_MODE_PCM_SLAVE_NSTD

PCM slave non-standard mode

AIO_MODE_PCM_MASTER_STD

PCM master standard mode

AIO_MODE_PCM_MASTER_NSTD

PCM master non-standard mode

【Note】

Built-in Cvitek only supports I2S master mode.

【Related Data Type and Interface】

10.4.1.23. AIO_I2STYPE_E

【Description】

Define I2S interface device type.

【Syntax】

typedef enum {
   AIO_I2STYPE_INNERCODEC = 0, /* AIO I2S connect inner audio CODEC */
   AIO_I2STYPE_INNERHDMI,      /* AIO I2S connect Inner HDMI */
   AIO_I2STYPE_EXTERN,         /* AIO I2S connect extern hardware */
} AIO_I2STYPE_E;

【Member】

Member

Description

AIO_I2STYPE_INNERCODEC

I2S connect inner audio CODEC

AIO_I2STYPE_INNERHDMI

I2S connect Inner HDMI

AIO_I2STYPE_EXTERN

I2S connect extern hardware

【Note】

Cvitek only supports AIO_I2STYPE_INNERCODEC connecting to inner audio CODEC.

【Related Data Type and Interface】

10.4.1.24. AUDIO_SOUND_MODE_E

【Description】

Define the audio channel mode.

【Syntax】

typedef enum _AIO_SOUND_MODE_E {
  AUDIO_SOUND_MODE_MONO   = 0, /*mono*/
  AUDIO_SOUND_MODE_STEREO = 1, /*stereo only support interlace mode*/
  AUDIO_SOUND_MODE_BUTT        /*boundary check*/
} AUDIO_SOUND_MODE_E;

【Member】

Member

Description

AUDIO_SOUND_MODE_MONO

Mono

AUDIO_SOUND_MODE_STEREO

Stereo

【Note】

The left channel corresponds to channel 0 and the right channel corresponds to channel 1.

For AI, mono input is from the left channel by default.

If it needs to be configured as the right channel input,

  • Turn on the right channel only and process.

  • Open the left and right channels, process according to the left channel, and use CVI_AI_SetTrackMode to configure Audio Input channel mode to AUDIO_TRACK_EXCHANGE”.

For AO, mono input is from the left channel by default.

If it needs to be configured as the right channel input, you can consider two methods.

  • Turn on the right channel only and process.

  • Turn on the left and right channels, process according to the left channel, and use CVI_AO_SetTrackMode to configure AO channel mode to “AUDIO_TRACK_EXCHANGE”.

For stereo mode, only the left channel (that is, the channel whose number is less than half of u32ChnCnt in the device attribute) should be operated, and the SDK will automatically operate the right channel.

【Related Data Type and Interface】

10.4.1.25. AUDIO_MOD_PARAM_S

【Description】

Define the audio module parameter structure.

【Syntax】

typedef struct _AUDIO_MOD_PARAM_S {
        AUDIO_CLKSEL_E enClkSel; /* Audio clock select */
} AUDIO_MOD_PARAM_S;

【Member】

enClkSel audio clock source selection. Please see AUDIO_CLKSEL_E.

【Note】

Cvitek does not need special setting for CLK.

【Related Data Type and Interface】

None.

10.4.1.26. AIO_ATTR_S

【Description】

Define audio input / output device property structure.

【Syntax】

typedef struct _AIO_ATTR_S {
        AUDIO_SAMPLE_RATE_E enSamplerate;   /* sample rate */
        AUDIO_BIT_WIDTH_E   enBitwidth;             /* bitwidth */
        AIO_MODE_E          enWorkmode;     /* master or slave mode */
        AUDIO_SOUND_MODE_E  enSoundmode;    /* momo or steror */
        CVI_U32  u32EXFlag;
        /* expand 8bit to 16bit,use AI_EXPAND(only valid for AI 8bit),*/
        /*use AI_CUT(only valid for extern Codec for 24bit) */
        CVI_U32 u32FrmNum;
        /* frame num in buf[2,CVI_MAX_AUDIO_FRAME_NUM] */
        CVI_U32 u32PtNumPerFrm;
        /* point num per frame (80/160/240/320/480/1024/2048) */
        /*(ADPCM IMA should add 1 point, AMR only support 160) */
        CVI_U32 u32ChnCnt;  /* channel number on FS, valid value:1/2/4/8 */
        CVI_U32 u32ClkSel;  /* 0: AI and AO clock is separate*/
        /* 1: AI and AO clock is inseparate, AI use AO's clock*/
        AIO_I2STYPE_E enI2sType;    /* i2s type */
} AIO_ATTR_S;

【Member】

Member

Description

enSamplerate

Audio sample rate (this parameter does not work in slave mode); Static properties.

enBitwidth

Audio sampling accuracy (in slave mode, this parameter must match the sampling accuracy of audio AD/DA); Static properties.

enWorkmode

Audio I / O working mode; Static properties.

enSoundmode

Audio channel mode; Static properties.

u32EXFlag

Value range: {0, 1, 2}。 0:does not extend.

1: It is expanded to 16 bits, and the 8-bit to 16bit extension flag (only valid for Audio Input sampling accuracy of 8bit).

2: The 24 bits are cropped to 16 bits, which may be used in the external codec scenario.

Static property, keep parameters, generally set to 1.

u32FrmNum

Number of block frames.

u32PtNumPerFrm

Number of sample points per frame.

Value range: G711, G726, ADPCM_DVI4 is 160, 320, 480;

u32ChnCnt

Number of channels supported. Values: 1, 2, 4, 8, 16( Input and output supports up to 2 channels respectively.

u32ClkSel

Whether AI and AO use the same clock source.

enI2sType

Configure I2S interface device type;

Cvitek only supports master mode.

【Note】

The number of sampling points per frame u32PtNumPerFrm and sampling rate enSamplerate determine the frequency of hardware interrupt.

If the frequency is too high, it will affect the performance of the system and interact with other services.

It is suggested that the values of these two parameters satisfy the formula: (u32PtNumPerFrm * 1000) / enSamplerate > = 10.

For example, when the sampling rate is 16000Hz, it is recommended to set the number of sampling points greater than or equal to 160.

【Related Data Type and Interface】

  • CVI_AI_SetPubAttr

  • CVI_AO_SetPubAttr

10.4.1.27. AI_CHN_PARAM_S

【Description】

Define channel parameter structure.

【Syntax】

typedef struct _AI_CHN_PARAM_S {
  CVI_U32 u32UsrFrmDepth; /* user frame depth */
} AI_CHN_PARAM_S;

【Member】

u32UsrFrmDepth: Audio frame block depth.

【Note】

None.

【Related Data Type and Interface】

None.

10.4.1.28. AUDIO_FRAME_S

【Description】

Define audio frame data structure.

【Syntax】

typedef struct _AUDIO_FRAME_S {
  AUDIO_BIT_WIDTH_E   enBitwidth;/*audio frame bitwidth*/
  AUDIO_SOUND_MODE_E  enSoundmode;/*audio frame momo or stereo mode*/
  CVI_U8  * u64VirAddr[2];                      /*audio frame vir addr*/
  CVI_U64  u64PhyAddr[2];               /*audio frame phy addr*/
  CVI_U64  u64TimeStamp;                /*audio frame timestamp*/
  CVI_U32  u32Seq;                      /*audio frame seq*/
  CVI_U32  u32Len;                      /*data length per channel in frame*/
  CVI_U32  u32PoolId[2];                /*audio frame pool id*/
} AUDIO_FRAME_S;

【Member】

Member

Description

enBitwidth

Audio sampling accuracy.

enSoundmode

Audio channel mode.

u64VirAddr [2]

Audio frame data virtual address.

u64PhyAddr[2]

Audio frame data physical address. Not supported at present.

u64TimeStamp

Audio frame timestamp. The unit is μs.

u32Seq

Audio frame sequence.

u32Len

Audio frame length: the total sampling amount of a single channel. samples as the unit. 1 sample = 2 bytes. Ex. AIO_ATTR_S parameters setting: u32FrmNum = 320, u32ChnCnt = 2. And u32Len = 320(samples/channel). u64VirAddr [0] buffer includes the number of bytes should be (u32Len x u32ChnCnt x 2).

u32PoolId[2]

Audio frame block pool ID.

【Note】

u32Len (audio frame length) refers to the data length of a single channel.

u64VirAddr [0],the length is in bytes : (u32Len x bytes_per_sample);

The default channel mode for mono audio is left channel, and the data is arranged as [Left, Left, Left, Left, Left, …].

Stereo data is arranged as [L, R, L, R, L, R, …] where L stands for the left channel and R stands for the right channel.

(Note: the left represents a single sample in the left channel, and the right represents a single sample in the left channel.)

u64VirAddr [1]. There is no storage data, which can be customized.

【Related Data Type and Interface】

None.

10.4.1.29. AEC_FRAME_S

【Description】

Define the information structure of echo cancellation reference

【Syntax】

typedef struct _AEC_FRAME_S {
   AUDIO_FRAME_S stRefFrame;  /* aec reference audio frame */
   CVI_BOOL   bValid;   /* whether frame is valid */
   CVI_BOOL   bSysBind; /* whether is sysbind */
}AEC_FRAME_S;

【Member】

Member

Description

stRefFrame

Echo cancellation reference frame structure.

bValid

Reference frame valid flag.

Value range: CVI_TRUE:the reference frame is valid.

CVI_FALSE: if the reference frame is invalid, it cannot be used for echo cancellation.

bSysBind

Whether Audio Input and AENC are system bound.

【Note】

None.

【Related Data Type and Interface】

None.

10.4.1.30. AUDIO_AGC_CONFIG_S

【Description】

Define the audio AGC configuration information structure.

【Syntax】

typedef struct _AUDIO_AGC_CONFIG_S {
  /* the max boost gain for AGC release processing, [0, 3] */
  /* para_obj.para_agc_max_gain = 1; */
  CVI_S8 para_agc_max_gain;
  /* the gain level of target high of AGC, [0, 36] */
  /* para_obj.para_agc_target_high = 2; */
  CVI_S8 para_agc_target_high;
  /* the gain  level of target low of AGC, [0, 36] */
  /* para_obj.para_agc_target_low = 6; */
  CVI_S8 para_agc_target_low;
  /* speech-activated AGC functionality, [0, 1] */
  /* para_obj.para_agc_vad_enable = 1; */
  CVI_BOOL para_agc_vad_ena;
} AUDIO_AGC_CONFIG_S;

【Member】

Member

Description

para_agc_max_gain

The maximum gain at which a signal can be amplified.

para_agc_target_high:

AGC will reach the “Target High” level.

para_agc_target_low

AGC will reach the “Target Low” level.

para_agc_vad_enable:

Speech-activated AGC uses speech VAD from NR to avoid amplifying background noise. It is recommended to turn on this function in high SNR environment, and it is better to turn off this function in medium / low SNR environment for better voice quality.

【Note】

【Related Data Type and Interface】

10.4.1.31. AI_AEC_CONFIG_S

【Description】

Define the audio echo cancellation configuration information structure.

【Syntax】

typedef struct _AI_AEC_CONFIG_S {
  CVI_U16 para_aec_filter_len;           /* the filter length of AEC, [1, 13] */
  CVI_U16 para_aes_std_thrd;                 /* the threshold of STD/DTD, [0, 39] */
  CVI_U16 para_aes_supp_coeff;           /* the residual echo suppression level in AES, [0, 100] */
} AI_AEC_CONFIG_S;

【Member】

Member

Description

para_aex_filter_len

The length of the adaptive filter

para_aes_std_thrd

Residual Echo Suppression Threshold

para_aes_supp_coeff

Residual Echo Suppression Level

【Note】

When user mode is on, other parameters will take effect;

Otherwise, it is configured according to the default value of the corresponding working mode enWorkstate in according to AI_VQE_CONFIG_S / AI_TALKVQE_CONFIG The working mode in the AI_VQE_CONFIG_S/ AI_TALKVQE_CONFIG_S.

When configuring parameters, correctness checks for advanced parameters are only performed when the user mode is enabled. Only when the advanced parameters are correct can the configuration be successful.

【Related Data Type and Interface】

  • AI_VQE_CONFIG_S

10.4.1.32. AUDIO_ANR_CONFIG_S

【Description】

Define the information structure of audio voice noise reduction function.

【Syntax】

typedef struct _AUDIO_ANR_CONFIG_S {
        /* the coefficient of NR priori SNR tracking, [0, 20] */
        /* para_obj.para_nr_snr_coeff = 15; */
        CVI_U16 para_nr_snr_coeff;
        /* the coefficient of NR noise tracking, [0, 14] */
        /* para_obj.para_nr_noise_coeff = 2; */
        //CVI_S8 para_nr_noise_coeff;
        CVI_U16 para_nr_init_sile_time;
} AUDIO_ANR_CONFIG_S;

【Member】

Member

Description

para_nr_snr_coeff

Signal-to-Noise Ratio (SNR) tracking coefficient. If it is set to a larger value, NR will have a higher noise reduction ability, but the speech signal may be more easily distorted; If a smaller value is selected, NR will suppress less noise signal, but it will have better speech quality performance.

para_nr_noise_coeff

Noise tracking coefficient. This parameter determines the tracking speed of stationary noise[ 0 - 14] 0: slowest noise tracking speed 14: fastest noise tracking speed

【Note】

None.

【Related Data Type and Interface】

10.4.1.33. AUDIO_DELAY_CONFIG_S

【Description】

Definition of Audio Signal Delay Structure.

【Syntax】

typedef struct _AUDIO_DELAY_CONFIG_S {
   /* the initial filter length of linear AEC to support up for echo tail, [1, 13] */
   CVI_U16 para_aec_init_filter_len;
   /* the digital gain target, [1, 12] */
   CVI_U16 para_dg_target;
   /* the delay sample for ref signal, [1, 3000] */
   CVI_U16 para_delay_sample;
} AUDIO_DELAY_CONFIG_S;

【Member】

Member

Description

para_aec_init_filter_len

The length of the adaptive filter

para_dg_target

Digital Gain.

Value range [1-12]. This feature helps reduce residual echo and residual stationary noise.

para_delay_sample

Used to delay the reference signal.

Value range: [1-3000] It enables AEC/AES to accelerate convergence at the beginning of the echo.

【Note】

None.

【Related Data Type and Interface】

10.4.1.34. VQE_WORKSTATE_E

【Description】

Define the working mode of voice quality enhancement.

【Syntax】

typedef enum _VQE_WORKSTATE_E {
  VQE_WORKSTATE_COMMON  = 0,
  /* common environment, Applicable to the family of voice calls. */
  VQE_WORKSTATE_MUSIC   = 1,
  /* music environment , Applicable to the family of music environment. */
  VQE_WORKSTATE_NOISY   = 2,
  /* noisy environment , Applicable to the noisy voice calls.  */
} VQE_WORKSTATE_E;

【Member】

Member

Description

VQE_WORKSTATE_COMMON

Common mode.

VQE_WORKSTATE_MUSIC

Music mode.

VQE_WORKSTATE_NOISY

Noise mode.

【Note】

None.

【Related Data Type and Interface】

10.4.1.35. VQE_RECORD_TYPE

【Description】

Define the recording type.

【Syntax】

typedef enum _VQE_RECORD_TYPE {
  VQE_RECORD_NORMAL        = 0,
  /*<double micphone recording. */
  VQE_RECORD_BUTT,  /* Used for boundary checking */
} VQE_RECORD_TYPE;

【Member】

VQE_RECORD_NORMAL: Standard type.

【Note】

Cvitek only supports talk VQE, and record VQE is not used until it is customized.

【Related Data Type and Interface】

10.4.1.36. AI_TALKVQE_CONFIG_S

【Description】

Define the structure of audio input sound quality enhancement (Talk) configuration information.

【Syntax】

typedef struct _AI_TALKVQE_CONFIG_S {
    CVI_U16 para_client_config;              /* Client-specific configuration parameter */
    CVI_U32 u32OpenMask;                     /* VQE feature enable mask */
    CVI_S32 s32WorkSampleRate;               /* Sample Rate: 8KHz/16KHz. Default: 8KHz */
    // MIC IN VQE settings
    AI_AEC_CONFIG_S     stAecCfg;             /* Acoustic Echo Cancellation configuration */
    AUDIO_ANR_CONFIG_S  stAnrCfg;             /* Automatic Noise Reduction configuration */
    AUDIO_AGC_CONFIG_S  stAgcCfg;             /* Automatic Gain Control configuration */
    AUDIO_DELAY_CONFIG_S stAecDelayCfg;       /* AEC delay configuration */
    CVI_S32 para_notch_freq;                 /* User can ignore this flag */
    CVI_CHAR customize[MAX_AUDIO_VQE_CUSTOMIZE_NAME]; /* Customization name */
} AI_TALKVQE_CONFIG_S;

【Member】

Member

Description

para_client_config

Client parameter configuration.

u32OpenMask

Mask value enabled for each Talk Vqe function.

s32WorkSampleRate

Operating sampling frequency. This parameter is the working sampling rate of the internal functional algorithm.

Value range: 8KHz/16KHz/48KHz. The default value is 8KHz. (48KHz for Hpf only)

stAecCfg

Configuration information related to echo cancellation function.

stAnrCfg

Configuration information related to voice noise reduction function.

stAgcCfg

Automatic gain control configuration information.

stAecDelayCfg

Configuration information related to audio signal delay.

para_notch_freq

Customized frequency elimination.

customize

Customization parameter selection.

【Note】

Cvitek VQE supports only AGC/ANR/AEC.

For example, if RNR/EQ data is set, it will not have an effect

【Related Data Type and Interface】

None.

10.4.1.37. AI_RECORDVQE_CONFIG_S

【Description】

Define the structure of audio input sound quality enhancement (Record) configuration information.

【Syntax】

typedef struct _AI_RECORDVQE_CONFIG_S {
        CVI_U32             u32OpenMask;       /* Bitmask for enabling/disabling features */
    CVI_S32             s32WorkSampleRate; /* Sample Rate: 16KHz/48KHz */
        /* Sample Rate:16KHz/48KHz*/
    CVI_S32             s32FrameSample;    /* Number of samples per frame */
    CVI_S32             s32BytesPerSample; /* Number of bytes per sample */

        /* VQE frame length:80-4096 */
        VQE_WORKSTATE_E     enWorkstate;       /* Current work state of VQE */
    CVI_S32             s32InChNum;        /* Number of input channels */
    CVI_S32             s32OutChNum;       /* Number of output channels */
    VQE_RECORD_TYPE     enRecordType;      /* Type of recording */
    AUDIO_AGC_CONFIG_S  stAgcCfg;          /* Configuration for Automatic Gain Control (AGC) */
} AI_RECORDVQE_CONFIG_S;

【Member】

Member

Description

u32OpenMask

Mask value enabled for each Talk Vqe function.

s32WorkSampleRate

Operating sampling frequency. This parameter is the working sampling rate of the internal functional algorithm.

Value range: 8KHz/16KHz/48KHz. The default value is 8KHz. (48KHz for Hpf only)

stAgcCfg

Automatic gain control configuration information.

enWorkstate

Working mode

s32InChNum

Number of input channels processed by VQE.

Value range: [1, 2].

s32OutChNum

Number of output channels processed by VQE.

Value range: [1, 2].

enRecordType

Record type

【Note】

Cvitek VQE supports only AGC/ANR/AEC.

For example, if RNR/EQ data is set, it will not have an effect

【Related Data Type and Interface】

None.

10.4.1.38. AUDIO_STREAM_S

【Description】

Define audio stream structure.

【Syntax】

typedef struct _AUDIO_STREAM_S {
   CVI_U8 *pStream;         /* the virtual address of stream */
   CVI_U32 u32PhyAddr;      /* the physics address of stream */
   CVI_U32 u32Len;          /* stream lenth, by bytes */
   CVI_U64 u64TimeStamp;    /* frame time stamp*/
   CVI_U32 u32Seq;      /* frame seq,if stream is not a valid frame, u32Seq is 0*/
} AUDIO_STREAM_S;

【Member】

Member

Description

pStream

The virtual address of stream

u32PhyAddr

the physics address of stream

u32Len

Audio stream length. AUDIO_STREAM_S structure body, in byte.

u64TimeStamp

Audio stream timestamp

u32Seq

Audio stream sequence

【Note】

None.

【Related Data Type and Interface】

  • CVI_AENC_GetStream

10.4.1.39. AO_CHN_STATE_S

【Description】

Define Audio Output Channel Data Block Status Structure.

【Syntax】

typedef struct hiAO_CHN_STATE_S {
   CVI_U32                  u32ChnTotalNum;
   CVI_U32                  u32ChnFreeNum;
   CVI_U32                  u32ChnBusyNum;
} AO_CHN_STATE_S;

【Member】

Member

Description

u32ChnTotalNum

Total number of blocks in output channel.

u32ChnFreeNum

Available free blocks

u32ChnBusyNum

Occupied blocks

【Note】

None.

【Related Data Type and Interface】

  • CVI_AO_QueryChnStat

10.4.1.40. AUDIO_TRACK_MODE_E

【Description】

Audio device channel mode type.

【Syntax】

typedef enum _AUDIO_TRACK_MODE_E {
    AUDIO_TRACK_NORMAL      = 0,  /* Normal audio track */
    AUDIO_TRACK_BOTH_LEFT   = 1,  /* Both channels play left audio */
    AUDIO_TRACK_BOTH_RIGHT  = 2,  /* Both channels play right audio */
    AUDIO_TRACK_EXCHANGE    = 3,  /* Exchange left and right audio channels */
    AUDIO_TRACK_MIX         = 4,  /* Mix both left and right audio channels */
    AUDIO_TRACK_LEFT_MUTE   = 5,  /* Mute left audio channel */
    AUDIO_TRACK_RIGHT_MUTE  = 6,  /* Mute right audio channel */
    AUDIO_TRACK_BOTH_MUTE   = 7,  /* Mute both audio channels */

    AUDIO_TRACK_BUTT        /* End of audio track modes */
} AUDIO_TRACK_MODE_E;

【Member】

Member

Description

AUDIO_TRACK_NORMAL

Normal mode, no processing

AUDIO_TRACK_BOTH_LEFT

Both channels are left

AUDIO_TRACK_BOTH_RIGHT

Both channels are right channel

AUDIO_TRACK_EXCHANGE

Data exchange between left and right channels, left channel is right channel sound, right channel is left channel sound

AUDIO_TRACK_MIX

The output of left and right channels is the aggregation of left and right channels (mixed)

AUDIO_TRACK_LEFT_MUTE

The left channel is mute, and the right channel plays the original right channel sound

AUDIO_TRACK_RIGHT_MUTE

The right channel is mute, and the left channel plays the original left channel sound

AUDIO_TRACK_BOTH_MUTE

Both left and right channels are mute

【Note】

None.

【Related Data Type and Interface】

  • CVI_AI_SetTrackMode

  • CVI_AO_SetTrackMode

10.4.1.41. AUDIO_FADE_RATE_E

【Description】

The audio device fade in and fade out rate type.

【Syntax】

typedef enum _AUDIO_FADE_RATE_E {
   AUDIO_FADE_RATE_NONE   = 0,
   AUDIO_FADE_RATE_10 = 10,
   AUDIO_FADE_RATE_20 = 20,
   AUDIO_FADE_RATE_30 = 30,
   AUDIO_FADE_RATE_50 = 50,
   AUDIO_FADE_RATE_100 = 100,
   AUDIO_FADE_RATE_200 = 200,
   AUDIO_FADE_RATE_BUTT = -1
} AUDIO_FADE_RATE_E;

【Member】

Member

Description

AUDIO_FADE_RATE_NONE

No delay between increasing or decreasing the volume

AUDIO_FADE_RATE_10

Volume increments or decrements for every 10ms step.

AUDIO_FADE_RATE_20

Volume increments or decrements for every 20ms step.

AUDIO_FADE_RATE_30

Volume increments or decrements for every 30ms step.

AUDIO_FADE_RATE_50

Volume increments or decrements for every 50ms step.

AUDIO_FADE_RATE_100

Volume increments or decrements for every 100ms step.

AUDIO_FADE_RATE_200

Volume increments or decrements for every 200ms step.

【Note】

When Cvitek uses AUDIO_FADE_RATE_E parameter, please confirm that bFade in AUDIO_FADE_S has been set to CVI_TRUE.

Fade in or fade out will be set gradually according to the set AUDIO_FADE_RATE time delay based on the current volume value until fade in to unmute or fade out to mute.

【Related Data Type and Interface】

None.

10.4.1.42. AUDIO_FADE_S

【Description】

The audio device fades in and out setting structure.

【Syntax】

typedef struct hiAUDIO_FADE_S {
   CVI_BOOL             bFade;
   AUDIO_FADE_RATE_E enFadeInRate;
   AUDIO_FADE_RATE_E enFadeOutRate;
} AUDIO_FADE_S;

【Member】

Member

Description

bFade

Whether to turn on the fade in and fade out function.

CVI_TRUE: Turn on the fading function.

CVI_FALSE: Turn off the fading function.

enFadeInRate

Audio output device volume fade-in speed.

enFadeOutRate

Audio output device volume fade-in speed.

【Note】

Cvitek please confirm that the bFade in AUDIO_FADE_S has been set to CVI_TRUE, and the setting of enFadeInRate/enFadeOutRate value will have effect.

【Related Data Type and Interface】

10.4.1.43. G726_BPS_E

【Description】

Defines the G.726 codec rate

【Syntax】

typedef enum _G726_BPS_E {
   G726_16K = 0,        /* G726 16kbps, see RFC3551.txt  4.5.4 G726-16 */
   G726_24K,             /* G726 24kbps, see RFC3551.txt  4.5.4 G726-24 */
   G726_32K,             /* G726 32kbps, see RFC3551.txt  4.5.4 G726-32 */
   G726_40K,             /* G726 40kbps, see RFC3551.txt  4.5.4 G726-40 */
   MEDIA_G726_16K,  /* G726 16kbps for ASF ... */
   MEDIA_G726_24K,  /* G726 24kbps for ASF ... */
   MEDIA_G726_32K,  /* G726 32kbps for ASF ... */
   MEDIA_G726_40K,  /* G726 40kbps for ASF ... */
   G726_BUTT,       /* Used for boundary checking */
} G726_BPS_E;

【Member】

Member

Description

G726_16K

16kbps G.726。

G726_24K

24kbps G. 726。

G726_32K

32kbps G.726。

G726_40K

40kbps G.726。

MEDIA_G726_16K G726

16kbps for ASF。

MEDIA_G726_24K

G726 24kbps for ASF。

MEDIA_G726_32K

G726 32kbps for ASF。

MEDIA_G726_40K

G726 40kbps for ASF。

【Note】

None.

【Related Data Type and Interface】

None.

10.4.1.44. ADPCM_TYPE_E

【Description】

Define ADPCM codec type.

【Syntax】

typedef enum _ADPCM_TYPE_E {
        /* see DVI4 diiffers in three respects from the IMA ADPCM at RFC3551.txt 4.5.1 DVI4 */

        ADPCM_TYPE_DVI4 = 0, /* 32kbps ADPCM(DVI4) for RTP */
        ADPCM_TYPE_IMA, /* 32kbps ADPCM(IMA),NOTICE:point num must be 161/241/321/481 */
        ADPCM_TYPE_ORG_DVI4, /* Original DVI4 ADPCM type */
        ADPCM_TYPE_BUTT, /* Used for boundary checking */
} ADPCM_TYPE_E;

【Member】

Member

Description

ADPCM_TYPE_DVI4

32kbit/s ADPCM(DVI4)。

ADPCM_TYPE_IMA

32kbit/s ADPCM(IMA)。

ADPCM_TYPE_ORG_DVI4

32kbit/s ADPCM(ORG_DVI4)。

【Note】

None.

【Related Data Type and Interface】

None.

10.4.1.45. AUDIO_SAVE_FILE_INFO_S

【Description】

Defines the configuration parameters for saving audio files.

【Syntax】

typedef struct _AUDIO_SAVE_FILE_INFO_S {
    CVI_BOOL    bCfg;                  /* Configuration flag (TRUE/FALSE) */
    CVI_CHAR    aFilePath[MAX_AUDIO_FILE_PATH_LEN]; /* File path where the audio is saved */
    CVI_CHAR    aFileName[MAX_AUDIO_FILE_NAME_LEN]; /* Name of the saved audio file */
    CVI_U32     u32FileSize;          /* Size of the file in KB */
} AUDIO_SAVE_FILE_INFO_S;

【Members】

Member Name

Description

bCfg

Configuration flag indicating whether file saving is enabled.

aFilePath

File path specifying the directory where the audio file will be saved.

aFileName

File name specifying the name of the saved audio file.

u32FileSize

File size specifying the maximum size of the saved audio file in kilobytes (KB).

【Notes】

Ensure that the file path and file name do not exceed the defined maximum lengths to avoid buffer overflow issues.

【Related Data Types and Interfaces】

MAX_AUDIO_FILE_PATH_LEN MAX_AUDIO_FILE_NAME_LEN

10.4.1.46. CVI_HPF_CONFIG_S

【Description】

Defines the configuration parameters for a high-pass filter (HPF).

【Syntax】

typedef struct _CVI_HPF_CONFIG_S {
    int type; /* HPF filter type */
    float f0; /* cut-off frequency */
    float Q; /* Q factor */
    float gainDb; /* gain in dB */
} CVI_HPF_CONFIG_S;

【Members】

Member Name

Description

type

Filter type identifier, used to distinguish between different high-pass filter designs or implementations.

f0

Cut-off frequency (Hz), defines the point where the high-pass filter begins to attenuate signals below this frequency.

Q

Quality factor, describes the ratio of the filter bandwidth to its center frequency, affecting the selectivity of the filter.

gainDb

Gain (in decibels), specifies the level of gain or attenuation at the pass frequency.

【Notes】

None.

【Related Data Types and Interfaces】

None.

10.4.1.47. CVI_EQ_CONFIG_S

【Description】

Defines the configuration parameters for an equalizer (EQ).

【Syntax】

typedef struct _CVI_EQ_CONFIG_S {
    int bandIdx;      /* Index of the EQ band */
    uint32_t freq;    /* Frequency in Hz */
    float QValue;     /* Quality factor of the EQ band */
    float gainDb;     /* Gain in decibels for the EQ band */
} CVI_EQ_CONFIG_S;

【Members】

Member Name

Description

bandIdx

Index indicating the specific band of the equalizer.

freq

Center frequency of the band, measured in Hertz (Hz).

QValue

Quality factor, describes the ratio of the bandwidth to the center frequency of the band, affecting the filter’s selectivity.

gainDb

Gain, measured in decibels (dB), specifies the level of gain or attenuation for the band.

【Notes】

None.

【Related Data Types and Interfaces】

None.

10.4.1.48. CVI_DRC_LIMITER_PARAM

【Description】

Defines the configuration parameters for a dynamic range compressor (DRC) limiter.

【Syntax】

typedef struct _CVI_DRC_LIMITER_PARAM {
    uint32_t attackTimeMs; /* Attack time in milliseconds */
    uint32_t releaseTimeMs; /* Release time in milliseconds */
    float thresholdDb; /* Threshold level in decibels */
    float postGain; /* Post-gain in decibels */
} CVI_DRC_LIMITER_PARAM;

【Members】

Member Name

Description

attackTimeMs

Attack time in milliseconds (ms), indicating the time taken for the compressor to start acting after the signal exceeds the threshold.

releaseTimeMs

Release time in milliseconds (ms), indicating the time taken for the compressor to stop acting after the signal falls below the threshold.

thresholdDb

Threshold level in decibels (dB), indicating the signal level at which compression begins.

postGain

Post-gain in decibels (dB), indicating the gain adjustment applied to the signal after compression.

【Notes】

None.

【Related Data Types and Interfaces】

None.

10.4.1.49. CVI_DRC_EXPANDER_PARAM

【Description】

Defines the configuration parameters for a dynamic range compressor (DRC) expander.

【Syntax】

typedef struct _CVI_DRC_EXPANDER_PARAM {
    uint32_t attackTimeMs; /* Attack time in milliseconds */
    uint32_t releaseTimeMs; /* Release time in milliseconds */
    uint32_t holdTimeMs; /* Hold time in milliseconds */
    uint16_t ratio; /* Expansion ratio */
    float thresholdDb; /* Threshold level in decibels */
    float minDb; /* Minimum level in decibels */
} CVI_DRC_EXPANDER_PARAM;

【Members】

Member Name

Description

attackTimeMs

Attack time in milliseconds (ms), indicating the time taken for the expander to start acting after the signal exceeds the threshold.

releaseTimeMs

Release time in milliseconds (ms), indicating the time taken for the expander to stop acting after the signal falls below the threshold.

holdTimeMs

Hold time in milliseconds (ms), indicating the time the expander remains active before releasing.

ratio

Expansion ratio, indicating the ratio between the input signal and the output signal.

thresholdDb

Threshold level in decibels (dB), indicating the signal level at which expansion begins.

minDb

Minimum level in decibels (dB), indicating the minimum output level of the expander.

【Notes】

None.

【Related Data Types and Interfaces】

None.

10.4.1.50. CVI_DRC_COMPRESSOR_PARAM

【Description】

Defines the configuration parameters for a dynamic range compressor (DRC) compressor.

【Syntax】

typedef struct _CVI_DRC_COMPRESSOR_PARAM {
    uint32_t attackTimeMs; /* Attack time in milliseconds */
    uint32_t releaseTimeMs; /* Release time in milliseconds */
    uint16_t ratio; /* Compression ratio */
    float thresholdDb; /* Threshold level in decibels */
} CVI_DRC_COMPRESSOR_PARAM;

【Members】

Member Name

Description

attackTimeMs

Attack time in milliseconds (ms), indicating the time taken for the compressor to start acting after the signal exceeds the threshold.

releaseTimeMs

Release time in milliseconds (ms), indicating the time taken for the compressor to stop acting after the signal falls below the threshold.

ratio

Compression ratio, indicating the ratio between the input signal and the output signal.

thresholdDb

Threshold level in decibels (dB), indicating the signal level at which compression begins.

【Notes】

None.

【Related Data Types and Interfaces】

None.

10.4.1.51. AUDIO_SPK_EQ_CONFIG_S

【Description】

Defines the configuration parameters for the speaker equalizer (EQ).

【Syntax】

typedef struct _AUDIO_SPK_EQ_CONFIG_S {
    CVI_U16 para_spk_eq_nband; /* Number of EQ bands */
    CVI_U16 para_spk_eq_freq[5]; /* EQ band frequencies */
    CVI_U16 para_spk_eq_gain[5]; /* EQ band gains */
    CVI_U16 para_spk_eq_qfactor[5]; /* EQ band Q factors */
} AUDIO_SPK_EQ_CONFIG_S;

【Members】

Member Name

Description

para_spk_eq_nband

Number of EQ bands.

para_spk_eq_freq

Array of EQ band center frequencies.

para_spk_eq_gain

Array of EQ band gains.

para_spk_eq_qfactor

Array of EQ band Q factors.

【Notes】

None.

【Related Data Types and Interfaces】

None.

10.4.1.52. AO_VQE_CONFIG_S

【Description】

Defines the configuration parameters for audio output (AO) voice quality enhancement (VQE).

【Syntax】

typedef struct _AO_VQE_CONFIG_S {
    CVI_U32 u32OpenMask; /* Open mask for VQE modules */
    CVI_S32 s32WorkSampleRate; /* Working sample rate */
    CVI_S32 s32channels; /* Number of channels */
    /* Sample Rate: 8KHz/16KHz default: 8KHz*/
    AUDIO_SPK_AGC_CONFIG_S stAgcCfg; /* AGC configuration */
    AUDIO_SPK_EQ_CONFIG_S stEqCfg; /* EQ configuration */
    CVI_HPF_CONFIG_S stHpfParam; /* HPF configuration */
    CVI_EQ_CONFIG_S stEqParam; /* EQ configuration */
    CVI_DRC_COMPRESSOR_PARAM stDrcCompressor; /* DRC compressor configuration */
    CVI_DRC_LIMITER_PARAM stDrcLimiter; /* DRC limiter configuration */
    CVI_DRC_EXPANDER_PARAM stDrcExpander; /* DRC expander configuration */
} AO_VQE_CONFIG_S;

【Members】

Member Name

Description

u32OpenMask

Open mask for VQE modules.

s32WorkSampleRate

Working sample rate in Hz.

s32channels

Number of channels.

stAgcCfg

Automatic Gain Control (AGC) configuration parameters.

stEqCfg

Equalizer (EQ) configuration parameters.

stHpfParam

High-Pass Filter (HPF) configuration parameters.

stEqParam

Equalizer (EQ) configuration parameters.

stDrcCompressor

Dynamic Range Compressor (DRC) compressor configuration parameters.

stDrcLimiter

Dynamic Range Compressor (DRC) limiter configuration parameters.

stDrcExpander

Dynamic Range Compressor (DRC) expander configuration parameters.

【Notes】

None.

【Related Data Types and Interfaces】

None.

10.4.1.53. HPF_FILTER_TYPE

【Description】

Defines the enumeration for high-pass filter (HPF) types.

【Syntax】

typedef enum {
    E_FILTER_LPF, /* Low-pass filter */
    E_FILTER_HPF, /* High-pass filter */
    E_FILTER_LSF, /* Low-shelf filter */
    E_FILTER_HSF, /* High-shelf filter */
    E_FILTER_PEF, /* Peak filter */
    E_FILTER_MAX, /* Maximum filter type */
} HPF_FILTER_TYPE;

【Members】

Enumeration Value

Description

E_FILTER_LPF

Low-pass filter.

E_FILTER_HPF

High-pass filter.

E_FILTER_LSF

Low-shelf filter.

E_FILTER_HSF

High-shelf filter.

E_FILTER_PEF

Peak filter.

E_FILTER_MAX

Maximum filter type.

【Notes】

None.

【Related Data Types and Interfaces】

None.

10.4.1.54. AUDIO_SPK_AGC_CONFIG_S

【Description】

Defines the configuration parameters for the speaker automatic gain control (AGC).

【Syntax】

typedef struct _AUDIO_SPK_AGC_CONFIG_S {
    CVI_S8 para_agc_max_gain; /* the max boost gain for AGC release processing, [0, 3] */
    CVI_S8 para_agc_target_high; /* the gain level of target high of AGC, [0, 36] */
    CVI_S8 para_agc_target_low; /* the gain level of target low of AGC, [0, 36] */
} AUDIO_SPK_AGC_CONFIG_S;

【Members】

Member Name

Description

para_agc_max_gain

The max boost gain for AGC release processing, range [0, 3].

para_agc_target_high

The gain level of target high of AGC, range [0, 36].

para_agc_target_low

The gain level of target low of AGC, range [0, 36].

【Notes】

None.

【Related Data Types and Interfaces】

None.

10.4.1.55. AUDIO_FILE_STATUS_S

【Description】

Define the audio file save status structure.

【Syntax】

typedef struct _AUDIO_FILE_STATUS_S {
        CVI_BOOL     bSaving; /* Whether the file is saving or not */
} AUDIO_FILE_STATUS_S;

【Member】

Member

Description

bSaving

Checking the file storage status.

CVI_TRUE: In the state of saving files;;

CVI_FALSE: Not in file storage status.

【Note】

None.

【Related Data Type and Interface】

None.

10.4.1.56. VQE_MODULE_CONFIG_S

【Description】

Define the configuration information structure of voice quality enhancement and resampling module.

【Syntax】

typedef struct _VQE_MODULE_CONFIG_S {
        CVI_VOID *pHandle; /* Handle of the VQE module */
} VQE_MODULE_CONFIG_S;

【Member】

Member

Description

pHandle

Register handle.

【Note】

The registration handle of each sound quality enhancement and resampling module can be obtained by calling the handle acquisition interface.

【Related Data Type and Interface】

None.

10.4.1.57. AUDIO_VQE_REGISTER_S

【Description】

Define the register structure of sound quality enhancement and resampling module.

【Syntax】

typedef struct _AUDIO_VQE_REGISTER_S {
    VQE_MODULE_CONFIG_S stResModCfg;     /* Configuration for the Resample module */
    VQE_MODULE_CONFIG_S stHpfModCfg;     /* Configuration for the High Pass Filter module */
    VQE_MODULE_CONFIG_S stHdrModCfg;     /* Configuration for the HDR module */
    VQE_MODULE_CONFIG_S stGainModCfg;    /* Configuration for the Gain module */

    // Record VQE
    VQE_MODULE_CONFIG_S stRecordModCfg;  /* Configuration for the Record VQE module */

    // Talk VQE
    VQE_MODULE_CONFIG_S stAecModCfg;     /* Configuration for the Acoustic Echo Cancellation module */
    VQE_MODULE_CONFIG_S stAnrModCfg;     /* Configuration for the Automatic Noise Reduction module */
    VQE_MODULE_CONFIG_S stAgcModCfg;     /* Configuration for the Automatic Gain Control module */
    VQE_MODULE_CONFIG_S stEqModCfg;      /* Configuration for the Equalizer module */

    // CviFi VQE
    VQE_MODULE_CONFIG_S stRnrModCfg;     /* Configuration for the Residual Noise Reduction module */
    VQE_MODULE_CONFIG_S stDrcModCfg;     /* Configuration for the Dynamic Range Compression module */
    VQE_MODULE_CONFIG_S stPeqModCfg;     /* Configuration for the Parametric Equalizer module */
} AUDIO_VQE_REGISTER_S;

【Member】

Member

Description

pHandle

Register handle.

【Note】

Currently only supports Talk VQE after audio uplink voice recording.

Other VQE are not supported.

【Related Data Type and Interface】

None.

10.4.2. Audio Encoding

The data types and data structures related to audio encoding are defined as follows:

10.4.2.1. AENC_MAX_CHN_NUM

【Description】

Define the maximum number of audio coding channels.

【Syntax】

#define AENC_MAX_CHN_NUM         1

【Note】

ADEC_MAX_CHN_NUM

【Related Data Type and Interface】

None.

10.4.2.2. AENC_ATTR_G711_S

【Description】

Define G.711 encoding protocol attribute structure.

【Syntax】

typedef struct hiAENC_ATTR_G711_S
{
   CVI_U32 resv;
}AENC_ATTR_G711_S;

【Member】

Member

Description

resv

not used

【Note】

None.

【Related Data Type and Interface】

None.

10.4.2.3. AENC_ATTR_G726_S

【Description】

Define G.726 encoding protocol attribute structure.

【Syntax】

typedef struct _AENC_ATTR_G726_S {
   G726_BPS_E enG726bps;
}AENC_ATTR_G726_S;

【Member】

Member

Description

enG726bps

G.726 protocol bitrate

【Note】

None.

【Related Data Type and Interface】

G726_BPS_E

10.4.2.4. AENC_ATTR_ADPCM_S

【Description】

Define ADPCM encoding protocol attribute structure.

【Syntax】

typedef struct _AENC_ATTR_ADPCM_S {
   ADPCM_TYPE_E enADPCMType;
}AENC_ATTR_ADPCM_S;

【Member】

Member

Description

enADPCMType

ADPCM type

【Note】

None.

【Related Data Type and Interface】

ADPCM_TYPE_E

10.4.2.5. AENC_ATTR_LPCM_S

【Description】

Define LPCM encoding protocol attribute structure.

【Syntax】

typedef struct _AENC_ATTR_LPCM_S {
   CVI_U32 resv;            /*reserve item*/
}AENC_ATTR_LPCM_S;

【Member】

The internal variables of this structure are not used.

【Note】

None.

【Related Data Type and Interface】

None.

10.4.2.6. AENC_CHN_ATTR_S

【Description】

Defines the audio encoding channel attribute structure. The definition of this structure varies slightly on different processor platforms.

【Syntax】

typedef struct _AENC_CHN_ATTR_S  {
   PAYLOAD_TYPE_E enType;
   CVI_U32  u32PtNumPerFrm;
   CVI_U32 u32BufSize;
   CVI_VOID *pValue;
   CVI_BOOL bFileDbgMode;
}AENC_CHN_ATTR_S;

【Member】

Member

Description

enType

Audio coding protocol type Static property.

u32PtNumPerFrm

Frame length of audio encoding protocol

u32BufSize

Size of audio encoding block.

pValue

Specific protocol attribute pointer.

bFileDbgMode

Whether in the file storage status

【Note】

None.

【Related Data Type and Interface】

None.

10.4.2.7. AAC_AENC_ENCODER_S

【Description】

Defines AAC encoder attribute structure.

【Syntax】

typedef struct _AAC_AENC_ENCODER_S {
   PAYLOAD_TYPE_E  enType;
   CVI_U32      u32MaxFrmLen;
   CVI_CHAR   aszName[17];
   /* encoder type,be used to print proc information */
   CVI_S32 (*pfnOpenEncoder)(CVI_VOID *pEncoderAttr, CVI_VOID **ppEncoder);
   /* pEncoder is the handle to control the encoder */
   CVI_S32  (*pfnEncodeFrm)(CVI_VOID *pEncoder, CVI_S16 * inputdata, CVI_U8 *
   pu8Outbuf,

   CVI_S32 s32InputSizeBytes, CVI_U32 *pu32OutLen);
   CVI_S32 (*pfnCloseEncoder)(CVI_VOID *pEncoder);
} AAC_AENC_ENCODER_S;

【Member】

This structure is only used by AAC external LIB link.

This version of SDK is only defined but not supported.

【Note】

This structure is only used by AAC external LIB link.

This version of SDK is only defined but not supported.

If AAC is required, please refer to middleware/sample/audio/aac_sample.

【Related Data Type and Interface】

None.

10.4.3. Audio Decoding

Data types and data structures related to audio decoding are defined as follows:

10.4.3.1. MAX_AUDIO_FRAME_NUM

【Description】

Define the maximum number of audio decoding block frames.

【Syntax】

#define MAX_AUDIO_FRAME_NUM        300

【Note】

Currently the number of audio internal cache frames is determined by the SDK, so this setting is not open to users and will not have any effect.

【Related Data Type and Interface】

None.

10.4.3.2. ADEC_MAX_CHN_NUM

【Description】

Define the maximum number of audio decoding channels.

【Syntax】

#define ADEC_MAX_CHN_NUM       1

【Note】

Currently only supports single channel encoding and decoding.

【Related Data Type and Interface】

None.

10.4.3.3. ADEC_ATTR_G711_S

【Description】

Define G.711 decoding protocol attribute structure.

【Syntax】

typedef struct _ADEC_ATTR_G711_S {
   CVI_U32 resv;
}ADEC_ATTR_G711_S;

【Member】

The variables in this structure are not used in Cvitek processor

【Note】

None.

【Related Data Type and Interface】

None.

10.4.3.4. ADEC_ATTR_G726_S

【Description】

Define G.726 decoding protocol attribute structure.

【Syntax】

typedef struct _ADEC_ATTR_G726_S {
   G726_BPS_E enG726bps;
}ADEC_ATTR_G726_S;

【Member】

Member

Description

enG726bps

G.726 protocol bitrate

【Note】

None.

【Related Data Type and Interface】

G726_BPS_E

10.4.3.5. ADEC_ATTR_ADPCM_S

【Description】

Define ADPCM decoding protocol attribute structure .

【Syntax】

typedef struct _ADEC_ATTR_ADPCM_S {
   ADPCM_TYPE_E enADPCMType;
}ADEC_ATTR_ADPCM_S;

【Member】

Member

Description

enADPCMType

ADPCM type

【Note】

None.

【Related Data Type and Interface】

10.4.3.6. ADEC_ATTR_LPCM_S

【Description】

Define LPCM decoding protocol attribute structure.

【Syntax】

typedef struct _ADEC_ATTR_LPCM_S {
   CVI_U32 resv;
}ADEC_ATTR_LPCM_S;

【Member】

resv is to be extended.

【Note】

None.

【Related Data Type and Interface】

None.

10.4.3.7. ADEC_MODE_E

【Description】

Define the decoding method.

【Syntax】

typedef enum _ADEC_MODE_E {
   ADEC_MODE_PACK = 0,
   ADEC_MODE_STREAM ,
   ADEC_MODE_BUTT
}ADEC_MODE_E;

【Member】

Member

Description

ADEC_MODE_PACK

Decode in Pack mode.

ADEC_MODE_STREAM

Decode in stream mode.

【Note】

Pack mode is used when the user confirms that the current stream packet is the result of one frame data encoding, the decoder will decode it directly.

If it is not one frame, the decoder will fail.

The efficiency of this mode is relatively high.

If the code stream packet encoded by AENC module is not damaged, this mode can be used for decoding.

Stream mode is used when the user can’t confirm whether the current code stream packet is one frame of data, and the decoder needs to determine and block the code stream.

This mode is inefficient, and is generally used in the case of reading file code stream decoding or uncertain code stream packet boundary.

Of course, due to the fixed length of speech coding stream, it is easy to determine the frame boundary in the stream, so it is recommended to use pack decoding.

Cvitek only supports pack mode.

In the case of uncertain stream boundary, Cvitek will make decoding errors due to the misalignment of frame number.

【Related Data Type and Interface】

None.

10.4.3.8. ADEC_CHN_ATTR_S

【Description】

Define the decoding channel attribute structure.

【Syntax】

typedef struct _ADEC_CH_ATTR_S {
   PAYLOAD_TYPE_E enType;
   CVI_U32         u32BufSize; /*buf size[2~CVI_MAX_AUDIO_FRAME_NUM]*/
   ADEC_MODE_E   enMode;/*decode mode*/
   /* CVI_VOID ATTRIBUTE      *pValue;*/
   CVI_VOID *pValue;
   CVI_BOOL bFileDbgMode;
   //if ao not enable
   CVI_S32 s32BytesPerSample;
   CVI_S32 s32frame_size; //in samples
   CVI_S32 s32ChannelNums; // 1 or 2
   CVI_S32 s32Sample_rate;;
}ADEC_CHN_ATTR_S;

【Member】

Member

Description

enType

Audio decoding protocol type, static property.

u32BufSize

Audio decoding block buffer size. At present, the number of audio internal cache frames is determined by the SDK, so this setting is not open to users and will not have any effect.

enMode

Decoding mode, static property. Mode only supports ADEC_MODE_PACK mode and cannot be detected automatically by setting ADEC_MODE_STREAM.

pValue

Specific protocol attribute pointer.

bFileDbgMode

Whether to turn on save file mode. Please pay attention to the cvi_sample_audio.c sample code in the SDK. This value is true by default for debugging. Users should set it to false in actual use to avoid using performance or memory due to disk saving.

When the user only uses the ADECmodule but not the AO module, the user needs to inform the ADECmodule of the relevant parametercharacteristics through the following variable settings.

s32BytesPerSample

Bytes used for unit sampling. (bit width SL16, 16bits = 2 bytes, at this point, the bytes used for unit sampling should be set to 2) (the examples in SDK are 2).

s32frame_size

Period sample size: the number of samples sent to the ADEC module each time.

s32ChannelNums

The number of channels (mono: 1, dual: 2).

s32Sample_rate

The sampling frequency (HZ) of the code stream to be decoded.

【Note】

None.

【Related Data Type and Interface】

None.

10.4.3.9. AUDIO_FRAME_INFO_S

【Description】

Define the audio frame information structure after decoding.

【Syntax】

typedef struct _AUDIO_FRAME_INFO_S {
   AUDIO_FRAME_S *pstFrame;
   CVI_U32         u32Id;
} AUDIO_FRAME_INFO_S;

【Member】

Member

Description

pstFrame

Audio frame pointer.

u32Id

Index of audio frame, range [0, 49].

【Note】

None.

【Related Data Type and Interface】

CVI_ADEC_GetFrame CVI_ADEC_ReleaseFrame

10.4.3.10. ADEC_DECODER_S

【Description】

Define the decoder attribute structure.

【Syntax】

typedef struct _ADEC_DECODER_S {
   PAYLOAD_TYPE_E  enType;
   CVI_CHAR   aszName[17];

   CVI_S32 (*pfnOpenDecoder)(CVI_VOID *pDecoderAttr, CVI_VOID
                                          **ppDecoder);
   CVI_S32 (*pfnDecodeFrm)(CVI_VOID *pDecoder, CVI_U8 **pu8Inbuf,
            CVI_S32 *ps32LeftByte,
            CVI_U16 *pu16Outbuf, CVI_U32 *pu32OutLen, CVI_U32
                                    *pu32Chns);
   CVI_S32 (*pfnGetFrmInfo)(CVI_VOID *pDecoder, CVI_VOID *pInfo);
   CVI_S32 (*pfnCloseDecoder)(CVI_VOID *pDecoder);
   CVI_S32 (*pfnResetDecoder)(CVI_VOID *pDecoder);
} ADEC_DECODER_S;

【Member】

Member

Description

enType

Decoding protocol type

aszName

Decoder name.

pfnOpenDecoder

Function pointer to open decoder.

pfnDecodeFrm

Function pointer to decode

pfnGetFrmInfo

Function pointer to get audio frame information.

pfnCloseDecoder

Function pointer to close decoder.

pfnResetDecoder

Clear the buffer and reset the decoder.

【Note】

This structure is specially used to link external AAC decoding lib.

Currently, please refer to middleware/sample/audio/aac_sample for AAC decoding.

【Related Data Type and Interface】

None.