# Application Note **How to use the Voice Activity Detection in the TAx511x and TAx521x**



Jeff McPherson, Abin Mathew

#### ABSTRACT

The TAx511x and TAx521x devices are a family of high performance stereo codecs suitable for Land Mobile Radio, IP Network and Telephone, Video Conferencing, and Professional Audio Equipment. This family of devices has an extensive set of features that include the following:

- Real-Time DSP-based digital mixer
- Digital volume control
- Programmable filters (Biquads, HPF)
- Automatic gain control (AGC)/ Dynamic range compression (DRC)
- Linear-phase or Low-latency filter modes
- Ultrasound activity Detection/generator
- Voice activity detection (VAD)

This application note describes how to configure the Voice Activity Detection (VAD) feature in TAx511x and TAx521x devices (TAA5111, TAA5112, TAC5111, TAC5112, TAA5211, TAA5212, TAC5211, TAC5212).

# **Table of Contents**

| 1 Introduction             | 2  |
|----------------------------|----|
| 2 Voice Activity Detection | 2  |
| 2.1 VAD Configurations     | 3  |
| 2.2 VAD Parameters         | 6  |
| 3 VAD Performance Results  |    |
| 4 Examples                 |    |
| 5 Summary                  | 13 |
| 6 References               | 13 |
|                            |    |

# **List of Figures**

| Figure 1-1. VAD Example                                                           | . 2 |
|-----------------------------------------------------------------------------------|-----|
| Figure 3-1. Non-Speech Hit Rate vs Speech Hit Rate for Car Noise                  |     |
| Figure 3-2. Non-Speech Hit Rate vs Speech Hit Rate for Restaurant Noise           | 8   |
| Figure 3-3. Non-Speech Hit Rate vs Speech Hit Rate for Train Noise                | . 9 |
| Figure 3-4. Non-Speech Hit Rate vs Speech Hit Rate at -5dB Threshold for 6dB SNR  | 9   |
| Figure 3-5. Non-Speech Hit Rate vs Speech Hit Rate at –5dB Threshold for 12dB SNR | . 9 |
| Figure 3-6. Non-Speech Hit Rate vs Speech Hit Rate at –5dB Threshold for 18dB SNR | . 9 |
| Figure 3-7. Non-Speech Hit Rate vs Speech Hit Rate at –5dB Threshold for 24dB SNR |     |

# **List of Tables**

| Table 2-1. List of VAD Configuration                                 | 3 |
|----------------------------------------------------------------------|---|
| Table 2-2. VAD Current Consumption                                   |   |
| Table 2-3. VAD Mode Selection Using LPAD_CFG1 Register               |   |
| Table 2-4. VAD ON During Recording Selection Using VAD CFG2 Register |   |
| Table 2-5. VAD Channel Selection Using LPAD CFG1 Register            |   |
| Table 2-6. SDOUT as Interrupt Selection Using VAD_CFG2 Register      | 5 |
| Table 2-7. MICBIAS Control During VAD Processing on PDM Channels     |   |
| Table 2-8. VAD Clock Selection Using LPAD_LPSG_CFG1 Register         |   |
| 5 <u> </u>                                                           |   |



| Table 2-9. VAD Clock Frequency Selection Using LPAD_LPSG_CFG1 Register     | 5 |
|----------------------------------------------------------------------------|---|
| Table 2-10. List of VAD Parameters.                                        |   |
| Table 2-11. Programmable Coefficient Registers for Initial Learning Period | 6 |
| Table 2-12. Programmable Coefficient Registers for Hold Over Counter       |   |
| Table 2-13. Programmable Coefficient Registers for Wakeup Wait             |   |
| Table 2-14. Programmable Coefficient Registers for Threshold               |   |
|                                                                            |   |

# Trademarks

All trademarks are the property of their respective owners.

# 1 Introduction

The Voice Activity Detection (VAD) algorithm is a voice-triggered, system wake-up mechanism. The VAD feature enables the device or system (depending on the application) to be in sleep mode: consuming minimal power in the absence of voice activity. The TAx5x1x can generate an interrupt upon the detection of voice activity. Figure 1-1 shows how the VAD responds to voice activity.



## Figure 1-1. VAD Example

The VAD feature is supported on all analog-to-digital converter (ADC) channels of the TAx511x and TAx521x device family, including digital microphone channels, with one channel being monitored at a time. The digital microphone channel is preferred for low-power applications. TI recommends that the audio sampling rate of the device be 8kHz for best performance, however 48kHz is supported if higher fidelity is needed. This application note describes the operation of the VAD, the tunable parameters, and the device configurations required to support VAD.

# **2 Voice Activity Detection**

The VAD algorithm uses a decision tree classification-based algorithm for voice activity detection. The VAD block monitors the input signal from the microphone channels for a voice-like profile and upon detection of a voice activity pattern, the VAD block triggers an interrupt. The VAD monitors for both an onset of voice activity as well as the end of voice activity. Both of these events can be mapped to interrupts.

The TAx511x or TAx521x device also has the capability to automatically power-on and power-off based on the VAD interrupts. As an example, the TAx511x or TAx521x system can be set up to monitor VAD activity on a digital microphone channel and then power on the analog microphone channels based on the VAD trigger. By using the VAD to control the ADC power-on and power-off behavior, power savings can be achieved, compared to leaving the ADC powered on at all times.

The VAD has three modes that determine the behavior of the ADC record path:

- 1. Auto Mode: Voice triggered VAD interrupt based ADC power on and power off. The ADC record path is active only during voice activity.
- 2. User Mode: Voice triggered VAD interrupts have to be monitored by the host and the ADC record path is powered on or off through I2C commands.
- 3. Intermediate Mode: Voice triggered VAD interrupt powers on the ADC record path and the ADC continues recording until powered off through host I2C commands.

Note that in all modes, the device generates an interrupt on the configured pin which can be sent to an external DSP or SOC.

The salient features of VAD are as follows:

- Clock configuration for VAD includes:
  - VAD processing using internal oscillator (Target mode)
  - VAD processing using external clock on BCLK input (Target mode)
  - VAD processing using external clock on MCLK input (Controller Mode)
- In external clock configuration, the VAD processing clock frequency can be adjusted to meet the system power demands.
- Automatic switching between VAD mode and Record mode The system switches from VAD mode to record mode upon voice activity and switches back to VAD mode upon no voice activity in Auto VAD mode.

# 2.1 VAD Configurations

Table 2-1 shows the different modes in which VAD can be operated.

| VAD Configuration               | Function, Description                                                                                                                                                                                                                                 |
|---------------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| Modes: User, Auto, Intermediate | Auto Mode: VAD interrupt-based ADC power on and power off.<br>Intermediate Mode: VAD interrupt-based ADC power on and user-initiated ADC power off.<br>User Mode: VAD monitoring is active. ADC record path power on and power off is user initiated. |
| VAD monitoring channel          | Channel assigned for VAD monitoring is configurable                                                                                                                                                                                                   |
| VAD Clock Configurable          | Different Clock can be selected for VAD processing including external and internal clocks                                                                                                                                                             |
| VAD with ADC recording          | This feature decides if voice detection needs to be active when recording is in progress.                                                                                                                                                             |
| VAD Interrupt Pin               | Device can be configured to monitor the VAD interrupt on any general purpose I/O pin or on SDOUT                                                                                                                                                      |

Table 2-1. List of VAD Configuration

#### 2.1.1 User, Auto, Intermediate

VAD can be programmed by the user to be either in user mode, auto mode, or intermediate mode. Note that all VAD modes are only supported with an audio sample rate of 8kHz or 48kHz.

0d =User Initiated ADC power up and ADC power down: This is the user mode in which VAD monitoring is active and ADC power up and power down is initiated by the user.

1d = VAD interrupt based on ADC power up and ADC power down: This is the auto mode in which ADC is turned ON or OFF automatically based on the interrupt generated by the VAD algorithm.

2d = VAD interrupt based on ADC power up with user-initiated ADC power-down. This is an intermediate mode between user and auto modes. A voice triggered VAD interrupt powers on the ADC record path and the ADC continues recording until powered off through register write commands from the host. In this mode, the ADC must receive a register write to power on at the same time the VAD is powered on, but the VAD can keep the ADC powered down until voice is detected.

Table 2-2 shows the power benefits of using the VAD by comparing the current consumption on AVDD across the 3 modes.



#### Table 2-2. VAD Current Consumption VAD Typical Current Mode **Voice Activity Detected?** (mA) Other Conditions User YES - ADC powered ON by 6.899 Mode host User NO - ADC powered OFF by 4.744 Mode host Auto YES - ADC powered ON by 6.799 Mode VAD AVDD = 3.3V; Fs = 8kHz, BCLK = 2.048MHz; TDM Format; CH1 enabled and monitored; MICBIAS Enabled Auto NO - ADC powered OFF by 4.571 Mode VAD Intermedi YES - ADC powered ON by 6.896 ate Mode VAD Intermedi NO - ADC powered OFF by 4.575 ate Mode host

As Table 2-3 shows, VAD mode selection is done using the LPAD\_MODE[1:0] bit of LPAD\_CFG1[7:6] register (page = 0x01, address = 0x1E).

Table 2-3. VAD Mode Selection Using LPAD\_CFG1 Register

| Bit | Field          | Туре | Reset | Description                                                                                                                                                                                                                                             |
|-----|----------------|------|-------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| 7-6 | LPAD_MODE[1:0] | R/W  | 00b   | Auto ADC power up and power down configuration selection.<br>0d = User initiated ADC power-up and ADC power down<br>1d = VAD interrupt based ADC power up and ADC power down<br>2d = VAD interrupt based ADC power up but user initiated ADC power down |

#### 2.1.2 VAD With ADC Recording

This parameter decides if voice activity needs to be detected when ADC is recording is ongoing or not. If this bit is enabled, then the VAD algorithm continues running when ADC recording is in progress to detect any voice activity. Running the VAD while the ADC is recording is considered High Power Mode.

As Table 2-4 shows, VAD ON during recording selection is done using the LPAD\_PD\_DET\_EN bit of LPAD\_CFG1[1] register (page = 0x01, address = 0x1E).

#### Table 2-4. VAD ON During Recording Selection Using VAD\_CFG2 Register

| Bit | Field         | Туре | Reset | Description                                                                                                                                                                                                     |
|-----|---------------|------|-------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| 1   | VAD_PD_DET_EN | R/W  | 1b    | Enable ASI output data during VAD activity.<br>0d = VAD processing is not enabled during ADC recording<br>1d = VAD processing is enabled during ADC recording and VAD interrupts are<br>generated as configured |

#### 2.1.3 VAD Monitoring Channel

4

This parameter decides which channel is to be monitored for VAD activity. Only one of the channels can be monitored for VAD activity at a time.

As Table 2-5 shows, VAD channel selection is done using the LPAD\_CH\_SEL[1:0] bit of LPAD\_CFG1[5:4] register (page = 0x01, address = 0x1E).

| Bit | Field            | Туре | Reset | Description                                                                                                                                                                                                         |
|-----|------------------|------|-------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| 5-4 | LPAD_CH_SEL[1:0] | R/W  | 10b   | VAD channel select.<br>0d = Channel 1 is monitored for VAD activity<br>1d = Channel 2 is monitored for VAD activity<br>2d = Channel 3 is monitored for VAD activity<br>3d = Channel 4 is monitored for VAD activity |

Table 2-5. VAD Channel Selection Using LPAD\_CFG1 Register



### 2.1.4 VAD Interrupt Pin

SDOUT pin can be used for VAD interrupt and the GPOx or GPIOx pins can be configured as primary ASI output during ADC recording simultaneously with VAD monitoring.

When the SDOUT pin is configured as the VAD interrupt, SDOUT pin follows the polarity set by the INT\_POL[7] bit of INT\_CFG[7] register (page = 0x00, address = 0x42).

As Table 2-6 shows, SDOUT as interrupt selection is done using the LPAD\_SDOUT\_INT\_CFG[3] bit of LPAD\_CFG1 register (page = 0x01, address = 0x1E).

| Bit | Field         | Туре | Reset | Description                                                                                                                                                                                |
|-----|---------------|------|-------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| 3   | SDOUT_INT_CFG | R/W  | 0b    | SDOUT interrupt configuration.<br>0d = SDOUT pin is not enabled for interrupt function<br>1d = SDOUT pin is enabled to support interrupt output when channel data in<br>not being recorded |

#### 2.1.5 MICBIAS Enable During PDM Monitoring

The MICBIAS output can be enabled or disabled (default) when the VAD is monitoring a PDM channel based on the value of LPAD\_CH\_SEL[1:0].

As Table 2-7 shows, the MICBIAS enable during PDM recording is controlled using the LPAD\_EN\_MICBIAS[2] bit of LPAD\_CFG1 register (page = 0x01, address = 0x1E).

#### Table 2-7. MICBIAS Control During VAD Processing on PDM Channels

| Bit | Field           | Туре | Reset | Description                                                                                                                                                                                   |
|-----|-----------------|------|-------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| 2   | LPAD_EN_MICBIAS | R/W  |       | MICBIAS control during VAD processing on a PDM channel.<br>0d = MICBIAS is disabled during VAD processing on a PDM channels<br>1d = MICBIAS is enabled during VAD processing on a PDM channel |

#### 2.1.6 VAD Clock Configurability

VAD can be run on either the internal oscillator clock or the external clock provided by the user. This external clock can be given on either the BCLK pin or the MCLK pin.

As Table 2-8 shows, VAD clock selection is done using the LPAD\_LPSG\_CLK\_CFG[1:0] bit of LPAD\_LPSG\_CFG1[7:6] register (page = 0x01, address = 0x20).

If the user selects either 1d or 2d, then the frequency of external clock is selected using LPAD\_LPSG\_EXT\_CLK\_CFG[1:0] bit of LPAD\_LSG\_CFG1[5:4] register (page = 0x01, address = 0x20) as shown in Table 2-9.

| Bit | Field            | Туре | Reset | Description                                                                                                                                                                                                                                                                                        |
|-----|------------------|------|-------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| 7-6 | VAD_CLK_CFG[1:0] | R/W  | 00b   | Clock select for VAD<br>0d = VAD processing using internal oscillator clock<br>1d = VAD processing using external clock on BCLK input<br>2d = VAD processing using external clock on MCLK input<br>3d = Custom clock configuration based on MST_CFG, CLK_SRC and<br>CLKGEN_CFG registers in page 0 |

#### Table 2-9. VAD Clock Frequency Selection Using LPAD\_LPSG\_CFG1 Register

| Bit | Field                | Туре | Reset | Description                                                                                                                                                                                     |
|-----|----------------------|------|-------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| 5-4 | VAD_EXT_CLK_CFG[1:0] | R/W  | 00Ь   | Clock configuration using external clock for VAD.<br>0d = External clock is 3.072MHz<br>1d = External clock is 6.144MHz<br>2d = External clock is 12.288MHz<br>3d = External clock is 18.432MHz |

Voice Activity Detection



# 2.2 VAD Parameters

Table 2-10 shows the parameters of the VAD algorithm. These parameters reside in the 32-bit wide coefficient memory (Page 0x0D, 0x0E) of the device.

| VAD Parameter                 | Function, Description                                                                                                                                                                                                                                                                                                                                    |  |  |  |  |
|-------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|--|--|--|--|
| Initial learning period (ILP) | This is the amount of time the VAD algorithm takes to adjust to the background noise environment, from the time instant VAD is turned on.                                                                                                                                                                                                                |  |  |  |  |
| Hold over counter (HOC)       | The HOC determines how long the VAD interrupt can stay active after voice activity is determined to have ended. This can be lowered so that the interrupt reacts to the space in between words, or increased to mak sure the recording of an entire sentence.                                                                                            |  |  |  |  |
| Wakeup wait (WW)              | If VAD is programmed to be in Auto mode, on detecting voice, VAD automatically turns on the ADC and starts recording, simultaneously also checking for voice activity. Wakeup wait is the amount of time for which VAD is suspended after going to recording mode and thereafter resumed.                                                                |  |  |  |  |
| Threshold (TH)                | Threshold controls the decision boundary of the nodes of the decision tree. A higher value can increase the node thresholds of all the nodes of the decision tree, thus reducing the likelihood of false positives. Similarly, a lower value for the threshold parameter decreases the node thresholds, which reduces the likelihood of false negatives. |  |  |  |  |

#### Table 2-10. List of VAD Parameters

# 2.2.1 Initial Learning Period

Initial Learning Period (ILP) is the amount of time the VAD algorithm takes to adjust to the background noise environment, from the time instant VAD is turned on. In practical applications, the device is soaked in the background noise by default and the VAD interrupt can be low during the ILP followed by which the detector can start the voice detection. After the Initial Learning Period is completed, the device calculates the coefficients required for the algorithm and the VAD starts to work as expected within the designed for accuracy limit. The device can continue to calculate coefficients after the ILP is over and improve accuracy as long as the VAD remains powered up in a given environment. The best ILP for a given use case is determined by how dynamic the noise environment is. More dynamic environments can require a longer ILP so that the VAD has enough time to characterize the background noise without triggering false positives. Users and designers are encouraged to test the VAD in the expected noise environment of the system to determine the most appropriate ILP. Equation 1 shows the computation of the VAD\_ILP parameter.

Initial learning period 
$$(s) = \frac{ILP_{10}}{(256 \times 8000)}$$
 (1)

where

6

• ILP<sub>10</sub> is the ILP register value in decimal form interpreted as a signed integer

The default value (0x001F4000) corresponds to 1s. Table 2-11 shows the registers that control the VAD\_ILP parameter.

| Coefficient | Page | Register | Reset Value | Description     |  |  |
|-------------|------|----------|-------------|-----------------|--|--|
|             | 0x0D | 0x7C     | 0x00        | ILP Byte[31:24] |  |  |
| VAD_ILP     | 0x0D | 0x7D     | 0x1F        | ILP Byte[23:16] |  |  |
|             | 0x0D | 0x7E     | 0x40        | ILP Byte[15:8]  |  |  |
|             | 0x0D | 0x7F     | 0x00        | ILP Byte[7:0]   |  |  |

Table 2-11. Programmable Coefficient Registers for Initial Learning Period

# 2.2.2 Hold Over Counter

On detecting voice activity, the VAD algorithm generates an interrupt. If the interrupt is programmed to be active high, then the interrupt goes high (logic 1) on detecting voice and goes low (logic 0) when there is no voice. Before going low, the amount of time the interrupt stays high after the voice activity is determined to have ended is determined by the hold over counter count. Equation 2 shows the computation of the VAD\_HOC parameter.

Hold over counter 
$$(s) = \frac{HOC_{10}}{(256 \times 8000)}$$

(2)

#### where

• HOC<sub>10</sub> is the HOC register value in decimal form interpreted as a signed integer

The default value (0x00032000) corresponds to 100ms. Table 2-12 shows the registers that control the VAD\_HOC parameter.

| Coefficient | Page | Register | Reset Value | Description     |  |  |
|-------------|------|----------|-------------|-----------------|--|--|
|             | 0x0E | 0x0C     | 0x00        | HOC Byte[31:24] |  |  |
| VAD HOC     | 0x0E | 0x0D     | 0x03        | HOC Byte[23:16] |  |  |
| VAD_HOC     | 0x0E | 0x0E     | 0x20        | HOC Byte[15:8]  |  |  |
|             | 0x0E | 0x0F     | 0x00        | HOC Byte[7:0]   |  |  |

#### Table 2-12. Programmable Coefficient Registers for Hold Over Counter

#### 2.2.3 Wakeup Wait

If VAD is programmed to be in auto mode, on detecting voice, VAD can automatically turn on the ADC and start recording, simultaneously also checking for voice activity. Wakeup wait is the amount of time the VAD is suspended after going to recording mode and thereafter resumed. Equation 3 shows the computation of the VAD\_WW parameter.

$$Wakeup wait (s) = \frac{WW_{10}}{(256 \times 8000)}$$
(3)

where

• WW<sub>10</sub> is the Wakeup wait register value in decimal interpreted as a signed integer

The default value (0x01388000) corresponds to 10s. Table 2-13 shows the registers that control the VAD\_WW parameter.

|             | V    |          | <u> </u>    |                |  |
|-------------|------|----------|-------------|----------------|--|
| Coefficient | Page | Register | Reset Value | Description    |  |
|             | 0x0E | 0x08     | 0x01        | WW Byte[31:24] |  |
| VAD WW      | 0x0E | 0x09     | 0x38        | WW Byte[23:16] |  |
| VAD_VVVV    | 0x0E | 0x0A     | 0x80        | WW Byte[15:8]  |  |
|             | 0x0E | 0x0B     | 0x00        | WW Byte[7:0]   |  |

Table 2-13. Programmable Coefficient Registers for Wakeup Wait

#### 2.2.4 Threshold

Threshold (TH) controls the decision boundary of the nodes of the decision tree. A higher value can increase the node thresholds of all the nodes of the decision tree, thus reducing the likelihood of false positives. Similarly, a lower value for the threshold parameter decreases the node thresholds, which reduces the likelihood of false negatives. Thus the threshold parameter can be adjusted to settle at the appropriate balance between false negatives and false positives.

Equation 4 shows the computation of the VAD\_TH parameter.

$$Threshold_{new} = Threshold_{default} \times 10^{\frac{thr+12}{20}}$$
(4)

where

- thr is the threshold value in dB (-20dB to 0dB )
- Threshold<sub>default</sub> is the default value in the Threshold register in decimal (4194304)

The default value (4194304) corresponds to -12dB. Table 2-14 shows the registers that control the VAD\_TH parameter.



| Coefficient | Page | Register | Reset Value | Description    |  |  |
|-------------|------|----------|-------------|----------------|--|--|
|             | 0x0D | 0x5C     | 0x00        | TH Byte[31:24] |  |  |
| VAD TH      | 0x0D | 0x5D     | 0x40        | TH Byte[23:16] |  |  |
| VAD_IH      | 0x0D | 0x5E     | 0x00        | TH Byte[15:8]  |  |  |
|             | 0x0D | 0x5F     | 0x00        | TH Byte[7:0    |  |  |

#### Table 2-14. Programmable Coefficient Registers for Threshold

# **3 VAD Performance Results**

This section discusses the VAD performance. The algorithm performance is given by a Receiver Operating Characteristic (ROC) curve which describes the detection performance across different operating thresholds (–12dB to –3dB). ROC plots are included for the noise scenarios from the Aurora Noise database (Figure 3-1 Car, Figure 3-2 restaurant and Figure 3-3 Train) and speech signals from the NOIZEUS Speech database. Test vectors are generated by mixing noise and speech signals at the desired SNR (SNR is the separation between the power levels of speech and noise signals) of 12, 18, and 24dB (for example, 12dB SNR means noise power level is 12dB down from the speech power level). These SNR values were chosen based on common output values of microphones. This data was also taken with an 8kHz sampling rate for the best expected performance.

The ROC plots start with a -12dB threshold at the extreme top left and moves towards the right as the threshold is increased. Speech Hitrate is the accuracy of the VAD to correctly detect voice when the VAD is present in the input signal. Non-Speech Hitrate is the accuracy of the VAD to correctly ignore dynamic movements in the noise signal. A high hit rate for both speech and non-speech indicates the algorithm's ability to correctly detect voice when present and prevent false positives when voice is not present.



Figure 3-1. Non-Speech Hit Rate vs Speech Hit Rate for Car Noise









Figure 3-3. Non-Speech Hit Rate vs Speech Hit Rate for Train Noise

After analyzing the collected data, the –5dB threshold was chosen to give the best speech hit rate and nonspeech hit rate across different noise types. ROC curve at –5dB threshold for different noise types is as shown for 6, 12, 18, and 24dB SNR.



Figure 3-4. Non-Speech Hit Rate vs Speech Hit Rate at -5dB Threshold for 6dB SNR



Figure 3-6. Non-Speech Hit Rate vs Speech Hit Rate at –5dB Threshold for 18dB SNR



Figure 3-5. Non-Speech Hit Rate vs Speech Hit Rate at –5dB Threshold for 12dB SNR



Figure 3-7. Non-Speech Hit Rate vs Speech Hit Rate at –5dB Threshold for 24dB SNR



# 4 Examples

This section presents six examples for configuring the VAD.

**Example 1:** The following example code shows configuration for using the VAD in User Initiated Mode with analog microphone on CH1.

```
# Key: w A0 XX YY ==> write to I2C address 0xA0, to register 0xXX, data 0xYY
                # ==> comment delimiter
#
# See the corresponding EVM user guide for jumper settings and audio connections.
**********************
#
# Power up IOVDD and AVDD power supplies
# Wait for 1ms.
#
w A0 00 00 # Goto Page 0
w A0 01 01 # Software reset
           # Wait for 16 ms
d 10
w A0 02 09 # Device wake up, Power up VREF and DREG
w A0 42 A0 # Interrupt asserts on live events, active high
w AO OA 31 # GPIO is IRQ w Drive Strength Active Low and Active High
w A0 00 01 # Goto Page 1
w A0 33 DF # Unmask VAD Power Up Detect
 A0 1E 02 # User Initiated power up, VAD on CH1
w
w A0 00 0D # Goto Page D
w A0 0D 00 8F 47 35 # Threshold Value of -5dB
w A0 70 00 4E 20 00 # ILP of 2.5s
w A0 00 00 # Goto Page 0
w A0 76 80 # Enable CH1
w A0 78 A4 # ADC, MICBIAS, and VAD Power On
```

**Example 2:** The following example code shows configuration for using the VAD in Automatic Power Up and Power Down Mode with analog microphone on CH1.

Key: w A0 XX YY ==> write to I2C address 0xA0, to register 0xXX, data 0xYY # # # ==> comment delimiter # See the corresponding EVM user guide for jumper settings and audio connections. \*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\* # # Power up IOVDD and AVDD power supplies # Wait for 1ms. w A0 00 00 # Goto Page 0 w AO 01 01 # Software reset d 10 # Wait for 16 ms w A0 02 01 # Device wake up, Power up only DREG, VAD will power up VREF w AO 42 AO # Interrupt asserts on live events, active high w AO OA 31 # GPIO is IRQ w Drive Strength Active Low and Active High w A0 00 01 # Goto Page 1 w A0 33 DF # Unmask VAD Power Up Detect w A0 1E 42 # Automatic Powerup/down, VAD on CH1 w A0 00 0D # Goto Page D w AO OD OO 8F 47 35 # Threshold Value of -5dB w A0 70 00 4E 20 00 # ILP of 2.5s w A0 00 0E # Goto Page E w A0 08 00 9C 40 00 #Set wakeup wait to 5s w A0 00 00 # Goto Page 0 w A0 76 80 # Enable CH1 w A0 78 24 # VAD, MICBIAS Power On

**Example 3:** The following example code shows configuration for using the VAD in Automatic Power Up with User-Initiated Power Down Mode with analog microphone on CH1.

```
# Key: w A0 XX YY ==> write to I2C address 0xA0, to register 0xXX, data 0xYY
                # ==> comment delimiter
# See the corresponding EVM user guide for jumper settings and audio connections.
**********************
#
#
# Power up IOVDD and AVDD power supplies
# Wait for 1ms.
w A0 00 00 # Goto Page 0
w A0 01 01 # Software reset
           # Wait for 16 ms
d 10
w A0 02 01 # Device wake up, Power up only DREG, VAD will power up VREF
w A0 42 A0 # Interrupt asserts on live events, active high
w AO OA 31 \# GPIO is IRQ w Drive Strength Active Low and Active High
w A0 00 01 # Goto Page 1
w A0 33 DF # Unmask VAD Power Up Detect
w AO 1E 82 # VAD Automatic Powerup/User Initiated Powerdown, VAD on CH1
w A0 00 0D # Goto Page D
w A0 0D 00 8F 47 35 # Threshold Value of -5dB
w A0 70 00 4E 20 00 # ILP of 2.5s
w A0 00 00 # Goto Page 0
w A0 76 80 # Enable CH1
w A0 78 A4 # ADC, VAD, MICBIAS Power On
```

**Example 4:** The following example code shows the configuration for using the VAD in User Initiated Mode with digital microphone on CH1.

```
# Key: w A0 XX YY ==> write to I2C address 0xA0, to register 0xXX, data 0xYY
               # ==> comment delimiter
# Power up IOVDD and AVDD power supplies
# Wait for 1ms.
w A0 00 00 # Goto Page 0
w A0 01 01 # Software reset
d 10
          # Wait for 16 ms
w A0 02 09 # Device wake up, VREF and DREG Powered Up
w A0 0C 41 # GPO1 is PDM Clock output, Active Low and Active High
w A0 OD O2 # GPI1 is enabled
w AO 13 EC # GPI1 is PDM Data Input CH1 and 2
w A0 42 A0 # Interrupt asserts on live events, active high
w AO OA 31 \# GPIO is IRQ w Drive Strength Active Low and Active High
w A0 00 01 # Goto Page 1
w A0 33 DF # Unmask VAD Power Up Detect
w A0 1E 02 # User Initiated power up, VAD on CH1
#
w A0 00 0D # Goto Page D
w AO OD OO 8F 47 35 # Threshold Value of -5dB
w A0 70 00 4E 20 00 # ILP of 2.5s
w A0 00 00 # Goto Page 0
w A0 76 80 # Enable CH1
w A0 78 A4 # ADC, MICBIAS, and VAD Power On
```



**Example 5:** The following example code shows the configuration for using the VAD in Automatic Power Up and Power Down mode with digital microphone on CH1.

# Key: w A0 XX YY ==> write to I2C address 0xA0, to register 0xXX, data 0xYY # ==> comment delimiter # See the corresponding EVM user guide for jumper settings and audio connections. \*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\* # # # Power up IOVDD and AVDD power supplies # Wait for 1ms. w A0 00 00 # Goto Page 0 w A0 01 01 # Software reset # Wait for 16 ms d 10 w A0 02 09 # Device wake up, VREF and DREG Powered Up w A0 OC 41 # GPO1 is PDM Clock output, Active Low and Active High w A0 OD O2 # GPI1 is enabled w AO 13 EC # GPI1 is PDM Data Input CH1 and 2 w A0 42 A0 # Interrupt asserts on live events, active high w AO OA 31 # GPIO is IRQ w Drive Strength Active Low and Active High w A0 00 01 # Goto Page 1 AO 33 DF # Unmask VAD Power Up Detect w w AO 1E 42 # VAD Automatic Power up/down, VAD on CH1 w A0 00 0D # Goto Page D w AO OD OO 8F 47 35 # Threshold Value of -5dB A0 70 00 4E 20 00 # ILP of 2.5s w A0 00 0E # Goto Page E w A0 08 00 9C 40 00 #Set wakeup wait to 5s w A0 00 00 # Goto Page 0 A0 76 80 # Enable CH1 w w A0 78 24 # VAD, MICBIAS Power On

**Example 6:** The following example code shows the configuration for using the VAD in Automatic Power Up with User-Initiated Power Down Mode with digital microphone on CH1.

# Key: w A0 XX YY ==> write to I2C address 0xA0, to register 0xXX, data 0xYY # ==> comment delimiter # Power up IOVDD and AVDD power supplies # Wait for 1ms. w A0 00 00 # Goto Page 0 w A0 01 01 # Software reset d 10 # Wait for 16 ms w A0 02 09 # Device wake up, VREF and DREG Powered Up w AO OC 41 # GPO1 is PDM Clock output, Active Low and Active High w A0 OD O2 # GPI1 is enabled w A0 13 EC # GPI1 is PDM Data Input CH1 and 2 w A0 42 A0 # Interrupt asserts on live events, active high w AO OA 31 # GPIO is IRQ w Drive Strength Active Low and Active High w A0 00 01 # Goto Page 1 w A0 33 DF # Unmask VAD Power Up Detect w A0 1E 82 # VAD Automatic Powerup/User Initiated Powerdown, VAD on CH1 # w A0 00 0D # Goto Page D w AO OD OO 8F 47 35 # Threshold Value of -5dB w A0 70 00 4E 20 00 # ILP of 2.5s # w A0 00 00 # Goto Page 0 w A0 76 80 # Enable CH1 w A0 78 A4 # VAD, MICBIAS Power On



# 5 Summary

The Voice Activity Detection (VAD) algorithm is a voice-triggered, system wake-up mechanism that enables a device or system to be in sleep mode, consuming minimal power until voice is detected and recording can take place.

The VAD has 3 modes of operation based on the needs of the system: user mode, automatic mode, and intermediate mode. These modes control how the ADC recording powers up and down based on voice activity.

The VAD feature is supported on all analog-to-digital converter (ADC) channels of the TAx511x and TAx521x device family, including digital microphone channels, with one channel being monitored at a time.

# **6** References

- Texas Instruments, *TLV320ADC6120 2-Channel*, 768-kHz, Burr-Brown™ Audio ADC, data sheet.
- Texas Instruments, *TLV320ADC6120 stereo-channel*, 768-*kHz*, *Burr-Brown*<sup>™</sup> *audio ADC with 106-dB SNR*, evaluation module.
- Texas Instruments, *TLV320ADC5120 2-Channel*, 768-*kHz*, *Burr-Brown*<sup>™</sup> *Audio ADC*, data sheet.
- Texas Instruments, *TLV320ADC5120 stereo-channel, 768-kHz, Burr-Brown™ audio ADC with 106-dB SNR*, evaluation module.
- Texas Instruments, TLV320ADC3120 2-Channel, 768-kHz, Burr-Brown™ Audio ADC, data sheet.
- Texas Instruments, *TLV320ADC3120 stereo-channel, 768-kHz, Burr-Brown™ audio ADC with 106-dB SNR*, evaluation module.
- Texas Instruments, PCM6120-Q1 2-Channel, 768-kHz, Burr-Brown™ Audio ADC data sheet.
- Texas Instruments, PCM5120-Q1 2-Channel, 768-kHz, Burr-Brown™ Audio ADC data sheet.
- Texas Instruments, *ADCx120EVM-PDK*, user's guide.
- Texas Instruments, *PurePath*<sup>™</sup> Console.

# IMPORTANT NOTICE AND DISCLAIMER

TI PROVIDES TECHNICAL AND RELIABILITY DATA (INCLUDING DATA SHEETS), DESIGN RESOURCES (INCLUDING REFERENCE DESIGNS), APPLICATION OR OTHER DESIGN ADVICE, WEB TOOLS, SAFETY INFORMATION, AND OTHER RESOURCES "AS IS" AND WITH ALL FAULTS, AND DISCLAIMS ALL WARRANTIES, EXPRESS AND IMPLIED, INCLUDING WITHOUT LIMITATION ANY IMPLIED WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE OR NON-INFRINGEMENT OF THIRD PARTY INTELLECTUAL PROPERTY RIGHTS.

These resources are intended for skilled developers designing with TI products. You are solely responsible for (1) selecting the appropriate TI products for your application, (2) designing, validating and testing your application, and (3) ensuring your application meets applicable standards, and any other safety, security, regulatory or other requirements.

These resources are subject to change without notice. TI grants you permission to use these resources only for development of an application that uses the TI products described in the resource. Other reproduction and display of these resources is prohibited. No license is granted to any other TI intellectual property right or to any third party intellectual property right. TI disclaims responsibility for, and you will fully indemnify TI and its representatives against, any claims, damages, costs, losses, and liabilities arising out of your use of these resources.

TI's products are provided subject to TI's Terms of Sale or other applicable terms available either on ti.com or provided in conjunction with such TI products. TI's provision of these resources does not expand or otherwise alter TI's applicable warranties or warranty disclaimers for TI products.

TI objects to and rejects any additional or different terms you may have proposed.

Mailing Address: Texas Instruments, Post Office Box 655303, Dallas, Texas 75265 Copyright © 2024, Texas Instruments Incorporated