Voice Processing Tools and Software for K2G and C5517 Designs | Vídeo | TI.com

Internet Explorer no es un explorador compatible con TI.com. Para disfrutar de una mejor experiencia, utilice otro navegador.

Video Player is loading.

Current Time 0:00

Duration 17:24

Loaded: 0.96%

Stream Type LIVE

Remaining Time 17:24

Voice Processing Tools and Software for K2G and C5517 Designs

00:17:24 | 27 APR 2017

This training addresses some of the basic concepts associated with the voice recognition processing technique known as beamforming, as well as an introduction to the design tools, hardware, and software that are deployed in the real-time beamforming demonstrations available from TI for the EVMK2G and TMDSEVM5517 development platforms.

Medios

[MUSIC PLAYING]

Hi. In this section, we will cover some of the beamforming concepts, design tools, hardware, and software. All of these topics are relevant to the real-time beamforming demos available from TI for the K2G and C5571 platforms.

Acoustic beamforming is a technique employed in audio systems in order to weed out unwanted sound sources and listen to the source that is of importance to the system. The concept of beamforming have their origins in military applications, such as radar. Today we see beamforming in audio applications, such as the famous Amazon Echo and conference call systems.

The picture you see on the far right is that of an array that can be mounted on the ceiling of a conference room to help capture audio from anywhere in the room during meetings. Beamforming is also creeping into other consumer appliances, such as thermostats. The physics of beamforming aren't all that complicated to understand. If you have some basic understanding of how audio wave fronts behave, it should make a lot of sense.

A typical beamformer consists of a mic array, a steering delay stage, and a summation stage. Wave fronts that approach an array will hit each mic at varying times with very minute differences in arrival times. The trick in the beamformer is to take these very minuscule time deltas and phase shifts into consideration and align them to provide a signal with the highest energy.

By just focusing on one specific angle, you can ignore sound from other angles. The key here are the beamforming coefficients that are configured in the code for each virtual mic angle. Depending on the number of physical mics and array geometry of the design, these values can be manipulated as desired. The output of the beamformer can be fed into the multi-source selection module that will choose a virtual microphone with the highest energy. This chosen microphone can then be put into a voice recognition engine piped to the internet or fit into some other audio-processing algorithm.

In order to help get started with the design and implementation of a microphone array, let's take a closer look at some of the design tools available. TI provides a design GUI to help generate the beamformer coefficient, depending on the array design. The GUI is part of the AER package and can be found at the path in the slide. A link to the AER package download is also provided at the end of this presentation.

The tool currently only supports 8 or 16 kilohertz audio sampling rates. You can specify the number of microphones, with the default being four. There must be at least two microphones in the array. And for rectangular geometry, the mic number must be four.

The supported geometries are linear, cross, rectangular, or circular. The contour and polar frequency fields are parameters you can specify for the frequency response graph that will be generated when you hit the Design button on the tool. If you hit the Design and Save button, a text file containing the coefficients will be saved locally which can then be copied into your source code. We will look at how to use these coefficients next.

Once the beamforming GUI has generated needed coefficients, what do you do with them? They will need to be inserted into the file Sys BFFLT.C in the example code. There are macros in the code depending on the number of microphones being used. So ensure that the coefficients are inserted into the right area.

Let's now switch gears to discuss the hardware used in the voice processing collateral. The Circular Mic Board, or CMB, is an eight-mic circular array with seven mics on the periphery and one center reference mic. It is designed with Knowles analog mics. Digital Mics can also be used in a design if desired.

The mics feed into two PCM1864 ADCs, which are mounted on the underside of the array. They output audio using the I2S protocol. These I2S lines can be interfaced to a DSP as seen fit.

There is a four-mic linear version called a Linear Mic Board, or LMB, in the works and should be available shortly. This would be ideal for solutions that require less mics, especially on the C5517. We also have a .STL 3D printer file so that anyone can 3D print a small stand-off for the CMB PCB. This provides a nice, stable platform for the CMB when doing the audio preprocessing demos.

The three-pronged approach to the voice preprocessing collateral development has been to enable this on K2G, C5517, and C674X. The TI designs for K2G and C5517 are due to be released soon, with C674X to follow. The C5517 is limited to only six mics, as there are only three I2S lines available on this device. This would still suffice for a lot of applications, since C55X devices are targeted for smaller solutions.

The TI designs being developed are created around a demo, as illustrated here. The audio inputs are taken into either a K2G or C5517 EVM from the CMB or LMB. The audio is then processed using the various algorithms, such as beamforming, ASNR and MSS.

The output of the MSS, which contains a signal with the highest energy, is sent to the output through the headphone jack on the EVM. Clean audio which has gone through processing can be heard on the left ear. Unclean raw audio from the center reference mic on the CMV is fed into the right headphone. This provides a great way for customers to distinguish the impact of processing on the audio signal.

The best way to analyze the audio is to connect the headphone output via a line into a PC. Software, such as Adobe Audition or Audacity, can now be used to better analyze the audio quality.

Here are the connection details between the K2G EVM and Circular Mic Board. Please pay close attention to the jumper settings underneath the CMV. It is a tight fit to squeeze in the cables, so also pay close attention to ensure wires aren't crossed.

Similar to the K2G set-up, here are the connection details between the C5517 EVM and the CMV. Ensure that the SW4 DIP switch and jumper settings are correctly set on C5517 EVM. Please also note that this setup only uses six mics on the CMV due to the limitation on the number of I2S lines on the C5517 EVM.

We will now take a closer look at the software environment for the audio preprocessing demos. Let's now look into the software packages that contain the audio preprocessing demos. The K2G Processor SDK RTOS package has a K2G audio preprocessing demo, starting with Version 3.03.

The same applies to the C55X_CSL CSL where the C5517 audio preprocessing demo is included, starting with CSO Version 3.07. Both these demos can be found at the path specified here. Note that there are various dependent components that are part of the AER and VOLIB packages, which would need to be included in the build. We'll cover the set up of the build environment later.

Both the K2G and C5517 demos contain a real-time demo that takes in audio from the CMV, does a beamforming ASNR, our MSS, and DRC before outputting the process audio via the headphone jack. There's also a File IO demo that takes in audio data via a file and outputs the process data back into a file. This demo is ideal for simulating the audio preprocessing without external hardware. Please note that the File IO demo only applies to the K2G platform.

The loopback demos in both packages are typically used to check the function of the individual microphones, two at a time. The selected microphones for the loopback can be manipulated in the code.

The audio preprocessing software contains a few software knobs to enable the user to manipulate various tuning parameters depending on the end application. ASNR parameters, such as signal delay, and so on, can be changed in the code. The filter coefficients, as discussed earlier, can also be modified.

An important feature in the audio preprocessing software is the ability to manipulate the number of microphones in the setup. This feature is especially important for customers wanting to quickly scale their end application depending on the number of microphones. The processed and unprocessed audio can also be analyzed individually by listening to either the left or right channels. The channels can also be flipped around as desired in the software.

Loopback mode in the demos are a great way to test the individual microphones in the array. This, coupled with the ability to manipulate the codec gain settings, provide a great way to adjust microphone settings as needed while bypassing the preprocessing algorithms. As discussed earlier, the number of microphones in the system can be easily changed in the CODEC_PCM186X.H file by manipulating the NUM_OF_MICS macro. Note that the C5517 only supports six mics in total due to the limitation in I2S lines.

We will now take a look at how to build the real-time demos for the K2G and C5517 in a Windows machine. The K2G demo is built using gmake on the command line. The C5517 demo is built in CCS by importing projects into the CCS workspace. We will cover what this looks like later.

For the K2G demo, ensure that the Processor SDK RTOS package components, such as a PDK, are registered in CCS. We'll not cover Processor SDK-related environment setup, since it is not in the scope of this segment. However, please see the link at the end of this presentation on setting up and building the Processor SDK RTOS package. The pdksetupenv.bat file would typically need to be manipulated if your CCS and Processor SDK installation is not installed at C:ti.

In order to successfully build the demos on both platforms, it's essential to have the dependent packages installed at the correct locations. Here we are looking at the directory structure when AER and VOLIB are installed at the correct location alongside the Processor SDK RTOS packages. The same applies to the C5517 CSL package. Ensure that the AER, VOLIB, Bios, XDIOS, and XDC tools are installed alongside the CSL root directory.

Let's take a look at how to build and run the demos on both K2G and C5517. In this segment, we will take a closer look at the K2G real-time audio preprocessing demo. We will assume that the K2G Processor SDK RTOS package and CCS Version 6.1.3 are installed at C:ti. Also ensure the AER and VOLIB packages are installed at the correct locations, as covered previously.

Navigate to the Processor SDK RTOS package, as seen here, and run the setenv.bat file. This will set up the build tools environment. Next, navigate to the build directory of the real-time demo. Do a gmake Clean, followed by a gmake All.

At this point, the demo build process will begin. Please note that this build can take several minutes. So let's cut to the chase for the sake of time. Once the build completes, the K2G beamforming real-time binary will be populated in the build directory, as seen here. This binary can now be run on the C66X core on the K2G EVM.

Assuming that the CMV is connected, let's see how you would load and run the demo. We will not cover CCS setup content in this segment. However, please see the Getting Started with CCS link at the end of this video for more information.

In CCS, launch a target configuration for the K2G EVM. Connect to see C66X core, ensuring that the GEL file is loaded on the core. Navigate to the location of the K2G BFRT.OUT binary we built previously, and load it on the core. CCS symbols will also get loaded, which will allow for setting breakpoints in the code to further examine the behavior of the demo.

Hit Resume, and at this point the demo will start running. In this case, we have the line out from the EVM connected to the PC line in. We will launch Audacity and see the audio output from the demo.

Here we can see the left channel with the clean audio and the right side with unclean audio. This can be recorded and split into left and right channels in order to listen to each channel separately. This provides a great way to analyze the processed and unprocessed audio quality.

We will now take a look at building the real-time beamforming demo for the C5517 platform. Download and install the C55X CSL package from the link mentioned at the end of this video. We will assume that the CSL package is installed at the default C55X-LP location. Prior to building the demo, ensure that the AER and VOLIB packages are installed at the correct locations as covered in the previous section.

In the CCS workspace, go to Project and import CCS Projects. Now navigate to the CSL Root folder and select the real-time demo and the dependent projects ATFS BIOS DRV LIB and C55X_CSL_LP. Ensure that the CSL_GENERAL.H file in the C55X_CSL_LP INC folder has a correct platform macro defined for the C5517 EVM. Prior to building the demo, ensure that the C55X_CSL_LP libraries are rebuilt with the correct platform macro.

That's it. You're now all set to build a demo. Right-click on the BFRT BIOS project, and hit Build Project.

Uh-oh. Why are we seeing this compilation error? Ah, I see. Silly me. I did not include the dependent packages, such as AER and other covered previously installed alongside the CSL installation.

Let's fix that now by copying over the relevant packages. That should do the trick. Let's rebuild the project. OK. There we go. It built successfully.

Let's launch a C5517 target configuration and connect to the core with the GEL file loaded. Next, load the BFRT BIOS.OUT binary on the core and hit Resume. At this point, the demo will start running and will be putting out the processed and unprocessed audio via P9 headphone jack on the C5517 EVM. The audio can be analyzed in the same manner as seen for the K2G demo.

This completes the segment which covered the basics of beamforming, design tools, hardware, and software used to demonstrate the real-time audio preprocessing on the K2G and C5517 platforms. If you have any questions on running these demos, please use TI's E2E forums for assistance. Thank you for watching.