文章/答案/技术大牛

发布

社区首页 >专栏 >Cumstom Build VUI On-device Using VoiceHub

Cumstom Build VUI On-device Using VoiceHub

用户6026865

发布于 2023-03-03 01:20:03

2050

Silicon Labs just announced availability of Sensory’s Truly HandsFree solution working on the EFR32 Series 1 and Series 2 families.

Along with that announcement is the availability of an example that allows you to experience the power of their technology firsthand.

Sensory’s machine learning technology is pretty impressive, and they have a long list of OEMs that agree.

In this blog I’m going to discuss this Sensory example. There’s a lot you can do with this example, but what’s really cool is this example allows you to test your own wake word by just typing it in!

Typically, with machine learning, you must create a data set, train your model, make sure it fits in the target device, and convert it.

Sensory’s VoiceHub is an ML Solutions tool, which doesn’t require the developer to do any of these steps. Instead, you only have to type in the wake word(s) you’d like to try and Sensory will create a specific machine learning library for you to integrate directly into this example.

The example is available in the Silicon Labs GitHub repository: https://github.com/SiliconLabs/machine_learning_applications/tree/main/voice/sensory_wakeupword

The README explains three different use cases supported with this example:

Directly downloading the prebuilt binary.
Rebuilding the example to use one of the other pre-defined wake-word libraries.
Creating your own wake-word using Sensory’s VoiceHub and integrating the resulting library into the example.

The first two uses are clearly described in the README, so I won’t spend much time on those – other than suggesting you try these out in your noisy environments. The performance is pretty impressive!

What’s more interesting is that we have worked with Sensory so you can go to VoiceHub and create your own wake-word library to integrate and test with this example.

To create the library all you need to do is type in the wake word! And you can choose one of 23 different languages to recognize!!

(The example does not support voice commands, but any good, embedded programmer has enough information in the example and library to make voice commands work).

Being able to create your own wake-word library allows you to test the Sensory solution for your specific application.

As a developer, one of the key challenges you have is to find the right balance of accuracy and fit when you include all the other parts of your application like the wireless communications stack. This example allows you to check if your voice activation wireless application is viable on the Silicon Labs part.

Create Your Own Sensory THF Wake Word Example Library

This is how easy it is to create your own wake-word library. You go to sensory.com/voicehub and request login credentials. Sensory has a EULA you must agree to before they grant you access to VoiceHub.

Once you are logged into VoiceHub, create a WakeWord project-

You will be presented a project page with some choices to make:

Initial Size

The amount of RAM in the chip you are using will define what you should choose. If you are using the xG24 Dev Kit, or the Thunderboard Sense 2 they both have chips (MG24 and MG12, respectively) with 256KB RAM. Therefore, you should choose the Initial Size < 200KB. There are 3 choices: 80KB, 147KB, and 191KB.

The primary influence of the value you choose will depend on what wireless stacks you are using in your application.

For rough estimation you can assume a single stack like BLE or Zigbee requires about 30KB RAM. This would allow you to try an initial size up to 191KB. If you are using multi-protocol (like BLE & Zigbee) the RAM requirements are closer to 60KB. This would suggest you could use initial size up to 147KB. At the time of this writing the Matter stack is quite a heavy user of RAM, around 128KB. Therefore, with Matter you would choose an Initial Size of 80KB.

Smaller size can translate to less accuracy, which could mean less performance in noisy environments or more false positives.

Even if you have the space, it’s best to be memory efficient and find the smallest size that works for your application in its real-world environment.

Output Format

Choose “DSP Silicon Labs” and leave Operating Points set to the default “Single Point”.

Language. You can pick any one of the 23 languages and dialects. Remember to click on the play button to listen to your typed wake word(s).

Type in 1 to 4 wake words.

These wake words will map directly to the control of the LEDs as described in the README section Model behaviour.

Build your new wake word library by clicking on the “Build” button.

Building your Wake Word library can take time (several hours, or longer depending on the queue). Once the library is built you will get an email indicating your library is ready for download.

Return to the VoiceHub project and download your model:

The downloaded model will be in the form of files like this below. You will need to take these files and move them into the Studio project.

Follow the README section Manually adding your model on how to copy & paste the …net.c from the VoiceHub model into net.h in your Studio project’s new include/model subdirectory, and similarly the …search.h and …search.c from the VoiceHub model into search.h in your Studio project’s new include/model subdirectory.

Oh yeah, you need the Studio project. Here’s how to set it up from the example’s git repo before you follow the steps in the previous paragraph.

Setting up the GitHub Sensory Example in Studio

Setting up the example as an SDK extension will allow you to perform any of the three uses described in the beginning of this blog: run the pre-built binary, modify the example to use another pre-defined wake word library, or integrate a wake word library you created on VoiceHub into the example application.

The GitHub example was designed to behave like an SDK Extension so the pre-built binary would show up as a Demo through Studio.

If we set it up as an External Repo, you could still have access to the example project, but you would not be able to see the pre-built demo in the Studio example interface.

I’ll spend my time discussing the steps to integrate the wake word library you built in the earlier steps. All these instructions are in the README as well, but I’ll expand on those notes with a few pictures to help.

Follow the README instructions to clone the git repo into your local machine. Then add the example as an SDK extension by following these steps:

Select Studio->Preferences:

In Preferences choose SDK, then choose Add Extension:

Browse for the file directory that was cloned locally. Be sure to select the parent directory which is called “machine_learning_applications”.

After clicking OK you will be asked if you trust this directory – click on Trust.

You will see the machine learning extension added to the chosen version of the GSDK (use GSDK version 4.1.1 or greater):

Now you should be able to pick the Sensory Wake Word Example in the Studio Launcher. After picking your target device (xG24 Dev Kit) and looking at the Example Projects tab, choose Machine Learning in the Capability filter and you will see the Sensory Example and Demo.

Choose to Create the Sensory Wakeup Word example to setup the project that you’ll integrate your version of the Wake Word library that you created from the steps above.

In the Simplicity IDE Project Explorer you will see the model include directory. Create a sub directory for the new model you created In VoiceHub.

In this example I created a subdirectory called my_wakewords:

Make sure you Change const unsigned short to const unsigned short __ALIGNED(4) in your new search.h and net.h files. This ensures models are aligned on 4-byte boundaries in memory.

The final step to integrate your model into the example is described in the README section Model selection. Add the paths to the net.h and search.h in your new include/model subdirectory.

The application is ready to be recompiled and flashed onto the xG24 Dev Kit. I was able to run the example and use my command words to control the LEDs.

If you’ve completed all these steps successfully, you have experienced the power of machine learning, without needing to understand how to implement machine learning – that’s the intention of our ML Solutions type partners like Sensory.

If you’d like to learn about other tools we offer for the machine learning developer, please start at silabs.com/ai-ml