Hi @zenyagami, and thanks for posting!
There are quite a few topics to cover here, so I’ll separate them into sections:
Changing ASR providers
The document you linked explains the credentials required for each provider, but setting the credentials themselves doesn’t change your provider from the default. This could be clearer in the docs, so thanks for bringing it to our attention.
The easiest way to change providers is to use one of our preconfigured profiles—in this case,
TFWakewordGoogleASR if you’re using wakeword, or a different
*GoogleASR profile if you’re not. Profiles are explained a bit in the speech pipeline documentation.
To use a profile in the
Spokestack builder, you’ll want to keep your existing credentials/locale config and add
to your builder’s call chain.
I’m not sure which models you’re talking about in this section, but the same advice applies to both wake word and NLU: You can choose to either distribute the models directly with your app, in which case the app has to decompress them to an external directory when it launches for the first time, or it can download the models on first launch. Either choice is a one-time operation; once the models are there, they can be accessed from the same filesystem path on every startup.
Note, however, that if you store them in the app’s cache directory like we mention in some of our guides, the user can choose to clear the cache at will, forcing you to re-download/re-decompress, so you’ll need to check for the files’ existence before you use them.
Cloud NLU and changing models
Spokestack does have a cloud NLU service that runs inference without storing models locally, but we don’t currently include a component for it in our mobile libraries; models run entirely on-device like you’ve mentioned.
If you change your model, you’ll need your app to re-download it. How this is managed on the app side is up to you, though; it doesn’t necessarily require releasing a new version if you have another mechanism for periodically checking for new models and downloading them automatically.
and what is custom dictionary and pronunciation?
I’m not sure what this means in the context of NLU; could you give a little more information? Are you talking about TTS here? If so, see our TTS documentation—our mobile libraries don’t currently expose a custom dictionary feature, but you can customize pronunciation by using SSML or Speech Markdown as described there.
The way NLU works in general makes configuring it almost as much art as science. It’s hard to diagnose without a little more information about your configuration and the sample utterances you’re trying, but a phrase starting with “what is” does seem likely to resemble one or two intents in a lot of common configurations, so the confidence scores, while perhaps disappointing, don’t surprise me much.
It sounds like you want to include what’s often known as a “fallback intent” in your configuration, something that I don’t believe Rasa supports by default, but which you can create on your own and include in your YAML. A fallback intent contains phrases that have nothing to do with your application’s domain, so creating a good one really depends on the other intents you have in your configuration and how strict you want to be with matching them.
Other strategies for deciding that you don’t have a good match and returning an error response include looking at the confidence score and rejecting anything that’s under a certain threshold, or if two intents have very similar confidence scores (though using the confidence score doesn’t sound like an option for your case above) or rejecting NLU results if they’re missing a value for an important slot.
Hope some of that helps! Feel free to follow up if you have any other questions.