Members | Sign In
Legacy MOVI User Community Forum (readonly) > MOVI Question & Answers
avatar

Nested voice commands ("Do something" -> Movi: Are you sure? -> "Yes/No")

posted Apr 03, 2016 08:07:30 by dmworking247
Hi all,

As per the title, I'm trying to find a neater way to nest multiple voice prompts within a sketch. I've tried using the password sketch but I've ended up totally confusing myself over how Movi works during a loop, and when (if ever) it 'waits' for a response before proceeding.

I'm hoping to achieve something like this (psuedo-code) below. I think it would be a good idea as a future example sketch in the Movi library, as it would also help improve accuracy when certain commands are nested within each other.

Example:
recognizer.callSign("Movi"); // Train callsign Arduino (may take 20 seconds)
recognizer.addSentence("Make a cup of tea"); // Add sentence 1
recognizer.addSentence("make a sound"); // Add sentence 2

if (res==1)
{ // Sentence 1.
// psuedo code
Movi now asks "Are you sure"?
We poll for a new response
if (2nd response == Yes)
{
Movi asks "How many lumps of sugar"?
We poll for a new response
if (3rd response == "one")
{
say "one lump"
}
if (3rd response == "two")
{
say "two lumps"
}

else
{
say ("Maybe next time")
}
}

if (res==2)
{
Movi asks "How many times should I beep"?
We poll for a new response
if (2nd response == "one")
{
say "beep"
}
if (3rd response == "two")
{
say "beep beep"
}
}



As you can see I'm trying to improve the context of subsequent responses to initial commands. In this example, simple confirmations like "are you sure" Y/N and commands like "one two three" can be re-used within different primary commands.
page   1
1 reply
avatar
GeraldFriedland said Apr 03, 2016 21:43:31
Let's not discuss the password example for now as it is extra tricky because the password command is special in order to protect leakage of the provided password to other shields. In anyways, you are definitely right that context helps make MOVI's recognition more robust. See also this post.

Let's work based off your pseudo code. First, you need to add every single word and sentence you want MOVI to recognize with addSentence(). Also, I like to use the F Macro(), whenever I build a system that has a lot of question/answer strings, just to be sure. So you would do this:

recognizer.addSentence(F("make a cup of tea")); // Add sentence 1 
recognizer.addSentence(F("make a sound")); // Add sentence 2 
recognizer.addSentence(F("yes")); // Add sentence 3 
recognizer.addSentence(F("no")); // Add sentence 4 
recognizer.addSentence(F("one")); // Add sentence 5 
recognizer.addSentence(F("two")); // Add sentence 6 


If this becomes a lot of sentences, you can make the individual words (such as yes, no and the numbers) one sentence and work with MOVI's results on a word level. This is done in our 'Hunt the Wumpus' example. However, for simplicity, I don't do this in this forum post.

So now, there are several ways to implement the nested commands. Some of them nicer, some of them uglier. I encourage the community to come up with better ways but here is what definitely works.

What I do is introduce a 'context' variable that will keep track of where we are. The good news about this method is that it 'unnests' the questions and each chunk of code only deals with one specific response. Also, if the dialog system becomes more complex, you can easily create methods that each deal with one question/answer pair. If you get more and more complex, then you can actually generalize these methods... in the end you end up with something like AiML.

Well, here we go for now:

 
const int DEFAULT=0;
const int AREYOUSURE=1; 
const int HOWMANYBEEPS=2;
const int HOWMANYSUGAR=3;
int context=DEFAULT;   // stores the context
bool okresponse=false; // this one makes things easier, see below

void loop() 
{
    int result=recognizer.poll(); // See if the recognizer got something
    if (context==DEFAULT) { // MOVI just started
        if (result==1) {
            recognizer.ask(F("Are you sure?"));
            context=AREYOUSURE;
        }
        if (result==2) {
            recognizer.ask(F("How many times should I beep?"));
            context=HOWMANYBEEPS;
        }
        if (result>2) {
            recognizer.say(F("Commands are: Make a coup of tea and Make a sound.")):
        }
    }
    if (context==AREYOUSURE) {
        okresponse=false;
        if (result==3) { // yes
           context=HOWMANYSUGAR;
           okresponse=true;
        }
        if (result==4) { // no
           recognizer.say(F("Maybe next time"));
           okresponse=true;
           context=DEFAULT;
        }
        if ((result>0) && (okresponse==false)) { // Any other spoken response. 
           recognizer.say(F("Please respond yes or no"));
           recognizer.ask(F("Are you sure?"));
        }
    }
    if (context==HOWMANYSUGAR) {
        okresponse=false;
        if (result==5) { // one
           recognizer.say(F("one lump"));
           context=DEFAULT;
           okresponse=true;
        }
        if (result==6) { // two
           recognizer.say(F("two lumps"));
           okresponse=true;
           context=DEFAULT;
        }
        if ((result>0) && (okresponse==false)) { // Any other spoken response. 
           recognizer.say(F("Please respond one or two"));
           recognizer.ask(F("How many lumps of sugar?"));
        }
    }
    if (context==HOWMANYBEEPS) {
        okresponse=false;
        if (result==5) { // one
           recognizer.say(F("beep"));
           context=DEFAULT;
           okresponse=true;
        }
        if (result==6) { // two
           recognizer.say(F("beep beep"));
           okresponse=true;
           context=DEFAULT;
        }
        if ((result>0) && (okresponse==false)) { // Any other spoken response. 
           recognizer.say(F("Please respond one or two"));
           recognizer.ask(F("How many beeps?"));
        }
    }
}


Hope that helps.
[Last edited Apr 04, 2016 00:36:13]
Login below to reply: