Members | Sign In
MOVI User Community Forum > MOVI Question & Answers

SentenceSets - some observations

posted Jan 10, 2017 03:24:11 by scott
I've been having fun with MOVI for a couple of weeks and it looks like the SentenceSets sketch is the best one to use as a starting point for my project (MOVI to MQTT to OpenHAB).

I discovered a couple of things while adapting the code and thought I'd share.

1) say("ONE") and say("one") seem to be different. The all caps case is spelled out ("o-n-e") while the lower case is "one", as expected. For most words, it doesn't make a difference but ONE and TWO are spelled out. Hmmm.

2) the String returned by getResult() has a carriage return at the end of it (might be obvious to some). This is fine except if you're using the getWord() function in the sketch as the last word always includes the <CR>. It's important if you're trying to match that last word to your known words. Maybe I'm thick but it took me an hour to notice - easy enough to trim off the <CR> for sentence->word deconstruction.

3) I'm not sure how useful the 'levenshtein' routine is for any type of word correction (for me). I've found that MOVI either nails it (returns exactly the spoken words) or it returns something that is a long way off (usually extra words but sometimes just plain wrong). So the 'levenshtein' is either zero (match) or some other number that doesn't really help make a correction. Most of my commands will be three words ("open blind three" or "main light on") so I'll need to build in some sort of routine where MOVI will ask the user things like "open which blind?" or "what would you like switched on?" when it only understand part of what is said.

I'm really enjoying the MOVI shield. I'm amazed you guys could get all this onto a single board.

page   1
5 replies
GeraldFriedland said Jan 12, 2017 17:13:30

Thanks so much for your feedback!

1) This behavior is built into espeak/SVOX pico. We just delegate to them. ;-) So it might be interesting for you to ask this question on their forum.

2) The carriage return is not supposed to be there. I am very surprised. I wonder if your Arduino adds it in the serial communication because you either configured your Arduino that way or your SoftwareSerial. Also, which board are you using?

3) Levenshtein should be extremely helpful for you, especially when your commands all have the same length. Taking the sentence with the minimum distance to your trained sentences should be correcting many errors. Having said that: If its only very few sentences it might be overkill/not doing anything because you will either get match/no match due to the sparsity of the error space. My recommendation is to keep it in there for now. If you want something better, you might have to build a language model with a real grammar based on HMMs.
Here is a book chapter that provides the concepts of what you would have to do:

Hope that helps,
[Last edited Jan 12, 2017 17:13:47]
scott said Jan 13, 2017 05:03:41
Thansk for the link Gerald - I wanted to learn more about HMM's.

1) I'm using new firmware and new SVOX pico voice. My comment was just something people should be aware of when using all Caps vs lower case. Not sure if it applies to all three letter words that are upper case.
2) Again, just something to be aware of. I'm using an Uno and standard SoftwareSerial library with no mods. However, I am using the MICDEBUG for monitoring what comes back from the board....
3) I'll keep the Levenshtein in my toolbox. I'm still figuring out how it all works under the hood so I can follow the most robust methods. It's a very interesting research area.

GeraldFriedland said Jan 14, 2017 18:29:22
Actually now that you are mentioning it: The Levenshtein distance as implemented in the example is character by character. It might be useful for your application to change it to be word-based.

scott said Jan 15, 2017 12:19:46
Implement as each word to word distance as a sum of character to charscter distances? Then sum over all words in sentence?

I was pulling each sentence apart and comparing each word agsinst individual trsining words using the Levenshtein method as per the example. So each word either matched or didn't.

I'm unsure whether using .getResult() plus enforcing my own 'rules' (such as all commands are three words) then trying to do some error correction is any better (or worse) than using .poll() with an optimuzed training set. Try both and see I assume. Perhaps a bit of both.

GeraldFriedland said Jan 16, 2017 03:55:43
As you probably know, the Levenshtein Distance is also called Edit Distance. The distance is the sum of the number of insertions, the number of deletions, and the number of replacements of characters that one has to do to convert one sentence to another. What I am suggesting is: Instead of counting the insertion, deletions, and replacements of characters, you could count the insertions, deletions, and replacements of words.

So .poll() should be better and more efficient. But, like in the example, sometimes it can make sense to implement your own matcher.

Good luck and keep us posted!
[Last edited Jan 16, 2017 03:56:14]
Login below to reply: