Friday, March 22, 2013

Voice Recognition VR Module Review

Voice Recognition VR kit is one product we always wanted to have in our product lines. Only a handful of specialty manufacturers build this board. This makes them incredibly hard to find, not to mention the high price tag that usually comes with it.

That’s why when our first VR module sample arrived a couple of days ago, I was not able to contain my excitement in such a way that I simply dropped everything else and set to immediately start examining and evaluating the VR module. After a brief struggle with the manual testing my reading comprehension, I hooked up the VR with a gizDuino, and worked an Arduino sketch to test and demonstrate its core VR functions.

Following is a short summary of VR module features:

Up to 15 speaker dependent SD voice commands
organized as 3 groups of 5 voice commands each. Due to limited capacity of the on board voice controller, it can only process one group, or 5 voice commands at a time. The host controller takes the responsibility of loading the appropriate voice command set.  Each SD voice command can be up to 1.3 seconds long. 

Non- volatile voice commands storage.
The VR module retains all voice commands prints even after power cycling. Voice commands will only be replaced or erased at the instruction of the host controller.

UART TTL interface.
This allows it to connect with any 5V MCU host that is similarly equipped with UART port. It can even interface directly with 3.3V MCU host as long as the host input is 5V tolerant.



THE MICROPHONE

Construction. Strangely, the microphone was constructed with the microphone mouth directed on a side, instead of at the top, as one would normally expect. Making things worse, the microphone does not have enough gain to compensate for improperly directed voice inputs. And you as users, if you fail to take this into account, will have a mighty hard time training the VR module. Fortunately, the microphone has a marking that can be used as a guide as shown in the photo. Always voice your command towards this side of the microphone. 

    
Sensitivity. The microphone, as already mentioned, has low sensitivity. You have to keep your mouth within an inch or two from the microphone pickup port as you speak for VR recognition. It is conceivable that the manufacturer opted for low sensitivity to make the VR kit usable even at high ambient noise environment. So, if you are thinking of using it in applications where you can voice command something from a distance, this might not be it. It is possible to use a microphone (electret type) other that one supplied with the kit. A headset with good quality microphone (i.e. gaming headset) may actually work wonders for this kit.

Training Experience
One thing I immediately noticed during teaching voice commands is the VR is not as tolerant as during recognition mode. During training, the VR requires you to repeat each voice command twice. You have to pronounce the commands in almost exact manner, same loudness, same inflection, same accent and all. Your voice has to be at the right loudness. It will reject a voice input it perceived as too loud or too soft. In other words, to train it with your voice, you need to train yourself as well. It was kind of frustrating the first time, but once you got the hang of it, it becomes pretty easy.

During training, the on board LED indicators will guide you by giving visual cues as when you can start speaking, and whether or not the VR successfully processed your input voice command. This too requires some getting used to. But what I think is severely lacking is an indication that can tell you in what segment (of 5 segments) you are currently in.

Voice Recognition Mode

The VR is more tolerant to changes in voice input during voice recognition mode. It can detect with reasonable accuracy voice command of varying loudness, and even varying pitch. For example, I trained the VR to recognize the word “singko”. During the recognition mode, the VR correctly identified it even when I spoke it stretched about twice long, varied the pitch a tad high and low, and shouted it (with the microphone held about 30 cm away from my mouth).

But it has trouble discerning closely sounding words. For example, it ran into trouble when I tried training it by pronouncing letter “B” and “D”. Hence, it is best to train it using words set that sound very different from each other.

Conclusion

The VR board is a Speaker Dependent voice recognition board. Meaning it will recognized and accept only the voice commands of the trainer. A Speaker Independent feature would have been nice, but that may push the price of the module beyond the affordability zone. In addition, the inability of the VR to process all 15 voice commands at the same time may be seen somewhat as a limiting factor. But I think these will not diminish much the usefulness of the VR kit. For example, in order to use VR all 15 voice command capacity, extra coding work may be needed, but it can be done and is not terribly hard.

That being said, if your intended application will not be severely handicapped by the mentioned limitations, the VR module will serve well as a cost effective Voice Recognition add-on solution to your DIY electronics and robotics projects. 

   

1 comment:

  1. Wow another great product from E-gizmo, can't wait to use that one on my projects

    ReplyDelete