The KLearnNotes2 Handbook | ||
---|---|---|
<<< Previous | Voice input - talk to your computer! :) | Next >>> |
General suggestions
If you notice, that recorded sounds are unnaturally quiet, try repeating mic config, but this time speak a little quieter at "Mic level" step (step 2). Generally: if mic input is too low (and recordings are too quiet) speak quieter at this step; on the other hand, if mic level is too high (and recordings "go over the range" and are misshapen) speak louder at this step.
For me it worked best, if I kept microphone a bit farther from my mouth during mic calibration, than during actual recording/recognition.
The first vowel (a,e,u,i/ee) in the phrase is most important: it triggers (starts) voice recognition. It has to be pronounced very clearly (strongly and a bit longer). The rest of a phrase can be pronounced really naturally.
Most important: if you had problems recording a sample (i.e. auto-recording didn't start or cut part of the phrase frequently) you will most probably have problems with this phrase during recognition. If for other phrases recording works fine I suggest that you changed what you say (i.e. how you are going to "click" the command with your voice). "QUIT" doesn't record? Say "quit program" instead. Don't over-do. You will have to remember these expressions to use them (so it is not advised to record here "Please, quit this program immediately", because simply in a week you will not remember how to activate "quit" with your voice).
Anyway - experiment. You may edit old voice model anytime - it's very easy to record again a better phrase for most erroreus command.
With all the suggestions below: don't over-do! Sure, make it easier for your computer to recognize your speech, but don't make it too artificial or too hard/inconvenient to you!
Generally the shorter sound, the worse it will be recognized. Therefore notes names are usually the worst. :( Actually, there are special problems with wrong recognition of some pairs of letters' names.
In English notation and English language this is B and D. They differ only by very short beginning. Therefore you should be very careful and over-explicitly pronounce these beginnings ("BBBBee" with long closed lips before actually making the sound, "DDDDee" with tongue on your palate/upper teeth).
In German notation there may be problem with A-H pair. There is sure no problem if you are an English speaker ("ai" and "aidg" are different enough), but for example in Polish these are pronounced "aa" - "haa". The first time I did recording this was recognized awfully wrong (see hissing section below). I deleted all samples, carefully recorded them again, and surprisingly both H and A are recognized perfectly now (without even using any of the brutal solutions listed bellow). :)
There is a special problem with hissing sounds. They are recognized/recorded fine once recognitions starts. But they don't cause recognition to start (they don't "carry enough sound").
A good example is note "F". Even if you say "F" ("aephphph") with a long hissing sound in the end it will not be recognized as a human sound at all (and recording will not start) if "ae" in the beginning is too short. Therefore, you should say "F" with long "ae" in front (i.e. "aaaff", rather than "af" or "affff").
Unfortunately the same may happen to one-vowel words like "STOP" :(. Just try making "O" a bit longer if your recording doesn't start. Anyway, "ST" in the beginning may not be recognized well either. It is better to record "test-stop" rather than "stop-test" or "stop".
Record "yes button" rather than "yes" ("ae" may be too short to trigger recording) or "button yes" ("but" may be not noticed if pronounced too short; you may hear "tonyes" :().
Note, that if beginning or ending is cut off in all the samples, it will probably be cutoff too during recognition. Therefore this should be recognized fine, provided the cut-off phrase is not too similar to some other phrase (e.g. cutting off the beginning of sound "B" "beee" is not acceptable, because the result will probably will be mistaken for "E").
If main problem is, that recording does not start, in most cases it is enough just to pronounce notes' names longer or make the first vowel of a command longer. Try this first before following the suggestions bellow.
But if your language is full of hissing sounds, or your mic/sound card is of poor quality, you may need one of not-so-convenient solutions listed below.
Pronounce each note name twice during recording. Therefore you click on note C with sound ->"see-see" ;), on B with sound ->"bee-bee" etc.
Making it three times may work even better but is less convenient.
If program doesn't tell the difference between vowels and consonants, you may try to pronounce vowels naturally and consonants like in the BW1. Therefore you would say
E -> eee A -> aei but C -> see-see B -> bee-bee D -> dee-dee F -> aeff-aeff G -> gee-gee H -> aidg-aidg |
If recording simply doesn't start (in English this most frequently happens with quickly pronounced note "F") you may add some phrase in front of each name. E.g. "note":
E -> "note eee" C -> "note see" D -> "note dee" F -> "note aefff" |
You may use Greek names for the letters. These are much more diverse than English notes' names:
A -> "alpha" B -> "beta" C -> "chi" (I know, it should be "gamma", but "chi" sound more like "C", and I prefer using "gamma" for "G") D -> "delta" E -> "epsilon" F -> "phi" G -> "gamma" (hmm, is there any other natural suggestion?) |
<<< Previous | Home | Next >>> |
Voice input - Setup details, hints and tricks | Up | devel: How does it work? |