Voice input - (un?)Common problems

General suggestions

With all the suggestions below: don't over-do! Sure, make it easier for your computer to recognize your speech, but don't make it too artificial or too hard/inconvenient to you!

dangerous pairs

Generally the shorter sound, the worse it will be recognized. Therefore notes names are usually the worst. :( Actually, there are special problems with wrong recognition of some pairs of letters' names.

In English notation and English language this is B and D. They differ only by very short beginning. Therefore you should be very careful and over-explicitly pronounce these beginnings ("BBBBee" with long closed lips before actually making the sound, "DDDDee" with tongue on your palate/upper teeth).

In German notation there may be problem with A-H pair. There is sure no problem if you are an English speaker ("ai" and "aidg" are different enough), but for example in Polish these are pronounced "aa" - "haa". The first time I did recording this was recognized awfully wrong (see hissing section below). I deleted all samples, carefully recorded them again, and surprisingly both H and A are recognized perfectly now (without even using any of the brutal solutions listed bellow). :)

hissing sounds

There is a special problem with hissing sounds. They are recognized/recorded fine once recognitions starts. But they don't cause recognition to start (they don't "carry enough sound").

A good example is note "F". Even if you say "F" ("aephphph") with a long hissing sound in the end it will not be recognized as a human sound at all (and recording will not start) if "ae" in the beginning is too short. Therefore, you should say "F" with long "ae" in front (i.e. "aaaff", rather than "af" or "affff").

Unfortunately the same may happen to one-vowel words like "STOP" :(. Just try making "O" a bit longer if your recording doesn't start. Anyway, "ST" in the beginning may not be recognized well either. It is better to record "test-stop" rather than "stop-test" or "stop".

Record "yes button" rather than "yes" ("ae" may be too short to trigger recording) or "button yes" ("but" may be not noticed if pronounced too short; you may hear "tonyes" :().

Note, that if beginning or ending is cut off in all the samples, it will probably be cutoff too during recognition. Therefore this should be recognized fine, provided the cut-off phrase is not too similar to some other phrase (e.g. cutting off the beginning of sound "B" "beee" is not acceptable, because the result will probably will be mistaken for "E").

brutal solutions

If main problem is, that recording does not start, in most cases it is enough just to pronounce notes' names longer or make the first vowel of a command longer. Try this first before following the suggestions bellow.

But if your language is full of hissing sounds, or your mic/sound card is of poor quality, you may need one of not-so-convenient solutions listed below.

brutal way 1: pronounce multiple times

Pronounce each note name twice during recording. Therefore you click on note C with sound ->"see-see" ;), on B with sound ->"bee-bee" etc.

Making it three times may work even better but is less convenient.

brutal way 2: pronounce consonants multiple times

If program doesn't tell the difference between vowels and consonants, you may try to pronounce vowels naturally and consonants like in the BW1. Therefore you would say


E -> eee
A -> aei
but
C -> see-see
B -> bee-bee
D -> dee-dee
F -> aeff-aeff
G -> gee-gee
H -> aidg-aidg
Since there are only 7 notes' names this is not as inconvenient as it sounds. F and H are very different from others, so it would be OK to pronounce them once, but it makes the rule more complicated and in practice I found it simpler to pronounce all consonants twice (it doesn't hurt).

brutal way 3: extra words

If recording simply doesn't start (in English this most frequently happens with quickly pronounced note "F") you may add some phrase in front of each name. E.g. "note":


E -> "note eee"
C -> "note see"
D -> "note dee"
F -> "note aefff"
etc. Just be careful to say this without a long pause, so that it would be considered one phrase.

brutal way 4: Greek letters

You may use Greek names for the letters. These are much more diverse than English notes' names:


A -> "alpha"
B -> "beta"
C -> "chi" (I know, it should be "gamma", but "chi" sound more like "C",
                and I prefer using "gamma" for "G")
D -> "delta"
E -> "epsilon"
F -> "phi"
G -> "gamma" (hmm, is there any other natural suggestion?)
Sounds strange? REMEMBER: it's only up to you how you "click with your voice"!