Simple, synchronous speech recognizer, using the Nuance SREC package. Usages proceeds as follows: Create a Recognizer.
Create a Recognizer.Grammar.
Setup the Recognizer.Grammar.
Reset the Recognizer.Grammar slots, if needed.
Fill the Recognizer.Grammar slots, if needed.
Compile the Recognizer.Grammar, if needed.
Save the filled Recognizer.Grammar, if needed.
Start the Recognizer.
Loop over advance and putAudio until recognition complete.
Fetch and process results, or notify of failure.
Stop the Recognizer.
Destroy the Recognizer.
Below is example code
// create and start audio input
InputStream audio = new MicrophoneInputStream(11025, 11025*5);
// create a Recognizer
String cdir = Recognizer.getConfigDir(null);
Recognizer recognizer = new Recognizer(cdir + "/baseline11k.par");
// create and load a Grammar
Recognizer.Grammar grammar = recognizer.new Grammar(cdir + "/grammars/VoiceDialer.g2g");
// setup the Grammar to work with the Recognizer
grammar.setupRecognizer();
// fill the Grammar slots with names and save, if required
grammar.resetAllSlots();
for (String name : names) grammar.addWordToSlot("@Names", name, null, 1, "V=1");
grammar.compile();
grammar.save(".../foo.g2g");
// start the Recognizer
recognizer.start();
// loop over Recognizer events
while (true) {
switch (recognizer.advance()) {
case Recognizer.EVENT_INCOMPLETE:
case Recognizer.EVENT_STARTED:
case Recognizer.EVENT_START_OF_VOICING:
case Recognizer.EVENT_END_OF_VOICING:
// let the Recognizer continue to run
continue;
case Recognizer.EVENT_RECOGNITION_RESULT:
// success, so fetch results here!
for (int i = 0; i < recognizer.getResultCount(); i++) {
String result = recognizer.getResult(i, Recognizer.KEY_LITERAL);
}
break;
case Recognizer.EVENT_NEED_MORE_AUDIO:
// put more audio in the Recognizer
recognizer.putAudio(audio);
continue;
default:
notifyFailure();
break;
}
break;
}
// stop the Recognizer
recognizer.stop();
// destroy the Recognizer
recognizer.destroy();
// stop the audio device
audio.close();
1. 如何建立自已的语音库呢?2. baseline11k.par 和 VoiceDialer.g2g 是什么作用的文件?
Create a Recognizer.Grammar.
Setup the Recognizer.Grammar.
Reset the Recognizer.Grammar slots, if needed.
Fill the Recognizer.Grammar slots, if needed.
Compile the Recognizer.Grammar, if needed.
Save the filled Recognizer.Grammar, if needed.
Start the Recognizer.
Loop over advance and putAudio until recognition complete.
Fetch and process results, or notify of failure.
Stop the Recognizer.
Destroy the Recognizer.
Below is example code
// create and start audio input
InputStream audio = new MicrophoneInputStream(11025, 11025*5);
// create a Recognizer
String cdir = Recognizer.getConfigDir(null);
Recognizer recognizer = new Recognizer(cdir + "/baseline11k.par");
// create and load a Grammar
Recognizer.Grammar grammar = recognizer.new Grammar(cdir + "/grammars/VoiceDialer.g2g");
// setup the Grammar to work with the Recognizer
grammar.setupRecognizer();
// fill the Grammar slots with names and save, if required
grammar.resetAllSlots();
for (String name : names) grammar.addWordToSlot("@Names", name, null, 1, "V=1");
grammar.compile();
grammar.save(".../foo.g2g");
// start the Recognizer
recognizer.start();
// loop over Recognizer events
while (true) {
switch (recognizer.advance()) {
case Recognizer.EVENT_INCOMPLETE:
case Recognizer.EVENT_STARTED:
case Recognizer.EVENT_START_OF_VOICING:
case Recognizer.EVENT_END_OF_VOICING:
// let the Recognizer continue to run
continue;
case Recognizer.EVENT_RECOGNITION_RESULT:
// success, so fetch results here!
for (int i = 0; i < recognizer.getResultCount(); i++) {
String result = recognizer.getResult(i, Recognizer.KEY_LITERAL);
}
break;
case Recognizer.EVENT_NEED_MORE_AUDIO:
// put more audio in the Recognizer
recognizer.putAudio(audio);
continue;
default:
notifyFailure();
break;
}
break;
}
// stop the Recognizer
recognizer.stop();
// destroy the Recognizer
recognizer.destroy();
// stop the audio device
audio.close();
1. 如何建立自已的语音库呢?2. baseline11k.par 和 VoiceDialer.g2g 是什么作用的文件?
# this is the telematics grammar test, grammar is fixed
# default models
cmdline.modelfiles = models/generic11_f.swimdl models/generic11_m.swimdl
cmdline.arbfile = models/generic.swiarb
cmdline.tcp = CMDLINE.TCPFILE
cmdline.lda = models/generic11.lda
#
cmdline.modelfiles11 = models/generic11_f.swimdl models/generic11_m.swimdl
cmdline.modelfiles8 = models/generic8_f.swimdl models/generic8_m.swimdl
cmdline.lda11 = models/generic11.lda
cmdline.lda8 = models/generic8.lda
#
cmdline.vocabulary = dictionary/cmu6plus.ok.zip
#cmdline.vocabulary = dictionary/large.ok
...
g2g文件是用grxmlcompile,make_g2g工具生成的,用来定义待识别语句的语法,对应的源码是grxml文件,格式可以参照具体的srec文档
谢谢.我以上贴的代码,是用现成的东西.我想识别"A,B,C,D....1,2,3,4,..."等简单的几个英文数字, 我应如何做呢?(步骤一般是怎么样的?)我自已想建立一个语音库(不知说法准不准确), 我应如何做呢?(步骤一般是怎么样的?)