最終更新:2024-12-01 (日) 08:00:54 (230d)
Julius
julius
Top / julius

speech recognition engine
メモ

-smpFreq
help (4.6)

Julius rev.4.6 - based on JuliusLib rev.4.6 (fast)

Engine specification:
 -  Base setup   : fast
 -  Supported LM : DFA, N-gram, Word
 -  Extension    : LibSndFile
 -  Compiled by  : gcc -g -O2 -fPIC

Options:

--- Global Options -----------------------------------------------

 Feature Vector Input:
    [-input devname]       input source  (default = htkparam)
         htkparam/mfcfile  feature vectors in HTK parameter file format
         outprob           outprob vectors in HTK parameter file format
         vecnet            receive vectors from client (TCP/IP)
    [-filelist file]    filename of input file list

 Speech Input:
    (Can extract MFCC/FBANK/MELSPEC features from waveform)
    [-input devname]    input source  (default = htkparam)
         file/rawfile      waveform file (RAW(BE),WAV,AU,SND,NIST,ADPCM and more)
         mic               default microphone device
         alsa              use ALSA interface
         adinnet           adinnet client (TCP/IP)
         stdin             standard input
    [-filelist file]    filename of input file list
    [-adport portnum]   adinnet port number to listen         (5530)
    [-48]               enable 48kHz sampling with internal down sampler (OFF)
    [-zmean/-nozmean]   enable/disable DC offset removal      (OFF)
    [-lvscale]          input level scaling factor (1.0: OFF) (1.0)
    [-nostrip]          disable stripping off zero samples
    [-record dir]       record triggered speech data to dir
    [-rejectshort msec] reject an input shorter than specified
    [-rejectlong msec]  reject an input longer than specified

 Speech Detection: (default: on=mic/net off=files)
    [-cutsilence]       turn on (force) skipping long silence
    [-nocutsilence]     turn off (force) skipping long silence
    [-lv unsignedshort] input level threshold (0-32767)       (2000)
    [-zc zerocrossnum]  zerocross num threshold per sec.      (60)
    [-headmargin msec]  header margin length in msec.         (300)
    [-tailmargin msec]  tail margin length in msec.           (400)
    [-chunksize sample] unit length for processing            (1000)
    [-fvad]             FVAD sw (-1=off, 0-3=on / degree      (-1)
    [-fvad_param i f]   FVAD parameter (dur/thres)            (5 0.50)

 GMM utterance verification:
    -gmm filename       GMM definition file
    -gmmnum num         GMM Gaussian pruning num              (10)
    -gmmreject string   comma-separated list of noise model name to reject

 On-the-fly Decoding: (default: on=mic/net off=files)
    [-realtime]         turn on, input streamed with MAP-CMN
    [-norealtime]       turn off, input buffered with sentence CMN

 Others:
    [-C jconffile]      load options from jconf file
    [-quiet]            reduce output to only word string
    [-demo]             equal to "-quiet -progout"
    [-debug]            (for debug) dump numerous log
    [-callbackdebug]    (for debug) output message per callback
    [-check (wchmm|trellis)] (for debug) check internal structure
    [-check triphone]   triphone mapping check
    [-outprobout file]  Output state probabilities to file
    [-setting]          print engine configuration and exit
    [-help]             print this message and exit

--- Instance Declarations ----------------------------------------

    [-AM]               start a new acoustic model instance
    [-LM]               start a new language model instance
    [-SR]               start a new recognizer (search) instance
    [-AM_GMM]           start an AM feature instance for GMM
    [-GLOBAL]           start a global section
    [-nosectioncheck]   disable option location check

--- Acoustic Model Options (-AM) ---------------------------------

 Acoustic analysis:
    [-htkconf file]     load parameters from the HTK Config file
    [-smpFreq freq]     sample period (Hz)                    (16000)
    [-smpPeriod period] sample period (100ns)                 (625)
    [-fsize sample]     window size (sample)                  (400)
    [-fshift sample]    frame shift (sample)                  (160)
    [-preemph]          pre-emphasis coef.                    (0.97)
    [-fbank]            number of filterbank channels         (24)
    [-ceplif]           cepstral liftering coef.              (22)
    [-rawe] [-norawe]   toggle using raw energy               (no)
    [-enormal] [-noenormal] toggle normalizing log energy     (no)
    [-escale]           scaling log energy for enormal        (1.0)
    [-silfloor]         energy silence floor in dB            (50.0)
    [-delwin frame]     delta windows length (frame)          (2)
    [-accwin frame]     accel windows length (frame)          (2)
    [-hifreq freq]      freq. of upper band limit, off if <0  (-1)
    [-lofreq freq]      freq. of lower band limit, off if <0  (-1)
    [-sscalc]           do spectral subtraction (file input only)
    [-sscalclen msec]   length of head silence for SS (msec)  (300)
    [-ssload filename]  load constant noise spectrum from file for SS
    [-ssalpha value]    alpha coef. for SS                    (2.000000)
    [-ssfloor value]    spectral floor for SS                 (0.500000)
    [-zmeanframe/-nozmeanframe] frame-wise DC removal like HTK(OFF)
    [-usepower/-nousepower] use power in fbank analysis       (OFF)
    [-cmnload file]     load initial CMN/CVN param from file on startup
    [-cmnsave file]     save CMN/CVN param to file after each input
    [-cmnstatic]        no MAP, use static CMN/CVN (use with -cmnload)
    [-cvnstatic]        use static CVN only (use with -cmnload)
    [-cmnnoupdate]      not update initial param while recog. (use with -cmnload)
    [-cmnmapweight]     weight value of initial cm for MAP-CMN (100.00)
    [-cvn]              cepstral variance normalisation       (on)
    [-vtln alpha lowcut hicut] enable VTLN (1.0 to disable)   (1.000000)

 Acoustic Model:
    -h hmmdefsfile      HMM definition file name
    [-hlist HMMlistfile] HMMlist filename (must for triphone model)
    [-dnnconf file]     DNN configuration file
    [-iwcd1 methodname] switch IWCD triphone handling on 1st pass
             best N     use N best score (default of n-gram, N=3)
             max        use maximum score
             avg        use average score (default of dfa)
    [-force_ccd]        force to handle IWCD
    [-no_ccd]           don't handle IWCD
    [-notypecheck]      don't check input parameter type
    [-spmodel HMMname]  name of short pause model             ("sp")
    [-multipath]        switch decoding for multi-path HMM    (auto)

 Acoustic Model Computation Method:
    [-gprune methodname] select Gaussian pruning method:
             safe          safe pruning
             heuristic     heuristic pruning
             beam          beam pruning (default for TM/PTM)
             none          no pruning (default for non tmix models)
    [-tmix gaussnum]    Gaussian num threshold per mixture for pruning (2)
    [-gshmm hmmdefs]    monophone hmmdefs for GS
    [-gsnum N]          N-best state will be selected        (24)

--- Language Model Options (-LM) ---------------------------------

 N-gram:
    -d file.bingram     n-gram file in Julius binary format
    -nlr file.arpa      forward n-gram file in ARPA format
    -nrl file.arpa      backward n-gram file in ARPA format
    [-lmp float float]  weight and penalty (tri: 8.0 -2.0 mono: 5.0 -1)
    [-lmp2 float float]       for 2nd pass (tri: 8.0 -2.0 mono: 6.0 0)
    [-transp float]     penalty for transparent word (+0.0)

 DFA Grammar:
    -dfa file.dfa       DFA grammar file
    -gram file[,file2...] (list of) grammar prefix(es)
    -gramlist filename  filename of grammar list
    [-penalty1 float]   word insertion penalty (1st pass)     (0.0)
    [-penalty2 float]   word insertion penalty (2nd pass)     (0.0)

 Word Dictionary for N-gram and DFA:
    -v dictfile         dictionary file name
    [-silhead wordname] (n-gram) beginning-of-sentence word   (<s>)
    [-siltail wordname] (n-gram) end-of-sentence word         (</s>)
    [-mapunk wordname]  (n-gram) map unknown words to this    (<unk>)
    [-forcedict]        ignore error entry and keep running
    [-iwspword]         (n-gram) add short-pause word for inter-word CD sp
    [-iwspentry entry]  (n-gram) word entry for "-iwspword" (<UNK> [sp] sp sp)
    [-adddict dictfile] (n-gram) load extra dictionary
    [-addentry entry]   (n-gram) load extra word entry

 Isolated Word Recognition:
    -w file[,file2...]  (list of) wordlist file name(s)
    -wlist filename     file that contains list of wordlists
    -wsil head tail sp  name of silence/pause model
                          head - BOS silence model name       (silB)
                          tail - EOS silence model name       (silE)
                           sp  - their name as context or "NULL" (NULL)

--- Recognizer / Search Options (-SR) ----------------------------

 Search Parameters for the First Pass:
    [-b beamwidth]      beam width (by state num)             (guessed)
                        (0: full search, -1: force guess)
    [-bs score_width]   beam width (by score offset)          (disabled)
                        (-1: disable)
    [-sepnum wordnum]   (n-gram) # of hi-freq word isolated from tree (150)
    [-1pass]            do 1st pass only, omit 2nd pass
    [-inactive]         recognition process not active on startup

 Search Parameters for the Second Pass:
    [-b2 hyponum]       word envelope beam width (by hypo num) (30)
    [-n N]              # of sentence to find                 (1)
    [-output N]         # of sentence to output               (1)
    [-sb score]         score beam threshold (by score)       (80.0)
    [-s hyponum]        global stack size of hypotheses       (500)
    [-m hyponum]        hypotheses overflow threshold num     (2000)
    [-lookuprange N]    frame lookup range in word expansion  (5)
    [-looktrellis]      (dfa) expand only backtrellis words
    [-[no]multigramout] (dfa) output per-grammar results
    [-oldtree]          (dfa) use old build_wchmm()
    [-oldiwcd]          (dfa) use full lcdset
    [-iwsp]             insert sp for all word end (multipath)(off)
    [-iwsppenalty]      trans. penalty for iwsp (multipath)   (-1.0)

 Short-pause Segmentation:
    [-spsegment]        enable short-pause segmentation
    [-spdur]            length threshold of sp frames         (10)
    [-pausemodels str]  comma-delimited list of pause models for segment

 Graph Output with graph-oriented search:
    [-lattice]          enable word graph (lattice) output
    [-confnet]          enable confusion network output
    [-nolattice]][-noconfnet] disable lattice / confnet output
    [-graphrange N]     merge same words in graph (0)
                        -1: not merge, leave same loc. with diff. score
                         0: merge same words at same location
                        >0: merge same words around the margin
    [-graphcut num]     graph cut depth at postprocess (-1: disable)(80)
    [-graphboundloop num] max. num of boundary adjustment loop (20)
    [-graphsearchdelay] inhibit search termination until 1st sent. found
    [-nographsearchdelay] disable it (default)

 Forced Alignment:
    [-walign]           optionally output word alignments
    [-palign]           optionally output phoneme alignments
    [-salign]           optionally output state alignments

 Minimum Bayes Risk Decoding:
    [-mbr]              enable rescoring sentence on MBR(WER)
    [-mbr_wwer]         enable rescoring sentence on MBR(WWER)
    [-nombr]            disable rescoring sentence on MBR
    [-mbr_weight float float] score and loss func. weight on MBR (0.1 1.0)

 Confidence Score:
    [-cmalpha value]    CM smoothing factor                    (0.050000)

 Message Output:
    [-fallback1pass]    use 1st pass result when search failed
    [-progout]          progressive output in 1st pass
    [-proginterval]     interval of progout in msec           (300)

-------------------------------------------------

 Additional options for application:
    [--help]    display this help
    [-help]     display this help
    [-outfile]  save result in separate .out file
    [-nolog]    not output any log
    [-logfile arg]      output log to file
    [-noxmlescape]      disable XML escape
    [-separatescore]    output AM and LM scores separately
    [-kanji arg]        convert character set for output
    [-nocharconv]       disable charconv
    [-charconv arg arg] convert character set for output
    [-outcode arg]      select info to output to the module: WLPSCwlps
    [-module (arg)]     run as a server module
    [-record arg]       record input waveform to file in dir
↑
Julius 4.2.2? (Ubuntu)

accept_check?
adinrec?
adintool?
dfa_determinize?
dfa_minimize?
jclient?
jcontrol?
julius
julius-generate?
julius-generate-ngram?
mkbingram?
mkbinhmm?
mkbinhmmlist?
mkdfa?
mkfa?
mkgshmm?
mkss?
nextword?
yomi2voca?
検索

クイックアクセス

チラ裏

リンク

人気の50件

Julius
julius
Top / julius

メモ

help (4.6)

Julius 4.2.2? (Ubuntu)

最新の100件

検索

クイックアクセス

チラ裏

リンク

人気の50件

Juliusjulius Top / julius

メモ

help (4.6)

Julius 4.2.2? (Ubuntu)

最新の100件

Julius
julius
Top / julius