最終更新:2014-05-06 (火) 04:30:23 (3068d)  

SPTK
Top / SPTK

Speech Signal Processing Toolkit (SPTK)

音声分析?音声合成ベクトル量子化?,データ処理・表示などを行う。

http://sp-tk.sourceforge.net/

概要

  • The Speech Signal Processing Toolkit (SPTK) is a suite of speech signal processing tools for UNIX environments, e.g., LPC analysis, PARCOR? analysis, LSP? analysis, PARCOR? synthesis filter, LSP? synthesis filter, vector quantization techniques, and other extended versions of them. This software is released under New and Simplified BSD license.
  • SPTK was developed and has been used in the research group of Prof. Satoshi Imai (he has retired) and Prof. Takao Kobayashi (currently he is with Interdisciplinary Graduate School of Science and Engineering, Tokyo Institute of Technology) at P&I laboratory, Tokyo Institute of Technology. A sub-set of tools was chosen and arranged for distribution by Prof. Keiichi Tokuda (currently he is with Department of Computer Science and Engineering, Nagoya Institute of Technology) as a coordinator in cooperation and other collaborates (see "Acknowledgments" and "Who we are" in README).
  • The original source codes have been written by many people who took part in activities of the research group. The most original source codes of this distribution were written by Takao Kobayashi (graph, data processing, FFT, sampling rate conversion, etc.), Keiichi Tokuda (speech analysis, speech synthesis, etc.), and Kazuhito Koishida (LSP, vector quantization, etc.).

This version is accompanied by a Reference Manual. A small User's Manual "Examples for using SPTK" is also attached.

マニュアル


このPDFへのリンク

コマンド

  • acep? - adaptive cepstral analysis.
  • acorr? - obtain autocorrelation sequence
  • agcep? - adaptive generalized cepstral analysis
  • amcep? - adaptive mel-cepstral analysis
  • average? - calculate mean for each block
  • b2mc? - transform MLSA digital filter coeffcients to mel-cepstrum
  • bcp? - block copy
  • bcut? - binary file cut
  • bell? - ring a bell
  • c2acr? - transform cepstrum to autocorrelation
  • c2ir? - cepstrum to minimum phase impulse response
  • c2sp? - transform cepstrum to spectrum
  • cdist? - calculation of cepstral distance
  • clip - data clipping
  • da? - play 16-bit linear PCM data
  • dct? - DCT-II
  • decimate? - decimation (data skipping)
  • delay? - delay sequence
  • delta? - delta calculation
  • df2? - second order standard form digital filter
  • dfs? - digital filter in standard form
  • dmp? - binary file dump
  • dtw? - dynamic time warping
  • ds? - down-sampling
  • echo2? - echo arguments to the standard error
  • excite? - generate excitation
  • extract? - extract vector
  • fd? - file dump
  • fdrw? - draw a graph
  • fft? - FFT for complex sequence
  • fft2? - 2-dimensional FFT for complex sequence
  • fftcep? - FFT cepstral analysis
  • fftr? - FFT for real sequence
  • fftr2? - 2-dimensional FFT for real sequence
  • fig? - plot a graph
  • frame? - extract frame from data sequence
  • freqt? - frequency transformation
  • gc2gc? - generalized cepstral transformation
  • gcep? - generalized cepstral analysis
  • glogsp? - draw a log spectrum graph
  • glsadf? - GLSA? digital filter for speech synthesis
  • gmm? - GMM parameter estimation
  • gmmp? - calculation of GMM log-probability
  • gnorm? - gain normalization
  • grlogsp? - draw a running log spectrum graph
  • grpdelay? - group delay of digital filter
  • gseries? - draw a discrete series
  • gwave? - draw a waveform
  • histogram? - histogram
  • idct? - Inverse DCT-II
  • ifft? - inverse FFT for complex sequence
  • ifft2? - 2-dimensional inverse FFT for complex sequence
  • ifftr? - inverse FFT for real sequence
  • ignorm? - inverse gain normalization
  • impulse? - generate impulse sequence
  • imsvq? - decoder of multi stage vector quantization
  • interpolate? - interpolation of data sequence
  • ivq? - decoder of vector quantization
  • lbg? - LBG algorithm for vector quantizer design
  • levdur? - solve an autocorrelation normal equation using Levinson-Durbin method
  • linear? intpl - linear interpolation of data
  • lmadf? - LMA? digital filter for speech synthesis
  • lpc? - LPC analysis using Levinson-Durbin method
  • lpc2c? - transform LPC to cepstrum.
  • lpc2lsp? - transform LPC to LSP?
  • lpc2par? - transform LPC to PARCOR?.
  • lsp2lpc? - transform LSP? to LPC
  • lsp2sp? - transform LSP? to spectrum.
  • lspcheck? - check stability and rearrange LSP
  • lspdf? - LSP speech synthesis digital filter
  • ltcdf? - all-pole lattice digital filter for speech synthesis
  • mc2b? - transform mel-cepstrum to MLSA digital filter coeffcients.
  • mcep? - mel cepstral analysis.
  • merge? - data merge
  • mfcc? - mel-frequency cepstral analysis
  • mgc2mgc? - frequency and generalized cepstral transformation.
  • mgc2mgcls?p— transform MGC to MGC-LSP
  • mgc2sp? - transform mel-generalized cepstrum to spectrum
  • mgcep? - mel-generalized cepstral analysis
  • mgclsp2mg?c— transform MGC-LSP to MGC
  • mglsadf? - MGLSA digital filter for speech synthesis
  • minmax? - find minimum and maximum values.
  • mlpg? - obtains parameter sequence from PDF sequence
  • mlsacheck? - check stability of MLSA filter
  • mlsadf? - MLSA digital filter for speech synthesis
  • msvq? - multi stage vector quantization
  • nan? - data check
  • norm0? - normalize coeffcients
  • nrand? - generate normal distributed random value
  • par2lpc? - transform PARCOR to LPC.
  • pca? - principal component analysis
  • pcas? - calculate principal component scores
  • phase? - transform real sequence to phase
  • pitch? - pitch extraction
  • poledf? - all pole digital filter for speech synthesis
  • psgr? - XY-plotter simulator for EPSF
  • ramp? - generate ramp sequence
  • raw2wav? - raw to wav (RIFF)
  • reverse? - reverse the order of data in each block-acep? - adaptive cepstral analysis.
  • acorr? - obtain autocorrelation sequence
  • agcep? - adaptive generalized cepstral analysis
  • amcep? - adaptive mel-cepstral analysis
  • average? - calculate mean for each block
  • b2mc? - transform MLSA digital filter coeffcients to mel-cepstrum
  • bcp? - block copy
  • bcut? - binary file cut
  • bell? - ring a bell
  • c2acr? - transform cepstrum to autocorrelation
  • c2ir? - cepstrum to minimum phase impulse response
  • c2sp? - transform cepstrum to spectrum
  • cdist? - calculation of cepstral distance
  • clip - data clipping
  • da? - play 16-bit linear PCM data
  • dct? - DCT-II
  • decimate? - decimation (data skipping)
  • delay? - delay sequence
  • delta? - delta calculation
  • df2? - second order standard form digital filter
  • dfs? - digital filter in standard form
  • dmp? - binary file dump
  • dtw? - dynamic time warping
  • ds? - down-sampling
  • echo2? - echo arguments to the standard error
  • excite? - generate excitation
  • extract? - extract vector
  • fd? - file dump
  • fdrw? - draw a graph
  • fft? - FFT for complex sequence
  • fft2? - 2-dimensional FFT for complex sequence
  • fftcep? - FFT cepstral analysis
  • fftr? - FFT for real sequence
  • fftr2? - 2-dimensional FFT for real sequence
  • fig? - plot a graph
  • frame? - extract frame from data sequence
  • freqt? - frequency transformation
  • gc2gc? - generalized cepstral transformation
  • gcep? - generalized cepstral analysis
  • glogsp? - draw a log spectrum graph
  • glsadf? - GLSA digital filter for speech synthesis
  • gmm? - GMM parameter estimation
  • gmmp? - calculation of GMM log-probability
  • gnorm? - gain normalization
  • grlogsp? - draw a running log spectrum graph
  • grpdelay? - group delay of digital filter
  • gseries? - draw a discrete series
  • gwave? - draw a waveform
  • histogram? - histogram
  • idct? - Inverse DCT-II
  • ifft? - inverse FFT for complex sequence
  • ifft2? - 2-dimensional inverse FFT for complex sequence
  • ifftr? - inverse FFT for real sequence
  • ignorm? - inverse gain normalization
  • impulse? - generate impulse sequence
  • imsvq? - decoder of multi stage vector quantization
  • interpolate? - interpolation of data sequence
  • ivq? - decoder of vector quantization
  • lbg? - LBG algorithm for vector quantizer design
  • levdur? - solve an autocorrelation normal equation using Levinson-Durbin method
  • linear? intpl - linear interpolation of data
  • lmadf? - LMA digital filter for speech synthesis
  • lpc? - LPC analysis using Levinson-Durbin method
  • lpc2c? - transform LPC to cepstrum.
  • lpc2lsp? - transform LPC to LSP
  • lpc2par? - transform LPC to PARCOR.
  • lsp2lpc? - transform LSP to LPC
  • lsp2sp? - transform LSP to spectrum.
  • lspcheck? - check stability and rearrange LSP
  • lspdf? - LSP speech synthesis digital filter
  • ltcdf? - all-pole lattice digital filter for speech synthesis
  • mc2b? - transform mel-cepstrum to MLSA? digital filter coeffcients.
  • mcep? - mel cepstral analysis.
  • merge? - data merge
  • mfcc? - mel-frequency cepstral analysis
  • mgc2mgc? - frequency and generalized cepstral transformation.
  • mgc2mgcls?p— transform MGC to MGC-LSP
  • mgc2sp? - transform mel-generalized cepstrum to spectrum
  • mgcep? - mel-generalized cepstral analysis
  • mgclsp2mg?c— transform MGC-LSP to MGC
  • mglsadf? - MGLSA digital filter for speech synthesis
  • minmax? - find minimum and maximum values.
  • mlpg? - obtains parameter sequence from PDF sequence
  • mlsacheck? - check stability of MLSA filter
  • mlsadf? - MLSA digital filter for speech synthesis
  • msvq? - multi stage vector quantization
  • nan? - data check
  • norm0? - normalize coeffcients
  • nrand? - generate normal distributed random value
  • par2lpc? - transform PARCOR to LPC.
  • pca? - principal component analysis
  • pcas? - calculate principal component scores
  • phase? - transform real sequence to phase
  • pitch? - pitch extraction
  • poledf? - all pole digital filter for speech synthesis
  • psgr? - XY-plotter simulator for EPSF
  • ramp? - generate ramp sequence
  • raw2wav? - raw to wav (RIFF)
  • reverse? - reverse the order of data in each block
  • rmse? - calculation of root mean squared error
  • root pol - calculate roots of a polynomial equation
  • sin? - generate sinusoidal sequence
  • smcep? - mel-cepstral analysis using 2nd order all-pass filter.
  • snr? - evaluate SNR and segmental SNR
  • sopr? - execute scalar operations
  • spec? - transform real sequence to log spectrum
  • step? - generate step sequence
  • swab? - swap bytes
  • symmetrize? - symmetrize the sequence of data
  • train? - generate pulse sequence
  • transpose? - transpose a matrix
  • uels? - unbiased estimation of log spectrum.
  • ulaw? - -law compress/decompress.
  • us? - up-sampling.
  • us16? - up-sampling from 10 or 12 kHz to 16 kHz.
  • uscd? - up/down-sampling from 8, 10, 12, or 16 kHz to 11.025, 22.05, or 44.1 kHz
  • vc - GMM-based voice conversion200
  • vopr? - execute vector operations206
  • vq? - vector quantization209
  • vstat? - vector statistics calculation 210
  • vsum? - summation of vector 213
  • wav2raw? - wav (RIFF) to raw215
  • window - data windowing216
  • x2x? - data type transformation218
  • xgr? - XY-plotter simulator for X-window system
  • zcross? - zero cross223
  • zerodf? - all zero digital filter for speech synthesis
  • rmse? - calculation of root mean squared error
  • root pol - calculate roots of a polynomial equation
  • sin? - generate sinusoidal sequence
  • smcep? - mel-cepstral analysis using 2nd order all-pass filter.
  • snr? - evaluate SNR and segmental SNR
  • sopr? - execute scalar operations
  • spec? - transform real sequence to log spectrum
  • step? - generate step sequence
  • swab? - swap bytes
  • symmetrize? - symmetrize the sequence of data
  • train? - generate pulse sequence
  • transpose? - transpose a matrix
  • uels? - unbiased estimation of log spectrum.
  • ulaw? - -law compress/decompress.
  • us? - up-sampling.
  • us16? - up-sampling from 10 or 12 kHz to 16 kHz.
  • uscd? - up/down-sampling from 8, 10, 12, or 16 kHz to 11.025, 22.05, or 44.1 kHz
  • vc - GMM-based voice conversion200
  • vopr? - execute vector operations206
  • vq? - vector quantization209
  • vstat? - vector statistics calculation 210
  • vsum? - summation of vector 213
  • wav2raw? - wav (RIFF) to raw215
  • window - data windowing216
  • x2x? - data type transformation218
  • xgr? - XY-plotter simulator for X-window system 221iv CONTENTS
  • zcross? - zero cross223
  • zerodf? - all zero digital filter for speech synthesis

関連