by - Lawrence R. Rabiner, Ronald W. Schafer
학교 음성인식 수업 교재이다... 중간 범위 8장까지 인거 같은데 할 수 있을까...
CHAPTER 1 Introduction to Digital Speech Processing 1
1.1 The Speech Signal 3
1.2 The Speech Stack 8
1.3 Applications of Digital Speech Processing 10
1.4 Comment on the References 15
1.5 Summary
CHAPTER 2 Review of Fundamentals of Digital Signal Processing
2.1 Introduction 18
2.2 Discrete-Time Signals and Systems 18
2.3 Transform Representation of Signals and Systems 22
2.4 Fundamentals of Digital Filters 33
2.5 Sampling 44
2.6 Summary 56
CHAPTER 3 Fundamentals of Human Speech Production 67
3.1 Introduction 67
3.2 The Process of Speech Production 68
3.3 Short-Time Fourier Representation of Speech 81
3.4 Acoustic Phonetics 86
3.5 Distinctive Features of the Phonemes of American English
3.6 Summary 110
CHAPTER 4 Hearing, Auditory Models, and Speech Perception
4.1 Introduction 124
4.2 The Speech Chain 125
4.3 Anatomy and Function of the Ear 127
4.4 The Perception of Sound 133
4.5 Auditory Models 150
4.6 Human Speech Perception Experiments 158
4.7 Measurement of Speech Quality and Intelligibility 162
4.8 Summary 166
CHAPTER 5 Sound Propagation in the Human Vocal Tract 170
5.1 The Acoustic Theory of Speech Production 170
5.2 Lossless Tube Models 200
5.3 Digital Models for Sampled Speech Signals 219
5.4 Summary 228
CHAPTER 6 Time-Domain Methods for Speech Processing 239
6.1 Introduction 239
6.2 Short-Time Analysis of Speech 242
6.3 Short-Time Energy and Short-Time Magnitude 248
6.4 Short-Time Zero-Crossing Rate 257
6.5 The Short-Time Autocorrelation Function 265
6.6 The Modified Short-Time Autocorrelation Function 273
6.7 The Short-Time Average Magnitude Difference Function
6.8 Summary 277
CHAPTER 7 Frequency-Domain Representations 287
7.1 Introduction 287
7.2 Discrete-Time Fourier Analysis 289 7.3 Short-Time Fourier Analysis 292
7.4 Spectrographic Displays 312
7.5 Overlap Addition Method of Synthesis 319
7.6 Filter Bank Summation Method of Synthesis 331
7.7 Time-Decimated Filter Banks 340 7.8 Two-Channel Filter Banks 348
7.9 Implementation of the FBS Method Using the FFT 358
7.10 OLA Revisited 365
7.11 Modifications of the STFT 367
7.12 Summary 379
CHAPTER 8 The Cepstrum and Homomorphic Speech Processing
8.1 Introduction 399
8.2 Homomorphic Systems for Convolution 401
8.3 Homomorphic Analysis of the Speech Model 417
8.4 Computing the Short-Time Cepstrum and Complex Cepstrum
of Speech 429
8.5 Homomorphic Filtering of Natural Speech 440 8.6 Cepstrum Analysis of All-Pole Models 456
8.7 Cepstrum Distance Measures 459
8.8 Summary 466
CHAPTER 9 Linear Predictive Analysis of Speech Signals 473
9.1 Introduction 473
9.2 Basic Principles of Linear Predictive Analysis 474
9.3 Computation of the Gain for the Model 486
9.4 Frequency Domain Interpretations of Linear Predictive
Analysis 490
9.5 Solution of the LPC Equations 505 9.6 The Prediction Error Signal 527
9.7 Some Properties of the LPC Polynomial A(z) 538
9.8 Relation of Linear Predictive Analysis to Lossless Tube Models 546
9.9 Alternative Representations of the LP Parameters 551
9.10 Summary 560
CHAPTER 10 Algorithms for Estimating Speech Parameters 578
10.1 Introduction 578
10.2 Median Smoothing and Speech Processing 580
10.3 Speech-Background/Silence Discrimination 586
10.4 A Bayesian Approach to Voiced/Unvoiced/Silence Detection 595
10.5 Pitch Period Estimation (Pitch Detection) 603
10.6 Formant Estimation 635
10.7 Summary 645
CHAPTER 11 Digital Coding of Speech Signals 663
11.1 Introduction 663
11.2 Sampling Speech Signals 667
11.3 A Statistical Model for Speech 669
11.4 Instantaneous Quantization 676
11.5 Adaptive Quantization 706
11.6 Quantizing of Speech Model Parameters 718
11.7 General Theory of Differential Quantization 732
11.8 Delta Modulation 743
11.9 Differential PCM (DPCM) 759
11.10 Enhancements for ADPCM Coders 768
11.11 Analysis-by-Synthesis Speech Coders 783
11.12 Open-Loop Speech Coders 806
11.13 Applications of Speech Coders 814
11.14 Summary 819
CHAPTER 12 Frequency-Domain Coding of Speech and Audio 842
12.1 Introduction 842
12.2 Historical Perspective 844
12.3 Subband Coding 850
12.4 Adaptive Transform Coding 861
12.5 A Perception Model for Audio Coding 866
12.6 MPEG-1 Audio Coding Standard 881 12.7 Other Audio Coding Standards 894
12.8 Summary 894
CHAPTER 13 Text-to-Speech Synthesis Methods 907
13.1 Introduction 907
13.2 Text Analysis 908
13.3 Evolution of Speech Synthesis Methods 914 13.4 Early Speech Synthesis Approaches 916
13.5 Unit Selection Methods 926
13.6 TTS Future Needs 942
13.7 Visual TTS 943
13.8 Summary 947
CHAPTER 14 Automatic Speech Recognition and Natural Language Understanding 950
14.1 Introduction 950
14.2 Basic ASR Formulation 952
14.3 Overall Speech Recognition Process 953
14.4 Building a Speech Recognition System 954
14.5 The Decision Processes in ASR 957 14.6 Step 3: The Search Problem 971
14.7 Simple ASR System: Isolated Digit Recognition 972 14.8 Performance Evaluation of Speech Recognizers 974
14.9 Spoken Language Understanding 977
14.10 Dialog Management and Spoken Language Generation
14.11 User Interfaces 983
14.12 Multimodal User Interfaces 984
14.13 Summary 984