•  
  •  
 

Abstract

Segmental Sinusoidal Model for Speech Signal Coding. Periodic signal can be decomposed by sinusoidal component with Fourier series. With this characteristic, it can be modeled referring by sinusoidal form. By the sinusoidal model, signal can be quantized in order to encode the speech signal at the lower rate. The recent sinusoidal method is implemented in speech coding. By using this method, a block of the speech signal with 20 ms to 30 ms width is coded based on Fourier series coefficients. The new method proposed is quantization and reconstruction of speech signal by the segmental sinusoidal model. A segment is defined as a block of the speech signal from certain peak to consecutive peak. The length of the segment is variable, instead of the fixed block like the recent sinusoidal method. Coder consists of the encoder and the decoder. Encoder works to code speech signal at variable rate. Then coded signal will be transmitted to receiver. On the receiver, coded signal will be reconstructed, so that the reconstruction signal has the near quality compared with the original signal. The experimental results show that the average of segmental SNR is more than 20 dB.

Bahasa Abstract

Sinyal yang periodik dapat didekomposisikan ke dalam bentuk sinusoida dengan menggunakan bantuan deret Fourier. Berdasarkan karakteristik sinyal suara yang demikian, maka dapat dilakukan pemodelan dengan mengacu pada bentuk sinusoida. Dengan menggunakan model sinusoida dapat dilakukan proses kuantisasi untuk mengkodekan sinyal suara pada laju yang rendah. Metode sinusoida telah banyak digunakan untuk mengkodekan sinyal suara. Dengan metode tersebut satu blok sinyal suara selebar 20 milidetik sampai dengan 30 milidetik dapat dikodekan dengan menggunakan koefisien deret Fourier. Metode baru yang diusulkan adalah kuantisasi dan rekonstruksi sinyal suara berdasarkan model sinusoida secara segmental. Segmen yang diambil adalah antara satu nilai puncak tertentu menuju ke nilai puncak berikutnya yang berlawanan, bukan berupa blok sinyal dengan panjang yang tetap seperti pada metode sinusoida yang sudah ada. Pengkode yang dirancang terdiri atas bagian enkoder dan dekoder. Enkoder berfungsi untuk mengkodekan sinyal suara pada laju variabel. Sinyal terkode selanjutnya dikirimkan ke penerima. Pada sisi penerima terdapat dekoder berfungsi untuk mengembalikan bentuk sinyal agar sesuai dengan asalnya dengan kualitas yang tidak jauh berbeda. Berdasarkan hasil percobaan diperoleh nilai rata-rata SNR segmental lebih dari 20 dB.

References

JR. Deller, JG. Proakis, JHL. Hansen, Discrete-Time Processing of Speech, Macmillan Publishing Company, New York, 1993.

S. Furui, Digital Speech Processing, Synthesis, and Recognition, Marcel Dekker Incorporation, New York, 1989.

L. Rabiner, BH. Juang, Fundamentals of Speech Recognition, Prentice Hall international, New Jersey, 1993.

TF Quatiery, RJ.McAulay, Speech Transforma tions Based on a Sinusoidal Representation, IEEE TASSP, vol. ASSP-34, no. 6, 1986

RJ.McAulay, TF Quatiery, Speech, Analysis/Synthesis Based on a SinusoidalRepresentation, IEEE TASSP, vol. ASSP-34, no. 4, 1986

T. Abe, dan M. Honda, Sinusoidal Model Based On Instantaneous Frequency Attractor, TSALP vol 14 No.4, 2006

R. Boyer, dan Abed-Meraim, K, AudioModeling Based on Delayed Sinusoids, TSAP vol 12, No.2, 2004.

CO. Etemoglu, dan V. Cuperman, V, Matching Pursuit Sinusoidal Speech Coding, TSAP vol 11,No.5, 2003.

GH. Hotho, dan RJ. Sluijter, A Narrowband Low Bit Rate Sinusoidal Audio and Speech Coder, Company Research - Philips Electronics Nederland, 2003.

J. Jensen, R. Heusdens, dan SH. Jensen, A Perceptual Subspace Approach for Modeling of Speech and Audio Signal with Damped Sinusoid, TSAP vol 12, No.2, 2004

S. Marchand dan M. Raspud, Enhanced Time-Stretching Using Order-2 Sinusoidal Modeling, Conference on DAFX, Napels, 2004.

S. Ramamohan, dan S. Dandapat, : Sinusoidal Model Based Analysis and Classification of Stressed Speech, TSALP vol 14 No.3, 2006.

S. Ahmadi, A. Spanias, A new Model for Sinusoidal Transform Coding of Speech, IEEE TSAP, 1998

BS. Atal, V. Cuperman, A. Gersho, Advances in Speech Coding, Kluwer Academic Publishers, Massachusetts, 1991.

BS. Atal, V. Cuperman, A. Gersho, Speech and Audio Coding for Wirelles and Network Applications, Kluwer Academic Publishers, Massachusetts L, 1993,.

AM. Kondoz, Digital Speech : Coding for Low Bit Rate Communications Systems, John Wiley & Sons Ltd,West Sussex, 1995.

T. Painter,dan A. Spanias, Perceptual Segmentation and Component Selection for Sinusoidal Representation of Audio, IEEE TSAP, vol 13, 2005, hal 149 - 162

A. Rao, R. Kumaresan, On Decomposing Speech into Modulated Components, IEEE TSAP vol 8, hal 240-254, 2000.

TF. Quatieri, TE. Hanna, GC. O’Leary, AM-FM Separation Using Auditory-Motivated Filters, IEEE TSAP, vol 5, 1997, hal. 465-480.

EB. George, MJT. Smith, Speech Analysis/Synthesis and Modification Using an Analysis-by-Synthesis/Overlap-Add Sinusoidal Model, IEEE TSAP, vol 5, 1997, hal 389-406.

HK. Jang dan JS. Park, Multiresolution Sinusoidal model with Dinamic Segmentation for Time Scale Modification of Polyphonic Audio Signal, TSAP vol 13, No.2, 2005

FB Setiawan, S. Tjondronegoro, Pemodelan Sinyal Suara dalam Bentuk Sinusoida, Prosiding Seminar Nasional UTY, Yogyakarta, 2005

FB Setiawan, S. Tjondronegoro, Model Sinyal Suara dengan Harmonik Pertama pada Segmen Puncak ke Puncak, Prosiding SITIA-ITS, Surabaya, 2006.

Share

COinS
 
 

To view the content in your browser, please download Adobe Reader or, alternately,
you may Download the file to your hard drive.

NOTE: The latest versions of Adobe Reader do not support viewing PDF files within Firefox on Mac OS and if you are using a modern (Intel) Mac, there is no official plugin for viewing PDF files within the browser window.