CPL - Chalmers Publication Library
| Utbildning | Forskning | Styrkeområden | Om Chalmers | In English In English Ej inloggad.

Coding Speech for Packet Networks

Jonas Lindblom (Institutionen för elektromagnetik)
Göteborg : Chalmers University of Technology, 2003. ISBN: 91-7291-374-6.- 35 s.
[Doktorsavhandling]

The topic of this thesis is speech coding for packet networks. Problems related to the use of packet networks for voice communication are addressed. Real-time voice communication is for example very delay sensitive; if the total end-to-end delay in a telephone session grows large, it is perceived as annoying. The Internet, as of today, is a "best-effort" network and, in contrast to a traditional telephone channel, varying delays may occur throughout a conversation. If packets containing speech data are delayed much, not reaching the receiving end before their scheduled playout time, they are lost. Receivers need to handle packet loss in some way, or the subjective quality will be severely degraded.

The packet loss problem is central in this thesis, and it is approached from different directions. The thesis consists of seven articles (papers A-G), and in three of those (B-D), receiver-based packet loss concealment (PLC) methods are suggested. The PLC methods can in principle be employed in any existing system, by modifying the receivers. In paper E, a forward error correction system, based on the use of a secondary sub-coder, is proposed, and found to yield good results. Compared to receiver-based PLC, it does however require more bandwidth, and introduces additional delay. Instead of using PLC add-ons, as in papers B-E, the objective in paper G is to design a complete speech coder from scratch--with the packet channel in mind. A problem with many of today's coders is that they, for compression efficiency, utilize inter-frame coding techniques. Under frame-erasure conditions, such coders do not perform well, as errors propagate over several frames due to lost internal coder states. In the coder proposed in paper G, this is avoided by the use of new variable-dimension coding techniques based on Gaussian mixture (GM) models. These GM-based coding schemes are treated more generally in paper F of the thesis. Gaussian mixture modeling is frequently employed throughout the thesis (papers A,B,F,G), and is the sole topic of paper A, where a modified GM model with corresponding model estimation algorithm, is investigated.

Nyckelord: speech coding, voice over IP, frame erasure, packet loss concealment, Gaussian mixture modeling, bounded support, sinusoidal modeling, harmonic modeling, vector quantization, variable dimension



Denna post skapades 2006-08-28. Senast ändrad 2013-09-25.
CPL Pubid: 157

 

Institutioner (Chalmers)

Institutionen för elektromagnetik (1900-2004)

Ämnesområden

Elektroteknik och elektronik

Chalmers infrastruktur

Ingår i serie

Technical report - School of Electrical Engineering, Chalmers University of Technology, Göteborg, Sweden 468


Doktorsavhandlingar vid Chalmers tekniska högskola. Ny serie 2056