Tech Talk Featured Article

Tech Talk with SIP Print's Jonathan Fuld: What is a Codec?

February 01, 2010

By TMCnet Special Guest
Jonathan Fuld, Chief Technology Officer, SIP Print,

A codec is an algorithm that a telephone uses to break down, or encode, speech from the handset of a telephone. This same algorithm is also used on the telephone at the other end of the conversation to recreate, or decode, the voice or sound that was made at the other end. 
And this process is reversed from the second phone of the conversation. So, the codec is used twice – actually four times – in a telephone call. Once to encode and decode from caller A to caller B and back again. In a conference call, this process happens between the callers and the conferencing soft switch. In fact, codec’s can be used by devices other than telephones – call recording devices for example, and video call recording devices for another.
The word, “codec,” is actually an encoding of the phrase, “code – decode,” or the longer phrase, “to code and then to decode.”
A codec exists to compress raw data, such as voice, music, video, or other high intensity data streams, for the purpose of transmission. Each has a specific purpose, some are used in voice and others are used in video transmission. 
The two most common codec’s used in VoIP systems are G.711 and G.729. They are standards that meet the criteria of the International Telecommunications Union, or “ITU,” an organization based in Geneva Switzerland. 
G.711 is commonly known as the Pulse (News - Alert) Code Modulation, or “PCM,” codec. PCM means the algorithm or actually the software samples the audio signal 8000 times a second; where each sample is represented by 8 bits for a total of 64 kbit/s. That is sixty-four thousand bits per second. There are two versions of the codec. The µ-law is generally used in North America and Japan.  The A-law is used in Europe. The difference is the practicality of sampling the audio signal. This codec has no patent or licensing costs to use it. This codec is found in use in many premise based PBX (News - Alert) systems.
The second codec, G.729, is a patented algorithm with associated license fees paid to the inventors of multiple sections of the algorithm. It is used primarily to transmit data internally on Telco networks, between sip trunk providers and the premise based PBX systems and to the actual telephones connected to hosted SIP PBX systems.
This codex is complex, but at the same time has a high compression rate that results in a low bandwidth cost. It operates at eight kbit/s using a conjugate-structure algebraic-code-excited linear-prediction, or “CS-ACELP.” CS-ACELP means every sample of the audio signal is used as an input to the digitization of the whole audio signal from start to finish. For example, each sample runs through the algorithm along with all the previous samples’ hash key so as to create a digitization in a 10 millisecond frame. 
This frame runs along the digital network until it encounters the algorithm at the other end of the telephone conversation, or even a call recording device. The decoding happens in reverse order, the algorithm processes the most recently received frame along with the hash key of all the previous frames to make a new audio signal.
While complex and somewhat processor intensive, this codec presents the user with three additional features:
1)      Discontinuous Transmission, or “DTX,” capabilities: how to process what happens when no one talks – no sound – but there is still a phone call – used with cellular systems to save on battery power
2)      Voice Activity Detection, or “VAD,” capabilities: how should the digital frames be composed when there is noise but no voice on the audio signal
3)      Comfort Noise Generation, or “CNG,” capabilities: when no one is talking, the call is still on, the human ear likes to hear a low level hum or noise of some kind to tell the human brain that the call has not dropped
These extra features save on bandwidth and offer greater quality of reproduced signal.
There are additional voice codec’s, including G.722 – High Definition, G.723 and others. In addition the video codec’s include H.261, H.263 and H.264.
In summary, a codec is a method to reduce a signal – audio, video, or other – into discrete digital 1’s and 0’s for transmission along the internet and to rebuild that audio signal from the digital IP Packets.

TMCnet publishes expert commentary on various telecommunications, IT, call center, CRM and other technology-related topics. Are you an expert in one of these fields, and interested in having your perspective published on a site that gets several million unique visitors each month? Get in touch.

Edited by Kelly McGuire

blog comments powered by Disqus