[This post is based on a lecture given by Ariel David and I on the effects of IP networks on visual quality in video over IP applications]
One of the biggest obstacles of the video conferencing industry has long been the user experience. Early video conferencing systems suffered from poor quality video and low fidelity audio, as well as great difficulties in establishing and maintaining connection for the conference duration. Most of these issues were solved with advances in processor speed, audio and video coding algorithms and network infrastructure improvement. But even today’s video conferencing experience – with high definition video, wide-band audio and always-on networks – is not flawless, and the main Achilles heel is, as always, the network.
The Achilles heel of High Definition – available bandwidth
This is true for video over IP in general – IPTV, video streaming, web casting, you name it. Whether it’s the network one is connected to at the office, home or on a mobile handset – bandwidth is expensive, and therefore bandwidth is limited. And transmitting video at reasonable bit rates over that bandwidth is not trivial, and the affect the bandwidth has on visual quality is significant.
It can be said that video conferencing is much “easier”, in terms of content complexity, than the other video over IP domains, usually featuring a bunch of “Talking Heads” (also known as CP). Still, as can be seen in the table below, even at 480p, an endpoint requires around 1 Mbps for good visual quality, and almost twice at 720p (HD).
A 1Mbps network connection is not a problem in a modern enterprise. But if video conferencing is deployed all over the organization (as it should), very soon the existing infrastructure would become a bottleneck, and bandwidth would drop, causing an array of nasty artifacts, those annoyances that make your video terribly unpleasant. And while other aspects, such as video codec features, scene type and source type, also influence the visual quality, it seems that network related artifacts are the most frequent and most annoying.
Insufficient bandwidth will result in packet loss, which leads to lousy video. There are basically two ways for an endpoint to deal with that – reduce the bit rate and/or increase the compression. Reducing the bit rate causes a drop in visual quality, as can be seen above. Modern endpoints reduce the resolution (picture size) together with the bit rate, but anyway the experience suffers. Higher compression rate decreases the visual quality even further.
And so we are left with the following, seven deadly sins – I mean artifacts – of video over IP:
The Seven Deadly Sins
1. Packet Loss
If packets are missing, whole areas in the video frame are displayed wrong. This causes ugly artifacts to appear in various ways, all of them very unpleasant
Packet loss in a multipoint video conference.
These are vertical lines or colored areas, which usually trail an object while it’s moving. This usually happens when the bandwidth is insufficient, as movement requires more bits.
Left: A scene featuring trails on the person on the left. Right: zoom in on the trails.
Blockiness is usually visible on moving objects and backgrounds (walls, furniture). This is caused by a high compression rate.
Left: original Right: blockiness effect
Posterization is the loss of color depth. This means that background and other even color spaces look blurred, and lose the sharp transiency between them. Again, this is caused by a high compression rate.
Left: source Right: posterization effect
Noise is a general name for any “weird” looking squares and spots in the video. These are usually caused by packet loss or erroneous packets.
Random noisy artifacts on video
6. Mosquitoes (Gibbs effect)
Mosquitoes, or in their formal name “Gibbs Effect“, are dots in various level of intensity which surround edges. This is caused by a too high compression rate.
Left: visible mosquitoes on the face. Right: zoom in on the face.
As endpoints lower the resolution, when bit rate drops, the picture on the other end is smaller and has to be scaled up to display on a large screen. Scaling up a low resolution picture is very hard, and often the visual quality suffers.
Left: qCIF image (source). Right: Scaled up image.
When it comes to visual quality, everyone is an expert. Whether you’re watching a video streamed over the internet, attending a video conference with peers around the world or watching the news on your mobile handset – you don’t want anything interfering with your experience, and definitely not those pesky artifacts.
There are ways to remove, or at least reduce, those annoyances. Those methods mostly involve fancy post-processing algorithms. But this, I fear, is a matter for another post.