[This the second part of a series of several posts about bandwidth estimation for IP based video calling.]
- Part 1: NetSense 101: Why do we need bandwidth estimation?
- Part 2: NetSense 101: Packet-loss-based Bandwidth Estimation (this post)
- Part 3: NetSense 101: Delay-based Bandwidth Estimation
- Part 4: NetSense 101: Q&A
Bandwidth. We don’t have enough of it. And video calling devices need to know how much of it is available. That’s at least the gist of what I’ve written in why we need bandwidth estimation.
How do you estimate it today in most cases? Using packet loss information.
When two endpoints are engaged in a video call, they send RTP packets with the actual compressed media in them. In parallel to the RTP, there’s an additional protocol called RTCP that is used. RTCP takes care of sending control information between the endpoints that relates to the RTP media: how many packets were sent, received, lost, etc.
This information can then be used by the sender or the receiver of the media to change his behavior in real time.
In most systems, the decision is left to the receiver – he is the one that receives the data and is aware to some extent on how the link is behaving.
The receiver will be monitoring the incoming media, and if he sees that there are just too many packet losses (a heuristic that is different between vendors), it will estimate how much effective bandwidth it has and from that information ask the sender of that media to reduce the bitrate on one of the media channels. This in turn, will reduce the quality of the encoded data, reduce the frame-rate or the resolution – again, based on the sender’s own internal policies.
The diagram below illustrates the flow the receiver will use for the bandwidth estimation algorithm.
While this is the “industry standard”, there are a few soft spots with this solution that I want to point:
- The amount of packet loss that causes re-estimation of the available bandwidth is based on a heuristic. It might not be the best one and it might not be able to distinguish between congestion and corruption types of packet losses (we want to focus on congestion with bandwidth estimation).
- This solution is aggressive. As long as there’s no packet losses (or maybe little), we won’t reduce the bitrate. In a way, this is similar to the bufferbloat problem, where we try to send as much data as possible without taking into consideration the internal queues of all the switches and routers along the way. This in turn can increase the latency of the media.
- We will reduce bandwidth only after a congestion already occurred and has affected the video quality. This is too late, especially when we will be requesting for I-frames in parallel (=frames which take a lot of bandwidth to encode).
- The estimated bandwidth we will synchronize on isn’t an accurate one. trying to estimate it from packet losses means we either need to reduce bandwidth aggressively, and then we lose some of the bandwidth that is available for us by the network – or we might need to reduce it more than once to synchronize on the available bandwidth, which will lengthen the time until we get the video quality to a reasonable level yet again.
So you see: there are things we can improve. The main thing being able to “know” the available bandwidth before congestion occurs. While we haven’t yet developed our prophetic module here in RADVISION, we actually did come close when we talk about bandwidth estimation; but this will be the topic of my next post: how do we estimate bandwidth based on network delays.