[This is part one of a series of several posts about bandwidth estimation for IP based video calling.]
- Part 1: NetSense 101: Why do we need bandwidth estimation? (this post)
- Part 2: NetSense 101: Packet-loss-based Bandwidth Estimation
- Part 3: NetSense 101: Delay-based Bandwidth Estimation
- Part 4: NetSense 101: Q&A
We’ve written in the past about bandwidth estimation – in the post I just linked to and elsewhere. But I think it is time for a few more posts, to explain a bit more about the rationale behind our bandwidth estimation mechanism; and this time I plan on starting from the beginning – with the WHY question.
Video is a bandwidth hog. To send video in good quality you need a lot of bandwidth – usually upwards of 1 Mbps. And when video gets encoded before it is being sent, the decision is made about how much bandwidth to invest in each and every outgoing video frame. While we can always invest a lot, the question that becomes: Will the bits I am investing in my encoded video make it safely to their destination?
You might say that having a broadband connection to the internet should be enough for me not to dwell on the issue. But this is untrue – when I am working at home, there are multiple machines connected to the internet on that same connection, each doing its own tasks – passively updating my email inbox, Dropbox, Evernote and other useful cloud accounts, actively having my wife browse her Facebook account and my daughter watching YouTube songs. This means that effectively there’s less bandwidth available for my video call. And as if this is not enough – there are a lot of infrastructure (switches and routers) between me and my destination – infrastructure that caters other bandwidth hog users out there.
This boils down to a simple truth – while video calling requires both low latency and high bandwidth, both will fluctuate in their quality throughout my video call: I’ll have varying bandwidth and different data delays in that 30 minutes call I am having.
It means that my video encoder – that part that takes the raw video from the camera and compresses it before sending it to the network – needs to be flexible enough to change the way it works: increasing and reducing the amount of data it generates based on the current conditions of the network.
What will happen if my video encoder will ignore the network conditions and just continue generating a 1 Mbps bitstream for a network that has only 512 Kbps available? It will simply congest the network, causing packet losses, which in turn will kill any chances of having reasonable video quality until the network conditions improves.
My encoder needs a way to estimate the current network conditions (=estimate the available bandwidth), and from that decide how much bitrate it has and encode accordingly, reducing or increasing quality/resolution/frame rate to achieve that goal.
There are several ways of estimating that bandwidth, where the most common ones are based on packet losses. I’ll touch these techniques in my next post.