[Ever since we started discussing desktop video conferencing as a valid option for providing the entire work force with video conferencing capabilities, including telecommuters and employees that are out of the office, I've been getting a lot of questions regarding this new and ground-breaking concept.
There's lots of confusion around this, and people are pretty much preaching what they're selling: complicated SVC codecs, fancy HD cameras, state-of-the-art next generation CPUs.
Just recently Scott Wharton, CEO of Vidtel Inc., suggested we recommend the "perfect machine" for a "soft" client implementation - the right processor, graphics card, external webcam, etc.
And so I've asked Vincent Chavy, a guy that needs no introduction here, to use his vast experience, as a long-time veteran in the desktop conferencing industry, and give the expert advice on this one.]
It Is All About The Processor….
Well, sure thing, having a decent processor is very important, but why? And at what point does it stop being important? Is a 16 cores processor (as shown in the image below) going to give you 1080p at 120 fps?
First of all, the process of encoding video costs more than the process of decoding video. A fair estimation is that it takes 4 times more CPU to encode a video than to decode it, at equivalent resolution, of course.
Multi-core processor machines are interesting, because the same cache is shared between multiple cores. The encoding process can therefore be shared between multiple cores in a more optimized fashion. Read, you can encode more! So, in theory, the more cores the better, but there is of course a practical limit.
If you want a machine to encode and decode High Definition 720p at 30 fps, a machine with an Intel i5 or i7 Core2 Duo processor will do fine. If you are more reasonable, and can settle with sending VGA at 30 fps and receiving High Definition 720p at 30fps, a machine with an Intel Core 2 duo 2.4 GHz will just be perfect (note that I said Core2 Duo and not Duo Core). In these recommendations, I of course take into account the fact that users want to keep some engine power to do those other minor stuff, like browsing the web, checking emails, and instant messaging with others while participating in a meeting.
You may notice that I recommend Intel’s processors here. That’s because our video codecs are highly optimized for the Intel platform, leveraging the low-level Intel libraries which provide the best performance for those processor-intensive operations.
Some stats, including CPU Usage, from our SCOPIA Desktop client in a 720p@30 call.
Come On, It’s all about the camera…
You can now get an extremely good USB camera, with amazing specifications, for a very decent price. The good old days of 160×120 at 5 fps are happily gone (see below for an amazing blast from the past). Today is the era of 720p at 30 fps… and more!
So yes, the camera is very important. And the driver – very important too. And the settings of the camera are even more important.
A blast from the good old CU-SeeMe days.
There are 2 variables that are critical when testing and selecting a camera: capture size (and aspect ratio) and frames per second (fps).
Most of the cameras on the market will capture up to VGA 30 fps. This is the case of almost all the cameras embedded in laptops and desktop screens. There are a few USB cameras that “claim” to capture 30 fps 720p.
So, why is it so hard to do 720p at 30 fps when you can now get a cheap camcorder doing 1080p? Well, the big limiting factor for USB camera vendors is the USB Bus speed, which is limited.
Classic mistakes (user errors, if you prefer) include using a low speed USB hub or plugging in your camera in an old USB 1.0 plug. If you want to get the best of the best of your high performance camera, make sure it is plugged on a high speed 2.0 USB plug, directly on your computer. Oh, and of course, all USB devices usually share the same BUS. So if you wondered why your camera is slow when you are doing your weekly backup on your USB drive… Well, you now know why.
Let’s do the math: an image of 720p (that is 1280 by 720 pixels), is composed of 921,600 pixels. A pixel is usually coded using 24 bits, so capturing 720p at 30 frames per second means 921,600 * 30 * 24 = 663.5 Mbps. And guess what is the maximum speed of a USB 2.0 BUS? 480 Mbps. L
To overcome this limitation most camera vendors are now compressing the image before transmitting it from the camera to the PC. So the video stream acquired from the camera is now a compressed stream instead of a raw stream as it was before, and compression introduces some loss in quality.
Fancy things on the camera also affect the frame rate. This includes digital zoom, automatic light correction, as well as all the funky moustaches and hats you can digitally add to your image. Capturing and sending 720p at 30 frames per second is a serious business. One parameter off, and the punishment (either on image size or frames per second) will be immediate. So we recommend turning off all settings like face tracking, automatic adjustments in low light environments as well as any automatic configuration (auto focus, auto white balance) that makes the video pulsate. In short, any configuration with the word “automatic” in it is suspicious.
One last warning – as mentioned above, the camera driver is equally important. So make sure you always have the latest and greatest from your camera vendor.
To be complete, I should mention that there is an alternative to USB Cameras. You can buy a video acquisition card, and use your camcorder or any other compatible PTZ camera. Although this is something that has some value, I believe that for the cost it implies, you will be better served by a solution like the SCOPIA VC240.
Hey, Don’t Forget The Graphic Card…
The graphic card and the graphic card driver are key to the video rendering (display of the video) process. A weak graphic card or an outdated graphic card driver will result in high CPU usage when rendering video – and well, this is some precious CPU cycles that will not be available for the video codec.
Since it is now rare to find a bad graphics card, the one you have in your computer should be fine. It is however a good idea to verify that your drivers are up-to-date.
What is going to be very interesting is to see the role of graphic cards in the future of personal computing, especially their GPUs (Graphic Processing Unit). With technologies, libraries and framework like Open CL, Microsoft Media Framework, ATI Stream or NVIDIA CUDA, a lot of the video processing or pre-processing will be possibly handled by the graphic card. Meaning that the GPU will be helping the CPU a lot more than today.
OK, Then – What IS The Perfect Desktop Video-Conferencing Machine?
Well, if you are ready to settle with sending VGA at 30fps and receiving 720p at 30 fps, a good laptop (Core2 Duo 2.4 Ghz or better) will do just fine. The camera embedded in your laptop is most likely good enough as well. A lot of my colleagues are using their DELL XPS Laptop, with the embedded camera and have a great videoconferencing experience.
If you want more, then you need to aim for an i5 or i7 processor machine with windows 7 and with the best of breed camera (Logitech 9000 or Microsoft Lifecam Cinema HD).
To end with a personal note, I am attending almost all my meetings from my MacBook Pro, and I would not trade it for any other machine. I am more than happy with sending VGA, and I have to say, that – sometimes – I would even prefer sending less… like CIF, god forgive!
I was never told “you need to shave” until high definition arrived. I was never told “gosh you look tired” in the CIF days. Remember, with desktop video conferencing, you are right in front of your camera, so your audience WILL see that you were partying last night. But that’s an issue for a different kind of post…
More than a year ago I wrote a post on “affordable telepresence” and the failing economy. You see, as much as the economy is regarded as a driving force towards visual communications adoption, it is quite a struggle to believe that a $300K system is a viable solution in the current economical state (and in general).
At around the same time as that post, I written about what telepresence is and what it is not. In a nutshell, ever since Cisco announced their Telepresence system (end of 2006), Telepresence has been distanced from “plain old” video conferencing as much as possible. The PR basically argued that video conferencing failed, and telepresence is the next big thing.
And the PR worked. By the end of 2007 most video conferencing vendors re-branded their high-end systems as “telepresence”, totally blurring the definition of telepresence. Then came “affordable telepresence” , “personal telepresence” (previously known as “executive systems”), “home telepresence”, whatever.
Personal telepresence with granddad?!
Telepresence – Out , Personal Telepresence – In
Wainhouse Research recently released a research note titled “The Telepresence Vanishing Act“, arguing that the telepresence as a market segment is disappearing, as it is not a real product or a product category, but a different “experience”. As the “standard” video conferencing systems, the room systems, are evolving rapidly towards 1080p video and big displays, as well as paying much more attention to lighting, acoustics and camera issues, the room system experience is converging with telepresence.
And if telepresence is no longer a big hype, and if the price ain’t right, there’s no surprise that at the end of 2009 and the beginning of 2010 most vendors have rolled out low-cost telepresence-like systems. Cisco and LifeSize were probably the pioneers, back in 2008, with the LifeSize Room 200 and the Cisco Telepresence 500, delivering an impressive feature set with a reasonable price tag (The Room 200 was priced at $16,999). It was/is not telepresence, but a rather nice-looking HD video conferencing system. But go argue with marketers.
And with that “market it now, worry about the results later” spirit, and with price becoming an issue and telepresence losing its clear definition in the market, came the latest hype: “software telepresence”. Yes, there’s no mistake here – I’m talking about a software client, running on some kind of personal computer, marketed as “telepresence”.
You know by now I’m a big fan of desktop conferencing and collaboration infrastructure. After all, we are very proud of our software client. But sorry – desktop conferencing is NOT telepresence. And while it’s great to take a hit at Cisco, and comparing the low price tag of such systems to that of Telepresence, it is like comparing cherries to watermelons…
The Real Deal: Compatibility and Interoperability
Take, for instance, the latest release from Vidyo, the software video conferencing start-up. They recently announced the VidyoRoom HD-220, with a price tag of $6,995, claiming it can replace the $250,000 system that Cisco sells.
Dave Greenfield from ZDNet’s Team Think points out to the real deal here – it’s not the 1080p or the 60fps that counts; it’s compatibility:
“What’s still needed is a way to coordinate all of the different high-end video system. It’s not just a matter of supporting the H.323 video either…
Then there’s having to coordinate all of the other components that are possible in a video conference – screen display, acoustical mapping, screen display and the like. Vendors have different ways of implementing and then managing these exchanges.”
And, of course, there’s interoperability.
“All very nice, but the big issue here is compatibility. A video system that connects with one or two other offices is far less useful than one that interoperates with every webcam on every desk…”
And if I may be blunt here, there’s also the issue of the overall design and the logic behind the whole thing. These “software-based” solutions look just like they sound: a weird-looking server, with external components that don’t really fit the over-all design.
So while the server in the picture above looks ok, and has an appealing price tag, you may notice it misses some basic components. You know – the microphone, the camera, the data sharing cable. Yep, you’re expected to shop for the separate external components on your own. It’s just what Tsahi warned us about in a recent post – peripherals are a big headache.
Why Cheap Costs More
Just imagine going to a car dealership, and being offered a car with a great price tag, only it misses a few simple external components – wheels, gear, engine. But you know – any wheel will work here, and we support a variety of gears, and everyone sells engines these days. Would you buy such a car at that dealership?
Impressive as the price tag is, when you add the few necessities, such as an HD camera, a high quality microphone system, high quality speakers and a VGA connector (at least), such a video solution can cost more than $30,000. Or at least that’s what Dave Greenfield says.
And the bottom line: it doesn’t just feel like a mess, it also looks like one. Just compare the “complete” server-based solution with the slick Lifesize Express 220, that has everything built-in for less than $7000, and you’ll get my drift:
There’s an old Jewish saying claiming that “cheap costs more most of the time”. And while I strongly believe in affordable solutions, when it comes to a room system you better choose a room system grade solution, and not something that comes near and is marketed as such.
A software client is a software client, a room system is a room system, a telepresence system is a telepresence system – they each have their own characteristics, their own benefits, their own drawbacks. So don’t let the marketing people confuse you. Understand what you need, and choose appropriately.
“There are many methods for predicting the future. For example, you can read horoscopes, tea leaves, tarot cards or crystal balls. Collectively these methods are known as “nutty”. Or you can put well-researched facts into sophisticated computer models, more commonly referred to as “a complete waste of time” - Scott Adams
As 2010 begins, it’s predictions time again, when everyone who’s writing anywhere must give their predictions for the up-coming year. I will not disclose here my methods for predicting the future, but will share with you what I commonly refer to as “my educated guess”:
1. Video Calling and Video Conferencing Will Merge Into Visual Communications
Video calling is getting more wide-spread and more buzz, especially now that Skype is making a lot of noise in CES. On the other hand, in the enterprise front, the video conferencing market is still quite secluded from the IM/Video chat hype. In 2010 I suspect that these two islands will finally meet and merge, and the result would be a better, more complete experience – the one we like to call visual communications.
2. Video Calling Will Be In Everybody’s Homes (And Not Just Their Desktops)
To continue from the last prediction, video calling on the desktop is becoming something everyone is using, thanks to Skype, Google and their likes. But 2010 seems to be the turning-point where video will leave the desktop and move into the living room, the bedroom, the kitchen, the restroom, you name it. Calling someone using video would be just like calling him on the phone, only better.
3. IT Will Have Their Hands Full With Bandwidth Issues
As video calling will become popular and as visual communications becomes a reality, IT managers will have to deal with bandwidth issues in their enterprise networks. Bandwidth management – in the endpoint level, in the bridge level and in the management level – will become a must, if visual communications is to be successfully deployed in the entire organization.
4. Everyone Would Have Their iPhone App
Video conferencing? Video calling? Video anything? – there’s an app for that. 2010 will be the year where every player in the market will come out with their own iPhone app. What will these apps do? We’ll have to wait and see. But in 2010 the business reality will be: you have an iPhone app, therefore you are.
5. Scalable Video Coding (SVC) Will Not Become The De-Facto Standard
SVC is great. SVC tools do wonders to video conferencing. But 2010 will not be the year that the market will switch to SVC-based solutions. SVC-based endpoints and SVC silos will be deployed, but the majority of enterprises will use either non-SVC solutions or a hybrid of SVC and non-SVC.
Bonus: A Bunch Of Stuff That Won’t Happen This Year
B2B Video Conferencing will not be solved. 1080p will not become the de-facto resolution. Cloud-based video conferencing will not become popular. Social media will not penetrate to the enterprise. Mobile video conferencing will not happen. The iPhone will not have a front-faced camera (yet…). Software-only MCUs will not become popular. People will still not reach an understanding about what telepresence is.
[I've been asked often, during the past two years, what's next on the video coding front. Some people are asking about H.265, the natural heir of the current king H.264; others are just wondering where we're going. To be honest, even for a video coding guy keeping up with the latest trends and turns in the video coding world is a complicated task.
But the "next generation" of video coding is a very interesting topic, not to mention very relevant for this blog. So I asked Christian Timmerer, who I have been following on-line via his excellent blog and twitter, to try and explain where we're going and what's next in the video coding world.]
The xVC Era
VC stands for many things these days – Venture Capital and Video Conferencing to name a few. But VC in the multimedia community means Video Coding, the science behind the many uses of video in our everyday lives.
In a keynote speech at ICUIMC’08 Fernando Pereira asked “is there a xVC virus?”. He referred to the “era of xVC” that conquered the mainstream multimedia community, where “x” stands for A, S, M and many other letters, as I will briefly discuss here.
First there was AVC, the Advanced Video Coding standard, also known as ITU-T H.264 and MPEG-4 Part 10. Its first version was published in 2003, providing 50% more compression efficiency compared to previous video coding standards (or in other words: providing the same quality using less than half the bits).
In 2007 came SVC, the Scalable Video Coding extension to AVC, enabling one to code multiple versions of the same video (in terms of resolution, frame rate and bit rate) using one bitstream, while keeping the overhead at a reasonable level.
Next we have MVC, a Multi-view Video Coding extension, which enables efficient coding of 3D video using multiple viewpoints and directions to create a depth impression of a scene and an interactive selection of views within a certain range.
There is actually more xVC standards, such as Reconfigurable Video Coding (RVC) and Distributed Video Coding (DVC), but these are quite hardcore… I would like to focus on the efforts towards a “mainstream” next generation coding standard, one that would enable higher resolutions, higher frame rates and higher quality, without losing coding efficiency (and possibly keeping the 50% improvement compared to previous standards). These efforts are currently called HVC – High-performance Video Coding.
An introduction to HVC
The main purpose of HVC is to come up with a new generation of video compression technology, that enables substantially higher compression capability than the existing AVC standard. The activity started with a vision to address the next generation of ultra-HD (UHD) devices (displays and cameras) already appearing on the horizon, while providing better support for mobile terminals, where the video quality at low resolutions, frame rates, and bit rates today is largely unacceptable.
Ultra HD vs. High Definition resolution comparison
Therefore, ISO/IEC MPEG and ITU-T SG16 Q6/16, the leading standardization bodies in this domain (and yes – they’ve renewed their collaboration), are currently pursuing a Call for Proposals, which will be finalized in January 2010, with responses evaluated in April 2010. The requirements for HVC range are for better compression performance over:
Higher picture formats (potentially from QVGA to 8Kx4K, or UHD)
while maintaining low delay, error resiliency, scalability and more.
The competitive phase of the standardization process (i.e., from the starting point until a Committee Draft is reached) has a more or less detailed timeline defining how/when to register/submit responses to the call and how they are going to be evaluated.
For evaluation purposes, a couple of video sequences have been defined, covering a range of specifications (and uses), ranging from 2560×1600 sequences of street cameras (Class A) to various 416×240p clips (Class D) to 720p@60fps streams (Class E).
It is probably worth noting that submissions to Class A will be evaluated based on PSNR and rate only, whereas submissions for other classes will be evaluated by means of a formal subjective assessment as well. The reason for doing so is that subjective tests are quite expensive. The tests will be conducted in different institutes, such as FUB, EBU and EPFL.
A New Standard Is Coming… Stay Tuned!
The HVC efforts have already begun. Based on the current timeline, one can expect the new standard to be available around the end of 2012/beginning of 2013. This may seem far away, but as many video infrastructure products have a 2 year design process, this is very relevant to today’s design efforts.
And so we are looking forward to a new and very exciting xVC episode, and it will be very interesting to see how the new standard evolves and whether it will fit today’s expectations. Thus, stay tuned!
[Chrisitan Timmerer is an Assistant Professor at the Department of Information Technology (ITEC) - Multimedia Communication Group. His research topics include multimedia content transport, multimedia adaptation in constrained and streaming environments and Quality of Experience. He has published more than 50 papers in these areas, and is an editorial board member of the Encyclopedia of Multimedia, the ACM/Springer International Journal on Multiemedia Tools and Applications, and associate editor for IEEE Computer Science Computing Now.]
Want to keep up with HVC? Here are a few useful web sites and blogs:
A few weeks ago we held our annual conference in Tel-Aviv. This year the headline was – surprise, surprise! – Unified Communications. During the day my division, the Networking Business Unit (NBU), focused on exposing the local crowd, executives and IT managers from leading enterprises and organizations, to the latest trends in IP communication and collaboration.
Other than some very interesting presentations given by my colleagues and myself about unified communications, desktop collaboration, video conferencing and video technologies, there were a few guest speakers who I found very interesting as well as thought provoking.
The Conference kicked-off with two success stories of video conference implementations: Mr. Guilaume Boudin, VP Advanced Services at Orange (France), discussed Open Videopresence, an end-to-end, fully managed video meeting service “as simple as a phone call”; Mr. Roni Shlovsky, Head of Communication Infrastructure, in Bank Leumi, the biggest bank in Israel, discussed Leumi Digital, a project I already wrote about here. Both, of course, powered by RADVISION.
But the most interesting discussion on my part was a panel (see above) hosted by Moshe Machline, our VP Corporate Marketing, featuring IT managers from different organizations that implemented video conferencing infrastructure in the last few years. Panel members discussed freely the ins and outs of video conferencing deployment, including things they struggled with, things they are proud of and the big value it brings to their organizations.
The interesting thing about the panel was that, during the conference breaks, while mingling with the crowd, explaining about our technology and hearing the conversation around the demo floor, I basically heard the same things that were discussed in the panel.
To sum things up, I think that as the business arena gets more and more complex, with and without regard to the economical crisis, organizations and employees struggle with a few basic problems:
How can you achieve “more” with “less”?
How can you move faster?
How can you handle the growing amounts of information?
How can you balance work load and personal life?
How can you work better with your customers?
How can you work better with your partners and suppliers?
How can you integrate new technologies and tools into the work place?
Of course, all of these have business and financial implications on the organizations and the economy in general:
Budgets are being cut – we need to stretch them to the max
Markets are moving fast – we need to keep with them
There’s too much information – we can miss out on opportunities
Employees worry about their personal life – we need to allow them to work more flexibly
Customer satisfaction is extremely important – we need to care for our customers
Partners and suppliers are important – we need to keep close relationships with them
New technologies can offer great benefits – we need to utilize them as much as possible
The Collaboration Quadrant
There’s a quote I like, ever since I saw it in some Cisco brochure, taken from The McKinsey Quarterly Review, 2005 #4, titled “The Next Revolution in Interactions“:
“Raising the productivity of employees whose jobs can’t be automatedis the next great performance challenge-and the stakes are high.”
In my view, this sums up beautifully all of the questions above and their implications. However, it seems that not too much has changed since 2005. In fact, while Tsahi Levent-Levi gave his keynote presentation on the communication continuum, which I briefly discussed here, I tried to think about the communication means that are common in the workplace vs. what is available, according to a similar paradigm:
As you can see, I’ve put on the horizontal axis the level of experience that these means offer – text-only on the left side, multimedia (voice, video) on the right. On the vertical axis I’ve put the reach of these means – from near (individuals, mostly fellow employees) to far (large target audience, both inside and outside the organization).
It is quite clear that we moved from a text-only, one-to-one, near communication – “call”, to a multi-media, many-to-many, far communication – “collaboration”, across what I can now call the Collaboration Quadrant.
The differences between “call” and “collaboration” are quite clear:
A call has a single source and is usually synchronous. Collaboration can have multiple sources, and be both real-time and non real-time.
A call usually connects parties from within the organization. Collaboration connects people from dispersed locations, but also organizations.
A call is done within a static, pre-defined network. Collaboration is dynamic.
A call allows you the option to find the information you need (via the people you need). Collaboration makes certain that the people and resources will be available.
Moving to a “Collaboration” Way of Thinking
Moving from a “call” way of thinking to a “collaboration” way of thinking is key for the success of the modern organization. To do so, one must go past the traditional means of communication, such as e-mail, telephone and IM, and open up to the means that occupy the top 2 quadrants: multimedia-based, social means, such as video conferencing, blogs, unified communication, etc.
These “new” technologies are “unpopular”. The reason for it is that as they are new, both to users and IT managers, it is unclear to both what their contribution is (ROI) and what it takes to deploy them properly in the organization. While IM and e-mail have become popular both inside and outside the organization, video-based and social-based services have yet to win both, and so are left out.
Seeing is Believing – a global conference during the Summit.
An example for that is a question asked during the panel from the audience: “why use video for calls that connect, for instance, two meeting rooms in different geographical locations”. For people from the industry, this is a trivial question: video makes these kinds of meetings work. For people in the audience, not using video, this is indeed a valid question – they don’t see the benefit, and so they don’t see why they should make the effort.
But what I liked was the answers given by the panel members themselves – users, IT managers, people who have deployed video conferencing within their organizations and can testify to the benefits it brings. Some of the answers given were:
“Seeing is believing. You build trust much quicker and much easier using video”.
“With video you know who’s talking” (as meeting rooms hold more than one participant)
“Video makes the meeting effective. People stay focused”
“Video makes everyone feel connected”
And there you have it – with a few real answers to a serious question, you can easily see how a “new” technology can make an existing work process much more productive, if you just jump into the water and try.
To take from McKinsey, the next revolution in communications is collaboration. The stakes are high, and you better be ready.
[This post is based on a post written by Romi Mikulinsky and me, published - in Hebrew - in the popular HolesInTheNet blog]
We live in crazy times, I tell you. Everywhere you go, someone tells you that you have to be more “social” – use social media, connect via social networks, have a social strategy, yada social yada. And it seems that the more “social” we get, the less social we actually are, as we spend most of our time in front of a screen (and I don’t care which of the four it is…).
But you can’t escape it – social networks have won us over. If you’re not there, you might as well not exist. And in case you’re a late adopter to all of this, here’s a short recap: early social networks, known then as “online communities”, started forming around 1994. Geocities (RIP) and Tripod were probably the best known community websites. Then, between 2002 and 2004, the “social networks” emerged, with Friendster leading the way, and MySpace, Bebo and Facebook following behind.
As social networks became popular, “specific” social networks started appearing, catering for specific, specialized “common grounds”: LinkedIn connects you to your professional “friends” (colleagues, fellow and former co-workers); Classmates connects you to your old classmates; Musicians and artists can be found on MySpace, etc. Even large organizations started creating their own social networks, for instance IBM.
The Pros and Cons of “Specific” Social Networks
The “specific” social networks offer us defined, bordered content. As Gal Mor, chief editor of HolesInTheNet, wisely notes [Hebrew. Sorry!], “specific” social networks are actually not competing, and shouldn’t compete, with other networks, “specific” and non-”specific”. They offer us clear and simple pros:
It is clear what data they hold and which people are members
It is therefore simple to find information, as the content and borders are well defined
On the other hand, as the content in each network is limited, we find ourselves using more and more social networks, and their numbers are increasing on a daily basis. Being a member of several social networks raises a whole new set of issues:
How one connects to people in each of the different networks?
How one manages their “split personalities” over different networks?
If I want to upload pictures, will I use flickr (a “specific” service), facebook or twitter?
If I am updating my status, should I update it across networks?
When I am looking for information, where do I begin?
a visualization of my personal Facebook network (via TouchGraph)
Social Networks, Meet Unified Communication
All this really reminds me of the “more is less” debate regarding our communication means, especially in the enterprise. We have many choices, each with its own set of characteristics and its place on the communication continuum. But after we learned to master each one, with its pros and cons, we realized that indeed more is less, unless they are unified.
Unified communications is aimed at reducing the “communication latency“, that negative effect on our effectiveness that is caused by having to deal with too many means of communications. By using one platform, with one interface, to access all those means, either explicitly or implicitly, that latency is reduced if not eliminated.
When I receive my incoming calls – video, voice, IM – using one application; When I can check my voice and video mail, chat history, e-mails from there; When I can reach someone – by e-mail, phone, whatever – from that same application, I can spend my time on real work instead of switching between applications and playing that old “cat and mouse” game.
And the same goes for social networks. The “secret” of connecting the interfaces, even if the connection is not “unified” but limited, is slowly but surely spreading around the social arena. You can update your status in Twitter, and automatically update your facebook and LinkedIn status as well. You can upload an image to flickr, and share it on facebook automatically. I am using the blip.fm integration with Twitter to “dedicate” songs to my friends using YouTube and other streaming services. This is not only cool, but effective and increases productivity. Not to mention helps you handle your social network fatigue.
Social Networks, meet Unified Communication
And it’s no coincidence I have mentioned Twitter in all of my examples above. It seems that in the present battlefield around the “one platform”, the “one interface” that will unite all those social networks, Twitter is winning on many fronts. Almost without trying Twitter has become, in a weird evolution that I think its creators didn’t predict, the center of information for many of us.
We have dumped our RSS readers, stopped forwarding e-mails, quit the forums and chat rooms, and are focusing on Twitter more and more for sharing information and links. Why follow a bunch of blogs, when you can follow the bloggers themselves? Why spend time in various social networks, when Twitter has the interesting links to them as well? You just follow your friends and interests on Twitter, and turn your timeline into that ultimate unified social network.
Ultimate? Well, not really. Information on Twitter gets lost too quickly. “Walled gardens” are still un-penetrable, even with links from Twitter into them. The massive amounts of information make us miss out on important things too often. And yet, until there’s a better service, or social network, that will offer a better integration – one network to rule them all – Twitter is the only sane option to stay social, or “social”, and still have a life.
2009 has been quite an interesting year – to the world in general, to the tech community, to the video conferencing market and to this blog specifically.
I thought that celebrating a new year is a great excuse to visit the posts I liked best in 2009, so here’s a recap:
January
If iPhone wants to be the future, I argued in January, it sure is lagging behind. And I was referring to the lack of a front facing camera in the iPhone 2.0.
A year later, nothing’s really changed, and although some creative minds are trying to bridge the gap, the result is still far from pleasing, and you can only blame Apple for that.
This year, with the recent acquisitions of LifeSize and Tandberg, proved that video conferencing endpoints are no longer nice to have, but here to stay.
March
In March I was trying to compare 2009 and 2003 in terms of the state of the economy and video conferencing. As I joined RADVISION in 2003, and as the Dot-com bubble burst just before, it was amazing to see how close 2009 was to 2003 in terms of the current status and the predictions we all make.
This lead to a post called “2009 Reality Check“, which I personally really liked, as it gave everyone some stuff to think about.
The bottom line was that although we can’t beat Moore’s law, we can still provide customers with a better experience, and the way to do it is to harvest the power given to us by the new and exciting multi-core platforms.
May
I love Seth Godin. And in May I got my chance to write about one of Seth’s posts in my blog. The post “Video Conferencing – The Kind of Meeting That Works” used Seth’s definition of the “three kinds of meetings” in corporate culture and explained why video conferencing can make most work better.
June
In June RADVISION announced many new exciting versions of its video conferencing products at InfoComm09. In the “InfoComm 09 Round-Up” Bob Romano, the NBU VP Marketing, reviewed them, along-side videos from the show.
I especially liked the quote from one of the senior analysts at the show: “it’s nice to see RADVISION regaining its technology leadership, where they belong”.
July
In July an Air Tran commercial really made me laugh. Bottom line was that even if video conferencing is not going to replace face-to-face meetings, air travel will change and everybody knows it.
So I wrote my post about “Business Travel Without Moving” to say, again, that I see no logic what so ever in the excessive amount of business travel done today. Not that it really helped.
August
On August a tweet by Roger Farnsworth got me travelling back through time to the mid 90s. When video conferencing started, and the user experience was, well, crappy (as Tsahi likes to say).
In my post “The Curious Incident With The Post I Read In The Night Time” I tried to explain how many myths regarding video conferencing are no longer true, and how the experience today can definitely replace any other means of personal communication.
My answer: who cares?! Just as long as it’s affordable, and can be massively adopted.
October
It’s been a year filled with telepresence, but in RADVISION we are already taking a glimpse towards the future. “If Telepresence is the Present, 3DPresence is the Future” I wrote in October, giving some details about an exciting project that RADVISION takes part in – 3D Video Conferencing.
November
2009 was the year of the cloud. Or at least the year of the “talk about the cloud”. I really liked Dave Michels’ “Cloud Series” and decided to discuss cloud technology in general and cloud-based video conferencing in particular.
2009 was also the year everything went multi-touch. Suddenly everyone understands that UI is very important. But are touch-based interfaces the only way to go?
Coming Soon: Free Video Conferencing From Google. This was the headline of a recent ZDNet story by Garrett Rogers. Garett based his prediction on an interview with Rishi Chandra, a Google Apps product manager, on SFGate. There, Mr. Chandra said that “launching a voice or video chat session should flow seamlessly within Gmail and mesh organically with the other Apps” and “should be embedded in the core experience across the application set”.
Google’s voice and video communication capabilities are limited to peer-to-peer communication, but Mr. Chandra says:
“This [current Gmail capability] is the first step in a much broader set of features we hope to roll out over the next six to 12 months around video [and voice] capabilities. It’s a great opportunity for us to push the space along.”
Google’s current video calling interface. Source: Google.
Today the Google video calling product is part of Gmail, meaning you need to have your browser open, pointing to the Gmail page, to make your presence available and receive incoming calls. You also have to specifically install the video chat capabilities. Furhtermore, Google has not launched the multi-party conferencing ability yet. The result is that although Google holds the promise to become a potential market disrupter, we will have to wait for that promise to happen.
Google Video Conferencing Is Not Evil
But why waste good posts on negativity? This is Tsahi’s job. Instead, I would rather discuss how Google’s plans for further investment in video conferencing are an opportunity for the whole market, a much-needed step even. And I’ll explain.
I’ve been writing here a lot about a need for a change of mind. For video conferencing to become a relevant, viable means of communication, it has to be regarded as such by the public. The public, not just the video conferencing industry, IT managers or early adopters and tech savvy geeks. Cisco has been doing the industry a lot of good by intensely marketing the video conferencing concept, but its focus is on the high-end Telepresence.
Like it or not, technology adoption is not rational. SMS messages (texting), intended for Service Providers’ personnel, were adopted by young teens and became the next big thing, both in terms of participation and monetization. Instant messaging was regarded as a pass-time activity a few years ago, but became so popular in our homes, that it was adopted by every organization as a legitimate communication means. And I can go on with such examples for hours.
Companies like Skype and Microsoft, with their video-enabled IM clients, Nokia or Samsung, with their video-enabled handsets, have brought video to the masses, but not the masses to video. If Google is successful in changing that, making everyone use video calling as a natural means of engagement, like chatting or e-mail, the adoption of video conferencing elsewhere, especially in the corporate world, is just a matter of time.
Just take a look at the huge BANG Google has pulled off with its Google Wave introduction (much ado about nothing?). Now replace “Wave” with “Video Conferencing” and everyone who is anybody in our industry will have his mouth running…
Conferencing Gadget in Google Wave. Source: Technorati.
And Technical Opportunities Too…
For the peer-to-peer video calling Google is said to be using the technology licensed from Vidyo. Two years ago Google acquired Marratech to “enable from-the-desktop participation… in videoconference meetings”.
When Google says “free”, they actually mean “free as long as we can monetize it alternatively”, and Google is the expert in analyzing our personal information and redirecting it to advertisers. So how would Google be using Video Conferencing for this purpose? One can only speculate.
A simple, yet not trivial, solution would be to transcribe the conference, and use this information to display relevant ads during or after the conference. Speech recognition technologies have really advanced in recent years, and this is not as fictional as it may seem – They have been doing it for Google Video for a long time.
And what about displaying those ads? Will Google invest in new technologies of embedding text and images into the video, or will they use the “simpler” ply model? They have been investing in this front with online non-real-time video, with no real success up until this day as far as I can see, but it would be very interesting to see if this will change with real-time visual communications.
And what about the traits that brought Google fame and fortune?
Archiving and Retrieval – Will they store our conferences like they do with our e-mails and chats? Will they revolutionize the way we use video conferencing in that sense?
Search – Will they use their search technologies for audio and video? Will they have to come up with new stuff?
UI Design – Google is known for their slick-yet-simple product design. Will that change the way users use video conferencing?
And so you see – I fear not Google’s play in “my” space or any Google death ray. On the contrary – If Sergey and Eric join John as leading promoters of visual communications, the industry as a whole will benefit tremendously. Sure, the competition may be more intense, but I prefer more competition over a giant market than a limited one over a niche domain.
So I will await the Google Multi-party Video Conferencing. I am sure it will do no evil.
In a previous post I argued against the “One Internet”. I claimed that for video conferencing to work all access options to the network must be supported and the network itself has to be media-aware.
That being said, one might get the impression that I believe that only a dedicated video conferencing network, one that connects (at least) the entire enterprise and offers a “clean” environment for our precious means of communication, can provide a worthy quality of experience.
And it’s not that I’m against that type of solution. Whether it is ISDN, IP or MPLS based, the benefits are clear: a video network is permanent and always ready, the quality of service (QoS) is guaranteed, it is easy to handle and maintain, etc.
But if we are heading towards mass deployment of video conferencing infrastructure, dedicated networks are going to give IT managers and corporations a great big headache, as a dedicated network:
is hard to scale
is another network to administrate (on top of the existing IP networks)
is expensive (double the network – at least double the expenses)
“The clear advantages of converged networks are improved costs and IT resource productivity”.
But can a video conference network reside alongside other IP traffic, without risking quality and experience? Is the “converged network” a futuristic dream, or can you base your video conferencing network on your existing IP infrastructure? I decided to consult the local expert – Yossi Bronstein, AVP Corporate IT & IS at RADVISION, to learn how it is done in a global organization such as ours, with extensive IP-based communication deployment, including video conferencing (as the shoe maker doesn’t go bare-footed…)
The 1st thing Yossi did was show me the corporate network topology (see above). The 2nd thing Yossi did was emphasis that this is the corporate video network as well. RADVISION, says Yossi, has chosen to base its IT infrastructures on a converged network, due to the obvious reasons: costs and IT resources. Therefore, although RADVISION employees are using video conferencing much more than the average worker, all the IP traffic in RADVISION, including video conferencing, runs on the same network.
Dos and Don’ts of a Converged Network
This, of course, does not mean that the video is treated like any IP data over the network. On the contrary – the secret to properly converge video with “other” IP data, says Yossi, is priorities. Priority should be given to any “important”, “sensitive” data that goes over the network, but with video this is crucial.
In RADVISION, for instance, as it can be seen from the network topology, every branch is connected through both the public Internet and dedicated MPLS lines. In general, the IP traffic uses the Internet, but at any given time, for any reason, it can be shifted to MPLS to guarantee proper transfer.
Same prioritization takes place in every aspect of the network. This can be achieved technically in many ways, such as:
ToS can be used to guarantee low delay, high throughput and high reliability across routers via IP precedence.
Differentiated services (diffserv) can be used to manage the traffic and provide different levels of QoS.
Traffic shaping can be used to optimize or guarantee performance, latency and bandwidth in the network.
These priorities are decided according to the traffic characteristics: port allocation, source and/or destination, data type, etc. As mail server synchronization is prioritized for bandwidth and reliability, so can – for instance – a video call originating from the CEO’s office. And video should get more bandwidth than audio. And video between branches (in a distributed MCU architecture) should get enough bandwidth for proper quality.
And so, while every employee in RADVISION is capable of making a video call (via their personal IP Phone or SCOPIA Desktop) or a video conference (using their virtual meeting room), and although video traffic is substantial during work hours, Yossi says that on average only 3% of the time any external interference is required. Maybe this can explain how for a big organization such as RADVISION one system administrator deals with the entire global network.
And there you have it – you can have one network, accessible to all, via different means, protocols and infrastructures, and still enjoy a great experience for your visual communications, if you know how to set your priorities and not treat every packet as equal. And that is true, I thought to myself after my meeting with Yossi was over, not just for the RADVISION network or any corporate network, but for the Internet as a whole.
In a recent post on his blog, Tsahi discussed how the iPhone “changed the game” when it comes to product design. The best and most obvious example he gives is touch technology, which has become “the most coveted input technology”. In the recently held World Innovation Summit, Amichai Ben-David, CEO of N-trig, Israeli manufacturer of revolutionary multi-point touch screens, admitted that the unprecedented success of the iPhone drove everyone to understand that touch is the most intuitive form of input for users.
Demo clip of N-trig’s Duo-Sense Technology
But if touch is so intuitive, what about the rest of the senses? Can they be used as user interfaces?! I decided to consult the expert, in this case Dr. Romi Mikulinsky, whose dissertation, at the University of Toronto’s English Department, dealt with the way memory is affected by the transition of photographic images from medium to medium.
The Future Interface: A Sensory Experience
If you look at the way user interfaces (UI) have evolved, Romi says, from the first GUIall the way to the Minority Report -like interfaces that totally dominate today’s UI, it’s easy to imagine how future interfaces will look: haptic-based (touch screens, multi-touch interface), visual-based (controlled by eye movement), sound-based, brain implanted, using contact lenses, controlled by gestures, mind-controlled – the sky is the limit. The future interface will involve all of our sensors, and will facilitate the way we engage with computers by creating a more intuitive, sensory experience.
These future interfaces will position us inside the data and will expand the way sensory experience is thought of today, as an inner, private experience. These interfaces will enable us to step into the data processing process/experience. Will they eventually help the digital world, or at least the Internet, become a prosthetic organ, an external device that alters, affects or supports our experiences of reality?
Sensory experiences allow us to create seamless interfaces that eliminate the distance between the “inside” and the “outside”, and close the gap between man and machine. Seamless interfaces bring us closer to artificial intelligence (AI) by using our intelligence and our senses to directly interact with machines. Nevertheless, these interfaces can become a mechanism that will enable us to learn more about ourselves and about other things in the world.
Playing Solitaire seamlessly
Sceptic about Haptic? Try To Listen!
Haptic interfaces are no longer science fiction, and it seems that users are willingly adopting them, possibly because they are very natural. In fact, operating a system by waving your hands, for instance, makes you forget you are using any interface, as it is almost totally seamless. No input/output, no clicking and typing, no necessary hardware or dedicated devices. After all, wouldn’t it be great if the digital post-it acts like a paper post-it?
But, as you know, we have more than just 2 senses. Our interaction with computers is already based on our eyes and our hands, seeing and touching. How about using other senses, not just sight and touch, as interfaces?
This will be especially beneficial for people with certain handicaps, who can’t use the existing interfaces. For instance, for a blind person, a hearing-based interface can make an otherwise unusable system easily accessible. This may sound complicated, but projects like Michal Rinott’s SonicTexing or the Tactile Explorer from Tactile World, allow the visually impaired to easily access computer-based applications, which usually require the use of sight.
Can You Smell the Interface? Taste It!
If I were to suggest using scent or taste as an interface, you may say that I, well, lost my senses. But Romi believes they may be the next new seamless interfaces. Last year SHOWstudio launched a groundbreaking initiative for fragrance over the Internet. I, personally, have seen a demonstration of a motion picture with “fragrance support”, where you not only see and hear, but also smell the movie scenes.
And the same goes for taste. Or, at least, our tongue. And if you think I went too far, take a look at The Brain Port, a neural tongue interface which uses 144 micro-electrodes to transmit information through sensitive nerve fibres in our lingua:
Combine all of those senses together, and you can see what a real seamless interface might look like in the near future – a total sensory experience. One that would be able to transmit and/or replicate an entire experience. Just imagine how a website like synesthecity would be like, if the sight, sound, smell, taste and touch were transmitted over the Internet.
And, of course, don’t forget that illusive, most promising “Sixth sense“, which some refer to as extra-sensory perception, and some as our future personal connection to “the cloud”. If you haven’t seen THAT TED demo, I strongly urge you to do so now. After seeing what can be done with natural, seamless, sensory-based interfaces, there’s really no need for additional words.
And Meanwhile In The Real World…
I started off with touch technology, and I want to end with touch technology, but this time with its application for Video Conferencing. I’ve already written here about the Teliris TouchTable, which is very impressive. But as touch technology is no longer expensive and can be found everywhere, it can be utilized to upgrade your video conferencing experience as well.
Here’s a short video showing the latest version of RADVISION’s SCOPIA Desktop, utilizing touch technology for a most intuitive and efficient user interface:
Seamless, intuitive, productive – I have a sense we will be seeing everyone and everything following suit in the near future.