Saturday, January 21, 2012

Divergent Views on Communicating with Machines

Steve Ballmer, CEO, Microsoft and Dr. Dieter Zetsche, Chairman, Mercedes Benz
Much of what I saw at CES 2012 was about products being upgraded to “smart” under the premise that smart connectivity enables consumer convenience. It was definitely on the minds of most of those attending. That's why the CES keynote speeches were so well attended: they were slated to offer insight into the near-term future. But this year there were competing visions of that future. The industry leaders seemed to have divergent approaches to the development and marketing of "smart."
  • CES is a show focused on near-term product releases… those that will be launched later this year, in time for Christmas, and into next year. Throughout CES, almost all of the consumer products were demonstrating smarter products – smarter in the sense that they are connected to the Internet or a local net and have sensors or artificial intelligence that gather and process data and make decisions based on that information.
  • The Apple iPad, with it's multi-touch capability, has already changed our expectations as to how we interact with our computers, tablets and phones. And Apple's Siri and Microsoft's Kinect are leading us on to even newer ways -- smart ways of interaction between users and their devices.… all to add value to the product by providing convenience or entertainment to the buyer.
  • One of the most interesting areas of CES was focused on Digital Health - where one could easily see benefits from sensors and smart apps providing data that affects consumers. Digital Health was an area packed with healthcare inventions and eager young inventors, and the many new products and apps epitomized our "nurse in your purse" future.
To make my point I must first describe three seemingly disparate events:

Ford and NPR Press Conference:
Ford and NPR held a joint press conference to launch NPR’s new app which runs within the infotainment system (called Sync AppLink) on new Ford cars. NPR has the 1st and 2nd rankings of Morning Edition and All Things Considered among U.S. news radio programs. Their new app gives Ford drivers voice control over their NPR programming. In a menu-driven series of commands, a driver can call up the latest news of the hour, select a live stream of his or her favorite station, or access programs or topics from NPR’s large library of podcasts by using a set of simple commands like “hourly news” or “stations” or “programs” followed by the name of the program. The resulting selection may be playing on the FM band, streaming live, or streaming from the archives of NPR over the Internet. It could take as many as five commands to get the desired program. Underneath the Ford Sync system is Microsoft’s operating system. Executives from Ford and NPR, when asked about future improvements to the system, said that a more free-form natural language voice recognition system would be ideal but is not yet capable and reliable enough to work with safety and convenience in a car. But think how Siri would get to the same program in just one short sentence: “Find and play today’s ‘All Things Considered.’”

Keynote Speech by Microsoft CEO Steve Ballmer:
Shortly after this presentation I went to the Steve Ballmer Microsoft Keynote Speech. Bizarre is a charitable word to describe this off-putting, fever-pitched yet unexciting sales pitch for everything Microsoft. Very little news, less information about new product introductions, and much puffery about the new Windows 8 Operating System coming sometime this year. Not a word about robotics even though Microsoft supports and sells a robotic operating system. Ballmer presented Windows 8 and Metro – the same systems that are limiting NPR’s app by not having a capable and reliable free-form voice recognition system similar to Apple’s Siri – as the cat’s meow; the very highest tech and best you can buy anywhere. I actually felt bad from the presentation - to see such an unappealing sales pitch while omitting Microsoft's vision for the future. [MS announced that this would be Ballmer's and their last keynote at CES - a fact which underscores how the shift toward mobile devices has kept MS re-allocating talent and resources to adapt.] As an aside, Bloomberg Businessweek Magazine just did a cover story about Ballmer turning the company into a more relevant powerhouse with cooler technology and also a serious player in cloud computing. In that article, Businessweek describes what I saw: “For many, the lasting impression of Ballmer is the sweaty, breathless, booming clown seen in countless YouTube clips [or in my case, in person at CES]. He plays the cheerleader in an apparent effort to prove that no one can top his love of Microsoft – and he succeeds cringingly well.” The article goes on to describe Ballmer as pretty normal except in public presentations. Still, I left that night without any new information and with a headache and bad feeling.

Keynote Speech by Mercedes Chairman Dieter Zetsche:
The next morning I went to see Dr Dieter Zetsche present Mercedes' first-ever CES keynote speech, an inspiring, informative and well thought out “big picture” focus on the next generation of connected cars. When asked whether cars were going to become autonomously-driven commodities built to carry around consumer products, he responded that Mercedes builds cars that people want to drive and that will continue – but when the traffic or the road is boring, there will be a switch to turn on a temporary autopilot. Zetsche, in his interesting and responsible presentation, described the auto industry and Mercedes cars in terms of freedom and included new offerings within each of five “freedoms.”
  1. Freedom not only from the horses, buses and trains of the past, but from the limits of distance, from the tethers of things local, to distancing yourself from your parents.
  2. Freedom of time via connectivity so that seamless updates are pushed to in-vehicle communication systems negating the need to bring your car in for system updates. Their new MBrace2 system regularly updates and monitors their cars but also connects today’s digital lifestyle into a digital drive style.
  3. Freedom of speech to communicate with your car in the most safe and expeditious manner. The current iteration of MBrace2 has a much enhanced (but not yet freeform) voice recognition system and in many instances the system will be proactive, eg, choosing to not answer phone calls or read messages at those times when the driver is fully occupied with hazardous driving situations.
  4. Freedom of energy – where Zetsche described new hydrogen-based fuel packs just waiting for the national (political) infrastructure to support them.
  5. Freedom of information where car-to-car communication can provide alerts about road hazards and conditions by taking advantage of the already present in-car virtual private network system and link.
All three of these presentations occurred before the doors for CES opened, and when I walked the massive exhibition space, those visions peppered what I saw with what I believed to be the immediate future in mobility, communication and apps. It is clear to me that despite Mr. Ballmer's sales pitch to buy today's systems and products because they were great, free form voice recognition (a la Siri) is the future of communication with our machines and "smart" is the pathway we are following to that end goal.

The most advanced manner of communicating with smart products is by voice and gesture. Today’s technology is menu-driven (like the NPR example) but the future is free-form and natural (think IBM’s Watson or Apple’s Siri). Hence the flurry of acquisitions into the language processing space: Apple’s acquisition of Siri; Google just bought CleverSense; Aldebaran recently purchased Karotz. New startups of note in this arena include True Knowledge and their Siri-like product Evi.  Nuance (of Dragon Dictate fame) is already established in this arena. Nuance voice processing is repackaged and used by many car companies for their in-car systems including both Ford and Mercedes.

It appeared to be an afterthought in Ballmer’s presentation (Microsoft has been slow to react to it's popularity and multiple uses), but Microsoft’s Kinect voice and gesture recognition device was the wonder of 2011 and seen in many non-Microsoft boothes at the 2012 show. Hacked from its Xbox gaming origins, it provides a low-cost alternative to expensive LIDAR and collision avoidance systems, and all sorts of other applications.  It is a wonderful invention that other companies are hacking and incorporating into their products. PrimeSense, the Israeli inventor of the Kinect device and it's software, has been doing a booming business selling the device for non-gaming applications, research and who knows what else.

Consequently, it was easy to see that at CES 2012 the path to the next level of "smart" products is through the use of better communication with those products - gesture and voice recognition, and natural language, to command and control them – just like Tom Cruise in Mission Impossible!