Siri downplayed by Google and Microsoft

One thing I’ve learned is that when the competition spins their opinion in downplaying a feature like this, it usually means some type of nerve has been struck. Sure, Siri in its current beta form is not ubiquitously usable. Apple however is not sitting still and is working to change that by adding more languages and service partners. True, there will always be a subset of people who may balk at the idea of barking into their phones in this manner, especially if you think about how most voice recognitions system work where you have to speak unnaturally (slow or in a more specific order). The difference with Siri though is that the way you speak to it isn’t that much different as if you were speaking to another person. Furthermore, these are smartphones we’re talking about (pun intended) where people unsurprisingly speak into them. And so for both Rubin and Lee to dismiss the personal assistant angle is either missing the point about Siri or more likely, knowing that they are several steps behind what Apple has managed to bring to the consumer market.
The key difference between current voice recognition systems is Siri’s natural language processing (aka what many of us have erroneously referred to as AI). Voice recognition (based on the capabilities of Dragon’s Nuance) is just one piece of the puzzle. Siri’s NLP changes that since it allows the user to perform these inquiries in a natural and continuous way since where Siri goes beyond current systems is its ability to understand the context and flow of a conversation. Siri also further optimizes itself as time passes by learning more about the user. Where have we heard about this before? This is why the likes of Google and Facebook want to learn everything about its users since the more information they have, the better they can redirect/tailor their key revenue generator (advertisement) at you. Apple’s purpose is different though, it wants to know more about you not to sell you ads (or to sell that personal information to the highest bidder) but to make your user experience that much better. And most of the processing for this happens on Apple’s servers (which is why unlike Voice Over, Siri requires a network connection either via WiFi or cellular).As the fomer lead developer for Siri (Ed Wrenbeck) noted, the brains of Siri are running on a bunch of servers in Apple’s data center where it is able to take a sentence and dissect it naturally while also being able to maintain context. This comprehension of context is difficult and requires a great amount of logic and processing. And this is where it differs from the ELIZA form of AI (artificial intelligence) where it normally took input and repurposed it back into questions which would eventually spit out a response based on your answers. The NLP behind Siri makes it feel more human. The real kicker is that Wrenbeck says that Siri is basically a contextual, semantic, personalized search engine. Search engine is bolded because if you think about this, Siri is taking search to another level. And that has to make Google especially uncomfortable at some level. Search in its current incarnation is wholly unnatural and given the sort of results that are spit out, often times unusable (except for popular terms) unless you dig down or really refine your search (at which point, you end up typing something really long, inane and nonsensical). Sure, Apple isn’t developing their own search engine ala Google or Bing…. yet. But with all of these questions being posed to Siri, Apple is building up a huge database of natural language queries which can be optimized and fine tuned in terms of the results given. It’s sort of like meta data (data about data) but in this case, more like meta search where Apple is building up this huge repository of search questions and possible answers while letting others do the literal dirty work of data mining and optimizing that data. 

Apple is naturally using Siri as a carrot for the iPhone 4S by making it available only exclusive for it. As Wrenbeck noted though, intelligence needs to come from many sources and that poses a key challenge for Apple in terms of opening it up while also maintaining the high quality of the data (i.e. the old adage, garbage in garbage out applies here). Wrenbeck also noted that Siri’s original vision was to be a personal assistant available from any device, anywhere, anytime; something which Apple has pulled back from in making it available only on the 4S (and not the iPhone 4 or iPad 2). This could change as Siri’s capabilities grow since it is clearly labeled as beta in its current shipping incarnation. Given that Apple rarely does this (ship software in beta format – and yes, I do realize and have written about how they have managed to at times shipped half baked solutions in terms of software and services, Siri is a bit different), they obviously see that this represents a game changer not in the sense of the voice recognition capabilities but more so, in how something as mundane as search can be made better. And that small little proposition is what should make Google (and Microsoft to a lesser extent with Bing) squirm as this ups the ante as to what users will expect. In some way, Siri could represent Apple’s foray (in minor toe dipping fashion) into search. The desktop search environment is naturally different because voice is not always a good fit in this particular area. But Siri still can understand text based input which means that the premise of natural language “text” queries can also come to fruition (though there has been great strides in this area, its an area that has been hit/miss). Finally, one has to wonder whether or not Apple will take voice biometrics, something it dabbled in awhile back in Mac OS 9 with Voiceprint (which never made the transition to Mac OS X) where you used your voice to login and meld it with Siri. That might take recognition and personalization to a whole new level or could possibly be perceived as being maybe too much by those uncomfortable with machines taking on more human characteristics (but thats an entirely different subject altogether).

Leave a Reply