Maybe Siri should have multiple personality disorder


For some years now, there has been a race between Google, Amazon, Microsoft, Apple and others to create the best voice assistant. The interaction model for each of them is that the user asks a question to the One Assistant, which either answers the questions directly or dispatches it to a number of 3rd party provides applications.

The main contenders are each gradually increasing the number of domains their assistants understand, gradually expanding their keywords and ontologies until eventually at some point in the distant future they are supposed to cover everything we do in daily life. In doing so, they allow 3rd party developers various degrees of integration to include their own functionality into the One Assistant - but at the end of the day, the user still interacts with the assistant as a single "personality".

Lately I've been wondering whether that interaction model is the best approach technically, or for us as a user, or for Apple as a business strategy given where it stands in the race compared to others.

The alternative I would like to consider is for the users to address their questions directly to specific 3rd parties, acting as multiple personalities available for us to converse with, rather than to a single assistant whose knowledge base includes the 3rd parties.

"Hey Accuweather, is it going to rain tomorrow?"
"Hey MLB Bot, how did the Giants do?"
"Hey Rotten Tomatoes, is the new Spider-Man any good?"

With Microsoft and Alexa recently announcing a (very limited) degree of interaction between Alexa and Cortana, I wonder if maybe they are thinking in the same direction.

There are 4 reasons why I think this might work better, and why I think Apple should be the one to take this step.

First of all, technically. Expecting the One Assistant to understand 1 or 2 domains is relatively easy, but as the number of domains branches out into ever more new domains and fragments into ever more specific sub-domains, it becomes impossible for one assistant to keep up. The complexity grows exponentially, and it is impossible for a central team developing a single assistant to keep up, even with 3rd party extensions being added in. This is the issue that the Cyc project (out of which, incidentally, Siri was born) ran into as it tried for over 30 years to model all common sense knowledge of the world.

The second (though related) reason, is that "single point of contact" model may slow down development. As the One Assistants expand into ever more domains, it becomes harder to add new functionality since there are so many existing queries to take into account. Every new keyword that is added needs to be weighed against the possibility of overlapping with existing keywords (or being phonetically too close). The more is added, the worse it becomes. This is a classic feature interaction problem which many software platforms suffer, and the solution is almost always to make the system more modular. Separating the knowledge domains by allowing the user to explicitly specify which subsystem he wants to query greatly reduces the risk of feature interaction.

Third, for us as a user, it can be instructive to compare our interactions with Siri and Alexa to our interactions with humans. Having One Assistant is modeled on the role of executive assistants as they have existed for the past hundred years. But in social interactions, nobody expects a single person to be knowledgeable on every domain from baseball and television to medicine and history. We are used to speaking to multiple people and we know who is knowledgeable on which topic. Asking the user to specify who they are targeting a certain question is relatively low effort for the user since our brains are hardwired to do this. We can always fall back on the One Assistant as a default in case we don't know who might answer a certain question, but if the user knows exactly who he wants to speak to, he should be allowed to specify it.

Assuming the above has some merit (and people familiar with AI and voice assistants may feel free to correct me), then my final argument is more strategic. I think Apple would be the party to best benefit from stepping away from the One Assistant model and towards the Multiple Personalities model since it would tilt the playing field to their advantage. Apple's privacy-driven take on AI is the one that least benefits from centralising all info into a single assistant. Google currently has Siri beat, and it does not seem very likely that Siri will leapfrog Google in response quality in the years to come. But splitting up the interaction model by presenting multiple personalities to the user would allow Apple to re-define the game in a way that Google may have a hard time adapting to.

Just my two cents.



Comments

Popular posts from this blog

Maybe 2018 will be the year UIKit replaces AppKit on the Mac

Why losing some customers may be good for the Mac

The path for technology for the next decade? Here's 5 guesses.