fbpx

Voice interaction Conference: A Designer Goes to SpeechTEK

Voice interaction Conference: A Designer Goes to SpeechTEK
Reading Time: 4 minutes
Our Creative Director looks for takeaways from the voice interaction event

SpeechTEK 2018

I recently attended the SpeechTEK conference for the first time in Washington, D.C., an event focused on deploying speech-based solutions for business applications. I went to the event with Stu Gavurin, Mission Data CEO, who had attended previously, but for me, SpeechTEK stood out from other conferences in several ways. I attended because I wanted to have a broader understanding of the voice world beyond building consumer-facing skills for Amazon’s Alexa. I do think I achieved my goal, but it took some work.

Hyper-focused

Unlike other industry conferences such as An Event Apart, Breaking Development, or Circles, which allow you to take away a lot of information around a particular topic (like design) while still covering a number of different areas (research, user experience, visual design, etc.), SpeechTEK is hyper-focused on voice. The sheer amount of information on this one particular topic did leave me a bit weary by the end.

In particular, the conference focuses on customer service and how voice interaction fits in, which required a bit of a mental leap for me to draw value out of every discussion. Company case studies, which made up a reasonable amount of the talks, were in areas that aren’t exactly relevant to how I might use voice interaction in a potential project. For instance, capturing notes and action items from a conference call is interesting, but I would have preferred a broader overview of best practices.

The customer service focus also meant that the speakers assumed that I knew a bit more than I did. There was use of terms and talk of technologies or protocols that were foreign to me, but I was likely the only one in the room for whom that was the case. Even so, I was able to pick up some interesting insights to think about that when stitched together began to paint a decent picture.

High level learning

For me, the most interesting part about attending SpeechTEK was learning about some high level things that I hadn’t really thought about or even knew to consider. I suppose it’s fair to say that I thought I knew a lot more than I did. I thought to myself “how complicated can this be,” and the truth is there are opportunities to use voice to solve problems in interesting ways if you go beyond the typical Alexa/Google smart assistant use cases and explore things like advanced natural language processing and artificial intelligence.

For example:

  • Understanding the basics of how voice platforms work. Beginning with listening to voice with far-field voice detection, the audio runs through a processor for speech recognition to a natural language processor, which can then be further passed to third party services or read back to a user once the intent is understood with a proper response based on the intent. Previously, to me it was: “talk to device, it recognizes your words, does magic, answers you.”
  • Discovering some of the documented differences between screen and voice interfaces, one of the big ones being how voice interfaces are generally terrible at error handling which is particularly frustrating to users.
  • Letting users know where they are within a system, what they are talking to (the general OS or a particular application) is a hard problem to solve but important. With smart assistants it can be confusing to know if you are talking to Alexa or talking to a skill and sometimes you move in and out without knowing it.
  • The difference between action-centered queries vs. conversation-centric queries and when they are appropriate and what they are best suited for.
  • Basic concepts/approaches from the voice world like IVR (Interactive Voice Response) systems that presenters were entirely like, “IVRs, you all know what those are so I won’t spend time on those,” and I had to go look them up.

Takeaways

Voice interaction has been around for a while now in various forms, but consumers are becoming more familiar with how it might benefit their lives in their homes or at their jobs. Advances in natural language processing and recognition are affording strategists, designers, and developers new ways to think about and implement voice-based solutions instead of screen-based ones. While there is talk of AI and voice replacing screens, I don’t know that I really believe that. There are too many things that people just don’t want to do with their voice depending on the task, scenario, level of privacy needed, etc. I do believe screens are on the decline and things like voice can augment or replace current experiences for better ones, but most likely we are going to be left with a world where we use different technologies and hardware for different purposes, just like today. How we use those things and where they are located in proximity to us is likely going to make our connected world a whole lot more interesting.