Conversational UX Design - Process


Conversational UX Design Process

Just as with graphical and mobile user interfaces, conversation interfaces should be designed before you start building. Static mockups are much cheaper and faster to change than working conversation flow. We use the following design process.

1. Mock-up
2. Design
3. Build

Mock-up: While sketches and low-fidelity prototypes are useful for visualizing graphical interfaces, they will not help in visualizing a conversation design. Instead, a simple transcript is better suited. Transcripts can represent sequences of utterances, and who is speaking each one, for either text- or voice-based conversations. Transcripts are easy to read, easy to create and easy to share. Designers and stakeholders can quickly iterate on a mocked-up transcript before building any parts of the conversation flow. Simple transcripts also lack any representation of a visual interface, in the case of text-based interfaces, which may be distracting from the design of the conversation itself.

The field of Conversation Analysis provides some transcription conventions that are useful for mocking up user interactions. Conversation analysts mark the speaker of each turn with short labels, such as U for user and A for the agent, or chatbot, in order to keep track of who is saying what at all times. We also put the user's utterances in boldface to make them further stand out. In addition, conversation analysts number each line of a transcript, somewhat like programmers, which makes it easy to point to particular utterances, or parts of utterances.

Simple Mock-up
01 U:  who invented the hard disk?
02 A:  It was invented by IBM in 1956.
03 U:  can you give an example?
04 A:  The IBM 305 RAMAC was the first  
05     computer to use a hard disk drive.

In this simple mock-up, a design for how a user can elicit an example of the agent's prior utterance is demonstrated. The user can ask a question (line 1), receive an answer (line 2) and then elicit a particular kind of paraphrase, or understanding repair: an example (line 3). The agent will recognize this kind of user action and respond appropriately (lines 4-5). Whether or not this design will work with the eventual architecture of the application is not the concern at this point. The goal is to represent the target user experience that the system will produce.

Design: While mock-ups represent the desired conversational user experience, they do not represent the underlying system itself. Many sequences, or parts of sequences, that can be designed through transcripts may not be feasible to build. Therefore, the next step is to design the conversation code itself. "Pseudocode" is a notation that resembles a programming language but is simplified. Pseudocode is commonly used by programmers to design programs before building them. Because it is simplified, designers can use pseudocode to focus on the design itself without having to worry about exact commands and syntax.

We have developed a form of pseudocode specifically for representing dialog flows using the Intent-Entity-Context-Response paradigm, used in most of today's chatbot and conversation platforms. So far we have used it mostly with Watson Conversation. In this paradigm, there are only 5 actions the designer can take...

  1. Create condition – Conditions are compared to input utterances, represented with "if"

  2. Assign default – Assign action if no conditions are met, represented with "else"

  3. Set variable – Capture the context of the current input for future turns, represented with "set"

  4. Route to node – Route to another dialog node, represented with "goto"

  5. Respond to user – Output text to the user, represented with "say"

In creating the first type of basic action, "create condition," designers can typically combine the following elements...

  1. Intents – Linguistic classes against which the similarity of a text input can be scored, prefaced with "#"

  2. Entities – Keywords or phrases to be matched exactly, prefaced with "@"

  3. Context – Variables for capturing events in the conversation, prefaced with "$"

  4. Pattern – Text string analysis with Regular Expressions, Javascript, etc., prefaced with "^"

By using this shorthand for the 5 basic commands (if, else, set, goto, say) and the 4 condition types (#, @, $, ^), we can quickly represent most features of a dialog tree. For example, the following pseudocode represents the underlying node structure of the Simple Mock-up above (lines 3-5 only)...

01   if #EXAMPLE_REQUEST

02        if $example has value
         
say $example

03        else say "I'm afraid I can't think of an
          example."

This pseudocode instructs the designer to create one parent node and two child nodes, as indicated by the indents. The parent node has a single condition, an intent, #EXAMPLE_REQUEST (line 1). The first child node has a single condition, variable $example has a value, and a response, which is the value of $example (line 2). The second child node assigns a default action, respond with "I'm afraid I can't think of an example." (line 3). In practice, we usually do not represent an entire dialog tree in pseudocode, but only examples of the more complicated node branches.

This pseudocode design enables designers to represent how the underlying conversation code works and to share the design with others. While transcripts are easy to read by any stakeholders, pseudocode may be difficult to read for anyone who is unfamiliar with building on a conversation platform, such as Watson Conversation. Therefore, we share them mostly among only conversation designers/builders.

Build: Once the designer has a firm grasp on what to build, s/he should start building. The transcript mock-ups and the pseudocode design provide a detailed blueprint for what to create. In addition, the Common Activity modules of the Natural Conversation Framework provide reusable examples that designers can configure and modify, rather than starting from scratch every time.

Because they are simple and text-based, both transcripts and pseudocode are easier to share with collaborators in emails, presentations or documents than screenshots of an authoring tool or the full conversation code itself. For this reason, they are critical for a joint development process.

Show me your transcripts!

Despite the numerous articles on "chatbots" and "conversational agents" these days, there is a noticeable absence of examples of the chat or the conversation. For every article about a new conversational agent, ask "what is the conversational experience like?" "How conversational is it?" "What is the agent doing that is new and innovative?" None of these questions can be answered without seeing examples of the interaction itself.

Conversation analysts trade in excerpts of detailed transcriptions of naturally occurring human conversation in order to share and demonstrate their analytic discoveries. Conversational UX designers should likewise trade in excerpts of chat logs in order to demonstrate the form and beauty of their designs. The practice of sharing transcript excerpts will enable conversation designers to share tips and tricks, learn from each other and collaborate: "Here's a design I've got!" "Check out this working script!" "How did you do that?"

Transcripts are the currency of conversation analysts and should be for conversational UX designers too!




Project Members

Dr. Robert J. Moore
Conversation Analyst, Lead

Raphael Arar
UX Designer & Researcher

Dr. Margaret H. Szymanski
Conversation Analyst

Dr. Guang-Jie Ren
Manager