Burrito

What is a relationship? And how do we architect something ambiguous, emergent, and invisible?

Burrito, along with Phygital Party Mode, is a personal project I playfully call self-centered design, in which I take a hypercritical view of my mundane interactions to gain insights into embodied interaction and our everyday relationships with technology.

Burrito is bot, intended to be a manifestation of my marriage. Every time my husband and I text each other, the emojis we send are harvested in a database. By means of a mutually defined algorithm of ranked emojis, Burrito then makes a judgement and weekly informs us who is the better spouse. Bonkers? I even agree, but bare with me.

Inspiration

Three sources of inspiration lead to Burrito:

[1] While reading David Brooks’ The Social Animal I was provoked by the assertion that there are three entities in every and any two-body relationship – the two individuals and the relationships itself – the latter of which is equally as dynamic and influential.

[2] At an IxDA London event in 2015 there was a great discussion on the theory of individual algorithms (ex. Facebook) occupying one of Dunbar’s 150 – the proposed number of relationships we can cognitively maintain.

[3] The unavoidable, hyped revival of conversational UIs and artificially intelligent bots.

The culmination of these, in addition to my interest in embodied interaction, prompted me to make a relationship explicit.
Naturally, I started with my marriage.

Prototype

When this project began in the summer of 2015, my husband and I used Whatsapp as our primary means of digital communication and emojis as the primary subtext. Emojis were our digital tone of voice or body language – unspoken clues to a private mutual understanding enabled by shared experiences. Through a convoluted process of switching to SMS in order to automatically save our conversations to a database, I was able to successfully harvest our emojis. Then, after we both individually ranked emojis based on a simple scoring system that also accounted for frequency and proximity, Burrito was able to determine who is the better spouse.

As a result, Burrito ignited a lot of questions. For example – how does Burrito know the difference between an emoji being directed at my husband verses to him about someone or something else? How do I program Burrito to account for context? What counts as context? Should Burrito also supply an explanation of why one spouse is better than the other? Or does that matter? If Burrito informs us more or less frequently, is that better or worse? What is the frequency threshold that will affect our behavior and ultimately redefine our communication?

Reflections

Through the prototype and resulting questions, I have subsequently learned a lot about algorithms and their broader implications. But more importantly, I’ve taken a more critical eye towards conversational communication and implicit interaction. The two screenshots below encapsulate new perspectives.

The screenshot to the left is following a recent switch to Telegram, an open messaging platform by which Burrito is being refactored into an official bot. Telegram also supports stickers and GIFs, the latter being a particularly prominent substitute for our emojis and a new technical challenge due to necessary image recognition. It is also worth pointing out the orientation of the sticker character, who faces left, or towards my husband. As my husband intended this sticker directed at me, I would expect the sticker within my interface to reverse along the vertical axis and face right, towards me. While this subtle detail does not impede the flow or meaning of our conversation, it is nevertheless a break in the conversational UI foundation. Since an important element of conversational UIs is the consistency of always having yourself on the right and others on the left, why doesn’t the content as well reflect (pun intended) the relationship between sender and receiver? And therefore, could a subtle yet simple play on content orientation embed implicit meaning?

The screenshot to the right is a regular instantiation of conversations with my sister. She works from home, often with her 9 month old in one hand, which means while I silently message her from a crowded office, she responds by voice as her preferred modality of interaction. In my ideal world, my sister’s voice recordings would be accurately translated into words and GIFs, and my text and GIFs would be appropriately converted into an intonated voice memo. By separating the input modality from the delivery modality, the interface would be independently relevant to each user’s environment. Why doesn’t the content reflect each user’s own relationship with their immediate physical environment? And if it did, how would this affect implicit meaning?

In summary, there is a lot of focus on making artificial intelligence appropriately embedded into conversations based on the substance of content alone, but little towards intersubjectivity.

Case Study

In an enterprise context, which I design within at Zebra Technologies, our users are what we refer to as situationally disabled, meaning they might be under extra cognitive stress or physically limited regarding how they receive and communicate vital information while simultaneously performing other tasks. Our users are also often in complex, data-rich environments in which they are expected to act quickly and accurately regarding both identification and input.

How should a conversational UI in an enterprise environment appropriately adapt to situationally disabled users’ context?

In a particular project dedicated to improving task based process of warehouse workers operating heavy machinery, we have been investigating how to seamlessly interweave conversations into exception handling with minimal disruption to the prescribed workflow. One such concept below – with specifics intentionally removed – explores the idea of a conversational UI.

The first screen is a standard task list with a fab button that collapses the tasks in the second screen to make room for a conversational interface. Canned responses restrict the flow to context aware options. More importantly, the receiver and system are decoupled from the usual left hand side, and the system messages centered. This deliberate separation is important for how we establish an enterprise specific relationships between senders, receivers, and systems within multimodal ecosystems.

Enterprise case study in collaboration with Noël Bankston