Building our own bot

When I first met Molly, nearly a decade ago, we didn’t live in the same place. She was on the east coast, I was in Chicago, and though she did move closer we still spent our first 3 years in separate cities. In an earlier era we’d probably have a box full of letters representing that period of our relationship. But thankfully, the medium of our time allowed for more continual connection, and instead we have a phone full of text messages.

I really didn’t text that much before we met, here and there to coordinate plans. But with Molly it’s always been something different, a steady stream of communication, a connective thread, a heartbeat. Even after moving to Pittsburgh together it was still important, with both of us traveling so much. I’ve probably exchanged 1000X more messages with her than everyone else combined.

A few years ago, having accumulated 7 years worth of messages, I wanted to find a way to look at this trove of texts. On the iPhone, Apple makes it hard to look back more than a few days, requiring screen-by-screen scrolling and no good way to search. The Messages app has a clear bias towards recency, but I wanted to zoom out, to sift through this virtual letter box that we’d built up bit-by-bit. Luckily, after some digging, I learned that deep within the obscured file structure of an iPhone backup was a SQLite database containing all of our messages. Stripped of its proprietary interface, I could get my hands on a searchable version of our distanced ephemera: every sleepy “goodnight,” anticipatory “boarding now,” and phatic “&&&” that meant nothing and everything all at once.

A database is funny place to find your memories. With this tiny SQLite file, I could effortlessly recall exactly what we we talked about 100 days after we met, analyze our most commonly used words, and map the frequency and times we were in touch. That was sort of interesting, but also not particularly surprising or revealing. After playing with it for an hour or so I discarded my initial ideas for potentially visualizing this dataset. There was certainly a lot to work with, but it seemed like it would result in the kind of vapid navel gazing found in most quantified self projects.

Instead, I decided to use the database not as an archive, to be cataloged and analyzed, but as a seed, to train an AI that would make new text messages based upon our history. This plan seemed like more fun, and was a chance to learn about new technology that had only recently become more accessible. The idea was to create our own private bot, trained on all the text messages we’ve ever sent each other. I wanted it to send us one text a day, not a verbatim Timehop-like reminder of something we’d actually said in the past, but an original quip — conjured from the mind of a weird little AI whose only knowledge of the world was the texts messages we’d sent each other.

Creating mollysimon bot

I extracted the the 123MB SQLite database from my iPhone backup on Christmas day, 2018. The first thing I did was play around with a tool called iMessageAnalyzer, a Mac app that provides a quick-and-dirty UI to search and make basic charts out of your iMessage history. To begin training our bot I needed to get the messages out of the database, and into a basic text file. A simple query let me filter and export only the texts exchanged with Molly, resulting in a svelte 7MB CSV file.

To train the AI model I used an open source project called textgenrnn, a Python module that utilizes the TensorFlow ML platform. What’s going on technically here is called a “recurrent neural network,” which I had a conceptual understanding of, but was thankful that textgenrnn abstracted away nearly all the complexity. Shoutout to this Lifehacker article that made it seem so easy. Once all the dependencies were installed the program was easy enough to use: (1) feed it the seed text, (2) tell it how you long want it to train, and (3) set the “temperature,” which is sometimes called “creativity” in other ML tools.

The further you dial up the temperature, the more divergent the AI will be. At a low temperature, it would write things that were extremely similar to phrases we’d actually said. At a high temperature it could devolve into complete nonsense. I wanted our bot to be “inspired” by our texts, not just a copycat, so after some trial and error I landed on a temperature of 8 out of 10 — very creative, but not raving mad. The other parameter was how many “epochs” of training the program should undertake. The larger the number, the better the results, and the longer it would take to process. I cranked up the epochs, starting it running, and went to bed.

The next step of the process was more straight-forward. After generating nearly 40K texts, I imported them into a MySQL database running on my web server. I wrote a small PHP script to pull one out randomly, and configured an account at Twilio that would allow me to send it as a text message. Finally, I set up a cron job (a UNIX-based task scheduler) to run my script once a day, which sends the bot’s message to both Molly and myself. I also set it up so that we can request a new message from the bot at any time, by simply texting the Twilio number.

Hello, from bot

Of all my personal projects, I think mollysimon bot is the one that gives me the most daily enjoyment. It’s pretty common that we’ll both burst out laughing when the bot arrives, or ask each other later in the evening if we’ve seen the bot that day. There’s something uncanny in the language, even if the messages are often nonsense. The only words and phrases it knows are the one’s we’ve said to each other. The idiosyncrasies contained therein would likely be lost on anyone else, but for us it’s like a series of inside jokes. Not all of the 1600 text messages our bot has sent us are coherent, or funny, but more often than not they’re delightful.

A while ago I posted some good ones on Instagram:

One of our favorites, that has become part of our everyday lexicon is below. It’s such a perfect example of the bot being influenced by our texts (understanding Emoji as a proper noun) but adding it’s own oddity. The best ones make just enough sense to be funny.

Emoji is so short. Enjoy him in 8 minutes.

Even when the bot sends complete nonsense it’s still sort of interesting. As a result of the “temperature” being turned up during text generation it has the ability to make up new words, which can leave you wondering: what would “poggling to the hotel” really be like?

heh, and he sent them on emoji with Amazon Brewery till Labure Mersec. I can go to the shower which he seems a house and his song shitty and feeling awesome of the view non-different lists. They're poggling to the hotel, hopefully somewhere shitty going to the CENST.

Sometimes it’s fun to try and imagine the events that could have led to a message, were it a real one, being sent between the two of us:

I am still so bad with data right now.

Other times you can imagine that maybe it could have been a real message, assuming some liberal autocorrect mistakes mixed in.

I don’t want to push the car. Come to IDEO chicken thing around it. It's away you were always great.
I think so. One friend card called $120. Breakfast night island progress in course. It's awesome.

Rarely, although I wish it happened more often because I really enjoy it, the bot will generate a URL. Where possible, I’m tempted to buy up the domains and turn these creations into actual web pages. The example below still cracks me up every time I see it:

https://m.grroip-study?sster/tails/bc/star-old-oattward/oths-thre-editically.com/news/down.com/uncaller-terrible-live-everything-weird.com/130/2019ff./1/102//98676713916696278

A huge part of what makes the bot funny is receiving these snippets as actual text messages. If you just read them in a list (as you are here, I suppose) they don’t have the same effect as receiving them as a text, out of the blue. Although they arrive at the same time every day, it’s easy to forget and be momentarily startled by an incoming message, wondering “what is this?”

Sometimes, when the bot adhere’s too closely to the script, it can actually get confusing. The message below is one where I mistakenly thought it was actually from Molly:

On my way home!

Bot Dreams

Last year, I experimented with taking our bot to the next level by having an AI generate images based on bot messages. Similar to how textgenrnn made text generation accessible to me, this Google Colaboratory notebook made image generation something I could try without deep technical knowledge or a high-powered GPU. The included code utilizes something called a “generative adversary network” (GAN) to create images based on simple text prompts. The output is weird and surreal in much the same way as the text bot, so I figured that combining the two was only natural.

I created an Instagram account called @botdreams, imagining what it might be like if an image bot fell asleep while reading the ramblings of our text bot. Some of the result are pretty amazing:

“Alex smells cheap”
“And are you addressing the doggies?”
“Here is everything in the morning”
“Kind of so fast”

Generating these images was really fun, but also very time consuming. Each one took over an hour to create and can’t be easily automated, at least with the free version of Google Colab I was using. Ultimately, it stopped working entirely because Colab wasn’t providing me with the right GPU. I can see myself returning to botdreams again, either with a paid version of Colab or another approach. It feels like a natural extension of the project, and a way to share our bot’s weirdness more broadly.

For now though, I’m happy to be getting the daily text message from our weird little AI progeny. I like how it injects some surprise into the day, or as bot once said:

They was awesome!