Note: this project was completed in the first half of 2022, before the widespread public availability of the newest generation of Large Language Models (LLMs), like ChatGPT. While the newer models are much more powerful, two things should be kept in mind:
And now, back to our originally scheduled programming:
Before we delve into Conversational AI, I need to talk to you about something near and dear to my heart.
Me.
Well, more specifically, my work history.
I’ve spent the majority of my adult life working in sales in some capacity.
I’ve worked in retail sales, corporate sales, and, most recently, in online sales as a copywriter and digital marketer.
While each sales type has its positives and negatives, there is one common theme to all of them.
Granted, attrition is a fact of life in all industries, but it's especially worrying in sales.
The turnover rate in sales is over 30%, almost 270% greater than the average for all other industries.
And 67% of sales employees are on the verge of burnout at this moment.
If you have worked in sales for more than a couple of years, odds are you've experienced burnout or know someone who has.
You can find a whole host of suggestions online as to how companies can tackle this challenge. Without fail, though, every single one of them misses what, in my experience, has been the most obvious and best solution to motivating salespeople and preventing burnout: giving them better leads.
Sales, with its constant rejection, is a profession that’s bruising on the ego.
As rejections pile up, salespeople become less passionate about their jobs.
If you give them higher quality leads, they will experience fewer rejections.
Better leads mean more money for sales reps (which creates even happier salespeople) and help a company's bottom line.
Okay, but how do you give them better leads?
You do that through lead nurturing, the process of engaging prospective customers by providing them with appropriate content and information at each stage of the sales funnel with the end goal of earning their business.
Currently, there is a range of lead nurturing strategies companies employ:
And while these strategies have borne some fruit, there is still a lot of potential business left on the table.
And this is where Conversational AI, aka chatbots, come into play.
The more engaged a prospect is, the warmer they become as a lead when they eventually get passed on to a sales rep.
Current lead engagement channels like email, content marketing, or even web copy are passive for the most part and leave you hoping prospects see what you want them to see as opposed to simply skimming by it.
Others, like one-to-one interactions, are not easily scalable.
Conversation is more engaging, and chatbots are more easily scalable.
What’s more, everyone who comes to your site is already interested in your product, to some degree. I mean, why else would they be there? Since customers aren't easy to attain, it’s incumbent upon you to take advantage of that.
Unfortunately, when you mention chatbots to the average person, they usually feel like this:
That’s because most of them adhere to a very rigid if-this-then-that structure that consequently leaves you feeling like you’d be much better off throwing your computer out of the nearest window.
gensim, glob, logging, matplotlib, nltk, numpy, os, pandas, pickle, pyLDAvis, pytorch, random, re, seaborn, shutil, sklearn, spacy, sys, tensorflow, textblob, time, tqdm.notebook, transformers, typing, and wordcloud.
In other words, can I make people go from feeling like this:
To this:
Initially, I planned to use the Cornell Movie-Dialogs Corpus, a collection of fictional conversations extracted from movie scripts.
Unfortunately, for reasons which will become apparent, I had to change course.
I ended up using a modified topical chat dataset instead. Created by Arnav Sharma, it's a streamlined version of this original Amazon Alexa dataset.
The dataset consists of 8,000+ conversations and 184,000+ messages. It contains a conversation id, a message, and the sentiment of each message, and spans 8 topics. Its intended purpose is to aid in the effort to build a socialbot that can have engaging open-domain conversations with humans.
The topics included are:
In terms of models, I focused on attention-based ones because of their ability to mimic the way humans understand language.
For example, if a sentence is ten words long, you and I don’t give 10% of our attention to each word in that sentence. We focus our attention on the most important contextual words. Then, we respond accordingly.
Attention-based models do a similar thing.
Unfortunately, my initial attempts at creating an open-ended chatbot were a little murdery.
How, you ask?
Well, I asked my chatbot the following:
And it responding with:
Now, I don’t know about you, but if I ask someone a question that has nothing to do with murder, and they then proceed to respond by highlighting the fact that they are not a murderer, I am 100% going to think that they are, in fact, a murderer.
While the other answers my initial chatbot gave me weren’t as worrying, they were weirdly existential and completely useless in terms of my goal.
(If you scroll to the bottom of this notebook, you can take a look at some of them.)
It makes sense when you think about it. While movie conversations are dialogue, they’re dialogue with a specific end goal in mind: moving the plot. While it works for a 90-minute film, it’s going to struggle outside of that context. It’s just too inflexible.
How long do you think you’d be able to handle talking to someone who spoke like a movie?
So, I changed the dataset.
In addition to changing the dataset, I changed the model type.
Initially, I was trying to create a transformer model with the aid of this TensorFlow tutorial, but thanks to some outside advice, I realized I would be far better off fine-tuning a pre-trained GPT-2 model.
Simply put, a pre-trained model is a model created by someone else to solve a similar problem. Instead of starting from scratch, you can use their model as a starting point to build on.
The results were significantly better.
For example:
And:
While you may not agree with all of the bot’s opinions, it’s hard to argue that it’s not more dynamic than my previous attempt at a bot, and many of the bots you interact with online.
My best small GPT-2 model had a final perplexity of 1.1191, while my medium model had a final perplexity of 1.0628.
(Perplexity, for those not aware, is a metric that measures how confident a model is in its predictions. The lower the perplexity, the more confident it is. Lower perplexity scores have a high correlation with human-like conversation.)
Due to the time constraints, I had to limit the dataset to the first 50,000 entries.
Unfortunately, limiting my dataset to 50,000 entries led to less than optimal performance on some subjects.
Probably not the sort of answers you want a representative of your company giving prospective customers.
While with more time, I might have been able to deal with issues like the above, it does speak to some of the challenges of Conversational AI.
But still, think of the possibilities Conversational AI presents:
If you are interested in interacting with my chatbots, you can do so here: