Conversational AI

Background

Note: this project was completed in the first half of 2022, before the widespread public availability of the newest generation of Large Language Models (LLMs), like ChatGPT. While the newer models are much more powerful, two things should be kept in mind:

  1. The general idea and process behind it is still very much applicable (only now we’re closer to realizing its potential).
  2. Though the current iteration of LLMs are impressive, they’re still not full-fledged replacements for humans.
Image courtesy of @hal_jpeg

And now, back to our originally scheduled programming:

Before we delve into Conversational AI, I need to talk to you about something near and dear to my heart.

Me.

Well, more specifically, my work history.

Notice a trend?

I’ve spent the majority of my adult life working in sales in some capacity.

I’ve worked in retail sales, corporate sales, and, most recently, in online sales as a copywriter and digital marketer.

While each sales type has its positives and negatives, there is one common theme to all of them.

Free burnout for everyone!

Granted, attrition is a fact of life in all industries, but it's especially worrying in sales. 

The turnover rate in sales is over 30%, almost 270% greater than the average for all other industries.

And 67% of sales employees are on the verge of burnout at this moment. 

If you have worked in sales for more than a couple of years, odds are you've experienced burnout or know someone who has.

You can find a whole host of suggestions online as to how companies can tackle this challenge. Without fail, though, every single one of them misses what, in my experience, has been the most obvious and best solution to motivating salespeople and preventing burnout: giving them better leads.

Sales, with its constant rejection, is a profession that’s bruising on the ego.

As rejections pile up, salespeople become less passionate about their jobs.

If you give them higher quality leads, they will experience fewer rejections.

Better leads mean more money for sales reps (which creates even happier salespeople) and help a company's bottom line.

Okay, but how do you give them better leads?

You do that through lead nurturing, the process of engaging prospective customers by providing them with appropriate content and information at each stage of the sales funnel with the end goal of earning their business.

Currently, there is a range of lead nurturing strategies companies employ:

  • Email marketing/nurturing
  • Retargeting
  • Personalization
  • Face-to-face interaction
  • Content marketing

And while these strategies have borne some fruit, there is still a lot of potential business left on the table.

And this is where Conversational AI, aka chatbots, come into play. 

Hello, friend.

"Companies are starting to implement chatbots as a customer engagement channel, thus reflecting a potential shift in the way companies interact with their customers, exchange data, and provide services."

The more engaged a prospect is, the warmer they become as a lead when they eventually get passed on to a sales rep.

Current lead engagement channels like email, content marketing, or even web copy are passive for the most part and leave you hoping prospects see what you want them to see as opposed to simply skimming by it.

Others, like one-to-one interactions, are not easily scalable. 

Conversation is more engaging, and chatbots are more easily scalable. 

What’s more, everyone who comes to your site is already interested in your product, to some degree. I mean, why else would they be there? Since customers aren't easy to attain, it’s incumbent upon you to take advantage of that.

Unfortunately, when you mention chatbots to the average person, they usually feel like this:

Please don't touch me.

That’s because most of them adhere to a very rigid if-this-then-that structure that consequently leaves you feeling like you’d be much better off throwing your computer out of the nearest window.

VIEW ON GITHUB

Libraries:

gensim, glob, logging, matplotlib, nltk, numpy, os, pandas, pickle, pyLDAvis, pytorch, random, re, seaborn, shutil, sklearn, spacy, sys, tensorflow, textblob, time, tqdm.notebook, transformers, typing, and wordcloud.

Challenge

Can I use my knowledge of data science to create a chatbot capable of open and dynamic conversation?

In other words, can I make people go from feeling like this:

Still don't want you to touch me.

To this:

Friends?

Data Selection & EDA

Initially, I planned to use the Cornell Movie-Dialogs Corpus, a collection of fictional conversations extracted from movie scripts. 

Unfortunately, for reasons which will become apparent, I had to change course. 

I ended up using a modified topical chat dataset instead. Created by Arnav Sharma, it's a streamlined version of this original Amazon Alexa dataset.

The dataset consists of 8,000+ conversations and 184,000+ messages. It contains a conversation id, a message, and the sentiment of each message, and spans 8 topics. Its intended purpose is to aid in the effort to build a socialbot that can have engaging open-domain conversations with humans.

The topics included are:

  • fashion
  • politics
  • books
  • sports
  • general entertainment
  • music
  • science and technology
  • movies

Models

In terms of models, I focused on attention-based ones because of their ability to mimic the way humans understand language.

For example, if a sentence is ten words long, you and I don’t give 10% of our attention to each word in that sentence. We focus our attention on the most important contextual words. Then, we respond accordingly.

Attention-based models do a similar thing.

Unfortunately, my initial attempts at creating an open-ended chatbot were a little murdery.

How, you ask?

Well, I asked my chatbot the following:

Hey, I was curious as to what my bot's response would be.

And it responding with:

Uh...................

Now, I don’t know about you, but if I ask someone a question that has nothing to do with murder, and they then proceed to respond by highlighting the fact that they are not a murderer, I am 100% going to think that they are, in fact, a murderer. 

While the other answers my initial chatbot gave me weren’t as worrying, they were weirdly existential and completely useless in terms of my goal. 

(If you scroll to the bottom of this notebook, you can take a look at some of them.)

It makes sense when you think about it. While movie conversations are dialogue, they’re dialogue with a specific end goal in mind: moving the plot. While it works for a 90-minute film, it’s going to struggle outside of that context. It’s just too inflexible.

How long do you think you’d be able to handle talking to someone who spoke like a movie?

So, I changed the dataset.

In addition to changing the dataset, I changed the model type.

Initially, I was trying to create a transformer model with the aid of this TensorFlow tutorial, but thanks to some outside advice, I realized I would be far better off fine-tuning a pre-trained GPT-2 model.

Simply put, a pre-trained model is a model created by someone else to solve a similar problem. Instead of starting from scratch, you can use their model as a starting point to build on.

The results were significantly better.

For example:

And: 

While you may not agree with all of the bot’s opinions, it’s hard to argue that it’s not more dynamic than my previous attempt at a bot, and many of the bots you interact with online.

My best small GPT-2 model had a final perplexity of 1.1191, while my medium model had a final perplexity of 1.0628.

(Perplexity, for those not aware, is a metric that measures how confident a model is in its predictions. The lower the perplexity, the more confident it is. Lower perplexity scores have a high correlation with human-like conversation.)

Due to the time constraints, I had to limit the dataset to the first 50,000 entries. 

Unfortunately, limiting my dataset to 50,000 entries led to less than optimal performance on some subjects.

Probably not the sort of answers you want a representative of your company giving prospective customers.

While with more time, I might have been able to deal with issues like the above, it does speak to some of the challenges of Conversational AI.

  1. AI isn’t a know-it-all. Unless extensively trained on a topic, AI responses can be seemingly nonsensical, bizarre, or not ideal for its use case.
  2. Time and resources. The “basic” version of my model took 10+ hours to train using Google Colab. The other one took twice as long.
  3. Unreasonable expectations. It’s important to remember that AI is AI, not human. Many view AI as a panacea, but as seen with the shortcomings above, it’s far from that. However, if expectations are reasonable, it can be a tremendous tool.

But still, think of the possibilities Conversational AI presents:

  1. Lead nurturing. In this example, I used a conversational dataset from Amazon, but what if a company took the call transcripts of its best 10 sales reps, transcribed them, and then trained a Conversational AI model on them? Can you see the possibilities for on-page customer engagement and the quality of your leads?
  2. Better online shopping experiences. Wouldn’t it be nice to have a chatbot give you a contextually appropriate response to your question instead of fighting the urge to throw your computer out the window because you would be better off talking to a brick wall?
  3. Interactive brand messaging. Instead of hoping customers come across your key selling points while skimming your page, you can actively engage them in a way that not only speaks to their needs but also speaks to the values of your brand and how they align with your prospective customers.

And, Most Importantly...

Incidentally, this guy looks like my brother.

If you are interested in interacting with my chatbots, you can do so here:

Best friends!