Creating Conversational AI Chatbots with Huggingface Transformers in Python

How does Python implement chatbots? Learn how to use the Huggingface Transformers library to generate conversational responses from pre-trained DialoGPT models in Python.

Python for conversational AI chatbots: Chatbots have become very popular in recent years, and researchers are doing a good job of advancing conversational AI chatbots as interest in using chatbots for business grows.

In this Huggingface Transformer chatbot tutorial, we will use the Huggingface transformer library to generate conversational responses using a pre-trained DialoGPT model.

DialoGPT is a massively tunable neural dialogue response generation model that was trained on 147 million conversations extracted from Reddit, with the benefit that you can fine-tune it with your dataset for better performance than training from scratch.

First, let’s install the transformer:

$ pip3 install transformers

Open a new Python file or notebook and do the following:

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

# model_name = "microsoft/DialoGPT-large"
model_name = "microsoft/DialoGPT-medium"
# model_name = "microsoft/DialoGPT-small"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)

There are three versions of DialogoGPT; Small, medium, large. Of course, the bigger the better, but if you’re running it on your machine, I think a small or medium size fits your memory, no problem. You can also use Google Colab to try out the big ones.

Use greedy search to generate responses

In this section, we’ll use a greedy search algorithm to generate a response. That is, we choose the chatbot response with the highest probability of being selected at each time step.

Let’s write code to chat with our AI using greedy search:

# chatting 5 times with greedy search
for step in range(5):
    # take user input
    text = input(">> You:")
    # encode the input and add end of string token
    input_ids = tokenizer.encode(text + tokenizer.eos_token, return_tensors="pt")
    # concatenate new user input with chat history (if there is)
    bot_input_ids = torch.cat([chat_history_ids, input_ids], dim=-1) if step > 0 else input_ids
    # generate a bot response
    chat_history_ids = model.generate(
        bot_input_ids,
        max_length=1000,
        pad_token_id=tokenizer.eos_token_id,
    )
    #print the output
    output = tokenizer.decode(chat_history_ids[:, bot_input_ids.shape[-1]:][0], skip_special_tokens=True)
    print(f"DialoGPT: {output}")

Python Implements Conversational AI Chatbots – Let’s Explain the Core of This Code:

  • We start by getting input from the user to use for the chat.
  • We use the DialoGPT tokenizer to encode the text, and we also append the end of the string markup and return it as a Pytorch tensor.input_ids
  • If this is the first time to chat with a bot, then we offer a generation directly to our model. Otherwise, we attach the chat history with the help of the method.input_idstorch.cat()
  • After that, we use the method of generating the chatbot response.model.generate()
  • Finally, since the returned output is also a tokenized sequence, we use and set to decode the sequence to ensure that we don’t see any annoying special tags, such as . Also, since the model returns the entire sequence, we skip the previous chats and only print the newly generated chatbot responses.tokenizer.decode()skip_special_tokensTrue<|endoftext|>

Here’s an example of a discussion with a bot:

>> You:How can you be rich so quickly?
DialoGPT: I'm not rich, I'm just a rich man.
>> You:Ok, you don't make sense, how old are you ?
DialoGPT: I'm not a rich man, I'm just a rich man.
>> You:How can you be happy ?
DialoGPT: I'm not happy, I'm just a rich man.
>> You:But they say money buys happiness!
DialoGPT: Money buys happiness, but not happiness itself.
>> You:You don't make sense again!
DialoGPT: I'm not happy, I'm just a rich man.

Huggingface Transformer Chatbot Tutorial: You’ll see that the model repeats a lot of responses because those responses have the highest probability and selects it every time.

By default, the greedy search algorithm is used when no other parameters are set, and in the next section we’ll add some parameters to the method to see if we can improve the build.model.generate()

Learn also: How to pretrain BERT with Transformers in Python.

Use Beam Search to generate responses

How does Python implement chatbots? By preserving the most probable hypothesis at each time step, and then employing the sequence with the highest overall probability, beam search allows us to reduce the risk of losing a sequence with a high probability, and the following code will use beam search to generate a chatbot response:num_beams

# chatting 5 times with beam search
for step in range(5):
    # take user input
    text = input(">> You:")
    # encode the input and add end of string token
    input_ids = tokenizer.encode(text + tokenizer.eos_token, return_tensors="pt")
    # concatenate new user input with chat history (if there is)
    bot_input_ids = torch.cat([chat_history_ids, input_ids], dim=-1) if step > 0 else input_ids
    # generate a bot response
    chat_history_ids = model.generate(
        bot_input_ids,
        max_length=1000,
        num_beams=3,
        early_stopping=True,
        pad_token_id=tokenizer.eos_token_id
    )
    #print the output
    output = tokenizer.decode(chat_history_ids[:, bot_input_ids.shape[-1]:][0], skip_special_tokens=True)
    print(f"DialoGPT: {output}")

Copy: When set to the in method, then we will select 3 words at each time step and develop them to find the highest overall probability of the sequence, set to 1 which is the same as greedy search.num_beams3model.generate()num_beams

Here’s an example discussion of using beam search vs. chatbots:

>> You:How can I be rich?
DialoGPT: You can't.
>> You:Why not?
DialoGPT: Because you can't.
>> You:Then how am I supposed to live well?
DialoGPT: You can't.
>> You:So basically, you want me to be a poor guy?
DialoGPT: No, I want you to be a rich guy.
>> You:Ok how to be rich, then?
DialoGPT: I don't know.

Responses are generated using sampling

In the previous sections, we used beam search and greedy search to generate the highest probability sequence. Now this is ideal for tasks with predictable outputs such as machine translation or text summarization. However, it is not the best option for the open generation when it comes to chatbots.

For better generation, we need to introduce some randomness, and we sample from a wide range of candidate sequences based on probability:

# chatting 5 times with sampling
for step in range(5):
    # take user input
    text = input(">> You:")
    # encode the input and add end of string token
    input_ids = tokenizer.encode(text + tokenizer.eos_token, return_tensors="pt")
    # concatenate new user input with chat history (if there is)
    bot_input_ids = torch.cat([chat_history_ids, input_ids], dim=-1) if step > 0 else input_ids
    # generate a bot response
    chat_history_ids = model.generate(
        bot_input_ids,
        max_length=1000,
        do_sample=True,
        top_k=0,
        pad_token_id=tokenizer.eos_token_id
    )
    #print the output
    output = tokenizer.decode(chat_history_ids[:, bot_input_ids.shape[-1]:][0], skip_special_tokens=True)
    print(f"DialoGPT: {output}")

Python Implements Conversational AI Chatbots – Copy this time, we set the parameters to sample and we set to indicate that we choose all possible probabilities, which we’ll discuss later.do_sampleTruetop_k0top_k

Here’s a chat with these parameters:

>> You:Hello, how can I be rich?
DialoGPT: You can't
>> You:Really? then how others get rich?
DialoGPT: They work as money artists
>> You:How to be a money artist?
DialoGPT: I don't know, I don't have enough knowledge on artists
>> You:Alright, thank you!
DialoGPT: No problem, if you need tips pm me.
>> You:Great, you're my savior!
DialoGPT: Thank you for helping a stranger out like me!

There are obviously some improvements. However, sampling a large number of low-probability sequences can lead to random generation (as you can see in the last sentence).

To further improve it, we can:

  • Decreasing the sampling rate, which helps us reduce the likelihood of choosing low-probability words and increase the likelihood of choosing high-probability words.temperature
  • Using Top-k sampling instead of selecting all possible scenarios will help us discard the low-probability words that are chosen.
# chatting 5 times with Top K sampling & tweaking temperature
for step in range(5):
    # take user input
    text = input(">> You:")
    # encode the input and add end of string token
    input_ids = tokenizer.encode(text + tokenizer.eos_token, return_tensors="pt")
    # concatenate new user input with chat history (if there is)
    bot_input_ids = torch.cat([chat_history_ids, input_ids], dim=-1) if step > 0 else input_ids
    # generate a bot response
    chat_history_ids = model.generate(
        bot_input_ids,
        max_length=1000,
        do_sample=True,
        top_k=100,
        temperature=0.75,
        pad_token_id=tokenizer.eos_token_id
    )
    #print the output
    output = tokenizer.decode(chat_history_ids[:, bot_input_ids.shape[-1]:][0], skip_special_tokens=True)
    print(f"DialoGPT: {output}")

Now, we’re set to sample from the first few words sorted in descending order of probability. We also set it to (which defaults) to provide a higher chance of selecting high-probability words, setting temperature to the same as greedy search, and setting it to infinity the same as completely random. Here’s a discussion of these parameters:top_k100100temperature0.751.00.0

>> You:How can I get rich?
DialoGPT: I'd also like to know this.
>> You:So you don't know it?
DialoGPT: I know it, but it doesn't matter.
>> You:Really? money matters, man!
DialoGPT: Well, money is money and I gotta get my money.
>> You:Yeah how to get them?
DialoGPT: Well, that's a question. I'd like to know more, just for curiosity's sake, to know what other people have said about how I should get rich.
>> You:Some robots like you told me you have to buy gold, silver or stocks.
DialoGPT: I think my wallet is empty.

As you can see, it’s much better now, feel free to tweak and parameters and see if you can improve it.temperaturetop_k

Nuclear sampling

How does Python implement chatbots? Nucleus sampling or Top-p sampling is chosen from the smallest possible words with a cumulative probability of exceeding the parameters we set.p

Here’s an example of using top-p sampling:

# chatting 5 times with nucleus sampling & tweaking temperature
for step in range(5):
    # take user input
    text = input(">> You:")
    # encode the input and add end of string token
    input_ids = tokenizer.encode(text + tokenizer.eos_token, return_tensors="pt")
    # concatenate new user input with chat history (if there is)
    bot_input_ids = torch.cat([chat_history_ids, input_ids], dim=-1) if step > 0 else input_ids
    # generate a bot response
    chat_history_ids = model.generate(
        bot_input_ids,
        max_length=1000,
        do_sample=True,
        top_p=0.95,
        top_k=0,
        temperature=0.75,
        pad_token_id=tokenizer.eos_token_id
    )
    #print the output
    output = tokenizer.decode(chat_history_ids[:, bot_input_ids.shape[-1]:][0], skip_special_tokens=True)
    print(f"DialoGPT: {output}")

Replication we set to disable top-k sampling, but there are two methods you can use, both of which tend to work better. Here’s a chat:top_k0

>> You:How can I be rich fast?
DialoGPT: Inflation and free money
>> You:lol, but that takes time, I need fast money.
DialoGPT: Your post makes no sense. You just need a lot of money to be rich fast
>> You:Ok how to get a lot of money?
DialoGPT: Buy gold and then trade it for money.
>> You:Hmm, I got your point, thanks.
DialoGPT: No problem. You have to have money to buy gold, but its only good for trading it.
>> You:Yeah, I understand, thank you so much.
DialoGPT: Ok, let me know if you need help

Now chatbots clearly make sense in many situations.

Huggingface Transformers chatbot tutorial: Now let’s add some code to generate multiple chatbot responses, and then we select the responses we want to include in the next input:

# chatting 5 times with nucleus & top-k sampling & tweaking temperature & multiple
# sentences
for step in range(5):
    # take user input
    text = input(">> You:")
    # encode the input and add end of string token
    input_ids = tokenizer.encode(text + tokenizer.eos_token, return_tensors="pt")
    # concatenate new user input with chat history (if there is)
    bot_input_ids = torch.cat([chat_history_ids, input_ids], dim=-1) if step > 0 else input_ids
    # generate a bot response
    chat_history_ids_list = model.generate(
        bot_input_ids,
        max_length=1000,
        do_sample=True,
        top_p=0.95,
        top_k=50,
        temperature=0.75,
        num_return_sequences=5,
        pad_token_id=tokenizer.eos_token_id
    )
    #print the outputs
    for i in range(len(chat_history_ids_list)):
      output = tokenizer.decode(chat_history_ids_list[i][bot_input_ids.shape[-1]:], skip_special_tokens=True)
      print(f"DialoGPT {i}: {output}")
    choice_index = int(input("Choose the response you want for the next input: "))
    chat_history_ids = torch.unsqueeze(chat_history_ids_list[choice_index], dim=0)

I’ve set to return 5 sentences at a time, and we have to select the sentences that will be included in the next sequence. Here’s how it goes: num_return_sequences5

>> You:Hello, friend.
DialoGPT 0: Hey there!
DialoGPT 1: I am your friend, and I am also your friend.
DialoGPT 2: Hello. :
DialoGPT 3: Hiya! How are you?
DialoGPT 4: Hello. I like your username.
Choose the response you want for the next input: 3
>> You:Fine, thanks, what about you?
DialoGPT 0: Good. Not much to do.
DialoGPT 1: Not much, you?
DialoGPT 2: Fine, thanks.
DialoGPT 3: I'm okay, thank you for asking.
DialoGPT 4: I'm fine too.
Choose the response you want for the next input: 2
>> You:Can you tell me how can I get rich quickly?
DialoGPT 0: Yes, I can!
DialoGPT 1: Yes, just look up how to get rich on Google.
DialoGPT 2: Just keep doing your thing. You'll get rich.
DialoGPT 3: No, I can't.
DialoGPT 4: no, i have no money
Choose the response you want for the next input: 1
>> You:Yeah, but I thought you have an answer...
DialoGPT 0: I'm getting married in the morning.
DialoGPT 1: How do you get rich?
DialoGPT 2: I think there are many good answers, but I could send you one, if you want.
DialoGPT 3: I will, thanks!
DialoGPT 4: I thought you have an answer.
Choose the response you want for the next input: 2
>> You:Yes, please send me.
DialoGPT 0: You can send me link.
DialoGPT 1: I am very interested
DialoGPT 2: I sent you a PM
DialoGPT 3: I'll send you a PM
DialoGPT 4: I am always interested in new ideas.
Choose the response you want for the next input: 2

Summary of Python’s implementation of conversational AI chatbots

How does Python implement chatbots? Hopefully, this tutorial has helped you understand how to generate text on DialoGPT and similar models. For more information on how to generate text, I highly recommend reading the How to generate text with Transformers guide.

I’ll let you tweak the parameters and see if you can make the bot perform better.

In addition, you can combine it with text-to-speech and speech-to-text tutorials to build a virtual assistant like Alexa, Siri, Cortana, and more.