Conceptual cheat sheet: How to create a chatbot with OpenAI's API

Farez Rahman
1 reply
Spent the last couple of weeks diving into AI and I've been learning how to build a chatbot that uses my own content. I've outlined the general steps for creating one, as a "cheat sheet" to myself, so I'm sharing it here. So if you want to build a chatbot with your own content, here are the general steps you would take. 1. Setup - Create an OpenAI account and get your API key. - Create a vector database, e.g. PostgreSQL with pgvector or Redis. - Create a table to hold all your embeddings (OpenAI's vectors require 1536 dimensions). 2. Process your documents - Gather all your documents ad convert them into text. Convert line breaks and tabs to spaces (OpenAI’s recommendation). - Break the document content into chunks. Each chunk should be small-ish, e.g. 2,000 words (look up OpenAI's "tokens" for more specific settings). - Use OpenAI’s API to create embeddings for each chunk. - Save each chunk into the vector database. 3. Create a similarity search function - Use a similarity search function to retrieve the correct chunk, based on a question.. OpenAI recommends the cosine similarity calculation. - Test it out and tweak the parameters. 4. Create your user-facing app - Create a form to ask questions and display responses. 5. Write your prompts There are 3 types of prompts: - Information gathering prompt: A prompt for OpenAI to keep asking questions until it has all the information it needs. - Summarise question prompt: Another prompt to summarise the question. - Answer prompt: A prompt that combines the final form of the question along with the relevant chunk of information. 6. Get chatting - Host your code and database somewhere. - When someone asks a question, send the question and the Information gathering prompt to the OpenAI Chat API. - Repeat until OpenAI responds that it has enough info. - Send the chat transcript to OpenAI with the Summarise question prompt. 7. Answer the question - Use the OpenAI API to convert the summarised question into an embedding. - Search the vector database for the chunk that would most likely contain the answer, using the similarity search function you created earlier. - Use the Answer prompt to send the summarised question and the chunk to the OpenAI Completion API to get the answer. - Wait for OpenAI to respond and display the answer to the user. That's generally it! Would love your feedback, especially from those who have already been building chatbots. Especially if you have a different method or process for building it. Thanks! --- PS There's a lot more to it, of course, and if you want more of this, my newsletter is the best place to find me: https://farez.substack.com.