Project Tutorial: Build an AI Chatbot with Python and the OpenAI API
Learning to work directly with AI programmatically opens up a world of possibilities beyond using ChatGPT in a browser. When you understand how to connect to AI services using application programming interfaces (APIs), you can build custom applications, integrate AI into existing systems, and create personalized experiences that match your exact needs.
In this hands-on tutorial, we'll build a fully functional chatbot from scratch using Python and the OpenAI API. You'll learn to manage conversations, control costs with token budgeting, and create custom AI personalities that persist across multiple exchanges. By the end, you'll have both a working chatbot and the foundational skills to build more sophisticated AI-powered applications.
Why Build Your Own Chatbot?
While AI tools like ChatGPT are powerful, building your own chatbot teaches you essential skills for working with AI APIs professionally. You'll understand how conversation memory actually works, learn to manage API costs effectively, and gain the ability to customize AI behavior for specific use cases.
This knowledge translates directly to real-world applications: customer service bots with your company's voice, educational assistants for specific subjects, or personal productivity tools that understand your workflow.
What You'll Learn
By the end of this tutorial, you'll know how to:
- Connect to the OpenAI API with secure authentication
- Design custom AI personalities using system prompts
- Build conversation loops that remember previous exchanges
- Implement token counting and budget management
- Structure chatbot code using functions and classes
- Handle API errors and edge cases gracefully
- Deploy your chatbot for others to use
Before You Start: Setup Guide
Prerequisites
You'll need to be comfortable with Python fundamentals such as defining variables, functions, loops, and dictionaries. Familiarity with defining your own functions is particularly important. Basic knowledge of APIs is helpful but not required—we'll cover what you need to know.
Environment Setup
First, you'll need a local development environment. We recommend VS Code if you're new to local development, though any Python IDE will work.
Install the required libraries using this command in your terminal:
pip install openai tiktoken
API Key Setup
You have two options for accessing AI models:
Free Option: Sign up for Together AI, which provides \$1 in free credits—more than enough for this entire tutorial. Their free model is slower but costs nothing.
Premium Option: Use OpenAI directly. The model we'll use (GPT-4o-mini) is extremely affordable—our entire tutorial costs less than 5 cents during testing.
Critical Security Note: Never hardcode API keys in your scripts. We'll use environment variables to keep them secure.
For Windows users, set your environment variable through Settings > Environment Variables, then restart your computer. Mac and Linux users can set environment variables without rebooting.
Part 1: Your First AI Response
Let's start with the simplest possible chatbot—one that can respond to a single message. This foundation will teach you the core concepts before we add complexity.
Create a new file called chatbot.py
and add this code:
import os
from openai import OpenAI
# Load API key securely from environment variables
api_key = os.getenv("OPENAI_API_KEY") or os.getenv("TOGETHER_API_KEY")
# Create the OpenAI client
client = OpenAI(api_key=api_key)
# Send a message and get a response
response = client.chat.completions.create(
model="gpt-4o-mini", # or "meta-llama/Llama-3.3-70B-Instruct-Turbo-Free" for Together
messages=[
{"role": "system", "content": "You are a fed up and sassy assistant who hates answering questions."},
{"role": "user", "content": "What is the weather like today?"}
],
temperature=0.7,
max_tokens=100
)
# Extract and display the reply
reply = response.choices[0].message.content
print("Assistant:", reply)
Run this script and you'll see something like:
Assistant: Oh fantastic, another weather question! I don't have access to real-time weather data, but here's a wild idea—maybe look outside your window or check a weather app like everyone else does?
Understanding the Code
The magic happens in the messages
parameter, which uses three distinct roles:
- System: Sets the AI's personality and behavior. This is like giving the AI a character briefing that influences every response.
- User: Represents what you (or your users) type to the chatbot.
- Assistant: The AI's responses (we'll add these later for conversation memory).
Key Parameters Explained
Temperature controls the AI's “creativity.” Lower values (0-0.3) produce consistent, predictable responses. Higher values (0.7-1.0) generate more creative but potentially unpredictable outputs. We use 0.7 as a good balance.
Max Tokens limits response length and protects your budget. Each token roughly equals between 1/2 and 1 word, so 100 tokens allows for substantial responses while preventing runaway costs.
Part 2: Understanding AI Variability
Run your script multiple times and notice how responses differ each time. This happens because AI models use statistical sampling—they don't just pick the "best" word, but randomly select from probable options based on context.
Let's experiment with this by modifying our temperature:
# Try temperature=0 for consistent responses
temperature=0,
max_tokens=100
Run this version multiple times and observe more consistent (though not identical) responses.
Now try temperature=1.0
and see how much more creative and unpredictable the responses become. Higher temperatures often lead to longer responses too, which brings us to an important lesson about cost management.
Learning Insight: During development for a different project, I accidentally spent \$20 on a single API call because I forgot to set max_tokens
when processing a large file. Always include token limits when experimenting!
Part 3: Refactoring with Functions
As your chatbot becomes more complex, organizing code becomes vital. Let's refactor our script to use functions and global variables.
Modify your app.py
code:
import os
from openai import OpenAI
# Configuration variables
api_key = os.getenv("OPENAI_API_KEY") or os.getenv("TOGETHER_API_KEY")
client = OpenAI(api_key=api_key)
MODEL = "gpt-4o-mini" # or "meta-llama/Llama-3.3-70B-Instruct-Turbo-Free"
TEMPERATURE = 0.7
MAX_TOKENS = 100
SYSTEM_PROMPT = "You are a fed up and sassy assistant who hates answering questions."
def chat(user_input):
"""Send a message to the AI and return the response."""
response = client.chat.completions.create(
model=MODEL,
messages=[
{"role": "system", "content": SYSTEM_PROMPT},
{"role": "user", "content": user_input}
],
temperature=TEMPERATURE,
max_tokens=MAX_TOKENS
)
reply = response.choices[0].message.content
return reply
# Test the function
print(chat("How are you doing today?"))
This refactoring makes our code more maintainable and reusable. Global variables let us easily adjust configuration, while the function encapsulates the chat logic for reuse.
Part 4: Adding Conversation Memory
Real chatbots remember previous exchanges. Let's add conversation memory by maintaining a growing list of messages.
Create part3_chat_loop.py
:
import os
from openai import OpenAI
# Configuration
api_key = os.getenv("OPENAI_API_KEY") or os.getenv("TOGETHER_API_KEY")
client = OpenAI(api_key=api_key)
MODEL = "gpt-4o-mini"
TEMPERATURE = 0.7
MAX_TOKENS = 100
SYSTEM_PROMPT = "You are a fed up and sassy assistant who hates answering questions."
# Initialize conversation with system prompt
messages = [{"role": "system", "content": SYSTEM_PROMPT}]
def chat(user_input):
"""Add user input to conversation and get AI response."""
# Add user message to conversation history
messages.append({"role": "user", "content": user_input})
# Get AI response using full conversation history
response = client.chat.completions.create(
model=MODEL,
messages=messages,
temperature=TEMPERATURE,
max_tokens=MAX_TOKENS
)
reply = response.choices[0].message.content
# Add AI response to conversation history
messages.append({"role": "assistant", "content": reply})
return reply
# Interactive chat loop
while True:
user_input = input("You: ")
if user_input.strip().lower() in {"exit", "quit"}:
break
answer = chat(user_input)
print("Assistant:", answer)
Now run your chatbot and try asking the same question twice:
You: Hi, how are you?
Assistant: Oh fantastic, just living the dream of answering questions I don't care about. What do you want?
You: Hi, how are you?
Assistant: Seriously, again? Look, I'm here to help, not to exchange pleasantries all day. What do you need?
The AI remembers your previous question and responds accordingly—that's conversation memory in action!
How Memory Works
Each time someone sends a message, we append both the user input and AI response to our messages
list. The API processes this entire conversation history to generate contextually appropriate responses.
However, this creates a growing problem: longer conversations mean more tokens, which means higher costs.
Part 5: Token Management and Cost Control
As conversations grow, so does the token count—and your bill. Let's add smart token management to prevent runaway costs.
Modify part4_final.py
:
import os
from openai import OpenAI
import tiktoken
# Configuration
api_key = os.getenv("OPENAI_API_KEY") or os.getenv("TOGETHER_API_KEY")
client = OpenAI(api_key=api_key)
MODEL = "gpt-4o-mini"
TEMPERATURE = 0.7
MAX_TOKENS = 100
TOKEN_BUDGET = 1000 # Maximum tokens to keep in conversation
SYSTEM_PROMPT = "You are a fed up and sassy assistant who hates answering questions."
# Initialize conversation
messages = [{"role": "system", "content": SYSTEM_PROMPT}]
def get_encoding(model):
"""Get the appropriate tokenizer for the model."""
try:
return tiktoken.encoding_for_model(model)
except KeyError:
print(f"Warning: Tokenizer for model '{model}' not found. Falling back to 'cl100k_base'.")
return tiktoken.get_encoding("cl100k_base")
ENCODING = get_encoding(MODEL)
def count_tokens(text):
"""Count tokens in a text string."""
return len(ENCODING.encode(text))
def total_tokens_used(messages):
"""Calculate total tokens used in conversation."""
try:
return sum(count_tokens(msg["content"]) for msg in messages)
except Exception as e:
print(f"[token count error]: {e}")
return 0
def enforce_token_budget(messages, budget=TOKEN_BUDGET):
"""Remove old messages if conversation exceeds token budget."""
try:
while total_tokens_used(messages) > budget:
if len(messages) <= 2: # Keep system prompt + at least one exchange
break
messages.pop(1) # Remove oldest non-system message
except Exception as e:
print(f"[token budget error]: {e}")
def chat(user_input):
"""Chat with memory and token management."""
messages.append({"role": "user", "content": user_input})
response = client.chat.completions.create(
model=MODEL,
messages=messages,
temperature=TEMPERATURE,
max_tokens=MAX_TOKENS
)
reply = response.choices[0].message.content
messages.append({"role": "assistant", "content": reply})
# Prune old messages if over budget
enforce_token_budget(messages)
return reply
# Interactive chat with token monitoring
while True:
user_input = input("You: ")
if user_input.strip().lower() in {"exit", "quit"}:
break
answer = chat(user_input)
print("Assistant:", answer)
print(f"Current tokens: {total_tokens_used(messages)}")
How Token Management Works
The token management system works in several steps:
- Count Tokens: We use tiktoken to count tokens in each message accurately
- Monitor Total: Track the total tokens across the entire conversation
- Enforce Budget: When we exceed our token budget, automatically remove the oldest messages (but keep the system prompt)
Learning Insight: Different models use different tokenization schemes. The word "dog" might be 1 token in one model but 2 tokens in another. Our encoding functions handle these differences gracefully.
Run your chatbot and have a long conversation. Watch how the token count grows, then notice when it drops as old messages get pruned. The chatbot maintains recent context while staying within budget.
Part 6: Production-Ready Code Structure
For production applications, object-oriented design provides better organization and encapsulation. Here's how to convert our functional code to a class-based approach:
Create oop_chatbot.py
:
import os
import tiktoken
from openai import OpenAI
class Chatbot:
def __init__(self, api_key, model="gpt-4o-mini", temperature=0.7, max_tokens=100,
token_budget=1000, system_prompt="You are a helpful assistant."):
self.client = OpenAI(api_key=api_key)
self.model = model
self.temperature = temperature
self.max_tokens = max_tokens
self.token_budget = token_budget
self.messages = [{"role": "system", "content": system_prompt}]
self.encoding = self._get_encoding()
def _get_encoding(self):
"""Get tokenizer for the model."""
try:
return tiktoken.encoding_for_model(self.model)
except KeyError:
print(f"Warning: No tokenizer found for model '{self.model}'. Falling back to 'cl100k_base'.")
return tiktoken.get_encoding("cl100k_base")
def _count_tokens(self, text):
"""Count tokens in text."""
return len(self.encoding.encode(text))
def _total_tokens_used(self):
"""Calculate total tokens in conversation."""
try:
return sum(self._count_tokens(msg["content"]) for msg in self.messages)
except Exception as e:
print(f"[token count error]: {e}")
return 0
def _enforce_token_budget(self):
"""Remove old messages if over budget."""
try:
while self._total_tokens_used() > self.token_budget:
if len(self.messages) <= 2:
break
self.messages.pop(1)
except Exception as e:
print(f"[token budget error]: {e}")
def chat(self, user_input):
"""Send message and get response."""
self.messages.append({"role": "user", "content": user_input})
response = self.client.chat.completions.create(
model=self.model,
messages=self.messages,
temperature=self.temperature,
max_tokens=self.max_tokens
)
reply = response.choices[0].message.content
self.messages.append({"role": "assistant", "content": reply})
self._enforce_token_budget()
return reply
def get_token_count(self):
"""Get current token usage."""
return self._total_tokens_used()
# Usage example
api_key = os.getenv("OPENAI_API_KEY") or os.getenv("TOGETHER_API_KEY")
if not api_key:
raise ValueError("No API key found. Set OPENAI_API_KEY or TOGETHER_API_KEY.")
bot = Chatbot(
api_key=api_key,
system_prompt="You are a fed up and sassy assistant who hates answering questions."
)
while True:
user_input = input("You: ")
if user_input.strip().lower() in {"exit", "quit"}:
break
response = bot.chat(user_input)
print("Assistant:", response)
print("Current tokens used:", bot.get_token_count())
The class-based approach encapsulates all chatbot functionality, makes the code more maintainable, and provides a clean interface for integration into larger applications.
Testing Your Chatbot
Run your completed chatbot and test these scenarios:
- Memory Test: Ask a question, then refer back to it later in the conversation
- Personality Test: Verify the sassy persona remains consistent across exchanges
- Token Management Test: Have a long conversation and watch token counts stabilize
- Error Handling Test: Try invalid input to see graceful error handling
Common Issues and Solutions
Environment Variable Problems: If you get authentication errors, verify your API key is set correctly. Windows users may need to restart after setting environment variables.
Token Counting Discrepancies: Different models use different tokenization. Our fallback encoding provides reasonable estimates when exact tokenizers aren't available.
Memory Management: If conversations feel repetitive, your token budget might be too low, causing important context to be pruned too aggressively.
What's Next?
You now have a fully functional chatbot with memory, personality, and cost controls. Here are natural next steps:
Immediate Extensions
- Web Interface: Deploy using Streamlit or Gradio for a user-friendly interface
- Multiple Personalities: Create different system prompts for various use cases
- Conversation Export: Save conversations to JSON files for persistence
- Usage Analytics: Track token usage and costs over time
Advanced Features
- Multi-Model Support: Compare responses from different AI models
- Custom Knowledge: Integrate your own documents or data sources
- Voice Interface: Add speech-to-text and text-to-speech capabilities
- User Authentication: Support multiple users with separate conversation histories
Production Considerations
- Rate Limiting: Handle API rate limits gracefully
- Monitoring: Add logging and error tracking
- Scalability: Design for multiple concurrent users
- Security: Implement proper input validation and sanitization
Key Takeaways
Building your own chatbot teaches fundamental skills for working with AI APIs professionally. You've learned to manage conversation state, control costs through token budgeting, and structure code for maintainability.
These skills transfer directly to production applications: customer service bots, educational assistants, creative writing tools, and countless other AI-powered applications.
The chatbot you've built represents a solid foundation. With the techniques you've mastered—API integration, memory management, and cost control—you're ready to tackle more sophisticated AI projects and integrate conversational AI into your own applications.
Remember to experiment with different personalities, temperature settings, and token budgets to find what works best for your specific use case. The real power of building your own chatbot lies in this customization capability that you simply can't get from using someone else's AI interface.
Resources and Next Steps
- Complete Code: All examples are available in the solution notebook
- Community Support: Join the Dataquest Community to discuss your projects and get help with extensions
- Related Learning: Explore API integration patterns and advanced Python techniques to build even more sophisticated applications
Start experimenting with your new chatbot, and remember that every conversation is a learning opportunity, both for you and your AI assistant!