How to Build a Multilingual Customer Support Chatbot with Gemini 2.5 Pro and Firebase
Build a multilingual customer support chatbot with Gemini 2.5 Pro and Firebase — one model, automatic language detection, persistent sessions, zero translation APIs.
Your customer base doesn’t speak one language. Your support chatbot probably does. That gap costs you — in frustrated users, in ticket volume, in customers who just leave. The traditional fix was ugly: spin up separate models per locale, chain translation APIs, watch latency climb, and pray the meaning didn’t get mangled somewhere between English and Thai. There’s a better way now.
Gemini 2.5 Pro, Google’s latest multimodal model released in early 2026, handles multilingual conversations natively — meaning the model understands and responds in the user’s language without a separate translation step in the middle. Pair that with Firebase for session management and a bit of structured prompting, and you can have a production-ready multilingual support chatbot that doesn’t make you want to cry at 2am. This tutorial walks you through the whole thing, from API setup to deployment.
Before we start: this guide assumes you’re comfortable with JavaScript/Node.js and have a Firebase project. The Gemini API calls work the same in Python if that’s your stack — swap the syntax, keep the logic.
What You’ll Build
By the end of this tutorial, you’ll have a customer support chatbot that detects the user’s language automatically, maintains conversation context across a full support session in that language, responds naturally without the robotic feel of machine-translated text, and stores conversation history in Firestore so sessions persist across page reloads. The whole thing runs on a single Gemini 2.5 Pro model — no language-specific routing, no translation middleware eating your budget.
What You Need Before Starting
You’ll need a Google Cloud account with the Gemini API enabled, a Firebase project with Firestore enabled, Node.js 18 or higher, and the @google/generative-ai SDK plus the Firebase Admin SDK. Your Gemini API key lives in Google AI Studio or Google Cloud Console. For Firebase, grab your service account credentials JSON from Project Settings. Keep both out of your repository — environment variables, always.
Step 1 — Project Setup and Dependencies
Create your project directory and install the essentials:
npm init -y
npm install @google/generative-ai firebase-admin express dotenv
Create a .env file at your project root with your credentials. Never hardcode these — if you push an API key to GitHub, you will have a bad day within hours, not days.
GEMINI_API_KEY=your_gemini_api_key_here
FIREBASE_PROJECT_ID=your_firebase_project_id
GOOGLE_APPLICATION_CREDENTIALS=./serviceAccountKey.json
Pro tip ✅
Add
serviceAccountKey.jsonand.envto your.gitignoreimmediately, before you write a single line of application code. Future you will thank present you.
Step 2 — Initialize Gemini and Firebase
Create src/config.js to handle both initializations cleanly:
import { GoogleGenerativeAI } from '@google/generative-ai';
import admin from 'firebase-admin';
import dotenv from 'dotenv';
dotenv.config();
const genAI = new GoogleGenerativeAI(process.env.GEMINI_API_KEY);
admin.initializeApp({
credential: admin.credential.applicationDefault(),
projectId: process.env.FIREBASE_PROJECT_ID,
});
const db = admin.firestore();
export const model = genAI.getGenerativeModel({
model: 'gemini-2.5-pro',
generationConfig: {
temperature: 0.7,
topP: 0.95,
maxOutputTokens: 1024,
},
});
export { db };
The temperature: 0.7 setting gives you responses that feel natural without going off-script. For customer support, you don’t want the model getting creative — if you’re seeing hallucinated return policies, drop this to 0.4. The maxOutputTokens: 1024 cap keeps responses concise; support answers shouldn’t be essays.
Note 💡
Gemini 2.5 Pro’s context window is substantially larger than previous generations, but for customer support you rarely need more than the last 10-15 exchanges. Cap your conversation history in Firestore retrieval to keep costs predictable.
Step 3 — Build the Multilingual System Prompt
This is where most tutorials go wrong. They write a system prompt in English and expect multilingual magic to happen. The model can handle it, but you get much better results when you explicitly instruct language behavior. Here’s the system prompt that does the heavy lifting:
You are a helpful customer support assistant for [Company Name].
LANGUAGE RULES:
- Detect the language of the user's first message automatically.
- Respond ALWAYS in the same language the user writes in.
- If the user switches languages mid-conversation, switch with them immediately.
- Never translate or explain what language you are using — just use it naturally.
- Maintain the same tone and formality level appropriate for that language's cultural context.
SUPPORT RULES:
- Help users with order tracking, returns, product questions, and account issues.
- If you cannot resolve an issue, collect the user's contact details and tell them a human agent will follow up within 24 hours.
- Never invent policies, prices, or product details you are not certain of.
- Keep responses concise — two to four sentences unless more detail is genuinely needed.
CONTEXT: You have access to the full conversation history in this session.
The language rules section is doing serious work here. The instruction to match formality for cultural context matters — Japanese support interactions follow different conventions than Brazilian Portuguese ones, and a flat “be polite” instruction misses that entirely.
Pro tip ✅
Add your actual company policies to the system prompt as a reference block. Something like
RETURN POLICY: 30 days, receipt required, no exceptions for digital goods. This dramatically reduces hallucinated policies and gives the model something concrete to cite.
Step 4 — Firestore Session Management
Persistent context is what separates a real support chatbot from a parlor trick. Create src/session.js:
import { db } from './config.js';
const SESSION_HISTORY_LIMIT = 12; // keep last 12 exchanges
export async function getSessionHistory(sessionId) {
const doc = await db.collection('chat_sessions').doc(sessionId).get();
if (!doc.exists) return [];
return doc.data().history || [];
}
export async function saveSessionHistory(sessionId, history) {
const trimmedHistory = history.slice(-SESSION_HISTORY_LIMIT * 2);
await db.collection('chat_sessions').doc(sessionId).set({
history: trimmedHistory,
updatedAt: new Date().toISOString(),
}, { merge: true });
}
export async function clearSession(sessionId) {
await db.collection('chat_sessions').doc(sessionId).delete();
}
The slice(-SESSION_HISTORY_LIMIT * 2) line trims to the last 24 messages (12 user + 12 model turns). Multiply by 2 because Firestore stores alternating user/model turns. Adjust this number based on your typical support conversation length — complex technical support might need 20 exchanges, simple FAQ chatbots can get away with 6.
Step 5 — The Core Chat Handler
Create src/chatbot.js — this is the main piece that ties everything together:
import { model } from './config.js';
import { getSessionHistory, saveSessionHistory } from './session.js';
const SYSTEM_PROMPT = `[paste your system prompt from Step 3 here]`;
export async function handleSupportMessage(sessionId, userMessage) {
const history = await getSessionHistory(sessionId);
const chat = model.startChat({
history: history,
systemInstruction: SYSTEM_PROMPT,
});
const result = await chat.sendMessage(userMessage);
const responseText = result.response.text();
const updatedHistory = [
...history,
{ role: 'user', parts: [{ text: userMessage }] },
{ role: 'model', parts: [{ text: responseText }] },
];
await saveSessionHistory(sessionId, updatedHistory);
return responseText;
}
Notice that history gets passed directly into startChat. Gemini’s chat API maintains the conversation context within the call, but you need to persist it yourself between calls — that’s what Firestore is handling. The model doesn’t remember previous sessions on its own; your session management layer is what makes it feel like it does.
Warning ⚠️
Don’t pass the entire Firestore history object directly to
startChatwithout validating the format first. Gemini’s history expects a strict alternating user/model turn structure. If your history array somehow has two consecutive user messages (which can happen if a previous API call failed mid-save), you’ll get a cryptic error. Add a validation step that checks turn order before passing to the model.
Step 6 — Express API Layer
Wrap everything in a simple API that your frontend can hit:
import express from 'express';
import { handleSupportMessage } from './chatbot.js';
import { clearSession } from './session.js';
import { v4 as uuidv4 } from 'uuid';
const app = express();
app.use(express.json());
app.post('/api/chat', async (req, res) => {
try {
const { message, sessionId } = req.body;
if (!message || typeof message !== 'string') {
return res.status(400).json({ error: 'Message is required' });
}
const activeSessionId = sessionId || uuidv4();
const response = await handleSupportMessage(activeSessionId, message);
res.json({
response,
sessionId: activeSessionId,
});
} catch (error) {
console.error('Chat error:', error);
res.status(500).json({ error: 'Something went wrong. Please try again.' });
}
});
app.delete('/api/chat/:sessionId', async (req, res) => {
await clearSession(req.params.sessionId);
res.json({ success: true });
});
app.listen(3000, () => console.log('Support chatbot running on port 3000'));
The frontend generates a sessionId (or uses one from localStorage) and passes it with every message. On first load, no sessionId is sent and the server creates one. The client stores it and sends it back with subsequent messages — that’s your persistent session in two lines of frontend logic.
Step 7 — Test Your Multilingual Setup
Before deploying, test the language detection with a few curl calls. The model should switch languages naturally without any explicit language parameter from you:
# Test Spanish
curl -X POST http://localhost:3000/api/chat
-H "Content-Type: application/json"
-d '{"message": "Hola, quiero saber el estado de mi pedido #12345"}'
# Test Japanese (same session - paste sessionId from first response)
curl -X POST http://localhost:3000/api/chat
-H "Content-Type: application/json"
-d '{"message": "注文の状態を確認したいのですが", "sessionId": "your-session-id-here"}'
# Test French with a language switch mid-session
curl -X POST http://localhost:3000/api/chat
-H "Content-Type: application/json"
-d '{"message": "Bonjour, je voudrais retourner un article", "sessionId": "your-session-id-here"}'
If the model responds in the correct language each time, your system prompt is working. If you’re getting English responses to non-English inputs, the system instruction isn’t being applied — double-check that you’re passing systemInstruction to startChat and not burying it in the history array.
Pro tip ✅
Test with languages that use different scripts — Arabic (right-to-left), Japanese (no spaces between words), Hindi (Devanagari script). If those work cleanly, your token handling and response formatting are solid. If Arabic comes back scrambled, you likely have a text encoding issue somewhere in your middleware layer, not a model problem.
Step 8 — Prompt Variants for Different Support Scenarios
The base setup works, but you’ll want specialized prompt configurations for different ticket types. Here are three you can drop in as system prompt variants:
# Variant 1: E-commerce returns focus
You are a returns specialist for [Store Name]. Your only job is to guide users through the return process. You have a 30-day return window policy. Digital downloads are non-refundable. Physical items need original packaging. Respond in the user's language. Collect order number and reason for return. Once collected, tell the user a prepaid label will be emailed within 2 hours.
# Variant 2: Technical support with escalation logic
You are a Level 1 technical support agent for [Product Name]. Respond in the user's language. Troubleshoot in this order: (1) ask user to restart/refresh, (2) check if the issue is account-specific by asking them to try a different browser, (3) ask them to clear cache. If the issue persists after these three steps, collect their email and device/OS details, then tell them a Level 2 engineer will contact them. Do not attempt further diagnosis beyond these steps.
# Variant 3: High-volume FAQ deflection
You are a support assistant for [Company]. Respond in the user's language. Your goal is to answer common questions using only this information: [paste your FAQ content here]. If the user's question is not covered by this information, say clearly that you cannot answer that specific question and offer to connect them with a human agent. Never guess or infer answers not explicitly in the FAQ content.
The escalation logic in Variant 2 is particularly useful because it prevents the model from going too deep into technical rabbit holes. Without that structure, support bots tend to generate increasingly elaborate (and sometimes wrong) troubleshooting steps. Constraining the diagnostic tree keeps the bot useful without making it a liability.
Avoid 🚫
Don’t use vague instructions like “be helpful” or “assist customers professionally” as your primary guidance. The model will comply — by doing whatever seems plausible. Specific constraints and defined workflows consistently outperform general instructions for support use cases.
Deploying to Production
For production, a few things need attention beyond the basic setup. Rate limiting is non-negotiable — without it, one angry customer with a script can eat your entire API budget in an afternoon. Add express-rate-limit and set a sensible cap per IP, something like 30 requests per minute. Add input sanitization to strip anything that looks like a prompt injection attempt before it reaches the model. Keep your system prompt out of client-side code entirely — it should only ever live server-side.
Firebase Functions is the natural deployment target if you’re already in the Google ecosystem. The cold start latency on Functions can be noticeable for chat applications, so either configure minimum instances or use Cloud Run if you need consistent response times. Firestore’s real-time listeners also let you build a conversation dashboard for your support team to monitor live sessions — useful for catching edge cases your prompts haven’t handled yet.
Pro tip ✅
Log every conversation to Firestore with a
resolved: falseflag by default. Build a simple dashboard that shows unresolved sessions older than 10 minutes — those are your escalation candidates that the bot couldn’t close. Reviewing a week’s worth of those conversations will tell you exactly what to add to your system prompt next.
What to Build Next
The setup above gets you a working multilingual support chatbot. What makes it a good one over time is iteration on the system prompt based on real conversation data, and integration with your actual backend — order management systems, account databases, ticket systems. Gemini’s function calling lets you give the model tools to look up real order status instead of asking users to hold while a human checks. That’s the next logical step: a bot that can actually answer “where’s my order” with a real answer, in whatever language the customer asked.
The multilingual piece, once set up this way, largely takes care of itself. The model handles the language detection and switching; your job is to make sure the underlying business logic it’s working with is accurate. A chatbot that confidently gives wrong information in six languages is worse than one that only speaks English. Get the facts right first, then let the model handle the translation.


