When your AI API needs authentication, rate limiting, and logging, the first instinct is to add Express middleware or Next.js route handlers. This works — until you need per-customer rate limits, API key rotation, automatic documentation, or multi-region deployment. At that point, maintaining custom middleware becomes a distraction from your core product. This article compares building API gateway features yourself vs using Zuplo.
What API Gateway Features Do AI APIs Need?
| Feature | DIY Complexity | Zuplo |
|---|---|---|
| Rate limiting | Redis + custom logic, edge cases galore | Built-in policy, 1 config block |
| API key management | Database table, hashing, rotation, revocation | Built-in with developer portal |
| JWT validation | Library + JWKS fetching + error handling | Built-in policy, issuer URL config |
| Request logging | Middleware + log aggregation setup | Built-in analytics |
| API documentation | Swagger UI + OpenAPI spec maintenance | Auto-generated from routes |
| Multi-region | Separate deployments + load balancer | Automatic — 200+ PoPs |
| Caching | Redis + cache-control logic | Built-in response caching policy |
The DIY Rate Limiter Problem
Here's a typical custom rate limiter in Express using Redis:
// Custom rate limiter — looks simple, gets complicated fast import { createClient } from "redis"; const redis = createClient({ url: process.env.REDIS_URL }); export function rateLimiter(limit: number, windowSeconds: number) { return async (req: Request, res: Response, next: NextFunction) => { const key = `ratelimit:${req.headers["x-api-key"]}`; const [count] = await redis .multi() .incr(key) .expire(key, windowSeconds) .exec(); res.setHeader("X-RateLimit-Limit", limit); res.setHeader("X-RateLimit-Remaining", Math.max(0, limit - (count as number))); if ((count as number) > limit) { return res.status(429).json({ error: "Rate limit exceeded" }); } next(); }; } // Problems: race conditions at high concurrency, no sliding window, // no per-plan limits, no analytics, no dashboard visibility
At first glance this works. But then you need:
- Different limits per plan (free: 10/min, pro: 100/min, enterprise: 1000/min)
- Sliding window instead of fixed window (fixed windows allow double the rate at boundary)
- Rate limit headers in the Retry-After format RFC 7231 compliant
- Bypass for internal service-to-service calls
- A dashboard to see who's hitting limits
- Alert when a customer is consistently near the limit (upsell signal)
Each of these adds 50-200 lines of infrastructure code that doesn't move your product forward.
When to Stay DIY
Your API is internal only
If your API only serves your own frontend, rate limiting by IP is sufficient and Upstash Ratelimit + a few lines of code is perfectly adequate. You don't need API key management.
You have a single customer tier
If every customer gets the same limits and there's no external developer access, the complexity of an API gateway is overkill.
You need deeply custom request logic
Zuplo's policies are powerful but if your rate limiting logic requires reading from your own database schema in complex ways, a custom middleware that speaks directly to your database is more flexible.
When Zuplo Wins
- You're building a developer-facing API that external customers integrate with
- You need usage-based billing tied to API consumption
- You want automatic OpenAPI documentation generated from your routes
- You need different rate limits and feature flags per customer tier
- You want multi-region API deployment without managing separate infrastructure
- Your team lacks backend infrastructure experience and wants to ship auth fast
Migration Path
You don't have to choose upfront. The practical approach:
| Stage | Approach |
|---|---|
| MVP (0-100 users) | Upstash Ratelimit middleware, API key in DB, simple middleware |
| Growth (100-1000 users) | Add Zuplo in front of existing backend — zero code changes needed |
| Scale (1000+ users) | Zuplo handles auth, rate limiting, docs; backend focuses on business logic |
Zuplo proxies to your existing backend URL. Adding it is a DNS/proxy change, not a code change. You can adopt it incrementally — route just your public API through Zuplo while keeping internal routes direct.
Zuplo's free tier supports up to 50k requests/month with full API key management and rate limiting. This covers most early-stage AI APIs. Start free and add billing features when customers start asking for usage-based pricing.Cost Comparison
| Approach | Monthly Cost | Engineering Time |
|---|---|---|
| DIY (Redis + custom auth) | $3-15/month (Redis) | 2-5 days to build, ongoing maintenance |
| Zuplo free tier | $0 | 30 minutes to configure |
| Zuplo Pro | $150/month | 30 minutes to configure |
| Zuplo Enterprise | Custom | Dedicated support |
For most AI startups, the engineering time saved by using Zuplo in the growth phase more than justifies the cost — especially when factoring in edge cases that inevitably appear in DIY rate limiting.