System Design 101 — Part 2: High-Level Architecture of a URL Shortener

Aug 15, 2025

In the last part, we set the stage for our URL shortener journey. We talked about latency vs. throughput, load balancing, caching, the CAP theorem, and how scalability isn’t just about throwing more servers at a problem.

Now, it’s time to roll up our sleeves and move from theory to architecture.
Because before we write a single line of code, we need a blueprint — otherwise, we’re just building castles in the sand.

Step 1 — The Requirements

Every good system starts with clear requirements. Let’s define ours.

Functional Requirements

Shorten a given URL.
Redirect to the original URL when given the short link.
Track click metrics (optional for now).

Non-Functional Requirements

Low latency — nobody wants to wait half a second for a redirect.
High availability — the service should be up all the time.
Scalable — should handle millions of requests/day.
Fault tolerant — system should gracefully recover from failures.

Step 2 — The Big Picture (Architecture)

Here’s what our high-level architecture looks like:

Clients → API Gateway / Load Balancer → Application Layer → Cache (Redis) → Database
↓ (optional) Analytics Service
↓ (optional) CDN

Flow Example:

Shortening a URL
- Client sends a POST request with the original URL.
- App generates a short code and stores it in the DB.
- Response returns short.ly/abc123.
Redirecting
- Client hits short.ly/abc123.
- Check Redis cache. If found → return redirect immediately.
- If not found → fetch from DB, store in cache, return redirect.

Step 3 — Choosing the Tech

Here’s where trade-offs matter.

Backend Language:
Go for speed (Go, Rust) or developer familiarity (Node.js, Python, Java).
Database:
- Relational (PostgreSQL/MySQL) — easy to start, strong consistency.
- NoSQL (Cassandra, DynamoDB) — great for huge scale, flexible schema.
Cache:
Redis — blazingly fast for lookups of popular URLs.
Hosting:
Cloud (AWS, GCP, Azure) with auto-scaling and load balancing.

Step 4 — Short Code Generation

Three common strategies:

Base62 Encoding
Convert an auto-increment ID to a string using 62 characters (A–Z, a–z, 0–9).
Hashing
Use MD5 or SHA-256 on the URL, then take the first few characters.
Handle collisions by rehashing or appending a random bit.
Random String Generation
Generate a random string of fixed length and check for duplicates before saving.

Pro Tip: Avoid predictable IDs in public URLs — it makes scraping easier.

Step 5 — Redirects Done Right

Check the Cache First
Cache misses go to the DB. After fetching, store in Redis for next time.
HTTP 301 vs 302
- 301 (Permanent) → Good for stable URLs.
- 302 (Temporary) → If target might change.

Step 6 — Scaling from Day One

If our shortener takes off, here’s how we scale:

Horizontal Scaling — More app servers behind a load balancer.
Database Sharding — Split data based on short code ranges.
Geo-Distributed Caching — Keep data close to users worldwide.
CDN Edge Redirects — Let a CDN like Cloudflare handle redirects for ultra-low latency.

Step 7 — Fault Tolerance

Replication — Keep DB replicas in multiple regions.
Circuit Breakers — Stop cascading failures.
Graceful Degradation — If cache is down, redirect still works via DB.

What’s Next

In Part 3, we’ll go deep into low-level design:

Database schema
API endpoints
How to track analytics
Security considerations (rate limiting, abuse prevention)

The goal? By the end of this series, you’ll have the mental toolkit to design any scalable service — not just a URL shortener.

If you enjoyed this, share it with your fellow devs — and stay tuned for the next part. This is just the beginning. 🚀

CodStak

Discussion about this post

Ready for more?