In today’s world, almost every online interaction — from browsing a website to using a mobile app — is powered by the client-server architecture. The client acts as a consumer, while the server functions as a provider — the client requests and consumes services offered by the server. Whether you’re an aspiring developer or preparing for a system design interview, it’s important to grasp how this model works and what challenges come with it.
Let’s dive in.
What Is Client-Server Architecture?
At its core, client-server architecture is a communication model where:
- The client (e.g., a web browser, mobile app, or desktop software) makes requests for data or services.
- The server processes those requests and responds with the appropriate information.
Both client and server are connected via the internet or a local network. For example, when you open Instagram on your phone, your app (client) sends a request to Instagram’s server, which then responds with Instagram posts.
How Does the Client Know Which Server to Connect To?
Here’s an important question: How does the client know the exact server to connect to?
Technically, every server on the internet has a unique IP address (like 142.250.190.14
). You can connect to server with the IP address directly. But remembering an IP address is not easy for humans. That’s where DNS (Domain Name System) steps in.
Role of DNS
DNS is like the internet’s phonebook. It maps human-readable domain names (like google.com
) to their corresponding IP addresses.
So when you enter www.google.com
into your browser:
- Your computer contacts the DNS server.
- The DNS server returns the IP address of Google’s server.
- Browser cache the IP address along with the domain name in the browser with some TTL.
- Your browser then makes a connection to that IP to fetch the website.
If the DNS cannot find the IP mapping, you’ll see an error such as:
DNS_PROBE_FINISHED_NXDOMAIN
or “DNS name could not be resolved.”
What Happens When the Server Can’t Handle More Requests?
Imagine a situation where your app becomes really popular, and thousands of users start using it simultaneously — such as during festival seasons or a promotion by a celebrity — resulting in a massive influx of requests. Your server has limited resources like CPU, RAM, and storage.
So what happens if it gets overwhelmed?
Vertical Scaling
One common solution is to scale vertically — i.e., upgrade the server with more CPU power, more RAM, or a bigger disk.
But there’s a catch:
- Vertical scaling has limits.
- It often requires downtime to upgrade hardware or restart the system.
- You can’t keep increasing the server size indefinitely — there’s a physical and architectural limit to how much a single server can handle.
- Scaling up server capacity also incurs higher costs, making it less efficient beyond a certain point.
- It doesn’t solve the problem of a single point of failure.
Boosting Performance with Caching
To reduce load on the server and improve response time, we often use caching.
Caching involves temporarily storing frequently requested data — like product listings or user profiles — in fast-access memory (e.g., Redis, Memcached). This means:
- Less frequent calls to the database.
- Faster response times for clients.
However, even with caching, a single server can still become a bottleneck as traffic grows.
The Problem with a Single Server
Using only one server in production introduces multiple risks:
- Scalability limits: You can only scale vertically to a certain point.
- Downtime during upgrades.
- Single Point of Failure (SPOF): If the server crashes, your entire app goes down.
- Unpredictable traffic: Real-world traffic is spiky. A single machine cannot always handle surges.
The solution to these issues lies in horizontal scaling and load balancing, which we’ll explore in the next blog.
Conclusion
Client-server architecture is the backbone of modern web systems. While it starts with one client connecting to one server, you’ll quickly run into challenges as usage scales. Understanding DNS, server limits, and the need for scalability is the first step toward designing resilient, distributed systems.