ClusterIP Services in Practice

Introduction: ClusterIP Services for Multi-Tier Applications

In the previous lesson, you learned how Services use labels and selectors to find and route traffic to Pods. You created a simple single-tier application with one Deployment and one Service, and you verified they were properly wired together using endpoints. Now we're ready to tackle something more realistic: a multi-tier application where different components need to communicate with each other internally. In this lesson, you'll build a three-tier shop application with a frontend, backend, and database. You'll create ClusterIP Services for each tier and learn how they enable secure internal communication. You'll also master advanced ClusterIP patterns including headless Services, session affinity, multi-port exposure, and topology-aware routing. By the end, you'll understand why ClusterIP is the foundation of microservices architectures in Kubernetes.

ClusterIP Fundamentals: Internal-Only Communication

ClusterIP is the default Service type in Kubernetes, and it's designed for one specific purpose: enabling communication between Pods inside the cluster. When you create a ClusterIP Service, Kubernetes assigns it a stable internal IP address that never changes. This IP is only accessible from within the cluster — no external traffic can reach it. Think of it like an internal phone extension in an office building: employees can call each other using extensions, but people outside the building can't dial those numbers directly.

Why does internal-only access matter? First, it's a security best practice. Not every component of your application should be exposed to the internet. Your database, for example, should only be accessible to your backend services, not to the outside world. Second, it simplifies your architecture. When services communicate internally, you don't need to worry about firewalls, load balancers, or public IP addresses. Everything stays within the cluster's private network.

ClusterIP is different from other Service types you'll learn about later. NodePort Services expose your application on a specific port on every cluster node, making it accessible from outside the cluster. LoadBalancer Services create an external load balancer (usually in a cloud provider) that routes internet traffic to your Pods. These types are useful when you need external access, but for internal communication between application tiers, ClusterIP is the right choice. It's simpler, more secure, and exactly what you need for microservices talking to each other.

Here's a key insight: even if you eventually expose your application to the internet using a LoadBalancer or Ingress, you'll still use ClusterIP Services for the internal connections between your application's components. For example, your frontend might be exposed via a LoadBalancer, but it will communicate with your backend through a ClusterIP Service. This layered approach keeps your internal architecture secure while still allowing controlled external access where needed.

Multi-Tier Architecture: The Shop Application Pattern

Let's talk about the application we're going to build. We're creating a simplified online shop with three distinct tiers, each with a specific responsibility. This pattern mirrors how real-world applications are structured, where different concerns are separated into different services. Understanding this architecture will help you see why ClusterIP Services are so important.

The frontend tier is what users interact with — the web UI. In our example, we're using nginx to serve static web pages, but in a real application, this might be a React or Vue.js application. The frontend needs to communicate with the backend to fetch data and perform actions, but it should never talk directly to the database. This separation is crucial for security and maintainability.

The backend tier contains your business logic and API endpoints. It receives requests from the frontend, processes them, and interacts with the database as needed. In our example, we're also using nginx as a placeholder, but in production, this would be something like a Node.js, Python, or Java application. The backend is the only tier that should have direct access to the database. This creates a security boundary — even if someone compromises the frontend, they shouldn't be able to directly access your data.

The database tier stores your application's data. We're using PostgreSQL, a popular relational database. Ideally, the database should only accept connections from the backend tier, never from the frontend or from outside the cluster. ClusterIP Services are a first step in this direction: they keep the database internal to the cluster, preventing any external access from the internet. However, it's important to understand that ClusterIP alone doesn't restrict which Pods within the cluster can access the database Service — by default, any Pod in the cluster can reach any ClusterIP Service. If you need to enforce that only the backend can talk to the database, you would use Kubernetes NetworkPolicies, which we'll cover in a future lesson.

The communication flow looks like this: in a complete system, a user's browser would talk to the frontend (though this external access requires LoadBalancer, NodePort, or Ingress — topics we'll cover in future lessons, and not something we're implementing here). What we will implement is the internal communication: the frontend makes API calls to the backend (using the backend's ClusterIP Service), and the backend queries the database (using the database's ClusterIP Service). Each arrow in this internal chain is a ClusterIP Service providing stable, internal connectivity. This pattern scales beautifully — you can add more tiers (like a caching layer or a message queue) and connect them the same way.

Creating the Multi-Tier Deployments

Now let's build our three-tier application step by step. We'll start by creating the Deployments for each tier. Remember from the previous lesson: the key is making sure we use consistent labels so our Services can find the Pods later. Let's begin with the frontend tier.

Here's the frontend Deployment. Save this as deployment-frontend.yaml:

This first section defines the Deployment's metadata. We're naming it frontend-deployment and giving it labels app: shop and tier: frontend. These labels on the Deployment itself are mainly for organization — the important labels are coming next in the Pod template.

The replicas: 2 means we want two frontend Pods for redundancy. The selector.matchLabels tells the Deployment which Pods it should manage. This must match the labels we're about to define in the Pod template.

Here's where we define the actual Pod specification. The template.metadata.labels section is crucial — these are the labels that will be stamped on every Pod created by this Deployment. We're using app: shop to indicate this is part of our shop application, and tier: frontend to show it's the frontend tier. The container runs on port 80.

Creating ClusterIP Services for Each Tier

With our Deployments running, we can now create the ClusterIP Services that will enable communication between the tiers. Each Service will use label selectors to find and route traffic to the appropriate Pods.

Let's start with the frontend Service. Save this as service-frontend.yaml:

We're naming the Service frontend-svc. The naming convention with the -svc suffix is common but not required — you can name Services anything you want. The labels here are on the Service object itself, mainly for organization.

Here's the critical part: type: ClusterIP explicitly sets the Service type (though ClusterIP is the default, it's good practice to be explicit). The selector section defines which Pods this Service will route traffic to. It's looking for Pods with both app: shop AND tier: frontend, which matches our frontend Deployment's Pod labels.

The ports section defines how traffic flows. The name: http is a descriptive label for this port mapping. The port: 80 is the port the Service listens on — other Pods will connect to the Service on this port. The targetPort: 80 is the port on the backend Pods where traffic will be forwarded. In this case, they're the same (80), but they don't have to be.

Service Discovery and DNS Names

Now that we have Services created, let's talk about how Pods actually use them to communicate. You might think Pods would need to know the Cluster IP addresses (like 10.96.123.45), but that would be inconvenient and error-prone. Instead, Kubernetes provides automatic DNS-based service discovery. Every Service gets a DNS name that Pods can use to connect to it, and Kubernetes automatically resolves that name to the Service's Cluster IP.

The DNS name for a Service follows a simple pattern: <service-name>.<namespace>.svc.cluster.local. Since we're working in the default namespace, our Services have these DNS names: frontend-svc.default.svc.cluster.local, backend-svc.default.svc.cluster.local, and database-svc.default.svc.cluster.local. However, when Pods are in the same namespace, you can use the short form: just frontend-svc, backend-svc, or database-svc. Kubernetes will automatically expand these to the full DNS names.

This DNS-based approach is incredibly powerful. Your application code doesn't need to know IP addresses or worry about Pods being replaced. You simply connect to backend-svc or database-svc, and Kubernetes handles the rest. If the backend Pods are replaced, the Service's Cluster IP stays the same, and the DNS name continues to work. This is what makes Kubernetes services so reliable.

Let's test this connectivity. We'll use kubectl exec to run commands inside our Pods and verify they can reach each other using Service names. First, let's get the name of one of our frontend Pods:

Headless Services: Direct Pod Access

So far, all our Services have used ClusterIP, which provides load balancing across Pods. But what if you need to communicate directly with specific Pods instead of going through a load balancer? This is where headless Services come in.

A headless Service is created by setting clusterIP: None in the Service spec. Instead of getting a single Cluster IP that load-balances across Pods, a headless Service returns the IP addresses of all matching Pods directly through DNS. This is essential for stateful applications where each Pod has a unique identity, such as databases, message queues, or distributed systems like Cassandra or Elasticsearch.

Let's create a headless Service to see how it differs from a regular ClusterIP Service. We'll create a StatefulSet first, since headless Services are most commonly used with StatefulSets. Save this as statefulset-database.yaml:

Notice the serviceName: database-headless-svc field — this links the StatefulSet to a headless Service, enabling stable network identities for each Pod. StatefulSets create Pods with predictable names like database-statefulset-0, database-statefulset-1, database-statefulset-2.

Now let's create the headless Service. Save this as service-database-headless.yaml:

Session Affinity: Sticky Sessions

By default, ClusterIP Services distribute requests randomly across all backend Pods using round-robin load balancing. Each request might go to a different Pod. But sometimes you need session affinity (also called sticky sessions) to ensure that requests from the same client always go to the same Pod.

Session affinity is useful when your application stores session data in memory rather than in a shared cache or database. For example, a shopping cart stored in a Pod's memory would be lost if the next request goes to a different Pod. While the best practice is to use external session storage (like Redis), session affinity provides a simpler solution for some use cases.

Let's create a Service with session affinity enabled. First, create a backend Deployment for testing. Save this as deployment-backend-affinity.yaml:

This uses the http-echo image which returns a message including the Pod's hostname, making it easy to see which Pod handled each request.

Now create a Service with session affinity. Save this as service-backend-affinity.yaml:

The key configuration is sessionAffinity: ClientIP. This tells Kubernetes to route all requests from the same source IP address to the same backend Pod. Apply both resources:

Multi-Port Services: Exposing Multiple Ports

Sometimes a single application needs to expose multiple ports. For example, your application might serve HTTP traffic on port 80 and HTTPS on port 443, or it might expose a web interface on one port and a metrics endpoint on another. Instead of creating separate Services for each port, you can define multiple port mappings in a single Service.

Let's create a Deployment that exposes multiple ports. Save this as deployment-multiport.yaml:

This Pod has two containers: nginx serving on port 80 and a Prometheus node exporter serving metrics on port 9100.

Now create a multi-port Service. Save this as service-multiport.yaml:

When you have multiple ports, each port must have a unique name. The Service exposes port 8080 for HTTP (routing to container port 80) and port 9090 for metrics (routing to container port 9100). Apply both:

Verify the Service configuration:

Topology-Aware Routing: Node-Local Traffic

In multi-node clusters, a Service might route traffic to a Pod on a different node, which adds network latency. Topology-aware routing allows you to prefer routing traffic to Pods on the same node when possible. This is particularly useful in multi-zone clusters where cross-zone traffic adds latency and costs money.

You can enable node-local routing by setting internalTrafficPolicy: Local in the Service spec. This tells Kubernetes to prefer routing to Pods on the same node as the client. If no local Pods are available, the traffic will fail rather than being routed to a remote Pod.

Let's create a Service with topology-aware routing. First, ensure you have a Deployment with multiple replicas spread across nodes. Then create the Service. Save this as service-local-traffic.yaml:

The key configuration is internalTrafficPolicy: Local. With this setting:

Traffic from a Pod is routed only to backend Pods on the same node
If no backend Pods exist on the client's node, the connection fails
This reduces network hops and latency for local traffic

Apply the Service:

To test this, you need a multi-node cluster. You can observe which Pod responds by checking if requests from Pods on different nodes reach different backend Pods. Be aware that internalTrafficPolicy: Local can lead to uneven load distribution if Pods aren't evenly spread across nodes, and it can cause connection failures if no local Pods are available.

For most use cases, the default internalTrafficPolicy: Cluster (which load-balances across all Pods regardless of location) is appropriate. Use Local only when you have specific latency requirements and understand the trade-offs.

Summary: The Foundation of Microservices Architecture

You've now built a complete multi-tier application using ClusterIP Services and mastered advanced patterns for production environments. You created three Deployments (frontend, backend, database), three ClusterIP Services to connect them, and verified that they can communicate using DNS-based service discovery. Beyond the basics, you learned four advanced ClusterIP patterns:

Headless Services (clusterIP: None) for direct Pod access, essential for stateful applications where each Pod has a unique identity
Session Affinity (sessionAffinity: ClientIP) for sticky sessions, ensuring requests from the same client always reach the same Pod
Multi-Port Services for exposing multiple ports through a single Service, useful for applications with web and metrics endpoints
Topology-Aware Routing (internalTrafficPolicy: Local) for reducing network latency by preferring local Pods

These patterns form the foundation of microservices architectures in Kubernetes. The key takeaways are: ClusterIP Services provide stable, internal-only endpoints for Pod-to-Pod communication. They use the same label-selector mechanism you learned in the previous lesson, but now you've seen how it scales to multi-tier applications and advanced use cases. Services automatically get DNS names, so your application code can use simple names like backend-svc instead of IP addresses. This makes your applications portable and resilient to Pod changes.

Security Note: We used hardcoded credentials (like POSTGRES_PASSWORD) to keep the focus on networking. In production, always use Kubernetes Secrets instead of plaintext values to manage sensitive data.

In the upcoming practice exercises, you'll apply these concepts hands-on: wiring multi-tier applications, implementing headless Services for StatefulSets, configuring session affinity, exposing multiple ports, and setting up topology-aware routing. This hands-on experience will solidify your understanding of ClusterIP Services and prepare you for more advanced Service types in future lessons.

Previous Lesson

Next Lesson: NodePort Services

Join the 1M+ learners on CodeSignal

Be a part of our community of 1M+ users who develop and demonstrate their skills on CodeSignal