Introduction to Kubernetes Volumes

When you run applications in Kubernetes, you're working with containers that are designed to be lightweight and portable. By default, these containers treat storage as temporary — any files you create or modify inside a container exist only as long as that specific container instance is running. This works fine for stateless applications, but what happens when you need to save logs, store user uploads, or persist database files?

That's where Kubernetes volumes come in. In this lesson, we'll explore why containers need volumes, how to define them in your Pod specifications, and the crucial difference between declaring a volume and actually using it.

The Container Storage Problem

Containers use what's called an ephemeral filesystem. This means that when you write a file inside a running container, that file is stored in a temporary layer that's tied to that specific container instance. Everything works fine until something causes the container to restart — maybe it crashes. When the container restarts, Kubernetes creates a fresh container instance, and all the data from the previous instance disappears.

Let's make this concrete with a real-world scenario. Imagine you're running a web application that allows users to upload profile pictures. Without volumes, here's what happens: a user uploads their photo, your application saves it to /var/www/uploads/profile.jpg inside the container, and everything looks great. An hour later, the container restarts due to a memory issue. When the new container starts up, the /var/www/uploads directory is empty — the user's profile picture is gone. This same problem affects application logs, cached data, and any other files your application creates.

This ephemeral behavior is actually intentional in container design. It keeps containers lightweight and ensures they start from a clean, predictable state. But for many real applications, we need a way to preserve data across container restarts. That's the fundamental problem that volumes solve.

Kubernetes Volumes: The Two-Part Solution

Kubernetes solves the storage problem through volumes, which are storage resources that exist independently of any single container. Unlike the container's ephemeral filesystem, a volume persists even when containers restart. The key insight is that volumes are attached to the Pod (the wrapper around your containers), not to individual containers. This means the volume's lifecycle is tied to the Pod, giving your data a longer lifespan than any individual container. How long data persists depends on the volume type — for emptyDir volumes (which we'll use in this lesson), data survives container restarts but is lost when the Pod itself is deleted or replaced.

Setting up a volume in Kubernetes requires two distinct steps, and understanding this two-part structure is crucial. First, you define the volume in the Pod's spec.volumes section — this tells Kubernetes, "I want a volume to exist, here's what type it is, and here's what I'm calling it." Second, you mount that volume into your container using the volumeMounts section — this tells Kubernetes, "Take that volume I defined and make it available at this specific path inside my container."

Here's how these two parts connect in a Pod specification:

The volume name is what ties these two parts together — the name field in the volume definition must match exactly with the name field in the volume mount. This connection tells Kubernetes which defined volume to make available at which path inside your container.

In this lesson, we'll focus on the simplest type of volume: emptyDir. An emptyDir volume starts as an empty directory when the Pod is created and exists for the lifetime of the Pod. It's perfect for scratch space, temporary caches, or any data that needs to survive container restarts but doesn't need to persist after the Pod is deleted. We'll explore other volume types in later lessons, but gives us a clear way to understand the basic volume syntax.

Example: Pod Without a Volume

Let's start by looking at a Pod that doesn't use any volumes. This will help us understand the default behavior before we introduce the solution. Here's the complete specification:

This is a straightforward Pod definition. The metadata.name field gives our Pod the name app-without-volume, which we'll use to reference it in kubectl commands. The spec.containers section defines a single container running the nginx web server (version 1.25). Notice what's missing: there's no volumeMounts section under the container, and the volumes array at the bottom is empty.

The empty volumes: [] array is worth highlighting. While you could omit this line entirely (Kubernetes would assume an empty array), including it explicitly makes your intent clear: this Pod deliberately has no volumes. Any data that the nginx container writes — whether it's access logs, error logs, or cached files — will be stored in the container's ephemeral filesystem. If this container restarts for any reason, all that data vanishes.

When you run this Pod, the nginx process can still write files to paths like /var/log/nginx/access.log or /tmp/cache. These writes succeed because the container has a writable filesystem layer. The problem isn't that writes fail — it's that the data doesn't survive a restart. This Pod demonstrates the default container behavior that volumes are designed to improve.

Example: Pod With an emptyDir Volume

Now let's look at a Pod that uses a volume to provide persistent storage across container restarts. Here's the complete specification:

This Pod has the same basic structure as our previous example, but now we've added the two-part volume configuration. Let's break down each part to understand how they work together.

First, look at the spec.volumes section at the bottom of the file:

This section defines a volume. We're telling Kubernetes, "Create a volume, call it scratch, and make it an emptyDir type." The emptyDir: {} syntax means we're using all the default settings for this volume type (Kubernetes will create an empty directory on the node's disk). The name scratch is arbitrary — we could call it data, storage, or my-volume — but it needs to be unique within this Pod because we'll use this name to reference the volume later.

Now look at the volumeMounts section inside the container definition:

This section mounts the volume into the container's filesystem. The name: scratch field connects this mount to the volume we defined earlier — the names must match exactly. The field specifies where the volume appears inside the container. After this Pod starts, the container will see a directory at , and anything written to that directory will actually be stored in the volume.

Mounting the Same Volume at Multiple Paths

Once a volume is defined, you aren't limited to mounting it in only one place. A single volume can appear in the volumeMounts list more than once, each time at a different mountPath. Because all of those paths point to the same underlying volume, data written through any of them is immediately visible through all of the others.

Here's what that looks like:

Both entries reference the same name: scratch volume. Inside the container, /tmp/data and /tmp/backup are two windows into the same storage — a file created at /tmp/data/report.txt will also be readable at /tmp/backup/report.txt. The volumes section stays exactly the same; no second volume definition is needed.

This pattern is useful any time you want two parts of an application to reach shared data through different path conventions without duplicating the underlying storage.

Working with Volume-Based Pods

Now that we understand the YAML structure, let's look at the commands you'll use to work with these Pods. On CodeSignal, these commands are ready to run in the terminal, and all the necessary tools are pre-installed.

To create both Pods, use the kubectl apply command:

When you run these commands, you should see output like this:

This confirms that Kubernetes has accepted your Pod specifications and is working to create the Pods. The Pods will go through a brief initialization phase during which Kubernetes pulls the nginx image (if it's not already cached) and starts the containers.

To inspect the Pods and verify the volume configuration, use the kubectl describe command:

The describe output is detailed, but focus on two sections. For app-without-volume, you'll see a Containers section that lists the nginx container, but there won't be any Mounts listed (or only the default Kubernetes service account token). For app-with-volume, you'll see a Mounts section under the container that shows:

Summary and Practice Preview

Kubernetes volumes solve the container storage problem by providing storage that persists across container restarts. The key to using volumes is understanding the two-part configuration: you define volumes in the spec.volumes section of your Pod, giving each volume a name and a type, and then you mount those volumes into your containers using the volumeMounts section, specifying where each volume should appear in the container's filesystem. The volume name is what connects these two pieces — it must match exactly.

The difference between a Pod with volumes and one without is fundamental. Without volumes, all data written by your containers lives in the ephemeral container filesystem and disappears on restart. With volumes, data written to mounted paths persists in the volume and survives container restarts. In the upcoming practice exercises, you'll create your own Pod specifications with volumes, experiment with writing data to mounted paths, and observe how that data behaves when containers restart. This hands-on practice will solidify your understanding of volume syntax and behavior.

Sign up
Join the 1M+ learners on CodeSignal
Be a part of our community of 1M+ users who develop and demonstrate their skills on CodeSignal