Understanding Docker Volumes: A Comprehensive Tutorial

Containers are designed to be short-lived and stateless, which creates interesting challenges when applications need to store and access data that should survive container restarts or replacements.

Docker volumes provide the solution to this fundamental problem. They allow data to persist beyond the lifecycle of individual containers and provide a mechanism for containers to share data with one another.

This tutorial will guide you through the concepts, usage patterns, and best practices for working with Docker volumes effectively.

The challenge of data persistence in containers

To understand why volumes are necessary, let's first explore the nature of containers. When Docker runs a container, it creates a writable layer on top of the immutable image layers. Any files written by the application inside the container get stored in this writable layer. However, when the container is removed, all data in this layer is permanently deleted.

This behavior aligns with the container philosophy of being ephemeral and replaceable, but it presents a clear problem for applications that need to maintain state, such as databases, content management systems, or any application that generates user data. Without a persistence mechanism, restarting a container would mean losing all accumulated data.

Consider a simple example: if you run a MySQL database in a container and store the database files within the container's filesystem, stopping and removing that container would destroy all your data. Clearly, this is not acceptable for production environments.

Docker storage options: An overview

Docker provides three main options for managing data persistence:

Bind mounts create a direct link between a directory on the host machine and a directory in the container. The container can read and write to this directory as if it were part of its own filesystem. Any changes made by the container are immediately visible on the host and vice versa.

tmpfs mounts store data in the host system's memory rather than on disk. This approach is useful for sensitive information that should not be persisted to disk but needs to be available to the container.

Volumes are the most flexible and recommended way to persist data. Unlike bind mounts, volumes are completely managed by Docker. They can be named, backed up, and transferred between containers more easily. They also work seamlessly across different operating systems, unlike bind mounts which can have permission issues.

The key difference between these options lies in how and where the data is stored, and how that storage is managed. Volumes provide the most abstraction and are generally considered the best practice for most persistence needs in Docker.

We call when your
website goes down

Get notified with a radically better infrastructure monitoring platform.

Docker volumes in depth

Docker volumes are managed by Docker itself and stored in a part of the host filesystem that's managed by Docker (typically /var/lib/docker/volumes/ on Linux systems). This location is important because it means:

The volumes are isolated from the core functionality of the host system.
Docker can provide guarantees about the behavior of the volumes.
Non-Docker processes should not modify this data.

Volumes have their own lifecycle independent of containers. You can create a volume, attach it to one or more containers, detach it, and reattach it to different containers - all without losing any data. This separation of concerns allows for much greater flexibility in how you manage persistent data.

Volume drivers further extend this capability by allowing volumes to be stored on remote hosts or cloud providers, encrypted, or given other special capabilities.

Creating and managing volumes

Let's start with the basics of creating and managing Docker volumes. The simplest way to create a volume is with the docker volume create command:

Copied!

docker volume create my_data

Output

my_data

This creates a named volume called "my_data" that you can reference in other Docker commands. By default, Docker uses the local driver, which stores the volume on the local host filesystem.

You can also create a volume with specific options using the --opt flag:

Copied!

docker volume create --driver local \
  --opt type=none \
  --opt device=/home/user/data \
  --opt o=bind \
  my_custom_volume

This creates a volume that binds to a specific directory on the host system.

Inspecting volumes

To see detailed information about a volume, use the inspect command:

Copied!

docker volume inspect my_data

Output

[
    {
        "CreatedAt": "2023-03-17T14:05:22Z",
        "Driver": "local",
        "Labels": {},
        "Mountpoint": "/var/lib/docker/volumes/my_data/_data",
        "Name": "my_data",
        "Options": {},
        "Scope": "local"
    }
]

The output shows important information like where the volume is stored on the host system (the Mountpoint) and which driver it uses.

Listing and removing volumes

To see all volumes on your system:

Copied!

docker volume ls

Output

DRIVER    VOLUME NAME
local     my_data
local     my_custom_volume

To remove a volume when you no longer need it:

Copied!

docker volume rm my_data

If you have unused volumes (not attached to any container), you can clean them all up at once:

Copied!

docker volume prune -f

Output

Deleted Volumes:
my_custom_volume

Total reclaimed space: 10B

The -f flag force-removes volumes without asking for confirmation, so use it cautiously.

Attaching volumes to containers

Now that we understand how to create and manage volumes, let's see how to use them with containers.

The traditional way to attach a volume to a container is with the -v flag:

Copied!

docker run -d \
  --name postgres_db \
  -v my_postgres_data:/var/lib/postgresql/data \
  -e POSTGRES_PASSWORD=mysecretpassword \
  postgres:14

In this command:

-d runs the container in detached mode
--name postgres_db gives our container a name
-v my_postgres_data:/var/lib/postgresql/data mounts the volume "mypostgresdata" to the directory where PostgreSQL stores its data
-e POSTGRES_PASSWORD=mysecretpassword sets an environment variable
postgres:14 is the image we're using

The syntax for the -v flag is volume_name:container_path[:options]. If the volume doesn't exist, Docker creates it automatically.

A more explicit and verbose alternative is the --mount flag:

Copied!

docker run -d \
  --name postgres_db2 \
  --mount source=my_postgres_data2,target=/var/lib/postgresql/data \
  -e POSTGRES_PASSWORD=mysecretpassword \
  postgres:14

The --mount flag is more explicit about its parameters, making it clearer what each part does. It's especially useful for more complex mounting scenarios.

If you want to ensure a container can't modify the data in a volume, you can mount it as read-only:

Copied!

docker run -d \
  --name web_server \
  --mount source=web_content,target=/usr/share/nginx/html,readonly \
  nginx:latest

With the -v syntax, you would use:

Copied!

docker run -d \
  --name web_server \
  -v web_content:/usr/share/nginx/html:ro \
  nginx:latest

Volume use cases

Let's explore some common use cases for Docker volumes with practical examples.

The most common use case for volumes is database persistence. Here's a complete example setting up a MySQL database with volume persistence:

Copied!

docker volume create mysql_data

Copied!

docker run -d \
  --name mysql_db \
  -v mysql_data:/var/lib/mysql \
  -e MYSQL_ROOT_PASSWORD=my-secret-pw \
  -e MYSQL_DATABASE=myapp \
  -e MYSQL_USER=myapp_user \
  -e MYSQL_PASSWORD=myapp_pass \
  -p 3306:3306 \
  mysql:8.0

Now, even if you stop and remove the container, your data will persist:

Copied!

docker rm -f mysql_db

Copied!

docker run -d \
  --name mysql_db_new \
  -v mysql_data:/var/lib/mysql \
  -e MYSQL_ROOT_PASSWORD=my-secret-pw \
  -p 3306:3306 \
  mysql:8.0

The new container will have access to all the data created by the original container.

Volumes can be used to share configuration files between containers:

Copied!

docker volume create app_config

Copied!

docker run --rm \
  -v app_config:/config \
  alpine:latest \
  sh -c 'echo "server_url=https://example.com" > /config/app.conf'

Copied!

docker run -d \
  --name app1 \
  -v app_config:/etc/myapp \
  myapp:latest

Copied!

docker run -d \
  --name app2 \
  -v app_config:/etc/myapp \
  myapp:latest

This pattern allows you to centralize configuration and ensure consistency across multiple containers.

In a continuous integration setup, you might want to share build artifacts between build and runtime containers:

Copied!

# Create a volume for build artifacts
docker volume create build_artifacts

# Run a build container
docker run --rm \
  -v build_artifacts:/build \
  -v $(pwd)/src:/source \
  node:14 \
  sh -c 'cd /source && npm install && npm run build && cp -r dist/* /build/'

# Run a web server to serve the built artifacts
docker run -d \
  --name web \
  -v build_artifacts:/usr/share/nginx/html \
  -p 8080:80 \
  nginx:latest

This separates the build environment from the runtime environment while allowing the runtime to access the build output.

Advanced volume concepts

So far, we've focused on named volumes, which are explicitly created and given a name. Docker also supports anonymous volumes, which are created automatically when you use the -v flag with just a container path:

Copied!

docker run -d \
  --name postgres_anon \
  -v /var/lib/postgresql/data \
  -e POSTGRES_PASSWORD=mysecretpassword \
  postgres:14

Docker assigns a random ID to this volume. You can see it with:

Copied!

docker inspect postgres_anon -f '{{ .Mounts }}'

Output

[{volume 8a7f02923b9a13a3c991ebf119bf1a2c4d968a91ef9730e0c4762a57d17a1234 /var/lib/docker/volumes/8a7f02923b9a13a3c991ebf119bf1a2c4d968a91ef9730e0c4762a57d17a1234/_data /var/lib/postgresql/data local  true }]

Anonymous volumes are harder to manage and reuse, so named volumes are generally preferred for persistent data.

Using volumes with Docker Compose

Docker Compose makes it easy to define and use volumes for multi-container applications:

compose.yaml

Copied!

services:
  db:
    image: postgres:16
    volumes:
      - postgres_data:/var/lib/postgresql/data
    environment:
      POSTGRES_PASSWORD: example
      POSTGRES_USER: myapp
      POSTGRES_DB: myapp_db

  web:
    build: ./app
    volumes:
      - ./app:/code
      - ./static:/app/static
    ports:
      - "8000:8000"
    depends_on:
      - db

volumes:
  postgres_data:

In this example, we define a named volume postgres_data for the database service and use both a bind mount (./app:/code) and a host directory mount (./static:/app/static) for the web service.

To start the services with their volumes:

Copied!

docker-compose up -d

Output

[+] Running 4/4
 ✔ Network app_default           Created                  0.5s
 ✔ Volume "postgres_data"        Created                  0.0s
. . .

Security considerations

When working with volumes, consider these security best practices:

Manage permissions carefully: Docker volumes inherit the permissions of the container. Ensure that the container runs with appropriate user permissions.

Copied!

docker run -d \
  --name secure_nginx \
  --user $(id -u):$(id -g) \
  -v web_content:/usr/share/nginx/html \
  nginx:latest

Use read-only volumes when containers only need to read data.
Don't expose sensitive host directories through bind mounts.
Consider volume encryption for sensitive data, using third-party volume drivers that support encryption.

Final thoughts

Docker volumes provide the essential bridge between the ephemeral nature of containers and the need for persistent, shared data. By understanding how to create, manage, and utilize volumes effectively, you can build robust containerized applications that maintain state across container lifecycles.

Thanks for reading!

Got an article suggestion? Let us know

Understanding Docker Networks: A Comprehensive Guide

A comprehensive guide to Docker networking that explains network types, container communication, service discovery, and security best practices for containerized applications.

→

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.

Understanding Docker Volumes: A Comprehensive Tutorial

Contents

The challenge of data persistence in containers

Docker storage options: An overview

Docker volumes in depth

Creating and managing volumes

Inspecting volumes

Listing and removing volumes

Attaching volumes to containers

Volume use cases

Configuration sharing

Sharing build artifacts

Advanced volume concepts

Using volumes with Docker Compose

Security considerations

Final thoughts