Common Ansible Errors and How to Fix Them

Ansible has become one of the most popular tools for automation, configuration management, and infrastructure as code.

Its agentless architecture and YAML-based playbooks make it relatively easy to learn, but that doesn't mean you won't encounter errors.

In fact, as you progress from simple playbooks to more complex automation scenarios, troubleshooting becomes an increasingly important skill.

This article covers the most common errors you'll encounter when working with Ansible, why they occur, and how to fix them.

Understanding these common pitfalls will help you build more robust automation and save time when issues arise.

The fastest log
search on the planet

Better Stack lets you see inside any stack, debug any issue, and resolve any incident.

Syntax and YAML errors

YAML forms the foundation of Ansible playbooks. Its human-readable format makes playbooks easy to write, but its strict syntax rules can also lead to frustrating errors.

YAML indentation errors

Perhaps the most common error in Ansible is related to YAML indentation. YAML uses indentation to establish the structure and hierarchy of data. Unlike some languages where indentation is merely a matter of style, in YAML, indentation is syntactically significant.

Common error message:

Copied!

. . .
Syntax Error while loading YAML.
  mapping values are not allowed in this context

The error appears to be in '/home/ayo/dev/betterstack/demo/ansible-errors/playbook.yml': line 5, column 10, but may
be elsewhere in the file depending on the exact syntax problem.
. . .

Why it happens:

This error typically occurs when you mix tabs and spaces or use inconsistent indentation levels. YAML is very particular about this distinction.

How to fix it:

Use spaces instead of tabs for indentation.
Maintain consistent indentation (2 spaces is the common standard).
Use a YAML validator or linter to check your files.

Here's an example of incorrect indentation:

Copied!

- name: Install web server
  hosts: webservers
  tasks:
  - name: Install Apache
      apt:
        name: apache2
        state: present
    - name: Start Apache service
      service:
        name: apache2
        state: started

And here's the corrected version:

Copied!

- name: Install web server
  hosts: webservers
  tasks:
    - name: Install Apache
      apt:
        name: apache2
        state: present
    - name: Start Apache service
      service:
        name: apache2
        state: started

To validate your YAML files, you can use the yamllint tool:

Copied!

yamllint playbook.yml

Missing or invalid quotes

Another common syntax error involves string quoting, especially when your strings contain special characters.

Common error message:

Output

ERROR! Syntax Error while loading YAML.
  found unacceptable character #: mapping values are not allowed in this context

The error appears to be in '/path/to/playbook.yml': line 12, column 15, but may
be elsewhere in the file depending on the exact syntax problem.

Why it happens:

YAML requires strings containing special characters like colons, hash symbols, or starting with characters like asterisks to be quoted. Additionally, strings containing variables might need proper quoting to avoid interpolation issues.

How to fix it:

Quote strings containing special characters
Use single quotes to prevent variable interpolation
Use double quotes when you need variable interpolation

Here's an example that would cause errors:

Copied!

- name: Configure application
  hosts: app_servers
  vars:
    app_config:
      url: http://example.com:8080  # Error: contains colon
      comment: This server handles #1 priority tasks  # Error: contains hash
      command: ls -la  # Error: contains space and hyphen
  tasks:
    - name: Create configuration file
      template:
        src: config.j2
        dest: /etc/app/config.yaml

Corrected version:

Copied!

- name: Configure application
  hosts: app_servers
  vars:
    app_config:
      url: "http://example.com:8080"
      comment: "This server handles #1 priority tasks"
      command: "ls -la"
  tasks:
    - name: Create configuration file
      template:
        src: config.j2
        dest: /etc/app/config.yaml

Jinja2 template syntax errors

Ansible uses Jinja2 templating extensively, which provides powerful capabilities but can also introduce errors, especially when mixing it with YAML syntax.

Common error message:

Output

ERROR! template error while templating string: unexpected '{'. String: {{ item }}{{ ansible_facts['hostname'] }}

Why it happens:

These errors often occur due to missing spaces in Jinja2 expressions, incorrect filter syntax, or confusing Jinja2 with YAML syntax.

How to fix it:

Ensure proper spacing in Jinja2 expressions ({{ variable }} not {{variable}}).
Use proper syntax for filters ({{ variable | filter }}).
Properly quote templated strings in YAML.

Incorrect example:

Copied!

- name: Configure hosts
  hosts: all
  tasks:
    - name: Create file with hostname
      file:
        path: /tmp/{{item}}{{ansible_facts['hostname']}}.txt
        state: touch
      loop:
        - server_
        - host_

Corrected version:

Copied!

- name: Configure hosts
  hosts: all
  tasks:
    - name: Create file with hostname
      file:
        path: "/tmp/{{ item }}{{ ansible_facts['hostname'] }}.txt"
        state: touch
      loop:
        - server_
        - host_

For complex Jinja2 expressions, you can use the debug module to test your syntax:

Copied!

- name: Debug Jinja2 expressions
  hosts: localhost
  vars:
    my_string: "Hello World"
    my_list: [1, 2, 3, 4, 5]
  tasks:
    - name: Test Jinja2 expression
      debug:
        msg: "{{ my_string | upper }} {{ my_list | sum }}"

Connection errors

Since Ansible operates by connecting to remote hosts, connection problems are a common source of errors. Understanding these issues is crucial for effective troubleshooting.

SSH is Ansible's primary method for connecting to managed nodes, and SSH-related issues are among the most common errors.

Common error message:

Output

UNREACHABLE! => {"changed": false, "msg": "Failed to connect to the host via ssh: ssh: connect to host 192.168.1.100 port 22: Connection timed out", "unreachable": true}

Why it happens: SSH connection failures can result from network connectivity issues, firewall configurations, incorrect credentials, SSH service not running, or incorrect SSH configuration.

How to fix it:

Verify network connectivity with a ping test.
Check that SSH service is running on the target.
Verify firewall rules allow SSH connections.
Ensure SSH credentials are correct.
Configure SSH options in ansible.cfg.

To test basic connectivity:

Copied!

ping <ip_address>

You can configure SSH options in your ansible.cfg file:

ansible.cfg

Copied!

[defaults]
inventory = ./inventory
remote_user = deploy

[ssh_connection]
ssh_args = -o ControlMaster=auto -o ControlPersist=60s -o ConnectTimeout=10
pipelining = True

For more verbose SSH debugging, increase Ansible's verbosity:

Copied!

ansible-playbook playbook.yml -vvv

Privilege escalation errors

Ansible often needs to run commands with elevated privileges, which can lead to permission-related errors.

Common error message:

Output

FAILED! => {"msg": "Missing sudo password"}

Or:

Output

FAILED! => {"changed": false, "module_stderr": "sudo: a password is required\n", "module_stdout": "", "msg": "MODULE FAILURE\nSee stdout/stderr for the exact error", "rc": 1}

Why it happens: These errors occur when Ansible attempts to execute tasks that require elevated privileges without providing the necessary sudo password or having passwordless sudo configured.

How to fix it:

Use the --ask-become-pass option to prompt for the sudo password.
Configure passwordless sudo on the target hosts.
Specify the become method in your playbook.
Store the sudo password securely using Ansible Vault.

Example playbook with become (sudo) configured:

Copied!

- name: Configure system
  hosts: webservers
  become: true
  become_method: sudo
  become_user: root
  tasks:
    - name: Install required packages
      apt:
        name:
          - nginx
          - curl
          - python3
        state: present

To run the playbook with sudo password prompt:

Copied!

ansible-playbook playbook.yml --ask-become-pass

Host key verification failures

SSH relies on host key verification for security, which can sometimes cause connection issues with Ansible.

Common error message:

Output

UNREACHABLE! => {"changed": false, "msg": "Failed to connect to the host via ssh: Host key verification failed.", "unreachable": true}

Why it happens: This error occurs when the SSH host key of the target system isn't in your known_hosts file, or when the host key has changed (which could indicate a potential security issue).

How to fix it:

Manually connect to the host via SSH to add it to known_hosts.
Disable host key checking in ansible.cfg (less secure but convenient for testing).
Use ssh-keyscan to add the host key to known_hosts programmatically.

To disable host key checking for testing:

ansible.cfg

Copied!

[defaults]
host_key_checking = False

For a more secure approach, add the host key programmatically:

Copied!

ssh-keyscan 192.168.1.100 >> ~/.ssh/known_hosts

Inventory and variable errors

Properly managing inventory and variables is crucial for Ansible. Errors in these areas can be particularly confusing because they may not manifest until specific tasks are executed.

Undefined variables

Using a variable that hasn't been defined is a common error, especially in complex playbooks with multiple variable sources.

Common error message:

Output

FAILED! => {"msg": "The task includes an option with an undefined variable. The error was: 'app_version' is undefined"}

Why it happens: This error occurs when you reference a variable that hasn't been defined in any of Ansible's variable sources (playbook vars, inventory, group_vars, etc.) or when you misspell a variable name.

How to fix it:

Check variable definitions across all relevant files.
Use the default filter to provide fallback values.
Use debug tasks to inspect variable content.
Use ansible-inventory --list to see all inventory variables.

Example using the default filter:

Copied!

- name: Deploy application
  hosts: app_servers
  tasks:
    - name: Create app directory
      file:
        path: "/opt/app/{{ app_version | default('latest') }}"
        state: directory

Adding debug tasks to inspect variables:

Copied!

- name: Debug variables
  hosts: app_servers
  tasks:
    - name: Display all variables
      debug:
        var: hostvars[inventory_hostname]

    - name: Display specific variable
      debug:
        var: app_version
      ignore_errors: yes

Inventory parsing issues

Problems with inventory file syntax can prevent Ansible from properly recognizing and connecting to hosts.

Common error message:

Output

ERROR! Attempted to read "/path/to/inventory" as YAML: Syntax Error while loading YAML.

Or:

Output

[WARNING]: Could not match supplied host pattern, ignoring: webservers

Why it happens: These errors occur due to syntax errors in inventory files, incorrect group definitions, or typos in host patterns.

How to fix it:

Validate inventory syntax.
Check group names and hierarchy.
Use ansible-inventory --graph to visualize your inventory structure.

Example of a problematic inventory file:

Copied!

[webservers]
web1.example.com
web2.example.com

[dbservers]
db1.example.com
db2.example.com

[production:children]
webservers
database  # This should be dbservers

Corrected version:

Copied!

[webservers]
web1.example.com
web2.example.com

[dbservers]
db1.example.com
db2.example.com

[production:children]
webservers
dbservers

To validate your inventory structure:

Copied!

ansible-inventory --graph

You should see output like:

Output

@all:
  |--@dbservers:
  |  |--db1.example.com
  |  |--db2.example.com
  |--@production:
  |  |--@dbservers:
  |  |  |--db1.example.com
  |  |  |--db2.example.com
  |  |--@webservers:
  |  |  |--web1.example.com
  |  |  |--web2.example.com
  |--@ungrouped:
  |--@webservers:
  |  |--web1.example.com
  |  |--web2.example.com

Variable precedence confusion

Ansible has a complex variable precedence system that can lead to unexpected values being used.

Common error message: There's rarely a specific error message for precedence issues. Instead, you'll notice variables having unexpected values.

Why it happens: Ansible has a specific order in which it processes variables from different sources. When the same variable is defined in multiple places, the value from the highest-precedence source wins.

How to fix it:

Review Ansible's variable precedence documentation.
Use debug tasks to find where variables are coming from.
Use ansible-inventory --list to see all inventory variables.
Consider where you define variables based on their scope and purpose.

Example debug task to help trace variable sources:

Copied!

- name: Trace variable precedence
  hosts: app_servers
  vars:
    app_port: 8080
  tasks:
    - name: Show app_port from different sources
      debug:
        msg: |
          Playbook vars: {{ app_port }}
          Group vars: {{ hostvars[inventory_hostname].app_port | default('undefined') }}
          Host vars: {{ hostvars[inventory_hostname].app_port | default('undefined') }}
          Extra vars: {{ app_port }}

Run this with:

Copied!

ansible-playbook trace_vars.yml --extra-vars "app_port=9000"

Module-specific errors

Ansible modules are the workhorses that perform actual tasks on managed nodes. Each module has its own set of potential errors.

Command/shell module failures

The command and shell modules are among the most commonly used, but they can fail for various reasons.

Common error message:

Output

FAILED! => {"changed": true, "cmd": "service apache2 status", "delta": "0:00:00.005625", "end": "2023-04-11 14:30:12.125799", "msg": "non-zero return code", "rc": 3, "start": "2023-04-11 14:30:12.120174", "stderr": "", "stderr_lines": [], "stdout": "apache2 is not running", "stdout_lines": ["apache2 is not running"]}

Why it happens: Command module failures typically occur when the executed command exits with a non-zero status. This could be due to the command not existing, insufficient permissions, or the command itself failing.

How to fix it:

Add ignore_errors: true for commands that may legitimately fail
Use become: true for commands requiring elevated privileges
Use the failed_when directive to customize failure conditions
Consider using specialized modules instead of raw commands

Example with improved error handling:

Copied!

- name: Check and restart services
  hosts: webservers
  become: true
  tasks:
    - name: Check if Apache is running
      command: systemctl status apache2
      register: apache_status
      ignore_errors: true
      changed_when: false  # Status check doesn't change anything

    - name: Restart Apache if not running
      service:
        name: apache2
        state: restarted
      when: apache_status.rc != 0

Using failed_when for custom failure conditions:

Copied!

- name: Check disk space
  hosts: all
  tasks:
    - name: Get disk usage
      command: df -h /
      register: df_output
      changed_when: false

    - name: Parse disk usage
      set_fact:
        disk_usage_pct: "{{ df_output.stdout_lines[1].split()[4] | replace('%', '') }}"

    - name: Check if disk space is critical
      fail:
        msg: "Disk usage is critical: {{ disk_usage_pct }}%"
      when: disk_usage_pct | int > 90

Package management errors

Package installation issues are common, especially when dealing with different distributions or repository configurations.

Common error message:

Output

FAILED! => {"changed": false, "msg": "No package matching 'apache2' available."}

Or:

Output

FAILED! => {"changed": false, "msg": "Failed to update apt cache: E: Could not get lock /var/lib/apt/lists/lock"}

Why it happens: These errors occur when packages are not available in the configured repositories, repositories are not accessible, or there are locking issues with the package manager.

How to fix it:

Verify package name and availability for the target distribution
Ensure repositories are properly configured
Update package cache before installation
Handle lock files appropriately

Example with improved package handling:

Copied!

- name: Install packages robustly
  hosts: webservers
  become: true
  tasks:
    - name: Update apt cache
      apt:
        update_cache: yes
        cache_valid_time: 3600  # Use cached results if updated within the last hour
      when: ansible_os_family == "Debian"

    - name: Install Apache (Debian/Ubuntu)
      apt:
        name: apache2
        state: present
      when: ansible_os_family == "Debian"

    - name: Install Apache (RedHat/CentOS)
      yum:
        name: httpd
        state: present
      when: ansible_os_family == "RedHat"

For lock file issues, you can add retry logic:

Copied!

- name: Install packages with retry
  hosts: all
  become: true
  tasks:
    - name: Install required packages
      apt:
        name: nginx
        state: present
        update_cache: yes
      register: apt_result
      retries: 5
      delay: 10
      until: apt_result is success

File operation errors

File operations can fail due to permissions, path issues, or disk space constraints.

Common error message:

Output

FAILED! => {"changed": false, "msg": "Error creating file /etc/app/config.ini: [Errno 13] Permission denied: '/etc/app/config.ini'"}

Or:

Output

FAILED! => {"changed": false, "msg": "Error creating directory /var/www/html/uploads: [Errno 2] No such file or directory: '/var/www/html/uploads'"}

Why it happens: File operation errors typically occur due to insufficient permissions, non-existent parent directories, or full filesystems.

How to fix it:

Use become: true to get elevated privileges.
Ensure parent directories exist (use state: directory with the file module).
Check file ownership and permissions.
Verify available disk space.

Example with robust file operations:

Copied!

- name: Create application directories
  hosts: webservers
  become: true
  tasks:
    - name: Check available disk space
      command: df -h /var
      register: df_output
      changed_when: false

    - name: Ensure parent directory exists
      file:
        path: /var/www/myapp
        state: directory
        mode: '0755'
        owner: www-data
        group: www-data

    - name: Create nested directories with parent option
      file:
        path: /var/www/myapp/uploads/images
        state: directory
        mode: '0775'
        owner: www-data
        group: www-data
        recurse: yes  # Creates parent directories if needed

For copying files with proper permissions:

Copied!

- name: Copy configuration files
  hosts: app_servers
  become: true
  tasks:
    - name: Copy app configuration
      copy:
        src: files/app.conf
        dest: /etc/app/app.conf
        owner: app_user
        group: app_group
        mode: '0640'
        backup: yes  # Create backup of existing file

Performance and scalability issues

As your Ansible deployments grow, you may encounter performance issues that aren't strictly errors but can significantly impact usability.

Playbook execution timeouts

Long-running tasks can timeout, especially when Ansible's default timeouts are insufficient.

Common error message:

Output

FAILED! => {"msg": "The async task did not complete within the requested time (300s)."}

Why it happens: Ansible has various timeout settings that can cause tasks to fail when they take too long to complete, such as large file transfers, database migrations, or package installations.

How to fix it:

Use async/poll for long-running tasks.
Adjust timeout settings in ansible.cfg.
Break large tasks into smaller components.

Example using async for long-running tasks:

Copied!

- name: Run long operations
  hosts: webservers
  become: true
  tasks:
    - name: Update all packages
      apt:
        upgrade: dist
        update_cache: yes
      async: 3600  # Allow this task to run for up to 1 hour
      poll: 30     # Check status every 30 seconds

Configure timeouts in ansible.cfg:

ansible.cfg

Copied!

[defaults]
timeout = 60  # Default SSH timeout in seconds

[ssh_connection]
ssh_args = -o ConnectTimeout=60 -o ServerAliveInterval=30

Memory issues with large inventories

When working with large inventories, Ansible can consume significant memory, potentially causing performance problems or failures.

Common error message: Memory issues typically manifest as the ansible process being killed by the OS or extremely slow performance.

Why it happens: Ansible loads the entire inventory into memory and collects facts from all hosts by default. With large inventories, this can consume substantial memory resources.

How to fix it:

Use fact caching to reduce repeated fact gathering
Limit fact gathering when possible
Use the --limit option to target specific hosts
Break large playbooks into smaller components

Configure fact caching in ansible.cfg:

ansible.cfg

Copied!

[defaults]
gathering = smart
fact_caching = jsonfile
fact_caching_connection = /tmp/ansible_fact_cache
fact_caching_timeout = 86400  # 24 hours in seconds

Limit fact gathering in playbooks:

Copied!

- name: Minimal facts playbook
  hosts: all
  gather_facts: no  # Don't gather facts at all

  tasks:
    - name: Gather only the facts we need
      setup:
        gather_subset:
          - '!all'
          - '!min'
          - 'network'
          - 'hardware'

    - name: Display only network facts
      debug:
        var: ansible_default_ipv4

Parallelism problems

Ansible's parallel execution can sometimes lead to resource contention and unpredictable behavior.

Common error message: There's usually no specific error message, but you might see inconsistent results or timeouts when running against many hosts simultaneously.

Why it happens: By default, Ansible runs tasks in parallel across hosts. This can cause issues when tasks compete for resources or when order matters between different host groups.

How to fix it:

Adjust the forks parameter to control parallelism
Use the serial directive to limit simultaneous execution
Apply throttling to resource-intensive tasks

Configure lower parallelism in ansible.cfg:

ansible.cfg

Copied!

[defaults]
forks = 10  # Default is 5

Use serial execution for critical tasks:

Copied!

- name: Update database servers in sequence
  hosts: db_servers
  serial: 1  # Run on one host at a time
  become: true
  tasks:
    - name: Stop database service
      service:
        name: postgresql
        state: stopped

    - name: Update database packages
      apt:
        name: postgresql
        state: latest

    - name: Start database service
      service:
        name: postgresql
        state: started

Using throttle for specific tasks:

Copied!

- name: Run resource-intensive operations
  hosts: all
  become: true
  tasks:
    - name: Rebuild search index
      command: rebuild_search_index
      throttle: 3  # Only run on 3 hosts at a time

Final thoughts

Understanding common Ansible errors and their solutions is crucial for effective automation. By recognizing patterns in YAML syntax issues, connection failures, variable handling, and module-specific errors, you can quickly diagnose and resolve problems.

Thanks for reading!

Got an article suggestion? Let us know

Debugging Ansible Workflows: A Comprehensive Guide

Master Ansible debugging with this comprehensive guide covering verbosity settings, module-specific troubleshooting, logging integration, and advanced strategies for complex deployments.

→

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.

Common Ansible Errors and How to Fix Them

Contents

Syntax and YAML errors

YAML indentation errors

Missing or invalid quotes

Jinja2 template syntax errors

Connection errors

Privilege escalation errors

Host key verification failures

Inventory and variable errors

Undefined variables

Inventory parsing issues

Variable precedence confusion

Module-specific errors

Command/shell module failures

Package management errors

File operation errors

Performance and scalability issues

Playbook execution timeouts

Memory issues with large inventories

Parallelism problems

Final thoughts

Make your mark

Join the writer's program

Build on top of Better Stack