Ansible
Ansible is an open source configuration management tool written in Python and helps automate software configuration and deployment. It manages remote hosts via SSH or PowerShell and requires no agents on the client side. Changes can be executed on one or more hosts which can be on different networks, so long as it is accessible by the Ansible host.
Introduction
Ansible provides a bunch of modules that perform a certain action. For example, the package module installs a package on the host machine. Each module can take any number of parameters which are specified by a task definition. We could define a new task that passes 'vim' as a parameter to the package module so that vim gets installed. A set of roles that together implements a specific feature or sets up the 'role' of a machine can be bundled together as an Ansible role. A playbook specifies which host or host groups implement which roles.
In summary:
- Inventory: file or script that tells Ansible what the hosts are. Hosts can be bundled together into groups. Groups can inherit other groups.
- Playbook: A playbook tells Ansible what tasks or roles a host or host group need to implement
- Role: A bundle of tasks, files, and templates that configures a specific feature or role.
- Task: A list of parameters which are called against a specific Module in order to do a specific action
- Module: A piece of code which does an action
Modules & Tasks
Modules can take in any number of parameters as variables. Modules can also return data back to Ansible which can be used later on by other modules. A module by itself isn't able to do anything and can only 'run' when used as a task. A Task is simply a list item that specifies a module and its parameters.
A single module can be executed via the ansible command while multiple modules can be executed through the use of a playbook which describes the list of tasks or a bundle of tasks (called roles) it needs to perform.
Ansible uses the Jinja2 template engine and all parameters that are passed to Ansible can be templated. Uses of this are found throughout the examples below and allows for the use of loops (by iterating over lists) and embedding variables within strings. Templated values must be enclosed in double quotes.
Modules which are commonly used and their commonly used parameters are listed below:
Module | parameter | Description |
---|---|---|
yum/dnf | - name: Install NTP
yum:
name: ntp
state: present
- name: Install common packages
dnf:
name:
- git
- vim
- zsh
|
Installs a package.
You may also use loops to call the module multiple times for each item. Eg: - name: Install packages
dnf:
name: "{{ item }}"
with_items:
- git
- vim
This is something which isn't required for the yum/dnf modules since version 2.0. |
package | - name: Ensure NFS utils are installed
package:
name: "{{ __packages.client }}"
state: "{{ 'latest' if upgrade else 'present' }}"
|
This module will use the proper underlying package manager suitable on the system, making cross-platform tasks easier to write.
The name parameter also accepts a list, like yum/dnf. |
template | - name: Configure NTP
notify: restart ntp
template:
src: ntp.conf.j2
dest: /etc/ntp.conf
|
Copy a templated file.
Notice that we also call 'restart ntp' when the file is changed. 'restart ntp' is a task defined under the Template may include variables passed to this playbook. |
service | - name: Start the ntp service
service: name: ntpd
state: started
enabled: yes
- name: restart ntp
service:
name: ntpd
state: restarted
|
Ensures something is started and enabled on startup |
debug | - name: "testing something"
debug:
msg: "item: {{ item }}"
with_items:
- 1
- 2
|
Debug messages |
command | - name: Extract archive
command: /bin/tar xvf wordpress.tar.gz
args:
chdir: /srv/
creates: /srv/wordpress
- name: safely use templated variable to run command
command: cat {{ myfile|quote }}
register: myoutput
|
Runs a command |
shell | - name: Shell command
shell: "echo $HOME | grep something"
|
Similar to command, but the task is executed inside a shell. Environment variables and piping will work as a result.
The |
copy | - name: Copy using inline content
copy:
content: |
this is some file content
of mine.
dest: /tmp/file
- name: Copy a "sudoers" file on the remote machine for editing
copy:
src: /etc/sudoers
dest: /etc/sudoers.edit
remote_src: yes
validate: /usr/sbin/visudo -csf %s
|
Copies files or sets file contents |
file | - name: Create MariaDB log file
file:
path: /var/log/mysqld.log
state: touch
owner: mysql
group: mysql
mode: 0775
- name: Create MariaDB PID directory
file:
path: /var/run/mysqld
state: directory
owner: mysql
group: mysql
mode: 0775
|
Creates a file or directory |
stat | - name: Check that the somefile.conf exists
stat:
path: /etc/file.txt
register: stat_result
changed_when: False
- name: Create the file, if it doesnt exist already
file:
path: /etc/file.txt
state: touch
when: not stat_result.stat.exists
|
Stats a file, which can be used to only run something based on a file's existence. |
find | - name: Find files with globbing
find:
paths: /tmp
patterns: "*.swp"
register: list_of_files
## Loop over list_of_files using with_items.
- name: Do something with file list
file:
path: "{{ item.path }}"
state: absent
with_items: "{{ list_of_files.files }}"
|
finds files by name or by pattern. |
mount | - name: Mount /home
mount:
fstype: nfs
opts: "vers=3,intr,hard"
path: /home
src: "netapp{{ fileserver_index }}:/ArcHome"
state: present
|
Mounts a filesystem |
firewalld | - name: insert firewalld rule
firewalld:
port: {{ mysql_port }}/tcp
permanent: true
state: enabled
immediate: yes
|
Manages the firewall |
git | - name: Copy the code from repository
git:
repo: {{ repository }}
dest: /var/www/html/
|
You can clone an entire git repository to some location. |
systemd | - name: Make sure a service is running
systemd:
state: started
name: httpd
- name: Enable a timer for dnf-automatic
systemd:
name: dnf-automatic.timer
state: started
enabled: yes
|
Manages systemd services and timers.
State must be one of:
|
user | - title: Create a new user
user:
name: remote
shell: /bin/bash
groups: admin,wheel
append: yes
|
|
group | - title: Create a new group
group:
name: remote
state: present
|
|
authorized_key | - name: Set an authorized key
authorized_key:
user: remote
state: present
key: "{{ lookup('file', '/home/remote/.ssh/id_rsa.pub') }}"
|
The key can also be specified as just a string. |
Roles
Roles define a set of tasks that should be done. I like to think of Roles as a bundle or grouping of tasks. Ansible Galaxy contains a bunch of other roles that can be used. GitHub also has lots of other roles users have made public which may do what you want.
Roles help organize tasks into a single unit. For example, you may want to create a webserver role that contains instructions on setting up Apache and another database role with instructions on setting up MySQL. Creating a new role involves creating the following directory structure:
roles/
installapache/
tasks/
handlers/
templates/
vars/
defaults/
installtomcat/
tasks/
meta/
Each role should be its own subdirectory in the roles directory. Each role should also have folders for each of: tasks
for all ansible tasks the role should perform, handlers
containing all handlers in this role, files
for files needed by the role, such as configuration files, templates
containing templating data, vars
containing values that override defaults set in the defaults
directory, meta
for any metadata needed for this role, such as dependency data. In each of these directories, the main.yml
file will be read and used if it exists.
In the example above, we have a installapache
role and installtomcat
role.
Inside installapache/tasks/main.yml
, we can load the appropriate set of tasks depending on the distribution:
- name: import a tasks based on OS platform
import_tasks: centos.yml
when: ansible_distribution == 'CentOS'
- import_tasks: ubuntu.yml
when: ansible_distribution == 'Ubuntu'
We would then create a centos.yml
and ubuntu.yml
within this same tasks directory with all the tasks that should be done in this role. Eg.
---
- name: Install Apache using yum
yum:
name: "httpd"
state: latest
- name: Start the Apache server
service:
name: httpd
state: started
As Dependencies
Roles can be included as dependencies. Under meta, define something like:
dependencies:
- role: common
vars:
some_parameter: 3
- role: apache
vars:
apache_port: 80
- role: postgres
vars:
dbname: blarg
other_parameter: 12
Inventory
An Inventory contains a list of hosts managed by Ansible. It outlines any groups and their relationship with hosts. This file can be written in either INI or YAML. By default, the inventory file is loaded from /etc/ansible/hosts
, but can be specified with -i
when calling ansible
or ansible-playbook
.
An example INI inventory file is shown below. Note that groups are defined as named INI sections. Group inheritance and variables are specified with the :children
and :vars
in the INI section.
[webservers]
web1.example.com ansible_user=bob
web2.example.com ansible_host=192.168.81.142 ansible_port=3333
; variables that apply only to the webservers group
[webservers:vars]
listen_port=8080
[apservers]
ap1.example.com
ap2.example.com
[dbservers]
; db01, db02, ..., db10
db[01:10].example.com
; servers group contains the following sub-groups
[servers:children]
webservers
apservers
dbservers
Here, we have four distinct groups: webservers
, apservers
, dbservers
, and servers
. The servers group includes every host that's defined here since it includes all the other groups as its children. Additionally, there is a built-in group named all
which can be used.
Host ranges can be specified with brackets, such as [01:10]
to denote nodes from 01 (with a leading 0) through 10.
Variable overrides
You can override specific variables in the inventory file. In the example above, the web servers have their user, host, and ports overridden so that Ansible will connect to these hosts using the specified host and ports. Other variables that you might be interested are:
Description | Variables |
---|---|
Override the host's address. Connect to this address rather than the relying on the hostname. | ansible_host=10.1.2.6
|
Override the default SSH port | ansible_port=2222
|
SSH as this user when connecting to the host (rather than root) | ansible_user=opc
|
Use this identity file (rather than the default in ~/.ssh/id_rsa) | ansible_ssh_private_key_file=/path/to/your/.ssh/id_rsa
|
Become root using sudo (used if your ansible_user isn't root) | ansible_become=yes
|
Override the python interpreter on the host. | ansible_python_interpreter=/usr/local/bin/python
|
Variables
Both group and host specific variables can also be defined outside of the inventory file by placing them within a group_vars
and host_vars
directory, relative to the inventory file. This might be desirable if there are many variables that need to be defined and allows for easier organization. Regardless of what format your inventory file is written in (INI or YAML), the variables defined in these directories must be written in YAML or JSON.
For example, we can have the following directory structure:
ansible_root
├── group_vars
│ ├── all
│ │ └── general.yaml
│ ├── servers
│ │ └── only-servers.yaml
│ └── webservers
│ └── web-servers.yaml
├── host_vars
│ └── ap1.example.com
└── inventory.ini
Hosts will load variables in the following order:
- The playbook, the inventory file,
- the
group_vars
directories (from most general groupall
, through each of the intermediate parent groups to reach the most specific child group), and - the
host_vars
directory.
The values that are loaded are then merged or flattened for each host before the playbook is executed. The rule of thumb is that the most specific takes precedence. In the case of host ap1.example.com, it would load its values from: the playbook, the inventory file, group_vars/all
, group_vars/servers
, host_vars/ap1.example.com
. Each subsequent file overrides any previous variables that was set, allowing for generic values to be applied by placing them in the more generic locations first.
Read more on this at: https://docs.ansible.com/ansible/latest/user_guide/intro_inventory.html
Playbook
Playbooks define what tasks (or roles which is just a bundle of tasks) a host or host group Ansible needs to perform. When running a playbook with ansible-playbook, tasks are executed in the order they appear and in the order of the host pattern that is given, ensuring that runs are repeatable and reproduceable.
Playbooks can be applied with the ansible-playbook
command. After a playbook run, a recap will output the following:
- ok - task ran successfully with no changes being made
- changed - task ran successfully with changes being made
- failed - task failed to run
- unreachable - host was unreachable
- skipped - task was skipped
- ignored - task was ignored
- rescued - ?
An example playbook looks like this:
---
- name: Handler demo 1
hosts: frt01.example.com
gather_facts: no
become: yes
tasks:
- name: Update Apache configuration
template:
src: template.j2
dest: /etc/httpd/httpd.conf
notify: Restart Apache
- name: restart all services
command: echo "this task will restart all services"
notify: "restart all services"
handlers:
- name: Restart Apache
service:
name: httpd
state: restarted
listen: "restart all services"
Handlers can be called with notify. Handlers can also listen for a specific event with the listen directive.
Facts
You can get all facts by running:
# ansible all -m setup -i inventory.ini
Here are some facts that I commonly use:
# Operating system
"ansible_distribution": "Rocky"
"ansible_distribution_major_version": "8"
"ansible_distribution_release": "Green Obsidian"
"ansible_distribution_version": "8.4"
"ansible_os_family": "Rocky"
# Networking
"ansible_fqdn": "mc44"
"ansible_hostname": "mc44"
"ansible_nodename": "mc44"
"ansible_interfaces": [
"ib0",
"lo",
"eno16"
]
# Machine info
"ansible_memtotal_mb": 192037
"ansible_mounts": []
"ansible_machine": "x86_64"
"ansible_virtualization_type": "NA"
"ansible_memfree_mb": 190562
"ansible_memtotal_mb": 192037
"ansible_swapfree_mb": 0
"ansible_swaptotal_mb": 0
You can determine if the environment is a VM or container by reading the value set under ansible_virtualization_type
.
When defining tasks, you can conditionally run something with the when
parameter followed by a truth statement. For example, to only run on Red Hat based systems, add when: "ansible_facts['os_family'] == 'RedHat'"
Loops
More about loops https://docs.ansible.com/ansible/latest/user_guide/playbooks_loops.html#playbooks-loops
Loops can be done by specifying a list of items and passing it to loop
, or with_items
.
---
- name: Install Apache
hosts: frt01.example.com
gather_facts: no
become: yes
tasks:
- name: Install Apache package
yum:
name: httpd
state: latest
- name: Open firewall for Apache
firewalld:
service: "{{ item }}"
permanent: yes
state: enabled
immediate: yes
loop:
- "http"
- "https"
- name: Restart and enable the service
service:
name: httpd
state: restarted
enabled: yes
Quick Demo
The quickest way to get your toes wet with Ansible is to try it out with a small set of Docker containers. Using the two docker images (TODO: link to docker files) and docker-compose, we can bring up a small lab to test Ansible with (TODO: link to docker-compose.yml).
Each container has an ansible account with ansible as the password. Using ansible, you can reach all of them using password authentication with the --ask-pass
flag. To make the containers use SSH keys going forward, generate a new SSH key on the head node, then run this (TODO: link to playbook) playbook.
You may want to turn off SSH host key checking by creating an ansible.cfg
file:
[defaults]
host_key_checking = false
You can then ping all the nodes by running:
# ansible -i inventory.ini all -m shell -a 'hostname'
Precedence
See: https://docs.ansible.com/ansible/latest/reference_appendices/general_precedence.html
Tips & Tricks
- An entire task can be written in one line as a
key=value
space delimited string. Ansible will parse the string out into a map before passing it to its module. - When defining tasks, you can also call the
import_tasks
orimport_role
modules to load a task file or role directory. - Ansible will load all
group_vars/group_name/*.yml
files as variables. Thegroup_vars
directory is relative to the inventory file. - ansible-playbook can be executed against a specific node with the
-l <host>
limit option. - ansible-playbook can start at a specific task, useful when debugging a specific task, using
--start-at-task <task-name>
. - You can have ansible-playbook run only a specific set of tasks using the tags feature. Tag roles or tasks you want to run and then call ansible-playbook with
--tags <tag>
.
- name: do something
tags: do-something
debug:
msg: "hello"
$ ansible-playbook example.yml --tags "do-something"
Distributed shell command
You can run commands on all your hosts similar to what you can do with salt-call * cmd.run
with ansible.
# ansible all -m shell -a "yum -y update sudo" -i clusters/x/inventory.ini -f 10
- Using the
shell
module, we runyum -y update sudo
- We use 10 jobs in parallel (up from the default of 5)
Formatting
Tasks with conditions can be written out over newlines. Just use the yaml >
operator.
- name: install some packages
package:
name: "[[:Template:Packages]]"
update_cache: yes
state: latest
when: >
ansible_distribution == 'Debian' and
ansible_distribution_release == my_release and
some_other_condition != true
Remember that you can specify arrays as inline lists in Ansible. For example:
- name: some task
tags: ["installation", "task"]
Jinja
Test Conditions
All test conditions can make use of Jinja templating. This allows for some interesting when conditions when determining when a task or role is to be executed. Some examples below.
Description | Example Condition |
---|---|
if the host is in the server group.
|
when: 'server' in group_names
|
if the host is not in the server group.
|
when: 'server' not in group_names
|
if some_value is true, or undefined.
|
when: "some_value | default(true)"
|
if some_value is undefined
|
when: some_value is undefined
|
if host is Red Hat | when: ansible_os_family == "RedHat"
|
combining conditionals | when: ansible_os_family == "RedHat"
|
checking whether a variable contains 'error' anywhere | when: some_value.find('error') != -1
|
Check if an array has items. | when: some_list | length > 0
|
See more on conditionals at: https://ansible-docs.readthedocs.io/zh/stable-2.0/rst/playbooks_conditionals.html
Align Text
Useful when trying to make config files look nice.
Alignment | Code | Example |
---|---|---|
Left Align | {{ '%-20s' | format(my_text ) }}
|
some value
|
Center | {{ my_text | center(80) }}
|
some value
|
Right Align | {{ '%20s' | format(my_text ) }}
|
some value
|
Comma Delimited List
Use the loop.last
variable to determine whether this is the last item in the list. You may use either the if-expression, or an actual if block.
{% for item in my_list %}
{{ item }}
{{ "," if not loop.last }}
{% if not loop.last %},{% endif %}
{% endfor %}
Idempotent Random Number
Random numbers are useful when configuring many hosts since it lets you spread out tasks over time (in the case of cronjobs) or over different resources (in the case of mounting network filesystems from different endpoints). However, we want the same random number for each host so that each run through of the playbook remains the same. To accomplish this, we can make use of Jinja's random function seeded with the host's inventory name.
vars:
fileserver_index: {{ 10 | random(seed=inventory_hostname) }}
Doing this, we can use fileserver_index
, which contains a value between 0 through 9.
Lookup a DNS name
You can use lookup to call dig and lookup a A record. For example:
# ansible localhost -m debug -a "msg={{ lookup('dig', 'nas') }}"
Tasks
Disable SELinux
- name: Disable SELinux
selinux: state=disabled
Troubleshooting
Import tasks not honoring when condition
Given the following task, the when condition is not being honored and is still evaluated:
- name: load tasks if condition is true
import_tasks: tasks.yml
when: some_condition_var is defined and some_condition_var is sameas true
The problem here is that import_tasks
will load all tasks in tasks.yml
at parse time rather than at runtime. This means that all tasks within tasks.yml
will be loaded with the when condition applied to each task. With this example, if the condition is false, all tasks that were imported will be skipped.
To not load any of the tasks in tasks.yml
when the condition is false, use include_tasks
instead.
See Also
There are lots of resources online to help you get started. Some guides I've stumbled across which I found helpful were:
- https://coyote.systems/blog/devops/learning-ansible-with-docker
- https://www.digitalocean.com/community/cheatsheets/how-to-use-ansible-cheat-sheet-guide
To set up the initial user, use something like the following. Use --ask-pass when running the playbook so that you log in using a password first, then this playbook will set up the SSH keys.
HPC example