Ansible #2 How to create an Ansible Playbook

0

Last Updated on July 6, 2024 by David Both

Image by Gerd Altmann from Pixabay

In this introduction to Playbook creation, we examine a play that manages updates for a local Ansible controller machine.

A few years ago, I started using Ansible to simplify the administrative tasks for the multiple organizations I support. In addition, I especially wanted to simplify the management of my own fleet of eight or nine hosts (the exact count changes frequently and I now have twelve hosts). I have been doing all of my automation using Bash scripts since I started using Linux. I used scripts to automate the distribution of other update-automation scripts, system and user configuration files, upgrades to newer releases of Fedora, and updates to new Fedora releases. And that does not include all those little ad-hoc tasks that arise on what seems to be an hourly basis.

In this article, I create a playbook that can perform a system update while accounting for system differences.

I recently wrote about My First Day of Using Ansible. If you have not read that article, you might want to do so before continuing with this one. This article does assume that you have some very basic familiarity with using Ansible. If you have read my previous article you should be able to follow this one with no difficulty. In this article, we’ll define some important Ansible concepts and terms, discuss comments, and implement the first play of the playbook. The next article will continue with the two remaining plays, some follow-up information, and my resource list.

This article’s playbook does not require installing any Ansible modules or collections beyond the default built-in ones. This playbook has been tested with Ansible 2.9.14.

What is a playbook?

As I was writing this article, I was also watching some sessions of this year’s virtual AnsibleFest. As I watched, I developed my own description of a playbook—one which makes sense to me. This description contains bits of playbook wisdom from several of the presenters, as well as some books and blog posts I have read.

Here is my definition:

An Ansible playbook is a human-created description of the desired state of one or more computers. Ansible reads the playbook, compares the actual state of the computers with the state specified in the playbook, and performs the tasks required to bring those computers to conform to the state described in the playbook.

Updates

Installing updates on modern Linux hosts is a near-constant task, made important by the number of security fixes and feature improvements that come out continuously. If certain software, such as the kernel, glibc, or systemd is updated, the host also needs to be rebooted. Some additional tasks also need to be performed, such as updating the man page database.

Although I have automated update installations with a cool script, some of my hosts need to be treated a little differently than others. For example, I do not want to update my firewall and email/web/DHCP/DNS server simultaneously. If the server reboots, the firewall cannot obtain the information it needs to proceed with its own updates, and neither can the other hosts in my network. If the firewall goes down, access to the external Fedora repositories is lost to my server and all other internal hosts. That means waiting for these two hosts to reboot before updates can begin on the other hosts.

So I used to do those updates by starting the scripts on the firewall and then waiting for it to finish, then moving on to the server and waiting for it to finish before running a small command-line program to execute the update script in sequence on the rest of my hosts. I could have written these dependencies into my scripts but my first few days with Ansible showed me that it already has those capabilities and many more that would make my task much more straightforward. Really—MUCH simpler. And faster overall, with no intervention from me.

The Ansible strategy

Ansible uses a hub strategy for managing hosts. The Ansible software is installed on the host that acts as the controller. Client-side Ansible software isn’t required, so none needs to be installed on the remote hosts.

Ansible uses SSH, which is already installed by nearly every Linux distribution, to communicate with remote hosts. Although Sysadmins can choose to use passwords for remote access to hosts, that certainly reduces the efficiency and hands-off nature of a tool like Ansible. So, like most other admins, I use public/private key pairs (PPKP), which is considered safer than passwords and allows automation of tasks for from one to thousands of remote hosts with no intervention from the administrator.

Ansible sends commands to the remote hosts via SSH and uses the results to determine the command’s success. It can also use those results to determine the next course of action using conditional when statements.

Defining the requirements

Just like any program, whether written in C, Python, Bash, or any other language, I always start with a set of requirements. You have read my book, The Linux Philosophy for SysAdmins, haven’t you? This is also true for tools like Ansible. I need to define where I am going so I can identify when I have arrived.

These are the requirements for my Ansible playbook:

Play 1: Ansible controller

  1. First, install updates on the Ansible control node.
  2. Update the man page database.
  3. Reboot if necessary.
  4. Login after the reboot and rerun the playbook. Because the control node is already in the desired state, no further actions will be taken, and Play 2 will begin.

Play 2: Servers

  1. Install updates on the firewalls and servers in series, i.e., one at a time.
  2. Update the man page database.
  3. Reboot if necessary.
  4. Wait for the first host to reboot, if necessary, before starting on the next.

Play 3: Workstations

  1. Do not start updating workstations until servers have been completed.
  2. Install updates on each computer running simultaneously in parallel.
  3. Update the man page database.
  4. Reboot if necessary.

Yes, I know that there are other ways of doing this. I did think about doing the Ansible controller last so that I do not need to deal with restarting the playbook after the controller reboots. But the way I do things is to do updates on my primary workstation first—which is also my Ansible control node—and then do some testing before updating the rest of the systems, so this strategy works perfectly for that. Your needs will probably differ from mine, so you should use whatever method works best for you. The thing about Ansible is that it is flexible enough to accommodate different needs.

Now that I have the requirements for a task, I can begin the playbook.

Syntax

Ansible playbooks must conform to standard YAML syntax and formatting. The most frequent errors I have encountered are my own formatting mistakes. This is usually because I have used or not used leading dashes as required, or I have used the wrong indentation.

The play names start in column one of the playbook, and each following task is indented by exactly two spaces. Exactly two spaces indent each action in a task, and sub-tasks are further indented by exactly two spaces. Any other number of spaces or use of any whitespace other than spaces, such as tabs, will generate a runtime error. Extra white space at the end of a line also generates an error.

You will make formatting errors and will quickly learn to see the problems. There are some tools I can use to assist us in locating these errors before attempting to run playbooks, and they can save a great deal of time in the long run.

Starting the playbook

So let’s start a playbook that will perform those tasks in the required sequence. Playbooks are simply collections of tasks that define the desired state of a host. A hostname or inventory group is specified at the beginning of the playbook and defines the hosts on which Ansible will run the playbook.

Our playbook will contain three plays to handle each type of host I identified in the requirements statement. Each play will have a bit different logic but will produce the same result—one or more hosts with all updates installed.

My playbook is named doUpdates.yml and is located in the /root/ansible/Updates directory, which I created for this project. The Bash program installed by the plays in this playbook is located in the /root/ansible/Updates/files directory.

Let’s explore this playbook one section at a time.

Defining this as a YAML file

I start all of my code with well-structured comments so that the file name and a short description of this playbook exist for myself or some other sysadmin in the future. Playbooks can contain comments, although I have seen few articles or books that mention this.

As a Sysadmin who believes in documenting everything, I find that comments can be very helpful. This is not so much about saying the same things in the comments as I do in the task name, but instead, it is about identifying the purpose of groups of tasks and ensuring that I record my reasons for doing certain things in a certain way or order. This can help with debugging problems at a later date when I might have forgotten my original thinking. As in Bash, comments begin with a #.

The primary function of this first section is the three dashes () that are used to define this as a YAML file. The yml extension on the file name stands for YAML. I have seen a couple of meanings for that, but my bet is on “Yet Another Markup Language,” despite the fact that I have seen some claims that YAML is not one.

########################################################################
#                             doUpdates.yml
#------------------------------------------------------------------
# This playbook installs all available RPM updates on the inventory hosts.
#
#    
#------------------------------------------------------------------
#
# Change History              
# Date        Name         Version   Description
# 2020/10/01  David Both   00.00     Started new code
# 2020/10/10  David Both   01.00     First release code finished
# 2020/10/18  David Both   01.01     Minor changes to sequence and
#                                    fix a couple minor problems.
#
########################################################################
---

The first play

This next section defines the first play in the playbook. Playbooks can have one or more plays, and ours has three: One for the control host on which Ansible is run, one for the two servers on my network, and one for the rest of the workstations. Next, I define the first play—after all, this is a playbook.

Notice that the play begins in column zero, and then there is strict indentation of the remaining lines in the play. No statement defines the beginning of the play. Ansible uses the rigid YAML structure to determine where each play and task begins.

########################################################################
#######################################################################
# Play 1 - Do updates for host david
########################################################################
########################################################################
- name: Play 1 - Install updates on david - the Ansible controler
  hosts: david
  remote_user: root
  vars:
    run: false
    reboot: false

We will run into various keywords frequently, so here are some explanations that I wish I had when I started working with Ansible.

name: This line is the play’s name, and the play name is displayed in the STDOUT data stream. This makes it easy to identify each play that runs to keep track as I watch or view the redirected stream later. The keyword is required for each play and task, but the text content is optional.

hosts: This defines the hostnames on which the play will be executed. It can contain a space-separated list of hostnames or the name of a group of hosts. The host group and the names of all listed hosts must appear in the inventory file. By default, that is /etc/ansible/hosts but can be another file so long as you use the -i (--inventory) option to specify the alternative file.

remote_user: This line is not a requirement, but it does specify the user that Ansible will act as on the remote host. If the user on the remote host is the same as the user on the localhost, this line is not needed. By default, Ansible uses the same user ID on the remote host as the user that runs the Ansible playbook. I use it here simply for informational purposes. I run most playbooks as root on the localhost, so Ansible logs in to the remote host as root.

vars: This section can be used to define one or more variables, which can be used as in any programming language. In this case, I use them in conditional “when” statements later in the playbook to control the execution path.

The scope of variables is limited to the section in which they are defined. In this case, they are defined in Play 1, so they are limited to that play. If I want to use them in later plays, I will need to set them again in each play in which they are required. If a variable is set in a task, then they are only available within that task and not in the rest of that play.

Variable values can be overridden at the command line by using the -e (--extra_variables) option to specify a different value. We will see that when it is time to run the playbook.

The tasks

This is the beginning of the tasks section for Play 1. The task: keyword is indented by exactly two spaces. Each task must have a name statement, even if there is no text for the name. The text makes it easier to follow the playbook logic and is displayed on the screen during execution to assist as I follow the progress in realtime.

tasks:
########################################################################
# Do some preliminary checking 
########################################################################
    - name: Install the latest version of the doUpdates script
      copy:
        src: /root/ansible/Updates/files/doUpdates
        dest: /usr/local/bin
        mode: 0774
        owner: root
        group: root



    - name: Check for currently available updates
      command: doUpdates -c
      register: check
    - debug: var=check.stdout_lines

This first section contains three tasks. The first task copies a Bash program I wrote to the target host. The second runs the program just installed and assigns—registers—the STDOUT data stream from the doUpdates program to the variable “check.” The third task prints all of the STDOUT lines in the check variable to the screen.

Let’s look at the new keywords in a bit more detail:

copy: The copy keyword defines the beginning of a stanza that can copy one or more files from a specified source location (src) to a specified target location (dest). The keywords in this section define various aspects of the copy operation and the final state of the copied file.

src: This is the fully qualified path and name of the file to be copied. In this case, I am only going to copy a single file, but it is easy to copy all files in a directory or only those that match a file glob pattern. The source file is usually stored in a location in the Ansible hub directory tree. In this case, the fully qualified path to my source file is located in /root/ansible/Updates/files/doUpdates.

dest: This is the destination path on the target host(s) into which the source file will be copied.

mode: The mode keyword defines the file mode that will be applied to the copied file. Regardless of the file mode of the source file, Ansible will set the file mode to that specified in this statement. For example, rwxr_xr__ or 0754. Be sure to use all four bytes when using the octal format.

owner: This is the owner account that will be applied to the file.

group: This is the group account that will be applied to the file.

command: Any Linux shell command, shell script, or command-line program along with options and arguments can be used with this keyword. I have used the Bash program that was just installed to obtain some information that is not easily obtainable using the Ansible built-ins, such as dnf.

register: This keyword sets the STDOUT from the command specified above into a variable named “check.” The when: keyword can query the content of this variable. It is then used as a conditional to determine whether the task of which it is a part will be performed. We will see this in the next section.

debug: Prints the content of the specified variable to the screen. I frequently use this as a debug tool. I find this helpful in debugging. Hint, hint.

Now a little about my doUpdates Bash program.

I originally wrote this Bash program to actually do the updates that I have now begun doing with Ansible. It contains some code that determines whether updates are available. It also determines whether the kernel, systemd, or glibc have been updated, any one of which should require a reboot to take full effect. My program emits a couple of lines to STDOUT that I can use in Ansible as a conditional to decide whether to reboot the target host. I use that in this next section, which does the actual updates, and the following one performs a power off for my primary workstation. Similar code performs a reboot on all other hosts, as you will see.

The STDOUT from this program used with the -c option looks like this when updates are available, but a reboot is not required. I can use a regular expression to search any of the text in this data stream for key strings, which can be used in a when: conditional to determine whether a specific task is performed.

TASK [debug] ***********************************************************************************************
ok: [wally1] => {
    "check.stdout_lines": [
        "########## 48 updates ARE available for host wally1.both.org. ##########",
        "########## Including: ##########",
        "Last metadata expiration check: 1:47:12 ago on Tue 20 Oct 2020 01:50:07 PM EDT.",
        "Updates Information Summary: available",
        "    3 Security notice(s)",
        "        2 Moderate Security notice(s)",
        "    3 Bugfix notice(s)",
        "    2 Enhancement notice(s)",
        "    2 other notice(s)",
        "########## A reboot will NOT be required after these updates are installed. ##########",
        "Program terminated normally"
    ]
}

This next section immediately above performs the actual updates if all of the conditionals in the when: statement are true. This section uses the built-in Ansible dnf package manager.

########################################################################
# Do the updates.
########################################################################
# Install all available updates
    - name: Install all current updates
      dnf:
        name: "*"
        state: latest
      when: (check.stdout | regex_search('updates ARE available')) and run == "true"

dnf: Calls the Ansible built-in that interfaces with the DNF package manager. Although a bit limited in its capabilities, it can install, remove, and update packages. One of the limitations of the DNF module is that it does not have the check-update function. Therefore, I continue to use my Bash program to discover the list of packages to be updated and from that determine whether a reboot (or power off) needs to be performed. Ansible also has YUM and APT built-ins.

name: Provides the name of the package on which to operate. In this case, the file glob character * denotes all installed packages.

state: The “latest” value for this keyword indicates that all installed packages are to be brought up to the most recent version. Some of the other state options are “present,” which means that the package is installed but not necessarily the latest version, and “absent,” which means to remove the package if it is installed.

when: This conditional phrase specifies the conditions which must be met for this task to run. In this instance, the updates will only be installed when the string of text defined in the regular expression is present in the variable “check” that was previously registered, and the “run” variable is set to “true.”

Now that the updates have been done, I may need to reboot, so let’s see how I can deal with that. The next task does that for us.

First, I have a bit of documentation via comments, which describes that I have done a power off rather than a reboot for this host due to a potential hardware issue. This is a perfect example of why we should include comments in our code and playbooks because they explain why this particular play is needed and why it is different from the others. It is also a great example of how I can treat one host differently from others.

########################################################################
# Now poweroff host david because of MB problems that won't let it
# do a reboot without manual intervention. Need to see if I
# can figure out this problem and fix it but it is a hardware issue
# and this is just a temporary circumvention. 
########################################################################
    - name: Poweroff this host if necessary and reboot extra variable is true
      command: poweroff
      when: (check.stdout | regex_search('reboot will be required')) and reboot == "true" and run == "true"

In this task, I send the poweroff command instead of the preferred action of rebooting the computer. I do this for the reason stated in the comments: Because the motherboard on my primary workstation—which is also my Ansible hub—does not properly reboot. This is probably due to a misconfiguration on my part rather than any type of malfunction. I have not yet discovered the reason for it because I find the time I have already spent in searches and making BIOS configuration changes has exceeded my tolerance limit, and I need to get the work done.

Author’s note: I eventually did fix this problem. As I suspected, it was a UEFI motherboard configuration problem. Specifically, I had misconfigured overclocking.

Execution of the playbook stops after that power off (or reboot if your Ansible hub device reboots properly) so that I need to restart it after it ultimately does get up and running again. This time, because the updates have been installed, the power off or reboot does not occur, and the next play is run.

And this concludes the first play.

Wrapping up

We’ve covered a lot of information here and I hope that a solid explanation of the keywords and tasks was helpful. Scripts will continue to play an essential part in my administrative automation. However, Ansible looks like it can take over many tasks and do them far better than even complex scripts. It is all about the playbook.

It’s also clear what my opinion is of comments: They are critical. Play 1 gets our Ansible playbook rolling. In the next article, we’ll conclude the playbook with two additional plays to manage the firewall and servers, and then the rest of the hosts on the network. I’ll also provide some followup details and concepts.