Creating a Dynamic Inventory Script for Ansible

It seems no one has written a blog post on creating dynamic inventory scripts for Ansible in a while. I feel this topic could use an update as some of the information I found was incomplete or out of date.

My goal is was convert Terraforms’s tfstate data from DigitalOcean to a usable inventory script. Keep that in mind as it drove many specifics on how the script works. I want to also note that the script I reference is a first pass at getting a working inventory script.

So first, the script (in its current state):

#!/usr/bin/python3

import subprocess
import argparse
import json

relevant_tf_state_values = {
    'digitalocean_droplet': ['name', 'ipv4_address', 'ipv4_address_private', 'tags'],
    'digitalocean_database_cluster': ['name', 'host', 'private_host', 'port'],
    'digitalocean_database_user': ['name', 'password'],
    'digitalocean_database': ['name'],
    'digitalocean_domain': ['id'],
    'digitalocean_volume': ['name', 'size', 'initial_filesystem_type'],
    'digitalocean_ssh_key': ['name', 'fingerprint']
}

extra_vars = {
    'ansible_ssh_user': 'root',
    'web_mount_point': '/mnt/nfs/data',
    'web_mount_point_type': 'nfs',
    'ansible_ssh_common_args': '-o StrictHostKeyChecking=no -o userknownhostsfile=/dev/null'
}

class DigitalOceanInventory(object):

    def __init__(self):
        self.tags = []
        self.droplets = []
        self.vars = {}
        self.inventory_json = json.loads(self._get_terraform_output())
        self._generate_groups()
        self._generate_vars()
        self.ansible_inventory = self._generate_ansible_inventory()
    
    def _get_terraform_output(self):
        process = subprocess.Popen(['terraform', 'show', '-json'],
                                   stdout=subprocess.PIPE,
                                   stderr=subprocess.PIPE,
                                   universal_newlines=True)
        stdout, stderr = process.communicate()
        return stdout

    def _parse_resource(self, resource, resource_type, relevant_objects):
        data = {}
        for key, value in resource['values'].items():
            if key in relevant_objects:
                data[f'{resource_type}_{key}'] = value
        return data

    def _generate_groups(self):
        tags = 'digitalocean_tag'
        droplets = 'digitalocean_droplet'
        for resource in self.inventory_json['values']['root_module']['resources']:
            if resource['type'] == tags:
                self.tags.append(resource['values']['name'])
            elif resource['type'] == droplets:
                self.droplets.append(self._parse_resource(resource, droplets, relevant_tf_state_values[droplets]))

    def _generate_vars(self):
        for resource in self.inventory_json['values']['root_module']['resources']:
            if resource['type'] in relevant_tf_state_values.keys() and resource['type'] not in \
                    ['digitalocean_tags', 'digitalocean_droplets']:
                for key, value in resource['values'].items():
                    if key in relevant_tf_state_values[resource['type']] and key not in ['ip', 'tags']:
                        resource_id = resource['type']
                        self.vars[f'{resource_id}_{key}'] = value
                for key, value in extra_vars.items():
                    self.vars[key] = value

    def _generate_ansible_inventory(self):
        inventory = {}
        for tag in self.tags:
            hosts = []
            public_ips = []
            private_ips = []
            inventory[tag] = {}
            for droplet in self.droplets:
                if tag in droplet['digitalocean_droplet_tags']:
                    hosts.append(droplet['digitalocean_droplet_ipv4_address'])
                    public_ips.append(droplet['digitalocean_droplet_ipv4_address'])
                    private_ips.append(droplet['digitalocean_droplet_ipv4_address_private'])
                inventory[tag]['hosts'] = hosts
                inventory[tag]['vars'] = self.vars
            ansible_tag = tag.replace('-', '_')
            inventory[tag]['vars'][f'{ansible_tag}_public_ips'] = public_ips
            inventory[tag]['vars'][f'{ansible_tag}_private_ips'] = private_ips
            if 'digitalocean_volume_name' in inventory[tag]['vars']:
                nfs_mount_point = str('/mnt/' + inventory[tag]['vars']['digitalocean_volume_name'].replace('-', '_'))
                inventory[tag]['vars']['nfs_mount_point'] = nfs_mount_point
        inventory['_meta'] = {}
        inventory['_meta']['hostvars'] = {}
        return inventory

    def get_inventory(self):
        return json.dumps(self.ansible_inventory, indent=2)


def main():
    parser = argparse.ArgumentParser()
    parser.add_argument('--save', '-s', help='Generates Ansible inventory and stores to disk as inventory.json.',
                        action='store_true')
    parser.add_argument('--list', action='store_true')
    args = parser.parse_args()
    do = DigitalOceanInventory()
    if args.list:
        print(do.get_inventory())
    elif args.save:
        with open('inventory.json', 'w') as inventory:
            inventory.write(do.get_inventory())


if __name__ == '__main__':
    main()

At a high level, we’re getting the tfstate from Terraform by running the following command: terraform show -json. Next, we generate hostgroups by piggybacking on the tags added to host resources during creation. Next, we parse through the other resources to get the subset of information that we’re interested in. Finally, we generate an Python object with all the data in the desired format. Finally, we dump it as a JSON object and either return it to stdout or to inventory.json.

The inventory output looks something like this:

{
  "tag-name-node": {
    "hosts": [
      "10.0.0.1"
    ],
    "vars": {
      "digitalocean_ssh_key_fingerprint": "00:11:22:33:44:55:66:77:88:99:AA:BB:CC:DD:EE:FF",
      "digitalocean_ssh_key_name": "sshkeyname",
      "ansible_ssh_user": "root",
      "web_mount_point": "/mnt/nfs/data",
      "web_mount_point_type": "nfs",
      "ansible_ssh_common_args": "-o StrictHostKeyChecking=no -o userknownhostsfile=/dev/null",
      "digitalocean_database_cluster_host": "something.ondigitalocean.com",
      "digitalocean_database_cluster_name": "db-name",
      "digitalocean_database_cluster_port": 25060,
      "digitalocean_database_cluster_private_host": "private.something.ondigitalocean.com",
      "digitalocean_database_user_name": "wordpress",
      "digitalocean_database_user_password": "password",
      "digitalocean_domain_id": "something.com",
      "digitalocean_volume_initial_filesystem_type": "ext4",
      "digitalocean_volume_name": "volume-name",
      "digitalocean_volume_size": 5,
      "nfs_node_public_ips": [
        "10.0.0.1"
      ],
      "nfs_node_private_ips": [
        "10.0.0.1"
      ],
      "nfs_mount_point": "/mnt/barista_cloud_volume"
    }
  },
  "_meta": {
    "hostvars": {}
  }
}

Now, if you try to feed this to Ansible as an inventory file, it will not be parsed correctly. The dynamic inventory JSON format is not the same as the JSON inventory format. This took me awhile to figure out and is honestly kind of frustrating as it makes creating a working JSON template so you can iterate and test quickly much more difficult than it needs to be. On the topic of gotcha’s, here a a few more to be aware of.

  1. Your inventory script does not have to be written in Python, but it must include a shebang at the top of the script so it can be executed (also it must be executable so chmod +x your script).
  2. The inventory script must accept the flag --list. It’s supposed to also accept --host and return details on a single host but I have not needed it nor implemented it.
  3. Even if you are not adding vars for specific hosts, you MUST include the _meta section in your inventory.

That’s about it. I will probably come back around and clean this script up and make it more reusable. Heck, I might put together a boilerplate script that can make creating custom dynamic inventory scripts quicker. As mentioned before, this is a first pass attempt to get something that works for my use case.

Finally, I feel I would be remiss if I did not include the tidbits of info I found scattered around the web that helped me figure this out.

https://www.jeffgeerling.com/blog/creating-custom-dynamic-inventories-ansible (Jeff, as always, is an invaluable resource on all things Ansible.)

https://docs.ansible.com/ansible/2.9/dev_guide/developing_inventory.html

https://adamj.eu/tech/2016/12/04/writing-a-custom-ansible-dynamic-inventory-script/

Thanks all folks. Have a good weekend!

 

Using Terraform to Manage DigitalOcean Resources

I am a fan of DigitalOcean. What they lack in breath of services they more than make up for with the ease of use, documentation, and tutorials. Last year, I overhauled this website to be driven by Ansible. This year, I want to take this automation to the next level. There are capability gaps using Ansible to create infrastructure that I’ve had to work around by doing some tasks manually or by writing custom scripts.

An example of this comes when trying to create a managed database cluster. Ansible cannot do this so I wrote a Python script to handle database management.

https://github.com/seaburr/WordPressOnDigitalOcean/tree/master/roles/database-server

I do not feel DigitalOcean should fill the gaps either. Why? Because Ansible is a configuration management tool that ensures resources are configured in a desired state. Infrastructure creation is not Ansible’s job. There are specific tools for infrastructure creation… Enter Terraform.

Terraform is a tool for defining providers (like DigitalOcean or AWS) and the resources (like droplets, load balancers, etc.) that your environment requires. Terraforms intent is to compare your infrastructure to your desired state and make corrections to bring your resources into compliance. It is a different concern from HOW the infrastructure is configured.

Over the next few months, I’m going to migrate infrastructure concerns out of Ansible and into Terraform. In fact, I’ve already got a POC to share.

https://github.com/seaburr/Terraform-On-DO

This repository defines the new standard for infrastructure that I am aiming for.

Here’s a simple mockup of the goal:

I did try to use the built-in graph functionality of Terraform to show this but it came out looking like this:

I’ve got boxes full of Pepe!

Anyways, it’s a work in progress. I’ve run into what I believe is a bug with the DigitalOcean Terraform provider and I’ve already raised a ticket with them to get resolved.

Next time, let’s actually learn something and dig into a resource and the provider configuration.

 

Borked Website: A Short Story

Today, I broke this website while testing some minor changes to the deployment scripts. I tried to figure out what went wrong (I messed up something with Apache while trying to renew the SSL cert). I couldn’t get it sorted out so I blew up the droplet (VM, EC2 instance, whatever) and re-executed the existing playbooks. What enabled me to do this? How was I able to do this?

Things that allowed me to recover:

  1. These scripts include daily backups so even if the entire WordPress deployment needs to be re-created from scratch, the data is tarred up and ready to go.
  2. The website data is not stored on the VM but an attached persistent volume.
  3. The database is a standalone, managed MySQL instance.
  4. Playbooks and roles are designed to be idempotent so re-running them is safe. They aim for desired state meaning no change if it’s not needed.

How I recovered:

So I simply destroyed the droplet, recreated it, and re-provisioned it. I had to perform a few tasks manually in DigitalOcean (whitelisting the droplet IP to the MySQL instance and pointing the floating IP to the new droplet) but even these tasks can be automated in the future (and will be).

All in all, I spent about an hour trying to figure out what I broke and another fifteen minutes to blow away and recreate the host. This is the way… Or at least, this is the way towards the way.

 

WordPress on DigitalOcean Updates

This project hasn’t been touched in a few months so last night I embarked to give it a spin. It failed. Miserably. So I went bug-hunting and got it working again.

Resolved Issues

  • centos-base
    • Removed packages no longer available that were causing role to fail.
    • Added package Glances to replace htop.
    • Fixed an issue with fail2ban configuration tasks.
    • Enhanced fail2ban configuration.
  • create-swap
    • Resolved a typo in a task that prevented swap file from being created.
  • install-apache
    • Enabled gzip compression to reduce page load times.
    • Enabled caching to reduce page load times.
    • Removed hardcoded values in vhost.conf.j2 that would have resulted in a misconfigured HTTP to HTTPS redirect.
  • install-certbot
    • Fixed issues that would have prevented automatic renewal cron job from being created.
  • create-droplet
    • Changed default droplet size from 1gb to 2gb.
  • destroy-droplet
    • removed hardcoded region that would have prevented deleting droplets not deployed in region NYC1.
  • install-wordpress
    • Fixed an issue where MySQL port was not being added to wp-config.php, preventing WordPress from starting.
    • Fixed an issue where Apache could not access document root.
    • Fixed an issue where wp-config.php was getting incorrect database connection details.
  • database-server
    • Simplified data returned from script used to create database servers to resolve an issue in install-wordpress.

See commit: https://github.com/seaburr/WordPressOnDigitalOcean/commit/f6226a2ac92a71f891dc23a6fc04a3d521fb227a

Next steps will be focusing on adding an automatic build job to help ensure that this code is always in good, working order.

Have a good weekend and take care of yourselves.

 

Ansible and Text Encoding and Line Endings and Git and Windows and Frustration

I ran into an issue the other day with Ansible while provisoning a Windows machine. After installing InstallShield 2015 SAB, Ansible copies a small license file configuration file into the installation directory.

[FlexNet Publisher Server]
Server=Port@ServerName

That’s the configuration. Should be easy right?

Not in this case. I was confounded by errors in while testing the installation. If I opened up the file using Notepad and saved it, InstallShield would start working.

I had tried all of the usual suspects once I realized there was something wrong with the file. Change the text encoding from UTF8 to Windows ANSI. No change. Change the line endings from LF to CRLF. No change.

So what was going on?

As it turns out, my text editor (Atom) was adding an extra LF to end of the file. Why would it do that? Well, this part of the POSIX standard.

See: https://gcc.gnu.org/legacy-ml/gcc/2003-11/msg01568.html

Before saving.
After saving.

To get around this issue, I uploaded this file to Artifactory and I treat the configuration file as an artifact, like the InstallShield installer, and just download and copy it to the installation directory.

This is a reminder to be conscious of how Git and your editor treat files when you’re provisioning Windows machines. Occasionally, you will still run into maddening little issues like this. I don’t want to admit how much time this thing left me stumped.

 

Re-Architecting This Website VIII

“Done. For now…”

All of my goals for the first re-architecture of this website are now complete. This evening I fixed the backup script, added a role that installed EFF’s certbot, and updated the README to reflect the current status.

You can see what’s new here: https://github.com/seaburr/WordPressOnDigitalOcean

Have a good week. In the coming weeks, there will not be much activity as I’ll be changing gears to Kubernetes.

 

Re-Architecting This Website VII

“Baby steps.”

I’ve added a single change today. There’s new role that will install and configure the DigitalOcean monitoring agent.

In the coming days, I’ve got a few more items to wrap up. Once those (minor) missing pieces are in place, I will call this project done and move on to something else, like re-re-architecting this website using Kubernetes.

You can see what’s new here: https://github.com/seaburr/WordPressOnDigitalOcean

Have a good weekend.

 

Re-Architecting This Website VI

I can’t believe it’s been over two weeks since I touched this project. So far, things have been good. I took a look at AWS with the intent of redoing this project on Lightsail. I assumed it would be cheaper. Unfortunately, it won’t be. That doesn’t mean it’s not worth the effort, but it certainly changes the time table to migrate because it will actually cost MORE per month than hosting on DigitalOcean. Instead of focusing on migrating, I’ve decided to focus on general work that needs to be done which translates to any other cloud vendor.

Here’s what’s new:

  • Extended centos-base role to install and configure fail2ban for SSH and Apache. It also now includes two packages (htop and screen) that were previously missing.
  • New role called create-swap that will create and configure a swapfile.
  • Reconfigured install-apache to include mod_security and a more robust 80 -> 443 redirect.

You can see what’s new here: https://github.com/seaburr/WordPressOnDigitalOcean

Have a good weekend.

 

Re-Architecting This Website V

“Bootstrapped.”

Last night, I migrated this website onto its new infrastructure! The tooling I’ve been working for the last week has become feature-complete enough that it was used to create the database, droplet, and storage volume this site now uses.

There was some manual work (SSL configuration, import from previous website, updating DB connection string) however it was relatively minor and these things are likely to be the next features added.

Here’s where we’re at:

I’m really excited about the progress made in a week.