Dead Air

It’s been several years since I last posted anything though it’s not because I haven’t been busy. My goal is to start writing up posts regularly but before I do, I feel I need to recap the last three and a half years.

I would recommend that everyone occasionally take time to document their accomplishments (professionally and otherwise).

The single most important change in my life is that I am now a father and my son is an absolutely incredible little kid. I hope I can continue to help him grow into a kind, caring, smart, driven person who can accomplish more than I could ever dream of in whatever realm brings him fulfillment.

Now, as for work…

In July of 2024, I left Broadcom (post acquisition of VMware) to join a startup in the Atlanta area. Between the 2021 and 2024 at VMware and the last several months at Airia, I’ve accomplished quite a lot.

  • Built a Prometheus exporter to run performance tests and detects issues within the VMware build environment. During one troubleshooting event, it was able to confirm that a NFS performance issue was solved nearly instantly whereas the previous test involved running an large build that took several hours.
  • Built a tool to migrate repositories between Artifactory instances including recreating virtual repositories and their relationships with the underlying local and remote repos. It was used successfully to migrate several terabytes of data to a new instance of Artifactory hosted in AWS.
  • Built a SCCM provisioner for GitHub as a gap-stop to onboard hundreds of users into GitHub while we built selected and tested an IdP solution.
  • Built countless tools/processes using GitHub’s REST and GraphQL APIs to assist with configuring GitHub as teams migrated from Perforce and Bitbucket.
  • Selected, tested, deployed, documented, and tuned Loki as the logging system of choice for troubleshooting issues and getting statistics out of log data.
  • Building out several of our systems for dealing with incidents including a customer-facing status page, deploying PagerDuty for alerts, and implementing an RCCA process using 5 Why’s.
  • Combed through every component of our system and ensured that we had telemetry and log data and either added a dashboard from the community or built a custom dashboard. I want to ensure we have observability into everything.

This list will continue to grow rapidly and there are several things I’d like to add however they are things I’ve only just begun to look into. In short, I am busy as hell.

 

Creating a Dynamic Inventory Script for Ansible

It seems no one has written a blog post on creating dynamic inventory scripts for Ansible in a while. I feel this topic could use an update as some of the information I found was incomplete or out of date.

My goal is was convert Terraforms’s tfstate data from DigitalOcean to a usable inventory script. Keep that in mind as it drove many specifics on how the script works. I want to also note that the script I reference is a first pass at getting a working inventory script.

So first, the script (in its current state):

#!/usr/bin/python3

import subprocess
import argparse
import json

relevant_tf_state_values = {
    'digitalocean_droplet': ['name', 'ipv4_address', 'ipv4_address_private', 'tags'],
    'digitalocean_database_cluster': ['name', 'host', 'private_host', 'port'],
    'digitalocean_database_user': ['name', 'password'],
    'digitalocean_database': ['name'],
    'digitalocean_domain': ['id'],
    'digitalocean_volume': ['name', 'size', 'initial_filesystem_type'],
    'digitalocean_ssh_key': ['name', 'fingerprint']
}

extra_vars = {
    'ansible_ssh_user': 'root',
    'web_mount_point': '/mnt/nfs/data',
    'web_mount_point_type': 'nfs',
    'ansible_ssh_common_args': '-o StrictHostKeyChecking=no -o userknownhostsfile=/dev/null'
}

class DigitalOceanInventory(object):

    def __init__(self):
        self.tags = []
        self.droplets = []
        self.vars = {}
        self.inventory_json = json.loads(self._get_terraform_output())
        self._generate_groups()
        self._generate_vars()
        self.ansible_inventory = self._generate_ansible_inventory()
    
    def _get_terraform_output(self):
        process = subprocess.Popen(['terraform', 'show', '-json'],
                                   stdout=subprocess.PIPE,
                                   stderr=subprocess.PIPE,
                                   universal_newlines=True)
        stdout, stderr = process.communicate()
        return stdout

    def _parse_resource(self, resource, resource_type, relevant_objects):
        data = {}
        for key, value in resource['values'].items():
            if key in relevant_objects:
                data[f'{resource_type}_{key}'] = value
        return data

    def _generate_groups(self):
        tags = 'digitalocean_tag'
        droplets = 'digitalocean_droplet'
        for resource in self.inventory_json['values']['root_module']['resources']:
            if resource['type'] == tags:
                self.tags.append(resource['values']['name'])
            elif resource['type'] == droplets:
                self.droplets.append(self._parse_resource(resource, droplets, relevant_tf_state_values[droplets]))

    def _generate_vars(self):
        for resource in self.inventory_json['values']['root_module']['resources']:
            if resource['type'] in relevant_tf_state_values.keys() and resource['type'] not in \
                    ['digitalocean_tags', 'digitalocean_droplets']:
                for key, value in resource['values'].items():
                    if key in relevant_tf_state_values[resource['type']] and key not in ['ip', 'tags']:
                        resource_id = resource['type']
                        self.vars[f'{resource_id}_{key}'] = value
                for key, value in extra_vars.items():
                    self.vars[key] = value

    def _generate_ansible_inventory(self):
        inventory = {}
        for tag in self.tags:
            hosts = []
            public_ips = []
            private_ips = []
            inventory[tag] = {}
            for droplet in self.droplets:
                if tag in droplet['digitalocean_droplet_tags']:
                    hosts.append(droplet['digitalocean_droplet_ipv4_address'])
                    public_ips.append(droplet['digitalocean_droplet_ipv4_address'])
                    private_ips.append(droplet['digitalocean_droplet_ipv4_address_private'])
                inventory[tag]['hosts'] = hosts
                inventory[tag]['vars'] = self.vars
            ansible_tag = tag.replace('-', '_')
            inventory[tag]['vars'][f'{ansible_tag}_public_ips'] = public_ips
            inventory[tag]['vars'][f'{ansible_tag}_private_ips'] = private_ips
            if 'digitalocean_volume_name' in inventory[tag]['vars']:
                nfs_mount_point = str('/mnt/' + inventory[tag]['vars']['digitalocean_volume_name'].replace('-', '_'))
                inventory[tag]['vars']['nfs_mount_point'] = nfs_mount_point
        inventory['_meta'] = {}
        inventory['_meta']['hostvars'] = {}
        return inventory

    def get_inventory(self):
        return json.dumps(self.ansible_inventory, indent=2)


def main():
    parser = argparse.ArgumentParser()
    parser.add_argument('--save', '-s', help='Generates Ansible inventory and stores to disk as inventory.json.',
                        action='store_true')
    parser.add_argument('--list', action='store_true')
    args = parser.parse_args()
    do = DigitalOceanInventory()
    if args.list:
        print(do.get_inventory())
    elif args.save:
        with open('inventory.json', 'w') as inventory:
            inventory.write(do.get_inventory())


if __name__ == '__main__':
    main()

At a high level, we’re getting the tfstate from Terraform by running the following command: terraform show -json. Next, we generate hostgroups by piggybacking on the tags added to host resources during creation. Next, we parse through the other resources to get the subset of information that we’re interested in. Finally, we generate an Python object with all the data in the desired format. Finally, we dump it as a JSON object and either return it to stdout or to inventory.json.

The inventory output looks something like this:

{
  "tag-name-node": {
    "hosts": [
      "10.0.0.1"
    ],
    "vars": {
      "digitalocean_ssh_key_fingerprint": "00:11:22:33:44:55:66:77:88:99:AA:BB:CC:DD:EE:FF",
      "digitalocean_ssh_key_name": "sshkeyname",
      "ansible_ssh_user": "root",
      "web_mount_point": "/mnt/nfs/data",
      "web_mount_point_type": "nfs",
      "ansible_ssh_common_args": "-o StrictHostKeyChecking=no -o userknownhostsfile=/dev/null",
      "digitalocean_database_cluster_host": "something.ondigitalocean.com",
      "digitalocean_database_cluster_name": "db-name",
      "digitalocean_database_cluster_port": 25060,
      "digitalocean_database_cluster_private_host": "private.something.ondigitalocean.com",
      "digitalocean_database_user_name": "wordpress",
      "digitalocean_database_user_password": "password",
      "digitalocean_domain_id": "something.com",
      "digitalocean_volume_initial_filesystem_type": "ext4",
      "digitalocean_volume_name": "volume-name",
      "digitalocean_volume_size": 5,
      "nfs_node_public_ips": [
        "10.0.0.1"
      ],
      "nfs_node_private_ips": [
        "10.0.0.1"
      ],
      "nfs_mount_point": "/mnt/barista_cloud_volume"
    }
  },
  "_meta": {
    "hostvars": {}
  }
}

Now, if you try to feed this to Ansible as an inventory file, it will not be parsed correctly. The dynamic inventory JSON format is not the same as the JSON inventory format. This took me awhile to figure out and is honestly kind of frustrating as it makes creating a working JSON template so you can iterate and test quickly much more difficult than it needs to be. On the topic of gotcha’s, here a a few more to be aware of.

  1. Your inventory script does not have to be written in Python, but it must include a shebang at the top of the script so it can be executed (also it must be executable so chmod +x your script).
  2. The inventory script must accept the flag --list. It’s supposed to also accept --host and return details on a single host but I have not needed it nor implemented it.
  3. Even if you are not adding vars for specific hosts, you MUST include the _meta section in your inventory.

That’s about it. I will probably come back around and clean this script up and make it more reusable. Heck, I might put together a boilerplate script that can make creating custom dynamic inventory scripts quicker. As mentioned before, this is a first pass attempt to get something that works for my use case.

Finally, I feel I would be remiss if I did not include the tidbits of info I found scattered around the web that helped me figure this out.

https://www.jeffgeerling.com/blog/creating-custom-dynamic-inventories-ansible (Jeff, as always, is an invaluable resource on all things Ansible.)

https://docs.ansible.com/ansible/2.9/dev_guide/developing_inventory.html

https://adamj.eu/tech/2016/12/04/writing-a-custom-ansible-dynamic-inventory-script/

Thanks all folks. Have a good weekend!

 

Borked Website: A Short Story

Today, I broke this website while testing some minor changes to the deployment scripts. I tried to figure out what went wrong (I messed up something with Apache while trying to renew the SSL cert). I couldn’t get it sorted out so I blew up the droplet (VM, EC2 instance, whatever) and re-executed the existing playbooks. What enabled me to do this? How was I able to do this?

Things that allowed me to recover:

  1. These scripts include daily backups so even if the entire WordPress deployment needs to be re-created from scratch, the data is tarred up and ready to go.
  2. The website data is not stored on the VM but an attached persistent volume.
  3. The database is a standalone, managed MySQL instance.
  4. Playbooks and roles are designed to be idempotent so re-running them is safe. They aim for desired state meaning no change if it’s not needed.

How I recovered:

So I simply destroyed the droplet, recreated it, and re-provisioned it. I had to perform a few tasks manually in DigitalOcean (whitelisting the droplet IP to the MySQL instance and pointing the floating IP to the new droplet) but even these tasks can be automated in the future (and will be).

All in all, I spent about an hour trying to figure out what I broke and another fifteen minutes to blow away and recreate the host. This is the way… Or at least, this is the way towards the way.

 

Re-Architecting This Website VIII

“Done. For now…”

All of my goals for the first re-architecture of this website are now complete. This evening I fixed the backup script, added a role that installed EFF’s certbot, and updated the README to reflect the current status.

You can see what’s new here: https://github.com/seaburr/WordPressOnDigitalOcean

Have a good week. In the coming weeks, there will not be much activity as I’ll be changing gears to Kubernetes.

 

Re-Architecting This Website VII

“Baby steps.”

I’ve added a single change today. There’s new role that will install and configure the DigitalOcean monitoring agent.

In the coming days, I’ve got a few more items to wrap up. Once those (minor) missing pieces are in place, I will call this project done and move on to something else, like re-re-architecting this website using Kubernetes.

You can see what’s new here: https://github.com/seaburr/WordPressOnDigitalOcean

Have a good weekend.

 

Re-Architecting This Website VI

I can’t believe it’s been over two weeks since I touched this project. So far, things have been good. I took a look at AWS with the intent of redoing this project on Lightsail. I assumed it would be cheaper. Unfortunately, it won’t be. That doesn’t mean it’s not worth the effort, but it certainly changes the time table to migrate because it will actually cost MORE per month than hosting on DigitalOcean. Instead of focusing on migrating, I’ve decided to focus on general work that needs to be done which translates to any other cloud vendor.

Here’s what’s new:

  • Extended centos-base role to install and configure fail2ban for SSH and Apache. It also now includes two packages (htop and screen) that were previously missing.
  • New role called create-swap that will create and configure a swapfile.
  • Reconfigured install-apache to include mod_security and a more robust 80 -> 443 redirect.

You can see what’s new here: https://github.com/seaburr/WordPressOnDigitalOcean

Have a good weekend.

 

Re-Architecting This Website V

“Bootstrapped.”

Last night, I migrated this website onto its new infrastructure! The tooling I’ve been working for the last week has become feature-complete enough that it was used to create the database, droplet, and storage volume this site now uses.

There was some manual work (SSL configuration, import from previous website, updating DB connection string) however it was relatively minor and these things are likely to be the next features added.

Here’s where we’re at:

I’m really excited about the progress made in a week.

 

Re-Architecting This Website IV

“Biting off more than I can chew.”

As of Sunday, I had gotten droplet creation, volume storage, and a good bit of the base OS setup work handled. The major missing piece was the creating the database. I am happy to say that work is now complete, but it’s not without bumps, warts, and bruises.

Creating the database server proved daunting because Ansible modules do not exist for DigitalOcean’s managed database server. I thought I’d just write a module or two, however that proved to be a much more daunting task than I first realized.

I’ve never written a module for Ansible, I’m no DBA, and the number of APIs exposed for database actions left me feeling deflated. Still, I soldiered through and created a Python script I can use to create, configure, or destroy a managed MySQL server. There are a ton of assumptions made to keep the scope of work reduced, but it’s sufficient for my needs at this point. It’s wrapped with argparse so I can run it from an Ansible command task then parse the JSON output.

Check it out here: https://github.com/seaburr/WordPressOnDigitalOcean/blob/master/roles/database-server/files/digital_ocean_database.py

As I said, I believe Ansible modules should exist for their managed database, as they do for AWS and GCP however, I think that’s a larger project than one individual can tackle on their first attempt writing modules. I’d be happy to collaborate on that if it ever comes up.

In related news, I’ve also extended the install-wordpress role to create a wp-config.php from template, including fetching unique salts from wordpress.org, I’ve also fixed a few small issues with the Apache installation and added missing packages.

At this point, the only thing missing to go from nothing to a functioning WordPress site is adding database connection information into wp-config.php and handling SSL.

 

Re-Architecting This Website II

We’re onto part two! Luckily, this is a simple change.

The VM that runs this website is on DigitalOcean. It’s so old (3 years) that it was only provisioned with 1CPU/512MB of memory. This has been sufficient but it’s a tight squeeze with the reverse proxy and MySQL shoehorned onto the same VM. Tonight when I ran OS updates, I noticed this website went down. After looking further, the Linux OOM killer had stepped in and killed MySQL to keep the machine up and running. After starting MySQL, I scaled the droplet to a larger size.

DigitalOcean allows you to scale your droplet but they cannot perform hot add, so the droplet needed to be powered off. Luckily, the $5/mo droplet size has increased to 1CPU/1GB so I was able to double this VMs memory at no cost.

Let’s call this our “get well plan” so the next time updates are run, this site doesn’t go down.

 

Re-Architecting This Website I

Let’s embark on a journey to re-architect this website to something that is more resilient. Currently, the entire website runs on a single DigitalOcean droplet.

Backups are handled through a Bash script that zips up the application directory, dumps the DB, and tars up the output.

This could be better. Let’s make this better. As the website gets re-architected, I’ll provide diagrams and links to the code & documentation leveraged.