Borked Website: A Short Story

Today, I broke this website while testing some minor changes to the deployment scripts. I tried to figure out what went wrong (I messed up something with Apache while trying to renew the SSL cert). I couldn’t get it sorted out so I blew up the droplet (VM, EC2 instance, whatever) and re-executed the existing playbooks. What enabled me to do this? How was I able to do this?

Things that allowed me to recover:

  1. These scripts include daily backups so even if the entire WordPress deployment needs to be re-created from scratch, the data is tarred up and ready to go.
  2. The website data is not stored on the VM but an attached persistent volume.
  3. The database is a standalone, managed MySQL instance.
  4. Playbooks and roles are designed to be idempotent so re-running them is safe. They aim for desired state meaning no change if it’s not needed.

How I recovered:

So I simply destroyed the droplet, recreated it, and re-provisioned it. I had to perform a few tasks manually in DigitalOcean (whitelisting the droplet IP to the MySQL instance and pointing the floating IP to the new droplet) but even these tasks can be automated in the future (and will be).

All in all, I spent about an hour trying to figure out what I broke and another fifteen minutes to blow away and recreate the host. This is the way… Or at least, this is the way towards the way.

 

Leave a Reply

Your email address will not be published. Required fields are marked *