My Ansible Holy Grail: Bootstrapping a VPS

tl;dr The completed playbook is available at the bottom of this post.

Alright, "Holy Grail" is an exaggeration. If nothing else, it's a thing I've wanted for ages, but couldn't do until recently.

I'm on a years-long quest to document my server setup1: assorted scripts and config files first found their way into a Git repository around 2010. More recently, I've focused on Ansible playbooks that can deploy to new servers on demand.

One weak spot I identified while doing similar Ansible work for MyTransHealth was the initial "bootstrapping" of a VPS. AWS's Ubuntu 16.04 AMI, for example, doesn't even come with Python, and Python is a pretty core requirement for Ansible. I finally connected the dots last week, resulting in a playbook that will configure a fresh Ubuntu 16.04 AMI on EC2 with no prior steps required.

Spoiler Alert

A combination of Ansible features make bootstrapping possible:

  • The raw module, which lets you run commands more directly via SSH while bypassing the normal Python module subsystem.
  • gather_facts: no, which disables the intial Python-powered fact gathering.
  • Multiple plays per Playbook, which let you differentiate low-level bootstrap operations for more traditional Ansible-powered setup operations.
  • pre_tasks, post_tasks, and include_role which let you mix and match operations with more flexibility.

Creating our playbook

First, we start with a one-host inventory file:

$ cat hosts
dingo.example.com

Next, let's create a simple playbook, bootstrap.yml:

$ cat bootstrap.yml
---
- hosts: all
  remote_user: ubuntu
  tasks:
  - name: install vim
    apt: name=vim

Let's see how this goes with no additional configuration:

$ ansible-playbook -i hosts bootstrap.yml
PLAY [all] *********************************************************************

TASK [setup] *******************************************************************
fatal: [dingo.example.com]: FAILED! => {"changed": false, "failed": true, "module_stderr": "Shared connection to dingo.example.com closed.\r\n", "module_stdout": "/bin/sh: 1: /usr/bin/python: not found\r\n", "msg": "MODULE FAILURE"}
        to retry, use: --limit @/home/annika/ansible/bootstrap.retry

PLAY RECAP *********************************************************************
dingo.example.com : ok=0    changed=0    unreachable=0    failed=1

(If you see an SSH error about "too long for Unix domain socket", try updating your control path.)

As expected, our remote host lacked Python, so Ansible wasn't able to run its command. But look closely! We didn't even get past the initial setup phase: Ansible didn't fail at 'install vim', it failed gathering facts about the remote host (which requires Python).

Disabling fact gathering

Let's amend bootstrap.yml and disable this initial fact gathering:

---
- hosts: all
  remote_user: ubuntu
  gather_facts: no
  tasks:
  - name: install vim
    apt: name=vim

Run the playbook:

$ ansible-playbook -i hosts bootstrap.yml
PLAY [all] *********************************************************************

TASK [install vim] *************************************************************
fatal: [dingo.example.com]: FAILED! => {"changed": false, "failed": true, "module_stderr": "Shared connection to dingo.example.com closed.\r\n", "module_stdout": "/bin/sh: 1: /usr/bin/python: not found\r\n", "msg": "MODULE FAILURE"}
    to retry, use: --limit @/home/annika/ansible/bootstrap.retry

PLAY RECAP *********************************************************************
dingo.example.com          : ok=0    changed=0    unreachable=0    failed=1

We got a little further: Ansible is at least trying to run our command, but lack of Python means it doesn't get far. Let's give it what it wants.

Raw commands

Ansible's raw command lets us bypass the normal Python requirement. We'll add a pre-task that installs Python for future use. Note: this doesn't have to be a pre-task, but they're handy if you want to use roles in your bootstrap.

---
- hosts: all
  remote_user: ubuntu
  become: yes
  gather_facts: no
  pre_tasks:
    - name: install python
      raw: test -e /usr/bin/python || (apt update -y && apt install -y python-minimal)
  tasks:
    - name: install vim
      apt: name=vim

And the output:

$ ansible-playbook -i hosts bootstrap.yml
PLAY [all] *********************************************************************

TASK [install python] **********************************************************
changed: [dingo.example.com]

TASK [install vim] *************************************************************
ok: [dingo.example.com]

PLAY RECAP *********************************************************************
dingo.example.com          : ok=2    changed=1    unreachable=0    failed=0

Our apt task runs as expected, now that the Python requirement is satisfied.

Running additional plays

Your bootstrapping play might be very different from the rest of your initial provisioning: you might set up an alternate user and delete the default user, you might want access to host facts2, etc. If this is the case, you can add additional plays into the same playbook:

---
- hosts: all
  remote_user: ubuntu
  become: yes
  gather_facts: no
  pre_tasks:
    - name: install python
      raw: test -e /usr/bin/python || (apt update -y && apt install -y python-minimal)
  roles:
    - annika-user

# our second task connects as a different SSH user, and runs `setup` to
# gather facts.
- hosts: all
  remote_user: annika
  become: yes
  gather_facts: yes
  tasks:
    # we can't delete a user while we're ssh'd in as that user!
    - name: remove default user
      user: name=ubuntu state=absent
    - debug:

Ideally, my bootstrapping playbook preps any new VPS for future use: my default user is available, the timezone is set to UTC, vim is the default text editor, and so on. With this playbook, I use the same system to manage the baseline configuration and the role-specific setup, with no manual steps to document and execute.

Credits

  1. Aside from the benefits having a reproducible server setup, my provider has several times increased the stats on existing VPS. I've been hesitant to take advantage of upgrades from the past two years, because they require a more involved migration (Xen to KVM) than previous upgrades.
  2. It's worth mentioning that you can kick off a setup task at any time to (re)gather facts.