My Ansible Holy Grail: Bootstrapping a VPS
tl;dr The completed playbook is available at the bottom of this post.
Alright, "Holy Grail" is an exaggeration. If nothing else, it's a thing I've wanted for ages, but couldn't do until recently.
I'm on a years-long quest to document my server setup1: assorted scripts and config files first found their way into a Git repository around 2010. More recently, I've focused on Ansible playbooks that can deploy to new servers on demand.
One weak spot I identified while doing similar Ansible work for MyTransHealth was the initial "bootstrapping" of a VPS. AWS's Ubuntu 16.04 AMI, for example, doesn't even come with Python, and Python is a pretty core requirement for Ansible. I finally connected the dots last week, resulting in a playbook that will configure a fresh Ubuntu 16.04 AMI on EC2 with no prior steps required.
Spoiler Alert
A combination of Ansible features make bootstrapping possible:
- The
raw
module, which lets you run commands more directly via SSH while bypassing the normal Python module subsystem. gather_facts: no
, which disables the intial Python-powered fact gathering.- Multiple plays per Playbook, which let you differentiate low-level bootstrap operations for more traditional Ansible-powered setup operations.
pre_tasks
,post_tasks
, andinclude_role
which let you mix and match operations with more flexibility.
Creating our playbook
First, we start with a one-host inventory file:
$ cat hosts
dingo.example.com
Next, let's create a simple playbook, bootstrap.yml
:
$ cat bootstrap.yml
---
- hosts: all
remote_user: ubuntu
tasks:
- name: install vim
apt: name=vim
Let's see how this goes with no additional configuration:
$ ansible-playbook -i hosts bootstrap.yml
PLAY [all] *********************************************************************
TASK [setup] *******************************************************************
fatal: [dingo.example.com]: FAILED! => {"changed": false, "failed": true, "module_stderr": "Shared connection to dingo.example.com closed.\r\n", "module_stdout": "/bin/sh: 1: /usr/bin/python: not found\r\n", "msg": "MODULE FAILURE"}
to retry, use: --limit @/home/annika/ansible/bootstrap.retry
PLAY RECAP *********************************************************************
dingo.example.com : ok=0 changed=0 unreachable=0 failed=1
(If you see an SSH error about "too long for Unix domain socket", try updating your control path.)
As expected, our remote host lacked Python, so Ansible wasn't able to run its
command. But look closely! We didn't even get past the initial setup
phase:
Ansible didn't fail at 'install vim', it failed gathering facts about the remote
host (which requires Python).
Disabling fact gathering
Let's amend bootstrap.yml
and disable this initial fact gathering:
---
- hosts: all
remote_user: ubuntu
gather_facts: no
tasks:
- name: install vim
apt: name=vim
Run the playbook:
$ ansible-playbook -i hosts bootstrap.yml
PLAY [all] *********************************************************************
TASK [install vim] *************************************************************
fatal: [dingo.example.com]: FAILED! => {"changed": false, "failed": true, "module_stderr": "Shared connection to dingo.example.com closed.\r\n", "module_stdout": "/bin/sh: 1: /usr/bin/python: not found\r\n", "msg": "MODULE FAILURE"}
to retry, use: --limit @/home/annika/ansible/bootstrap.retry
PLAY RECAP *********************************************************************
dingo.example.com : ok=0 changed=0 unreachable=0 failed=1
We got a little further: Ansible is at least trying to run our command, but lack of Python means it doesn't get far. Let's give it what it wants.
Raw commands
Ansible's raw
command lets us bypass the normal Python requirement. We'll add
a pre-task that installs Python for future use. Note: this doesn't have to be a
pre-task, but they're handy if you want to use roles in your bootstrap.
---
- hosts: all
remote_user: ubuntu
become: yes
gather_facts: no
pre_tasks:
- name: install python
raw: test -e /usr/bin/python || (apt update -y && apt install -y python-minimal)
tasks:
- name: install vim
apt: name=vim
And the output:
$ ansible-playbook -i hosts bootstrap.yml
PLAY [all] *********************************************************************
TASK [install python] **********************************************************
changed: [dingo.example.com]
TASK [install vim] *************************************************************
ok: [dingo.example.com]
PLAY RECAP *********************************************************************
dingo.example.com : ok=2 changed=1 unreachable=0 failed=0
Our apt
task runs as expected, now that the Python requirement is satisfied.
Running additional plays
Your bootstrapping play might be very different from the rest of your initial provisioning: you might set up an alternate user and delete the default user, you might want access to host facts2, etc. If this is the case, you can add additional plays into the same playbook:
---
- hosts: all
remote_user: ubuntu
become: yes
gather_facts: no
pre_tasks:
- name: install python
raw: test -e /usr/bin/python || (apt update -y && apt install -y python-minimal)
roles:
- annika-user
# our second task connects as a different SSH user, and runs `setup` to
# gather facts.
- hosts: all
remote_user: annika
become: yes
gather_facts: yes
tasks:
# we can't delete a user while we're ssh'd in as that user!
- name: remove default user
user: name=ubuntu state=absent
- debug:
Ideally, my bootstrapping playbook preps any new VPS for future use: my default user is available, the timezone is set to UTC, vim is the default text editor, and so on. With this playbook, I use the same system to manage the baseline configuration and the role-specific setup, with no manual steps to document and execute.
Credits
- Aside from the benefits having a reproducible server setup, my provider has several times increased the stats on existing VPS. I've been hesitant to take advantage of upgrades from the past two years, because they require a more involved migration (Xen to KVM) than previous upgrades. ↩
- It's
worth mentioning that you can kick off a
setup
task at any time to (re)gather facts. ↩