Since the summer of 2012 I've dreamt of having an in-house Heroku service. Ideally it would remove any systems admin work away from developers and run a large number of staging environments ad-hoc in an inexpensive fashion. Our live services run on bare-metal for performance reasons but our staging environments don't have such speed requirements.
At any time, we can have several pull requests open with features that are being developed. Being a geographically dispersed team, it's handy to be able to access these services from a remotely-visible domain, interact with them and provide feedback to one another.
My ideal scenario would be to create an automated tool which would, on-command, checkout out a pull request, push it into a VM, install it and configure nginx to point traffic to it via a unique hostname. Job queues and databases would be exposed via external services and their attributes exposed via environment variables inside each container.
An SSH-receiver script is something we can put together with just about no code at all. Gitreceive is one such project that does just that. With our SSH public keys installed we could speak to the stage deployment machine via git, just as we would with Heroku.
Key characteristics of these staging environments is that they're elastic, short-lived and should be very simple to create and destroy. As of this writing I could provision a server with 24GB of DDR3 RAM for $53.28 per month from Hetzner's online auction but I could also get an SSD-based, 512 MB RAM virtual machine on Digital Ocean for $0.0069 an hour.
The SSD should be quick enough for us to allocate a large SWAP partition. The staging site shouldn't come under heavy load so memory performance isn't the biggest concern. Being able to launch a staging environment in minutes and only pay for the number of hours we need it live would be a huge cost savings when multiplied across all of our projects. Keeping fixed costs low is the key to a healthy company.
Hetzner works out to $2.22 per GB of RAM per month versus $10 a month from Digital Ocean but Digital Ocean has per-hour granularity in it's billing, Hetzner only does monthly billing. If we switch off our staging environments we should save money. Technically we could automatically turn them off after 4 hours since the database service would be external to the container itself.
With most Virtualisation solutions you need to allocate a static amount of memory for your container. But with LXC, only the amount of memory that you're using will be allocated so it will allow for much denser hosting. With this in mind I set out to build a prototype using Docker that would spawn, install a python application that speaks HTTP via IPv4 and proxy that to the outside world.
ID | Description ------------------------------- 361740 | Arch Linux 2013.05 x32 350424 | Arch Linux 2013.05 x64 1602 | CentOS 5.8 x32 1601 | CentOS 5.8 x64 376568 | CentOS 6.4 x32 562354 | CentOS 6.4 x64 12575 | Debian 6.0 x32 12573 | Debian 6.0 x64 303619 | Debian 7.0 x32 308287 | Debian 7.0 x64 894856 | Docker-Ubuntu-13.04-x64 (10/4) 32387 | Fedora 17 x32 32399 | Fedora 17 x32 Desktop 32428 | Fedora 17 x64 32419 | Fedora 17 x64 Desktop 697056 | Fedora 19 x86 696598 | Fedora 19 x86-64 459444 | LAMP on Ubuntu 12.04 483575 | Redmine on Ubuntu 12.04 464235 | Ruby on Rails on Ubuntu 12.10 (Nginx + Unicorn) 14098 | Ubuntu 10.04 x32 14097 | Ubuntu 10.04 x64 284211 | Ubuntu 12.04 x32 284203 | Ubuntu 12.04 x64 433240 | Ubuntu 12.10 x32 473123 | Ubuntu 12.10 x64 473136 | Ubuntu 12.10 x64 Desktop 345791 | Ubuntu 13.04 x32 350076 | Ubuntu 13.04 x64 682275 | Wordpress on Ubuntu 12.10
I opted to spawn VMs using the Ubuntu 13.04 x64 (350076) golden image. It's more efficient to use 32-bit memory pointers when you have less than 4GB of memory but I wanted to replicate our live environment as much as I could.
Once the command was given to spawn the VM I waited for a minute before the status switched to active:
$ watch -n5 --color 'tugboat info docker-test' Droplet fuzzy name provided. Finding droplet ID...done, ###### (docker-test) Name: docker-test ID: ###### Status: new IP: Region ID: 1 Image ID: 350076 Size ID: 66 Backups Active: false
Once the VM was up I could SSH into it and disable IPv6:
$ ssh root@###.###.###.### $ echo '1' > /proc/sys/net/ipv6/conf/lo/disable_ipv6 $ echo '1' > /proc/sys/net/ipv6/conf/lo/disable_ipv6 $ echo '1' > /proc/sys/net/ipv6/conf/all/disable_ipv6 $ echo '1' > /proc/sys/net/ipv6/conf/default/disable_ipv6 $ echo 'blacklist ipv6' >> /etc/modprobe.d/blacklist.conf $ echo 'GRUB_CMDLINE_LINUX_DEFAULT="ipv6.disable=1 console=ttyS0"' >> /etc/default/grub $ update-grub $ /etc/init.d/networking restart
Then I downloaded and installed Docker and launched a container:
$ curl http://get.docker.io/gpg | apt-key add - $ echo deb http://get.docker.io/ubuntu docker main > /etc/apt/sources.list.d/docker.list $ apt-get -y update $ apt-get -y install linux-image-extra-`uname -r` lxc-docker git-core $ docker pull colinsurprenant/ubuntu-raring-amd64 $ docker run -p 8000:8000 -dns 18.104.22.168 -i -t colinsurprenant/ubuntu-raring-amd64 /bin/bash
Once in the container I would launch a simple Python-based application server:
root@3bdb529362ae:/tmp$ wget --no-check-certificate https://raw.github.com/pypa/pip/master/contrib/get-pip.py $ md5sum get-pip.py 60a3d165e93999895e26b96681b65090 get-pip.py $ python get-pip.py $ pip install Flask gunicorn $ cat > hello.py <<EOL import os from flask import Flask app = Flask(__name__) @app.route('/') def hello(): return 'Hello World!' EOL $ gunicorn hello:app -b 0.0.0.0:8000
Inside this container I could see port 8000 on the internal IPv4 interface was open:
root@3bdb529362ae:/tmp$ lsof -OnP | grep LISTEN gunicorn 318 root 5u IPv4 18819 0t0 TCP 0.0.0.0:8000 (LISTEN) gunicorn 323 root 5u IPv4 18819 0t0 TCP 0.0.0.0:8000 (LISTEN)
Unfortunately when I looked at the container's host's open ports, port 8000 was only binding on IPv6 interfaces:
$ lsof -OnP | grep LISTEN | grep 8000 docker 8650 root .. IPv6 .. TCP *:8000 (LISTEN) docker 8650 8653 root .. IPv6 .. TCP *:8000 (LISTEN) docker 8650 8654 root .. IPv6 .. TCP *:8000 (LISTEN) docker 8650 8655 root .. IPv6 .. TCP *:8000 (LISTEN) docker 8650 8656 root .. IPv6 .. TCP *:8000 (LISTEN) docker 8650 8657 root .. IPv6 .. TCP *:8000 (LISTEN) docker 8650 8658 root .. IPv6 .. TCP *:8000 (LISTEN) docker 8650 8701 root .. IPv6 .. TCP *:8000 (LISTEN) docker 8650 8704 root .. IPv6 .. TCP *:8000 (LISTEN) docker 8650 8706 root .. IPv6 .. TCP *:8000 (LISTEN) docker 8650 8707 root .. IPv6 .. TCP *:8000 (LISTEN) docker 8650 8710 root .. IPv6 .. TCP *:8000 (LISTEN)
The gunicorn process only speaks IPv4 as far as I can tell so I needed a way to stop IPv6 from being available and hopefully that would force Docker to bind on IPv4 interfaces. That didn't work so I tried to setup IP forwarding so IPv4 and IPv6 would transmit back and forth between one another:
$ sysctl -w net.ipv4.ip_forward=1 $ echo 1 > /proc/sys/net/ipv4/ip_forward $ vi /etc/sysctl.conf # Changed to: net.ipv4.ip_forward=1 $ sysctl -p /etc/sysctl.conf $ service network restart
This didn't seem to help either so I raised a bug report with dotcloud.
As of this writing the bug is still outstanding but I did find a work around: Using LXC directly and proxy traffic using iptables. There was a fantastic write up in Digital Ocean's help section on how to do this.
There are still a number of pieces to this system to sort out but the bottom line is that we're on the way to having our own staging cloud which could cost as little as a few dollars a month. If any of the above chimes with you and you're interested in working with these sorts of systems, please drop me a line.