This post is part of a series detailing how this site runs. You can view the previous entry here.
With the infrastructure in place for me to receive and handle secure web requests for dwgill.com on my DigitalOcean droplet, theoretically all that's next is for me to actually run Ghost on my machine on the correct port, and Nginx can start piping over web requests to it.
There seem to be at least a couple ways to go about setting up Ghost on a machine, the most popular of which is probably the official Ghost command line interface. This is supposed to be a turn-key solution which sets up almost all the aspects of a Ghost install for you. I can't quite explain it, but there's a certain discomfort or unease I have with installing system-wide tools for managing the installation of a particular peice of software. If all I wanted to do on that droplet was run Ghost, that would be one thing, but what if I later want to run something that requires a different version of Node? I would like to avoid cluttering my machine's environment of programs and utilities, if possible.
At bottom, Ghost is a full-fledged application, and in 2017, in my view, deploying an app responsibly generally means involving Docker in some capacity.
The Briefest of Overviews of Docker
Docker is a technology which enables you to run a process in an entirely isolated environment. When you run a program with Docker, which is to say you run it in a Docker container, it runs with its own unique filesystem and set of TCP/IP ports. The filesystem and ports are by themselves entirely isolated from actual system resources. Containers can be packaged up in images and freely moved from machine to machine, essentially enabling one to bundle an application along with its entire operating system context.
It's an effect not dissimilar to virtual machines, but whereas running a VM requires simulating an entire hardware context in addition to the operating system that runs on that simulated hardware, containerization is basically accomplished by having the operating system systematically lie to a process about the system resources available to it. This turns out to be profoundly more efficient: whereas software run in a VM frequently struggles to match the performance of code executed outside it, the runtime costs of containerization is neglible enough that large swaths of industry are now utilizing it to deploy their applications.
It's not hard to see why. By packaging applications together with all the resources they need to run, you avoid a tremendous number of potential headaches in misconfiguring the environment an application relies on. It turns out the easiest way to avoid having the wrong version of Node installed is to bundle the right version of Node with your app itself, and the simplest way to insure all the right files are in all the right locations to execute your code is to ship the whole filesystem itself with your software. If this all sounds either too good to be true or prohitively resource-expensive, then rest assured there's a lot of really cool engineering behind all this to make it happen, all of which is beyond the scope of this post.
Running Ghost in a Docker Container
Now if I was starting from scratch and I wanted to run Ghost in a Docker container, I would first need to build a Docker image with Ghost installed on it. I would look at the official Ghost documentation, and dutifully replicate all the steps in a
Dockerfile. Then I would run
docker build and Docker would execute all those commands in an isolated filesystem context, and at the end of it all I would have a snapshot of a filesystem, a Docker image, that has Ghost installed and ready to run on it.
Fortunately enough, however, the folks behind Docker have already built a ready-to-use Ghost image. This reduces the whole process to not much more than these few lines:
# Download a script to install Docker curl -fsSL get.docker.com -o get-docker.sh # Install Docker sudo sh get-docker.sh # Download my Ghost image sudo docker pull ghost:1.5-alpine
1.5-alpine is the "tag" of the Ghost image. In effect, it is both the version of Ghost (1.5.x) and the "kind" of filesystem it's installed on (Alpine Linux).
Theoretically, all that's next is to run the following:
docker run --name my-ghost-container-name ghost:1.5-alpine
And here I am, "running" Ghost, and just as ignorant of how to install it as when I began. But this is disengenuous, as there's still more steps involved to running a container that is both serving content to the public internet and fault-tolerant.
Configuring a Ghost Container
There are three problems with running the container as its run in the previous code block. The first is that this container doesn't just contain the Ghost application, but all the content of the Ghost site. What if the container crashes? What if I want to upgrade to the next version of Ghost when that's released? I need the content of the Ghost site to be able to outlast the container that runs the site itself, and the easiest way to do that is through Docker "volumes", which are a feature of Docker where you can effectively map a filesystem location in a container to a location on the host machine. So, that entails a new option in the run container command:
docker run \ --name my-ghost-container-name \ -v /home/my-user/my_ghost_content:/var/lib/ghost/content \ ghost:1.5-alpine
This maps the contents of the directory
/var/lib/ghost/content in the container to the
/home/my-user/my_ghost_content directory on the host machine. I can start and stop as many different Ghost containers as I like, and so long as they all mount that same content directory, I shouldn't need to worry about losing any actual blog content. In the event a Ghost update introduces breaking changes, however, I will still need to manually export/import my content across two containers running the update before and after. Deal with that when we get to it, I suppose.
Exposing the Container to the Public Internet
Anyhow, the second problem is that this container isn't visible to the outside world. I meant what I said when I told you the filesystem and ports are entirely isolated. The Ghost app may think it's listening on a port in its container, but right now that port isn't connected to anything on the other end. But, just as we mapped a local directory into the container, we can also map a port on the host machine into it.
docker run \ --name my-ghost-container-name \ -v /home/my-user/my_ghost_content:/var/lib/ghost/content \ -p 80:2368 \ ghost:1.5-alpine
This maps port 80 on the machine to port 2368 (the default Ghost port) in the container. That's fine so far as it goes, but didn't I want to serve this blog over https?? For that, Nginx needs to be the front-facing webserver, listening on ports 80 and 443. We can map any port we like to the container's interior port of 2368, so long as it isn't being used already by something else. As it happens, Ghost's default port of 2368 is currently unutilized.
docker run \ --name my-ghost-container-name \ -v /home/my-user/my_ghost_content:/var/lib/ghost/content \ -p 2368:2368 \ ghost:1.5-alpine
Now Ghost is serving on port 2368, which means we can now configure a reverse proxy to it with Nginx, but there's one hitch, and this is an important one: Docker
-p directives expose the container to the public internet by default. What difference does it make that http://dwgill.com forcefully redirects to https://dwgill.com and serves the site securely over https, if I can still access the Ghost container directly, unencrypted, at http://dwgill.com:2368? That's a huge hole if I want to guarantee that all my traffic is safely encrypted.
The solution is a not-frequently-publicized third component to the
-p option, which enables you to associate an ip address with the container as well.
docker run \ --name my-ghost-container-name \ -v /home/my-user/my_ghost_content:/var/lib/ghost/content \ -p 127.0.0.1:2368:2368 \ ghost:1.5-alpine127.0.0.1
This insures that only connections to
127.0.0.1:2368 are received by the container.
127.0.0.1 is the loopback ip address, which means it's only reachable from the very machine that's listening for that ip address. If my Ghost instance is listening on
127.0.0.1, then the only things that can reach it are other processes on that same machine—in my case, Nginx.
The container is now truly ready to be hooked up to Nginx.
Letting Ghost Know its own URL
With the problems of preserving the site's content long-term and actually connecting it to the outside world addressed, there's just one remaining problem with the container itself: Ghost has no idea it's being served from dwgill.com. This doesn't stop requests for dwgill.com from reaching it—Nginx and the previous section make sure of that. But it does mean that Ghost won't know where to link the home button to on the side or bottom of the page, for example. By default, Ghost assumes it's being served from
http://localhost:2368/, which means unless I don't correct it, things like the home botton on every page are going to link to
http://localhost:2368/ rather than
Fortunately enough, this is easy to fix. According to the Ghost documentation, you can set configuration options by setting environment variables that correspond to the relevant entry in the config file. We just need to set
url. That just means another line on our
docker run command:
docker run \ --name my-ghost-container-name \ -v /home/my-user/my_ghost_content:/var/lib/ghost/content \ -p 127.0.0.1:2368:2368 \ -e url=https://mywebsite.com \ ghost:1.5-alpine127.0.0.1
Keeping Ghost Running with Systemd
With all this configured, the only thing that remains is making sure the site stays running while I'm away. As it stands, this
docker run command only runs as long as the terminal window I typed it in stays open. The easiest way to keep it running indefinitely would be to just add a
-d flag to the command, but I would prefer to add this blog as a system-level service under Systemd. Nginx is a Systemd service, and if I want to start Nginx, that's a simple
systemctl start nginx.service; If I want to restart it,
systemctl restart nginx.service; and if there's any issue, I can check the status of it with
systemctl status nginx.service. That's a convenience I would like to have with my blog.
The way you get from the fancy
docker run command I've developed over the course of this post to a Systemd service is a unit file. This specifies the details Systemd needs to know about how to properly handle an application—How do I start the program? How should I stop it? What do I need to do before starting it? What other services need to be running, first? etc. There's a lot you can do with these, but as it turns out the needs of my Ghost instance are relatively simple, and make for a comparatively straightforward config file, which I call
# [Unit] details metadata about the service [Unit] # Describe docker.ghost.service Description=Ghost Container # Requires= insures that starting docker.ghost.service # will also start docker.service Requires=docker.service # After= insures that docker.service will be fully up and # running before Systemd starts docker.ghost.service After=docker.service # Likewise for Nginx Requires=nginx.service After=nginx.service # [Service] gives details on managing the service itself [Service] # If Ghost exits for any reason, always restart it Restart=always # %n will be replaced with the name of this unit file # When starting Ghost, stop & remove the container if it's # already running/exists ExecStartPre=-/usr/bin/docker stop %n ExecStartPre=-/usr/bin/docker rm %n # Pull the image if we do not already have it ExecStartPre=/usr/bin/docker pull ghost:1.5-alpine # Run the command we've come up with ExecStart=/usr/bin/docker run \ --name %n \ -v /var/lib/ghost/content:/var/lib/ghost/content \ -e url=https://dwgill.com \ -p 127.0.0.1:2368:2368 \ ghost:1.5-alpine [Install] WantedBy=multi-user.target
With this file ready, I just put it in
/etc/systemd/system, and then all that truly remains is to, in order, reload the Systemd configuration, enable the service to start at boot, and start the service itself. Because of the lines concerning
nginx.service, this will automatically insure Nginx is up and running before evening trying to start Ghost.
sudo systemctl daemon-reload sudo systemctl enable docker.ghost.service sudo systemctl start docker.ghost.service
I don't think when I started that I expected I would take well north of 3,000 words detailing this website's technical operation, but here's hoping somebody finds it interesting and/or useful. I know I probably will, six or so months from now when something breaks and I'm left trying to remember just how in the world I put this all together.
You can find all the config files / scripts mentioned in this series at this Github repository.
It's actually more accurate to say the reverse; you "build" images, which are in themselves effectively "snapshots" of filesystems, that then run as containers once you execute a process with that image. ↩︎
Dockerfileis a text document that contains all the commands a user could call on the command line to assemble an image." ↩︎
It's both possible and very common to run Docker containers whose filesystems resemble operating systems entirely different from that of the machine you're actually running them on. Alpine Linux, a very minimal and security focused Linux distribution, consequently ends up being a popular choice in order to minimize disk space. ↩︎
The Ghost Foundation recommends using Ghost together with a MySQL database, but the Ghost Docker image by default uses SQLite, which means the database is stored in the Ghost container. I could thus avoid this issue by configuring Ghost to use MySQL (with MySQL running in another container), but I figure SQLite covers my bases fine for the almost-zero users I'm serving right now. ↩︎
Docker volumes can also do a number of other cool things, but that's all irrelevant to this post. ↩︎
Ports 80 and 443 are the standard http and https ports, respectively. ↩︎
In retrospect, I suppose this means that forgoing a ip address parameter with a
-pdirective is equivalent to binding it to
Environment variables are an easy means by which a process can pass information and data to a subprocess—usually configuration stuff. ↩︎
Systemd is an init system, which basically means it's the first process that runs on Linux system and it's responsible for managing system resources and all the other processes on the machine. ↩︎