Horizontally scaling your Ghost blog on DigitalOcean
Preface
I had intended to submit this to DigitalOcean as a post, but the topics they require for their community posts don't fit the content of this post. I do not feel the need to hold onto this content to myself until such times as DO are ready for it. I'd rather get it out there and help people. Sure, getting paid $300 for it would be good, but helping others is a reward in itself.
I currently host this blog on Ghost itself, but have worked through this as a fun exercise in case the traffic increases on this blog and I can't justify the increased cost of Ghost hosting for free content.
Incidentally, you can sign up to DigitalOcean and get $100 credit (referral link which benefits me if you use it).
Introduction
Ghost is an open source, headless, Node.js CMS, and a great alternative to Wordpress. The image in the marketplace can get you up and running with a high performance blog in minutes. But what happens when you build such an audience that your droplet isn't enough? You can scale vertically and add more power to the one droplet (vertical scaling) from within your DigitalOcean dashboard, or you can add more droplets and scale horizontally.
In this tutorial, you will:
- Migrate the Ghost database from the same droplet as the blog to a separate droplet specifically for the database
- Change your Ghost blog infrastructure from a single droplet to multiple droplets behind a load balancer
- Create a GlusterFS network filesystem to share files between servers on volumes
- Move the uploads for the Ghost blog single node to the GlusterFS file system
Prerequisites
Before following this tutorial make sure you have:
- A VPC configured. You may find An Overview of DigitalOcean VPC and Networking if you have never worked with a VPC before
- One droplet with a running Ghost blog install
- Access to update DNS settings for your domain
- Created an Ubuntu 18.04 Droplet and follow the guide on How To Install MySQL on Ubuntu 18.04..
You may also need to refer to the following throughout this process:
- Tutorial: How To Create a Redundant Storage Pool Using GlusterFS on Ubuntu 18.04
- Docs: How to create Load Balancers
During this tutorial it is important that you or other authors do not create new content within Ghost. Doing so may result in the content being missed as part of the migration.
Step 1 — Migrating the database
Because the Ghost image in the marketplace runs MySQL 5.7 it is not advised to use a managed database instance within DigitalOcean for the Ghost database. Managed databases use MySQL 8, which causes problems when migrating the database. Specifically the managed database instance has the sql_require_primary_key variable set to ON
. Not all of the tables within Ghost have primary keys, which causes a database restore to fail.
Step 1.1 — Backing up the database
SSH into the Droplet hosting Ghost and runmysqldump
on the database. The database name, by default, is ghost_production
, but you can check in the configuration file /var/www/ghost/config.production.json
what this is. using the details from the config.production.json
file, make a note of the username (likely ghost
) and password for the connection. The database connection settings may look like
{
"database": {
"client": "mysql",
"connection": {
"host": "localhost",
"user": "ghost",
"password": "07fb430a68382862c3195653c64583d608d8a234627b9e33",
"database": "ghost_production"
}
}
}
Run the mysqldump command mysqldump -u root [database] > ~/ghost.sql
i.e.
mysqldump -u root ghost_production > ~/ghost.sql
After this completes you should see a file called ghost.sql
with a non-zero length if you run ls -la ~/
Step 1.2 — Configuring the new database
SSH into the droplet which has MySQL installed (from the prerequisites) and run mysql
initialise the MySQL CLI. Create a new schema for the Ghost database
CREATE SCHEMA ghost_production;
Then create the user for ghost to connect with, using the details from the configuration file
CREATE USER 'ghost'@'%' IDENTIFIED BY '07fb430a68382862c3195653c64583d608d8a234627b9e33';
You then need to grant the ghost
user permission to perform actions on the database
GRANT create,delete,insert,select,update,alter ON ghost_PRODUCTION.* TO 'ghost'@'%';
Step 1.3 — Restoring the database
Once the user has been created you can restore the database from the Ghost droplet to the new database droplet. SSH into the Ghost droplet and run
mysql -u ghost -h [mysql_droplet_ip] -p < /root/ghost.sql
Enter the password for the ghost user when prompted.
Once the database has been restored edit the database settings within /var/www/ghost/config.production.json
and change the host
from 'localhost' to the IP address of the droplet. If your Ghost Droplet is on the same private network as the MySQL Droplet use the private network address.
When the configuration file has been changed stop the MySQL service on the Ghost droplet using service mysql stop
. The blog should still load with all the content as before. This is now communicating with the separate database droplet.
Step 2 — Creating the load balancer and additional Droplets
The second Ghost droplet should be created from a snapshot of the existing droplet. It is recommended that Droplets are powered off during the snapshot process. This provides you with the perfect opportunity to create and introduce the load balancer, and re-point the DNS from the Droplet to the load balancer. Assign the first droplet to the load balancer and start the Droplet.
Once it is online and traffic is being routed correctly, bring the second Droplet online. When the second droplet is showing as online within the load balancer, remove the first one from the load balancer again. All traffic should then be routed directly to the second Droplet, and the blog should be no different from when it was being served by the first Droplet. When you have verified the site is working the same on both Droplets, bring the first Droplet back online.
Repeat this step to add a third droplet.
Step 3 — Adding Volume Storage and using GlusterFS
At this point, there are three separate instances of the blog which share the same database. With there being multiple instances, if you upload an image to the blog, it will only be uploaded to one of the servers. This will lead to 404 errors for images if the site visitor loads the page on a server which doesn't have the file. To get around this issue you can use clustered volume storage.
Step 3.1 — Creating block storage volumes
First, establish how much space is required to store the images folder for Ghost by running du -sh /var/www/ghost/content/images
this provides a human readable summary of the folder and subfolder structure as a single line.
Once that has been determined, create a block storage volume with enough space to hold all of that data, and with space to keep adding more. Make sure the option 'Automatically Format & Mount' is selected, and the filesystem is set to XFS. Attach this volume to the first Droplet, then create another volume exactly the same and attach it to the second droplet. Repeat for the third Droplet
Step 3.2 — Configuring Gluster
First you need to install Gluster on the Droplets. For Ubuntu18.04 on which the Ghost image is built this is done with apt install glusterfs-server
. When that has been done on both Droplets, start the Gluster service service glusterd start
.
Once the service is started, the iptables need updating on both Droplets to allow communication between them iptables -I INPUT -p all -s <ip-address> -j ACCEPT
. One each Droplet, the is the IP address of the other Droplets. It is preferable for the ip-address elements to be private IP addresses, however the public IP address will work.
With the ports open between the servers within IP tables, Gluster needs to be made aware of its peer nodes. On the first Droplet fun gluster peer probe droplet2-ip
where the droplet2-ip
is the IP address added to the iptables rule. Repeat for droplet3-ip.
As Gluster doesn't always recognise the links in the underlying file system for /dev/sdaX
and where it actually points to. As such, you will need to mount the volume to a new location. Assuming that the volume is attached to /dev/sda
on each Droplet, run
echo "/dev/sda /export/dev/sda xfs defaults 0 0" >> /etc/fstab
mkdir -p /export/sda && mount -a && mkdir /export/sda/brick
This will create the folder /export/sda
, which will be a mount of /dev/sda
, and then create a folder called brick
within the volume. You will use this brick
folder when creating the glusterfs volume, next.
To share the files across the volumes so they are available at all times on all Droplets you need to create a replica volume within Gluster. It is preferable to create this with 3 or more volumes, which is why this tutorial uses 3 droplets behind the load balancer:
gluster volume create <volumename> replica 3 <droplet1-ip>:/export/sda/brick <droplet2-ip>:/export/sda/brick <droplet3-ip>:/export/sda/brick
The IP addresses used should be the private ones, but must be the ones set in IP tables. Once the volume has been created the console will tell you to start the volume to access data e.g.
root@ghost-blog-web-01:/export/sda# gluster volume create ghostshare replica 3 10.106.0.2:/export/sda/brick 10.106.0.3:/export/sda/brick 10.106.0.4:/export/sda/brick
volume create: ghostshare: success: please start the volume to access data
Start the volume with gluster volume start <volumename>
.
To test the cluster run touch /export/sda/brick/test-file
on one of the Droplets. The file should then be replicated on the other bricks in the volume.
Step 3.3 — Copying the images from Ghost
Now the shared volume is up and running, the images from Ghost can be copied from the location within the Ghost install to the Gluster volume using cp -r /var/www/ghost/content/images /export/sda/brick
on one of the Droplets.
This will copy all files and folders from the images folder into the brick. It should be replicated across the other volumes in the cluster.
Step 3.4 — Serving images from the volume within Ghost
To serve the images from within the Gluster volume by the Ghost blog you need to replace the images folder with a link to the Gluster volume. On each of the Droplets perform the following
mv /var/www/ghost/content/images /var/www/ghost/content/images-backup
cd /var/www/ghost/content
ln -s /export/sda/brick/images ./images
chown -R ghost:ghost /export/sda/brick/images
This will copy the images folder to a folder called images-backup
, and then create the images
folder within the Ghost content folder, but as a symbolic link to the images folder within the Gluster brick. It will then change the ownership of the files within the brick to the ghost
user and group, to allow ghost to access the files and folders for the normal running of the site.
Conclusion
In this tutorial you've started with a single Droplet with Ghost as a blogging platform. You've scaled the infrastructure to 3 web-server Droplets behind a load balancer and set up a separate Droplet for the MySQL database, which the 3 instances of the blog communicate with. You've attached storage volumes to each of the droplets and set up a clustered file system to hold the images so they are available to each of the web servers.
At this point, the blog is ready to use again. You can upload images on any of the servers behind the load balancer and it will be available to all of the Droplets you have configured.