Published: 2022-06-01

For five years my day job was managing a production, public Ceph cluster. I’m very fond of it and know how good it is at not losing data.

Now I want to start scanning my book collection. That will require a significant amount of durable onsite storage, so I’m turning to Ceph.

This document details the efforts to fold setup and management of the Ceph distributed filesystem into Homefarm.

2022-06-01 First steps

I always start a process with manual proving-out, then move to automation.

The first thing I needed was a machine to test with. I didn’t have any spares on hand, and also didn’t want to build a standard PC out of my spare parts. I want these machines to be low-power and physically small. I decided to start with a refurbished USFF business desktop: the HP 600 G2.

They’re $190 and in plentiful supply. They have an i3-6100T CPU, which is nice and efficient but will have more than enough power for the cold-storage job it’s going to be asked to do. And most importantly, while they come with a 2.5" laptop format SSD, they have an NVMe slot on-board, which can be used for the OSD.

Since I was working with it professionally, Ceph has transitioned to something of a Homefarm-like approach itself thanks to a new tool called cephadm. It’ll help you bootstrap a cluster with “reasonable” defaults in a very short time and with not much effort.

Also, the standard for Ceph has become deployment via containers, which works around one of the problems I had last time I tried something like this: only deb and rpm packages being available.

However, irritatingly, cephadm itself expects to be able to find Debian or RH packages so it can install any packages it considers to be missing. Luckily I found a workaround for this on the AUR, and was able to… sort of… bring up a single Ceph node.

Remember those reasonable defaults? They include not being happy with a single-node install, so none of the placement groups would go active and I couldn’t get an RGW running. It was a really good first effort though.

Next up is buying another 2 machines and then doing some upgrades. This is what the cluster will look like initially:

Part Qty Cost
HP ProDesk 600-G2 3 189.99
WD Green 1TB NVMe SSD 3 79.99
2x8GB DDR4 SODIMM kit 3 54.99
Total 974.91

3 nodes, each with 1TB storage and 16GB RAM. There will also be 3 8GB SODIMMs left over for future use. Running at triple replication (the default), this will result in a total cluster storage size of… 1TB. But that’s fine for a starting point.

2022-06-17 Working cluster