This post first appeared on the QueryCal technical blog.

A/B testing is a lifesaver for a solo SaaS developer. While it’s hard to predict whether a 14-day or 1-month trial is going to get a better signup rate, it’s really simple to test: show half of people the 14-day button and the other half the 1-month button and just measure the click-through rate.

There are loads of tools out there to help you do this, but rule #1 of running a micro-SaaS like QueryCal is to keep your technology stack simple—it’s better to focus your time on features, not integrating a bunch of different technologies.

Here’s how QueryCal does A/B testing using only Caddy, a static site generator, and Plausible Analytics.

QueryCal’s web architecture

A boxes and arrows diagram showing a Caddy server on the edge of the network proxying requests to either web.landing or web.docs depending on the request

QueryCal’s web architecture is very simple. There’s a public-facing Caddy server that forwards requests to either the landing page (web.landing) or the docs site (web.docs). These two sites are themselves just simple static file servers.

The Caddy config looks like this (I’m using docker-compose so web.landing resolves to the IP of the container serving the landing page):

querycal.com/docs {
    reverse_proxy web.docs:80
}

querycal.com {
    reverse_proxy web.landing:80
}

So, how is this simple setup going to do A/B testing? It needs to:

  1. Be able to serve two different experimental variants of the site
  2. Split visitors evenly into the two experiments
  3. Track which experiment each visitor was seeing when they signed up

Most A/B testing solutions would need some backend code to accomplish this, but with some clever use of Caddy, it’s possible to do all this while still keeping the architecture simple, and only using static file servers:

The same boxes and arrows diagram as before but now there are two copies of web.landing: web.landing.a and web.landing.b

Challenge 1: Serving A/B versions of a static site

The QueryCal landing page is just some plain HTML, CSS, and JS that’s bundled up using a tool called Parcel.

Normally Parcel takes the HTML/CSS/JS and generates a single bundle of the static site but, by using a plugin called posthtml-expressions, we can actually get it to generate the two variants.

posthtml is a HTML post-processor and the posthtml-expressions plugin adds support for some basic conditionals so we can write HTML like this:

<if condition="variantA">
    <a href="/register?trial=14">14-day free trial</a>
</if><else>
    <a href="/register?trial=28">1-month free trial</a>
</else>

At build time this gets processed and converted into a single button that either says “14-day free trial” or “1-month free trial” depending on whether the condition “variantA” is true.

This condition gets set based on an environment variable using this short posthtml.config.js file:

module.exports = {
  plugins: [
    require("posthtml-expressions")({
      locals: { variantA: process.env.VARIANT === "A" },
    }),
  ],
};

Now when we run VARIANT=A parcel build we’ll get one version of our site and when we run VARIANT=B parcel build we’ll get the other!

Challenge 2: Splitting visitors into A/B variants with Caddy

Now that we can easily generate two different variants of a static site, we need a way to randomly show each to half of the visitors.

Ideally, the solution needs to:

  • Split visitors as close to 50:50 as possible (otherwise it’s harder to work out which experiment performed best)
  • Consistently show the same variant of the site to each individual (it’d be confusing if the page kept changing when you reloaded it)
  • Not require cookies (that’d defeat the point of using cookie-free analytics like Plausible)

Luckily Caddy has a directive that can do exactly this: a reverse_proxy that uses the ip_hash load balancing policy.

querycal.com {
    reverse_proxy web.landing.a:80 web.landing.b:80 {
        lb_policy ip_hash
    }
}

This Caddyfile means, for requests to querycal.com:

  1. First, Caddy takes the visitor’s IP and hashes it
  2. If the hash is “even”, Caddy reverse-proxies the request to web.landing.a
  3. If the hash is “odd”, Caddy reverse-proxies the request to web.landing.b

This is great because it means:

  • Visitors will be very evenly split between the two variants
  • As long as a visitor’s IP remains the same, they’ll always see the same variant
  • All without requiring any cookies!

The only complexity this adds to our architecture is having to run one static file server per variant of the QueryCal landing page.

Challenge 3: Tracking A/B conversion rate with Plausible Analytics

Now that we’ve got visitors being shown the two variants of our site, all that’s left is to track the success rate of each. This is easily done by extending the if-else to use Plausible’s custom props feature.

<if condition="variantA">
    <a href="/register?trial=14" onclick="plausible('signup', {props: {variant: '14-day'}}">14-day free trial</a>
</if><else>
    <a href="/register?trial=28" onclick="plausible('signup', {props: {variant: '1-month'}}">1-month free trial</a>
</else>

Now when someone signs up:

  • If they saw the 14-day trial button, a conversion gets logged with the tag variant: 14-day
  • If they saw the 1-month trial button, a conversion gets logged with the tag variant: 1-month.

This gets nicely broken down in the Plausible UI: Screenshot of the Plausible Analytics UI showing the goal &lsquo;signup&rsquo; broken down into the results for each individual variant

Limitations of this method

While this is a nice simple way to implement A/B testing, it does have some limitations compared to a more full-featured A/B testing solution. But, these limitations are fairly minor and, for small sites like QueryCal, are definitely worth the tradeoff.

The A/B split is only done on IP, not per-visitor

Because we’re using the visitor’s IP address to decide which variant to show, an even split isn’t guaranteed. It’s common for universities and phone providers to put all of their users behind a small number of IP addresses rather than assigning them all unique ones (using Carrier-grade NAT). This could cause some problems with the data because all of these people will be shown one variant and not the other.

It only supports a fixed number of variants

Full-featured A/B testing tools will let you create many different variants of a button and test them all at the same time. Because we have to statically generate each version of the site, this technique only really works for a small number of variants (I’m only ever using two at a time).

Simultaneous experiments get bundled together

Say you’re running two simultaneous experiments: the 14-day vs 1-month trial button test and another test of whether the pricing information should come before or after a demo.

A proper A/B testing solution would show 4 different versions of the site:

  • 14-day trial, pricing first
  • 14-day trial, demo first
  • 1-month trial, pricing first
  • 1-month trial, demo first

But because we’re statically generating the versions of our site we get just two:

  • 14-day trial, pricing first
  • 1-month trial, demo first

If the second variant shows much higher signup rate, it’ll be hard to tell if it was the 1-month trial or showing the demo before the pricing that was actually responsible. This isn’t a killer problem, but it’s worth bearing in mind—where possible, don’t run multiple experiments that are trying to improve the same metric.

What is QueryCal?

QueryCal takes your calendars and lets you query them as if they were an SQL database. It uses the real SQLite query engine so you can write queries to calculate metrics based on your calendar events and correlate events between your different calendars.

Sound interesting? Sign up for a free trial.