Deploying a Rust service with Nix, GitHub Actions, and SCP
Fake Email runs a Rust SMTP + HTTP server on a single EC2 box. The deploy pipeline is deliberately boring: Nix flake for reproducible local builds, plain cargo on stock Ubuntu in GitHub Actions for the production binary, scp + ssh to ship it, systemd to run it, Caddy to terminate TLS. No Kubernetes, no Docker, no fly.io, no Pulumi.
This post walks through every file in the deploy path and explains why each piece exists — including the one decision that trips up most people the first time: we use Nix locally but do not ship the Nix-built binary.
1. The shape of the system
Three components live on a single Ubuntu 22.04 EC2 instance:
fake-email— our Rust binary, listens on127.0.0.1:3001(HTTP) and0.0.0.0:25(SMTP).caddy— terminates TLS on:443, reverse-proxies to:3001.postgres— on a separate managed instance, reached over the network.
Inbound mail arrives on port 25 directly to the Rust SMTP server. The browser talks to https://api.fake-email.site → Caddy → 127.0.0.1:3001. The Next.js UI is on Vercel, so the EC2 box never serves HTML. That separation is the entire architecture.
Internet
┌───────────────┼───────────────┐
│ │ │
:443 HTTPS :25 SMTP :80 ACME
│ │ │
└─────────► Caddy ◄─────────┘
│
▼
127.0.0.1:3001 ◄──── fake-email (Rust)
│
▼
Postgres (managed)2. Two build tracks: Nix for dev, cargo for deploy
This is the one thing that confuses every new contributor, so it goes first.
The repo ships a Nix flake (flake.nix) that uses Crane to build the Rust workspace reproducibly. On any machine that has Nix installed:
nix build .#http-server # produces ./result/bin/http-server nix run .#backend # builds and runs with env validation nix develop # drops into a shell with cargo, sqlx-cli, node22
That binary is perfect for local development and for anyone who also runs Nix. The catch:
A Nix-built binary has its interpreter and shared library paths hard-coded to /nix/store/<hash>/.... On a stock Ubuntu host with no Nix, that interpreter does not exist, so the kernel returns ENOENT when systemd tries to exec the binary. systemd reports it as status=203/EXEC and the service never starts.
Two ways out:
- Install Nix on the EC2 host and copy the Nix store closure too — adds a moving part.
- Build the production binary with plain
cargo build --releaseon an Ubuntu runner, so its dynamic loader is/lib64/ld-linux-x86-64.so.2and itslibsslis the systemlibssl3.
We picked option 2. The GitHub Actions runner is ubuntu-latest, which currently matches the deploy host closely enough that the binary just works. We treat the Nix flake as a developer convenience and the cargo build as the release artifact.
3. The Nix flake, line by line
flake.nix in full:
{
description = "fake-email backend (HTTP + SMTP + Postgres)";
inputs = {
nixpkgs.url = "github:NixOS/nixpkgs/nixos-unstable";
flake-utils.url = "github:numtide/flake-utils";
crane.url = "github:ipetkov/crane";
};
outputs = { self, nixpkgs, flake-utils, crane, ... }:
flake-utils.lib.eachDefaultSystem (system:
let
pkgs = nixpkgs.legacyPackages.${system};
inherit (pkgs) lib;
craneLib = crane.mkLib pkgs;
sqlFilter = path: _type: builtins.match ".*\\.sql$" path != null;
rustSrc = lib.cleanSourceWith {
src = lib.cleanSource ./.;
filter = path: type:
(sqlFilter path type) || (craneLib.filterCargoSources path type);
};
commonArgs = {
src = rustSrc;
strictDeps = true;
nativeBuildInputs = with pkgs; [ pkg-config ];
buildInputs = with pkgs; [ openssl ]
++ lib.optionals stdenv.isDarwin [ libiconv ];
doCheck = false;
};
cargoArtifacts = craneLib.buildDepsOnly (commonArgs // {
pname = "fake-email-workspace-deps";
version = "0.1.0";
cargoExtraArgs = "--workspace";
});
http-server = craneLib.buildPackage (commonArgs // {
pname = "http-server";
version = "0.1.0";
inherit cargoArtifacts;
cargoExtraArgs = "-p http-server";
});
...
in { ... });
}What is doing real work here
- Crane (
crane.mkLib) — a Nix library for building Rust workspaces. Splits dependency builds from first-party builds so the dep build can be cached aggressively. rustSrcfilter— only include files that affect the build. SQL migrations get a custom matcher because Crane's default filter does not see.sqlfiles and the build would otherwise missmigrations/*.cargoArtifacts = buildDepsOnly { ... }— build the workspace dependency graph in one derivation. When a first-party source file changes, Nix reuses this derivation; whenCargo.lockchanges, it rebuilds.http-server = buildPackage { inherit cargoArtifacts; }— build only-p http-server, importing the cached dependency build.strictDeps = true— refuse to let build-time and runtime dependencies leak into each other. Catches a class of accidental dynamic-linker bugs.doCheck = false— tests run in CI, not at build time, so the package output is decoupled from test flakiness.
The flake also exposes a devShell with cargo, rustc, rustfmt, clippy, sqlx-cli, and nodejs_22. A new contributor on macOS or Linux runs nix develop and has exactly the toolchain CI uses.
4. The GitHub Actions workflow
.github/workflows/nix-backend.yml has two jobs: build (on every push and PR) and deploy (only on a push to main). UI changes are ignored — Vercel handles those.
The build job
build:
runs-on: ubuntu-latest
timeout-minutes: 45
steps:
- uses: actions/checkout@v4
- run: sudo apt-get update -qq && sudo apt-get install -y -qq pkg-config libssl-dev
- uses: dtolnay/rust-toolchain@stable
- uses: Swatinem/rust-cache@v2
with:
workspaces: "."
- run: cargo build --release -p http-server --locked
- name: Verify ELF (no Nix interpreter)
run: |
set -euo pipefail
file ./target/release/http-server
if file ./target/release/http-server | grep -q /nix/store; then exit 1; fi
ldd ./target/release/http-server | head -20
- uses: actions/upload-artifact@v4
with:
name: http-server
path: ./target/release/http-server
retention-days: 7Notes worth stealing:
--lockedoncargo buildensures the build uses the committedCargo.lock. No silent dependency upgrades on the deploy path.Swatinem/rust-cache@v2— caches thetarget/directory and registry, cutting CI from ~25 min to ~3 min on warm builds.- The ELF verify step is the safety belt against the Nix mistake. If anyone ever swaps
cargo buildfornix buildand uploads the result, thegrep /nix/storetrips and the deploy fails before the binary ever leaves CI. - Artifact upload hands the binary to the deploy job. Keeping build and deploy separate means a hot fix can be re-deployed by re-running the deploy job without rebuilding.
The deploy job
deploy:
needs: build
if: github.ref == 'refs/heads/main' && github.event_name == 'push'
runs-on: ubuntu-latest
timeout-minutes: 15
concurrency:
group: deploy-production
cancel-in-progress: false
steps:
- uses: actions/download-artifact@v4
with: { name: http-server, path: artifact }
- run: |
set -euxo pipefail
SRC="$(find artifact -type f -name http-server | head -1)"
test -n "$SRC"
cp "$SRC" ./http-server && chmod +x ./http-server
file ./http-server
- uses: appleboy/scp-action@v0.1.7
with:
host: ${{ secrets.EC2_HOST }}
username: ${{ secrets.EC2_USER }}
key: ${{ secrets.EC2_SSH_KEY }}
source: http-server
target: /tmp/
- uses: appleboy/ssh-action@v1
with:
host: ${{ secrets.EC2_HOST }}
username: ${{ secrets.EC2_USER }}
key: ${{ secrets.EC2_SSH_KEY }}
script: |
set -euo pipefail
...Three things to call out:
concurrency: deploy-productionwithcancel-in-progress: false— only one deploy runs at a time, and new pushes queue rather than cancelling the in-flight deploy. This avoids the worst-case where a half-shipped binary gets replaced mid-restart.if: github.ref == 'refs/heads/main'— PRs build but do not deploy. Manual force-deploys go throughre-run failed job or a tiny shell command.- Two separate actions for scp and ssh.
appleboy/scp-actionships the file;appleboy/ssh-actionruns the install script. A single all-in-one action would mix concerns and is harder to debug when one half fails.
5. SCP + SSH: the actual ship
The remote script on EC2 does the careful work:
set -euo pipefail
test -s /tmp/http-server
if file /tmp/http-server | grep -q '/nix/store'; then
echo "Refusing to install: Nix-linked binary." >&2
exit 1
fi
sudo mkdir -p /opt/fake-email/bin
sudo systemctl stop fake-email 2>/dev/null || true
sudo install -m 0755 -o root -g root /tmp/http-server \
/opt/fake-email/bin/http-server.new
sudo mv /opt/fake-email/bin/http-server.new \
/opt/fake-email/bin/http-server
sudo systemctl daemon-reload || true
sudo systemctl start fake-email
for i in {1..30}; do
curl -sf --max-time 2 http://127.0.0.1:3001/api/health \
| grep -q OK && break || sleep 1
done
sudo systemctl is-active --quiet fake-email
curl -sf --max-time 5 http://127.0.0.1:3001/api/health \
| grep -q OK || {
sudo systemctl status fake-email --no-pager -l || true
sudo journalctl -u fake-email -n 80 --no-pager || true
file /opt/fake-email/bin/http-server || true
sudo ldd /opt/fake-email/bin/http-server || true
exit 1
}
rm -f /tmp/http-serverTactics worth highlighting:
- Belt and suspenders Nix check on the remote host — even if someone bypasses CI and SCPs a binary manually, the install step refuses Nix-linked output.
- Atomic swap with
install+mv. The new binary lands athttp-server.new, then the filesystem rename atomically replaces the running path. systemd has already been stopped, so there is no race where two processes race for port 25. - 30-second health-check loop against
/api/health. The deploy fails loudly withjournalctloutput if the binary will not come up. That output lands directly in the GitHub Actions log. - Final cleanup of
/tmp/http-serverso the next deploy starts from a clean slate.
6. systemd unit and hardening
deploy/fake-email.service is the supervisor. It is tiny but hardened:
[Unit] Description=fake-email backend (HTTP + SMTP) After=network-online.target Wants=network-online.target StartLimitIntervalSec=60 StartLimitBurst=5 [Service] Type=simple User=fake-email Group=fake-email WorkingDirectory=/opt/fake-email EnvironmentFile=/etc/fake-email/env ExecStart=/opt/fake-email/bin/http-server Restart=always RestartSec=2 TimeoutStartSec=30 AmbientCapabilities=CAP_NET_BIND_SERVICE CapabilityBoundingSet=CAP_NET_BIND_SERVICE NoNewPrivileges=true ProtectSystem=true ProtectHome=true PrivateTmp=true PrivateDevices=true ProtectKernelTunables=true ProtectKernelModules=true ProtectControlGroups=true RestrictSUIDSGID=true RestrictNamespaces=true LockPersonality=true RestrictRealtime=true [Install] WantedBy=multi-user.target
Why each block matters
AmbientCapabilities=CAP_NET_BIND_SERVICE— required to bind port 25 as the non-rootfake-emailuser. Without this you would have to run the binary as root or push the SMTP port to a high number behind an iptables redirect.ProtectSystem=true— read-only/usr,/boot,/efi. We deliberately chosetrueoverstrictbecause strict can break the dynamic linker setup on some minimal AMIs.ProtectHome=true,PrivateTmp=true,PrivateDevices=true— the service cannot see other users' homes, gets its own/tmpnamespace, and has no device nodes besides what the kernel needs.NoNewPrivileges=true+RestrictSUIDSGID=true+LockPersonality=true— the process cannot escalate, cannot mount, cannot set up unusual personality flags.Restart=alwayswithStartLimitBurst=5— recover from transient crashes but stop after 5 rapid failures so we are not in a hot loop.
The environment file /etc/fake-email/env contains the secrets (DATABASE_URL) and runtime config (DOMAIN, HTTP_PORT, SMTP_PORT, CORS_ALLOWED_ORIGINS). It is chmod 600 and owned by the fake-email user, so only systemd (running as root) and the service itself can read it.
7. setup.sh: bootstrap an EC2 host
deploy/setup.sh turns a fresh Ubuntu 22.04 instance into a working host with one command:
DATABASE_URL='postgres://user:pass@host/db' \ VERCEL_ORIGIN='https://your-app.vercel.app' \ ./deploy/setup.sh
The script is a series of checks-then-actions:
- Preflight — refuse to run as root, refuse to run on non-Ubuntu, refuse without passwordless sudo. Loud, early.
- Packages — install
curl,jq,netcat,ca-certificates, and cruciallylibssl3(the dynamic OpenSSL the cargo binary expects). - DB reachability — TCP-poke the Postgres host on 5432 so we know networking is correct before continuing.
- Bin dir, user, env file — create
/opt/fake-email/bin, a systemfake-emailuser withnologinshell, and write/etc/fake-email/envwith mode 600. - systemd install — copy the unit file, daemon-reload, enable.
- Caddy install — add the Cloudsmith repo, install Caddy, write a one-line Caddyfile (
api.fake-email.site { reverse_proxy localhost:3001 }). Caddy gets its own LetsEncrypt cert on first start. - Firewall + start — UFW opens 22, 25, 80, 443; restart fake-email if a binary is already present; restart Caddy.
- Health summary — poll
/api/health, check SMTP banner withnc 127.0.0.1 25, print pass/fail counts.
One file, no Ansible, no Terraform. For a one-box service this is the right scale. If we ever need a second box, this script becomes an Ansible playbook within an afternoon.
8. Caddy, DNS, and port 25
The DNS picture:
A api.fake-email.site → <EC2 elastic IP> A mail.fake-email.site → <EC2 elastic IP> MX fake-email.site → mail.fake-email.site (priority 10)
Caddy on :443 auto-provisions a TLS cert for api.fake-email.site via the ACME HTTP-01 challenge (Caddy serves the challenge on :80 itself), then reverse-proxies HTTP to 127.0.0.1:3001. The Caddyfile is two lines.
Port 25 is the part everybody trips on. Two things to know:
- AWS blocks port 25 outbound on new accounts. You have to file a request with AWS to lift the throttle. (Inbound 25 is fine.)
- Reverse DNS (PTR) on the EC2 elastic IP should point to
mail.fake-email.siteso that careful senders accept the connection. AWS sets this via support request too.
9. Health check loop and rollback
The CI deploy script polls http://127.0.0.1:3001/api/health for up to 30 seconds after starting systemd. The endpoint is a one-liner in the Rust binary: it returns {"status":"ok"} if the DB pool is healthy and a 5xx otherwise. Three failure shapes get distinct output in the deploy log:
- Binary fails to exec —
systemctl statusshows203/EXEC,lddreveals the missing library or/nix/storereference. - Binary starts, can't reach Postgres —
journalctlshows the connection error, the health endpoint stays 500. - Binary starts and connects, but bind fails on :25 — usually because we deleted the
CAP_NET_BIND_SERVICEcapability or the previous process is still holding the socket.
Rollback is currently manual: redeploy the previous green commit. If we needed faster rollback we would keep the previous binary at /opt/fake-email/bin/http-server.prev and add a two-line revert script. We have not needed it.
10. Secrets and ssh keys
Three repo secrets live in GitHub:
EC2_HOST— the public hostname or IP.EC2_USER— typicallyubuntu.EC2_SSH_KEY— a deploy-only private key. Not your personal key.
Generate the deploy key locally:
ssh-keygen -t ed25519 -f deploy_ed25519 -C 'github-actions-deploy' # add deploy_ed25519.pub to ~/.ssh/authorized_keys on EC2 # copy deploy_ed25519 contents into GitHub secret EC2_SSH_KEY # (and delete the local files) shred -u deploy_ed25519 deploy_ed25519.pub
Lock the EC2 sshd to key-only and IPv4-only if possible. Rotate the deploy key annually, or immediately if you ever pasted it into a tool you don't trust.
11. Pitfalls we already hit
Nix binary in production (status=203/EXEC)
First deploy. Forgot, ran nix build .#http-server locally and scp'd ./result/bin/http-server. systemd printed 203/EXEC with no other info because the kernel could not find /nix/store/<hash>/lib/ld-linux-x86-64.so.2. Fix: build with cargo build --release on Ubuntu and ship that. The ELF verify step in CI now prevents a repeat.
libssl version skew
Built on Ubuntu 22.04 runner (libssl3); the EC2 was on Ubuntu 20.04 (libssl1.1). The binary failed with libssl.so.3 not found. Fix: pin the EC2 to 22.04 and install libssl3 in setup.sh so the runtime always matches the runner.
Health probe was too short
First version checked once after 1 second. Cold-start of the Rust binary + sqlx pool prime can take 3–5 seconds, so the deploy falsely failed on the first push of the day. Fix: 30-iteration loop with 1s sleeps.
Concurrent deploys clobbering each other
Two quick pushes in a row started two deploys; the second stopped systemd while the first was still polling. Fix: concurrency.group: deploy-production with cancel-in-progress: false.
AWS port 25 throttle
Brand-new AWS accounts cap port 25 outbound. We are receive-only, so this did not bite us directly — but if you ever want to send (forwarding, bounces), file the request the day you spin up the account, not the day you ship.
12. Run it yourself
Everything described here is in the public repo. To replicate the stack on your own EC2 box:
# on EC2 (Ubuntu 22.04) git clone https://github.com/Shivrajsoni/fake-email cd fake-email DATABASE_URL='postgres://...' \ VERCEL_ORIGIN='https://your-app.vercel.app' \ ./deploy/setup.sh # in GitHub repo settings # add secrets: EC2_HOST, EC2_USER, EC2_SSH_KEY # locally git push origin main # CI builds + deploys; you watch the green check on GitHub