GoDoxy

Benchmarks

Benchmark and resource usage of GoDoxy

TL;DR

All rows below come from one full make benchmark run (four proxies) on 2026-05-03 (Profile: smoke). They are useful for relative comparison on this machine and profile, not for universal rankings.

  • Reused HTTP/1.1: Nginx ~72k req/s; Traefik ~42.5k; GoDoxy ~41.4k; Caddy ~34.6k.
  • Reused HTTP/2 (h2load -m16): Nginx ~85k req/s; GoDoxy ~39.4k; Caddy ~31.3k; Traefik ~19.6k — this run reported 0 h2load failures and 0 HTTP 5xx for every proxy in the summary lines.
  • Reused HTTP/3 (h3bench): Nginx ~31.5k req/s; GoDoxy ~21.0k; Traefik ~17.5k; Caddy ~15.9k (all Failed: 0 in the h3bench summaries).
  • Fresh connection tests used 2000 requests with 32 concurrent connections (h2load --h1 -m1 / h2load -m1 / h3bench -m1). Every proxy completed 2000 / 2000 for HTTP/1.1, HTTP/2, and HTTP/3 in this run.
  • HTTP/1.1 cleartext on ports 8080–8083 was disabled (H1C=0); enable with --h1c or H1C=1 for plaintext H1 comparable to TLS H1.

Benchmark setup

Source code:

Run all benchmark modes:

make benchmark

From the repo root, ./scripts/benchmark.sh --help prints the full option list (same as the usage block in scripts/benchmark.sh). Common invocations:

# Profiles: smoke (default), stable, stress — CLI or BENCH_PROFILE=…
./scripts/benchmark.sh --profile stable
./scripts/benchmark.sh --smoke    # alias: --profile smoke
./scripts/benchmark.sh --stable   # alias: --profile stable
./scripts/benchmark.sh --stress   # alias: --profile stress

# Protocol toggles (defaults: H1/H2/H3 on, H1C off)
./scripts/benchmark.sh --no-h1
./scripts/benchmark.sh --no-h2
./scripts/benchmark.sh --no-h3
./scripts/benchmark.sh --h1c            # cleartext HTTP/1.1 on :8080–:8083
./scripts/benchmark.sh --no-h1c

# Connection mode (default both; env CONNECTION_MODE=reused|fresh|both)
./scripts/benchmark.sh --reused       # duration-based reused connections only
./scripts/benchmark.sh --fresh        # fixed-request fresh connections only
./scripts/benchmark.sh --both

# Make forwards extra CLI args verbatim
make benchmark args="--stable --no-h1 --no-h3"

Environment variables (defaults come from the active profile unless you override):

BENCH_PROFILE=smoke|stable|stress   # CLI --profile/--smoke/--stable/--stress overrides
H1=0|1 H2=0|1 H3=0|1 H1C=0|1
CONNECTION_MODE=both|reused|fresh
TARGET=GoDoxy|Traefik|Caddy|Nginx    # case-insensitive; omit for all four

DURATION=10s
THREADS=4
CONNECTIONS=32
STREAMS=16                           # concurrent streams per H2/H3 connection
REQUESTS=2000                       # requests per protocol in fresh mode
FRESH_CONNECTIONS=32                 # concurrent connections for fresh runs (default: CONNECTIONS)

H2LOAD_WARM_UP_TIME=3s              # reused-mode warm-up; 0 disables
H2LOAD_DURATION=                   # optional h2load duration token (derived from DURATION if unset)
H3_TOOL=auto|h2load|h3bench
LATENCY_SAMPLES=5 RUNS=1 REPEAT_DELAY=1

UPLOAD_BODY_BYTES=262144

BENCH_COMPOSE_FILE=dev.compose.yml
BENCH_COMPOSE_MANAGE=0|1 BENCH_COMPOSE_CLEANUP=0|1 BENCH_COMPOSE_RECREATE=0|1

HOST=bench.domain.com               # hostname for SNI and bench URLs

Built-in profile presets (from the script):

ProfileDurationConnectionsStreamsFresh requestsFresh connectionsThroughput runsLatency samples
smoke10s321620003215
stable30s64162000064525
stress30s10010050000100310

All three profiles use 4 h2load threads and set a non-zero warm-up before reused throughput (smoke 3s, stable/stress 5s). When RUNS is greater than 1, the script repeats each throughput benchmark and prints a median / coefficient-of-variation summary (stable, stress, or your own RUNS). Override profile defaults with the env vars above.

What gets tested

The benchmark uses a Go upstream server that returns a 4096-byte response body. The script starts four reverse proxies and tests each through TLS listeners:

ProxyHTTPS URL
GoDoxyhttps://bench.domain.com:8440/
Traefikhttps://bench.domain.com:8441/
Caddyhttps://bench.domain.com:8442/
Nginxhttps://bench.domain.com:8443/

The benchmark preserves bench.domain.com as SNI and :authority, then connects to loopback with:

  • h2load --connect-to=127.0.0.1:<port> for throughput tests
  • curl --resolve for probe-style checks

HTTP/1.1 and HTTP/2 run over HTTPS. HTTP/2 is real TLS/ALPN H2, not h2c. The compose stack also exposes cleartext HTTP on 8080–8083 (http://bench.domain.com:<port>/) for optional HTTP/1.1 plaintext baseline runs (--h1c / H1C=1); when enabled, the script exercises the same h2load --h1 -m1 scenarios without TLS. The stack also exposes UDP listeners for HTTP/3.

HTTP/3 defaults to H3_TOOL=auto: use h2load --h3 when available, otherwise build and run the bundled cmd/h3bench client. With h3bench, reused throughput uses a fixed-duration run; fresh one-request-per-connection uses h3bench -m1 (as in this capture). Force h3bench explicitly with:

H3_TOOL=h3bench make benchmark

Disable HTTP/3 with either:

H3=0 make benchmark
./scripts/benchmark.sh --no-h3

Run details

SettingValue
Date2026-05-03
Profilesmoke
Target hostnamebench.domain.com
Duration10s
h2load warm-up3s
h2load threads4
TLS connections32
HTTP/2 streams16 (h2load -m16)
HTTP/3 streams/conn16
Fresh requests2000
Fresh concurrency32
Upload probe size256 KiB
Latency probe samples5 per scenario
HTTP/3 toolh3bench

HTTP/1.1 cleartext baseline

Cleartext HTTP/1.1 reuses the same h2load reused- and fresh-connection profiles as TLS H1, but targets http://bench.domain.com:808x/ with --connect-to to loopback. This run did not enable it, so there are no plaintext req/s or latency rows to publish here. Re-run with:

./scripts/benchmark.sh --h1c
# or
H1C=1 make benchmark

Console markers look like [HTTP/1.1 cleartext reused] h2load --h1 -m1 and [HTTP/1.1 cleartext fresh].

Reused TLS throughput

Persistent/reused connection tests show steady-state proxy throughput after connection setup.

  • H1/H2: h2load; throughput in req/s; transfer in MB/s; latency is mean time for request.
  • H3: h3bench; transfer in MiB/s; latency columns are percentiles plus average.
  • Done: completed requests in the timed window (includes warm-up / elapsed semantics from the script).
  • Fail / 5xx: h2load failures and HTTP 5xx responses.
ProxyH1 req/sH1 doneH1 MB/sH1 latencyH2 req/sH2 doneH2 failH2 5xxH2 MB/sH2 latencyH3 req/sH3 doneH3 MiB/sH3 p50H3 p90H3 avgH3 p99
Nginx71,969719,689294.40.44 ms84,965849,64600340.95.72 ms31,504315,708123.112.1 ms31.3 ms16.2 ms77.8 ms
GoDoxy41,356413,557167.50.77 ms39,377393,76800154.712.55 ms20,961209,96581.921.4 ms42.6 ms24.3 ms76.1 ms
Traefik42,481424,806172.10.75 ms19,649196,4930077.225.78 ms17,504175,54668.425.4 ms51.6 ms29.1 ms87.4 ms
Caddy34,598345,977140.60.92 ms31,310313,10400123.015.99 ms15,852159,01161.927.5 ms57.0 ms32.1 ms108 ms

Notes:

  • Compare HTTP/3 only across HTTP/3 rows. h3bench and h2load report different latency and transfer metrics.
  • HTTP/2 mean latency in the H1/H2 columns is h2load mean time for request; under multiplexed load it reflects concurrent scheduling, not single-request latency.
  • h2load sometimes prints slightly different totals for “done/succeeded” and the HTTP/2 status codes: summary; the Fail and 5xx columns follow the explicit failed / 5xx fields from the tool output in this capture.

Fresh TLS throughput

Fresh connection tests use 2000 new connections with one request each (h2load --h1 -m1 for HTTP/1.1 and h2load -m1 for HTTP/2). HTTP/3 uses h3bench -m1` with the same 2000 requests / 32 connections profile. These numbers mostly reflect handshake, accept-loop, and short-lived QUIC or TCP connection behavior.

ok/fail shows succeeded requests versus failed attempts out of 2000 (all 2000 / 0 in this run).

ProxyH1 req/sH1 MB/sH1 ok/failH1 doneH2 req/sH2 MB/sH2 ok/failH2 doneH3 req/sH3 MiB/sH3 ok/failH3 done
Nginx32,789134.152000 / 02,00034,973140.342000 / 02,00017,06866.672000 / 02,000
GoDoxy23,35894.582000 / 02,00024,65596.902000 / 02,00015,35359.972000 / 02,000
Traefik25,014101.312000 / 02,00025,563100.472000 / 02,00013,34452.132000 / 02,000
Caddy25,259102.672000 / 02,00017,20867.652000 / 02,00010,66541.662000 / 02,000

Latency probes

Probe tests send five samples per route and protocol. Values below are mean TTFB in milliseconds. Bodies or first SSE chunks are about 4 KiB.

Proxy/json H1/json H2/json H3/upload H1/upload H2/upload H3/stream H1/stream H2/stream H3/sse H1/sse H2/sse H3
Nginx2.714.514.233.504.555.173.082.923.472.662.543.22
GoDoxy2.303.334.882.143.194.772.582.512.293.842.702.26
Traefik2.743.693.783.503.374.283.462.763.563.052.833.46
Caddy3.383.444.894.752.985.563.553.032.952.982.623.22

HTTP/3 probe latency includes QUIC setup behavior from the probe client. Treat those values as probe behavior, not pure steady-state HTTP/3 request latency.

SSE probe

SSE probes use /sse?count=3&interval_ms=150. Each result is five samples. The table shows ok/failed plus mean TTFB in milliseconds.

ProxyH1 ok/failH1 TTFBH2 ok/failH2 TTFBH3 ok/failH3 TTFB
Nginx5 / 02.665 / 02.545 / 03.22
GoDoxy5 / 03.845 / 02.705 / 02.26
Traefik5 / 03.055 / 02.835 / 03.46
Caddy5 / 02.985 / 02.625 / 03.22

All SSE samples succeeded. Nginx had the lowest mean HTTP/1.1 SSE TTFB among the four in this run; GoDoxy was lowest on HTTP/3 SSE means in this snapshot.

WebSocket probe

ProxyOK / 5StatusMean TTFB
Nginx5 / 51012.25 ms
Traefik5 / 51014.85 ms
GoDoxy5 / 51013.88 ms
Caddy5 / 51016.64 ms

Interpretation

  • Reused HTTP/1.1: Nginx leads on req/s and mean time for request; Traefik edges GoDoxy slightly on this profile, with Caddy fourth.
  • Reused HTTP/2: Nginx leads by a wide margin on aggregate req/s with low reported mean time for request; GoDoxy and Caddy follow, then Traefik. This smoke profile did not show the high failure rates sometimes seen at much higher client/stream counts—re-run with heavier settings before inferring stability limits.
  • Reused HTTP/3: Nginx leads on req/s and transfer; GoDoxy, Traefik, and Caddy follow in that order here, with Caddy showing the highest p99 among the four.
  • Fresh connections: Nginx leads on H1/H2/H3 instantaneous throughput in this harness; GoDoxy is second on HTTP/3 fresh, then Traefik and Caddy.
  • Latency probes: H1/H2 stay in the low-millisecond band on most routes; a single slow QUIC dial or first sample can pull HTTP/3 /json averages up (visible on Caddy and GoDoxy in this run).
  • Plaintext HTTP/1.1 was not exercised in this capture; set H1C=1 to compare TLS vs cleartext H1 under the same script.

Resource usage

CONTAINER ID   NAME           CPU %     MEM USAGE / LIMIT     MEM %     NET I/O         BLOCK I/O   PIDS
c2437c342388   godoxy-proxy   0.89%     38.02MiB / 853.7MiB   4.45%     0B / 0B         0B / 0B     12
80ad84f07d31   socket-proxy   0.00%     2.156MiB / 853.7MiB   0.25%     15.7kB / 48kB   0B / 0B     5

On this page