Simplifyd Cloud

Healthchecks

Configure startup, readiness, and liveness healthchecks to guarantee zero-downtime deployments and automatic recovery for Docker services.

Healthchecks tell Simplifyd Cloud how to determine whether your service is alive, ready to serve traffic, or still starting up. Configuring them is the primary way to achieve zero-downtime deployments and automatic restart on failure.

Healthchecks are only available on Docker services.

Check types

Simplifyd Cloud supports three healthcheck types, each serving a distinct purpose:

TypePurpose
StartupSignals that the service has finished initialising. Liveness and readiness checks are disabled until this check succeeds. Use it for slow-starting services.
ReadinessSignals that the service is ready to receive traffic. Services that fail this check are temporarily removed from the load balancer — they are not restarted.
LivenessSignals that the service is still running correctly. Services that repeatedly fail this check are restarted automatically.

When to use each type

  • Startup — use when your service takes longer than a few seconds to become operational (e.g. JVM warmup, database migration on boot, loading a large ML model). Without a startup check, a slow-starting service may be killed by the liveness check before it finishes initialising.
  • Readiness — use to gate traffic. If your service needs a moment after startup before it can handle requests (e.g. connecting to a database, warming an in-memory cache), a readiness check ensures no requests are routed to it prematurely.
  • Liveness — use to detect deadlocks or infinite loops. A service that is running but stuck and not accepting connections will be detected and restarted.

You can configure all three checks on the same service. A typical pattern is: startup → readiness → liveness. Each operates independently.

How healthchecks work

All healthchecks use an HTTP GET request. Simplifyd Cloud sends a GET to the configured path and port on the service. A response with an HTTP status code between 200 and 399 is a success; anything else — including no response or a refused connection — is a failure.

The liveness and readiness checks follow this evaluation cycle:

  1. Wait Initial Delay seconds after service start before the first check.
  2. Send an HTTP GET request every Period seconds.
  3. Allow each request Timeout seconds to respond.
  4. After Failure Threshold consecutive failures → take action (restart for liveness; remove from load balancer for readiness).
  5. After Success Threshold consecutive successes → mark the check as passing again.

The startup check runs through the same cycle; once it passes for the first time, it stops running and the liveness and readiness checks begin.

Configuring healthchecks

  1. Open the service panel → Settings tab → Deploy section.
  2. Under Health Probes, find the check type you want to configure.
  3. Click + Add next to the check type.
  4. Fill in the dialog:
    • Path — the HTTP path your service exposes for health (e.g. /health, /ready, /livez).
    • Port — the port your service is listening on (e.g. 8080).
  5. Optionally expand Advanced settings to configure timing and thresholds (see below).
  6. Click Save Probe.
  7. The check is staged as a pending change. Click Apply in the Apply Changes bar to deploy.

Like resource changes and start commands, healthcheck configuration is tracked in the changeset and only takes effect on the next deployment. You can discard the change before deploying.

Advanced settings

FieldDefaultDescription
Initial Delay (s)0Seconds to wait after the service starts before running the first check.
Period (s)10How often (in seconds) to perform the check.
Timeout (s)1Seconds to wait for a response before counting it as a failure.
Failure Threshold3Number of consecutive failures before the check is considered failed.
Success Threshold1Number of consecutive successes required to mark the check as healthy again after a failure. Must be 1 for liveness and startup checks.

Editing and removing healthchecks

  • Edit — click the pencil icon next to the check to reopen the dialog with the current values pre-filled.
  • Remove — click the trash icon next to the check. The removal is staged and applied on the next deployment.

Your service should expose a lightweight endpoint (e.g. GET /health) that:

  • Returns HTTP 200 when the service is ready.
  • Does not make external calls (database, third-party APIs) unless you specifically want those to gate readiness.
  • Responds quickly — within the configured timeout (typically 1 second).

A minimal example in Node.js:

app.get('/health', (req, res) => res.sendStatus(200));

For a readiness check that also verifies the database is reachable:

app.get('/ready', async (req, res) => {
  try {
    await db.query('SELECT 1');
    res.sendStatus(200);
  } catch {
    res.sendStatus(503);
  }
});

Common patterns

Web app with a slow startup

Use a startup check with a generous window, then readiness and liveness for ongoing health:

TypePathPortInitial DelayPeriodFailure Threshold
Startup/health80800512 (= 60 s total)
Readiness/health80800103
Liveness/health80800303

Minimal setup — traffic gating only

If you only need to prevent traffic from reaching services before they are ready, a readiness check alone is sufficient:

TypePathPort
Readiness/health8080