Kubernetes CronJob Monitoring

Start monitoring your Kubernetes CronJobs free — 10 checks, no credit card.

The problem

A Kubernetes CronJob can stop doing its job long before anything in the cluster looks wrong. `kubectl get cronjob` shows a recent LAST SCHEDULE, the Deployment is healthy, and the dashboard stays green — yet the nightly backup silently skipped a run because `startingDeadlineSeconds` lapsed, a suspended CronJob was never re-enabled, or a controller restart missed the window. Worse, a Job that exits non-zero gets retried under `restartPolicy: OnFailure` and may eventually succeed, so the failed run never surfaces — or it exhausts `backoffLimit` and is marked failed in a namespace no one is watching. Kubernetes tells you a pod ran; it does not tell you the work inside it actually completed. The result is the most dangerous failure mode in scheduled work: silence that looks exactly like success.

How Pakyas helps

Pakyas is execution-signal monitoring: each CronJob proves it ran by sending a signal from inside the pod, and Pakyas compares those signals against the schedule you expect. Because the signal originates in the container itself, you find out even when the cluster, the scheduler, or the node is the thing that broke. Pakyas distinguishes the states Kubernetes blurs together — Missing (no signal arrived, so the run never happened or never reported), Late (it arrived outside the expected window), Overrunning (the Job is taking longer than its normal duration), and Error (the wrapped command explicitly reported a non-zero exit). A retried-then-recovered Job and a Job that quietly stopped firing are no longer indistinguishable. You get one clear alert that means action, not a wall of green that hides a skipped backup.

Set it up

  1. Create a check and grab its ping URL

    # In the Pakyas dashboard, create a check for your CronJob
    # (set the expected schedule, e.g. 0 2 * * *, and a grace period).
    # Copy the check's ping URL — it looks like:
    #   https://ping.pakyas.com/<public_id>
    # where <public_id> is the check's UUID.
    #
    # Store it as a Kubernetes secret so it isn't baked into the manifest:
    kubectl create secret generic pakyas-ping \
      --from-literal=ping-url=https://ping.pakyas.com/<public_id>

    The ping URL has no /ping/ path segment — it is https://ping.pakyas.com/<public_id> directly. Append /start and /fail for start and failure signals.

  2. Signal start, success, and failure from the CronJob

    apiVersion: batch/v1
    kind: CronJob
    metadata:
      name: nightly-backup
    spec:
      schedule: "0 2 * * *"   # must match the schedule on your Pakyas check
      jobTemplate:
        spec:
          backoffLimit: 0       # avoid silent retries masking a failed run
          template:
            spec:
              restartPolicy: Never
              containers:
                - name: backup
                  image: your-org/backup:latest
                  env:
                    - name: PING_URL
                      valueFrom:
                        secretKeyRef:
                          name: pakyas-ping
                          key: ping-url
                  command: ["/bin/sh", "-c"]
                  args:
                    - >
                      curl -fsS -m 10 --retry 3 "$PING_URL/start" || true;
                      if /app/run-backup.sh; then
                        curl -fsS -m 10 --retry 3 "$PING_URL";
                      else
                        curl -fsS -m 10 --retry 3 "$PING_URL/fail";
                        exit 1;
                      fi

    Requires curl in the image. The trailing /start and /fail are real Pakyas signals; the bare URL is the success signal. `|| true` on the start ping keeps a ping outage from blocking the actual job.

  3. Prefer the wrapper if you can change the entrypoint

    # Option B: let the pakyas CLI handle start/success/fail automatically.
    # The CLI sends a start ping, runs the command, then sends success or fail
    # based on the real exit code, and exits with that same code.
    spec:
      containers:
        - name: backup
          image: your-org/backup:latest   # image must include the `pakyas` binary
          # `nightly-backup` is the check slug; everything after -- is your job
          command: ["pakyas", "monitor", "nightly-backup", "--", "/app/run-backup.sh"]
          env:
            - name: PAKYAS_API_KEY
              valueFrom:
                secretKeyRef:
                  name: pakyas-secret
                  key: api-key

    `pakyas monitor <slug> -- <command>` is the project's recommended K8s pattern: it reports a failure ping before the pod exits non-zero, so OnFailure retries still happen while you still get alerted on the failed run.

A worked example

Say you run a nightly Postgres backup as a CronJob scheduled at `0 2 * * *`, dumping to object storage. Create a Pakyas check with that same `0 2 * * *` schedule and a grace period a little longer than a normal backup takes (e.g. 30 minutes). Wire the manifest as in the steps above: the container sends a start signal, runs `run-backup.sh`, then sends a success signal on exit 0 or a fail signal on any error. Now the failure modes become distinct and actionable. If the dump command errors out (bad credentials, disk full), Pakyas marks the check Error and alerts you immediately. If the backup pod never gets scheduled — a missed window, a CronJob left suspended, a node that never came back — no signal arrives and Pakyas marks it Missing after the grace period, instead of you discovering it during a restore six weeks later. If the dump starts hanging and runs well past its usual duration, Pakyas flags it Overrunning before it eats the next morning's window. One green check now genuinely means "last night's backup ran and finished."

Pricing

Pakyas has four tiers: Free ($0), Developer ($9/mo), Pro ($29/mo), and Business ($99/mo). The Free tier includes 10 checks — enough to cover your most critical CronJobs (backups, migrations, syncs) at no cost, with paid tiers adding more checks and capabilities as your cluster grows.

See the full breakdown on the pricing page.

New to the terminology? See the cron monitoring glossary for plain-language definitions of every job state, or explore everything Pakyas tracks on the features page.

Start monitoring your Kubernetes CronJobs free — 10 checks, no credit card.

Execution-signal precision: know when a job is Missing, Late, Overrunning, or reports an Error — not just up or down.

Start monitoring free