no vibes. no opinions. just cold, hard latency.
Every 2 minutes, we send a unique prompt to every AI provider. Each one is different. Each one has a right answer. If the model gets it wrong, we notice. No special treatment, no warm-up calls. Just one honest kick per minute.
We measure wall clock time — from the moment the request leaves our server to the moment the last byte comes back. No cheating with TTFB. If it takes 4 seconds to think about one sentence, that's 4 seconds of your life gone.
We look at the last 3 probes (~6 minutes of history). If 2 out of 3 failed → DOWN. If 2 out of 3 were slow → WEARY. Otherwise → UP. One bad request doesn't trigger anything — we're not drama queens. Two in a row? Now we're talking.
Most models get probed every 2 minutes. Some models run on a 6-minute cadence instead — they have tighter rate limits that would choke on the standard interval. Same logic, same thresholds, just slower data. Their staleness window scales to match so they don't get falsely marked STALE between normal probe cycles.
All probes are sent from US West (California) on Railway's us-west2 metal infrastructure. One region, full transparency.
questions? complaints? existential dread about AI reliability?
same tbh