DASHBOARD

results over the last 7 days — who's slacking?

SPEED LEADERBOARD

google/gemini-3.1-flash-lite 452ms

CHEAP

mistral/mistral-medium-latest 533ms

MID

google/gemini-3.1-flash-lite-preview 569ms

mistral/mistral-small-latest 607ms

CHEAP

anthropic/claude-haiku-4-5 700ms

CHEAP

openai/gpt-5.4-mini 740ms

CHEAP

mistral/mistral-large-latest 749ms

FLAGSHIP

google/gemini-3-flash-preview 1205ms

MID

deepseek/deepseek-v4-flash 1407ms

MID

anthropic/claude-sonnet-4-6 1819ms

MID

anthropic/claude-opus-4-7 1845ms

FLAGSHIP

deepseek/deepseek-v4-pro 1959ms

FLAGSHIP

openai/gpt-5.4 2030ms

MID

openai/gpt-5.5 2379ms

FLAGSHIP

google/gemini-3.1-pro-preview 2990ms

FLAGSHIP

7-DAY UPTIME

mistral/mistral-large-latest 100%

FLAGSHIP 5036/5040 probes

mistral/mistral-medium-latest 100%

MID 5040/5040 probes

mistral/mistral-small-latest 100%

CHEAP 5040/5040 probes

google/gemini-3.1-flash-lite 98%

CHEAP 4964/5040 probes

google/gemini-3.1-pro-preview 98%

FLAGSHIP 1650/1680 probes

google/gemini-3-flash-preview 98%

MID 4964/5040 probes

anthropic/claude-haiku-4-5 -

CHEAP

anthropic/claude-opus-4-7 -

FLAGSHIP

anthropic/claude-sonnet-4-6 -

MID

deepseek/deepseek-v4-flash -

MID

deepseek/deepseek-v4-pro -

FLAGSHIP

google/gemini-3.1-flash-lite-preview -

openai/gpt-5.4 -

MID

openai/gpt-5.4-mini -

CHEAP

openai/gpt-5.5 -

FLAGSHIP

WRONG ANSWERS

% of successful probes where the model did NOT return the expected response · lower is better · daily over 7 days

daily averages in your local timezone · only successful probes