Why Direct LLM API Calls Break In Production
A category-defining guide to the boring failure modes behind LLM calls: timeouts, rate limits, worker restarts, duplicate retries, and unknown outcomes.
Blog
Practical writing on durable LLM requests, retries, idempotency, wait mode, and the failure modes that show up in production.
A category-defining guide to the boring failure modes behind LLM calls: timeouts, rate limits, worker restarts, duplicate retries, and unknown outcomes.
Implementation Notes
A look at the small worker loop behind ReqRun: durable queue rows, lock tokens, attempts, retryable failures, terminal failures, and backoff with jitter.
Production Patterns
Retries are dangerous without dedupe. This post explains how to choose idempotency keys for LLM tasks and how ReqRun deduplicates by project.
Some LLM requests finish quickly. Some do not. wait=true gives developers a synchronous happy path without giving up durable async recovery.
Webhook senders retry delivery. If a webhook triggers LLM work, your handler needs idempotency and a durable model request record.
Category
ReqRun is intentionally narrow. It does not coordinate every step in your system; it makes the LLM request step durable and visible.