Learn how to monitor LLM inference in production using Prometheus and Grafana. Track p95 latency, tokens/sec, queue duration, and KV cache usage across vLLM, TGI, and llama.cpp. Includes PromQL examples, dashboards, alerts, Docker & Kubernetes setups.
An actual overnight Morning Stack run, unedited: the real email as delivered, the three tailored application packages, and the full ledger of everything the agent filtered out to find them.
Show off what you're playing on itch.io natively in Discord.