Skip to content

Time-to-load and error metrics for HTTP requests

For our QoS metrics, we need to track the time taken to generate responses, as well as HTTP errors. The consensus in that issue is that this should be Debusine's responsibility.

I think the simplest approach would be to add django-prometheus. It's already packaged in Debian, and it seems to have pretty much what we need. Its metrics include histograms of request processing time labelled by view and responses by status, which would already be a significant visibility improvement. We'll need to add a few additional metrics to divide things up by logged-in users vs. anonymous requests, and to track SSO denials.

django-prometheus has various approaches for exporting metrics (we're using multiple threads in a single process, so I think we can probably use the dedicated thread approach). If and when we need to monitor multiple Django processes, we can just configure Prometheus to scrape each of them separately: we don't have worker processes that are so short-lived that Prometheus would miss out on scraping them, so we don't need the added complexity of storing information somewhere like Redis.

To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information