Skip to content

Optimize SnapshotModel calls

Baptiste Beauplat requested to merge lyknode/snapshot:feat/sql-opti into master

While profiling snapshot to understand what was causing 504, as reported in #1023540, I discovered that snapshot was doing, on packages with a lot of sources and binary files, massive amount of SQL queries.

To illustrate, this is the profile for fetching https://snapshot.debian.org/package/gcc-6/6.0.1-2/

Screenshot_20230118_192859

We end up with more than 2000 call to DictCursor.execute! Moreover, the time took my develop env to respond, using the real DB was more than 2 minuites... I did the same test directly calling apache on lw07 and the result that about 1 minute. I blame the SSH tunnel latency for that difference.

Most of the queries can be traced back to SnapshotModel.packages_get_file_info() and sure enough, that method is called from inside a for loop, was accounting for about ~1800 single call.

I've modified the method to return a indexed result while taking a list of hashes as its input. This way, the method can be called once with the complete list and do a single query against the database.

This modification drastically improved the performance the request. This is the second profile, with the single call for SnapshotModel.packages_get_file_info():

Screenshot_20230118_192915

Down to ~200 SQL queries and around ~25 seconds.

I applied the same principle to SnapshotModel.packages_get_binary_files_from_id() and re-did the profiling:

Screenshot_20230118_192926

4 queries and down to 1.5 second. This is a very promising result.

I could not find anymore queries called multiple times for a single request. Since the time gained from this optimization should free-up a lot of resources from apache (having to wait for the DB) and the DB itself, I expect a visible improvement over snapshot DB and frontend performance.

This MR includes a couple of additional changes, namely:

  • Fix the CI for unstable by:
    • Updating gitlab-ci dependencies
    • Fixing rewriting of the Location header on redirect
  • Add missing dependencies when deploying a development version of snapshot using a venv (especially when running the testsuite)
  • Fixing the coverage report and converage html command
  • Adding support for passing args to pytest when running tox

Merge request reports

Loading