Skip to content

Add initial design for collections

Colin Watson requested to merge cjwatson/debusine:design-collections into devel

This is extremely preliminary, but I wanted to get something up for people to tear apart that roughly matches what we discussed in Cambridge (https://docs.google.com/document/d/11KE3MlCtzLIakjLOoFpRPUMX5jLwJ-pX4ubbWCQD_4I).

I continue to struggle with modelling this sort of thing as a generic collection with specializations for Debian archives, rather than as something like an Archive table, but I admit it does fit well with the design approach used by artifacts. The place where it feels strangest is for the data attached to collection memberships, which has to use a structure depending on the type of the collection. I've done my best to make it all fit.

Although I think it was a term I suggested to begin with, I'd love to have a better term than "membership"; to me that's a binary concept, and this is actually a one-to-many relationship.

I haven't yet really considered deeply how this will interact with workflows. Coming up with concrete designs for collections and workflows at the same time was a bit too much for me, and collections seem more fundamental, so I started here. Our design notes talk about having collection membership audit log entries link to workflows and about workflows having server-side actions to update collections, and IIRC the link is to a workflow rather than a task because tasks are always executed on workers which isn't something we'd want here.

Something like expiration_delay will be needed if we ever want to do snapshots, and I think it's good to have that in place up-front since collections clearly need to participate in the rules for artifact expiry anyway. There's an unfortunate confusion with workspaces: their expiration delay is from artifact creation, whereas the one that's needed here is from the time when the artifact is removed from the collection.

base_archives may not be quite right yet. Presumably we'd be using that to generate sources.list or equivalent for builds, in which case you might need to be able to specify things like suites as well? I also expect more policy rules to be needed in general, though that's the sort of thing that can be added later.

Sharing of collections across workspaces seems likely to need more work. Our design notes say "One master owner and others have read-only views", but I don't remember exactly what use cases we were talking about there.

I haven't mentioned generation or importing of archive indexes yet. That might want to be in a separate MR. If we're generating our own indexes, then we might need to have additional information on collection memberships to express the notion of pending changes that haven't yet been fully processed, or maybe we just mark the collection dirty in some way when we make a change to its memberships. In general we'll need to think through the workflow here and make sure that this data model can support it.

Merge request reports

Loading