Skip to content

Model collection data using Pydantic

Collection.data and CollectionItem.data are JSON fields, but there are no associated Pydantic models. This has the same sorts of problems that we had before modelling artifact and task data: validation has to be handled in each place where we're reading from or writing to those data fields, it's easy to make mistakes, and we have no reflection that we might be able to use to build interfaces that know about the available fields and their types.

I'd like this to work in a somewhat similar way to artifacts: there should be create_data methods on Collection and CollectionItem, called from clean, that instantiate the appropriate Pydantic models. CollectionManagerInterface should take the collection and collection item data models as type parameters. In places where we take collection or collection item data as a plain dictionary, we should be able to pass the appropriate Pydantic model as an alternative, and most call sites should do so.

I haven't worked out all the details of this so there may be some parts of the above sketch that don't quite work, but something along these lines should be feasible.

To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information