Skip to content
Snippets Groups Projects
README.md 4.65 KiB

PyCDDL: Deserialize CBOR and/or do CDDL schema validation

CDDL is a schema language for the CBOR serialization format. pycddl allows you to:

  • Validate CBOR documents match a particular CDDL schema, based on the Rust cddl library.
  • Optionally, decode CBOR documents.

Usage

Validation

Here we use the cbor2 library to serialize a dictionary to CBOR, and then validate it:

from pycddl import Schema
import cbor2

uint_schema = Schema("""
    object = {
        xint: uint
    }
"""
)
uint_schema.validate_cbor(cbor2.dumps({"xint", -2}))

If validation fails, a pycddl.ValidationError is raised.

Validation + deserialization

You can deserialize CBOR to Python objects using cbor.loads(). However:

  • cbor2 uses C code by default, and the C programming language is prone to memory safety issues. If you are reading untrusted CBOR, better to use a Rust library to decode the data.
  • You will need to parse the CBOR twice, once for validation and once for decoding, adding performance overhead.

By deserializing with pycddl, you solve the first problem, and a future version of pycddl will solve the second problem (see https://gitlab.com/tahoe-lafs/pycddl/-/issues/37).

from pycddl import Schema
import cbor2

uint_schema = Schema("""
    object = {
        xint: uint
    }
"""
)
deserialized = uint_schema.validate_cbor(cbor2.dumps({"xint", -2}), True)
assert deserialized == {"xint": -2}

Deserializing without schema validation

If you don't care about schemas, you can just deserialize the CBOR like so:

from pycddl import Schema

ACCEPT_ANYTHING = Schema("main = any")

def loads(encoded_cbor_bytes):
    return ACCEPT_ANYTHING.validate_cbor(encoded_cbor_bytes, True)

In a future release this will become a standalone, more efficient API, see https://gitlab.com/tahoe-lafs/pycddl/-/issues/36

Reducing memory usage and safety constraints

In order to reduce memory usage, you can pass in any Python object that implements the buffer API and stores bytes, e.g. a memoryview() or a mmap object.

The passed-in object must be read-only, and the data must not change during validation! If you mutate the data while validation is happening the result can be memory corruption or other undefined behavior.

Supported CBOR types for deserialization

If you are deserializing a CBOR document into Python objects, you can deserialize:

  • Null/None.
  • Booleans.
  • Floats.
  • Integers up to 64-bit size. Larger integers aren't supported yet.
  • Bytes.
  • Strings.
  • Lists.
  • Maps/dictionaries.
  • Sets.

Other types will be added in the future if there is user demand.

Schema validation is not restricted to this list, but rather is limited by the functionality of the cddl Rust crate.

Release notes

0.6.3

Features:

  • Support final 3.13.

0.6.2