v1.1.0~rc1 -- "He who controls the spice controls the universe." This release is the first release candidate for the next minor release following runc 1.0. It contains all of the bugfixes included in runc 1.0 patch releases (up to and including 1.0.3). A fair few new features have been added, and several features have been deprecated (with plans for removal in runc 1.2). At the moment we only plan to do a single release candidate for runc 1.1, and once 1.1.0 is released we will not continue updating the 1.0.z runc branch. Deprecated: * runc run/start now warns if a new container cgroup is non-empty or frozen; this warning will become an error in runc 1.2. (#3132, #3223) * runc can only be built with Go 1.16 or later from this release onwards. (#3100, #3245) Removed: * `cgroup.GetHugePageSizes` has been removed entirely, and been replaced with `cgroup.HugePageSizes` which is more efficient. (#3234) * `intelrdt.GetIntelRdtPath` has been removed. Users who were using this function to get the intelrdt root should use the new `intelrdt.Root` instead. (#2920, #3239) Added: * Add support for RDMA cgroup added in Linux 4.11. (#2883) * runc exec now produces exit code of 255 when the exec failed. This may help in distinguishing between runc exec failures (such as invalid options, non-running container or non-existent binary etc.) and failures of the command being executed. (#3073) * runc run: new `--keep` option to skip removal exited containers artefacts. This might be useful to check the state (e.g. of cgroup controllers) after the container hasexited. (#2817, #2825) * seccomp: add support for `SCMP_ACT_KILL_PROCESS` and `SCMP_ACT_KILL_THREAD` (the latter is just an alias for `SCMP_ACT_KILL`). (#3204) * seccomp: add support for `SCMP_ACT_NOTIFY` (seccomp actions). This allows users to create sophisticated seccomp filters where syscalls can be efficiently emulated by privileged processes on the host. (#2682) * checkpoint/restore: add an option (`--lsm-mount-context`) to set a different LSM mount context on restore. (#3068) * runc releases are now cross-compiled for several architectures. Static builds for said architectures will be available for all future releases. (#3197) * intelrdt: support ClosID parameter. (#2920) * runc exec --cgroup: an option to specify a (non-top) in-container cgroup to use for the process being executed. (#3040, #3059) * cgroup v1 controllers now support hybrid hierarchy (i.e. when on a cgroup v1 machine a cgroup2 filesystem is mounted to /sys/fs/cgroup/unified, runc run/exec now adds the container to the appropriate cgroup under it). (#2087, #3059) * sysctl: allow slashes in sysctl names, to better match `sysctl(8)`'s behaviour. (#3254, #3257) * mounts: add support for bind-mounts which are inaccessible after switching the user namespace. Note that this does not permit the container any additional access to the host filesystem, it simply allows containers to have bind-mounts configured for paths the user can access but have restrictive access control settings for other users. (#2576) * Add support for recursive mount attributes using `mount_setattr(2)`. These have the same names as the proposed `mount(8)` options -- just prepend `r` to the option name (such as `rro`). (#3272) * Add `runc features` subcommand to allow runc users to detect what features runc has been built with. This includes critical information such as supported mount flags, hook names, and so on. Note that the output of this command is subject to change and will not be considered stable until runc 1.2 at the earliest. The runtime-spec specification for this feature is being developed in opencontainers/runtime-spec#1130. (#3296) Changed: * system: improve performance of `/proc/$pid/stat` parsing. (#2696) * cgroup2: when `/sys/fs/cgroup` is configured as a read-write mount, change the ownership of certain cgroup control files (as per `/sys/kernel/cgroup/delegate`) to allow for proper deferral to the container process. (#3057) * docs: series of improvements to man pages to make them easier to read and use. (#3032) Libcontainer API: * internal api: remove internal error types and handling system, switch to Go wrapped errors. (#3033) * New configs.Cgroup structure fields (#3177): * Systemd (whether to use systemd cgroup manager); and * Rootless (whether to use rootless cgroups). * New cgroups/manager package aiming to simplify cgroup manager instantiation. (#3177) * All cgroup managers' instantiation methods now initialize cgroup paths and can return errors. This allows to use any cgroup manager method (e.g. Exists, Destroy, Set, GetStats) right after instantiation, which was not possible before (as paths were initialized in Apply only). (#3178) Fixed: * nsenter: do not try to close already-closed fds during container setup and bail on close(2) failures. (#3058) * runc checkpoint/restore: fixed for containers with an external bind mount which destination is a symlink. (#3047). * cgroup: improve openat2 handling for cgroup directory handle hardening. (#3030) * `runc delete -f` now succeeds (rather than timing out) on a paused container. (#3134) * runc run/start/exec now refuses a frozen cgroup (paused container in case of exec). Users can disable this using `--ignore-paused`. (#3132, #3223) * config: do not permit null bytes in mount fields. (#3287) Thanks to the following people who made this release possible: * Adrian Reber <areber@redhat.com> * Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp> * Alban Crequy <alban@kinvolk.io> * Aleksa Sarai <cyphar@cyphar.com> * Dave Chen <dave.chen@arm.com> * flouthoc <flouthoc.git@gmail.com> * Fraser Tweedale <ftweedal@redhat.com> * Itamar Holder <iholder@redhat.com> * Kailun Qin <kailun.qin@intel.com> * Kang Chen <kongchen28@gmail.com> * Kir Kolyshkin <kolyshkin@gmail.com> * lifubang <lifubang@acmcoder.com> * Liu Hua <weldonliu@tencent.com> * Maksim An <maksiman@microsoft.com> * Markus Lehtonen <markus.lehtonen@intel.com> * Mauricio Vásquez <mauricio@kinvolk.io> * Mengjiao Liu <mengjiao.liu@daocloud.io> * Mrunal Patel <mrunal@me.com> * Neil Johnson <najohnsn@gmail.com> * Odin Ugedal <odin@uged.al> * Piotr Resztak <piotr.resztak@gmail.com> * Qiang Huang <h.huangqiang@huawei.com> * Rodrigo Campos <rodrigo@kinvolk.io> * Sascha Grunert <sgrunert@redhat.com> * Sebastiaan van Stijn <github@gone.nl> * Shengjing Zhu <zhsj@debian.org> * xiadanni <xiadanni1@huawei.com> Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>