These comments were added to enforce using the correct import path for
our packages ("github.com/docker/docker", not "github.com/moby/moby").
However, when working in go module mode (not GOPATH / vendor), they have
no effect, so their impact is limited.
Remove these imports in preparation of migrating our code to become an
actual go module.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The runc libcontainer/cgroups package was moved to a separate
module; switch our use of the runc module to use the new
location.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This makes `WritableCgroups` a pointer so we can error when it's specified in invalid configurations (both rootless and user namespaces).
Signed-off-by: Tianon Gravi <admwiggin@gmail.com>
Fixes#42040Closes#42043
Rather than making cgroups read-write by default, instead have a flag
for making it possible.
Since these security options are passed through the cli to daemon API,
no changes are needed to docker-cli.
Signed-off-by: Vincent Batts <vbatts@hashbangbash.com>
Second attempt to stop using the OCI prestart hook to call SetKey
to set up the OS Sandbox's key and perform network config in the
new network namespace.
The first attempt was reverted because it made it impossible to
use --sysctl to set per-interface sysctls on an interface that had
not yet been moved into the new network namespace.
Now, per-interface sysctls can be used to do that (with less
ambiguity because the setting is not tied to the interface using
an unpredictably assigned name).
Signed-off-by: Rob Murray <rob.murray@docker.com>
Commit 2ce811e632 migrated the use of the
userns package to the github.com/moby/sys/user module.
After further discussion with maintainers, it was decided to move the
userns package to a separate module, as it has no direct relation with
"user" operations (other than having "user" in its name).
This patch migrates our code to use the new module.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Use a more distinct name, so that local variables can use it. While
at it, also added GoDoc to describe its functionality.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The same code was used both on Linux and Windows; move it to a platform-
agnostic file so that both can use this function, which contains GoDoc
describing the functionality.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The userns package in libcontainer was integrated into the moby/sys/user
module at commit [3778ae603c706494fd1e2c2faf83b406e38d687d][1].
The userns package is used in many places, and currently either depends
on runc/libcontainer, or on containerd, both of which have a complex
dependency tree. This patch is part of a series of patches to unify the
implementations, and to migrate toward that implementation to simplify
the dependency tree.
[1]: 3778ae603c
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This package is only used by the daemon, so move it to the internal
rootless package instead.
Note that technically this could be in daemon/internal, but as there's
already an existing internal/rootless package (which needs to be in the
top-level internal package because it's also used by /plugin), I'm moving
it there.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The `Sandbox.SetKey()` method is called through an OCI prestart hook
which then calls back the daemon through a UNIX socket. This method is
responsible for provisioning interfaces, etc... into the sandbox.
A new EnvironCarrier is used to propagate the trace context to the
prestart hook, which then marhsals an OTel MapCarrier into the JSON
payload sent back to the daemon. That way, every spans created from
`SetKey()` are correctly parented to the original `ContainerStart` API
call.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
This adds some nolint-comments for the deprecated kernel-memory options; we
deprecated these, but they could technically still be accepted by alternative
runtimes.
daemon/daemon_unix.go:108:3: SA1019: memory.Kernel is deprecated: kernel-memory limits are not supported in cgroups v2, and were obsoleted in [kernel v5.4]. This field should no longer be used, as it may be ignored by runtimes. (staticcheck)
memory.Kernel = &config.KernelMemory
^
daemon/update_linux.go:63:3: SA1019: memory.Kernel is deprecated: kernel-memory limits are not supported in cgroups v2, and were obsoleted in [kernel v5.4]. This field should no longer be used, as it may be ignored by runtimes. (staticcheck)
memory.Kernel = &resources.KernelMemory
^
Prestart hooks are deprecated, and more granular hooks should be used instead.
CreateRuntime are the closest equivalent, and executed in the same locations
as Prestart-hooks, but depending on what these hooks do, possibly one of the
other hooks could be used instead (such as CreateContainer or StartContainer).
As these hooks are still supported, this patch adds nolint comments, but adds
some TODOs to consider migrating to something else;
daemon/nvidia_linux.go:86:2: SA1019: s.Hooks.Prestart is deprecated: use [Hooks.CreateRuntime], [Hooks.CreateContainer], and [Hooks.StartContainer] instead, which allow more granular hook control during the create and start phase. (staticcheck)
s.Hooks.Prestart = append(s.Hooks.Prestart, specs.Hook{
^
daemon/oci_linux.go:76:5: SA1019: s.Hooks.Prestart is deprecated: use [Hooks.CreateRuntime], [Hooks.CreateContainer], and [Hooks.StartContainer] instead, which allow more granular hook control during the create and start phase. (staticcheck)
s.Hooks.Prestart = append(s.Hooks.Prestart, specs.Hook{
^
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Partially reverts 0046b16 "daemon: set libnetwork sandbox key w/o OCI hook"
Running SetKey to store the OCI Sandbox key after task creation, rather
than from the OCI prestart hook, meant it happened after sysctl settings
were applied by the runtime - which was the intention, we wanted to
complete Sandbox configuration after IPv6 had been disabled by a sysctl
if that was going to happen.
But, it meant '--sysctl' options for a specfic network interface caused
container task creation to fail, because the interface is only moved into
the network namespace during SetKey.
This change restores the SetKey prestart hook, and regenerates config
files that depend on the container's support for IPv6 after the task has
been created. It also adds a regression test that makes sure it's possible
to set an interface-specfic sysctl.
Signed-off-by: Rob Murray <rob.murray@docker.com>
This code is currently only used in the daemon, but is also needed in other
places. We should consider moving this code to github.com/moby/sys, so that
BuildKit can also use the same implementation instead of maintaining a fork;
moving it to internal allows us to reuse this code inside the repository, but
does not allow external consumers to depend on it (which we don't want as
it's not a permanent location).
As our code only uses this in linux files, I did not add a stub for other
platforms (but we may decide to do that in the moby/sys repository).
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Prior to this commit, a container running with `--net=host` had
`{"type":"network","path":"/var/run/docker/netns/default"}` in
the ``.linux.namespaces` field of the OCI Runtime Config,
but this wasn't needed.
Close issue 47100
Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
The github.com/opencontainers/runc/libcontainer/user package was moved
to a separate module. While there's still uses of the old module in
our code-base, runc itself is migrating to the new module, and deprecated
the old package (for runc 1.2).
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The github.com/containerd/containerd/log package was moved to a separate
module, which will also be used by upcoming (patch) releases of containerd.
This patch moves our own uses of the package to use the new module.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Fix issue 46563 "Rootful-in-Rootless dind doesn't work since systemd v250 (due to oom score adj)"
Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
While working on this code, I noticed that there's currently an issue
with userns enabled. When userns is enabled, joining another container's
namespace must also join its user-namespace.
However, a container can only be in a single user namespace, so if a
container joins namespaces from multiple containers, latter user-namespaces
overwrite former ones.
We must add validation for this, but in the meantime, add notes / todo's.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
- Most error-message returned would already include "container" and the
container ID in the error-message (e.g. "container %s is not running"),
so there's no need to add a custom prefix for that.
- os.Stat returns a PathError, which already includes the operation ("stat"),
the path, and the underlying error that occurred.
And while updating, let's also fix the name to be proper camelCase :)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This function didn't need the whole container, only its ID, so let's
use that as argument. This also makes it consistent with getIpcContainer.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
`Daemon.getPidContainer()` was wrapping the error-message with a message
("cannot join PID of a non running container") that did not reflect the
actual reason for the error; `Daemon.GetContainer()` could either return
an invalid parameter (invalid / empty identifier), or a "not found" error
if the specified container-ID could not be found.
In the latter case, we don't want to return a "not found" error through
the API, as this would indicate that the container we're _starting_ was
not found (which is not the case), so we need to convert the error into
an `errdefs.ErrInvalidParameter` (the container-ID specified for the PID
namespace is invalid if the container doesn't exist).
This logic is similar to what we do for IPC namespaces. which received
a similar fix in c3d7a0c603.
This patch updates the error-types, and moves them into the getIpcContainer
and getPidContainer container functions, both of which should return
an "invalid parameter" if the container was not found.
It's worth noting that, while `WithNamespaces()` may return an "invalid
parameter" error, the `start` endpoint itself may _not_ be. as outlined
in commit bf1fb97575, starting a container
that has an invalid configuration should be considered an internal server
error, and is not an invalid _request_. However, for uses other than
container "start", `WithNamespaces()` should return the correct error
to allow code to handle it accordingly.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
We were using a mixture of approaches for these; aligning them a bit
to all use switch statements.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The function was checking in a loop if networking for the container was
disabled. Change the function to return early, and to only set hooks
if one needs to be set.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>