182 Commits

Author SHA1 Message Date
Sebastiaan van Stijn
0df791cb72 explicitly access Container.State instead of through embedded struct
The Container.State struct holds the container's state, and most of
its fields are expected to change dynamically. Some o these state-changes
are explicit, for example, setting the container to be "stopped". Other
state changes can be more explicit, for example due to the containers'
process exiting or being "OOM" killed by the kernel.

The distinction between explicit ("desired") state changes and "state"
("actual state") is sometimes vague; for some properties, we clearly
separated them, for example if a user requested the container to be
stopped or restarted, we store state in the Container object itself;

    HasBeenManuallyStopped   bool // used for unless-stopped restart policy
    HasBeenManuallyRestarted bool `json:"-"` // used to distinguish restart caused by restart policy from the manual one

Other properties are more ambiguous. such as "HasBeenStartedBefore" and
"RestartCount", which are stored on the Container (and persisted to
disk), but may be more related to "actual" state, and likely should
not be persisted;

    RestartCount             int
    HasBeenStartedBefore     bool

Given that (per the above) concurrency must be taken into account, most
changes to the `container.State` struct should be protected; here's where
things get blurry. While the `State` type provides various accessor methods,
only some of them take concurrency into account; for example, [State.IsRunning]
and [State.GetPID] acquire a lock, whereas [State.ExitCodeValue] does not.
Even the (commonly used) [State.StateString] has no locking at all.

The way to handle this is error-prone; [container.State] contains a mutex,
and it's exported. Given that its embedded in the [container.Container]
struct, it's also exposed as an exported mutex for the container. The
assumption here is that by "merging" the two, the caller to acquire a lock
when either the container _or_ its state must be mutated. However, because
some methods on `container.State` handle their own locking, consumers must
be deeply familiar with the internals; if both changes to the `Container`
AND `Container.State` must be made. This gets amplified more as some
(exported!) methods, such as [container.SetRunning] mutate multiple fields,
but don't acquire a lock (so expect the caller to hold one), but their
(also exported) counterpart (e.g. [State.IsRunning]) do.

It should be clear from the above, that this needs some architectural
changes; a clearer separation between "desired" and "actual" state (opening
the potential to update the container's config without manually touching
its `State`), possibly a method to obtain a read-only copy of the current
state (for those querying state), and reviewing which fields belong where
(and should be persisted to disk, or only remain in memory).

This PR preserves the status quo; it makes no structural changes, other
than exposing where we access the container's state. Where previously the
State fields and methods were referred to as "part of the container"
(e.g. `ctr.IsRunning()` or `ctr.Running`), we now explicitly reference
the embedded `State` (`ctr.State.IsRunning`, `ctr.State.Running`).

The exception (for now) is the mutex, which is still referenced through
the embedded struct (`ctr.Lock()` instead of `ctr.State.Lock()`), as this
is (mostly) by design to protect the container, and what's in it (including
its `State`).

[State.IsRunning]: c4afa77157/daemon/container/state.go (L205-L209)
[State.GetPID]: c4afa77157/daemon/container/state.go (L211-L216)
[State.ExitCodeValue]: c4afa77157/daemon/container/state.go (L218-L228)
[State.StateString]: c4afa77157/daemon/container/state.go (L102-L131)
[container.State]: c4afa77157/daemon/container/state.go (L15-L23)
[container.Container]: c4afa77157/daemon/container/container.go (L67-L75)
[container.SetRunning]: c4afa77157/daemon/container/state.go (L230-L277)

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2025-09-19 16:02:14 +02:00
Sebastiaan van Stijn
082b4e8d77 client: move ExecOptions to client
- move api/types/container.ExecOptions to the client
- rename api/types/container.ExecOptions to ExecCreateRequest

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2025-09-15 17:37:47 +02:00
Sebastiaan van Stijn
e099f1e409 daemon: Daemon.ContainerExecStart: fix typo in log field
Changing it to `execID`, which is what's used in most/all other places.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2025-09-03 11:34:21 +02:00
Derek McGowan
f74e5d48b3 Create github.com/moby/moby/v2 module
Signed-off-by: Derek McGowan <derek@mcg.dev>
2025-07-31 10:13:29 -07:00
Sebastiaan van Stijn
83510a26b3 api/types: move backend types to daemon/server
The "backend" types in API were designed to decouple the API server
implementation from the daemon, or other parts of the code that
back the API server. This would allow the daemon to evolve (e.g.
functionality moved to different subsystems) without that impacting
the API server's implementation.

Now that the API server is no longer part of the API package (module),
there is no benefit to having it in the API module. The API server
may evolve (and require changes in the backend), which has no direct
relation with the API module (types, responses); the backend definition
is, however, coupled to the API server implementation.

It's worth noting that, while "technically" possible to use the API
server package, and implement an alternative backend implementation,
this has never been a prime objective. The backend definition was
never considered "stable", and we don't expect external users to
(attempt) to use it as such.

This patch moves the backend types to the daemon/server package,
so that they can evolve with the daemon and API server implementation
without that impacting the API module (which we intend to be stable,
following SemVer).

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2025-07-28 00:03:04 +02:00
Derek McGowan
afd6487b2e Create github.com/moby/moby/api module
Signed-off-by: Derek McGowan <derek@mcg.dev>
2025-07-21 09:30:05 -07:00
Derek McGowan
5419eb1efc Move container to daemon/container
Signed-off-by: Derek McGowan <derek@mcg.dev>
2025-06-27 14:27:21 -07:00
Derek McGowan
a02ba3c7df Move container/stream to daemon/internal/stream
Signed-off-by: Derek McGowan <derek@mcg.dev>
2025-06-27 14:27:05 -07:00
Matthieu MOREL
bc9ec5fc02 fix emptyStringTest from go-critic
Signed-off-by: Matthieu MOREL <matthieu.morel35@gmail.com>
2025-06-07 09:57:59 +02:00
Sebastiaan van Stijn
5318877858 daemon: remove // import comments
These comments were added to enforce using the correct import path for
our packages ("github.com/docker/docker", not "github.com/moby/moby").
However, when working in go module mode (not GOPATH / vendor), they have
no effect, so their impact is limited.

Remove these imports in preparation of migrating our code to become an
actual go module.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2025-05-30 15:59:13 +02:00
Sebastiaan van Stijn
07466d2e9b daemon: Daemon.ContainerExecStart: rename err-return, and minor refactor
- rename the error-return to prevent accidental shadowing
- remove some intermediate variables
- usee a struct-literal for specs.Process
- optimize logging-code to not use chained "WithField"
- remove punctuation from error-message

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2025-05-15 22:09:56 +02:00
Sebastiaan van Stijn
b54a038bec docker exec: fail early on exec create if specified user doesn't exist
Before this patch, and error would be produced when starting the exec,
but the CLI would wait for the exec to complete, timing out after 10
seconds (default). With this change, an error is returned immediately
when creating the exec.

Note that "technically" this check may have some TOCTOU issues, because
'/etc/passwd' and '/etc/groups' may be mutated by the container in between
creating the exec and starting it.

This is very likely a corner-case, but something we can consider changing
in future (either allow creating an invalid exec, and checking before
starting, or checking both before create and before start).

With this patch:

    printf 'FROM alpine\nRUN rm -f /etc/group' | docker build -t nogroup -
    ID=$(docker run -dit nogroup)

    time docker exec -u 0:root $ID echo hello
    Error response from daemon: unable to find group root: no matching entries in group file

    real	0m0.014s
    user	0m0.010s
    sys	0m0.003s

    # numericc uid/gid (should not require lookup);
    time docker exec -u 0:0 $ID echo hello
    hello

    real	0m0.059s
    user	0m0.007s
    sys	0m0.008s

    # no user specified (should not require lookup);
    time docker exec $ID echo hello
    hello

    real	0m0.057s
    user	0m0.013s
    sys	0m0.008s

    docker rm -fv $ID

    # container that does have a valid /etc/groups

    ID=$(docker run -dit alpine)
    time docker exec -u 0:root $ID echo hello
    hello

    real	0m0.063s
    user	0m0.010s
    sys	0m0.009s

    # non-existing user or group
    time docker exec -u 0:blabla $ID echo hello
    Error response from daemon: unable to find group blabla: no matching entries in group file

    real	0m0.013s
    user	0m0.004s
    sys	0m0.009s

    docker rm -fv $ID

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2025-04-25 15:24:00 +02:00
Sebastiaan van Stijn
d3c0825439 daemon: make daemon.getEntrypointAndArgs a regular function
It was not using the daemon, so can be a regular function. While at it,
also changed the parameter type to accept a regular string-slice, as
we don't need strslice.StrSlice's json.Unmarshaler implementation, and
reversed the logic for the early return.

Finally, for uses where the entrypoint was always nil, this patch removes
the use of this utility altogether.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2025-01-26 14:37:08 +01:00
Derek McGowan
0aa8fe0bf9 Update to containerd v2.0.2, buildkit v0.19.0-rc2
Update buildkit version to commit which uses 2.0

Signed-off-by: Derek McGowan <derek@mcg.dev>
2025-01-15 14:09:30 +01:00
Sebastiaan van Stijn
5c48736863 remove redundant alias for runtime-spec
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2024-10-26 18:31:39 +02:00
Laura Brehm
c866a7e5f8 daemon/exec: don't overwrite exit code if set
If we fail to start an exec, the deferred error-handling block in [L181-L193](c7e42d855e/daemon/exec.go (L181-L193))
would set the exit code to `126` (`EACCES`). However, if we get far enough along
attempting to start the exec, we set the exit code according to the error returned
from starting the task [L288-L291](c7e42d855e/daemon/exec.go (L288-L291)).

For some situations (such as `docker exec [some-container]
missing-binary`), the 2nd block returns the correct exit code (`127`)
but that then gets overwritten by the 1st block.

This commit changes that logic to only set the default exit code `126`
if the exit code has not been set yet.

Signed-off-by: Laura Brehm <laurabrehm@hey.com>
2024-09-30 15:49:04 +01:00
Sebastiaan van Stijn
452e134001 api/types: move ExecStartOptions to api/types/backend
It's a type used by the backend, so moving it there.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2024-06-10 10:19:46 +02:00
Sebastiaan van Stijn
cd76e3e7f8 api/types: move ExecConfig to api/types/container
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2024-06-10 10:19:46 +02:00
Sebastiaan van Stijn
cff4f20c44 migrate to github.com/containerd/log v0.1.0
The github.com/containerd/containerd/log package was moved to a separate
module, which will also be used by upcoming (patch) releases of containerd.

This patch moves our own uses of the package to use the new module.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2023-10-11 17:52:23 +02:00
Sebastiaan van Stijn
0f871f8cb7 api/types/events: define "Action" type and consts
Define consts for the Actions we use for events, instead of "ad-hoc" strings.
Having these consts makes it easier to find where specific events are triggered,
makes the events less error-prone, and allows documenting each Action (if needed).

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2023-08-29 00:38:08 +02:00
Sebastiaan van Stijn
10a3a3bc49 daemon: inline some variables when emitting events
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2023-08-29 00:38:08 +02:00
Sebastiaan van Stijn
034e7e153f daemon: rename containerdCli to containerdClient
The containerdCli was somewhat confusing (is it the CLI?); let's rename
to make it match what it is :)

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2023-07-18 13:57:27 +02:00
Brian Goff
74da6a6363 Switch all logging to use containerd log pkg
This unifies our logging and allows us to propagate logging and trace
contexts together.

Signed-off-by: Brian Goff <cpuguy83@gmail.com>
2023-06-24 00:23:44 +00:00
Cory Snider
d222bf097c daemon: reload runtimes w/o breaking containers
The existing runtimes reload logic went to great lengths to replace the
directory containing runtime wrapper scripts as atomically as possible
within the limitations of the Linux filesystem ABI. Trouble is,
atomically swapping the wrapper scripts directory solves the wrong
problem! The runtime configuration is "locked in" when a container is
started, including the path to the runC binary. If a container is
started with a runtime which requires a daemon-managed wrapper script
and then the daemon is reloaded with a config which no longer requires
the wrapper script (i.e. some args -> no args, or the runtime is dropped
from the config), that container would become unmanageable. Any attempts
to stop, exec or otherwise perform lifecycle management operations on
the container are likely to fail due to the wrapper script no longer
existing at its original path.

Atomically swapping the wrapper scripts is also incompatible with the
read-copy-update paradigm for reloading configuration. A handler in the
daemon could retain a reference to the pre-reload configuration for an
indeterminate amount of time after the daemon configuration has been
reloaded and updated. It is possible for the daemon to attempt to start
a container using a deleted wrapper script if a request to run a
container races a reload.

Solve the problem of deleting referenced wrapper scripts by ensuring
that all wrapper scripts are *immutable* for the lifetime of the daemon
process. Any given runtime wrapper script must always exist with the
same contents, no matter how many times the daemon config is reloaded,
or what changes are made to the config. This is accomplished by using
everyone's favourite design pattern: content-addressable storage. Each
wrapper script file name is suffixed with the SHA-256 digest of its
contents to (probabilistically) guarantee immutability without needing
any concurrency control. Stale runtime wrapper scripts are only cleaned
up on the next daemon restart.

Split the derived runtimes configuration from the user-supplied
configuration to have a place to store derived state without mutating
the user-supplied configuration or exposing daemon internals in API
struct types. Hold the derived state and the user-supplied configuration
in a single struct value so that they can be updated as an atomic unit.

Signed-off-by: Cory Snider <csnider@mirantis.com>
2023-06-01 14:45:25 -04:00
Cory Snider
0b592467d9 daemon: read-copy-update the daemon config
Ensure data-race-free access to the daemon configuration without
locking by mutating a deep copy of the config and atomically storing
a pointer to the copy into the daemon-wide configStore value. Any
operations which need to read from the daemon config must capture the
configStore value only once and pass it around to guarantee a consistent
view of the config.

Signed-off-by: Cory Snider <csnider@mirantis.com>
2023-06-01 14:45:24 -04:00
Sebastiaan van Stijn
42f1be8030 daemon: translateContainerdStartErr(): rename to setExitCodeFromError()
This should hopefully make it slightly clearer what it does.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2022-12-28 09:27:42 +01:00
Sebastiaan van Stijn
2cf09c5446 daemon: translateContainerdStartErr(): remove unused cmd argument
This argument was no longer used since commit 225e046d9d

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2022-12-28 09:27:41 +01:00
Cory Snider
a09f8dbe6e daemon: Maintain container exec-inspect invariant
We have integration tests which assert the invariant that a
GET /containers/{id}/json response lists only IDs of execs which are in
the Running state, according to GET /exec/{id}/json. The invariant could
be violated if those requests were to race the handling of the exec's
task-exit event. The coarse-grained locking of the container ExecStore
when starting an exec task was accidentally synchronizing
(*Daemon).ProcessEvent and (*Daemon).ContainerExecInspect to it just
enough to make it improbable for the integration tests to catch the
invariant violation on execs which exit immediately. Removing the
unnecessary locking made the underlying race condition more likely for
the tests to hit.

Maintain the invariant by deleting the exec from its container's
ExecCommands before clearing its Running flag. Additionally, fix other
potential data races with execs by ensuring that the ExecConfig lock is
held whenever a mutable field is read from or written to.

Signed-off-by: Cory Snider <csnider@mirantis.com>
2022-08-24 19:35:07 -04:00
Cory Snider
4bafaa00aa Refactor libcontainerd to minimize c8d RPCs
The containerd client is very chatty at the best of times. Because the
libcontained API is stateless and references containers and processes by
string ID for every method call, the implementation is essentially
forced to use the containerd client in a way which amplifies the number
of redundant RPCs invoked to perform any operation. The libcontainerd
remote implementation has to reload the containerd container, task
and/or process metadata for nearly every operation. This in turn
amplifies the number of context switches between dockerd and containerd
to perform any container operation or handle a containerd event,
increasing the load on the system which could otherwise be allocated to
workloads.

Overhaul the libcontainerd interface to reduce the impedance mismatch
with the containerd client so that the containerd client can be used
more efficiently. Split the API out into container, task and process
interfaces which the consumer is expected to retain so that
libcontainerd can retain state---especially the analogous containerd
client objects---without having to manage any state-store inside the
libcontainerd client.

Signed-off-by: Cory Snider <csnider@mirantis.com>
2022-08-24 14:59:08 -04:00
Cory Snider
b75246202a Stop locking container exec store while starting
The daemon.containerd.Exec call does not access or mutate the
container's ExecCommands store in any way, and locking the exec config
is sufficient to synchronize with the event-processing loop. Locking
the ExecCommands store while starting the exec process only serves to
block unrelated operations on the container for an extended period of
time.

Convert the Store struct's mutex to an unexported field to prevent this
from regressing in the future.

Signed-off-by: Cory Snider <csnider@mirantis.com>
2022-08-24 14:59:07 -04:00
Cory Snider
4b84a33217 daemon: kill exec process on ctx cancel
Terminating the exec process when the context is canceled has been
broken since Docker v17.11 so nobody has been able to depend upon that
behaviour in five years of releases. We are thus free from backwards-
compatibility constraints.

Co-authored-by: Nicolas De Loof <nicolas.deloof@gmail.com>
Co-authored-by: Sebastiaan van Stijn <github@gone.nl>
Signed-off-by: Nicolas De Loof <nicolas.deloof@gmail.com>
Signed-off-by: Cory Snider <csnider@mirantis.com>
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2022-08-23 15:35:30 +02:00
Paweł Gronowski
56a20dbc19 container/exec: Support ConsoleSize
Now client have the possibility to set the console size of the executed
process immediately at the creation. This makes a difference for example
when executing commands that output some kind of text user interface
which is bounded by the console dimensions.

Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
2022-06-24 11:54:25 +02:00
Sebastiaan van Stijn
2ec2b65e45 libcontainerd: SignalProcess(): accept syscall.Signal
This helps reducing some type-juggling / conversions further up
the stack.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2022-05-05 00:53:49 +02:00
Sebastiaan van Stijn
28409ca6c7 replace pkg/signal with moby/sys/signal v0.5.0
This code was moved to the moby/sys repository

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2021-07-23 09:32:54 +02:00
Sebastiaan van Stijn
a432eb4b3a ContainerExecStart(): don't wrap getExecConfig() errors, and prevent panic
daemon.getExecConfig() already returns typed errors; by wrapping those errors
we may loose the actual reason for failures. Changing the error-type was
originally added in 2d43d93410, but I think
it was not intentional to ignore already-typed errors. It was later refactored
in a793564b25, which added helper functions
to create these errors, but kept the same behavior.

Also adds error-handling to prevent a panic in situations where (although
unlikely) `daemon.containers.Get()` would not return a container.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2021-03-22 13:37:05 +01:00
Sebastiaan van Stijn
6eb5720233 Fix daemon.getExecConfig(): not using typed errNotRunning() error
This makes daemon.getExecConfig return a errdefs.Conflict() error if the
container is not running.

This was originally the case, but a refactor of this code changed the typed
error (`derr.ErrorCodeContainerNotRunning`) to a non-typed error;
a793564b25

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2021-03-22 13:37:03 +01:00
Sebastiaan van Stijn
8312004f41 remove uses of deprecated pkg/term
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2020-04-21 16:29:27 +02:00
Sebastiaan van Stijn
eb14d936bf daemon: rename variables that collide with imported package names
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2020-04-14 17:22:23 +02:00
Sebastiaan van Stijn
797ec8e913 daemon: rename all receivers to "daemon"
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2020-04-14 17:22:21 +02:00
Sebastiaan van Stijn
07ff4f1de8 goimports: fix imports
Format the source according to latest goimports.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2019-09-18 12:56:54 +02:00
Sebastiaan van Stijn
c85fe2d224 Merge pull request #38522 from cpuguy83/fix_timers
Make sure timers are stopped after use.
2019-06-07 13:16:46 +02:00
Michael Crosby
7603c22c73 Use original process spec for execs
Fixes #38865

Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
2019-03-21 15:41:53 -04:00
Brian Goff
eaad3ee3cf Make sure timers are stopped after use.
`time.After` keeps a timer running until the specified duration is
completed. It also allocates a new timer on each call. This can wind up
leaving lots of uneccessary timers running in the background that are
not needed and consume resources.

Instead of `time.After`, use `time.NewTimer` so the timer can actually
be stopped.
In some of these cases it's not a big deal since the duraiton is really
short, but in others it is much worse.

Signed-off-by: Brian Goff <cpuguy83@gmail.com>
2019-01-16 14:32:53 -08:00
David Wang
e6783656f9 Fix race condition between exec start and resize
Signed-off-by: David Wang <00107082@163.com>
2018-06-08 11:07:48 +08:00
Sebastiaan van Stijn
f23c00d870 Various code-cleanup
remove unnescessary import aliases, brackets, and so on.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2018-05-23 17:50:54 +02:00
Brian Goff
6433683887 Merge pull request #36815 from allencloud/simplify-ode
refactor: simplify code to make function getExecConfig  more readable
2018-05-11 10:06:33 -04:00
Kir Kolyshkin
7d62e40f7e Switch from x/net/context -> context
Since Go 1.7, context is a standard package. Since Go 1.9, everything
that is provided by "x/net/context" is a couple of type aliases to
types in "context".

Many vendored packages still use x/net/context, so vendor entry remains
for now.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2018-04-23 13:52:44 -07:00
Allen Sun
a637939981 refactor: simplify code to make function more readable
Signed-off-by: Allen Sun <shlallen1990@gmail.com>
2018-04-09 15:50:20 +08:00
Stephen J Day
d84da75f01 daemon: use context error rather than inventing new one
Signed-off-by: Stephen J Day <stephen.day@docker.com>
2018-03-22 09:38:59 -07:00
Daniel Nephin
4f0d95fa6e Add canonical import comment
Signed-off-by: Daniel Nephin <dnephin@docker.com>
2018-02-05 16:51:57 -05:00