Commit Graph

69 Commits

Author SHA1 Message Date
Sebastiaan van Stijn
0df791cb72 explicitly access Container.State instead of through embedded struct
The Container.State struct holds the container's state, and most of
its fields are expected to change dynamically. Some o these state-changes
are explicit, for example, setting the container to be "stopped". Other
state changes can be more explicit, for example due to the containers'
process exiting or being "OOM" killed by the kernel.

The distinction between explicit ("desired") state changes and "state"
("actual state") is sometimes vague; for some properties, we clearly
separated them, for example if a user requested the container to be
stopped or restarted, we store state in the Container object itself;

    HasBeenManuallyStopped   bool // used for unless-stopped restart policy
    HasBeenManuallyRestarted bool `json:"-"` // used to distinguish restart caused by restart policy from the manual one

Other properties are more ambiguous. such as "HasBeenStartedBefore" and
"RestartCount", which are stored on the Container (and persisted to
disk), but may be more related to "actual" state, and likely should
not be persisted;

    RestartCount             int
    HasBeenStartedBefore     bool

Given that (per the above) concurrency must be taken into account, most
changes to the `container.State` struct should be protected; here's where
things get blurry. While the `State` type provides various accessor methods,
only some of them take concurrency into account; for example, [State.IsRunning]
and [State.GetPID] acquire a lock, whereas [State.ExitCodeValue] does not.
Even the (commonly used) [State.StateString] has no locking at all.

The way to handle this is error-prone; [container.State] contains a mutex,
and it's exported. Given that its embedded in the [container.Container]
struct, it's also exposed as an exported mutex for the container. The
assumption here is that by "merging" the two, the caller to acquire a lock
when either the container _or_ its state must be mutated. However, because
some methods on `container.State` handle their own locking, consumers must
be deeply familiar with the internals; if both changes to the `Container`
AND `Container.State` must be made. This gets amplified more as some
(exported!) methods, such as [container.SetRunning] mutate multiple fields,
but don't acquire a lock (so expect the caller to hold one), but their
(also exported) counterpart (e.g. [State.IsRunning]) do.

It should be clear from the above, that this needs some architectural
changes; a clearer separation between "desired" and "actual" state (opening
the potential to update the container's config without manually touching
its `State`), possibly a method to obtain a read-only copy of the current
state (for those querying state), and reviewing which fields belong where
(and should be persisted to disk, or only remain in memory).

This PR preserves the status quo; it makes no structural changes, other
than exposing where we access the container's state. Where previously the
State fields and methods were referred to as "part of the container"
(e.g. `ctr.IsRunning()` or `ctr.Running`), we now explicitly reference
the embedded `State` (`ctr.State.IsRunning`, `ctr.State.Running`).

The exception (for now) is the mutex, which is still referenced through
the embedded struct (`ctr.Lock()` instead of `ctr.State.Lock()`), as this
is (mostly) by design to protect the container, and what's in it (including
its `State`).

[State.IsRunning]: c4afa77157/daemon/container/state.go (L205-L209)
[State.GetPID]: c4afa77157/daemon/container/state.go (L211-L216)
[State.ExitCodeValue]: c4afa77157/daemon/container/state.go (L218-L228)
[State.StateString]: c4afa77157/daemon/container/state.go (L102-L131)
[container.State]: c4afa77157/daemon/container/state.go (L15-L23)
[container.Container]: c4afa77157/daemon/container/container.go (L67-L75)
[container.SetRunning]: c4afa77157/daemon/container/state.go (L230-L277)

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2025-09-19 16:02:14 +02:00
Derek McGowan
f74e5d48b3 Create github.com/moby/moby/v2 module
Signed-off-by: Derek McGowan <derek@mcg.dev>
2025-07-31 10:13:29 -07:00
Sebastiaan van Stijn
83510a26b3 api/types: move backend types to daemon/server
The "backend" types in API were designed to decouple the API server
implementation from the daemon, or other parts of the code that
back the API server. This would allow the daemon to evolve (e.g.
functionality moved to different subsystems) without that impacting
the API server's implementation.

Now that the API server is no longer part of the API package (module),
there is no benefit to having it in the API module. The API server
may evolve (and require changes in the backend), which has no direct
relation with the API module (types, responses); the backend definition
is, however, coupled to the API server implementation.

It's worth noting that, while "technically" possible to use the API
server package, and implement an alternative backend implementation,
this has never been a prime objective. The backend definition was
never considered "stable", and we don't expect external users to
(attempt) to use it as such.

This patch moves the backend types to the daemon/server package,
so that they can evolve with the daemon and API server implementation
without that impacting the API module (which we intend to be stable,
following SemVer).

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2025-07-28 00:03:04 +02:00
Derek McGowan
afd6487b2e Create github.com/moby/moby/api module
Signed-off-by: Derek McGowan <derek@mcg.dev>
2025-07-21 09:30:05 -07:00
Derek McGowan
5419eb1efc Move container to daemon/container
Signed-off-by: Derek McGowan <derek@mcg.dev>
2025-06-27 14:27:21 -07:00
Derek McGowan
0b2582dc8f Move internal/metrics to daemon/internal/metrics
Signed-off-by: Derek McGowan <derek@mcg.dev>
2025-06-27 14:25:45 -07:00
Matthieu MOREL
bc9ec5fc02 fix emptyStringTest from go-critic
Signed-off-by: Matthieu MOREL <matthieu.morel35@gmail.com>
2025-06-07 09:57:59 +02:00
Sebastiaan van Stijn
5318877858 daemon: remove // import comments
These comments were added to enforce using the correct import path for
our packages ("github.com/docker/docker", not "github.com/moby/moby").
However, when working in go module mode (not GOPATH / vendor), they have
no effect, so their impact is limited.

Remove these imports in preparation of migrating our code to become an
actual go module.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2025-05-30 15:59:13 +02:00
Rob Murray
4f2e128378 Merge pull request #49337 from thaJeztah/simplify_health_getshell
daemon: health: getShell: simplify logic (LCOW remnants)
2025-01-27 15:55:56 +00:00
Sebastiaan van Stijn
2197549e4f daemon: health: getShell: simplify logic (LCOW remnants)
This function had some LCOW remnants, where the assumption was made
the only on Windows, the image's OS could potentially not match the
host's OS (see 3e6a13ccb8).

While we currently are not able to run a Windows image on Linux (or
vice versa), this function doesn't have to take into account;

- If a shell is configured; use whatever is configured
- otherwise, use "cmd.exe" for Windows images, and "/bin/sh" otherwise
  (likely Linux, but the existing code did not account for other platforms).

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2025-01-26 15:27:46 +01:00
Sebastiaan van Stijn
d3c0825439 daemon: make daemon.getEntrypointAndArgs a regular function
It was not using the daemon, so can be a regular function. While at it,
also changed the parameter type to accept a regular string-slice, as
we don't need strslice.StrSlice's json.Unmarshaler implementation, and
reversed the logic for the early return.

Finally, for uses where the entrypoint was always nil, this patch removes
the use of this utility altogether.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2025-01-26 14:37:08 +01:00
Paweł Gronowski
51c2689427 daemon/metrics: Move out to internal/metrics
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
2025-01-07 14:13:06 +01:00
Paweł Gronowski
638172417c container: Add ImagePlatform field and deprecate OS
Change the persistent container metadata to store the whole platform
(as defined by OCI) instead of only the operating system.

Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
2024-11-19 13:55:54 +01:00
Sebastiaan van Stijn
c130ce1f5d api/types: move container Health types to api/types/container
This moves the `Health` and `HealthcheckResult` types to the container package,
as well as the related `NoHealthcheck`, `Starting`, `Healthy`, and `Unhealthy`
consts.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2024-07-02 12:46:47 +02:00
Sebastiaan van Stijn
a736d0701c Merge pull request #47936 from thaJeztah/api_types_container_types
api/types: move more types to sub-packages
2024-06-10 16:51:49 +02:00
Jack Walker
c514952774 Changed default value of the startInterval to 5s
Co-authored-by: Sebastiaan van Stijn <github@gone.nl>
Signed-off-by: Jack Walker <90711509+j2walker@users.noreply.github.com>
2024-06-10 13:23:26 +02:00
Sebastiaan van Stijn
452e134001 api/types: move ExecStartOptions to api/types/backend
It's a type used by the backend, so moving it there.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2024-06-10 10:19:46 +02:00
Cory Snider
97d32bb7d7 daemon: stop checkpointing health probes to disk
The health status and probe log of containers are not mission-criticial
data which must survive a crash. It is not worth prematrely wearing out
consumer-grade flash storage by overwriting and fsync()ing the container
config on after every probe. Update only the live Container object and
the ViewDB replica on every container health probe instead. It will
eventually get checkpointed along with some other state (or config)
change. Running containers will not be checkpointed on daemon shutdown
when live-restore is enabled, but it does not matter: the health status
and probe log will be zeroed out when the daemon starts back up.

Signed-off-by: Cory Snider <csnider@mirantis.com>
2024-01-16 14:09:40 -05:00
Brian Goff
02a932d63f Fix case where health start interval is 0 uses default
When the start interval is 0 we should treat that as unset.
This is especially important for older API versions where we reset the
value to 0.

Instead of using the default probe value we should be using the
configured `interval` value (which may be a default as well) which gives
us back the old behavior before support for start interval was added.

Signed-off-by: Brian Goff <cpuguy83@gmail.com>
2023-11-02 20:02:16 +00:00
Sebastiaan van Stijn
cff4f20c44 migrate to github.com/containerd/log v0.1.0
The github.com/containerd/containerd/log package was moved to a separate
module, which will also be used by upcoming (patch) releases of containerd.

This patch moves our own uses of the package to use the new module.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2023-10-11 17:52:23 +02:00
Sebastiaan van Stijn
0f871f8cb7 api/types/events: define "Action" type and consts
Define consts for the Actions we use for events, instead of "ad-hoc" strings.
Having these consts makes it easier to find where specific events are triggered,
makes the events less error-prone, and allows documenting each Action (if needed).

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2023-08-29 00:38:08 +02:00
Sebastiaan van Stijn
10a3a3bc49 daemon: inline some variables when emitting events
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2023-08-29 00:38:08 +02:00
Sebastiaan van Stijn
a3867992b7 daemon: rename max/min as it collides with go1.21 builtin
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2023-08-26 22:02:21 +02:00
Brian Goff
2216d3ca8d Add health start interval
This adds an additional interval to be used by healthchecks during the
start period.
Typically when a container is just starting you want to check if it is
ready more quickly than a typical healthcheck might run. Without this
users have to balance between running healthchecks to frequently vs
taking a very long time to mark a container as healthy for the first
time.

Signed-off-by: Brian Goff <cpuguy83@gmail.com>
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2023-07-05 23:44:17 +00:00
Brian Goff
74da6a6363 Switch all logging to use containerd log pkg
This unifies our logging and allows us to propagate logging and trace
contexts together.

Signed-off-by: Brian Goff <cpuguy83@gmail.com>
2023-06-24 00:23:44 +00:00
Cory Snider
786c9adaa2 daemon: fix double-unlock in health check probe
Signed-off-by: Cory Snider <csnider@mirantis.com>
2023-06-22 17:48:21 -04:00
Sebastiaan van Stijn
0670621291 Merge pull request #43997 from thaJeztah/healthcheck_capture_logs
daemon: capture output of killed health checks
2022-09-02 10:48:22 +02:00
Cory Snider
a09f8dbe6e daemon: Maintain container exec-inspect invariant
We have integration tests which assert the invariant that a
GET /containers/{id}/json response lists only IDs of execs which are in
the Running state, according to GET /exec/{id}/json. The invariant could
be violated if those requests were to race the handling of the exec's
task-exit event. The coarse-grained locking of the container ExecStore
when starting an exec task was accidentally synchronizing
(*Daemon).ProcessEvent and (*Daemon).ContainerExecInspect to it just
enough to make it improbable for the integration tests to catch the
invariant violation on execs which exit immediately. Removing the
unnecessary locking made the underlying race condition more likely for
the tests to hit.

Maintain the invariant by deleting the exec from its container's
ExecCommands before clearing its Running flag. Additionally, fix other
potential data races with execs by ensuring that the ExecConfig lock is
held whenever a mutable field is read from or written to.

Signed-off-by: Cory Snider <csnider@mirantis.com>
2022-08-24 19:35:07 -04:00
Cory Snider
4bafaa00aa Refactor libcontainerd to minimize c8d RPCs
The containerd client is very chatty at the best of times. Because the
libcontained API is stateless and references containers and processes by
string ID for every method call, the implementation is essentially
forced to use the containerd client in a way which amplifies the number
of redundant RPCs invoked to perform any operation. The libcontainerd
remote implementation has to reload the containerd container, task
and/or process metadata for nearly every operation. This in turn
amplifies the number of context switches between dockerd and containerd
to perform any container operation or handle a containerd event,
increasing the load on the system which could otherwise be allocated to
workloads.

Overhaul the libcontainerd interface to reduce the impedance mismatch
with the containerd client so that the containerd client can be used
more efficiently. Split the API out into container, task and process
interfaces which the consumer is expected to retain so that
libcontainerd can retain state---especially the analogous containerd
client objects---without having to manage any state-store inside the
libcontainerd client.

Signed-off-by: Cory Snider <csnider@mirantis.com>
2022-08-24 14:59:08 -04:00
Cory Snider
0cbb92bcc5 daemon: capture output of killed health checks
Add an integration test to verify that health checks are killed on
timeout and that the output is captured.

Co-authored-by: Nicolas De Loof <nicolas.deloof@gmail.com>
Signed-off-by: Cory Snider <csnider@mirantis.com>
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2022-08-24 13:59:34 +02:00
Cory Snider
4b84a33217 daemon: kill exec process on ctx cancel
Terminating the exec process when the context is canceled has been
broken since Docker v17.11 so nobody has been able to depend upon that
behaviour in five years of releases. We are thus free from backwards-
compatibility constraints.

Co-authored-by: Nicolas De Loof <nicolas.deloof@gmail.com>
Co-authored-by: Sebastiaan van Stijn <github@gone.nl>
Signed-off-by: Nicolas De Loof <nicolas.deloof@gmail.com>
Signed-off-by: Cory Snider <csnider@mirantis.com>
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2022-08-23 15:35:30 +02:00
Paweł Gronowski
56a20dbc19 container/exec: Support ConsoleSize
Now client have the possibility to set the console size of the executed
process immediately at the creation. This makes a difference for example
when executing commands that output some kind of text user interface
which is bounded by the console dimensions.

Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
2022-06-24 11:54:25 +02:00
Cory Snider
bdc6473d2d health: Start probe timeout after exec starts
Starting an exec can take a significant amount of time while under heavy
container operation load. In extreme cases the time to start the process
can take upwards of a second, which is a significant fraction of the
default health probe timeout (30s). With a shorter timeout, the exec
start delay could make the difference between a successful probe and a
probe timeout! Mitigate the impact of excessive exec start latencies by
only starting the probe timeout timer after the exec'ed process has
started.

Add a metric to sample the latency of starting health-check exec probes.

Signed-off-by: Cory Snider <csnider@mirantis.com>
2022-04-28 17:21:03 -04:00
Sebastiaan van Stijn
797ec8e913 daemon: rename all receivers to "daemon"
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2020-04-14 17:22:21 +02:00
Sebastiaan van Stijn
3e6a13ccb8 LCOW: fix using wrong shell for healthchecks
As reported in docker/compose#6445, when deploying a Linux
container on Windows (LCOW), the daemon made the wrong assumption
when deciding which shell to use to execute the healthcheck, looking
at the host's platform instead of the container's platform.

This patch adds a check for the container's platform when deploying
on Windows, and sets the correct shell.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2019-06-21 13:58:25 +02:00
Brian Goff
eaad3ee3cf Make sure timers are stopped after use.
`time.After` keeps a timer running until the specified duration is
completed. It also allocates a new timer on each call. This can wind up
leaving lots of uneccessary timers running in the background that are
not needed and consume resources.

Instead of `time.After`, use `time.NewTimer` so the timer can actually
be stopped.
In some of these cases it's not a big deal since the duraiton is really
short, but in others it is much worse.

Signed-off-by: Brian Goff <cpuguy83@gmail.com>
2019-01-16 14:32:53 -08:00
Kir Kolyshkin
7d62e40f7e Switch from x/net/context -> context
Since Go 1.7, context is a standard package. Since Go 1.9, everything
that is provided by "x/net/context" is a couple of type aliases to
types in "context".

Many vendored packages still use x/net/context, so vendor entry remains
for now.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2018-04-23 13:52:44 -07:00
Daniel Nephin
4f0d95fa6e Add canonical import comment
Signed-off-by: Daniel Nephin <dnephin@docker.com>
2018-02-05 16:51:57 -05:00
Nicolas De Loof
aa6bb5cb69 introduce « exec_die » event
Signed-off-by: Nicolas De Loof <nicolas.deloof@gmail.com>
2018-01-08 11:42:25 +01:00
Nicolas De Loof
852a943c77 fix #35843 regression on health check workingdir
Signed-off-by: Nicolas De Loof <nicolas.deloof@gmail.com>
2017-12-20 14:04:51 +01:00
Yong Tang
29d6aef393 Merge pull request #35533 from AliyunContainerService/supress-warning-healthcheck-none
Suppress warning when NONE was set for healthcheck
2017-11-30 11:06:05 -08:00
Li Yi
e987c554c9 Supress warning when NONE was set for healthcheck
Change-Id: I9ebcf49e9e8ac76beb037779ad02ac6020169849
Signed-off-by: Li Yi <denverdino@gmail.com>
2017-11-17 19:43:59 +08:00
Stephen J Day
7db30ab0cd container: protect the health status with mutex
Adds a mutex to protect the status, as well. When running the race
detector with the unit test, we can see that the Status field is written
without holding this lock. Adding a mutex to read and set status
addresses the issue.

Signed-off-by: Stephen J Day <stephen.day@docker.com>
2017-11-16 15:04:01 -08:00
Daniel Nephin
62c1f0ef41 Add deadcode linter
Signed-off-by: Daniel Nephin <dnephin@docker.com>
2017-08-21 18:18:50 -04:00
Daniel Nephin
9b47b7b151 Fix golint errors.
Signed-off-by: Daniel Nephin <dnephin@docker.com>
2017-08-18 14:23:44 -04:00
Derek McGowan
1009e6a40b Update logrus to v1.0.1
Fixes case sensitivity issue

Signed-off-by: Derek McGowan <derek@mcgstyle.net>
2017-07-31 13:16:46 -07:00
Aaron Lehmann
da28210a15 Merge pull request #33781 from mlaventure/fix-healhcheck-goroutine-leak
Prevent a goroutine leak when healthcheck gets stopped
2017-06-26 15:34:43 -07:00
Kenfe-Mickael Laventure
67297ba005 Prevent a goroutine leak when healthcheck gets stopped
Signed-off-by: Kenfe-Mickael Laventure <mickael.laventure@gmail.com>
2017-06-23 08:06:49 -07:00
Fabio Kung
aacddda89d Move checkpointing to the Container object
Also hide ViewDB behind an inteface.

Signed-off-by: Fabio Kung <fabio.kung@gmail.com>
2017-06-23 07:52:32 -07:00
Fabio Kung
eed4c7b73f keep a consistent view of containers rendered
Replicate relevant mutations to the in-memory ACID store. Readers will
then be able to query container state without locking.

Signed-off-by: Fabio Kung <fabio.kung@gmail.com>
2017-06-23 07:52:31 -07:00