The Container.State struct holds the container's state, and most of
its fields are expected to change dynamically. Some o these state-changes
are explicit, for example, setting the container to be "stopped". Other
state changes can be more explicit, for example due to the containers'
process exiting or being "OOM" killed by the kernel.
The distinction between explicit ("desired") state changes and "state"
("actual state") is sometimes vague; for some properties, we clearly
separated them, for example if a user requested the container to be
stopped or restarted, we store state in the Container object itself;
HasBeenManuallyStopped bool // used for unless-stopped restart policy
HasBeenManuallyRestarted bool `json:"-"` // used to distinguish restart caused by restart policy from the manual one
Other properties are more ambiguous. such as "HasBeenStartedBefore" and
"RestartCount", which are stored on the Container (and persisted to
disk), but may be more related to "actual" state, and likely should
not be persisted;
RestartCount int
HasBeenStartedBefore bool
Given that (per the above) concurrency must be taken into account, most
changes to the `container.State` struct should be protected; here's where
things get blurry. While the `State` type provides various accessor methods,
only some of them take concurrency into account; for example, [State.IsRunning]
and [State.GetPID] acquire a lock, whereas [State.ExitCodeValue] does not.
Even the (commonly used) [State.StateString] has no locking at all.
The way to handle this is error-prone; [container.State] contains a mutex,
and it's exported. Given that its embedded in the [container.Container]
struct, it's also exposed as an exported mutex for the container. The
assumption here is that by "merging" the two, the caller to acquire a lock
when either the container _or_ its state must be mutated. However, because
some methods on `container.State` handle their own locking, consumers must
be deeply familiar with the internals; if both changes to the `Container`
AND `Container.State` must be made. This gets amplified more as some
(exported!) methods, such as [container.SetRunning] mutate multiple fields,
but don't acquire a lock (so expect the caller to hold one), but their
(also exported) counterpart (e.g. [State.IsRunning]) do.
It should be clear from the above, that this needs some architectural
changes; a clearer separation between "desired" and "actual" state (opening
the potential to update the container's config without manually touching
its `State`), possibly a method to obtain a read-only copy of the current
state (for those querying state), and reviewing which fields belong where
(and should be persisted to disk, or only remain in memory).
This PR preserves the status quo; it makes no structural changes, other
than exposing where we access the container's state. Where previously the
State fields and methods were referred to as "part of the container"
(e.g. `ctr.IsRunning()` or `ctr.Running`), we now explicitly reference
the embedded `State` (`ctr.State.IsRunning`, `ctr.State.Running`).
The exception (for now) is the mutex, which is still referenced through
the embedded struct (`ctr.Lock()` instead of `ctr.State.Lock()`), as this
is (mostly) by design to protect the container, and what's in it (including
its `State`).
[State.IsRunning]: c4afa77157/daemon/container/state.go (L205-L209)
[State.GetPID]: c4afa77157/daemon/container/state.go (L211-L216)
[State.ExitCodeValue]: c4afa77157/daemon/container/state.go (L218-L228)
[State.StateString]: c4afa77157/daemon/container/state.go (L102-L131)
[container.State]: c4afa77157/daemon/container/state.go (L15-L23)
[container.Container]: c4afa77157/daemon/container/container.go (L67-L75)
[container.SetRunning]: c4afa77157/daemon/container/state.go (L230-L277)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Move the option-types to the client and in some cases create a
copy for the backend. These types are used to construct query-
args, and not marshaled to JSON, and can be replaced with functional
options in the client.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The "backend" types in API were designed to decouple the API server
implementation from the daemon, or other parts of the code that
back the API server. This would allow the daemon to evolve (e.g.
functionality moved to different subsystems) without that impacting
the API server's implementation.
Now that the API server is no longer part of the API package (module),
there is no benefit to having it in the API module. The API server
may evolve (and require changes in the backend), which has no direct
relation with the API module (types, responses); the backend definition
is, however, coupled to the API server implementation.
It's worth noting that, while "technically" possible to use the API
server package, and implement an alternative backend implementation,
this has never been a prime objective. The backend definition was
never considered "stable", and we don't expect external users to
(attempt) to use it as such.
This patch moves the backend types to the daemon/server package,
so that they can evolve with the daemon and API server implementation
without that impacting the API module (which we intend to be stable,
following SemVer).
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
These comments were added to enforce using the correct import path for
our packages ("github.com/docker/docker", not "github.com/moby/moby").
However, when working in go module mode (not GOPATH / vendor), they have
no effect, so their impact is limited.
Remove these imports in preparation of migrating our code to become an
actual go module.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This function was decorating errors with the container name, but within its
own context wouldn't be aware how the delete was referenced. This could
result in a container deleted by "ID" to produce an error with the container
Name. Some errors were also decorated before storing as "removalError" on
the container object itself.
The removalError was originally added in f963500c54,
before which the error was returned. Now that it's part of the container's
state itself, adding the container's ID is probably not very useful.
This patch reduces the scope of decorating the errors to the error-condition
itself, leaving it to the caller to decorate them further with the container
ID or Name (if any).
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This code only needed to know whether the container was paused; for other
states ("restarting", "running"), it's still used to be included in the
error string.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
rmLink already looked up the parent container's ID, so we should not use
daemon.GetContainer to resolve the container, as that performs fuzzy
matching (name, ID-prefix, or ID).
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Implement containerd image store backed `RWLayer` and remove the
containerd-specific `PrepareSnapshot` method from the ImageService
interface.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Previously the RWLayer reference was cleared without holding the
container lock. This could lead to goroutine panics in various places
that use the container.RWLayer because nil checks introduced in #36242
where not sufficient as the reference could change right before the use.
Fixes#49227
Signed-off-by: Tadeusz Dudkiewicz <tadeusz.dudkiewicz@rtbhouse.com>
The only external consumer are the `graphdriver` and `graphdriver/shim`
packages in github.com/docker/go-plugins-helpers, which depended on
[ContainerFS][1], which was removed in 9ce2b30b81.
graphdriver-plugins were deprecated in 6da604aa6a,
and support for them removed in 555dac5e14,
so removing this should not be an issue.
Ideally this package would've been moved inside `daemon/internal`, but it's used
by the `daemon` (cleanupContainer), `plugin` package, and by `graphdrivers`,
so needs to be in the top-level `internal/` package.
[1]: 6eecb7beb6/graphdriver/api.go (L218)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Update to containerd 1.7.18, which now migrated to the errdefs module. The
existing errdefs package is now an alias for the module, and should no longer
be used directly.
This patch:
- updates the containerd dependency: https://github.com/containerd/containerd/compare/v1.7.17...v1.7.18
- replaces uses of the old package in favor of the new module
- adds a linter check to prevent accidental re-introduction of the old package
- adds a linter check to prevent using the "log" package, which was also
migrated to a separate module.
There are still some uses of the old package in (indirect) dependencies,
which should go away over time.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The `ContainerCreateConfig` and `ContainerRmConfig` structs are used for
options to be passed to the backend, and are not used in client code.
Thess struct currently is intended for internal use only (for example, the
`AdjustCPUShares` is an internal implementation details to adjust the container's
config when older API versions are used).
Somewhat ironically, the signature of the Backend has a nicer UX than that
of the client's `ContainerCreate` signature (which expects all options to
be passed as separate arguments), so we may want to update that signature
to be closer to what the backend is using, but that can be left as a future
exercise.
This patch moves the `ContainerCreateConfig` and `ContainerRmConfig` structs
to the backend package to prevent it being imported in the client, and to make
it more clear that this is part of internal APIs, and not public-facing.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The github.com/containerd/containerd/log package was moved to a separate
module, which will also be used by upcoming (patch) releases of containerd.
This patch moves our own uses of the package to use the new module.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Define consts for the Actions we use for events, instead of "ad-hoc" strings.
Having these consts makes it easier to find where specific events are triggered,
makes the events less error-prone, and allows documenting each Action (if needed).
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Also remove integration-cli: `DockerAPISuite.TestContainerAPIDeleteConflict`,
which was testing the same conditions as `TestRemoveContainerRunning` in
integration/container.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Saw this failure in a flaky test, and I wondered why we consider this an
error condition;
=== RUN TestKillWithStopSignalAndRestartPolicies
main_test.go:32: assertion failed: error is not nil: Error response from daemon: Could not kill running container 668f62511f4aa62357269cd405cff1fbe295b7f6d5011e7cfed434e3072330b7, cannot remove - Container 668f62511f4aa62357269cd405cff1fbe295b7f6d5011e7cfed434e3072330b7 is not running: failed to remove 668f62511f4aa62357269cd405cff1fbe295b7f6d5011e7cfed434e3072330b7
--- FAIL: TestKillWithStopSignalAndRestartPolicies (0.84s)
=== RUN TestKillWithStopSignalAndRestartPolicies/same-signal-disables-restart-policy
--- PASS: TestKillWithStopSignalAndRestartPolicies/same-signal-disables-restart-policy (0.42s)
=== RUN TestKillWithStopSignalAndRestartPolicies/different-signal-keep-restart-policy
--- PASS: TestKillWithStopSignalAndRestartPolicies/different-signal-keep-restart-policy (0.23s)
In the above;
1. `Error response from daemon: Could not kill running container 668f62511f4aa62357269cd405cff1fbe295b7f6d5011e7cfed434e3072330b7`
2. `cannot remove - Container 668f62511f4aa62357269cd405cff1fbe295b7f6d5011e7cfed434e3072330b7 is not running`
3. `failed to remove 668f62511f4aa62357269cd405cff1fbe295b7f6d5011e7cfed434e3072330b7`
So it looks like the removal fails because we couldn't kill the container
because it was already stopped, which may be a race condition where the first
check shows the container to be running (but may already be in process to be
removed or killed. In either case, we probably shouldn't fail the removal if
the container is already stopped.
This patch adds a `isNotRunning()` utility, so that we can ignore this case,
and proceed with the removal.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
If the lease doesn't exit (for example when creating the container
failed), just ignore the not found error.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
The containerdCli was somewhat confusing (is it the CLI?); let's rename
to make it match what it is :)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The existing runtimes reload logic went to great lengths to replace the
directory containing runtime wrapper scripts as atomically as possible
within the limitations of the Linux filesystem ABI. Trouble is,
atomically swapping the wrapper scripts directory solves the wrong
problem! The runtime configuration is "locked in" when a container is
started, including the path to the runC binary. If a container is
started with a runtime which requires a daemon-managed wrapper script
and then the daemon is reloaded with a config which no longer requires
the wrapper script (i.e. some args -> no args, or the runtime is dropped
from the config), that container would become unmanageable. Any attempts
to stop, exec or otherwise perform lifecycle management operations on
the container are likely to fail due to the wrapper script no longer
existing at its original path.
Atomically swapping the wrapper scripts is also incompatible with the
read-copy-update paradigm for reloading configuration. A handler in the
daemon could retain a reference to the pre-reload configuration for an
indeterminate amount of time after the daemon configuration has been
reloaded and updated. It is possible for the daemon to attempt to start
a container using a deleted wrapper script if a request to run a
container races a reload.
Solve the problem of deleting referenced wrapper scripts by ensuring
that all wrapper scripts are *immutable* for the lifetime of the daemon
process. Any given runtime wrapper script must always exist with the
same contents, no matter how many times the daemon config is reloaded,
or what changes are made to the config. This is accomplished by using
everyone's favourite design pattern: content-addressable storage. Each
wrapper script file name is suffixed with the SHA-256 digest of its
contents to (probabilistically) guarantee immutability without needing
any concurrency control. Stale runtime wrapper scripts are only cleaned
up on the next daemon restart.
Split the derived runtimes configuration from the user-supplied
configuration to have a place to store derived state without mutating
the user-supplied configuration or exposing daemon internals in API
struct types. Hold the derived state and the user-supplied configuration
in a single struct value so that they can be updated as an atomic unit.
Signed-off-by: Cory Snider <csnider@mirantis.com>
Ensure data-race-free access to the daemon configuration without
locking by mutating a deep copy of the config and atomically storing
a pointer to the copy into the daemon-wide configStore value. Any
operations which need to read from the daemon config must capture the
configStore value only once and pass it around to guarantee a consistent
view of the config.
Signed-off-by: Cory Snider <csnider@mirantis.com>
Signed-off-by: Djordje Lukic <djordje.lukic@docker.com>
c8d/daemon: Mount root and fill BaseFS
This fixes things that were broken due to nil BaseFS like `docker cp`
and running a container with workdir override.
This is more of a temporary hack than a real solution.
The correct fix would be to refactor the code to make BaseFS and LayerRW
an implementation detail of the old image store implementation and use
the temporary mounts for the c8d implementation instead.
That requires more work though.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
daemon/images: Don't unset BaseFS
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Attempting to delete the directory while another goroutine is
concurrently executing a CheckpointTo() can fail on Windows due to file
locking. As all callers of CheckpointTo() are required to hold the
container lock, holding the lock while deleting the directory ensures
that there will be no interference.
Signed-off-by: Cory Snider <csnider@mirantis.com>
This avoids having to determine what the default is in various
parts of the code. If no custom timeout is passed (nil), the
default will be used.
Also remove the named return variable from cleanupContainer(),
as it wasn't used.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
We already have this config, so might as well pass it, instead of passing
each option as a separate argument.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
- daemon/delete: rename var that collided with import, remove output var
- daemon: fix inconsistent receiver name and package aliases
- daemon/stop: rename imports and variables to standard naming
This is in preparation of some changes, but keeping it in a
separate commit to make review of other changes easier.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
pkg/system historically has been a bit of a kitchen-sink of things that were
somewhat "system" related, but didn't have a good place for. EnsureRemoveAll()
is one of those utilities. EnsureRemoveAll() is used to both unmount and remove
a path, for which it depends on both github.com/moby/sys/mount, which in turn
depends on github.com/moby/sys/mountinfo.
pkg/system is imported in the CLI, but neither EnsureRemoveAll(), nor any of its
moby/sys dependencies are used on the client side, so let's move this function
somewhere else, to remove those dependencies from the CLI.
I looked for plausible locations that were related; it's used in:
- daemon
- daemon/graphdriver/XXX/
- plugin
I considered moving it into a (e.g.) "utils" package within graphdriver (but not
a huge fan of "utils" packages), and given that it felt (mostly) related to
cleaning up container filesystems, I decided to move it there.
Some things to follow-up on after this:
- Verify if this function is still needed (it feels a bit like a big hammer in
a "YOLO, let's try some things just in case it fails")
- Perhaps it should be integrated in `containerfs.Remove()` (so that it's used
automatically)
- Look if there's other implementations (and if they should be consolidated),
although (e.g.) the one in containerd is a copy of ours:
https://github.com/containerd/containerd/blob/v1.5.9/pkg/cri/server/helpers_linux.go#L200
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This removes some of the checks that were added in 0cba7740d4,
but should no longer be needed.
- `Daemon.create()`: fix the error message, which assumed it could only occur on Windows.
- `Daemon.cleanupContainer()`: no need to validate container platform to delete it.
- `Daemon.containerExport`: if a container was created, we should be able to
export it; no need to validate.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>