Add nil check before calling EventsService.Close() to prevent panic
when daemon.EventsService is not initialized during shutdown.
Signed-off-by: jinda.ljd <jinda.ljd@alibaba-inc.com>
Stopping the Engine while a container with autoremove set is running may
leave behind dead containers on disk. These containers aren't reclaimed
on next start, appear as "dead" in `docker ps -a` and can't be
inspected or removed by the user.
This bug has existed since a long time but became user visible with
9f5f4f5a42. Prior to that commit,
containers with no rwlayer weren't added to the in-memory viewdb, so
they weren't visible in `docker ps -a`. However, some dangling files
would still live on disk (e.g. folder in /var/lib/docker/containers,
mount points, etc).
The underlying issue is that when the daemon stops, it tries to stop all
running containers and then closes the containerd client. This leaves a
small window of time where the Engine might receive 'task stop' events
from containerd, and trigger autoremove. If the containerd client is
closed in parallel, the Engine is unable to complete the removal,
leaving the container in 'dead' state. In such case, the Engine logs the
following error:
cannot remove container "bcbc98b4f5c2b072eb3c4ca673fa1c222d2a8af00bf58eae0f37085b9724ea46": Canceled: grpc: the client connection is closing: context canceled
Solving the underlying issue would require complex changes to the
shutdown sequence. Moreover, the same issue could also happen if the
daemon crashes while it deletes a container. Thus, add a cleanup step
on daemon startup to remove these dead containers.
Signed-off-by: Albin Kerouanton <albin.kerouanton@docker.com>
Replace WithDialOpts with WithExtraDialOpts when creating containerd
clients to preserve the containerd client's default dial options while
adding our custom options.
Previously, using WithDialOpts would overwrite all of containerd's
default dial options, requiring us to sync them.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Throw an error if the containerd snapshotter is enabled on Windows but
containerd has not been configured. This fixes a panic in this case when
trying to use an uninitialized client.
Signed-off-by: Derek McGowan <derek@mcg.dev>
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
The call from Daemon.create -> Daemon.setHostConfig acquired
container.Lock, but didn't need to because the container is
newly created and solely owned by the caller. The call from
Daemon.restore did not acquire the lock.
Signed-off-by: Rob Murray <rob.murray@docker.com>
The priority order for determining image store choice was incorrect when
a prior graphdriver existed.
The issue occurred because the prior graphdriver check happened after
processing explicit driver configuration, effectively ignoring user
intent when prior state existed.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
This workaround was added in df519e9e1a, pending
a fix in containerd;
> daemon: Fix giving up too early while connecting to containerd socket
>
> Explicitly set the gRPC connection params to take the timeout into
> account to workaround the containerd v2 client not passing down the
> stack.
>
> containerd v2 replaced usages of deprecated gRPC functions but didn't
> pass the timeout to the actual dial connection options.
A fix for this was merged in [containerd@ee574e7], which is part of containerd
v2.1.0-beta.0, and backported to containerd v2.0.4 through [containerd@6b5efba].
[containerd@ee574e7]: ee574e76e7
[containerd@6b5efba]: 6b5efba83b
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
- Introduced DefaultIsolation method in the Daemon to return the daemon configured isolation mode for Windows.
Signed-off-by: Vigilans <vigilans@foxmail.com>
On daemon shutdown, the HTTP server tries to gracefully shutdown for 5
seconds. If there's an open API connection to the '/events' endpoint, it
fails to do so as nothing interrupts that connection, thus forcing the
daemon to wait until that timeout is reached.
Add a Close method to the EventsService, and call it during daemon
shutdown. It'll close any events channel, signaling to the '/events'
handler to return and close the connection.
It now takes ~1s (or less) to shutdown the daemon when there's an active
'/events' connection, instead of 5.
Signed-off-by: Albin Kerouanton <albin.kerouanton@docker.com>
commit a69abdd90d introduced a "verbose"
option for the disk-usage endpoint, which allowed omitting the items
to be included in the results.
However, it did not take into account that a singleflight is used to
allow sharing the results between requests; this means that a request
made while another request is already in flight could share the wrong
results, and either get "verbose" or "non-verbose", depending on the
request already in flight.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This change adds type specific fields to `GET /system/df` endpoint with high level information of disk usage. This change also introduces `verbose` query to the endpoint so that detailed information is by default excluded unless queried to reduce memory consumption. The previous top level `DiskUsage` fields (`Images`, `Containers`, `Volumes` and `BuildCache`) are now deprecated and kept for backwards compatibility.
Co-authored-by: Claude <noreply@anthropic.com>
Signed-off-by: Austin Vazquez <austin.vazquez@docker.com>
Defer the logic to fill in the container platform information from the
image service until container restore. During container restore the
image backend is fully initialized and can be used to fill in the
missing platform fields for older containers.
Signed-off-by: Derek McGowan <derek@mcg.dev>
The Container.State struct holds the container's state, and most of
its fields are expected to change dynamically. Some o these state-changes
are explicit, for example, setting the container to be "stopped". Other
state changes can be more explicit, for example due to the containers'
process exiting or being "OOM" killed by the kernel.
The distinction between explicit ("desired") state changes and "state"
("actual state") is sometimes vague; for some properties, we clearly
separated them, for example if a user requested the container to be
stopped or restarted, we store state in the Container object itself;
HasBeenManuallyStopped bool // used for unless-stopped restart policy
HasBeenManuallyRestarted bool `json:"-"` // used to distinguish restart caused by restart policy from the manual one
Other properties are more ambiguous. such as "HasBeenStartedBefore" and
"RestartCount", which are stored on the Container (and persisted to
disk), but may be more related to "actual" state, and likely should
not be persisted;
RestartCount int
HasBeenStartedBefore bool
Given that (per the above) concurrency must be taken into account, most
changes to the `container.State` struct should be protected; here's where
things get blurry. While the `State` type provides various accessor methods,
only some of them take concurrency into account; for example, [State.IsRunning]
and [State.GetPID] acquire a lock, whereas [State.ExitCodeValue] does not.
Even the (commonly used) [State.StateString] has no locking at all.
The way to handle this is error-prone; [container.State] contains a mutex,
and it's exported. Given that its embedded in the [container.Container]
struct, it's also exposed as an exported mutex for the container. The
assumption here is that by "merging" the two, the caller to acquire a lock
when either the container _or_ its state must be mutated. However, because
some methods on `container.State` handle their own locking, consumers must
be deeply familiar with the internals; if both changes to the `Container`
AND `Container.State` must be made. This gets amplified more as some
(exported!) methods, such as [container.SetRunning] mutate multiple fields,
but don't acquire a lock (so expect the caller to hold one), but their
(also exported) counterpart (e.g. [State.IsRunning]) do.
It should be clear from the above, that this needs some architectural
changes; a clearer separation between "desired" and "actual" state (opening
the potential to update the container's config without manually touching
its `State`), possibly a method to obtain a read-only copy of the current
state (for those querying state), and reviewing which fields belong where
(and should be persisted to disk, or only remain in memory).
This PR preserves the status quo; it makes no structural changes, other
than exposing where we access the container's state. Where previously the
State fields and methods were referred to as "part of the container"
(e.g. `ctr.IsRunning()` or `ctr.Running`), we now explicitly reference
the embedded `State` (`ctr.State.IsRunning`, `ctr.State.Running`).
The exception (for now) is the mutex, which is still referenced through
the embedded struct (`ctr.Lock()` instead of `ctr.State.Lock()`), as this
is (mostly) by design to protect the container, and what's in it (including
its `State`).
[State.IsRunning]: c4afa77157/daemon/container/state.go (L205-L209)
[State.GetPID]: c4afa77157/daemon/container/state.go (L211-L216)
[State.ExitCodeValue]: c4afa77157/daemon/container/state.go (L218-L228)
[State.StateString]: c4afa77157/daemon/container/state.go (L102-L131)
[container.State]: c4afa77157/daemon/container/state.go (L15-L23)
[container.Container]: c4afa77157/daemon/container/container.go (L67-L75)
[container.SetRunning]: c4afa77157/daemon/container/state.go (L230-L277)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Move the option-types to the client and in some cases create a
copy for the backend. These types are used to construct query-
args, and not marshaled to JSON, and can be replaced with functional
options in the client.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
makeDriverConfig is written in such a way that it seems to support
label-based driver configuration. That is, you could hypothetically use
labels starting with `com.docker.network.driver.<driver-name>.` to
define the configuration of a driver.
These labels come from the Controller's `cfg.Labels` which are set by
the daemon through libnet's OptionLabels which takes the list of labels
set on the daemon through dockerd's --label flag, or the equivalent
daemon.json field.
However, the daemon forbids setting labels that start with
`com.docker.*`. For instance:
label com.docker.network.driver.bridge.EnableProxy=false is not allowed: the namespaces com.docker.*, io.docker.*, and org.dockerproject.* are reserved for internal use
Hence, this is dead code — remove it.
Also, makeDriverConfig is checking if the Controller's cfg field is
nil... But the Controller struct is instantiated in a single place (i.e.
NewController) and it always set that field. Drop that nil check too.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
Fix a bug causing containers not being loaded when storage driver wasn't
chosen explicitly.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
When no explicit driver was specified, the containerd store by default
was also applied to existing graphdriver setups.
Fix this and add a test.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
This change restores the environment variable configuration of daemon storage driver through the DOCKER_DRIVER environment variable.
Signed-off-by: Austin Vazquez <austin.vazquez@docker.com>
Add layer migration on startup
Use image size threshold rather than image count
Add daemon integration test
Add test for migrating to containerd snapshotters
Add vfs migration
Add tar export for containerd migration
Add containerd migration test with save and load
Signed-off-by: Derek McGowan <derek@mcg.dev>
It has no external consumers, is written with specific behavior, making
it not a good candidate to carry in the module.
This moves it to the daemon as a non-exported `resolveSymlinkedDirectory`
utility, so that it's only accessible where it's currently used.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The "backend" types in API were designed to decouple the API server
implementation from the daemon, or other parts of the code that
back the API server. This would allow the daemon to evolve (e.g.
functionality moved to different subsystems) without that impacting
the API server's implementation.
Now that the API server is no longer part of the API package (module),
there is no benefit to having it in the API module. The API server
may evolve (and require changes in the backend), which has no direct
relation with the API module (types, responses); the backend definition
is, however, coupled to the API server implementation.
It's worth noting that, while "technically" possible to use the API
server package, and implement an alternative backend implementation,
this has never been a prime objective. The backend definition was
never considered "stable", and we don't expect external users to
(attempt) to use it as such.
This patch moves the backend types to the daemon/server package,
so that they can evolve with the daemon and API server implementation
without that impacting the API module (which we intend to be stable,
following SemVer).
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>