Commit Graph

74 Commits

Author SHA1 Message Date
Sebastiaan van Stijn
0df791cb72 explicitly access Container.State instead of through embedded struct
The Container.State struct holds the container's state, and most of
its fields are expected to change dynamically. Some o these state-changes
are explicit, for example, setting the container to be "stopped". Other
state changes can be more explicit, for example due to the containers'
process exiting or being "OOM" killed by the kernel.

The distinction between explicit ("desired") state changes and "state"
("actual state") is sometimes vague; for some properties, we clearly
separated them, for example if a user requested the container to be
stopped or restarted, we store state in the Container object itself;

    HasBeenManuallyStopped   bool // used for unless-stopped restart policy
    HasBeenManuallyRestarted bool `json:"-"` // used to distinguish restart caused by restart policy from the manual one

Other properties are more ambiguous. such as "HasBeenStartedBefore" and
"RestartCount", which are stored on the Container (and persisted to
disk), but may be more related to "actual" state, and likely should
not be persisted;

    RestartCount             int
    HasBeenStartedBefore     bool

Given that (per the above) concurrency must be taken into account, most
changes to the `container.State` struct should be protected; here's where
things get blurry. While the `State` type provides various accessor methods,
only some of them take concurrency into account; for example, [State.IsRunning]
and [State.GetPID] acquire a lock, whereas [State.ExitCodeValue] does not.
Even the (commonly used) [State.StateString] has no locking at all.

The way to handle this is error-prone; [container.State] contains a mutex,
and it's exported. Given that its embedded in the [container.Container]
struct, it's also exposed as an exported mutex for the container. The
assumption here is that by "merging" the two, the caller to acquire a lock
when either the container _or_ its state must be mutated. However, because
some methods on `container.State` handle their own locking, consumers must
be deeply familiar with the internals; if both changes to the `Container`
AND `Container.State` must be made. This gets amplified more as some
(exported!) methods, such as [container.SetRunning] mutate multiple fields,
but don't acquire a lock (so expect the caller to hold one), but their
(also exported) counterpart (e.g. [State.IsRunning]) do.

It should be clear from the above, that this needs some architectural
changes; a clearer separation between "desired" and "actual" state (opening
the potential to update the container's config without manually touching
its `State`), possibly a method to obtain a read-only copy of the current
state (for those querying state), and reviewing which fields belong where
(and should be persisted to disk, or only remain in memory).

This PR preserves the status quo; it makes no structural changes, other
than exposing where we access the container's state. Where previously the
State fields and methods were referred to as "part of the container"
(e.g. `ctr.IsRunning()` or `ctr.Running`), we now explicitly reference
the embedded `State` (`ctr.State.IsRunning`, `ctr.State.Running`).

The exception (for now) is the mutex, which is still referenced through
the embedded struct (`ctr.Lock()` instead of `ctr.State.Lock()`), as this
is (mostly) by design to protect the container, and what's in it (including
its `State`).

[State.IsRunning]: c4afa77157/daemon/container/state.go (L205-L209)
[State.GetPID]: c4afa77157/daemon/container/state.go (L211-L216)
[State.ExitCodeValue]: c4afa77157/daemon/container/state.go (L218-L228)
[State.StateString]: c4afa77157/daemon/container/state.go (L102-L131)
[container.State]: c4afa77157/daemon/container/state.go (L15-L23)
[container.Container]: c4afa77157/daemon/container/container.go (L67-L75)
[container.SetRunning]: c4afa77157/daemon/container/state.go (L230-L277)

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2025-09-19 16:02:14 +02:00
Cory Snider
c2c2b80e90 daemon: report IPAM status for Swarm networks
As the Engine API requests may be directed at a non-leader Swarm
manager, the information needs to be tunneled through the Swarm API.

Signed-off-by: Cory Snider <csnider@mirantis.com>
2025-09-11 15:25:14 -04:00
Cory Snider
f8bd170b2a daemon: validate args in network.New*Filter
Filter-term validation does not belong in the API module. Clients should
not be making any assumptions about which terms the daemon understands.
Users should not need to upgrade their clients to use filter terms
introduced in a newer daemon. Move the network filter validation from
the api module into the daemon.

Split network.NewFilter into network.NewListFilter and
network.NewPruneFilter constructors which validate the filter terms,
enforcing the invariant that any network.Filter is a well-formed filter
for networks.

The network route handlers have been leveraging a hidden 'idOrName'
filter term that is not listed in the set of accepted filters and
therefore not accepted in API client requests. And it's a good thing
that it was never part of the API: it is completely broken and not fit
for purpose! When a filter contains an idOrName term, the term values
are ignored and instead the filter tests whether either the 'id' or
'name' terms match the Name of the network. Unless the filter contains
both 'id' and 'name' terms, the match will evaluate to true for all
networks! None of the daemon-internal users of 'idOrName' set either
of those terms, therefore it has the same effect as if the filter did
not contain the 'idOrName' term in the first place.

Filtering networks by id-or-name is a quirky thing that the daemon needs
to do to uphold its end of the Engine API contract, not something that
would be of use to clients. Fixing up the idOrName filter would
necessitate adding it to the list of accepted terms so the filter passes
validaton, which would have the side effect of also making the filter
available to API clients. Instead, add an exported field to the Filter
struct so that daemon code can opt into the internal-only behaviour of
having the 'id' term match on either the network Name or ID.

Signed-off-by: Cory Snider <csnider@mirantis.com>
2025-09-04 12:49:31 -04:00
Cory Snider
ea1dfbda9e daemon: prune networks using network.Filter
Construct a network.Filter from the filters.Args only once per API
request so we don't waste cycles re-validating an already validated
filter. Since (*Daemon).NetworksPrune is implemented in terms of
(Cluster).GetNetworks, that method now accepts a network.Filter instead
of a filter.Args. Change the signature of (*Daemon).GetNetworks for
consistency as both of the GetNetworks methods are used by network API
route handlers.

Signed-off-by: Cory Snider <csnider@mirantis.com>
2025-09-04 12:49:31 -04:00
Austin Vazquez
812aa46d81 Move the api/types/time package to internal daemon package
Signed-off-by: Austin Vazquez <austin.vazquez@docker.com>
2025-08-14 07:56:59 -05:00
Sebastiaan van Stijn
c98e5cb60b update github links to moby/moby
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2025-08-01 01:48:55 +02:00
Derek McGowan
f74e5d48b3 Create github.com/moby/moby/v2 module
Signed-off-by: Derek McGowan <derek@mcg.dev>
2025-07-31 10:13:29 -07:00
Sebastiaan van Stijn
83510a26b3 api/types: move backend types to daemon/server
The "backend" types in API were designed to decouple the API server
implementation from the daemon, or other parts of the code that
back the API server. This would allow the daemon to evolve (e.g.
functionality moved to different subsystems) without that impacting
the API server's implementation.

Now that the API server is no longer part of the API package (module),
there is no benefit to having it in the API module. The API server
may evolve (and require changes in the backend), which has no direct
relation with the API module (types, responses); the backend definition
is, however, coupled to the API server implementation.

It's worth noting that, while "technically" possible to use the API
server package, and implement an alternative backend implementation,
this has never been a prime objective. The backend definition was
never considered "stable", and we don't expect external users to
(attempt) to use it as such.

This patch moves the backend types to the daemon/server package,
so that they can evolve with the daemon and API server implementation
without that impacting the API module (which we intend to be stable,
following SemVer).

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2025-07-28 00:03:04 +02:00
Derek McGowan
222b2b8b2f Move internal/lazyregexp to daemon/internal/lazyregexp
Signed-off-by: Derek McGowan <derek@mcg.dev>
2025-07-24 12:12:38 -07:00
Derek McGowan
afd6487b2e Create github.com/moby/moby/api module
Signed-off-by: Derek McGowan <derek@mcg.dev>
2025-07-21 09:30:05 -07:00
Derek McGowan
7a720df61f Move libnetwork to daemon/libnetwork
Signed-off-by: Derek McGowan <derek@mcg.dev>
2025-07-14 09:25:23 -07:00
Rob Murray
793dd8385a Only "prune" Windows networks created by Docker
Signed-off-by: Rob Murray <rob.murray@docker.com>
2025-06-06 20:24:04 +01:00
Sebastiaan van Stijn
5318877858 daemon: remove // import comments
These comments were added to enforce using the correct import path for
our packages ("github.com/docker/docker", not "github.com/moby/moby").
However, when working in go module mode (not GOPATH / vendor), they have
no effect, so their impact is limited.

Remove these imports in preparation of migrating our code to become an
actual go module.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2025-05-30 15:59:13 +02:00
Sebastiaan van Stijn
bc1dbd9ea6 daemon: use lazyregexp to compile regexes on first use
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2025-01-02 21:37:33 +01:00
Cory Snider
71a299ff6a daemon: switch to Go 1.19 atomics
Signed-off-by: Cory Snider <csnider@mirantis.com>
2024-07-05 19:05:15 -04:00
Sebastiaan van Stijn
d22d8a78f1 runconfig: deprecate IsPreDefinedNetwork
Move the function internal to the daemon, where it's used. Deliberately
not mentioning the new location, as this function should not be used
externally.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2024-06-20 11:10:54 +02:00
Sebastiaan van Stijn
db2f1acd5d api/types: move ContainersPruneReport to api/types/container
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2024-06-10 10:19:47 +02:00
Sebastiaan van Stijn
e5f9484ab6 api/types: move NetworksPruneReport to api/types/network
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2024-06-07 11:14:52 +02:00
Sebastiaan van Stijn
484e6b784c api/types: move ContainerCreateConfig, ContainerRmConfig to api/types/backend
The `ContainerCreateConfig` and `ContainerRmConfig` structs are used for
options to be passed to the backend, and are not used in client code.

Thess struct currently is intended for internal use only (for example, the
`AdjustCPUShares` is an internal implementation details to adjust the container's
config when older API versions are used).

Somewhat ironically, the signature of the Backend has a nicer UX than that
of the client's `ContainerCreate` signature (which expects all options to
be passed as separate arguments), so we may want to update that signature
to be closer to what the backend is using, but that can be left as a future
exercise.

This patch moves the `ContainerCreateConfig` and `ContainerRmConfig` structs
to the backend package to prevent it being imported in the client, and to make
it more clear that this is part of internal APIs, and not public-facing.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2023-12-05 16:41:36 +01:00
Sebastiaan van Stijn
cff4f20c44 migrate to github.com/containerd/log v0.1.0
The github.com/containerd/containerd/log package was moved to a separate
module, which will also be used by upcoming (patch) releases of containerd.

This patch moves our own uses of the package to use the new module.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2023-10-11 17:52:23 +02:00
Sebastiaan van Stijn
0f871f8cb7 api/types/events: define "Action" type and consts
Define consts for the Actions we use for events, instead of "ad-hoc" strings.
Having these consts makes it easier to find where specific events are triggered,
makes the events less error-prone, and allows documenting each Action (if needed).

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2023-08-29 00:38:08 +02:00
Sebastiaan van Stijn
74354043ff remove uses of libnetwork/Network.Info()
Now that we removed the interface, there's no need to cast the Network
to a NetworkInfo interface, so we can remove uses of the `Info()` method.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2023-08-08 22:05:30 +02:00
Sebastiaan van Stijn
64c6f72988 libnetwork: remove Network interface
There's only one implementation; drop the interface and use the
concrete type instead.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2023-07-22 11:56:41 +02:00
Brian Goff
74da6a6363 Switch all logging to use containerd log pkg
This unifies our logging and allows us to propagate logging and trace
contexts together.

Signed-off-by: Brian Goff <cpuguy83@gmail.com>
2023-06-24 00:23:44 +00:00
Cory Snider
d222bf097c daemon: reload runtimes w/o breaking containers
The existing runtimes reload logic went to great lengths to replace the
directory containing runtime wrapper scripts as atomically as possible
within the limitations of the Linux filesystem ABI. Trouble is,
atomically swapping the wrapper scripts directory solves the wrong
problem! The runtime configuration is "locked in" when a container is
started, including the path to the runC binary. If a container is
started with a runtime which requires a daemon-managed wrapper script
and then the daemon is reloaded with a config which no longer requires
the wrapper script (i.e. some args -> no args, or the runtime is dropped
from the config), that container would become unmanageable. Any attempts
to stop, exec or otherwise perform lifecycle management operations on
the container are likely to fail due to the wrapper script no longer
existing at its original path.

Atomically swapping the wrapper scripts is also incompatible with the
read-copy-update paradigm for reloading configuration. A handler in the
daemon could retain a reference to the pre-reload configuration for an
indeterminate amount of time after the daemon configuration has been
reloaded and updated. It is possible for the daemon to attempt to start
a container using a deleted wrapper script if a request to run a
container races a reload.

Solve the problem of deleting referenced wrapper scripts by ensuring
that all wrapper scripts are *immutable* for the lifetime of the daemon
process. Any given runtime wrapper script must always exist with the
same contents, no matter how many times the daemon config is reloaded,
or what changes are made to the config. This is accomplished by using
everyone's favourite design pattern: content-addressable storage. Each
wrapper script file name is suffixed with the SHA-256 digest of its
contents to (probabilistically) guarantee immutability without needing
any concurrency control. Stale runtime wrapper scripts are only cleaned
up on the next daemon restart.

Split the derived runtimes configuration from the user-supplied
configuration to have a place to store derived state without mutating
the user-supplied configuration or exposing daemon internals in API
struct types. Hold the derived state and the user-supplied configuration
in a single struct value so that they can be updated as an atomic unit.

Signed-off-by: Cory Snider <csnider@mirantis.com>
2023-06-01 14:45:25 -04:00
Cory Snider
0b592467d9 daemon: read-copy-update the daemon config
Ensure data-race-free access to the daemon configuration without
locking by mutating a deep copy of the config and atomically storing
a pointer to the copy into the daemon-wide configStore value. Any
operations which need to read from the daemon config must capture the
configStore value only once and pass it around to guarantee a consistent
view of the config.

Signed-off-by: Cory Snider <csnider@mirantis.com>
2023-06-01 14:45:24 -04:00
Paweł Gronowski
117ceac82b daemon/prune: Use errdefs for invalid "until" value
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
2023-04-21 10:25:57 +02:00
Laura Brehm
45ee4d7c78 c8d: Compute container's layer size
Co-authored-by: Sebastiaan van Stijn <github@gone.nl>
Signed-off-by: Laura Brehm <laurabrehm@hey.com>
2023-03-08 00:58:02 +01:00
Brian Goff
4b981436fe Fixup libnetwork lint errors
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
2021-06-01 23:48:32 +00:00
Brian Goff
a0a473125b Fix libnetwork imports
After moving libnetwork to this repo, we need to update all the import
paths for libnetwork to point to docker/docker/libnetwork instead of
docker/libnetwork.
This change implements that.

Signed-off-by: Brian Goff <cpuguy83@gmail.com>
2021-06-01 21:51:23 +00:00
Sebastiaan van Stijn
51c7992928 API: add "prune" events
This patch adds a new "prune" event type to indicate that pruning of a resource
type completed.

This event-type can be used on systems that want to perform actions after
resources have been cleaned up. For example, Docker Desktop performs an fstrim
after resources are deleted (https://github.com/linuxkit/linuxkit/tree/v0.7/pkg/trim-after-delete).

While the current (remove, destroy) events can provide information on _most_
resources, there is currently no event triggered after the BuildKit build-cache
is cleaned.

Prune events have a `reclaimed` attribute, indicating the amount of space that
was reclaimed (in bytes). The attribute can be used, for example, to use as a
threshold for performing fstrim actions. Reclaimed space for `network` events
will always be 0, but the field is added to be consistent with prune events for
other resources.

To test this patch:

Create some resources:

    for i in foo bar baz; do \
        docker network create network_$i \
        && docker volume create volume_$i \
        && docker run -d --name container_$i -v volume_$i:/volume busybox sh -c 'truncate -s 5M somefile; truncate -s 5M /volume/file' \
        && docker tag busybox:latest image_$i; \
    done;

    docker pull alpine
    docker pull nginx:alpine

    echo -e "FROM busybox\nRUN truncate -s 50M bigfile" | DOCKER_BUILDKIT=1 docker build -

Start listening for "prune" events in another shell:

    docker events --filter event=prune

Prune containers, networks, volumes, and build-cache:

    docker system prune -af --volumes

See the events that are returned:

    docker events --filter event=prune
    2020-07-25T12:12:09.268491000Z container prune  (reclaimed=15728640)
    2020-07-25T12:12:09.447890400Z network prune  (reclaimed=0)
    2020-07-25T12:12:09.452323000Z volume prune  (reclaimed=15728640)
    2020-07-25T12:12:09.517236200Z image prune  (reclaimed=21568540)
    2020-07-25T12:12:09.566662600Z builder prune  (reclaimed=52428841)

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2020-07-28 12:41:14 +02:00
Brian Goff
c0bc14e8dd Move network conversions out of API router
This stuff doesn't belong here and is causing imports of libnetwork into
the router, which is not what we want.

Signed-off-by: Brian Goff <cpuguy83@gmail.com>
2018-06-27 17:11:29 -07:00
Brian Goff
e4b6adc88e Extract volume interaction to a volumes service
This cleans up some of the package API's used for interacting with
volumes, and simplifies management.

Signed-off-by: Brian Goff <cpuguy83@gmail.com>
2018-05-25 14:21:07 -04:00
Kir Kolyshkin
7d62e40f7e Switch from x/net/context -> context
Since Go 1.7, context is a standard package. Since Go 1.9, everything
that is provided by "x/net/context" is a couple of type aliases to
types in "context".

Many vendored packages still use x/net/context, so vendor entry remains
for now.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2018-04-23 13:52:44 -07:00
Brian Goff
63826e291b Move direct volume driver interaction to store
Since the volume store already provides this functionality, we should
just use it rather than duplicating it.

Signed-off-by: Brian Goff <cpuguy83@gmail.com>
2018-04-17 14:06:53 -04:00
Brian Goff
9d46c4c138 Support cancellation in directory.Size()
Makes sure that if the user cancels a request that the daemon stops
trying to traverse a directory.

Signed-off-by: Brian Goff <cpuguy83@gmail.com>
2018-03-29 15:49:15 -04:00
Daniel Nephin
0dab53ff3c Move all daemon image methods into imageService
imageService provides the backend for the image API and handles the
imageStore, and referenceStore.

Signed-off-by: Daniel Nephin <dnephin@docker.com>
2018-02-26 16:48:29 -05:00
Daniel Nephin
9c25df0fa2 Move ImagePrune
Signed-off-by: Daniel Nephin <dnephin@docker.com>
2018-02-21 18:26:16 -05:00
Daniel Nephin
3aa4f7f0d7 Remove broken container check from image prune
The imageRefs map was being popualted with containerID, and accessed
with an imageID which would never match.

Remove this broken code because: 1) it hasn't ever worked so isn't
necessary, and 2) because at best it would be racy

ImageDelete() should already handle preventing of removal of used
images.

Signed-off-by: Daniel Nephin <dnephin@docker.com>
2018-02-08 18:10:46 -05:00
Daniel Nephin
4f0d95fa6e Add canonical import comment
Signed-off-by: Daniel Nephin <dnephin@docker.com>
2018-02-05 16:51:57 -05:00
John Howard
afd305c4b5 LCOW: Refactor to multiple layer-stores based on feedback
Signed-off-by: John Howard <jhoward@microsoft.com>
2018-01-18 08:31:05 -08:00
John Howard
ce8e529e18 LCOW: Re-coalesce stores
Signed-off-by: John Howard <jhoward@microsoft.com>

The re-coalesces the daemon stores which were split as part of the
original LCOW implementation.

This is part of the work discussed in https://github.com/moby/moby/issues/34617,
in particular see the document linked to in that issue.
2018-01-18 08:29:19 -08:00
Yong Tang
2a7388a6c4 Merge pull request #34960 from sterchelen/34953-Prune-Volume-lack-event-entry
Fix #34953 how volumes are pruned from daemon
2017-10-12 09:24:26 -07:00
Nicolas Sterchele
63864ad8c1 Fix #34953 how volumes are pruned from daemon
- Call the function that create an event entry while volumes are
pruning.
- Pass volume.Volume type on volumeRm instead of a name. Volume lookup is done
on the exported VolumeRm function.
- Skip volume deletion when force option used and it does not exists.

Signed-off-by: Nicolas Sterchele <sterchele.nicolas@gmail.com>
2017-10-09 21:15:26 +02:00
Sebastiaan van Stijn
97c5ae25c4 Replace uses of filters.Include() with filters.Contains()
The `filters.Include()` method was deprecated in favor of `filters.Contains()`
in 065118390a, but still used in various
locations.

This patch replaces uses of `filters.Include()` with `filters.Contains()`.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2017-09-26 13:39:56 +02:00
John Howard
7b9a8f460b Move to a single tag-store
Signed-off-by: John Howard <jhoward@microsoft.com>
2017-08-18 17:09:27 -07:00
Brian Goff
ebcb7d6b40 Remove string checking in API error handling
Use strongly typed errors to set HTTP status codes.
Error interfaces are defined in the api/errors package and errors
returned from controllers are checked against these interfaces.

Errors can be wraeped in a pkg/errors.Causer, as long as somewhere in the
line of causes one of the interfaces is implemented. The special error
interfaces take precedence over Causer, meaning if both Causer and one
of the new error interfaces are implemented, the Causer is not
traversed.

Signed-off-by: Brian Goff <cpuguy83@gmail.com>
2017-08-15 16:01:11 -04:00
Derek McGowan
1009e6a40b Update logrus to v1.0.1
Fixes case sensitivity issue

Signed-off-by: Derek McGowan <derek@mcgstyle.net>
2017-07-31 13:16:46 -07:00
allencloud
87b4dc2002 return prune data when context canceled
Signed-off-by: allencloud <allen.sun@daocloud.io>
2017-07-10 10:06:24 +08:00
John Howard
4ec9766a27 LCOW: Fix nits from 33241
Signed-off-by: John Howard <jhoward@microsoft.com>
2017-06-27 11:59:49 -07:00