51 Commits

Author SHA1 Message Date
Sebastiaan van Stijn
19edf44896 daemon/config: remove deprecated Config.
This function was deprecated in 83f8f4efd7,
and the package is internal to the daemon, so we can remove it.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2025-09-12 09:59:12 +02:00
Derek McGowan
f74e5d48b3 Create github.com/moby/moby/v2 module
Signed-off-by: Derek McGowan <derek@mcg.dev>
2025-07-31 10:13:29 -07:00
Sebastiaan van Stijn
d761d9d358 pkg/rootless: move to daemon/internal
This package is used internally by the daemon, and was only used out
side of the daemon by pkg/plugins (for which we still need to look
where it should be kept).

Making it internal because it's trivial to implement if needed by
anyone. The only reason it's a package is to keep it central, and
to make it easier to discover where we have rootlesskit-specific
codepaths in our codebase.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2025-07-28 22:04:39 +02:00
Rob Murray
cf1695bef1 Add option --bridge-accept-fwmark
Packets with the given firewall mark are accepted by the bridge
driver's filter-FORWARD rules.

The value can either be an integer mark, or it can include a
mask in the format "<mark>/<mask>".

Signed-off-by: Rob Murray <rob.murray@docker.com>
2025-07-22 19:15:02 +01:00
Rob Murray
090c319f2e Don't allow the daemon to start with nftables and Swarm enabled
Signed-off-by: Rob Murray <rob.murray@docker.com>
2025-07-22 09:13:45 +01:00
Derek McGowan
afd6487b2e Create github.com/moby/moby/api module
Signed-off-by: Derek McGowan <derek@mcg.dev>
2025-07-21 09:30:05 -07:00
Rob Murray
39ab393274 Add daemon option --firewall-backend
Signed-off-by: Rob Murray <rob.murray@docker.com>
2025-07-17 15:12:01 +01:00
Derek McGowan
7a720df61f Move libnetwork to daemon/libnetwork
Signed-off-by: Derek McGowan <derek@mcg.dev>
2025-07-14 09:25:23 -07:00
Derek McGowan
f05652867d Move opts to daemon/pkg/opts
Signed-off-by: Derek McGowan <derek@mcg.dev>
2025-07-14 09:25:05 -07:00
Matthieu MOREL
381d9d0723 fix use-errors-new from revive
Signed-off-by: Matthieu MOREL <matthieu.morel35@gmail.com>
2025-06-26 12:07:38 +00:00
Sebastiaan van Stijn
27956106d5 daemon/config: remove // import comments
These comments were added to enforce using the correct import path for
our packages ("github.com/docker/docker", not "github.com/moby/moby").
However, when working in go module mode (not GOPATH / vendor), they have
no effect, so their impact is limited.

Remove these imports in preparation of migrating our code to become an
actual go module.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2025-05-30 15:59:12 +02:00
Rob Murray
44a3453d73 Add daemon option --allow-direct-routing
Per-network option com.docker.network.bridge.trusted-host-interfaces
accepts a list of interfaces that are allowed to route
directly to a container's published ports in a bridge
network with nat enabled.

This daemon level option disables direct access filtering,
enabling direct access to published ports on container
addresses in all bridge networks, via all host interfaces.

It overlaps with short-term env-var workaround:
  DOCKER_INSECURE_NO_IPTABLES_RAW=1
- it does not allow packets sent from outside the host to reach
  ports published only to 127.0.0.1
- it will outlive iptables (the workaround was initially intended
  for hosts that do not have kernel support for the "raw" iptables
  table).

Signed-off-by: Rob Murray <rob.murray@docker.com>
2025-04-30 20:59:28 +01:00
Sebastiaan van Stijn
d7f59cec05 daemon/config: add basic validation of exec-opt options
Validate if options are passed in the right format and if the given option
is supported on the current platform.

Before this patch, no validation would happen until the daemon was started,
and unknown options as well as incorrectly formatted options would be silently
ignored on Linux;

    dockerd --exec-opt =value-only --validate
    configuration OK

    dockerd --exec-opt unknown-opt=unknown-value --validate
    configuration OK

    dockerd --exec-opt unknown-opt=unknown-value --validate
    ...
    INFO[2024-11-28T12:07:44.255942174Z] Daemon has completed initialization
    INFO[2024-11-28T12:07:44.361412049Z] API listen on /var/run/docker.sock

With this patch, exec-opts are included in the validation before the daemon
is started/created, and errors are produced when trying to use an option
that's either unknown or not supported by the platform;

    dockerd --exec-opt =value-only --validate
    unable to configure the Docker daemon with file /etc/docker/daemon.json: merged configuration validation from file and command line flags failed: invalid exec-opt (=value-only): must be formatted 'opt=value'

    dockerd --exec-opt isolation=default --validate
    unable to configure the Docker daemon with file /etc/docker/daemon.json: merged configuration validation from file and command line flags failed: invalid exec-opt (isolation=default): 'isolation' option is only supported on windows

    dockerd --exec-opt unknown-opt=unknown-value --validate
    unable to configure the Docker daemon with file /etc/docker/daemon.json: merged configuration validation from file and command line flags failed: invalid exec-opt (unknown-opt=unknown-value): unknown option: 'unknown-opt'

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2025-01-03 13:05:35 +01:00
Sebastiaan van Stijn
c9f17bedc7 daemon/config: extract validation of userland-proxy config
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2024-11-28 16:41:19 +01:00
Sebastiaan van Stijn
83f8f4efd7 daemon/config: deprecate Config.ValidatePlatformConfig
This method was only used internally as part of config.Validate; deprecate
it in favor of config.Validate and make it a non-exported function.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2024-11-28 16:41:18 +01:00
Sebastiaan van Stijn
74a00f183b daemon/config: move utility-functions separate from Config methods
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2024-11-28 16:41:18 +01:00
Sebastiaan van Stijn
a4714fa04d daemon/config: verifyDefaultCgroupNsMode: update error message for consistency
Most validation errors have the "invalid xxxxx" prefix; format this error
to be consitent with other errors.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2024-11-28 16:41:18 +01:00
Rob Murray
0b5b1db1c1 Use default ULA prefix if fixed-cidr-v6 is not specified
Use the same logic to generate IPAMConf for IPv6 as for IPv4.

- When no fixed-cidr-v6 is specified, rather than error out, use
  the default address pools (as for an IPv4 default bridge with no
  fixed-cidr, and as for user-defined networks).
- Add daemon option --bip6, similar to --bip.
  - Necessary because it's the only way to override an old address
    on docker0 (daemon-managed default bridge), as illustrated by
    test cases.
- For a user-managed default bridge (--bridge), use IPv6 addresses
  on the user's bridge to determine the pool, sub-pool and gateway.
  Following the same rules as IPv4.
- Don't set up IPv6 IPAMConf if IPv6 is not enabled.

Signed-off-by: Rob Murray <rob.murray@docker.com>
2024-11-25 18:29:25 +00:00
Rob Murray
3cadadb4eb Add daemon option --ip-forward-no-drop
The daemon no longer depends on the iptables/ip6tables filter-FORWARD
chain's policy being DROP in order to implement its port filtering
rules.

However, if the daemon enables IP forwarding in the host's system
config, by default it will set the policy to DROP to avoid potential
security issues for other applications/networks.

If docker does need to enable IP forwarding, but other applications
on the host require filter-FORWARD's policies to be ACCEPT, this
option can be used to tell the daemon to leave the policy unchanged.
(Equivalent to enabling IP forwarding before starting the daemon,
but without needing to do that.)

Signed-off-by: Rob Murray <rob.murray@docker.com>
2024-11-11 12:12:57 +00:00
Rob Murray
b79bba6b68 Remove feature flag "windows-dns-proxy"
Added in 26.1.0, commit 6c68be24a2
Default changed to true in 27.0.0, commit 33f9a5329a

No sign of problems so, remove.

Signed-off-by: Rob Murray <rob.murray@docker.com>
2024-10-24 11:19:42 +01:00
Rob Murray
f1e0746c08 Tell RootlessKit about docker-proxy port mappings
Before this change, when running rootless, instead of running
docker-proxy the daemon would run rootlesskit-docker-proxy.

The job of rootlesskit-docker-proxy was to tell RootlessKit
about mapped host ports before starting docker-proxy, and then
to remove the mapping when it was stopped.

So, rootlesskit-docker-proxy would need to be kept in-step
with changes to docker-proxy (particuarly the upcoming change
to bind TCP/UDP ports in the daemon and pass them to the proxy,
but also possible-future changes like running proxy per-container
rather than per-port-mapping).

This change runs the docker-proxy in rootless mode, instead of
rootlesskit-docker-proxy, and the daemon itself tells RootlessKit
about changes in host port mappings.

Signed-off-by: Rob Murray <rob.murray@docker.com>
2024-08-05 14:04:05 +01:00
Brian Goff
f8c088be05 Lookup docker-proxy in libexec paths
This allows distros to put docker-proxy under libexec paths as is done
for docker-init.

Also expands the lookup to to not require a `docker/` subdir in libexec
subdir.
Since it is a generic helper that may be used for something else in the
future, this is only done for binaries with a `docker-`.

Signed-off-by: Brian Goff <cpuguy83@gmail.com>
2024-06-20 19:26:54 +00:00
Sebastiaan van Stijn
517fb0991e api/types/container: provide alias for github.com/docker/go-units.Ulimit
This type is included in various types used in the API, but comes from
a separate module. The go-units module may be moving to the moby org,
and it is yet to be decided if the Ulimit type is a good fit for that
module (which deals with more generic units, such as "size" and "duration"
otherwise).

This patch introduces an alias to help during the transition of this type
to it's new location. The alias makes sure that existing code continues
to work (at least for now), but we need to start updating such code after
this PR is merged.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2024-06-18 13:18:20 +02:00
Rob Murray
74d77d8811 Revert "Internal resolver for default bridge network"
This reverts commit 18f4f775ed.

Because buildkit doesn't run an internal resolver, and it bases its
/etc/resolv.conf on the host's ... when buildkit is run in a container
that has 'nameserver 127.0.0.11', its build containers will use Google's
DNS servers as a fallback (unless the build container uses host
networking).

Before, when the 127.0.0.11 resolver was not used for the default network,
the buildkit container would have inherited a site-local nameserver. So,
the build containers it created would also have inherited that DNS
server - and they'd be able to resolve site-local hostnames.

By replacing the site-local nameserver with Google's, we broke access
to local DNS and its hostnames.

Signed-off-by: Rob Murray <rob.murray@docker.com>
2024-06-17 20:19:20 +01:00
Rob Murray
18f4f775ed Internal resolver for default bridge network
Until now, containers on the default bridge network have been configured
to talk directly to external DNS servers - their resolv.conf files have
either been populated with nameservers from the host's resolv.conf, or
with servers from '--dns' (or with Google's nameservers as a fallback).

This change makes the internal bridge more like other networks by using
the internal resolver.  But, the internal resolver is not populated with
container names or aliases - it's only for external DNS lookups.

Containers on the default network, on a host that has a loopback
resolver (like systemd's on 127.0.0.53) will now use that resolver
via the internal resolver. So, the logic used to find systemd's current
set of resolvers is no longer needed by the daemon.

Legacy links work just as they did before, using '/etc/hosts' and magic.

(Buildkit does not use libnetwork, so it can't use the internal resolver.
But it does use libnetwork/resolvconf's logic to configure resolv.conf.
So, code to set up resolv.conf for a legacy networking without an internal
resolver can't be removed yet.)

Signed-off-by: Rob Murray <rob.murray@docker.com>
2024-06-05 20:27:24 +01:00
Rob Murray
6c68be24a2 Windows DNS resolver forwarding
Make the internal DNS resolver for Windows containers forward requests
to upsteam DNS servers when it cannot respond itself, rather than
returning SERVFAIL.

Windows containers are normally configured with the internal resolver
first for service discovery (container name lookup), then external
resolvers from '--dns' or the host's networking configuration.

When a tool like ping gets a SERVFAIL from the internal resolver, it
tries the other nameservers. But, nslookup does not, and with this
change it does not need to.

The internal resolver learns external server addresses from the
container's HNSEndpoint configuration, so it will use the same DNS
servers as processes in the container.

The internal resolver for Windows containers listens on the network's
gateway address, and each container may have a different set of external
DNS servers. So, the resolver uses the source address of the DNS request
to select external resolvers.

On Windows, daemon.json feature option 'windows-no-dns-proxy' can be used
to prevent the internal resolver from forwarding requests (restoring the
old behaviour).

Signed-off-by: Rob Murray <rob.murray@docker.com>
2024-04-16 18:57:28 +01:00
Sebastiaan van Stijn
19a04efa2f api: remove API < v1.24
Commit 08e4e88482 (Docker Engine v25.0.0)
deprecated API version v1.23 and lower, but older API versions could be
enabled through the DOCKER_MIN_API_VERSION environment variable.

This patch removes all support for API versions < v1.24.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2024-02-06 18:44:44 +01:00
Albin Kerouanton
f07c45e4f2 daemon: remove --oom-score-adjust flag
This flag was marked deprecated in commit 5a922dc16 (released in v24.0)
and to be removed in the next release.

Signed-off-by: Albin Kerouanton <albinker@gmail.com>
2024-01-20 00:40:28 +01:00
Sebastiaan van Stijn
e8e20c0897 daemon/config: setPlatformDefaults: use debug for missing userland-proxy
commit 4f9db655ed moved looking up the
userland-proxy binary to early in the startup process, and introduced
a log-message if the binary was missing.

However, a side-effect of this was this message would also be printed
when running "--version";

    dockerd --version
    time="2024-01-09T09:18:53.705271292Z" level=warning msg="failed to lookup default userland-proxy binary" error="exec: \"docker-proxy\": executable file not found in $PATH"
    Docker version v25.0.0-rc.1, build 9cebefa717

We should look if we can avoid this, but let's change the message to be
a debug message as a short-term workaround.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2024-01-09 13:18:04 +01:00
Sebastiaan van Stijn
4f9db655ed portmapper: move userland-proxy lookup to daemon config
When mapping a port with the userland-proxy enabled, the daemon would
perform an "exec.LookPath" for every mapped port (which, in case of
a range of ports, would be for every port in the range).

This was both inefficient (looking up the binary for each port), inconsistent
(when running in rootless-mode, the binary was looked-up once), as well as
inconvenient, because a missing binary, or a mis-configureed userland-proxy-path
would not be detected daeemon startup, and not produce an error until starting
the container;

    docker run -d -P nginx:alpine
    4f7b6589a1680f883d98d03db12203973387f9061e7a963331776170e4414194
    docker: Error response from daemon: driver failed programming external connectivity on endpoint romantic_wiles (7cfdc361821f75cbc665564cf49856cf216a5b09046d3c22d5b9988836ee088d): fork/exec docker-proxy: no such file or directory.

However, the container would still be created (but invalid);

    docker ps -a
    CONTAINER ID   IMAGE          COMMAND                  CREATED          STATUS    PORTS     NAMES
    869f41d7e94f   nginx:alpine   "/docker-entrypoint.…"   10 seconds ago   Created             romantic_wiles

This patch changes how the userland-proxy is configured;

- The path of the userland-proxy is now looked up / configured at daemon
  startup; this is similar to how the proxy is configured in rootless-mode.
- A warning is logged when failing to lookup the binary.
- If the daemon is configured with "userland-proxy" enabled, an error is
  produced, and the daemon will refuse to start.
- The "proxyPath" argument for newProxyCommand() (in libnetwork/portmapper)
  is now required to be set. It no longer looks up the executable, and
  produces an error if no path was provided. While this change was not
  required, it makes the daemon config the canonical source of truth, instead
  of logic spread accross multiplee locations.

Some of this logic is a change of behavior, but these changes were made with
the assumption that we don't want to support;

- installing the userland proxy _after_ the daemon was started
- moving the userland proxy (or installing a proxy with a higher
  preference in PATH)

With this patch:

Validating the config produces an error if the binary is not found:

    dockerd --validate
    WARN[2023-12-29T11:36:39.748699591Z] failed to lookup default userland-proxy binary       error="exec: \"docker-proxy\": executable file not found in $PATH"
    userland-proxy is enabled, but userland-proxy-path is not set

Disabling userland-proxy prints a warning, but validates as "OK":

    dockerd --userland-proxy=false --validate
    WARN[2023-12-29T11:38:30.752523879Z] ffailed to lookup default userland-proxy binary       error="exec: \"docker-proxy\": executable file not found in $PATH"
    configuration OK

Speficying a non-absolute path produces an error:

    dockerd --userland-proxy-path=docker-proxy --validate
    invalid userland-proxy-path: must be an absolute path: docker-proxy

Befor this patch, we would not validate this path, which would allow the daemon
to start, but fail to map a port;

    docker run -d -P nginx:alpine
    4f7b6589a1680f883d98d03db12203973387f9061e7a963331776170e4414194
    docker: Error response from daemon: driver failed programming external connectivity on endpoint romantic_wiles (7cfdc361821f75cbc665564cf49856cf216a5b09046d3c22d5b9988836ee088d): fork/exec docker-proxy: no such file or directory.

Specifying an invalid userland-proxy-path produces an error as well:

    dockerd --userland-proxy-path=/usr/local/bin/no-such-binary --validate
    userland-proxy-path is invalid: stat /usr/local/bin/no-such-binary: no such file or directory

    mkdir -p /usr/local/bin/not-a-file
    dockerd --userland-proxy-path=/usr/local/bin/not-a-file --validate
    userland-proxy-path is invalid: exec: "/usr/local/bin/not-a-file": is a directory

    touch /usr/local/bin/not-an-executable
    dockerd --userland-proxy-path=/usr/local/bin/not-an-executable --validate
    userland-proxy-path is invalid: exec: "/usr/local/bin/not-an-executable": permission denied

Same when using the daemon.json config-file;

    echo '{"userland-proxy-path":"no-such-binary"}' > /etc/docker/daemon.json
    dockerd --validate
    unable to configure the Docker daemon with file /etc/docker/daemon.json: merged configuration validation from file and command line flags failed: invalid userland-proxy-path: must be an absolute path: no-such-binary

    dockerd --userland-proxy-path=hello --validate
    unable to configure the Docker daemon with file /etc/docker/daemon.json: the following directives are specified both as a flag and in the configuration file: userland-proxy-path: (from flag: hello, from file: /usr/local/bin/docker-proxy)

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2023-12-29 16:23:18 +01:00
Sebastiaan van Stijn
388216fc45 Merge pull request #46850 from robmry/46829-allow_ipv6_subnet_change
Allow overlapping change in bridge's IPv6 network.
2023-12-19 18:35:13 +01:00
Rob Murray
27f3abd893 Allow overlapping change in bridge's IPv6 network.
Calculate the IPv6 addreesses needed on a bridge, then reconcile them
with the addresses on an existing bridge by deleting then adding as
required.

(Previously, required addresses were added one-by-one, then unwanted
addresses were removed. This meant the daemon failed to start if, for
example, an existing bridge had address '2000:db8::/64' and the config
was changed to '2000:db8::/80'.)

IPv6 addresses are now calculated and applied in one go, so there's no
need for setupVerifyAndReconcile() to check the set of IPv6 addresses on
the bridge. And, it was guarded by !config.InhibitIPv4, which can't have
been right. So, removed its IPv6 parts, and added IPv4 to its name.

Link local addresses, the example given in the original ticket, are now
released when containers are stopped. Not releasing them meant that
when using an LL subnet on the default bridge, no container could be
started after a container was stopped (because the calculated address
could not be re-allocated). In non-default bridge networks using an
LL subnet, addresses leaked.

Linux always uses the standard 'fe80::/64' LL network. So, if a bridge
is configured with an LL subnet prefix that overlaps with it, a config
error is reported. Non-overlapping LL subnet prefixes are allowed.

Signed-off-by: Rob Murray <rob.murray@docker.com>
2023-12-18 16:10:41 +00:00
Sebastiaan van Stijn
08e4e88482 daemon: raise default minimum API version to v1.24
The daemon currently provides support for API versions all the way back
to v1.12, which is the version of the API that shipped with docker 1.0. On
Windows, the minimum supported version is v1.24.

Such old versions of the client are rare, and supporting older API versions
has accumulated significant amounts of code to remain backward-compatible
(which is largely untested, and a "best-effort" at most).

This patch updates the minimum API version to v1.24, which is the fallback
API version used when API-version negotiation fails. The intent is to start
deprecating older API versions, but no code is removed yet as part of this
patch, and a DOCKER_MIN_API_VERSION environment variable is added, which
allows overriding the minimum version (to allow restoring the behavior from
before this patch).

With this patch the daemon defaults to API v1.24 as minimum:

    docker version
    Client:
     Version:           24.0.2
     API version:       1.43
     Go version:        go1.20.4
     Git commit:        cb74dfc
     Built:             Thu May 25 21:50:49 2023
     OS/Arch:           linux/arm64
     Context:           default

    Server:
     Engine:
      Version:          dev
      API version:      1.44 (minimum version 1.24)
      Go version:       go1.21.3
      Git commit:       0322a29b9ef8806aaa4b45dc9d9a2ebcf0244bf4
      Built:            Mon Dec  4 15:22:17 2023
      OS/Arch:          linux/arm64
      Experimental:     false
     containerd:
      Version:          v1.7.9
      GitCommit:        4f03e100cb967922bec7459a78d16ccbac9bb81d
     runc:
      Version:          1.1.10
      GitCommit:        v1.1.10-0-g18a0cb0
     docker-init:
      Version:          0.19.0
      GitCommit:        de40ad0

Trying to use an older version of the API produces an error:

    DOCKER_API_VERSION=1.23 docker version
    Client:
     Version:           24.0.2
     API version:       1.23 (downgraded from 1.43)
     Go version:        go1.20.4
     Git commit:        cb74dfc
     Built:             Thu May 25 21:50:49 2023
     OS/Arch:           linux/arm64
     Context:           default
    Error response from daemon: client version 1.23 is too old. Minimum supported API version is 1.24, please upgrade your client to a newer version

To restore the previous minimum, users can start the daemon with the
DOCKER_MIN_API_VERSION environment variable set:

    DOCKER_MIN_API_VERSION=1.12 dockerd

API 1.12 is the oldest supported API version on Linux;

    docker version
    Client:
     Version:           24.0.2
     API version:       1.43
     Go version:        go1.20.4
     Git commit:        cb74dfc
     Built:             Thu May 25 21:50:49 2023
     OS/Arch:           linux/arm64
     Context:           default

    Server:
     Engine:
      Version:          dev
      API version:      1.44 (minimum version 1.12)
      Go version:       go1.21.3
      Git commit:       0322a29b9ef8806aaa4b45dc9d9a2ebcf0244bf4
      Built:            Mon Dec  4 15:22:17 2023
      OS/Arch:          linux/arm64
      Experimental:     false
     containerd:
      Version:          v1.7.9
      GitCommit:        4f03e100cb967922bec7459a78d16ccbac9bb81d
     runc:
      Version:          1.1.10
      GitCommit:        v1.1.10-0-g18a0cb0
     docker-init:
      Version:          0.19.0
      GitCommit:        de40ad0

When using the `DOCKER_MIN_API_VERSION` with a version of the API that
is not supported, an error is produced when starting the daemon;

    DOCKER_MIN_API_VERSION=1.11 dockerd --validate
    invalid DOCKER_MIN_API_VERSION: minimum supported API version is 1.12: 1.11

    DOCKER_MIN_API_VERSION=1.45 dockerd --validate
    invalid DOCKER_MIN_API_VERSION: maximum supported API version is 1.44: 1.45

Specifying a malformed API version also produces the same error;

    DOCKER_MIN_API_VERSION=hello dockerd --validate
    invalid DOCKER_MIN_API_VERSION: minimum supported API version is 1.12: hello

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2023-12-05 23:11:02 +01:00
Albin Kerouanton
d5d41c2849 daemon/config: Put params for the default network into a dedicated struct
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
2023-11-03 14:10:41 +01:00
Sebastiaan van Stijn
c90229ed9a api/types: move system info types to api/types/system
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2023-07-07 13:01:36 +02:00
Sebastiaan van Stijn
b8220f5d0d daemon/config: move MTU to BridgeConfig
This option is only used for the default bridge network; let's move the
field to that struct to make it clearer what it's used for.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2023-07-05 14:43:35 +02:00
Cory Snider
0f6eeecac0 daemon: consolidate runtimes config validation
The daemon has made a habit of mutating the DefaultRuntime and Runtimes
values in the Config struct to merge defaults. This would be fine if it
was a part of the regular configuration loading and merging process,
as is done with other config options. The trouble is it does so in
surprising places, such as in functions with 'verify' or 'validate' in
their name. It has been necessary in order to validate that the user has
not defined a custom runtime named "runc" which would shadow the
built-in runtime of the same name. Other daemon code depends on the
runtime named "runc" always being defined in the config, but merging it
with the user config at the same time as the other defaults are merged
would trip the validation. The root of the issue is that the daemon has
used the same config values for both validating the daemon runtime
configuration as supplied by the user and for keeping track of which
runtimes have been set up by the daemon. Now that a completely separate
value is used for the latter purpose, surprising contortions are no
longer required to make the validation work as intended.

Consolidate the validation of the runtimes config and merging of the
built-in runtimes into the daemon.setupRuntimes() function. Set the
result of merging the built-in runtimes config and default default
runtime on the returned runtimes struct, without back-propagating it
onto the config.Config argument.

Signed-off-by: Cory Snider <csnider@mirantis.com>
2023-06-01 14:45:25 -04:00
Cory Snider
d222bf097c daemon: reload runtimes w/o breaking containers
The existing runtimes reload logic went to great lengths to replace the
directory containing runtime wrapper scripts as atomically as possible
within the limitations of the Linux filesystem ABI. Trouble is,
atomically swapping the wrapper scripts directory solves the wrong
problem! The runtime configuration is "locked in" when a container is
started, including the path to the runC binary. If a container is
started with a runtime which requires a daemon-managed wrapper script
and then the daemon is reloaded with a config which no longer requires
the wrapper script (i.e. some args -> no args, or the runtime is dropped
from the config), that container would become unmanageable. Any attempts
to stop, exec or otherwise perform lifecycle management operations on
the container are likely to fail due to the wrapper script no longer
existing at its original path.

Atomically swapping the wrapper scripts is also incompatible with the
read-copy-update paradigm for reloading configuration. A handler in the
daemon could retain a reference to the pre-reload configuration for an
indeterminate amount of time after the daemon configuration has been
reloaded and updated. It is possible for the daemon to attempt to start
a container using a deleted wrapper script if a request to run a
container races a reload.

Solve the problem of deleting referenced wrapper scripts by ensuring
that all wrapper scripts are *immutable* for the lifetime of the daemon
process. Any given runtime wrapper script must always exist with the
same contents, no matter how many times the daemon config is reloaded,
or what changes are made to the config. This is accomplished by using
everyone's favourite design pattern: content-addressable storage. Each
wrapper script file name is suffixed with the SHA-256 digest of its
contents to (probabilistically) guarantee immutability without needing
any concurrency control. Stale runtime wrapper scripts are only cleaned
up on the next daemon restart.

Split the derived runtimes configuration from the user-supplied
configuration to have a place to store derived state without mutating
the user-supplied configuration or exposing daemon internals in API
struct types. Hold the derived state and the user-supplied configuration
in a single struct value so that they can be updated as an atomic unit.

Signed-off-by: Cory Snider <csnider@mirantis.com>
2023-06-01 14:45:25 -04:00
Cory Snider
0b592467d9 daemon: read-copy-update the daemon config
Ensure data-race-free access to the daemon configuration without
locking by mutating a deep copy of the config and atomically storing
a pointer to the copy into the daemon-wide configStore value. Any
operations which need to read from the daemon config must capture the
configStore value only once and pass it around to guarantee a consistent
view of the config.

Signed-off-by: Cory Snider <csnider@mirantis.com>
2023-06-01 14:45:24 -04:00
Sebastiaan van Stijn
fb96b94ed0 daemon: remove handling for deprecated "oom-score-adjust", and produce error
This option was deprecated in 5a922dc162, which
is part of the v24.0.0 release, so we can remove it from master.

This patch;

- adds a check to ValidatePlatformConfig, and produces a fatal error
  if oom-score-adjust is set
- removes the deprecated libcontainerd/supervisor.WithOOMScore
- removes the warning from docker info

With this patch:

    dockerd --oom-score-adjust=-500 --validate
    Flag --oom-score-adjust has been deprecated, and will be removed in the next release.
    unable to configure the Docker daemon with file /etc/docker/daemon.json: merged configuration validation from file and command line flags failed: DEPRECATED: The "oom-score-adjust" config parameter and the dockerd "--oom-score-adjust" options have been removed.

And when using `daemon.json`:

    dockerd --validate
    unable to configure the Docker daemon with file /etc/docker/daemon.json: merged configuration validation from file and command line flags failed: DEPRECATED: The "oom-score-adjust" config parameter and the dockerd "--oom-score-adjust" options have been removed.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2023-05-06 16:36:17 +02:00
Sebastiaan van Stijn
5a922dc162 daemon: deprecate --oom-score-adjust for the daemon
The `oom-score-adjust` option was added in a894aec8d8,
to prevent the daemon from being OOM-killed before other processes. This
option was mostly added as a "convenience", as running the daemon as a
systemd unit was not yet common.

Having the daemon set its own limits is not best-practice, and something
better handled by the process-manager starting the daemon.

Commit cf7a5be0f2 fixed this option to allow
disabling it, and 2b8e68ef06 removed the default
score adjust.

This patch deprecates the option altogether, recommending users to set these
limits through the process manager used, such as the "OOMScoreAdjust" option
in systemd units.

With this patch:

    dockerd --oom-score-adjust=-500 --validate
    Flag --oom-score-adjust has been deprecated, and will be removed in the next release.
    configuration OK

    echo '{"oom-score-adjust":-500}' > /etc/docker/daemon.json
    dockerd
    INFO[2023-04-12T21:34:51.133389627Z] Starting up
    INFO[2023-04-12T21:34:51.135607544Z] containerd not running, starting managed containerd
    WARN[2023-04-12T21:34:51.135629086Z] DEPRECATED: The "oom-score-adjust" config parameter and the dockerd "--oom-score-adjust" option will be removed in the next release.

    docker info
    Client:
      Context:    default
      Debug Mode: false
    ...
    DEPRECATED: The "oom-score-adjust" config parameter and the dockerd "--oom-score-adjust" option will be removed in the next release

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2023-04-13 00:02:39 +02:00
Tianon Gravi
6caaa8cadc Prefer loading docker-init from an appropriate "libexec" directory
The `docker-init` binary is not intended to be a user-facing command, and as such it is more appropriate for it to be found in `/usr/libexec` (or similar) than in `PATH` (see the FHS, especially https://refspecs.linuxfoundation.org/FHS_3.0/fhs/ch04s07.html and https://refspecs.linuxfoundation.org/FHS_2.3/fhs-2.3.html#USRLIBLIBRARIESFORPROGRAMMINGANDPA).

This adjusts the logic for using that configuration option to take this into account and appropriately search for `docker-init` (or the user's configured alternative) in these directories before falling back to the existing `PATH` lookup behavior.

This behavior _used_ to exist for the old `dockerinit` binary (of a similar name and used in a similar way but for an alternative purpose), but that behavior was removed in 4357ed4a73 when that older `dockerinit` was also removed.

Most of this reasoning _also_ applies to `docker-proxy` (and various `containerd-xxx` binaries such as the shims), but this change does not affect those.  It would be relatively straightforward to adapt `LookupInitPath` to be a more generic function such as `libexecLookupPath` or similar if we wanted to explore that.

See 14482589df/cli-plugins/manager/manager_unix.go for the related path list in the CLI which loads CLI plugins from a similar set of paths (with a similar rationale - plugin binaries are not typically intended to be run directly by users but rather invoked _via_ the CLI binary).

Signed-off-by: Tianon Gravi <admwiggin@gmail.com>
2023-03-24 14:25:12 -07:00
Akihiro Suda
e807ae4f2e vendor: github.com/containerd/cgroups/v3 v3.0.1
Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2023-03-08 20:15:17 +09:00
Jan Garcia
6ab12ec8f4 rootless: move ./rootless to ./pkg/rootless
Signed-off-by: Jan Garcia <github-public@n-garcia.com>
2023-01-09 16:26:06 +01:00
Sebastiaan van Stijn
b28e66cf4f daemon/config: New(): initialize config with platform-specific defaults
This centralizes more defaults, to be part of the config struct that's
created, instead of interweaving the defaults with other code in various
places.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2022-08-17 08:54:32 +02:00
Sebastiaan van Stijn
881e326f7a daemon/config: remove unneeded alias
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2022-04-17 13:08:34 +02:00
Sebastiaan van Stijn
ff2a5301b8 daemon: remove discovery-related config handling
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2022-01-06 18:28:17 +01:00
Brian Goff
7ccf750daa Allow switching Windows runtimes.
This adds support for 2 runtimes on Windows, one that uses the built-in
HCSv1 integration and another which uses containerd with the runhcs
shim.

Signed-off-by: Brian Goff <cpuguy83@gmail.com>
2021-09-23 17:44:04 +00:00
Sebastiaan van Stijn
09cf117b31 api/types: hostconfig: create enum for CgroupnsMode
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2021-08-06 19:05:54 +02:00
Sebastiaan van Stijn
98f0f0dd87 api/types: hostconfig: define consts for IpcMode
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2021-08-06 19:05:51 +02:00