202 Commits

Author SHA1 Message Date
Cory Snider
a90adb6dc1 api/types/network: use netip types as appropriate
And generate the ServiceInfo struct from the Swagger spec.

Signed-off-by: Cory Snider <csnider@mirantis.com>
2025-10-03 21:39:14 +02:00
Sebastiaan van Stijn
0df791cb72 explicitly access Container.State instead of through embedded struct
The Container.State struct holds the container's state, and most of
its fields are expected to change dynamically. Some o these state-changes
are explicit, for example, setting the container to be "stopped". Other
state changes can be more explicit, for example due to the containers'
process exiting or being "OOM" killed by the kernel.

The distinction between explicit ("desired") state changes and "state"
("actual state") is sometimes vague; for some properties, we clearly
separated them, for example if a user requested the container to be
stopped or restarted, we store state in the Container object itself;

    HasBeenManuallyStopped   bool // used for unless-stopped restart policy
    HasBeenManuallyRestarted bool `json:"-"` // used to distinguish restart caused by restart policy from the manual one

Other properties are more ambiguous. such as "HasBeenStartedBefore" and
"RestartCount", which are stored on the Container (and persisted to
disk), but may be more related to "actual" state, and likely should
not be persisted;

    RestartCount             int
    HasBeenStartedBefore     bool

Given that (per the above) concurrency must be taken into account, most
changes to the `container.State` struct should be protected; here's where
things get blurry. While the `State` type provides various accessor methods,
only some of them take concurrency into account; for example, [State.IsRunning]
and [State.GetPID] acquire a lock, whereas [State.ExitCodeValue] does not.
Even the (commonly used) [State.StateString] has no locking at all.

The way to handle this is error-prone; [container.State] contains a mutex,
and it's exported. Given that its embedded in the [container.Container]
struct, it's also exposed as an exported mutex for the container. The
assumption here is that by "merging" the two, the caller to acquire a lock
when either the container _or_ its state must be mutated. However, because
some methods on `container.State` handle their own locking, consumers must
be deeply familiar with the internals; if both changes to the `Container`
AND `Container.State` must be made. This gets amplified more as some
(exported!) methods, such as [container.SetRunning] mutate multiple fields,
but don't acquire a lock (so expect the caller to hold one), but their
(also exported) counterpart (e.g. [State.IsRunning]) do.

It should be clear from the above, that this needs some architectural
changes; a clearer separation between "desired" and "actual" state (opening
the potential to update the container's config without manually touching
its `State`), possibly a method to obtain a read-only copy of the current
state (for those querying state), and reviewing which fields belong where
(and should be persisted to disk, or only remain in memory).

This PR preserves the status quo; it makes no structural changes, other
than exposing where we access the container's state. Where previously the
State fields and methods were referred to as "part of the container"
(e.g. `ctr.IsRunning()` or `ctr.Running`), we now explicitly reference
the embedded `State` (`ctr.State.IsRunning`, `ctr.State.Running`).

The exception (for now) is the mutex, which is still referenced through
the embedded struct (`ctr.Lock()` instead of `ctr.State.Lock()`), as this
is (mostly) by design to protect the container, and what's in it (including
its `State`).

[State.IsRunning]: c4afa77157/daemon/container/state.go (L205-L209)
[State.GetPID]: c4afa77157/daemon/container/state.go (L211-L216)
[State.ExitCodeValue]: c4afa77157/daemon/container/state.go (L218-L228)
[State.StateString]: c4afa77157/daemon/container/state.go (L102-L131)
[container.State]: c4afa77157/daemon/container/state.go (L15-L23)
[container.Container]: c4afa77157/daemon/container/container.go (L67-L75)
[container.SetRunning]: c4afa77157/daemon/container/state.go (L230-L277)

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2025-09-19 16:02:14 +02:00
Albin Kerouanton
d2e0895b9b daemon: deprecate env vars set by legacy links
The environment variables set by legacy links are not particularly
useful because you need to know the name of the linked container to use
them, or you need to scan all enviornment variables to find them.

Legacy links are deprecated / marked "legacy" since a long time, and we
want to replace them with non-legacy links. This will help make the
default bridge work like custom networks.

For now, stop setting these environment variables inside of linking
containers by default, but provide an escape hatch to allow users who
still rely on these to re-enable them.

The integration-cli tests `TestExecEnvLinksHost` and `TestLinksEnvs` are
removed as they need to run against a daemon with legacy links env vars
enabled, and a new integration test`TestLegacyLinksEnvVars` is added to
fill the gap. Similarly, the docker-py test `test_create_with_links` is
skipped.

Signed-off-by: Albin Kerouanton <albinker@gmail.com>
2025-08-14 11:32:54 +02:00
Sebastiaan van Stijn
15f78b752c daemon: make buildSandboxOptions, buildSandboxPlatformOptions more atomic
The buildSandboxPlatformOptions function was given a pointer to the
sboxOptions and modified it in-place.

Similarly, a pointer to the container was passed and `container.HostsPath`
and `container.ResolvConfPath` mutated. In cases where either of those
failed, we would return an error, but the container (and sboxOptions)
would already be modified.

This patch;

- updates the signature of buildSandboxPlatformOptions to return a fresh
  slice of sandbox options, which can be appended to the sboxOptions by
  the caller.
- uses intermediate variables for `hostsPath` and `resolvConfPath`, and
  only mutates the container if both were obtained successfully.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2025-08-05 11:59:46 +02:00
Derek McGowan
f74e5d48b3 Create github.com/moby/moby/v2 module
Signed-off-by: Derek McGowan <derek@mcg.dev>
2025-07-31 10:13:29 -07:00
Sebastiaan van Stijn
ca1c5ee08f pkg/stringid: move to daemon, and provide copy in client
The stringid package is used in many places; while it's trivial
to implement a similar utility, let's just provide it as a utility
package in the client, removing the daemon-specific logic.

For integration tests, I opted to use the implementation in the
client, as those should not ideally not make assumptions about
the daemon implementation.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2025-07-25 13:39:32 +02:00
Sebastiaan van Stijn
0c3185a835 daemon: killProcessDirectly: use "WithFields" for logging
Don't chain "WithError" and "WithFields"

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2025-07-20 15:09:07 +02:00
Derek McGowan
7a720df61f Move libnetwork to daemon/libnetwork
Signed-off-by: Derek McGowan <derek@mcg.dev>
2025-07-14 09:25:23 -07:00
Derek McGowan
5419eb1efc Move container to daemon/container
Signed-off-by: Derek McGowan <derek@mcg.dev>
2025-06-27 14:27:21 -07:00
Matthieu MOREL
381d9d0723 fix use-errors-new from revive
Signed-off-by: Matthieu MOREL <matthieu.morel35@gmail.com>
2025-06-26 12:07:38 +00:00
Matthieu MOREL
6d737371b8 fix comparison rule from errorlint
Signed-off-by: Matthieu MOREL <matthieu.morel35@gmail.com>

Signed-off-by: Matthieu MOREL <matthieu.morel35@gmail.com>
2025-06-13 08:26:56 +00:00
Sebastiaan van Stijn
5318877858 daemon: remove // import comments
These comments were added to enforce using the correct import path for
our packages ("github.com/docker/docker", not "github.com/moby/moby").
However, when working in go module mode (not GOPATH / vendor), they have
no effect, so their impact is limited.

Remove these imports in preparation of migrating our code to become an
actual go module.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2025-05-30 15:59:13 +02:00
Derek McGowan
d0154d3e59 Update to use github.com/moby/go-archive
Update use of idtools to moby/user for archive and other deprecated uses

Signed-off-by: Derek McGowan <derek@mcg.dev>
2025-04-08 17:35:05 -07:00
Derek McGowan
3fc36bcac4 Update daemon to use moby sys/user identity mapping
Signed-off-by: Derek McGowan <derek@mcg.dev>
2025-04-04 08:24:09 -07:00
Sebastiaan van Stijn
9a69161992 daemon: remove Daemon.children(), Daemon.parents() wrappers
Remove the wrappers to make it more explicit that these are related to
the legacy links feature.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2025-01-30 14:23:10 +01:00
Sebastiaan van Stijn
3b27e36d67 daemon/links: add EnvVars function
Encapsulate the "create link -> link.ToEnv" process.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2025-01-20 11:49:59 +01:00
Sebastiaan van Stijn
53fec9813f daemon: Daemon.setupLinkedContainers: don't fetch linked containers if not used
This function was unconditionally trying to fetch linked container, even
if the container was not using the default bridge (the only network that
supports legacy links).

Also removing the intermediate variable, as daemon.children, through
daemon.linkindex.children already returns a variable with a copy of these
links.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2025-01-20 11:49:59 +01:00
Rob Murray
e601e71681 Remove function isLinkable
Signed-off-by: Rob Murray <rob.murray@docker.com>
2024-11-05 14:22:53 +00:00
Rob Murray
4c553defce Separate Sandbox/Endpoint construction
If config for legacy links needs to be added to a libnetwork.Sandbox,
add it when constructing the Endpoint that needs it - removing the
constraint on ordering of Endpoint construction, and the dependency
between Endpoint and Sandbox construction.

So, now a Sandbox can be constructed in one place, before the first
Endpoint.

Signed-off-by: Rob Murray <rob.murray@docker.com>
2024-11-05 10:00:10 +00:00
Sebastiaan van Stijn
5a4595466b Merge pull request #48008 from thaJeztah/deprecate_runconfig_DefaultDaemonNetworkMode
runconfig: deprecate DefaultDaemonNetworkMode, move to daemon/network
2024-06-18 14:13:07 +02:00
Rob Murray
74d77d8811 Revert "Internal resolver for default bridge network"
This reverts commit 18f4f775ed.

Because buildkit doesn't run an internal resolver, and it bases its
/etc/resolv.conf on the host's ... when buildkit is run in a container
that has 'nameserver 127.0.0.11', its build containers will use Google's
DNS servers as a fallback (unless the build container uses host
networking).

Before, when the 127.0.0.11 resolver was not used for the default network,
the buildkit container would have inherited a site-local nameserver. So,
the build containers it created would also have inherited that DNS
server - and they'd be able to resolve site-local hostnames.

By replacing the site-local nameserver with Google's, we broke access
to local DNS and its hostnames.

Signed-off-by: Rob Murray <rob.murray@docker.com>
2024-06-17 20:19:20 +01:00
Sebastiaan van Stijn
8e91b64e07 runconfig: deprecate DefaultDaemonNetworkMode, move to daemon/network
This function returns the default network to use for the daemon platform;
moving this to a location separate from runconfig, which is planned to
be dismantled and moved to the API.

While it might be convenient to move this utility inside api/types/container,
we don't want to advertise this function too widely, as the default returned
can ONLY be considered correct when ran on the daemon-side. An alternative
would be to introduce an argument (daemonPlatform), which isn't very convenient
to use.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2024-06-17 17:32:56 +02:00
Sebastiaan van Stijn
7b438c5c31 daemon: rename variables that shadowed imports
Not a full list yet, but renaming to prevent shadowing, and to use a more
consistent short form (ctr for container).

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2024-06-17 11:06:06 +02:00
Rob Murray
18f4f775ed Internal resolver for default bridge network
Until now, containers on the default bridge network have been configured
to talk directly to external DNS servers - their resolv.conf files have
either been populated with nameservers from the host's resolv.conf, or
with servers from '--dns' (or with Google's nameservers as a fallback).

This change makes the internal bridge more like other networks by using
the internal resolver.  But, the internal resolver is not populated with
container names or aliases - it's only for external DNS lookups.

Containers on the default network, on a host that has a loopback
resolver (like systemd's on 127.0.0.53) will now use that resolver
via the internal resolver. So, the logic used to find systemd's current
set of resolvers is no longer needed by the daemon.

Legacy links work just as they did before, using '/etc/hosts' and magic.

(Buildkit does not use libnetwork, so it can't use the internal resolver.
But it does use libnetwork/resolvconf's logic to configure resolv.conf.
So, code to set up resolv.conf for a legacy networking without an internal
resolver can't be removed yet.)

Signed-off-by: Rob Murray <rob.murray@docker.com>
2024-06-05 20:27:24 +01:00
Rob Murray
6c68be24a2 Windows DNS resolver forwarding
Make the internal DNS resolver for Windows containers forward requests
to upsteam DNS servers when it cannot respond itself, rather than
returning SERVFAIL.

Windows containers are normally configured with the internal resolver
first for service discovery (container name lookup), then external
resolvers from '--dns' or the host's networking configuration.

When a tool like ping gets a SERVFAIL from the internal resolver, it
tries the other nameservers. But, nslookup does not, and with this
change it does not need to.

The internal resolver learns external server addresses from the
container's HNSEndpoint configuration, so it will use the same DNS
servers as processes in the container.

The internal resolver for Windows containers listens on the network's
gateway address, and each container may have a different set of external
DNS servers. So, the resolver uses the source address of the DNS request
to select external resolvers.

On Windows, daemon.json feature option 'windows-no-dns-proxy' can be used
to prevent the internal resolver from forwarding requests (restoring the
old behaviour).

Signed-off-by: Rob Murray <rob.murray@docker.com>
2024-04-16 18:57:28 +01:00
Rob Murray
beb97f7fdf Refactor 'resolv.conf' generation.
Replace regex matching/replacement and re-reading of generated files
with a simple parser, and struct to remember and manipulate the file
content.

Annotate the generated file with a header comment saying the file is
generated, but can be modified, and a trailing comment describing how
the file was generated and listing external nameservers.

Always start with the host's resolv.conf file, whether generating config
for host networking, or with/without an internal resolver - rather than
editing a file previously generated for a different use-case.

Resolves an issue where rewrites of the generated file resulted in
default IPv6 nameservers being unnecessarily added to the config.

Signed-off-by: Rob Murray <rob.murray@docker.com>
2024-02-06 22:26:12 +00:00
Paweł Gronowski
f07387466a daemon/oci: Extract side effects from withMounts
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
2024-01-19 17:27:16 +01:00
Sebastiaan van Stijn
cff4f20c44 migrate to github.com/containerd/log v0.1.0
The github.com/containerd/containerd/log package was moved to a separate
module, which will also be used by upcoming (patch) releases of containerd.

This patch moves our own uses of the package to use the new module.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2023-10-11 17:52:23 +02:00
Albin Kerouanton
ff503882f7 daemon: Improve NetworkingConfig & EndpointSettings validation
So far, only a subset of NetworkingConfig was validated when calling
ContainerCreate. Other parameters would be validated when the container
was started. And the same goes for EndpointSettings on NetworkConnect.

This commit adds two validation steps:

1. Check if the IP addresses set in endpoint's IPAMConfig are valid,
   when ContainerCreate and ConnectToNetwork is called ;
2. Check if the network allows static IP addresses, only on
   ConnectToNetwork as we need the libnetwork's Network for that and it
   might not exist until NetworkAttachment requests are sent to the
   Swarm leader (which happens only when starting the container) ;

Signed-off-by: Albin Kerouanton <albinker@gmail.com>
2023-09-18 17:21:06 +02:00
Sebastiaan van Stijn
dd26e6b15e daemon: Daemon.getIpcContainer: make errors less repetitive
- Most error-message returned would already include "container" and the
  container ID in the error-message (e.g. "container %s is not running"),
  so there's no need to add a custom prefix for that.
- os.Stat returns a PathError, which already includes the operation ("stat"),
  the path, and the underlying error that occurred.

And while updating, let's also fix the name to be proper camelCase :)

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2023-08-24 16:20:42 +02:00
Sebastiaan van Stijn
3d94eb9bcd daemon: Daemon.getPidContainer: change to accept "id" argument
This function didn't need the whole container, only its ID, so let's
use that as argument. This also makes it consistent with getIpcContainer.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2023-08-24 16:20:42 +02:00
Sebastiaan van Stijn
bc7f341f29 daemon: WithNamespaces(): fix incorrect error for PID, IPC namespace
`Daemon.getPidContainer()` was wrapping the error-message with a message
("cannot join PID of a non running container") that did not reflect the
actual reason for the error; `Daemon.GetContainer()` could either return
an invalid parameter (invalid / empty identifier), or a "not found" error
if the specified container-ID could not be found.

In the latter case, we don't want to return a "not found" error through
the API, as this would indicate that the container we're _starting_ was
not found (which is not the case), so we need to convert the error into
an `errdefs.ErrInvalidParameter` (the container-ID specified for the PID
namespace is invalid if the container doesn't exist).

This logic is similar to what we do for IPC namespaces. which received
a similar fix in c3d7a0c603.

This patch updates the error-types, and moves them into the getIpcContainer
and getPidContainer container functions, both of which should return
an "invalid parameter" if the container was not found.

It's worth noting that, while `WithNamespaces()` may return an "invalid
parameter" error, the `start` endpoint itself may _not_ be. as outlined
in commit bf1fb97575, starting a container
that has an invalid configuration should be considered an internal server
error, and is not an invalid _request_. However, for uses other than
container "start", `WithNamespaces()` should return the correct error
to allow code to handle it accordingly.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2023-08-24 16:19:07 +02:00
Sebastiaan van Stijn
13648a0e21 daemon: remove Daemon.checkContainer and related utils
This was added in 12485d62ee to save some
duplication, but was really over-engineered to save a few lines of code,
at the cost of hiding away what it does and also potentially returning
inconsistent errors (not addressed in this patch). Let's start with
inlining these.

This removes;

- Daemon.checkContainer
- daemon.containerIsRunning
- daemon.containerIsNotRestarting

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2023-08-24 16:12:18 +02:00
Sebastiaan van Stijn
603547fa19 daemon: change Daemon.setupPathsAndSandboxOptions to a regular func
It's not using the daemon in any way, so let's change it to a regular
function.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2023-08-02 16:14:15 +02:00
Sebastiaan van Stijn
5e2a1195d7 swap logrus types for their containerd/logs aliases
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2023-08-01 13:02:55 +02:00
Sebastiaan van Stijn
210932b3bf daemon: format code with gofumpt
Formatting the code with https://github.com/mvdan/gofumpt

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2023-06-29 00:33:03 +02:00
Brian Goff
74da6a6363 Switch all logging to use containerd log pkg
This unifies our logging and allows us to propagate logging and trace
contexts together.

Signed-off-by: Brian Goff <cpuguy83@gmail.com>
2023-06-24 00:23:44 +00:00
Cory Snider
0b592467d9 daemon: read-copy-update the daemon config
Ensure data-race-free access to the daemon configuration without
locking by mutating a deep copy of the config and atomically storing
a pointer to the copy into the daemon-wide configStore value. Any
operations which need to read from the daemon config must capture the
configStore value only once and pass it around to guarantee a consistent
view of the config.

Signed-off-by: Cory Snider <csnider@mirantis.com>
2023-06-01 14:45:24 -04:00
Sebastiaan van Stijn
ab35df454d remove pre-go1.17 build-tags
Removed pre-go1.17 build-tags with go fix;

    go mod init
    go fix -mod=readonly ./...
    rm go.mod

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2023-05-19 20:38:51 +02:00
Sebastiaan van Stijn
9d5e754caa move pkg/system: process to a separate package
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2022-11-04 01:50:23 +01:00
Sebastiaan van Stijn
970ad4e3c7 pkg/system: IsProcessZombie() ignore "os.ErrNotExist" errors
If the file doesn't exist, the process isn't running, so we should be able
to ignore that.

Also remove an intermediate variable.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2022-11-04 01:49:49 +01:00
Sebastiaan van Stijn
ea1eb449b7 daemon: killWithSignal, killPossiblyDeadProcess: accept syscall.Signal
This helps reducing some type-juggling / conversions further up
the stack.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2022-05-05 00:53:52 +02:00
Brian Goff
03f1c3d78f Lock down docker root dir perms.
Do not use 0701 perms.
0701 dir perms allows anyone to traverse the docker dir.
It happens to allow any user to execute, as an example, suid binaries
from image rootfs dirs because it allows traversal AND critically
container users need to be able to do execute things.

0701 on lower directories also happens to allow any user to modify
     things in, for instance, the overlay upper dir which neccessarily
     has 0755 permissions.

This changes to use 0710 which allows users in the group to traverse.
In userns mode the UID owner is (real) root and the GID is the remapped
root's GID.

This prevents anyone but the remapped root to traverse our directories
(which is required for userns with runc).

Signed-off-by: Brian Goff <cpuguy83@gmail.com>
(cherry picked from commit ef7237442147441a7cadcda0600be1186d81ac73)
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
(cherry picked from commit 93ac040bf0)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2021-10-05 09:57:00 +02:00
Eng Zer Jun
c55a4ac779 refactor: move from io/ioutil to io and os package
The io/ioutil package has been deprecated in Go 1.16. This commit
replaces the existing io/ioutil functions with their new definitions in
io and os packages.

Signed-off-by: Eng Zer Jun <engzerjun@gmail.com>
2021-08-27 14:56:57 +08:00
Sebastiaan van Stijn
686be57d0a Update to Go 1.17.0, and gofmt with Go 1.17
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2021-08-24 23:33:27 +02:00
Brian Goff
4b981436fe Fixup libnetwork lint errors
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
2021-06-01 23:48:32 +00:00
Brian Goff
a0a473125b Fix libnetwork imports
After moving libnetwork to this repo, we need to update all the import
paths for libnetwork to point to docker/docker/libnetwork instead of
docker/libnetwork.
This change implements that.

Signed-off-by: Brian Goff <cpuguy83@gmail.com>
2021-06-01 21:51:23 +00:00
Cam
e57a365ab1 docker kill: fix bug where failed kills didnt fallback to unix kill
1. fixes #41587
2. removes potential infinite Wait and goroutine leak at end of kill
function

fixes #41587

Signed-off-by: Cam <gh@sparr.email>
2021-04-14 15:43:44 -07:00
Brian Goff
7f5e39bd4f Use real root with 0701 perms
Various dirs in /var/lib/docker contain data that needs to be mounted
into a container. For this reason, these dirs are set to be owned by the
remapped root user, otherwise there can be permissions issues.
However, this uneccessarily exposes these dirs to an unprivileged user
on the host.

Instead, set the ownership of these dirs to the real root (or rather the
UID/GID of dockerd) with 0701 permissions, which allows the remapped
root to enter the directories but not read/write to them.
The remapped root needs to enter these dirs so the container's rootfs
can be configured... e.g. to mount /etc/resolve.conf.

This prevents an unprivileged user from having read/write access to
these dirs on the host.
The flip side of this is now any user can enter these directories.

Signed-off-by: Brian Goff <cpuguy83@gmail.com>
(cherry picked from commit e908cc3901)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2021-02-02 13:01:25 +01:00
Sebastiaan van Stijn
7335167340 Remove redundant "os.IsNotExist" checks on os.RemoveAll()
`os.RemoveAll()` should never return this error. From the docs:

> If the path does not exist, RemoveAll returns nil (no error).

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2020-09-23 10:30:53 +02:00