This makes the `ImageList` function to add `shared-size=1` to the url
query when user caller sets the SharedSize.
SharedSize support was introduced in API version 1.42. This field was
added to the options struct, but client wasn't adjusted.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
(cherry picked from commit 3d97f1e22d)
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
The replace was removed in 64f9ea1cf5, but I
forgot to remove the comment.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit 6326ad1729)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This removes the dependency on github.com/docker/docker/pkg/stringid
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit a44f547343)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Conflicts:
vendor.mod
Conflict because code.cloudfoundry.org/clock moved to a direct dependency in
vendor.mod on master branch since 342b44bf20
full diff: 6341884e5f...b17f02f0a0
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit 64f9ea1cf5)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Current Dockerfile downloads vpnkit for both linux/amd64
and linux/arm64 platforms even if target platform does not
match. This change will download vpnkit only if target
platform matches, otherwise it will just use a dummy scratch
stage.
Signed-off-by: CrazyMax <crazy-max@users.noreply.github.com>
(cherry picked from commit 8a46a2a364)
no code changes in vendored files
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit 341c9e77a8)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
no changes in vendored files
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit 9a8b46518b)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
no significant changes in vendored code, other than updating build-tags
for go1.17, but removes some dependencies from the module, which can
help with future updates;
full diff: 3f7ff695ad...abb19827d3
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit 61f266f660)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
updates the "logentries" dependency;
- checking error when calling output
- Support Go Modules
full diff: 7a984a84b5...fc06dab2ca
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit 8d5eebcc6e)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
While this replace was needed in swarmkit itself, it looks like
it doesn't cause issues when removed in this repository, so
let's remove it here.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit 62a4a45a72)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Previously we had to use a replace rule, as later versions of this
module resulted in a panic. This issue was fixed in:
f30034d788
Which means we can remove the replace rule, and update the dependency.
No new release was tagged yet, so sticking to a "commit" for now.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit a2d758acc9)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
using TARGETVARIANT in frozen-images stage implies changes in
`download-frozen-image-v2.sh` script to add support for variants
so we are able to build against more platforms.
Signed-off-by: CrazyMax <crazy-max@users.noreply.github.com>
(cherry picked from commit 25dc760162)
The file will still be available in Git history; we should drop it
however as it is misleading and obsolete.
Signed-off-by: Bjorn Neergaard <bneergaard@mirantis.com>
(cherry picked from commit ec1bb21649)
Signed-off-by: Bjorn Neergaard <bneergaard@mirantis.com>
Also cleaning up some errors
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit 56e64270f3)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The golang.org/x/ projects are now doing tagged releases.
Some notable changes:
- authhandler: Add support for PKCE
- Introduce new AuthenticationError type returned by errWrappingTokenSource.Token
- Add support to set JWT Audience in JWTConfigFromJSON()
- google/internal: Add AWS Session Token to Metadata Requests
- go.mod: update vulnerable net library
- google: add support for "impersonated_service_account" credential type.
- google/externalaccount: add support for workforce pool credentials
full diff: https://github.com/golang/oauth2/compare/2bc19b11175f...v0.1.0
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit a6cb8efd81)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Use the IoctlRetInt, IoctlSetInt and IoctlLoopSetStatus64 helper
functions defined in the golang.org/x/sys/unix package instead of
manually wrapping these using a locally defined function.
Inspired by 3cc3d8a560
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit c7c02eea81)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The 1.16 `io/fs` compatibility code was being built on 1.18 and 1.19.
Drop it completely as 1.16 is long EOL, and additionally drop 1.17 as it
has been EOL for a month and 1.18 is both the minimum Go supported by
the 20.10 branch, as well as a very easy jump from 1.17.
Signed-off-by: Bjorn Neergaard <bneergaard@mirantis.com>
(cherry picked from commit 85fa72c599)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
opencontainers/go-digest is a 1:1 copy of the one in distribution. It's no
longer used in distribution itself, so may be removed there at some point.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit 6174d00c03)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This utility was added in 442b45628e as part of
user-namespaces, and first used in 44e1023a93 to
set up the daemon root, and move the existing content;
44e1023a93/daemon/daemon_experimental.go (L68-L71)
A later iteration no longer _moved_ the existing root directory, and removed the
use of `directory.MoveToSubdir()` e8532023f2
It looks like there's no external consumers of this utility, so we should be
save to remove it.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit 26659d5eb8)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
- separate exported function from implementation, to allow for GoDoc to be
maintained in a single location.
- don't use named return variables (no "bare" return, and potentially shadowing
variables)
- reverse the `os.IsNotExist(err) && d != dir` condition, putting the "lighter"
`d != dir` first.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit bd6217bb74)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This utility was only used in a single place, and had no external consumers.
Move it to where it's used.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit 07b1aa822c)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
It was only used in a couple of places, and in most places shouldn't be used
as those locations were in unix/linux-only files, so didn't need the wrapper.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit 4347080b46)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
I noticed the comment above this code, but didn't see a corresponding type-cast.
Looking at this file's history, I found that these were removed as part of
2f5f0af3fd, which looks to have overlooked some
deliberate type-casts.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit 0a861e68df)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This fallback is used when we filter the manifest list by the user-provided platform and find no matches such that we match the previous Docker behavior (before it supported variant matching). This has been deprecated long enough that I think it's time we finally stop supporting this weird fallback, especially since it makes for buggy behavior like `docker pull --platform linux/arm/v5 alpine:3.16` leading to a `linux/arm/v6` image being pulled (I specified a variant, every manifest list entry specifies a variant, so clearly the only behavior I as a user could reasonably expect is an error that `linux/arm/v5` is not supported, but instead I get an explicitly incompatible image despite doing everything I as a user can to prevent that situation).
Signed-off-by: Tianon Gravi <admwiggin@gmail.com>
(cherry picked from commit 5bc17c3e54)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
On Windows, syscall.StartProcess and os/exec.Cmd did not properly
check for invalid environment variable values. A malicious
environment variable value could exploit this behavior to set a
value for a different environment variable. For example, the
environment variable string "A=B\x00C=D" set the variables "A=B" and
"C=D".
Thanks to RyotaK (https://twitter.com/ryotkak) for reporting this
issue.
This is CVE-2022-41716 and Go issue https://go.dev/issue/56284.
This Go release also fixes https://github.com/golang/go/issues/56309, a
runtime bug which can cause random memory corruption when a goroutine
exits with runtime.LockOSThread() set. This fix is necessary to unblock
work to replace certain uses of pkg/reexec with unshared OS threads.
Signed-off-by: Cory Snider <csnider@mirantis.com>
(cherry picked from commit f9d4589976)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Currently an attempt to pull a reference which resolves to an OCI
artifact (Helm chart for example), results in a bit unrelated error
message `invalid rootfs in image configuration`.
This provides a more meaningful error in case a user attempts to
download a media type which isn't image related.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Deleting a containerd task whose status is Created fails with a
"precondition failed" error. This is because (aside from Windows)
a process is spawned when the task is created, and deleting the task
while the process is running would leak the process if it was allowed.
libcontainerd mistakenly tries to clean up from a failed start by
deleting the created task, which will always fail with the
aforementioned error. Change it to pass the `WithProcessKill` delete
option so the cleanup has a chance to succeed.
Signed-off-by: Cory Snider <csnider@mirantis.com>
(cherry picked from commit 1bef9e3fbf)
Signed-off-by: Cory Snider <csnider@mirantis.com>
This fix tries to address issues raised in #44346.
The max-concurrent-downloads and max-concurrent-uploads limits are applied for the whole engine and not for each pull/push command.
Signed-off-by: Luis Henrique Mulinari <luis.mulinari@gmail.com>
(cherry picked from commit 6c0aa5b00a)
Signed-off-by: Cory Snider <csnider@mirantis.com>
cmd.Environ() is new in go1.19, and not needed for this specific case.
Without this, trying to use this package in code that uses go1.18 will fail;
builder/remotecontext/git/gitutils.go:216:23: cmd.Environ undefined (type *exec.Cmd has no field or method Environ)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit 4fdc1bb1fb)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
GitHub uses these parameters to construct a name; removing the ./ prefix
to make them more readable (and add them back where it's used)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit 0760c6f4e1)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
- On Windows, we don't build and run a local test registry (we're not running
docker-in-docker), so we need to skip this test.
- On rootless, networking doesn't support this (currently)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit 4f43cb660a)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This is accomplished by storing the distribution source in the content
labels. If the distribution source is not found then we check to the
registry to see if the digest exists in the repo, if it does exist then
the puller will use it.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit 27530efedb)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Setting cmd.Env overrides the default of passing through the parent
process' environment, which works out fine most of the time, except when
it doesn't. For whatever reason, leaving out all the environment causes
git-for-windows sh.exe subprocesses to enter an infinite loop of
access violations during Cygwin initialization in certain environments
(specifically, our very own dev container image).
Signed-off-by: Cory Snider <csnider@mirantis.com>
While it is undesirable for the system or user git config to be used
when the daemon clones a Git repo, it could break workflows if it was
unconditionally applied to docker/cli as well.
Signed-off-by: Cory Snider <csnider@mirantis.com>
Prevent git commands we run from reading the user or system
configuration, or cloning submodules from the local filesystem.
Signed-off-by: Cory Snider <csnider@mirantis.com>
Keep It Simple! Set the working directory for git commands by...setting
the git process's working directory. Git commands can be run in the
parent process's working directory by passing the empty string.
Signed-off-by: Cory Snider <csnider@mirantis.com>
Make the test more debuggable by logging all git command output and
running each table-driven test case as a subtest.
Signed-off-by: Cory Snider <csnider@mirantis.com>
Previously, Docker Hub was excluded when configuring "allow-nondistributable-artifacts".
With the updated policy announced by Microsoft, we can remove this restriction;
https://techcommunity.microsoft.com/t5/containers/announcing-windows-container-base-image-redistribution-rights/ba-p/3645201
There are plans to deprecated support for foreign layers altogether in the OCI,
and we should consider to make this option the default, but as that requires
deprecating the option (and possibly keeping an "opt-out" option), we can look
at that separately.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit 30e5333ce3)
Previously we waited for 60 seconds after the service faults to restart
it. However, there isn't much benefit to waiting this long. We expect
15 seconds to be a more reasonable delay.
Co-Authored-by: Kevin Parsons <kevpar@microsoft.com>
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit 624daf8d9e)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This is the equivalent of the local implementation.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit 3c585e6567)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
golang.org/x/sys/windows now implements this, so we can use that
instead of a local implementation.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit 6176ab5901)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The `IsAnInteractiveSession` was deprecated, and `IsWindowsService` is marked
as the recommended replacement.
For details, see 280f808b4a
> CL 244958 includes isWindowsService function that determines if a
> process is running as a service. The code of the function is based on
> public .Net implementation.
>
> IsAnInteractiveSession function implements similar functionality, but
> is based on an old Stackoverflow post., which is not as authoritative
> as code written by Microsoft for their official product.
>
> This change copies CL 244958 isWindowsService function into svc package
> and makes it public. The intention is that future users will prefer
> IsWindowsService to IsAnInteractiveSession.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit ffcddc908e)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
go-winio now defines this function, so we can consume that.
Note that there's a difference between the old implementation and the original
one (added in 1cb9e9b44e). The old implementation
had special handling for win32 error codes, which was removed in the go-winio
implementation in 0966e1ad56
As `go-winio.GetFileSystemType()` calls `filepath.VolumeName(path)` internally,
this patch also removes the `string(home[0])`, which is redundant, and could
potentially panic if an empty string would be passed.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit 90431d1857)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Commit 955c1f881a (Docker v17.12.0) replaced
detection of support for multiple lowerdirs (as required by overlay2) to not
depend on the kernel version. The `overlay2.override_kernel_check` was still
used to print a warning that older kernel versions may not have full support.
After this, commit e226aea280 (Docker v20.10.0,
backported to v19.03.7) removed uses of the `overlay2.override_kernel_check`
option altogether, but we were still parsing it.
This patch changes the `parseOptions()` function to not parse the option,
printing a deprecation warning instead. We should change this to be an error,
but the `overlay2.override_kernel_check` option was not deprecated in the
documentation, so keeping it around for one more release.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit e35700eb50)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The pkg/fsutils package was forked in containerd, and later moved to
containerd/continuity/fs. As we're moving more bits to containerd, let's also
use the same implementation to reduce code-duplication and to prevent them from
diverging.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit 5b6b42162b)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This adds a new filter argument to the volume prune endpoint "all".
When this is not set, or it is a false-y value, then only anonymous
volumes are considered for pruning.
When `all` is set to a truth-y value, you get the old behavior.
This is an API change, but I think one that is what most people would
want.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit 618f26ccbc)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
From the mailing list:
We have just released Go versions 1.19.2 and 1.18.7, minor point releases.
These minor releases include 3 security fixes following the security policy:
- archive/tar: unbounded memory consumption when reading headers
Reader.Read did not set a limit on the maximum size of file headers.
A maliciously crafted archive could cause Read to allocate unbounded
amounts of memory, potentially causing resource exhaustion or panics.
Reader.Read now limits the maximum size of header blocks to 1 MiB.
Thanks to Adam Korczynski (ADA Logics) and OSS-Fuzz for reporting this issue.
This is CVE-2022-2879 and Go issue https://go.dev/issue/54853.
- net/http/httputil: ReverseProxy should not forward unparseable query parameters
Requests forwarded by ReverseProxy included the raw query parameters from the
inbound request, including unparseable parameters rejected by net/http. This
could permit query parameter smuggling when a Go proxy forwards a parameter
with an unparseable value.
ReverseProxy will now sanitize the query parameters in the forwarded query
when the outbound request's Form field is set after the ReverseProxy.Director
function returns, indicating that the proxy has parsed the query parameters.
Proxies which do not parse query parameters continue to forward the original
query parameters unchanged.
Thanks to Gal Goldstein (Security Researcher, Oxeye) and
Daniel Abeles (Head of Research, Oxeye) for reporting this issue.
This is CVE-2022-2880 and Go issue https://go.dev/issue/54663.
- regexp/syntax: limit memory used by parsing regexps
The parsed regexp representation is linear in the size of the input,
but in some cases the constant factor can be as high as 40,000,
making relatively small regexps consume much larger amounts of memory.
Each regexp being parsed is now limited to a 256 MB memory footprint.
Regular expressions whose representation would use more space than that
are now rejected. Normal use of regular expressions is unaffected.
Thanks to Adam Korczynski (ADA Logics) and OSS-Fuzz for reporting this issue.
This is CVE-2022-41715 and Go issue https://go.dev/issue/55949.
View the release notes for more information: https://go.dev/doc/devel/release#go1.19.2
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit 7b4e4c08b5)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Commit 7b153b9e28 updated the main
swagger file, but didn't update the v1.42 version used for the
documentation as it wasn't created yet at the time.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit 271243d382)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Before this change restarting the daemon in live-restore with running
containers + a restart policy meant that volume refs were not restored.
This specifically happens when the container is still running *and*
there is a restart policy that would make sure the container was running
again on restart.
The bug allows volumes to be removed even though containers are
referencing them. 😱
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
(cherry picked from commit 4c0e0979b4)
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
These functions were used in 63a7ccdd23, which was
part of Docker v1.5.0 and v1.6.0, but removed in Docker v1.7.0 when the network
stack was replaced with libnetwork in d18919e304.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit 49de15cdcc)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
runconfig/config_test.go:23:46: empty-lines: extra empty line at the start of a block (revive)
runconfig/config_test.go:75:55: empty-lines: extra empty line at the start of a block (revive)
oci/devices_linux.go:57:34: empty-lines: extra empty line at the start of a block (revive)
oci/devices_linux.go:60:69: empty-lines: extra empty line at the start of a block (revive)
image/fs_test.go:53:38: empty-lines: extra empty line at the end of a block (revive)
image/tarexport/save.go:88:29: empty-lines: extra empty line at the end of a block (revive)
layer/layer_unix_test.go:21:34: empty-lines: extra empty line at the end of a block (revive)
distribution/xfer/download.go:302:9: empty-lines: extra empty line at the end of a block (revive)
distribution/manifest_test.go:154:99: empty-lines: extra empty line at the end of a block (revive)
distribution/manifest_test.go:329:52: empty-lines: extra empty line at the end of a block (revive)
distribution/manifest_test.go:354:59: empty-lines: extra empty line at the end of a block (revive)
registry/config_test.go:323:42: empty-lines: extra empty line at the end of a block (revive)
registry/config_test.go:350:33: empty-lines: extra empty line at the end of a block (revive)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit 8a2e1245d4)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
cmd/dockerd/trap/trap_linux_test.go:29:29: empty-lines: extra empty line at the end of a block (revive)
cmd/dockerd/daemon.go:327:35: empty-lines: extra empty line at the start of a block (revive)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit f63dea4337)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
client/events.go:19:115: empty-lines: extra empty line at the start of a block (revive)
client/events_test.go:60:31: empty-lines: extra empty line at the start of a block (revive)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit cd51c9fafb)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
api/server/router/build/build_routes.go:239:32: empty-lines: extra empty line at the start of a block (revive)
api/server/middleware/version.go:45:241: empty-lines: extra empty line at the end of a block (revive)
api/server/router/swarm/helpers_test.go:11:44: empty-lines: extra empty line at the end of a block (revive)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit f71fe8476a)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
opts/address_pools_test.go:7:39: empty-lines: extra empty line at the end of a block (revive)
opts/opts_test.go:12:42: empty-lines: extra empty line at the end of a block (revive)
opts/opts_test.go:60:49: empty-lines: extra empty line at the end of a block (revive)
opts/opts_test.go:253:37: empty-lines: extra empty line at the end of a block (revive)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit b04f1416f6)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
daemon/network/filter_test.go:174:19: empty-lines: extra empty line at the end of a block (revive)
daemon/restart.go:17:116: empty-lines: extra empty line at the end of a block (revive)
daemon/daemon_linux_test.go:255:41: empty-lines: extra empty line at the end of a block (revive)
daemon/reload_test.go:340:58: empty-lines: extra empty line at the end of a block (revive)
daemon/oci_linux.go:495:101: empty-lines: extra empty line at the end of a block (revive)
daemon/seccomp_linux_test.go:17:36: empty-lines: extra empty line at the start of a block (revive)
daemon/container_operations.go:560:73: empty-lines: extra empty line at the end of a block (revive)
daemon/daemon_unix.go:558:76: empty-lines: extra empty line at the end of a block (revive)
daemon/daemon_unix.go:1092:64: empty-lines: extra empty line at the start of a block (revive)
daemon/container_operations.go:587:24: empty-lines: extra empty line at the end of a block (revive)
daemon/network.go:807:18: empty-lines: extra empty line at the end of a block (revive)
daemon/network.go:813:42: empty-lines: extra empty line at the end of a block (revive)
daemon/network.go:872:72: empty-lines: extra empty line at the end of a block (revive)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit ddb42f3ad2)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
daemon/images/image_squash.go:17:71: empty-lines: extra empty line at the start of a block (revive)
daemon/images/store.go:128:27: empty-lines: extra empty line at the end of a block (revive)
daemon/images/image_list.go:154:55: empty-lines: extra empty line at the start of a block (revive)
daemon/images/image_delete.go:135:13: empty-lines: extra empty line at the end of a block (revive)
daemon/images/image_search.go:25:64: empty-lines: extra empty line at the start of a block (revive)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit 05042ce472)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
daemon/logger/loggertest/logreader.go:58:43: empty-lines: extra empty line at the end of a block (revive)
daemon/logger/ring_test.go:119:34: empty-lines: extra empty line at the end of a block (revive)
daemon/logger/adapter_test.go:37:12: empty-lines: extra empty line at the end of a block (revive)
daemon/logger/adapter_test.go:41:44: empty-lines: extra empty line at the end of a block (revive)
daemon/logger/adapter_test.go:170:9: empty-lines: extra empty line at the end of a block (revive)
daemon/logger/loggerutils/sharedtemp_test.go:152:43: empty-lines: extra empty line at the end of a block (revive)
daemon/logger/loggerutils/sharedtemp.go:124:117: empty-lines: extra empty line at the end of a block (revive)
daemon/logger/syslog/syslog.go:249:87: empty-lines: extra empty line at the end of a block (revive)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit 0695a910c6)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
daemon/graphdriver/aufs/aufs.go:239:80: empty-lines: extra empty line at the start of a block (revive)
daemon/graphdriver/graphtest/graphbench_unix.go:249:27: empty-lines: extra empty line at the start of a block (revive)
daemon/graphdriver/graphtest/testutil.go:271:30: empty-lines: extra empty line at the end of a block (revive)
daemon/graphdriver/graphtest/graphbench_unix.go:179:32: empty-block: this block is empty, you can remove it (revive)
daemon/graphdriver/zfs/zfs.go:375:48: empty-lines: extra empty line at the end of a block (revive)
daemon/graphdriver/overlay/overlay.go:248:89: empty-lines: extra empty line at the start of a block (revive)
daemon/graphdriver/devmapper/deviceset.go:636:21: empty-lines: extra empty line at the end of a block (revive)
daemon/graphdriver/devmapper/deviceset.go:1150:70: empty-lines: extra empty line at the start of a block (revive)
daemon/graphdriver/devmapper/deviceset.go:1613:30: empty-lines: extra empty line at the end of a block (revive)
daemon/graphdriver/devmapper/deviceset.go:1645:65: empty-lines: extra empty line at the start of a block (revive)
daemon/graphdriver/btrfs/btrfs.go:53:101: empty-lines: extra empty line at the start of a block (revive)
daemon/graphdriver/devmapper/deviceset.go:1944:89: empty-lines: extra empty line at the start of a block (revive)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit 9d9cca49b4)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
daemon/cluster/convert/service.go:96:34: empty-lines: extra empty line at the end of a block (revive)
daemon/cluster/convert/service.go:169:44: empty-lines: extra empty line at the end of a block (revive)
daemon/cluster/convert/service.go:470:30: empty-lines: extra empty line at the end of a block (revive)
daemon/cluster/convert/container.go:224:23: empty-lines: extra empty line at the start of a block (revive)
daemon/cluster/convert/network.go:109:14: empty-lines: extra empty line at the end of a block (revive)
daemon/cluster/convert/service.go:537:27: empty-lines: extra empty line at the end of a block (revive)
daemon/cluster/services.go:247:19: empty-lines: extra empty line at the end of a block (revive)
daemon/cluster/services.go:252:41: empty-lines: extra empty line at the end of a block (revive)
daemon/cluster/services.go:256:12: empty-lines: extra empty line at the end of a block (revive)
daemon/cluster/services.go:289:80: empty-lines: extra empty line at the start of a block (revive)
daemon/cluster/executor/container/health_test.go:18:37: empty-lines: extra empty line at the start of a block (revive)
daemon/cluster/executor/container/adapter.go:437:68: empty-lines: extra empty line at the end of a block (revive)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit 0c7b930952)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
plugin/v2/settable_test.go:24:29: empty-lines: extra empty line at the end of a block (revive)
plugin/manager_linux.go:96:6: empty-lines: extra empty line at the end of a block (revive)
plugin/backend_linux.go:373:16: empty-lines: extra empty line at the start of a block (revive)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit 4eb9b5f20e)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
volume/mounts/parser_test.go:42:39: empty-lines: extra empty line at the end of a block (revive)
volume/mounts/windows_parser.go:129:24: empty-lines: extra empty line at the end of a block (revive)
volume/local/local_test.go:16:35: empty-lines: extra empty line at the end of a block (revive)
volume/local/local_unix.go:145:3: early-return: if c {...} else {... return } can be simplified to if !c { ... return } ... (revive)
volume/service/service_test.go:18:38: empty-lines: extra empty line at the end of a block (revive)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit 188724a597)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
testutil/fixtures/load/frozen.go:141:99: empty-lines: extra empty line at the end of a block (revive)
testutil/daemon/plugin.go:56:129: empty-lines: extra empty line at the end of a block (revive)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit e9f1b83a4a)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
integration/config/config_test.go:106:31: empty-lines: extra empty line at the end of a block (revive)
integration/secret/secret_test.go:106:31: empty-lines: extra empty line at the end of a block (revive)
integration/network/service_test.go:58:50: empty-lines: extra empty line at the end of a block (revive)
integration/network/service_test.go:401:58: empty-lines: extra empty line at the end of a block (revive)
integration/system/event_test.go:30:38: empty-lines: extra empty line at the end of a block (revive)
integration/plugin/logging/read_test.go:19:41: empty-lines: extra empty line at the end of a block (revive)
integration/service/list_test.go:30:48: empty-lines: extra empty line at the end of a block (revive)
integration/service/create_test.go:400:46: empty-lines: extra empty line at the start of a block (revive)
integration/container/logs_test.go:156:42: empty-lines: extra empty line at the end of a block (revive)
integration/container/daemon_linux_test.go:135:44: empty-lines: extra empty line at the end of a block (revive)
integration/container/restart_test.go:160:62: empty-lines: extra empty line at the end of a block (revive)
integration/container/wait_test.go:181:47: empty-lines: extra empty line at the end of a block (revive)
integration/container/restart_test.go:116:30: empty-lines: extra empty line at the end of a block (revive)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit 786e6d80ba)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
builder/remotecontext/detect_test.go:64:66: empty-lines: extra empty line at the end of a block (revive)
builder/remotecontext/detect_test.go:78:46: empty-lines: extra empty line at the end of a block (revive)
builder/remotecontext/detect_test.go:91:51: empty-lines: extra empty line at the end of a block (revive)
builder/dockerfile/internals_test.go:95:38: empty-lines: extra empty line at the end of a block (revive)
builder/dockerfile/copy.go:86:112: empty-lines: extra empty line at the end of a block (revive)
builder/dockerfile/dispatchers_test.go:286:39: empty-lines: extra empty line at the start of a block (revive)
builder/dockerfile/builder.go:280:38: empty-lines: extra empty line at the end of a block (revive)
builder/dockerfile/dispatchers.go:66:85: empty-lines: extra empty line at the start of a block (revive)
builder/dockerfile/dispatchers.go:559:85: empty-lines: extra empty line at the start of a block (revive)
builder/builder-next/adapters/localinlinecache/inlinecache.go:26:183: empty-lines: extra empty line at the start of a block (revive)
builder/builder-next/adapters/containerimage/pull.go:441:9: empty-lines: extra empty line at the start of a block (revive)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit ecb4ed172b)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
integration-cli/docker_cli_pull_test.go:55:69: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_cli_exec_test.go:46:64: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_cli_service_health_test.go:86:65: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_api_images_test.go:128:66: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_api_swarm_node_test.go:79:69: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_cli_health_test.go:51:57: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_cli_health_test.go:159:73: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_cli_swarm_unix_test.go:60:67: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_api_inspect_test.go:30:33: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_api_build_test.go:429:71: empty-lines: extra empty line at the start of a block (revive)
integration-cli/docker_cli_attach_unix_test.go:19:78: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_api_build_test.go:470:70: empty-lines: extra empty line at the start of a block (revive)
integration-cli/docker_cli_history_test.go:29:64: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_cli_links_test.go:93:86: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_cli_create_test.go:33:61: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_cli_links_test.go:145:78: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_cli_create_test.go:114:70: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_api_attach_test.go:226:153: empty-lines: extra empty line at the start of a block (revive)
integration-cli/docker_cli_by_digest_test.go:239:71: empty-lines: extra empty line at the start of a block (revive)
integration-cli/docker_cli_create_test.go:135:49: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_cli_create_test.go:143:75: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_cli_create_test.go:181:71: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_cli_inspect_test.go:72:65: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_api_swarm_service_test.go:98:77: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_api_swarm_service_test.go:144:69: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_cli_rmi_test.go:63:2: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_api_swarm_service_test.go:199:79: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_cli_rmi_test.go:69:2: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_api_swarm_service_test.go:300:75: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_cli_prune_unix_test.go:35:25: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_cli_events_unix_test.go:393:60: empty-lines: extra empty line at the start of a block (revive)
integration-cli/docker_cli_events_unix_test.go:441:71: empty-lines: extra empty line at the start of a block (revive)
integration-cli/docker_cli_ps_test.go:33:67: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_cli_ps_test.go:559:67: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_cli_events_test.go:117:75: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_api_containers_test.go:547:74: empty-lines: extra empty line at the start of a block (revive)
integration-cli/docker_api_containers_test.go:1054:84: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_api_containers_test.go:1076:87: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_api_containers_test.go:1232:72: empty-lines: extra empty line at the start of a block (revive)
integration-cli/docker_api_containers_test.go:1801:21: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_cli_network_unix_test.go:58:95: empty-lines: extra empty line at the start of a block (revive)
integration-cli/docker_cli_network_unix_test.go:750:75: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_cli_network_unix_test.go:765:76: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_cli_swarm_test.go:617:100: empty-lines: extra empty line at the start of a block (revive)
integration-cli/docker_cli_swarm_test.go:892:72: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_cli_daemon_test.go:119:74: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_cli_daemon_test.go:981:68: empty-lines: extra empty line at the start of a block (revive)
integration-cli/docker_cli_daemon_test.go:1951:87: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_cli_run_test.go:83:66: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_cli_run_test.go:357:72: empty-lines: extra empty line at the start of a block (revive)
integration-cli/docker_cli_build_test.go:89:83: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_cli_build_test.go:114:83: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_cli_build_test.go:183:80: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_cli_build_test.go:290:71: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_cli_build_test.go:314:65: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_cli_build_test.go:331:67: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_cli_build_test.go:366:76: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_cli_build_test.go:403:67: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_cli_build_test.go:648:67: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_cli_build_test.go:708:72: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_cli_build_test.go:938:66: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_cli_build_test.go:1018:72: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_cli_build_test.go:1097:2: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_cli_build_test.go:1182:62: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_cli_build_test.go:1244:66: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_cli_build_test.go:1524:69: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_cli_build_test.go:1546:80: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_cli_build_test.go:1716:70: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_cli_build_test.go:1730:65: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_cli_build_test.go:2162:74: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_cli_build_test.go:2270:71: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_cli_build_test.go:2288:70: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_cli_build_test.go:3206:65: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_cli_build_test.go:3392:66: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_cli_build_test.go:3433:72: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_cli_build_test.go:3678:76: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_cli_build_test.go:3732:67: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_cli_build_test.go:3759:69: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_cli_build_test.go:3802:61: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_cli_build_test.go:3898:66: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_cli_build_test.go:4107:9: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_cli_build_test.go:4791:74: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_cli_build_test.go:4821:73: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_cli_build_test.go:4854:70: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_cli_build_test.go:5341:74: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_cli_build_test.go:5593:81: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_api_containers_test.go:2145:11: empty-lines: extra empty line at the start of a block (revive)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit dc0c2340b8)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Also renamed variables that collided with import
api/types/strslice/strslice_test.go:36:41: empty-lines: extra empty line at the end of a block (revive)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit 31441778fa)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
pkg/directory/directory.go:9:49: empty-lines: extra empty line at the start of a block (revive)
pkg/pubsub/publisher.go:8:48: empty-lines: extra empty line at the start of a block (revive)
pkg/loopback/attach_loopback.go:96:69: empty-lines: extra empty line at the start of a block (revive)
pkg/devicemapper/devmapper_wrapper.go:136:48: empty-lines: extra empty line at the start of a block (revive)
pkg/devicemapper/devmapper.go:391:35: empty-lines: extra empty line at the end of a block (revive)
pkg/devicemapper/devmapper.go:676:35: empty-lines: extra empty line at the end of a block (revive)
pkg/archive/changes_posix_test.go:15:38: empty-lines: extra empty line at the end of a block (revive)
pkg/devicemapper/devmapper.go:241:51: empty-lines: extra empty line at the start of a block (revive)
pkg/fileutils/fileutils_test.go:17:47: empty-lines: extra empty line at the end of a block (revive)
pkg/fileutils/fileutils_test.go:34:48: empty-lines: extra empty line at the end of a block (revive)
pkg/fileutils/fileutils_test.go:318:32: empty-lines: extra empty line at the end of a block (revive)
pkg/tailfile/tailfile.go:171:6: empty-lines: extra empty line at the end of a block (revive)
pkg/tarsum/fileinfosums_test.go:16:41: empty-lines: extra empty line at the end of a block (revive)
pkg/tarsum/tarsum_test.go:198:42: empty-lines: extra empty line at the start of a block (revive)
pkg/tarsum/tarsum_test.go:294:25: empty-lines: extra empty line at the start of a block (revive)
pkg/tarsum/tarsum_test.go:407:34: empty-lines: extra empty line at the end of a block (revive)
pkg/ioutils/fswriters_test.go:52:45: empty-lines: extra empty line at the end of a block (revive)
pkg/ioutils/writers_test.go:24:39: empty-lines: extra empty line at the end of a block (revive)
pkg/ioutils/bytespipe_test.go:78:26: empty-lines: extra empty line at the end of a block (revive)
pkg/sysinfo/sysinfo_linux_test.go:13:37: empty-lines: extra empty line at the end of a block (revive)
pkg/archive/archive_linux_test.go:57:64: empty-lines: extra empty line at the end of a block (revive)
pkg/archive/changes.go:248:72: empty-lines: extra empty line at the start of a block (revive)
pkg/archive/changes_posix_test.go:15:38: empty-lines: extra empty line at the end of a block (revive)
pkg/archive/copy.go:248:124: empty-lines: extra empty line at the end of a block (revive)
pkg/archive/diff_test.go:198:44: empty-lines: extra empty line at the end of a block (revive)
pkg/archive/archive.go:304:12: empty-lines: extra empty line at the end of a block (revive)
pkg/archive/archive.go:749:37: empty-lines: extra empty line at the end of a block (revive)
pkg/archive/archive.go:812:81: empty-lines: extra empty line at the start of a block (revive)
pkg/archive/copy_unix_test.go:347:34: empty-lines: extra empty line at the end of a block (revive)
pkg/system/path.go:11:39: empty-lines: extra empty line at the end of a block (revive)
pkg/system/meminfo_linux.go:29:21: empty-lines: extra empty line at the end of a block (revive)
pkg/plugins/plugins.go:135:32: empty-lines: extra empty line at the end of a block (revive)
pkg/authorization/response.go:71:48: empty-lines: extra empty line at the start of a block (revive)
pkg/authorization/api_test.go:18:51: empty-lines: extra empty line at the end of a block (revive)
pkg/authorization/middleware_test.go:23:44: empty-lines: extra empty line at the end of a block (revive)
pkg/authorization/middleware_unix_test.go:17:46: empty-lines: extra empty line at the end of a block (revive)
pkg/authorization/api_test.go:57:45: empty-lines: extra empty line at the end of a block (revive)
pkg/authorization/response.go:83:50: empty-lines: extra empty line at the start of a block (revive)
pkg/authorization/api_test.go:66:47: empty-lines: extra empty line at the end of a block (revive)
pkg/authorization/middleware_unix_test.go:45:48: empty-lines: extra empty line at the end of a block (revive)
pkg/authorization/response.go:145:75: empty-lines: extra empty line at the start of a block (revive)
pkg/authorization/middleware_unix_test.go:56:51: empty-lines: extra empty line at the end of a block (revive)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit 412c650e05)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This package was moved to a separate repository, using the steps below:
# install filter-repo (https://github.com/newren/git-filter-repo/blob/main/INSTALL.md)
brew install git-filter-repo
cd ~/projects
# create a temporary clone of docker
git clone https://github.com/docker/docker.git moby_pubsub_temp
cd moby_pubsub_temp
# for reference
git rev-parse HEAD
# --> 572ca799db
# remove all code, except for pkg/pubsub, license, and notice, and rename pkg/pubsub to /
git filter-repo --path pkg/pubsub/ --path LICENSE --path NOTICE --path-rename pkg/pubsub/:
# remove canonical imports
git revert -s -S 585ff0ebbe6bc25b801a0e0087dd5353099cb72e
# initialize module
go mod init github.com/moby/pubsub
go mod tidy
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit 0249afc523)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
In preparation of moving this package separate.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit 0440ca07ba)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The `docker` CLI currently doesn't handle situations where the current context
(as defined in `~/.docker/config.json`) is invalid or doesn't exist. As loading
(and checking) the context happens during initialization of the CLI, this
prevents `docker context` commands from being used, which makes it complicated
to fix the situation. For example, running `docker context use <correct context>`
would fail, which makes it not possible to update the `~/.docker/config.json`,
unless doing so manually.
For example, given the following `~/.docker/config.json`:
```json
{
"currentContext": "nosuchcontext"
}
```
All of the commands below fail:
```bash
docker context inspect rootless
Current context "nosuchcontext" is not found on the file system, please check your config file at /Users/thajeztah/.docker/config.json
docker context rm --force rootless
Current context "nosuchcontext" is not found on the file system, please check your config file at /Users/thajeztah/.docker/config.json
docker context use default
Current context "nosuchcontext" is not found on the file system, please check your config file at /Users/thajeztah/.docker/config.json
```
While these things should be fixed, this patch updates the script to switch
the context using the `--context` flag; this flag is taken into account when
initializing the CLI, so that having an invalid context configured won't
block `docker context` commands from being executed. Given that all `context`
commands are local operations, "any" context can be used (it doesn't need to
make a connection with the daemon).
With this patch, those commands can now be run (and won't fail for the wrong
reason);
```bash
docker --context=default context inspect -f "{{.Name}}" rootless
rootless
docker --context=default context inspect -f "{{.Name}}" rootless-doesnt-exist
context "rootless-doesnt-exist" does not exist
```
One other issue may also cause things to fail during uninstall; trying to remove
a context that doesn't exist will fail (even with the `-f` / `--force` option
set);
```bash
docker --context=default context rm blablabla
Error: context "blablabla": not found
```
While this is "ok" in most circumstances, it also means that (potentially) the
current context is not reset to "default", so this patch adds an explicit
`docker context use`, as well as unsetting the `DOCKER_HOST` and `DOCKER_CONTEXT`
environment variables.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit e2114731e7)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This code was duplicated in two places -- factor it out, add
documentation, and move magic numbers into a constant.
Additionally, use the same permissions (0755) in both code paths, and
ensure that the ID map is used in both code paths.
Co-authored-by: Vasiliy Ulyanov <vulyanov@suse.de>
Signed-off-by: Bjorn Neergaard <bneergaard@mirantis.com>
Signed-off-by: Vasiliy Ulyanov <vulyanov@suse.de>
(cherry picked from commit 4831ff9f27)
This fixes an inifinite loop in mkdirAs(), used by `MkdirAllAndChown`,
`MkdirAndChown`, and `MkdirAllAndChownNew`, as well as directories being
chown'd multiple times when relative paths are used.
The for loop in this function was incorrectly assuming that;
1. `filepath.Dir()` would always return the parent directory of any given path
2. traversing any given path to ultimately result in "/"
While this is correct for absolute and "cleaned" paths, both assumptions are
incorrect in some variations of "path";
1. for paths with a trailing path-separator ("some/path/"), or dot ("."),
`filepath.Dir()` considers the (implicit) "." to be a location _within_ the
directory, and returns "some/path" as ("parent") directory. This resulted
in the path itself to be included _twice_ in the list of paths to chown.
2. for relative paths ("./some-path", "../some-path"), "traversing" the path
would never end in "/", causing the for loop to run indefinitely:
```go
// walk back to "/" looking for directories which do not exist
// and add them to the paths array for chown after creation
dirPath := path
for {
dirPath = filepath.Dir(dirPath)
if dirPath == "/" {
break
}
if _, err := os.Stat(dirPath); err != nil && os.IsNotExist(err) {
paths = append(paths, dirPath)
}
}
```
A _partial_ mitigation for this would be to use `filepath.Clean()` before using
the path (while `filepath.Dir()` _does_ call `filepath.Clean()`, it only does so
_after_ some processing, so only cleans the result). Doing so would prevent the
double chown from happening, but would not prevent the "final" path to be "."
or ".." (in the relative path case), still causing an infinite loop, or
additional checks for "." / ".." to be needed.
| path | filepath.Dir(path) | filepath.Dir(filepath.Clean(path)) |
|----------------|--------------------|------------------------------------|
| some-path | . | . |
| ./some-path | . | . |
| ../some-path | .. | .. |
| some/path/ | some/path | some |
| ./some/path/ | some/path | some |
| ../some/path/ | ../some/path | ../some |
| some/path/. | some/path | some |
| ./some/path/. | some/path | some |
| ../some/path/. | ../some/path | ../some |
| /some/path/ | /some/path | /some |
| /some/path/. | /some/path | /some |
Instead, this patch adds a `filepath.Abs()` to the function, so make sure that
paths are both cleaned, and not resulting in an infinite loop.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit 1e13247d6d)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Also switching to use arm64, as all amd64 stages have moved to GitHub actions,
so using arm64 allows the same machine to be used for tests after the DCO check
completed.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit 419c47a80a)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Remove the "deadcode", "structcheck", and "varcheck" linters, as they are
deprecated:
WARN [runner] The linter 'deadcode' is deprecated (since v1.49.0) due to: The owner seems to have abandoned the linter. Replaced by unused.
WARN [runner] The linter 'structcheck' is deprecated (since v1.49.0) due to: The owner seems to have abandoned the linter. Replaced by unused.
WARN [runner] The linter 'varcheck' is deprecated (since v1.49.0) due to: The owner seems to have abandoned the linter. Replaced by unused.
WARN [linters context] structcheck is disabled because of generics. You can track the evolution of the generics support by following the https://github.com/golangci/golangci-lint/issues/2649.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit 2f1c382a6d)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
- dbus: add Connected methods to check connections status
- dbus: add support for querying unit by PID
- dbus: implement support for cgroup freezer APIs
- journal: remove implicit initialization
- login1: add methods to get session/user properties
- login1: add context-aware ListSessions and ListUsers methods
full diff: https://github.com/github.com/coreos/go-systemd/compare/v22.3.2...v22.4.0
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit 323ab8ef97)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Now that we can pass any custom containerd shim to dockerd there is need
for this check. Without this it becomes possible to use wasm shims for
example with images that have "wasi" as the OS.
Signed-off-by: Djordje Lukic <djordje.lukic@docker.com>
(cherry picked from commit 1a3d8019d1)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
After discussing in the maintainers meeting, we concluded that Slowloris attacks
are not a real risk other than potentially having some additional goroutines
lingering around, so setting a long timeout to satisfy the linter, and to at
least have "some" timeout.
libnetwork/diagnostic/server.go:96:10: G112: Potential Slowloris Attack because ReadHeaderTimeout is not configured in the http.Server (gosec)
srv := &http.Server{
Addr: net.JoinHostPort(ip, strconv.Itoa(port)),
Handler: s,
}
api/server/server.go:60:10: G112: Potential Slowloris Attack because ReadHeaderTimeout is not configured in the http.Server (gosec)
srv: &http.Server{
Addr: addr,
},
daemon/metrics_unix.go:34:13: G114: Use of net/http serve function that has no support for setting timeouts (gosec)
if err := http.Serve(l, mux); err != nil && !strings.Contains(err.Error(), "use of closed network connection") {
^
cmd/dockerd/metrics.go:27:13: G114: Use of net/http serve function that has no support for setting timeouts (gosec)
if err := http.Serve(l, mux); err != nil && !strings.Contains(err.Error(), "use of closed network connection") {
^
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit 55fd77f724)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This was missing to be removed from Jenkinsfile when we moved
to GHA for unit and integration tests.
Signed-off-by: CrazyMax <crazy-max@users.noreply.github.com>
(cherry picked from commit cd54f31984)
The OCI image spec is considering to change the Image struct and embedding the
Platform type (see opencontainers/image-spec#959) in the go implementation.
Moby currently uses some struct-literals to propagate the platform fields,
which will break once those changes in the OCI spec are merged.
Ideally (once that change arrives) we would update the code to set the Platform
information as a whole, instead of assigning related fields individually, but
in some cases in the code, image platform information is only partially set
(for example, OSVersion and OSFeatures are not preserved in all cases). This
may be on purpose, so needs to be reviewed.
This patch keeps the current behavior (assigning only specific fields), but
removes the use of struct-literals to make the code compatible with the
upcoming changes in the image-spec module.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit 3cb933db9d)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
otherwise this one won't be considered for permission checks
Signed-off-by: Nicolas De Loof <nicolas.deloof@gmail.com>
(cherry picked from commit 25345f2c04)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
From the mailing list:
We have just released Go versions 1.19.1 and 1.18.6, minor point releases.
These minor releases include 2 security fixes following the security policy:
- net/http: handle server errors after sending GOAWAY
A closing HTTP/2 server connection could hang forever waiting for a clean
shutdown that was preempted by a subsequent fatal error. This failure mode
could be exploited to cause a denial of service.
Thanks to Bahruz Jabiyev, Tommaso Innocenti, Anthony Gavazzi, Steven Sprecher,
and Kaan Onarlioglu for reporting this.
This is CVE-2022-27664 and Go issue https://go.dev/issue/54658.
- net/url: JoinPath does not strip relative path components in all circumstances
JoinPath and URL.JoinPath would not remove `../` path components appended to a
relative path. For example, `JoinPath("https://go.dev", "../go")` returned the
URL `https://go.dev/../go`, despite the JoinPath documentation stating that
`../` path elements are cleaned from the result.
Thanks to q0jt for reporting this issue.
This is CVE-2022-32190 and Go issue https://go.dev/issue/54385.
Release notes:
go1.19.1 (released 2022-09-06) includes security fixes to the net/http and
net/url packages, as well as bug fixes to the compiler, the go command, the pprof
command, the linker, the runtime, and the crypto/tls and crypto/x509 packages.
See the Go 1.19.1 milestone on the issue tracker for details.
https://github.com/golang/go/issues?q=milestone%3AGo1.19.1+label%3ACherryPickApproved
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit 1eadbdd9fa)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
also ran gofmt with go1.19
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit 58413c15cb)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The wrapper sets the default namespace in the context if none is
provided, this is needed because we are calling these services directly
and not trough GRPC that has an interceptor to set the default namespace
to all calls.
Signed-off-by: Djordje Lukic <djordje.lukic@docker.com>
(cherry picked from commit 878906630b)
Signed-off-by: Djordje Lukic <djordje.lukic@docker.com>
Update to the latest version that contains a fix for CVE-2022-27664;
f3363e06e7
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit 518179f63e)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
From the mailing list:
We have just released Go versions 1.19.1 and 1.18.6, minor point releases.
These minor releases include 2 security fixes following the security policy:
- net/http: handle server errors after sending GOAWAY
A closing HTTP/2 server connection could hang forever waiting for a clean
shutdown that was preempted by a subsequent fatal error. This failure mode
could be exploited to cause a denial of service.
Thanks to Bahruz Jabiyev, Tommaso Innocenti, Anthony Gavazzi, Steven Sprecher,
and Kaan Onarlioglu for reporting this.
This is CVE-2022-27664 and Go issue https://go.dev/issue/54658.
- net/url: JoinPath does not strip relative path components in all circumstances
JoinPath and URL.JoinPath would not remove `../` path components appended to a
relative path. For example, `JoinPath("https://go.dev", "../go")` returned the
URL `https://go.dev/../go`, despite the JoinPath documentation stating that
`../` path elements are cleaned from the result.
Thanks to q0jt for reporting this issue.
This is CVE-2022-32190 and Go issue https://go.dev/issue/54385.
Release notes:
go1.18.6 (released 2022-09-06) includes security fixes to the net/http package,
as well as bug fixes to the compiler, the go command, the pprof command, the
runtime, and the crypto/tls, encoding/xml, and net packages. See the Go 1.18.6
milestone on the issue tracker for details;
https://github.com/golang/go/issues?q=milestone%3AGo1.18.6+label%3ACherryPickApproved
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit cba36a064d)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
integration-cli/docker_cli_daemon_test.go:545:54: host:port in url should be constructed with net.JoinHostPort and not directly with fmt.Sprintf (nosprintfhostport)
cmdArgs = append(cmdArgs, "--tls=false", "--host", fmt.Sprintf("tcp://%s:%s", l.daemon, l.port))
^
opts/hosts_test.go:35:31: host:port in url should be constructed with net.JoinHostPort and not directly with fmt.Sprintf (nosprintfhostport)
"tcp://:5555": fmt.Sprintf("tcp://%s:5555", DefaultHTTPHost),
^
opts/hosts_test.go:91:30: host:port in url should be constructed with net.JoinHostPort and not directly with fmt.Sprintf (nosprintfhostport)
":5555": fmt.Sprintf("tcp://%s:5555", DefaultHTTPHost),
^
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit 306b8c89e8)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Updating test-code only; set ReadHeaderTimeout for some, or suppress the linter
error for others.
contrib/httpserver/server.go:11:12: G114: Use of net/http serve function that has no support for setting timeouts (gosec)
log.Panic(http.ListenAndServe(":80", nil))
^
integration/plugin/logging/cmd/close_on_start/main.go:42:12: G112: Potential Slowloris Attack because ReadHeaderTimeout is not configured in the http.Server (gosec)
server := http.Server{
Addr: l.Addr().String(),
Handler: mux,
}
integration/plugin/logging/cmd/discard/main.go:17:12: G112: Potential Slowloris Attack because ReadHeaderTimeout is not configured in the http.Server (gosec)
server := http.Server{
Addr: l.Addr().String(),
Handler: mux,
}
integration/plugin/logging/cmd/dummy/main.go:14:12: G112: Potential Slowloris Attack because ReadHeaderTimeout is not configured in the http.Server (gosec)
server := http.Server{
Addr: l.Addr().String(),
Handler: http.NewServeMux(),
}
integration/plugin/volumes/cmd/dummy/main.go:14:12: G112: Potential Slowloris Attack because ReadHeaderTimeout is not configured in the http.Server (gosec)
server := http.Server{
Addr: l.Addr().String(),
Handler: http.NewServeMux(),
}
testutil/fixtures/plugin/basic/basic.go:25:12: G112: Potential Slowloris Attack because ReadHeaderTimeout is not configured in the http.Server (gosec)
server := http.Server{
Addr: l.Addr().String(),
Handler: http.NewServeMux(),
}
volume/testutils/testutils.go:170:5: G114: Use of net/http serve function that has no support for setting timeouts (gosec)
go http.Serve(l, mux)
^
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit 31fb92c609)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The linter falsely detects this as using "math/rand":
libnetwork/networkdb/cluster.go:721:14: G404: Use of weak random number generator (math/rand instead of crypto/rand) (gosec)
val, err := rand.Int(rand.Reader, big.NewInt(int64(n)))
^
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit 561a010161)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Use net.JoinHostPort to account for IPv6 addresses.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit a33d1f9a7c)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
While the name generator has been frozen for new additions in 624b3cfbe8,
this person has become controversial. Our intent is for this list to be inclusive
and non-controversial.
This patch removes the name from the list.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit 0f052eb4f5)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Migrating these functions to allow them being shared between moby, docker/cli,
and containerd, and to allow using them without importing all of sys / system,
which (in containerd) also depends on hcsshim and more.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit 509f19f611)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
filepath.IsAbs() will short-circuit on Linux/Unix, so having a single
implementation should not affect those platforms.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit 2640aec0d7)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
validate other YAML files, such as the ones used in the documentation,
and GitHub actions workflows, to prevent issues such as;
- 30295c1750
- 8e8d9a3650
With this patch:
hack/validate/yamllint
Congratulations! yamllint config file formatted correctly
Congratulations! YAML files are formatted correctly
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit 6cef06b940)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Suppresses warnings like:
LANG=C.UTF-8 yamllint -c hack/validate/yamllint.yaml -f parsable .github/workflows/*.yml
.github/workflows/ci.yml:7:1: [warning] truthy value should be one of [false, true] (truthy)
.github/workflows/windows.yml:7:1: [warning] truthy value should be one of [false, true] (truthy)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit 91bb776bb8)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Before:
10030:81 error line too long (89 > 80 characters) (line-length)
After:
api/swagger.yaml:10030:81: [error] line too long (89 > 80 characters) (line-length)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit f679d8c821)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Don't make the file hidden, and add .yaml extension, so that editors
pick up the right formatting :)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit 5f114b65b4)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
In network node change test, the expected behavior is focused on how many nodes
left in networkDB, besides timing issues, things would also go tricky for a
leave-then-join sequence, if the check (counting the nodes) happened before the
first "leave" event, then the testcase actually miss its target and report PASS
without verifying its final result; if the check happened after the 'leave' event,
but before the 'join' event, the test would report FAIL unnecessary;
This code change would check both the db changes and the node count, it would
report PASS only when networkdb has indeed changed and the node count is expected.
Signed-off-by: David Wang <00107082@163.com>
(cherry picked from commit f499c6b9ec)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
It was deprecated in edac92409a, which
was part of 18.09 and up, so should be safe by now to remove this.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit e14924570c)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Terminating the exec process when the context is canceled has been
broken since Docker v17.11 so nobody has been able to depend upon that
behaviour in five years of releases. We are thus free from backwards-
compatibility constraints.
Co-authored-by: Nicolas De Loof <nicolas.deloof@gmail.com>
Co-authored-by: Sebastiaan van Stijn <github@gone.nl>
Signed-off-by: Nicolas De Loof <nicolas.deloof@gmail.com>
Signed-off-by: Cory Snider <csnider@mirantis.com>
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit 4b84a33217)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Since runtimes can now just be containerd shims, we need to check if the
reference is possibly a containerd shim.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
(cherry picked from commit e6ee27a541)
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
Starting with the 22.06 release, buildx is the default client for
docker build, which uses BuildKit as builder.
This patch changes the default builder version as advertised by
the daemon to "2" (BuildKit), so that pre-22.06 CLIs with BuildKit
support (but no buildx installed) also default to using BuildKit
when interacting with a 22.06 (or up) daemon.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The `-g` / `--graph` options were soft deprecated in favor of `--data-root` in
261ef1fa27 (v17.05.0) and at the time considered
to not be removed. However, with the move towards containerd snapshotters, having
these options around adds additional complexity to handle fallbacks for deprecated
(and hidden) flags, so completing the deprecation.
With this patch:
dockerd --graph=/var/lib/docker --validate
Flag --graph has been deprecated, Use --data-root instead
unable to configure the Docker daemon with file /etc/docker/daemon.json: merged configuration validation from file and command line flags failed: the "graph" config file option is deprecated; use "data-root" instead
mkdir -p /etc/docker
echo '{"graph":"/var/lib/docker"}' > /etc/docker/daemon.json
dockerd --validate
unable to configure the Docker daemon with file /etc/docker/daemon.json: merged configuration validation from file and command line flags failed: the "graph" config file option is deprecated; use "data-root" instead
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit b58de39ca7)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Update the profile to make use of CAP_BPF and CAP_PERFMON capabilities. Prior to
kernel 5.8, bpf and perf_event_open required CAP_SYS_ADMIN. This change enables
finer control of the privilege setting, thus allowing us to run certain system
tracing tools with minimal privileges.
Based on the original patch from Henry Wang in the containerd repository.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit 7b7d1132e8)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
full diff: 6068d1894d...48dd89375d
Finishes off the work to change references to cluster volumes in the API
from using "csi" as the magic word to "cluster". This reflects that the
volumes are "cluster volumes", not "csi volumes".
Notably, there is no change to the plugin definitions being "csinode"
and "csicontroller". This terminology is appropriate with regards to
plugins because it accurates reflects what the plugin is.
Signed-off-by: Drew Erny <derny@mirantis.com>
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit 9861dd069b)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
full diff: https://github.com/containerd/containerd/v1.6.6...v1.6.7
Welcome to the v1.6.7 release of containerd!
The seventh patch release for containerd 1.6 contains various fixes,
includes a new version of runc and adds support for ppc64le and riscv64
(requires unreleased runc 1.2) builds.
Notable Updates
- Update runc to v1.1.3
- Seccomp: Allow clock_settime64 with CAP_SYS_TIME
- Fix WWW-Authenticate parsing
- Support RISC-V 64 and ppc64le builds
- Windows: Update hcsshim to v0.9.4 to fix regression with HostProcess stats
- Windows: Fix shim logs going to panic.log file
- Allow ptrace(2) by default for kernels >= 4.8
See the changelog for complete list of changes
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit 7376bf948b)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
full diff: https://github.com/containerd/containerd/v1.6.6...v1.6.7
Welcome to the v1.6.7 release of containerd!
The seventh patch release for containerd 1.6 contains various fixes,
includes a new version of runc and adds support for ppc64le and riscv64
(requires unreleased runc 1.2) builds.
Notable Updates
- Update runc to v1.1.3
- Seccomp: Allow clock_settime64 with CAP_SYS_TIME
- Fix WWW-Authenticate parsing
- Support RISC-V 64 and ppc64le builds
- Windows: Update hcsshim to v0.9.4 to fix regression with HostProcess stats
- Windows: Fix shim logs going to panic.log file
- Allow ptrace(2) by default for kernels >= 4.8
See the changelog for complete list of changes
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit 4e46d9f963)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
full diff: https://github.com/opencontainers/runc/compare/v1.1.2...v1.1.3
This is the third release of the 1.1.z series of runc, and contains
various minor improvements and bugfixes.
- Our seccomp `-ENOSYS` stub now correctly handles multiplexed syscalls on
s390 and s390x. This solves the issue where syscalls the host kernel did not
support would return `-EPERM` despite the existence of the `-ENOSYS` stub
code (this was due to how s390x does syscall multiplexing).
- Retry on dbus disconnect logic in libcontainer/cgroups/systemd now works as
intended; this fix does not affect runc binary itself but is important for
libcontainer users such as Kubernetes.
- Inability to compile with recent clang due to an issue with duplicate
constants in libseccomp-golang.
- When using systemd cgroup driver, skip adding device paths that don't exist,
to stop systemd from emitting warnings about those paths.
- Socket activation was failing when more than 3 sockets were used.
- Various CI fixes.
- Allow to bind mount `/proc/sys/kernel/ns_last_pid` to inside container.
- runc static binaries are now linked against libseccomp v2.5.4.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit 2293de1c82)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This field was added to replace the deprecated "Parent" field.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit e0db8207f3)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This field has been deprecated in BuildKit, so this follows the deprecation
in the Engine API.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit ebf339628a)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Commit 7a9cb29fb9 added a new "platform" query-
parameter to the `POST /containers/create` endpoint, but did not update the
swagger file and documentation.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit 982f09f837)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Commit 7a9cb29fb9 added a new "platform" query-
parameter to the `POST /containers/create` endpoint, but did not update the
swagger file and documentation.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit 1000e4ee7d)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Commit 7a9cb29fb9 added a new "platform" query-
parameter to the `POST /containers/create` endpoint, but did not update the
swagger file and documentation.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit 3dae8e9fc2)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Contrary to popular belief, the OCI Runtime specification does not
specify the command-line API for runtimes. Looking at containerd's
architecture from the lens of the OCI Runtime spec, the _shim_ is the
OCI Runtime and runC is "just" an implementation detail of the
io.containerd.runc.v2 runtime. When one configures a non-default runtime
in Docker, what they're really doing is instructing Docker to create
containers using the io.containerd.runc.v2 runtime with a configuration
option telling the runtime that the runC binary is at some non-default
path. Consequently, only OCI runtimes which are compatible with the
io.containerd.runc.v2 shim, such as crun, can be used in this manner.
Other OCI runtimes, including kata-containers v2, come with their own
containerd shim and are not compatible with io.containerd.runc.v2.
As Docker has not historically provided a way to select a non-default
runtime which requires its own shim, runtimes such as kata-containers v2
could not be used with Docker.
Allow other containerd shims to be used with Docker; no daemon
configuration required. If the daemon is instructed to create a
container with a runtime name which does not match any of the configured
or stock runtimes, it passes the name along to containerd verbatim. A
user can start a container with the kata-containers runtime, for
example, simply by calling
docker run --runtime io.containerd.kata.v2
Runtime names which containerd would interpret as a path to an arbitrary
binary are disallowed. While handy for development and testing it is not
strictly necessary and would allow anyone with Engine API access to
trivially execute any binary on the host as root, so we have decided it
would be safest for our users if it was not allowed.
It is not yet possible to set an alternative containerd shim as the
default runtime; it can only be configured per-container.
Signed-off-by: Cory Snider <csnider@mirantis.com>
(cherry picked from commit 547da0d575)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Before this change there was a race condition between State.Wait reading
the exit code from State and the State being changed instantly after the
change which ended the State.Wait.
Now, each State.Wait has its own channel which is used to transmit the
desired StateStatus at the time the state transitions to the awaited
one. Wait no longer reads the status by itself so there is no race.
The issue caused the `docker run --restart=always ...' to sometimes exit
with 0 exit code, because the process was already restarted by the time
State.Wait got the chance to read the exit code.
Test run
--------
Before:
```
$ go test -count 1 -run TestCorrectStateWaitResultAfterRestart .
--- FAIL: TestCorrectStateWaitResultAfterRestart (0.00s)
state_test.go:198: expected exit code 10, got 0
FAIL
FAIL github.com/docker/docker/container 0.011s
FAIL
```
After:
```
$ go test -count 1 -run TestCorrectStateWaitResultAfterRestart .
ok github.com/docker/docker/container 0.011s
```
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
This caused a race condition where AutoRemove could be restored before
container was considered for restart and made autoremove containers
impossible to restart.
```
$ make DOCKER_GRAPHDRIVER=vfs BIND_DIR=. TEST_FILTER='TestContainerWithAutoRemoveCanBeRestarted' TESTFLAGS='-test.count 1' test-integration
...
=== RUN TestContainerWithAutoRemoveCanBeRestarted
=== RUN TestContainerWithAutoRemoveCanBeRestarted/kill
=== RUN TestContainerWithAutoRemoveCanBeRestarted/stop
--- PASS: TestContainerWithAutoRemoveCanBeRestarted (1.61s)
--- PASS: TestContainerWithAutoRemoveCanBeRestarted/kill (0.70s)
--- PASS: TestContainerWithAutoRemoveCanBeRestarted/stop (0.86s)
PASS
DONE 3 tests in 3.062s
```
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
doCopyXattrs() never reached due to copyXattrs boolean being false, as
a result file capabilities not being copied.
moved copyXattr() out of doCopyXattrs()
Signed-off-by: Illo Abdulrahim <abdulrahim.illo@nokia.com>
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit 31f654a704)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This was introduced in 906b979b88, which changed
a `goto` to a `break`, but afaics, the intent was still to break out of the loop.
(linter didn't catch this before because it didn't have the right build-tag set)
daemon/logger/journald/read.go:238:4: SA4011: ineffective break statement. Did you mean to break out of the outer loop? (staticcheck)
break // won't be able to write anything anymore
^
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit 75577fe7a8)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The correct formatting for machine-readable comments is;
//<some alphanumeric identifier>:<options>[,<option>...][ // comment]
Which basically means:
- MUST NOT have a space before `<identifier>` (e.g. `nolint`)
- Identified MUST be alphanumeric
- MUST be followed by a colon
- MUST be followed by at least one `<option>`
- Optionally additional `<options>` (comma-separated)
- Optionally followed by a comment
Any other format will not be considered a machine-readable comment by `gofmt`,
and thus formatted as a regular comment. Note that this also means that a
`//nolint` (without anything after it) is considered invalid, same for `//#nosec`
(starts with a `#`).
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit 4f08346686)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Add pkey_alloc(2), pkey_free(2) and pkey_mprotect(2) in seccomp default profile.
pkey_alloc(2), pkey_free(2) and pkey_mprotect(2) can only configure
the calling process's own memory, so they are existing "safe for everyone" syscalls.
close issue: #43481
Signed-off-by: zhubojun <bojun.zhu@foxmail.com>
(cherry picked from commit e258d66f17)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
- Update IsErrNotFound() to check for the current type before falling back to
detecting the deprecated type.
- Remove unauthorizedError and notImplementedError types, which were not used.
- IsErrPluginPermissionDenied() was added in 7c36a1af03,
but not used at the time, and still appears to be unused.
- Deprecate IsErrUnauthorized in favor of errdefs.IsUnauthorized()
- Deprecate IsErrNotImplemented in favor of errdefs,IsNotImplemented()
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit ee230d8fdd)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Older versions of Go don't format comments, so committing this as
a separate commit, so that we can already make these changes before
we upgrade to Go 1.19.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit 52c1a2fae8)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This was caught by goimports;
goimports -w $(find . -type f -name '*.go'| grep -v "/vendor/")
CI doesn't run on these platforms, so didn't catch it.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit e4e819b49c)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
go1.18.4 (released 2022-07-12) includes security fixes to the compress/gzip,
encoding/gob, encoding/xml, go/parser, io/fs, net/http, and path/filepath
packages, as well as bug fixes to the compiler, the go command, the linker,
the runtime, and the runtime/metrics package. See the Go 1.18.4 milestone on the
issue tracker for details:
https://github.com/golang/go/issues?q=milestone%3AGo1.18.4+label%3ACherryPickApproved
This update addresses:
CVE-2022-1705, CVE-2022-1962, CVE-2022-28131, CVE-2022-30630, CVE-2022-30631,
CVE-2022-30632, CVE-2022-30633, CVE-2022-30635, and CVE-2022-32148.
Full diff: https://github.com/golang/go/compare/go1.18.3...go1.18.4
From the security announcement;
https://groups.google.com/g/golang-announce/c/nqrv9fbR0zE
We have just released Go versions 1.18.4 and 1.17.12, minor point releases. These
minor releases include 9 security fixes following the security policy:
- net/http: improper sanitization of Transfer-Encoding header
The HTTP/1 client accepted some invalid Transfer-Encoding headers as indicating
a "chunked" encoding. This could potentially allow for request smuggling, but
only if combined with an intermediate server that also improperly failed to
reject the header as invalid.
This is CVE-2022-1705 and https://go.dev/issue/53188.
- When `httputil.ReverseProxy.ServeHTTP` was called with a `Request.Header` map
containing a nil value for the X-Forwarded-For header, ReverseProxy would set
the client IP as the value of the X-Forwarded-For header, contrary to its
documentation. In the more usual case where a Director function set the
X-Forwarded-For header value to nil, ReverseProxy would leave the header
unmodified as expected.
This is https://go.dev/issue/53423 and CVE-2022-32148.
Thanks to Christian Mehlmauer for reporting this issue.
- compress/gzip: stack exhaustion in Reader.Read
Calling Reader.Read on an archive containing a large number of concatenated
0-length compressed files can cause a panic due to stack exhaustion.
This is CVE-2022-30631 and Go issue https://go.dev/issue/53168.
- encoding/xml: stack exhaustion in Unmarshal
Calling Unmarshal on a XML document into a Go struct which has a nested field
that uses the any field tag can cause a panic due to stack exhaustion.
This is CVE-2022-30633 and Go issue https://go.dev/issue/53611.
- encoding/xml: stack exhaustion in Decoder.Skip
Calling Decoder.Skip when parsing a deeply nested XML document can cause a
panic due to stack exhaustion. The Go Security team discovered this issue, and
it was independently reported by Juho Nurminen of Mattermost.
This is CVE-2022-28131 and Go issue https://go.dev/issue/53614.
- encoding/gob: stack exhaustion in Decoder.Decode
Calling Decoder.Decode on a message which contains deeply nested structures
can cause a panic due to stack exhaustion.
This is CVE-2022-30635 and Go issue https://go.dev/issue/53615.
- path/filepath: stack exhaustion in Glob
Calling Glob on a path which contains a large number of path separators can
cause a panic due to stack exhaustion.
Thanks to Juho Nurminen of Mattermost for reporting this issue.
This is CVE-2022-30632 and Go issue https://go.dev/issue/53416.
- io/fs: stack exhaustion in Glob
Calling Glob on a path which contains a large number of path separators can
cause a panic due to stack exhaustion.
This is CVE-2022-30630 and Go issue https://go.dev/issue/53415.
- go/parser: stack exhaustion in all Parse* functions
Calling any of the Parse functions on Go source code which contains deeply
nested types or declarations can cause a panic due to stack exhaustion.
Thanks to Juho Nurminen of Mattermost for reporting this issue.
This is CVE-2022-1962 and Go issue https://go.dev/issue/53616.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit 34b8670b1a)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
WARN [runner] The linter 'golint' is deprecated (since v1.41.0) due to: The repository of the linter has been archived by the owner. Replaced by revive.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
libnetwork/firewall_linux.go:11:21: var-declaration: should drop = nil from declaration of var ctrl; it is the zero value (revive)
ctrl *controller = nil
^
distribution/pull_v2_test.go:213:4: S1038: should use t.Fatalf(...) instead of t.Fatal(fmt.Sprintf(...)) (gosimple)
t.Fatal(fmt.Sprintf("expected formatPlatform to show windows platform with a version, but got '%s'", result))
^
integration-cli/docker_cli_build_test.go:5951:3: S1038: should use c.Skipf(...) instead of c.Skip(fmt.Sprintf(...)) (gosimple)
c.Skip(fmt.Sprintf("Bug fixed in 18.06 or higher.Skipping it for %s", testEnv.DaemonInfo.ServerVersion))
^
integration-cli/docker_cli_daemon_test.go:240:3: S1038: should use c.Skipf(...) instead of c.Skip(fmt.Sprintf(...)) (gosimple)
c.Skip(fmt.Sprintf("New base device size (%v) must be greater than (%s)", units.HumanSize(float64(newBasesizeBytes)), units.HumanSize(float64(oldBasesizeBytes))))
^
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
client/request.go:183:28: error-strings: error strings should not be capitalized or end with punctuation or a newline (revive)
err = errors.Wrap(err, "In the default daemon configuration on Windows, the docker client must be run with elevated privileges to connect.")
^
client/request.go:186:28: error-strings: error strings should not be capitalized or end with punctuation or a newline (revive)
err = errors.Wrap(err, "This error may indicate that the docker daemon is not running.")
^
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
It was pinned to the 1.3 version; removing the minor version to
make sure we're on the latest 1.x stable.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Delve on Linux is currently only supported on amd64 and arm64;
https://github.com/go-delve/delve/blob/v1.8.1/pkg/proc/native/support_sentinel.go#L1-L6
On ppc64le and s390x, trying to install and run it, caused the
build to fail:
RUN --mount=type=cache,target=/root/.cache/go-build --mount=type=cache,target=/go/pkg/mod GOBIN=/build/ GO111MODULE=on go install "github.com/go-delve/delve/cmd/dlv@v1.8.1" && /build/dlv --help:
pkg/mod/github.com/go-delve/delve@v1.8.1/service/debugger/debugger.go:28:2: found packages native (dump_linux.go) and your_operating_system_and_architecture_combination_is_not_supported_by_delve (support_sentinel.go) in /go/pkg/mod/github.com/go-delve/delve@v1.8.1/pkg/proc/native
Error: failed to solve: executor failed running [/bin/sh -c GOBIN=/build/ GO111MODULE=on go install "github.com/go-delve/delve/cmd/dlv@${DELVE_VERSION}" && /build/dlv --help]: exit code: 1
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The current docker-default AppArmor profile intends to block write
access to everything in `/proc`, except for `/proc/<pid>` and
`/proc/sys/kernel/shm*`.
Currently the rules block access to everything in `/proc/sys`, and do
not successfully allow access to `/proc/sys/kernel/shm*`. Specifically,
a path like /proc/sys/kernel/shmmax matches this part of the pattern:
deny @{PROC}/{[^1-9][^0-9][^0-9][^0-9]* }/** w,
/proc / s y s / kernel /shmmax
This patch updates the rule so that it works as intended.
Closes#39791
Signed-off-by: Phil Sphicas <phil.sphicas@att.com>
The Windows Dockerfile did not use a "v" prefix, whereas the
hack/dockerfile/install/containerd.installer did. While we're
not overriding these versions currently through build-args, doing
so would result in one of them being incorrect.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Removes some custom handling, some of which were giving the wrong
error on failure ("expected no error" when we were checking for an
error).
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Set the defaults when constructing the config, instead of setting them
indirectly through the command-line flags.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This test was validating that the config file would not overwrite the
log-opt, but the test did not set up the flags correctly; as the flags
were not marked as "changed", it would not detect a conflict between
the config-file and daemon-flags.
This patch:
- removes the incorrect fields from the JSON file
- initializes the Config using config.New(), so that any defaults are also set
- sets flag values by actually setting them through the flags
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
- TestReloadWithDuplicateLabels() also check value
- TestReloadDefaultConfigNotExist, TestReloadBadDefaultConfig,
TestReloadWithConflictingLabels: verify that config is not
reloaded.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This function depends on flags having been parsed before it's used;
add a safety-net in case this function would be called before that.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This prevents creating a socket and touching the filesystem before
trying to use a port that was already in use by a container.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
updating to the latest v0.5.x patch release:
full diff: https://github.com/hashicorp/go-msgpack/compare/v0.5.3...v0.5.5
- Fix an issue where struct pointer fields tagged with omitempty will be omitted
if referenced value is empty, so a field of type *bool, then field would be
omitted pointer is nil or &false.
- Fixed a decoding issue when decoding a string value in a map where the value
already existed would panic.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
It's an indirect dependency, and we were pinning it to use the latest tagged
release (which didn't have a go.mod yet). No code changes in the vendored files,
so let's skip the replace rule.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
un-pinning the dependency
full diff: https://github.com/census-instrumentation/opencensus-go/compare/v0.22.3...v0.23.0
- replace gofmt with goimports
- Allow creating additional View universes
- Safely reject invalid-length span and trace ids
- fix Panic when x-b3-spanid exceeds 16 characters
- Reduce allocations
- Remove call to time.Now() on worker thread when handling record reqs
- Delete views from measure ref when unregistering
- Allow custom view.Meters to export metrics for other Resources
- Initialize View Start Time During View Registration
- Record a Start Time Per Time Series within a View
- Made public traceparent/tracestate marshal/unmarshal
- Fix const labels with derived metrics
- Defer IDGenerator initialization until first use
- Allow replacing trace SDK
- Provide accessor to the span implementation
- Lock only when needed, remove duplicate code
- Update dependencies
- fix memory leak cause by the spanStore.(census-instrumentation/opencensus-go)
- Adds an exported function to flush interval reader
- Adding GC stats to runmetrics plugin
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
commit 737e8c6ab8 added validation for the wait
condition parameter, however, the default ("not-running") option was not part
of the list of valid options, resulting in a regression if the default value
was explicitly passed;
docker scan --accept-license --version
Error response from daemon: invalid condition: "not-running"
This patch adds the missing option, and adds a test to verify.
With this patch;
make BIND_DIR=. DOCKER_GRAPHDRIVER=vfs TEST_FILTER=TestWaitConditions test-integration
...
--- PASS: TestWaitConditions (0.04s)
--- PASS: TestWaitConditions/removed (1.79s)
--- PASS: TestWaitConditions/default (1.91s)
--- PASS: TestWaitConditions/next-exit (1.97s)
--- PASS: TestWaitConditions/not-running (1.99s)
PASS
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Now client have the possibility to set the console size of the executed
process immediately at the creation. This makes a difference for example
when executing commands that output some kind of text user interface
which is bounded by the console dimensions.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
This adds Bjorn, Cory, Nicolas and Djordje to the list of curators
to enable them to help out with triage and other tasks.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The LoopkupImage method is only used by the inspect image route and
returns an api/type struct. The depenency to api/types of the
daemon/images package is wrong, the daemon doesn't need to know about
the api types.
Signed-off-by: Djordje Lukic <djordje.lukic@docker.com>
This is a new attempt on making these tests less flaky. The previous attempt in
commit 585c147b7a assumed that the test was failing
if the test-daemon still had unrelated containers present from other tests, but
it appears that the actual reason for the tests to be flaky may be that the `--rm`
option was moved to the daemon side and an asynchronous operation. As a result,
the container may not yet be removed once the `docker run` completes, which happens
frequently on Windows (likely be- cause removing containers is somewhat slower
on Windows).
This patch adds a retry-loop (using `poll.WaitOn()`) to wait for the container
to be removed.
make DOCKER_GRAPHDRIVER=vfs TEST_FILTER='TestRunContainerWithRmFlag' test-integration
INFO: Testing against a local daemon
=== RUN TestDockerSuite
=== RUN TestDockerSuite/TestRunContainerWithRmFlagCannotStartContainer
=== RUN TestDockerSuite/TestRunContainerWithRmFlagExitCodeNotEqualToZero
--- PASS: TestDockerSuite (1.00s)
--- PASS: TestDockerSuite/TestRunContainerWithRmFlagCannotStartContainer (0.50s)
--- PASS: TestDockerSuite/TestRunContainerWithRmFlagExitCodeNotEqualToZero (0.49s)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Improve consistency for the logs, and remove a redundant log:
time="2022-06-07T15:37:24.418470152Z" level=debug msg="found 0 orphan layers"
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
bump netlink to 1.2.1
change usages of netlink handle .Delete() to Close()
remove superfluous replace in vendor.mod
make requires of github.com/Azure/go-ansiterm direct
Signed-off-by: Martin Braun <braun@neuroforge.de>
1. Add integration tests for the ContainerLogs API call
Each test handle a distinct case of ContainerLogs output.
- Muxed stream, when container is started without tty
- Single stream, when container is started with tty
2. Add unit test for LogReader suite that tests concurrent logging
It checks that there are no race conditions when logging concurrently
from multiple goroutines.
Co-authored-by: Cory Snider <csnider@mirantis.com>
Signed-off-by: Cory Snider <csnider@mirantis.com>
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
This function was calling SystemInfo() only to get the daemon's name
to add to the event that's generated.
SystemInfo() is quite heavy, and no info other than the Name was used.
The name returned is just looking up the hostname, so instead, call
`hostName()` directly.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
CodeCov has been very hit-and-miss recently; it looks like we
may need some additional settings to make it compare with the
correct parent commit (perhaps it doesn't work well with rebasing),
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Welcome to the v1.6.5 release of containerd!
The fifth patch release for containerd 1.6 includes a few fixes and updated
version of runc.
Notable Updates
- Fix for older CNI plugins not reporting version
- Fix mount path handling for CRI plugin on Windows
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Commit 483aa6294b introduced a regression, causing
spurious warnings to be shown when starting a daemon for the first time after
a fresh install:
docker info
...
WARNING: IPv4 forwarding is disabled
WARNING: bridge-nf-call-iptables is disabled
WARNING: bridge-nf-call-ip6tables is disabled
The information shown is incorrect, as checking the corresponding options on
the system, shows that these options are available:
cat /proc/sys/net/ipv4/ip_forward
1
cat /proc/sys/net/bridge/bridge-nf-call-iptables
1
cat /proc/sys/net/bridge/bridge-nf-call-ip6tables
1
The reason this is failing is because the daemon itself reconfigures those
options during networking initialization in `configureIPForwarding()`;
cf4595265e/libnetwork/drivers/bridge/setup_ip_forwarding.go (L14-L25)
Network initialization happens in the `daemon.restore()` function within `daemon.NewDaemon()`:
cf4595265e/daemon/daemon.go (L475-L478)
However, 483aa6294b moved detection of features
earlier in the `daemon.NewDaemon()` function, and collects the system information
(`d.RawSysInfo()`) before we enter `daemon.restore()`;
cf4595265e/daemon/daemon.go (L1008-L1011)
For optimization (collecting the system information comes at a cost), those
results are cached on the daemon, and will only be performed once (using a
`sync.Once`).
This patch:
- introduces a `getSysInfo()` utility, which collects system information without
caching the results
- uses `getSysInfo()` to collect the preliminary information needed at that
point in the daemon's lifecycle.
- moves printing warnings to the end of `daemon.NewDaemon()`, after all information
can be read correctly.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Note that Windows does not support options, so strictly doesn't need
to have this code, but keeping it in case we're adding support.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Unmounting does not depend on wether or not loading options failed.
This code-path seemed to be used as a "hack" to prevent hitting the
unmount on Windows (which does not support unmounting).
Moving it outside of the "if" to make more clear that it's independent
of loading the options.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Differentiate between Windows and Linux, as Windows doesn't support
options, so there's no need to save options to disk,
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Instead of evaluating these paths each time (appending `_data`, or using
`filepath.Dir()` to find the root path from the `_data_` path).
This also removes the `root.DataPath()` utility, which is now no longer needed.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This way we can validate if Root supports quotaCtl, allowing us to
fail early, before creating any of the directories.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This moves validation of options to the start of the Create function
to prevent hitting the filesystem and having to remove the volume
from disk.
Also addressing some minor nits w.r.t. errors.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
While the current code is correct (as errors.Wrap() returns nil if
err is nil), relying on this behavior has caused some confusion in
the past, resulting in regressions.
This patch makes the error-handling code slightly more idiomatic and
defensive against such regressions.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Since commit 2c4a868f64, Docker doesn't
use the value of net.ipv4.ip_local_port_range when choosing an ephemeral
port. This change reverts back to the previous behavior.
Fixes#43054.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
Make finding the correct runtime image from image index
more compliant with OCI spec and match containerd implementation.
Changes:
- Manifest list is allowed to contain manifest lists
- Unknown mediatype inside manifest list is skipped instead of causing an error
- Platform in descriptor is optional
Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>
This attempts to fix CI flakiness on the TestRunContainerWithRmFlagCannotStartContainer
and TestRunContainerWithRmFlagExitCodeNotEqualToZero tests.
These tests;
- get a list of all container ID's
- run a container with `--rm`
- wait for it to exit
- checks that the list of all container IDs is empty
The last step assumes that no other tests are running on the same daemon; if
another test is running, there may be other containers present (unrelated to
the test).
This patch updates the tests to use a `docker inspect` to verify the container
no longer exists afterwards.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
go1.18.3 (released 2022-06-01) includes security fixes to the crypto/rand,
crypto/tls, os/exec, and path/filepath packages, as well as bug fixes to the
compiler, and the crypto/tls and text/template/parse packages. See the Go
1.18.3 milestone on our issue tracker for details:
https://github.com/golang/go/issues?q=milestone%3AGo1.18.3+label%3ACherryPickApproved
Hello gophers,
We have just released Go versions 1.18.3 and 1.17.11, minor point releases.
These minor releases include 4 security fixes following the security policy:
- crypto/rand: rand.Read hangs with extremely large buffers
On Windows, rand.Read will hang indefinitely if passed a buffer larger than
1 << 32 - 1 bytes.
Thanks to Davis Goodin and Quim Muntal, working at Microsoft on the Go toolset,
for reporting this issue.
This is [CVE-2022-30634][CVE-2022-30634] and Go issue https://go.dev/issue/52561.
- crypto/tls: session tickets lack random ticket_age_add
Session tickets generated by crypto/tls did not contain a randomly generated
ticket_age_add. This allows an attacker that can observe TLS handshakes to
correlate successive connections by comparing ticket ages during session
resumption.
Thanks to GitHub user nervuri for reporting this.
This is [CVE-2022-30629][CVE-2022-30629] and Go issue https://go.dev/issue/52814.
- `os/exec`: empty `Cmd.Path` can result in running unintended binary on Windows
If, on Windows, `Cmd.Run`, `cmd.Start`, `cmd.Output`, or `cmd.CombinedOutput`
are executed when Cmd.Path is unset and, in the working directory, there are
binaries named either "..com" or "..exe", they will be executed.
Thanks to Chris Darroch, brian m. carlson, and Mikhail Shcherbakov for reporting
this.
This is [CVE-2022-30580][CVE-2022-30580] and Go issue https://go.dev/issue/52574.
- `path/filepath`: Clean(`.\c:`) returns `c:` on Windows
On Windows, the `filepath.Clean` function could convert an invalid path to a
valid, absolute path. For example, Clean(`.\c:`) returned `c:`.
Thanks to Unrud for reporting this issue.
This is [CVE-2022-29804][CVE-2022-29804] and Go issue https://go.dev/issue/52476.
[CVE-2022-30634]: https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2022-30634
[CVE-2022-30629]: https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2022-30629
[CVE-2022-30580]: https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2022-30580
[CVE-2022-29804]: https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2022-29804
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
These seemed to prevent cleaning up directories;
On arm64:
=== RUN TestSysctlOverride
testing.go:1090: TempDir RemoveAll cleanup: unlinkat /tmp/TestSysctlOverride2860094781/001/mounts/shm: device or resource busy
--- FAIL: TestSysctlOverride (0.00s)
On Windows:
=== Failed
=== FAIL: github.com/docker/docker/daemon TestLoadOrCreateTrustKeyInvalidKeyFile (0.00s)
testing.go:1090: TempDir RemoveAll cleanup: remove C:\Users\CONTAI~1\AppData\Local\Temp\TestLoadOrCreateTrustKeyInvalidKeyFile2014634395\001\keyfile4156691647: The process cannot access the file because it is being used by another process.
=== FAIL: github.com/docker/docker/daemon/graphdriver TestIsEmptyDir (0.01s)
testing.go:1090: TempDir RemoveAll cleanup: remove C:\Users\CONTAI~1\AppData\Local\Temp\TestIsEmptyDir1962964337\001\dir-with-empty-file\file2523853824: The process cannot access the file because it is being used by another process.
=== FAIL: github.com/docker/docker/pkg/directory TestSizeEmptyFile (0.00s)
testing.go:1090: TempDir RemoveAll cleanup: remove C:\Users\CONTAI~1\AppData\Local\Temp\TestSizeEmptyFile1562416712\001\file16507846: The process cannot access the file because it is being used by another process.
=== FAIL: github.com/docker/docker/pkg/directory TestSizeNonemptyFile (0.00s)
testing.go:1090: TempDir RemoveAll cleanup: remove C:\Users\CONTAI~1\AppData\Local\Temp\TestSizeNonemptyFile1240832785\001\file3265662846: The process cannot access the file because it is being used by another process.
=== FAIL: github.com/docker/docker/pkg/directory TestSizeFileAndNestedDirectoryEmpty (0.00s)
testing.go:1090: TempDir RemoveAll cleanup: remove C:\Users\CONTAI~1\AppData\Local\Temp\TestSizeFileAndNestedDirectoryEmpty2163416550\001\file3715413181: The process cannot access the file because it is being used by another process.
=== FAIL: github.com/docker/docker/pkg/directory TestSizeFileAndNestedDirectoryNonempty (0.00s)
testing.go:1090: TempDir RemoveAll cleanup: remove C:\Users\CONTAI~1\AppData\Local\Temp\TestSizeFileAndNestedDirectoryNonempty878205470\001\file3280422273: The process cannot access the file because it is being used by another process.
=== FAIL: github.com/docker/docker/volume/service TestSetGetMeta (0.01s)
testing.go:1090: TempDir RemoveAll cleanup: remove C:\Users\CONTAI~1\AppData\Local\Temp\TestSetGetMeta3332268057\001\db: The process cannot access the file because it is being used by another process.
=== FAIL: github.com/docker/docker/volume/service TestList (0.03s)
testing.go:1090: TempDir RemoveAll cleanup: remove C:\Users\CONTAI~1\AppData\Local\Temp\TestList2846947953\001\volumes\metadata.db: The process cannot access the file because it is being used by another process.
=== FAIL: github.com/docker/docker/volume/service TestRestore (0.02s)
testing.go:1090: TempDir RemoveAll cleanup: remove C:\Users\CONTAI~1\AppData\Local\Temp\TestRestore3368254142\001\volumes\metadata.db: The process cannot access the file because it is being used by another process.
=== FAIL: github.com/docker/docker/daemon/graphdriver TestIsEmptyDir (0.00s)
testing.go:1090: TempDir RemoveAll cleanup: remove C:\Users\CONTAI~1\AppData\Local\Temp\TestIsEmptyDir2823795693\001\dir-with-empty-file\file2625561089: The process cannot access the file because it is being used by another process.
=== FAIL: github.com/docker/docker/pkg/directory TestSizeFileAndNestedDirectoryNonempty (0.00s)
testing.go:1090: TempDir RemoveAll cleanup: remove C:\Users\CONTAI~1\AppData\Local\Temp\TestSizeFileAndNestedDirectoryNonempty4246252950\001\nested3442260313\file21164327: The process cannot access the file because it is being used by another process.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Previously, with the patch from #43146, it was possible for a
network configured with a single ingress or load balancer on a
distribution which does not have the `ip_vs` kernel module loaded
by default to try to apply sysctls which did not exist yet, and
subsequently dynamically load the module as part of ipvs/netlink.go.
This module is vendored, and not a great place to try to tie back
into core libnetwork functionality, so also ensure that the sysctls
(which are idempotent) are called after ingress/lb creation once
`ipvs` has been initialized.
Signed-off-by: Ryan Barry <rbarry@mirantis.com>
The recent fix for log corruption changed the signature of the
NewLogFile and WriteLogEntry functions and the test wasn't adjusted to
this change.
Fix the test by adjusting to the new LogFile API.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Make the allocated buffers bigger to allow better reusability and avoid
frequent reallocations.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
The Message is not needed after it is marshalled, so no need to hold it
for the entire function scope.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Moved the buffer pools in json-file and local logging drivers to the
whole driver scope. It is more efficient to have a pool for the whole
driver rather than for each logger instance.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
While working on deprecation of the `aufs` and `overlay` storage-drivers, the
`TestCleanupMounts` had to be updated, as it was currently using `aufs` for
testing. When rewriting the test to use `overlay2` instead (using an updated
`mountsFixture`), I found out that the test was failing, and it appears that
only `overlay`, but not `overlay2` was taken into account.
These cleanup functions were added in 05cc737f54,
but at the time the `overlay2` storage driver was not yet implemented;
05cc737f54/daemon/graphdriver
This omission was likely missed in 23e5c94cfb,
because the original implementation re-used the `overlay` storage driver, but
later on it was decided to make `overlay2` a separate storage driver.
As a result of the above, `daemon.cleanupMountsByID()` would ignore any `overlay2`
mounts during `daemon.Shutdown()` and `daemon.Cleanup()`.
This patch:
- Adds a new `mountsFixtureOverlay2` with example mounts for `overlay2`
- Rewrites the tests to use `gotest.tools` for more informative output on failures.
- Adds the missing regex patterns to `daemon/getCleanPatterns()`. The patterns
are added at the start of the list to allow for the fasted match (`overlay2`
is the default for most setups, and the code is iterating over possible
options).
As a follow-up, we could consider adding additional fixtures for different
storage drivers.
Before the fix is applied:
go test -v -run TestCleanupMounts ./daemon/
=== RUN TestCleanupMounts
=== RUN TestCleanupMounts/aufs
=== RUN TestCleanupMounts/overlay2
daemon_linux_test.go:135: assertion failed: 0 (unmounted int) != 1 (int): Expected to unmount the shm (and the shm only)
--- FAIL: TestCleanupMounts (0.01s)
--- PASS: TestCleanupMounts/aufs (0.00s)
--- FAIL: TestCleanupMounts/overlay2 (0.01s)
=== RUN TestCleanupMountsByID
=== RUN TestCleanupMountsByID/aufs
=== RUN TestCleanupMountsByID/overlay2
daemon_linux_test.go:171: assertion failed: 0 (unmounted int) != 1 (int): Expected to unmount the root (and that only)
--- FAIL: TestCleanupMountsByID (0.00s)
--- PASS: TestCleanupMountsByID/aufs (0.00s)
--- FAIL: TestCleanupMountsByID/overlay2 (0.00s)
FAIL
FAIL github.com/docker/docker/daemon 0.054s
FAIL
With the fix applied:
go test -v -run TestCleanupMounts ./daemon/
=== RUN TestCleanupMounts
=== RUN TestCleanupMounts/aufs
=== RUN TestCleanupMounts/overlay2
--- PASS: TestCleanupMounts (0.00s)
--- PASS: TestCleanupMounts/aufs (0.00s)
--- PASS: TestCleanupMounts/overlay2 (0.00s)
=== RUN TestCleanupMountsByID
=== RUN TestCleanupMountsByID/aufs
=== RUN TestCleanupMountsByID/overlay2
--- PASS: TestCleanupMountsByID (0.00s)
--- PASS: TestCleanupMountsByID/aufs (0.00s)
--- PASS: TestCleanupMountsByID/overlay2 (0.00s)
PASS
ok github.com/docker/docker/daemon 0.042s
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
I noticed I made a mistake in the first ping ("before swarm init"), which
was not specifying the daemon's socket path and because of that testing
against the main integration daemon (not the locally spun up daemon).
While fixing that, I wondered why the test didn't actually use the client
for the requests (to also verify the client converted the response), so
I rewrote the test to use `client.Ping()` and to verify the ping response
has the expected values set.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Marshalling log messages by json-file and local drivers involved
serializing the message into a shared buffer. This caused a regression
resulting in log corruption with recent changes where Log may be called
from multiple goroutines at the same time.
Solution is to use a sync.Pool to manage the buffers used for the
serialization. Also removed the MarshalFunc, which the driver had to
expose to the LogFile so that it can marshal the message. This is now
moved entirely to the driver.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
These HostConfig properties were not validated until the OCI spec for the container
was created, which meant that `container run` and `docker create` would accept
invalid values, and the invalid value would not be detected until `start` was
called, returning a 500 "internal server error", as well as errors from containerd
("cleanup: failed to delete container from containerd: no such container") in the
daemon logs.
As a result, a faulty container was created, and the container state remained
in the `created` state.
This patch:
- Updates `oci.WithNamespaces()` to return the correct `errdefs.InvalidParameter`
- Updates `verifyPlatformContainerSettings()` to validate these settings, so that
an error is returned when _creating_ the container.
Before this patch:
docker run -dit --ipc=shared --name foo busybox
2a00d74e9fbb7960c4718def8f6c74fa8ee754030eeb93ee26a516e27d4d029f
docker: Error response from daemon: Invalid IPC mode: shared.
docker ps -a --filter name=foo
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
2a00d74e9fbb busybox "sh" About a minute ago Created foo
After this patch:
docker run -dit --ipc=shared --name foo busybox
docker: Error response from daemon: invalid IPC mode: shared.
docker ps -a --filter name=foo
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
An integration test was added to verify the new validation, which can be run with:
make BIND_DIR=. TEST_FILTER=TestCreateInvalidHostConfig DOCKER_GRAPHDRIVER=vfs test-integration
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Using the swagger.yaml to generate api models will create incompatible field types. Some inconsistencies had already been mentioned at #39131. I've added more fixes from real life experience, some only occurring on Windows.
Closes#39131
Signed-off-by: Tobias Gesellchen <tobias@gesellix.de>
Reduce the amount of time ReadLogs holds the LogFile fsop lock by
releasing it as soon as all the files are opened, before parsing the
compressed file headers.
Signed-off-by: Cory Snider <csnider@mirantis.com>
File watches have been a source of complexity and unreliability in the
LogFile follow implementation, especially when combined with file
rotation. File change events can be unreliably delivered, especially on
Windows, and the polling fallback adds latency. Following across
rotations has never worked reliably on Windows. Without synchronization
between the log writer and readers, race conditions abound: readers can
read from the file while a log entry is only partially written, leading
to decode errors and necessitating retries.
In addition to the complexities stemming from file watches, the LogFile
follow implementation had complexity from needing to handle file
truncations, and (due to a now-fixed bug in the polling file watcher
implementation) evictions to unlock the log file so it could be rotated.
Log files are now always rotated, never truncated, so these situations
no longer need to be handled by the follow code.
Rewrite the LogFile follow implementation in terms of waiting until
LogFile notifies it that a new message has been written to the log file.
The LogFile informs the follower of the file offset of the last complete
write so that the follower knows not to read past that, preventing it
from attempting to decode partial messages and making retries
unnecessary. Synchronization between LogFile and its followers is used
at critical points to prevent missed notifications of writes and races
between file rotations and the follower opening files for read.
Signed-off-by: Cory Snider <csnider@mirantis.com>
The refCounter used for sharing temporary decompressed log files and
tracking when the files can be deleted is keyed off the source file's
path. But the path of a log file is not stable: it is renamed on each
rotation. Consequently, when logging is configured with both rotation
and compression, multiple concurrent readers of a container's logs could
read logs out of order, see duplicates or decompress a log file which
has already been decompressed.
Replace refCounter with a new implementation, sharedTempFileConverter,
which is agnostic to the file path, keying off the source file's
identity instead. Additionally, sharedTempFileConverter handles the full
lifecycle of the temporary file, from creation to deletion. This is all
abstracted from the consumer: all the bookkeeping and cleanup is handled
behind the scenes when Close() is called on the returned reader value.
Only one file descriptor is used per temporary file, which is shared by
all readers.
A channel is used for concurrency control so that the lock can be
acquired inside a select statement. While not currently utilized, this
makes it possible to add support for cancellation to
sharedTempFileConverter in the future.
Signed-off-by: Cory Snider <csnider@mirantis.com>
Truncating the current log file while a reader is still reading through
it results in log lines getting missed. In contrast, rotating the file
allows readers who have the file open can continue to read from it
undisturbed. Rotating frees up the file name for the logger to create a
new file in its place. This remains true even when max-file=1; the
current log file is "rotated" from its name without giving it a new one.
On POSIXy filesystem APIs, rotating the last file is straightforward:
unlink()ing a file name immediately deletes the name from the filesystem
and makes it available for reuse, even if processes have the file open
at the time. Windows on the other hand only makes the name available
for reuse once the file itself is deleted, which only happens when no
processes have it open. To reuse the file name while the file is still
in use, the file needs to be renamed. So that's what we have to do:
rotate the file to a temporary name before marking it for deletion.
Signed-off-by: Cory Snider <csnider@mirantis.com>
The json-file driver appends a newline character to log messages with
PLogMetaData.Last set, but the local driver did not. Alter the behavior
of the local driver to match that of the json-file driver.
Signed-off-by: Cory Snider <csnider@mirantis.com>
The LogFile follower would stop immediately upon the producer closing.
The close signal would race the file watcher; if a message were to be
logged and the logger immediately closed, the follower could miss that
last message if the close signal (formerly ProducerGone) was to win the
race. Add logic to perform one more round of reading when the producer
is closed to catch up on any final logs.
Signed-off-by: Cory Snider <csnider@mirantis.com>
Whether or not the logger has been closed is a property of the logger,
and only of concern to its log reading implementation, not log watchers.
The loggers and their reader implementations can communicate as they see
fit. A single channel per logger which is closed when the logger is
closed is plenty sufficient to broadcast the state to log readers, with
no extra bookeeping or synchronization required.
Signed-off-by: Cory Snider <csnider@mirantis.com>
The asynchronous startup of the log-reading goroutine made the
follow-tail tests nondeterministic. The Log calls in the tests which
were supposed to happen after the reader started reading would sometimes
execute before the reader, throwing off the counts. Tweak the ReadLogs
implementation so that the order of operations is deterministic.
Signed-off-by: Cory Snider <csnider@mirantis.com>
Add an extensive test suite for validating the behavior of any
LogReader. Test the current LogFile-based implementations against it.
Signed-off-by: Cory Snider <csnider@mirantis.com>
The jsonfilelog read benchmark was incorrectly reusing the same message
pointer in the producer loop. The message value would be reset after the
first call to jsonlogger.Log, resulting in all subsequent calls logging
a zero-valued message. This is not a representative workload for
benchmarking and throws off the throughput metric.
Reduce variation between benchmark runs by using a constant timestamp.
Write to the producer goroutine's error channel only on a non-nil error
to eliminate spurious synchronization between producer and consumer
goroutines external to the logger being benchmarked.
Signed-off-by: Cory Snider <csnider@mirantis.com>
On Linux the daemon was not respecting the HostConfig.ConsoleSize
property and relied on cli initializing the tty size after the container
was created. This caused a delay between container creation and
the tty actually being resized.
This is also a small change to the api description, because
HostConfig.ConsoleSize is no longer Windows-only.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
This also fixes the GetOperatingSystem function in
pkg/parsers/operatingsystem which mistakenly truncated utsname.Machine
to the index of \0 in utsname.Sysname.
Fixes: 7aeb3efcb4
Cc: Tobias Klauser <tklauser@distanz.ch>
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
These query-args were documented, but not actually supported until
ea6760138c (API v1.42).
This removes them from the documentation, as these arguments were ignored
(and defaulted to `true` (enabled))
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Slightly make the change in API v1.42 more visible, and add a snippet
about what users should do to preserve the pre-v1.41 behavior.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
relates to #35082, moby/libnetwork#2491
Previously, values for expire_quiescent_template, conn_reuse_mode,
and expire_nodest_conn were set only system-wide. Also apply them
for new lb_* and ingress_sbox sandboxes, so they are appropriately
propagated
Signed-off-by: Ryan Barry <rbarry@mirantis.com>
Now that there's no differentiation between Linux and Windows
for this check, we can remove the two implementations and move
the code inline as it's only used in a single location and moving
it inline makes it more transparent on what's being checked.
As part of this change, the now unused "scope" field is also removed.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This was added in bd9814f0db to support downgrading
docker 1.7 to 1.6.
The related migration code was removed in 0023abbad3
(Docker 18.05), which was also the last consumer of VolumeDataPathName outside
of the package, so that const can be un-exported.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Similar to the (now removed) `apparmor` build tag, this build-time toggle existed for users who needed to build without the `libseccomp` library. That's no longer necessary, and given the importance of seccomp to the overall default security profile of Docker containers, it makes sense that any binary built for Linux should support (and use by default) seccomp if the underlying host does.
Signed-off-by: Tianon Gravi <admwiggin@gmail.com>
This is the second patch release of the runc 1.1 release branch. It
fixes CVE-2022-29162, a minor security issue (which appears to not be
exploitable) related to process capabilities.
This is a similar bug to the ones found and fixed in Docker and
containerd recently (CVE-2022-24769).
- A bug was found in runc where runc exec --cap executed processes with
non-empty inheritable Linux process capabilities, creating an atypical Linux
environment. For more information, see GHSA-f3fp-gc8g-vw66 and CVE-2022-29162.
- runc spec no longer sets any inheritable capabilities in the created
example OCI spec (config.json) file.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
These were added as part of d7ba1f85ef,
but weren't used at the time, and still aren't used, so let's remove
them.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
strings.ReplaceAll(s, old, new) is a wrapper function for
strings.Replace(s, old, new, -1). But strings.ReplaceAll is more
readable and removes the hardcoded -1.
Signed-off-by: Eng Zer Jun <engzerjun@gmail.com>
This import was left behind due to some PR's being merged, both
affecting the imports that were used.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
- remove isErrNoSuchProcess() in favor of a plain errors.As()
- errNoSuchProcess.Error(): remove punctuation
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This allows the postContainersKill() handler to pass values as-is. As part of
the rewrite, I also moved the daemon.GetContainer(name) call later in the
function, so that we can fail early if an invalid signal is passed, before
doing the (heavier) fetching of the container.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The libtrust trust-key is only used for pushing legacy image manifests;
pushing these images has been deprecated, and we only need to be able
to push them in our CI.
This patch disables generating the trust-key (and related paths) unless
the DOCKER_ALLOW_SCHEMA1_PUSH_DONOTUSE env-var is set (which we do in
our CI).
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This change is in preparation of deprecating support for old manifests.
Currently the daemon's ID is based on the trust-key ID, which will be
removed once we fully deprecate support for old manifests (the trust
key is currently only used in tests).
This patch:
- looks if a trust-key is present; if so, it migrates the trust-key
ID to the new "engine-id" file within the daemon's root.
- if no trust-key is present (so in case it's a "fresh" install), we
generate a UUID instead and use that as ID.
The migration is to prevent engines from getting a new ID on upgrades;
while we don't provide any guarantees on the engine's ID, users may
expect the ID to be "stable" (not change) between upgrades.
A test has been added, which can be ran with;
make DOCKER_GRAPHDRIVER=vfs TEST_FILTER='TestConfigDaemonID' test-integration
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Make sure we validate the default address given before using it, and
combine the parsing/validation logic so that it can be reused.
This patch also makes the errors more consistent, and uses pkg/errors
for generating them.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Use the default (0) value to indicate "not set", which simplifies
working with these configuration options, preventing the need to
use intermediate variables etc.
While changing this code, also making some small cleanups, such
as replacing "fmt.Sprintf()" for "strconv" variants.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This method returned the network controller, only to set it on the daemon.
While making this change, also;
- update some error messages to be in the correct format
- use errors.Wrap() where possible
- extract configuring networks into a separate function to make the flow
slightly easier to follow.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Starting an exec can take a significant amount of time while under heavy
container operation load. In extreme cases the time to start the process
can take upwards of a second, which is a significant fraction of the
default health probe timeout (30s). With a shorter timeout, the exec
start delay could make the difference between a successful probe and a
probe timeout! Mitigate the impact of excessive exec start latencies by
only starting the probe timeout timer after the exec'ed process has
started.
Add a metric to sample the latency of starting health-check exec probes.
Signed-off-by: Cory Snider <csnider@mirantis.com>
reloadMaxDownloadAttempts() is used to reload the configuration,
but validation happened before merging the config with the defaults.
This removes the validation from this function, instead centralizing
validation in config.Validate().
NOTE:
Currently this validation is "ok", as it checks for "nil" values;
I am working on changes to reduce the use of pointers in the config,
and instead provide a mechanism to fill in defaults. This change is
in preparation of that.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The Reload logic is problematic and needs a rewrite.
Currently, config.Reload() is validating newConfig before the reload callback
is executed. At that point, newConfig may be a partial configuration, yet to be
merged with the existing configuration (in the "reload()" callback). Validating
this config before it's merged can result in incorrect validation errors.
However, the current "reload()" callback we use is DaemonCli.reloadConfig(),
which includes a call to Daemon.Reload(), which both performs "merging" and
validation, as well as actually updating the daemon configuration. Calling
DaemonCli.reloadConfig() *before* validation, could thus lead to a failure in
that function (making the reload non-atomic).
While *some* errors could always occur when applying/updating the config, we
should make it more atomic, and;
1. get (a copy of) the active configuration
2. get the new configuration
3. apply the (reloadable) options from the new configuration
4. validate the merged results
5. apply the new configuration.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
MergeDaemonConfigurations was validating the configs before and after
merging. However, the "fileConfig" configuration may contain only a
"partial" configuration (options to apply to / override the existing
config). This means that some options may not be set and contain default
or empty values.
Validating such partial configurations can produce validation failures,
so to prevent those, we should validate the configuration _after_
merging, to validate the "final" state.
There's more cleaning up / improvements to be made in this area; for
example, we currently use our "self crafted" `getConflictFreeConfiguration()`
function, which is used to detect options that are not allowed to
be overridden, and which could potentially be handled by mergo.Merge(),
but leaving those changes for a future exercise.
This patch removes the first validation step, changing the function
to only validate the resulting configuration after merging.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
spf13/pflag now provides this out of the box, so no need to implement
and use our own value-type for this.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This file was originally part of the work to support Solaris, and
there's nothing "not common unix" anymmore, so merging the files.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
installConfigFlags already has separate implementations for Linux and
Windows, so no need to further differentiate.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The installCommonConfigFlags() function is meant for flags that are
supported by all platforms, so removing it from that function.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Release notes:
Welcome to the v1.6.3 release of containerd!
The third patch release for containerd 1.6 includes various fixes and updates.
Notable Updates
- Fix panic when configuring tracing plugin
- Improve image pull performance in CRI plugin
- Check for duplicate nspath
- Fix deadlock in cgroup metrics collector
- Mount devmapper xfs file system with "nouuid" option
- Make the temp mount as ready only in container WithVolumes
- Fix deadlock from leaving transaction open in native snapshotter
- Monitor OOMKill events to prevent missing container events
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Previously, the API server configuration would be initialized and
validated when starting the API. Because of this, invalid configuration
(e.g. missing or invalid TLS certificates) would not be detected
when using `dockerd --validate`.
This patch moves creation of the validation earlier, so that it's
validated as part of `dockerd --validate`.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Previously, hosts were de-duplicated and normalized when starting
the API server (in `loadListeners()`), which meant that errors could
occur in that step (but not detected when using `dockerd --validate`),
as well as the list of hosts in the config not matching what would
actually be used (i.e., if duplicates were present).
This patch extracts the de-duplicating to a separate function, and
executes it as part of loading the daemon configuration, so that we
can fail early.
Moving this code also showed that some of this validation depended
on `newAPIServerConfig()` modifying the configuration (adding an
empty host if none was set) in order to have the parsing set a
default. This code was moved elsewhere, but a TODO comment added
as this logic is somewhat sketchy.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
- un-export `daemonOptions.InstallFlags()`; `daemonOptions` itself isn't exported,
not exported, and `InstallFlags()` isn't matching any interface and only used
internally.
- un-export `daemonOptions.SetDefaultOptions()` and remove the `flags` argument
as we were passing `daemonOptions.flags` as argument on a method attached to
`daemonOptions`, which was somewhat backwards. While at it, also removing an
intermediate variable that wasn't needed.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Log-level validation was previously performed when configuring the daemon-logs;
this moves the validation to config.Validate() so that we can catch invalid
settings when running dockerd --validate.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This was introduced in 85572cac14, where I
probably forgot to remove this code from an earlier iteration (I decided
that having an explicit `configureCertsDir()` function call for this would
make it more transparent that we're re-configuring a default).
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This was added in ec87479b7e, but it's unclear
why a sync.Once was used just for reading an environment-variable. The
related PR had a lot of review comments, so perhaps an earlier implementation
used something more heavy-weight, or it was just overlooked.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The Logging boolean was unconditionally set to true and ignored in all locations,
except for enabling the debugging middleware, which was also gated by the active
logrus logging level.
While it could make sense to have a Loglevel option configured on the API server,
we don't have this currently, and to make that actually useful, that config would
need to be tollerated by all locations that produce logs (which isn't the case
either).
Looking at the history of this option; a boolean to disable logging was originally
added in commit c423a790d6, which hard-coded it to
"disabled" in a test, and "enabled" for the API server outside of tests (before
that commit, logging was always enabled).
02ddaad5d9 and 5c42b2b512
changed the hard-coded values to be configurable through a `Logging` env-var (env-
vars were used _internally_ at the time to pass on options), which later became
a configuration struct in a0bf80fe03.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This is a method on the daemon, which itself holds the Config, so
there's no need to pass the same configuration as an argument.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This commit replaces `os.Setenv` with `t.Setenv` in tests. The
environment variable is automatically restored to its original value
when the test and all its subtests complete.
Reference: https://pkg.go.dev/testing#T.Setenv
Signed-off-by: Eng Zer Jun <engzerjun@gmail.com>
Also moving writeStatus() to the puller, which is where it's used, and makes
it slightly easier to consume.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
These were only exported to facilitate ImageService.GetRepository() (used for
the `GET /distribution/{name:.*}/json` endpoint.
Moving the core functionality of that to the distribution package makes it
more consistent with (e.g.) "pull" operations, and allows us to keep more things
internal.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
It's only used internally, so we can refer to the implementation itself. Given
that RegistryService.LookupPushEndpoints now only returns V2 endpoints, we
no longer need to check if an endpoint is possibly V1.
Also rename some types that had "v2" in their name, now that we only support v2.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
It's only used internally, so we can refer to the implementation itself. Given
that RegistryService.LookupPullEndpoints now only returns V2 endpoints, we
no longer need to check if an endpoint is possibly V1.
Also rename some types that had "v2" in their name, now that we only support v2.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
un-exports errors that were only used internally:
- Remove ErrNoSupport as it was not emitted anywhere
- ImageConfigPullError -> imageConfigPullError
- TranslatePullError() -> translatePullError()
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Some packages were using `logrus.Fatal()` in init functions (which logs the error,
and (by default) calls `os.Exit(1)` after logging).
Given that logrus formatting and outputs have not yet been configured during the
initialization stage, it does not provide much benefits over a plain `panic()`.
This patch replaces some instances of `logrus.Fatal()` with `panic()`, which has
the added benefits of not introducing logrus as a dependency in some of these
packages, and also produces a stacktrace, which could help locating the problem
in the unlikely event an `init()` fails.
Before this change, an error would look like:
$ dockerd
FATA[0000] something bad happened
After this change, the same error looks like:
$ dockerd
panic: something bad happened
goroutine 1 [running]:
github.com/docker/docker/daemon/logger/awslogs.init.0()
/go/src/github.com/docker/docker/daemon/logger/awslogs/cloudwatchlogs.go:128 +0x89
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This test was only validating that "an" error occurred, but failed
to check if the error was for the expected reason.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Some checks combined all possible comparisons in a single "assert",
making it hard to see in the output what failed (output, error?)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Re-order some test-cases to make it easier to find if we cover all variants,
and add some missing variants.
Also change tests to not use default ports where needed, so that we are sure
the code is taking the provided value, and didn't fall back to use the defaults.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Current versions of Go no longer have a problem with the trailing
colon when using url.Parse() or net.SplitHostPort(), so we can remove
this workaround.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This was added in 683766613a, to workaround
changes in error between go 1.12.8 / go 1.11.13, causing the test to fail.
We no longer test against those versions, so we can remove this workaround.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Containers can have a default stop-signal (`--stop-signal` / `STOPSIGNAL`) and
timeout (`--stop-timeout`). It is currently not possible to update either of
these after the container is created (`docker update` does not allow updating
them), and while either of these can be overridden through some commands, we
currently do not have a command that can override *both*:
command | stop-signal | stop-timeout | notes
----------------|-------------|--------------|----------------------------
docker kill | yes | DNA | only sends a single signal
docker restart | no | yes |
docker stop | no | yes |
As a result, if a user wants to stop a container with a custom signal and
timeout, the only option is to do this manually:
docker kill -s <custom signal> mycontainer
# wait <desired timeout>
# press ^C to cancel the graceful stop
# forcibly kill the container
docker kill mycontainer
This patch adds a new `signal` query parameter to the container "stop" and
"restart" endpoints. This parameter can be added as a new flag on the CLI,
which would allow stopping and restarting with a custom timeout and signal,
for example:
docker stop --signal=SIGWINCH --time=120 mycontainer
docker restart --signal=SIGWINCH --time=120 mycontainer
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This avoids having to determine what the default is in various
parts of the code. If no custom timeout is passed (nil), the
default will be used.
Also remove the named return variable from cleanupContainer(),
as it wasn't used.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
We already have this config, so might as well pass it, instead of passing
each option as a separate argument.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
- daemon/delete: rename var that collided with import, remove output var
- daemon: fix inconsistent receiver name and package aliases
- daemon/stop: rename imports and variables to standard naming
This is in preparation of some changes, but keeping it in a
separate commit to make review of other changes easier.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
While looking at this code, I noticed that we were wasting quite some resources
by first constructing a string, only to split it again (with `strings.Fields()`)
into a string slice.
Some conversions were also happening multiple times (int to string, IP-address to
string, etc.)
Setting up networking is known to be costing a considerable amount of time when
starting containers, and while this may only be a small part of that, it doesn't
hurt to save some resources (and readability of the code isn't significantly
impacted).
For example, benchmarking the `redirector()` code before/after:
BenchmarkParseOld-4 137646 8398 ns/op 4192 B/op 75 allocs/op
BenchmarkParseNew-4 629395 1762 ns/op 2362 B/op 24 allocs/op
Average over 10 runs:
benchstat old.txt new.txt
name old time/op new time/op delta
Parse-4 8.43µs ± 2% 1.79µs ± 3% -78.76% (p=0.000 n=9+8)
name old alloc/op new alloc/op delta
Parse-4 4.19kB ± 0% 2.36kB ± 0% -43.65% (p=0.000 n=10+10)
name old allocs/op new allocs/op delta
Parse-4 75.0 ± 0% 24.0 ± 0% -68.00% (p=0.000 n=10+10)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Just a small clean-up (there's more endpoints to do this for, but
I was working on changes in this area on the CLI when I noticed we
were setting this query-parameter unconditionally.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
ContainerConfig is used in multiple locations (for example, both for
Image.Config and Image.ContainerConfig). Unfortunately, swagger does
not allow documenting individual uses if a type is used; for this type,
the content is _optional_ when used as Image.ContainerConfig (which is
set by the classic builder, which does a "commit" of a container, but
not used when building an image with BuildKit).
This patch attempts to address this confusion by documenting that
"it may be empty (or fields not propagated) if it's used for the
Image.ContainerConfig field".
Perhaps alternatives are possible (aliasing the type?) but we can
look at those in a follow-up.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Make the call to cleanupServiceBindings during network deletion
conditional on Windows (where it is required), thereby providing a
performance improvement to network cleanup on Linux.
Signed-off-by: Trapier Marshall <tmarshall@mirantis.com>
Removal of PolicyLists from Windows VFP must be performed prior to
removing the HNS network. Otherwise PolicyList removal fails with
HNS error "network not found".
Signed-off-by: Trapier Marshall <tmarshall@mirantis.com>
ContainerConfig is used in multiple locations (for example, both for
Image.Config and Image.ContainerConfig). Unfortunately, swagger does
not allow documenting individual uses if a type is used; for this type,
the content is _optional_ when used as Image.ContainerConfig (which is
set by the classic builder, which does a "commit" of a container, but
not used when building an image with BuildKit).
This patch attempts to address this confusion by documenting that
"it may be empty (or fields not propagated) if it's used for the
Image.ContainerConfig field".
Perhaps alternatives are possible (aliasing the type?) but we can
look at those in a follow-up.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This was added in 93c3e6c91e, at which time only
some basic handling of non-succesful status codes was present;
93c3e6c91e/api/client/utils.go (L112-L121)
Given that since 38e6d474af non-successful status-
codes are already handled, and a 204 ("no content") status should not be an error,
this special case should no longer be needed.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Having to declare a package-scope variable and separately initialize it
is repetitive and error-prone. Refactor so that each metric is defined
and initialized in the same statement.
Signed-off-by: Cory Snider <csnider@mirantis.com>
git published an advisory Yesterday, which (as a counter-measure)
requires the git repository's directory to be owned by the current
user, and otherwise produce an error:
fatal: unsafe repository ('/workspace' is owned by someone else)
To add an exception for this directory, call:
git config --global --add safe.directory /workspace
The DCO check is run within a container, which is running as `root`
(to allow packages to be installed), but because of this, the user
does not match the files that are bind-mounted from the host (as they
are checked out by Jenkins, using a different user).
To work around this issue, this patch configures git to consider the
`/workspace` directory as "safe". We configure it in the `--system`
configuration so that it takes effect for "all users" inside the
container.
More details on the advisory can be found on GitHub's blog:
https://github.blog/2022-04-12-git-security-vulnerability-announced/
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Simplify some of the logic, and add documentation about the package,
as well as warnings that this package should not be used as a general-
purpose utility.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
pkg/urlutil (despite its poorly chosen name) is not really intended as a generic
utility to handle URLs, and should only be used by the builder to handle (remote)
build contexts.
- IsURL() only does a very rudimentary check for http(s):// prefixes, without any
other validation, but due to its name may give incorrect expectations.
- IsGitURL() is written specifically with docker build remote git contexts in
mind, and has handling for backward-compatibility, where strings that are
not URLs, but start with "github.com/" are accepted.
Because of the above, this patch:
- moves the package inside builder/remotecontext, close to where it's intended
to be used (ideally this would be part of build/remotecontext itself, but this
package imports many other dependencies, which would introduce those as extra
dependencies in the CLI).
- deprecates pkg/urlutil, but adds aliases as there are some external consumers.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Implement similar logic as is used in httputils.ReadJSON(). Before
this patch, endpoints using the ContainerDecoder would incorrectly
return a 500 (internal server error) status.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Implement a ReadJSON() utility to help reduce some code-duplication,
and to make sure we handle JSON requests consistently (e.g. always
check for the content-type).
Differences compared to current handling:
- prevent possible panic if request.Body is nil ("should never happen")
- always require Content-Type to be "application/json"
- be stricter about additional content after JSON (previously ignored)
- but, allow the body to be empty (an empty body is not invalid);
update TestContainerInvalidJSON accordingly, which was testing the
wrong expectation.
- close body after reading (some code did this)
We should consider to add a "max body size" on this function, similar to
7b9275c0da/api/server/middleware/debug.go (L27-L40)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
pkg/urlutil (despite its poorly chosen name) is not really intended as a generic
utility to handle URLs, and should only be used by the builder to handle (remote)
build contexts.
This patch:
- fix some cases where the host was ignored for valid addresses.
- removes a redundant use of urlutil.IsTransportURL(); instead adding code to
check if the given scheme (protocol) is supported.
- improve port validation for out of range ports.
- fix some missing validation: the driver was silently ignoring path elements,
but expected a host (not an URL)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
fix some missing validation: the driver was silently ignoring path elements
in some cases, and expecting a host (not an URL), and for unix sockets did
not validate if a path was specified.
For the latter case, we should have a fix in the upstream driver, as it
uses an empty path as default path for the socket (`defaultSocketPath`),
and performs no validation.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
pkg/urlutil (despite its poorly chosen name) is not really intended as a generic
utility to handle URLs, and should only be used by the builder to handle (remote)
build contexts.
This patch:
- removes a redundant use of urlutil.IsTransportURL(); instead adding some code
to check if the given scheme (protocol) is supported.
- define a `defaultPort` const for the default port.
- use `net.JoinHostPort()` instead of string concatenating, to account for possible
issues with IPv6 addresses.
- renames a variable that collided with an imported package.
- improves test coverage, and moves an integration test.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
pkg/urlutil (despite its poorly chosen name) is not really intended as a generic
utility to handle URLs, and should only be used by the builder to handle (remote)
build contexts.
This patch removes the use of urlutil.IsURL(), in favor of just checking if the
provided scheme (protocol) is supported.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
pkg/urlutil (despite its poorly chosen name) is not really intended as a generic
utility to handle URLs, and should only be used by the builder to handle (remote)
build contexts.
This patch:
- removes a redundant use of urlutil.IsTransportURL(); code further below already
checked if the given scheme (protocol) was supported.
- renames some variables that collided with imported packages.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This is a follow-up to 427c7cc5f8, which added
proxy-configuration options ("http-proxy", "https-proxy", "no-proxy") to the
dockerd cli and in `daemon.json`.
While working on documentation changes for this feature, I realised that those
options won't be "next" to each-other when formatting the daemon.json JSON, for
example using `jq` (which sorts the fields alphabetically). As it's possible that
additional proxy configuration options are added in future, I considered that
grouping these options in a struct within the JSON may help setting these options,
as well as discovering related options.
This patch introduces a "proxies" field in the JSON, which includes the
"http-proxy", "https-proxy", "no-proxy" options.
Conflict detection continues to work as before; with this patch applied:
mkdir -p /etc/docker/
echo '{"proxies":{"http-proxy":"http-config", "https-proxy":"https-config", "no-proxy": "no-proxy-config"}}' > /etc/docker/daemon.json
dockerd --http-proxy=http-flag --https-proxy=https-flag --no-proxy=no-proxy-flag --validate
unable to configure the Docker daemon with file /etc/docker/daemon.json:
the following directives are specified both as a flag and in the configuration file:
http-proxy: (from flag: http-flag, from file: http-config),
https-proxy: (from flag: https-flag, from file: https-config),
no-proxy: (from flag: no-proxy-flag, from file: no-proxy-config)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Support for overlay on a backing filesystem without d_type was deprecated in
0abb8dec3f (Docker 17.12), with an exception
for existing installations (0a4e793a3d).
That deprecation was nearly 5 years ago, and running without d_type is known to
cause serious issues (so users will likely already have run into other problems).
This patch removes support for running overlay and overlay2 on these filesystems,
returning the error instead of logging it.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
These were changes I drafted when reviewing 7c731e02a9,
and had these stashed in my local git;
- rename receiver to prevent "unconsistent receiver name" warnings
- make NewRouter() slightly more idiomatic, and wrap the options,
to make them easier to read.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Previously, the ppc64ls and s390x stages only ran on non-PR commits,
but the unit-tests and integration/xx tests could be enabled with
a checkbox.
This patch changes the Jenkinsfile to also allow the integration-cli
tests to be run on pull requests if the checkbox is enabled.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Updated the list of AUTHORS using the generate-authors.sh script.
Also updating the .mailmap file to prevent some duplicates, and
to include some updates from containerd, which had a more up-to-date
list of author's preferred e-mail addresses.
Signed-off-by: Gabriel Goller <gabrielgoller123@gmail.com>
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The config.Validate() function did not validate hosts that were configured in
the daemon.json configuration file, resulting in `--validate` to pass, but the
daemon failing to start.
before this patch:
echo '{"hosts":["127.0.0.1:2375/path"]}' > /etc/docker/daemon.json
dockerd --validate
configuration OK
dockerd
INFO[2022-04-03T11:42:22.162366200Z] Starting up
failed to load listeners: error parsing -H 127.0.0.1:2375/path: invalid bind address (127.0.0.1:2375/path): should not contain a path element
with this patch:
echo '{"hosts":["127.0.0.1:2375/path"]}' > /etc/docker/daemon.json
dockerd --validate
unable to configure the Docker daemon with file /etc/docker/daemon.json: configuration validation from file failed: invalid bind address (127.0.0.1:2375/path): should not contain a path element
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The TestReloadDefaultConfigNotExist() test assumed it was running in a clean
environment, in which the `/etc/docker/daemon.json` file doesn't exist, and
would fail if that was not the case.
This patch updates the test to override the default location to a a non-existing
path, to allow running the test in an environment where `/etc/docker/daemon.json`
is present.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
There was a discrepancy between what `ParseTCPAddr()` accepted, and what the
daemon was able to use, resulting in the daemon to start, but fail to create
listeners for the specified host.
Before this patch:
dockerd -H tcp://127.0.0.1:2375/
INFO[2022-04-03T10:18:06.417502600Z] Starting up
...
failed to load listeners: listen tcp: address tcp/2375/: unknown port
dockerd -H 127.0.0.1:2375/path
INFO[2022-04-03T10:18:06.417502600Z] Starting up
...
failed to load listeners: listen tcp: address tcp/5555/path: unknown port
After this patch:
dockerd -H tcp://127.0.0.1:2375/
Status: invalid argument "tcp://127.0.0.1:2375/" for "-H, --host" flag: invalid bind address (127.0.0.1:2375/): should not contain a path element
See 'dockerd --help'., Code: 125
dockerd -H 127.0.0.1:2375/path
Status: invalid argument "127.0.0.1:2375/path" for "-H, --host" flag: invalid bind address (127.0.0.1:2375/path): should not contain a path element
See 'dockerd --help'., Code: 125
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This feature requires experimental mode to be enabled, so mentioning that
in the flag description.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
release notes: https://github.com/docker/buildx/releases/tag/v0.8.2
Notable changes:
- Update Compose spec used by buildx bake to v1.2.1 to fix parsing ports definition
- Fix possible crash on handling progress streams from BuildKit v0.10
- Fix parsing groups in buildx bake when already loaded by a parent group
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
both -1 and 0 are accepted as "no limit", so don't send the
limit option if no limit was set. For simplicity, we're ignoring
values <= 0.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This uses the correct comparison with compatibility
checks for variants.
The deprecated arm variant matcher is left as is.
Although it is not needed for valid cases it is not
fully compatible as also matches some invalid
combinations, so should be removed separately.
Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>
The test was dependent on its container being _first_ in the response,
but anywhere on the line should be fine.
Signed-off-by: Paul "TBBle" Hampson <Paul.Hampson@Pobox.com>
Arbitrary here does not include '', best to catch that one early as it's
almost certainly a mistake (possibly an attempt to pass a POSIX path
through this API)
Signed-off-by: Paul "TBBle" Hampson <Paul.Hampson@Pobox.com>
Since this function is about to get more complicated, and change
behaviour, this establishes tests for the existing implementation.
Signed-off-by: Paul "TBBle" Hampson <Paul.Hampson@Pobox.com>
This adds an additional "Swarm" header to the _ping endpoint response,
which allows a client to detect if Swarm is enabled on the daemon, without
having to call additional endpoints.
This change is not versioned in the API, and will be returned irregardless
of the API version that is used. Clients should fall back to using other
endpoints to get this information if the header is not present.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The registry package contained code to automatically set the CertsDir() path,
based on wether or not the daemon was running in rootlessmode. In doing so,
it made use of the `pkg/rootless.RunningWithRootlessKit()` utility.
A recent change in de6732a403 added additional
functionality in the `pkg/rootless` package, introducing a dependency on
`github.com/rootless-containers/rootlesskit`. Unfortunately, the extra
dependency also made its way into the docker cli, which also uses the
registry package.
This patch introduces a new `SetCertsDir()` function, which allows
the default certs-directory to be overridden, and updates the daemon
to configure this location during startup.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Previously, we only printed a warning if a storage driver was deprecated. The
intent was to continue supporting these drivers, to allow users to migrate
to a different storage driver.
This patch changes the behavior; if the user has no storage driver specified
in the daemon configuration (so if we try to detect the previous storage
driver based on what's present in /var/lib/docker), we now produce an error,
informing the user that the storage driver is deprecated (and to be removed),
as well as instructing them to change the daemon configuration to explicitly
select the storage driver (to allow them to migrate).
This should make the deprecation more visible; this will be disruptive, but
it's better to have the failure happening *now* (while the drivers are still
there), than for users to discover the storage driver is no longer there
(which would require them to *downgrade* the daemon in order to migrate
to a different driver).
With this change, `docker info` includes a link in the warnings that:
/ # docker info
Client:
Context: default
Debug Mode: false
Server:
...
Live Restore Enabled: false
WARNING: The overlay storage-driver is deprecated, and will be removed in a future release.
Refer to the documentation for more information: https://docs.docker.com/go/storage-driver/
When starting the daemon without a storage driver configured explicitly, but
previous state was using a deprecated driver, the error is both logged and
printed:
...
ERRO[2022-03-25T14:14:06.032014013Z] [graphdriver] prior storage driver overlay is deprecated and will be removed in a future release; update the the daemon configuration and explicitly choose this storage driver to continue using it; visit https://docs.docker.com/go/storage-driver/ for more information
...
failed to start daemon: error initializing graphdriver: prior storage driver overlay is deprecated and will be removed in a future release; update the the daemon configuration and explicitly choose this storage driver to continue using it; visit https://docs.docker.com/go/storage-driver/ for more information
When starting the daemon and explicitly configuring it with a deprecated storage
driver:
WARN[2022-03-25T14:15:59.042335412Z] [graphdriver] WARNING: the overlay storage-driver is deprecated and will be removed in a future release; visit https://docs.docker.com/go/storage-driver/ for more information
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
- use pkg/errors for errors and fix error-capitalisation
- remove one redundant call to logDeprecatedWarning() (we're already skipping
deprecated drivers in that loop).
- rename `list` to `priorityList` for readability.
- remove redundant "skip" for the vfs storage driver, as it's already
excluded by `scanPriorDrivers()`
- change one debug log to an "info", so that the daemon logs contain the driver
that was configured, and include "multiple prior states found" error in the
daemon logs, to assist in debugging failed daemon starts.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
There is a race condition between the local proxy and iptables rule
setting. When we have a lot of UDP traffic, the kernel will create
conntrack entries to the local proxy and will ignore the iptables
rules set after that.
Related to PR #32505. Fix#8795.
Signed-off-by: Vincent Bernat <vincent@bernat.ch>
Operations performed on overlay network sandboxes are handled by
dispatching operations send through a channel. This allows for
asynchronous operations to be performed which, since they are
not called from within another function, are able to operate in
an idempotent manner with a known/measurable starting state from
which an identical series of iterative actions can be performed.
However, it was possible in some cases for an operation dispatched
from this channel to write a message back to the channel in the
case of joining a network when a sufficient volume of sandboxes
were operated on.
A goroutine which is simultaneously reading and writing to an
unbuffered channel can deadlock if it sends a message to a channel
then waits for it to be consumed and completed, since the only
available goroutine is more or less "talking to itself". In order
to break this deadlock, in the observed race, a goroutine is now
created to send the message to the channel.
Signed-off-by: Martin Dojcak <martin.dojcak@lablabs.io>
Signed-off-by: Ryan Barry <rbarry@mirantis.com>
This reverts the changes made in 2a9c987e5a, which
moved the GetHTTPErrorStatusCode() utility to the errdefs package.
While it seemed to make sense at the time to have the errdefs package provide
conversion both from HTTP status codes errdefs and the reverse, a side-effect
of the move was that the errdefs package now had a dependency on various external
modules, to handle conversio of errors coming from those sub-systems, such as;
- github.com/containerd/containerd
- github.com/docker/distribution
- google.golang.org/grpc
This patch moves the conversion from (errdef-) errors to HTTP status-codes to a
api/server/httpstatus package, which is only used by the API server, and should
not be needed by client-code using the errdefs package.
The MakeErrorHandler() utility was moved to the API server itself, as that's the
only place it's used. While the same applies to the GetHTTPErrorStatusCode func,
I opted for keeping that in its own package for a slightly cleaner interface.
Why not move it into the api/server/httputils package?
The api/server/httputils package is also imported in the client package, which
uses the httputils.ParseForm() and httputils.HijackConnection() functions as
part of the TestTLSCloseWriter() test. While this is only used in tests, I
wanted to avoid introducing the indirect depdencencies outside of the api/server
code.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The wrapResponseError() utility converted some specific errors, but in
doing so, could hide the actual error message returned by the daemon.
In addition, starting with 38e6d474af,
HTTP status codes were already mapped to their corresponding errdefs
types on the client-side, making this conversion redundant.
This patch removes the wrapResponseError() utility; it's worth noting
that some error-messages will change slightly (as they now return the
error as returned by the daemon), but may cointain more details as
before, and in some cases prevents hiding the actual error.
Before this change:
docker container rm nosuchcontainer
Error: No such container: nosuchcontainer
docker container cp mycontainer:/no/such/path .
Error: No such container:path: mycontainer:/no/such/path
docker container cp ./Dockerfile mycontainer:/no/such/path
Error: No such container:path: mycontainer:/no/such
docker image rm nosuchimage
Error: No such image: nosuchimage
docker network rm nosuchnetwork
Error: No such network: nosuchnetwork
docker volume rm nosuchvolume
Error: No such volume: nosuchvolume
docker plugin rm nosuchplugin
Error: No such plugin: nosuchplugin
docker checkpoint rm nosuchcontainer nosuchcheckpoint
Error response from daemon: No such container: nosuchcontainer
docker checkpoint rm mycontainer nosuchcheckpoint
Error response from daemon: checkpoint nosuchcheckpoint does not exist for container mycontainer
docker service rm nosuchservice
Error: No such service: nosuchservice
docker node rm nosuchnode
Error: No such node: nosuchnode
docker config rm nosuschconfig
Error: No such config: nosuschconfig
docker secret rm nosuchsecret
Error: No such secret: nosuchsecret
After this change:
docker container rm nosuchcontainer
Error response from daemon: No such container: nosuchcontainer
docker container cp mycontainer:/no/such/path .
Error response from daemon: Could not find the file /no/such/path in container mycontainer
docker container cp ./Dockerfile mycontainer:/no/such/path
Error response from daemon: Could not find the file /no/such in container mycontainer
docker image rm nosuchimage
Error response from daemon: No such image: nosuchimage:latest
docker network rm nosuchnetwork
Error response from daemon: network nosuchnetwork not found
docker volume rm nosuchvolume
Error response from daemon: get nosuchvolume: no such volume
docker plugin rm nosuchplugin
Error response from daemon: plugin "nosuchplugin" not found
docker checkpoint rm nosuchcontainer nosuchcheckpoint
Error response from daemon: No such container: nosuchcontainer
docker checkpoint rm mycontainer nosuchcheckpoint
Error response from daemon: checkpoint nosuchcheckpoint does not exist for container mycontainer
docker service rm nosuchservice
Error response from daemon: service nosuchservice not found
docker node rm nosuchnode
Error response from daemon: node nosuchnode not found
docker config rm nosuchconfig
Error response from daemon: config nosuchconfig not found
docker secret rm nosuchsecret
Error response from daemon: secret nosuchsecret not found
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
All other endpoints handle this in the API; given that the JSON format for
filters is part of the API, it makes sense to handle it there, and not have
that concept leak into further down the code.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Move the default to the service itself, and produce the correct status code
if an invalid limit was specified. The default is currently set both on the
cli and on the daemon side, and it should be only set on one of them.
There is a slight change in behavior; previously, searching with `--limit=0`
would produce an error, but with this change, it's considered the equivalent
of "no limit set" (and using the default).
We could keep the old behavior by passing a pointer (`nil` means "not set"),
but I left that for a follow-up exercise (we may want to pass an actual
config instead of separate arguments, as well as some other things that need
cleaning up).
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The intent of this function is to return a copy of the service's configuration,
and to copy / dereference the options in its configuration.
The code was doing this in slightly complicated fashion. This patch;
- adds a `copy()` function to serviceConfig
- rewrites the code to use a slightly more idiomatic approach, using one of
the approaches described in "golang SliceTricks" https://github.com/golang/go/wiki/SliceTricks#copy
- changes defaultService.ServiceConfig() to use this function, and updates
its godoc to better describe that it returns a copy.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This removes the ugly hack where we stored the current config, tried to
reconfigure the service, and rolled back to the stored copy on failures.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Most operations only require read access, so change this to use an RWMutex,
and some minor refactoring in lookupV2Endpoints() so that we are not
constructing tlsconfig multiple times in some cases.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
- registry: newIndexInfo(): minor refactor
- registry: loadAllowNondistributableArtifacts() minor refactor
initialise the slices with a length.
- registry: defaultService.Search(): minor refactor
Perform all manipulation earlier, so that it's not needed to scroll up
to learn what's done.
- various other minor cleanups
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This error was only returned in a single location, and not used anywhere
as a specific type.
The error returned by `validateNoScheme()` also appeared to only be used in
one case; in all other cases, the error itself was ignored, and replaced with
a custom error. Because of this, this patch also replace `validateNoScheme()`
with a `hasScheme()` function that returns a boolean, to better match how it's
used.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Un-export:
- config.LoadAllowNondistributableArtifacts()
- config.LoadInsecureRegistries()
- config.LoadMirrors()
The config type is already un-exported; this also un-exports these functions
to be explicit they're internal only.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
These are only used internally, and the v1Endpoint.Path() function was only
used to get the `_ping` URL, so let's inline that code instead.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The DefaultService was not really meant to be used outside of the package, so
un-export it, and change NewService()'s signature to return a Service interface.
To un-export this type, a test in daemon/images was updated to not use DefaultService,
but now using the registry.Service interface itself.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
While this was intended t be a stop-gap solution, it's been there for years and
users depend on this. It's also still complicated to secure _localhost_, so
by now, we'd probably have to be realistic, and consider this to be "permanent".
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
full diff: 5770296d90...3147a52a75
This version contains a fix for CVE-2022-27191 (not sure if it affects us).
From the golang mailing list:
Hello gophers,
Version v0.0.0-20220315160706-3147a52a75dd of golang.org/x/crypto/ssh implements
client authentication support for signature algorithms based on SHA-2 for use with
existing RSA keys.
Previously, a client would fail to authenticate with RSA keys to servers that
reject signature algorithms based on SHA-1. This includes OpenSSH 8.8 by default
and—starting today March 15, 2022 for recently uploaded keys.
We are providing this announcement as the error (“ssh: unable to authenticate”)
might otherwise be difficult to troubleshoot.
Version v0.0.0-20220314234659-1baeb1ce4c0b (included in the version above) also
fixes a potential security issue where an attacker could cause a crash in a
golang.org/x/crypto/ssh server under these conditions:
- The server has been configured by passing a Signer to ServerConfig.AddHostKey.
- The Signer passed to AddHostKey does not also implement AlgorithmSigner.
- The Signer passed to AddHostKey does return a key of type “ssh-rsa” from its PublicKey method.
Servers that only use Signer implementations provided by the ssh package are
unaffected. This is CVE-2022-27191.
Alla prossima,
Filippo for the Go Security team
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This function was only used in a single place, and pkg/parsers/operatingsystem
already copied the `verNTWorkstation` const, so we might as well move this function
there as well to "unclutter" pkg/system.
The function had no external users, so not adding an alias / stub.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This fixes the "deprecated" comment to have the correct format to be picked
up by editors, and adds `omitempty` labels for KernelMemory and KernelMemoryTCP.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
- Omit `KernelMemory` and `KernelMemoryTCP` fields in `/info` response if they're
not supported, or when using API v1.42 or up.
- Re-enable detection of `KernelMemory` (as it's still needed for older API versions)
- Remove warning about kernel memory TCP in daemon logs (a warning is still returned
by the `/info` endpoint, but we can consider removing that).
- Prevent incorrect "Minimum kernel memory limit allowed" error if the value was
reset because it's not supported by the host.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
- remove KernelMemory option from `v1.42` api docs
- remove KernelMemory warning on `/info`
- update changes for `v1.42`
- remove `KernelMemory` field from endpoints docs
Signed-off-by: aiordache <anca.iordache@docker.com>
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
These consts were deprecated in 46c591b045, and
although that has not been in a release yet (we usually deprecate for at least
one release before removing), doing a search showed that there were no external
consumers of these consts, so it should be fine to remove them.
This patch removes the consts that were moded to pkg/idtools;
- SeTakeOwnershipPrivilege
- ContainerAdministratorSidString
- ContainerUserSidString
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This function is marked deprecated in Go 1.18; however, the suggested replacement
brings in a large amount of new code, and most strings we generate will be ASCII,
so this would only be in case it's used for some user-provided string. We also
don't have a language to use, so would be using the "default".
Adding a `//nolint` comment to suppress the linting failure instead.
daemon/logger/templates/templates.go:23:14: SA1019: strings.Title is deprecated: The rule Title uses for word boundaries does not handle Unicode punctuation properly. Use golang.org/x/text/cases instead. (staticcheck)
"title": strings.Title,
^
pkg/plugins/pluginrpc-gen/template.go:67:9: SA1019: strings.Title is deprecated: The rule Title uses for word boundaries does not handle Unicode punctuation properly. Use golang.org/x/text/cases instead. (staticcheck)
return strings.Title(s)
^
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The API defines a default limit for searches, but when removing the
default from the cli, the client still sends "0" as a limit, which
is not allowed by existing versions of the API:
docker search --limit=0 busybox
Error response from daemon: Limit 0 is outside the range of [1, 100]
This patch changes the client so that no limit is sent if none was set ("0"),
allowing the daemon to use its (or the registry's) default.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Finish the refactor which was partially completed with commit
34536c498d, passing around IdentityMapping structs instead of pairs of
[]IDMap slices.
Existing code which uses []IDMap relies on zero-valued fields to be
valid, empty mappings. So in order to successfully finish the
refactoring without introducing bugs, their replacement therefore also
needs to have a useful zero value which represents an empty mapping.
Change IdentityMapping to be a pass-by-value type so that there are no
nil pointers to worry about.
The functionality provided by the deprecated NewIDMappingsFromMaps
function is required by unit tests to to construct arbitrary
IdentityMapping values. And the daemon will always need to access the
mappings to pass them to the Linux kernel. Accommodate these use cases
by exporting the struct fields instead. BuildKit currently depends on
the UIDs and GIDs methods so we cannot get rid of them yet.
Signed-off-by: Cory Snider <csnider@mirantis.com>
This removes the `setOS()` / `getOS()` functions from the layer store, which were
added in fc21bf280b and 0380fbff37
in support of LCOW.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This defines the interface that the package expects in order to lookup
pull endpoints, instead of requiring the whole registry.Service interface.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
It wrapped the regular registry service, but the ResolveRepository() function
was not called anywhere.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Reimplement GetCgroupMounts using the github.com/containerd/cgroups and
github.com/moby/sys/mountinfo packages.
Signed-off-by: Cory Snider <csnider@mirantis.com>
This removes the plugin section from the containerd configuration file
(`/var/run/docker/containerd/containerd.toml`) that is generated when
starting containerd as child process;
```toml
[plugins]
[plugins.linux]
shim = "containerd-shim"
runtime = "runc"
runtime_root = "/var/lib/docker/runc"
no_shim = false
shim_debug = true
```
This configuration doesn't appear to be used since commit:
0b14c2b67a, which switched the default runtime
to to io.containerd.runc.v2.
Note that containerd itself uses `containerd-shim` and `runc` as default
for `shim` and `runtime` v1, so omitting that configuration doesn't seem
to make a difference.
I'm slightly confused if any of the other options in this configuration were
actually used: for example, even though `runtime_root` was configured to be
`/var/lib/docker/runc`, when starting a container with that coniguration set
on docker 19.03, `/var/lib/docker/runc` doesn't appear to exist:
```console
$ docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
098baa4cb0e7 nginx:alpine "/docker-entrypoint.…" 59 minutes ago Up 59 minutes 80/tcp foo
$ ls /var/lib/docker/runc
ls: /var/lib/docker/runc: No such file or directory
$ ps auxf
PID USER TIME COMMAND
1 root 0:00 sh
16 root 0:11 dockerd --debug
26 root 0:09 containerd --config /var/run/docker/containerd/containerd.toml --log-level debug
234 root 0:00 containerd-shim -namespace moby -workdir /var/lib/docker/containerd/daemon/io.containerd.runtime.v1.linux/moby/09
251 root 0:00 nginx: master process nginx -g daemon off;
304 101 0:00 nginx: worker process
...
```
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This adds consts for the environment variables that are supported
by the client. These environment variables are unlikely to change,
or at least, unlikely to be removed, but having consts allows for
them to be documented.
I did not change all occurrences of these variables to use the const,
as they're used in various tests, and it's ok to use a fixture for
those, but it's nice to have a const available for (external) consumers
of the client package, and to have their purpose (and caveats)
documented in the code.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
- Improve documentation of various functions to better describe their behavior.
- Rename some variables to be more descriptive (as this is client code, used
by external consumers, it's nice to be a bit more explicit).
- Remove a redundant check in `WithVersionFromEnv()`, as `WithVersion()`
already checks for empty values.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This test was setting the non-exported `Client.basePath` directly, however,
it was setting it to a value that would never realistically happen, because
`NewClientWithOpts()` initializes the Client with the default API version:
ea5b4765d9/client/client.go (L119-L130)
Which is used by `getAPIPath()` to construct the URL/path:
ea5b4765d9/client/client.go (L176-L190)
While this didn't render the test "invalid", using a Client that's constructed
in the usual way, makes it more representative.
Given that we deprecated (but still support) the non-versioned API paths, with
the exception of the `/_ping` API endpoint, we should probably change `getAPIPath()`
to default to the "current version", instead of allowing it to use an empty string.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
- avoid accessing non-exported fields where possible, and test using accessors
instead, so that we're closer to how it's actually used.
- use a variable or const for "expected" in some tests, so that "expected" is
printed as part of the test-failure output (instead of just a "value").
- swap the order of "actual" and "expected" for consistency, and to make it
easier to see what the "expected" value is in some cases ("expected" on the
right, so that it reads `val (actual) != val (expected)`).
- don't set fields in the Ping response that are not relevant to the test.
- rename some variables for consistency.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Trying to build Docker images with buildkit using a ZFS-backed storage
was unreliable due to apparent race condition between adding and
removing layers to the storage (see: https://github.com/moby/buildkit/issues/1758).
The issue describes a similar problem with the BTRFS driver that was
resolved by adding additional locking based on the scheme used in the
OverlayFS driver. This commit replicates the scheme to the ZFS driver
which makes the problem as reported in the issue stop happening.
Signed-off-by: Tomasz Mańko <hi@jaen.me>
This package was deprecated in de56a90929, which
was part of the 20.10 release, so consumers of this package should've been
able to migrate to the new location.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This package was deprecated in dc3c382b34, which
was part of the 20.10 release, so consumers of this package should've been
able to migrate to the new location.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This package was deprecated in 5ca758199d, which
was part of the 20.10 release, so consumers of this package should've been
able to migrate to the new location.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This package was deprecated in 41d4112e89, which
was part of the 20.10 release, so consumers of this package should've been
able to migrate to the new location.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This package was deprecated in 99beb2ca02, which
was part of the 20.10 release, so consumers of this package should've been
able to migrate to the new location.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This should help with CI being unstable when generating the types (due
to Go randomizing order). Unfortunately, the (file) names are a bit ugly,
but addressing that in a follow-up.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This field was used when Windows did not yet support regular images, and required
the base-image to pre-exist on the Windows machine (as those layers were not yet
allowed to be distributed).
Commit f342b27145 (docker 1.13.0, API v1.25) removed
usage of the field. The field was not documented in the API, but because it was not
removed from the Golang structs in the API, ended up in the API documentation when
we switched to using Swagger instead of plain MarkDown for the API docs.
Given that the field was never set in any of these API versions, and had an "omitempty",
it was never actually returned in a response, so should be fine to remove from these
API docs.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This field was used when Windows did not yet support regular images, and required
the base-image to pre-exist on the Windows machine (as those layers were not yet
allowed to be distributed).
Commit f342b27145 (docker 1.13.0, API v1.25) removed
usage of the field. The field was not documented in the API, but because it was not
removed from the Golang structs in the API, ended up in the API documentation when
we switched to using Swagger instead of plain MarkDown for the API docs.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This field was used when Windows did not yet support regular images, and required
the base-image to pre-exist on the Windows machine (as those layers were not yet
allowed to be distributed).
Commit f342b27145 (docker 1.13.0, API v1.25) removed
usage of the field.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Microsoft has stopped updating the ProductName registry value in Windows
11; it reads as Windows 10. And Microsoft has made it very difficult to
look up the real product name programmatically so that applications do
not attempt to parse it. (Ever wonder why they skipped Windows 9?) The
only documented and supported mechanisms require WMI or WinRT. The
product name has no bearing on application compatibility so it is not
worth doing any heroics to display the correct name. The build number
and Update Build Revision is sufficient information to identify a
specific build of Windows. Stop displaying the ProductName so as not to
confuse users with incorrect information.
Microsoft has frozen the ReleaseId registry value at 2009 when they
switched to semi-annual releases and alpha-numeric versions. The release
version as displayed by winver.exe and Settings -> System -> About on
Windows 20H2 and newer can be found in the new DisplayVersion registry
value. Replicate the way winver.exe displays the version by
preferentially reporting the DisplayVersion if present and reporting if
it is a Windows Server edition.
Signed-off-by: Cory Snider <csnider@mirantis.com>
I think this was there for historic reasons (may have been goimports expected
this, and we used to have a linter that wanted it), but it's not needed, so
let's remove it (to make my IDE less complaining about unneeded aliases).
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
- rename definition in swagger from `Image` to `ImageInspect` to match the go type
- improve (or add) documentation for various fields
- move example values in-line in the "definitions" section
- remove the `required` fields from `ImageInspect`, as the type is only used as
response type (not to make requests).
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
- rename definition in swagger from `Image` to `ImageInspect` to match the go type
- improve (or add) documentation for various fields
- move example values in-line in the "definitions" section
- remove the `required` fields from `ImageInspect`, as the type is only used as
response type (not to make requests).
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The log message's timestamp was being read after it was returned to the
pool. By coincidence the timestamp field happened to not be zeroed on
reset so much of the time things would work as expected. But if the
message value was to be taken back out of the pool before WriteLogEntry
returned, the timestamp recorded in the gzip header of compressed
rotated log files would be incorrect.
Make future use-after-put bugs fail fast by zeroing all fields of the
Message value, including the timestamp, when it is put into the pool.
Signed-off-by: Cory Snider <csnider@mirantis.com>
Perform the validation when the daemon starts instead of performing these
validations for each individual container, so that we can fail early.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This is not "very" important, but this dependency was only used
for a single const, which could be satisfied with a comment.
Not very urgent, as github.com/docker/go-units is likely imported
through other ways already (but it's nice to have the package be
more isolated).
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
pkg/system historically has been a bit of a kitchen-sink of things that were
somewhat "system" related, but didn't have a good place for. EnsureRemoveAll()
is one of those utilities. EnsureRemoveAll() is used to both unmount and remove
a path, for which it depends on both github.com/moby/sys/mount, which in turn
depends on github.com/moby/sys/mountinfo.
pkg/system is imported in the CLI, but neither EnsureRemoveAll(), nor any of its
moby/sys dependencies are used on the client side, so let's move this function
somewhere else, to remove those dependencies from the CLI.
I looked for plausible locations that were related; it's used in:
- daemon
- daemon/graphdriver/XXX/
- plugin
I considered moving it into a (e.g.) "utils" package within graphdriver (but not
a huge fan of "utils" packages), and given that it felt (mostly) related to
cleaning up container filesystems, I decided to move it there.
Some things to follow-up on after this:
- Verify if this function is still needed (it feels a bit like a big hammer in
a "YOLO, let's try some things just in case it fails")
- Perhaps it should be integrated in `containerfs.Remove()` (so that it's used
automatically)
- Look if there's other implementations (and if they should be consolidated),
although (e.g.) the one in containerd is a copy of ours:
https://github.com/containerd/containerd/blob/v1.5.9/pkg/cri/server/helpers_linux.go#L200
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This makes it more inline with other data we collect, and can be used to make
some info optional at some point.
fillDebugInfo sets the current debugging state of the daemon, and additional
debugging information, such as the number of Go-routines, and file descriptors.
Note that this currently always collects the information, but the CLI only
prints it if the daemon has debug enabled. We should consider to either make
this information optional (cli to request "with debugging information"), or
only collect it if the daemon has debug enabled. For the CLI code, see
https://github.com/docker/cli/blob/v20.10.12/cli/command/system/info.go#L239-L244
Additional note: the CLI considers info.SystemTime debugging information. This
felt a bit "odd" (daemon time could be useful for standard use), so I left this
out of this function.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
These are used internally only, and set by daemon.NewDaemon(). If they're
used externally, we should add an accessor added (which may be something
we want to do for daemon.registryService (which should be its own backend)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This makes it more inline with other data we collect, and can be used to
make some info optional at some point.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
full diff: db3c7e526a...2eb08e3e57
- Add support for detecting netns for all possible QoS in Kubernetes
- Add go1.10 build constraint
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
full diff: https://github.com/gotestyourself/gotest.tools/compare/v3.0.3...v3.1.0
noteworthy changes:
- ci: add go1.16
- ci: add go1.17, remove go1.13
- golden: only create dir if update flag is set
- icmd: replace all usages of os/exec with golang.org/x/sys/execabs
- assert: ErrorIs
- fs: add DirFromPath
- Stop creating directory outside of testdata
- fs: Fix comparing symlink permissions
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This function was abstracting things a bit too much; the layerStore had a
exported `.Get()` which called `.getWithoutLock()`, but also a non-exported
`.get()`, which also called `.getWithoutLock()`.
While it's common to have a non-exported variant (without locking), the naming
of `.get()` could easily be confused for that variant (which it wasn't).
All locations where `.get()` was called were already handling locks for
`releaseLayer()`, so moving the actual locking inline for `.get()` makes it
more visible where locking happens.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This is more in line with other consts that are used for defaults, and makes it
slightly easier to consume than DefaultV2Registry, e.g. see:
https://github.com/oras-project/oras-go/blob/v1.1.0/pkg/auth/docker/resolver.go#L81-L84
Note that both the "index.docker.io" and "registry-1.docker.io" domains
are here for historic reasons and backward-compatibility. These domains
are still supported by Docker Hub (and will continue to be supported), but
there are new domains already in use, and plans to consolidate all legacy
domains to new "canonical" domains. Once those domains are decided on, we
should update these consts (but making sure to preserve compatibility with
existing installs, clients, and user configuration).
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
full diff: https://github.com/containerd/continuity/compare/v0.1.0...v0.2.2
- fs/stat: add FreeBSD, and cleanup some nolint-comments
- go.mod: bazil.org/fuse v0.0.0-20200407214033-5883e5a4b5125
- Fix darwin issues
- Remove direct dependency on github.com/pkg/errors
- Do not log errors before returning them
- Build containerd/continuity on multiple Unix OSes
- Update CI Go version to 1.17
- fs: use syscall.Timespec.Unix
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
From the field's description [1]:
DualStack previously enabled RFC 6555 Fast Fallback
support, also known as "Happy Eyeballs", in which IPv4 is
tried soon if IPv6 appears to be misconfigured and
hanging.
Deprecated: Fast Fallback is enabled by default. To
disable, set FallbackDelay to a negative value.
This field was deprecated in efc185029b,
which is included in Go 1.12beta1 and up.
[1]: 2ebe77a2fd/src/net/dial.go (L54-L61)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
While this feature is deprecated / unsupported on cgroups v2, it's
part of the API, so let's at least document it.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Use sub-tests so that the iterations can run in parallel (instead of
sequential), and to make failures show up for the iteration that they're
part of.
Note that changing to subtests means that we'll always run 3 iterations of
the test, and no longer fail early (but the test still fails if any of
those iterations fails.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This test has become quite flaky on Windows / Windows with Containerd.
Looking at the test, I noticed that it's running a test three times (according
to the comment "as it failed ~ 50% of the time"). However;
- it uses the `--rm` option to clean up the container after it terminated
- it uses a fixed name for the containers that are started
I had a quick look at the issue that it was created for, and neither of those
options were mentioned in the reported bug (so are just part of the test setup).
I think the test was written when the `--rm` option was still client-side, in which
case the cli would not terminate until it removed the container (making the
remove synchronous). Current versions of docker have moved the `--rm` to the
daemon side, and (if I'm not mistaken) performed asynchronous, and therefore could
potentially cause a conflicting name.
This patch:
- removes the fixed name (the test doesn't require the container to have a
specific name, so we can just use a random name)
- adds logs to capture the stderr and stdout output of the run (so that we're
able to capture failure messages).
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
I had to check what the actual size was, so added it to the const's documentation.
While at it, also made use of it in a test, so that we're testing against the expected
value, and changed one alias to be consistent with other places where we alias this
import.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Windows Server 2016 (RS1) reached end of support, and Docker Desktop requires
Windows 10 V19H2 (version 1909, build 18363) as a minimum.
This patch makes Windows Server RS5 / ltsc2019 (build 17763) the minimum version
to run the daemon, and removes some hacks for older versions of Windows.
There is one check remaining that checks for Windows RS3 for a workaround
on older versions, but recent changes in Windows seemed to have regressed
on the same issue, so I kept that code for now to check if we may need that
workaround (again);
085c6a98d5/daemon/graphdriver/windows/windows.go (L319-L341)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The recently-upgraded gosec linter has a rule for archive extraction
code which may be vulnerable to directory traversal attacks, a.k.a. Zip
Slip. Gosec's detection is unfortunately prone to false positives,
however: it flags any filepath.Join call with an argument derived from a
tar.Header value, irrespective of whether the resultant path is used for
filesystem operations or if directory traversal attacks are guarded
against.
All of the lint errors reported by gosec appear to be false positives.
Signed-off-by: Cory Snider <csnider@mirantis.com>
A copy of Go's archive/tar packge was vendored with a patch applied to
mitigate CVE-2019-14271. Vendoring standard library packages is not
supported by Go in module-aware mode, which is getting in the way of
maintenance. A different approach to mitigate the vulnerability is
needed which does not involve vendoring parts of the standard library.
glibc implements name service lookups such as users, groups and DNS
using a scheme known as Name Service Switch. The services are
implemented as modules, shared libraries which glibc dynamically links
into the process the first time a function requiring the module is
called. This is the crux of the vulnerability: if a process linked
against glibc chroots, then calls one of the functions implemented with
NSS for the first time, glibc may load NSS modules out of the chrooted
filesystem.
The API underlying the `docker cp` command is implemented by forking a
new process which chroots into the container's rootfs and writes a tar
stream of files from the container over standard output. It utilizes the
Go standard library's archive/tar package to write the tar stream. It
makes use of the tar.FileInfoHeader function to construct a tar.Header
value from an fs.FileInfo value. In modern versions of Go on *nix
platforms, FileInfoHeader will attempt to resolve the file's UID and GID
to their respective user and group names by calling the os/user
functions LookupId and LookupGroupId. The cgo implementation of os/user
on *nix performs lookups by calling the corresponding libc functions. So
when linked against glibc, calls to tar.FileInfoHeader after the
process has chrooted into the container's rootfs can have the side
effect of loading NSS modules from the container! Without any
mitigations, a malicious container image author can trivially get
arbitrary code execution by leveraging this vulnerability and escape the
chroot (which is not a sandbox) into the host.
Mitigate the vulnerability without patching or forking archive/tar by
hiding the OS-dependent file info from tar.FileInfoHeader which it needs
to perform the lookups. Without that information available it falls back
to populating the tar.Header with only the information obtainable
directly from the FileInfo value without making any calls into os/user.
Fixes#42402
Signed-off-by: Cory Snider <csnider@mirantis.com>
All uses of this interface already accept a DownloadDescriptor; keeping the
interface small to allow this functionality to be used by other download-descriptors,
while still being able to check for the actual functionality (to be able to register
a digest).
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This is mostly for documentation purposes; defining a type makes
the option(s) show up grouped on pkg.go.dev (and in godoc).
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The `TransferManager` interface only had a single implementation, and neither
`LayerDownloadManager`, nor `LayerUploadManager` currently had an option to
provide a custom implementation, so we can un-export this.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This makes the function a bit more idiomatic, and leaves it to the caller to
decide wether or not the error can be ignored.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
All regular, non-EOL Linux distros now come with more recent kernels
out of the box. There may still be users trying to run on kernel 3.10
or older (some embedded systems, e.g.), but those should be a rare
exception, which we don't have to take into account.
This patch removes the kernel version check on Linux, and the corresponding
DOCKER_NOWARN_KERNEL_VERSION environment that was there to skip this
check.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This was originally added to solve a CLI usability issue related to `docker-machine` and `docker` (explicitly *not* `dockerd`) -- the "docker/cli" repository has a separate copy of this entire `opts` package, so isn't even using this implementation.
Signed-off-by: Tianon Gravi <admwiggin@gmail.com>
Includes security fixes for crypto/elliptic (CVE-2022-23806), math/big (CVE-2022-23772),
and cmd/go (CVE-2022-23773).
go1.17.7 (released 2022-02-10) includes security fixes to the crypto/elliptic,
math/big packages and to the go command, as well as bug fixes to the compiler,
linker, runtime, the go command, and the debug/macho, debug/pe, and net/http/httptest
packages. See the Go 1.17.7 milestone on our issue tracker for details:
https://github.com/golang/go/issues?q=milestone%3AGo1.17.7+label%3ACherryPickApproved
full diff: https://github.com/golang/go/compare/go1.17.6...go1.17.7
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
full diff: https://github.com/fsnotify/fsnotify/compare/v1.4.9...v1.5.1
Relevant changes:
- Fix unsafe pointer conversion
- Drop support/testing for Go 1.11 and earlier
- Update x/sys to latest
- add //go:build lines
- add go 1.17 to test matrix
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Remove the replace rule, and use the version as specified by (indirect) dependencies:
full diff: bf48bf16ab...f6687ab280
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The endpoint was silently ignoring invalid values for the "condition" parameter.
This patch now returns a 400 status if an unknown, non-empty "condition" is passed.
With this patch:
curl --unix-socket /var/run/docker.sock -XPOST 'http://localhost/v1.41/containers/foo/wait?condition=foobar'
{"message":"invalid condition: \"foobar\""}
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The client would always send a value, even if no `condition` was set;
Calling POST /v1.41/containers/foo/wait?condition=
This patch changes the client to not send the parameter if it's empty (and the
API default value should be used).
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This patch updates the swagger, and:
- adds an enum definition to document valid values (instead of describing them)
- updates the description to mention both "omitted" and "empty" values (although
the former is already implicitly covered by the field being "optional" and
having a default value).
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The /containers/{id}/wait can return a 400 (invalid argument) error if
httputils.ParseForm() fails.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This patch updates the swagger, and:
- adds an enum definition to document valid values (instead of describing them)
- updates the description to mention both "omitted" and "empty" values (although
the former is already implicitly covered by the field being "optional" and
having a default value).
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Looks like the `replace` rule was also matching what we're already vendoring,
so we can remove it:
github.com/containerd/containerd v1.5.8 => github.com/containerd/containerd v1.5.8
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Remove the replace rule, and use the version as specified by (indirect) dependencies:
full diff: e18ecbb051...69e39bad7d
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Remove the replace rule, and use the version as specified by (indirect) dependencies:
full diff: 3af7569d3a...f0f3c7e86c
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The Linux kernel never sets the Inheritable capability flag to anything
other than empty. Moby should have the same behavior, and leave it to
userspace code within the container to set a non-empty value if desired.
Reported-by: Andrew G. Morgan <morgan@kernel.org>
Signed-off-by: Samuel Karp <skarp@amazon.com>
Looks like this may be needed for Go 1.18
Also updating the golangci-lint configuration to account for updated
exclusion rules.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
I think the original intent here was to make passing t optional (62a856e912),
but it looks like that's not done anywhere, so let's remove it.
integration-cli/docker_utils_test.go:81:2: SA5011: possible nil pointer dereference (staticcheck)
c.Helper()
^
integration-cli/docker_utils_test.go:84:5: SA5011(related information): this check suggests that the pointer can be nil (staticcheck)
if c != nil {
^
integration-cli/docker_utils_test.go:106:2: SA5011: possible nil pointer dereference (staticcheck)
c.Helper()
^
integration-cli/docker_utils_test.go:108:5: SA5011(related information): this check suggests that the pointer can be nil (staticcheck)
if c != nil {
^
integration-cli/docker_utils_test.go:116:2: SA5011: possible nil pointer dereference (staticcheck)
c.Helper()
^
integration-cli/docker_utils_test.go:118:5: SA5011(related information): this check suggests that the pointer can be nil (staticcheck)
if c != nil {
^
integration-cli/docker_utils_test.go:126:2: SA5011: possible nil pointer dereference (staticcheck)
c.Helper()
^
integration-cli/docker_utils_test.go:128:5: SA5011(related information): this check suggests that the pointer can be nil (staticcheck)
if c != nil {
^
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
As caught by gosimple:
client/client.go:138:14: S1040: type assertion to the same type: c.client.Transport already has type http.RoundTripper (gosimple)
if _, ok := c.client.Transport.(http.RoundTripper); !ok {
^
This check was originally added in dc9f5c2ca3, to
check if the passed option was a `http.Transport`, and later changed in
e345cd12f9 to check for `http.RoundTripper` instead.
Client.client is a http.Client, for which the Transport field is a RoundTripper,
so this check is redundant.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The hack/vendor.sh script is used to (re)vendor dependencies. However, it did
not run `go mod tidy` before doing so, wheras the vendor _validation_ script
did.
This could result in vendor validation failing if go mod tidy resulted in
changes (which could be in `vendor.sum`).
In "usual" situations, this could be easily done by the user (`go mod tidy`
before running `go mod vendor`), but due to our (curent) uses of `vendor.mod`,
and having to first set up a (dummy) `go.mod`, this is more complicated.
Instead, just make the script do this, so that `hack/vendor.sh` will always
produce the expected result.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
It's deprecated in Go 1.18:
client/request.go:157:8: SA1019: err.Temporary is deprecated: Temporary errors are not well-defined. Most "temporary" errors are timeouts, and the few exceptions are surprising. Do not use this method. (staticcheck)
if !err.Temporary() {
^
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This commit allows the Landlock[0] system calls in the default seccomp
policy.
Landlock was introduced in kernel 5.13, to fill the gap that inspecting
filepaths passed as arguments to filesystem system calls is not really
possible with pure `seccomp` (unless involving `ptrace`).
Allowing Landlock by default fits in with allowing `seccomp` for
containerized applications to voluntarily restrict their access rights
to files within the container.
[0]: https://www.kernel.org/doc/html/latest/userspace-api/landlock.html
Signed-off-by: Tudor Brindus <me@tbrindus.ca>
daemon/graphdriver/fuse-overlayfs/fuseoverlayfs.go:101:63: SA9002: file mode '700' evaluates to 01274; did you mean '0700'? (staticcheck)
if err := idtools.MkdirAllAndChown(path.Join(home, linkDir), 700, currentID); err != nil {
^
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This was added in 194eaa5c0f to check image
compatibility based on Platform.Features;
// For now, hard code that all base images except nanoserver depend on win32k support
if imageData.Name != "nanoserver" {
imageData.OSFeatures = append(imageData.OSFeatures, "win32k")
}
But no longer used since 1f59bc8c03 and
d231260868
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This was added in commits fc21bf280b and
0380fbff37 in support of LCOW, but was
now always set to runtime.GOOS.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This removes some of the checks that were added in 0cba7740d4,
but should no longer be needed.
- `Daemon.create()`: fix the error message, which assumed it could only occur on Windows.
- `Daemon.cleanupContainer()`: no need to validate container platform to delete it.
- `Daemon.containerExport`: if a container was created, we should be able to
export it; no need to validate.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This removes some of the checks that were added in 0cba7740d4,
but should no longer be needed.
- `dockerfile.BuildFromConfig()` is used for `docker (container) commmit` and
`docker (image) import`. For `docker import`, we're failing early already.
For `commit`, it won't be possible to have a container that doesn't have the
right operating-system, so there's no need to validate.
- `dispatchRequest.getImageOrStage()`: simplify the check; all checks resulted
in an error on Windows, so it came down to "Windows does not support FROM scratch".
- `dispatchState.beginStage()`: `image.OperatingSystem()` already defaults to the
`runtime.GOOS` if unset, so remove the local default fallback.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This removes some of the checks that were added in 0cba7740d4,
but should no longer be needed.
- `image/store.Delete()`: no need to validate image platform to delete it.
- `image/tarexporter/takeLayerReference()`: use `image.OperatingSystem()` and
fail early to prevent constructing the `ChainID()`.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This removes some of the checks that were added in 0cba7740d4,
but should no longer be needed.
- `ImageService.ImageDelete()`: no need to validate image platform to delete it.
- `ImageService.ImageHistory()`: no need to validate image platform to show its
history; if it made it into the local image cache, it should be valid.
- `ImageService.ImportImage()`: `dockerfile.BuildFromConfig()` is used for
`docker (container) commmit` and `docker (image) import`. For `docker import`,
it's more transparent to perform validation early.
- `ImageService.LookupImage()`: no need to validate image platform to inspect it;
if it made it into the local image cache, it should be valid.
- `ImageService.SquashImage()`: same. This code was actually broken, because it
wrapped an `err` that was always `nil`, so would never return an error.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
None of the implementations used return an error, so removing the error
return can simplify using these.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Commit 0380fbff37 added the ability to pass a
--platform flag on `docker import` when importing an archive. The intent
of that commit was to allow importing a Linux rootfs on a Windows daemon
(as part of the experimental LCOW feature).
A later commit (337ba71fc1) changed some
of this code to take both OS and Architecture into account (for `docker build`
and `docker pull`), but did not yet update the `docker image import`.
This patch updates the import endpoitn to allow passing both OS and
Architecture. Note that currently only matching OSes are accepted,
and an error will be produced when (e.g.) specifying `linux` on Windows
and vice-versa.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This error is meant to be used in the output stream, and some comments
were added to prevent accidentally using local variables.
Renaming the variable instead to make it less ambiguous.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
These logs were meant to be logged when starting the daemon. Moving the logs
to the daemon startup code (which also prints similar messages) instead of
having the images service log them.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This argument was added for LCOW support, but it was only used to verify if
the passed platform (OS) matched the host. Given that all uses of this function
(except for one) passed runtime.GOOS, we may as well move the check to that
location.
We should do more cleaning up after this, and perform such validations early,
instead of passing platform around in too many places where it's only used for
similar validations. This is a first step in that direction.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This interface only had a single implementation (xfer.LayerDownloadManager),
and all places where it was used already imported the xfer package.
Removing the interface, also makes it a closer match to the "upload" part,
as `xfer.LayerUploadManager()` did not use an interface.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Commit 3b5fac462d / docker 1.10 removed support
for the LXC runtime, and removed the corresponding fields from the API (v1.22).
This patch removes the `HostConfig.LxcConf` field from the API documentation.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Commit 3b5fac462d / docker 1.10 removed support
for the LXC runtime, and removed the corresponding fields from the API (v1.22).
This patch removes the `HostConfig.LxcConf` field from the swagger definition.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This test is verifying that the container has the right options set (through
`docker inspect`), but also checks if the cgroup-rules are set within the container
by reading `/sys/fs/cgroup/devices/devices`
Unlike cgroups v1, on cgroups v2, there is no file interface, and rules are handled
through ebpf, which means that the test will fail because this file is not present.
From the Linux documentation for cgroups v2: https://github.com/torvalds/linux/blob/v5.16/Documentation/admin-guide/cgroup-v2.rst#device-controller
> (...)
> Device controller manages access to device files. It includes both creation of
> new device files (using mknod), and access to the existing device files.
>
> Cgroup v2 device controller has no interface files and is implemented on top of
> cgroup BPF. To control access to device files, a user may create bpf programs
> of type BPF_PROG_TYPE_CGROUP_DEVICE and att>
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
- all changes here are attributed to difference in behaviour between,
namely:
- resolution of secondary test dependencies
- prunning of non-Go files
Signed-off-by: Ilya Dmitrichenko <errordeveloper@gmail.com>
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
- use `vendor.mod` instead of `go.mod` to avoid issues to do with
use of CalVer, not SemVer
- ensure most of the dependency versions do not change
- only zookeeper client has to change (via docker/libkv#218) as
previously used version is no longer maintained and has missing
dependencies
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The `daemon.RawSysInfo()` function can be a heavy operation, as it collects
information about all cgroups on the host, networking, AppArmor, Seccomp, etc.
While looking at our code, I noticed that various parts in the code call this
function, potentially even _multiple times_ per container, for example, it is
called from:
- `verifyPlatformContainerSettings()`
- `oci.WithCgroups()` if the daemon has `cpu-rt-period` or `cpu-rt-runtime` configured
- in `ContainerDecoder.DecodeConfig()`, which is called on boith `container create` and `container commit`
Given that this information is not expected to change during the daemon's
lifecycle, and various information coming from this (such as seccomp and
apparmor status) was already cached, we may as well load it once, and cache
the results in the daemon instance.
This patch updates `daemon.RawSysInfo()` to use a `sync.Once()` so that
it's only executed once for the daemon's lifecycle.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This takes the changes from 1a933e113d and
834272f978, and applies them to older API
versions in the docs directory (which are used for the actual documentation).
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Use unique names to prevent tests from interfering, using a shorter
name, as there's a maximum length for these.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Use the syscall method instead of repeating the type conversions for
the syscall.Stat_t Atim/Mtim members. This also allows to drop the
//nolint: unconvert comments.
Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
followLogs() is getting really long (170+ lines) and complex.
The function has multiple inner functions that mutate its variables.
To refactor the function, this change introduces follow{} struct.
The inner functions are now defined as ordinal methods, which are
accessible from tests.
Signed-off-by: Kazuyoshi Kato <katokazu@amazon.com>
* When async is enabled, this option defines the interval (ms) at which the connection
to the fluentd-address is re-established. This option is useful if the address
may resolve to one or more IP addresses, e.g. a Consul service address.
While the change in #42979 resolves the issue where a Docker container can be stuck
if the fluentd-address is unavailable, this functionality adds an additional benefit
in that a new and healthy fluentd-address can be resolved, allowing logs to flow once again.
This adds a `fluentd-async-reconnect-interval` log-opt for the fluentd logging driver.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Signed-off-by: Conor Evans <coevans@tcd.ie>
Co-authored-by: Sebastiaan van Stijn <github@gone.nl>
Co-authored-by: Conor Evans <coevans@tcd.ie>
A number of tests in the TestDockerDaemonSuite create a custom bridge as part
of the test. In some cases, an existing `docker0` bridge could interfere with
those tests. For example, the `TestDaemonICCLinkExpose` and `TestDaemonICCPing`
verify that no "icc" communication is possible, and for this create a new
bridge with a custom IP-range.
However, depending on which tests ran before the test, a default `docker0` bridge
may exist (e.g., if the`TestDefaultGatewayIPv4Implicit`) with the same IP-range,
in which iptables rules may have been set up that allow communication, and thus
make the "icc" tests fail.
This patch removes the `docker0` interface at the start of tests that create
their own bridge to prevent it from interfering.
Note that alternatively, we could update those tests to use an IP-range that's
less likely to overlap, but this may be more brittle (but could still be done
in addition to this change as a follow-up).
To verify these changes;
make DOCKER_GRAPHDRIVER=vfs TEST_SKIP_INTEGRATION=1 TESTFLAGS='-test.run TestDockerDaemonSuite/TestDaemon(DefaultGatewayIPv4|ICC)' test-integration-cli
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
If we detect that a pattern is either an exact match, prefix match, or
suffix match, use an optimized code path instead of compiling a regexp.
Signed-off-by: Aaron Lehmann <alehmann@netflix.com>
This system call is only available on the 32- and 64-bit PowerPC, it is
used by modern programming language implementations (such as gcc-go) to
implement coroutine features through userspace context switches.
Other container environment, such as Systemd nspawn already whitelist
this system call in their seccomp profile [1] [2]. As such, it would be
nice to also whitelist it in moby.
This issue was encountered on Alpine Linux GitLab CI system, which uses
moby, when attempting to execute gcc-go compiled software on ppc64le.
[1]: https://github.com/systemd/systemd/pull/9487
[2]: https://github.com/systemd/systemd/issues/9485
Signed-off-by: Sören Tempel <soeren+git@soeren-tempel.net>
Before this change, if Decode() couldn't read a log record fully,
the subsequent invocation of Decode() would read the record's non-header part
as a header and cause a huge heap allocation.
This change prevents such a case by having the intermediate buffer in
the decoder struct.
Fixes#42125.
Signed-off-by: Kazuyoshi Kato <katokazu@amazon.com>
The flag ForceStopAsyncSend was added to fluent logger lib in v1.5.0 (at
this time named AsyncStop) to tell fluentd to abort sending logs
asynchronously as soon as possible, when its Close() method is called.
However this flag was broken because of the way the lib was handling it
(basically, the lib could be stucked in retry-connect loop without
checking this flag).
Since fluent logger lib v1.7.0, calling Close() (when ForceStopAsyncSend
is true) will really stop all ongoing send/connect procedure,
wherever it's stucked.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
Trying to reduce the use of libcontainer/devices, as it's considered
to be an "internal" package by runc.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Unfortunately, this check was missing in the original version. It could
cause a positive match to be overwritten by checking parent dirs.
Signed-off-by: Aaron Lehmann <alehmann@netflix.com>
The existing code does not correctly handle the case where a file
matches one of the patterns, but should not match overall because of an
exclude pattern that applied to a parent directory (see
https://github.com/docker/buildx/issues/850).
Fix this by independently tracking the results of matching against each
pattern. A file should be considered to match any pattern that matched a
parent dir.
Signed-off-by: Aaron Lehmann <alehmann@netflix.com>
full diff: https://github.com/containerd/ttrpc/compare/v1.0.2...v1.1.0
- client: Handle sending/receiving in separate goroutines
- Return Unimplemented when services or methods are not implemented
- go.mod: sirupsen/logrus v1.7.0
- go.mod: update dependencies
- go.mod: github.com/gogo/protobuf v1.3.2
- go.mod: google.golang.org/grpc v1.27.1
- go.mod: google.golang.org/genproto v0.0.0-20200224152610-e50cd9704f63
- go.mod: github.com/prometheus/procfs v0.6.0
- replace pkg/errors
- Rename branch from master to main
- Use GitHub Actions for CI
- Make "go test" and "go build" work on macOS
- Add protoc-gen-go-ttrpc
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The appropriate/nc image was last built over 6 years ago, and uses the
deprecated v2 schema 1 format.
https://github.com/appropriate/docker-nc/tree/master/latest
The image is just a plain "apk install" of netbsd-netcat, but was added
in 1c4286bcff, because at the time the
busybox nc had some bugs.
These appear to be resolved, so we can use the busybox nc, from the
frozen images.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Pushing manifest v2, schema 1 images has been deprecated in commit
6302dbbf46 (docker 20.10). It's still used in
some tests to provision a legacy registry to test _pulling_ legacy images
(which is still "supported"), but we should no longer have to validate pushing
for other scenarios.
This patch removes the schema 1 push tests, and inlines the code that was
extracted in non-exported functions (for them to be shared between schema 2 and
schema 1 tests).
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The eighth patch release for containerd 1.5 contains a mitigation for CVE-2021-41190
as well as several fixes and updates.
Notable Updates
* Handle ambiguous OCI manifest parsing
* Filter selinux xattr for image volumes in CRI plugin
* Use DeactiveLayer to unlock layers that cannot be renamed in Windows snapshotter
* Fix pull failure on unexpected EOF
* Close task IO before waiting on delete
* Log a warning for ignored invalid image labels rather than erroring
* Update pull to handle of non-https urls in descriptors
See the changelog for complete list of changes
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Since moby/libnetwork#2635 has been merged, allocatePortsInternal()
checks if IPv6 is enabled by calling IsV6Listenable(). This function
calls `net.Listen("tcp6", "[::1]:0")` and returns false when
net.Listen() fails.
TestPortMappingV6Config() starts by setting up a new net ns to run into
it. The loopback interface is not bring up in this net ns, thus
net.Listen() fails and IsV6Listenable() returns false. This change takes
care of bringing loopback iface up right after moving to the new net ns.
This test has been reported has flaky on s390x in #42468. For some
reason, this test seems to be consistently green on the CI (on amd64
arch) and when running `hack/test/unit` locally. However it consistently
fails when running `TESTFLAGS='-shuffle on' hack/test/unit` locally.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
This test is failing frequently once nodes have less disk space
available. Skipping the test for now, but we can continue looking
for a good solution.
Tracked through https://github.com/moby/moby/issues/42743
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This should help with Jenkins failing to clean up the Workspace:
- make sure "cleanup" is also called in the defer for all daemons. keeping
the daemon's storage around prevented Jenkins from cleaning up.
- close client connections and some readers (just to be sure)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The storage-driver directory caused Jenkins cleanup to fail. While at it, also
removing other directories that we do not include in the "bundles" that are
stored as Jenkins artifacts.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
go1.17.3 (released 2021-11-04) includes security fixes to the archive/zip and
debug/macho packages, as well as bug fixes to the compiler, linker, runtime, the
go command, the misc/wasm directory, and to the net/http and syscall packages.
See the Go 1.17.3 milestone on our issue tracker for details.
From the announcement e-mail:
[security] Go 1.17.3 and Go 1.16.10 are released
We have just released Go versions 1.17.3 and 1.16.10, minor point releases.
These minor releases include two security fixes following the security policy:
- archive/zip: don't panic on (*Reader).Open
Reader.Open (the API implementing io/fs.FS introduced in Go 1.16) can be made
to panic by an attacker providing either a crafted ZIP archive containing
completely invalid names or an empty filename argument.
Thank you to Colin Arnott, SiteHost and Noah Santschi-Cooney, Sourcegraph Code
Intelligence Team for reporting this issue. This is CVE-2021-41772 and Go issue
golang.org/issue/48085.
- debug/macho: invalid dynamic symbol table command can cause panic
Malformed binaries parsed using Open or OpenFat can cause a panic when calling
ImportedSymbols, due to an out-of-bounds slice operation.
Thanks to Burak Çarıkçı - Yunus Yıldırım (CT-Zer0 Crypttech) for reporting this
issue. This is CVE-2021-41771 and Go issue golang.org/issue/48990.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This was originally in docker/libnetwork#2624, which has been closed since the
code was moved here.
When creating a new network, IPAM's address allocator attempts to reserve the
network and broadcast addresses on IPv4 networks of all sizes. For RFC 3021
point-to-point networks (IPv4 /31s), this consumes both available addresses and
renders any attempt to allocate an address from the block unsuccessful.
This change prevents those reservations from taking place on IPv4 networks having
two or fewer addresses (i.e., /31s and /32s) while retaining the existing behavior
for larger IPv4 blocks and all IPv6 blocks.
In case you're wondering why anyone would allocate /31s: I work for a network
service provider. We use a lot of point-to-point networks. This cuts our
address space utilization for those by 50%, which makes ARIN happy.
This patch modifies the network allocator to recognize when an network is too
small for network and broadcast addresses and skip those reservations.
There are additional unit tests to make sure the functions involved behave as expected.
Try these out:
* `docker network create --driver bridge --subnet 10.200.1.0/31 --ip-range 10.200.1.0/31 test-31`
* `docker network create --driver bridge --subnet 10.200.1.0/32 --ip-range 10.200.1.0/32 test-32`
My installation has been running this patch in production with /31s since March.
Signed-off-by: Mark Feit <mfeit@internet2.edu>
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Also includes review suggestions in daemon.initNetworkController():
- update godoc for setHostGatewayIP()
- change setHostGatewayIP() to get config, instead of daemon
- remove redundant nil check for controller
Signed-off-by: sanchayanghosh <sanchayanghosh@outlook.com>
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The daemon can print the proxy configuration as part of error-messages,
and when reloading the daemon configuration (SIGHUP). Make sure that
the configuration is sanitized before printing.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The proxy configuration works, but looks like we're unable to connect to the
test proxy server as part of our test;
level=debug msg="Trying to pull example.org:5000/some/image from https://example.org:5000 v2"
level=warning msg="Error getting v2 registry: Get \"https://example.org:5000/v2/\": proxyconnect tcp: dial tcp 127.0.0.1:45999: connect: connection refused"
level=info msg="Attempting next endpoint for pull after error: Get \"https://example.org:5000/v2/\": proxyconnect tcp: dial tcp 127.0.0.1:45999: connect: connection refused"
level=error msg="Handler for POST /v1.42/images/create returned error: Get \"https://example.org:5000/v2/\": proxyconnect tcp: dial tcp 127.0.0.1:45999: connect: connection refused"
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This allows configuring the daemon's proxy server through the daemon.json con-
figuration file or command-line flags configuration file, in addition to the
existing option (through environment variables).
Configuring environment variables on Windows to configure a service is more
complicated than on Linux, and adding alternatives for this to the daemon con-
figuration makes the configuration more transparent and easier to use.
The configuration as set through command-line flags or through the daemon.json
configuration file takes precedence over env-vars in the daemon's environment,
which allows the daemon to use a different proxy. If both command-line flags
and a daemon.json configuration option is set, an error is produced when starting
the daemon.
Note that this configuration is not "live reloadable" due to Golang's use of
`sync.Once()` for proxy configuration, which means that changing the proxy
configuration requires a restart of the daemon (reload / SIGHUP will not update
the configuration.
With this patch:
cat /etc/docker/daemon.json
{
"http-proxy": "http://proxytest.example.com:80",
"https-proxy": "https://proxytest.example.com:443"
}
docker pull busybox
Using default tag: latest
Error response from daemon: Get "https://registry-1.docker.io/v2/": proxyconnect tcp: dial tcp: lookup proxytest.example.com on 127.0.0.11:53: no such host
docker build .
Sending build context to Docker daemon 89.28MB
Step 1/3 : FROM golang:1.16-alpine AS base
Get "https://registry-1.docker.io/v2/": proxyconnect tcp: dial tcp: lookup proxytest.example.com on 127.0.0.11:53: no such host
Integration tests were added to test the behavior:
- verify that the configuration through all means are used (env-var,
command-line flags, damon.json), and used in the expected order of
preference.
- verify that conflicting options produce an error.
Signed-off-by: Anca Iordache <anca.iordache@docker.com>
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Looks like this test was broken from the start, and fully relied on a race
condition. (Test was added in 65ee7fff02)
The problem is in the service's command: `ls -l /etc/config || /bin/top`, which
will either:
- exit immediately if the secret is mounted correctly at `/etc/config` (which it should)
- keep running with `/bin/top` if the above failed
After the service is created, the test enters a race-condition, checking for 1
task to be running (which it ocassionally is), after which it proceeds, and looks
up the list of tasks of the service, to get the log output of `ls -l /etc/config`.
This is another race: first of all, the original filter for that task lookup did
not filter by `running`, so it would pick "any" task of the service (either failed,
running, or "completed" (successfully exited) tasks).
In the meantime though, SwarmKit kept reconciling the service, and creating new
tasks, so even if the test was able to get the ID of the correct task, that task
may already have been exited, and removed (task-limit is 5 by default), so only
if the test was "lucky", it would be able to get the logs, but of course, chances
were likely that it would be "too late", and the task already gone.
The problem can be easily reproduced when running the steps manually:
echo 'CONFIG' | docker config create myconfig -
docker service create --config source=myconfig,target=/etc/config,mode=0777 --name myservice busybox sh -c 'ls -l /etc/config || /bin/top'
The above creates the service, but it keeps retrying, because each task exits
immediately (followed by SwarmKit reconciling and starting a new task);
mjntpfkkyuuc1dpay4h00c4oo
overall progress: 0 out of 1 tasks
1/1: ready [======================================> ]
verify: Detected task failure
^COperation continuing in background.
Use `docker service ps mjntpfkkyuuc1dpay4h00c4oo` to check progress.
And checking the tasks for the service reveals that tasks exit cleanly (no error),
but _do exit_, so swarm just keeps up reconciling, and spinning up new tasks;
docker service ps myservice --no-trunc
ID NAME IMAGE NODE DESIRED STATE CURRENT STATE ERROR PORTS
2wmcuv4vffnet8nybg3he4v9n myservice.1 busybox:latest@sha256:f7ca5a32c10d51aeda3b4d01c61c6061f497893d7f6628b92f822f7117182a57 docker-desktop Ready Ready less than a second ago
5p8b006uec125iq2892lxay64 \_ myservice.1 busybox:latest@sha256:f7ca5a32c10d51aeda3b4d01c61c6061f497893d7f6628b92f822f7117182a57 docker-desktop Shutdown Complete less than a second ago
k8lpsvlak4b3nil0zfkexw61p \_ myservice.1 busybox:latest@sha256:f7ca5a32c10d51aeda3b4d01c61c6061f497893d7f6628b92f822f7117182a57 docker-desktop Shutdown Complete 6 seconds ago
vsunl5pi7e2n9ol3p89kvj6pn \_ myservice.1 busybox:latest@sha256:f7ca5a32c10d51aeda3b4d01c61c6061f497893d7f6628b92f822f7117182a57 docker-desktop Shutdown Complete 11 seconds ago
orxl8b6kt2l6dfznzzd4lij4s \_ myservice.1 busybox:latest@sha256:f7ca5a32c10d51aeda3b4d01c61c6061f497893d7f6628b92f822f7117182a57 docker-desktop Shutdown Complete 17 seconds ago
This patch changes the service's command to `sleep`, so that a successful task
(after successfully performing `ls -l /etc/config`) continues to be running until
the service is deleted. With that change, the service should (usually) reconcile
immediately, which removes the race condition, and should also make it faster :)
This patch changes the tests to use client.ServiceLogs() instead of using the
service's tasklist to directly access container logs. This should also fix some
failures that happened if some tasks failed to start before reconciling, in which
case client.TaskList() (with the current filters), could return more tasks than
anticipated (as it also contained the exited tasks);
=== RUN TestCreateServiceSecretFileMode
create_test.go:291: assertion failed: 2 (int) != 1 (int)
--- FAIL: TestCreateServiceSecretFileMode (7.88s)
=== RUN TestCreateServiceConfigFileMode
create_test.go:355: assertion failed: 2 (int) != 1 (int)
--- FAIL: TestCreateServiceConfigFileMode (7.87s)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
commit b1a3fe4934 changed how the error was
returned (which is now wrapped), causing the test to fail:
=== RUN TestInvalidRemoteDriver
libnetwork_test.go:1289: Did not fail with expected error. Actual error: Plugin does not implement the requested driver: plugin="invalid-network-driver", requested implementation="NetworkDriver"
--- FAIL: TestInvalidRemoteDriver (0.01s)
Changing the test to use errors.Is()
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
There were a couple characters being explicitly escaped, but it
wasn't comprehensive.
This is now the set difference between the Golang regex meta
characters and the `filepath` match meta characters with the
exception of `\`, which already has special logic due to being
the path separator on Windows.
Signed-off-by: Milas Bowman <milasb@gmail.com>
Before this change if you assume that things work the way the test
expects them to (it does not, but lets assume for now) we aren't really
testing anything because we are testing that a container is healthy
before and after we send a signal. This will give false positives even
if there is a bug in the underlying code. Sending a signal can take any
amount of time to cause a container to exit or to trigger healthchecks
to stop or whatever.
Now lets remove the assumption that things are working as expected,
because they are not.
In this case, `top` (which is what is running in the container) is
actually exiting when it receives `USR1`.
This totally invalidates the test.
We need more control and knowledge as to what is happening in the
container to properly test this.
This change introduces a custom script which traps `USR1` and flips the
health status each time the signal is received.
We then send the signal twice so that we know the change has occurred
and check that the value has flipped so that we know the change has
actually occurred.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
This test runs with t.Parallel() _and_ uses subtests, but didn't capture
the `tc` variable, which potentialy (likely) makes it test the same testcase
multiple times.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
testutil/daemon uses a generic unix implementation that assumes that
the host OS supports cgroups & network namespaces, which is not the
case for FreeBSD.
This change adds a FreeBSD-specific implementation for
`testutil/daemon`, namely for `cleanupNetworkNamespace` and
`CgroupNamespace` functions.
Signed-off-by: Artem Khramov <akhramov@pm.me>
Updates go-winio to the latest version. The main important fix here is
to go-winio's backuptar package. This is needed to fix a bug in sparse
file handling in container layers, which was exposed by a recent change
in Windows.
go-winio v0.5.1: https://github.com/microsoft/go-winio/releases/tag/v0.5.1
Signed-off-by: Kevin Parsons <kevpar@microsoft.com>
In situations where docker runs in an environment where capabilities are limited,
sucn as docker-in-docker in a container created by older versions of docker, or
in a container where some capabilities have been disabled, starting a privileged
container may fail, because even though the _kernel_ supports a capability, the
capability is not available.
This patch attempts to address this problem by limiting the list of "known" capa-
bilities on the set of effective capabilties for the current process. This code
is based on the code in containerd's "caps" package.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
- don't pass the query's quetion.name separately, as we're already
passing the query itself.
- remove a "fallthrough" in favor of combining the cases in the switch
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
- Make sure all log messages have the `[resolver]` prefix
- Use `logrus.WithError()` consistently
- Improve information included in some logs
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The seventh patch release for containerd 1.5 is a security release to fix CVE-2021-41103.
Notable Updates:
- Fix insufficiently restricted permissions on container root and plugin directories
GHSA-c2h3-6mxw-7mvq
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
- Install apparmor parser for arm64 and update seccomp to 2.5.1
- Update runc binary to 1.0.2
- Update hcsshim to v0.8.21 to fix layer issue on Windows Server 2019
- Add support for 'clone3' syscall to fix issue with certain images when seccomp is enabled
- Add image config labels in CRI container creation
- Fix panic in metadata content writer on copy error
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This was added in b3c883bb2f, but resulted
in a panic if the embedded DNS had to handle an unsupported query-type,
such as ANY.
This patch adds a debug log for this case (to better describe how it's
handled.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Added an option 'awslogs-format' to allow specifying
a log format for the logs sent CloudWatch from the aws log driver.
For now, only the 'json/emf' format is supported.
If no option is provided, the log format header in the
request to CloudWatch will be omitted as before.
Signed-off-by: James Sanders <james3sanders@gmail.com>
go1.17.2 (released 2021-10-07) includes a security fix to the linker and misc/wasm
directory, as well as bug fixes to the compiler, the runtime, the go command, and
to the time and text/template packages. See the Go 1.17.2 milestone on our issue
tracker for details:
https://github.com/golang/go/issues?q=milestone%3AGo1.17.2+label%3ACherryPickApproved
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Standalone is a boolean, so false by default; also cleanup some debug logs
(probably more logs can be removed)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
I think it's a bit more readable to just use a literal value
for these; this also prevents having to use `_` to skip zero.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
If chroot is used with a special root directory then create
destination directory within chroot. This works automatically
already due to extractor creating parent paths and is only
used currently with cp where parent paths are actually required
and error will be shown to user before reaching this point.
Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>
(cherry picked from commit 52d285184068998c22632bfb869f6294b5613a58)
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
(cherry picked from commit 80f1169eca)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Do not use 0701 perms.
0701 dir perms allows anyone to traverse the docker dir.
It happens to allow any user to execute, as an example, suid binaries
from image rootfs dirs because it allows traversal AND critically
container users need to be able to do execute things.
0701 on lower directories also happens to allow any user to modify
things in, for instance, the overlay upper dir which neccessarily
has 0755 permissions.
This changes to use 0710 which allows users in the group to traverse.
In userns mode the UID owner is (real) root and the GID is the remapped
root's GID.
This prevents anyone but the remapped root to traverse our directories
(which is required for userns with runc).
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
(cherry picked from commit ef7237442147441a7cadcda0600be1186d81ac73)
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
(cherry picked from commit 93ac040bf0)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
- using v2.1.0 for the "v1" registry (last release with only v1)
- using v2.3.0 as "current" version (was v2.3.0-rc.0)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This moves installers that are only used during CI into the Dockerfile. Some
installers are still used in the release-pipeline, so keeping thos for now.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Docker 17.07 and up allow the CLI to be configured to set default proxy
env-vars to be used (both as build-arg and as env for docker run), see
docker/cli#93, so setting these here should be redundant. If someone
needs these env-vars set, they should be configured in the cli's
`~/.docker/config.json` instead.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This adds support for 2 runtimes on Windows, one that uses the built-in
HCSv1 integration and another which uses containerd with the runhcs
shim.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
Because FreeBSD uses 64-bit device nodes (see
https://reviews.freebsd.org/rS318736), Linux implementation of
`system.Mknod` & `system.Mkdev` is not sufficient.
This change adds freebsd-specific implementations for `Mknod` and
Mkdev`.
Signed-off-by: Artem Khramov <akhramov@pm.me>
zstd is a compression algorithm that has a very fast decoder, while
providing also good compression ratios. The fast decoder makes it
suitable for container images, as decompressing the tarballs is a very
expensive operation.
https://github.com/opencontainers/image-spec/pull/788 added support
for zstd to the OCI image specs.
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
This includes additional fixes for CVE-2021-39293.
go1.17.1 (released 2021-09-09) includes a security fix to the archive/zip package,
as well as bug fixes to the compiler, linker, the go command, and to the crypto/rand,
embed, go/types, html/template, and net/http packages. See the Go 1.17.1 milestone
on the issue tracker for details:
https://github.com/golang/go/issues?q=milestone%3AGo1.17.1+label%3ACherryPickApproved
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Looks like we don't need sprintf for how it's used. Replacing sprintf makes it
more performant (~2.4x as fast), and less memory, allocations:
BenchmarkGetRandomName-8 8203230 142.4 ns/op 37 B/op 2 allocs/op
BenchmarkGetRandomNameOld-8 3499509 342.9 ns/op 85 B/op 5 allocs/op
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The golangci-lint, gotestsum, shfmt, and vndr utilities should generally
be ok to be pinned by version instead of a specific sha. Also rename
the corresponding env-vars / build-args accordingly:
- GOLANGCI_LINT_COMMIT -> GOLANGCI_LINT_VERSION
- GOTESTSUM_COMMIT -> GOTESTSUM_VERSION
- SHFMT_COMMIT -> SHFMT_VERSION
- VNDR_COMMIT -> VNDR_VERSION
- CONTAINERD_COMMIT -> CONTAINERD_VERSION
- RUNC_COMMIT -> RUNC_VERSION
- ROOTLESS_COMMIT -> ROOTLESS_VERSION
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This build-tag was removed in 52390d6804,
which is part of runc v1.0.0-rc94 and up, so no longer relevant.
the kmem options are now always disabled in runc.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
commit c55a4ac779 changed the ioutil utilities
to use the new os variants, per recommendation from the go 1.16 release notes:
https://golang.org/doc/go1.16#ioutil
> we encourage new code to use the new definitions in the io and os packages.
> Here is a list of the new locations of the names exported by io/ioutil:
However, the devil is in the detail, and io.ReadDir() is not a direct
replacement for ioutil.ReadDir();
> ReadDir => os.ReadDir (note: returns a slice of os.DirEntry rather than a slice of fs.FileInfo)
go1.16 added a io.FileInfoToDirEntry() utility to concert a DirEntry to
a FileInfo, but it's not available in go1.16
This patch copies the FileInfoToDirEntry code, and uses it for go1.16.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Commit dae652e2e5 added support for non-privileged
containers to use ICMP_PROTO (used for `ping`). This option cannot be set for
containers that have user-namespaces enabled.
However, the detection looks to be incorrect; HostConfig.UsernsMode was added
in 6993e891d1 / ee2183881b,
and the property only has meaning if the daemon is running with user namespaces
enabled. In other situations, the property has no meaning.
As a result of the above, the sysctl would only be set for containers running
with UsernsMode=host on a daemon running with user-namespaces enabled.
This patch adds a check if the daemon has user-namespaces enabled (RemappedRoot
having a non-empty value), or if the daemon is running inside a user namespace
(e.g. rootless mode) to fix the detection.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Unify the NetworkInspect tests to remove some boilerplating
Before this change:
go test -v -run TestNetworkInspect ./client/
=== RUN TestNetworkInspectError
--- PASS: TestNetworkInspectError (0.00s)
=== RUN TestNetworkInspectNotFoundError
--- PASS: TestNetworkInspectNotFoundError (0.00s)
=== RUN TestNetworkInspectWithEmptyID
--- PASS: TestNetworkInspectWithEmptyID (0.00s)
=== RUN TestNetworkInspect
--- PASS: TestNetworkInspect (0.00s)
PASS
ok github.com/docker/docker/client 0.010s
With this change:
go test -v -run TestNetworkInspect ./client/
=== RUN TestNetworkInspect
=== RUN TestNetworkInspect/empty_ID
=== RUN TestNetworkInspect/no_options
=== RUN TestNetworkInspect/verbose
=== RUN TestNetworkInspect/global_scope
=== RUN TestNetworkInspect/unknown_network
=== RUN TestNetworkInspect/server_error
--- PASS: TestNetworkInspect (0.00s)
--- PASS: TestNetworkInspect/empty_ID (0.00s)
--- PASS: TestNetworkInspect/no_options (0.00s)
--- PASS: TestNetworkInspect/verbose (0.00s)
--- PASS: TestNetworkInspect/global_scope (0.00s)
--- PASS: TestNetworkInspect/unknown_network (0.00s)
--- PASS: TestNetworkInspect/server_error (0.00s)
PASS
ok github.com/docker/docker/client 0.012s
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
- fix incorrectly formatted GoDoc and comments
- rename a variable that collided with the `cap` built-in
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This replaces the local SeccompSupported() utility for the implementation in containerd,
which performs the same check.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The io/ioutil package has been deprecated in Go 1.16. This commit
replaces the existing io/ioutil functions with their new definitions in
io and os packages.
Signed-off-by: Eng Zer Jun <engzerjun@gmail.com>
Golang '.0' releases are released without a trailing .0 (i.e. go1.17
is equal to go1.17.0). For the base image, we want to specify the go
version including their patch release (golang:1.17 is equivalent to
go1.17.x), so adjust the script to also accept the trailing .0, because
otherwise the download-URL is not found:
hack/vendor.sh archive/tar
update vendored copy of archive/tar
downloading: https://golang.org/dl/go1.17.0.src.tar.gz
curl: (22) The requested URL returned error: 404
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
These checks were added when we required a specific version of containerd
and runc (different versions were known to be incompatible). I don't think
we had a similar requirement for tini, so this check was redundant. Let's
remove the check altogether.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Update the frozen images to also be based on Debian bullseye. Using the "slim"
variant (which looks to have all we're currently using), and remove the
buildpack-dep frozen image.
The buildpack-dep image is quite large, and it looks like we only use it to
compile some C binaries, which should work fine on a regular debian image;
docker build -t debian:bullseye-slim-gcc -<<EOF
FROM debian:bullseye-slim
RUN apt-get update && apt-get install -y gcc libc6-dev --no-install-recommends
EOF
docker image ls
REPOSITORY TAG IMAGE ID CREATED SIZE
debian bullseye-slim-gcc 1851750242af About a minute ago 255MB
buildpack-deps bullseye fe8fece98de2 2 days ago 834MB
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
commit 7168d98c43 removed these, but
we overlooked that the same stage is used to build runc as well, so
we likely need these.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Otherwise errors within this function will all show to be at the line
number of the utility, instead of where it failed in the test:
=== RUN TestDaemonDefaultNetworkPools
service_test.go:23: assertion failed:
Command: ip link delete docker0
ExitCode: 127
Error: exec: "ip": executable file not found in $PATH
Stdout:
Stderr:
Failures:
ExitCode was 127 expected 0
Expected no error
=== RUN TestDaemonRestartWithExistingNetwork
service_test.go:23: assertion failed:
Command: ip link delete docker0
ExitCode: 127
Error: exec: "ip": executable file not found in $PATH
Stdout:
Stderr:
Failures:
ExitCode was 127 expected 0
Expected no error
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Previously, ioutils imported the crypty/sha256 package, because it was
used by the HashData() utility. As a side-effect of that import, the
sha256 algorithm was registered through its `init()` function.
Now that the HashData() utility is removed, the import is no longer needed
in this package, but some parts of our code depended on the side-effect, and
without this, it fail to recognise the algorithms, unless something else
happens to import crypto/sha256 / crypto/sha512, which made our
tests fail:
```
=== Failed
=== FAIL: reference TestLoad (0.00s)
store_test.go:53: failed to parse reference: unsupported digest algorithm
=== FAIL: reference TestSave (0.00s)
store_test.go:82: failed to parse reference: unsupported digest algorithm
=== FAIL: reference TestAddDeleteGet (0.00s)
store_test.go:174: could not parse reference: unsupported digest algorithm
=== FAIL: reference TestInvalidTags (0.00s)
store_test.go:355: assertion failed: error is not nil: unsupported digest algorithm
```
While it would be better to do the import in the actual locations where it's
expected, there may be code-paths we overlook, so instead adding the import
here temporarily. Until the PR in go-digest has been merged and released.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The IsLocalhost utility was not used, which only leaves the IsIPv4Localhost
utility.
Go's "net" package provides a `IsLoopBack()` check, but it checks for both
IPv4 and IPv6 loopback interfaces. We likely should also do IPv6 here, but
that's better left for a separate change, so instead, I replicated the IPv4
bits from Go's net.IP.IsLoopback().
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This allows using the package without having to import the "types" package,
and without having to consume github.com/ishidawataru/sctp.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
It was only used for two consts, which are unlikely to change, but the
"opts" package currently also imports libnetwork/ipamutils, which has
an `init()` function that does some heavy lifting, and not needed for
this utility's purpose.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Perhaps the testutils package in the past had an `init()` function to set up
specific things, but it no longer has. so these imports were doing nothing.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This const was previously living in pkg/signal, but with that package
being moved to its own module, it didn't make much sense to put docker's
defaults in a generic module.
The const from the "signal" package is currenlty used *both* by the CLI
and the daemon as a default value when creating containers. This put up
some questions:
a. should the default be non-exported, and private to the container
package? After all, it's a _default_ (so should be used if _NOT_ set).
b. should the client actually setting a default, or instead just omit
the value, unless specified by the user? having the client set a
default also means that the daemon cannot change the default value
because the client (or older clients) will override it.
c. consider defaults from the client and defaults of the daemon to be
separate things, and create a default const in the CLI.
This patch implements option "a" (option "b" will be done separately,
as it involves the CLI code). This still leaves "c" open as an option,
if the CLI wants to set its own default.
Unfortunately, this change means we'll have to drop the alias for the
deprecated pkg/signal.DefaultStopSignal const, but a comment was left
instead, which can assist consumers of the const to find why it's no
longer there (a search showed the Docker CLI as the only consumer though).
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Commit 135cec5d4d added support for
calling the /system/df endpoint concurrently.
This patch adds a note about this enhancement to the API changes.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Each of the validation functions depended on HostConfig being not `nil`. Use an
early return, instead of continuing, and checking if it's `nil` in each of the
validate functions.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This makes sure that the value set in the daemon can be used as-is,
without having to replicate the normalization logic elsewhere.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This allows containers to use the embedded default profile if a different
default is set (e.g. "unconfined") in the daemon configuration. Without this
option, users would have to copy the default profile to a file in order to
use the default.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Commit b237189e6c implemented an option to
set the default seccomp profile in the daemon configuration. When that PR
was reviewed, it was discussed to have the option accept the path to a custom
profile JSON file; https://github.com/moby/moby/pull/26276#issuecomment-253546966
However, in the implementation, the special "unconfined" value was not taken into
account. The "unconfined" value is meant to disable seccomp (more factually:
run with an empty profile).
While it's likely possible to achieve this by creating a file with an an empty
(`{}`) profile, and passing the path to that file, it's inconsistent with the
`--security-opt seccomp=unconfined` option on `docker run` and `docker create`,
which is both confusing, and makes it harder to use (especially on Docker Desktop,
where there's no direct access to the VM's filesystem).
This patch adds the missing check for the special "unconfined" value.
Co-authored-by: Tianon Gravi <admwiggin@gmail.com>
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Using "default" as a name is a bit ambiguous, because the _daemon_ default
can be changed using the '--seccomp-profile' daemon flag.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This function was added in a754d89b40, but not
used. Currently, the only consumer of this function I could find was docker/cli,
which used it in a unit-test (this test has already been updated to not depend
on this function); https://grep.app/search?q=.CustomHTTPHeaders%28%29&filter[lang][0]=Go
Given that commit a68ae4a2d9 deprecated the
corresponding client.SetCustomHTTPHeaders() function, and because there is no
active use for this function, it should be ok to deprecate.
We can include this in a patch-release (to be sure nobody else is depending on
it, and (if someone is) to notify them of the deprecation.
As a follow-up to this commit, I'll remove both functions.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The `CapabilityMapping` and `Capabilities` types appeared to be only
used locally, and added unneeded complexity.
This patch removes those types, and simplifies the logic to use a
map that maps names to `capability.Cap`s
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
A capability can either be invalid, or not supported by the kernel
on which we're running. This patch changes the error message produced
to reflect if the capability is invalid/unknown, or a known capability,
but not supported by the kernel version.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Now that runc v1.0.0-rc93 is used, we can revert this temporary workaround
This reverts commit a38b96b8cd.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Run all tests within `libnetwork` namespace with `-p=1`
in a separate `gotestsum` invocation.
Signed-off-by: Roman Volosatovs <roman.volosatovs@docker.com>
Since commit "seccomp: Sync fields with runtime-spec fields"
(5d244675bd) we support to specify the
DefaultErrnoRet to be used.
Before that commit it was not specified and EPERM was used by default.
This commit keeps the same behaviour but just makes it explicit that the
default is EPERM.
Signed-off-by: Rodrigo Campos <rodrigo@kinvolk.io>
full diff: https://github.com/containerd/containerd/compare/v1.5.4...v1.5.5
Welcome to the v1.5.5 release of containerd!
The fifth patch release for containerd 1.5 updates runc to 1.0.1 and contains
other minor updates.
Notable Updates
- Update runc binary to 1.0.1
- Update pull logic to try next mirror on non-404 response
- Update pull authorization logic on redirect
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Welcome to the v1.5.5 release of containerd!
The fifth patch release for containerd 1.5 updates runc to 1.0.1 and contains
other minor updates.
Notable Updates
- Update runc binary to 1.0.1
- Update pull logic to try next mirror on non-404 response
- Update pull authorization logic on redirect
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The existing code was the exact equivalent of bytes.HasPrefix();
// HasPrefix tests whether the byte slice s begins with prefix.
func HasPrefix(s, prefix []byte) bool {
return len(s) >= len(prefix) && Equal(s[0:len(prefix)], prefix)
}
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Let clients choose object types to compute disk usage of.
Signed-off-by: Roman Volosatovs <roman.volosatovs@docker.com>
Co-authored-by: Sebastiaan van Stijn <github@gone.nl>
If no seccomp policy is requested, then the built-in default policy in
dockerd applies. This has no rule for "clone3" defined, nor any default
errno defined. So when runc receives the config it attempts to determine
a default errno, using logic defined in its commit:
7a8d7162f9
As explained in the above commit message, runc uses a heuristic to
decide which errno to return by default:
[quote]
The solution applied here is to prepend a "stub" filter which returns
-ENOSYS if the requested syscall has a larger syscall number than any
syscall mentioned in the filter. The reason for this specific rule is
that syscall numbers are (roughly) allocated sequentially and thus newer
syscalls will (usually) have a larger syscall number -- thus causing our
filters to produce -ENOSYS if the filter was written before the syscall
existed.
[/quote]
Unfortunately clone3 appears to one of the edge cases that does not
result in use of ENOSYS, instead ending up with the historical EPERM
errno.
Latest glibc (2.33.9000, in Fedora 35 rawhide) will attempt to use
clone3 by default. If it sees ENOSYS then it will automatically
fallback to using clone. Any other errno is treated as a fatal
error. Thus when docker seccomp policy triggers EPERM from clone3,
no fallback occurs and programs are thus unable to spawn threads.
The clone3 syscall is much more complicated than clone, most notably its
flags are not exposed as a directly argument any more. Instead they are
hidden inside a struct. This means that seccomp filters are unable to
apply policy based on values seen in flags. Thus we can't directly
replicate the current "clone" filtering for "clone3". We can at least
ensure "clone3" returns ENOSYS errno, to trigger fallback to "clone"
at which point we can filter on flags.
Fixes: https://github.com/moby/moby/issues/42680
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
This utility was just a shallow wrapper around executing the regular
expression, and in some cases, we didn't even use the error it returned,
so better to inline the code instead of abstracting it away.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This way, there's no need to pass down the regular expression, and the
validation is "just another" validator (which we already pass).
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Compile the regular expression, instead of 'ad-hoc'. For this to work, I moved
the splitting was moved out of parseMountRaw() into ParseMountRaw(), and the
former was renamed to parseMount(). This function still receives the 'raw' string,
as it's used to include the "raw" spec for inclusion in error messages.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Add hints for "Failed to destroy btrfs snapshot <DIR> for <ID>: operation not permitted" on rootless
Related to issue 41762
Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
(*PatternMatcher).Matches includes a special case for when the pattern
matches a parent dir, even though it doesn't match the current path.
However, it assumes that the parent dir which would match the pattern
must have the same number of separators as the pattern itself. This
doesn't hold true with a patern like "**/foo". A file foo/bar would have
len(parentPathDirs) == 1, which is less than the number of path
len(pattern.dirs) == 2... therefore this check would be skipped.
Given that "**/foo" matches "foo", I think it's a bug that the "parent
subdir matches" check is being skipped in this case.
It seems safer to loop over the parent subdirs and check each against
the pattern. It's possible there is a safe optimization to check only a
certain subset, but the existing logic seems unsafe.
Signed-off-by: Aaron Lehmann <alehmann@netflix.com>
Discovered a few instances, where loop variable is incorrectly used
within a test closure, which is marked as parallel.
Few of these were actually loops over singleton slices, therefore the issue
might not have surfaced there (yet), but it is good to fix there as
well, as this is an incorrect pattern used across different tests.
Signed-off-by: Roman Volosatovs <roman.volosatovs@docker.com>
The daemon uses a priority list to automatically select the best-matching storage
driver for the backing filesystem that is used.
Historically, overlay2 was not supported on Btrfs and ZFS, and the daemon would
automatically pick the `btrfs` or `zfs` storage driver if that was the Backing
File System.
Commits 649e4c8889 and e226aea280
improved our detection to check if overlay2 was supported on the backing file-
system, allowing overlay2 to be used on top of Btrfs or ZFS, but did not change
the priority list.
While both Btrfs and ZFS have advantages for certain use-cases, and provide
advanced features that are not available to overlay2, they also are known
to require more "handholding", and are generally considered to be mostly
useful for "advanced" users.
This patch changes the storage-driver priority list, to prefer overlay2 (if
supported by the backing filesystem), and effectively makes btrfs and zfs
opt-in storage drivers.
This change does not affect existing installations; the daemon will detect
the storage driver that was previously in use (based on the presence of
storage directories in `/var/lib/docker`).
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
When building the dev image, the Makefile generates a tag-name for the image,
based on the current git branch. As a result of this naming, old images will
collect on a developer's machine (especially when building from different
branches, for example when reviewing pull requests):
REPOSITORY TAG IMAGE ID CREATED SIZE
docker-dev HEAD 9785a8fb82f5 30 hours ago 2.13GB
docker-dev master 9785a8fb82f5 30 hours ago 2.13GB
docker-dev seccomp-closer-to-oci 9785a8fb82f5 30 hours ago 2.13GB
docker-dev move-stackdump 06882c142bfd 2 days ago 2.13GB
docker-dev add-dns-to-docker-info 2961ed1b99bd 10 days ago 2.13GB
docker-dev add-platform-info 2961ed1b99bd 10 days ago 2.13GB
docker-dev rata-seccomp-new-fields 2961ed1b99bd 10 days ago 2.13GB
docker-dev swagger-wip 2961ed1b99bd 10 days ago 2.13GB
docker-dev system-df-types 2961ed1b99bd 10 days ago 2.13GB
docker-dev use-oci-platform 2961ed1b99bd 10 days ago 2.13GB
docker-dev update-swagger-fork 3eeedecca85a 2 weeks ago 2.13GB
docker-dev remove-lcow-step5-alternative 51f9720bbc19 2 weeks ago 2.13GB
docker-dev update-s390x-ubuntu-2004 51f9720bbc19 2 weeks ago 2.13GB
docker-dev fix-image-shared-size 09e9aa46694a 2 weeks ago 2.13GB
docker-dev remove-discovery 11823223ae83 3 weeks ago 2.13GB
docker-dev daemon-config 355643e371b0 4 weeks ago 2.12GB
docker-dev jenkins-windows-containerd 68199214b860 4 weeks ago 2.11GB
docker-dev unfork-buildkit 68199214b860 4 weeks ago 2.11GB
docker-dev warn-on-non-matching-platform bc014b94017f 5 weeks ago 2.11GB
docker-dev remove-lcow 3a43c0900282 6 weeks ago 2.11GB
docker-dev remove-lcow-part5 3a43c0900282 6 weeks ago 2.11GB
docker-dev remove-lcow-step3 3a43c0900282 6 weeks ago 2.11GB
docker-dev remove-lcow-step4 3a43c0900282 6 weeks ago 2.11GB
docker-dev seccomp-unconfined-daemon 3a43c0900282 6 weeks ago 2.11GB
docker-dev update-authors 3a43c0900282 6 weeks ago 2.11GB
docker-dev payall4u-fix-creating-sandbox-when-disable-bridge 114c0f2ceb17 6 weeks ago 2.12GB
docker-dev catch-almost-all f437d2bc512b 8 weeks ago 2.12GB
docker-dev bin-criu c72894ae66f3 2 months ago 2.12GB
docker-dev bump-golang-1-14 395932141809 2 months ago 2.14GB
docker-dev upstream-systemd-units d0cb07f9473c 2 months ago 2.12GB
docker-dev bump-criu 6ed9e8fcf59f 2 months ago 2.12GB
This images are a bit of a pain to clean up, and because they are tagged,
`docker image prune` or `docker system prune` doesn't help (unless `--all` is
used).
Looking at the background of this naming, a found that it was originally added
in a95712899e, after a discussion on PR 3471.
At the time, the image name was used to check if the image needed building, and
otherwise building was skipped in the makefile.
This is no longer the case; the image is built unconditionally, and the build-
cache helps (where possible) speed up rebuilding the image.
In _theory_ having unique names would allow for multiple dev containers (from
different branches) to be started in parallel, but in most situations, the
source-code will be mounted (`BIND_MOUNT=.`), so I'm not sure if that should
be a compelling reason to keep the current naming.
This patch removes the unique tag, and will always tag the image locally as
`docker-dev:latest`.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This patch, similar to d92739713c, embeds the
`LinuxSeccomp` type of the runtime-spec, so that we can support all options
provided by the spec, and decorates it with our own fields.
With this, profiles can make use of the recently added "Flags" field, to
specify flags that must be passed to seccomp(2) when installing the filter.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Make the error message slightly more informative, and remove the redundant
`len(config.ArchMap) != 0` check, as iterating over an empty, or 'nil' slice
is a no-op already. This allows to use a slightly more idiomatic "if ok := xx; ok"
condition.
Also move validation to the start of the loop (early return), and explicitly create
a new slice for "names" if the legacy "Name" field is used.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Add test to verify profile validation, and to verify that the legacy
format actually loads the profile as expected (instead of only verifying
it doesn't produce an error).
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
It's the only location where this is used, and it's quite specific
to dockerd (not really a reusable function for external use), so
moving it into that package.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
It is not directly related to signal-handling, so can well live
in its own package.
Also added a variant that doesn't take a directory to write files
to, for easier consumption / better match to how it's used.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The "quiet" argument was only used in a single place (at daemon startup), and
every other use had to pass "false" to prevent this function from logging
warnings.
Now that SysInfo contains the warnings that occurred when collecting the
system information, we can make leave it up to the caller to use those
warnings (and log them if wanted).
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
We pass the SysInfo struct to all functions. Adding cg2Controllers as a
(non-exported) field makes passing around this information easier.
Now that infoCollector and infoCollectorV2 have the same signature, we can
simplify some bits and use a single slice for all "collectors".
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
We pass the SysInfo struct to all functions. Adding cg2GroupPath as a
(non-exported) field makes passing around this information easier.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
path.Join() already does path.Clean(), and the opts.cg2GroupPath
field is already cleaned as part of WithCgroup2GroupPath()
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
We pass the SysInfo struct to all functions. Adding cgMounts as a
(non-exported) field makes passing around this information easier.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This makes it clearer that this code is the cgroups v1 equivalent of newV2().
Also moves the "options" handling to newV2() because it's currently only used
for cgroupsv2.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
full diff: https://github.com/containerd/containerd/compare/v1.5.2...v1.5.3
- Fix User Agent sent to registry authentication server (changes default user-
agent from "Go-http-client/1.1" to "containerd/v1.5.3")
- Fix missing Body.Close() calls on push to docker remote
- Change Wrapf of non-error to an actual error
- fixes Failed to pull image (unexpected commit digest)
- fix invalid validation error checking
- Update hcsshim to 0.8.18
- Update Go to 1.16.6
- content/local: inline sys.StatATimeAsTime()
- windows: Use GetFinalPathNameByHandle for ResolveSymbolicLink
- Fix cleanup context of teardownPodNetwork
- fixes CRI fails to invoke CNI plugin to teardown network when RunPodSandbox times out
- sandbox: send pod UID to CNI plugins as K8S_POD_UID
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
full diff: https://github.com/containerd/containerd/compare/v1.5.2...v1.5.3
Welcome to the v1.5.3 release of containerd!
The third patch release for containerd 1.5 updates runc to 1.0.0 and contains
various other fixes.
Notable Updates
- Update runc binary to 1.0.0
- Send pod UID to CNI plugins as K8S_POD_UID
- Fix invalid validation error checking
- Fix error on image pull resume
- Fix User Agent sent to registry authentication server
- Fix symlink resolution for disk mounts on Windows
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
- Fix the error message in hack/validate/vendor to specify that
hack/vendor.sh should be run instead of vndr.
- Fix hack/vendor.sh to also match on Windows paths for the whitelist.
This allows the script to be run on Windows via Git Bash.
Signed-off-by: Kevin Parsons <kevpar@microsoft.com>
The reasoning for this change is to be able to query image shared size without having to rely on the more heavyweight `/system/df` endpoint.
Signed-off-by: Roman Volosatovs <roman.volosatovs@docker.com>
This makes it easier to add more options to the backend without having to change
the signature.
While we're changing the signature, also adding a context.Context, which is not
currently used, but probably should be at some point.
Signed-off-by: Roman Volosatovs <roman.volosatovs@docker.com>
`github.com/hashicorp/memberlist` update caused `TestNetworkDBCRUDTableEntries`
to occasionally fail, because the test would try to check whether an entry
write is propagated to all nodes, but it would not wait for all nodes to
be available before performing the write.
It could be that the failure is caused simply by improved performance of
the dependency - it could also be that some connectivity guarantee the
test depended on is not provided by the dependency anymore.
The same fix is applied to `TestNetworkDBNodeJoinLeaveIteration` due to
same issue.
Signed-off-by: Roman Volosatovs <roman.volosatovs@docker.com>
Upstream update fixes the issue where left node would be marked as
failed, which caused `TestNetworkDBIslands` to occasionally fail.
Signed-off-by: Roman Volosatovs <roman.volosatovs@docker.com>
This allows the rejoin intervals to be chosen according to the context
within which the component is used, and, in particular, this allows
lower intervals to be used within TestNetworkDBIslands test.
Signed-off-by: Roman Volosatovs <roman.volosatovs@docker.com>
Per the systemd.unit documentation:
> If this unit gets activated, the units listed will be activated as well. If one of the other units fails to activate, and an ordering dependency After= on the failing unit is set, this unit will not be started. Besides, with or without specifying After=, this unit will be stopped if one of the other units is explicitly stopped.
>
> Often, it is a better choice to use Wants= instead of Requires= in order to achieve a system that is more robust when dealing with failing services.
This should also be generally "safe" given we added `--containerd=/run/containerd/containerd.sock` to the flags we pass to `dockerd`.
Signed-off-by: Tianon Gravi <admwiggin@gmail.com>
Signed-off-by: Anca Iordache <anca.iordache@docker.com>
This code is not generically useful on "unix", and contains linux-
specific code, so make it only compile on linux.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This type was added to support Solaris (which didn't support these
options). Solaris support was removed, so we can integrate this type
back into the "unix" type.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This type was added to support Solaris (which didn't support these
options). Solaris support was removed, so we can integrate this type
back into the "unix" type.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Put variables and functions in the same owrder between both,
to allow for easier comparing between platforms.
Also synchronised some comments/godoc between both.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The runtime spec we are using has support for these 3 fields[1], but
moby doesn't have them in its seccomp struct. This patch just adds and
copies them when they are in the profile.
DefaultErrnoRet is implemented in the runc version moby is using (it is
implemented since runc-rc95[2]) but if we create a container without
this moby patch, we don't see an error nor the expected behavior. This
is not clear for the user (the profile they specify is valid, the syntax
is ok, but the wrong behavior is seen).
This is because the DefaultErrnoRet field is not copied to the config
passed ultimately to runc (i.e. is like the field was not specified).
With this patch, we see the expected behavior.
The other two fileds are in the runtime-spec but not yet in runc (a PR
is open and targets 1.1.0 milestone). However, I took the liberty to
copy them now too for two reasons:
1. If we don't add them now and end up using a runc version that
supports them, then the error that the user will see is not clear at
all:
docker: Error response from daemon: OCI runtime create failed: container_linux.go:380: starting container process caused: listenerPath is not set: unknown.
And it is not obvious to debug for the user, as the field _is_ set in
the profile they specify (just not copied by moby to the profile moby
specifies ultimately to runc).
2. When using a runc without seccomp notify support (like today), the
error we see is the same with and without this moby patch (when using a
seccomp profile with the new fields):
docker: Error response from daemon: OCI runtime create failed: string SCMP_ACT_NOTIFY is not a valid action for seccomp: unknown.
Then, it seems like a clear win to add them now: we don't have to do it
later (that implies not clear errors to the user if we forget, like we
did with DefaultErrnoRet) and the user sees the exact same error when
using a runc version that doesn't support these fields.
[1]: Note we are vendoring version 1c3f411f041711bbeecf35ff7e93461ea6789220 and this version has these 3 fields 1c3f411f04/config-linux.md (seccomp)
[2]: https://github.com/opencontainers/runc/pull/2954/
[3]: https://github.com/opencontainers/runc/pull/2682
Signed-off-by: Rodrigo Campos <rodrigo@kinvolk.io>
Ensure empty `BuildCache` field is represented as empty JSON array(`[]`)
instead of `null` to be consistent with `Images`, `Containers` etc.
Signed-off-by: Roman Volosatovs <roman.volosatovs@docker.com>
This allows stubbing the provider for a test without side effects for
other tests, making it no longer needed to reset it to its original
value in a defer()
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This changes mounts.NewParser() to create a parser for the current operatingsystem,
instead of one specific to a (possibly non-matching, in case of LCOW) OS.
With the OS-specific handling being removed, the "OS" parameter is also removed
from `daemon.verifyContainerSettings()`, and various other container-related
functions.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
- Remove the windowsparser.HasResource() override, as it was the same on both
Windows and Linux
- Move the rxLCOWDestination to the lcowParser code
- Move the rwModes variable to a generic (non-platform-specific) file, as it's
used both for the windowsParser and the linuxParser
- Some minor formatting and linting changes
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
- Rename image summary constructor
- Rename `newImage` into `newImageSummary`, since the returned type is
`*types.ImageSummary`
- Rename variables for clarity
- Rename `newImage` into `summary`, since the variable type is
`*types.ImageSummary`
- Rename `imagesMap` into `summaryMap`, since the value type
contained is `*types.ImageSummary`
- Only compute `DiffSize` when more than 1 reference to the layer
exists, since it is not used otherwise
- Move variable declarations closer to where they are used
Signed-off-by: Roman Volosatovs <roman.volosatovs@docker.com>
Co-authored-by: Sebastiaan van Stijn <github@gone.nl>
The test sometimes failed because no error was returned:
=== Failed
=== FAIL: pkg/plugins TestClientWithRequestTimeout (0.00s)
client_test.go:254: assertion failed: expected an error, got nil: expected error
Possibly caused by a race condition, as the sleep was just 1 ms longer than the timeout;
this patch is increasing the sleep in the response to try to reduce flakiness.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
These were purposefully ignored before but this goes ahead and "fixes"
most of them.
Note that none of the things gosec flagged are problematic, just
quieting the linter here.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
This is another one of those tools to mimic the docker network cli.
It is not needed anymore, along with an old fork of the docker flag
packages which was a fork of the go flag package.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
Since this command is part of the official distribution and even
required for tests, let's move this up to the main cmd's.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
This was used for testing purposes when libnetwork was in a separate
repo.
Now that it is integrated we no longer need it since dockerd and docker
cli provide the same function.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
This header was used for fallbacks to v1 registries, but it's no longer
used, and marked optional / legacy in the OCI distribution-spec:
https://github.com/opencontainers/distribution-spec/blob/v1.0.0/spec.md#legacy-docker-support-http-headers
> Because of the origins this specification, the client MAY encounter
> Docker-specific headers, such as `Docker-Content-Digest`, or
> `Docker-Distribution-API-Version`. These headers are OPTIONAL and
> clients SHOULD NOT depend on them.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Fixes#36911
If config file is invalid we'll exit anyhow, so this just prevents
the daemon from starting if the configuration is fine.
Mainly useful for making config changes and restarting the daemon
iff the config is valid.
Signed-off-by: Rich Horwood <rjhorwood@apple.com>
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Signed-off-by: Anca Iordache <anca.iordache@docker.com>
I don't think this script was really used, and now that GitHub has
issue templates, it will diverge from the template we have configured,
so better to remove it.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
It turns out libseccomp is not used for building docker at all.
It is only used for building runc (and needs libseccomp > 2.4)
Signed-off-by: Frédéric Dalleau <frederic.dalleau@docker.com>
We had some hosts with quite a bit of cycling containers that ocassionally causes docker daemons to lock up.
Most prominently `docker run` commands do not respond and nothing happens anymore.
Looking at the stack trace the following is at least likely sometimes a cause to that:
Two goroutines g0 and g1 can race against each other:
* (g0) 1. getSvcRecords is called and calls (*network).Lock()
--> Network is locked.
* (g1) 2. processEndpointDelete is called, and calls (*controller).Lock()
--> Controller is locked
* (g1) 3. processEndpointDelete tries (*network).ID() which calls (*network).Lock().
* (g0) 4. getSvcRecords calls (*controller).Lock().
3./4. are deadlocked against each other since the other goroutine holds the lock they need.
References b5dc370370/network.go
Signed-off-by: Steffen Butzer <steffen.butzer@outlook.com>
- winterm: GetStdFile(): Added compatibility with "golang.org/x/sys/windows"
- winterm: fix GetStdFile() falltrough
- update deprecation message to refer to the correct replacement
- add go.mod
- Fix int overflow
- Convert int to string using rune()
full diff:
- bea5bbe245...3f7ff695ad
- d6e3b3328b...d185dfc1b5
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
openSUSE Tumbleweed was facing "x509: certificate signed by unknown authority" error,
as `/etc/ssl/ca-bundle.pem` is provided as a symlink to `../../var/lib/ca-certificates/ca-bundle.pem`,
which was not supported by `rootlesskit --copy-up=/etc` .
See rootless-containers/rootlesskit issues 225
Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
These consts were used in combination with idtools utilities, which
makes it a more logical location for these consts to live.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
On Linux/Unix it was just a thin wrapper for unix.Unmount(), and a no-op on Windows.
This function was not used anywhere (also not externally), so removing this without
deprecating first.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This removes the deprecated wrappers, so that the package no longer has
hcsshim as a dependency. These wrappers were no longer used in our code,
and were deprecated in the 20.10 release (giving external consumers to
replace the deprecated ones).
Note that there are two consts which were unused, but for which there is
no replacement in golang.org/x/sys;
const (
PROCESS_TRUST_LABEL_SECURITY_INFORMATION = 0x00000080
ACCESS_FILTER_SECURITY_INFORMATION = 0x00000100
)
PROCESS_TRUST_LABEL_SECURITY_INFORMATION is documented as "reserved", and I could
not find clear documentation about ACCESS_FILTER_SECURITY_INFORMATION, so not sure
if they must be included in golang.org/x/sys: https://docs.microsoft.com/en-us/openspecs/windows_protocols/ms-dtyp/23e75ca3-98fd-4396-84e5-86cd9d40d343
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This makes the type better reflect the difference with the "runtime" profile;
our local type is used to generate a runtime-spec seccomp profile and extends
the runtime-spec type with additional fields; adding a "Name" field for backward
compatibility with older JSON representations, additional "Comment" metadata,
and conditional rules ("Includes", "Excludes") used during generation to adjust
the profile based on the container (capabilities) and host's (architecture, kernel)
configuration.
This change introduces one change in the type; the "runtime-spec" type uses a
`[]LinuxSeccompArg` for the `Args` field, whereas the local type used pointers;
`[]*LinuxSeccompArg`.
In addition, the runtime-spec Syscall type brings a new `ErrnoRet` field, allowing
the profile to specify the errno code returned for the syscall, which allows
changing the default EPERM for specific syscalls.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This is to allow quota package (without tests) to be built without cgo.
makeBackingFsDev was used in helpers but not defined in projectquota_unsupported.go
Also adjust some GoDoc to follow the standard format.
Signed-off-by: Tibor Vass <tibor@docker.com>
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This should make sure the link is still meaningful if that file drastically changes (which should make it easier to trace where the interesting block of code moved to and how it changes over time).
Also, add TODO items for Go 1.15+ and 1.16+ where we can "pie" more builds.
Signed-off-by: Tianon Gravi <admwiggin@gmail.com>
It's already omitted for ppc64 in
hack/dockerfile/install/install.sh
not using wildcard, because GOARCH=ppc64le supports pie
Signed-off-by: Georgy Yakovlev <gyakovlev@gentoo.org>
The message changed from "is deprecated" to "has been deprecated":
client/hijack.go:85:16: SA1019: httputil.NewClientConn has been deprecated since Go 1.0: Use the Client or Transport in package net/http instead. (staticcheck)
clientconn := httputil.NewClientConn(conn, nil)
^
integration/plugin/authz/authz_plugin_test.go:180:7: SA1019: httputil.NewClientConn has been deprecated since Go 1.0: Use the Client or Transport in package net/http instead. (staticcheck)
c := httputil.NewClientConn(conn, nil)
^
integration/plugin/authz/authz_plugin_test.go:479:12: SA1019: httputil.NewClientConn has been deprecated since Go 1.0: Use the Client or Transport in package net/http instead. (staticcheck)
client := httputil.NewClientConn(conn, nil)
^
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Probably needs a similar change as c208f03fbd,
but this code makes my head spin, so for now suppressing, and created a
tracking issue:
daemon/graphdriver/graphtest/graphtest_unix.go:305:12: unsafeptr: possible misuse of reflect.SliceHeader (govet)
header := *(*reflect.SliceHeader)(unsafe.Pointer(&buf))
^
daemon/graphdriver/graphtest/graphtest_unix.go:308:36: unsafeptr: possible misuse of reflect.SliceHeader (govet)
data := *(*[]byte)(unsafe.Pointer(&header))
^
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
daemon/list.go:556:18: var-declaration: should omit type bool from declaration of var shouldSkip; it will be inferred from the right-hand side (revive)
shouldSkip bool = true
^
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
daemon/config/config_unix.go:92:21: error-strings: error strings should not be capitalized or end with punctuation or a newline (revive)
return fmt.Errorf("Default cgroup namespace mode (%v) is invalid. Use \"host\" or \"private\".", cm) // nolint: golint
^
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
pkg/archive/copy.go:357:16: G110: Potential DoS vulnerability via decompression bomb (gosec)
if _, err = io.Copy(rebasedTar, srcTar); err != nil {
^
Ignoring GoSec G110. See https://github.com/securego/gosec/pull/433
and https://cure53.de/pentest-report_opa.pdf, which recommends to
replace io.Copy with io.CopyN7. The latter allows to specify the
maximum number of bytes that should be read. By properly defining
the limit, it can be assured that a GZip compression bomb cannot
easily cause a Denial-of-Service.
After reviewing, this should not affect us, because here we do not
read into memory.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Also looks like a false positive, but given that these were basically
testing for the `errdefs.Conflict` and `errdefs.NotFound` interfaces, I
replaced these with those;
daemon/stats/collector.go:154:6: type `notRunningErr` is unused (unused)
type notRunningErr interface {
^
daemon/stats/collector.go:159:6: type `notFoundErr` is unused (unused)
type notFoundErr interface {
^
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
pkg/devicemapper/devmapper.go:383:28: S1039: unnecessary use of fmt.Sprintf (gosimple)
if err := task.setMessage(fmt.Sprintf("@cancel_deferred_remove")); err != nil {
^
integration/plugin/graphdriver/external_test.go:321:18: S1039: unnecessary use of fmt.Sprintf (gosimple)
http.Error(w, fmt.Sprintf("missing id"), 409)
^
integration-cli/docker_api_stats_test.go:70:31: S1039: unnecessary use of fmt.Sprintf (gosimple)
_, body, err := request.Get(fmt.Sprintf("/info"))
^
integration-cli/docker_cli_build_test.go:4547:19: S1039: unnecessary use of fmt.Sprintf (gosimple)
"--build-arg", fmt.Sprintf("FOO1=fromcmd"),
^
integration-cli/docker_cli_build_test.go:4548:19: S1039: unnecessary use of fmt.Sprintf (gosimple)
"--build-arg", fmt.Sprintf("FOO2="),
^
integration-cli/docker_cli_build_test.go:4549:19: S1039: unnecessary use of fmt.Sprintf (gosimple)
"--build-arg", fmt.Sprintf("FOO3"), // set in env
^
integration-cli/docker_cli_build_test.go:4668:32: S1039: unnecessary use of fmt.Sprintf (gosimple)
cli.WithFlags("--build-arg", fmt.Sprintf("tag=latest")))
^
integration-cli/docker_cli_build_test.go:4690:32: S1039: unnecessary use of fmt.Sprintf (gosimple)
cli.WithFlags("--build-arg", fmt.Sprintf("baz=abc")))
^
pkg/jsonmessage/jsonmessage_test.go:255:4: S1039: unnecessary use of fmt.Sprintf (gosimple)
fmt.Sprintf("ID: status\n"),
^
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
daemon/volumes_unix_test.go:228:13: SA4001: &*x will be simplified to x. It will not copy x. (staticcheck)
mp: &(*c.MountPoints["/jambolan"]), // copy the mountpoint, expect no changes
^
daemon/logger/local/local_test.go:214:22: SA4001: &*x will be simplified to x. It will not copy x. (staticcheck)
dst.PLogMetaData = &(*src.PLogMetaData)
^
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
client/request.go:245:2: S1031: unnecessary nil check around range (gosimple)
if headers != nil {
^
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
daemon/logger/journald/read.go:128:3 comment on exported function `CErr` should be of the form `CErr ...`
daemon/logger/journald/read.go:131:36: unnecessary conversion (unconvert)
return C.GoString(C.strerror(C.int(-ret)))
^
daemon/logger/journald/read.go:380:2: S1023: redundant `return` statement (gosimple)
return
^
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
These should be ok to ignore for the purpose they're used
pkg/namesgenerator/names-generator.go:843:36: G404: Use of weak random number generator (math/rand instead of crypto/rand) (gosec)
name := fmt.Sprintf("%s_%s", left[rand.Intn(len(left))], right[rand.Intn(len(right))])
^
pkg/namesgenerator/names-generator.go:849:36: G404: Use of weak random number generator (math/rand instead of crypto/rand) (gosec)
name = fmt.Sprintf("%s%d", name, rand.Intn(10))
^
testutil/stringutils.go:11:18: G404: Use of weak random number generator (math/rand instead of crypto/rand) (gosec)
b[i] = letters[rand.Intn(len(letters))]
^
pkg/namesgenerator/names-generator.go:849:36: G404: Use of weak random number generator (math/rand instead of crypto/rand) (gosec)
name = fmt.Sprintf("%s%d", name, rand.Intn(10))
^
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Both getDynamicPortRange() and sanitizePortRange() could produce
and error, and the error message was currently discarded, silently
falling back to using the default port range.
This patch:
- Moves the fallback message from getDynamicPortRange() to getDefaultPortRange(),
which is where the actual fallback occurs.
- Logs the fallback message and the error that causes the fallback.
The message/error is currently printed at the INFO level, but could be raised
to a WARN, depending on what kind of situations can cause the error.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The second (sandbox) argument was unused, and it was only
used in a single location, so we may as well inline the
check.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This argument was used to detect conflicts, but was later removed in
1c73b1c99c14d7f048a2318a3caf589865c76fad.
However, it was never removed, and we were still getting a list
of all networks, without using the results.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
A node is no longer using its load balancer IP address when it no longer
has tasks that use the network that requires that load balancer. When
this occurs, the swarmkit manager will free that IP in IPAM, and may
reaassign it.
When a task shuts down cleanly, it attempts removal of the networks it
uses, and if it is the last task using those networks, this removal
succeeds, and the load balancer IP is freed.
However, this behavior is absent if the container fails. Removal of the
networks is never attempted.
To address this issue, I amend the executor. Whenever a node load
balancer IP is removed or changed, that information is passedd to the
executor by way of the Configure method. By keeping track of the set of
node NetworkAttachments from the previous call to Configure, we can
determine which, if any, have been removed or changed.
At first, this seems to create a race, by which a task can be attempting
to start and the network is removed right out from under it. However,
this is already addressed in the controller. The controller will attempt
to recreate missing networks before starting a task.
Signed-off-by: Drew Erny <derny@mirantis.com>
Trying to avoid logging code in "libraries" used elsewhere.
If this debug log is important, it should be easy to add in code
that's calling it.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This is just a first pass to axe some low hanging fruit. At this point, the project is generally mature enough that I do not believe most of this is necessary.
Signed-off-by: Tianon Gravi <admwiggin@gmail.com>
log statement should reflect how long it actually waited, not how long
it theoretically could wait based on the 'seconds' integer passed in.
Signed-off-by: Cam <gh@sparr.email>
This takes the same approach as was implemented on `docker build`, where a warning
is printed if `FROM --platform=...` is used (added in 399695305c)
Before:
docker rmi armhf/busybox
docker pull --platform=linux/s390x armhf/busybox
Using default tag: latest
latest: Pulling from armhf/busybox
d34a655120f5: Pull complete
Digest: sha256:8e51389cdda2158935f2b231cd158790c33ae13288c3106909324b061d24d6d1
Status: Downloaded newer image for armhf/busybox:latest
docker.io/armhf/busybox:latest
With this change:
docker rmi armhf/busybox
docker pull --platform=linux/s390x armhf/busybox
Using default tag: latest
latest: Pulling from armhf/busybox
d34a655120f5: Pull complete
Digest: sha256:8e51389cdda2158935f2b231cd158790c33ae13288c3106909324b061d24d6d1
Status: Downloaded newer image for armhf/busybox:latest
WARNING: image with reference armhf/busybox was found but does not match the specified platform: wanted linux/s390x, actual: linux/arm64
docker.io/armhf/busybox:latest
And daemon logs print:
WARN[2021-04-26T11:19:37.153572667Z] ignoring platform mismatch on single-arch image error="image with reference armhf/busybox was found but does not match the specified platform: wanted linux/s390x, actual: linux/arm64" image=armhf/busybox
When pulling without specifying `--platform, no warning is currently printed (but we can add a warning in future);
docker rmi armhf/busybox
docker pull armhf/busybox
Using default tag: latest
latest: Pulling from armhf/busybox
d34a655120f5: Pull complete
Digest: sha256:8e51389cdda2158935f2b231cd158790c33ae13288c3106909324b061d24d6d1
Status: Downloaded newer image for armhf/busybox:latest
docker.io/armhf/busybox:latest
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
- daemon.WithRootless(): make sure ROOTLESSKIT_PARENT_EUID is valid int
- daemon.RawSysInfo(): minor simplification, and rename variable that
clashed with imported package.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The BurntSushi project is no longer maintained, and the container ecosystem
is moving to use the pelletier/go-toml project instead.
This patch moves libnetwork to use the pelletier/go-toml library, to reduce
our dependency tree and use the same library in all places.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The awk dependency is an issue when running check-config.sh on systems
without awk. The use of awk can be replaced with sed, which improves
portability.
The PR code review discussion iterated via grep to this final sed
version that is all Tianon Gravi's art.
Co-authored-by: Tianon Gravi <admwiggin@gmail.com>
Signed-off-by: Joakim Roubert <joakim.roubert@axis.com>
The LCOW implementation in dockerd has been deprecated in favor of re-implementation
in containerd (in progress). Microsoft started removing the LCOW V1 code from the
build dependencies we use in Microsoft/opengcs (soon to be part of Microsoft/hcshhim),
which means that we need to start removing this code.
This first step removes the lcow graphdriver, the LCOW initialization code, and
some LCOW-related utilities.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
full diff: ab34263943...6772e930b6
- http/httpproxy: match http scheme when selecting http_proxy
- drop support for pre-1.12 direct syscalls on darwin
- x/net/http2: reject HTTP/2 Content-Length headers containing a sign
- http2/h2i: use x/term instead of x/crypto/ssh/terminal
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
These are failing in CI because something is not enabled.
Its not clear that these tests ever worked because they were not
actually running while in the libnetwork repo, which was only testing
Linux.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
libnetwork does different stuff depending on if you are running the
tests in a container or not... without telling it we are in a container
a bunch of the tests actually fail.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
After moving libnetwork we have a few extra cmd's.
Some of these are using urfave/cli so we need to vendor that in.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
After moving libnetwork to this repo, we need to update all the import
paths for libnetwork to point to docker/docker/libnetwork instead of
docker/libnetwork.
This change implements that.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
This utility was added after 19.03, and is only used in the daemon code
itself, so we can un-export it, until there's an external use for it.
Also updated the description, because the runc code already copied it
from coreos/go-systemd, so better to describe the actual source.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The correct name is `com.docker.network.container_iface_prefix`, but
the changelog accidentally used `interface` instead of `iface`, because
the libnetwork pull request used that as a title.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
full diff: https://github.com/moby/buildkit/compare/v0.8.2...v0.8.3
- vendor containerd (required for rootless overlayfs on kernel 5.11)
- not included to avoid depending on a fork
- Add retry on image push 5xx errors
- contenthash: include basename in content checksum for wildcards
- Fix missing mounts in execOp cache map
- Add regression test for run cache not considering mounts
- Add hack to preserve Dockerfile RUN cache compatibility after mount cache bugfix
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
On systems that do not have bash, the current bash-based
check-config.sh won't run. Making check-config.sh a POSIX shell script
instead makes it more portable.
Signed-off-by: Joakim Roubert <joakim.roubert@axis.com>
Other Unix platforms (e.g. Darwin) are also affected by the Go
runtime sending SIGURG.
This patch changes how we match the signal by just looking for the
"URG" name, which should handle any platform that has this signal
defined in the SignalMap.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
full diff: 6e2cb13661...f2269e66cd
- support SO_SNDBUF/SO_RCVBUF handling
- Support Go Modules
- license clarificaton
- ci: drop 1.6, 1.7, 1.8 support
- Add support for SocketConfig
- support goarch mips64le architecture.
- fix possible socket leak when bind fails
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Update libnetwork to make `docker run -p 80:80` functional again on environments
with kernel boot parameter `ipv6.disable=1`.
full diff: b3507428be...64b7a4574d
- fix port forwarding with ipv6.disable=1
- fixes moby/moby/42288 Docker 20.10.6: all containers stopped and cannot start if ipv6 is disabled on host
- fixes docker/libnetwork/2629 Network issue with IPv6 following update to version 20.10.6
- fixesdocker/for-linux/1233 Since 20.10.6 it's not possible to run docker on a machine with disabled IPv6 interfaces
- vendor: github.com/ishidawataru/sctp f2269e66cdee387bd321445d5d300893449805be
- Enforce order of lock acquisitions on network/controller, fixes#2632
- fixes docker/libnetwork/2632 Name resolution stuck due to deadlock between different network struct methods
- fixes moby/moby/42032 Docker deamon get's stuck, can't serve DNS requests
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Logging to daemon logs every time there's an error with a log driver can be
problematic since daemon logs can grow rapidly, potentially exhausting disk
space.
Instead, it's preferable to limit the rate at which log driver errors are allowed
to be written. By default, this limit is 333 entries per second max.
Signed-off-by: Angel Velazquez <angelcar@amazon.com>
Samuel has been a long-time supporter of the container ecosystem. He is an active security advisor for the containerd project, has spent a lot of time with the awslogs logging driver, and has been helping with review and triage of pull requests here for a while, and I believe his experience with both Go and containers at large would be a very valuable addition to the maintainer team.
Signed-off-by: Tianon Gravi <admwiggin@gmail.com>
Do not handle SIGURG on Linux, as in go1.14+, the go runtime issues
SIGURG as an interrupt to support preemptable system calls on Linux.
This issue was caught in TestCatchAll, which could fail when updating to Go 1.14 or above;
=== Failed
=== FAIL: pkg/signal TestCatchAll (0.01s)
signal_linux_test.go:32: assertion failed: urgent I/O condition (string) != continued (string)
signal_linux_test.go:32: assertion failed: continued (string) != hangup (string)
signal_linux_test.go:32: assertion failed: hangup (string) != child exited (string)
signal_linux_test.go:32: assertion failed: child exited (string) != illegal instruction (string)
signal_linux_test.go:32: assertion failed: illegal instruction (string) != floating point exception (string)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
full diff: https://github.com/opencontainers/runc/compare/v1.0.0-rc94...v1.0.0-rc95
Release notes:
This release of runc contains a fix for CVE-2021-30465, and users are
strongly recommended to update (especially if you are providing
semi-limited access to spawn containers to untrusted users).
Aside from this security fix, only a few other changes were made since
v1.0.0-rc94 (the only user-visible change was the addition of support
for defaultErrnoRet in seccomp profiles).
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This was originally added in 833444c0d6,
at which time buildx did not yet have a release, so we had to build
from source.
Now that buildx has binary releases on GitHub, we should be able to
consume those binaries instead of building.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
=== RUN TestServicePlugin
plugin_test.go:42: assertion failed: error is not nil: error building basic plugin bin: no required module provides package github.com/docker/docker/testutil/fixtures/plugin/basic: go.mod file not found in current directory or any parent directory; see 'go help modules'
: exit status 1
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
INFO: Running integration tests at 05/17/2021 12:54:50...
INFO: DOCKER_HOST at tcp://127.0.0.1:2357
INFO: Integration API tests being run from the host:
INFO: make.ps1 starting at 05/17/2021 12:54:50
powershell.exe : go: cannot find main module, but found vendor.conf in D:\gopath\src\github.com\docker\docker
At D:\gopath\src\github.com\docker\docker@tmp\durable-1ed00396\powershellWrapper.ps1:3 char:1
+ & powershell -NoProfile -NonInteractive -ExecutionPolicy Bypass -Comm ...
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+ CategoryInfo : NotSpecified: (go: cannot find...m\docker\docker:String) [], RemoteException
+ FullyQualifiedErrorId : NativeCommandError
to create a module there, run:
go mod init
INFO: make.ps1 ended at 05/17/2021 12:54:51
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Updates the certificates to account for current versions of Go expecting
SANs to be used instead of the Common Name field:
FAIL: s390x.integration.plugin.authz TestAuthZPluginTLS (0.53s)
[2020-07-26T09:36:58.638Z] authz_plugin_test.go:132: assertion failed:
error is not nil: error during connect: Get "https://localhost:4271/v1.41/version":
x509: certificate relies on legacy Common Name field, use SANs or temporarily enable Common Name matching with GODEBUG=x509ignoreCN=0
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Certificates were originally added in c000cb6471,
but did not include a script to generate them. Current versions of Go expect
certificates to use SAN instead of Common Name fields, so updating the script
to include those;
x509: certificate relies on legacy Common Name field, use SANs or temporarily
enable Common Name matching with GODEBUG=x509ignoreCN=0
Some fields were updated to be a bit more descriptive (instead of "replaceme"),
and the `-text` option was used to include a human-readable variant of the
content.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
These tests were no longer valid on Go 1.16; related to https://tip.golang.org/doc/go1.16#path/filepath
> The Match and Glob functions now return an error if the unmatched part of
> the pattern has a syntax error. Previously, the functions returned early on
> a failed match, and thus did not report any later syntax error in the pattern.
Causing the test to fail:
=== RUN TestMatches
fileutils_test.go:388: assertion failed: error is not nil: syntax error in pattern: pattern="a\\" text="a"
--- FAIL: TestMatches (0.00s)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
We perform a DCO check before we run all other tests, so we can skip it
as part of the validate step.
Leaving the line in for visibility, and in case we switch from Jenkins
to (e.g.) GitHub actions.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Schema1 images can not have a config based cache key
before the layers are pulled. Avoid validation and reuse
manifest digest as a second key.
Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>
This changes CI to skip these platforms by default. The ppc64le and s390x
machines are "pet machines", configuration may be outdated, and these
machines are known to be flaky.
Building and verifying packages for these platforms is being handed
over to the IBM team.
We can still run these platforms for specific pull requests by selecting
the checkboxes.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This unit file was created when we packaged rpms without the
socket activation unit, but that's no longer the case.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The VolumesService did not have information wether or not a volume
was _created_ or if a volume already existed in the driver, and
the existing volume was used.
As a result, multiple "create" events could be generated for the
same volume. For example:
1. Run `docker events` in a shell to start listening for events
2. Create a volume:
docker volume create myvolume
3. Start a container that uses that volume:
docker run -dit -v myvolume:/foo busybox
4. Check the events that were generated:
2021-02-15T18:49:55.874621004+01:00 volume create myvolume (driver=local)
2021-02-15T18:50:11.442759052+01:00 volume create myvolume (driver=local)
2021-02-15T18:50:11.487104176+01:00 container create 45112157c8b1382626bf5e01ef18445a4c680f3846c5e32d01775dddee8ca6d1 (image=busybox, name=gracious_hypatia)
2021-02-15T18:50:11.519288102+01:00 network connect a19f6bb8d44ff84d478670fa4e34c5bf5305f42786294d3d90e790ac74b6d3e0 (container=45112157c8b1382626bf5e01ef18445a4c680f3846c5e32d01775dddee8ca6d1, name=bridge, type=bridge)
2021-02-15T18:50:11.526407799+01:00 volume mount myvolume (container=45112157c8b1382626bf5e01ef18445a4c680f3846c5e32d01775dddee8ca6d1, destination=/foo, driver=local, propagation=, read/write=true)
2021-02-15T18:50:11.864134043+01:00 container start 45112157c8b1382626bf5e01ef18445a4c680f3846c5e32d01775dddee8ca6d1 (image=busybox, name=gracious_hypatia)
5. Notice that a "volume create" event is created twice;
- once when `docker volume create` was ran
- once when `docker run ...` was ran
This patch moves the generation of (most) events to the volume _store_, and only
generates an event if the volume did not yet exist.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
full diff: https://github.com/moby/sys/compare/symlink/v0.1.0...mountinfo/v0.4.1
github.com/moby/sys/mountinfo v0.4.1
----------------------------------------------
- Fix PrefixFilter() being too greedy
- TestMountedBy*: add missing pre-checks
- Documentation improvements
github.com/moby/sys/mount v0.2.0
----------------------------------------------
Breaking changes:
- Remove stub-implementations for Windows for `Mount()`, `Unmount()`,
`RecursiveUnmount()`, `MergeTmpfsOptions()`
Fixes and improvements:
- `go.mod`: update github.com/moby/sys/mountinfo to v0.4.0
- use `MNT_*` flags from golang.org/x/sys/unix on freebsd
- add support for OpenBSD in addition to FreeBSD
- fix package overview documentation not showing
- `RecursiveUnmount()`: minor improvements
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This option was originally added in d05aa418b0,
and moved in 8b15839ee8 (after which it temporarily
went to the docker/engine-api repository, and was brought back in this repository
in 91e197d614).
However, it looks like this field was never used; the API always returns the standard
information, and the "--quiet" option for `docker ps` is implemented on the CLI
side, which uses different formatting when setting this option;
2ec468e284/api/client/ps.go (L73-L79)
This patch removes the unused field,
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Welcome to the v1.5.0 release of containerd!
The sixth major release of containerd includes many stability improvements
and code organization changes to make contribution easier and make future
features cleaner to develop. This includes bringing CRI development into the
main containerd repository and switching to Go modules. This release also
brings support for the Node Resource Interface (NRI).
Highlights
--------------------------------------------------------------------------------
*Project Organization*
- Merge containerd/cri codebase into containerd/containerd
- Move to Go modules
- Remove selinux build tag
- Add json log format output option for daemon log
*Snapshots*
- Add configurable overlayfs path
- Separate overlay implementation from plugin
- Native snapshotter configuration and plugin separation
- Devmapper snapshotter configuration and plugin separation
- AUFS snapshotter configuration and plugin separation
- ZFS snapshotter configuration and plugin separation
- Pass custom snapshot labels when creating snapshot
- Add platform check for snapshotter support when unpacking
- Handle loopback mounts
- Support userxattr mount option for overlay in user namespace
- ZFS snapshotter implementation of usage
*Distribution*
- Improve registry response errors
- Improve image pull performance over HTTP 1.1
- Registry configuration package
- Add support for layers compressed with zstd
- Allow arm64 to fallback to arm (v8, v7, v6, v5)
*Runtime*
- Add annotations to containerd task update API
- Add logging binary support when terminal is true
- Runtime support on FreeBSD
*Windows*
- Implement windowsDiff.Compare to allow outputting OCI images
- Optimize WCOW snapshotter to commit writable layers as read-only parent layers
- Optimize LCOW snapshotter use of scratch layers
*CRI*
- Add NRI injection points cri#1552
- Add support for registry host directory configuration
- Update privileged containers to use current capabilities instead of known capabilities
- Add pod annotations to CNI call
- Enable ocicrypt by default
- Support PID NamespaceMode_TARGET
Impactful Client Updates
--------------------------------------------------------------------------------
This release has changes which may affect projects which import containerd.
*Switch to Go modules*
containerd and all containerd sub-repositories are now using Go modules. This
should help make importing easier for handling transitive dependencies. As of
this release, containerd still does not guarantee client library compatibility
for 1.x versions, although best effort is made to minimize impact from changes
to exported Go packages.
*CRI plugin moved to main repository*
With the CRI plugin moving into the main repository, imports under github.com/containerd/cri/
can now be found github.com/containerd/containerd/pkg/cri/.
There are no changes required for end users of CRI.
*Library changes*
oci
The WithAllCapabilities has been removed and replaced with WithAllCurrentCapabilities
and WithAllKnownCapabilities. WithAllKnownCapabilities has similar
functionality to the previous WithAllCapabilities with added support for newer
capabilities. WithAllCurrentCapabilities can be used to give privileged
containers the same set of permissions as the calling process, preventing errors
when privileged containers attempt to get more permissions than given to the
caller.
*Configuration changes*
New registry.config_path for CRI plugin
registry.config_path specifies a directory to look for registry hosts
configuration. When resolving an image name during pull operations, the CRI
plugin will look in the <registry.config_path>/<image hostname>/ directory
for host configuration. An optional hosts.toml file in that directory may be
used to configure which hosts will be used for the pull operation as well
host-specific configurations. Updates under that directory do not require
restarting the containerd daemon.
Enable registry.config_path in the containerd configuration file.
[plugins."io.containerd.grpc.v1.cri".registry]
config_path = "/etc/containerd/certs.d"
Configure registry hosts, such as /etc/containerd/certs.d/docker.io/hosts.toml
for any image under the docker.io namespace (any image on Docker Hub).
server = "https://registry-1.docker.io"
[host."https://public-mirror.example.com"]
capabilities = ["pull"]
[host."https://docker-mirror.internal"]
capabilities = ["pull", "resolve"]
ca = "docker-mirror.crt"
If no hosts.toml configuration exists in the host directory, it will fallback
to check certificate files based on Docker's certificate file
pattern (".crt" files for CA certificates and ".cert"/".key" files for client
certificates).
*Deprecation of registry.mirrors and registry.configs in CRI plugin*
Mirroring and TLS can now be configured using the new registry.config_path
option. Existing configurations may be migrated to new host directory
configuration. These fields are only deprecated with no planned removal,
however, these configurations cannot be used while registry.config_path is
defined.
*Version 1 schema is deprecated*
Version 2 of the containerd configuration toml is recommended format and the
default. Starting this version, a deprecation warning will be logged when
version 1 is used.
To check version, see the version value in the containerd toml configuration.
version=2
FreeBSD Runtime Support (Experimental)
--------------------------------------------------------------------------------
This release includes changes that allow containerd to run on FreeBSD with a
compatible runtime, such as runj. This
support should be considered experimental and currently there are no official
binary releases for FreeBSD. The runtimes used by containerd are maintained
separately and have their own stability guarantees. The containerd project
strives to be compatible with any runtime which aims to implement containerd's
shim API and OCI runtime specification.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
A temporary directory was created but not removed at the end of the test.
The missing remove directory call is added now.
Signed-off-by: Muhammad Zohaib Aslam <zohaibse011@gmail.com>
The underlying Loggers Close() function can be called with the the
run() goroutine still writing to the driver. This is causing the
fluentd-golang-logger to panic cause it doesn't defensively check
for the closing of the channel before writing to it.
It relies on the docker daemon to keep the contract of not calling Log()
if Close() has already been called.
Contributions by: James Johnston <james.johnston@thumbtack.com>
Nathan Wong <nathanw@thumbtack.com>
Signed-off-by: Anuj Varma <anujvarma@thumbtack.com>
Kernel 5.11 introduced support for rootless overlayfs, but incompatible with SELinux.
On the other hand, fuse-overlayfs is compatible.
Close issue 42333
Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
Previously, running dockerd-rootless.sh on SELinux-enabled hosts
was failing with "can't open lock file /run/xtables.lock: Permission denied" error.
(issue 41230).
This commit avoids hitting the error by relabeling /run in the RootlessKit child.
The actual /run on the parent is unaffected.
e6fc34b71a/libpod/networking_linux.go (L396-L401)
Tested on Fedora 34
Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
Make `docker run -p 80:80` functional again on environments with kernel boot parameter `ipv6.disable=1`.
Fix moby/moby issue 42288
Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
Relates to a82fff6377/docs/packages.md (proxies)
> (..) the first four of these are the standard built-in build-arg options
> available for `docker build`
> (..) The last, `all_proxy`, is a standard var used for socks proxying. Since
> it is not built into `docker build`, if you want to use it, you will need to
> add the following line to the dockerfile:
>
> ARG all_proxy
Given the we support all other commonly known proxy env-vars by default, it makes
sense to add this one as well.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
These tests would panic;
- in WithRLimits(), because HostConfig was not set;
470ae8422f/daemon/oci_linux.go (L46-L47)
- in daemon.mergeUlimits(), because daemon.configStore was not set;
470ae8422f/daemon/oci_linux.go (L1069)
This panic was not discovered because the current version of runc/libcontainer that we vendor
would not always return false for `apparmor.IsEnabled()` when running docker-in-docker or if
`apparmor_parser` is not found. Starting with v1.0.0-rc93 of libcontainer, this is no longer
the case (changed in bfb4ea1b1b)
This patch;
- changes the tests to initialize Daemon.configStore and Container.HostConfig
- Combines TestExecSetPlatformOpt and TestExecSetPlatformOptPrivileged into a new test
(TestExecSetPlatformOptAppArmor)
- Runs the test both if AppArmor is enabled and if not (in which case it tests
that the container's AppArmor profile is left empty).
- Adds a FIXME comment for a possible bug in execSetPlatformOpts, which currently
prefers custom profiles over "privileged".
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
full diff: 55eda46b22...19ee068f93
brings in updated protobufs, generated with gogo/protobuf v1.3.2
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Starting `dockerd-rootless.sh` checks that `$HOME` is writeable, but does not
require it to be so.
Make the check more precise, and check that it actually exists and is a
directory.
Signed-off-by: Hugo Osvaldo Barrera <hugo@barrera.io>
Whether or not the command path is in the error message is a an
implementation detail.
For example, on Windows the only reason this ever matched was because it
dumped the entire container config into the error message, but this had
nothing to do with the actual error.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
this refactors the Stop command to fix a few issues and behaviors that
dont seem completely correct:
1. first it fixes a situation where stop could hang forever (#41579)
2. fixes a behavior where if sending the
stop signal failed, then the code directly sends a -9 signal. If that
fails, it returns without waiting for the process to exit or going
through the full docker kill codepath.
3. fixes a behavior where if sending the stop signal failed, then the
code sends a -9 signal. If that succeeds, then we still go through the
same stop waiting process, and may even go through the docker kill path
again, even though we've already sent a -9.
4. fixes a behavior where the code would wait the full 30 seconds after
sending a stop signal, even if we already know the stop signal failed.
fixes#41579
Signed-off-by: Cam <gh@sparr.email>
Before this change, cleanup of the btrfs driver (occuring on each daemon
shutdown) resulted in disabling quotas. It was done with an assumption
that quotas can be enabled or disabled on a subvolume level, which is
not true - enabling or disabling quota is always done on a filesystem
level.
That was leading to disabling quota on btrfs filesystems on each daemon
shutdown.
This change fixes that behavior and removes misleading `subvol` prefix
from functions and methods which set up quota (on a filesystem level).
Fixes: #34593
Fixes: 401c8d1767 ("Add disk quota support for btrfs")
Signed-off-by: Michal Rostecki <mrostecki@opensuse.org>
These tests fail, possibly due to changes in the kernel. Temporarily skipping
these tests, so that we at least have some coverage on these windows versions
in this repo, and we can look into this specific issue separately.;
=== FAIL: github.com/docker/docker/pkg/archive TestChangesDirsEmpty (0.21s)
changes_test.go:261: Reported changes for identical dirs: [{\dirSymlink C}]
=== FAIL: github.com/docker/docker/pkg/archive TestChangesDirsMutated (0.14s)
changes_test.go:391: unexpected change "C \\dirSymlink" "\\dirnew"
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Images for Windows 2022 (SAC) are not yet available, so using insider builds
in the meantime; mcr.microsoft.com/windows/servercore/insider:10.0.20295.1
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This adds a stage to test against the current SAC (Semi Annual Channel),
which allows us to catch possible regressions on upcoming LTS versions.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The runc/libcontainer apparmor package on master no longer checks if apparmor_parser
is enabled, or if we are running docker-in-docker.
While those checks are not relevant to runc (as it doesn't load the profile), these
checks _are_ relevant to us (and containerd). So switching to use the containerd
apparmor package, which does include the needed checks.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This patch picks the first commit in containerd that exports the AppArmor package
functions to keep the vendor diff small (there are some updates to that package
after this, but those will be included in other patches).
full diff: fbf1a72de7...55eda46b22
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This is the first commit after the containerd transition to go modules. Using this
as an intermediate version to allow us to track what dependency changes are
introduced in the containerd dependency since.
full diff: b9092fae15...fbf1a72de7
There were some fix-ups in the PR after adding go modules that updated dependencies,
which will be aligned in the next commit.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This is the last commit before containerd switched to using go modules. Using this
as an intermediate version to allow us to more easily track what dependency updates
containerd has.
full diff: 0edc412565...b9092fae15
relevant changes in vendored code:
- Do not hardcode "amd64" on LCOW and Windows-related files
- Optimize Windows and LCOW snapshotters to only create scratch layer on the final snapshot
- Add annotations to task update request api
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Inode numbers are guaranteed to be unique only within a filesystem.
As such there is an edge case where these predicates are true on a
non-btrfs filesystem.
Closes#42271
Signed-off-by: Brett Milford <brettmilford@gmail.com>
This was changed as part of a refactor to use containerd dist code. The
problem is the OCI media types are not compatible with older versions of
Docker.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
This test has been flaky for a long time, failing with:
--- FAIL: TestInspect (12.04s)
inspect_test.go:39: timeout hit after 10s: waiting for tasks to enter run state. task failed with error: task: non-zero exit (1)
While looking through logs, noticed tasks were started, entering RUNNING stage,
and then exited, to be started again.
state.transition="STARTING->RUNNING"
...
msg="fatal task error" error="task: non-zero exit (1)"
...
state.transition="RUNNING->FAILED"
Looking for possible reasons, first considering network issues (possibly we ran
out of IP addresses or networking not cleaned up), then I spotted the issue.
The service is started with;
Command: []string{"/bin/top"},
Args: []string{"-u", "root"},
The `-u root` is not an argument for the service, but for `/bin/top`. While the
Ubuntu/Debian/GNU version `top` has a -u/-U option;
docker run --rm ubuntu:20.04 top -h 2>&1 | grep '\-u'
top -hv | -bcEHiOSs1 -d secs -n max -u|U user -p pid(s) -o field -w [cols]
The *busybox* version of top does not:
docker run --rm busybox top --help 2>&1 | grep '\-u'
So running `top -u root` would cause the task to fail;
docker run --rm busybox top -u root
top: invalid option -- u
...
echo $?
1
As a result, the service went into a crash-loop, and because the `poll.WaitOn()`
was running with a short interval, in many cases would _just_ find the RUNNING
state, perform the `service inspect`, and pass, but in other cases, it would not
be that lucky, and continue polling untill we reached the 10 seconds timeout,
and mark the test as failed.
Looking for history of this option (was it previously using a different image?) I
found this was added in 6cd6d8646a, but probably
just missed during review.
Given that the option is only set to have "something" to inspect, I replaced
the `-u root` with `-d 5`, which makes top refresh with a 5 second interval.
Note that there is another test (`TestServiceListWithStatuses) that uses the same
spec, however, that test is skipped based on API version of the test-daemon, and
(to be looked into), when performing that check, no API version is known, causing
the test to (always?) be skipped:
=== RUN TestServiceListWithStatuses
--- SKIP: TestServiceListWithStatuses (0.00s)
list_test.go:34: versions.LessThan(testEnv.DaemonInfo.ServerVersion, "1.41")
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Taking the same approach as was taken in containerd
The new library has a slightly different output;
- keys at the same level are sorted alphabetically
- empty sections not omitted (`proxy_plugins`, `stream_processors`, `timeouts`),
which could possibly be be addressed with an "omitempty" in containerd's struct.
- empty slices are not omitted (`imports`, `required_plugins`)
After sorting the "before" configuration the diff looks like this:
```patch
diff --git a/config-before-sorted.toml b/config-after.toml
index cc771ce7ab..43a727f589 100644
--- a/config-before-sorted.toml
+++ b/config-after.toml
@@ -1,6 +1,8 @@
disabled_plugins = ["cri"]
+imports = []
oom_score = 0
plugin_dir = ""
+required_plugins = []
root = "/var/lib/docker/containerd/daemon"
state = "/var/run/docker/containerd/daemon"
version = 0
@@ -37,6 +39,12 @@ version = 0
shim = "containerd-shim"
shim_debug = true
+[proxy_plugins]
+
+[stream_processors]
+
+[timeouts]
+
[ttrpc]
address = ""
gid = 0
```
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The github.com/BurntSushi/toml project is no longer maintained,
and containerd is switching to this project instead, so start
moving our code as well.
This patch only changes the binary used during validation (tbh,
we could probably remove this validation step, but leaving that
for now).
I manually verified that the hack/verify/toml still works by adding a commit
that makes the MAINTAINERS file invalid;
diff --git a/MAINTAINERS b/MAINTAINERS
index b739e7e20c..81ababd8de 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -23,7 +23,7 @@
# a subsystem, they are responsible for doing so and holding the
# subsystem maintainers accountable. If ownership is unclear, they are the de facto owners.
- people = [
+ people =
"akihirosuda",
"anusha",
"coolljt0725",
Running `hack/verify/toml` was able to detect the broken format;
hack/validate/toml
(27, 4): keys cannot contain , characterThese files are not valid TOML:
- MAINTAINERS
Please reformat the above files as valid TOML
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Some tests were using domain names that were intended to be "fake", but are
actually registered domain names (such as domain.com, registry.com, mytest.com).
Even though we were not actually making connections to these domains, it's
better to use domains that are designated for testing/examples in RFC2606:
https://tools.ietf.org/html/rfc2606
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The following was failing previously, because `getUnprivilegedMountFlags()` was not called:
```console
$ sudo mount -t tmpfs -o noexec none /tmp/foo
$ $ docker --context=rootless run -it --rm -v /tmp/foo:/mnt:ro alpine
docker: Error response from daemon: OCI runtime create failed: container_linux.go:367: starting container process caused: process_linux.go:520: container init caused: rootfs_linux.go:60: mounting "/tmp/foo" to rootfs at "/home/suda/.local/share/docker/overlay2/b8e7ea02f6ef51247f7f10c7fb26edbfb308d2af8a2c77915260408ed3b0a8ec/merged/mnt" caused: operation not permitted: unknown.
```
Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
overlay2 no longer sets `archive.OverlayWhiteoutFormat` when
running in UserNS, so we can remove the complicated logic in the
archive package.
Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
When running in userns, returns error (i.e. "use naive, not native")
immediately.
No substantial change to the logic.
Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
Fix issue 41762
Cherry-pick "drivers: btrfs: Allow unprivileged user to delete subvolumes" from containers/storage
831e32b6bd
> In btrfs, subvolume can be deleted by IOC_SNAP_DESTROY ioctl but there
> is one catch: unprivileged IOC_SNAP_DESTROY call is restricted by default.
>
> This is because IOC_SNAP_DESTROY only performs permission checks on
> the top directory(subvolume) and unprivileged user might delete dirs/files
> which cannot be deleted otherwise. This restriction can be relaxed if
> user_subvol_rm_allowed mount option is used.
>
> Although the above ioctl had been the only way to delete a subvolume,
> btrfs now allows deletion of subvolume just like regular directory
> (i.e. rmdir sycall) since kernel 4.18.
>
> So if we fail to cleanup subvolume in subvolDelete(), just fallback to
> system.EnsureRmoveall() to try to cleanup subvolumes again.
> (Note: quota needs privilege, so if quota is enabled we do not fallback)
>
> This fix will allow non-privileged container works with btrfs backend.
Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
full diff: fa125a3512...b3507428be
- fixed IPv6 iptables rules for enabled firewalld (libnetwork#2609)
- fixes "Docker uses 'iptables' instead of 'ip6tables' for IPv6 NAT rule, crashes"
- Fix regression in docker-proxy
- introduced in "Fix IPv6 Port Forwarding for the Bridge Driver" (libnetwork#2604)
- fixes/addresses: "IPv4 and IPv6 addresses are not bound by default anymore" (libnetwork#2607)
- fixes/addresses "IPv6 is no longer proxied by default anymore" (moby#41858)
- Use hostIP to decide on Portmapper version
- fixes docker-proxy not being stopped correctly
Port mapping of containers now contain separatet mappings for IPv4 and IPv6 addresses, when
listening on "any" IP address. Various tests had to be updated to take multiple mappings into
account.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The normalizing was updated with the output of the "docker port" command
in mind, but we're normalizing the "expected" output, which is passed
without the "->" in front of the mapping, causing some tests to fail;
=== RUN TestDockerSuite/TestPortHostBinding
--- FAIL: TestDockerSuite/TestPortHostBinding (1.21s)
docker_cli_port_test.go:324: assertion failed: error is not nil: |:::9876!=[::]:9876|
=== RUN TestDockerSuite/TestPortList
--- FAIL: TestDockerSuite/TestPortList (0.96s)
docker_cli_port_test.go:25: assertion failed: error is not nil: |:::9876!=[::]:9876|
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
`getCurrentOOMScoreAdj()` was broken because `strconv.Atoi()` was called
without trimming "\n".
Fix issue 40068: `rootless docker in kubernetes: "getting the final child's pid from
pipe caused \"EOF\": unknown"
Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
Since rootlesskit removed vendor folder, building it has to rely on go mod.
Dockerfile in docker-ce-packaging uses GOPROXY=direct, which makes "go mod"
commands use git to fetch modules. "go mod" in Go versions before 1.14.1 are
incompatible with older git versions, including the version of git that ships
with CentOS/RHEL 7 (which have git 1.8), see golang/go#38373
This patch switches rootlesskit install script to set GOPROXY to
https://proxy.golang.org so that git is not required for downloading modules.
Once all our code has upgraded to Go 1.14+, this workaround should be
removed.
Signed-off-by: Tibor Vass <tibor@docker.com>
daemon.getExecConfig() already returns typed errors; by wrapping those errors
we may loose the actual reason for failures. Changing the error-type was
originally added in 2d43d93410, but I think
it was not intentional to ignore already-typed errors. It was later refactored
in a793564b25, which added helper functions
to create these errors, but kept the same behavior.
Also adds error-handling to prevent a panic in situations where (although
unlikely) `daemon.containers.Get()` would not return a container.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This makes daemon.getExecConfig return a errdefs.Conflict() error if the
container is not running.
This was originally the case, but a refactor of this code changed the typed
error (`derr.ErrorCodeContainerNotRunning`) to a non-typed error;
a793564b25
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Rootlesskit doesn't currently handle IPv6 addresses, causing TestNetworkLoopbackNat
and TestNetworkNat to fail;
Error starting userland proxy:
error while calling PortManager.AddPort(): listen tcp: address :::8080: too many colons in address
This patch:
- Updates `getExternalAddress()` to pick IPv4 address if both IPv6 and IPv4 are found
- Update TestNetworkNat to net.JoinHostPort(), so that square brackets are used for
IPv6 addresses (e.g. `[::]:8080`)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
full diff: https://github.com/rootless-containers/rootlesskit/compare/v0.13.1...v0.14.0
v0.14.0 Changes (since v0.13.2)
--------------------------------------
- CLI: improve --help output
- API: support GET /info
- Port API: support specifying IP version explicitly ("tcp4", "tcp6")
- rootlesskit-docker-proxy: support libnetwork >= 20201216 convention
- Allow vendoring with moby/sys/mountinfo@v0.1.3 as well as @v0.4.0
- Remove socat port driver
- socat driver has been deprecated since v0.7.1 (Dec 2019)
- New experimental flag: --ipv6
- Enables IPv6 routing (slirp4netns --enable-ipv6). Unrelated to port driver.
v0.13.2
--------------------------------------
- Fix cleaning up crashed state dir
- Update Go to 1.16
- Misc fixes
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
These were added in 94d70d8355 for Windows TP4,
but no longer used after 331c8a86d4 removed
support for TP4.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
In 20.10 we no longer implicitly push all tags and require a
"--all-tags" flag, so add this to the test when the CLI is >= 20.10
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
Tonis mentioned that we can run into issues if there is more error
handling added here. This adds a custom reader implementation which is
like io.MultiReader except it does not cache EOF's.
What got us into trouble in the first place is `io.MultiReader` will
always return EOF once it has received an EOF, however the error
handling that we are going for is to recover from an EOF because the
underlying file is a file which can have more data added to it after
EOF.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
This seems to be testing a strange case, specifically that one can set
the `--net` and `--network` in the same command with the same network.
Indeed this used to work with older CLIs but newer ones error out when
validating the request before sending it to the daemon.
Opening this for discussion because:
1. This doesn't seem to be testing anything at all related to the rest
of the test
2. Not really providing any value here.
3. Is testing that a technically invalid option is successful (whether
the option should be valid as it relates to the CLI accepting it is
debatable).
4. Such a case seems fringe and even a bug in whatever is calling the
CLI with such options.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
Imagine that in test TestServiceLogsFollow the service
TestServiceLogsFollow would print "log test" message to pipe exactly 3
times before cmd.Process.Kill() would kill the service in the end of
test. This means that goroutine would hang "forever" in
reader.Readline() because it can't read anything from pipe but pipe
write end is still left open by the goroutine.
This is standard behaviour of pipes, one should close the write end
before reading from the read end, else reading would block forever.
This problem does not fire frequently because the service normally
prints "log test" message at least 4 times, but we saw this hang on our
test runs in Virtuozzo.
We can't close the write pipe end before reading in a goroutine because
the goroutine is basicly a thread and closing a file descrptor would
close it for all other threads and "log test" would not be printed at
all.
So I see another way to handle this race, we can just defer pipe close
to the end of the main thread of the test case after killing the
service. This way goroutine's reading would be interrupted and it would
finish eventually.
Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
With the promotion of the experimental Dockerfile syntax to "stable", the Dockerfile
syntax now includes some options that are supported by BuildKit, but not (yet)
supported in the classic builder.
As a result, parsing a Dockerfile may succeed, but any flag that's known to BuildKit,
but not supported by the classic builder is silently ignored;
$ mkdir buildkit_flags && cd buildkit_flags
$ touch foo.txt
For example, `RUN --mount`:
DOCKER_BUILDKIT=0 docker build --no-cache -f- . <<EOF
FROM busybox
RUN --mount=type=cache,target=/foo echo hello
EOF
Sending build context to Docker daemon 2.095kB
Step 1/2 : FROM busybox
---> 219ee5171f80
Step 2/2 : RUN --mount=type=cache,target=/foo echo hello
---> Running in 022fdb856bc8
hello
Removing intermediate container 022fdb856bc8
---> e9f0988844d1
Successfully built e9f0988844d1
Or `COPY --chmod` (same for `ADD --chmod`):
DOCKER_BUILDKIT=0 docker build --no-cache -f- . <<EOF
FROM busybox
COPY --chmod=0777 /foo.txt /foo.txt
EOF
Sending build context to Docker daemon 2.095kB
Step 1/2 : FROM busybox
---> 219ee5171f80
Step 2/2 : COPY --chmod=0777 /foo.txt /foo.txt
---> 8b7117932a2a
Successfully built 8b7117932a2a
Note that unknown flags still produce and error, for example, the below fails because `--hello` is an unknown flag;
DOCKER_BUILDKIT=0 docker build -<<EOF
FROM busybox
RUN --hello echo hello
EOF
Sending build context to Docker daemon 2.048kB
Error response from daemon: dockerfile parse error line 2: Unknown flag: hello
With this patch applied
----------------------------
With this patch applied, flags that are known in the Dockerfile spec, but are not
supported by the classic builder, produce an error, which includes a link to the
documentation how to enable BuildKit:
DOCKER_BUILDKIT=0 docker build --no-cache -f- . <<EOF
FROM busybox
RUN --mount=type=cache,target=/foo echo hello
EOF
Sending build context to Docker daemon 2.048kB
Step 1/2 : FROM busybox
---> b97242f89c8a
Step 2/2 : RUN --mount=type=cache,target=/foo echo hello
the --mount option requires BuildKit. Refer to https://docs.docker.com/go/buildkit/ to learn how to build images with BuildKit enabled
DOCKER_BUILDKIT=0 docker build --no-cache -f- . <<EOF
FROM busybox
COPY --chmod=0777 /foo.txt /foo.txt
EOF
Sending build context to Docker daemon 2.095kB
Step 1/2 : FROM busybox
---> b97242f89c8a
Step 2/2 : COPY --chmod=0777 /foo.txt /foo.txt
the --chmod option requires BuildKit. Refer to https://docs.docker.com/go/buildkit/ to learn how to build images with BuildKit enabled
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
When the multireader hits EOF, we will always get EOF from it, so we
cannot store the multrireader fro later error handling, only for the
decoder.
Thanks @tobiasstadler for pointing this error out.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
The "userxattr" option is needed for mounting overlayfs inside a user namespace with kernel >= 5.11.
The "userxattr" option is NOT needed for the initial user namespace (aka "the host").
Also, Ubuntu (since circa 2015) and Debian (since 10) with kernel < 5.11 can mount the overlayfs in a user namespace without the "userxattr" option.
The corresponding kernel commit: 2d2f2d7322ff43e0fe92bf8cccdc0b09449bf2e1
> **ovl: user xattr**
>
> Optionally allow using "user.overlay." namespace instead of "trusted.overlay."
> ...
> Disable redirect_dir and metacopy options, because these would allow privilege escalation through direct manipulation of the
> "user.overlay.redirect" or "user.overlay.metacopy" xattrs.
Fix issue 42055
Related to containerd/containerd PR 5076
Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
Fixes#41704
The latest released versions of the static binaries (20.10.3) are still unable
to use faccessat2 with musl-1.2.2 even though this was addressed in #41353 and
related issues. The underlying cause seems to be that the build system
here still uses the default version of libseccomp shipped with buster.
An updated version is available in buster backports:
https://packages.debian.org/buster-backports/libseccomp-dev
Signed-off-by: Jeremy Huntwork <jhuntwork@lightcubesolutions.com>
Added an option `awslogs-create-stream` to allow skipping log stream
creation for awslogs log driver. The default value is still true to
keep the behavior be consistent with before.
Signed-off-by: Xia Wu <xwumzn@amazon.com>
full diff: https://github.com/containerd/containerd/compare/v1.4.3...v1.4.4
Release notes:
The fourth patch release for `containerd` 1.4 contains a fix for CVE-2021-21334
along with various other minor issues.
See [GHSA-36xw-fx78-c5r4](https://github.com/containerd/containerd/security/advisories/GHSA-36xw-fx78-c5r4)
for more details related to CVE-2021-21334.
Notable Updates
- Fix container create in CRI to prevent possible environment variable leak between containers
- Update shim server to return grpc NotFound error
- Add bounds on max `oom_score_adj` value for shim's AdjustOOMScore
- Update task manager to use fresh context when calling shim shutdown
- Update Docker resolver to avoid possible concurrent map access panic
- Update shim's log file open flags to avoid containerd hang on syscall open
- Fix incorrect usage calculation
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
These syscalls were disabled in #18971
due to them requiring CAP_PTRACE. CAP_PTRACE was blocked by default due
to a ptrace related exploit. This has been patched in the Linux kernel
(version 4.8) and thus `ptrace` has been re-enabled. However, these
associated syscalls seem to have been left behind. This commit brings
them in line with `ptrace`, and re-enables it for kernel > 4.8.
Signed-off-by: clubby789 <jamie@hill-daniel.co.uk>
=== RUN TestUntarParentPathPermissions
archive_unix_test.go:171: assertion failed: error is not nil: chown /tmp/TestUntarParentPathPermissions694189715/foo: operation not permitted
Signed-off-by: Shengjing Zhu <zhsj@debian.org>
This mirrors what the non-rootless version does, and lets `systemd` understand
when the service is fully up and running.
`NotifyAccess=all` is required, since the main process is the wrapper script,
and it's the child process that emits the signal.
Signed-off-by: Hugo Osvaldo Barrera <hugo@barrera.io>
- Using "/go/" redirects for some topics, which allows us to
redirect to new locations if topics are moved around in the
documentation.
- Updated some old URLs to their new location.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
full diff: 68bb095353...9065b18ba4
- fix seccomp compatibility in 32bit arm
- fixes Unable to build alpine:edge containers for armv7
- fixes Buildx failing to build for arm/v7 platform on arm64 machine
- resolver: avoid error caching on token fetch
- fixes "Error: i/o timeout should not be cached"
- fileop: fix checksum to contain indexes of inputs
- frontend/dockerfile: add RunCommand.FlagsUsed field
- relates to [20.10] Classic builder silently ignores unsupported Dockerfile command flags
- update qemu emulators
- relates to "Impossible to run git clone inside buildx with non x86 architecture"
- Fix reference count issues on typed errors with mount references
- fixes errors on releasing mounts with typed execerror refs
- fixes / addresses invalid mutable ref when using shared cache mounts
- dockerfile/docs: fix frontend image tags
- git: set token only for main remote access
- fixes "Loading repositories with submodules is repeated. Failed to clone submodule from googlesource"
- allow skipping empty layer detection on cache export
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Signed-off-by: Tibor Vass <tibor@docker.com>
`dockerd-rootless.sh install` is a common typo of `dockerd-rootless-setuptool.sh install`.
Now `dockerd-rootless.sh install` shows human-readable error.
Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
The `RUN --mount` options have been promoted to the stable channel,
so we can switch from "experimental" to "stable".
Note that the syntax directive should no longer be needed now, but
it's good practice to add a syntax-directive, to allow building on
older versions of docker.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This code was added in 391441c28b, to fix
upgrades from docker 1.11 to 1.12 with existing containers.
Given that any container after 1.12 should have the correct configuration
already, it should be safe to assume this upgrade logic is no longer needed.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Points out another symbol that Docker might need. in this case Docker's
mesh network in swarm mode does not route Virtual IPs if it's unset.
From /var/logs/docker.log:
time="2021-02-19T18:15:39+01:00" level=error msg="set up rule failed, [-t mangle -A INPUT -d 10.0.1.2/32 -j MARK --set-mark 257]: (iptables failed: iptables --wait -t mang
le -A INPUT
-d 10.0.1.2/32 -j MARK --set-mark 257: iptables v1.8.7 (legacy): unknown option \"--set-mark\"\nTry `iptables -h' or 'iptables --help' for more information.\n (exit status 2))"
Bug: https://github.com/moby/libnetwork/issues/2227
Bug: https://github.com/docker/for-linux/issues/644
Bug: https://github.com/docker/for-linux/issues/525
Signed-off-by: Piotr Karbowski <piotr.karbowski@protonmail.ch>
Wrap platforms.Only and fallback to our ignore mismatches due to empty
CPU variants. This just cleans things up and makes the logic re-usable
in other places.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
The runtime spec expects the FileMode field to only hold file permissions,
however `unix.Stat_t.Mode` contains both file type and mode.
This patch strips file type so that only file mode is included in the Device.
Thanks to Iceber Gu, who noticed the same issue in containerd and runc.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
In some cases, in fact many in the wild, an image may have the incorrect
platform on the image config.
This can lead to failures to run an image, particularly when a user
specifies a `--platform`.
Typically what we see in the wild is a manifest list with an an entry
for, as an example, linux/arm64 pointing to an image config that has
linux/amd64 on it.
This change falls back to looking up the manifest list for an image to
see if the manifest list shows the image as the correct one for that
platform.
In order to accomplish this we need to traverse the leases associated
with an image. Each image, if pulled with Docker 20.10, will have the
manifest list stored in the containerd content store with the resource
assigned to a lease keyed on the image ID.
So we look up the lease for the image, then look up the assocated
resources to find the manifest list, then check the manifest list for a
platform match, then ensure that manifest referes to our image config.
This is only used as a fallback when a user specified they want a
particular platform and the image config that we have does not match
that platform.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
Commit edb62a3ace fixed a bug in MkdirAllAndChown()
that caused the specified permissions to not be applied correctly. As a result
of that bug, the configured umask would be applied.
When extracting archives, Unpack() used 0777 permissions when creating missing
parent directories for files that were extracted.
Before edb62a3ace, this resulted in actual
permissions of those directories to be 0755 on most configurations (using a
default 022 umask).
Creating these directories should not depend on the host's umask configuration.
This patch changes the permissions to 0755 to match the previous behavior,
and to reflect the original intent of using 0755 as default.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Config resolution was synchronized based on a wrong key as ref
variable is initialized only after in the same function. Using
the right key isn't fully correct either as the synchronized method
changes properties of the puller instance and can't be just skipped.
Added better error handling for the same case as well.
Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>
Changes certain words and adds punctuation to the comments of functions in the client package, which end up in the GoDoc documentation. Areas where only periods were needed were ignored to prevent excessive code churn.
Signed-off-by: Levi Harrison <levisamuelharrison@gmail.com>
We have upgraded runc to rc93 and added CI for cgroup 2.
So we can move cgroup v2 out of experimental.
Fix issue 41916
Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
This currently doesn't make a difference, because load.FrozenImagesLinux()
currently loads all frozen images, not just the specified one, but in case
that is fixed/implemented at some point.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
v0.13.1
- Refactor `ParsePortSpec` to handle IPv6 addresses, and improve validation
v0.13.0
- `rootlesskit --pidns`: fix propagating exit status
- Support cgroup2 evacuation, e.g., `systemd-run -p Delegate=yes --user -t rootlesskit --cgroupns --pidns --evacuate-cgroup2=evac --net=slirp4netns bash`
v0.12.0
- Port forwarding API now supports setting `ChildIP`
- The `vendor` directory is no longer included in this repo. Run `go mod vendor` if you need
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
full diff: https://github.com/opencontainers/runc/compare/v1.0.0-rc92...v1.0.0-rc93
release notes: https://github.com/opencontainers/runc/releases/tag/v1.0.0-rc93
Release notes for runc v1.0.0-rc93
-------------------------------------------------
This is the last feature-rich RC release and we are in a feature-freeze until
1.0. 1.0.0~rc94 will be released in a few weeks with minimal bug fixes only,
and 1.0.0 will be released soon afterwards.
- runc's cgroupv2 support is no longer considered experimental. It is now
believed to be fully ready for production deployments. In addition, runc's
cgroup code has been improved:
- The systemd cgroup driver has been improved to be more resilient and
handle more systemd properties correctly.
- We now make use of openat2(2) when possible to improve the security of
cgroup operations (in future runc will be wholesale ported to libpathrs to
get this protection in all codepaths).
- runc's mountinfo parsing code has been reworked significantly, making
container startup times significantly faster and less wasteful in general.
- runc now has special handling for seccomp profiles to avoid making new
syscalls unusable for glibc. This is done by installing a custom prefix to
all seccomp filters which returns -ENOSYS for syscalls that are newer than
any syscall in the profile (meaning they have a larger syscall number).
This should not cause any regressions (because previously users would simply
get -EPERM rather than -ENOSYS, and the rule applied above is the most
conservative rule possible) but please report any regressions you find as a
result of this change -- in particular, programs which have special fallback
code that is only run in the case of -EPERM.
- runc now supports the following new runtime-spec features:
- The umask of a container can now be specified.
- The new Linux 5.9 capabilities (CAP_PERFMON, CAP_BPF, and
CAP_CHECKPOINT_RESTORE) are now supported.
- The "unified" cgroup configuration option, which allows users to explicitly
specify the limits based on the cgroup file names rather than abstracting
them through OCI configuration. This is currently limited in scope to
cgroupv2.
- Various rootless containers improvements:
- runc will no longer cause conflicts if a user specifies a custom device
which conflicts with a user-configured device -- the user device takes
precedence.
- runc no longer panics if /sys/fs/cgroup is missing in rootless mode.
- runc --root is now always treated as local to the current working directory.
- The --no-pivot-root hardening was improved to handle nested mounts properly
(please note that we still strongly recommend that users do not use
--no-pivot-root -- it is still an insecure option).
- A large number of code cleanliness and other various cleanups, including
fairly large changes to our tests and CI to make them all run more
efficiently.
For packagers the following changes have been made which will have impact on
your packaging of runc:
- The "selinux" and "apparmor" buildtags have been removed, and now all runc
builds will have SELinux and AppArmor support enabled. Note that "seccomp"
is still optional (though we very highly recommend you enable it).
- make install DESTDIR= now functions correctly.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
While the field in the Go struct is named `NanoCPUs`, it has a JSON label to
use `NanoCpus`, which was added in the original pull request (not clear what
the reason was); 846baf1fd3
Some notes:
- Golang processes field names case-insensitive, so when *using* the API,
both cases should work, but when inspecting a container, the field is
returned as `NanoCpus`.
- This only affects Containers.Resources. The `Limits` and `Reservation`
for SwarmKit services and SwarmKit "nodes" do not override the name
for JSON, so have the canonical (`NanoCPUs`) casing.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
While the field in the Go struct is named `NanoCPUs`, it has a JSON label to
use `NanoCpus`, which was added in the original pull request (not clear what
the reason was); 846baf1fd3
Some notes:
- Golang processes field names case-insensitive, so when *using* the API,
both cases should work, but when inspecting a container, the field is
returned as `NanoCpus`.
- This only affects Containers.Resources. The `Limits` and `Reservation`
for SwarmKit services and SwarmKit "nodes" do not override the name
for JSON, so have the canonical (`NanoCPUs`) casing.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Use the image build from Dockerfile.simple to build docker binary failed
with not find <brtfs/ioctl.h>, we need to install libbtrfs-dev to fix this.
```
Building: bundles/dynbinary-daemon/dockerd-dev
GOOS="" GOARCH="" GOARM=""
.gopath/src/github.com/docker/docker/daemon/graphdriver/btrfs/btrfs.go:8:10: fatal error: btrfs/ioctl.h: No such file or directory
#include <btrfs/ioctl.h>
```
Signed-off-by: Lei Jitang <leijitang@outlook.com>
Otherwise a malformed or empty digest may cause a panic.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
(cherry picked from commit a7d4af84bd)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Various dirs in /var/lib/docker contain data that needs to be mounted
into a container. For this reason, these dirs are set to be owned by the
remapped root user, otherwise there can be permissions issues.
However, this uneccessarily exposes these dirs to an unprivileged user
on the host.
Instead, set the ownership of these dirs to the real root (or rather the
UID/GID of dockerd) with 0701 permissions, which allows the remapped
root to enter the directories but not read/write to them.
The remapped root needs to enter these dirs so the container's rootfs
can be configured... e.g. to mount /etc/resolve.conf.
This prevents an unprivileged user from having read/write access to
these dirs on the host.
The flip side of this is now any user can enter these directories.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
(cherry picked from commit e908cc3901)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The remapped root does not need access to this dir.
Having this owned by the remapped root opens the host up to an
uprivileged user on the host being able to escalate privileges.
While it would not be normal for the remapped UID to be used outside of
the container context, it could happen.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
(cherry picked from commit bfedd27259)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Generally if we ever need to change perms of a dir, between versions,
this ensures the permissions actually change when we think it should
change without having to handle special cases if it already existed.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
(cherry picked from commit edb62a3ace)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Before this change, there is no way to know if container (runtime)
resources have been cleaned up unless you actually remove the container.
This change allows callers of the wait API or the events API to know
that all runtime resources for the container are released (e.g. IP
addresses).
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
These syscalls (some of which have been in Linux for a while but were
missing from the profile) fall into a few buckets:
* close_range(2), epoll_pwait2(2) are just extensions of existing "safe
for everyone" syscalls.
* The mountv2 API syscalls (fs*(2), move_mount(2), open_tree(2)) are
all equivalent to aspects of mount(2) and thus go into the
CAP_SYS_ADMIN category.
* process_madvise(2) is similar to the other process_*(2) syscalls and
thus goes in the CAP_SYS_PTRACE category.
Signed-off-by: Aleksa Sarai <asarai@suse.de>
Fix#41803
Also attempt to mknod devices.
Mknodding devices are likely to fail, but still worth trying when
running with a seccomp user notification.
Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
This fixes a panic when an invalid "device cgroup rule" is passed, resulting
in an "index out of range".
This bug was introduced in the original implementation in 1756af6faf,
but was not reproducible when using the CLI, because the same commit also added
client-side validation on the flag before making an API request. The following
example, uses an invalid rule (`c *:* rwm` - two spaces before the permissions);
```console
$ docker run --rm --network=host --device-cgroup-rule='c *:* rwm' busybox
invalid argument "c *:* rwm" for "--device-cgroup-rule" flag: invalid device cgroup format 'c *:* rwm'
```
Doing the same, but using the API results in a daemon panic when starting the container;
Create a container with an invalid device cgroup rule:
```console
curl -v \
--unix-socket /var/run/docker.sock \
"http://localhost/v1.41/containers/create?name=foobar" \
-H "Content-Type: application/json" \
-d '{"Image":"busybox:latest", "HostConfig":{"DeviceCgroupRules": ["c *:* rwm"]}}'
```
Start the container:
```console
curl -v \
--unix-socket /var/run/docker.sock \
-X POST \
"http://localhost/v1.41/containers/foobar/start"
```
Observe the daemon logs:
```
2021-01-22 12:53:03.313806 I | http: panic serving @: runtime error: index out of range [0] with length 0
goroutine 571 [running]:
net/http.(*conn).serve.func1(0xc000cb2d20)
/usr/local/go/src/net/http/server.go:1795 +0x13b
panic(0x2f32380, 0xc000aebfc0)
/usr/local/go/src/runtime/panic.go:679 +0x1b6
github.com/docker/docker/oci.AppendDevicePermissionsFromCgroupRules(0xc000175c00, 0x8, 0x8, 0xc0000bd380, 0x1, 0x4, 0x0, 0x0, 0xc0000e69c0, 0x0, ...)
/go/src/github.com/docker/docker/oci/oci.go:34 +0x64f
```
This patch:
- fixes the panic, allowing the daemon to return an error on container start
- adds a unit-test to validate various permutations
- adds a "todo" to verify the regular expression (and handling) of the "a" (all) value
We should also consider performing this validation when _creating_ the container,
so that an error is produced early.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
"docker load" validates parent links by comparing image histories, and the
History struct has a time.Time member "Created". Time.UnmarshalJSON can read
RFC3339 timestamps with offset "+00:00", but t.MarshalJSON writes them with
offset "Z". Equivalent times in these two formats are not equal when compared
with the == operator.
This causes checkValidParent to incorrectly return false when the parent image
history contains times using offset "+00:00". In that case the history copied
to the child image will have been converted into "Z" form when marshaled out.
This patch adds an "Equal" method to History, which compares "Created" times
with t.Equal. This is used instead of reflect.DeepEqual in checkValidParent.
Signed-off-by: Rob Cowsill <42620235+rcowsill@users.noreply.github.com>
Loggers that implement BufSize() (e.g. awslogs) uses the method to
tell Copier about the maximum log line length. However loggerWithCache
and RingBuffer hide the method by wrapping loggers.
As a result, Copier uses its default 16KB limit which breaks log
lines > 16kB even the destinations can handle that.
This change implements BufSize() on loggerWithCache and RingBuffer to
make sure these logger wrappes don't hide the method on the underlying
loggers.
Fixes#41794.
Signed-off-by: Kazuyoshi Kato <katokazu@amazon.com>
The following warnings in `docker info` are now discarded,
because there is no action user can actually take.
On cgroup v1:
- "WARNING: No blkio weight support"
- "WARNING: No blkio weight_device support"
On cgroup v2:
- "WARNING: No kernel memory TCP limit support"
- "WARNING: No oom kill disable support"
`docker run` still prints warnings when the missing feature is being attempted to use.
Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
Allow proxying IPv6 traffic to the container's IPv4 interface
if `--ipv6` is disabled and the container does not have a
IPv6 address, when the docker-proxy / `userland-proxy` is enabled
on `dockerd`
Relates to https://github.com/moby/libnetwork/issues/2607
Signed-off-by: Arko Dasgupta <arko.dasgupta@docker.com>
When pulling an image by platform, it is possible for the image's
configured platform to not match what was in the manifest list.
The image itself is buggy because either the manifest list is incorrect
or the image config is incorrect. In any case, this is preventing people
from upgrading because many times users do not have control over these
buggy images.
This was not a problem in 19.03 because we did not compare on platform
before. It just assumed if we had the image it was the one we wanted
regardless of platform, which has its own problems.
Example Dockerfile that has this problem:
```Dockerfile
FROM --platform=linux/arm64 k8s.gcr.io/build-image/debian-iptables:buster-v1.3.0
RUN echo hello
```
This fails the first time you try to build after it finishes pulling but
before performing the `RUN` command.
On the second attempt it works because the image is already there and
does not hit the code that errors out on platform mismatch (Actually it
ignores errors if an image is returned at all).
Must be run with the classic builder (DOCKER_BUILDKIT=0).
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
This fixes a panic when an admin specifies a custom default runtime,
when a plugin is started the shim config is nil.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
This parameter was removed by kernel commit 4c145dce260137,
which made its way to kernel v5.3-rc1. Since that commit,
the functionality is built-in (i.e. it is available as long
as CONFIG_XFRM is on).
Make the check conditional.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
These config options are removed by kernel commit f382fb0bcef4,
which made its way into kernel v5.0-rc1.
Make the check conditional.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Kernel commit 2d1c498072de69e (which made its way into kernel v5.8-rc1)
removed CONFIG_MEMCG_SWAP_ENABLED Kconfig option, making swap accounting
always enabled (unless swapaccount=0 boot option is provided).
Make the check conditional.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
CONFIG_NF_NAT_NEEDED was removed in kernel commit 4806e975729f99c7,
which made its way into v5.2-rc1. The functionality is now under
NF_NAT which we already check for.
Make the check for NF_NAT_NEEDED conditional.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
CONFIG_NF_NAT_IPV4 was removed in kernel commit 3bf195ae6037e310,
which made its way into v5.1-rc1. The functionality is now under
NF_NAT which we already check for.
Make the check for NF_NAT_IPV4 conditional.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Commit f2f5106c92 added this test to verify loading
of images that were built with user-namespaces enabled.
However, because this test spins up a new daemon, not the daemon that's set up by
the test-suite's `TestMain()` (which loads the frozen images).
As a result, the `debian:bullseye` image was pulled from Docker Hub when running
the test;
Calling POST /v1.41/images/load?quiet=1
Applying tar in /go/src/github.com/docker/docker/bundles/test-integration/TestBuildUserNamespaceValidateCapabilitiesAreV2/d4d366b15997b/root/165536.165536/overlay2/3f7f9375197667acaf7bc810b34689c21f8fed9c52c6765c032497092ca023d6/diff" storage-driver=overlay
Applied tar sha256:845f0e5159140e9dbcad00c0326c2a506fbe375aa1c229c43f082867d283149c to 3f7f9375197667acaf7bc810b34689c21f8fed9c52c6765c032497092ca023d6, size: 5922359
Calling POST /v1.41/build?buildargs=null&cachefrom=null&cgroupparent=&cpuperiod=0&cpuquota=0&cpusetcpus=&cpusetmems=&cpushares=0&dockerfile=&labels=null&memory=0&memswap=0&networkmode=&rm=0&shmsize=0&t=capabilities%3A1.0&target=&ulimits=null&version=
Trying to pull debian from https://registry-1.docker.io v2
Fetching manifest from remote" digest="sha256:f169dbadc9021fc0b08e371d50a772809286a167f62a8b6ae86e4745878d283d" error="<nil>" remote="docker.io/library/debian:bullseye
Pulling ref from V2 registry: debian:bullseye
...
This patch updates `TestBuildUserNamespaceValidateCapabilitiesAreV2` to load the
frozen image. `StartWithBusybox` is also changed to `Start`, because the test
is not using the busybox image, so there's no need to load it.
In a followup, we should probably add some utilities to make this easier to set up
(and to allow passing the list frozen images that we want to load, without having
to "hard-code" the image name to load).
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Using `d.Kill()` with rootless mode causes the restarted daemon to not
be able to start containerd (it times out).
Originally this was SIGKILLing the daemon because we were hoping to not
have to manipulate on disk state, but since we need to anyway we can
shut it down normally.
I also tested this to ensure the test fails correctly without the fix
that the test was added to check for.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
When building images in a user-namespaced container, v3 capabilities are
stored including the root UID of the creator of the user-namespace.
This UID does not make sense outside the build environment however. If
the image is run in a non-user-namespaced runtime, or if a user-namespaced
runtime uses a different UID, the capabilities requested by the effective
bit will not be honoured by `execve(2)` due to this mismatch.
Instead, we convert v3 capabilities to v2, dropping the root UID on the
fly.
Signed-off-by: Eric Mountain <eric.mountain@datadoghq.com>
1. Allocate either a IPv4 and/or IPv6 Port Binding (HostIP, HostPort, ContainerIP,
ContainerPort) based on the input and system parameters
2. Update the userland proxy as well as dummy proxy (inside port mapper) to
specifically listen on either the IPv4 or IPv6 network
Signed-off-by: Arko Dasgupta <arko.dasgupta@docker.com>
This reverts commit a65c65d801,
which caused the docker service to not be starting, or delayed
starting the service in certain conditions.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This switches the hcsshim dependency back to tagged releases, instead of the special
"moby" branch. This makes the dependency align with both BuildKit and containerd,
which use these versions.
The switch to the "moby" branch was done in 2865478487,
to bring in a fix for image import, without having to bring in additional changes;
> We changed to the moby branch for a couple of reasons:
>
> - Allows us to take this important change without needing to also pull in all
> of the other work that has been going on in the repo.
> - moby uses an older set of APIs exposed from hcsshim, based on the HCS v1
> functionality. Going forwards, we have discussed deprecating/removing these
> APIs from the mainline branch in hcsshim, so our thinking was we could keep
> this moby branch around to ensure we don't break compatibility there.
>
> (...) Long term, the best path here is to get moby using containerd as the
> backend on Windows, which should alleviate these issues.
full diff: 9dcb42f100..v0.8.10
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
When writing container's `hostconfig.json`, permissions were set to 0644 (world-
readable). While this is not a security concern (as the `/var/lib/docker/containers`
directory has `0700` or `0701` permissions), there is no real need to have these
permissions, as this file is only accessed by the daemon.
Looking at history for file permissions;
- 06b53e3fc7 (first implementation) used `0666` (world-writable)
- cf1a6c08fa refactored the code, and removed explicit permissions
- ea3cbd3274 introduced atomic writes, and brought back the `0666` permissions
- 3ec8fed747 removed world-writable bits, but kept world-readable
This patch updates the permissions to `0600`, matching what's used for `config.v2.json`,
which was updated in ae52cea3ab, but forgot to update
`hostconfig.json`.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The free disk space on the Windows RS5 CI nodes appears to be just the
right size that the TestBuildWCOWSandboxSize test can generate 21GB of
layers, and then a 21GB sandbox inside a container, and then runs out of
space while committing the layer.
Helpfully, this failure is distinguishable in the logs from a failure
when the sandbox is too small, so we can do that.
TODO: Revert this if-and-when the Windows-RS5 CI nodes have more free
space.
Signed-off-by: Paul "TBBle" Hampson <Paul.Hampson@Pobox.com>
This ensures the storage-opts applies to all operations by the graph
drivers, replacing the merging of storage-opts into container storage
config at container-creation time, and hence applying storage-opts to
non-container operations like `COPY` and `ADD` in the builder.
Signed-off-by: Paul "TBBle" Hampson <Paul.Hampson@Pobox.com>
This applies the 127GB default WCOW Sandbox size to not just `RUN` under
`docker build` (as was previously the case) but to `COPY` and `ADD`
under `docker build` and also to `docker run`.
It also removes an inconsistency that the 127GB size was not applied
when `--platform windows` was not passed to `docker build`, but WCOW was
still used as a platform default, e.g. Docker Desktop for Windows in
Windows Containers mode.
Signed-off-by: Paul "TBBle" Hampson <Paul.Hampson@Pobox.com>
This test validates that `RUN` and `COPY` both target a read-write
sandbox on Windows that is configured according to the daemon's
`storage-opts` setting.
Sadly, this is a slow test, so we need to bump the timeout to 60 minutes
from the default of 10 minutes.
Signed-off-by: Paul "TBBle" Hampson <Paul.Hampson@Pobox.com>
relates to https://github.com/docker/for-linux/issues/678
When using the BindTo directive, Docker is permanently stopped by systemd
when containerd is temporarily killed and restarted;
Using `Requires` achieves mostly the same, but defines a weaker dependency;
https://www.freedesktop.org/software/systemd/man/systemd.unit.html#Requires=
> Requires=
>
> .. If this unit gets activated, the units listed will be activated as well.
> If one of the other units fails to activate, and an ordering dependency
> After= on the failing unit is set, this unit will not be started. Besides,
> with or without specifying After=, this unit will be stopped if one of the
> other units is explicitly stopped.
We may want to look into using `Wants=` instead of `Requires=`, because
that allows docker to continue running if containerd is restarted, quoting
the systemd documentation:
> Often, it is a better choice to use Wants= instead of Requires= in order
> to achieve a system that is more robust when dealing with failing services.
Given that docker will likely still fail if the containerd socket is not
present, startup will fail if containerd is not running, but if containerd
is restarted, the docker daemon may be able to try reconnecting.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
dockerd currently sets the oom-score-adjust itself. This functionality
was added when we did not yet run dockerd as a systemd service.
Now that we do, it's better to instead have systemd handle this.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The homedir package was only used to print default values for
flags that contained paths inside the user's home-directory in
a slightly nicer way (replace `/users/home` with `~`).
Given that this is not critical, we can replace this with golang's
function, which does not depend on libcontainer.
There's still one use of the homedir package in docker/docker/opts,
which is used by the dnet binary (but only requires the homedir
package when running in rootless mode)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
All distros that are supported by Docker now have at least
kernel version 3.10, so this check should no longer be needed.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
All distros that are supported by Docker now have at least
kernel version 3.10, so this check should no longer be needed.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Seeing this `ErrNotImplements` in some of our logs and it's not very
helpful because we don't know what plugin is causing it or even what the
requested interface is.
```
{"message":"legacy plugin: Plugin does not implement the requested driver"}
```
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
Previously, failing to disable IPv6 router advertisement prevented the daemon to
start.
An issue was reported by a user that started docker using `systemd-nspawn "machine"`,
which produced an error;
failed to start daemon: Error initializing network controller:
Error creating default "bridge" network: libnetwork:
Unable to disable IPv6 router advertisement:
open /proc/sys/net/ipv6/conf/docker0/accept_ra: read-only file system
This patch changes the error to a log-message instead.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The homedir package was only used to print default values for
flags that contained paths inside the user's home-directory in
a slightly nicer way (replace `/users/home` with `~`).
Given that this is not critical, we can replace this with golang's
function, which does not depend on libcontainer.
There's still one use of the homedir package in docker/docker/opts,
which is used by the dnet binary (but only requires the homedir
package when running in rootless mode)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Signed-off-by: Samuel Karp <skarp@amazon.com>
(cherry picked from commit 9489546c44d94d37337191c263879a7ac075a331)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Fix 'failed to get network during CreateEndpoint' during container starting.
Change the error type to `libnetwork.ErrNoSuchNetwork`, so `Start()` in `daemon/cluster/executor/container/controller.go` will recreate the network.
Signed-off-by: Xinfeng Liu <xinfeng.liu@gmail.com>
This function always returned `nil`, so we can remove the error
return, and update other functions that were handling errors.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Commit 2a480d515e updated the DNS library
and updated the error handling.
Due to changes in the library, we now had to check the response itself
to check if the response was truncated (Truncated DNS replies should
be sent to the client so that the client can retry over TCP).
However, 1e02aae252 added an incorrect
`nil` check to fix a panic, which ignored situations where
an error was returned, but no response (for example, if we failed
to connect to the DNS server).
In that situation, the error would be ignored, and further down we
would consider the connection to have been succesfull, but the DNS
server not returning a result.
After a "successful" lookup (but no results), we break the loop,
and don't attempt lookups in other DNS servers.
Versions before 1e02aae252 would produce:
Name To resolve: bbc.co.uk.
[resolver] query bbc.co.uk. (A) from 172.21.0.2:36181, forwarding to udp:192.168.5.1
[resolver] read from DNS server failed, read udp 172.21.0.2:36181->192.168.5.1:53: i/o timeout
[resolver] query bbc.co.uk. (A) from 172.21.0.2:38582, forwarding to udp:8.8.8.8
[resolver] received A record "151.101.0.81" for "bbc.co.uk." from udp:8.8.8.8
[resolver] received A record "151.101.192.81" for "bbc.co.uk." from udp:8.8.8.8
[resolver] received A record "151.101.64.81" for "bbc.co.uk." from udp:8.8.8.8
[resolver] received A record "151.101.128.81" for "bbc.co.uk." from udp:8.8.8.8
Versions after that commit would ignore the error, and stop further lookups:
Name To resolve: bbc.co.uk.
[resolver] query bbc.co.uk. (A) from 172.21.0.2:59870, forwarding to udp:192.168.5.1
[resolver] external DNS udp:192.168.5.1 returned empty response for "bbc.co.uk."
This patch updates the logic to handle the error to log the error (and continue with the next DNS):
- if an error is returned, and no response was received
- if an error is returned, but it was not related to a truncated response
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Signed-off-by: Tibor Vass <tibor@docker.com>
full diff: https://github.com/moby/ipvs/compare/v1.0.0...v1.0.1
- Fix compatibility issue on older kernels (< 3.18) where the address
family attribute for destination servers do not exist
- Fix the stats attribute check when parsing destination addresses
- NetlinkSocketsTimeout should be a constant
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Under certain conditions it appears that the DNS response and returned
error can be nil. When this happens, checking resp.Truncated results in
a nil panic so we must first check that the response is not nil before
checking if a truncated response was received.
See moby/moby#40715
Signed-off-by: Sam Whited <sam@samwhited.com>
The ipvs package was moved to a separate repo.
The ipvs package is a fairly generic set of helpers for managing IPVS.
The ipvs package is used by docker swarm and kubernetes.
Because we want to merge libnetwork back into the moby/moby codebase
while also not creating more dependencies for other projects on
moby/moby itself, it was decided that the best path for ipvs is to live
on it's own since there are no other ties to libnetwork.
Ref: https://github.com/moby/libnetwork/issues/2522
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
https://github.com/docker/libnetwork/pull/2419 and
https://github.com/docker/libnetwork/pull/2407
attempted to seperate out empty parent and internal for
macvlan and ipvlan networks
However it didnt pass the integration tests in moby
https://github.com/moby/moby/pull/40596 and exposed some
more plumbing that needed to be done to make sure
we separate the two things
If the -o parent is empty we create a dummylink
and if internal is set we dont add a default gateway
and make sure north-south communication cannot take place
(only east-west / container-container can)
Signed-off-by: Arko Dasgupta <arko.dasgupta@docker.com>
PartOf deactivates the socket whenever the service get deactivated.
The socket unit however should be active nevertheless, so that the
docker service can be started again through socket activation.
Based on the original patch in upstream moby/moby by Max Harmathy.
Co-authored-by: Max Harmathy <max.harmathy@web.de>
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Deleting a network sandbox on Linux implicitly clears OS (ipvs) load
balancer state. Deleting an HNS network on Windows by contrast does not
inherently remove its corresponding VFP load balancers. The method to
remove load balancers belongs to the network and so must be called prior
to or while deleting a network. This commit reverts one line from
ea2fa20859, reintroducing a call to
explicitly remove backend load balancers during network removal.
Signed-off-by: Trapier Marshall <tmarshall@mirantis.com>
Debian Buster is now the current "stable", and will be the default
baseimage for Golang images going forward.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Using dummy interface allows communication beween containers only if
they are running on the same node in swarm.
Signed-off-by: Pavel Matěja <pavel@verotel.cz>
Using dummy interface allows communication beween containers only if
they are running on the same node in swam.
Signed-off-by: Pavel Matěja <pavel@verotel.cz>
Since docker container can be connected to combination of several
internal and external networks change of default gateway of the internal
ones breaks communication via the external ones.
This fixes only ipvlan network type
Signed-off-by: Pavel Matěja <pavel@verotel.cz>
Further improving load balancer performance by expiring
connections to servers with weights set to 0.
Signed-off-by: Andrew Kim <taeyeonkim90@gmail.com>
IPVS module used for swarm load balancer had a performance issue
under a high load situation. conn_reuse_mode=0 sysctl variable can
be set to handle the high load situation by reusing existing
connection entries in the IPVS table.
Under a high load, IPVS module was dropping tcp SYN packets whenever
a port reuse is detected with a connection in TIME_WAIT status forcing
clients to re-initiate tcp connections after request timeout events.
By setting conn_reuse_mode=0, IPVS module avoids special handling of
existing entries in the IPVS connection table.
Along with expire_nodest_conn=1, swarm load balancer can handle
a high load of requests and forward connections to newly joining
backend services.
Signed-off-by: Andrew Kim <taeyeonkim90@gmail.com>
This fix addresses https://docker.atlassian.net/browse/ENGCORE-1115
Expected behaviors upon docker engine restarts:
1. IPTableEnable=true, DOCKER-USER chain present
-- no change to DOCKER-USER chain
2. IPTableEnable=true, DOCKER-USER chain not present
-- DOCKER-USER chain created and inserted top of FORWARD
chain.
3. IPTableEnable=false, DOCKER-USER chain present
-- no change to DOCKER-USER chain
the rational is that DOCKER-USER is populated
and may be used by end-user for purpose other than
filtering docker container traffic. Thus even if
IPTableEnable=false, docker engine does not touch
pre-existing DOCKER-USER chain.
4. IPTableEnable=false, DOCKER-USER chain not present
-- DOCKER-USER chain is not created.
Signed-off-by: Su Wang <su.wang@docker.com>
Issue - "index out of range" panic in drivers/overlay/encryption.go:539
due to a mismatch in indices between curKeys and spis due to
case where updateKeys might bail out due to an error and
not update the spis
Fix - Reconfigure keys when there is a key update failure
Signed-off-by: Arko Dasgupta <arko.dasgupta@docker.com>
Golang 1.12.12
-------------------------------
full diff: https://github.com/golang/go/compare/go1.12.11...go1.12.12
go1.12.12 (released 2019/10/17) includes fixes to the go command, runtime,
syscall and net packages. See the Go 1.12.12 milestone on our issue tracker for
details.
https://github.com/golang/go/issues?q=milestone%3AGo1.12.12
Golang 1.12.11 (CVE-2019-17596)
-------------------------------
full diff: https://github.com/golang/go/compare/go1.12.10...go1.12.11
go1.12.11 (released 2019/10/17) includes security fixes to the crypto/dsa
package. See the Go 1.12.11 milestone on our issue tracker for details.
https://github.com/golang/go/issues?q=milestone%3AGo1.12.11
[security] Go 1.13.2 and Go 1.12.11 are released
Hi gophers,
We have just released Go 1.13.2 and Go 1.12.11 to address a recently reported
security issue. We recommend that all affected users update to one of these
releases (if you're not sure which, choose Go 1.13.2).
Invalid DSA public keys can cause a panic in dsa.Verify. In particular, using
crypto/x509.Verify on a crafted X.509 certificate chain can lead to a panic,
even if the certificates don't chain to a trusted root. The chain can be
delivered via a crypto/tls connection to a client, or to a server that accepts
and verifies client certificates. net/http clients can be made to crash by an
HTTPS server, while net/http servers that accept client certificates will
recover the panic and are unaffected.
Moreover, an application might crash invoking
crypto/x509.(*CertificateRequest).CheckSignature on an X.509 certificate
request, parsing a golang.org/x/crypto/openpgp Entity, or during a
golang.org/x/crypto/otr conversation. Finally, a golang.org/x/crypto/ssh client
can panic due to a malformed host key, while a server could panic if either
PublicKeyCallback accepts a malformed public key, or if IsUserAuthority accepts
a certificate with a malformed public key.
The issue is CVE-2019-17596 and Go issue golang.org/issue/34960.
Thanks to Daniel Mandragona for discovering and reporting this issue. We'd also
like to thank regilero for a previous disclosure of CVE-2019-16276.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Also reduce the allowed port range as the total number of containers
per host is typically less than 1K.
This change helps in scenarios where there are other services on
the same host that uses ephemeral ports in iptables manipulation.
The workflow requires changes in docker engine (
https://github.com/moby/moby/pull/40055) and this change. It
works as follows:
1. user can now specified to docker engine an option
--published-port-range="50000-60000" as cmdline argument or
in daemon.json.
2. docker engine read and pass this info to libnetwork via
config.go:OptionDynamicPortRange.
3. libnetwork uses this range to allocate dynamic port henceforth.
4. --published-port-range can be set either via SIGHUP or
restart docker engine
5. if --published-port-range is not set by user, a OS specific
default range is used for dynamic port allocation.
Linux: 49153-60999, Windows: 60000-65000
6 if --published-port-range is invalid, that is, the range
given is outside of allowed default range, no change takes place.
libnetwork will continue to use old/existing port range for
dynamic port allocation.
Signed-off-by: Su Wang <su.wang@docker.com>
Reverts 141b53c77a (PR #2450)
Fallout from changing the forwarding default policy to deny was greater than anticipated.
Signed-off-by: Euan Harris <euan.harris@docker.com>
Fixed these tests :
1.TestNetworkDBIslands
Addresses : https://github.com/docker/libnetwork/issues/2402
2.TestNetworkDBCRUDMediumCluster
Addresses : https://github.com/docker/libnetwork/issues/2401
By :
1. Importing gotest.tools/poll to use poll.WaitOn
Above function can be used to check a condition at regular intervals
until a timeout is reached
2. Replacing Sleep with poll.WaitOn
2. Adding closeNetworkDBInstances to close remaining DBs
Signed-off-by: Arko Dasgupta <arko.dasgupta@docker.com>
The output of "bridge fdb show" command invoked under a network
namespace is unpredicable. Sometime it returns empty, and sometime
non-stop rolling output. This perhaps is a bug in kernel
and/or iproute2 implementation. To work around, display fdb for
each bridge.
Signed-off-by: Su Wang <su.wang@docker.com>
This commit allows a user to specify a Host IP via the
com.docker.network.host_ipv4 label which is used as the
Source IP during SNAT for bridge networks .
The use case is for hosts with multiple interfaces and
this label can dictate which IP will be used as Source IP
for North-South traffic
In the absence of this label, MASQUERADE is used which picks the Source IP
based on Next Hop from the Route Table
Addresses: https://github.com/moby/moby/issues/30053
Signed-off-by: Arko Dasgupta <arko.dasgupta@docker.com>
In windows HNS manages IPAM. If the user does not specify a subnet, HNS will choose one
for them. However, in order for the IPAM to show up in the output of "docker inspect",
we need to update the network IPAMv4Config field.
Signed-off-by: Pradip Dhara <pradipd@microsoft.com>
This reverts commit 9f58c475940fb0c0d4b69de0af7787b62a40481f.
This commit is causing TestCreateParallel to be flaky
Signed-off-by: Arko Dasgupta <arko.dasgupta@docker.com>
This reverts commit 94af1e5af2.
The reason to revert this is, that TestCreateParallel is
continously failing and breaking the CI
Signed-off-by: Arko Dasgupta <arko.dasgupta@docker.com>
This allows overriding the version of Go without making modifications in the
source code, which can be useful to test against multiple versions.
For example:
make GO_VERSION=1.13beta1 build
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This commit carries forward the work done in
https://github.com/docker/libnetwork/pull/2295
and fixes two things
1. Allows macvlan and ipvlan to be restored properly
after dockerd or the system is restarted
2. Makes sure the refcount for the configOnly network
is not incremented for the above case so this network
can be deleted after all the associated ConfigFrom networks
are deleted
Addresses: https://github.com/docker/libnetwork/issues/1743
Signed-off-by: Arko Dasgupta <arko.dasgupta@docker.com>
Since docker container can be connected to combination of several
internal and external networks change of default gateway of the internal
ones breaks communication via the external ones.
This fixes only macvlan network type
Signed-off-by: Pavel Matěja <pavel@verotel.cz>
This commit updates the vendored ishidawataru/sctp and adapts its used
types.
Signed-off-by: Sascha Grunert <sgrunert@suse.com>
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
moby/moby commit b27f70d45 wraps the ErrNotFound error returned when
a plugin cannot be found, to include a backtrace. This changes the
type of the error, so contoller.loadIPAMDriver no longer converts it
to a libnetwork plugin.NotFoundError.
This is a similar patch as was merged in 9b114971e5
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Changes included:
- Allow index specification at link creation time
- replace syscall with golang.org/x/sys/unix
- related: Use IFF_MULTI_QUEUE from x/sys/unix to define TUNTAP_MULTI_QUEUE
- related: Use IFLA_* constants from x/sys/unix
- Fix index out of range when no metadata for gretap
- added encapsulation attributes for Iptun and Sittun to support SIT tunnels
- Expose xfrm state's statistics
- Support invert in ip rules
- Support LWTUNNEL_ENCAP_SEG6
- Support setting and retrieving route MTU/AdvMSS
- Fix CalcRtable array parameter bug
- added support for Foo-over-UDP netlink calls
- Support num{tx,rx}queues and udp6zerocsum{tx,rx}
- tuntap: Add multiqueue support
- Retrieve VLAN ID when listing neighbour
- Fix LinkAdd for sit tunnel on 3.10 kernel
- Add support for managing source MACVLANs
- Two functions: one for adding bond slave, one for getting veth peer index
- Eliminate cgo from netlink
- Don't overwrite the XDP file descriptor with flags
- Fix reference to BPF instructions (on Kernel 4.13)
- Add Matchall filter
- Send IFA_CACHEINFO when setting up addresses
- Support IPv6 GRE Tun and Tap
- Add List option to RouteSubscribeWithOptions, AddrSubscribeWithOptions, and LinkSubscribeWithOptions
- Add Fq and Fq_Codel Qdisc support
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
RFC434 states that DNS Servers should be case insensitive
This commit makes sure that all DNS queries will be translated
to lower ASCII characters and all svcRecords will be saved in
lower case to abide by the RFC
Relates to https://github.com/moby/moby/issues/21169
Signed-off-by: Arko Dasgupta <arko.dasgupta@docker.com>
This test was producing error messages due to missing endpoints
in the plugin API;
```
=== RUN TestValidRemoteDriver
ERRO[0039] error getting capability for valid-network-driver due to NetworkDriver.GetCapabilities: 404 page not found
```
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
systemd and udev in their default configuration attempt to set a
persistent MAC address for network interfaces that don't have one
already [systemd-def-link]. We set the address only after creating the
interface, so there is a race between us and udev. There are several
outcomes (that actually occur, this race is very much not a theoretical
one):
* We set the address before udev gets to the networking rules, so udev
sees `/sys/devices/virtual/net/docker0/addr_assign_type = 3`
(NET_ADDR_SET). This means there's no need to assign a different
address and everything is fine.
* udev reads `/sys/devices/virtual/net/docker0/addr_assign_type` before
we set the address, gets `1` (NET_ADDR_RANDOM), and proceeds to
generate and set a persistent address.
Old versions of udev (pre-v242, i.e. without [udev-patch]) would then
fail to generate an address, spit out "Could not generate persistent
MAC address for docker0: No such file or directory" (see [udev-issue],
and everything would be probably fine as well.
Current version of udev (with [udev-patch]) will generate an address
just fine and then race us setting it. As udev does more work than we,
the most probable outcome is that udev will overwrite the address we
set and possibly cause some trouble later on.
On a clean Debian Buster (from Vagrant) VM with systemd/udev 242 from
Debian Experimental, `docker network create net1` up to `net7` resulted
in 3 bridges having a 02:42: address and 4 bridges having a seemingly
random (actually generated from interface name) address. With systemd
241, the result would be all bridges having a 02:42:, but some "Could
not generate persistent MAC address for" messages in the log.
The fix is to revert the MAC address setting fix from 6901ea51dc,
as it is no longer necessary with current netlink [netlink-addr-add],
and set the address atomically when creating the bridge interface, not
after that.
[systemd-def-link]: a166cd3aac/network/99-default.link
[udev-patch]: 6d36464065
[udev-issue]: https://github.com/systemd/systemd/issues/3374
[netlink-addr-add]: 7d9b424492
...
Do note that a similar race happens when creating veth devices as well.
I wasn't able to reproduce getting a wrong (non-02:42:) address,
possibly because the address is set by docker later, maybe only after
the interface is moved to another network namespace (but I'm just
guessing here). Still, different timings result in various error
messages being logged ("link_config: could not get ethtool features for
vethd9c938e" and the like) depending on when the interface disappears
from the primary network namespace. I'm not sure how to fix this and I
don't intend to dig deeper into this.
Signed-off-by: Tomas Janousek <tomi@nomi.cz>
The endpoint count for --config-only networks
was being incremented even when the respective --config-from
inherited network failed to create a network
This was due to a variable shadowing problem with err causing
the deferred function to not execute correctly.
Using the same err variable across the entire function fixes
the issue
Fixes: moby/moby#35101
Signed-off-by: Arko Dasgupta <arko.dasgupta@docker.com>
Starting from 18.09, there's a per node LB for each overlay
network, this change adds the check to node LB.
This change should not break on older docker versions.
Signed-off-by: Xinfeng Liu <xinfeng.liu@gmail.com>
For overlay, l2bridge, and l2tunnel, if the user does not specify a host port, windows driver will select a random port for them. This matches linux behavior.
For ics and nat networks the windows OS will choose the port.
Signed-off-by: Pradip Dhara <pradipd@microsoft.com>
This reverts commit 7adcd856fe.
Libnetwork should only use the iptables binary. Iptables v1.8 and above
uses the nftables backend. The translations for all the rules used by
libnetwork is supported by the new iptables binary.
Signed-off-by: Arko Dasgupta <arko.dasgupta@docker.com>
containerd is now running as a separate service, and should
no longer be started as a managed child-process of dockerd.
The dockerd service already specifies that it should be started
`After` the containerd.service, but there is still a race
condition, where containerd is started, but its socket is not yet
created.
In that situation, `dockerd` detects that the containerd socket
is missing, and will start a new instance of containerd (as a
managed child-process), which causes live-restore to fail.
This patch explicitly sets the `--containerd` daemon option.
If this option is set, `dockerd` will not start a new instance
of containerd.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Minor changes following review of the engine pull request
for this feature;
- Remove the name of the function from the error message
as it's not a debug message.
- Add the valid range to the error message, so that a
user has sufficient information to address the problem.
- Update GoDoc for the function to describe the default
port, and valid port-ranges.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
It is possible that the node is not yet present in
the node list map. In this case just print a warning
and return. The next iteration would be fine
Signed-off-by: Flavio Crisciani <flavio.crisciani@docker.com>
Introduce "com.docker.network.bridge.inhibit_ipv4" option to the bridge
network driver. If set, this option will prevent docker from setting or
modifying Layer-3 (IP) configuration on the bridge interface in any way.
This option should allow connecting containers to pre-existing network
segments (with e.g., pre-existing default gateways) while simultaneously
preserving our ability to communicate with the host and/or configure the
properties of the host-side container virtual network interface (e.g.,
delay/loss/jitter via netem), which can not be done using macvlan.
Signed-off-by: Gabriel Somlo <gsomlo@gmail.com>
Without this the docker.socket would not start by default when starting
the docker.service leading to failures to start.
Signed-off-by: Eli Uriegas <eli.uriegas@docker.com>
Removes the systemd drop-in unit file for socket activation and instead
prefers socket activation by default for both RHEL based and DEBIAN
based distributions.
Socket activation for RHEL based distributions was tested on CentOS 7 and Fedora 28.
Signed-off-by: Eli Uriegas <eli.uriegas@docker.com>
This PR chnages allow user to configure VxLAN UDP
port number. By default we use 4789 port number. But this commit
will allow user to configure port number during swarm init.
VxLAN port can't be modified after swarm init.
Signed-off-by: selansen <elango.siva@docker.com>
- NXDOMAIN is an authoritive answer, so when receiving an NXDOMAIN, we're done.
From RFC 1035: Name Error - Meaningful only for responses from an authoritative
name server, this code signifies that the domain name referenced in the query
does not exist.
FROM RFC 8020: When an iterative caching DNS resolver receives an NXDOMAIN
response, it SHOULD store it in its cache and then all names and resource
record sets (RRsets) at or below that node SHOULD be considered unreachable.
Subsequent queries for such names SHOULD elicit an NXDOMAIN response.
- REFUSED can be a transitional status: (https://www.ietf.org/rfc/rfc1035.txt)
The name server refuses to perform the specified operation for
policy reasons. For example, a name server may not wish to provide the
information to the particular requester, or a name server may not wish to
perform a particular operation (e.g., zone)
Other errors are now logged as debug-message, which can be useful for
troubleshooting.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Allow DSR to be a configurable option through a generic option to the
overlay driver. On the one hand this approach makes sense insofar as
only overlay networks can currently perform load balancing. On the
other hand, this approach has several issues. First, should we create
another type of swarm scope network, this will prevent it working.
Second, the service core code is separate from the driver code and the
driver code can't influence the core data structures. So the driver
code can't set this option itself. Therefore, implementing in this way
requires some hack code to test for this option in
controller.NewNetwork.
A more correct approach would be to make this a generic option for any
network. Then the driver could ignore, reject or be unaware of the option
depending on the chosen model. This would require changes to:
* libnetwork - naturally
* the docker API - to carry the option
* swarmkit - to propagate the option
* the docker CLI - to support the option
* moby - to translate the API option into a libnetwork option
Given the urgency of requests to address this issue, this approach will
be saved for a future iteration.
Signed-off-by: Chris Telfer <ctelfer@docker.com>
Modify the loadbalancing for east-west traffic to use direct routing
rather than NAT and update tasks to use direct service return under
linux. This avoids hiding the source address of the sender and improves
the performance in single-client/single-server tests.
Signed-off-by: Chris Telfer <ctelfer@docker.com>
Since SvcStats represents the stats for a `Service`, we don't want
to reuse that struct in the `Destination` (for no other reason than
incompatible nomenclature). So this patch adds a `DstStats` struct
to hold the Destination stats.
When dockerd.exe is not stopped cleanly (such as when Windows is
restarted), the endpoints are not cleaned up. When using a transparent
network, the endpoint IPv4 address is blank. When dockerd.exe starts up
again, libnetwork restores the endpoint, which would not have been
stored on a clean shutdown of dockerd.exe. That fails because the IPv4
address is blank. This change warns instead of failing.
Signed-off-by: John Stephens <johnstep@docker.com>
IPVLAN driver had been retained in experimental for multiple releases
with the requirement to have a proper L3 control-plane (such as BGP) to
go along with it which will make this driver much more useful. But
based on the community feedback,
https://github.com/moby/moby/issues/21735, am proposing to move this
driver out of experimental.
Signed-off-by: Madhu Venugopal <madhu@docker.com>
Set the PATH to what appears to be the standard on latest Ubuntu (18.04)
and Debian (9), fixing the following two issues:
1. PATH did not contain /bin (leading to ContainerTop/ps not working
on newer distros, among the other things).
2. $PATH can't be specified in Environment directives in .service files.
While at it, also:
3. Remove the comment about RPM as it looks misleading on deb-based
systems.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Signed-off-by: John Howard <jhoward@microsoft.com>
As well as bumping, libkv now requires go.etcd.io/bolt rather
than boltdb/bolt. Hence removed bolt from vendor.conf,
vendored go.etcd.io/bbot @ v1.3.1-etcd.8 and rerun vndr.
Removes the need for the offline installer to install the shim process
and instead installs the shim process as part of the packaging.
May be easier in the future to just package the shim process on it's own
but that'll come after this 18.09 release
Signed-off-by: Eli Uriegas <eli.uriegas@docker.com>
```
make lint
networkdb/networkdb_test.go:88:2: should replace t.Error(fmt.Sprintf(...)) with t.Errorf(...)
networkdb/networkdb_test.go:136:2: should replace t.Error(fmt.Sprintf(...)) with t.Errorf(...)
make: *** [lint] Error 1
```
Signed-off-by: Lei Gong <lgong@alauda.io>
Note that StartLimit* options were moved from "Service" to "Unit" in systemd 229
(6bf0f408e4)
both the old, and new location are accepted by systemd 229 and up, so using the old location
to make them work for either version of systemd.
StartLimitInterval was renamed to StartLimitIntervalSec in systemd 230
(f0367da7d1)
both the old, and new name are accepted by systemd 230 and up, so using the old name to make
this option work for either version of systemd.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This adds support for reloading the docker daemon
(SIGHIUP) so that changes in '/etc/docker/daemon.json'
can be loaded at runtime by reloading the service
through systemd ('systemctl reload docker')
Before this change, systemd would output an error
that "reloading" is not supported for the docker
service;
systemctl reload docker
Failed to reload docker.service: Job type reload is not applicable for unit docker.service.
After this change, the docker daemon can be reloaded
through 'systemctl reload docker', which reloads
the configuration;
journalctl -f -u docker.service
May 02 03:49:20 testing systemd[1]: Reloading Docker Application Container Engine.
May 02 03:49:20 testing docker[28496]: time="2016-05-02T03:49:20.143964103-04:00" level=info msg="Got signal to reload configuration, reloading from: /etc/docker/daemon.json"
May 02 03:49:20 testing systemd[1]: Reloaded Docker Application Container Engine.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Change the kill mode to process so that systemd does not kill container
processes when the daemon is shutdown but only the docker daemon
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
We need to add delegate yes to docker's service file so that it can
manage the cgroups of the processes that it launches without systemd
interfering with them and moving the processes after it is reloaded.
Delegate=
Turns on delegation of further resource control partitioning to
processes of the unit. For unprivileged services (i.e. those
using the User= setting), this allows processes to create a
subhierarchy beneath its control group path. For privileged
services and scopes, this ensures the processes will have all
control group controllers enabled.
This is the proper fix for issue moby/moby#20152
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Systemd sets a default of 512 tasks, which is far
too low to run many containers.
Note that TasksMax is only supported on systemd 226
and above.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
There is a not-insignificant performance overhead for all containers (if
containerd is a child of Docker, which is the current setup) if systemd
sets rlimits on the main Docker daemon process (because the limits
propogate to all children).
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
set LimitCORE=infinity to ensure complete core creation,
allows extraction of as much information as possible.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Signed-off-by: Andrew Hsu <andrewhsu@docker.com>
(cherry picked from commit 51879873897afe298cbb736acef34b5a0b500424)
Signed-off-by: Andrew Hsu <andrewhsu@docker.com>
ipamutils has two default address pool. Instead of allowing them to
be accessed directly, adding get functions so that other packages
can use get APIs.
Signed-off-by: selansen <elango.siva@docker.com>
This change brings global default address pool feature into
libnetwork. Idea is to reuse same code flow and functions that were
implemented for local scope default address pool.
Function InitNetworks carries most of the changes. local scope default
address pool init should always happen only once. But Global scope
default address pool can be initialized multiple times.
Signed-off-by: selansen <elango.siva@docker.com>
Old versions of things on CentOS 7 strike again!
infinity is not a thing for TimeoutSec on systemd < 229
Signed-off-by: Eli Uriegas <eli.uriegas@docker.com>
Go 1.10 fixed the problem related to thread and namespaces.
Details:
2595fe7fb6
In few words there is no more the possibility to have a go routine
running on a thread that is another namespace.
In this commit some cleanup is done and the method SetNamespace is
being removed. This will save tons of setns syscall, that were happening
way too frequently possibily to make sure that each operation was being
done in the host namespace.
I suspect that also all the drivers not running in a different
namespace would be able to drop also the lock of the OS Thread but
will address it in a different commit
Removed useless LockOSThreads around
Signed-off-by: Flavio Crisciani <flavio.crisciani@docker.com>
Change the sandbox IDs for the sandboxes of load-balancing endpoints to
be "lb_XXXXXXXXX" where XXXXXXXXX is the network ID that this sandbox
load balances for. This makes it easier to find these sandboxes in
/var/run/docker/netns and thus makes debugging easier.
Signed-off-by: Chris Telfer <ctelfer@docker.com>
Internal directory is designed to contain libraries
that are exclusively used by this project
Signed-off-by: Flavio Crisciani <flavio.crisciani@docker.com>
Instead of using "sync.Once" to determine whether to initialize a
network sandbox or subnet sandbox, we use a traditional mutex +
initialization boolean. This is because the initialization state isn't
truly a once-and-done condition. Rather, libnetwork destroys network
and subnet sandboxes when the last endpoint leaves them. The use of
sync.Once in this kind of scenario requires, therefore, re-initializing
the Once which is impoissible. So the approach that libnetwork
currently takes is to use a pointer to a Once and redirect that pointer
to a new Once on reset. This leads to nasty race conditions.
In addition to refactoring the locking, this patch merges the functions
joinSandbox(), and joinSubnetSandbox(). This makes the code both cleaner
and it also holds the network and subnet locks through the series of
read-modify-writes avoiding further potential races. This does reduce
the potential parallelism which could be applied should there be many
joins coming in on many different subnets in the same overlay network.
However, this should be an extremely minor performance hit for a very
obscure case.
One important pattern in this commit is that it is crucial to avoid
sending peerDB messages while holding a driver or network lock. The
changes herein defer such (asynchronous) notifications until after
release of such locks. This prevents deadlocks where the peerDB
blocks acquiring said locks while the network method blocks trying
to send to the peerDB's channel.
Signed-off-by: Chris Telfer <ctelfer@docker.com>
'make check' will now fail if the files produced by re-running protoc
differ from those which are checked into the repository.
Signed-off-by: Euan Harris <euan.harris@docker.com>
Outside the build container, run: make protobuf
Inside the build container, run: make protobuf-local
Signed-off-by: Euan Harris <euan.harris@docker.com>
This makes it possible to Ctrl-C tests and builds again. Zombie
processes will also be reaped correctly.
Signed-off-by: Euan Harris <euan.harris@docker.com>
The previous code used string slices to limit the length of certain
fields like endpoint or sandbox IDs. This assumes that these strings
are at least as long as the slice length. Unfortunately, some sandbox
IDs can be smaller than 7 characters. This fix addresses this issue
by systematically converting format string calls that were taking
fixed-slice arguments to use a precision specifier in the string format
itself. From the golang fmt package documentation:
For strings, byte slices and byte arrays, however, precision limits
the length of the input to be formatted (not the size of the output),
truncating if necessary. Normally it is measured in runes, but for
these types when formatted with the %x or %X format it is measured
in bytes.
This nicely fits the desired behavior: it will limit the number of
runes considered for string interpolation to the precision value.
Signed-off-by: Chris Telfer <ctelfer@docker.com>
TestOverlappingRequests checks that pool requests which are supersets or
subsets of existing allocations, and those which overlap with existing
allocations at the beginning or the end.
Multiple allocation is now tested by TestOverlappingRequests, so
TestDoublePoolRelease only needs to test double releasing.
Signed-off-by: Euan Harris <euan.harris@docker.com>
Added some optimizations to reduce the messages in the queue:
1) on join network the node execute a tcp sync with all the nodes that
it is aware part of the specific network. During this time before the
node was redistributing all the entries. This meant that if the network
had 10K entries the queue of the joining node will jump to 10K. The fix
adds a flag on the network that would avoid to insert any entry in the
queue till the sync happens. Note that right now the flag is set in
a best effort way, there is no real check if at least one of the nodes
succeed.
2) limit the number of messages to redistribute coming from a TCP sync.
Introduced a threshold that limit the number of messages that are
propagated, this will disable this optimization in case of heavy load.
Signed-off-by: Flavio Crisciani <flavio.crisciani@docker.com>
instead of printing the whole option, print the _number_ only,
because that's what the error-message is pointing at;
Before this change:
invalid number for ndots option ndots:foobar
After this change:
invalid number for ndots option: foobar
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Refactor the ostweaks file to allows a more easy reuse
Add a method on the osl.Sandbox interface to allow setting
knobs on the sandbox
Signed-off-by: Flavio Crisciani <flavio.crisciani@docker.com>
`ndots:0` is a valid DNS option; previously, `ndots:0` was
ignored, leading to the default (`ndots:0`) also being applied;
Before this change:
docker network create foo
docker run --rm --network foo --dns-opt ndots:0 alpine cat /etc/resolv.conf
nameserver 127.0.0.11
options ndots:0 ndots:0
After this change:
docker network create foo
docker run --rm --network foo --dns-opt ndots:0 alpine cat /etc/resolv.conf
nameserver 127.0.0.11
options ndots:0
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The "message" argument in assert.Equal expects a format
string; the current string was not that, resulting in an
incorrect message being printed;
--- FAIL: TestDNSOptions (1.28s)
Location: service_common_test.go:92
Error: Not equal: "ndots:5" (expected)
!= "ndots:0" (actual)
Messages: The option must be ndots:5 instead:%!(EXTRA string=ndots:0)
This patch removes the message altogether, because assert.Equal
already prints enough information to catch the error;
--- FAIL: TestDNSOptions (1.28s)
Location: service_common_test.go:92
Error: Not equal: "ndots:5" (expected)
!= "ndots:0" (actual)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Add debug and error logs to notify when a load balancing sandbox
is not found. This can occur in normal operation during removal.
Signed-off-by: Chris Telfer <ctelfer@docker.com>
Lock the network ID in the controller during an addServiceBinding to
prevent racing with network.delete(). This would cause the binding to
be silently ignored in the system.
Signed-off-by: Chris Telfer <ctelfer@docker.com>
This is the heart of the scalability change for services in libnetwork.
The present routing mesh adds load-balancing rules for a network to
every container connected to the network. This newer approach creates a
load-balancing endpoint per network per node. For every service on a
network, libnetwork assigns the VIP of the service to the endpoint's
interface as an alias. This endpoint must have a unique IP address in
order to route return traffic to it. Traffic destined for a service's
VIP arrives at the load-balancing endpoint on the VIP and from there,
Linux load balances it among backend destinations while SNATing said
traffic to the endpoint's unique IP address.
The net result of this scheme is that each node in a swarm need only
have one set of load balancing state per service instead of one per
container on the node. This scheme is very similar to how services
currently operate on Windows nodes in libnetwork. It (as with Windows
nodes) costs the use of extra IP addresses in a network (one per node)
and an extra network hop in the stack, although, always in the stack
local to the container.
In order to prevent existing deployments from suddenly failing if they
failed to allocate sufficient address space to include per-node
load-balancing endpoint IP addresses, this patch preserves the existing
functionality and activates the new functionality on a per-network
basis depending on whether the network has a load-balancing endpoint.
Eventually, moby should always set this option when creating new
networks and should only omit it for networks created as part of a swarm
that are not marked to use endpoint load balancing.
This patch also normalizes the code to treat "load" and "balancer"
as two separate words from the perspectives of variable/function naming.
This means that the 'b' in "balancer" must be capitalized.
Signed-off-by: Chris Telfer <ctelfer@docker.com>
This was passing extra information and adding confusion about the
purpose of the load balancing structure.
Signed-off-by: Chris Telfer <ctelfer@docker.com>
New load balancing code will require ability to add aliases to
load-balncer sandboxes. So this broadens the OSL interface to allow
adding aliases to any interface, along with the facility to get the
loopback interface's name based on the OS.
Signed-off-by: Chris Telfer <ctelfer@docker.com>
This method returns the name of the interface from the perspective
of the host OS pre-container. This will be required later for
finding matching a sandbox's interface name to an endpoint which
is, in turn, requied for adding an IP alias to a load balancer
endpoint.
Signed-off-by: Chris Telfer <ctelfer@docker.com>
Error unwinding only works if the error variable is used consistently
and isn't hidden in the scope of other if statements.
Signed-off-by: Chris Telfer <ctelfer@docker.com>
Default gateways truncate the endpoint name to 12 characters. This can
make network endpoints ambiguous especially for load-balancing sandboxes
for networks with lenghty names (such as with our prefixes). Address
this by detecting an overflow in the sanbox name length and instead
opting to name the gateway endpoint "gateway_<id>" which should never
collide.
Signed-off-by: Chris Telfer <ctelfer@docker.com>
Change the Delete() method to take optional options and add
NetworkDeleteOptionRemoveLB as one such option. This option allows
explicit removal of an ingress network along with its load-balancing
endpoint if there are no other endpoints in the network. Prior to this,
the libnetwork client would have to manually search for and remove the
ingress load balancing endpoint from an ingress network. This was, of
course, completely hacky.
This commit will require a slight modification in moby to make use of
the option when deleting the ingress network.
Signed-off-by: Chris Telfer <ctelfer@docker.com>
Fix related to bug: https://github.com/docker/for-linux/issues/348
We should perform updateToStore(ep) after n.addEndpoint or do update twice,
otherwise response from network plugin will not be written to KV storage.
This results in container creation with broken network config.
Signed-off-by: Siarhei Rasiukevich <raskintech@gmail.com>
Most of the libcontainer imports was just for a single test to marshal a
simple type, meanwhile this caused all kinds of transient imports that
are not really needed.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
(cherry picked from commit a07a1ee9ccdf4c5a3a90eea9fd359f10b5156c84)
Signed-off-by: selansen <elango.siva@docker.com>
The use of `Client()` on v2 plugins is being deprecated so that we can
be more flexible on the protocol used for plugins.
This means checking specifically if the plugin implements the
`Client() *plugins.Client` interface for V1 plugins, and for v2 plugins
building a the client manually.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
(cherry picked from commit 45824a226b8a220d6f189c2d25fe16f9efc83db9)
Signed-off-by: selansen <elango.siva@docker.com>
agent.pb.go is unchanged, but the files in networkdb and drivers
are slightly different when regenerated using the current versions
of protoc and gogoproto. This is probably because agent.pb.go
was last regenerated quite recently, in February 2018, whereas
networkdb.pb.go and overlay/overlay.pb.go were last changed in 2017,
and windows/overlay/overlay.pb.go was last changed in 2016.
Signed-off-by: Euan Harris <euan.harris@docker.com>
Previous logic was not accounting that each node is
in the node list so the bootstrap nodes won't retry
to reconnect because they will always find themselves
in the node map
Added test that validate the gossip island condition
Signed-off-by: Flavio Crisciani <flavio.crisciani@docker.com>
* Update dependencies to match moby master; add new sub-dependencies
as necessary.
* Update moby to latest
* Update gocapability
This moves gocapability beyond the version vendored in moby;
presumably the code which requires this particular version
is not used in moby and is removed by vndr. Moby will need
to be updated as well.
Signed-off-by: Euan Harris <euan.harris@docker.com>
moby/moby commit b27f70d45 wraps the ErrNotFound error returned when
a plugin cannot be found, to include a backtrace. This changes the
type of the error, so contoller.loadDriver no longer converts it to a
libnetwork plugin.NotFoundError. This causes a couple of tests which
inspect the return type to fail; most code only checks whether the
error is non-nil and is not affected by the change in type.
Signed-off-by: Euan Harris <euan.harris@docker.com>
Add a test to confirm that the pool allocator will iterate through all
the pools even if some earlier ones were freed before coming back to
previously allocated pools.
Signed-off-by: Chris Telfer <ctelfer@docker.com>
This commit prevents subnets from being reused at least initially,
instead favoring to cycle through them as we do with addresses within a
subnet.
Signed-off-by: Chris Telfer <ctelfer@docker.com>
Commit b64997ea prevented data corruption due to simultaneous
driver.CreateNetwork()/driver.DeleteNetwork() by holding the network
lock through the read/modify part of the operation. However, part of
the DeleteNetwork operation entails sending a message to the peerDB to
tell that goroutine to flush entries on deletion. This can lead to a
deadlock where:
* driver.DeleteNetwork() starts and acquires driver.Lock()
* peerDB receives some other request (e.g. EventNotify) and blocks
on driver.Lock()
* driver.DeleteNetwork() attempts a peerDB flush and blocks waiting
on the synchronous peerDB operation channel
This patch fixes the issue by deferring the peerDB flush operation until
after DeleteNetwork() unlocks driver.Lock(). Commit b64997ea only
modified CreateNetwork() and DeleteNetwork() and the critical section
that driver.Lock() protects in CreateNetwork() does not perform any
peerDB notifications or other locks of driver data structures. So this
solution should be a complete fix for any regressions introduced in
b64997ea.
Signed-off-by: Chris Telfer <ctelfer@docker.com>
Make sure that iptables operations on ingress
are serialized.
Before 2 racing routines trying to create the ingress chain
were allowed and one was failing reporting the chain as
already existing.
The lock guarantees that this condition does not happen anymore
Signed-off-by: Flavio Crisciani <flavio.crisciani@docker.com>
The system should remove cluster service info including networkDB
entries and DNS entries for container endpoints that are not part of a
service as well as those that are part of a service. This used to be
the normal sequence of operations but it moved to
sandbox.DisableService() in an effort to more gracefully handle endpoint
removal from a service (which proved insufficient). Unfortunately
subsequent changes also removed the newly-mandetory call to
sandbox.DisableService() preventing proper cleanup for non-service
container endpoints.
Signed-off-by: Chris Telfer <ctelfer@docker.com>
Go 1.7 added the subtest feature which can make table-driven tests much easier to run and debug. Some tests are not using this feature.
Signed-off-by: Yang Li <idealhack@gmail.com>
Use net.splitHostPort() instead of our own logic in func (p *PortBinding)
FromString(s string) error. This means that IPv6 literals, including
IPv4 in IPv6 literals, can now be parsed from the string form of
PortBindings. Zoned addresses do not work - net.splitHostPort() parses
them but net.ParseIP() cannot and returns an error. This is ok because
we do not have a slot to store the zone name in PortBinding anyway.
Signed-off-by: Euan Harris <euan.harris@docker.com>
The function tested by TestUtilGetHostPortionIP is called GetHostPartIP.
Rename the test to match the function being tested.
Signed-off-by: Euan Harris <euan.harris@docker.com>
Releasing a pool which has already been released should fail; this
change increases coverage by a fraction by exercising this path.
Signed-off-by: Euan Harris <euan.harris@docker.com>
This makes it easy to drop into the build container, for instance to
run tests or other Go tools over a subset of the code.
Signed-off-by: Euan Harris <euan.harris@docker.com>
Multiple simultaneous updates here would leave the driver in a very
inconsistent state. The disadvantage to this change is that it requires
holding the driver lock while reprogramming the keys.
Signed-off-by: Chris Telfer <ctelfer@docker.com>
This one is probably not critical. The worst that seems like could
happen would be if 2 deletes occur at the same time (one of which
should be an error):
1. network gets read from the map by delete-1
2. network gets read from the map by delete-2
3. delete-1 releases the network VNI
4. network create arrives at the driver and allocates the now free VNI
5. delete-2 releases the network VNI (error: it's been reallocated!)
6. both networks remove the VNI from the map
Part 6 could also become an issue if there were a simultaneous create
for the network at the same time. This leads to the modification of
the NewNetwork() method which now checks for an existing network before
adding it to the map.
Signed-off-by: Chris Telfer <ctelfer@docker.com>
The overlay network driver is not properly using it's mutexes or
sync.Onces. It made the classic mistake of not holding a lock through
various read-modify-write operations. This can result in inconsistent
state storage leading to more catastrophic issues.
This patch attempts to maintain the previous semantics while holding the
driver lock through operations that are read-modify-write of the
driver's network state.
One example of this race would be if two goroutines tried to invoke
d.network() after the network ID was removed from the table. Both would
try to reinstall it causing the "once" to get reinitialized twice
without any lock protection. This could then lead to the "once" getting
invoked twice on the same network. Furthermore, the changes to one of
these network structures gets effectively discarded. It's also the
case, that because there would be two simultaneous instances of the
network, the various network Lock() invocations would be meaningless for
race prevention.
Signed-off-by: Chris Telfer <ctelfer@docker.com>
This gets filtered for raw iptables calls, but not from calls made
through firewalld. The patch just ensures consistency of operation.
It also adds a warning when xtables contention detected and truncates
the search string slightly as it appears that the suffix will be
changing in the near future.
Signed-off-by: Chris Telfer <ctelfer@docker.com>
Since Go 1.7, context is a standard package. Since about Go 1.9 time,
all x/net/context provides is a few aliases to types in context, meaning
"x/net/context" and "context" can be mixed freely.
Some vendored packages still use x/net/context, so vendor entry remains
for now.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
This version includes "x/net/context" which is fully compatible with
the standard Go "context" package, so the two can be mixed together.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
`handleNodeEvent` is calling `changeNodeState` which writes to various
maps on the ndb object.
Using a write lock prevents a panic on concurrent read/write access on
these maps.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
This patch replaces awk with cut to workaround issues present with
running this script within ucp-dsinfo.
Signed-off-by: Kyle Squizzato <kyle.squizzato@docker.com>
This patch attempts to allow endpoints to complete servicing connections
while being removed from a service. The change adds a flag to the
endpoint.deleteServiceInfoFromCluster() method to indicate whether this
removal should fully remove connectivity through the load balancer
to the endpoint or should just disable directing further connections to
the endpoint. If the flag is 'false', then the load balancer assigns
a weight of 0 to the endpoint but does not remove it as a linux load
balancing destination. It does remove the endpoint as a docker load
balancing endpoint but tracks it in a special map of "disabled-but-not-
destroyed" load balancing endpoints. This allows traffic to continue
flowing, at least under Linux. If the flag is 'true', then the code
removes the endpoint entirely as a load balancing destination.
The sandbox.DisableService() method invokes deleteServiceInfoFromCluster()
with the flag sent to 'false', while the endpoint.sbLeave() method invokes
it with the flag set to 'true' to complete the removal on endpoint
finalization. Renaming the endpoint invokes deleteServiceInfoFromCluster()
with the flag set to 'true' because renaming attempts to completely
remove and then re-add each endpoint service entry.
The controller.rmServiceBinding() method, which carries out the operation,
similarly gets a new flag for whether to fully remove the endpoint. If
the flag is false, it does the job of moving the endpoint from the
load balancing set to the 'disabled' set. It then removes or
de-weights the entry in the OS load balancing table via
network.rmLBBackend(). It removes the service entirely via said method
ONLY IF there are no more live or disabled load balancing endpoints.
Similarly network.addLBBackend() requires slight tweaking to properly
manage the disabled set.
Finally, this change requires propagating the status of disabled
service endpoints via the networkDB. Accordingly, the patch includes
both code to generate and handle service update messages. It also
augments the service structure with a ServiceDisabled boolean to convey
whether an endpoint should ultimately be removed or just disabled.
This, naturally, required a rebuild of the protocol buffer code as well.
Signed-off-by: Chris Telfer <ctelfer@docker.com>
The golang.org/x/sync package was vendored using the
github.com/golang/sync URL, but this is not the canonical
URL.
Because of this, vendoring failed in Moby, as it detects
these to be a duplicate import:
vndr github.com/golang/sync
2018/03/14 11:54:37 Collecting initial packages
2018/03/14 11:55:00 Download dependencies
2018/03/14 11:55:00 Failed to parse config: invalid config format: // FIXME this should be golang.org/x/sync, which is already vendored above
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This commit contains fixes for duplicate IP with 3 issues addressed:
1) Race condition when datastore is not present in cases like swarmkit
2) Byte Offset calculation depending on where the start of the bit
in the bitsequence is, the offset was adding more bytes to the offset
when the start of the bit is in the middle of one of the instances in
a block
3) Finding the available bit was returning the last bit in the curent instance in
a block if the block is not full and the current bit is after the last
available bit.
Signed-off-by: Abhinandan Prativadi <abhi@docker.com>
Add a simple check and a summary report for the support script.
Report:
==SUMMARY==
Processed 3 networks
IP overlap found: 1
Processed 167 containers
Overlap found:
*** OVERLAP on Network 0ewr5iqraa8zv9l4qskp93wxo ***
2 "192.168.1.138",
Signed-off-by: Flavio Crisciani <flavio.crisciani@docker.com>
We should not delete an ingress network just because its endpoint count
drops to 1 (the IP address of the sandbox). This addresses a regression
where the ingress sandbox could be deleted on workers when the last
container leave said sandbox.
Signed-off-by: Chris Telfer <ctelfer@docker.com>
Set a limit to the max size of the transient log to avoid
filling up logs in case of issues
Signed-off-by: Flavio Crisciani <flavio.crisciani@docker.com>
Usually a diagnostic session wants to check the local state
without this flag the network is joined and left every iteration
altering actually the daemon status.
Also if the diagnostic client is used against a live node, the
network leave has a very bad side effect of kicking the node
out of the network killing its internal status.
For the above reason introducing the flag -a to be explicit
so that the current state is always preserved
Signed-off-by: Flavio Crisciani <flavio.crisciani@docker.com>
This is new feature that allows user to specify which subnetwork
Docker contrainer should choose from when it creates bridge network.
This libnetwork commit is to address moby PR 36054
Signed-off-by: selansen <elango.siva@docker.com>
Previously, support script dumped the host iptables filter/nat tables,
and each overlay network's network inspect and 'bridge fdb show' and
'brctl showmacs'. Now we collect much more information. Support script
dumps iptables filter/nat/mangle, routes and interfaces from iproute2,
bridge fdb table, & ipvsadm table, for the host and containers/overlay
networks on the host. We also dump a redacted copy of the container
health check status and other debugging information for each container,
in JSON format, and 'docker network inspect -v' for each overlay, if the
client/server support the -v flag.
Signed-off-by: ada mancini <ada@docker.com>
Setting ndots to 0 does not allow to resolve search domains
The default will remain ndots:0 that will directly resolve
services, but if the user specify a different ndots value
just propagate it into the container
Signed-off-by: Flavio Crisciani <flavio.crisciani@docker.com>
- the client allows to talk to the diagnostic server and
decode the internal values of the overlay and service discovery
- the tool also allows to remediate in case of orphans entries
- added README
Signed-off-by: Flavio Crisciani <flavio.crisciani@docker.com>
Avoid waiting for a double notification once a node rejoin, just
put it back to active state. Waiting for a further message does not
really add anything to the safety of the operation, the source of truth
for the node status resided inside memberlist.
Signed-off-by: Flavio Crisciani <flavio.crisciani@docker.com>
The netlink deserialize is fetching information from the link.
This require the go routine to be in the correct namespace to
succeed
Signed-off-by: Flavio Crisciani <flavio.crisciani@docker.com>
This PR contains a fix for moby/moby#30321. There was a moby/moby#31142
PR intending to fix the issue by adding a delay between disabling the
service in the cluster and the shutdown of the tasks. However
disabling the service was not deleting the service info in the cluster.
Added a fix to delete service info from cluster and verified using siege
to ensure there is zero downtime on rolling update of a service.
Signed-off-by: abhi <abhi@docker.com>
Swarm mode does not really have anymore a use for the watchMiss.
Peer entries are configured at configuration time.
If the gcthresh denies the insertion the peerAdd will fail.
Signed-off-by: Flavio Crisciani <flavio.crisciani@docker.com>
In sandbox creation we disable IPV6 config. But this causes problem in live-restore case
where all IPV6 configs are wiped out on running container. Hence extra check has been added
take care of this issue.
Signed-off-by: selansen <elango.siva@docker.com>
Created method to handle the node state change with cleanup operation
associated.
Realign testing client with the new diagnostic interface
Signed-off-by: Flavio Crisciani <flavio.crisciani@docker.com>
VETH interface was not cleaned up when DockerD got killed between addEndpoint and updateToStore calls.
I have added logs and made sure calling updateToStore before addEndpoint contains same values.
Hence moving up the call looks safer and VETH gets cleaned up even after DockerD gets killed in the middle.
Signed-off-by: selansen <elango@docker.com>
This is the right way to call for a clean shutdown
Return application/json as content-type when appropriate
Signed-off-by: Flavio Crisciani <flavio.crisciani@docker.com>
In case the file descriptor of the netlink socket is closed
the recvfrom is not returning. This may create deadlock conditions.
The current solution is to make sure that all the netlink socket used
have a proper timeout set on them to have the possibility to return
Added test to emulate the watchMiss condition
Signed-off-by: Flavio Crisciani <flavio.crisciani@docker.com>
If a node leave, avoid to notify the upper layer
for entries that are already marked for deletion
Signed-off-by: Flavio Crisciani <flavio.crisciani@docker.com>
This commit introduces the possibility to enable a debug mode
for the networkDB, this will allow the opening of a tcp port
on localhost that will expose the networkDB api for debugging
purposes.
The API can be discovered using curl localhost:<port>/help
It support json output if passed json as URL query parameter
option and pretty printing if passing json=pretty
All the binaries values are serialized in base64 encoding, this
can be skip passing the unsafe option as url query parameter
A simple go client will follow up
Signed-off-by: Flavio Crisciani <flavio.crisciani@docker.com>
The previous logic was not properly handling the case of a node
that was failing and oining back in short period of time.
The issue was in the handling of the network messages.
When a node joins it sync with other nodes, these are passing
the whole list of nodes that at best of their knowledge are part
of a network. At this point if the node receives that node A is part
of the network it saves it before having received the notification
that node A is actually alive (coming from memberlist).
If node A failed the source node will receive the notification
while the new joined node won't because memberlist never advertise
node A as available. In this case the new node will never purge
node A from its state but also worse, will accept any table notification
where node A is the owner and so will end up in a out of sync state
with the rest of the cluster.
This commit contains also some code cleanup around the area of node
management
Signed-off-by: Flavio Crisciani <flavio.crisciani@docker.com>
Update Dockerfile, curl is used for the healthcheck
Add /dump for creating the routine stack trace
Signed-off-by: Flavio Crisciani <flavio.crisciani@docker.com>
endpoint_cnt object is created during network create and destroyed when
network is deleted. But the updateToStore function creates an object
when it is not present in the store. endpoint_cnt is a mutable object
and is updated during endpoint create and delete events. If endpoint
create or delete happens after the network is deleted, it can
incorrectly create an endpoint_cnt object in the store and that can
cause problems when the same network is created again later.
The fix is to not create the endpoint_cnt object when endpoint_cnt is
incremented or decremented
Signed-off-by: Madhu Venugopal <madhu@docker.com>
This patch updates all dependencies to match what is
used in moby/moby. Making the dependencies match
what is used in that repository makes sure we test
with the same version as libnetwork is later built
with in moby.
This also gets rid of some temporary forks that were
needed during the migration of Sirupsen/logrus to
sirupsen/logrus.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Solaris support for Docker will likely not reach completion,
so removing these files as they are not in use and not
maintained.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Addresses failure to collect iptables information if lock is held during
data capture. Follows the reccomendation of iptables stderr in this
scenario:
```
Another app is currently holding the xtables lock. Perhaps you want to
use the -w option?
```
Signed-off-by: Trapier Marshall <trapier.marshall@docker.com>
Validate that passing an option into the daemon config
does not corrupt the option set into the container resolv.conf
Signed-off-by: Flavio Crisciani <flavio.crisciani@docker.com>
- Create a test to verify that a node that joins
in an async way is not going to extend the life
of a already deleted object
Signed-off-by: Flavio Crisciani <flavio.crisciani@docker.com>
When ndots was being explicitely passed in the daemon conf
the configuration landing into the container was corrupted
e.g. options ndots:1 ndots:0
The fix just removes the user option so that is not replicated
Signed-off-by: Flavio Crisciani <flavio.crisciani@docker.com>
Attachable containers they are tasks with no service associated
their cleanup was not done properly so it was possible to have
a leak of their name resolution if that was the last container
on the network.
Cleanupservicebindings was not able to do the cleanup because there
is no service, while also the notification of the delete arrives
after that the network is already being cleaned
Signed-off-by: Flavio Crisciani <flavio.crisciani@docker.com>
In case of merge commit, the sha passed to the codecov tool
is the one of the merged commit intstead of the merge commit
this creates error because the base commit is always different.
Passing it explicitely should fix it
Signed-off-by: Flavio Crisciani <flavio.crisciani@docker.com>
Without `-n`, iptables will attempt to lookup hostnames for IP
addresses, which can slow down the call dramatically.
Since we don't need this, and generally don't even care about the
output, use the `-n` flag to disable this.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
Avoid error logs in case of local peer case, there is no need for deleteNeighbor
Avoid the network leave to readvertise already deleted entries to upper layer
Signed-off-by: Flavio Crisciani <flavio.crisciani@docker.com>
In case of IP reuse locally there was a race condition
that was leaving the overlay namespace with wrong configuration
causing connectivity issues.
This commit introduces the use of setMatrix to handle the transient
state and make sure that the proper configuration is maintained
Signed-off-by: Flavio Crisciani <flavio.crisciani@docker.com>
The comparison was against the wrong constant value.
As described in the comment the check is there to guarantee
to not propagate events realted to stale deleted elements
Signed-off-by: Flavio Crisciani <flavio.crisciani@docker.com>
Since bit allocation is no longer first available from
the start some verfications are removed/modified to
the change allocation model
Signed-off-by: Abhinandan Prativadi <abhi@docker.com>
Previously the bitseq alloc was allocating the first available bit from the
begining of the sequence. With this commit the bitseq alloc will proceed
from the current allocation. This change will affect the way ipam and vni
allocation is done currently. The ip allocation will be done sequentially
from the previous allocation as opposed to the first available IP.
Signed-off-by: Abhinandan Prativadi <abhi@docker.com>
When execute 'docker swarm init' and 'docker swarm leave -f' on a node
repeatedly, the (*Broadcaster).run goroutine leak.
Signed-off-by: yangchenliang <yangchenliang@huawei.com>
Separate the hostname from the node identifier. All the messages
that are exchanged on the network are containing a nodeName field
that today was hostname-uniqueid. Now being encoded as strings in
the protobuf without any length restriction they plays a role
on the effieciency of protocol itself. If the hostname is very long
the overhead will increase and will degradate the performance of
the database itself that each single cycle by default allows 1400
bytes payload
Signed-off-by: Flavio Crisciani <flavio.crisciani@docker.com>
Make sure that the network is garbage collected after
the entries. Entries to be deleted requires that the network
is present.
Signed-off-by: Flavio Crisciani <flavio.crisciani@docker.com>
- Changed the loop per network. Previous implementation was taking a
ReadLock to update the reapTime but now with the residualReapTime
also the bulkSync is using the same ReadLock creating possible
issues in concurrent read and update of the value.
The new logic fetches the list of networks and proceed to the
cleanup network by network locking the database and releasing it
after each network. This should ensure a fair locking avoiding
to keep the database blocked for too much time.
Note: The ticker does not guarantee that the reap logic runs
precisely every reapTimePeriod, actually documentation says that
if the routine is too long will skip ticks. In case of slowdown
of the process itself it is possible that the lifetime of the
deleted entries increases, it still should not be a huge problem
because now the residual reaptime is propagated among all the nodes
a slower node will let the deleted entry being repropagate multiple
times but the state will still remain consistent.
Signed-off-by: Flavio Crisciani <flavio.crisciani@docker.com>
- Added remainingReapTime field in the table event.
Wihtout it a node that did not have a state for the element
was marking the element for deletion setting the max reapTime.
This was creating the possibility to keep the entry being resync
between nodes forever avoding the purpose of the reap time
itself.
- On broadcast of the table event the node owner was rewritten
with the local node name, this was not correct because the owner
should continue to remain the original one of the message
Signed-off-by: Flavio Crisciani <flavio.crisciani@docker.com>
The package updated and now shows new warnings that had to be corrected
to let the CI pass
Signed-off-by: Flavio Crisciani <flavio.crisciani@docker.com>
The CreateNetwork in the bridge driver was not able to properly
handle concurrent operations causing 2 issues:
1) crash from nil pointer exception
2) not proper handling of conflicting configuration
This commit addresses the 2 previous mentioned issues
and adds a test for it.
The test with the original code has a low failure frequency
to confirm the fix I had to add a time.Sleep in the body of the
CreateNetwork so to have a 100% failure
Signed-off-by: Flavio Crisciani <flavio.crisciani@docker.com>
`/etc/resolv.conf` is not an essential file in filesystem. (see
http://man7.org/linux/man-pages/man5/resolv.conf.5.html)
> If this file does not exist, only the name server on the local machine
> will be queried
It's baffling to users that containers can start with an empty
`resolv.conf` but cannot without this file.
This PR:
* ignore this error and use default servers for containers in `bridge`
mode networking.
* create an empty resolv.conf in `/var/lib/docker/containers/<id>` in
`host` mode networking.
Signed-off-by: Yuanhong Peng <pengyuanhong@huawei.com>
Prevents an issue where the goroutine may jump to a new OS thread during
execution putting it into a mount/network NS that is unexpected.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
(cherry picked from commit 6d8617d8757a759d806a3307ca04d4d588c04aed)
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
Avoid negative numbers and also set a lower bondary.
500 will mean 400 bytes minimum payload that will allow
at least a couple of gossip message to fit.
There is not theoretical limit becasue the message is made of
strings so there is still the possibility to have cases where
the 400 bytes are not enough to fit a single message, but
in that case we should start thinking why do I need a node
name that is long as an enciclopedia
Signed-off-by: Flavio Crisciani <flavio.crisciani@docker.com>
In the peerDelete the updateDB flag was always true
In the peerAdd the updateDB flag was always true except for
the initSandbox case. But now the initSandbox is handled by the
go routing of the peer operations, so we can move that flag
down and remove it from the top level functions
Signed-off-by: Flavio Crisciani <flavio.crisciani@docker.com>
The peerDbDelete was passing the wrong field to the underlay
Delete operation causing the mac entry to not being deleted
from the bridge on the overlay. This caused connectivity issue
when a container that before was remote was now scheduled
on the local node. The entry was such:
bridge fdb show | grep -i 02:42:0a:01:00:02
02:42:0a:01:00:02 dev vxlan0 master br0
02:42:0a:01:00:02 dev vxlan0 dst 172.31.14.63 link-netnsid 0 self permanent
That was still pointing to a remove node
Signed-off-by: Flavio Crisciani <flavio.crisciani@docker.com>
Move the sandbox init logic into the go routine that handles
peer operations.
This is to avoid deadlocks in the use of the pMap.Lock for the
network
Signed-off-by: Flavio Crisciani <flavio.crisciani@docker.com>
Remove the need for the wait group and avoid new
locks
Added utility to print the method name and the caller name
Signed-off-by: Flavio Crisciani <flavio.crisciani@docker.com>
This patch improves debugging for the resolver;
- prefix debug messages with `[resolver]` for easier finding in the daemon logs
- use `A` / `AAAA` for query-types in the logs instead of their numeric code
- add debug messages if the external DNS did not return a result
- print sucessful results (t.b.d.)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Before when a node was failing, all the nodes would bump the lamport time of all their
entries. This means that if a node flap, there will be a storm of update of all the entries.
This commit on the base of the previous logic guarantees that only the node that joins back
will readvertise its own entries, the other nodes won't need to advertise again.
Signed-off-by: Flavio Crisciani <flavio.crisciani@docker.com>
join/leave fixes:
- when a node leaves the network will deletes all the other nodes entries but will keep track of its
to make sure that other nodes if they are tcp syncing will be aware of them being deleted. (a node that
did not yet receive the network leave will potentially tcp/sync)
add network reapTime, was not being set locally
Signed-off-by: Flavio Crisciani <flavio.crisciani@docker.com>
Remove the need for the wait group and avoid new
locks
Added utility to print the method name and the caller name
Signed-off-by: Flavio Crisciani <flavio.crisciani@docker.com>
Deletion of the dynamic mac is expected to work only if there was active
traffic with that endpoint and a dynamic entry exists. It can also age
out. Hence the mac removal failing is not error. Removing it to make the
debugging easier when parsing the logs.
Signed-off-by: Santhosh Manohar <santhosh@docker.com>
- Diagnose framework that exposes REST API for db interaction
- Dockerfile to build the test image
- Periodic print of stats regarding queue size
- Client and server side for integration with testkit
- Added write-delete-leave-join
- Added test write-delete-wait-leave-join
- Added write-wait-leave-join
Signed-off-by: Flavio Crisciani <flavio.crisciani@docker.com>
- Introduce the possibility to specify the max buffer length
in network DB. This will allow to use the whole MTU limit of
the interface
- Add queue stats per network, it can be handy to identify the
node's throughput per network and identify unbalance between
nodes that can point to an MTU missconfiguration
Signed-off-by: Flavio Crisciani <flavio.crisciani@docker.com>
A rapid (within networkReapTime 30min) leave/join network
can corrupt the list of nodes per network with multiple copies
of the same nodes.
The fix makes sure that each node is present only once
Signed-off-by: Flavio Crisciani <flavio.crisciani@docker.com>
neighbor entries. On an l3 miss try to reprogram the neighbor entry
if the peer is valid. Its a best effort attempt because if the arp
table is still at gc_thresh3 value, addition will fail.
Signed-off-by: Santhosh Manohar <santhosh@docker.com>
Commit ca9a768d80
added a number of debugging messages for node join/leave
events.
This patch checks if a node already was listed,
and otherwise skips the logging to make the logs a bit
less noisy.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
* Correct SetMatrix documentation
The SetMatrix is a generic data structure, so the description
should not be tight to any specific use
Signed-off-by: Flavio Crisciani <flavio.crisciani@docker.com>
* Service Discovery reuse name and serviceBindings deletion
- Added logic to handle name reuse from different services
- Moved the deletion from the serviceBindings map at the end
of the rmServiceBindings body to avoid race with new services
Signed-off-by: Flavio Crisciani <flavio.crisciani@docker.com>
* Avoid race on network cleanup
Use the locker to avoid the race between the network
deletion and new endpoints being created
Signed-off-by: Flavio Crisciani <flavio.crisciani@docker.com>
* CleanupServiceBindings to clean the SD records
Allow the cleanupServicebindings to take care of the service discovery
cleanup. Also avoid to trigger the cleanup for each endpoint from an SD
point of view
LB and SD will be separated in the future
Signed-off-by: Flavio Crisciani <flavio.crisciani@docker.com>
* Addressed comments
Signed-off-by: Flavio Crisciani <flavio.crisciani@docker.com>
* NetworkDB deleteEntry has to happen
If there is an error locally guarantee that the delete entry
on network DB is still honored
Signed-off-by: Flavio Crisciani <flavio.crisciani@docker.com>
Flavio has been contributing various useful features in Docker 17.05
and 17.06 releases and also an active maintainer who helps with various
bug fixes and PR reviews
Signed-off-by: Madhu Venugopal <madhu@docker.com>
In accordance with the logic for SD, remove the ipvs rules
only when there is no more endpoints using the IP
Signed-off-by: Flavio Crisciani <flavio.crisciani@docker.com>
changed the ipMap to SetMatrix to allow transient states
Compacted the addSvc and deleteSvc into a one single method
Updated the datastructure for backends to allow storing all the information needed
to cleanup properly during the cleanupServiceBindings
Removed the enable/disable Service logic that was racing with sbLeave/sbJoin logic
Add some debug logs to track further race conditions
Signed-off-by: Flavio Crisciani <flavio.crisciani@docker.com>
The byteoffset calculation was skewed to double include
the offset value calculated. The double calculation
happens if the starting ordinal is part of the head
sequence block. This error in calculation could result
in duplicate but getting allocated eventually propogating
to ipam and vni id allocations
Signed-off-by: Abhinandan Prativadi <abhi@docker.com>
SetMatrix is a simple matrix of sets.
Added tests
This data structure will be used in following commit to handle
transient states where the same key can momentarely be associated
to more than a value
Signed-off-by: Flavio Crisciani <flavio.crisciani@docker.com>
The feature was not getting properly triggered, move it as
first operation in the configure
Signed-off-by: Flavio Crisciani <flavio.crisciani@docker.com>
The netlink socket that was used to monitor the L2
miss was never being closed. The watchMiss goroutine
spawned was never returning. This was causing goroutine
leak in case of createNetwork/destroyNetwork
Signed-off-by: Flavio Crisciani <flavio.crisciani@docker.com>
On linux systems bump up gc_thresholds so to lower the
probability of running with neighbor table overflow issues
Signed-off-by: Flavio Crisciani <flavio.crisciani@docker.com>
The channel ch.C is never closed.
Added the listen of the ch.Done() to guarantee
that the goroutine is exiting once the event channel
is closed
Signed-off-by: Flavio Crisciani <flavio.crisciani@docker.com>
The commit contains fix for the issue reported in
https://github.com/moby/moby/issues/33415 and
https://github.com/docker/libnetwork/issues/1772. With the
feature introduced to support local scope networks in swarm
mode the network configuration to include ipam driver was overriden
in libnetwork. This has been removed with this fix which will allow
ipam-driver option to be used for task allocation
Signed-off-by: Abhinandan Prativadi <abhi@docker.com>
When sandbox is deleting, another SetKey routine could be also in
progress as there's no lock to protect it, when this happens, there
could be a scene that one sandbox is removed, but it's osSbox file
"/var/run/docker/netns/xxxx" left on system and will never be cleaned.
So add a inDelete check for SetKey() to eliminate the race.
Signed-off-by: Zhang Wei <zhangwei555@huawei.com>
The time to keep a node failed into the failed node list
was originally supposed to be 24h.
If a node leaves explicitly it will be removed from the list of nodes
and put into the leftNodes list. This way the NotifyLeave event won't
insert it into the retry list.
NOTE: if the event is lost instead the behavior will be the same as a failed node.
If a node fails, the NotifyLeave will insert it into the failedNodes
list with a reapTime of 24h. This means that the node will be checked
for 24h before being completely forgot. The current check time is every
1 second and is done by the reconnectNode function.
The failed node list is updated every 2h instead.
Signed-off-by: Flavio Crisciani <flavio.crisciani@docker.com>
Allow users to configure firewall policies in a way that persists
docker operations/restarts. Docker will not delete or modify any
pre-existing rules from the DOCKER-USER filter chain. This allows
the user to create in advance any rules required to further
restrict access from/to the containers.
Fixesdocker/docker#29184Fixesdocker/docker#23987
Related to docker/docker#24848
Signed-off-by: Jacob Wen <jian.w.wen@oracle.com>
- Orchestrator interaction with the network driver is limited
to at most allocation/release of simple resources. For local scope
drivers all what is needed is the retrieval of the driver scope.The
full driver code base does not need to be pulled into the orschestrator.
This PR introduces a dedicated package in each builtin nw
driver for that purpose, as it was done for overlay driver.
Signed-off-by: Alessandro Boch <aboch@docker.com>
- It specifies whether the network driver can
provide containers connectivity across hosts.
- As of now, the data scope of the driver was
being overloaded with this notion.
- The driver scope information is still valid
and it defines whether the data allocation
of the network resources can be done globally
or only locally.
- With the scope network option, user can now
force a network as swarm scoped
regardless of the driver data scope.
- In case the network is configured as swarm scoped,
and the network driver is multihost capable,
a network DB instance will be launched for it.
Signed-off-by: Alessandro Boch <aboch@docker.com>
- They are configuration-only networks which
can be used to supply the configuration
when creating regular networks.
- They do not get allocated and do net get plumbed.
Drivers do not get to know about them.
- They can be removed, once no other network is
using them.
- When user creates a network specifying a
configuration network for the config, no
other network specific configuration field
is are accepted. User can only specify
network operator fields (attachable, internal,...)
- They do not need to have a driver field, that
field gets actually reset upon creation.
Signed-off-by: Alessandro Boch <aboch@docker.com>
Use the string concatenation operator instead of using Sprintf for
simple string concatenation. This is usually easier to read, and allows
the compiler to detect problems with the type or number of operands,
which would be runtime errors with Sprintf.
Signed-off-by: Aaron Lehmann <aaron.lehmann@docker.com>
Memberlist does a full validation of the protocol version (min, current, max)
amoung all the ndoes of the cluster.
The previous code was setting the protocol version to max version.
That made the upgrade incompatible.
Signed-off-by: Flavio Crisciani <flavio.crisciani@docker.com>
This change cleans up the SetClusterProvider method.
Swarm calls the SetClusterProvider to pass to libnetwork the pointer
of the provider from which libnetwork can fetch all the information to
initialize the internal agent.
The method can be and is called multiple times passing the same value,
with the previous logic that was erroneusly spawning multiple go routines that
were making possiblea race between an agentInit and an agentClose.
The new logic aims to disallow it by checking for the provider passed and
ensuring that if the provider is already present there is nothing to do because
there is already an active go routine that is ready to process cluster events.
Moreover a patch on moby side takes care of clearing up the Cluster Events
dispacthing using only 1 channel to handle all the events types.
This will also guarantee in order event handling because now all the events are
piped into one single channel.
Signed-off-by: Flavio Crisciani <flavio.crisciani@docker.com>
Change in the provider interface to let the provider
return the whole list of managers.
This will allow the netwrok db to have multiple choice
to establish the first adjacencies
Signed-off-by: Flavio Crisciani <flavio.crisciani@docker.com>
- Otherwise operation will unnecessarely block
for five seconds.
- This is particularly noticeable on graceful
shutdown of daemon in one node cluster.
Signed-off-by: Alessandro Boch <aboch@docker.com>
- concurrent swarm join and daemon stop seen in
integration tests may cause agentSetup to access
a nil clusterProvider, resulting in a panic
Signed-off-by: Alessandro Boch <aboch@docker.com>
I was reading over the description of the Endpoint and it struck me as a bit odd:
`An Endpoint can belong to *only one* network but may only belong to *one* Sandbox.`
I just wanted to rephrase it so that it's clear that an Endpoint has a one to one relationship with the Sandbox and a Network. If that is not the case, then I'm sorry for proposing the change. I'm only just starting to take a deeper dive into Docker networking.
Signed-off-by: Sebastian Radloff <sradloff23@gmail.com>
During configuration in SWARM mode is now possible to pass an additional
parameter --data-path-addr <ip|interface>.
The information is going to be used to configure which is the interface
that is going to be used for the data path for global scope drivers.
Up to now the only driver really using this extra parameter is the
overlay driver.
Signed-off-by: Flavio Crisciani <flavio.crisciani@docker.com>
- Currently if the join fails, the gw endpoint becomes
stale and stays connected to the gw network.
- Also fix sbJoin to do the cleanup in case
setupDefaultGW() fails
Signed-off-by: Alessandro Boch <aboch@docker.com>
Flush all the endpoint flows when the external
connectivity is removed.
This will prevent issues where if there is a flow
in conntrack this will have precedence and will
let the packet skip the POSTROUTING chain.
Signed-off-by: Flavio Crisciani <flavio.crisciani@docker.com>
- The info it provides can be found elsewhere
The logs gets printed too often becasue of
the programming being done in the tasks
Signed-off-by: Alessandro Boch <aboch@docker.com>
- Now that docker has the code to release the ingress
network, have docker do the release on cluster leave
and on graceful daemon shutdown.
This is a cleaner approach in line with the cleanup
triggered by who created the resource and will avoid
races on ingress network removal as revealed by the
docker tests.
Signed-off-by: Alessandro Boch <aboch@docker.com>
Because the failure would not be on creating the osl sandbox
(which is done by somebody else). It would be on the programming
libnetwork does on the osl sandbox. In case of failure just report
the error. External caller will take care of removing the parent sandbox
via the cleanup on the error handling path. Otherwise the osl sandbox
will never be removed.
Signed-off-by: Alessandro Boch <aboch@docker.com>
vndr has been updated and now pulls in license files and readmes.
This just re-runs with the latest version so vendoring is up to date.
Should cut down on changes from real vendor commit updates.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
- Do not run the risk of suppressing meaningful messages
for the rest of the cluster, as a many services depend
on it, like the service records and the distributed
load balancers.
Signed-off-by: Alessandro Boch <aboch@docker.com>
I saw a rare race during the first few calls to iptables module
where some of them would reenter initCheck() after the first call
to it already changed iptablesPath, but before the rest of the function
completed (in particular the long execs into testing for availability
of --wait flag and determining iptables version), resulting in
failure of one or more of iptables calls that did not use --wait and
were concurrent.
To fix the problem, this change gathers all one-time initialization into a
single function under a sync.Once instead of using a global variable
as a "done initializing" flag before initialization is done. sync.Once
guarantees all concurrent calls will block until the first one completes.
In addition, it turns out that GetVersion(), called from initCheck(), used
Raw() which called back into initCheck() via raw(), which did not cause a
problem in the earlier implementation but deadlocked when initialization became
strict. This was changed to use a direct call, similar to initialization of
supportsXlock.
Signed-off-by: Max Timchenko <max@maxvt.com>
- Currently when a non-named container with network aliases
is connected to a swarm attachable network, its aliases are
not added to the service records.
This is not in line with what we do when connecting to
a local scope network, or to a kv-store based overlay network.
Signed-off-by: Alessandro Boch <aboch@docker.com>
getNetworksFromStore reads networks and endpoint_cnt from the kvstores.
endpoint_cnt especially is read in a for-loop for each network and that
causes a lot of stress in poorly performing KV-Stores.
This fix eases the load on the kvstore by fetching all the endpoint_cnt
in a single read and the operation is performed on it.
Signed-off-by: Madhu Venugopal <madhu@docker.com>
With the introduction of networkdb, the node discovery events were not
sent to the drivers. This commit generates the node discovery events and
sents it to the drivers interested in it.
Signed-off-by: Madhu Venugopal <madhu@docker.com>
- do not error on duplicate service removal
- give some context to service logs,
this would help debugging related issues
Signed-off-by: Alessandro Boch <aboch@docker.com>
- Do not relay on software flags to decide when to create the
virtual service. Instead query the kernel for presence.
So that it cannot happen that a real server creation
fails because the virtual server is missing.
Signed-off-by: Alessandro Boch <aboch@docker.com>
- the function is broken as it does not strip the
zone id from an IPv6 nameserver address, and it
returns the IPv6 address with /32
Signed-off-by: Alessandro Boch <aboch@docker.com>
Fixdocker/docker#27539
After io.Copy(to, from), we should call to.CloseWrite(), not to.CloseRead().
Without this fix, TestTCP4ProxyHalfClose (newly added in this commit) fails as
follows:
--- FAIL: TestTCP4ProxyHalfClose (0.00s)
network_proxy_test.go:135: EOF
Signed-off-by: Akihiro Suda <suda.akihiro@lab.ntt.co.jp>
drvRegistry isnt aware if a plugin is v1 or v2. Plugin-v2 provides a way
for user to disable and remove plugins. But unfortunately, there isnt
any api to advertise the removal to drvRegistry. Hence there is no way
to handle "docker plugin rm" of installed plugin. In order to support
the case of "docker plugin install x" followed by "docker plugin rm x"
followed by reinstalling of plugin x "docker plugin install x",
drvRegistry must allow overriding any existing plugin with the same
name. The protection in plugin infra will prevent willful override of
existing plugin.
Signed-off-by: Madhu Venugopal <madhu@docker.com>
With Plugin-V2, plugins can get activated before remote driver is
Initialized. Those plugins fails to get registered with drvRegistry.
This fix handles that scenario
Signed-off-by: Madhu Venugopal <madhu@docker.com>
This fixes an issue where using a fqdn as hostname
not being added to /etc/hosts.
The etchosts.Build() function was never called
with an IP-address, therefore the fqdn was not
added.
The subsequent updateHostsFile() was not updated
to support fqdn's as hostname, and not adding
the record correctly to /etc/hosts.
This patch implements the functionality in
updateHostsFile()
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Right now, items logged by memberlist end up as a complete log line
embedded inside another log line, like the following:
Nov 22 16:34:16 hostname dockerd: time="2016-11-22T16:34:16.802103258-08:00" level=info msg="2016/11/22 16:34:16 [INFO] memberlist: Marking xyz-1d1ec2dfa053 as failed, suspect timeout reached\n"
This has two time and date stamps, and an escaped newline inside the
"msg" field of the outer log message.
To fix this, define a custom logger that only prints the message itself.
Capture this message in logWriter, strip off the log level (added
directly by memberlist), and route to the appropriate logrus method.
Signed-off-by: Aaron Lehmann <aaron.lehmann@docker.com>
With the introduction of GetIDInRange function in IDM and using it in
ovmanager, the idm.New was modified to start from 1. But that causes
issues when the network is removed which results in releasing the
vxlan-id from IDM. With the offset of 1, the Release call incorrectly
releases a bit which could be in use by another network and this results
in the infamous "error creating vxlan interface: file exists" errors
when another network is created with this freed bit.
Signed-off-by: Madhu Venugopal <madhu@docker.com>
- Disable ipv6 on all interface by default at sandbox creation.
Enable IPv6 per interface basis if the interface has an IPv6
address. In case sandbox has an IPv6 interface, also enable
IPv6 on loopback interface.
Signed-off-by: Alessandro Boch <aboch@docker.com>
- iptables pkg functions are coded to discard
the xtables_lock error message about acquiring
the lock, because all the calls are done with
the wait logic. But the error message has
slightly changed between iptables 1.4.x and 1.6.
This lead to false positives causing docker
network create to fil in presence of concurrent calls.
- Fixed message mark to be common among the two main versions.
Signed-off-by: Alessandro Boch <aboch@docker.com>
Fix import name to use original project name 'logrus' instead of 'log'
Removing `f` from `logrus.Debugf` when formatting string is not present.
Signed-off-by: Daehyeok Mun <daehyeok@gmail.com>
Ping on VIP has been behaving inconsistently depending on if a task
for a service is local or remote.
With this fix, the ICMP echo-request packets to service VIP are replied
to by the NAT rule to self
Signed-off-by: Madhu Venugopal <madhu@docker.com>
This fix tries to fix logrus formatting by removing `f` from
`logrus.[Error|Warn|Debug|Fatal|Panic|Info]f` when formatting string
is not present.
Also fix import name to use original project name 'logrus' instead of
'log'
Signed-off-by: Daehyeok Mun <daehyeok@gmail.com>
Without this change, specifying a remote network driver on Windows results in "could not resolve driver X in registry" error.
Signed-off-by: Michal Kostrzewa <kostrzewa.michal@o2.pl>
To make it consistent with windows and linux workers
Signed-off-by: Madhu Venugopal <madhu@docker.com>
Fixed build breaks
Signed-off-by: msabansal <sabansal@microsoft.com>
- This fixes a panic in memberlist.Leave() because called
after memberlist.shutdown = false
It happens because of two interlocking calls to NetworkDB.clusterLeave()
It is easily reproducible with two back-to-back calls
to docker swarm init && docker swarm leave --force
While the first clusterLeave() is waiting for sendNodeEvent(NodeEventTypeLeave)
to timeout (5 sec) a second clusterLeave() is called. The second clusterLeave()
will end up invoking memberlist.Leave() after the previous call already did
the same, therefore after memberlist.shutdown was set false.
- The fix is to have agentClose() acquire the agent instance and reset the
agent pointer right away under lock. Then execute the closing/leave functions
on the agent instance.
Signed-off-by: Alessandro Boch <aboch@docker.com>
This fix tries to address the issue raised in:
https://github.com/docker/docker/issues/26341
where multiple addresses in a bridge may cause `--fixed-cidr` to
not have the correct addresses.
The issue is that `netutils.ElectInterfaceAddresses(bridgeName)`
only returns the first IPv4 address.
This fix changes `ElectInterfaceAddresses()` and `addresses()`
so that all IPv4 addresses are returned. This will allow the
possibility of selectively choose the address needed.
Signed-off-by: Yong Tang <yong.tang.github@outlook.com>
This simplifies how we build it in docker/docker as no vendoring needed,
and this does program not use any logrus features.
Signed-off-by: Justin Cormack <justin.cormack@docker.com>
Currently there is an instance of controller and service lock being
obtained in different order which causes the AB/BA deadlock. Do not ever
wrap controller lock around service lock.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
- and update it to store. Otherwise after an ungraceful shutdown,
at next boot there will be in store two bridge endpoints with
same port-mapping data. When bridge driver will try to restore
the endpoints, there will be conflicts and a container with
restart policy could fail to start.
Signed-off-by: Alessandro Boch <aboch@docker.com>
- When docker is run inside a container, the infrastructure
needed by modprobe is not always available, causing the
xfrm module load to fail even when these modules are already
loaded or builtin in the kernel.
- In case of probe failure, before declaring the failure,
run an API check by attempting the creation of
a NETLINK_XFRM socket.
Signed-off-by: Alessandro Boch <aboch@docker.com>
There is no guarantees that the ep and extEp are using the same driver.
If they are not using the same drivers, the driver for ep will not know
about the networks of extEp and fails the RevokeExternalConnectivity
call.
Signed-off-by: Shayan Pooya <shayan@liveve.org>
Do not add service discovery names to ingress network as this is a
routing only network and no intra-cluster discovery should happen in
that network. This fixes the ambiguity and resolving names between
services which are both publishing ports and also attached to same
another network.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
Once the bulksync ack channel is closed remove it from the ack table
right away. There is no reason to keep it in the ack table and later
delete it in the ack waiter. Ack waiter anyways has reference to the
channel on which it is waiting.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
As part of daemon init, network and ipam drivers are passed a
pluginstore object that implements the plugin/getter interface. Use this
interface methods in libnetwork to interact with network plugins. This
interface provides the new and improved pluginv2 functionality and falls
back to pluginv1 (legacy) if necessary.
Signed-off-by: Anusha Ragunathan <anusha@docker.com>
When a gossip join failure happens do not return early in the call chain
because a join failure is most likely transient and the retry logic
built in the networkdb is going to retry and succeed. Returning early
makes the initialization of ingress network/sandbox to not happen which
causes a problem even after the gossip join on retry is successful.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
Since the node name randomization fix, we need to make sure that we
purge the old node with the same prefix and same IP from the nodes
database if it still present. This causes unnecessary reconnect
attempts.
Also added a change to avoid unnecessary update of local lamport time
and only do it of we are ready to do a push pull on a join. Join should
happen only when the node is bootstrapped or when trying to reconnect
with a failed node.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
When plumbing overlay filter rules serialize this to make sure that
multiple sandbox join or leave is not causing erroneous behavior while
moving the RETURN rule in the predefined chains.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
* TestwriteJSON and TestontainerInvalidLeave were never executed due to the typos. Recent govet found them.
* TestWriteJSON was failing due to the comparison between string and []byte. Also, it didn't considered that json.Encode appends LF.
* TestContainerInvalidLeave was faling due to a typo
Signed-off-by: Akihiro Suda <suda.akihiro@lab.ntt.co.jp>
If user provided a non-zero listen address, honor that and bind only to
that address. Right now it is not honored and we always bind to all ip
addresses in the host.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
With port redirect in the ingress path happening before ipvs in the
ingess sandbox, there is a chance of 5-tuple collision in the ipvs
connection table for two entirely different services have different
PublishedPorts but the same TargetPort. To disambiguate the ipvs
connection table, delay the port redirect from PublishedPort to
TargetPort until after the loadbalancing has happened in ipvs. To be
specific, perform the redirect after the packet enters the real backend
container namespace.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
Currently, a reference counting scheme is used to reference count all
individual port configs that need to be plumbed in the ingress to make
sure that in situations where a service with the same set of port
configs is getting added or removed doesn't accidentally remove the port
config plumbing if the add/remove notifications come out of order. This
same reference counting scheme is also used for plumbing the port-based
marking rules. But marking rules should not be plumbed based on that
because marks are always different for different instantiations of the
same service. So fixed the code to plumb port-based mark rules based on
the complete set of port configs, while plumbing pure port rules and
proxies based on a filter set of port configs based on the reference
count.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
Currently if there is any transient gossip failure in any node the
recoevry process depends on other nodes propogating the information
indirectly. In cases if these transient failures affects all the nodes
that this node has in its memberlist then this node will be permenantly
cutoff from the the gossip channel. Added node state management code in
networkdb to address these problems by trying to rejoin the cluster via
the failed nodes when there is a failure. This also necessitates the
need to add new messages called node event messages to differentiate
between node leave and node failure.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
When dynamic networks are created and there is a race in creation of the
same network from two different tasks then one of them will fail while
the other will succeed. For service tasks this is not a big problem
because they will be rescheduled again. But for attachment tasks this
can be a problem since they won't get recreated and making the whole
connection fail. Fixed it by serializing network creation for the
network with the same id and trying to see if the id is present after
coming out of wait.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
This reverts commit b042dbe312.
The original commit breaks s390x, for example Docker build fails:
* https://github.com/docker/docker/issues/26440
As discussed in the above issue:
Even though char is unsigned by default on s390x, (gcc)go forces the type
of RawSockaddr.Data to be signed.
It makes no practical difference if these fields are signed or unsigned,
it's just an API issue.
The (assumed) reason for the original commit:
For a while RawSockaddr.Data was unsigned during development of the gcc
s390x port (not in an upstream release though). Probably the patch has
been developed in this time frame.
Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Since docker/docker removed mflag package and libnetwork relies on it
create a copy of mflag package in libnetwork project.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
Currently the endpoint count is being decremented before the driver
cleanup and more importantly before releasing the ip address. This is
racy as it creates a time window where we already have decremented the
endpoint count and so the network can be deleted now. But we haven't
released the IP address yet and the pool is already gone. Although there
is no harm done since the pool is already gone. it generates unnecessary
error message about not able to release the address. Also if the driver
cleanup fails we really should not decrement endpoint count.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
When stale delete notifications are received, we still need to make sure
to purge sandbox neighbor cache because these stale deletes are most
typically out of order delete notifications and if an add for the
peermac was received before the delete of the old peermac,vtep pair then
we process that and replace the kernel state but the old neighbor state
in the sandbox cache remains. That needs to be purged when we finally
get the out of order delete notification.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
When the libnetwork controller is not in distributed control mode avoid
retaining stale sandboxes when the network cannot be retrieved from
store. This ratining logic is only applicable for an independent k/v
store which manages libnetwork state. In such case the k/v store may be
temporarily unavailable so there is a need to retain the sandbox so that
the resource cleanup happens properly.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
This also allows pubslied services to be accessible from containers on bridge
networks on the host
Signed-off-by: Santhosh Manohar <santhosh@docker.com>
Avoid by reinitializing the channel immediately after closing the
channel within a lock. Also change the wait code to cache the channel in
stack be retrieving it from controller and wait on the stack copy of the
channel.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
This script gathers some basic information from a system that might
be useful to help troubleshoot problems. If added into an image
including the proper binaries, running looks something like this:
docker run --rm \
-v /var/run/docker.sock:/var/run/docker.sock \
-v /var/run/docker/netns:/var/run/docker/netns \
--privileged --net=host nwsupport /bin/support
Signed-off-by: Daniel Hiltgen <daniel.hiltgen@docker.com>
Avoid the whole store endpoint update logic when running in swarm mode
and the endpoint is part of a global scope network. Currently there is
no store update that is happening for global scope networks in swarm
mode, but this code path will delete the svcRecords database when the
last endpoint on the network is removed which is something that is not
required.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
When leaving the entire gossip cluster or when leaving a network
specific gossip cluster, we may not have had a chance to cleanup service
bindings by way of gossip updates due to premature closure of gossip
channel. Make sure to cleanup all service bindings since we are not
participating in the cluster any more.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
Currently the initDone notification is provided immediately after
initializing the cluster. This may be fine for the first manager. But
for all subsequent nodes which join the cluster we need to wait until
the node completes the joining to the gossip cluster inorder to
synchronize the gossip network clock with other nodes. If we don't have
uptodate clock the updates that this node provides to the cluster may be
discarded by the other nodes if they have entries which are yet to be
reaped but have a better clock.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
In cases a node left the cluster and quickly rejoined before the node
entry is expired by other nodes in the cluster, when the node rejoins we
fail to add it to the quick lookup database. Fixed it.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
In networkdb we should ignore delete events for entries which doesn't
exist in the db. This is always true because if the entry did not exist
then the entry has been removed way earlier and got purged after the
reap timer and this notification is very stale.
Also there were duplicate delete notifications being sent to the
clients. One when the actual delete event was received from gossip and
later when the entry was getting reaped. The second notification is
unnecessary and may cause issues with the clients if they are not
coded for idempotency.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
When a node leaves the swarm cluster, we should cleanup the ingress
network and sandbox. This makes sure that when the next time the node
joins the swarm it will be able to update the cluster with the right
information.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
The SNAT rules added for LB egress is broader and breaks load balancing
if the service is connected to multiple networks. Make it conditional
based on the subnet to which the network belongs so that the right SNAT
rule gets matched when egressing the corresponding network.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
Fixed certain spurious overlay errors which were not errors at all but
showing up everytime service tasks are started in the engine.
Also added a check to make sure a delete is valid by checking the
incoming endpoint id wih the one in peerdb just to make sure if the
delete from gossip is not stale.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
Make service loadbalancing to work from within one of the containers of
the service. Currently this only works when the loadbalancer selects the
current container. If another container of the same service is chosen,
the connection times out. This fix adds a SNAT rule to change the source
IP to the containers primary IP so that responses can be routed back to
this container.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
Ingress loadbalancer is only required to be plumbed in ingress sandboxes
of nodes which are the only mechanism to get traffix outside the cluster
to tasks. Since the tasks are part of ingress network, these
loadbalancers were getting added in all tasks which are exposing ports
which is totally unnecessary resource usage. This PR avoids that.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
If not enough keys are provided to SetKeys, this may cause a panic. This
should not cause problems with the current integration in Docker 1.12.0,
but the panic might happen loading data created by an earlier version,
or data that is corrupted somehow. Add a length check to be defensive.
Signed-off-by: Aaron Lehmann <aaron.lehmann@docker.com>
Sometimes you may get stale backend removal notices from gossip due to
some lingering state. If a stale backend notice is received and it is
already processed in this node ignore it rather than processing it.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
The CopyTo function for joininfo is not copying the driver table entries
which then is missing when the endpoint is re-read for the store cache.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
If a remote plugin returns an empty string in response to RequestAddress(),
the internal helper will return nil which will crash libnetwork in several
places.
Treat an empty string as a new error ipamapi.ErrNoIPReturned.
Signed-off-by: Thomas Graf <tgraf@suug.ch>
- We need to compare the node notification IP with
the advertise address otherwise when the advertise
address is different from the local address (this
is for the public address outside of the host
that maps 1-to-1 to the local private address)
the local IP will be acocunted as an ipsec host
and extra states will be programmed for it.
Signed-off-by: Alessandro Boch <aboch@docker.com>
- When creating a non encrypted overlay network,
make sure no encryption related mangle rule from
stale network is on the way.
Signed-off-by: Alessandro Boch <aboch@docker.com>
- Because of a bug in the netlink xfrm code, our code will
fail to find and remove the states. While we could wait
for the netlink library fix, there is no longer a need to
convert the parsed IP addresses to the canonical notation
given the previous SPI computation (which worked on that
4 byte address assumption) is now replaced by the fnv hash.
- Also modify driver option that enables ipsec to "encrypted"
Signed-off-by: Alessandro Boch <aboch@docker.com>
With this change, all the auto-detection of the addresses are removed
from libnetwork and the caller takes the responsibilty to have a proper
advertise-addr in various scenarios (including externally facing public
advertise-addr with an internal facing private listen-addr)
Signed-off-by: Madhu Venugopal <madhu@docker.com>
the UDS sock is an unique file and the lifetime of it is until the
docker daemon dies (gracefully). Hence there is no need for it to be
under /var/lib and not mandatory to be configurable either.
Signed-off-by: Madhu Venugopal <madhu@docker.com>
A network is added to the `d.networks` map before it's fully initialized. That
is, it's possible for a network in `d.networks` to exist without having
`bridgeIPv4` populated yet. If multiple networks are spun up close to the same
time, a panic can occur.
Example:
```
panic(0x1a75d20, 0xc82000e090)
/usr/local/go/src/runtime/panic.go:443 +0x4e9
net.networkNumberAndMask(0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0)
/usr/local/go/src/net/ip.go:433 +0x42
net.(*IPNet).Contains(0x0, 0xc82084dbd0, 0x4, 0x4, 0xc820010200)
/usr/local/go/src/net/ip.go:457 +0x25
github.com/docker/libnetwork/drivers/bridge.(*networkConfiguration).conflictsWithNetworks(0xc822249360, 0xc822761380, 0x40, 0xc820866a60, 0x4, 0x4, 0x0, 0x0)
/root/rpmbuild/BUILD/docker-engine/vendor/src/github.com/docker/libnetwork/drivers/bridge/bridge.go:334 +0x40b
```
Signed-off-by: Andy Lindeman <alindeman@salesforce.com>
Rather than re-execing docker as the proxy, create a new command docker-proxy
that is much smaller to save memory in the case where there are a lot of
procies being created. Also allows the proxy to be replaced, for example
in Docker for Mac we have a proxy that proxies to osx instead of locally.
This is the vendoring pull for https://github.com/docker/docker/pull/23312
Signed-off-by: Justin Cormack <justin.cormack@docker.com>
When deleting entries or when learning about deleted entries remember
then for a longer time to avoid excessive delete duplicates in the
gossip cluster. Also added code changes to ignore event messages
originated from the source node so that it doesn't get added into the
rebroadcast queue.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
While scaling down, currently we are removing the service record even if
the LB entry for the vip is not fully removed. This causes resolution
issues when scaling down. Fixed it by removing the service record only
if the LB for the vip is going away.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
Currently ovmanager simply logs an error when there is a vni allocation
failure. Instead it should error out and free all the previously
allocated vnis
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
While trying to update loadbalancer state index the service both on id
and portconfig. From libnetwork point of view a service is not just
defined by its id but also the ports it exposes. When a service updates
its port its id remains the same but its portconfigs change which should
be treated as a new service in libnetwork in order to ensure proper
cleanup of old LB state and creation of new LB state.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
- If an endpoint is forcibly removed, it should not
matter whether the locator info is present. If
the daemon was started w/o the --cluster-advertise
option (the option is not mandatory), then the
locator would be empty for any endpoint.
Signed-off-by: Alessandro Boch <aboch@docker.com>
If xfrm modules cannot be loaded:
- Create netlink.Handle only for ROUTE socket
- Reject local join on overlay secure network
Signed-off-by: Alessandro Boch <aboch@docker.com>
Currently even outgoing connection requests are matched to inject into
DOCKER-INGRESS chain. This is not correct because it disrupts access to
services outside the host on the same service port. Instead inject only
the locally destined packets towards DOCKER-INGRESS chain.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
Due to a slice management logic error the bulk sync for loop can go on
indefinitely and eventually leading to an OOM error. Fixed the logic so
that an infinite loop never occurs. Also changed the bulk sync wait
timeout to use a timer rather than use time.After as time.After is known
to consume a lot of memory when called in a tight loop.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
When adding a loadbalancer to a sandbox, the sandbox may have a valid
namespace but it might not have populated all the dependent network
resources yet. In that case do not populate that endpoint's loadbalancer
into that sandbox yet. The loadbalancer will be populated into the
sandbox when it is done populating all the dependent network resources.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
When a node goes away purge all the network attachments from the node
and make sure we don't attempt bulk syncing to that node once removed.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
Since the datastore interface is common for persistent and
non-persistent objects we need to provide the same kind of sequencing
and atomicity guarantess to non-persistent data operations as we do for
persistent operations. So added sequencing and atomicity checks in the
data cache layer.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
When a node goes away we purge all the table entries that we learned
from that node but we don't notify the watchers about it. Made sure we
notify the watchers when this happens.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
If we cleaned up a stale network sandbox and an entry for that exists in
vniTbl, then purge it from vniTbl. Otherwise when a new vxlan for that
vni is added to the network, we might destroy the network sandbox
created in the current life.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
When broadcasting table event, make sure the broadcast queue is
valid. The network may have been removed while in the process of sending
the broadcast.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
If a miss notification arrives on a network's miss go routine currently
it is unconditionally processed. This is unnecessary and can be bad if
there are too many misses. This is especially true for hostmode. Fix
this by filtering out misses that doesn't belong to any of the network's
subnets.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
If the IPAM pools are not reserved before resource cleanup happens then
the resource release will not happen correctly.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
The wait in bulkSyncNode was meant for bulkSync initiator. Not for
responder. Fix the incorrect code which was also waiting unnecessarily
on response which it will never get and will eventually time out.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
Right now if no vip is provided only when a new loadbalancer is created
we add the service records of the backend ip. But it should happen all
the time. This is to make DNS RR on service name work.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
When a goroutine which is adding the service and another which is adding
just a destination interleave the destination which is dependent on the
service may not get added and will result in service working at reduced
scale. The fix is to synchronize this with the service mutex.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
- Moved ingress port forwarding rules to its own chain
- Flushed the chain during init
- Bound to the swarm ports so no hijacks it.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
In the current implementation, the local peers are being added as remote
peers so gets added to the vxlan neighbor and fdb table. This causes the
local forwarding to get stuck for a few seconds after the bridge mac
table entries for the local peers get aged out. This PR fixes the
problem.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
When leaving a cluster the agentInitDone should be re-initialized so tha
when a new cluster is initialized this is usable.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
If a new network request is received for a prticular vni, cleanup the
interface with that vni even if it is inside a namespace. This is done
by collecting vni to namespace data during init and later using it to
delete the interface.
Also fixed a long pending issue of the vxlan interface not getting
destroyed even if the sandbox is destroyed. Fixed by first deleting the
vxlan interface first before destroying the sandbox.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
Agent initialization wait method is added to make sure callers for
controller methods which depend on agent initialization to be complete
can wait on it.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
Also do not log error messages when adding a destination and it already
exists. This can happen because of duplicate gossip notifications.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
Ingress load balancer is achieved via a service sandbox which acts as
the proxy to translate incoming node port requests and mapping that to a
service entry. Once the right service is identified, the same internal
loadbalancer implementation is used to load balance to the right backend
instance.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
This PR adds support for loadbalancing across a group of endpoints that
share the same service configuration as passed in by
`OptionService`. The loadbalancer is implemented using ipvs with just
round robin scheduling supported for now.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
With docker 1.11 based dnet tests, the dnet_exec is sometimes exiting
with exit code 129 because the process is getting a SIGHUP. Although the
reason or source of the SIGHUP is unknown, it is making the tests flaky
because non-zero exit code. Fixed it by trapping SIGHUP inside the
container so that we can run the test code successfully.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
Convert all networkdb core message types from go message types to
protobuf message types. This faciliates future modification of the
message structure without breaking backward compatibility.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
- Earlier this was guaranteed by ipam driver intialization
which was not creating a global address space if the
global datastore was missing. Now that ipam address spaces
can be initialized with no backing datastore, insert an
explicit check in libnetwork, which should have been there
regardless.
Signed-off-by: Alessandro Boch <aboch@docker.com>
- Also restore older behavior where overlap check is not run
when preferred pool is specified. Got broken by recent changes
Signed-off-by: Alessandro Boch <aboch@docker.com>
During sandboxcleanup operation, a dummy object is created if the
kv-store is inaccessible during the daemon bootup. The dummy object is
used for local processing of sandbox/endpoint cleanup operation. But
unfortunately, since the persist flag was not set, the Skip()
functionality kicked-in when sandbox was written back to the store and
global endpoint was skipped to be tracked. During a subsequent cleanup
operation, sandbox was removed and the global endpoint was left stale in
the kv-store.
The fix is to set the persist flag in the dummy object and that handles
the sandbox and endpoint states appropriately and endpoint object is
properly cleaned up when the KVStore becomes available.
Signed-off-by: Madhu Venugopal <madhu@docker.com>
Add a notion of service in libnetwork so that a group of endpoints
which form a service can be treated as such so that service level
features can be added on top. Initially as part of this PR the support
to assign a name to the said service is added which results in DNS
queries to the service name to return all the IPs of the backing
endpoints so that DNS RR behavior on the service name can be achieved.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
- When creating a network with both IPv4 and IPv6 subnets,
if the allocation of the IPv6 pool fails, the already
reserved IPv4 pool does not get released.
Signed-off-by: Alessandro Boch <aboch@docker.com>
libnetwork agent mode is a mode where libnetwork can act as a local
agent for network and discovery plumbing alone while the state
management is done elsewhere. This completes the support for making
libnetwork and its associated drivers to be completely independent of a
k/v store(if needed) and work purely based on the state information
passed along by some some external controller or manager. This does not
mean that libnetwork support for decentralized state management via a
k/v store is removed.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
Currently overlay driver requires a k/v store to allocate a vxlan id and
add an entry in k/v store for network->vxlanIDs binding. But the overlay
driver should be able to work without a k/v store provided libnetwork
can pass along the vxlanIDs needed for the network, rather than the
driver managing it themselves. Modified the driver to work with vxlanIDs
passed down by libnetwork.
Also made changes in the driver to make use of the gossip layer
available in libnetwork if available.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
When a node joins a network it sends out a gossip event before it
updates it's own in-memory state. This can create a race where the node
gets the event back from a remote node before we update in-memory state
and we treat that as latest state. To avoid this race, always generate
the gossip after updating local state.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
With the introduction of a driver generic gossip in libnetwork it is not
necessary for drivers to run their own gossip protocol (like what
overlay driver is doing currently) but instead rely on the gossip
instance run centrally in libnetwork. In order to achieve this, certain
enhancements to driver api are needed. This api aims to provide these
enhancements.
The new api provides a way for drivers to register interest on table
names of their choice by returning a list of table names of interest as
a response to CreateNetwork. By doing that they will get notified if a
CRUD operation happened on the tables of their interest, via the newly
added EventNotify call.
Drivers themselves can add entries to any table during a Join call by
invoking AddTableEntry method any number of times during the Join
call. These entries lifetime is the same as the endpoint itself. As soon
as the container leaves the endpoint, those entries added by driver
during that endpoint's Join call will be automatically removed by
libnetwork. This action may trigger notification of such deletion to all
driver instances in the cluster who have registered interest in that
table's notification.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
Fixing an earlier commit which needlessly registered
boltdb in allocator.go while it is needed only for tests.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
Currently datastore has dependencies on various kv backends.
This is undesirable if datastore had to be used as a backend
agnostic store management package with it's cache layer. This
PR aims to achieve that.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
Make ipams builtin package work for os x target as ipam
driver developers happen to be using os x as well.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
In general the core IPAM and bitseq implementation has
very little assumptions about the presence of a backing
store. But there are a few places where this assumption exists
and this makes it not useful as a simple in-memory allocator.
This PR removes those false assumptions.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
Currently the libnetwork function `NewNetwork` does not allow
caller to pass a network ID and it is always generated internally.
This is sufficient for engine use. But it doesn't satisfy the needs
of libnetwork being used as an independent library in programs other
than the engine. This enhancement is one of the many needed to
facilitate a generic libnetwork.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
Because overlay is a builtin driver and global allocation of overlay
resources is probably going to happen in a different node (a single
node) and the actual plumbing of the network is probably going to happen
in all nodes, it makes sense to split the functionality of allocation
into two different packages. The central component(this package) only
implements the NetworkAllocate/Free apis while the distributed
component(the existing overlay driver) implements the rest of the driver
api. This way we can reduce the memory footprint overall.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
- When creating and programming the global overlay chain,
gracefully handle the case where the chain already exists.
Today the driver logs an Error and does not attempt to insert
the return rule if the chain is already present.
Signed-off-by: Alessandro Boch <aboch@docker.com>
Added NetworkAllocate and NetworkFree apis to the list of
driver apis. The intention of the api is to provide a
centralized way of allocating and freeing network resources
for a network which is cross-host.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
With DNS resolution happening within the container namespace
test should not try to ping a DNS name and expect a packet
loss message. It will only show up as a DNS name resolution
failure. Changed the bridge internal test to test for a
well known IP address.
Also rearranged the overlay internal tests so that it gets
to run before the dnet container is removed which was
happening in previous tests.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
Currently ipam/ipamutils has a bunch of dependencies
in osl and netlink which makes the ipam/ipamutils harder
to use independently with other applications. This PR
modularizes ipam/ipamutils into a standalone package
with no OS level dependencies.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
Currently when the default gw changes because of
other network connections happening in the container
the resolver sockets are not flushed. This results
in a subsequent DNS failure for external queries
A sequence of connecting the container to an overlay
network and subsequently to a bridge network without
disconnecting from any network will result in this
behaviour. This was revealed by one of the libnetwork
IT tests.
This is now fixed as part of the commit by flushing
the external query sockets when a default gw change
is detected.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
Currently driver management logic is tightly coupled with
libnetwork package and that makes it very difficult to
modularize it and use it separately. This PR modularizes
the driver management logic by creating a driver registry
package.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
Network DB is a network scoped gossip database built
on top of hashicorp/memberlist providing an eventually
consistent state store.
It limits the scope of the gossip and periodic bulk syncing
for table entries to only the nodes which participate in the
network to which the gossip belongs. This designs make the
gossip layer scale better and only consumes resources for the
network state that the node participates in.
Since the complete state for a network is maintained by all nodes
participating in the network, all nodes will eventually converge
to the same state.
NetworkDB also provides facilities for the users of the package to
watch on any table (or all tables) and get notified if there are
state changes of interest that happened anywhere in the cluster when
that state change eventually finds it's way to the watcher's node.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
- Restoring original behavior where on disconnect
from overlay network (only connected network), it also
disconnects from default gw network.
- On sandbox delete, the leave and delete of each
endpoint is performed, regardless of whether the endpoint
is the gw network endpoint. This endpoint is already
automatically removed in endpoint.sbLeave()
- Also do not let internal network dictate container does
not need external connectivity. Before this fix, if a container
was connected to an overlay and an internal network, it may not
get attached to the default gw network.
Signed-off-by: Alessandro Boch <aboch@docker.com>
- On sandbox delete, the leave and delete of each
endpoint is performed, regardless of whether the endpoint
is the gw network endpoint. This endpoint is already
automatically removed in endpoint.sbLeave() by
sb.clearDefaultGW() when the sandbox is marked for
deletion.
- Also restoring otiginal behavior where on disconnect
from overlay network (only connected network), it also
disconnects from default gw network.
- Also do not let internal network dictate container does
not need external connectivity. Before this fix, if a container
was connected to an overlay and an internal network, it may not
get attached to the default gw network.
- needDefaultGw() takes now into account whether the sandbox
is marked for deletion
Signed-off-by: Alessandro Boch <aboch@docker.com>
- Otherwise a overlay network delete after daemon restart
will hit a nil pointer dereference while releasing the
vxlan id
Signed-off-by: Alessandro Boch <aboch@docker.com>
This moves the initialization of the pre-defined networks to where it's
used instead of in package init.
This reason for this change is having this be populated in `init()`
causes it to always consume cpu, and memory (4.3MB of memory), to
populate even if the package is unused (like for instnace, in a re-exec).
Here is a memory profile of docker/docker just after starting the daemon of the
top 10 largest memory consumers:
Before:
```
flat flat% sum% cum cum%
0 0% 0% 11.89MB 95.96% runtime.goexit
0 0% 0% 6.79MB 54.82% runtime.main
0 0% 0% 5.79MB 46.74% main.init
0 0% 0% 4.79MB 38.67% github.com/docker/docker/api/server/router/network.init
0 0% 0% 4.79MB 38.67% github.com/docker/libnetwork.init
0 0% 0% 4.29MB 34.63% github.com/docker/libnetwork/ipam.init
0 0% 0% 4.29MB 34.63% github.com/docker/libnetwork/ipams/builtin.init
0 0% 0% 4.29MB 34.63% github.com/docker/libnetwork/ipamutils.init
0 0% 0% 4.29MB 34.63% github.com/docker/libnetwork/ipamutils.init.1
4.29MB 34.63% 34.63% 4.29MB 34.63% github.com/docker/libnetwork/ipamutils.initGranularPredefinedNetworks
```
After:
```
flat flat% sum% cum cum%
0 0% 0% 4439.37kB 89.66% runtime.goexit
0 0% 0% 4439.37kB 89.66% runtime.main
0 0% 0% 3882.11kB 78.40% github.com/docker/docker/cli.(*Cli).Run
0 0% 0% 3882.11kB 78.40% main.main
3882.11kB 78.40% 78.40% 3882.11kB 78.40% reflect.callMethod
0 0% 78.40% 3882.11kB 78.40% reflect.methodValueCall
0 0% 78.40% 557.26kB 11.25% github.com/docker/docker/api/server.init
557.26kB 11.25% 89.66% 557.26kB 11.25% html.init
0 0% 89.66% 557.26kB 11.25% html/template.init
0 0% 89.66% 557.26kB 11.25% main.init
```
Now, of course the docker daemon will still need to consume this memory, but
at least now re-execs and such won't have to re-init these variables.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
- Concurrent leave/join of one member overlay network can end with the error:
"subnet sandbox join failed for "A.B.C.D/MM": error creating vxlan interface: file exists"
This happens when the join is processed while the leave has already started.
Having the network one member only, the leave resets the once variable for this network subnets
and triggers the sandbox destroy for each subnet's vxlan interface, when the n.joinCnt goes to 0.
But given the destroySandbox() is not atomic, the join thread can trigger the creation of the
vxlan interface in between (given subnet.once was re-initialized) before the leave thread
removes the vxlan interface for this subnet.
- The fix is to not allow interruptions between the re-initialization of the subnet.once var and
consequent vxlan interface removal.
Signed-off-by: Alessandro Boch <aboch@docker.com>
Join & Leave Serf processing happens in a separate goroutine and there
are cases as in https://github.com/docker/libnetwork/issues/985, it can
cause lookup failures when endpoint delete processing happens before
Serf gets a chance to handle the leave processing.
The fix is to avoid such lookups in this goroutine, but handle the
endpoint and network objects directly.
Signed-off-by: Madhu Venugopal <madhu@docker.com>
With the current implementation, a config relaod event causes all the
datastores to reinitialize and that impacts objects with Persist=false
such as none and host network.
Signed-off-by: Madhu Venugopal <madhu@docker.com>
- in bridge driver modprobe for br_netfilter only if EnableIPTables==true
- move FirewalldInit() to iptables pakcage Init()
- move modprobe for nf_nat and xt_conntrack in iptables.initCheck()
Signed-off-by: Alessandro Boch <aboch@docker.com>
If we encounter an error setting an interface's IPv4 or IPv6 address,
log the addresses we tried to use using the %v specifier rather than %q.
Signed-off-by: Nalin Dahyabhai <nalin@redhat.com> (github: nalind)
Previously hook expected data with a wrong type.
Full netns path is not included with the data
passed with the hook.
Fixes#829
Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>
During ungraceful shutdown, it is possible that the endpoint_cnt can be
inconsistent with the actual endpoints in a network. This fix will
resolve that inconsistency
Signed-off-by: Madhu Venugopal <madhu@docker.com>
- ... on ungraceful shutdown during network create
- Allow forceful deletion of network
- On network delete, first mark the network for deletion
- On controller creation, first forcely remove any network
that is marked for deletion.
Signed-off-by: Alessandro Boch <aboch@docker.com>
- Fix npe in sbJoin error path
- Fail again endpoint Join in case of failure
in programming the external connectivity
- In bridge, look for parent and child container configs
in the generic data
- iptables.Exists() might be called before any other call to
iptables.raw(). We need to call checkInit() then.
Introduced by 1638fbdf27
Signed-off-by: Alessandro Boch <aboch@docker.com>
- Because of the lazy logic in Leave(), the overlay
veth end is not moved from the sandbox to the host
network namspace until the last endpoint leaves.
We cannot rely on this logic to clear the veth pairs,
because on last endpoint leave we have no reference to
the other N-1 veth names.
- The fix is to delete the container veth end on endpoint delete.
This anyways deletes both veth ends, regardless they are in different
namespaces.
Signed-off-by: Alessandro Boch <aboch@docker.com>
This approach allows the user to provide a FQDN as hostname if that is what
they want in their container, or to provide distinct host and domain parts. In
both cases we will correctly extract the first token for /etc/hosts.
Signed-off-by: Tim Hockin <thockin@google.com>
- Fixed exists to attempt a raw exists check only when
"iptables -C ..." execution returns error becasue of "unsupported option"
- Fixed raw exists to not match substring
- Added GetVersion method
Signed-off-by: Alessandro Boch <aboch@docker.com>
Chen Chun has been contributing to various useful features in 1.9 & 1.10
releases and also an active maintainer who helps with bug fixes and PR
reviews
Signed-off-by: Madhu Venugopal <madhu@docker.com>
- Becasue it is the only chain which carries the hairpin mode info
- Also install the skipDNAT rule only if userland-proxy == true
Signed-off-by: Alessandro Boch <aboch@docker.com>
- iptables to provide a native API
- resolver.go to invoke the iptables native API
when programming tables in the container
Signed-off-by: Alessandro Boch <aboch@docker.com>
- Attempt the veth delete only after both ends
are moved into the default network namespace.
Which is after both driver.Leave() and
sandbox.clearNetworkResources() are called.
Signed-off-by: Alessandro Boch <aboch@docker.com>
- Move DiscoverNew() and DiscoverDelete() methods into the new interface
- Add DatastoreUpdate notification
- Now this interface can be implemented by any drivers, not only network drivers
Signed-off-by: Alessandro Boch <aboch@docker.com>
When the daemon is ungracefully shutdown, sometimes
when we try to create the overlay sandbox after coming
back up might get created in a different epoch count
which will result in the vxlan interface not properly
cleaned up. Fix this by explicitly cleaning up all the
previous epoch sandboxes.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
- With the transition to ipam, bridge ip address stored
in `bridgeInterface` is guaranteed to be the ip of the
OS bridge by the `setupBridgeIPv4` step function.
Therefore the `setupIPTables()` step function no
longer needs to fetch the address from the OS.
Signed-off-by: Alessandro Boch <aboch@docker.com>
By removing the need to clear the default gateway during sbJoin and
sbLeave to account for other bridge network, the default-gw endpoint
will stay with the container, it will also help retain the container
property.
Signed-off-by: Madhu Venugopal <madhu@docker.com>
- The pool request code does not behave properly in
case of concurrent requests when client does not
specify a preferred pool. It may dispense the same
predefined pool to different networks.
- The issue is common for local and global
address spaces
Signed-off-by: Alessandro Boch <aboch@docker.com>
Its safe to cache the scope value in network object and can be reused
for cleanup operations. The current implementation assume the presence
of driver during cleanup operation. Since a remote driver may not be
present, we should not fail such cleanup operations. Hence make use of
the scope variable from network object.
Signed-off-by: Madhu Venugopal <madhu@docker.com>
Stale sandbox and endpoints are cleaned up during controller init.
Since we reuse the exact same code-path, for sandbox and endpoint
delete, they try to load the plugin and it causes daemon startup
timeouts since the external plugin containers cant be loaded at that
time. Since the cleanup is actually performed for the libnetwork core
states, we can force delete sandbox and endpoint even if the driver is
not loaded.
Signed-off-by: Madhu Venugopal <madhu@docker.com>
A commit today(1/6/2016) in golang.org/x/tools broke
libnetwork builds because it now requires certain
go 1.5 packages while libnetwork still uses go1.4.
Fixed it by manually installing golang.org/x/tools from
release-branch.go1.5 which should work for us
even though it is versioned 1.5. The 1.4 release
branch is too old and doesn't work with latest golint
package.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
The default make target should be `all` by convention but
since it is not the first target it wasn't getting triggered
as the default target. Fixed the makefile to make `all`
the first and default target.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
- runs to completion without error
- demonstrates info available when using bridge network driver
Closes#837
Signed-off-by: Gabe Rosenhouse <grosenhouse@pivotal.io>
ChainExists should not treat non-nil output as
error because there is always going to be some
output while dumping iptable rules.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
We check for existence of all filter rules in
overlay driver before creating it. We should
also do this for chain creation, because even though
we cleanup network chains when the last container
stops, there is a possibility of a stale network
chain in case of ungraceful restart.
Also cleaned up stale bridges if any exist due to
ungraceful shutdown of daemon.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
Currently we are cleaning up vxlan interfaces by name
before trying to setup an interface with the same name.
But this doesn't work for properly cleaning up vxlan
interfaces with the same vni, if the interface has a
a different name than the one expected. The fix is to
delete the interface based on vni.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
- So that a DHCP based plugin can express it needs
the endpoint MAC address when requested for an IP address.
- In such case libnetwork will allocate one if not already
provided by user
Signed-off-by: Alessandro Boch <aboch@docker.com>
Add support for overlay networking in older kernels.
Following were done to achieve this:
+ Create the vxlan network in host namespace.
+ This may create conflicts with other private
networks so check for conflicts and fail a
join if there is any conflict.
+ Add iptable based filtering to only allow
subnet bridges in the same network to forward
traffic while different network bridges will
not be able to forward b/w each other. Also
block traffic to overlay network originating
from the host itself.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
- Test random de-allocation of allocated addresses
which is closer to real use case
- Test db reconstruction after read from datastore
Signed-off-by: Alessandro Boch <aboch@docker.com>
If we use peerMap as value, then we copy its mutex on
`pMap = d.peerDb.mp[nid]` and lock entirely different mutexes every
time.
Signed-off-by: Alexander Morozov <lk4d4@docker.com>
- Consistently with what it does for IP addresses, libnetwork
will also program the container interface's MAC address with
the value set by network driver in InterfaceInfo.
Signed-off-by: Alessandro Boch <aboch@docker.com>
- On Sandbox deletion, during Leave of each
connected endpoint, avoid the default gw
check, which may create an unnecessary
connection to the default gateway network.
Signed-off-by: Alessandro Boch <aboch@docker.com>
this updates the MAINTAINERS file to the new format,
so that it can be parsed and collected in the docker/opensource
repository.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Sometimes, the vxlan kernel code may generate miss
notifications for vxlan bound packets when serf is
not initliazed. In such cases we should not try
doing a query as it will create a panic. We should
error out which will generate a log message.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
- Remove from contract predefined errors which are no longer
valid (ex. ErrInvalidIpamService, ErrInvalidIpamConfigService)
- Do not use network driver error for ipam load failure in controller.go
- Bitseq to expose two well-known errors (no more bit available, bit is already set)
- Default ipam to report proper well-known error on RequestAddress()
based on bitseq returned error
- Default ipam errors to comply with types error interface
Signed-off-by: Alessandro Boch <aboch@docker.com>
- On network delete it is better to release the gateway address
and address pool before removing the network from the datastore,
given ipam data is dependent on network data.
Signed-off-by: Alessandro Boch <aboch@docker.com>
The first issue is an ordering problem where sandbox
attached version of endpoint object should be pushed
to the watch database first so that any other create endpoint
which is in progress can make use of it immediately to update
the container hosts file. And only after that the current
container should try to retrieve the service records from the
service data base and upate it's hosts file. With the previous
order there is a small time window, when another endpoint create
will find this endpoint but it doesn't have the sandbox context
while the svc record population from svc db has already happened
so that container will totally miss to populate the service record
of the newly created endpoint.
The second issue is trying to rebuild the /etc/hosts file from scratch
during endpoint join and this may sometimes happen after the service
record add for another endpoint has happened on the container
file. Obviously this rebuilding will wipe out that service record which
was just added. Removed the rebuilding of /etc/hosts file during
endpoint join. The initial population of /etc/hosts file should only
happen during sandbox creation time. In the endpoint join just added
the backward-compatible self ip -> hostname entry as just another
record.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
runc/libcontainer split the `State` struct into platform specific structs
in
fe1cce69b3.
As a result, `NamespacePaths` isn't anymore in a global struct and
libnetwork is not cross-compiling in Docker (specifically on Windows) because
`sandbox_externalkey.go` is using `NamespacePaths`.
This patch splits `sandbox_externalkey.go` into platform specific
files and moves common things to a generic `sandbox_externalkey.go`.
Signed-off-by: Antonio Murdaca <runcom@redhat.com>
Compile the dnet tool for Linux (x86, amd64 and arm)
and Windows (x86 and amd64)
- Moved installation of dependencies into `Dockerfile.build`
- Remove `start-services` from Makefile
- That's the responsibility of Docker or build environment
- Removed utils depending on `netlink` from `netutils/utils.go`
Unable to add `make cross` to CircleCI just yet as there are some
issues to solve that are unrelated to this PR
Also fix `.gitignore` which was not updated after changing the build
image name in #667
Signed-off-by: Dave Tucker <dt@docker.com>
- pushReservation fails to correctly detect when
the affected block is the last in the current
sequence. It thinks instead the block is in between
the sequence. Because of this a couple of issues
may happen:
1. The allocation of the last bit causes the creation
of a phantom sequence (length 0) at the end.
(This has no side effects).
2. The allocation of a bit somewhere in the middle of
the bitmask may lead to a completely incorrect
sequence pattern.
Signed-off-by: Alessandro Boch <aboch@docker.com>
There are cases as seen in https://github.com/docker/docker/issues/17984
the sandbox could be stale in endpoint structure, when the actual
sandbox is removed during the cleanup phase. Hence instead of just
validating for sandboxID, make sure if it is actually present in the
sandboxes DB managed by the controller.
Signed-off-by: Madhu Venugopal <madhu@docker.com>
- Currently endpoint interface mac address is
not being set in network.go when user specified
the mac address for the container.
- Overlay driver expects to get the user defined mac-address
from InterfaceInfo.
Signed-off-by: Alessandro Boch <aboch@docker.com>
Usage:
./machines up [consul|etcd|zookeeper] [num_nodes]
./machines destroy
Uses Docker Machine to create test environments
We _could_ use these environments to run BATS tests against
This would ensure that all supported backends are working
Signed-off-by: Dave Tucker <dt@docker.com>
- On network delete, bridge interface removal is a best effort
If netlink fails to remove the interface, we must not
restore the network in the bridge network db
Signed-off-by: Alessandro Boch <aboch@docker.com>
It is sufficient to check only if network is available
in store to make the decision of whether to retain the
stale sandbox. If the endpoints are not available then
there is no point in retaining the sandbox anyways. This
fixes some extreme corner cases, where daemon goes down
right in the middle of sandbox cleanup happening.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
If the endpoint and the corresponding network is
not persistent then skip adding it into sandbox
store.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
At times, when checkpointed sandbox from store cannot be
cleaned up properly we still retain the sandbox in both
the store and in memory. But this sandbox store may not
contain important configuration information from docker.
So when docker requests a new sandbox, instead of using
it as is, reconcile the sandbox state from store with the
the configuration information provided by docker. To do this
mark the sandbox from store as stub and never reveal it to
external searches. When docker requests a new sandbox, update
the stub sandbox and clear the stub flag.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
There is a race in os sandbox sharing code where two containers which
are sharing the os sandbox try to recreate the os sandbox again which
might result in destroying the os sandbox and recreating it. Since the
os sandbox sharing is happening only for default sandbox, refactored the
code to create os sandbox only once inside a `sync.Once` api so that it
happens exactly once and gets reused by other containers. Also disabled
deleting this os sandbox.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
Since we share the host sandbox with many containers we
need to serialize creation of the sandbox. Otherwise
container starts may see the namespace path in inconsistent
state.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
For ungraceful daemon restarts, libnetwork has sandbox cleanup logic to
remove any stale & dangling resources. But, if the store is down during
the daemon restart, then the cleanup logic would not be able to perform
complete cleanup. During such cases, the sandbox has been removed. With
this fix, we retain the sandbox if the store is down and the endpoint
couldnt be cleaned. When the container is later restarted in docker
daemon, we will perform a sandbox cleanup and that will complete the
cleanup round.
Signed-off-by: Madhu Venugopal <madhu@docker.com>
Added IT cases for external connectivity check for bridge
and overlay networks, both initially and after a restart.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
Reconciling persistent state after configuring driver. If not
the networks will not be initialized properly based on certain
driver config settings like enabling IP tables etc.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
Added an IT case for checking proper /etc/hosts
handling in the overlay network. This also to see
if there are any stale entries in the /etc/hosts
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
Cleanup the service db for the network when the last
container on the network leaves on the host. This is
because we stop watching the network after the last
container leaves and so if we keep the service db
around it might be kept uptodate with containers
joining and leaving in other hosts. The service
db will populated properly when a container joins
this network at a later point in time.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
- Currently when a sandbox disconnect from a network
the network's services are not removed from the
sandbox's /etc/hosts file
Signed-off-by: Alessandro Boch <aboch@docker.com>
Overlay driver allows local containers to communicate in overly network
even when the serf is not fully inited. But when the container leaves an
overlay network, it gets stuck waiting on a nil notifyCh, when the serf
is not fully initialized.
Signed-off-by: Madhu Venugopal <madhu@docker.com>
Currently the local containers of a global scope
network will get it's service records updated
from both a local update and global update. There
is no way to check if this is a local endpoint when
a remote update comes in via watch because we add
the endpoint to local endpoint list during join, while
the remote update happens during createendpoint.
The right thing to do is update the local endpoint list
and start watching during createndpoint and remove the watch
during delete endpoint. But this might result in the container
getting it's own record in it's /etc/hosts. So added a filtering
logic to filter out self records when updating the container's
/etc/hosts
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
A local endpoint is known to the watch database only
during Join. But the same endpoint can be known to the
watch database as remote endpoint well before the Join
because a CreateEndpoint updates the endpoint to the store.
So on Join when you come to know that this is indeed a
local endpoint remove it from remote endpoint list and add it
to local endpoint list.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
Introduced a path level lock to synchronize updates
to /etc/hosts writes. A path level cache is maintained
to only synchronize only at the file level.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
- This brings the remote ipam driver in pair with the local one.
As of now remote driver package is assuming a valid address in CIDR
form is always present in a nil error AddressRequestResponse,
which is no longer true as community has requested to remove this
limitation.
- We are ok to remove it until we can provide a null ipam driver
option in future releases.
Signed-off-by: Alessandro Boch <aboch@docker.com>
- Currently allocator pulls all the bitmasks from datastore
before processing each public API. This is not needed as
the APIs already selectively pull the interested bitmask
when needed.
Signed-off-by: Alessandro Boch <aboch@docker.com>
- Allow to create an endpoint as anonymous.
An anonymous endpoint does not get added
to the network service records.
Signed-off-by: Alessandro Boch <aboch@docker.com>
Do not log unncessary warning messages when you get
maskable error from driver during an endpoint delete.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
When we bootup cleanup all dangling local
endpoints since they are not needed anymore.
The only reason it can happen is when the process
went down ungracefully after an endpoint is
created but before join is successfull.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
When the daemon has a lot of containers and even when
the daemon tries to give 15 second to stop all containers
it is not enough. So the daemon forces a shut down at the end
of 15 seconds. And hence in a situation with a lot of
containers even gracefully bringing down the daemon will result
in a lot of containers fully not brought down.
In addition to this the daemon force killing itself can happen
in any arbitrary point in time which will result in inconsistent
checkpointed state for the sandbox. This makes the cleanup really
fail when we come back up and in many cases because of this
inability to cleanup properly on restart will result in daemon not
able to restart because we are not able to delete the default network.
This commit ensures that the sandbox state stored in the disk is
never inconsistent so that when we come back up we will always be
able to cleanup the sandbox state.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
Bridge driver should return maskable error during Leave
or DeleteEndpoint since this can be an expected sceanrio
when libnetwork tries to leave and delete default bridge
endpoints and bridge driver does not persist with the default
bridge. This is only expected during an ungraceful exit of
the daemon but will cause confusion to the user if it shows
up as failures on a deamon restart after an ungraceful exit.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
- Added etcd integration test for overlay
- Added etcd integration test for multinode
with mock test driver suitable for circleci
- Added multinode tests for zookeeper
- Made the script smart enough to only start
data stores necessary for the requested suites
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
- With the selectively running a suite support
one can do the following to select which suite
of tests to run:
SUITES="simple multi" sudo -E make integration-tests
- Refactored and cleaned up some ununsed code in helpers.bash
- Added discover string parse function to parse discovery
string into provide and address
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
Currently we are trying to ensure that the parent
directory exists as a key. But it is really a directory
and etcd expects it to be a directory. So made the
change to ensure that the parent key is created as
a directory and not as a simple key.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
Added a check to see if address space is valid in
addrSpaces map before accessing it. Also fixed some
error strings so that it provides better information
to the user.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
Currently endpoint count is maintained as part of
network object and the endpoint count gets updated
frequently while the rest of network is quite stable.
Because of the frequent updates to endpoint count the
network object is getting marshalled and unmarshalled
ferquently. This is causing a lot of churn and transient
memory usage. Fix this by creating a deparate object of
endpoint count so that only that gets updated.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
Currently there are 3 distinct operations performed by
datastore
- Pushing the data to the store
- Updating the Index of the local object
- Updating the cache (in case of localscope)
Without a lock racing datastore api calls can interleave
in various surprising ways. Best thing is to keep these
3 above operation inseparable. Use a datastore lock to
achieve this.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
currently ipamutils package uses apis which are linux
specific and makes windows compile error out. Separated
the OS specific apis into linux and windows files to
diverge the implementation.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
Today we try to increment/decrement endpoint count
only once even if it is a key modified error. In case
of key modified error we should retry it to allow it to
succeed.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
- Both an empty and nil list of IpamConf object
will trigger auto-allocation for ipv4.
Auto-allocation for ipv6 will still be excluded
in the two cases above.
Signed-off-by: Alessandro Boch <aboch@docker.com>
Currently when container has a restart policy and gets
restarted, docker does not release networking and allocate
it back. But it presents libnetwork with a new sandbox while
all the network resources are locked in the old sandbox. This
commit attempts to move all the network resources from the old
sandbox to the new sandbox when libnetwork is presented with the
new sandbox.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
Currently integration test is a bit flaky because of
variability in the dnet bootup time. Fixed it to wait for
dnet to come up before performing any tests.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
Added restart test for default network so that we can test
bridge network persistence. Also added changes to dnet to
delete the default network if it is present.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
Since libnetwork is going to provide createNetwork
notifications only once when the network is created
bridge network needs to save it's network state in
persistent store so that it becomes available even
after restart.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
- Set bridge ipv4 address when bridge is present
- IPv6 changes for bridge
- Convert unit tests to the new model
Signed-off-by: Alessandro Boch <aboch@docker.com>
Now that libkv supports concurrent access to boltdb, there is no point
in depending on timeout mechanism
Signed-off-by: Madhu Venugopal <madhu@docker.com>
Currently when docker exits ungracefully it may leave
dangling sandboxes which may hold onto precious network
resources. Added checkpoint state for sandboxes which
on boot up will be used to clean up the sandboxes and
network resources.
On bootup the remaining dangling state in the checkpoint
are read and cleaned up before accepting any new
network allocation requests.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
This adds a new options configuration routine that the engine
can call in order to configure TLS for libnetworks KV store.
Signed-off-by: Daniel Hiltgen <daniel.hiltgen@docker.com>
- libnetwork should reserve only the auxiliary
addresses which belong to the container
addresable pool. And should fail the network
creation if the aux addr does not belong to
the master pool.
Signed-off-by: Alessandro Boch <aboch@docker.com>
Currently every `NewDatastore` creates a brand new
libkv store handle. This change attempts to share
the libkv store handle across various datastore handles
which share the same scope configuration. This enables
libnetwork and drivers to have different datastore handle
based on the same configuration but share the same
underlying libkv store handle.
This is mandatory for boltdb libkv backend because no two
clients can get exclusive access to boltdb file at the same
time. For other backends it just avoids the overhead of having
too many backend client instances
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
- Renamed netlabel prefixes to accomodate both global
and local store configs.
- Added a `private` marker.
- Skipping the data store configs for remote driver
so that external plugins don't get it as there is
no secure and sane way to coordinate providing
data store access to external plugins.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
There were some unconditional debug logging in serf.
Removed them and made then go through logrus writers
based on what error level the log string contains.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
Add a few bridge network integration tests which
specifically deals with multiple bridge networks
and libnetwork restart and persistence
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
Remove the need for watching for IPAM data
structures and add multi store support code and
data reorganization to simplify address space
management.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
Always on watching of networks and endpoints can
affect scalability of the cluster beyond a few nodes.
Remove pro active watching and watch only the objects
you are interested in.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
Add local scope store caching support as
well as do some refactoring to make it datastore
scope aware and manage scope specific config.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
Currently the default ipam implementation ignores the prefered ip if the
request is made on an existing sub-pool. The priority should be other
way around.
Signed-off-by: Madhu Venugopal <madhu@docker.com>
- When creating the handle, write it to store if
not present. Currently it is written to store
only on first bit allocation.
Signed-off-by: Alessandro Boch <aboch@docker.com>
With the new Discovery model, join can happen even before serf is
initliazed. It could also happen due to misconfiguration of
--cluster-advertise. The local endpoint join must succeed and later when
the serf initializes and joins the cluster, it will push the local db to
the cluster.
Signed-off-by: Madhu Venugopal <madhu@docker.com>
With the recently introduced docker discovery, the self node discovery
notification can reach the overlay driver after the remote node
discovery notification. In scenarios such as 2 node setup, it seems more
likely. In those scenarios, the serfJoin is not triggered and hence the
neighborship is not formed between the 2 nodes.
The fix is to retain the knowledge of the neighbor and reuse it
immediately after the serfInit is done. Since we do the serfJoin just
once, there is no harm in changing the neighIP to a new value even if it
is not used.
Signed-off-by: Madhu Venugopal <madhu@docker.com>
* integrated hostdiscovery package with the new Docker Discovery
* Integrated hostdiscovery package with libnetwork core
* removed libnetwork_discovery tag
* Introduced driver apis for discovery events
* moved overlay driver to make use of the discovery events
* Using Docker Discovery service.
* Changed integration-tests to make use of the new discovery
Signed-off-by: Madhu Venugopal <madhu@docker.com>
Exposing osl package outside libnetwork is not neccessary and the
InterfaceStatistics anyways belong to the types package.
Signed-off-by: Madhu Venugopal <madhu@docker.com>
- DisableBridgeCreation is misleading. change it to DefaultBridge
- Dont fail the init if localstore cannot be initialized
- added a convenience function to get endpoint for a container
Signed-off-by: Madhu Venugopal <madhu@docker.com>
Ideally, both overlay and libnetwork core must be changed to support
kv-store connection retry. But this is a stop-gap measure to unblock the
discovery related PRs.
Signed-off-by: Madhu Venugopal <madhu@docker.com>
Replaced it with DisableBridgeCreation and it can be used ONLY in
a special case for docker0 bridge from docker, instead of calling it
from all other case.
Signed-off-by: Madhu Venugopal <madhu@docker.com>
This commit adds a basic overlay network
connectivity integration test. By doing this
it adds the basic functions to form a crude
container to run the networking tests. The container
uses a busybox rootfs with network namespace and
/etc/hosts and /etc/resolv.conf generated by
libnetwork.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
Currently ther `service ls` output does not show the
sandbox ID. This adds that to the output so that it can
be used in dnet program.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
Currently on every endpoint Join the /etc/hosts
file is getting overwritten. This blows the already
existing service records. Modify the `updateHostsFile`
function to build the hosts file only on the first
endpoint join and for subsequent joins just update
the existing /etc/hosts file with the additional
network specific service records.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
1. Don't save localscope endpoints to localstore for now.
2. Add common function updateToStore/deleteFromStore to store KVObjects.
3. Merge `getNetworksFromGlobalStore` and `getNetworksFromLocalStore`
4. Add `n.isGlobalScoped` before `n.watchEndpoints` in `addNetwork`
5. Fix integration-tests
6. Fix test failure in drivers/remote/driver_test.go
7. Restore network to store if deleteNework failed
Currently the driver configuration is pushed through a separate
api. This makes driver configuration possible at any arbitrary
time. This unncessarily complicates the driver implementation.
More importantly the driver does not get access to it's
configuration before it can do the handshake with libnetwork.
This make the internal drivers a little bit different to
external plugins which can get their configuration before the handshake
with libnetwork.
This PR attempts to fix that mismatch between internal drivers and
external plugins.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
- Create a wrapper script to run intergation tests
so that setups and teardowns happen in more
optimal manner
- Add traps to cleanup containers on failure or
user interrupt
- Introduce basic multi-node integration tests
- Removed default network, default driver tests
as they may not be useful in the near future
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
Fixes#485
The code previously relied on an uninteded side effect. When the
interface name was set, this causes the interface to come up
prematurely. Once that side effect was removed, routes could
no longer be set.
This change ensures that routes are only set after the interface
is brought up.
Signed-off-by: Tom Denham <tom@tomdee.co.uk>
There are multiple goals of introducing test driver plugin
- Need a driver which can be configured to simulate
different driver behaviors
- For pure libnetwork multi-host integration testing
a test driver configured for global scope can be used
without trying to use a real driver like overlay
which comes with it's own dependencies which can't
be satisfied all enviroments(I am looking at you
circleci)
This PR also makes all test cases that we have so far to be run
in circleci without any skipping needed.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
Instead of passing the pointer to &ep.iface the current
code is passing the value. So the source variable is not
getting updated properly.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
- Enhance dnet to use codegansta/cli as the frontend
- Add `container create/rm` commands only in dnet
- With the above dnet enhancements add more integration tests
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
Updated Godeps and added codegangsta/cli into Godeps.
Also cleaned up the unnecessary packages by removing
host_discovery build tag which wasn't getting detected
by godep and was causing all sorts of `godep save` issues.
With this fix committers can do `godep save ./...` freely
to include their new dependencies without any failure.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
Add one capability negotiation interaction after plugin handshake, use
this to determine plugin's capability instead of default "global" scope.
Signed-off-by: Zhang Wei <zhangwei555@huawei.com>
Currently the endpoint data model consists of multiple
interfaces per-endpoint. This seems to be an overkill
since there is no real use case for it. Removing it
to remove unnecessary complexity from the code.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
- it is supposed to be called after lookupContainerID()
but the latter is not guaranteed to succeed and in
case of connection error will return what was passed
to it.
So in order to be able to operate with both long and short
container ids in case of lookupContainerID() failure,
always search by `partial-container-id`
Signed-off-by: Alessandro Boch <aboch@docker.com>
- So test will not fail because container is already there
Prefer this to re-use the containers as it would contain
states from last run
- A stale consul or dnet container condition will happen
in case the previous integ test run aborted
Signed-off-by: Alessandro Boch <aboch@docker.com>
Currently libnetwork does not have any integration test infra
support to tests libnetwork code end2end purely as a black
box. This initial commit adds the infra support to enable
test cases for this.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
CmdServiceDetach() incorrectly uses containerID where sandboxID is
expected. Thus, procDeleteSandbox() fails to find the corresponding
sandbox and returns the "Resource not found" error.
Signed-off-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
There were two pending PRs with package level
changes but no source level conflicts. This got
merged because git cannot detect this.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
fix bug for `docker service ls` error:
"Failed to retrieve backend list for service xxx (json: cannot
unmarshal object into Go value of type []client.sandboxResource)"
Signed-off-by: Zhang Wei <zhangwei555@huawei.com>
This way we won't vendor test related functions in docker anymore.
It also moves netns related functions to a new ns package to be able to
call the ns init function in tests. I think this also helps with the
overall package isolation.
Signed-off-by: David Calavera <david.calavera@gmail.com>
Looks like the libkv version vendored in really not in
sync with the git hash value in Godeps.json. The commit
04bd8f67ad
has just updated the Godeps.json without update the source.
Dnet in multi-host testing is broken due to this, while
docker mult-host functionality works because the correct
version of libkv has been vendored in docker/docker.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
Bridge driver panics in `CreateNetwork` if called without
a prior `Config` call. This causes issues in dnet which
tries to create network using default driver configuration.
It should be valid to call `CreateNetwork` without a prior
`Config` call in which case we need to assume default driver
config.
Fixed this by properly initializing the driver config pointer.
Also introduced a `configured` bool to make sure that still
`Config` is called exactly once for the instance of the bridge
driver.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
Test if error messages from daemon are not empty strings.
Confirmed it fails without af323c7.
--- FAIL: TestEndToEndErrorMessage (0.03s)
api_test.go:2266: Empty response error message.
Signed-off-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
- Convinience API which detaches the sandbox from
all endpoints, resets and reapply config options,
setup discovery files, reattach to the endpoints.
No change to the osl sandbox in use.
Signed-off-by: Alessandro Boch <aboch@docker.com>
Currently when libnetwork tests are run inside a container
you cannot interrupt them in the middle by pressing ctrl-c
even though all the tests run in foreground. Fix this by running
tests by wrapping the make invocation inside the container
with a shell scripts which installs the SIGINT handler.
Without the handler the kernel does not deliver signals
to the process with PID 1(which in this case was make itself)
and hence make could never be interrupted. With this fix
we capture SIGINT in the shell script and re-raise it in the
the child process (which is make) and that makes the make
interruptible.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
Two issues:
- container resolv.conf getting regenerated even when no dns configs are passed
- updateHosts should be skipped for host networking mode
- incorrect check on dnsOptions
Signed-off-by: Alessandro Boch <aboch@docker.com>
Make sure to always explicitly set namespace for all
kernel bound network operations irrespective of whether
the operation is performed in init namespace or a user
defined namespace. This already happens for user defined
netns. But doesn't happen for initial netns that libnetwork
runs in.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
- Do not discard errors on ip allocation for gw and bridge
- Release addresses on network delete
- Add some context on top of ipallocator returned error
- Create ip allocator instance at driver creation, not at package init,
otherwise this affects bridge test code where ip db is carried over
test functions
Signed-off-by: Alessandro Boch <aboch@docker.com>
- Maps 1 to 1 with container's networking stack
- It holds container's specific nw options which
before were incorrectly owned by Endpoint.
- Sandbox creation no longer coupled with Endpoint Join,
sandbox and endpoint have now separate lifecycle.
- LeaveAll naturally replaced by Sandbox.Delete
- some pkg and file renaming in order to have clear
mapping between structure name and entity ("sandbox")
- Revisited hosts and resolv.conf handling
- Removed from JoinInfo interface capability of setting hosts and resolv.conf paths
- Changed etchosts.Build() to first write the search domains and then the nameservers
Signed-off-by: Alessandro Boch <aboch@docker.com>
- also provided a new utility to compute the
host part ip address which is resilient to
input passed in different representations.
Signed-off-by: Alessandro Boch <aboch@docker.com>
The current lazy network sandbox initialization code has a race
in that if multiple go routines race to join the network the second
and subsequent go routines might try to use the sandbox before it is
fully initialized. Fix this by blocking the go routines in once.Do
calls and also take of care of rolling back properly in case of
error.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
- In DeleteEndpoint(), veth removal is a best effort,
as it could have alreayd been removed by sandbox destroy.
Therefore if veth is not found, cleanup defer function
should not run.
Signed-off-by: Alessandro Boch <aboch@docker.com>
- In case of sandboxAdd() failure, drive.Leave() call
in first executed defer reset err to nil. Secondly
executed defer in charge of resetting ep.container to nil
will not get executed.
Signed-off-by: Alessandro Boch <aboch@docker.com>
for the bridge driver.
Moves two config options, namely EnableIPTables and EnableUserlandProxy
from networks to the driver.
Closes#242
Signed-off-by: Mohammad Banikazemi <MBanikazemi@gmail.com>
14.10 reached EOL recently and hence experimental builds are not built for
that distro any more. Upgrading it to 15.04 means handling the systemd
specific docker daemon configs required for multi-host networking.
Signed-off-by: Madhu Venugopal <madhu@docker.com>
As seen in https://github.com/docker/docker/issues/14738 there is
general instability in the later kernels under race conditions when ioctl
calls are used in parallel with netlink calls for various operations.
(We are yet to narrow down to the exact root-cause on the kernel).
For those older kernels which doesnt support some of the netlink APIs,
we can fallback to using ioctl calls. Hence bringing back the original
code that used netlink (https://github.com/docker/libnetwork/pull/349).
Also, there was an existing bug in bridge creation using netlink which
was setting bridge mac during bridge creation. That operation is not
supported in the netlink library (and doesnt throw an error either).
Included a fix for that condition by setting the bridge mac after
creating the bridge.
Signed-off-by: Madhu Venugopal <madhu@docker.com>
The controller sandboxes hashmap is not being protected by a lock
while deleting it in `LeaveAll` call. This may result in a race
whereby any other read access that happens with the lock held is
also vulnerable to return random sandbox data which could result
in totally unpredictable behavior.
Also as part of the fix check if `s.endpoints` is empty and log an
error in `rmEndpoint` so that we don't bring down the process
for this unexpected error.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
TL;DR: check for IsExist(err) after a failed MkdirAll() is both
redundant and wrong -- so two reasons to remove it.
Quoting MkdirAll documentation:
> MkdirAll creates a directory named path, along with any necessary
> parents, and returns nil, or else returns an error. If path
> is already a directory, MkdirAll does nothing and returns nil.
This means two things:
1. If a directory to be created already exists, no error is
returned.
2. If the error returned is IsExist (EEXIST), it means there exists
a non-directory with the same name as MkdirAll need to use for
directory. Example: we want to MkdirAll("a/b"), but file "a"
(or "a/b") already exists, so MkdirAll fails.
The above is a theory, based on quoted documentation and my UNIX
knowledge.
3. In practice, though, current MkdirAll implementation [1] returns
ENOTDIR in most of cases described in #2, with the exception when
there is a race between MkdirAll and someone else creating the
last component of MkdirAll argument as a file. In this very case
MkdirAll() will indeed return EEXIST.
Because of #1, IsExist check after MkdirAll is not needed.
Because of #2 and #3, ignoring IsExist error is just plain wrong,
as directory we require is not created. It's cleaner to report
the error now.
[1] https://github.com/golang/go/blob/f9ed2f75/src/os/path.go
Signed-off-by: Kir Kolyshkin <kir@openvz.org>
- NetworkRange() function on which ipallocatore relies
to compute the subnet limits has a bug in computing the upper limit IP
- in case container subnet is specified (fixedCIDR), bridge driver to
reserve bridge and gateway addresses only if they belong to the container
subnet
- Make ipallocator more robust in using converting the passed network
to a canonical one before using it as a key in its public APIs
Signed-off-by: Alessandro Boch <aboch@docker.com>
Two changes were missing:
- On allocation of bridge ip was not passing canonical subnet
- Canonical subnet has to be passed on ip release
as well, otherwise ipallocator will attempt
ip release from a non registered nw
Signed-off-by: Alessandro Boch <aboch@docker.com>
Set the hairpin mode using the sysfs interface which
looks like it is working all the way to the oldest
of RHEL6.6 kernels.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
When you start a container after some other container has already
been started in the same network, the current container will have
an fdb which points to a wrong vtep to reach the already started
container. This makes the network connectivity to not work. The root
cause of the issue is because of golang does variable capture by
reference in closures and so we cannot use the return values from
range iterators directly. It needs to be copied to a locally scoped
variable and then use that copy as a capture variable in the closure.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
libnetwork.New() should return a controller and an error. The
example in README.md ignore the error now, which will let the
codes can not be compiled by Go.
When fixed-cidrv6 is used, the allocation and release must happen from
the appropriate network. Allocation is done properly in createendpoint,
but the DeleteEndpoint wasnt taking care of this case.
Signed-off-by: Madhu Venugopal <madhu@docker.com>
When bit sequence is trying to get key/value from the
data store it should always unmarshall the json data
before using it, as the data is JSON marshalled before
storing it in the data store.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
- We must ignore key not found error when querying
datastore for initial state.
- Regression introduced by 04bd8f67ad
Signed-off-by: Alessandro Boch <aboch@docker.com>
For the moment in 1.7.1 since we provide a resolv.conf set api
to the driver honor that so that for host driver we can use the
the host's /etc/resolv.conf file as is rather than putting the
contents through a filtering logic.
It should be noted that the driver side capability to set the
resolv.conf file is most likely going to go away in the future
but this should be fine for 1.7.1
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
In preparation for the new update of vishvananda/netlink package
we need to bringup the host veth interface manually.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
Some parts of the bridge driver code needs to use a different kernel
api or use the already existing apis in slightly different ways to
make the bridge driver work in RHEL/Centos 6.6. This PR provides
those fixes.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
The netlink way of creating bridge has problems in older
kernels like the one used on RHEL 6 (which is a supported
one). So trying to use ioctl method to create bridge
so that it works on any version.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
- When invoked from docker, endpoint.Statistics() returns
the statistics of the host's interfaces.
Issue is tracked down to ioutil.ReadFile(). For some
reason even if invoked from inside the sandbox netns,
it ends up reading the stats file from the default netns,
when invoked from docker.
If same operation is run from inside a dedicated binary,
it works as expected.
- Replacing it with exec.Command("cat", <file>) solves the issue
Signed-off-by: Alessandro Boch <aboch@docker.com>
In that commit, AtomicPutCreate takes previous = nil to Atomically create keys
that don't exist. We need a create operation that is atomic to prevent races
between multiple libnetworks creating the same object.
Previously, we just created new KVs with an index of 0 and wrote them to the
datastore. Consul accepts this behaviour and interprets index of 0 as
non-existing, but other data backends do no.
- Add Exists() to the KV interface. SetIndex() should also modify a KV so
that it exists.
- Call SetIndex() from within the GetObject() method on DataStore interface.
- This ensures objects have the updated values for exists and index.
- Add SetValue() to the KV interface. This allows implementers to define
their own method to marshall and unmarshall (as bitseq and allocator have).
- Update existing users of the DataStore (endpoint, network, bitseq,
allocator, ov_network) to new interfaces.
- Fix UTs.
There is no need to update the /etc/hosts files
of containers for endpoints which are created/deleted
in a network whose interface list is empty
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
Currently container can join one endpoint when it is started.
More endpoints can be attached at a later point in time. But
when that happens this attachment should only have meaning
only as long as the container is alive. The attachment should
lose it's meaning when the container goes away. Cuurently there
is no way for the container management code to tell libnetwork
to detach the container from all attached endpoints. This PR
provides an additional API `LeaveAll` which adds this
functionality,
To facilitate this and make the sanbox lifecycle consistent
some slight changes have been made to the behavior of sandbox
management code. The sandbox is no longer destroyed when the
last endpoint is detached from the container. Instead the sandbox
ie kept alive and can only be destroyed with a `LeaveAll` call.
This gives better control of sandbox lifecycle by the container
management code and the sandbox doesn't get destroyed from under
the carpet while the container is still using it.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
Add a minimal service discover support using service names or
service names qualified with network name. This is achieved
by populating the container's /etc/hosts file record with the
appropriate entries
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
Currently the etchosts package only provides helpers
to completely build an /etc/hosts file from scratch
or update a single hostname's IP address to a different
one. This commit adds the ability to add/delete an arbitrary
number of host record entries to/from the etc hosts file
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
Add support for in-tree driver configuration labels
by pushing the labels which has the appropriate
prefix to the corresponding driver.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
Add support for ovrouter biinary which can
act as both an independent integration test
tool to test overlay driver without libnetwork
as well as convert overlay driver into a plugin
in the future.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
Add support to add/delete neighbor entries to
the sandbox. Both L3 and L2(fdb) neighbor table additions
are supported.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
This commit brings in the first implementation of
overlay driver which makes use of vxlan tunneling
protocol to create logical networks across multiple
hosts.
This is very much alpha code and should be used for
demo and testing purposes only.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
Currently we rely on watch to catchup after the init. But there could be
a small time window on which, we might end up in a race condition on
network creates. By reading and populating networks during init, we
avoid any such conditions, especially for default network handling.
Signed-off-by: Madhu Venugopal <madhu@docker.com>
- At Handle creation, first check if an instance of the
the respective object is already present in the datastore.
- Handle sequence must be saved only if commit
to datastore is succesfull
- Caller (ipam) needs to manage the retry
Signed-off-by: Alessandro Boch <aboch@docker.com>
- Handle contains sequence and identifier.
This way datastore integration can be done
at bitseq level.
Signed-off-by: Alessandro Boch <aboch@docker.com>
- In order to facilitate usage of datastore
- This makes it slower. Efficiency will be
revisited later after datastore integration
is done.
Signed-off-by: Alessandro Boch <aboch@docker.com>
- Initial version
- It allows handling reservation/release of a finite set
of resources through large bitmask.
- It represents the bitmask as a list of equal
consecutive 32 bits long bitmask symbols.
It basically operates on a run-length encoding
of the bitmask without encode/decode processing.
Signed-off-by: Alessandro Boch <aboch@docker.com>
The `iptables.Exists` function is wrong in two ways:
1. The iptables -C call doesn't add `-j DOCKER` and fails to match
2. The long path takes ordering into account in comparison and fails to match
This patch fixes issue 1 by including `-j DOCKER` in the check.
Signed-off-by: Arnaud Porterie <arnaud.porterie@docker.com>
In one of the latest docker UI updates, the flags structure expects to
have a ShortUsage function. Without that any UI event that triggers an
short usage panics.
Signed-off-by: Madhu Venugopal <madhu@docker.com>
1. replaced --net option for service UI with SERVICE.[NETWORK] format
2. Making using of the default network/driver backend support
3. NetworkName and NetworkType from the UI/API can be empty string
and it will be replaced with DefaultNetwork and DefaultDriver
As per the design goals, we wanted to keep libnetwork core free of
handling defaults. Rather, the clients (docker & dnet) must handle the
defaultness of these entities.
Also, since there is no API to get these Default values from the
backend, UI will not handle the default values either. Hence, this falls
under the responsibility of the API layer to handle this specific case.
Signed-off-by: Madhu Venugopal <madhu@docker.com>
Since the Config is a read-only entity, Confg() method returned a value
instead of the pointer. In cases the config is nil, we should return an
empty config.
Signed-off-by: Madhu Venugopal <madhu@docker.com>
The configuration format for docker runtime is based on daemon flags and
hence adjusting the libnetwork configuration to accomodate it by moving
the TOML based configuration to the dnet tool.
Also changed the controller configuration via options
Signed-off-by: Madhu Venugopal <madhu@docker.com>
- Currently both network and host bits in the subnet are passed
when requesting an address from ipallocator.
The way ip allocator determines the first available
IP is tainted when caller passes the subnet host bits.
- Verified this patch applied to libnetwork vendored in docker
fixes the issue when starting the daemon.
- Fixes#287
Signed-off-by: Alessandro Boch <aboch@docker.com>
* When userland-proxy is disabled, enable hairpin mode on the host-side of the veth
* When userland-proxy is enabled, fix the iptable rules appropriately
Signed-off-by: Madhu Venugopal <madhu@docker.com>
Currently store makes use of a static isReservedNetwork check to decide
if a network needs to be stored in the distributed store or not. But it
is better if the check is not static, but be determined based on the
capability of the driver that backs the network.
Hence introducing a new capability mechanism to the driver which it can
express its capability during registration. Making use of first such
capability : Scope. This can be expanded in the future for more such cases.
Signed-off-by: Madhu Venugopal <madhu@docker.com>
* Removed network from being marshalled (it is part of the key anyways)
* Reworked the watch function to handle container-id on endpoints
* Included ContainerInfo to be marshalled which needs to be synchronized
* Resolved multiple race issues by introducing data locks
Signed-off-by: Madhu Venugopal <madhu@docker.com>
Currently we craete container mac address completely
randomly. But we probably need to generate based on
IP so that the mac address stays the same for a given
IP.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
Unless it is a forbidden error, libnetwork should not fail a forced
delete of a network and endpoint if the driver throws an error.
Signed-off-by: Madhu Venugopal <madhu@docker.com>
Calling GC() without ever creating a network namespace (sandbox on
Linux) will hang as the GC loop is not running (and therefore the
channel is not being listened to).
Tested via Docker that this corrects a daemon shutdown error if the
daemon is started and stopped without any containers or networks being
created while the daemon is up.
Docker-DCO-1.1-Signed-off-by: Phil Estes <estesp@linux.vnet.ibm.com>
- lock network struct before accessing config in NetworkCreate
- reorganize locks so that we lock only what needed and when needed
- conflict method really belongs to networkConfig not bridgeNetwork
Signed-off-by: Alessandro Boch <aboch@docker.com>
Removed release-specific info from the ROADMAP (better to keep this on the wiki; will not get stale). Made a couple of nips and tucks.
Signed-off-by: Amy Lindburg <amy.lindburg@docker.com>
In prep for the UI/API updates on Docker to integrate the network and
endpoints, this PR removes the experimental tag from dnet and moving it
to docker UI and API and wrap the top-level "network" and "service"
under experimental.
Signed-off-by: Madhu Venugopal <madhu@docker.com>
Added support to add a bridge the same way as any other
interface into the namespace. The only difference is linux
does not support creating the bridge in one namespace and
moving it into another namespace. So for a bridge the sandbox
code also does the creation of the bridge inside the sandbox.
Also added an optional argument to interface which can now
select one of the already existing interfaces as it's master.
For this option to succeed the master interface should be of type
bridge.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
Currently sandbox code exposes bare structs
externally to the package. It is untenable
to continue this way and it becomes too
inflexible to use it to store internal state.
Changed all of them to use interfaces.
Also cleaned up a lot of boiler plate code
which needs to set into namespace.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
Currently GenerateIfaceName is defined in bridge.go
and it specifically tries to only generate an interface
name only with `veth` prefix. Make it generic so that it
can accept a prefix and length of random bytes. Also
move it to netutils since it is useful to generate various
kinds of interface names using it.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
Right now the namespace paths are cleaned up every
garbage collection period. But if the daemon is restarted
before all the namespace paths of removed containers are
garbage collected they will remain there forever. The fix
is to provide a GC() api so that garbage collection can be
triggered immediately.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
Current version of netns used in libnetwork do not have requisite syscall
entry for PowerPC (ppc64le) arch. Consequently docker which uses libnetwork fails
to create any network enabled containers on Power systems.
This patch updates netns to latest commit 5478c060110032f972e86a1f844fdb9a2f008f2c
to add ppc64le syscall entry.
Signed-off-by: Pradipta Kr. Banerjee <bpradip@in.ibm.com>
In order to vendor-in libnetwork to docker, we need to remove the swarm
dependency even though it is used as library. using this PR, a new build
flag libnetwork_discovery is introduced in order to avoid pulling in the
unused hostdiscovery functionality into docker.
We are working with the Swarm project to see if we can modularize the
discovery package to become independent so that we can include them as a
vendor-in package in docker.
Signed-off-by: Madhu Venugopal <madhu@docker.com>
- To also return the configured exposed ports, besides the
port bindings; as now libnetwork/endpoint.go endpoint setters
separate the exposed ports and port binding configs.
Docker daemon will take care of aggregating the two sources
for presentation.
Signed-off-by: Alessandro Boch <aboch@docker.com>
When an endpoint is joined by a container it may
optionally pass a priority to resolve resource
conflicts inside the sandbox when more than one
endpoint provides the same kind of resource. If the
the priority is the same for two endpoints with
conflicting resources then the endpoint network names
are used to resolve the conflict.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
Sandbox needs unset gateway methods to cleanup
gateway settings to enable smooth transition
of the sandbox between endpoints.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
- The libnetwork test code had some issues in not properly
passing the network options. Fixed it.
- Made controller a global value so that every test uses the
same controller instance.
- Cleaned up endpoint and network objects after every test.
- Extended the endpoint join test case to test the same container
join two different networks using two different endpoints.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
Only remove the interfaces owned by the endpoint from
the sandbox when the container leaves the endpoint.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
Without this they don't have the desired effect.
The default when creating these types of routes with ip route add is link - the old setting of universe was just wrong.
Signed-off-by: Tom Denham <tom.denham@metaswitch.com>
Instead of sleeping reworked the code to use recurring ticks.
Also cleaned up unnecessary defers.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
Loopback interface was s not brought up when wemoved
to clone method of creating namespace. e. Adding it.
Also taking care of PR R comments.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
This PR attempts to work around bugs present in kernel
version 3.18-4.0.1 relating to namespace creation
and destruction. This fix attempts to avoid certain
systemmcalls to not get in the kkernel bug path as well
as lazily garbage collecting the name paths when they are removed.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
Do not have log output in packages that applications consume because the
output can mess with logs, stdout/stderr of applications and such and
there is nothing that the consumer can do about it other than change the
package that they are using.
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
Sir Codus Cleanupus vs. system.go
"This code doth offend me so" uttered Sir Codus Cleanupus
"and thus it must be cleft swiftly in twain". With one hefty
stroke of his broadsword he carved the old, unfunctioning
code in half and thus it ceased to be.
Signed-off-by: Dave Tucker <dt@docker.com>
Now that Endpoint interface has the Info method there is no
need to return container data as a return value in the Join
method. Removed the return value and fixed all the callers.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
The networkNamespace will record all interfaces joined into this sandbox.
While RremoveInterface func does't remove the leaved interfaces.
Signed-off-by: junxu <xujun@cmss.chinamobile.com>
In one of the previous commit, we went to the extreme of supporting just
the /{version}/networks. Though that satisfied the requirements for UI
integration, it is not fully consistent with Docker APIs.
Docker API supports both /{version}/resource and /resource and hence we
must add the same support for networks resource.
Also fixed a silly bug in api.go
Signed-off-by: Madhu Venugopal <madhu@docker.com>
- Also unexporting configuration structures in bridge
- Changes in dnet/network.go to set bridge name = network name
Signed-off-by: Alessandro Boch <aboch@docker.com>
In order to support the docker experimental feature build, moving the
service commands under experimental tag. Please refer to :
https://github.com/docker/docker/pull/13338/ for more information
Signed-off-by: Madhu Venugopal <madhu@docker.com>
Though libnetwork api is supposed to handle the sub router, it is given
the entire URL to deal with. But the current api.go assumes the network/
to be in the root path.
We need this patch to make it work seamlessly with docker & dnet UI & API
Signed-off-by: Madhu Venugopal <madhu@docker.com>
It is needed in cases when mapped port is already bound, or another
application bind mapped port. All this will be undetected because we use
iptables and not net.Listen.
Signed-off-by: Alexander Morozov <lk4d4@docker.com>
Currently the driver api allows the driver to specify the
full interface name for the interface inside the container.
This is not appropriate since the driver does not have the full
view of the sandbox to correcly allocate an unambiguous interface
name. Instead with this PR the driver will be allowed to specify
a prefix for the name and libnetwork and sandbox layers will
disambiguate it with an appropriate suffix.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
In essense, this just involves marshalling structs back and forth to a
remote process, via the plugin client. There are a couple of types
that don't JSONify well, notably `net.IPNet`, so there is some
translation to be done.
To conform to the driverapi interface, we must give the list of
endpoint interfaces to the remote process, and let it puzzle out what
it's supposed to do; including the possibility of returning an error.
The constraints on EndpointInfo are enforced by the remote driver
implementation; namely:
* It can't be nil
* If it's got non-empty Interfaces(), the remote process can't put
more in
In the latter case, or if we fail to add an interface for some
(future) reason, we try to roll the endpoint creation back. Likewise
for join -- if we fail to set the fields of the JoinInfo, we roll the
join back by leaving.
Signed-off-by: Michael Bridgen <mikeb@squaremobius.net>
- Package types to define the interfaces libnetwork errors
may implement, so that caller can categorize them.
Signed-off-by: Alessandro Boch <aboch@docker.com>
- Fix POST to collection
- Only resource ID in URI, search by name as query parameter
- Fix URLs, consistency and restrict regex
Signed-off-by: Alessandro Boch <aboch@docker.com>
This is need to decouple types from netutils which has linux
dependencies. This way the client code which needs network types
can just pull in types package which makes client code platform
agnostic.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
The container's /etc/resolv.conf permission was getting setup
as 0600 while it should be 0644 for every user inside the
container to be able to read it. The tempfile that we create
initially to populate the resolvconf content is getting created
with 0600 mode. Changed it to 0644 once it is created since there
is noway to pass mode option to ioutil.Tempfile
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
It may happen that the application (docker) may exit ungracefully
exit without calling leaves on endpoint and may result in stale
namespace files. So if a sandbox is created with the same key
attempt to cleanup the file if it exists before creating the
sandbox.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
There is a panic when two containers joining a host
network leave one after another. The problem was that
in controller.go the sandboxData was not stored as a
pointer reference. So when we got the data from the map
it was the copy of the data and refcnt increment was done
on that. Changed it to hold a reference. Also added a test
case to test it.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
Change namespace path to be /var/run/docker/netns since
/var/run/netns is being used by iproute2 and it is mounted
as MS_SHARED which causes some complications during integration.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
Refactored the driver api so that is aligns well with the design
of endpoint lifecycle becoming decoupled from the container lifecycle.
Introduced go interfaces to obtain address information during CreateEndpoint.
Go interfaces are also used to get data from driver during join.
This sort of deisgn hides the libnetwork specific type details from drivers.
Another adjustment is to provide a list of interfaces during CreateEndpoint. The
goal of this is many-fold:
* To indicate to the driver that IP address has been assigned by some other
entity (like a user wanting to use their own static IP for an endpoint/container)
and asking the driver to honor this. Driver may reject this configuration
and return an error but it may not try to allocate an IP address and override
the passed one.
* To indicate to the driver that IP address has already been allocated once
for this endpoint by an instance of the same driver in some docker host
in the cluster and this is merely a notification about that endpoint and the
allocated resources.
* In case the list of interfaces is empty the driver is required to allocate and
assign IP addresses for this endpoint.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
This commit brings in Remote driver integrated with the newly introduced
Plugin framework as a Docker Package.
The Plugin framework is designed as a Package and has no runtime
dependancy on Docker platform. It stands on its own and is a good
candidate for getting the remote drivers hooked to libnetwork
Signed-off-by: Madhu Venugopal <madhu@docker.com>
If resolvConfPath is unavailable and if the internally generated .hash file
is still present, then updateDNS should not consider the presence of internally
generated .hash. Rather, it must handle it as a case of using this
resolvConfPath for the first time.
Signed-off-by: Madhu Venugopal <madhu@docker.com>
When an update is done to the container resolv.conf file
and it was inheriting host entries, then we should not
re-read the host entries when the container leaves and
re-joins the endpoint.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
- Defines and implement http handler for "/networks" URLs
- Addresses part of requirements tracked by Issue #5
Signed-off-by: Alessandro Boch <aboch@docker.com>
Updated resolvconf and iptables packages based on upstream
changes which we need for libnetwork rebase. There were
docker engine changes based on this so we need this to
be integrated now.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
The libnetwork test does not need to run inside a namespace
when inside a container. This results in unpredictable behavior
when the sandbox code unlocks the go routine from the OS thread
while the test code still wants it locked in the OS thread. This
will result in unreachable interfaces when the go routine
migrates to a different OS thread.
Fixed by passing a special test flag which is only set to true
when the test is run inside a container.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
If the bridge exists and it exists with a different link local ip address
than fe80::1/64 then we waifl to accept that as a valid configuration without
trying to add the default link local ip address. With this fix we always try
to add the default link local address if it doesn't exist.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
In the present code, each driver package provides a `New()` method
which constructs a driver of its type, which is then registered with
the controller.
However, this is not suitable for the `drivers/remote` package, since
it does not provide a (singleton) driver, but a mechanism for drivers
to be added dynamically. As a result, the implementation is oddly
dual-purpose, and a spurious `"remote"` driver is added to the
controller's list of available drivers.
Instead, it is better to provide the registration callback to each
package and let it register its own driver or drivers. That way, the
singleton driver packages can construct one and register it, and the
remote package can hook the callback up with whatever the dynamic
driver mechanism turns out to be.
NB there are some method signature changes; in particular to
controller.New, which can return an error if the built-in driver
packages fail to initialise.
Signed-off-by: Michael Bridgen <mikeb@squaremobius.net>
- Refactored the Join/Leave code so they are synchronized across multiple go-routines
- Added parallel test coverage to test mult-thread access to Join/Leave
- Updated sandbox code to revert back to caller namespace when removing interfaces
- Changed the netns path to /var/run/netns so the cleanup is simpler on machine
reboot scenario
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
This commits brings in a functionality for remote drivers to register
with LibNetwork. The Built-In remote driver is responsible for the
actual "remote" plugin to be made available.
Having such a mechanism makes libnetwork core not dependent on any
external plugin mechanism and also the Libnetwork NB apis are free of
Driver interface.
Signed-off-by: Madhu Venugopal <madhu@docker.com>
This is an intial pass at the design docs.
Hopefully, we can merge this and then start accepting PRs to improve it!
Signed-off-by: Dave Tucker <dt@docker.com>
If we can't upload to coveralls, don't fail the build.
Goveralls and Coveralls have been a little flaky and started throwing
http 422 errors, although I still see coverage being reported.
It's best in the interim to ignore these, although this should be
removed in future when the service is more stable
Signed-off-by: Dave Tucker <dt@docker.com>
- Given this will be internal data, make a defensive copy to
protect from client inadvertently modifications.
Signed-off-by: Alessandro Boch <aboch@docker.com>
using a len(net.IP) to check for ipv4 or ipv6 is a bad idea.
And that was exactly done in NetworkOverlaps() function with the
assumption that any ipv4 net.IP will be of 4 bytes. Golang Net package
makes no such assumptions.
This assumption actually broke a particular use-case where the
NetworkOverlaps fails to identify a genuine overlap and that causes
datapath issues.
With this fix, we explicitely check for v4 or v6
Signed-off-by: Madhu Venugopal <madhu@docker.com>
container config.
- Added JoinOption processing for extra /etc/hosts record.
- Added support for updating /etc/hosts entries of other containers.
- Added sandbox support for adding a sandbox without the OS level create.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
After some delibration, we think it is better not to hold onto the
sandbox resources if an explicit call to Leave fails by the Driver.
Signed-off-by: Madhu Venugopal <madhu@docker.com>
- libnetwork cares for list of exposed ports, driver cares
for list of port bindings. At endpoint creation:
- list of exposed ports will be passed as libnetwork otion
- list of port mapping will be passed as driver option
Signed-off-by: Alessandro Boch <aboch@docker.com>
* Modified NB API with self referential var-aarg for future proofing the APIs
* Modified Driver API's option parameter to be a Map of interface{}
Signed-off-by: Madhu Venugopal <madhu@docker.com>
ISSUE:
- JoinOption type takes the exported interface Endpoint as parameter.
This does not allows libnetwork to control the setter functions
which will be executed by processOptions(). Client can now craft
any func (e Endpoint), pass it to Endpoint.Join() and have it executed.
Beside the fact this allows the client to shot himself in the foot,
there seem not to be a real need in having the JoinOption take the
Endpoint interface as parameter.
CHANGE:
- Changing the JoinOption signature to take a pointer to the unexported
endpoint structure. So now libnetwork is the only one that can define
the Join() method's options setter functions via the self referenced
JoinOption[...] functions.
Signed-off-by: Alessandro Boch <aboch@docker.com>
- Removed sandbox key argument for CreateEndpoint.
- Refactored bridge driver code to remove sandbox key.
- Fixed bridge driver code for gaps in ipv6 behavior
observed during docker integration.
- Updated test code, readme code, README.md according
api change.
- Fixed some sandbox issues while testing docker ipv6
integration.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
- Basically this is porting docker PR #9381 to libnetwork
- Added a Config.Validate() method where to consolidate
a priori validation of bridge configuration
- Have bridgeInterface store the current v4/v6 default gateways
- Introduced two setupStep functions to set the requested def gateways
Signed-off-by: Alessandro Boch <aboch@docker.com>
This is an experiment by modularizing the client UI handler in libnetwork
while the actual UI hook to the docker chain can come from Docker Project.
Signed-off-by: Madhu Venugopal <madhu@docker.com>
- setGatewayIP() => programGateway() becsause it is
causing confusion with setGateway() and setGatewayIPv6()
Signed-off-by: Alessandro Boch <aboch@docker.com>
- To reflect work flow. NewDriver() => ConfigureDriver()
and no NetworkDriver returned.
libnetwork clients would refer to a driver/network type, then
internally controller will retrieve the correspondent driver
instance, but this is not a concern of the clients.
- Remove NetworkDriver interface
- Removed stale blank dependency on bridge in libnetwork_test.go
Signed-off-by: Alessandro Boch <aboch@docker.com>
- This address one of the requirements of Issue #78
- Bridge MTU will be enforced on the veth pair ifaces
for each endpoint being added to the network.
Signed-off-by: Alessandro Boch <aboch@docker.com>
- This addresses one requirement from Issue #79
- Defined EndpointConfiguration struct for bridge driver
which contains the user's preferred mac address for the
sanbox interface
Signed-off-by: Alessandro Boch <aboch@docker.com>
- Store *Interface on endpoint create
- Remove from bridgeEndpoint ip params now available in Interface
- On endpoint delete attempt a removal of veth plugged into bridge
- (tested disabling defer netutils.SetupTestNetNS(t)() in libnetwrok_test)
- Fix bridge to store endpoints per sandbox
- Fix bug in error.go which causes stack overflow
- Start bridge error string w/ lower case as per go convention
Signed-off-by: Alessandro Boch <aboch@docker.com>
- Move SanboxInfo and Interface structures in sandbox package
(changed it to Info as per golint)
- Move UUID to new internal pkg types
- Updated .gitignore to ignore IDE project files
Signed-off-by: Alessandro Boch <aboch@docker.com>
types, except the naked error returns which were just prefixing
strings to previously returned error strings.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
DESCRIPTION:
As part of bringing libnetwork bridge driver features
in parity with docker/daemon/network/driver/bridge
features (Issue #46), this commit addresses the
bridge.RequestPort() API.
Currenlty docker/api/server.go needs an hold of port
allocator in order to reserve a transport port which
will be used by the http server on the host machine,
so that portallocator does not give out that port when
queried by portmapper as part of network driver operations.
ISSUE:
Current implementation in docker is server.go directly
access portmapper and then portallocator from bridge pkg
calling bridge.RequestPort(). This also forces that function
to trigger portmapper initialization (in case bridge init()
was not executed yet), while portmapper life cycle should
only be controlled by bridge network driver.
We cannot mantain this behavior with libnetwrok as this
violates the modularization of networking code which
libnetwork is bringing in.
FIX:
Make portallocator a singleton, now both docker core and
portmapper code can initialize it and get the only one instance
(Change in docker core code will happen when docker code
will migrate to use libnetwork), given it is being used for
host specific needs.
NOTE:
Long term fix is having multiple portallocator instances (so
no more singleton) each capable to be in sync with OS regarding
current port allocation.
When this change comes, no change whould be required on portallocator'
clients side, changes will be confined to portallocator package.
Signed-off-by: Alessandro Boch <aboch@docker.com>
- Added api enhancement to pass driver specific config
- Refactored simple bridge driver code for driver specific config
- Added an undocumented option to add non-default bridges without
manual pre-provisioning to help libnetwork testing
- Reenabled libnetwork test to do api testing
- Updated README.md
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
- Modified Mac address generation to match current standard
- Moved GenerateRandomName from libcontainer and removed the dependancy.
- Reduced entropy loop to 3 attempts.
Signed-off-by: Brent Salisbury <brent.salisbury@docker.com>
- Update Makefile to generate coverage details when running the tests
- Update CircleCI to use the Makefile
- Add Build and Coverage Badges to README
Closes#20
Signed-off-by: Dave Tucker <dt@docker.com>
- Update: portmapper, portallocator, ipallocator
- Remove stale godep dependencies
- Update pkg/iptables and others godep to latest
- Update bridge code and test after above changes
- Merge with latest changes in libnetwork
The code is updated up to docker/master commit SHA 86d66d6273
Signed-off-by: Alessandro Boch <aboch@docker.com>
- As they provide network translation functionalities,
they should be part of libnetwork
- In driver/bridge/setup_ip_tables.go remove depenency
on docker/daemon/networkdriver
Signed-off-by: Alessandro Boch <aboch@docker.com>
- Move ipallocator package into libnetwork
- Also ported network utility functions and their tests in libnetwork:
docker/daemon/networkdriver/utilg.go => libnetwork/utils.go
docker/daemon/networkdriver/network_test.go => libnetwork/utils_test.go
- Changed drivers/setup_device.go and setup_ipv4.go to reuse functions in
utils.go, instead of redefining internally.
- Modified utils to use vishvananda/netlink instead of libcontainer/netlink
Signed-off-by: Alessandro Boch <aboch@socketplane.io>
- Port and refactor docker/damon/driver ip tables setup function
into libnetwork.
- Taken care of golint guideline for CI to pass
- Ran one more time goimports for CI to pass...
Signed-off-by: Alessandro Boch <aboch@socketplane.io>
# removes docker service that is currently installed on the runner. we
# could use Uninstall-Package but not yet available on Windows runners.
# more info: https://github.com/actions/virtual-environments/blob/d3a5bad25f3b4326c5666bab0011ac7f1beec95e/images/win/scripts/Installers/Install-Docker.ps1#L11
name:Removing current daemon
run:|
if (Get-Service docker -ErrorAction SilentlyContinue) {
$dockerVersion = (docker version -f "{{.Server.Version}}")
# FIXME: when updating golangci-lint, remove the temporary "nolint" in https://github.com/moby/moby/blob/7860686a8df15eea9def9e6189c6f9eca031bb6f/libnetwork/networkdb/cluster.go#L246
ARGGOLANGCI_LINT_VERSION=v1.49.0
RUN --mount=type=cache,target=/root/.cache/go-build \
# set the graph driver as the current graphdriver if not set
DOCKER_GRAPHDRIVER:=$(if$(DOCKER_GRAPHDRIVER),$(DOCKER_GRAPHDRIVER),$(shell docker info 2>&1| grep "Storage Driver"| sed 's/.*: //'))
@@ -66,10 +54,13 @@ DOCKER_ENVS := \
-e DOCKER_TEST_HOST \
-e DOCKER_USERLANDPROXY \
-e DOCKERD_ARGS \
-e DELVE_PORT \
-e GITHUB_ACTIONS \
-e TEST_FORCE_VALIDATE \
-e TEST_INTEGRATION_DIR \
-e TEST_SKIP_INTEGRATION \
-e TEST_SKIP_INTEGRATION_CLI \
-e TESTCOVERAGE \
-e TESTDEBUG \
-e TESTDIRS \
-e TESTFLAGS \
@@ -80,16 +71,11 @@ DOCKER_ENVS := \
-e VALIDATE_REPO \
-e VALIDATE_BRANCH \
-e VALIDATE_ORIGIN_BRANCH \
-e HTTP_PROXY \
-e HTTPS_PROXY \
-e NO_PROXY \
-e http_proxy \
-e https_proxy \
-e no_proxy \
-e VERSION \
-e PLATFORM \
-e DEFAULT_PRODUCT_LICENSE \
-e PRODUCT
-e PRODUCT\
-e PACKAGER_NAME
# note: we _cannot_ add "-e DOCKER_BUILDTAGS" here because even if it's unset in the shell, that would shadow the "ENV DOCKER_BUILDTAGS" set in our Dockerfile, which is very important for our official builds
# to allow `make BIND_DIR=. shell` or `make BIND_DIR= test`
# Note that `BIND_DIR` will already be set to `bundles` if `DOCKER_HOST` is not set (see above BIND_DIR line), in such case this will do nothing since `DOCKER_MOUNT` will already be set.
returnerrdefs.Unavailable(fmt.Errorf("this node is not a swarm manager. Use \"docker swarm init\" or \"docker swarm join\" to connect this node to swarm and try again"))
}elseif!c.manager{
returnerrdefs.Unavailable(fmt.Errorf("this node is not a swarm manager. Worker nodes can't be used to view or modify cluster state. Please run this command on a manager node or promote the current node to a manager"))
// Used to signify that streams are multiplexed and therefore need a StdWriter to encode stdout/stderr messages accordingly.
// TODO @cpuguy83: This shouldn't be needed. It was only added so that http and websocket endpoints can use the same function, and the websocket function was not using a stdwriter prior to this change...
// HOWEVER, the websocket endpoint is using a single stream and SHOULD be encoded with stdout/stderr as is done for HTTP since it is still just a single stream.
// Since such a change is an API change unrelated to the current changeset we'll keep it as is here and change separately.
// Used to signify that streams must be multiplexed by producer as endpoint can't manage multiple streams.
// This is typically set by HTTP endpoint, while websocket can transport raw streams
MuxStreamsbool
}
@@ -38,8 +35,6 @@ type PartialLogMetaData struct {
// LogMessage is datastructure that represents piece of output produced by some
// container. The Line member is a slice of an array whose contents can be
// changed after a log driver's Log() method returns.
// changes to this struct need to be reflect in the reset method in
// The driver specific options used when creating the volume.
//
// Required: true
Optionsmap[string]string`json:"Options"`
// The level at which the volume exists. Either `global` for cluster-wide,
// or `local` for machine level.
//
// Required: true
Scopestring`json:"Scope"`
// Low-level details about the volume, provided by the volume driver.
// Details are returned as a map with key/value pairs:
// `{"key":"value","key2":"value2"}`.
//
// The `Status` field is optional, and is omitted if the volume driver
// does not support this feature.
//
Status map[string]interface{} `json:"Status,omitempty"`
// usage data
UsageData*UsageData`json:"UsageData,omitempty"`
}
// UsageData Usage details about the volume. This information is used by the
// `GET /system/df` endpoint, and omitted in other endpoints.
//
// swagger:model UsageData
typeUsageDatastruct{
// The number of containers referencing this volume. This field
// is set to `-1` if the reference-count is not available.
//
// Required: true
RefCountint64`json:"RefCount"`
// Amount of disk space used by the volume (in bytes). This information
// is only available for volumes created with the `"local"` volume
// driver. For volumes created with other volume drivers, this field
// is set to `-1` ("not available")
//
// Required: true
Sizeint64`json:"Size"`
}
Some files were not shown because too many files have changed in this diff
Show More
Reference in New Issue
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.