We had a couple of runs where these jobs got stuck and github
actions didn't allow terminating them, so that they were only
terminated after 120 minutes.
These jobs usually complete in 5 minutes, so let's give them
a shorter timeout. 20 minutes should be enough (don't @ me).
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit c68c9aed8c)
Signed-off-by: Austin Vazquez <macedonv@amazon.com>
We had a few "runaway jobs" recently, where the job got stuck, and kept
running for 6 hours (in one case even 24 hours, probably due some github
outage). Some of those jobs could not be terminated.
While running these actions on public repositories doesn't cost us, it's
still not desirable to have jobs running for that long (as they can still
hold up the queue).
This patch adds a blanket "2 hours" time-limit to all jobs that didn't
have a limit set. We should look at tweaking those limits to actually
expected duration, but having a default at least is a start.
Also changed the position of some existing timeouts so that we have a
consistent order in which it's set; making it easier to spot locations
where no limit is defined.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit 6b7e2783d1)
Signed-off-by: Austin Vazquez <macedonv@amazon.com>
The buildkit workflow uses Go to determine the version of Buildkit to run
integration-tests for. It currently uses on the default version that's
installed on the GitHub actions runners (1.21.13 currently), but this fails
if the go.mod/vendor.mod specify a higher version of Go as required version.
If this fails, the BUILDKIT_REF and REPO env-vars are not set / empty,
resulting in the workflow checking out the current (moby) repository instead
of buildkit, which fails.
This patch adds a step to explicitly install the expected version of Go.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit 02d4fc3234)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
- https://github.com/golang/go/issues?q=milestone%3AGo1.22.7+label%3ACherryPickApproved
- full diff: https://github.com/golang/go/compare/go1.22.6...go1.22.7
These minor releases include 3 security fixes following the security policy:
- go/parser: stack exhaustion in all Parse* functions
Calling any of the Parse functions on Go source code which contains deeply nested literals can cause a panic due to stack exhaustion.
This is CVE-2024-34155 and Go issue https://go.dev/issue/69138.
- encoding/gob: stack exhaustion in Decoder.Decode
Calling Decoder.Decode on a message which contains deeply nested structures can cause a panic due to stack exhaustion.
This is a follow-up to CVE-2022-30635.
Thanks to Md Sakib Anwar of The Ohio State University (anwar.40@osu.edu) for reporting this issue.
This is CVE-2024-34156 and Go issue https://go.dev/issue/69139.
- go/build/constraint: stack exhaustion in Parse
Calling Parse on a "// +build" build tag line with deeply nested expressions can cause a panic due to stack exhaustion.
This is CVE-2024-34158 and Go issue https://go.dev/issue/69141.
View the release notes for more information:
https://go.dev/doc/devel/release#go1.23.1
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
(cherry picked from commit a2e14dd8bd)
Signed-off-by: Austin Vazquez <macedonv@amazon.com>
Provide more context to the steps we're doing.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit 65cfcc28ab)
Signed-off-by: Cory Snider <csnider@mirantis.com>
On bookworm, AppArmor failed to start inside the container, which can be
seen at startup of the dev-container:
Created symlink /etc/systemd/system/systemd-firstboot.service → /dev/null.
Created symlink /etc/systemd/system/systemd-udevd.service → /dev/null.
Created symlink /etc/systemd/system/multi-user.target.wants/docker-entrypoint.service → /etc/systemd/system/docker-entrypoint.service.
hack/dind-systemd: starting /lib/systemd/systemd --show-status=false --unit=docker-entrypoint.target
systemd 252.17-1~deb12u1 running in system mode (+PAM +AUDIT +SELINUX +APPARMOR +IMA +SMACK +SECCOMP +GCRYPT -GNUTLS +OPENSSL +ACL +BLKID +CURL +ELFUTILS +FIDO2 +IDN2 -IDN +IPTC +KMOD +LIBCRYPTSETUP +LIBFDISK +PCRE2 -PWQUALITY +P11KIT +QRENCODE +TPM2 +BZIP2 +LZ4 +XZ +ZLIB +ZSTD -BPF_FRAMEWORK -XKBCOMMON +UTMP +SYSVINIT default-hierarchy=unified)
Detected virtualization docker.
Detected architecture x86-64.
modprobe@configfs.service: Deactivated successfully.
modprobe@dm_mod.service: Deactivated successfully.
modprobe@drm.service: Deactivated successfully.
modprobe@efi_pstore.service: Deactivated successfully.
modprobe@fuse.service: Deactivated successfully.
modprobe@loop.service: Deactivated successfully.
apparmor.service: Starting requested but asserts failed.
proc-sys-fs-binfmt_misc.automount: Got automount request for /proc/sys/fs/binfmt_misc, triggered by 49 (systemd-binfmt)
+ source /etc/docker-entrypoint-cmd
++ hack/make.sh dynbinary test-integration
When checking "aa-status", an error was printed that the filesystem was
not mounted:
aa-status
apparmor filesystem is not mounted.
apparmor module is loaded.
Checking if "local-fs.target" was loaded, that seemed to be the case;
systemctl status local-fs.target
● local-fs.target - Local File Systems
Loaded: loaded (/lib/systemd/system/local-fs.target; static)
Active: active since Mon 2023-11-27 10:48:38 UTC; 18s ago
Docs: man:systemd.special(7)
However, **on the host**, "/sys/kernel/security" has a mount, which was not
present inside the container:
mount | grep securityfs
securityfs on /sys/kernel/security type securityfs (rw,nosuid,nodev,noexec,relatime)
Interestingly, on `debian:bullseye`, this was not the case either; no
`securityfs` mount was present inside the container, and apparmor actually
failed to start, but succeeded silently:
mount | grep securityfs
systemctl start apparmor
systemctl status apparmor
● apparmor.service - Load AppArmor profiles
Loaded: loaded (/lib/systemd/system/apparmor.service; enabled; vendor preset: enabled)
Active: active (exited) since Mon 2023-11-27 11:59:09 UTC; 44s ago
Docs: man:apparmor(7)
https://gitlab.com/apparmor/apparmor/wikis/home/
Process: 43 ExecStart=/lib/apparmor/apparmor.systemd reload (code=exited, status=0/SUCCESS)
Main PID: 43 (code=exited, status=0/SUCCESS)
CPU: 10ms
Nov 27 11:59:09 9519f89cade1 apparmor.systemd[43]: Not starting AppArmor in container
Same, using the `/etc/init.d/apparmor` script:
/etc/init.d/apparmor start
Starting apparmor (via systemctl): apparmor.service.
echo $?
0
And apparmor was not actually active:
aa-status
apparmor module is loaded.
apparmor filesystem is not mounted.
aa-enabled
Maybe - policy interface not available.
After further investigating, I found that the non-systemd dind script
had a mount for AppArmor, which was added in 31638ab2ad
The systemd variant was missing this mount, which may have gone unnoticed
because `debian:bullseye` was silently ignoring this when starting the
apparmor service.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit cfb8ca520a)
Signed-off-by: Cory Snider <csnider@mirantis.com>
We used DEBIAN_FRONTEND in some places to prevent installation of packages
from being blocked. However, debian bookworm now [includes a fix][1] for
situations like this (it was specifically reported for Docker situations <3),
so we can get rid of these.
Thanks to Tianon for noticing this, and for linking to the Debian ticket!
[1]: https://bugs.debian.org/929417
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit fd40dfaf58)
Signed-off-by: Cory Snider <csnider@mirantis.com>
Also switch yamllint to be installed from debian's packages, which are
currently at v1.29.0.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit e72c4818c4)
Signed-off-by: Cory Snider <csnider@mirantis.com>
Debian Woodworm ships with a newer version of iptables, which caused two
tests to fail:
=== FAIL: amd64.integration-cli TestDockerDaemonSuite/TestDaemonICCLinkExpose (1.18s)
docker_cli_daemon_test.go:841: assertion failed: false (matched bool) != true (true bool): iptables output should have contained "DROP.*all.*ext-bridge6.*ext-bridge6", but was "Chain FORWARD (policy ACCEPT 0 packets, 0 bytes)\n pkts bytes target prot opt in out source destination \n 0 0 DOCKER-USER 0 -- * * 0.0.0.0/0 0.0.0.0/0 \n 0 0 DOCKER-ISOLATION-STAGE-1 0 -- * * 0.0.0.0/0 0.0.0.0/0 \n 0 0 ACCEPT 0 -- * ext-bridge6 0.0.0.0/0 0.0.0.0/0 ctstate RELATED,ESTABLISHED\n 0 0 DOCKER 0 -- * ext-bridge6 0.0.0.0/0 0.0.0.0/0 \n 0 0 ACCEPT 0 -- ext-bridge6 !ext-bridge6 0.0.0.0/0 0.0.0.0/0 \n 0 0 DROP 0 -- ext-bridge6 ext-bridge6 0.0.0.0/0 0.0.0.0/0 \n"
--- FAIL: TestDockerDaemonSuite/TestDaemonICCLinkExpose (1.18s)
=== FAIL: amd64.integration-cli TestDockerDaemonSuite/TestDaemonICCPing (1.19s)
docker_cli_daemon_test.go:803: assertion failed: false (matched bool) != true (true bool): iptables output should have contained "DROP.*all.*ext-bridge5.*ext-bridge5", but was "Chain FORWARD (policy ACCEPT 0 packets, 0 bytes)\n pkts bytes target prot opt in out source destination \n 0 0 DOCKER-USER 0 -- * * 0.0.0.0/0 0.0.0.0/0 \n 0 0 DOCKER-ISOLATION-STAGE-1 0 -- * * 0.0.0.0/0 0.0.0.0/0 \n 0 0 ACCEPT 0 -- * ext-bridge5 0.0.0.0/0 0.0.0.0/0 ctstate RELATED,ESTABLISHED\n 0 0 DOCKER 0 -- * ext-bridge5 0.0.0.0/0 0.0.0.0/0 \n 0 0 ACCEPT 0 -- ext-bridge5 !ext-bridge5 0.0.0.0/0 0.0.0.0/0 \n 0 0 DROP 0 -- ext-bridge5 ext-bridge5 0.0.0.0/0 0.0.0.0/0 \n"
--- FAIL: TestDockerDaemonSuite/TestDaemonICCPing (1.19s)
Both the `TestDaemonICCPing`, and `TestDaemonICCLinkExpose` test were introduced
in dd0666e64f. These tests called `iptables` with
the `-n` (`--numeric`) option, which prevents it from doing a reverse-DNS lookup
as an optimization.
However, the `-n` option did not have an effect to the `prot` column before
commit [da8ecc62dd765b15df84c3aa6b83dcb7a81d4ffa] (iptables < v1.8.9 or v1.8.8).
Newer versions, such as the iptables version shipping with Debian Woodworm do,
so we need to update the expected output for this version.
This patch removes the `-n` option, to keep the test more portable, also when
run non-containerized, and removes the use of regular expressions to check the
result, as these regular expressions were quite permissive (using `.*` wild-
card matching). Instead, we're getting the
With this change;
make DOCKER_GRAPHDRIVER=vfs TEST_FILTER=TestDaemonICC TEST_IGNORE_CGROUP_CHECK=1 test-integration
...
--- PASS: TestDockerDaemonSuite (139.11s)
--- PASS: TestDockerDaemonSuite/TestDaemonICCLinkExpose (54.62s)
--- PASS: TestDockerDaemonSuite/TestDaemonICCPing (84.48s)
[da8ecc62dd765b15df84c3aa6b83dcb7a81d4ffa]: https://git.netfilter.org/iptables/commit/?id=da8ecc62dd765b15df84c3aa6b83dcb7a81d4ffa
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit c3eed9fa3e)
Signed-off-by: Cory Snider <csnider@mirantis.com>
Starting with [6e0ed3d19c54603f0f7d628ea04b550151d8a262], the minimum
allowed size is now 300MB. Given that this is a sparse image, and
the size of the image is irrelevant to the test (we check for
limits defined through project-quotas, not the size of the
device itself), we can raise the size of this image.
[6e0ed3d19c54603f0f7d628ea04b550151d8a262]: https://git.kernel.org/pub/scm/fs/xfs/xfsprogs-dev.git/commit/?id=6e0ed3d19c54603f0f7d628ea04b550151d8a262
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit 9709b7e458)
Signed-off-by: Cory Snider <csnider@mirantis.com>
cross-compiling for arm/v5 was failing;
#56 84.12 /usr/bin/arm-linux-gnueabi-clang -marm -o $WORK/b001/exe/a.out -Wl,--export-dynamic-symbol=_cgo_panic -Wl,--export-dynamic-symbol=_cgo_topofstack -Wl,--export-dynamic-symbol=crosscall2 -Qunused-arguments -Wl,--compress-debug-sections=zlib /tmp/go-link-759578347/go.o /tmp/go-link-759578347/000000.o /tmp/go-link-759578347/000001.o /tmp/go-link-759578347/000002.o /tmp/go-link-759578347/000003.o /tmp/go-link-759578347/000004.o /tmp/go-link-759578347/000005.o /tmp/go-link-759578347/000006.o /tmp/go-link-759578347/000007.o /tmp/go-link-759578347/000008.o /tmp/go-link-759578347/000009.o /tmp/go-link-759578347/000010.o /tmp/go-link-759578347/000011.o /tmp/go-link-759578347/000012.o /tmp/go-link-759578347/000013.o /tmp/go-link-759578347/000014.o /tmp/go-link-759578347/000015.o /tmp/go-link-759578347/000016.o /tmp/go-link-759578347/000017.o /tmp/go-link-759578347/000018.o -O2 -g -O2 -g -O2 -g -lpthread -O2 -g -no-pie -static
#56 84.12 ld.lld: error: undefined symbol: __atomic_load_4
#56 84.12 >>> referenced by gcc_libinit.c
#56 84.12 >>> /tmp/go-link-759578347/000009.o:(_cgo_wait_runtime_init_done)
#56 84.12 >>> referenced by gcc_libinit.c
#56 84.12 >>> /tmp/go-link-759578347/000009.o:(_cgo_wait_runtime_init_done)
#56 84.12 >>> referenced by gcc_libinit.c
#56 84.12 >>> /tmp/go-link-759578347/000009.o:(_cgo_wait_runtime_init_done)
#56 84.12 >>> referenced 2 more times
#56 84.12
#56 84.12 ld.lld: error: undefined symbol: __atomic_store_4
#56 84.12 >>> referenced by gcc_libinit.c
#56 84.12 >>> /tmp/go-link-759578347/000009.o:(_cgo_wait_runtime_init_done)
#56 84.12 >>> referenced by gcc_libinit.c
#56 84.12 >>> /tmp/go-link-759578347/000009.o:(x_cgo_notify_runtime_init_done)
#56 84.12 >>> referenced by gcc_libinit.c
#56 84.12 >>> /tmp/go-link-759578347/000009.o:(x_cgo_set_context_function)
#56 84.12 clang: error: linker command failed with exit code 1 (use -v to see invocation)
From discussion on GitHub;
https://github.com/moby/moby/pull/46982#issuecomment-2206992611
The arm/v5 build failure looks to be due to libatomic not being included
in the link. For reasons probably buried in mailing list archives,
[gcc](https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81358) and clang don't
bother to implicitly auto-link libatomic. This is not a big deal on many
modern platforms with atomic intrinsics as the compiler generates inline
instruction sequences, avoiding any libcalls into libatomic. ARMv5 is not
one of those platforms: all atomic operations require a libcall.
In theory, adding `CGO_LDFLAGS=-latomic` should fix arm/v5 builds.
While it could be argued that cgo should automatically link against
libatomic in the same way that it automatically links against libpthread,
the Go maintainers would have a valid counter-argument that it should be
the C toolchain's responsibility to link against libatomic automatically,
just like it does with libgcc or compiler-rt.
Co-authored-by: Sebastiaan van Stijn <github@gone.nl>
Signed-off-by: Cory Snider <csnider@mirantis.com>
(cherry picked from commit 4cd5c2b643)
Signed-off-by: Cory Snider <csnider@mirantis.com>
cross-compiling for arm/v5 fails on go1.22; a fix is included for this
in go1.23 (https://github.com/golang/go/issues/65290), but for go1.22
we can set the correct option manually.
1.189 + go build -mod=vendor -modfile=vendor.mod -o /tmp/bundles/binary-daemon/dockerd -tags 'netgo osusergo static_build journald' -ldflags '-w -X "github.com/docker/docker/dockerversion.Version=dev" -X "github.com/docker/docker/dockerversion.GitCommit=HEAD" -X "github.com/docker/docker/dockerversion.BuildTime=2024-08-29T16:59:57.000000000+00:00" -X "github.com/docker/docker/dockerversion.PlatformName=" -X "github.com/docker/docker/dockerversion.ProductName=" -X "github.com/docker/docker/dockerversion.DefaultProductLicense=" -extldflags -static ' -gcflags= github.com/docker/docker/cmd/dockerd
67.78 # runtime/cgo
67.78 gcc_libinit.c:44:8: error: large atomic operation may incur significant performance penalty; the access size (4 bytes) exceeds the max lock-free size (0 bytes) [-Werror,-Watomic-alignment]
67.78 gcc_libinit.c:47:6: error: large atomic operation may incur significant performance penalty; the access size (4 bytes) exceeds the max lock-free size (0 bytes) [-Werror,-Watomic-alignment]
67.78 gcc_libinit.c:49:10: error: large atomic operation may incur significant performance penalty; the access size (4 bytes) exceeds the max lock-free size (0 bytes) [-Werror,-Watomic-alignment]
67.78 gcc_libinit.c:69:9: error: large atomic operation may incur significant performance penalty; the access size (4 bytes) exceeds the max lock-free size (0 bytes) [-Werror,-Watomic-alignment]
67.78 gcc_libinit.c:71:3: error: large atomic operation may incur significant performance penalty; the access size (4 bytes) exceeds the max lock-free size (0 bytes) [-Werror,-Watomic-alignment]
78.20 + rm -f /go/src/github.com/docker/docker/go.mod
Co-authored-by: Sebastiaan van Stijn <github@gone.nl>
Signed-off-by: Cory Snider <csnider@mirantis.com>
(cherry picked from commit e853c093bf)
Signed-off-by: Cory Snider <csnider@mirantis.com>
Looks like the way it picks up #nosec comments changed, causing the
linter error to re-appear;
pkg/archive/archive_linux.go:57:17: G305: File traversal when extracting zip/tar archive (gosec)
Name: filepath.Join(hdr.Name, WhiteoutOpaqueDir),
^
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit d4160d5aa7)
Signed-off-by: Cory Snider <csnider@mirantis.com>
Looks like the way it picks up #nosec comments changed, causing the
linter error to re-appear;
builder/remotecontext/remote.go:48:17: G107: Potential HTTP request made with variable url (gosec)
if resp, err = http.Get(address); err != nil {
^
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit 04bf0e3d69)
Signed-off-by: Cory Snider <csnider@mirantis.com>
Add a nil check to handle a case where the image config JSON would
deserialize into a nil map.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
(cherry picked from commit 642242a26b)
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
go1.21.11 (released 2024-06-04) includes security fixes to the archive/zip
and net/netip packages, as well as bug fixes to the compiler, the go command,
the runtime, and the os package. See the Go 1.21.11 milestone on our issue
tracker for details;
- https://github.com/golang/go/issues?q=milestone%3AGo1.21.11+label%3ACherryPickApproved
- full diff: https://github.com/golang/go/compare/go1.21.10...go1.21.11
From the security announcement;
We have just released Go versions 1.22.4 and 1.21.11, minor point releases.
These minor releases include 2 security fixes following the security policy:
- archive/zip: mishandling of corrupt central directory record
The archive/zip package's handling of certain types of invalid zip files
differed from the behavior of most zip implementations. This misalignment
could be exploited to create an zip file with contents that vary depending
on the implementation reading the file. The archive/zip package now rejects
files containing these errors.
Thanks to Yufan You for reporting this issue.
This is CVE-2024-24789 and Go issue https://go.dev/issue/66869.
- net/netip: unexpected behavior from Is methods for IPv4-mapped IPv6 addresses
The various Is methods (IsPrivate, IsLoopback, etc) did not work as expected
for IPv4-mapped IPv6 addresses, returning false for addresses which would
return true in their traditional IPv4 forms.
Thanks to Enze Wang of Alioth and Jianjun Chen of Zhongguancun Lab
for reporting this issue.
This is CVE-2024-24790 and Go issue https://go.dev/issue/67680.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit 91e2c29865)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
We currently depend on the containerd platform-parsing to return typed
errdefs errors; the new containerd platforms module does not return such
errors, and documents that errors returned should not be used as sentinel
errors; c1438e911a/errors.go (L21-L30)
Let's type these errors ourselves, so that we don't depend on the error-types
returned by containerd, and consider that eny platform string that results in
an error is an invalid parameter.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit cd1ed46d73)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
If a node is promoted right after another node is demoted, there exists
the possibility of a race, by which the newly promoted manager attempts
to connect to the newly demoted manager for its initial Raft membership.
This connection fails, and the whole swarm Node object exits.
At this point, the daemon nodeRunner sees the exit and restarts the
Node.
However, if the address of the no-longer-manager is recorded in the
nodeRunner's config.joinAddr, the Node again attempts to connect to the
no-longer-manager, and crashes again. This repeats. The solution is to
remove the node entirely and rejoin the Swarm as a new node.
This change erases config.joinAddr from the restart of the nodeRunner,
if the node has previously become Ready. The node becoming Ready
indicates that at some point, it did successfully join the cluster, in
some fashion. If it has successfully joined the cluster, then Swarm has
its own persistent record of known manager addresses. If no joinAddr is
provided, then Swarm will choose from its persisted list of managers to
join, and will join a functioning manager.
Signed-off-by: Drew Erny <derny@mirantis.com>
(cherry picked from commit 16e5c41591)
Signed-off-by: Drew Erny <derny@mirantis.com>
- https://github.com/golang/go/issues?q=milestone%3AGo1.21.10+label%3ACherryPickApproved
- full diff: https://github.com/golang/go/compare/go1.21.9...go1.21.10
These minor releases include 2 security fixes following the security policy:
- cmd/go: arbitrary code execution during build on darwin
On Darwin, building a Go module which contains CGO can trigger arbitrary code execution when using the Apple version of ld, due to usage of the -lto_library flag in a "#cgo LDFLAGS" directive.
Thanks to Juho Forsén of Mattermost for reporting this issue.
This is CVE-2024-24787 and Go issue https://go.dev/issue/67119.
- net: malformed DNS message can cause infinite loop
A malformed DNS message in response to a query can cause the Lookup functions to get stuck in an infinite loop.
Thanks to long-name-let-people-remember-you on GitHub for reporting this issue, and to Mateusz Poliwczak for bringing the issue to our attention.
This is CVE-2024-24788 and Go issue https://go.dev/issue/66754.
View the release notes for more information:
https://go.dev/doc/devel/release#go1.22.3
**- Description for the changelog**
```markdown changelog
Update Go runtime to 1.21.10
```
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
(cherry picked from commit 6c97e0e0b5)
Signed-off-by: Andrey Epifanov <aepifanov@mirantis.com>
# Conflicts:
# .github/workflows/.windows.yml
# .github/workflows/buildkit.yml
# .github/workflows/test.yml
# Dockerfile
# Dockerfile.simple
# Dockerfile.windows
# hack/dockerfiles/generate-files.Dockerfile
/usr/sbin/runc is confined with "runc" profile[1] introduced in AppArmor
v4.0.0. This change breaks stopping of containers, because the profile
assigned to containers doesn't accept signals from the "runc" peer.
AppArmor >= v4.0.0 is currently part of Ubuntu Mantic (23.10) and later.
In the case of Docker, this regression is hidden by the fact that
dockerd itself sends SIGKILL to the running container after runc fails
to stop it. It is still a regression, because graceful shutdowns of
containers via "docker stop" are no longer possible, as SIGTERM from
runc is not delivered to them. This can be seen in logs from dockerd
when run with debug logging enabled and also from tracing signals with
killsnoop utility from bcc[2] (in bpfcc-tools package in Debian/Ubuntu):
Test commands:
root@cloudimg:~# docker run -d --name test redis
ba04c137827df8468358c274bc719bf7fc291b1ed9acf4aaa128ccc52816fe46
root@cloudimg:~# docker stop test
Relevant syslog messages (with wrapped long lines):
Apr 23 20:45:26 cloudimg kernel: audit:
type=1400 audit(1713905126.444:253): apparmor="DENIED"
operation="signal" class="signal" profile="docker-default" pid=9289
comm="runc" requested_mask="receive" denied_mask="receive"
signal=kill peer="runc"
Apr 23 20:45:36 cloudimg dockerd[9030]:
time="2024-04-23T20:45:36.447016467Z"
level=warning msg="Container failed to exit within 10s of kill - trying direct SIGKILL"
container=ba04c137827df8468358c274bc719bf7fc291b1ed9acf4aaa128ccc52816fe46
error="context deadline exceeded"
Killsnoop output after "docker stop ...":
root@cloudimg:~# killsnoop-bpfcc
TIME PID COMM SIG TPID RESULT
20:51:00 9631 runc 3 9581 -13
20:51:02 9637 runc 9 9581 -13
20:51:12 9030 dockerd 9 9581 0
This change extends the docker-default profile with rules that allow
receiving signals from processes that run confined with either runc or
crun profile (crun[4] is an alternative OCI runtime that's also confined
in AppArmor >= v4.0.0, see [1]). It is backward compatible because the
peer value is a regular expression (AARE) so the referenced profile
doesn't have to exist for this profile to successfully compile and load.
Note that the runc profile has an attachment to /usr/sbin/runc. This is
the path where the runc package in Debian/Ubuntu puts the binary. When
the docker-ce package is installed from the upstream repository[3], runc
is installed as part of the containerd.io package at /usr/bin/runc.
Therefore it's still running unconfined and has no issues sending
signals to containers.
[1] https://gitlab.com/apparmor/apparmor/-/commit/2594d936
[2] https://github.com/iovisor/bcc/blob/master/tools/killsnoop.py
[3] https://download.docker.com/linux/ubuntu
[4] https://github.com/containers/crun
Signed-off-by: Tomáš Virtus <nechtom@gmail.com>
(cherry picked from commit 5ebe2c0d6b)
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
This was using `errors.Wrap` when there was no error to wrap, meanwhile
we are supposed to be creating a new error.
Found this while investigating some log corruption issues and
unexpectedly getting a nil reader and a nil error from `getTailReader`.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
(cherry picked from commit 0a48d26fbc)
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
This commit makes sure the embedded resolver doesn't try to forward to
upstream servers when a container is only attached to an internal
network.
Co-authored-by: Albin Kerouanton <albinker@gmail.com>
Signed-off-by: Rob Murray <rob.murray@docker.com>
(cherry picked from commit 790c3039d0)
Signed-off-by: Cory Snider <csnider@mirantis.com>
This commit introduces a new integration test suite aimed at testing
networking features like inter-container communication, network
isolation, port mapping, etc... and how they interact with daemon-level
and network-level parameters.
So far, there's pretty much no tests making sure our networks are well
configured: 1. there're a few tests for port mapping, but they don't
cover all use cases ; 2. there're a few tests that check if a specific
iptables rule exist, but that doesn't prevent that specific iptables
rule to be wrong in the first place.
As we're planning to refactor how iptables rules are written, and change
some of them to fix known security issues, we need a way to test all
combinations of parameters. So far, this was done by hand, which is
particularly painful and time consuming. As such, this new test suite is
foundational to upcoming work.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
(cherry picked from commit 409ea700c7)
Signed-off-by: Cory Snider <csnider@mirantis.com>
With both rootless and live restore enabled, there's some race condition
which causes the container to be `Unmount`ed before the refcount is
restored.
This makes sure we don't underflow the refcount (uint64) when
decrementing it.
The root cause of this race condition still needs to be investigated and
fixed, but at least this unflakies the `TestLiveRestore`.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
(cherry picked from commit 294fc9762e)
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
When this was called concurrently from the moby image
exporter there could be a data race where a layer was
written to the refs map when it was already there.
In that case the reference count got mixed up and on
release only one of these layers was actually released.
Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>
(cherry picked from commit 37545cc644)
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
In de2447c, the creation of the 'lower' file was changed from using
os.Create to using ioutils.AtomicWriteFile, which ignores the system's
umask. This means that even though the requested permission in the
source code was always 0666, it was 0644 on systems with default
umask of 0022 prior to de2447c, so the move to AtomicFile potentially
increased the file's permissions.
This is not a security issue because the parent directory does not
allow writes into the file, but it can confuse security scanners on
Linux-based systems into giving false positives.
Signed-off-by: Jaroslav Jindrak <dzejrou@gmail.com>
(cherry picked from commit cadb124ab6)
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
The platform comparison was backported from the branch that vendors
containerd 1.7.
In this branch the vendored containerd version is older and doesn't have
the same comparison logic for Windows specific OSVersion.
Require both major and minor components of Windows OSVersion to match.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
(cherry picked from commit b3888ed899)
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Add this syscall to match the profile in containerd
containerd: a6e52c74fa
libseccomp: 53267af3fb
kernel: 9f6c532f59
futex: Add sys_futex_wake()
To complement sys_futex_waitv() add sys_futex_wake(). This syscall
implements what was previously known as FUTEX_WAKE_BITSET except it
uses 'unsigned long' for the bitmask and takes FUTEX2 flags.
The 'unsigned long' allows FUTEX2_SIZE_U64 on 64bit platforms.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit d69729e053)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Add this syscall to match the profile in containerd
containerd: a6e52c74fa
libseccomp: 53267af3fb
kernel: cb8c4312af
futex: Add sys_futex_wait()
To complement sys_futex_waitv()/wake(), add sys_futex_wait(). This
syscall implements what was previously known as FUTEX_WAIT_BITSET
except it uses 'unsigned long' for the value and bitmask arguments,
takes timespec and clockid_t arguments for the absolute timeout and
uses FUTEX2 flags.
The 'unsigned long' allows FUTEX2_SIZE_U64 on 64bit platforms.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit 10d344d176)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Add this syscall to match the profile in containerd
containerd: a6e52c74fa
libseccomp: 53267af3fb
kernel: 0f4b5f9722
futex: Add sys_futex_requeue()
Finish off the 'simple' futex2 syscall group by adding
sys_futex_requeue(). Unlike sys_futex_{wait,wake}() its arguments are
too numerous to fit into a regular syscall. As such, use struct
futex_waitv to pass the 'source' and 'destination' futexes to the
syscall.
This syscall implements what was previously known as FUTEX_CMP_REQUEUE
and uses {val, uaddr, flags} for source and {uaddr, flags} for
destination.
This design explicitly allows requeueing between different types of
futex by having a different flags word per uaddr.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit df57a080b6)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Add this syscall to match the profile in containerd
containerd: a6e52c74fa
libseccomp: 53267af3fb
kernel: c35559f94e
x86/shstk: Introduce map_shadow_stack syscall
When operating with shadow stacks enabled, the kernel will automatically
allocate shadow stacks for new threads, however in some cases userspace
will need additional shadow stacks. The main example of this is the
ucontext family of functions, which require userspace allocating and
pivoting to userspace managed stacks.
Unlike most other user memory permissions, shadow stacks need to be
provisioned with special data in order to be useful. They need to be setup
with a restore token so that userspace can pivot to them via the RSTORSSP
instruction. But, the security design of shadow stacks is that they
should not be written to except in limited circumstances. This presents a
problem for userspace, as to how userspace can provision this special
data, without allowing for the shadow stack to be generally writable.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit 8826f402f9)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Add this syscall to match the profile in containerd
containerd: a6e52c74fa
libseccomp: 53267af3fb
kernel: 09da082b07
fs: Add fchmodat2()
On the userspace side fchmodat(3) is implemented as a wrapper
function which implements the POSIX-specified interface. This
interface differs from the underlying kernel system call, which does not
have a flags argument. Most implementations require procfs [1][2].
There doesn't appear to be a good userspace workaround for this issue
but the implementation in the kernel is pretty straight-forward.
The new fchmodat2() syscall allows to pass the AT_SYMLINK_NOFOLLOW flag,
unlike existing fchmodat.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit 6f242f1a28)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Add this syscall to match the profile in containerd
containerd: a6e52c74fa
libseccomp: 53267af3fb
kernel: cf264e1329
NAME
cachestat - query the page cache statistics of a file.
SYNOPSIS
#include <sys/mman.h>
struct cachestat_range {
__u64 off;
__u64 len;
};
struct cachestat {
__u64 nr_cache;
__u64 nr_dirty;
__u64 nr_writeback;
__u64 nr_evicted;
__u64 nr_recently_evicted;
};
int cachestat(unsigned int fd, struct cachestat_range *cstat_range,
struct cachestat *cstat, unsigned int flags);
DESCRIPTION
cachestat() queries the number of cached pages, number of dirty
pages, number of pages marked for writeback, number of evicted
pages, number of recently evicted pages, in the bytes range given by
`off` and `len`.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit 4d0d5ee10d)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This syscall is gated by CAP_SYS_NICE, matching the profile in containerd.
containerd: a6e52c74fa
libseccomp: d83cb7ac25
kernel: c6018b4b25
mm/mempolicy: add set_mempolicy_home_node syscall
This syscall can be used to set a home node for the MPOL_BIND and
MPOL_PREFERRED_MANY memory policy. Users should use this syscall after
setting up a memory policy for the specified range as shown below.
mbind(p, nr_pages * page_size, MPOL_BIND, new_nodes->maskp,
new_nodes->size + 1, 0);
sys_set_mempolicy_home_node((unsigned long)p, nr_pages * page_size,
home_node, 0);
The syscall allows specifying a home node/preferred node from which
kernel will fulfill memory allocation requests first.
...
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit 1251982cf7)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The compatibility depends on whether `hyperv` or `process` container
isolation is used.
This fixes cache not being used when building images based on older
Windows versions on a newer Windows host.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
(cherry picked from commit 91ea04089b)
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Only print the tag when the received reference has a tag, if
we can't cast the received tag to a `reference.Tagged` then
skip printing the tag as it's likely a digest.
Fixes panic when trying to install a plugin from a reference
with a digest such as
`vieux/sshfs@sha256:1d3c3e42c12138da5ef7873b97f7f32cf99fb6edde75fa4f0bcf9ed277855811`
Signed-off-by: Laura Brehm <laurabrehm@hey.com>
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Adds a test case for installing a plugin from a remote in the form
of `plugin-content-trust@sha256:d98f2f8061...`, which is currently
causing the daemon to panic, as we found while running the CLI e2e
tests:
```
docker plugin install registry:5000/plugin-content-trust@sha256:d98f2f806144bf4ba62d4ecaf78fec2f2fe350df5a001f6e3b491c393326aedb
```
Signed-off-by: Laura Brehm <laurabrehm@hey.com>
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Update the containerd binary that's used in CI to align with the version used
for Linux. This was missed in d6abda0710.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The double quotes inside a single quoted string don't need to be
escaped.
Looks like different Powershell versions are treating this differently
and it started failing unexpectedly without any changes on our side.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
(cherry picked from commit ecb217cf69)
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Turn subsequent `Close` calls into a no-op and produce a warning with an
optional stack trace (if debug mode is enabled).
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
(cherry picked from commit 585d74bad1)
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
go1.20.12 (released 2023-12-05) includes security fixes to the go command,
and the net/http and path/filepath packages, as well as bug fixes to the
compiler and the go command. See the Go 1.20.12 milestone on our issue
tracker for details.
- https://github.com/golang/go/issues?q=milestone%3AGo1.20.12+label%3ACherryPickApproved
- full diff: https://github.com/golang/go/compare/go1.20.11...go1.20.12
from the security mailing:
[security] Go 1.21.5 and Go 1.20.12 are released
Hello gophers,
We have just released Go versions 1.21.5 and 1.20.12, minor point releases.
These minor releases include 3 security fixes following the security policy:
- net/http: limit chunked data overhead
A malicious HTTP sender can use chunk extensions to cause a receiver
reading from a request or response body to read many more bytes from
the network than are in the body.
A malicious HTTP client can further exploit this to cause a server to
automatically read a large amount of data (up to about 1GiB) when a
handler fails to read the entire body of a request.
Chunk extensions are a little-used HTTP feature which permit including
additional metadata in a request or response body sent using the chunked
encoding. The net/http chunked encoding reader discards this metadata.
A sender can exploit this by inserting a large metadata segment with
each byte transferred. The chunk reader now produces an error if the
ratio of real body to encoded bytes grows too small.
Thanks to Bartek Nowotarski for reporting this issue.
This is CVE-2023-39326 and Go issue https://go.dev/issue/64433.
- cmd/go: go get may unexpectedly fallback to insecure git
Using go get to fetch a module with the ".git" suffix may unexpectedly
fallback to the insecure "git://" protocol if the module is unavailable
via the secure "https://" and "git+ssh://" protocols, even if GOINSECURE
is not set for said module. This only affects users who are not using
the module proxy and are fetching modules directly (i.e. GOPROXY=off).
Thanks to David Leadbeater for reporting this issue.
This is CVE-2023-45285 and Go issue https://go.dev/issue/63845.
- path/filepath: retain trailing \ when cleaning paths like \\?\c:\
Go 1.20.11 and Go 1.21.4 inadvertently changed the definition of the
volume name in Windows paths starting with \\?\, resulting in
filepath.Clean(\\?\c:\) returning \\?\c: rather than \\?\c:\ (among
other effects). The previous behavior has been restored.
This is an update to CVE-2023-45283 and Go issue https://go.dev/issue/64028.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
go1.20.11 (released 2023-11-07) includes security fixes to the path/filepath
package, as well as bug fixes to the linker and the net/http package. See the
Go 1.20.11 milestone on our issue tracker for details:
- https://github.com/golang/go/issues?q=milestone%3AGo1.20.11+label%3ACherryPickApproved
- full diff: https://github.com/golang/go/compare/go1.20.10...go1.20.11
from the security mailing:
[security] Go 1.21.4 and Go 1.20.11 are released
Hello gophers,
We have just released Go versions 1.21.4 and 1.20.11, minor point releases.
These minor releases include 2 security fixes following the security policy:
- path/filepath: recognize `\??\` as a Root Local Device path prefix.
On Windows, a path beginning with `\??\` is a Root Local Device path equivalent
to a path beginning with `\\?\`. Paths with a `\??\` prefix may be used to
access arbitrary locations on the system. For example, the path `\??\c:\x`
is equivalent to the more common path c:\x.
The filepath package did not recognize paths with a `\??\` prefix as special.
Clean could convert a rooted path such as `\a\..\??\b` into
the root local device path `\??\b`. It will now convert this
path into `.\??\b`.
`IsAbs` did not report paths beginning with `\??\` as absolute.
It now does so.
VolumeName now reports the `\??\` prefix as a volume name.
`Join(`\`, `??`, `b`)` could convert a seemingly innocent
sequence of path elements into the root local device path
`\??\b`. It will now convert this to `\.\??\b`.
This is CVE-2023-45283 and https://go.dev/issue/63713.
- path/filepath: recognize device names with trailing spaces and superscripts
The `IsLocal` function did not correctly detect reserved names in some cases:
- reserved names followed by spaces, such as "COM1 ".
- "COM" or "LPT" followed by a superscript 1, 2, or 3.
`IsLocal` now correctly reports these names as non-local.
This is CVE-2023-45284 and https://go.dev/issue/63713.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Attempting to delete the directory while another goroutine is
concurrently executing a CheckpointTo() can fail on Windows due to file
locking. As all callers of CheckpointTo() are required to hold the
container lock, holding the lock while deleting the directory ensures
that there will be no interference.
Signed-off-by: Cory Snider <csnider@mirantis.com>
(cherry picked from commit 18e322bc7c)
Signed-off-by: Cory Snider <csnider@mirantis.com>
When reading logs, timestamps should always be presented in UTC. Unlike
the "json-file" and other logging drivers, the "local" logging driver
was using local time.
Thanks to Roman Valov for reporting this issue, and locating the bug.
Before this change:
echo $TZ
Europe/Amsterdam
docker run -d --log-driver=local nginx:alpine
fc166c6b2c35c871a13247dddd95de94f5796459e2130553eee91cac82766af3
docker logs --timestamps fc166c6b2c35c871a13247dddd95de94f5796459e2130553eee91cac82766af3
2023-12-08T18:16:56.291023422+01:00 /docker-entrypoint.sh: /docker-entrypoint.d/ is not empty, will attempt to perform configuration
2023-12-08T18:16:56.291056463+01:00 /docker-entrypoint.sh: Looking for shell scripts in /docker-entrypoint.d/
2023-12-08T18:16:56.291890130+01:00 /docker-entrypoint.sh: Launching /docker-entrypoint.d/10-listen-on-ipv6-by-default.sh
...
With this patch:
echo $TZ
Europe/Amsterdam
docker run -d --log-driver=local nginx:alpine
14e780cce4c827ce7861d7bc3ccf28b21f6e460b9bfde5cd39effaa73a42b4d5
docker logs --timestamps 14e780cce4c827ce7861d7bc3ccf28b21f6e460b9bfde5cd39effaa73a42b4d5
2023-12-08T17:18:46.635967625Z /docker-entrypoint.sh: /docker-entrypoint.d/ is not empty, will attempt to perform configuration
2023-12-08T17:18:46.635989792Z /docker-entrypoint.sh: Looking for shell scripts in /docker-entrypoint.d/
2023-12-08T17:18:46.636897417Z /docker-entrypoint.sh: Launching /docker-entrypoint.d/10-listen-on-ipv6-by-default.sh
...
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit afe281964d)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This is the eleventh patch release in the 1.1.z release branch of runc.
It primarily fixes a few issues with runc's handling of containers that
are configured to join existing user namespaces, as well as improvements
to cgroupv2 support.
- Fix several issues with userns path handling.
- Support memory.peak and memory.swap.peak in cgroups v2.
Add swapOnlyUsage in MemoryStats. This field reports swap-only usage.
For cgroupv1, Usage and Failcnt are set by subtracting memory usage
from memory+swap usage. For cgroupv2, Usage, Limit, and MaxUsage
are set.
- build(deps): bump github.com/cyphar/filepath-securejoin.
- release notes: https://github.com/opencontainers/runc/releases/tag/v1.1.11
- full diff: https://github.com/opencontainers/runc/compare/v1.1.10...v1.1.11
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit 5fa4cfcabf)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
- full diff: https://github.com/opencontainers/runc/compare/v1.1.9...v1.1.10
- release notes: https://github.com/opencontainers/runc/releases/tag/v1.1.10
This is the tenth (and most likely final) patch release in the 1.1.z
release branch of runc. It mainly fixes a few issues in cgroups, and a
umask-related issue in tmpcopyup.
- Add support for `hugetlb.<pagesize>.rsvd` limiting and accounting.
Fixes the issue of postgres failing when hugepage limits are set.
- Fixed permissions of a newly created directories to not depend on the value
of umask in tmpcopyup feature implementation.
- libcontainer: cgroup v1 GetStats now ignores missing `kmem.limit_in_bytes`
(fixes the compatibility with Linux kernel 6.1+).
- Fix a semi-arbitrary cgroup write bug when given a malicious hugetlb
configuration. This issue is not a security issue because it requires a
malicious config.json, which is outside of our threat model.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit 15bcc707e6)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
When the daemon process or the host running it is abruptly terminated,
the layer metadata file can become inconsistent on the file system.
Specifically, `link` and `lower` files may exist but be empty, leading
to overlay mounting errors during layer extraction, such as:
"failed to register layer: error creating overlay mount to <path>:
too many levels of symbolic links."
This commit introduces the use of `AtomicWriteFile` to ensure that the
layer metadata files contain correct data when they exist on the file system.
Signed-off-by: Mike <mike.sul@foundries.io>
(cherry picked from commit de2447c2ab)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
full diff: https://github.com/golang/net/compare/v0.13.0...v0.17.0
This fixes the same CVE as go1.21.3 and go1.20.10;
- net/http: rapid stream resets can cause excessive work
A malicious HTTP/2 client which rapidly creates requests and
immediately resets them can cause excessive server resource consumption.
While the total number of requests is bounded to the
http2.Server.MaxConcurrentStreams setting, resetting an in-progress
request allows the attacker to create a new request while the existing
one is still executing.
HTTP/2 servers now bound the number of simultaneously executing
handler goroutines to the stream concurrency limit. New requests
arriving when at the limit (which can only happen after the client
has reset an existing, in-flight request) will be queued until a
handler exits. If the request queue grows too large, the server
will terminate the connection.
This issue is also fixed in golang.org/x/net/http2 v0.17.0,
for users manually configuring HTTP/2.
The default stream concurrency limit is 250 streams (requests)
per HTTP/2 connection. This value may be adjusted using the
golang.org/x/net/http2 package; see the Server.MaxConcurrentStreams
setting and the ConfigureServer function.
This is CVE-2023-39325 and Go issue https://go.dev/issue/63417.
This is also tracked by CVE-2023-44487.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit 1800dd0876)
Signed-off-by: Cory Snider <csnider@mirantis.com>
go1.20.10 (released 2023-10-10) includes a security fix to the net/http package.
See the Go 1.20.10 milestone on our issue tracker for details:
https://github.com/golang/go/issues?q=milestone%3AGo1.20.10+label%3ACherryPickApproved
full diff: https://github.com/golang/go/compare/go1.20.9...go1.20.10
From the security mailing:
[security] Go 1.21.3 and Go 1.20.10 are released
Hello gophers,
We have just released Go versions 1.21.3 and 1.20.10, minor point releases.
These minor releases include 1 security fixes following the security policy:
- net/http: rapid stream resets can cause excessive work
A malicious HTTP/2 client which rapidly creates requests and
immediately resets them can cause excessive server resource consumption.
While the total number of requests is bounded to the
http2.Server.MaxConcurrentStreams setting, resetting an in-progress
request allows the attacker to create a new request while the existing
one is still executing.
HTTP/2 servers now bound the number of simultaneously executing
handler goroutines to the stream concurrency limit. New requests
arriving when at the limit (which can only happen after the client
has reset an existing, in-flight request) will be queued until a
handler exits. If the request queue grows too large, the server
will terminate the connection.
This issue is also fixed in golang.org/x/net/http2 v0.17.0,
for users manually configuring HTTP/2.
The default stream concurrency limit is 250 streams (requests)
per HTTP/2 connection. This value may be adjusted using the
golang.org/x/net/http2 package; see the Server.MaxConcurrentStreams
setting and the ConfigureServer function.
This is CVE-2023-39325 and Go issue https://go.dev/issue/63417.
This is also tracked by CVE-2023-44487.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
go1.20.9 (released 2023-10-05) includes one security fixes to the cmd/go package,
as well as bug fixes to the go command and the linker. See the Go 1.20.9
milestone on our issue tracker for details:
https://github.com/golang/go/issues?q=milestone%3AGo1.20.9+label%3ACherryPickApproved
full diff: https://github.com/golang/go/compare/go1.20.8...go1.20.9
From the security mailing:
[security] Go 1.21.2 and Go 1.20.9 are released
Hello gophers,
We have just released Go versions 1.21.2 and 1.20.9, minor point releases.
These minor releases include 1 security fixes following the security policy:
- cmd/go: line directives allows arbitrary execution during build
"//line" directives can be used to bypass the restrictions on "//go:cgo_"
directives, allowing blocked linker and compiler flags to be passed during
compliation. This can result in unexpected execution of arbitrary code when
running "go build". The line directive requires the absolute path of the file in
which the directive lives, which makes exploting this issue significantly more
complex.
This is CVE-2023-39323 and Go issue https://go.dev/issue/63211.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
While this is not strictly necessary as the default OCI config masks this
path, it is possible that the user disabled path masking, passed their
own list, or is using a forked (or future) daemon version that has a
modified default config/allows changing the default config.
Add some defense-in-depth by also masking out this problematic hardware
device with the AppArmor LSM.
Signed-off-by: Bjorn Neergaard <bjorn.neergaard@docker.com>
(cherry picked from commit bddd826d7a)
Signed-off-by: Bjorn Neergaard <bjorn.neergaard@docker.com>
The ability to read these files may offer a power-based sidechannel
attack against any workloads running on the same kernel.
This was originally [CVE-2020-8694][1], which was fixed in
[949dd0104c496fa7c14991a23c03c62e44637e71][2] by restricting read access
to root. However, since many containers run as root, this is not
sufficient for our use case.
While untrusted code should ideally never be run, we can add some
defense in depth here by masking out the device class by default.
[Other mechanisms][3] to access this hardware exist, but they should not
be accessible to a container due to other safeguards in the
kernel/container stack (e.g. capabilities, perf paranoia).
[1]: https://nvd.nist.gov/vuln/detail/CVE-2020-8694
[2]: 949dd0104c
[3]: https://web.eece.maine.edu/~vweaver/projects/rapl/
Signed-off-by: Bjorn Neergaard <bjorn.neergaard@docker.com>
(cherry picked from commit 83cac3c3e3)
Signed-off-by: Bjorn Neergaard <bjorn.neergaard@docker.com>
go1.20.8 (released 2023-09-06) includes two security fixes to the html/template
package, as well as bug fixes to the compiler, the go command, the runtime,
and the crypto/tls, go/types, net/http, and path/filepath packages. See the
Go 1.20.8 milestone on our issue tracker for details:
https://github.com/golang/go/issues?q=milestone%3AGo1.20.8+label%3ACherryPickApproved
full diff: https://github.com/golang/go/compare/go1.20.7...go1.20.8
From the security mailing:
[security] Go 1.21.1 and Go 1.20.8 are released
Hello gophers,
We have just released Go versions 1.21.1 and 1.20.8, minor point releases.
These minor releases include 4 security fixes following the security policy:
- cmd/go: go.mod toolchain directive allows arbitrary execution
The go.mod toolchain directive, introduced in Go 1.21, could be leveraged to
execute scripts and binaries relative to the root of the module when the "go"
command was executed within the module. This applies to modules downloaded using
the "go" command from the module proxy, as well as modules downloaded directly
using VCS software.
Thanks to Juho Nurminen of Mattermost for reporting this issue.
This is CVE-2023-39320 and Go issue https://go.dev/issue/62198.
- html/template: improper handling of HTML-like comments within script contexts
The html/template package did not properly handle HMTL-like "<!--" and "-->"
comment tokens, nor hashbang "#!" comment tokens, in <script> contexts. This may
cause the template parser to improperly interpret the contents of <script>
contexts, causing actions to be improperly escaped. This could be leveraged to
perform an XSS attack.
Thanks to Takeshi Kaneko (GMO Cybersecurity by Ierae, Inc.) for reporting this
issue.
This is CVE-2023-39318 and Go issue https://go.dev/issue/62196.
- html/template: improper handling of special tags within script contexts
The html/template package did not apply the proper rules for handling occurrences
of "<script", "<!--", and "</script" within JS literals in <script> contexts.
This may cause the template parser to improperly consider script contexts to be
terminated early, causing actions to be improperly escaped. This could be
leveraged to perform an XSS attack.
Thanks to Takeshi Kaneko (GMO Cybersecurity by Ierae, Inc.) for reporting this
issue.
This is CVE-2023-39319 and Go issue https://go.dev/issue/62197.
- crypto/tls: panic when processing post-handshake message on QUIC connections
Processing an incomplete post-handshake message for a QUIC connection caused a panic.
Thanks to Marten Seemann for reporting this issue.
This is CVE-2023-39321 and CVE-2023-39322 and Go issue https://go.dev/issue/62266.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit c41121cc48)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
On startup all local volumes were unmounted as a cleanup mechanism for
the non-clean exit of the last engine process.
This caused live-restored volumes that used special volume opt mount
flags to be broken. While the refcount was restored, the _data directory
was just unmounted, so all new containers mounting this volume would
just have the access to the empty _data directory instead of the real
volume.
With this patch, the mountpoint isn't unmounted. Instead, if the volume
is already mounted, just mark it as mounted, so the next time Mount is
called only the ref count is incremented, but no second attempt to mount
it is performed.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
(cherry picked from commit 2689484402)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Make sure that the content in the live-restored volume mounted in a new
container is the same as the content in the old container.
This checks if volume's _data directory doesn't get unmounted on
startup.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
(cherry picked from commit aef703fa1b)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This documentation moved to a different page, and the Go documentation
moved to the https://go.dev/ domain.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit 2aabd64477)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
- full diff: https://github.com/containerd/containerd/compare/v1.6.21...v1.6.22
- release notes: https://github.com/containerd/containerd/releases/tag/v1.6.22
---
Notable Updates
- RunC: Update runc binary to v1.1.8
- CRI: Fix `additionalGids`: it should fallback to `imageConfig.User`
when `securityContext.RunAsUser`, `RunAsUsername` are empty
- CRI: Write generated CNI config atomically
- Fix concurrent writes for `UpdateContainerStats`
- Make `checkContainerTimestamps` less strict on Windows
- Port-Forward: Correctly handle known errors
- Resolve `docker.NewResolver` race condition
- SecComp: Always allow `name_to_handle_at`
- Adding support to run hcsshim from local clone
- Pinned image support
- Runtime/V2/RunC: Handle early exits w/o big locks
- CRITool: Move up to CRI-TOOLS v1.27.0
- Fix cpu architecture detection issue on emulated ARM platform
- Task: Don't `close()` io before `cancel()`
- Fix panic when remote differ returns empty result
- Plugins: Notify readiness when registered plugins are ready
- Unwrap io errors in server connection receive error handling
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit 4d674897f3)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
I had a CI run fail to "Upload reports":
Exponential backoff for retry #1. Waiting for 4565 milliseconds before continuing the upload at offset 0
Finished backoff for retry #1, continuing with upload
Total file count: 211 ---- Processed file #160 (75.8%)
...
Total file count: 211 ---- Processed file #164 (77.7%)
Total file count: 211 ---- Processed file #164 (77.7%)
Total file count: 211 ---- Processed file #164 (77.7%)
A 503 status code has been received, will attempt to retry the upload
##### Begin Diagnostic HTTP information #####
Status Code: 503
Status Message: Service Unavailable
Header Information: {
"content-length": "592",
"content-type": "application/json; charset=utf-8",
"date": "Mon, 21 Aug 2023 14:08:10 GMT",
"server": "Kestrel",
"cache-control": "no-store,no-cache",
"pragma": "no-cache",
"strict-transport-security": "max-age=2592000",
"x-tfs-processid": "b2fc902c-011a-48be-858d-c62e9c397cb6",
"activityid": "49a48b53-0411-4ff3-86a7-4528e3f71ba2",
"x-tfs-session": "49a48b53-0411-4ff3-86a7-4528e3f71ba2",
"x-vss-e2eid": "49a48b53-0411-4ff3-86a7-4528e3f71ba2",
"x-vss-senderdeploymentid": "63be6134-28d1-8c82-e969-91f4e88fcdec",
"x-frame-options": "SAMEORIGIN"
}
###### End Diagnostic HTTP information ######
Retry limit has been reached for chunk at offset 0 to https://pipelinesghubeus5.actions.githubusercontent.com/Y2huPMnV2RyiTvKoReSyXTCrcRyxUdSDRZYoZr0ONBvpl5e9Nu/_apis/resources/Containers/8331549?itemPath=integration-reports%2Fubuntu-22.04-systemd%2Fbundles%2Ftest-integration%2FTestInfoRegistryMirrors%2Fd20ac12e48cea%2Fdocker.log
Warning: Aborting upload for /tmp/reports/ubuntu-22.04-systemd/bundles/test-integration/TestInfoRegistryMirrors/d20ac12e48cea/docker.log due to failure
Error: aborting artifact upload
Total file count: 211 ---- Processed file #165 (78.1%)
A 503 status code has been received, will attempt to retry the upload
Exponential backoff for retry #1. Waiting for 5799 milliseconds before continuing the upload at offset 0
As a result, the "Download reports" continued retrying:
...
Total file count: 1004 ---- Processed file #436 (43.4%)
Total file count: 1004 ---- Processed file #436 (43.4%)
Total file count: 1004 ---- Processed file #436 (43.4%)
An error occurred while attempting to download a file
Error: Request timeout: /Y2huPMnV2RyiTvKoReSyXTCrcRyxUdSDRZYoZr0ONBvpl5e9Nu/_apis/resources/Containers/8331549?itemPath=integration-reports%2Fubuntu-20.04%2Fbundles%2Ftest-integration%2FTestCreateWithDuplicateNetworkNames%2Fd47798cc212d1%2Fdocker.log
at ClientRequest.<anonymous> (/home/runner/work/_actions/actions/download-artifact/v3/dist/index.js:3681:26)
at Object.onceWrapper (node:events:627:28)
at ClientRequest.emit (node:events:513:28)
at TLSSocket.emitRequestTimeout (node:_http_client:839:9)
at Object.onceWrapper (node:events:627:28)
at TLSSocket.emit (node:events:525:35)
at TLSSocket.Socket._onTimeout (node:net:550:8)
at listOnTimeout (node:internal/timers:559:17)
at processTimers (node:internal/timers:502:7)
Exponential backoff for retry #1. Waiting for 5305 milliseconds before continuing the download
Total file count: 1004 ---- Processed file #436 (43.4%)
And, it looks like GitHub doesn't allow cancelling the job, possibly
because it is defined with `if: always()`?
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit d6f340e784)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Use default apt mirrors and also check APT_MIRROR
is set before updating mirrors.
Signed-off-by: CrazyMax <crazy-max@users.noreply.github.com>
(cherry picked from commit a1d2132bf6)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Currently moby drops ep sets before the entrypoint is executed.
This does mean that with combination of no-new-privileges the
file capabilities stops working with non-root containers.
This is undesired as the usability of such containers is harmed
comparing to running root containers.
This commit therefore sets the effective/permitted set in order
to allow use of file capabilities or libcap(3)/prctl(2) respectively
with combination of no-new-privileges and without respectively.
For no-new-privileges the container will be able to obtain capabilities
that are requested.
Signed-off-by: Luboslav Pivarc <lpivarc@redhat.com>
Signed-off-by: Bjorn Neergaard <bjorn.neergaard@docker.com>
(cherry picked from commit 3aef732e61)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Audit the OCI spec options used for Linux containers to ensure they are
less order-dependent. Ensure they don't assume that any pointer fields
are non-nil and that they don't unintentionally clobber mutations to the
spec applied by other options.
Signed-off-by: Cory Snider <csnider@mirantis.com>
(cherry picked from commit 8a094fe609)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Many of the fields in LinuxResources struct are pointers to scalars for
some reason, presumably to differentiate between set-to-zero and unset
when unmarshaling from JSON, despite zero being outside the acceptable
range for the corresponding kernel tunables. When creating the OCI spec
for a container, the daemon sets the container's OCI spec CPUShares and
BlkioWeight parameters to zero when the corresponding Docker container
configuration values are zero, signifying unset, despite the minimum
acceptable value for CPUShares being two, and BlkioWeight ten. This has
gone unnoticed as runC does not distingiush set-to-zero from unset as it
also uses zero internally to represent unset for those fields. However,
kata-containers v3.2.0-alpha.3 tries to apply the explicit-zero resource
parameters to the container, exactly as instructed, and fails loudly.
The OCI runtime-spec is silent on how the runtime should handle the case
when those parameters are explicitly set to out-of-range values and
kata's behaviour is not unreasonable, so the daemon must therefore be in
the wrong.
Translate unset values in the Docker container's resources HostConfig to
omit the corresponding fields in the container's OCI spec when starting
and updating a container in order to maximize compatibility with
runtimes.
Signed-off-by: Cory Snider <csnider@mirantis.com>
(cherry picked from commit dea870f4ea)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Switch to using t.TempDir() instead of rolling our own.
Clean up mounts leaked by the tests as otherwise the tests fail due to
the leaked mounts because unlike the old cleanup code, t.TempDir()
cleanup does not ignore errors from os.RemoveAll.
Signed-off-by: Cory Snider <csnider@mirantis.com>
(cherry picked from commit 9ff169ccf4)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Go 1.15.7 contained a security fix for CVE-2021-3115, which allowed arbitrary
code to be executed at build time when using cgo on Windows.
This issue was not limited to the go command itself, and could also affect binaries
that use `os.Command`, `os.LookPath`, etc.
From the related blogpost (https://blog.golang.org/path-security):
> Are your own programs affected?
>
> If you use exec.LookPath or exec.Command in your own programs, you only need to
> be concerned if you (or your users) run your program in a directory with untrusted
> contents. If so, then a subprocess could be started using an executable from dot
> instead of from a system directory. (Again, using an executable from dot happens
> always on Windows and only with uncommon PATH settings on Unix.)
>
> If you are concerned, then we’ve published the more restricted variant of os/exec
> as golang.org/x/sys/execabs. You can use it in your program by simply replacing
At time of the go1.15 release, the Go team considered changing the behavior of
`os.LookPath()` and `exec.LookPath()` to be a breaking change, and made the
behavior "opt-in" by providing the `golang.org/x/sys/execabs` package as a
replacement.
However, for the go1.19 release, this changed, and the default behavior of
`os.LookPath()` and `exec.LookPath()` was changed. From the release notes:
https://go.dev/doc/go1.19#os-exec-path
> Command and LookPath no longer allow results from a PATH search to be found
> relative to the current directory. This removes a common source of security
> problems but may also break existing programs that depend on using, say,
> exec.Command("prog") to run a binary named prog (or, on Windows, prog.exe)
> in the current directory. See the os/exec package documentation for information
> about how best to update such programs.
>
> On Windows, Command and LookPath now respect the NoDefaultCurrentDirectoryInExePath
> environment variable, making it possible to disable the default implicit search
> of “.” in PATH lookups on Windows systems.
A result of this change was that registering the daemon as a Windows service
no longer worked when done from within the directory of the binary itself:
C:\> cd "Program Files\Docker\Docker\resources"
C:\Program Files\Docker\Docker\resources> dockerd --register-service
exec: "dockerd": cannot run executable found relative to current directory
Note that using an absolute path would work around the issue:
C:\Program Files\Docker\Docker>resources\dockerd.exe --register-service
This patch changes `registerService()` to use `os.Executable()`, instead of
depending on `os.Args[0]` and `exec.LookPath()` for resolving the absolute
path of the binary.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit 3e8fda0a70)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
If TEST_INTEGRATION_FAIL_FAST is not set, run the integration-cli tests
even if integration tests failed.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
(cherry picked from commit 6841a53d17)
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Includes a fix for CVE-2023-29409
go1.20.7 (released 2023-08-01) includes a security fix to the crypto/tls
package, as well as bug fixes to the assembler and the compiler. See the
Go 1.20.7 milestone on our issue tracker for details:
- https://github.com/golang/go/issues?q=milestone%3AGo1.20.7+label%3ACherryPickApproved
- full diff: https://github.com/golang/go/compare/go1.20.6...go1.20.7
From the mailing list announcement:
[security] Go 1.20.7 and Go 1.19.12 are released
Hello gophers,
We have just released Go versions 1.20.7 and 1.19.12, minor point releases.
These minor releases include 1 security fixes following the security policy:
- crypto/tls: restrict RSA keys in certificates to <= 8192 bits
Extremely large RSA keys in certificate chains can cause a client/server
to expend significant CPU time verifying signatures. Limit this by
restricting the size of RSA keys transmitted during handshakes to <=
8192 bits.
Based on a survey of publicly trusted RSA keys, there are currently only
three certificates in circulation with keys larger than this, and all
three appear to be test certificates that are not actively deployed. It
is possible there are larger keys in use in private PKIs, but we target
the web PKI, so causing breakage here in the interests of increasing the
default safety of users of crypto/tls seems reasonable.
Thanks to Mateusz Poliwczak for reporting this issue.
View the release notes for more information:
https://go.dev/doc/devel/release#go1.20.7
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit d5cb7cdeae)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Also fixes up some cleanup issues.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit 1a51898d2e)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
I noticed this was always being skipped because of race conditions
checking the logs.
This change adds a log scanner which will look through the logs line by
line rather than allocating a big buffer.
Additionally it adds a `poll.Check` which we can use to actually wait
for the desired log entry.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit 476e788090)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Allows tests to report their proxy settings for easier troubleshooting
on failures.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
(cherry picked from commit 8197752d68)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
1. On failed start tail the daemon logs
2. Exposes generic tailing functions to make test debugging simpler
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
(cherry picked from commit 914888cf8b)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
- go.mod: update dependencies and go version by
- Use Go1.20
- Fix couple of typos
- Added `WithStdout` and `WithStderr` helpers
- Moved `cmdOperators` handling from `RunCmd` to `StartCmd`
- Deprecate `assert.ErrorType`
- Remove outdated Dockerfile
- add godoc links
full diff: https://github.com/gotestyourself/gotest.tools/compare/v3.4.0...v3.5.0
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit ce053a14aa)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
- full diff: https://github.com/containerd/containerd/compare/v1.6.21...v1.6.22
- release notes: https://github.com/containerd/containerd/releases/tag/v1.6.22
---
Notable Updates
- RunC: Update runc binary to v1.1.8
- CRI: Fix `additionalGids`: it should fallback to `imageConfig.User`
when `securityContext.RunAsUser`, `RunAsUsername` are empty
- CRI: Write generated CNI config atomically
- Fix concurrent writes for `UpdateContainerStats`
- Make `checkContainerTimestamps` less strict on Windows
- Port-Forward: Correctly handle known errors
- Resolve `docker.NewResolver` race condition
- SecComp: Always allow `name_to_handle_at`
- Adding support to run hcsshim from local clone
- Pinned image support
- Runtime/V2/RunC: Handle early exits w/o big locks
- CRITool: Move up to CRI-TOOLS v1.27.0
- Fix cpu architecture detection issue on emulated ARM platform
- Task: Don't `close()` io before `cancel()`
- Fix panic when remote differ returns empty result
- Plugins: Notify readiness when registered plugins are ready
- Unwrap io errors in server connection receive error handling
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Before this change, integration test would fail fast and not execute all
test suites when one suite fails.
Change this behavior into opt-in enabled by TEST_INTEGRATION_FAIL_FAST
variable.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
(cherry picked from commit 48cc28e4ef)
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Upstart has been EOL for 8 years and isn't used by any distributions we support any more.
Signed-off-by: Tianon Gravi <admwiggin@gmail.com>
(cherry picked from commit 0d8087fbbc)
Signed-off-by: Bjorn Neergaard <bjorn.neergaard@docker.com>
Upstart has been EOL for 8 years and isn't used by any distributions we support any more.
Additionally, this removes the "cgroups v1" setup code because it's more reasonable now for us to expect something _else_ to have set up cgroups appropriately (especially cgroups v2).
Signed-off-by: Tianon Gravi <admwiggin@gmail.com>
(cherry picked from commit ae737656f9)
Signed-off-by: Bjorn Neergaard <bjorn.neergaard@docker.com>
release notes: https://github.com/opencontainers/runc/releases/tag/v1.1.8
full diff: https://github.com/opencontainers/runc/compare/v1.1.7...v1.1.9
This is the eighth patch release of the 1.1.z release branch of runc.
The most notable change is the addition of RISC-V support, along with a
few bug fixes.
- Support riscv64.
- init: do not print environment variable value.
- libct: fix a race with systemd removal.
- tests/int: increase num retries for oom tests.
- man/runc: fixes.
- Fix tmpfs mode opts when dir already exists.
- docs/systemd: fix a broken link.
- ci/cirrus: enable some rootless tests on cs9.
- runc delete: call systemd's reset-failed.
- libct/cg/sd/v1: do not update non-frozen cgroup after frozen failed.
- CI: bump Fedora, Vagrant, bats.
- .codespellrc: update for 2.2.5.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit df86d855f5)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Post-f8c0d92a22bad004cb9cbb4db704495527521c42, BUILDKIT_REPO doesn't
really do what it claims to. Instead, don't allow overloading since the
import path for BuildKit is always the same, and make clear the
provenance of values when generating the final variable definitions.
We also better document the script, and follow some best practices for
both POSIX sh and Bash.
Signed-off-by: Bjorn Neergaard <bjorn.neergaard@docker.com>
(cherry picked from commit 4ecc01f3ad)
Signed-off-by: Bjorn Neergaard <bjorn.neergaard@docker.com>
gotest.tools has an init() which registers a '-update' flag;
a80f057529/internal/source/update.go (L21-L23)
The quota helper contains a testhelpers file, which is meant for usage
in (integration) tests, but as it's in the same pacakge as production
code, would also trigger the gotest.tools init.
This patch removes the gotest.tools code from this file.
Before this patch:
$ (exec -a libnetwork-setkey "$(which dockerd)" -help)
Usage of libnetwork-setkey:
-exec-root string
docker exec root (default "/run/docker")
-update
update golden values
With this patch applied:
$ (exec -a libnetwork-setkey "$(which dockerd)" -help)
Usage of libnetwork-setkey:
-exec-root string
docker exec root (default "/run/docker")
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit 1aa17222e7)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
go1.20.6 (released 2023-07-11) includes a security fix to the net/http package,
as well as bug fixes to the compiler, cgo, the cover tool, the go command,
the runtime, and the crypto/ecdsa, go/build, go/printer, net/mail, and text/template
packages. See the Go 1.20.6 milestone on our issue tracker for details.
https://github.com/golang/go/issues?q=milestone%3AGo1.20.6+label%3ACherryPickApproved
Full diff: https://github.com/golang/go/compare/go1.20.5...go1.20.6
These minor releases include 1 security fixes following the security policy:
net/http: insufficient sanitization of Host header
The HTTP/1 client did not fully validate the contents of the Host header.
A maliciously crafted Host header could inject additional headers or entire
requests. The HTTP/1 client now refuses to send requests containing an
invalid Request.Host or Request.URL.Host value.
Thanks to Bartek Nowotarski for reporting this issue.
Includes security fixes for [CVE-2023-29406 ][1] and Go issue https://go.dev/issue/60374
[1]: https://github.com/advisories/GHSA-f8f7-69v5-w4vx
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit 1ead2dd35d)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
For local communications (npipe://, unix://), the hostname is not used,
but we need valid and meaningful hostname.
The current code used the socket path as hostname, which gets rejected by
go1.20.6 and go1.19.11 because of a security fix for [CVE-2023-29406 ][1],
which was implemented in https://go.dev/issue/60374.
Prior versions go Go would clean the host header, and strip slashes in the
process, but go1.20.6 and go1.19.11 no longer do, and reject the host
header.
Before this patch, tests would fail on go1.20.6:
=== FAIL: pkg/authorization TestAuthZRequestPlugin (15.01s)
time="2023-07-12T12:53:45Z" level=warning msg="Unable to connect to plugin: //tmp/authz2422457390/authz-test-plugin.sock/AuthZPlugin.AuthZReq: Post \"http://%2F%2Ftmp%2Fauthz2422457390%2Fauthz-test-plugin.sock/AuthZPlugin.AuthZReq\": http: invalid Host header, retrying in 1s"
time="2023-07-12T12:53:46Z" level=warning msg="Unable to connect to plugin: //tmp/authz2422457390/authz-test-plugin.sock/AuthZPlugin.AuthZReq: Post \"http://%2F%2Ftmp%2Fauthz2422457390%2Fauthz-test-plugin.sock/AuthZPlugin.AuthZReq\": http: invalid Host header, retrying in 2s"
time="2023-07-12T12:53:48Z" level=warning msg="Unable to connect to plugin: //tmp/authz2422457390/authz-test-plugin.sock/AuthZPlugin.AuthZReq: Post \"http://%2F%2Ftmp%2Fauthz2422457390%2Fauthz-test-plugin.sock/AuthZPlugin.AuthZReq\": http: invalid Host header, retrying in 4s"
time="2023-07-12T12:53:52Z" level=warning msg="Unable to connect to plugin: //tmp/authz2422457390/authz-test-plugin.sock/AuthZPlugin.AuthZReq: Post \"http://%2F%2Ftmp%2Fauthz2422457390%2Fauthz-test-plugin.sock/AuthZPlugin.AuthZReq\": http: invalid Host header, retrying in 8s"
authz_unix_test.go:82: Failed to authorize request Post "http://%2F%2Ftmp%2Fauthz2422457390%2Fauthz-test-plugin.sock/AuthZPlugin.AuthZReq": http: invalid Host header
[1]: https://github.com/advisories/GHSA-f8f7-69v5-w4vx
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit 6b7705d5b2)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
For local communications (npipe://, unix://), the hostname is not used,
but we need valid and meaningful hostname.
The current code used the client's `addr` as hostname in some cases, which
could contain the path for the unix-socket (`/var/run/docker.sock`), which
gets rejected by go1.20.6 and go1.19.11 because of a security fix for
[CVE-2023-29406 ][1], which was implemented in https://go.dev/issue/60374.
Prior versions go Go would clean the host header, and strip slashes in the
process, but go1.20.6 and go1.19.11 no longer do, and reject the host
header.
This patch introduces a `DummyHost` const, and uses this dummy host for
cases where we don't need an actual hostname.
Before this patch (using go1.20.6):
make GO_VERSION=1.20.6 TEST_FILTER=TestAttach test-integration
=== RUN TestAttachWithTTY
attach_test.go:46: assertion failed: error is not nil: http: invalid Host header
--- FAIL: TestAttachWithTTY (0.11s)
=== RUN TestAttachWithoutTTy
attach_test.go:46: assertion failed: error is not nil: http: invalid Host header
--- FAIL: TestAttachWithoutTTy (0.02s)
FAIL
With this patch applied:
make GO_VERSION=1.20.6 TEST_FILTER=TestAttach test-integration
INFO: Testing against a local daemon
=== RUN TestAttachWithTTY
--- PASS: TestAttachWithTTY (0.12s)
=== RUN TestAttachWithoutTTy
--- PASS: TestAttachWithoutTTy (0.02s)
PASS
[1]: https://github.com/advisories/GHSA-f8f7-69v5-w4vx
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit 92975f0c11)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Calling function returned from setupTest (which calls testEnv.Clean) in
a defer block inside a test that spawns parallel subtests caused the
cleanup function to be called before any of the subtest did anything.
Change the defer expressions to use `t.Cleanup` instead to call it only
after all subtests have also finished.
This only changes tests which have parallel subtests.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
(cherry picked from commit f9e2eed55d)
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
The daemon.lazyInitializeVolume() function only handles restoring Volumes
if a Driver is specified. The Container's MountPoints field may also
contain other kind of mounts (e.g., bind-mounts). Those were ignored, and
don't return an error; 1d9c8619cd/daemon/volumes.go (L243-L252C2)
However, the prepareMountPoints() assumed each MountPoint was a volume,
and logged an informational message about the volume being restored;
1d9c8619cd/daemon/mounts.go (L18-L25)
This would panic if the MountPoint was not a volume;
github.com/docker/docker/daemon.(*Daemon).prepareMountPoints(0xc00054b7b8?, 0xc0007c2500)
/root/rpmbuild/BUILD/src/engine/.gopath/src/github.com/docker/docker/daemon/mounts.go:24 +0x1c0
github.com/docker/docker/daemon.(*Daemon).restore.func5(0xc0007c2500, 0x0?)
/root/rpmbuild/BUILD/src/engine/.gopath/src/github.com/docker/docker/daemon/daemon.go:552 +0x271
created by github.com/docker/docker/daemon.(*Daemon).restore
/root/rpmbuild/BUILD/src/engine/.gopath/src/github.com/docker/docker/daemon/daemon.go:530 +0x8d8
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x30 pc=0x564e9be4c7c0]
This issue was introduced in 647c2a6cdd
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit a490248f4d)
Signed-off-by: Cory Snider <csnider@mirantis.com>
Multiple daemons starting/running concurrently can collide with each
other when editing iptables rules. Most integration tests which opt into
parallelism and start daemons work around this problem by starting the
daemon with the --iptables=false option. However, some of the tests
neglect to pass the option when starting or restarting the daemon,
resulting in those tests being flaky.
Audit the integration tests which call t.Parallel() and (*Daemon).Stop()
and add --iptables=false arguments where needed.
Signed-off-by: Cory Snider <csnider@mirantis.com>
(cherry picked from commit cdcb7c28c5)
Signed-off-by: Cory Snider <csnider@mirantis.com>
TestClientWithRequestTimeout has been observed to flake in CI. The
timing in the test is quite tight, only giving the client a 10ms window
to time out, which could potentially be missed if the host is under
load and the goroutine scheduling is unlucky. Give the client a full
five seconds of grace to time out before failing the test.
Signed-off-by: Cory Snider <csnider@mirantis.com>
(cherry picked from commit 9cee34bc94)
Signed-off-by: Cory Snider <csnider@mirantis.com>
Linux 6.2 and up (commit [f1f1f2569901ec5b9d425f2e91c09a0e320768f3][1])
provides a fast path for the number of open files for the process.
From the [Linux docs][2]:
> The number of open files for the process is stored in 'size' member of
> `stat()` output for /proc/<pid>/fd for fast access.
[1]: f1f1f25699
[2]: https://docs.kernel.org/filesystems/proc.html#proc-pid-fd-list-of-symlinks-to-open-files
This patch adds a fast-path for Kernels that support this, and falls back
to the slow path if the Size fields is zero.
Comparing on a Fedora 38 (kernel 6.2.9-300.fc38.x86_64):
Before/After:
go test -bench ^BenchmarkGetTotalUsedFds$ -run ^$ ./pkg/fileutils/
BenchmarkGetTotalUsedFds 57264 18595 ns/op 408 B/op 10 allocs/op
BenchmarkGetTotalUsedFds 370392 3271 ns/op 40 B/op 3 allocs/op
Note that the slow path has 1 more file-descriptor, due to the open
file-handle for /proc/<pid>/fd during the calculation.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit ec79d0fc05)
Resolved conflicts:
pkg/fileutils/fileutils_linux.go
Signed-off-by: Bjorn Neergaard <bjorn.neergaard@docker.com>
Use File.Readdirnames instead of os.ReadDir, as we're only interested in
the number of files, and results don't have to be sorted.
Before:
BenchmarkGetTotalUsedFds-5 149272 7896 ns/op 945 B/op 20 allocs/op
After:
BenchmarkGetTotalUsedFds-5 153517 7644 ns/op 408 B/op 10 allocs/op
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit eaa9494b71)
Resolved conflicts:
pkg/fileutils/fileutils_linux.go
Signed-off-by: Bjorn Neergaard <bjorn.neergaard@docker.com>
Commit 8d56108ffb moved this function from
the generic (no build-tags) fileutils.go to a unix file, adding "freebsd"
to the build-tags.
This likely was a wrong assumption (as other files had freebsd build-tags).
FreeBSD's procfs does not mention `/proc/<pid>/fd` in the manpage, and
we don't test FreeBSD in CI, so let's drop it, and make this a Linux-only
file.
While updating also dropping the import-tag, as we're planning to move
this file internal to the daemon.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit 252e94f499)
Resolved conflicts:
pkg/fileutils/fileutils_linux.go
Signed-off-by: Bjorn Neergaard <bjorn.neergaard@docker.com>
CI failed sometimes if no daemon.json was present:
Run sudo rm /etc/docker/daemon.json
sudo rm /etc/docker/daemon.json
sudo service docker restart
docker version
docker info
shell: /usr/bin/bash -e {0}
env:
DESTDIR: ./build
BUILDKIT_REPO: moby/buildkit
BUILDKIT_TEST_DISABLE_FEATURES: cache_backend_azblob,cache_backend_s3,merge_diff
BUILDKIT_REF: 798ad6b0ce9f2fe86dfb2b0277e6770d0b545871
rm: cannot remove '/etc/docker/daemon.json': No such file or directory
Error: Process completed with exit code 1.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit 264dbad43a)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
When live-restoring a container the volume driver needs be notified that
there is an active mount for the volume.
Before this change the count is zero until the container stops and the
uint64 overflows pretty much making it so the volume can never be
removed until another daemon restart.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
(cherry picked from commit 647c2a6cdd)
Signed-off-by: Bjorn Neergaard <bjorn.neergaard@docker.com>
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
I think this may be missing a sudo (as all other operations do use
sudo to access daemon.json);
Run if [ ! -e /etc/docker/daemon.json ]; then
if [ ! -e /etc/docker/daemon.json ]; then
echo '{}' | tee /etc/docker/daemon.json >/dev/null
fi
DOCKERD_CONFIG=$(jq '.+{"experimental":true,"live-restore":true,"ipv6":true,"fixed-cidr-v6":"2001:db8:1::/64"}' /etc/docker/daemon.json)
sudo tee /etc/docker/daemon.json <<<"$DOCKERD_CONFIG" >/dev/null
sudo service docker restart
shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0}
env:
GO_VERSION: 1.20.5
GOTESTLIST_VERSION: v0.3.1
TESTSTAT_VERSION: v0.1.3
ITG_CLI_MATRIX_SIZE: 6
DOCKER_EXPERIMENTAL: 1
DOCKER_GRAPHDRIVER: overlay2
tee: /etc/docker/daemon.json: Permission denied
Error: Process completed with exit code 1.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit d8bc5828cd)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Added code to correctly retrieve and convert the Topology from the gRPC
Swarm Node.
Signed-off-by: Drew Erny <derny@mirantis.com>
(cherry picked from commit cdb1293eea)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
go1.20.5 (released 2023-06-06) includes four security fixes to the cmd/go and
runtime packages, as well as bug fixes to the compiler, the go command, the
runtime, and the crypto/rsa, net, and os packages. See the Go 1.20.5 milestone
on our issue tracker for details:
https://github.com/golang/go/issues?q=milestone%3AGo1.20.5+label%3ACherryPickApproved
full diff: https://github.com/golang/go/compare/go1.20.4...go1.20.5
These minor releases include 3 security fixes following the security policy:
- cmd/go: cgo code injection
The go command may generate unexpected code at build time when using cgo. This
may result in unexpected behavior when running a go program which uses cgo.
This may occur when running an untrusted module which contains directories with
newline characters in their names. Modules which are retrieved using the go command,
i.e. via "go get", are not affected (modules retrieved using GOPATH-mode, i.e.
GO111MODULE=off, may be affected).
Thanks to Juho Nurminen of Mattermost for reporting this issue.
This is CVE-2023-29402 and Go issue https://go.dev/issue/60167.
- runtime: unexpected behavior of setuid/setgid binaries
The Go runtime didn't act any differently when a binary had the setuid/setgid
bit set. On Unix platforms, if a setuid/setgid binary was executed with standard
I/O file descriptors closed, opening any files could result in unexpected
content being read/written with elevated prilieges. Similarly if a setuid/setgid
program was terminated, either via panic or signal, it could leak the contents
of its registers.
Thanks to Vincent Dehors from Synacktiv for reporting this issue.
This is CVE-2023-29403 and Go issue https://go.dev/issue/60272.
- cmd/go: improper sanitization of LDFLAGS
The go command may execute arbitrary code at build time when using cgo. This may
occur when running "go get" on a malicious module, or when running any other
command which builds untrusted code. This is can by triggered by linker flags,
specified via a "#cgo LDFLAGS" directive.
Thanks to Juho Nurminen of Mattermost for reporting this issue.
This is CVE-2023-29404 and CVE-2023-29405 and Go issues https://go.dev/issue/60305 and https://go.dev/issue/60306.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit 98a44bb18e)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit 24c882c3e0)
Signed-off-by: Cory Snider <csnider@mirantis.com>
go1.20.4 (released 2023-05-02) includes three security fixes to the html/template
package, as well as bug fixes to the compiler, the runtime, and the crypto/subtle,
crypto/tls, net/http, and syscall packages. See the Go 1.20.4 milestone on our
issue tracker for details:
https://github.com/golang/go/issues?q=milestone%3AGo1.20.4+label%3ACherryPickApproved
release notes: https://go.dev/doc/devel/release#go1.20.4
full diff: https://github.com/golang/go/compare/go1.20.3...go1.20.4
from the announcement:
> These minor releases include 3 security fixes following the security policy:
>
> - html/template: improper sanitization of CSS values
>
> Angle brackets (`<>`) were not considered dangerous characters when inserted
> into CSS contexts. Templates containing multiple actions separated by a '/'
> character could result in unexpectedly closing the CSS context and allowing
> for injection of unexpected HMTL, if executed with untrusted input.
>
> Thanks to Juho Nurminen of Mattermost for reporting this issue.
>
> This is CVE-2023-24539 and Go issue https://go.dev/issue/59720.
>
> - html/template: improper handling of JavaScript whitespace
>
> Not all valid JavaScript whitespace characters were considered to be
> whitespace. Templates containing whitespace characters outside of the character
> set "\t\n\f\r\u0020\u2028\u2029" in JavaScript contexts that also contain
> actions may not be properly sanitized during execution.
>
> Thanks to Juho Nurminen of Mattermost for reporting this issue.
>
> This is CVE-2023-24540 and Go issue https://go.dev/issue/59721.
>
> - html/template: improper handling of empty HTML attributes
>
> Templates containing actions in unquoted HTML attributes (e.g. "attr={{.}}")
> executed with empty input could result in output that would have unexpected
> results when parsed due to HTML normalization rules. This may allow injection
> of arbitrary attributes into tags.
>
> Thanks to Juho Nurminen of Mattermost for reporting this issue.
>
> This is CVE-2023-29400 and Go issue https://go.dev/issue/59722.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit b7e8868235)
Signed-off-by: Cory Snider <csnider@mirantis.com>
go1.20.3 (released 2023-04-04) includes security fixes to the go/parser,
html/template, mime/multipart, net/http, and net/textproto packages, as well
as bug fixes to the compiler, the linker, the runtime, and the time package.
See the Go 1.20.3 milestone on our issue tracker for details:
https://github.com/golang/go/issues?q=milestone%3AGo1.20.3+label%3ACherryPickApproved
full diff: https://github.com/golang/go/compare/go1.20.2...go1.20.3
Further details from the announcement on the mailing list:
We have just released Go versions 1.20.3 and 1.19.8, minor point releases.
These minor releases include 4 security fixes following the security policy:
- go/parser: infinite loop in parsing
Calling any of the Parse functions on Go source code which contains `//line`
directives with very large line numbers can cause an infinite loop due to
integer overflow.
Thanks to Philippe Antoine (Catena cyber) for reporting this issue.
This is CVE-2023-24537 and Go issue https://go.dev/issue/59180.
- html/template: backticks not treated as string delimiters
Templates did not properly consider backticks (`) as Javascript string
delimiters, and as such did not escape them as expected. Backticks are
used, since ES6, for JS template literals. If a template contained a Go
template action within a Javascript template literal, the contents of the
action could be used to terminate the literal, injecting arbitrary Javascript
code into the Go template.
As ES6 template literals are rather complex, and themselves can do string
interpolation, we've decided to simply disallow Go template actions from being
used inside of them (e.g. "var a = {{.}}"), since there is no obviously safe
way to allow this behavior. This takes the same approach as
github.com/google/safehtml. Template.Parse will now return an Error when it
encounters templates like this, with a currently unexported ErrorCode with a
value of 12. This ErrorCode will be exported in the next major release.
Users who rely on this behavior can re-enable it using the GODEBUG flag
jstmpllitinterp=1, with the caveat that backticks will now be escaped. This
should be used with caution.
Thanks to Sohom Datta, Manipal Institute of Technology, for reporting this issue.
This is CVE-2023-24538 and Go issue https://go.dev/issue/59234.
- net/http, net/textproto: denial of service from excessive memory allocation
HTTP and MIME header parsing could allocate large amounts of memory, even when
parsing small inputs.
Certain unusual patterns of input data could cause the common function used to
parse HTTP and MIME headers to allocate substantially more memory than
required to hold the parsed headers. An attacker can exploit this behavior to
cause an HTTP server to allocate large amounts of memory from a small request,
potentially leading to memory exhaustion and a denial of service.
Header parsing now correctly allocates only the memory required to hold parsed
headers.
Thanks to Jakob Ackermann (@das7pad) for discovering this issue.
This is CVE-2023-24534 and Go issue https://go.dev/issue/58975.
- net/http, net/textproto, mime/multipart: denial of service from excessive resource consumption
Multipart form parsing can consume large amounts of CPU and memory when
processing form inputs containing very large numbers of parts. This stems from
several causes:
mime/multipart.Reader.ReadForm limits the total memory a parsed multipart form
can consume. ReadForm could undercount the amount of memory consumed, leading
it to accept larger inputs than intended. Limiting total memory does not
account for increased pressure on the garbage collector from large numbers of
small allocations in forms with many parts. ReadForm could allocate a large
number of short-lived buffers, further increasing pressure on the garbage
collector. The combination of these factors can permit an attacker to cause an
program that parses multipart forms to consume large amounts of CPU and
memory, potentially resulting in a denial of service. This affects programs
that use mime/multipart.Reader.ReadForm, as well as form parsing in the
net/http package with the Request methods FormFile, FormValue,
ParseMultipartForm, and PostFormValue.
ReadForm now does a better job of estimating the memory consumption of parsed
forms, and performs many fewer short-lived allocations.
In addition, mime/multipart.Reader now imposes the following limits on the
size of parsed forms:
Forms parsed with ReadForm may contain no more than 1000 parts. This limit may
be adjusted with the environment variable GODEBUG=multipartmaxparts=. Form
parts parsed with NextPart and NextRawPart may contain no more than 10,000
header fields. In addition, forms parsed with ReadForm may contain no more
than 10,000 header fields across all parts. This limit may be adjusted with
the environment variable GODEBUG=multipartmaxheaders=.
Thanks to Jakob Ackermann for discovering this issue.
This is CVE-2023-24536 and Go issue https://go.dev/issue/59153.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit f6cc8e3512)
Signed-off-by: Cory Snider <csnider@mirantis.com>
Includes a security fix for crypto/elliptic (CVE-2023-24532).
> go1.20.2 (released 2023-03-07) includes a security fix to the crypto/elliptic package,
> as well as bug fixes to the compiler, the covdata command, the linker, the runtime, and
> the crypto/ecdh, crypto/rsa, crypto/x509, os, and syscall packages.
> See the Go 1.20.2 milestone on our issue tracker for details.
https://go.dev/doc/devel/release#go1.20.minor
From the announcement:
> We have just released Go versions 1.20.2 and 1.19.7, minor point releases.
>
> These minor releases include 1 security fixes following the security policy:
>
> - crypto/elliptic: incorrect P-256 ScalarMult and ScalarBaseMult results
>
> The ScalarMult and ScalarBaseMult methods of the P256 Curve may return an
> incorrect result if called with some specific unreduced scalars (a scalar larger
> than the order of the curve).
>
> This does not impact usages of crypto/ecdsa or crypto/ecdh.
>
> This is CVE-2023-24532 and Go issue https://go.dev/issue/58647.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit 02dec48bab)
Signed-off-by: Cory Snider <csnider@mirantis.com>
We missed a case when parsing extra hosts from the dockerfile
frontend so the build fails.
To handle this case we need to set a dedicated worker label
that contains the host gateway IP so clients like Buildx
can just set the proper host:ip when parsing extra hosts
that contain the special string "host-gateway".
Signed-off-by: CrazyMax <crazy-max@users.noreply.github.com>
(cherry picked from commit 21e50b89c9)
daemon.generateNewName() already reserves the generated name, but its name
did not indicate it did. The daemon.registerName() assumed that the generated
name still had to be reserved, which could mean it would try to reserve the
same name again.
This patch renames daemon.generateNewName to daemon.generateAndReserveName
to make it clearer what it does, and updates registerName() to return early
if it successfully generated (and registered) the container name.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit 3ba67ee214)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Commit 90de570cfa passed through the request
context to daemon.ContainerStop(). As a result, cancelling the context would
cancel the "graceful" stop of the container, and would proceed with forcefully
killing the container.
This patch partially reverts the changes from 90de570cfa
and breaks the context to prevent cancelling the context from cancelling the stop.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit fc94ed0a86)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The official Python images on Docker Hub switched to debian bookworm,
which is now the current stable version of Debian.
However, the location of the apt repository config file changed, which
causes the Dockerfile build to fail;
Loaded image: emptyfs:latest
Loaded image ID: sha256:0df1207206e5288f4a989a2f13d1f5b3c4e70467702c1d5d21dfc9f002b7bd43
INFO: Building docker-sdk-python3:5.0.3...
tests/Dockerfile:6
--------------------
5 | ARG APT_MIRROR
6 | >>> RUN sed -ri "s/(httpredir|deb).debian.org/${APT_MIRROR:-deb.debian.org}/g" /etc/apt/sources.list \
7 | >>> && sed -ri "s/(security).debian.org/${APT_MIRROR:-security.debian.org}/g" /etc/apt/sources.list
8 |
--------------------
ERROR: failed to solve: process "/bin/sh -c sed -ri \"s/(httpredir|deb).debian.org/${APT_MIRROR:-deb.debian.org}/g\" /etc/apt/sources.list && sed -ri \"s/(security).debian.org/${APT_MIRROR:-security.debian.org}/g\" /etc/apt/sources.list" did not complete successfully: exit code: 2
This needs to be fixed in docker-py, but in the meantime, we can pin to
the bullseye variant.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit 19d860fa9d)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
go1.19.10 (released 2023-06-06) includes four security fixes to the cmd/go and
runtime packages, as well as bug fixes to the compiler, the go command, and the
runtime. See the Go 1.19.10 milestone on our issue tracker for details:
https://github.com/golang/go/issues?q=milestone%3AGo1.19.10+label%3ACherryPickApproved
full diff: https://github.com/golang/go/compare/go1.19.9...go1.19.10
These minor releases include 3 security fixes following the security policy:
- cmd/go: cgo code injection
The go command may generate unexpected code at build time when using cgo. This
may result in unexpected behavior when running a go program which uses cgo.
This may occur when running an untrusted module which contains directories with
newline characters in their names. Modules which are retrieved using the go command,
i.e. via "go get", are not affected (modules retrieved using GOPATH-mode, i.e.
GO111MODULE=off, may be affected).
Thanks to Juho Nurminen of Mattermost for reporting this issue.
This is CVE-2023-29402 and Go issue https://go.dev/issue/60167.
- runtime: unexpected behavior of setuid/setgid binaries
The Go runtime didn't act any differently when a binary had the setuid/setgid
bit set. On Unix platforms, if a setuid/setgid binary was executed with standard
I/O file descriptors closed, opening any files could result in unexpected
content being read/written with elevated prilieges. Similarly if a setuid/setgid
program was terminated, either via panic or signal, it could leak the contents
of its registers.
Thanks to Vincent Dehors from Synacktiv for reporting this issue.
This is CVE-2023-29403 and Go issue https://go.dev/issue/60272.
- cmd/go: improper sanitization of LDFLAGS
The go command may execute arbitrary code at build time when using cgo. This may
occur when running "go get" on a malicious module, or when running any other
command which builds untrusted code. This is can by triggered by linker flags,
specified via a "#cgo LDFLAGS" directive.
Thanks to Juho Nurminen of Mattermost for reporting this issue.
This is CVE-2023-29404 and CVE-2023-29405 and Go issues https://go.dev/issue/60305 and https://go.dev/issue/60306.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
While the VXLAN interface and the iptables rules to mark outgoing VXLAN
packets for encryption are configured to use the Swarm data path port,
the XFRM policies for actually applying the encryption are hardcoded to
match packets with destination port 4789/udp. Consequently, encrypted
overlay networks do not pass traffic when the Swarm is configured with
any other data path port: encryption is not applied to the outgoing
VXLAN packets and the destination host drops the received cleartext
packets. Use the configured data path port instead of hardcoding port
4789 in the XFRM policies.
Signed-off-by: Cory Snider <csnider@mirantis.com>
(cherry picked from commit 9a692a3802)
Signed-off-by: Cory Snider <csnider@mirantis.com>
Starting with go1.19, the Go runtime on Windows now supports the `netgo` build-
flag to use a native Go DNS resolver. Prior to that version, the build-flag
only had an effect on non-Windows platforms. When using the `netgo` build-flag,
the Windows's host resolver is not used, and as a result, custom entries in
`etc/hosts` are ignored, which is a change in behavior from binaries compiled
with older versions of the Go runtime.
From the go1.19 release notes: https://go.dev/doc/go1.19#net
> Resolver.PreferGo is now implemented on Windows and Plan 9. It previously
> only worked on Unix platforms. Combined with Dialer.Resolver and Resolver.Dial,
> it's now possible to write portable programs and be in control of all DNS name
> lookups when dialing.
>
> The net package now has initial support for the netgo build tag on Windows.
> When used, the package uses the Go DNS client (as used by Resolver.PreferGo)
> instead of asking Windows for DNS results. The upstream DNS server it discovers
> from Windows may not yet be correct with complex system network configurations,
> however.
Our Windows binaries are compiled with the "static" (`make/binary-daemon`)
script, which has the `netgo` option set by default. This patch unsets the
`netgo` option when cross-compiling for Windows.
Co-authored-by: Bjorn Neergaard <bjorn.neergaard@docker.com>
Signed-off-by: Bjorn Neergaard <bjorn.neergaard@docker.com>
(cherry picked from commit 53d1b12bc0)
Signed-off-by: Bjorn Neergaard <bjorn.neergaard@docker.com>
The error returned by DecodeConfig was changed in
b6d58d749c and caused this to regress.
Allow empty request bodies for this endpoint once again.
Signed-off-by: Cory Snider <csnider@mirantis.com>
(cherry picked from commit 967c7bc5d3)
Signed-off-by: Cory Snider <csnider@mirantis.com>
release notes: https://github.com/opencontainers/runc/releases/tag/v1.1.7
full diff: https://github.com/opencontainers/runc/compare/v1.1.6...v1.1.7
This is the seventh patch release in the 1.1.z release of runc, and is
the last planned release of the 1.1.z series. It contains a fix for
cgroup device rules with systemd when handling device rules for devices
that don't exist (though for devices whose drivers don't correctly
register themselves in the kernel -- such as the NVIDIA devices -- the
full fix only works with systemd v240+).
- When used with systemd v240+, systemd cgroup drivers no longer skip
DeviceAllow rules if the device does not exist (a regression introduced
in runc 1.1.3). This fix also reverts the workaround added in runc 1.1.5,
removing an extra warning emitted by runc run/start.
- The source code now has a new file, runc.keyring, which contains the keys
used to sign runc releases.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit 2d0e899819)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
release notes: https://github.com/opencontainers/runc/releases/tag/v1.1.6
full diff: https://github.com/opencontainers/runc/compare/v1.1.5...v1.1.6
This is the sixth patch release in the 1.1.z series of runc, which fixes
a series of cgroup-related issues.
Note that this release can no longer be built from sources using Go
1.16. Using a latest maintained Go 1.20.x or Go 1.19.x release is
recommended. Go 1.17 can still be used.
- systemd cgroup v1 and v2 drivers were deliberately ignoring UnitExist error
from systemd while trying to create a systemd unit, which in some scenarios
may result in a container not being added to the proper systemd unit and
cgroup.
- systemd cgroup v2 driver was incorrectly translating cpuset range from spec's
resources.cpu.cpus to systemd unit property (AllowedCPUs) in case of more
than 8 CPUs, resulting in the wrong AllowedCPUs setting.
- systemd cgroup v1 driver was prefixing container's cgroup path with the path
of PID 1 cgroup, resulting in inability to place PID 1 in a non-root cgroup.
- runc run/start may return "permission denied" error when starting a rootless
container when the file to be executed does not have executable bit set for
the user, not taking the CAP_DAC_OVERRIDE capability into account. This is
a regression in runc 1.1.4, as well as in Go 1.20 and 1.20.1
- cgroup v1 drivers are now aware of misc controller.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit d0efca893b)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
When running hack/vendor.sh, I noticed this file was added to vendor.
I suspect this should've been part of 0233029d5a,
but the vendor check doesn't appear to be catching this.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit 3f09316e3b)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Treat copying extended attributes from a source filesystem which does
not support extended attributes as a no-op, same as if the file did not
possess the extended attribute. Only fail copying extended attributes if
the source file has the attribute and the destination filesystem does
not support xattrs.
Signed-off-by: Cory Snider <csnider@mirantis.com>
(cherry picked from commit 2b6761fd3e)
Signed-off-by: Cory Snider <csnider@mirantis.com>
go1.19.9 (released 2023-05-02) includes three security fixes to the html/template
package, as well as bug fixes to the compiler, the runtime, and the crypto/tls
and syscall packages. See the Go 1.19.9 milestone on our issue tracker for details.
https://github.com/golang/go/issues?q=milestone%3AGo1.19.9+label%3ACherryPickApproved
release notes: https://go.dev/doc/devel/release#go1.19.9
full diff: https://github.com/golang/go/compare/go1.19.8...go1.19.9
from the announcement:
> These minor releases include 3 security fixes following the security policy:
>
>- html/template: improper sanitization of CSS values
>
> Angle brackets (`<>`) were not considered dangerous characters when inserted
> into CSS contexts. Templates containing multiple actions separated by a '/'
> character could result in unexpectedly closing the CSS context and allowing
> for injection of unexpected HMTL, if executed with untrusted input.
>
> Thanks to Juho Nurminen of Mattermost for reporting this issue.
>
> This is CVE-2023-24539 and Go issue https://go.dev/issue/59720.
>
> - html/template: improper handling of JavaScript whitespace
>
> Not all valid JavaScript whitespace characters were considered to be
> whitespace. Templates containing whitespace characters outside of the character
> set "\t\n\f\r\u0020\u2028\u2029" in JavaScript contexts that also contain
> actions may not be properly sanitized during execution.
>
> Thanks to Juho Nurminen of Mattermost for reporting this issue.
>
> This is CVE-2023-24540 and Go issue https://go.dev/issue/59721.
>
> - html/template: improper handling of empty HTML attributes
>
> Templates containing actions in unquoted HTML attributes (e.g. "attr={{.}}")
> executed with empty input could result in output that would have unexpected
> results when parsed due to HTML normalization rules. This may allow injection
> of arbitrary attributes into tags.
>
> Thanks to Juho Nurminen of Mattermost for reporting this issue.
>
> This is CVE-2023-29400 and Go issue https://go.dev/issue/59722.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The image store sends events when a new image is created/tagged, using
it instead of the reference store makes sure we send the "tag" event
when a new image is built using buildx.
Signed-off-by: Djordje Lukic <djordje.lukic@docker.com>
no changes in vendored code, just keeping scanners happy :)
release notes: https://github.com/opencontainers/runc/releases/tag/v1.1.5
diff: https://github.com/opencontainers/runc/compare/v1.1.4...v1.1.5
This is the fifth patch release in the 1.1.z series of runc, which fixes
three CVEs found in runc.
* CVE-2023-25809 is a vulnerability involving rootless containers where
(under specific configurations), the container would have write access
to the /sys/fs/cgroup/user.slice/... cgroup hierarchy. No other
hierarchies on the host were affected. This vulnerability was
discovered by Akihiro Suda.
<https://github.com/opencontainers/runc/security/advisories/GHSA-m8cg-xc2p-r3fc>
* CVE-2023-27561 was a regression which effectively re-introduced
CVE-2019-19921. This bug was present from v1.0.0-rc95 to v1.1.4. This
regression was discovered by @Beuc.
<https://github.com/advisories/GHSA-vpvm-3wq2-2wvm>
* CVE-2023-28642 is a variant of CVE-2023-27561 and was fixed by the same
patch. This variant of the above vulnerability was reported by Lei
Wang.
<https://github.com/opencontainers/runc/security/advisories/GHSA-g2j6-57v7-gm8c>
In addition, the following other fixes are included in this release:
* Fix the inability to use `/dev/null` when inside a container.
* Fix changing the ownership of host's `/dev/null` caused by fd redirection
(a regression in 1.1.1).
* Fix rare runc exec/enter unshare error on older kernels, including
CentOS < 7.7.
* nsexec: Check for errors in `write_log()`.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit a17029ba49)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Since cc19eba (backported to v23.0.4), the PreferredPool for docker0 is
set only when the user provides the bip config parameter or when the
default bridge already exist. That means, if a user provides the
fixed-cidr parameter on a fresh install or reboot their computer/server
without bip set, dockerd throw the following error when it starts:
> failed to start daemon: Error initializing network controller: Error
> creating default "bridge" network: failed to parse pool request for
> address space "LocalDefault" pool "" subpool "100.64.0.0/26": Invalid
> Address SubPool
See #45356.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
(cherry picked from commit 2d31697)
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
The (*network).ipamRelease function nils out the network's IPAM info
fields, putting the network struct into an inconsistent state. The
network-restore startup code panics if it tries to restore a network
from a struct which has fewer IPAM config entries than IPAM info
entries. Therefore (*network).delete contains a critical section: by
persisting the network to the store after ipamRelease(), the datastore
will contain an inconsistent network until the deletion operation
completes and finishes deleting the network from the datastore. If for
any reason the deletion operation is interrupted between ipamRelease()
and deleteFromStore(), the daemon will crash on startup when it tries to
restore the network.
Updating the datastore after releasing the network's IPAM pools may have
served a purpose in the past, when a global datastore was used for
intra-cluster communication and the IPAM allocator had persistent global
state, but nowadays there is no global datastore and the IPAM allocator
has no persistent state whatsoever. Remove the vestigial datastore
update as it is no longer necessary and only serves to cause problems.
If the network deletion is interrupted before the network is deleted
from the datastore, the deletion will resume during the next daemon
startup, including releasing the IPAM pools.
Signed-off-by: Cory Snider <csnider@mirantis.com>
(cherry picked from commit c957ad0067)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
GRPC is logging a *lot* of garbage at info level.
This configures the GRPC logger such that it is only giving us logs when
at debug level and also adds a log field indicating where the logs are
coming from.
containerd is still currently spewing these same log messages and needs
a separate update.
Without this change `docker build` is extremely noisy in the daemon
logs.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
(cherry picked from commit c7ccc68b15)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Stopping container on Windows can sometimes take longer than 10s which
caused this test to be flaky.
Increase the timeout to 75s when running this test on Windows.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
(cherry picked from commit 74dbb721aa)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The 'Deprecated:' line in NewClient's doc comment was not in a new
paragraph, so GoDoc, linters, and IDEs were unaware that it was
deprecated. The package documentation also continued to reference
NewClient. Update the doc comments to finish documenting that NewClient
is deprecated.
Signed-off-by: Cory Snider <csnider@mirantis.com>
(cherry picked from commit 6b9968e8b1)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Previously, the AWSLogs driver attempted to implement
non-blocking itself. Non-blocking is supposed to
implemented solely by the Docker RingBuffer that
wraps the log driver.
Please see issue and explanation here:
https://github.com/moby/moby/issues/45217
Signed-off-by: Wesley Pettit <wppttt@amazon.com>
(cherry picked from commit c8f8d11ac4)
- Prevent from descriptor leak
- Fixes optlen in getsockopt() for s390x
full diff: 9a39160e90...7ff4192f6f
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit 893d28469f)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Before:
```console
$ docker-rootless-setuptool.sh install
...
[INFO] Use CLI context "rootless"
Current context is now "rootless"
[INFO] Make sure the following environment variables are set (or add them to ~/.bashrc):
export PATH=/usr/local/bin:$PATH
Some applications may require the following environment variable too:
export DOCKER_HOST=unix:///run/user/1001/docker.sock
```
After:
```console
$ docker-rootless-setuptool.sh install
...
[INFO] Using CLI context "rootless"
Current context is now "rootless"
[INFO] Make sure the following environment variable(s) are set (or add them to ~/.bashrc):
export PATH=/usr/local/bin:$PATH
[INFO] Some applications may require the following environment variable too:
export DOCKER_HOST=unix:///run/user/1001/docker.sock
```
Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
(cherry picked from commit 4aa2876c75)
Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
go1.19.8 (released 2023-04-04) includes security fixes to the go/parser,
html/template, mime/multipart, net/http, and net/textproto packages, as well as
bug fixes to the linker, the runtime, and the time package. See the Go 1.19.8
milestone on our issue tracker for details:
https://github.com/golang/go/issues?q=milestone%3AGo1.19.8+label%3ACherryPickApproved
full diff: https://github.com/golang/go/compare/go1.19.7...go1.19.8
Further details from the announcement on the mailing list:
We have just released Go versions 1.20.3 and 1.19.8, minor point releases.
These minor releases include 4 security fixes following the security policy:
- go/parser: infinite loop in parsing
Calling any of the Parse functions on Go source code which contains `//line`
directives with very large line numbers can cause an infinite loop due to
integer overflow.
Thanks to Philippe Antoine (Catena cyber) for reporting this issue.
This is CVE-2023-24537 and Go issue https://go.dev/issue/59180.
- html/template: backticks not treated as string delimiters
Templates did not properly consider backticks (`) as Javascript string
delimiters, and as such did not escape them as expected. Backticks are
used, since ES6, for JS template literals. If a template contained a Go
template action within a Javascript template literal, the contents of the
action could be used to terminate the literal, injecting arbitrary Javascript
code into the Go template.
As ES6 template literals are rather complex, and themselves can do string
interpolation, we've decided to simply disallow Go template actions from being
used inside of them (e.g. "var a = {{.}}"), since there is no obviously safe
way to allow this behavior. This takes the same approach as
github.com/google/safehtml. Template.Parse will now return an Error when it
encounters templates like this, with a currently unexported ErrorCode with a
value of 12. This ErrorCode will be exported in the next major release.
Users who rely on this behavior can re-enable it using the GODEBUG flag
jstmpllitinterp=1, with the caveat that backticks will now be escaped. This
should be used with caution.
Thanks to Sohom Datta, Manipal Institute of Technology, for reporting this issue.
This is CVE-2023-24538 and Go issue https://go.dev/issue/59234.
- net/http, net/textproto: denial of service from excessive memory allocation
HTTP and MIME header parsing could allocate large amounts of memory, even when
parsing small inputs.
Certain unusual patterns of input data could cause the common function used to
parse HTTP and MIME headers to allocate substantially more memory than
required to hold the parsed headers. An attacker can exploit this behavior to
cause an HTTP server to allocate large amounts of memory from a small request,
potentially leading to memory exhaustion and a denial of service.
Header parsing now correctly allocates only the memory required to hold parsed
headers.
Thanks to Jakob Ackermann (@das7pad) for discovering this issue.
This is CVE-2023-24534 and Go issue https://go.dev/issue/58975.
- net/http, net/textproto, mime/multipart: denial of service from excessive resource consumption
Multipart form parsing can consume large amounts of CPU and memory when
processing form inputs containing very large numbers of parts. This stems from
several causes:
mime/multipart.Reader.ReadForm limits the total memory a parsed multipart form
can consume. ReadForm could undercount the amount of memory consumed, leading
it to accept larger inputs than intended. Limiting total memory does not
account for increased pressure on the garbage collector from large numbers of
small allocations in forms with many parts. ReadForm could allocate a large
number of short-lived buffers, further increasing pressure on the garbage
collector. The combination of these factors can permit an attacker to cause an
program that parses multipart forms to consume large amounts of CPU and
memory, potentially resulting in a denial of service. This affects programs
that use mime/multipart.Reader.ReadForm, as well as form parsing in the
net/http package with the Request methods FormFile, FormValue,
ParseMultipartForm, and PostFormValue.
ReadForm now does a better job of estimating the memory consumption of parsed
forms, and performs many fewer short-lived allocations.
In addition, mime/multipart.Reader now imposes the following limits on the
size of parsed forms:
Forms parsed with ReadForm may contain no more than 1000 parts. This limit may
be adjusted with the environment variable GODEBUG=multipartmaxparts=. Form
parts parsed with NextPart and NextRawPart may contain no more than 10,000
header fields. In addition, forms parsed with ReadForm may contain no more
than 10,000 header fields across all parts. This limit may be adjusted with
the environment variable GODEBUG=multipartmaxheaders=.
Thanks to Jakob Ackermann for discovering this issue.
This is CVE-2023-24536 and Go issue https://go.dev/issue/59153.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
commit 59118bff50 made this a direct
dependency (previously it was indirect). That commit was part of an
advisory, and didn't run the vendor validation check because of that.
This patch fixes the vendor.mod to unblock CI in this branch.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The netutils.ElectInterfaceAddresses function is only used in one place
outside of tests: in the daemon, to configure the default bridge
network. The function is also messy to reason about as it references the
shared mutable state of ipamutils.PredefinedLocalScopeDefaultNetworks.
It uses the list of predefined default networks to always return an IPv4
address even if the named interface does not exist or does not have any
IPv4 addresses. This list happens to be the same as the one used to
initialize the address pool of the 'builtin' IPAM driver, though that is
far from obvious. (Start with "./libnetwork".initIPAMDrivers and trace
the dataflow of the addressPool value. Surprise! Global state is being
mutated using the value of other global mutable state.)
The daemon does not need the fallback behaviour of
ElectInterfaceAddresses. In fact, the daemon does not have to configure
an address pool for the network at all! libnetwork will acquire one of
the available address ranges from the network's IPAM driver when the
preferred-pool configuration is unset. It will do so using the same list
of address ranges and the exact same logic
(netutils.FindAvailableNetworks) as ElectInterfaceAddresses. So unless
the daemon needs to force the network to use a specific address range
because the bridge interface already exists, it can leave the details
up to libnetwork.
Signed-off-by: Cory Snider <csnider@mirantis.com>
(cherry picked from commit cc19eba)
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
Notable Updates
- Disable looking up usernames and groupnames on host
- Add support for Windows ArgsEscaped images
- Update hcsshim to v0.9.8
- Fix debug flag in shim
- Add WithReadonlyTempMount to support readonly temporary mounts
- Update ttrpc to fix file descriptor leak
- Update runc binary to v1.1.5
= Update image config to support ArgsEscaped
full diff: https://github.com/containerd/containerd/compare/v1.6.19...v1.6.20
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
release notes: https://github.com/opencontainers/runc/releases/tag/v1.1.5
diff: https://github.com/opencontainers/runc/compare/v1.1.4...v1.1.5
This is the fifth patch release in the 1.1.z series of runc, which fixes
three CVEs found in runc.
* CVE-2023-25809 is a vulnerability involving rootless containers where
(under specific configurations), the container would have write access
to the /sys/fs/cgroup/user.slice/... cgroup hierarchy. No other
hierarchies on the host were affected. This vulnerability was
discovered by Akihiro Suda.
<https://github.com/opencontainers/runc/security/advisories/GHSA-m8cg-xc2p-r3fc>
* CVE-2023-27561 was a regression which effectively re-introduced
CVE-2019-19921. This bug was present from v1.0.0-rc95 to v1.1.4. This
regression was discovered by @Beuc.
<https://github.com/advisories/GHSA-vpvm-3wq2-2wvm>
* CVE-2023-28642 is a variant of CVE-2023-27561 and was fixed by the same
patch. This variant of the above vulnerability was reported by Lei
Wang.
<https://github.com/opencontainers/runc/security/advisories/GHSA-g2j6-57v7-gm8c>
In addition, the following other fixes are included in this release:
* Fix the inability to use `/dev/null` when inside a container.
* Fix changing the ownership of host's `/dev/null` caused by fd redirection
(a regression in 1.1.1).
* Fix rare runc exec/enter unshare error on older kernels, including
CentOS < 7.7.
* nsexec: Check for errors in `write_log()`.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit 77be7b777c)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
FirewallD creates the root INPUT chain with a default-accept policy and
a terminal rule which rejects all packets not accepted by any prior
rule. Any subsequent rules appended to the chain are therefore inert.
The administrator would have to open the VXLAN UDP port to make overlay
networks work at all, which would result in all VXLAN traffic being
accepted and defeating our attempts to enforce encryption on encrypted
overlay networks.
Insert the rule to drop unencrypted VXLAN packets tagged for encrypted
overlay networks at the top of the INPUT chain so that enforcement of
mandatory encryption takes precedence over any accept rules configured
by the administrator. Continue to append the accept rule to the bottom
of the chain so as not to override any administrator-configured drop
rules.
Signed-off-by: Cory Snider <csnider@mirantis.com>
(cherry picked from commit 965eda3b9a)
Signed-off-by: Cory Snider <csnider@mirantis.com>
Use `exec.Command` created by this function instead of obtaining it from
daemon struct. This prevents a race condition where `daemon.Kill` is
called before the goroutine has the chance to call `cmd.Wait`.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
(cherry picked from commit 88992de283)
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
TestDaemonRestartKillContainers test was always executing the last case
(`container created should not be restarted`) because the iterated
variables were not copied correctly.
Capture iterated values by value correctly and rename c to tc.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
(cherry picked from commit fed1c96e10)
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Some newer distros such as RHEL 9 have stopped making the xt_u32 kernel
module available with the kernels they ship. They do ship the xt_bpf
kernel module, which can do everything xt_u32 can and more. Add an
alternative implementation of the iptables match rule which uses xt_bpf
to implement exactly the same logic as the u32 filter using a BPF
program. Try programming the BPF-powered rules as a fallback when
programming the u32-powered rules fails.
Signed-off-by: Cory Snider <csnider@mirantis.com>
(cherry picked from commit 105b9834fb)
Signed-off-by: Cory Snider <csnider@mirantis.com>
The iptables rule clause used to match on the VNI of VXLAN datagrams
looks like line noise to the uninitiated. It doesn't help that the
expression is repeated twice and neither copy has any commentary.
DRY out the rule builder to a common function, and document what the
rule does and how it works.
Signed-off-by: Cory Snider <csnider@mirantis.com>
(cherry picked from commit 44cf27b5fc)
Signed-off-by: Cory Snider <csnider@mirantis.com>
The iptables rules which make encryption mandatory on an encrypted
overlay network are only programmed once there is a second node
participating in the network. This leaves single-node encrypted overlay
networks vulnerable to packet injection. Furthermore, failure to program
the rules is not treated as a fatal error.
Program the iptables rules to make encryption mandatory before creating
the VXLAN link to guarantee that there is no window of time where
incoming cleartext VXLAN packets for the network would be accepted, or
outgoing cleartext packets be transmitted. Only create the VXLAN link if
programming the rules succeeds to ensure that it fails closed.
Signed-off-by: Cory Snider <csnider@mirantis.com>
(cherry picked from commit 142f46cac1)
Signed-off-by: Cory Snider <csnider@mirantis.com>
The overlay-network encryption code is woefully under-documented, which
is especially problematic as it operates on under-documented kernel
interfaces. Document what I have puzzled out of the implementation for
the benefit of the next poor soul to touch this code.
Signed-off-by: Cory Snider <csnider@mirantis.com>
(cherry picked from commit d4fd582fb2)
Signed-off-by: Cory Snider <csnider@mirantis.com>
Volumes created from the image config were not being pruned because the
volume service did not think they were anonymous since the code to
create passes along a generated name instead of letting the volume
service generate it.
This changes the code path to have the volume service generate the name
instead of doing it ahead of time.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
(cherry picked from commit 146df5fbd3)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Commit 3246db3755 added handling for removing
cluster volumes, but in some conditions, this resulted in errors not being
returned if the volume was in use;
docker swarm init
docker volume create foo
docker create -v foo:/foo busybox top
docker volume rm foo
This patch changes the logic for ignoring "local" volume errors if swarm
is enabled (and cluster volumes supported).
While working on this fix, I also discovered that Cluster.RemoveVolume()
did not handle the "force" option correctly; while swarm correctly handled
these, the cluster backend performs a lookup of the volume first (to obtain
its ID), which would fail if the volume didn't exist.
Before this patch:
make TEST_FILTER=TestVolumesRemoveSwarmEnabled DOCKER_GRAPHDRIVER=vfs test-integration
...
Running /go/src/github.com/docker/docker/integration/volume (arm64.integration.volume) flags=-test.v -test.timeout=10m -test.run TestVolumesRemoveSwarmEnabled
...
=== RUN TestVolumesRemoveSwarmEnabled
=== PAUSE TestVolumesRemoveSwarmEnabled
=== CONT TestVolumesRemoveSwarmEnabled
=== RUN TestVolumesRemoveSwarmEnabled/volume_in_use
volume_test.go:122: assertion failed: error is nil, not errdefs.IsConflict
volume_test.go:123: assertion failed: expected an error, got nil
=== RUN TestVolumesRemoveSwarmEnabled/volume_not_in_use
=== RUN TestVolumesRemoveSwarmEnabled/non-existing_volume
=== RUN TestVolumesRemoveSwarmEnabled/non-existing_volume_force
volume_test.go:143: assertion failed: error is not nil: Error response from daemon: volume no_such_volume not found
--- FAIL: TestVolumesRemoveSwarmEnabled (1.57s)
--- FAIL: TestVolumesRemoveSwarmEnabled/volume_in_use (0.00s)
--- PASS: TestVolumesRemoveSwarmEnabled/volume_not_in_use (0.01s)
--- PASS: TestVolumesRemoveSwarmEnabled/non-existing_volume (0.00s)
--- FAIL: TestVolumesRemoveSwarmEnabled/non-existing_volume_force (0.00s)
FAIL
With this patch:
make TEST_FILTER=TestVolumesRemoveSwarmEnabled DOCKER_GRAPHDRIVER=vfs test-integration
...
Running /go/src/github.com/docker/docker/integration/volume (arm64.integration.volume) flags=-test.v -test.timeout=10m -test.run TestVolumesRemoveSwarmEnabled
...
make TEST_FILTER=TestVolumesRemoveSwarmEnabled DOCKER_GRAPHDRIVER=vfs test-integration
...
Running /go/src/github.com/docker/docker/integration/volume (arm64.integration.volume) flags=-test.v -test.timeout=10m -test.run TestVolumesRemoveSwarmEnabled
...
=== RUN TestVolumesRemoveSwarmEnabled
=== PAUSE TestVolumesRemoveSwarmEnabled
=== CONT TestVolumesRemoveSwarmEnabled
=== RUN TestVolumesRemoveSwarmEnabled/volume_in_use
=== RUN TestVolumesRemoveSwarmEnabled/volume_not_in_use
=== RUN TestVolumesRemoveSwarmEnabled/non-existing_volume
=== RUN TestVolumesRemoveSwarmEnabled/non-existing_volume_force
--- PASS: TestVolumesRemoveSwarmEnabled (1.53s)
--- PASS: TestVolumesRemoveSwarmEnabled/volume_in_use (0.00s)
--- PASS: TestVolumesRemoveSwarmEnabled/volume_not_in_use (0.01s)
--- PASS: TestVolumesRemoveSwarmEnabled/non-existing_volume (0.00s)
--- PASS: TestVolumesRemoveSwarmEnabled/non-existing_volume_force (0.00s)
PASS
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit 058a31e479)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Includes a security fix for crypto/elliptic (CVE-2023-24532).
> go1.19.7 (released 2023-03-07) includes a security fix to the crypto/elliptic
> package, as well as bug fixes to the linker, the runtime, and the crypto/x509
> and syscall packages. See the Go 1.19.7 milestone on our issue tracker for
> details.
https://go.dev/doc/devel/release#go1.19.minor
From the announcement:
> We have just released Go versions 1.20.2 and 1.19.7, minor point releases.
>
> These minor releases include 1 security fixes following the security policy:
>
> - crypto/elliptic: incorrect P-256 ScalarMult and ScalarBaseMult results
>
> The ScalarMult and ScalarBaseMult methods of the P256 Curve may return an
> incorrect result if called with some specific unreduced scalars (a scalar larger
> than the order of the curve).
>
> This does not impact usages of crypto/ecdsa or crypto/ecdh.
>
> This is CVE-2023-24532 and Go issue https://go.dev/issue/58647.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
- fix docker service create doesn't work when network and generic-resource are both attached
- Fix removing tasks when a jobs service is removed
- CSI: Allow NodePublishVolume even when plugin does not support staging
full diff: 904c221ac2...80a528a868
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit 088aff1620)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
- buildinfo: ensure URLs are redacted before written (fixes CVE-2023-26054)
full diff: 4f0ee09c40...70f2ad56d3
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
TestRequestReleaseAddressDuplicate gets flagged by go test -race because
the same err variable inside the test is assigned to from multiple
goroutines without synchronization, which obscures whether or not there
are any data races in the code under test.
Trouble is, the test _depends on_ the data race to exit the loop if an
error occurs inside a spawned goroutine. And the test contains a logical
concurrency bug (not flagged by the Go race detector) which can result
in false-positive test failures. Because a release operation is logged
after the IP is released, the other goroutine could reacquire the
address and log that it was reacquired before the release is logged.
Fix up the test so it is no longer subject to data races or
false-positive test failures, i.e. flakes.
Signed-off-by: Cory Snider <csnider@mirantis.com>
(cherry picked from commit b62445871e)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The latest version of containerd-shim-runhcs-v1 (v0.10.0-rc.4) pulled in
with the bump to ContainerD v1.7.0-rc.3 had several changes to make it
more robust, which had the side effect of increasing the worst-case
amount of time it takes for a container to exit in the worst case.
Notably, the total timeout for shutting down a task increased from 30
seconds to 60! Increase the timeouts hardcoded in the daemon and
integration tests so that they don't give up too soon.
Signed-off-by: Cory Snider <csnider@mirantis.com>
(cherry picked from commit d634ae9b60)
Signed-off-by: Cory Snider <csnider@mirantis.com>
"math/rand".Seed
- Migrate to using local RNG instances.
"archive/tar".TypeRegA
- The deprecated constant tar.TypeRegA is the same value as
tar.TypeReg and so is not needed at all.
Signed-off-by: Cory Snider <csnider@mirantis.com>
(cherry picked from commit dea3f2b417)
Signed-off-by: Cory Snider <csnider@mirantis.com>
Go 1.20 made a change to the behaviour of package "os/exec" which was
not mentioned in the release notes:
2b8f214094
Attempts to execute a directory now return syscall.EISDIR instead of
syscall.EACCESS. Check for EISDIR errors from the runtime and fudge the
returned error message to maintain compatibility with existing versions
of docker/cli when using a version of runc compiled with Go 1.20+.
Signed-off-by: Cory Snider <csnider@mirantis.com>
(cherry picked from commit 713e02e03e)
Signed-off-by: Cory Snider <csnider@mirantis.com>
maxDownloadAttempts maps to the daemon configuration flag
--max-download-attempts int
Set the max download attempts for each pull (default 5)
and the daemon configuration machinery interprets a value of 0 as "apply
the default value" and not a valid user value (config validation/
normalization bugs notwithstanding). The intention is clearly that this
configuration value should be an upper limit on the number of times the
daemon should try to download a particular layer before giving up. So it
is surprising to have the configuration value interpreted as a _retry_
limit. The daemon will make up to N+1 attempts to download a layer! This
also means users cannot disable retries even if they wanted to.
As this is a longstanding bug, not a recent regression, it would not be
appropriate to backport the fix (97921915a8)
in a patch release. Update the test to assert on the buggy behaviour so
it passes again.
Signed-off-by: Cory Snider <csnider@mirantis.com>
This addresses the same CVE as is patched in go1.19.6. From that announcement:
> net/http: avoid quadratic complexity in HPACK decoding
>
> A maliciously crafted HTTP/2 stream could cause excessive CPU consumption
> in the HPACK decoder, sufficient to cause a denial of service from a small
> number of small requests.
>
> This issue is also fixed in golang.org/x/net/http2 v0.7.0, for users manually
> configuring HTTP/2.
>
> This is CVE-2022-41723 and Go issue https://go.dev/issue/57855.
full diff: https://github.com/golang/net/compare/v0.5.0...v0.7.0
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit a36286cf89)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
go1.19.6 (released 2023-02-14) includes security fixes to the crypto/tls,
mime/multipart, net/http, and path/filepath packages, as well as bug fixes to
the go command, the linker, the runtime, and the crypto/x509, net/http, and
time packages. See the Go 1.19.6 milestone on our issue tracker for details:
https://github.com/golang/go/issues?q=milestone%3AGo1.19.6+label%3ACherryPickApproved
From the announcement on the security mailing:
We have just released Go versions 1.20.1 and 1.19.6, minor point releases.
These minor releases include 4 security fixes following the security policy:
- path/filepath: path traversal in filepath.Clean on Windows
On Windows, the filepath.Clean function could transform an invalid path such
as a/../c:/b into the valid path c:\b. This transformation of a relative (if
invalid) path into an absolute path could enable a directory traversal attack.
The filepath.Clean function will now transform this path into the relative
(but still invalid) path .\c:\b.
This is CVE-2022-41722 and Go issue https://go.dev/issue/57274.
- net/http, mime/multipart: denial of service from excessive resource
consumption
Multipart form parsing with mime/multipart.Reader.ReadForm can consume largely
unlimited amounts of memory and disk files. This also affects form parsing in
the net/http package with the Request methods FormFile, FormValue,
ParseMultipartForm, and PostFormValue.
ReadForm takes a maxMemory parameter, and is documented as storing "up to
maxMemory bytes +10MB (reserved for non-file parts) in memory". File parts
which cannot be stored in memory are stored on disk in temporary files. The
unconfigurable 10MB reserved for non-file parts is excessively large and can
potentially open a denial of service vector on its own. However, ReadForm did
not properly account for all memory consumed by a parsed form, such as map
ntry overhead, part names, and MIME headers, permitting a maliciously crafted
form to consume well over 10MB. In addition, ReadForm contained no limit on
the number of disk files created, permitting a relatively small request body
to create a large number of disk temporary files.
ReadForm now properly accounts for various forms of memory overhead, and
should now stay within its documented limit of 10MB + maxMemory bytes of
memory consumption. Users should still be aware that this limit is high and
may still be hazardous.
ReadForm now creates at most one on-disk temporary file, combining multiple
form parts into a single temporary file. The mime/multipart.File interface
type's documentation states, "If stored on disk, the File's underlying
concrete type will be an *os.File.". This is no longer the case when a form
contains more than one file part, due to this coalescing of parts into a
single file. The previous behavior of using distinct files for each form part
may be reenabled with the environment variable
GODEBUG=multipartfiles=distinct.
Users should be aware that multipart.ReadForm and the http.Request methods
that call it do not limit the amount of disk consumed by temporary files.
Callers can limit the size of form data with http.MaxBytesReader.
This is CVE-2022-41725 and Go issue https://go.dev/issue/58006.
- crypto/tls: large handshake records may cause panics
Both clients and servers may send large TLS handshake records which cause
servers and clients, respectively, to panic when attempting to construct
responses.
This affects all TLS 1.3 clients, TLS 1.2 clients which explicitly enable
session resumption (by setting Config.ClientSessionCache to a non-nil value),
and TLS 1.3 servers which request client certificates (by setting
Config.ClientAuth
> = RequestClientCert).
This is CVE-2022-41724 and Go issue https://go.dev/issue/58001.
- net/http: avoid quadratic complexity in HPACK decoding
A maliciously crafted HTTP/2 stream could cause excessive CPU consumption
in the HPACK decoder, sufficient to cause a denial of service from a small
number of small requests.
This issue is also fixed in golang.org/x/net/http2 v0.7.0, for users manually
configuring HTTP/2.
This is CVE-2022-41723 and Go issue https://go.dev/issue/57855.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit 94feb31516)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The per-network statistics counters are loaded and incremented without
any concurrency control. Use atomic integers to prevent data races
without having to add any synchronization.
Signed-off-by: Cory Snider <csnider@mirantis.com>
(cherry picked from commit d31fa84c7c)
Signed-off-by: Cory Snider <csnider@mirantis.com>
The errors are already returned to the client in the API response, so
logging them to the daemon log is redundant. Log the errors at level
Debug so as not to pollute the end-users' daemon logs with noise.
Refactor the logs to use structured fields. Add the request context to
the log entry so that logrus hooks could annotate the log entries with
contextual information about the API request in the hypothetical future.
Fixes#44997
Signed-off-by: Cory Snider <csnider@mirantis.com>
(cherry picked from commit a4e3c67e44)
Signed-off-by: Cory Snider <csnider@mirantis.com>
DNS servers in the loopback address range should always be resolved in
the host network namespace when the servers are configured by reading
from the host's /etc/resolv.conf. The daemon mistakenly conflated the
presence of DNS options (docker run --dns-opt) with user-supplied DNS
servers, treating the list of servers loaded from the host as a user-
supplied list and attempting to resolve in the container's network
namespace. Correct this oversight so that loopback DNS servers are only
resolved in the container's network namespace when the user provides the
DNS server list, irrespective of other DNS configuration.
Signed-off-by: Cory Snider <csnider@mirantis.com>
(cherry picked from commit 046cc9e776)
Signed-off-by: Cory Snider <csnider@mirantis.com>
If the resolver encounters an error before it attempts to forward the
request to external DNS, do not try to log information about the
external connection, because at this point `extConn` is `nil`. This
makes sure `dockerd` won't panic and crash from a nil pointer
dereference when it sees an invalid DNS query.
fixes#44979
Signed-off-by: er0k <er0k@er0k.net>
The function signature has changed since v0.10.
Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>
(cherry picked from commit 335907d187)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Adds overrides with specific tests suites in our tests
matrix so we can reduce build time significantly.
Signed-off-by: CrazyMax <crazy-max@users.noreply.github.com>
(cherry picked from commit 22776f8fdb)
`hostSupports` doesn't check if the apparmor_parser is available.
It's possible in some environments that the apparmor will be enabled but
the tool to load the profile is not available which will cause the
ensureDefaultAppArmorProfile to fail completely.
This patch checks if the apparmor_parser is available. Otherwise the
function returns early, but still logs a warning to the daemon log.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
(cherry picked from commit ab3fa46502)
IPVLAN networks created on Moby v20.10 do not have the IpvlanFlag
configuration value persisted in the libnetwork database as that config
value did not exist before v23.0.0. Gracefully migrate configurations on
unmarshal to prevent type-assertion panics at daemon start after upgrade.
Fixes#44925
Signed-off-by: Cory Snider <csnider@mirantis.com>
(cherry picked from commit 91725ddc92)
Signed-off-by: Cory Snider <csnider@mirantis.com>
CI is failing when bind-mounting source from the host into the dev-container;
fatal: detected dubious ownership in repository at '/go/src/github.com/docker/docker'
To add an exception for this directory, call:
git config --global --add safe.directory /go/src/github.com/docker/docker
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit 21677816a0)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The Pid field of an exit event cannot be relied upon to differentiate
exits of the container's task from exits of other container processes,
i.e. execs. The Pid is reported by the runtime and is implementation-
defined so there is no guarantee that a task's pid is distinct from the
pids of any other process in the same container. In particular,
kata-containers reports the pid of the hypervisor for all exit events.
Update the daemon to differentiate container exits from exec exits by
inspecting the event's ProcessID.
The local_windows libcontainerd implementation already sets the
ProcessID to InitProcessName on container exit events. Update the remote
libcontainerd implementation to match. ContainerD guarantees that the
process ID of a task (container init process) is set to the
corresponding container ID, so use that invariant to distinguish task
exits from other process exits.
Signed-off-by: Cory Snider <csnider@mirantis.com>
Notable Updates
- Fix push error propagation
- Fix slice append error with HugepageLimits for Linux
- Update default seccomp profile for PKU and CAP_SYS_NICE
- Fix overlayfs error when upperdirlabel option is set
full diff: https://github.com/containerd/containerd/compare/v1.6.15...v1.6.16
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit c41c8c2f86)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Repro steps:
- Run Docker Desktop
- Run `docker run busybox tail -f /dev/null`
- Run `pkill "Docker Desktop"
Expected:
An error message that indicates that Docker Desktop is shutting down.
Actual:
An error message that looks like this:
```
error waiting for container: invalid character 's' looking for beginning of value
```
here's an example:
https://github.com/docker/for-mac/issues/6575#issuecomment-1324879001
After this change, you get an error message like:
```
error waiting for container: copying response body from Docker: unexpected EOF
```
which is a bit more explicit.
Signed-off-by: Nick Santos <nick.santos@docker.com>
(cherry picked from commit 9900c7a348)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Notable Updates
- Fix push error propagation
- Fix slice append error with HugepageLimits for Linux
- Update default seccomp profile for PKU and CAP_SYS_NICE
- Fix overlayfs error when upperdirlabel option is set
full diff: https://github.com/containerd/containerd/compare/v1.6.15...v1.6.16
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
We already prefer ld for cross-building arm64 but that seems
not enough as native arm64 build also has a linker issue with lld
so we need to also prefer ld for native arm64 build.
Signed-off-by: CrazyMax <crazy-max@users.noreply.github.com>
(cherry picked from commit d2d6ef431f)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
libnetwork/etchosts/etchosts_test.go:167:54: empty-lines: extra empty line at the end of a block (revive)
libnetwork/osl/route_linux.go:185:74: empty-lines: extra empty line at the start of a block (revive)
libnetwork/osl/sandbox_linux_test.go:323:36: empty-lines: extra empty line at the start of a block (revive)
libnetwork/bitseq/sequence.go:412:48: empty-lines: extra empty line at the start of a block (revive)
libnetwork/datastore/datastore_test.go:67:46: empty-lines: extra empty line at the end of a block (revive)
libnetwork/datastore/mock_store.go:34:60: empty-lines: extra empty line at the end of a block (revive)
libnetwork/iptables/firewalld.go:202:44: empty-lines: extra empty line at the end of a block (revive)
libnetwork/iptables/firewalld_test.go:76:36: empty-lines: extra empty line at the end of a block (revive)
libnetwork/iptables/iptables.go:256:67: empty-lines: extra empty line at the end of a block (revive)
libnetwork/iptables/iptables.go:303:128: empty-lines: extra empty line at the start of a block (revive)
libnetwork/networkdb/cluster.go:183:72: empty-lines: extra empty line at the end of a block (revive)
libnetwork/ipams/null/null_test.go:44:38: empty-lines: extra empty line at the end of a block (revive)
libnetwork/drivers/macvlan/macvlan_store.go:45:52: empty-lines: extra empty line at the end of a block (revive)
libnetwork/ipam/allocator_test.go:1058:39: empty-lines: extra empty line at the start of a block (revive)
libnetwork/drivers/bridge/port_mapping.go:88:111: empty-lines: extra empty line at the end of a block (revive)
libnetwork/drivers/bridge/link.go:26:90: empty-lines: extra empty line at the end of a block (revive)
libnetwork/drivers/bridge/setup_ipv6_test.go:17:34: empty-lines: extra empty line at the end of a block (revive)
libnetwork/drivers/bridge/setup_ip_tables.go:392:4: empty-lines: extra empty line at the start of a block (revive)
libnetwork/drivers/bridge/bridge.go:804:50: empty-lines: extra empty line at the start of a block (revive)
libnetwork/drivers/overlay/ov_serf.go:183:29: empty-lines: extra empty line at the start of a block (revive)
libnetwork/drivers/overlay/ov_utils.go:81:64: empty-lines: extra empty line at the end of a block (revive)
libnetwork/drivers/overlay/peerdb.go:172:67: empty-lines: extra empty line at the start of a block (revive)
libnetwork/drivers/overlay/peerdb.go:209:67: empty-lines: extra empty line at the start of a block (revive)
libnetwork/drivers/overlay/peerdb.go:344:89: empty-lines: extra empty line at the start of a block (revive)
libnetwork/drivers/overlay/peerdb.go:436:63: empty-lines: extra empty line at the start of a block (revive)
libnetwork/drivers/overlay/overlay.go:183:36: empty-lines: extra empty line at the start of a block (revive)
libnetwork/drivers/overlay/encryption.go:69:28: empty-lines: extra empty line at the end of a block (revive)
libnetwork/drivers/overlay/ov_network.go:563:81: empty-lines: extra empty line at the start of a block (revive)
libnetwork/default_gateway.go:32:43: empty-lines: extra empty line at the start of a block (revive)
libnetwork/errors_test.go:9:40: empty-lines: extra empty line at the start of a block (revive)
libnetwork/service_common.go:184:64: empty-lines: extra empty line at the end of a block (revive)
libnetwork/endpoint.go:161:55: empty-lines: extra empty line at the end of a block (revive)
libnetwork/store.go:320:33: empty-lines: extra empty line at the end of a block (revive)
libnetwork/store_linux_test.go:11:38: empty-lines: extra empty line at the end of a block (revive)
libnetwork/sandbox.go:571:36: empty-lines: extra empty line at the start of a block (revive)
libnetwork/service_common.go:317:246: empty-lines: extra empty line at the start of a block (revive)
libnetwork/endpoint.go:550:17: empty-lines: extra empty line at the end of a block (revive)
libnetwork/sandbox_dns_unix.go:213:106: empty-lines: extra empty line at the start of a block (revive)
libnetwork/controller.go:676:85: empty-lines: extra empty line at the end of a block (revive)
libnetwork/agent.go:876:60: empty-lines: extra empty line at the end of a block (revive)
libnetwork/resolver.go:324:69: empty-lines: extra empty line at the end of a block (revive)
libnetwork/network.go:1153:92: empty-lines: extra empty line at the end of a block (revive)
libnetwork/network.go:1955:67: empty-lines: extra empty line at the start of a block (revive)
libnetwork/network.go:2235:9: empty-lines: extra empty line at the start of a block (revive)
libnetwork/libnetwork_internal_test.go:336:26: empty-lines: extra empty line at the start of a block (revive)
libnetwork/resolver_test.go:76:35: empty-lines: extra empty line at the end of a block (revive)
libnetwork/libnetwork_test.go:303:38: empty-lines: extra empty line at the end of a block (revive)
libnetwork/libnetwork_test.go:985:46: empty-lines: extra empty line at the end of a block (revive)
libnetwork/ipam/allocator_test.go:1263:37: empty-lines: extra empty line at the start of a block (revive)
libnetwork/errors_test.go:9:40: empty-lines: extra empty line at the end of a block (revive)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit cd381aea56)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This function effectively is a constructor, so rename it to better describe
it's functionality.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit 267108e113)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This method was an exported method, but only used as part of ParseConfigOptions,
so inlining it.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit 09cc2f9d0e)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
It was unclear what the distinction was between these configuration
structs, so merging them to simplify.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit 528428919e)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This was used for testing purposes when libnetwork was in a separate repo, using
the dnet utility, which was removed in 7266a956a8.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit 571baffd59)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Libnetwork configuration files were only used as part of integration tests using
the dnet utility, which was removed in 7266a956a8
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit 46f4a45769)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This method was only used in a single place; inlining it makes it
easier to see what's done.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit 7d574f5ac6)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
These were no longer used.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit a8a8bd1e42)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Static binaries for dockerd are broken on armhf and armel (32-bit).
It seems to be an issue with GCC as building using clang solves
this issue. Also adds extra instruction to prefer ld for
cross-compiling arm64 in bullseye otherwise it doesn't link.
Signed-off-by: CrazyMax <crazy-max@users.noreply.github.com>
(cherry picked from commit f676dab8dc)
Disables user.Lookup() and net.LookupHost() in the init() function on Windows.
Any package that simply imports pkg/chrootarchive will panic on Windows
Nano Server, due to missing netapi32.dll. While docker itself is not
meant to run on Nano Server, binaries that may import this package and
run on Nano server, will fail even if they don't really use any of the
functionality in this package while running on Nano.
Conflicts:
pkg/chrootarchive/archive_unix.go
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
(cherry picked from commit f49c88f1c4)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Current implementation in hack/make.sh overwrites PKG_CONFIG
if not defined and set it to pkg-config. When a build is invoked
using xx in our Dockerfile, it will set PKG_CONFIG to the right
value in go environments depending on the target architecture: 8015613ccc/base/xx-go (L75-L78)
Also needs to install dpkg-dev to use pkg-config when cross-building
Signed-off-by: CrazyMax <crazy-max@users.noreply.github.com>
(cherry picked from commit 71fa3b1337)
Build currently doesn't set the right name for target ARM
architecture through switches in CGO_CFLAGS and CGO_CXXFLAGS
when doing cross-compilation. This was previously fixed in https://github.com/moby/moby/pull/43474
Also removes the toolchain configuration. Following changes for
cross-compilation in https://github.com/moby/moby/pull/44546,
we forgot to remove the toolchain configuration that is
not used anymore as xx already sets correct cc/cxx envs already.
Signed-off-by: CrazyMax <crazy-max@users.noreply.github.com>
(cherry picked from commit 945704208a)
Raspberry Pi allows to start system under overlayfs.
Docker is successfully fallbacks to fuse-overlay but not starting
because of the `Error starting daemon: rename /var/lib/docker/runtimes /var/lib/docker/runtimes-old: invalid cross-device link` error
It's happening because `rename` is not supported by overlayfs.
After manually removing directory `runtimes` docker starts and works successfully
Signed-off-by: Illia Antypenko <ilya@antipenko.pp.ua>
(cherry picked from commit d591710f82)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This helps ensure that users are not surprised by unexpected tokens in
the JSON parser, or fallout later in the daemon.
Signed-off-by: Bjorn Neergaard <bneergaard@mirantis.com>
(cherry picked from commit 8dbc5df952)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This is a pragmatic but impure choice, in order to better support the
default tools available on Windows Server, and reduce user confusion due
to otherwise inscrutable-to-the-uninitiated errors like the following:
> invalid character 'þ' looking for beginning of value
> invalid character 'ÿ' looking for beginning of value
While meaningful to those who are familiar with and are equipped to
diagnose encoding issues, these characters will be hidden when the file
is edited with a BOM-aware text editor, and further confuse the user.
Signed-off-by: Bjorn Neergaard <bneergaard@mirantis.com>
(cherry picked from commit d42495033e)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
dockerd handles SIGQUIT by dumping all goroutine stacks to standard
error and exiting. In contrast, the Go runtime's default SIGQUIT
behaviour... dumps all goroutine stacks to standard error and exits.
The default SIGQUIT behaviour is implemented directly in the runtime's
signal handler, and so is both more robust to bugs in the Go runtime and
does not perturb the state of the process to anywhere near same degree
as dumping goroutine stacks from a user goroutine. The only notable
difference from a user's perspective is that the process exits with
status 2 instead of 128+SIGQUIT.
Signed-off-by: Cory Snider <csnider@mirantis.com>
(cherry picked from commit 0867d3173c)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
When userland-proxy is turned off and on again, the iptables nat rule
doing hairpinning isn't properly removed. This fix makes sure this nat
rule is removed whenever the bridge is torn down or hairpinning is
disabled (through setting userland-proxy to true).
Unlike for ip masquerading and ICC, the `programChainRule()` call
setting up the "MASQ LOCAL HOST" rule has to be called unconditionally
because the hairpin parameter isn't restored from the driver store, but
always comes from the driver config.
For the "SKIP DNAT" rule, things are a bit different: this rule is
always deleted by `removeIPChains()` when the bridge driver is
initialized.
Fixes#44721.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
(cherry picked from commit 566a2e4)
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
synchronises some fixes between these API versions for the documentation,
including fixes from:
- 18f85467e7
- 345346d7c6
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit 92cbd1c69e)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This centralizes more defaults, to be part of the config struct that's
created, instead of interweaving the defaults with other code in various
places.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit b28e66cf4f)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
[RFC 8259] allows for JSON implementations to optionally ignore a BOM
when it helps with interoperability; do so in Moby as Notepad (the only
text editor available out of the box in many versions of Windows Server)
insists on writing UTF-8 with a BOM.
[RFC 8259]: https://tools.ietf.org/html/rfc8259#section-8.1
Signed-off-by: Bjorn Neergaard <bneergaard@mirantis.com>
(cherry picked from commit bb19265ba8)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This is no longer necessary after the switch to the kernel UAPI.
Signed-off-by: Bjorn Neergaard <bneergaard@mirantis.com>
(cherry picked from commit aa80c33360)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
We only need suitable UAPI headers now. They are available on kernel 4.7
and newer; out of the distributions currently in support that users
might be interested in, only Enterprise Linux 7 has too old a kernel
(3.10).
Users of Enterprise Linux 7 distros can compile using a newer platform,
disable the Btrfs graphdriver as documented in this file, or use newer
kernel headers on their older distro.
Signed-off-by: Bjorn Neergaard <bneergaard@mirantis.com>
(cherry picked from commit c9d632e485)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
While the Cgo in this entire file is quite questionable, that is a task
for another day.
Signed-off-by: Bjorn Neergaard <bneergaard@mirantis.com>
(cherry picked from commit d3778d65fa)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
By relying on the kernel UAPI (userspace API), we can drop a dependency
and simplify building Moby, while also ensuring that we are using a
stable/supported source of the C types and defines we need.
btrfs-progs mirrors the kernel headers, but the headers it ships with
are not the canonical source and as [we have seen before][44698], could
be subject to changes.
Depending on the canonical headers from the kernel both is more
idiomatic, and ensures we are protected by the kernel's promise to not
break userspace.
[44698]: https://github.com/moby/moby/issues/44698
Signed-off-by: Bjorn Neergaard <bneergaard@mirantis.com>
(cherry picked from commit 3208dcabdc)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This is actually quite meaningless as we are reporting the libbtrfs
version, but we do not use libbtrfs. We only use the kernel interface to
btrfs instead.
While we could report the version of the kernel headers in play, they're
rather all-or-nothing: they provide the structures and defines we need,
or they don't. As such, drop all version information as the host kernel
version is the only thing that matters.
Signed-off-by: Bjorn Neergaard <bneergaard@mirantis.com>
(cherry picked from commit 1449c82484)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
cmd.Wait is called twice from different goroutines which can cause the
test to hang completely. Fix by calling Wait only once and sending its
return value over a channel.
In TestLogsFollowGoroutinesWithStdout also added additional closes and
process kills to ensure that we don't leak anything in case test returns
early because of failed test assertion.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
(cherry picked from commit deb4910c5b)
Make it possible to add `-race` to the BUILDFLAGS without making the
build fail with error:
"-buildmode=pie not supported when -race is enabled"
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
(cherry picked from commit bbe6e9e8d1)
Conntrack entries are created for UDP flows even if there's nowhere to
route these packets (ie. no listening socket and no NAT rules to
apply). Moreover, iptables NAT rules are evaluated by netfilter only
when creating a new conntrack entry.
When Docker adds NAT rules, netfilter will ignore them for any packet
matching a pre-existing conntrack entry. In such case, when
dockerd runs with userland proxy enabled, packets got routed to it and
the main symptom will be bad source IP address (as shown by #44688).
If the publishing container is run through Docker Swarm or in
"standalone" Docker but with no userland proxy, affected packets will
be dropped (eg. routed to nowhere).
As such, Docker needs to flush all conntrack entries for published UDP
ports to make sure NAT rules are correctly applied to all packets.
- Fixes#44688
- Fixes#8795
- Fixes#16720
- Fixes#7540
- Fixesmoby/libnetwork#2423
- and probably more.
As a precautionary measure, those conntrack entries are also flushed
when revoking external connectivity to avoid those entries to be reused
when a new sandbox is created (although the kernel should already
prevent such case).
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
(cherry picked from commit b37d34307d)
Signed-off-by: Cory Snider <csnider@mirantis.com>
The CreatedAt date was determined from the volume's `_data`
directory (`/var/lib/docker/volumes/<volumename>/_data`).
However, when initializing a volume, this directory is updated,
causing the date to change.
Instead of using the `_data` directory, use its parent directory,
which is not updated afterwards, and should reflect the time that
the volume was created.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
(cherry picked from commit 01fd23b625)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Keep the same output dir format in the bake definition
as the one used in make scripts.
Signed-off-by: CrazyMax <crazy-max@users.noreply.github.com>
(cherry picked from commit 9bcf5bed05)
this script is not used anymore. containerutility is
built in the Dockerfile.
Signed-off-by: CrazyMax <crazy-max@users.noreply.github.com>
(cherry picked from commit 04c90b8cf5)
Better support for cross compilation so we can fully rely
on `--platform` flag of buildx for a seamless integration.
This removes unnecessary extra cross logic in the Dockerfile,
DOCKER_CROSSPLATFORMS and CROSS vars and some hack scripts as well.
Non-sandboxed build invocation is still supported and dev stages
in the Dockerfile have been updated accordingly.
Bake definition and GitHub Actions workflows have been updated
accordingly as well.
Signed-off-by: CrazyMax <crazy-max@users.noreply.github.com>
(cherry picked from commit 8086f40123)
iptables -C flag was introduced in v1.4.11, which was released ten
years ago. Thus, there're no more Linux distributions supported by
Docker using this version. As such, this commit removes the old way of
checking if an iptables rule exists (by using substring matching).
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
(cherry picked from commit 799cc143c9)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The former was doing some checks and logging warnings, whereas
the latter was doing the same checks but to set some internal variables.
As both are called only once and from the same place, there're now
merged together.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
(cherry picked from commit 205e5278c6)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
iptables package has a function `detectIptables()` called to initialize
some local variables. Since v20.10.0, it first looks for iptables bin,
then ip6tables and finally it checks what iptables flags are available
(including -C). It early exits when ip6tables isn't available, and
doesn't execute the last check.
To remove port mappings (eg. when a container stops/dies), Docker
first checks if those NAT rules exist and then deletes them. However, in
the particular case where there's no ip6tables bin available, iptables
`-C` flag is considered unavailable and thus it looks for NAT rules by
using some substring matching. This substring matching then fails
because `iptables -t nat -S POSTROUTING` dumps rules in a slighly format
than what's expected.
For instance, here's what `iptables -t nat -S POSTROUTING` dumps:
```
-A POSTROUTING -s 172.18.0.2/32 -d 172.18.0.2/32 -p tcp -m tcp --dport 9999 -j MASQUERADE
```
And here's what Docker looks for:
```
POSTROUTING -p tcp -s 172.18.0.2 -d 172.18.0.2 --dport 9999 -j MASQUERADE
```
Because of that, those rules are considered non-existant by Docker and
thus never deleted. To fix that, this change reorders the code in
`detectIptables()`.
Fixes#42127.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
(cherry picked from commit af7236f85a)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
- use local variables for chains instead of sharing global variables
- make createNewChain a t.Helper
Signed-off-by: Chee Hau Lim <ch33hau@gmail.com>
(cherry picked from commit a2cea992c2)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Simplify the error message so that we don't have to distinguish between static-
and non-static builds. Also update the link to the storage-driver section to
use a "/go/" redirect in the docs, as the anchor link was no longer correct.
Using a "/go/" redirect makes sure the link remains functional if docs is moving
around.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit a5ebd28797)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This function was added in b86e3bee5a to
work around an issue in os/user.Current(), which SEGFAULTS when compiling
statically with cgo enabled (see golang/go#13470).
We hit similar issues in other parts, and contributed a "osusergo" build-
tag in https://go-review.googlesource.com/c/go/+/330753. The "osusergo"
build tag must be set when compiling static binaries with cgo enabled.
If that build-tag is set, the cgo implementation for user.Current() won't
be used, and a pure-go implementation is used instead;
https://github.com/golang/go/blob/go1.19.4/src/os/user/cgo_lookup_unix.go#L5
With the above in place, we no longer need this workaround, and can remove
the ensureHomeIfIAmStatic() function.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit 155e39187c)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Notable Updates
- Update overlay snapshotter to check for tmpfs when evaluating usage of userxattr
- Update hcsschim to v0.9.6 to fix resource leak on exec
- Make swapping disabled with memory limit in CRI plugin
- Allow clients to remove created tasks with PID 0
- Fix concurrent map iteration and map write in CRI port forwarding
- Check for nil HugepageLimits to avoid panic in CRI plugin
See the changelog for complete list of changes:
https://github.com/containerd/containerd/releases/tag/v1.6.13
full diff: https://github.com/containerd/containerd/compare/v1.6.12...v1.6.13
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
No changes in vendored code, but removes some indirect dependencies.
full diff: b17f02f0a0...0da442b278
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit 0007490b21)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(*Daemon).registerLinks() calling the WriteHostConfig() method of its
container argument is a vestigial behaviour. In the distant past,
registerLinks() would persist the container links in an SQLite database
and drop the link config from the container's persisted HostConfig. This
changed in Docker v1.10 (#16032) which migrated away from SQLite and
began using the link config in the container's HostConfig as the
persistent source of truth. registerLinks() no longer mutates the
HostConfig at all so persisting the HostConfig to disk falls outside of
its scope of responsibilities.
Signed-off-by: Cory Snider <csnider@mirantis.com>
(cherry picked from commit 388fe4aea8)
Signed-off-by: Cory Snider <csnider@mirantis.com>
(*Container).CheckpointTo() upserts a snapshot of the container to the
daemon's in-memory ViewDB and also persists the snapshot to disk. It
does not register the live container object with the daemon's container
store, however. The ViewDB and container store are used as the source of
truth for different operations, so having a container registered in one
but not the other can result in inconsistencies. In particular, the List
Containers API uses the ViewDB as its source of truth and the Container
Inspect API uses the container store.
The (*Daemon).setHostConfig() method is called fairly early in the
process of creating a container, long before the container is registered
in the daemon's container store. Due to a rogue CheckpointTo() call
inside setHostConfig(), there is a window of time where a container can
be included in a List Containers API response but "not exist" according
to the Container Inspect API and similar endpoints which operate on a
particular container. Remove the rogue call so that the caller has full
control over when the container is checkpointed and update callers to
checkpoint explicitly. No changes to (*Daemon).create() are needed as it
checkpoints the fully-created container via (*Daemon).Register().
Fixes#44512.
Signed-off-by: Cory Snider <csnider@mirantis.com>
(cherry picked from commit 0141c6db81)
Signed-off-by: Cory Snider <csnider@mirantis.com>
GetContainer() would return (nil, nil) when looking up a container
if the container was inserted into the containersReplica ViewDB but not
the containers Store at the time of the lookup. Callers which reasonably
assume that the returned err == nil implies returned container != nil
would dereference a nil pointer and panic. Change GetContainer() so that
it always returns a container or an error.
Signed-off-by: Cory Snider <csnider@mirantis.com>
(cherry picked from commit 00157a42d3)
Signed-off-by: Cory Snider <csnider@mirantis.com>
Moby is not a Go module; to prevent anyone from mistakenly trying to
convert it to one before we are ready, introduce a check (usable in CI
and locally) for a go.mod file.
This is preferable to trying to .gitignore the file as we can ensure
that a mistakenly created go.mod is surfaced by Git-based tooling and is
less likely to surprise a contributor.
Signed-off-by: Bjorn Neergaard <bneergaard@mirantis.com>
(cherry picked from commit 25c3421802)
Signed-off-by: Bjorn Neergaard <bneergaard@mirantis.com>
To make the local build environment more correct and consistent, we
should never leave an uncommitted go.mod in the tree; however, we need a
go.mod for certain commands to work properly. Use a wrapper script to
create and destroy the go.mod as needed instead of potentially changing
tooling behavior by leaving it.
If a go.mod already exists, this script will warn and call the wrapped
command with GO111MODULE=on.
Signed-off-by: Bjorn Neergaard <bneergaard@mirantis.com>
(cherry picked from commit a449f77774)
Signed-off-by: Bjorn Neergaard <bneergaard@mirantis.com>
This is a dependency of github.com/fluent/fluent-logger-golang, which
currently does not provide a go.mod, but tests against the latest
versions of its dependencies.
Updating this dependency to the latest version.
Notable changes:
- all: implement omitempty
- fix: JSON encoder may produce invalid utf-8 when provided invalid utf-8 message pack string.
- added Unwrap method to errWrapped plus tests; switched travis to go 1.14
- CopyToJSON: fix bitSize for floats
- Add Reader/Writer constructors with custom buffer
- Add missing bin header functions
- msgp/unsafe: bring code in line with unsafe guidelines
- msgp/msgp: fix ReadMapKeyZC (fix "Fail to decode string encoded as bin type")
full diff: https://github.com/tinylib/msgp/compare/v1.1.0...v1.1.6
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit 389dacd6e2)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This is an (indirect) dependency of github.com/fluent/fluent-logger-golang,
which currently does not provide a go.mod, but tests against the latest
versions of its dependencies.
Updating this dependency to the latest version.
full diff: https://github.com/philhofer/fwd/compare/v1.0.0...v1.1.2
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit 24496fe097)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Both of these were deprecated in 55f675811a,
but the format of the GoDoc comments didn't follow the correct format, which
caused them not being picked up by tools as "deprecated".
This patch updates uses in the codebase to use the alternatives.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit 0f7c9cd27e)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Make sure we use the same alias everywhere for easier finding,
and to prevent accidentally introducing duplicate imports with
different aliases for the same package.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit f6b695d2fb)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
golang.org/x/net contains a fix for CVE-2022-41717, which was addressed
in stdlib in go1.19.4 and go1.18.9;
> net/http: limit canonical header cache by bytes, not entries
>
> An attacker can cause excessive memory growth in a Go server accepting
> HTTP/2 requests.
>
> HTTP/2 server connections contain a cache of HTTP header keys sent by
> the client. While the total number of entries in this cache is capped,
> an attacker sending very large keys can cause the server to allocate
> approximately 64 MiB per open connection.
>
> This issue is also fixed in golang.org/x/net/http2 v0.4.0,
> for users manually configuring HTTP/2.
full diff: https://github.com/golang/net/compare/v0.2.0...v0.4.0
other dependency updates (due to circular dependencies):
- golang.org/x/sys v0.3.0: https://github.com/golang/sys/compare/v0.2.0...v0.3.0
- golang.org/x/text v0.5.0: https://github.com/golang/text/compare/v0.4.0...v0.5.0
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit 4bbc37687e)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This update:
- removes support for go1.11
- removes the use of "golang.org/x/crypto/ed25519", which is now part of stdlib:
> Beginning with Go 1.13, the functionality of this package was moved to the
> standard library as crypto/ed25519. This package only acts as a compatibility
> wrapper.
Note that this is not the latest release; version v1.1.44 introduced a tools.go
file, which added golang.org/x/tools to the dependency tree (but only used for
"go:generate") see commit:
df84acab71
full diff: https://github.com/miekg/dns/compare/v1.1.27...v1.1.43
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit bbb1b82232)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Welcome to the v1.6.11 release of containerd!
The eleventh patch release for containerd 1.6 contains a various fixes and updates.
Notable Updates
- Add pod UID annotation in CRI plugin
- Fix nil pointer deference for Windows containers in CRI plugin
- Fix lease labels unexpectedly overwriting expiration
- Fix for simultaneous diff creation using the same parent snapshot
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
- Fix nil pointer deference for Windows containers in CRI plugin
- Fix lease labels unexpectedly overwriting expiration
- Fix for simultaneous diff creation using the same parent snapshot
full diff: https://github.com/containerd/containerd/v1.6.10...v1.6.11
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit d331bc3b03)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Includes security fixes for net/http (CVE-2022-41717, CVE-2022-41720),
and os (CVE-2022-41720).
These minor releases include 2 security fixes following the security policy:
- os, net/http: avoid escapes from os.DirFS and http.Dir on Windows
The os.DirFS function and http.Dir type provide access to a tree of files
rooted at a given directory. These functions permitted access to Windows
device files under that root. For example, os.DirFS("C:/tmp").Open("COM1")
would open the COM1 device.
Both os.DirFS and http.Dir only provide read-only filesystem access.
In addition, on Windows, an os.DirFS for the directory \(the root of the
current drive) can permit a maliciously crafted path to escape from the
drive and access any path on the system.
The behavior of os.DirFS("") has changed. Previously, an empty root was
treated equivalently to "/", so os.DirFS("").Open("tmp") would open the
path "/tmp". This now returns an error.
This is CVE-2022-41720 and Go issue https://go.dev/issue/56694.
- net/http: limit canonical header cache by bytes, not entries
An attacker can cause excessive memory growth in a Go server accepting
HTTP/2 requests.
HTTP/2 server connections contain a cache of HTTP header keys sent by
the client. While the total number of entries in this cache is capped,
an attacker sending very large keys can cause the server to allocate
approximately 64 MiB per open connection.
This issue is also fixed in golang.org/x/net/http2 vX.Y.Z, for users
manually configuring HTTP/2.
Thanks to Josselin Costanzi for reporting this issue.
This is CVE-2022-41717 and Go issue https://go.dev/issue/56350.
View the release notes for more information:
https://go.dev/doc/devel/release#go1.19.4
And the milestone on the issue tracker:
https://github.com/golang/go/issues?q=milestone%3AGo1.19.4+label%3ACherryPickApproved
Full diff: https://github.com/golang/go/compare/go1.19.3...go1.19.4
The golang.org/x/net fix is in 1e63c2f08a
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit 52bc1ad744)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
It's never set, so we can remove it.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit 85fddc0081)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This is only used for tests, and the key is not verified anymore, so
instead of creating a key and storing it, we can just use an ad-hoc
one.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit 8feeaecb84)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Turned out that the loadOrCreateTrustKey() utility was doing exactly the
same as libtrust.LoadOrCreateTrustKey(), so making it a thin wrapped. I kept
the tests to verify the behavior, but we could remove them as we only need this
for our integration tests.
The storage location for the generated key was changed (again as we only need
this for some integration tests), so we can remove the TrustKeyPath from the
config.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit 5cdd6ab7cd)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This is a subset of 1981706196 on master,
preserving the tests for migrating the key to engine-id.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This addresses a regression introduced in 407e3a4552,
which turned out to be "too strict", as there's old images that use, for example;
docker pull python:3.5.1-alpine
3.5.1-alpine: Pulling from library/python
unsupported media type application/octet-stream
Before 407e3a4552, such mediatypes were accepted;
docker pull python:3.5.1-alpine
3.5.1-alpine: Pulling from library/python
e110a4a17941: Pull complete
30dac23631f0: Pull complete
202fc3980a36: Pull complete
Digest: sha256:f88925c97b9709dd6da0cb2f811726da9d724464e9be17a964c70f067d2aa64a
Status: Downloaded newer image for python:3.5.1-alpine
docker.io/library/python:3.5.1-alpine
This patch copies the additional media-types, using the list of types that
were added in a215e15cb1, which fixed a
similar issue.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit a6a539497a)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This syncs the seccomp-profile with the latest changes in containerd's
profile, applying the same changes as 17a9324035
Some background from the associated ticket:
> We want to use vsock for guest-host communication on KubeVirt
> (https://github.com/kubevirt/kubevirt). In KubeVirt we run VMs in pods.
>
> However since anyone can just connect from any pod to any VM with the
> default seccomp settings, we cannot limit connection attempts to our
> privileged node-agent.
>
> ### Describe the solution you'd like
> We want to deny the `socket` syscall for the `AF_VSOCK` family by default.
>
> I see in [1] and [2] that AF_VSOCK was actually already blocked for some
> time, but that got reverted since some architectures support the `socketcall`
> syscall which can't be restricted properly. However we are mostly interested
> in `arm64` and `amd64` where limiting `socket` would probably be enough.
>
> ### Additional context
> I know that in theory we could use our own seccomp profiles, but we would want
> to provide security for as many users as possible which use KubeVirt, and there
> it would be very helpful if this protection could be added by being part of the
> DefaultRuntime profile to easily ensure that it is active for all pods [3].
>
> Impact on existing workloads: It is unlikely that this will disturb any existing
> workload, becuase VSOCK is almost exclusively used for host-guest commmunication.
> However if someone would still use it: Privileged pods would still be able to
> use `socket` for `AF_VSOCK`, custom seccomp policies could be applied too.
> Further it was already blocked for quite some time and the blockade got lifted
> due to reasons not related to AF_VSOCK.
>
> The PR in KubeVirt which adds VSOCK support for additional context: [4]
>
> [1]: https://github.com/moby/moby/pull/29076#commitcomment-21831387
> [2]: dcf2632945
> [3]: https://kubernetes.io/docs/tutorials/security/seccomp/#enable-the-use-of-runtimedefault-as-the-default-seccomp-profile-for-all-workloads
> [4]: https://github.com/kubevirt/kubevirt/pull/8546
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit 57b229012a)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2022-12-01 14:09:46 +01:00
1542 changed files with 65832 additions and 22130 deletions
# FIXME: when updating golangci-lint, remove the temporary "nolint" in https://github.com/moby/moby/blob/7860686a8df15eea9def9e6189c6f9eca031bb6f/libnetwork/networkdb/cluster.go#L246
ARGGOLANGCI_LINT_VERSION=v1.49.0
ARGGOLANGCI_LINT_VERSION=v1.59.1
RUN --mount=type=cache,target=/root/.cache/go-build \
--mount=type=cache,target=/go/pkg/mod \
GOBIN=/build/ GO111MODULE=on go install "github.com/golangci/golangci-lint/cmd/golangci-lint@${GOLANGCI_LINT_VERSION}"\
@@ -220,52 +235,148 @@ RUN --mount=type=cache,target=/root/.cache/go-build \
&& /build/gotestsum --version
FROMbaseASshfmt
ARGSHFMT_VERSION=v3.0.2
ARGSHFMT_VERSION=v3.6.0
RUN --mount=type=cache,target=/root/.cache/go-build \
--mount=type=cache,target=/go/pkg/mod \
GOBIN=/build/ GO111MODULE=on go install "mvdan.cc/sh/v3/cmd/shfmt@${SHFMT_VERSION}"\
&& /build/shfmt --version
FROMdev-baseASdockercli
ARG DOCKERCLI_CHANNEL
# dockercli
FROMbaseASdockercli-src
WORKDIR/tmp/dockercli
RUN git init . && git remote add origin "https://github.com/docker/cli.git"
flags.StringVar(&conf.InitPath,"init-path","","Path to the docker-init binary")
flags.Int64Var(&conf.CPURealtimePeriod,"cpu-rt-period",0,"Limit the CPU real-time period in microseconds for the parent cgroup for all containers (not supported with cgroups v2)")
flags.Int64Var(&conf.CPURealtimeRuntime,"cpu-rt-runtime",0,"Limit the CPU real-time runtime in microseconds for the parent cgroup for all containers (not supported with cgroups v2)")
flags.StringVar(&conf.SeccompProfile,"seccomp-profile",config.SeccompProfileDefault,`Path to seccomp profile. Use "unconfined" to disable the default seccomp profile`)
flags.StringVar(&conf.SeccompProfile,"seccomp-profile",conf.SeccompProfile,`Path to seccomp profile. Use "unconfined" to disable the default seccomp profile`)
flags.Var(&conf.ShmSize,"default-shm-size","Default shm size for containers")
flags.BoolVar(&conf.NoNewPrivileges,"no-new-privileges",false,"Set no-new-privileges by default for new containers")
flags.StringVar(&conf.IpcMode,"default-ipc-mode",string(config.DefaultIpcMode),`Default mode for containers ipc ("shareable" | "private")`)
flags.StringVar(&conf.IpcMode,"default-ipc-mode",conf.IpcMode,`Default mode for containers ipc ("shareable" | "private")`)
flags.Var(&conf.NetworkConfig.DefaultAddressPools,"default-address-pool","Default address pools for node specific local networks")
// rootless needs to be explicitly specified for running "rootful" dockerd in rootless dockerd (#38702)
// Note that defaultUserlandProxyPath and honorXDG are configured according to the value of rootless.RunningWithRootlessKit, not the value of --rootless.
flags.BoolVar(&conf.Rootless,"rootless",rootless.RunningWithRootlessKit(),"Enable rootless mode; typically used with RootlessKit")
flags.StringVar(&conf.CgroupNamespaceMode,"default-cgroupns-mode",string(defaultCgroupNamespaceMode),`Default mode for containers cgroup namespace ("host" | "private")`)
// Note that conf.BridgeConfig.UserlandProxyPath and honorXDG are configured according to the value of rootless.RunningWithRootlessKit, not the value of --rootless.
flags.BoolVar(&conf.Rootless,"rootless",conf.Rootless,"Enable rootless mode; typically used with RootlessKit")
flags.StringVar(&conf.CgroupNamespaceMode,"default-cgroupns-mode",conf.CgroupNamespaceMode,`Default mode for containers cgroup namespace ("host" | "private")`)
Some files were not shown because too many files have changed in this diff
Show More
Reference in New Issue
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.