Our resolver is just a forwarder for external DNS so it should act like
it. Unless it's a server failure or refusal, take the response at face
value and forward it along to the client. RFC 8020 is only applicable to
caching recursive name servers and our resolver is neither caching nor
recursive.
Signed-off-by: Cory Snider <csnider@mirantis.com>
(cherry picked from commit 41356227f2)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The error returned by DecodeConfig was changed in
b6d58d749c and caused this to regress.
Allow empty request bodies for this endpoint once again.
Signed-off-by: Cory Snider <csnider@mirantis.com>
(cherry picked from commit 967c7bc5d3)
Signed-off-by: Cory Snider <csnider@mirantis.com>
This fixes a bug where, if a user pulls an image with a tag != `latest` and
a specific platform, we return an NotFound error for the wrong (`latest`) tag.
see: https://github.com/moby/moby/issues/45558
This bug was introduced in 779a5b3029
in the changes to `daemon/images/image_pull.go`, when we started returning the error from the call to
`GetImage` after the pull. We do this call, if pulling with a specified platform, to check if the platform
of the pulled image matches the requested platform (for cases with single-arch images).
However, when we call `GetImage` we're not passing the image tag, only name, so `GetImage` assumes `latest`
which breaks when the user has requested a different tag, since there might not be such an image in the store.
Signed-off-by: Laura Brehm <laurabrehm@hey.com>
(cherry picked from commit f450ea64e6)
Signed-off-by: Laura Brehm <laurabrehm@hey.com>
When the `ServerAddress` in the `AuthConfig` provided by the client is
empty, default to the default registry (registry-1.docker.io).
This makes the behaviour the same as with the containerd image store
integration disabled.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
(cherry picked from commit 2ad499f93e)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Now that most uses of reexec have been replaced with non-reexec
solutions, most of the reexec.Init() calls peppered throughout the test
suites are unnecessary. Furthermore, most of the reexec.Init() calls in
test code neglects to check the return value to determine whether to
exit, which would result in the reexec'ed subprocesses proceeding to run
the tests, which would reexec another subprocess which would proceed to
run the tests, recursively. (That would explain why every reexec
callback used to unconditionally call os.Exit() instead of returning...)
Remove unneeded reexec.Init() calls from test and example code which no
longer needs it, and fix the reexec.Init() calls which are not inert to
exit after a reexec callback is invoked.
Signed-off-by: Cory Snider <csnider@mirantis.com>
(cherry picked from commit 4e0319c878)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Ports over all the previous image delete logic, such as:
- Introduce `prune` and `force` flags
- Introduce the concept of hard and soft image delete conflics, which represent:
- image referenced in multiple tags (soft conflict)
- image being used by a stopped container (soft conflict)
- image being used by a running container (hard conflict)
- Implement delete logic such as:
- if deleting by reference, and there are other references to the same image, just
delete the passed reference
- if deleting by reference, and there is only 1 reference and the image is being used
by a running container, throw an error if !force, or delete the reference and create
a dangling reference otherwise
- if deleting by imageID, and force is true, remove all tags (otherwise soft conflict)
- if imageID, check if stopped container is using the image (soft conflict), and
delete anyway if force
- if imageID was passed in, check if running container is using the image (hard conflict)
- if `prune` is true, and the image being deleted has dangling parents, remove them
This commit also implements logic to get image parents in c8d by comparing shared layers.
Signed-off-by: Laura Brehm <laurabrehm@hey.com>
(cherry picked from commit cad97135b3)
Signed-off-by: Laura Brehm <laurabrehm@hey.com>
Our templates no longer contain version-specific rules, so this function
is no longer used. This patch deprecates it.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit e3e715666f)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
commit 7008a51449 removed version-conditional
rules from the template, so we no longer need the apparmor_parser Version.
This patch removes the call to `aaparser.GetVersion()`
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit ecaab085db)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Commit 2e19a4d56b removed all other version-
conditional statements from the AppArmor template, but left this one in place.
These conditions were added in 8cf89245f5
to account for old versions of debian/ubuntu (apparmor_parser < 2.9)
that lacked some options;
> This allows us to use the apparmor profile we have in contrib/apparmor/
> and solves the problems where certain functions are not apparent on older
> versions of apparmor_parser on debian/ubuntu.
Those patches were from 2015/2016, and all currently supported distro
versions should now have more current versions than that. Looking at the
oldest supported versions;
Ubuntu 18.04 "Bionic":
apparmor_parser --version
AppArmor parser version 2.12
Copyright (C) 1999-2008 Novell Inc.
Copyright 2009-2012 Canonical Ltd.
Debian 10 "Buster"
apparmor_parser --version
AppArmor parser version 2.13.2
Copyright (C) 1999-2008 Novell Inc.
Copyright 2009-2018 Canonical Ltd.
This patch removes the remaining conditionals.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit f445ee1e6c)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Add `execDuration` field to the event attributes map. This is useful for tracking how long the container ran.
Signed-off-by: Dorin Geman <dorin.geman@docker.com>
(cherry picked from commit 2ad37e1832)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Commit 1261fe69a3 deprecated the VirtualSize
field, but forgot to mention that it's also included in the /system/df
endpoint.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit fdc7a78652)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
- forward-port changes from 0ffaa6c785 to api/swagger.yaml (v1.44-dev)
- backports the changes to v1.43;
- Update container OOMKilled flag immediately 57d2d6ef62
- Add no-new-privileges to SecurityOptions returned by /info eb7738221c
- API: deprecate VirtualSize field for /images/json and /images/{id}/json 1261fe69a3
- api/types/container: create type for changes endpoint dbb48e4b29
- builder-next/prune: Handle "until" filter timestamps 54a125f677
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Treat copying extended attributes from a source filesystem which does
not support extended attributes as a no-op, same as if the file did not
possess the extended attribute. Only fail copying extended attributes if
the source file has the attribute and the destination filesystem does
not support xattrs.
Signed-off-by: Cory Snider <csnider@mirantis.com>
This special case was added in 3043c26419 as
a sentinel error (`AuthRequiredError`) to check whether authentication
is required (and to prompt the users to authenticate). A later refactor
(946bbee39a) removed the `AuthRequiredError`,
but kept the error-message and logic.
Starting with fcee6056dc, it looks like we
no longer depend on this specific error, so we can return the registry's
error message instead.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
go1.20.4 (released 2023-05-02) includes three security fixes to the html/template
package, as well as bug fixes to the compiler, the runtime, and the crypto/subtle,
crypto/tls, net/http, and syscall packages. See the Go 1.20.4 milestone on our
issue tracker for details:
https://github.com/golang/go/issues?q=milestone%3AGo1.20.4+label%3ACherryPickApproved
release notes: https://go.dev/doc/devel/release#go1.20.4
full diff: https://github.com/golang/go/compare/go1.20.3...go1.20.4
from the announcement:
> These minor releases include 3 security fixes following the security policy:
>
> - html/template: improper sanitization of CSS values
>
> Angle brackets (`<>`) were not considered dangerous characters when inserted
> into CSS contexts. Templates containing multiple actions separated by a '/'
> character could result in unexpectedly closing the CSS context and allowing
> for injection of unexpected HMTL, if executed with untrusted input.
>
> Thanks to Juho Nurminen of Mattermost for reporting this issue.
>
> This is CVE-2023-24539 and Go issue https://go.dev/issue/59720.
>
> - html/template: improper handling of JavaScript whitespace
>
> Not all valid JavaScript whitespace characters were considered to be
> whitespace. Templates containing whitespace characters outside of the character
> set "\t\n\f\r\u0020\u2028\u2029" in JavaScript contexts that also contain
> actions may not be properly sanitized during execution.
>
> Thanks to Juho Nurminen of Mattermost for reporting this issue.
>
> This is CVE-2023-24540 and Go issue https://go.dev/issue/59721.
>
> - html/template: improper handling of empty HTML attributes
>
> Templates containing actions in unquoted HTML attributes (e.g. "attr={{.}}")
> executed with empty input could result in output that would have unexpected
> results when parsed due to HTML normalization rules. This may allow injection
> of arbitrary attributes into tags.
>
> Thanks to Juho Nurminen of Mattermost for reporting this issue.
>
> This is CVE-2023-29400 and Go issue https://go.dev/issue/59722.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
term: remove interrupt handler on termios
On termios platforms, interrupt signals are not generated in raw mode
terminals as the ISIG setting is not enabled. Remove interrupt handler
as it does nothing for raw mode and prevents other uses of INT signal
with this library.
This code seems to go back all the way to moby/moby#214 where signal
handling was improved for monolithic docker repository. Raw mode -ISIG
got reintroduced in moby/moby@3f63b87807, but the INT handler was left
behind.
full diff: abb19827d3...1aeaba8785
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
release notes: https://github.com/opencontainers/runtime-spec/releases/tag/v1.1.0-rc.2
Additions
- config-linux: add support for rsvd hugetlb cgroup
- features: add features.md to formalize the runc features JSON
- config-linux: add support for time namespace
Minor fixes and documentation
- config-linux: clarify where device nodes can be created
- runtime: remove When serialized in JSON, the format MUST adhere to the following pattern
- Update CI to Go 1.20
- config: clarify Linux mount options
- config-linux: fix url error
- schema: fix schema for timeOffsets
- schema: remove duplicate keys
full diff: https://github.com/opencontainers/runtime-spec/compare/v1.1.0-rc.1...v1.1.0-rc.2
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
daemon.CreateImageFromContainer() already constructs a new config by taking
the image config, applying custom options (`docker commit --change ..`) (if
any), and merging those with the containers' configuration, so there is
no need to merge options again.
e22758bfb2/daemon/commit.go (L152-L158)
This patch removes the merge logic from generateCommitImageConfig, and
removes the unused arguments and error-return.
Co-authored-by: Djordje Lukic <djordje.lukic@docker.com>
Co-authored-by: Laura Brehm <laurabrehm@hey.com>
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The OCI image-spec now also provides ArgsEscaped for backward compatibility
with the option used by Docker.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
- Split these options to a separate struct, so that we can handle them in isolation.
- Change some tests to use subtests, and improve coverage
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
- sandbox, endpoint changed in c71555f030, but
missed updating the stubs.
- add missing stub for Controller.cleanupServiceDiscovery()
- While at it also doing some minor (formatting) changes.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This function included a defer to close the net.Conn if an error occurred,
but the calling function (SetExternalKey()) also had a defer to close it
unconditionally.
Rewrite it to use json.NewEncoder(), which accepts a writer, and inline
the code.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
It's a no-op on Windows and other non-Linux, non-FreeBSD platforms,
so there's no need to register the re-exec.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Just print the error and os.Exit() instead, which makes it more
explicit that we're exiting, and there's no need to decorate the
error.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Split the function into a "backing" function that returns an error, and the
re-exec entrypoint, which handles the error to provide a more idiomatic approach.
This was part of a larger change accross multiple re-exec functions (now removed).
For history's sake; here's the description for that;
The `reexec.Register()` function accepts reexec entrypoints, which are a `func()`
without return (matching a binary's `main()` function). As these functions cannot
return an error, it's the entrypoint's responsibility to handle any error, and to
indicate failures through `os.Exit()`.
I noticed that some of these entrypoint functions had `defer()` statements, but
called `os.Exit()` either explicitly or implicitly (e.g. through `logrus.Fatal()`).
defer statements are not executed if `os.Exit()` is called, which rendered these
statements useless.
While I doubt these were problematic (I expect files to be closed when the process
exists, and `runtime.LockOSThread()` to not have side-effects after exit), it also
didn't seem to "hurt" to call these as was expected by the function.
This patch rewrites some of the entrypoints to split them into a "backing function"
that can return an error (being slightly more iodiomatic Go) and an wrapper function
to act as entrypoint (which can handle the error and exit the executable).
To some extend, I'm wondering if we should change the signatures of the entrypoints
to return an error so that `reexec.Init()` can handle (or return) the errors, so
that logging can be handled more consistently (currently, some some use logrus,
some just print); this would also keep logging out of some packages, as well as
allows us to provide more metadata about the error (which reexec produced the
error for example).
A quick search showed that there's some external consumers of pkg/reexec, so I
kept this for a future discussion / exercise.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Create dangling images for imported images which don't have a name
annotation attached. Previously the content got loaded, but no image
referencing it was created which caused it to be garbage collected
immediately.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
release notes: https://github.com/opencontainers/runc/releases/tag/v1.1.7
full diff: https://github.com/opencontainers/runc/compare/v1.1.6...v1.1.7
This is the seventh patch release in the 1.1.z release of runc, and is
the last planned release of the 1.1.z series. It contains a fix for
cgroup device rules with systemd when handling device rules for devices
that don't exist (though for devices whose drivers don't correctly
register themselves in the kernel -- such as the NVIDIA devices -- the
full fix only works with systemd v240+).
- When used with systemd v240+, systemd cgroup drivers no longer skip
DeviceAllow rules if the device does not exist (a regression introduced
in runc 1.1.3). This fix also reverts the workaround added in runc 1.1.5,
removing an extra warning emitted by runc run/start.
- The source code now has a new file, runc.keyring, which contains the keys
used to sign runc releases.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
release notes: https://github.com/opencontainers/runc/releases/tag/v1.1.7
full diff: https://github.com/opencontainers/runc/compare/v1.1.6...v1.1.7
This is the seventh patch release in the 1.1.z release of runc, and is
the last planned release of the 1.1.z series. It contains a fix for
cgroup device rules with systemd when handling device rules for devices
that don't exist (though for devices whose drivers don't correctly
register themselves in the kernel -- such as the NVIDIA devices -- the
full fix only works with systemd v240+).
- When used with systemd v240+, systemd cgroup drivers no longer skip
DeviceAllow rules if the device does not exist (a regression introduced
in runc 1.1.3). This fix also reverts the workaround added in runc 1.1.5,
removing an extra warning emitted by runc run/start.
- The source code now has a new file, runc.keyring, which contains the keys
used to sign runc releases.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
- Verify the content to be equal, not "contains"; this output should be
predictable.
- Also verify the content returned by the function to match.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Looks like the intent is to exclude windows (which wouldn't have /etc/resolv.conf
nor systemd), but most tests would run fine elsewhere. This allows running the
tests on macOS for local testing.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Use t.TempDir() for convenience, and change some t.Fatal's to Errors,
so that all tests can run instead of failing early.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The test was assuming that the "source" file was always "/etc/resolv.conf",
but the `Get()` function uses `Path()` to find the location of resolv.conf,
which may be different.
While at it, also changed some `t.Fatalf()` to `t.Errorf()`, and renamed
some variables for clarity.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
After my last change, I noticed that the hash is used as a []byte in most
cases (other than tests). This patch updates the type to use a []byte, which
(although unlikely very important) also improves performance:
Compared to the previous version:
benchstat new.txt new2.txt
name old time/op new time/op delta
HashData-10 128ns ± 1% 116ns ± 1% -9.77% (p=0.000 n=20+20)
name old alloc/op new alloc/op delta
HashData-10 208B ± 0% 88B ± 0% -57.69% (p=0.000 n=20+20)
name old allocs/op new allocs/op delta
HashData-10 3.00 ± 0% 2.00 ± 0% -33.33% (p=0.000 n=20+20)
And compared to the original version:
benchstat old.txt new2.txt
name old time/op new time/op delta
HashData-10 201ns ± 1% 116ns ± 1% -42.39% (p=0.000 n=18+20)
name old alloc/op new alloc/op delta
HashData-10 416B ± 0% 88B ± 0% -78.85% (p=0.000 n=20+20)
name old allocs/op new allocs/op delta
HashData-10 6.00 ± 0% 2.00 ± 0% -66.67% (p=0.000 n=20+20)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The code seemed overly complicated, requiring a reader to be constructed,
where in all cases, the data was already available in a variable. This patch
simplifies the utility to not require a reader, which also makes it a bit
more performant:
go install golang.org/x/perf/cmd/benchstat@latest
GO111MODULE=off go test -run='^$' -bench=. -count=20 > old.txt
GO111MODULE=off go test -run='^$' -bench=. -count=20 > new.txt
benchstat old.txt new.txt
name old time/op new time/op delta
HashData-10 201ns ± 1% 128ns ± 1% -36.16% (p=0.000 n=18+20)
name old alloc/op new alloc/op delta
HashData-10 416B ± 0% 208B ± 0% -50.00% (p=0.000 n=20+20)
name old allocs/op new allocs/op delta
HashData-10 6.00 ± 0% 3.00 ± 0% -50.00% (p=0.000 n=20+20)
A small change was made in `Build()`, which previously returned the resolv.conf
data, even if the function failed to write it. In the new variation, `nil` is
consistently returned on failures.
Note that in various places, the hash is not even used, so we may be able to
simplify things more after this.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
As of Go 1.8, "net/http".Server provides facilities to close all
listeners, making the same facilities in server.Server redundant.
http.Server also improves upon server.Server by additionally providing a
facility to also wait for outstanding requests to complete after closing
all listeners. Leverage those facilities to give in-flight requests up
to five seconds to finish up after all containers have been shut down.
Signed-off-by: Cory Snider <csnider@mirantis.com>
- Use logrus.Fields instead of multiple WithField
- Split one giant debug log into one log per image
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Logging through a dependency-injected interface value was a vestige of
when Trap was in pkg/signal to avoid importing logrus in a reusable
package: cc4da81128.
Now that Trap lives under cmd/dockerd, nobody will be importing this so
we no longer need to worry about minimizing the package's dependencies.
Signed-off-by: Cory Snider <csnider@mirantis.com>
Always calling os.Exit() on clean shutdown may not always be desirable
as deferred functions are not run. Let the cleanup callback decide
whether or not to call os.Exit() itself. Allow the process to exit the
normal way, by returning from func main().
Simplify the trap.Trap implementation. The signal notifications are
buffered in a channel so there is little need to spawn a new goroutine
for each received signal. With all signals being handled in the same
goroutine, there are no longer any concurrency concerns around the
interrupt counter.
Signed-off-by: Cory Snider <csnider@mirantis.com>
The image store sends events when a new image is created/tagged, using
it instead of the reference store makes sure we send the "tag" event
when a new image is built using buildx.
Signed-off-by: Djordje Lukic <djordje.lukic@docker.com>
Don't panic when processing containers created under fork containerd
integration (this field was added in the upstream and didn't exist in
fork).
Co-authored-by: Djordje Lukic <djordje.lukic@docker.com>
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
The fix to ignore SIGPIPE signals was originally added in the Go 1.4
era. signal.Ignore was first added in Go 1.5.
Signed-off-by: Cory Snider <csnider@mirantis.com>
Since cc19eba (backported to v23.0.4), the PreferredPool for docker0 is
set only when the user provides the bip config parameter or when the
default bridge already exist. That means, if a user provides the
fixed-cidr parameter on a fresh install or reboot their computer/server
without bip set, dockerd throw the following error when it starts:
> failed to start daemon: Error initializing network controller: Error
> creating default "bridge" network: failed to parse pool request for
> address space "LocalDefault" pool "" subpool "100.64.0.0/26": Invalid
> Address SubPool
See #45356.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
The slice which stores chain ids used for computing shared size was
mistakenly initialized with the length set instead of the capacity.
This caused a panic when iterating over it later and dereferncing nil
pointer from empty items.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
The man page for sched_setaffinity(2) states the following about the pid
argument [1]:
> If pid is zero, then the mask of the calling thread is returned.
Thus the additional call to unix.Getpid can be omitted and pid = 0
passed to unix.SchedGetaffinity.
[1] https://man7.org/linux/man-pages/man2/sched_setaffinity.2.html#DESCRIPTION
Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Use Linux BPF extensions to locate the offset of the VXLAN header within
the packet so that the same BPF program works with VXLAN packets
received over either IPv4 or IPv6.
Signed-off-by: Cory Snider <csnider@mirantis.com>
Fixes `docker system prune --filter until=<timestamp>`.
`docker system prune` claims to support "until" filter for timestamps,
but it doesn't work because builder "until" filter only supports
duration.
Use the same filter parsing logic and then convert the timestamp to a
relative "keep-duration" supported by buildkit.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Unconditionally checking for RT_GROUP_SCHED is harmful. It is one of
the options that you want inactive unless you know that you want it
active.
Systemd recommends to disable it [1], a rationale for doing so is
provided in
https://bugzilla.redhat.com/show_bug.cgi?id=1229700#c0.
The essence is that you can not simply enable RT_GROUP_SCHED, you also
have to assign budgets manually. If you do not assign budgets, then
your realtime scheduling will be affected.
If check-config.sh keeps recommending to enable this, without further
advice, then users will follow the recommendation and likely run into
issues.
Again, this is one of the options that you want inactive, unless you
know that you want to use it.
Related Gentoo bugs:
- https://bugs.gentoo.org/904264
- https://bugs.gentoo.org/606548
1: 39857544ee/README (L144-L150)
Signed-off-by: Florian Schmaus <flo@geekplace.eu>
This test does not apply when running with snapshotters enabled;
go test -v -run TestGetInspectData .
=== RUN TestGetInspectData
inspect_test.go:27: RWLayer of container inspect-me is unexpectedly nil
--- FAIL: TestGetInspectData (0.00s)
FAIL
FAIL github.com/docker/docker/daemon 0.049s
FAIL
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
In versions of Docker before v1.10, this field was calculated from
the image itself and all of its parent images. Images are now stored
self-contained, and no longer use a parent-chain, making this field
an equivalent of the Size field.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
In versions of Docker before v1.10, this field was calculated from
the image itself and all of its parent images. Images are now stored
self-contained, and no longer use a parent-chain, making this field
an equivalent of the Size field.
For the containerd integration, the Size should be the sum of the
image's compressed / packaged and unpacked (snapshots) layers.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This commit removes iptables rules configured for secure overlay
networks when a network is deleted. Prior to this commit, only
CreateNetwork() was taking care of removing stale iptables rules.
If one of the iptables rule can't be removed, the erorr is logged but
it doesn't prevent network deletion.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
There's still some locations refering to AuFS;
- pkg/archive: I suspect most of that code is because the whiteout-files
are modelled after aufs (but possibly some code is only relevant to
images created with AuFS as storage driver; to be looked into).
- contrib/apparmor/template: likely some rules can be removed
- contrib/dockerize-disk.sh: very old contribution, and unlikely used
by anyone, but perhaps could be updated if we want to (or just removed).
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Update the version manually (we don't have automation for this yet), and
add a comment to vendor.mod to help users remind to update it.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
In the case of an error when calling snapshotter.Prepare we would return
nil. This change fixes that and returns the error from Prepare all the
time.
Signed-off-by: Djordje Lukic <djordje.lukic@docker.com>
It's not originally supported by image list, but we need it for `prune`
needs it, so `list` gets it for free.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
release notes: https://github.com/opencontainers/runc/releases/tag/v1.1.6
full diff: https://github.com/opencontainers/runc/compare/v1.1.5...v1.1.6
This is the sixth patch release in the 1.1.z series of runc, which fixes
a series of cgroup-related issues.
Note that this release can no longer be built from sources using Go
1.16. Using a latest maintained Go 1.20.x or Go 1.19.x release is
recommended. Go 1.17 can still be used.
- systemd cgroup v1 and v2 drivers were deliberately ignoring UnitExist error
from systemd while trying to create a systemd unit, which in some scenarios
may result in a container not being added to the proper systemd unit and
cgroup.
- systemd cgroup v2 driver was incorrectly translating cpuset range from spec's
resources.cpu.cpus to systemd unit property (AllowedCPUs) in case of more
than 8 CPUs, resulting in the wrong AllowedCPUs setting.
- systemd cgroup v1 driver was prefixing container's cgroup path with the path
of PID 1 cgroup, resulting in inability to place PID 1 in a non-root cgroup.
- runc run/start may return "permission denied" error when starting a rootless
container when the file to be executed does not have executable bit set for
the user, not taking the CAP_DAC_OVERRIDE capability into account. This is
a regression in runc 1.1.4, as well as in Go 1.20 and 1.20.1
- cgroup v1 drivers are now aware of misc controller.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
release notes: https://github.com/opencontainers/runc/releases/tag/v1.1.6
full diff: https://github.com/opencontainers/runc/compare/v1.1.5...v1.1.6
This is the sixth patch release in the 1.1.z series of runc, which fixes
a series of cgroup-related issues.
Note that this release can no longer be built from sources using Go
1.16. Using a latest maintained Go 1.20.x or Go 1.19.x release is
recommended. Go 1.17 can still be used.
- systemd cgroup v1 and v2 drivers were deliberately ignoring UnitExist error
from systemd while trying to create a systemd unit, which in some scenarios
may result in a container not being added to the proper systemd unit and
cgroup.
- systemd cgroup v2 driver was incorrectly translating cpuset range from spec's
resources.cpu.cpus to systemd unit property (AllowedCPUs) in case of more
than 8 CPUs, resulting in the wrong AllowedCPUs setting.
- systemd cgroup v1 driver was prefixing container's cgroup path with the path
of PID 1 cgroup, resulting in inability to place PID 1 in a non-root cgroup.
- runc run/start may return "permission denied" error when starting a rootless
container when the file to be executed does not have executable bit set for
the user, not taking the CAP_DAC_OVERRIDE capability into account. This is
a regression in runc 1.1.4, as well as in Go 1.20 and 1.20.1
- cgroup v1 drivers are now aware of misc controller.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Change return value in function signature and return fatal errors so
they can actually be reported to the caller instead of just being logged
to daemon log.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
The `oom-score-adjust` option was added in a894aec8d8,
to prevent the daemon from being OOM-killed before other processes. This
option was mostly added as a "convenience", as running the daemon as a
systemd unit was not yet common.
Having the daemon set its own limits is not best-practice, and something
better handled by the process-manager starting the daemon.
Commit cf7a5be0f2 fixed this option to allow
disabling it, and 2b8e68ef06 removed the default
score adjust.
This patch deprecates the option altogether, recommending users to set these
limits through the process manager used, such as the "OOMScoreAdjust" option
in systemd units.
With this patch:
dockerd --oom-score-adjust=-500 --validate
Flag --oom-score-adjust has been deprecated, and will be removed in the next release.
configuration OK
echo '{"oom-score-adjust":-500}' > /etc/docker/daemon.json
dockerd
INFO[2023-04-12T21:34:51.133389627Z] Starting up
INFO[2023-04-12T21:34:51.135607544Z] containerd not running, starting managed containerd
WARN[2023-04-12T21:34:51.135629086Z] DEPRECATED: The "oom-score-adjust" config parameter and the dockerd "--oom-score-adjust" option will be removed in the next release.
docker info
Client:
Context: default
Debug Mode: false
...
DEPRECATED: The "oom-score-adjust" config parameter and the dockerd "--oom-score-adjust" option will be removed in the next release
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The call to an unsecure registry doesn't return an error saying that the
"server gave an HTTP response to an HTTPS client" but a
tls.RecordHeaderError saying that the "first record does not look like a
TLS handshake", this changeset looks for the right error for that case.
This fixes the http fallback when using an insecure registry
Signed-off-by: Djordje Lukic <djordje.lukic@docker.com>
The (*network).ipamRelease function nils out the network's IPAM info
fields, putting the network struct into an inconsistent state. The
network-restore startup code panics if it tries to restore a network
from a struct which has fewer IPAM config entries than IPAM info
entries. Therefore (*network).delete contains a critical section: by
persisting the network to the store after ipamRelease(), the datastore
will contain an inconsistent network until the deletion operation
completes and finishes deleting the network from the datastore. If for
any reason the deletion operation is interrupted between ipamRelease()
and deleteFromStore(), the daemon will crash on startup when it tries to
restore the network.
Updating the datastore after releasing the network's IPAM pools may have
served a purpose in the past, when a global datastore was used for
intra-cluster communication and the IPAM allocator had persistent global
state, but nowadays there is no global datastore and the IPAM allocator
has no persistent state whatsoever. Remove the vestigial datastore
update as it is no longer necessary and only serves to cause problems.
If the network deletion is interrupted before the network is deleted
from the datastore, the deletion will resume during the next daemon
startup, including releasing the IPAM pools.
Signed-off-by: Cory Snider <csnider@mirantis.com>
The GetRepository method interacts directly with the registry, and does
not depend on the snapshotter, but is used for two purposes;
For the GET /distribution/{name:.*}/json route;
dd3b71d17c/api/server/router/distribution/backend.go (L11-L15)
And to satisfy the "executor.ImageBackend" interface as used by Swarm;
58c027ac8b/daemon/cluster/executor/backend.go (L77)
This patch removes the method from the ImageService interface, and instead
implements it through an composite struct that satisfies both interfaces,
and an ImageBackend() method is added to the daemon.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
remove GetRepository from ImageService
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
- Prevent from descriptor leak
- Fixes optlen in getsockopt() for s390x
full diff: 9a39160e90...7ff4192f6f
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This enforces the github.com/containerd/containerd/errdefs package to
be aliased as "cerrdefs". Any other alias (or no alias used) results
in a linting failure:
integration/container/pause_test.go:9:2: import "github.com/containerd/containerd/errdefs" imported as "c8derrdefs" but must be "cerrdefs" according to config (importas)
c8derrdefs "github.com/containerd/containerd/errdefs"
^
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The signatures of functions in containerd's errdefs packages are very
similar to those in our own, and it's easy to accidentally use the wrong
package.
This patch uses a consistent alias for all occurrences of this import.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This const looks to only be there for "convenience", or _possibly_ was created
with future normalization or special handling in mind.
In either case, currently it is just a direct copy (alias) for runtime.GOOS,
and defining our own type for this gives the impression that it's more than
that. It's only used in a single place, and there's no external consumers, so
let's deprecate this const, and use runtime.GOOS instead.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
These consts are only used internally, and never returned to the user.
Un-export to make it clear these are not for external consumption.
While looking at the code, I also noticed that we may be using the wrong
Windows API to collect this information (and found an implementation elsewhere
that does use the correct API). I did not yet update the code, in cases there
are specific reasons.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
It was not immediately clear why we were not using runtime.GOARCH for
these (with a conversion to other formats, such as x86_64). These docs
are based on comments that were posted when implementing this package;
- https://github.com/moby/moby/pull/13921#issuecomment-130106474
- https://github.com/moby/moby/pull/13921#issuecomment-140270124
Some links were now redirecting to a new location, so updated them to
not depend on the redirect.
While at it, also updated a call to logrus to use structured formatting
(WithError()).
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This change makes is possible to run `docker exec -u <UID> ...` when the
containerd integration is activated.
Signed-off-by: Djordje Lukic <djordje.lukic@docker.com>
- update various dependencies
- Windows: Support local drivers internal, l2bridge and nat
full diff: e28e8ba9bc...75e92ce14f
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
When running hack/vendor.sh, I noticed this file was added to vendor.
I suspect this should've been part of 0233029d5a,
but the vendor check doesn't appear to be catching this.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Linux kernel prior to v3.16 was not supporting netns for vxlan
interfaces. As such, moby/libnetwork#821 introduced a "host mode" to the
overlay driver. The related kernel fix is available for rhel7 users
since v7.2.
This mode could be forced through the use of the env var
_OVERLAY_HOST_MODE. However this env var has never been documented and
is not referenced in any blog post, so there's little chance many people
rely on it. Moreover, this host mode is deemed as an implementation
details by maintainers. As such, we can consider it dead and we can
remove it without a prior deprecation warning.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
Since 0fa873c, there's no function writing overlay networks to some
datastore. As such, overlay network struct doesn't need to implement
KVObject interface.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
Since a few commits, subnet's vni don't change during the lifetime of
the subnet struct, so there's no need to lock the network before
accessing it.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
Since the previous commit, data from the local store are never read,
thus proving it was only used for Classic Swarm.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
The overlay driver in Swarm v2 mode doesn't support live-restore, ie.
the daemon won't even start if the node is part of a Swarm cluster and
live-restore is enabled. This feature was only used by Swarm Classic.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
VNI allocations made by the overlay driver were only used by Classic
Swarm. With Swarm v2 mode, the driver ovmanager is responsible of
allocating & releasing them.
Previously, vxlanIdm was initialized when a global store was available
but since 142b522, no global store can be instantiated. As such,
releaseVxlanID actually does actually nothing and iptables rules are
never removed.
The last line of dead code detected by golangci-lint is now gone.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
Prior to 0fa873c, the serf-based event loop was started when a global
store was available. Since there's no more global store, this event loop
and all its associated code is dead.
Most dead code detected by golangci-lint in prior commits is now gone.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
The overlay driver was creating a global store whenever
netlabel.GlobalKVClient was specified in its config argument. This
specific label is unused anymore since 142b522 (moby/moby#44875).
It was also creating a local store whenever netlabel.LocalKVClient was
specificed in its config argument. This store is unused since
moby/libnetwork@9e72136 (moby/libnetwork#1636).
Finally, the sync.Once properties are never used and thus can be
deleted.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
The overlay driver was creating a global store whenever
netlabel.GlobalKVClient was specified in its config argument. This
specific label is not used anymore since 142b522 (moby/moby#44875).
golangci-lint now detects dead code. This will be fixed in subsequent
commits.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
This command was useful when overlay networks based on external KV store
was developed but is unused nowadays.
As the last reference to OverlayBindInterface and OverlayNeighborIP
netlabels are in the ovrouter cmd, they're removed too.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
This only makes the containerd ImageService implementation respect
context cancellation though.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Implement Children method for containerd image store which makes the
`ancestor` filter work for `docker ps`. Checking if image is a children
of other image is implemented by comparing their rootfs diffids because
containerd image store doesn't have a concept of image parentship like
the graphdriver store. The child is expected to have more layers than
the parent and should start with all parent layers.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Before:
```console
$ docker-rootless-setuptool.sh install
...
[INFO] Use CLI context "rootless"
Current context is now "rootless"
[INFO] Make sure the following environment variables are set (or add them to ~/.bashrc):
export PATH=/usr/local/bin:$PATH
Some applications may require the following environment variable too:
export DOCKER_HOST=unix:///run/user/1001/docker.sock
```
After:
```console
$ docker-rootless-setuptool.sh install
...
[INFO] Using CLI context "rootless"
Current context is now "rootless"
[INFO] Make sure the following environment variable(s) are set (or add them to ~/.bashrc):
export PATH=/usr/local/bin:$PATH
[INFO] Some applications may require the following environment variable too:
export DOCKER_HOST=unix:///run/user/1001/docker.sock
```
Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
Drop support for platforms which only have xt_u32 but not xt_bpf. No
attempt is made to clean up old xt_u32 iptables rules left over from a
previous daemon instance.
Signed-off-by: Cory Snider <csnider@mirantis.com>
go1.20.3 (released 2023-04-04) includes security fixes to the go/parser,
html/template, mime/multipart, net/http, and net/textproto packages, as well
as bug fixes to the compiler, the linker, the runtime, and the time package.
See the Go 1.20.3 milestone on our issue tracker for details:
https://github.com/golang/go/issues?q=milestone%3AGo1.20.3+label%3ACherryPickApproved
full diff: https://github.com/golang/go/compare/go1.20.2...go1.20.3
Further details from the announcement on the mailing list:
We have just released Go versions 1.20.3 and 1.19.8, minor point releases.
These minor releases include 4 security fixes following the security policy:
- go/parser: infinite loop in parsing
Calling any of the Parse functions on Go source code which contains `//line`
directives with very large line numbers can cause an infinite loop due to
integer overflow.
Thanks to Philippe Antoine (Catena cyber) for reporting this issue.
This is CVE-2023-24537 and Go issue https://go.dev/issue/59180.
- html/template: backticks not treated as string delimiters
Templates did not properly consider backticks (`) as Javascript string
delimiters, and as such did not escape them as expected. Backticks are
used, since ES6, for JS template literals. If a template contained a Go
template action within a Javascript template literal, the contents of the
action could be used to terminate the literal, injecting arbitrary Javascript
code into the Go template.
As ES6 template literals are rather complex, and themselves can do string
interpolation, we've decided to simply disallow Go template actions from being
used inside of them (e.g. "var a = {{.}}"), since there is no obviously safe
way to allow this behavior. This takes the same approach as
github.com/google/safehtml. Template.Parse will now return an Error when it
encounters templates like this, with a currently unexported ErrorCode with a
value of 12. This ErrorCode will be exported in the next major release.
Users who rely on this behavior can re-enable it using the GODEBUG flag
jstmpllitinterp=1, with the caveat that backticks will now be escaped. This
should be used with caution.
Thanks to Sohom Datta, Manipal Institute of Technology, for reporting this issue.
This is CVE-2023-24538 and Go issue https://go.dev/issue/59234.
- net/http, net/textproto: denial of service from excessive memory allocation
HTTP and MIME header parsing could allocate large amounts of memory, even when
parsing small inputs.
Certain unusual patterns of input data could cause the common function used to
parse HTTP and MIME headers to allocate substantially more memory than
required to hold the parsed headers. An attacker can exploit this behavior to
cause an HTTP server to allocate large amounts of memory from a small request,
potentially leading to memory exhaustion and a denial of service.
Header parsing now correctly allocates only the memory required to hold parsed
headers.
Thanks to Jakob Ackermann (@das7pad) for discovering this issue.
This is CVE-2023-24534 and Go issue https://go.dev/issue/58975.
- net/http, net/textproto, mime/multipart: denial of service from excessive resource consumption
Multipart form parsing can consume large amounts of CPU and memory when
processing form inputs containing very large numbers of parts. This stems from
several causes:
mime/multipart.Reader.ReadForm limits the total memory a parsed multipart form
can consume. ReadForm could undercount the amount of memory consumed, leading
it to accept larger inputs than intended. Limiting total memory does not
account for increased pressure on the garbage collector from large numbers of
small allocations in forms with many parts. ReadForm could allocate a large
number of short-lived buffers, further increasing pressure on the garbage
collector. The combination of these factors can permit an attacker to cause an
program that parses multipart forms to consume large amounts of CPU and
memory, potentially resulting in a denial of service. This affects programs
that use mime/multipart.Reader.ReadForm, as well as form parsing in the
net/http package with the Request methods FormFile, FormValue,
ParseMultipartForm, and PostFormValue.
ReadForm now does a better job of estimating the memory consumption of parsed
forms, and performs many fewer short-lived allocations.
In addition, mime/multipart.Reader now imposes the following limits on the
size of parsed forms:
Forms parsed with ReadForm may contain no more than 1000 parts. This limit may
be adjusted with the environment variable GODEBUG=multipartmaxparts=. Form
parts parsed with NextPart and NextRawPart may contain no more than 10,000
header fields. In addition, forms parsed with ReadForm may contain no more
than 10,000 header fields across all parts. This limit may be adjusted with
the environment variable GODEBUG=multipartmaxheaders=.
Thanks to Jakob Ackermann for discovering this issue.
This is CVE-2023-24536 and Go issue https://go.dev/issue/59153.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
While we currently do not provide an option to specify the snapshotter to use
for individual containers (we may want to add this option in future), currently
it already is possible to configure the snapshotter in the daemon configuration,
which could (likely) cause issues when changing and restarting the daemon.
This patch updates some code-paths that have the container available to use
the snapshotter that's configured for the container (instead of the default
snapshotter configured).
There are still code-paths to be looked into, and a tracking ticket as well as
some TODO's were added for those.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Previously, the AWSLogs driver attempted to implement
non-blocking itself. Non-blocking is supposed to
implemented solely by the Docker RingBuffer that
wraps the log driver.
Please see issue and explanation here:
https://github.com/moby/moby/issues/45217
Signed-off-by: Wesley Pettit <wppttt@amazon.com>
Prevent the daemon from erroring out if daemon.json contains default
network options for network drivers aside from bridge. Configuring
defaults for the bridge driver previously worked by coincidence because
the unrelated CLI flag '--bridge' exists.
Signed-off-by: Cory Snider <csnider@mirantis.com>
Closing stdin of a container or exec (a.k.a.: task or process) has been
somewhat broken ever since support for ContainerD 1.0 was introduced
back in Docker v17.11: the error returned from the CloseIO() call was
effectively ignored due to it being assigned to a local variable which
shadowed the intended variable. Serendipitously, that oversight
prevented a data race. In my recent refactor of libcontainerd, I
corrected the variable shadowing issue and introduced the aforementioned
data race in the process.
Avoid deadlocking when closing stdin without swallowing errors or
introducing data races by calling CloseIO() synchronously if the process
handle is available, falling back to an asynchronous close-and-log
strategy otherwise. This solution is inelegant and complex, but looks to
be the best that could be done without changing the libcontainerd API.
Signed-off-by: Cory Snider <csnider@mirantis.com>
- use apiClient for api-clients to reduce shadowing (also more "accurate")
- use "ctr" instead of "container"
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Attestation manifests have an OCI image media type, which makes them
being listed like they were a separate platform supported by the image.
Don't use `images.Platforms` and walk the manifest list ourselves
looking for all manifests that are an actual image manifest.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
commit 3c69b9f2c5 replaced these functions
and types with github.com/moby/patternmatcher. That commit has shipped with
docker 23.0, and BuildKit v0.11 no longer uses the old functions, so we can
remove these.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The 'Deprecated:' line in NewClient's doc comment was not in a new
paragraph, so GoDoc, linters, and IDEs were unaware that it was
deprecated. The package documentation also continued to reference
NewClient. Update the doc comments to finish documenting that NewClient
is deprecated.
Signed-off-by: Cory Snider <csnider@mirantis.com>
Distribution source labels don't store port of the repository. If the
content was obtained from repository 172.17.0.2:5000 then its
corresponding label will have a key "containerd.io/distribution.source.172.17.0.2".
Fix the check in canBeMounted to ignore the :port part of the domain.
This also removes the check which prevented insecure repositories to use
cross-repo mount - the real cause was the mismatch in domain comparison.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Distribution source label can specify multiple repositories - in this
case value is a comma separated list of source repositories.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Previously the labels would be appended for content that was pushed
even if subsequent pushes of other content failed.
Change the behavior to only append the labels if the whole push
operation succeeded.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Handler is called in parallel and modifying a map without
synchronization is a race condition.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
This is a follow-up of 48ad9e1. This commit removed the function
ElectInterfaceAddresses from utils_linux.go but not their FreeBSD &
Windows counterpart. As these functions are never called, they can be
safely removed.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
The ExtDNS2 field was added in
aad1632c15
to migrate existing state from < 1.14 to a new type. As it's unlikely
that installations still have state from before 1.14, rename ExtDNS2
back to ExtDNS and drop the migration code.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Co-authored-by: Cory Snider <csnider@mirantis.com>
Signed-off-by: Cory Snider <csnider@mirantis.com>
- make jobs.Add accept a list of jobs, so that we don't have to
repeatedly lock/unlock the mutex
- rename some variables that collided with imports or types
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This implements `docker push` under containerd image store. When
pushing manifest lists that reference a content which is not present in
the local content store, it will attempt to perform the cross-repo mount
the content if possible.
Considering this scenario:
```bash
$ docker pull docker.io/library/busybox
```
This will download manifest list and only host platform-specific
manifest and blobs.
Note, tagging to a different repository (but still the same registry) and pushing:
```bash
$ docker tag docker.io/library/busybox docker.io/private-repo/mybusybox
$ docker push docker.io/private-repo/mybusybox
```
will result in error, because the neither we nor the target repository
doesn't have the manifests that the busybox manifest list references
(because manifests can't be cross-repo mounted).
If for some reason the manifests and configs for all other platforms
would be present in the content store, but only layer blobs were
missing, then the push would work, because the blobs can be cross-repo
mounted (only if we push to the same registry).
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Push the reference parsing from repo and tag names into the api and pass
a reference object to the ImageService.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
release notes: https://github.com/opencontainers/runc/releases/tag/v1.1.5
diff: https://github.com/opencontainers/runc/compare/v1.1.4...v1.1.5
This is the fifth patch release in the 1.1.z series of runc, which fixes
three CVEs found in runc.
* CVE-2023-25809 is a vulnerability involving rootless containers where
(under specific configurations), the container would have write access
to the /sys/fs/cgroup/user.slice/... cgroup hierarchy. No other
hierarchies on the host were affected. This vulnerability was
discovered by Akihiro Suda.
<https://github.com/opencontainers/runc/security/advisories/GHSA-m8cg-xc2p-r3fc>
* CVE-2023-27561 was a regression which effectively re-introduced
CVE-2019-19921. This bug was present from v1.0.0-rc95 to v1.1.4. This
regression was discovered by @Beuc.
<https://github.com/advisories/GHSA-vpvm-3wq2-2wvm>
* CVE-2023-28642 is a variant of CVE-2023-27561 and was fixed by the same
patch. This variant of the above vulnerability was reported by Lei
Wang.
<https://github.com/opencontainers/runc/security/advisories/GHSA-g2j6-57v7-gm8c>
In addition, the following other fixes are included in this release:
* Fix the inability to use `/dev/null` when inside a container.
* Fix changing the ownership of host's `/dev/null` caused by fd redirection
(a regression in 1.1.1).
* Fix rare runc exec/enter unshare error on older kernels, including
CentOS < 7.7.
* nsexec: Check for errors in `write_log()`.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
no changes in vendored code, just keeping scanners happy :)
release notes: https://github.com/opencontainers/runc/releases/tag/v1.1.5
diff: https://github.com/opencontainers/runc/compare/v1.1.4...v1.1.5
This is the fifth patch release in the 1.1.z series of runc, which fixes
three CVEs found in runc.
* CVE-2023-25809 is a vulnerability involving rootless containers where
(under specific configurations), the container would have write access
to the /sys/fs/cgroup/user.slice/... cgroup hierarchy. No other
hierarchies on the host were affected. This vulnerability was
discovered by Akihiro Suda.
<https://github.com/opencontainers/runc/security/advisories/GHSA-m8cg-xc2p-r3fc>
* CVE-2023-27561 was a regression which effectively re-introduced
CVE-2019-19921. This bug was present from v1.0.0-rc95 to v1.1.4. This
regression was discovered by @Beuc.
<https://github.com/advisories/GHSA-vpvm-3wq2-2wvm>
* CVE-2023-28642 is a variant of CVE-2023-27561 and was fixed by the same
patch. This variant of the above vulnerability was reported by Lei
Wang.
<https://github.com/opencontainers/runc/security/advisories/GHSA-g2j6-57v7-gm8c>
In addition, the following other fixes are included in this release:
* Fix the inability to use `/dev/null` when inside a container.
* Fix changing the ownership of host's `/dev/null` caused by fd redirection
(a regression in 1.1.1).
* Fix rare runc exec/enter unshare error on older kernels, including
CentOS < 7.7.
* nsexec: Check for errors in `write_log()`.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Previously commit incorrectly used image config digest as an image id
for the new image which isn't consistent with the image target.
This changes it to use manifest digest.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Don't try to parse dangling images name (they have a non-canonical
format - `moby-dangling@sha256:...`) as a reference.
Log a warning if the image is not dangling and its name is not a valid
named reference.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Allow SetMatrix to be used as a value type with a ready-to-use zero
value. SetMatrix values are already non-copyable by virtue of having a
mutex field so there is no harm in allowing non-pointer values to be
used as local variables or struct fields. Any attempts to pass around
by-value copies, e.g. as function arguments, will be flagged by go vet.
Signed-off-by: Cory Snider <csnider@mirantis.com>
The `docker-init` binary is not intended to be a user-facing command, and as such it is more appropriate for it to be found in `/usr/libexec` (or similar) than in `PATH` (see the FHS, especially https://refspecs.linuxfoundation.org/FHS_3.0/fhs/ch04s07.html and https://refspecs.linuxfoundation.org/FHS_2.3/fhs-2.3.html#USRLIBLIBRARIESFORPROGRAMMINGANDPA).
This adjusts the logic for using that configuration option to take this into account and appropriately search for `docker-init` (or the user's configured alternative) in these directories before falling back to the existing `PATH` lookup behavior.
This behavior _used_ to exist for the old `dockerinit` binary (of a similar name and used in a similar way but for an alternative purpose), but that behavior was removed in 4357ed4a73 when that older `dockerinit` was also removed.
Most of this reasoning _also_ applies to `docker-proxy` (and various `containerd-xxx` binaries such as the shims), but this change does not affect those. It would be relatively straightforward to adapt `LookupInitPath` to be a more generic function such as `libexecLookupPath` or similar if we wanted to explore that.
See 14482589df/cli-plugins/manager/manager_unix.go for the related path list in the CLI which loads CLI plugins from a similar set of paths (with a similar rationale - plugin binaries are not typically intended to be run directly by users but rather invoked _via_ the CLI binary).
Signed-off-by: Tianon Gravi <admwiggin@gmail.com>
This adds a function to the client package which can be used to create a
buildkit client from our moby client.
Example:
```go
package main
import (
"context"
"github.com/moby/moby/client"
bkclient "github.com/moby/buildkit/client"
)
func main() {
c := client.NewWithOpts()
bc, _ := bkclient.New(context.Background(), ""
client.BuildkitClientOpts(c),
)
// ...
}
```
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
FirewallD creates the root INPUT chain with a default-accept policy and
a terminal rule which rejects all packets not accepted by any prior
rule. Any subsequent rules appended to the chain are therefore inert.
The administrator would have to open the VXLAN UDP port to make overlay
networks work at all, which would result in all VXLAN traffic being
accepted and defeating our attempts to enforce encryption on encrypted
overlay networks.
Insert the rule to drop unencrypted VXLAN packets tagged for encrypted
overlay networks at the top of the INPUT chain so that enforcement of
mandatory encryption takes precedence over any accept rules configured
by the administrator. Continue to append the accept rule to the bottom
of the chain so as not to override any administrator-configured drop
rules.
Signed-off-by: Cory Snider <csnider@mirantis.com>
The User-Agent includes the kernel version, which involves making a syscall
(and parsing the results) on Linux, and reading (plus parsing) the registry
on Windows. These operations are relatively costly, and we should not perform
those on every request that uses the User-Agent.
This patch adds a sync.Once so that we only perform these actions once for
the lifetime of the daemon's process.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
GRPC is logging a *lot* of garbage at info level.
This configures the GRPC logger such that it is only giving us logs when
at debug level and also adds a log field indicating where the logs are
coming from.
containerd is still currently spewing these same log messages and needs
a separate update.
Without this change `docker build` is extremely noisy in the daemon
logs.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
Commit 3991faf464 moved search into the registry
package, which also made the `dockerversion` package a dependency for registry,
which brings additional (indirect) dependencies, such as `pkg/parsers/kernel`,
and `golang.org/x/sys/windows/registry`.
Client code, such as used in docker/cli may depend on the `registry` package,
but should not depend on those additional dependencies.
This patch moves setting the userAgent to the API router, and instead of
passing it as a separate argument, includes it into the "headers".
As these headers now not only contain the `X-Meta-...` headers, the variables
were renamed accordingly.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Use `exec.Command` created by this function instead of obtaining it from
daemon struct. This prevents a race condition where `daemon.Kill` is
called before the goroutine has the chance to call `cmd.Wait`.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
TestDaemonRestartKillContainers test was always executing the last case
(`container created should not be restarted`) because the iterated
variables were not copied correctly.
Capture iterated values by value correctly and rename c to tc.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Some newer distros such as RHEL 9 have stopped making the xt_u32 kernel
module available with the kernels they ship. They do ship the xt_bpf
kernel module, which can do everything xt_u32 can and more. Add an
alternative implementation of the iptables match rule which uses xt_bpf
to implement exactly the same logic as the u32 filter using a BPF
program. Try programming the BPF-powered rules as a fallback when
programming the u32-powered rules fails.
Signed-off-by: Cory Snider <csnider@mirantis.com>
The iptables rule clause used to match on the VNI of VXLAN datagrams
looks like line noise to the uninitiated. It doesn't help that the
expression is repeated twice and neither copy has any commentary.
DRY out the rule builder to a common function, and document what the
rule does and how it works.
Signed-off-by: Cory Snider <csnider@mirantis.com>
The iptables rules which make encryption mandatory on an encrypted
overlay network are only programmed once there is a second node
participating in the network. This leaves single-node encrypted overlay
networks vulnerable to packet injection. Furthermore, failure to program
the rules is not treated as a fatal error.
Program the iptables rules to make encryption mandatory before creating
the VXLAN link to guarantee that there is no window of time where
incoming cleartext VXLAN packets for the network would be accepted, or
outgoing cleartext packets be transmitted. Only create the VXLAN link if
programming the rules succeeds to ensure that it fails closed.
Signed-off-by: Cory Snider <csnider@mirantis.com>
The overlay-network encryption code is woefully under-documented, which
is especially problematic as it operates on under-documented kernel
interfaces. Document what I have puzzled out of the implementation for
the benefit of the next poor soul to touch this code.
Signed-off-by: Cory Snider <csnider@mirantis.com>
Use the utility introduced in 1bd486666b to
share the same implementation as similar options. The IPCModeContainer const
is left for now, but we need to consider what to do with these.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Commit e7d75c8db7 fixed validation of "host"
mode values, but also introduced a regression for validating "container:"
mode PID-modes.
PID-mode implemented a stricter validation than the other options and, unlike
the other options, did not accept an empty container name/ID. This feature was
originally implemented in fb43ef649b, added some
some integration tests (but no coverage for this case), and the related changes
in the API types did not have unit-tests.
While a later change (d4aec5f0a6) added a test
for the `--pid=container:` (empty name) case, that test was later migrated to
the CLI repository, as it covered parsing the flag (and validating the result).
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
commit 1bd486666b refactored this code, but
it looks like I removed some changes in this part of the code when extracting
these changes from a branch I was working on, and the behavior did not match
the function's description (name to be empty if there is no "container:" prefix
Unfortunately, there was no test coverage for this in this repository, so we
didn't catch this.
This patch:
- fixes containerID() to not return a name/ID if no container: prefix is present
- adds test-coverage for TestCgroupSpec
- adds test-coverage for NetworkMode.ConnectedContainer
- updates some test-tables to remove duplicates, defaults, and use similar cases
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Make if more explicit which test-cases should be valid, and make it the
first field, because the "valid" field is shared among all test-cases in
the test-table, and making it the first field makes it slightly easier
to distinguish valid from invalid cases.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Commit 6a516acb2e moved the MemInfo type and
ReadMemInfo() function into the pkg/sysinfo package. In an attempt to assist
consumers of these to migrate to the new location, an alias was added.
Unfortunately, the side effect of this alias is that pkg/system now depends
on pkg/sysinfo, which means that consumers of this (such as docker/cli) now
get all (indirect) dependencies of that package as dependency, which includes
many dependencies that should only be needed for the daemon / runtime;
- github.com/cilium/ebpf
- github.com/containerd/cgroups
- github.com/coreos/go-systemd/v22
- github.com/godbus/dbus/v5
- github.com/moby/sys/mountinfo
- github.com/opencontainers/runtime-spec
This patch moves the MemInfo related code to its own package. As the previous move
was not yet part of a release, we're not adding new aliases in pkg/sysinfo.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Funneling the peer operations into an unbuffered channel only serves to
achieve the same result as a mutex, using a lot more boilerplate and
indirection. Get rid of the boilerplate and unnecessary indirection by
using a mutex and calling the operations directly.
Signed-off-by: Cory Snider <csnider@mirantis.com>
idtools.MkdirAllAndChown on Windows does not chown directories, which makes
idtools.MkdirAllAndChown() just an alias for system.MkDirAll().
Also setting the filemode to `0`, as changing filemode is a no-op on Windows as
well; both of these changes should make it more transparent that no chown'ing,
nor changing filemode takes place.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Volumes created from the image config were not being pruned because the
volume service did not think they were anonymous since the code to
create passes along a generated name instead of letting the volume
service generate it.
This changes the code path to have the volume service generate the name
instead of doing it ahead of time.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
Commit 3246db3755 added handling for removing
cluster volumes, but in some conditions, this resulted in errors not being
returned if the volume was in use;
docker swarm init
docker volume create foo
docker create -v foo:/foo busybox top
docker volume rm foo
This patch changes the logic for ignoring "local" volume errors if swarm
is enabled (and cluster volumes supported).
While working on this fix, I also discovered that Cluster.RemoveVolume()
did not handle the "force" option correctly; while swarm correctly handled
these, the cluster backend performs a lookup of the volume first (to obtain
its ID), which would fail if the volume didn't exist.
Before this patch:
make TEST_FILTER=TestVolumesRemoveSwarmEnabled DOCKER_GRAPHDRIVER=vfs test-integration
...
Running /go/src/github.com/docker/docker/integration/volume (arm64.integration.volume) flags=-test.v -test.timeout=10m -test.run TestVolumesRemoveSwarmEnabled
...
=== RUN TestVolumesRemoveSwarmEnabled
=== PAUSE TestVolumesRemoveSwarmEnabled
=== CONT TestVolumesRemoveSwarmEnabled
=== RUN TestVolumesRemoveSwarmEnabled/volume_in_use
volume_test.go:122: assertion failed: error is nil, not errdefs.IsConflict
volume_test.go:123: assertion failed: expected an error, got nil
=== RUN TestVolumesRemoveSwarmEnabled/volume_not_in_use
=== RUN TestVolumesRemoveSwarmEnabled/non-existing_volume
=== RUN TestVolumesRemoveSwarmEnabled/non-existing_volume_force
volume_test.go:143: assertion failed: error is not nil: Error response from daemon: volume no_such_volume not found
--- FAIL: TestVolumesRemoveSwarmEnabled (1.57s)
--- FAIL: TestVolumesRemoveSwarmEnabled/volume_in_use (0.00s)
--- PASS: TestVolumesRemoveSwarmEnabled/volume_not_in_use (0.01s)
--- PASS: TestVolumesRemoveSwarmEnabled/non-existing_volume (0.00s)
--- FAIL: TestVolumesRemoveSwarmEnabled/non-existing_volume_force (0.00s)
FAIL
With this patch:
make TEST_FILTER=TestVolumesRemoveSwarmEnabled DOCKER_GRAPHDRIVER=vfs test-integration
...
Running /go/src/github.com/docker/docker/integration/volume (arm64.integration.volume) flags=-test.v -test.timeout=10m -test.run TestVolumesRemoveSwarmEnabled
...
make TEST_FILTER=TestVolumesRemoveSwarmEnabled DOCKER_GRAPHDRIVER=vfs test-integration
...
Running /go/src/github.com/docker/docker/integration/volume (arm64.integration.volume) flags=-test.v -test.timeout=10m -test.run TestVolumesRemoveSwarmEnabled
...
=== RUN TestVolumesRemoveSwarmEnabled
=== PAUSE TestVolumesRemoveSwarmEnabled
=== CONT TestVolumesRemoveSwarmEnabled
=== RUN TestVolumesRemoveSwarmEnabled/volume_in_use
=== RUN TestVolumesRemoveSwarmEnabled/volume_not_in_use
=== RUN TestVolumesRemoveSwarmEnabled/non-existing_volume
=== RUN TestVolumesRemoveSwarmEnabled/non-existing_volume_force
--- PASS: TestVolumesRemoveSwarmEnabled (1.53s)
--- PASS: TestVolumesRemoveSwarmEnabled/volume_in_use (0.00s)
--- PASS: TestVolumesRemoveSwarmEnabled/volume_not_in_use (0.01s)
--- PASS: TestVolumesRemoveSwarmEnabled/non-existing_volume (0.00s)
--- PASS: TestVolumesRemoveSwarmEnabled/non-existing_volume_force (0.00s)
PASS
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
SearchRegistryForImages does not make sense as part of the image
service interface. The implementation just wraps the search API of the
registry service to filter the results client-side. It has nothing to do
with local image storage, and the implementation of search does not need
to change when changing which backend (graph driver vs. containerd
snapshotter) is used for local image storage.
Filtering of the search results is an implementation detail: the
consumer of the results does not care which actor does the filtering so
long as the results are filtered as requested. Move filtering into the
exported API of the registry service to hide the implementation details.
Only one thing---the registry service implementation---would need to
change in order to support server-side filtering of search results if
Docker Hub or other registry servers were to add support for it to their
APIs.
Use a fake registry server in the search unit tests to avoid having to
mock out the registry API client.
Signed-off-by: Cory Snider <csnider@mirantis.com>
Includes a security fix for crypto/elliptic (CVE-2023-24532).
> go1.20.2 (released 2023-03-07) includes a security fix to the crypto/elliptic package,
> as well as bug fixes to the compiler, the covdata command, the linker, the runtime, and
> the crypto/ecdh, crypto/rsa, crypto/x509, os, and syscall packages.
> See the Go 1.20.2 milestone on our issue tracker for details.
https://go.dev/doc/devel/release#go1.20.minor
From the announcement:
> We have just released Go versions 1.20.2 and 1.19.7, minor point releases.
>
> These minor releases include 1 security fixes following the security policy:
>
> - crypto/elliptic: incorrect P-256 ScalarMult and ScalarBaseMult results
>
> The ScalarMult and ScalarBaseMult methods of the P256 Curve may return an
> incorrect result if called with some specific unreduced scalars (a scalar larger
> than the order of the curve).
>
> This does not impact usages of crypto/ecdsa or crypto/ecdh.
>
> This is CVE-2023-24532 and Go issue https://go.dev/issue/58647.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
5008409b5c introduced the usage of
`strings.Cut` to help parse listener addresses.
Part of that also made it error out if no addr is specified after the
protocol spec (e.g. `tcp://`).
Before the change a proto spec without an address just used the default
address for that proto.
e.g. `tcp://` would be `tcp://127.0.0.1:2375`, `unix://` would be
`unix:///var/run/docker.sock`.
Critically, socket activation (`fd://`) never has an address.
This change brings back the old behavior but keeps the usage of
`strings.Cut`.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
Set `dangling-name-prefix` exporter attribute to `moby-dangling` which
makes it create an containerd image even when user didn't provide any
name for the new image.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
In preparation for containerd v1.7 which migrates off gogo/protobuf
and changes the protobuf Any type to one that's not supported by our
vendored version of typeurl.
This fixes a compile error on usages of `typeurl.UnmarshalAny` when
upgrading to containerd v1.7.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
- fix docker service create doesn't work when network and generic-resource are both attached
- Fix removing tasks when a jobs service is removed
- CSI: Allow NodePublishVolume even when plugin does not support staging
full diff: 904c221ac2...80a528a868
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
- Only use the image exporter in build if we don't use containerd
Without this "docker build" fails with:
Error response from daemon: exporter "image" could not be found
- let buildx know we support containerd snapshotter
- Pass the current snapshotter to the buildkit worker
If buildkit uses a different snapshotter we can't list the images any
more because we can't find the snapshot.
builder/builder-next: make ContainerdWorker a minimal wrapper
Note that this makes "Worker" a public field, so technically one could
overwrite it.
builder-next: reenable runc executor
Currently, without special CNI config the builder would
only create host network containers that is a security issue.
Using runc directly instead of shim is faster as well
as builder doesn’t need anything from shim. The overhead
of setting up network sandbox is much slower of course.
builder/builder-next: simplify options handling
Trying to simplify the logic;
- Use an early return if multiple outputs are provided
- Only construct the list of tags if we're using an image (or moby) exporter
- Combine some logic for snapshotter and non-snapshotter handling
Create a constant for the moby exporter
Pass a context when creating a router
The context has a 10 seconds timeout which should be more than enough to
get the answer from containerd.
Signed-off-by: Djordje Lukic <djordje.lukic@docker.com>
Co-authored-by: Sebastiaan van Stijn <github@gone.nl>
Co-authored-by: Tonis Tiigi <tonistiigi@gmail.com>
Co-authored-by: Nicolas De Loof <nicolas.deloof@gmail.com>
Co-authored-by: Paweł Gronowski <pawel.gronowski@docker.com>
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Stopping container on Windows can sometimes take longer than 10s which
caused this test to be flaky.
Increase the timeout to 75s when running this test on Windows.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
This addresses the previous issue with the containerd store where, after a container is created, we can't deterministically resolve which image variant was used to run it (since we also don't store what platform the image was fetched for).
This is required for things like `docker commit`, and computing the containers layer size later, since we need to resolve the specific image variant.
Signed-off-by: Laura Brehm <laurabrehm@hey.com>
When reading through some bug reports, I noticed that the error-message for
unsupported storage drivers is not very informative, as it does not include
the actual storage driver. Some of these errors are used as sentinel errors
internally, so improving the error returned by graphdriver.New() may need
some additional work, but this patch makes a start by including the name
of the graphdriver (if set) in the error-message.
Before this patch:
dockerd --storage-driver=foobar
...
failed to start daemon: error initializing graphdriver: driver not supported
With this patch:
dockerd --storage-driver=foobar
...
failed to start daemon: error initializing graphdriver: driver not supported: foobar
It's worth noting that there may be code "in the wild" that perform string-
matching on this error (e.g. [balena][1]), which is why I included the name as a separate "component"
in the output, to allow matching parts of the error.
[1]: 3d5c77a466/lib/preload.ts (L34-L35)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
These variables collided with the "repository" and "store" types declared
in this package. Rename the variables colliding with "repository", and
rename the "store" type to "refStore".
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Buildkit deprecated build information in v0.11 and will remove it in v0.12.
It's suggested to use provenance attestations instead.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
The authorization.Middleware contains a sync.Mutex field, making it
non-copyable. Remove one of the barriers to allowing deep copies of
config.Config values.
Inject the middleware into Daemon as a constructor argument instead.
Signed-off-by: Cory Snider <csnider@mirantis.com>
`cp` variable is used later to populate the `info.Checkpoint` field
option used by Task creation.
Previous changes mistakenly changed assignment of the `cp` variable to
declaration of a new variable that's scoped only to the if block.
Restore the old assignment behavior.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
It's surprising that the method to begin serving requests is named Wait.
And it is unidiomatic: it is a synchronous call, but it sends its return
value to the channel passed in as an argument instead of just returning
the value. And ultimately it is just a trivial wrapper around serveAPI.
Export the ServeAPI method instead so callers can decide how to call and
synchronize around it.
Call ServeAPI synchronously on the main goroutine in cmd/dockerd. The
goroutine and channel which the Wait() API demanded are superfluous
after all. The notifyReady() call was always concurrent and asynchronous
with respect to serving the API (its implementation spawns a goroutine)
so it makes no difference whether it is called before ServeAPI() or
after `go ServeAPI()`.
Signed-off-by: Cory Snider <csnider@mirantis.com>
The Server.cfg field is never referenced by any code in package
"./api/server". "./api/server".Config struct values are used by
DaemonCli code, but only to pass around configuration copied out of the
daemon config within the "./cmd/dockerd" package. Delete the
"./api/server".Config struct definition and refactor the "./cmd/dockerd"
package to pull configuration directly from cli.Config.
Signed-off-by: Cory Snider <csnider@mirantis.com>
Deprecate `<none>:<none>` and `<none>@<none>` magic strings included in
`RepoTags` and `RepoDigests`.
Produce an empty arrays instead and leave the presentation of
untagged/dangling images up to the client.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
The netip types can be used as map keys, unlike net.IP and friends,
which is a very useful property to have for this application.
Signed-off-by: Cory Snider <csnider@mirantis.com>
The address spaces are orthogonal. There is no shared state between them
logically so there is no reason for them to share any in-memory data
structures. addrSpace is responsible for allocating subnets and
addresses, while Allocator is responsible for implementing the IPAM API.
Lower all the implementation details of allocation into addrSpace.
There is no longer a need to include the name of the address space in
the map keys for subnets now that each addrSpace holds its own state
independently from other addrSpaces. Remove the AddressSpace field from
the struct used for map keys within an addrSpace so that an addrSpace
does not need to know its own name.
Pool allocations were encoded in a tree structure, using parent
references and reference counters. This structure affords for pools
subdivided an arbitrary number of times to be modeled, in theory. In
practice, the max depth is only two: master pools and sub-pools. The
allocator data model has also been heavily influenced by the
requirements and limitations of Datastore persistence, which are no
longer applicable.
Address allocations are always associated with a master pool. Sub-pools
only serve to restrict the range of addresses within the master pool
which could be allocated from. Model pool allocations within an address
space as a two-level hierarchy of maps.
Signed-off-by: Cory Snider <csnider@mirantis.com>
There is only one call site, in (*Allocator).RequestPool. The logic is
tightly coupled between the caller and callee, and having them separate
makes it harder to reason about.
Signed-off-by: Cory Snider <csnider@mirantis.com>
Only two address spaces are supported: LocalDefault and GlobalDefault.
Support for non-default address spaces in the IPAM Allocator is
vestigial, from a time when IPAM state was stored in a persistent shared
datastore. There is no way to create non-default address spaces through
the IPAM API so there is no need to retain code to support the use of
such address spaces. Drop all pretense that more address spaces can
exist, to the extent that the IPAM API allows.
Signed-off-by: Cory Snider <csnider@mirantis.com>
The two-phase commit dance serves no purpose with the current IPAM
allocator implementation. There are no fallible operations between the
call to aSpace.updatePoolDBOnAdd() and invoking the returned closure.
Allocate the subnet in the address space immediately when called and get
rid of the closure return.
Signed-off-by: Cory Snider <csnider@mirantis.com>
Checking if the image creation failed due to IsAlreadyExists didn't use
the error from ImageService.Create.
Error from ImageService.Create was stored in a separate variable and
later IsAlreadyExists checked the standard `err` variable instead of the
`createErr`.
As there's no need to store the error in a separate variable - just
assign it to err variable and fix the check.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Kubernetes only permits RuntimeClass values which are valid lowercase
RFC 1123 labels, which disallows the period character. This prevents
cri-dockerd from being able to support configuring alternative shimv2
runtimes for a pod as shimv2 runtime names must contain at least one
period character. Add support for configuring named shimv2 runtimes in
daemon.json so that runtime names can be aliased to
Kubernetes-compatible names.
Allow options to be set on shimv2 runtimes in daemon.json.
The names of the new daemon runtime config fields have been selected to
correspond with the equivalent field names in cri-containerd's
configuration so that users can more easily follow documentation from
the runtime vendor written for cri-containerd and apply it to
daemon.json.
Signed-off-by: Cory Snider <csnider@mirantis.com>
PTR queries with domain names unknown to us are not necessarily invalid.
Act like a well-behaved middlebox and fall back to forwarding
externally, same as we do with the other query types.
Signed-off-by: Cory Snider <csnider@mirantis.com>
...for limiting concurrent external DNS requests with
"golang.org/x/sync/semaphore".Weighted. Replace the ad-hoc rate limiter
for when the concurrency limit is hit (which contains a data-race bug)
with "golang.org/x/time/rate".Sometimes.
Immediately retrying with the next server if the concurrency limit has
been hit just further compounds the problem. Wait on the semaphore and
refuse the query if it could not be acquired in a reasonable amount of
time.
Signed-off-by: Cory Snider <csnider@mirantis.com>
It handles figuring out the UDP receive buffer size and setting IO
timeouts, which simplifies our code. It is also more robust to receiving
UDP replies to earlier queries which timed out.
Log failures to perform a client exchange at level error so they are
more visible to operators and administrators.
Signed-off-by: Cory Snider <csnider@mirantis.com>
forwardExtDNS() will now continue with the next external DNS sever if
co.ReadMsg() returns (nil, nil). Previously it would abort resolving the
query and not reply to the container client. The implementation of
ReadMsg() in the currently- vendored version of miekg/dns cannot return
(nil, nil) so the difference is immaterial in practice.
Signed-off-by: Cory Snider <csnider@mirantis.com>
(*dns.Msg).Truncate() is more intelligent and standards-compliant about
truncating DNS response messages than our hand-rolled version. Fix a
silly fencepost error the max TCP message size: the limit is
dns.MaxMsgSize (65535), full stop.
Signed-off-by: Cory Snider <csnider@mirantis.com>
The TC flag in a DNS message indicates that the sender had to
truncate it to fit within the length limit of the transmission channel.
It does NOT indicate that part of the message was lost before reaching
the recipient. Older versions of github.com/miekg/dns conflated the two
cases by returning ErrTruncated from ReadMsg() if the message was parsed
without error but had the TC flag set. The version of miekg/dns
currently vendored no longer returns an error when a well-formed DNS
message is received which has its TC flag set, but there was some
confusion on how to update libnetwork to deal with this behaviour
change. Truncated DNS replies are no longer different from any other
reply message: they are normal replies which do not need any special-
case handling to proxy back to the client.
Signed-off-by: Cory Snider <csnider@mirantis.com>
go1.19.6 (released 2023-02-14) includes security fixes to the crypto/tls,
mime/multipart, net/http, and path/filepath packages, as well as bug fixes to
the go command, the linker, the runtime, and the crypto/x509, net/http, and
time packages. See the Go 1.19.6 milestone on our issue tracker for details:
https://github.com/golang/go/issues?q=milestone%3AGo1.19.6+label%3ACherryPickApproved
From the announcement on the security mailing:
We have just released Go versions 1.20.1 and 1.19.6, minor point releases.
These minor releases include 4 security fixes following the security policy:
- path/filepath: path traversal in filepath.Clean on Windows
On Windows, the filepath.Clean function could transform an invalid path such
as a/../c:/b into the valid path c:\b. This transformation of a relative (if
invalid) path into an absolute path could enable a directory traversal attack.
The filepath.Clean function will now transform this path into the relative
(but still invalid) path .\c:\b.
This is CVE-2022-41722 and Go issue https://go.dev/issue/57274.
- net/http, mime/multipart: denial of service from excessive resource
consumption
Multipart form parsing with mime/multipart.Reader.ReadForm can consume largely
unlimited amounts of memory and disk files. This also affects form parsing in
the net/http package with the Request methods FormFile, FormValue,
ParseMultipartForm, and PostFormValue.
ReadForm takes a maxMemory parameter, and is documented as storing "up to
maxMemory bytes +10MB (reserved for non-file parts) in memory". File parts
which cannot be stored in memory are stored on disk in temporary files. The
unconfigurable 10MB reserved for non-file parts is excessively large and can
potentially open a denial of service vector on its own. However, ReadForm did
not properly account for all memory consumed by a parsed form, such as map
ntry overhead, part names, and MIME headers, permitting a maliciously crafted
form to consume well over 10MB. In addition, ReadForm contained no limit on
the number of disk files created, permitting a relatively small request body
to create a large number of disk temporary files.
ReadForm now properly accounts for various forms of memory overhead, and
should now stay within its documented limit of 10MB + maxMemory bytes of
memory consumption. Users should still be aware that this limit is high and
may still be hazardous.
ReadForm now creates at most one on-disk temporary file, combining multiple
form parts into a single temporary file. The mime/multipart.File interface
type's documentation states, "If stored on disk, the File's underlying
concrete type will be an *os.File.". This is no longer the case when a form
contains more than one file part, due to this coalescing of parts into a
single file. The previous behavior of using distinct files for each form part
may be reenabled with the environment variable
GODEBUG=multipartfiles=distinct.
Users should be aware that multipart.ReadForm and the http.Request methods
that call it do not limit the amount of disk consumed by temporary files.
Callers can limit the size of form data with http.MaxBytesReader.
This is CVE-2022-41725 and Go issue https://go.dev/issue/58006.
- crypto/tls: large handshake records may cause panics
Both clients and servers may send large TLS handshake records which cause
servers and clients, respectively, to panic when attempting to construct
responses.
This affects all TLS 1.3 clients, TLS 1.2 clients which explicitly enable
session resumption (by setting Config.ClientSessionCache to a non-nil value),
and TLS 1.3 servers which request client certificates (by setting
Config.ClientAuth
> = RequestClientCert).
This is CVE-2022-41724 and Go issue https://go.dev/issue/58001.
- net/http: avoid quadratic complexity in HPACK decoding
A maliciously crafted HTTP/2 stream could cause excessive CPU consumption
in the HPACK decoder, sufficient to cause a denial of service from a small
number of small requests.
This issue is also fixed in golang.org/x/net/http2 v0.7.0, for users manually
configuring HTTP/2.
This is CVE-2022-41723 and Go issue https://go.dev/issue/57855.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
TestRequestReleaseAddressDuplicate gets flagged by go test -race because
the same err variable inside the test is assigned to from multiple
goroutines without synchronization, which obscures whether or not there
are any data races in the code under test.
Trouble is, the test _depends on_ the data race to exit the loop if an
error occurs inside a spawned goroutine. And the test contains a logical
concurrency bug (not flagged by the Go race detector) which can result
in false-positive test failures. Because a release operation is logged
after the IP is released, the other goroutine could reacquire the
address and log that it was reacquired before the release is logged.
Fix up the test so it is no longer subject to data races or
false-positive test failures, i.e. flakes.
Signed-off-by: Cory Snider <csnider@mirantis.com>
If the resolver encounters an error before it attempts to forward the
request to external DNS, do not try to log information about the
external connection, because at this point `extConn` is `nil`. This
makes sure `dockerd` won't panic and crash from a nil pointer
dereference when it sees an invalid DNS query.
fixes#44979
Signed-off-by: er0k <er0k@er0k.net>
(cherry picked from commit 6c2637be11)
Signed-off-by: Bjorn Neergaard <bneergaard@mirantis.com>
This reverts commit ab3fa46502.
This fix was partial, and is not needed with the proper fix in
containerd.
Signed-off-by: Bjorn Neergaard <bneergaard@mirantis.com>
The errors are already returned to the client in the API response, so
logging them to the daemon log is redundant. Log the errors at level
Debug so as not to pollute the end-users' daemon logs with noise.
Refactor the logs to use structured fields. Add the request context to
the log entry so that logrus hooks could annotate the log entries with
contextual information about the API request in the hypothetical future.
Fixes#44997
Signed-off-by: Cory Snider <csnider@mirantis.com>
Go 1.20 made a change to the behaviour of package "os/exec" which was
not mentioned in the release notes:
2b8f214094
Attempts to execute a directory now return syscall.EISDIR instead of
syscall.EACCESS. Check for EISDIR errors from the runtime and fudge the
returned error message to maintain compatibility with existing versions
of docker/cli when using a version of runc compiled with Go 1.20+.
Signed-off-by: Cory Snider <csnider@mirantis.com>
maxDownloadAttempts maps to the daemon configuration flag
--max-download-attempts int
Set the max download attempts for each pull (default 5)
and the daemon configuration machinery interprets a value of 0 as "apply
the default value" and not a valid user value (config validation/
normalization bugs notwithstanding). The intention is clearly that this
configuration value should be an upper limit on the number of times the
daemon should try to download a particular layer before giving up. So it
is surprising to have the configuration value interpreted as a _retry_
limit. The daemon will make up to N+1 attempts to download a layer! This
also means users cannot disable retries even if they wanted to.
Fix the fencepost bug so that max attempts really means max attempts,
not max retries. And fix the fencepost bug with the retry-backoff delay
so that the first backoff is 5s, not 10s.
Signed-off-by: Cory Snider <csnider@mirantis.com>
"math/rand".Seed
- Migrate to using local RNG instances.
"archive/tar".TypeRegA
- The deprecated constant tar.TypeRegA is the same value as
tar.TypeReg and so is not needed at all.
Signed-off-by: Cory Snider <csnider@mirantis.com>
This addresses the same CVE as is patched in go1.19.6. From that announcement:
> net/http: avoid quadratic complexity in HPACK decoding
>
> A maliciously crafted HTTP/2 stream could cause excessive CPU consumption
> in the HPACK decoder, sufficient to cause a denial of service from a small
> number of small requests.
>
> This issue is also fixed in golang.org/x/net/http2 v0.7.0, for users manually
> configuring HTTP/2.
>
> This is CVE-2022-41723 and Go issue https://go.dev/issue/57855.
full diff: https://github.com/golang/net/compare/v0.5.0...v0.7.0
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
DNS servers in the loopback address range should always be resolved in
the host network namespace when the servers are configured by reading
from the host's /etc/resolv.conf. The daemon mistakenly conflated the
presence of DNS options (docker run --dns-opt) with user-supplied DNS
servers, treating the list of servers loaded from the host as a user-
supplied list and attempting to resolve in the container's network
namespace. Correct this oversight so that loopback DNS servers are only
resolved in the container's network namespace when the user provides the
DNS server list, irrespective of other DNS configuration.
Signed-off-by: Cory Snider <csnider@mirantis.com>
The per-network statistics counters are loaded and incremented without
any concurrency control. Use atomic integers to prevent data races
without having to add any synchronization.
Signed-off-by: Cory Snider <csnider@mirantis.com>
The (*bitmap.Handle).String() method can be rather expensive to call. It
is all the more tragic when the expensively-constructed string is
immediately discarded because the log level is not high enough. Let
logrus stringify the arguments to debug logs so they are only
stringified when the log level is high enough.
# Before
ok github.com/docker/docker/libnetwork/ipam 10.159s
# After
ok github.com/docker/docker/libnetwork/ipam 2.484s
Signed-off-by: Cory Snider <csnider@mirantis.com>
These conditions were added in 8cf89245f5
to account for old versions of debian/ubuntu (apparmor_parser < 2.9)
that lacked some options;
> This allows us to use the apparmor profile we have in contrib/apparmor/
> and solves the problems where certain functions are not apparent on older
> versions of apparmor_parser on debian/ubuntu.
Those patches were from 2015/2016, and all currently supported distro
versions should now have more current versions than that. Looking at the
oldest supported versions;
Ubuntu 18.04 "Bionic":
apparmor_parser --version
AppArmor parser version 2.12
Copyright (C) 1999-2008 Novell Inc.
Copyright 2009-2012 Canonical Ltd.
Debian 10 "Buster"
apparmor_parser --version
AppArmor parser version 2.13.2
Copyright (C) 1999-2008 Novell Inc.
Copyright 2009-2018 Canonical Ltd.
This patch removes the conditionals.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
These conditions were added in 8cf89245f5
to account for old versions of debian/ubuntu (apparmor_parser < 2.8.96)
that lacked some options;
> This allows us to use the apparmor profile we have in contrib/apparmor/
> and solves the problems where certain functions are not apparent on older
> versions of apparmor_parser on debian/ubuntu.
Those patches were from 2015/2016, and all currently supported distro
versions should now have more current versions than that. Looking at the
oldest supported versions;
Ubuntu 18.04 "Bionic":
apparmor_parser --version
AppArmor parser version 2.12
Copyright (C) 1999-2008 Novell Inc.
Copyright 2009-2012 Canonical Ltd.
Debian 10 "Buster"
apparmor_parser --version
AppArmor parser version 2.13.2
Copyright (C) 1999-2008 Novell Inc.
Copyright 2009-2018 Canonical Ltd.
This patch removes the conditionals.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Implements image tagging under containerd image store.
If an image with this tag already exists, and there's no other image
with the same target, change its name. The name will have a special
format `moby-dangling@<digest>` which isn't a valid canonical reference
and doesn't resolve to any repository.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
TagImage is just a wrapper for TagImageWithReference which parses the
repo and tag into a reference. Change TagImageWithReference into
TagImage and move the responsibility of reference parsing to caller.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Adds overrides with specific tests suites in our tests
matrix so we can reduce build time significantly.
Signed-off-by: CrazyMax <crazy-max@users.noreply.github.com>
`hostSupports` doesn't check if the apparmor_parser is available.
It's possible in some environments that the apparmor will be enabled but
the tool to load the profile is not available which will cause the
ensureDefaultAppArmorProfile to fail completely.
This patch checks if the apparmor_parser is available. Otherwise the
function returns early, but still logs a warning to the daemon log.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Signed-off-by: Djordje Lukic <djordje.lukic@docker.com>
c8d/daemon: Mount root and fill BaseFS
This fixes things that were broken due to nil BaseFS like `docker cp`
and running a container with workdir override.
This is more of a temporary hack than a real solution.
The correct fix would be to refactor the code to make BaseFS and LayerRW
an implementation detail of the old image store implementation and use
the temporary mounts for the c8d implementation instead.
That requires more work though.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
daemon/images: Don't unset BaseFS
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
commit 043dbc05df temporarily switched to a
fork of BuildKit to workaround a failure in CI. These fixes have been
backported to the v0.11 branch in BuildKit, so we can switch back to upstream.
We can remove this override once we update vendor.mod to BuildKit v0.11.3.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
IPVLAN networks created on Moby v20.10 do not have the IpvlanFlag
configuration value persisted in the libnetwork database as that config
value did not exist before v23.0.0. Gracefully migrate configurations on
unmarshal to prevent type-assertion panics at daemon start after upgrade.
Fixes#44925
Signed-off-by: Cory Snider <csnider@mirantis.com>
CI is failing when bind-mounting source from the host into the dev-container;
fatal: detected dubious ownership in repository at '/go/src/github.com/docker/docker'
To add an exception for this directory, call:
git config --global --add safe.directory /go/src/github.com/docker/docker
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This makes the `docker save` and `docker load` work with the containerd
image store. The archive is both OCI and Docker compatible.
Saved archive will only contain content which is available locally. In
case the saved image is a multi-platform manifest list, the behavior
depends on the local availability of the content. This is to be
reconsidered when we have the `--platform` option in the CLI.
- If all manifests and their contents, referenced by the manifest list
are present, then the manifest-list is exported directly and the ID
will be the same.
- If only one platform manifest is present, only that manifest is
exported (the image id will change and will be the id of
platform-specific manifest, instead of the full manifest list).
- If multiple, but not all, platform manifests are available, a new
manifest list will be created which will be a subset of the original
manifest list.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Running `ctr` in `make shell` doesn't work by default - one has to
invoke it via `ctr --address /run/docker/containerd/containerd.sock -n
moby` or set the CONTAINERD_{ADDRESS,NAMESPACE} environment variables to
be able to just use `ctr`.
Considering ephemeral nature of these containers it's cumbersome to do
it every time manually - this commit sets the variables to the default
containerd socket address and makes it much easier to use ctr.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
The Pid field of an exit event cannot be relied upon to differentiate
exits of the container's task from exits of other container processes,
i.e. execs. The Pid is reported by the runtime and is implementation-
defined so there is no guarantee that a task's pid is distinct from the
pids of any other process in the same container. In particular,
kata-containers reports the pid of the hypervisor for all exit events.
ContainerD guarantees that the process ID of a task is set to the
corresponding container ID, so use that invariant to distinguish task
exits from other process exits.
Signed-off-by: Cory Snider <csnider@mirantis.com>
The ID of the task is known at the time that the FIFOs need to be
created (it's passed into the IO-creator callback, and is also the same
as the container ID) so there is no need to hardcode it to "init". Name
the FIFOs after the task ID to be consistent with the FIFO names of
exec'ed processes. Delete the now-unused InitProcessName constant so it
can never again be used in place of a task/process ID.
Signed-off-by: Cory Snider <csnider@mirantis.com>
ContainerD unconditionally sets the ID of a task to its container's ID.
Emulate this behaviour in the libcontainerd local_windows implementation
so that the daemon can use ProcessID == ContainerID (in libcontainerd
terminology) to identify that an exit event is for the container's task
and not for another process (i.e. an exec) in the same container.
Signed-off-by: Cory Snider <csnider@mirantis.com>
The latest version of containerd-shim-runhcs-v1 (v0.10.0-rc.4) pulled in
with the bump to ContainerD v1.7.0-rc.3 had several changes to make it
more robust, which had the side effect of increasing the worst-case
amount of time it takes for a container to exit in the worst case.
Notably, the total timeout for shutting down a task increased from 30
seconds to 60! Increase the timeouts hardcoded in the daemon and
integration tests so that they don't give up too soon.
Signed-off-by: Cory Snider <csnider@mirantis.com>
Notable Updates
- Fix push error propagation
- Fix slice append error with HugepageLimits for Linux
- Update default seccomp profile for PKU and CAP_SYS_NICE
- Fix overlayfs error when upperdirlabel option is set
full diff: https://github.com/containerd/containerd/compare/v1.6.15...v1.6.16
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The (*bitseq.Handle).Destroy() method deletes the persisted KVObject
from the datastore. This is a no-op on all the bitseq handles in package
ipam as they are not persisted in any datastore.
Signed-off-by: Cory Snider <csnider@mirantis.com>
That method was only referenced by ipam.Allocator, but as it no longer
stores any state persistently there is no possibility for it to load an
inconsistent bit-sequence from Docker 1.9.x.
Signed-off-by: Cory Snider <csnider@mirantis.com>
The DatastoreConfig discovery type is unused. Remove the constant and
any resulting dead code. Today's biggest loser is the IPAM Allocator:
DatastoreConfig was the only type of discovery event it was listening
for, and there was no other place where a non-nil datastore could be
passed into the allocator. Strip out all the dead persistence code from
Allocator, leaving it as purely an in-memory implementation.
There is no more need to check the consistency of the allocator's
bit-sequences as there is no persistent storage for inconsistent bit
sequences to be loaded from.
Signed-off-by: Cory Snider <csnider@mirantis.com>
Per the Interface Segregation Principle, network drivers should not have
to depend on GetPluginGetter methods they do not use. The remote network
driver is the only one which needs a PluginGetter, and it is already
special-cased in Controller so there is no sense warping the interfaces
to achieve a foolish consistency. Replace all other network drivers' Init
functions with Register functions which take a driverapi.Registerer
argument instead of a driverapi.DriverCallback. Add back in Init wrapper
functions for only the drivers which Swarmkit references so that
Swarmkit can continue to build.
Refactor the libnetwork Controller to use the new drvregistry.Networks
and drvregistry.IPAMs driver registries in place of the legacy ones.
Signed-off-by: Cory Snider <csnider@mirantis.com>
There is no benefit to having a single registry for both IPAM drivers
and network drivers. IPAM drivers are registered in a separate namespace
from network drivers, have separate registration methods, separate
accessor methods and do not interact with network drivers within a
DrvRegistry in any way. The only point of commonality is
interface { GetPluginGetter() plugingetter.PluginGetter }
which is only used by the respective remote drivers and therefore should
be outside of the scope of a driver registry.
Create new, separate registry types for network drivers and IPAM
drivers, respectively. These types are "legacy-free". Neither type has
GetPluginGetter methods. The IPAMs registry does not have an
IPAMDefaultAddressSpaces method as that information can be queried
directly from the driver using its GetDefaultAddressSpaces method.
The Networks registry does not have an AddDriver method as that method
is a trivial wrapper around calling one of its arguments with its other
arguments.
Refactor DrvRegistry in terms of the new IPAMs and Networks registries
so that existing code in libnetwork and Swarmkit will continue to work.
Signed-off-by: Cory Snider <csnider@mirantis.com>
The datastore arguments to the IPAM driver Init() functions are always
nil, even in Swarmkit. The only IPAM driver which consumed the
datastores was builtin; all others (null, remote, windowsipam) simply
ignored it. As the signatures of the IPAM driver init functions cannot
be changed without breaking the Swarmkit build, they have to be left
with the same signatures for the time being. Assert that nil datastores
are always passed into the builtin IPAM driver's init function so that
there is no ambiguity the datastores are no longer respected.
Add new Register functions for the IPAM drivers which are free from the
legacy baggage of the Init functions. (The legacy Init functions can be
removed once Swarmkit is migrated to using the Register functions.) As
the remote IPAM driver is the only one which depends on a PluginGetter,
pass it in explicitly as an argument to Register. The other IPAM drivers
should not be forced to depend on a GetPluginGetter() method they do not
use (Interface Segregation Principle).
Signed-off-by: Cory Snider <csnider@mirantis.com>
Without (*Controller).ReloadConfiguration, the only way to configure
datastore scopes would be by passing config.Options to libnetwork.New.
The only options defined which relate to datastore scopes are limited to
configuring the local-scope datastore. Furthermore, the default
datastore config only defines configuration for the local-scope
datastore. The local-scope datastore is therefore the only datastore
scope possible in libnetwork. Start removing code which is only
needed to support multiple datastore scopes.
Signed-off-by: Cory Snider <csnider@mirantis.com>
Repro steps:
- Run Docker Desktop
- Run `docker run busybox tail -f /dev/null`
- Run `pkill "Docker Desktop"
Expected:
An error message that indicates that Docker Desktop is shutting down.
Actual:
An error message that looks like this:
```
error waiting for container: invalid character 's' looking for beginning of value
```
here's an example:
https://github.com/docker/for-mac/issues/6575#issuecomment-1324879001
After this change, you get an error message like:
```
error waiting for container: copying response body from Docker: unexpected EOF
```
which is a bit more explicit.
Signed-off-by: Nick Santos <nick.santos@docker.com>
ConfigLocalScopeDefaultNetworks is now dead code, thank goodness! Make
sure it stays dead by deleting the function. Refactor package ipamutils
to simplify things given its newly-reduced (ahem) scope.
Signed-off-by: Cory Snider <csnider@mirantis.com>
ipam.Allocator is not a singleton, but it references mutable singleton
state. Address that deficiency by refactoring it to instead take the
predefined address spaces as constructor arguments. Unfortunately some
work is needed on the Swarmkit side before the mutable singleton state
can be completely eliminated from the IPAMs.
Signed-off-by: Cory Snider <csnider@mirantis.com>
The function references global shared, mutable state and is no longer
needed. Deleting it brings us one step closer to getting rid of that
pesky shared state.
Signed-off-by: Cory Snider <csnider@mirantis.com>
The netutils.ElectInterfaceAddresses function is only used in one place
outside of tests: in the daemon, to configure the default bridge
network. The function is also messy to reason about as it references the
shared mutable state of ipamutils.PredefinedLocalScopeDefaultNetworks.
It uses the list of predefined default networks to always return an IPv4
address even if the named interface does not exist or does not have any
IPv4 addresses. This list happens to be the same as the one used to
initialize the address pool of the 'builtin' IPAM driver, though that is
far from obvious. (Start with "./libnetwork".initIPAMDrivers and trace
the dataflow of the addressPool value. Surprise! Global state is being
mutated using the value of other global mutable state.)
The daemon does not need the fallback behaviour of
ElectInterfaceAddresses. In fact, the daemon does not have to configure
an address pool for the network at all! libnetwork will acquire one of
the available address ranges from the network's IPAM driver when the
preferred-pool configuration is unset. It will do so using the same list
of address ranges and the exact same logic
(netutils.FindAvailableNetworks) as ElectInterfaceAddresses. So unless
the daemon needs to force the network to use a specific address range
because the bridge interface already exists, it can leave the details
up to libnetwork.
Signed-off-by: Cory Snider <csnider@mirantis.com>
The pattern of parsing bool was repeated across multiple files and
caused the duplication of the invalidFilter error helper.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
These package-level variables were copied over from the Linux
implementation; drop them for clarity's sake.
Signed-off-by: Bjorn Neergaard <bneergaard@mirantis.com>
netlink offers the netlink.LinkNotFoundError type, which we can use with
errors.As() to detect a unused link name.
Additionally, early return if GenerateRandomName fails, as reading
random bytes should be a highly reliable operation, and otherwise the
error would be swallowed by the fall-through return.
Signed-off-by: Bjorn Neergaard <bneergaard@mirantis.com>
GenerateRandomName now uses length to represent the overall length of
the string; this will help future users avoid creating interface names
that are too long for the kernel to accept by mistake. The test coverage
is increased and cleaned up using gotest.tools.
Signed-off-by: Bjorn Neergaard <bneergaard@mirantis.com>
After the context is cancelled, update the progress for the last time.
This makes sure that the progress also includes finishing updates.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
*bitseq.Handle should not implement sync.Locker. The mutex is a private
implementation detail which external consumers should not be able to
manipulate.
Signed-off-by: Cory Snider <csnider@mirantis.com>
The byte-slice temporary is fully overwritten on each loop iteration so
it can be safely reused to reduce GC pressure.
Signed-off-by: Cory Snider <csnider@mirantis.com>
There is a solid bit-vector datatype hidden inside bitseq.Handle, but
it's obscured by all the intrusive datastore and KVObject nonsense.
It can be used without a datastore, but the datastore baggage limits its
utility inside and outside libnetwork. Extract the datatype goodness
into its own package which depends only on the standard library so it
can be used in more situations.
Signed-off-by: Cory Snider <csnider@mirantis.com>
...in preparation for separating the bit-sequence datatype from the
datastore persistence (KVObject) concerns. The new package's contents
are identical to those of package libnetwork/bitseq to assist in
reviewing the changes made on each side of the split.
Signed-off-by: Cory Snider <csnider@mirantis.com>
We already prefer ld for cross-building arm64 but that seems
not enough as native arm64 build also has a linker issue with lld
so we need to also prefer ld for native arm64 build.
Signed-off-by: CrazyMax <crazy-max@users.noreply.github.com>
Static binaries for dockerd are broken on armhf and armel (32-bit).
It seems to be an issue with GCC as building using clang solves
this issue. Also adds extra instruction to prefer ld for
cross-compiling arm64 in bullseye otherwise it doesn't link.
Signed-off-by: CrazyMax <crazy-max@users.noreply.github.com>
Disables user.Lookup() and net.LookupHost() in the init() function on Windows.
Any package that simply imports pkg/chrootarchive will panic on Windows
Nano Server, due to missing netapi32.dll. While docker itself is not
meant to run on Nano Server, binaries that may import this package and
run on Nano server, will fail even if they don't really use any of the
functionality in this package while running on Nano.
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
Current implementation in hack/make.sh overwrites PKG_CONFIG
if not defined and set it to pkg-config. When a build is invoked
using xx in our Dockerfile, it will set PKG_CONFIG to the right
value in go environments depending on the target architecture: 8015613ccc/base/xx-go (L75-L78)
Also needs to install dpkg-dev to use pkg-config when cross-building
Signed-off-by: CrazyMax <crazy-max@users.noreply.github.com>
Build currently doesn't set the right name for target ARM
architecture through switches in CGO_CFLAGS and CGO_CXXFLAGS
when doing cross-compilation. This was previously fixed in https://github.com/moby/moby/pull/43474
Also removes the toolchain configuration. Following changes for
cross-compilation in https://github.com/moby/moby/pull/44546,
we forgot to remove the toolchain configuration that is
not used anymore as xx already sets correct cc/cxx envs already.
Signed-off-by: CrazyMax <crazy-max@users.noreply.github.com>
This reverts commit 2d397beb00.
moby#44706 and moby#44805 were both merged, and both refactored the same
file. The combination broke the build, and was not detected in CI as
only the combination of the two, applied to the same parent commit,
caused the failure.
moby#44706 should be carried forward, based on the current master, in
order to resolve this conflict.
Signed-off-by: Bjorn Neergaard <bneergaard@mirantis.com>
Embedded structs are part of the exported surface of a struct type.
Boxing a struct value into an interface value does not erase that;
any code could gain access to the embedded struct value with a simple
type assertion. The mutex is supposed to be a private implementation
detail, but *network implements sync.Locker because the mutex is
embedded. Change the mutex to an unexported field so *network no
longer spuriously implements the sync.Locker interface.
Signed-off-by: Cory Snider <csnider@mirantis.com>
Embedded structs are part of the exported surface of a struct type.
Boxing a struct value into an interface value does not erase that;
any code could gain access to the embedded struct value with a simple
type assertion. The mutex is supposed to be a private implementation
detail, but *endpoint implements sync.Locker because the mutex is
embedded. Change the mutex to an unexported field so *endpoint no
longer spuriously implements the sync.Locker interface.
Signed-off-by: Cory Snider <csnider@mirantis.com>
Basically every exported method which takes a libnetwork.Sandbox
argument asserts that the value's concrete type is *sandbox. Passing any
other implementation of the interface is a runtime error! This interface
is a footgun, and clearly not necessary. Export and use the concrete
type instead.
Signed-off-by: Cory Snider <csnider@mirantis.com>
Embedded structs are part of the exported surface of a struct type.
Boxing a struct value into an interface value does not erase that;
any code could gain access to the embedded struct value with a simple
type assertion. The mutex is supposed to be a private implementation
detail, but *sandbox implements sync.Locker because the mutex is
embedded. Change the mutex to an unexported field so *sandbox no
longer spuriously implements the sync.Locker interface.
Signed-off-by: Cory Snider <csnider@mirantis.com>
Embedded structs are part of the exported surface of a struct type.
Boxing a struct value into an interface value does not erase that;
any code could gain access to the embedded struct value with a simple
type assertion. The mutex is supposed to be a private implementation
detail, but *controller implements sync.Locker because the mutex is
embedded.
c, _ := libnetwork.New()
c.(sync.Locker).Lock()
Change the mutex to an unexported field so *controller no longer
spuriously implements the sync.Locker interface.
Signed-off-by: Cory Snider <csnider@mirantis.com>
unshare.Go() is not used as an existing network namespace needs to be
entered, not a new one created. Explicitly lock main() to the initial
thread so as not to depend on the side effects of importing the
internal/unshare package to achieve the same.
Signed-off-by: Cory Snider <csnider@mirantis.com>
- The oldest kernel version currently supported is v3.10. Bridge
parameters can be set through netlink since v3.8 (see
torvalds/linux@25c71c7). As such, we don't need to fallback to sysfs to
set hairpin mode.
- `scanInterfaceStats()` is never called, so no need to keep it alive.
- Document why `default_pvid` is set through sysfs
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
When userland-proxy is turned off and on again, the iptables nat rule
doing hairpinning isn't properly removed. This fix makes sure this nat
rule is removed whenever the bridge is torn down or hairpinning is
disabled (through setting userland-proxy to true).
Unlike for ip masquerading and ICC, the `programChainRule()` call
setting up the "MASQ LOCAL HOST" rule has to be called unconditionally
because the hairpin parameter isn't restored from the driver store, but
always comes from the driver config.
For the "SKIP DNAT" rule, things are a bit different: this rule is
always deleted by `removeIPChains()` when the bridge driver is
initialized.
Fixes#44721.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
If the imported layer archive is uncompressed, it gets compressed with
gzip before writing to the content store.
Archives compressed with gzip and zstd are imported as-is.
Xz and bzip2 are recompressed into gzip.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
This helps ensure that users are not surprised by unexpected tokens in
the JSON parser, or fallout later in the daemon.
Signed-off-by: Bjorn Neergaard <bneergaard@mirantis.com>
This is a pragmatic but impure choice, in order to better support the
default tools available on Windows Server, and reduce user confusion due
to otherwise inscrutable-to-the-uninitiated errors like the following:
> invalid character 'þ' looking for beginning of value
> invalid character 'ÿ' looking for beginning of value
While meaningful to those who are familiar with and are equipped to
diagnose encoding issues, these characters will be hidden when the file
is edited with a BOM-aware text editor, and further confuse the user.
Signed-off-by: Bjorn Neergaard <bneergaard@mirantis.com>
dockerd handles SIGQUIT by dumping all goroutine stacks to standard
error and exiting. In contrast, the Go runtime's default SIGQUIT
behaviour... dumps all goroutine stacks to standard error and exits.
The default SIGQUIT behaviour is implemented directly in the runtime's
signal handler, and so is both more robust to bugs in the Go runtime and
does not perturb the state of the process to anywhere near same degree
as dumping goroutine stacks from a user goroutine. The only notable
difference from a user's perspective is that the process exits with
status 2 instead of 128+SIGQUIT.
Signed-off-by: Cory Snider <csnider@mirantis.com>
synchronises some fixes between these API versions for the documentation,
including fixes from:
- 18f85467e7
- 345346d7c6
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Trying to remove the "docker.io" domain from locations where it's not relevant.
In these cases, this domain was used as a "random" domain for testing or example
purposes.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Trying to remove the "docker.io" domain from locations where it's not relevant.
In these cases, this domain was used as a "random" domain for testing or example
purposes.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
[RFC 8259] allows for JSON implementations to optionally ignore a BOM
when it helps with interoperability; do so in Moby as Notepad (the only
text editor available out of the box in many versions of Windows Server)
insists on writing UTF-8 with a BOM.
[RFC 8259]: https://tools.ietf.org/html/rfc8259#section-8.1
Signed-off-by: Bjorn Neergaard <bneergaard@mirantis.com>
We only need suitable UAPI headers now. They are available on kernel 4.7
and newer; out of the distributions currently in support that users
might be interested in, only Enterprise Linux 7 has too old a kernel
(3.10).
Users of Enterprise Linux 7 distros can compile using a newer platform,
disable the Btrfs graphdriver as documented in this file, or use newer
kernel headers on their older distro.
Signed-off-by: Bjorn Neergaard <bneergaard@mirantis.com>
By relying on the kernel UAPI (userspace API), we can drop a dependency
and simplify building Moby, while also ensuring that we are using a
stable/supported source of the C types and defines we need.
btrfs-progs mirrors the kernel headers, but the headers it ships with
are not the canonical source and as [we have seen before][44698], could
be subject to changes.
Depending on the canonical headers from the kernel both is more
idiomatic, and ensures we are protected by the kernel's promise to not
break userspace.
[44698]: https://github.com/moby/moby/issues/44698
Signed-off-by: Bjorn Neergaard <bneergaard@mirantis.com>
This is actually quite meaningless as we are reporting the libbtrfs
version, but we do not use libbtrfs. We only use the kernel interface to
btrfs instead.
While we could report the version of the kernel headers in play, they're
rather all-or-nothing: they provide the structures and defines we need,
or they don't. As such, drop all version information as the host kernel
version is the only thing that matters.
Signed-off-by: Bjorn Neergaard <bneergaard@mirantis.com>
cmd.Wait is called twice from different goroutines which can cause the
test to hang completely. Fix by calling Wait only once and sending its
return value over a channel.
In TestLogsFollowGoroutinesWithStdout also added additional closes and
process kills to ensure that we don't leak anything in case test returns
early because of failed test assertion.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
This is a squashed version of various PRs (or related code-changes)
to implement image inspect with the containerd-integration;
- add support for image inspect
- introduce GetImageOpts to manage image inspect data in backend
- GetImage to return image tags with details
- list images matching digest to discover all tags
- Add ExposedPorts and Volumes to the image returned
- Refactor resolving/getting images
- Return the image ID on inspect
- consider digest and ignore tag when both are set
- docker run --platform
Signed-off-by: Djordje Lukic <djordje.lukic@docker.com>
Signed-off-by: Nicolas De Loof <nicolas.deloof@gmail.com>
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Make it possible to add `-race` to the BUILDFLAGS without making the
build fail with error:
"-buildmode=pie not supported when -race is enabled"
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Conntrack entries are created for UDP flows even if there's nowhere to
route these packets (ie. no listening socket and no NAT rules to
apply). Moreover, iptables NAT rules are evaluated by netfilter only
when creating a new conntrack entry.
When Docker adds NAT rules, netfilter will ignore them for any packet
matching a pre-existing conntrack entry. In such case, when
dockerd runs with userland proxy enabled, packets got routed to it and
the main symptom will be bad source IP address (as shown by #44688).
If the publishing container is run through Docker Swarm or in
"standalone" Docker but with no userland proxy, affected packets will
be dropped (eg. routed to nowhere).
As such, Docker needs to flush all conntrack entries for published UDP
ports to make sure NAT rules are correctly applied to all packets.
- Fixes#44688
- Fixes#8795
- Fixes#16720
- Fixes#7540
- Fixesmoby/libnetwork#2423
- and probably more.
As a precautionary measure, those conntrack entries are also flushed
when revoking external connectivity to avoid those entries to be reused
when a new sandbox is created (although the kernel should already
prevent such case).
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
The CreatedAt date was determined from the volume's `_data`
directory (`/var/lib/docker/volumes/<volumename>/_data`).
However, when initializing a volume, this directory is updated,
causing the date to change.
Instead of using the `_data` directory, use its parent directory,
which is not updated afterwards, and should reflect the time that
the volume was created.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
The image store's used are an interface, so there's no guarantee
that implementations don't wrap the errors. Make sure to catch
such cases by using errors.Is.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Better support for cross compilation so we can fully rely
on `--platform` flag of buildx for a seamless integration.
This removes unnecessary extra cross logic in the Dockerfile,
DOCKER_CROSSPLATFORMS and CROSS vars and some hack scripts as well.
Non-sandboxed build invocation is still supported and dev stages
in the Dockerfile have been updated accordingly.
Bake definition and GitHub Actions workflows have been updated
accordingly as well.
Signed-off-by: CrazyMax <crazy-max@users.noreply.github.com>
This function was suppressing errors coming from ConfigureTransport, with the
assumption that it will only be used with the Default configuration, and only return
errors for invalid / unsupported schemes (e.g., using "npipe" on a Linux environment);
d109e429dd/vendor/github.com/docker/go-connections/sockets/sockets_unix.go (L27-L29)
Those errors won't happen when the default is passed, so this is mostly theoretical.
Let's return the error instead (which should always be nil), just to be on the save
side, and to make sure that any other use of this function will return errors that
occurred, which may also be when parsing proxy environment variables;
d109e429dd/vendor/github.com/docker/go-connections/sockets/sockets.go (L29)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This code below is run when restoring all images (which can be "many"),
constructing the "logrus.WithFields" is deliberately not "DRY", as the
logger is only used for error-cases, and we don't want to do allocations
if we don't need it. A "f" type-alias was added to make it ever so slightly
more DRY, but that's just for convenience :)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Having this function hides what it's doing, which is just to type-cast
to an image.ID (which is a digest). Using a cast is more transparent,
so deprecating this function in favor of a regular typecast.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Simplify the error message so that we don't have to distinguish between static-
and non-static builds. Also update the link to the storage-driver section to
use a "/go/" redirect in the docs, as the anchor link was no longer correct.
Using a "/go/" redirect makes sure the link remains functional if docs is moving
around.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This function was added in b86e3bee5a to
work around an issue in os/user.Current(), which SEGFAULTS when compiling
statically with cgo enabled (see golang/go#13470).
We hit similar issues in other parts, and contributed a "osusergo" build-
tag in https://go-review.googlesource.com/c/go/+/330753. The "osusergo"
build tag must be set when compiling static binaries with cgo enabled.
If that build-tag is set, the cgo implementation for user.Current() won't
be used, and a pure-go implementation is used instead;
https://github.com/golang/go/blob/go1.19.4/src/os/user/cgo_lookup_unix.go#L5
With the above in place, we no longer need this workaround, and can remove
the ensureHomeIfIAmStatic() function.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The IPCMode type was added in 497fc8876e, and from
that patch, the intent was to allow `host` (without `:`), `""` (empty, default)
or `container:<container ID>`, but the `Valid()` function seems to be too relaxed
and accepting both `:`, as well as `host:<anything>`. No unit-tests were added
in that patch, and integration-tests only tested for valid values.
Later on, `PidMode`, and `UTSMode` were added in 23feaaa240
and f2e5207fc9, both of which were implemented as
a straight copy of the `IPCMode` implementation, copying the same bug.
Finally, commit d4aec5f0a6 implemented unit-tests
for these types, but testing for the wrong behavior of the implementation.
This patch updates the validation to correctly invalidate `host[:<anything>]`
and empty (`:`) types.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
It only had a single implementation, so we may as well remove the added
complexity of defining it as an interface.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
After further looking at the code, it appears that the default exit-code
for unknown (other) errors is 128 (set in `defer`).
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
These matches were overwriting the previous "match", so reversing the
order in which they're tried so that we can return early.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Rename the variable make it more visible where it's used, as there's were
other "err" variables masking it.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
They were just small wrappers arround cli.Args(), and the abstraction
made one wonder if they were doing some "magic" things, but they weren't,
so just inlining the `cli.Args()` makes it more transparent what's executed.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
- Remove `WaitRestart()` as it was no longer used
- Un-export `WaitForInspectResult()` as it was only used internally, and we want
to reduce uses of these utilities.
- Inline `appendDocker()` into `Docker()` as it was the only place it was used,
and the name was incorrect anyway (should've been named `prependXX`).
- Simplify `Args()`, as it was first splitting the slice (into `command` and `args`),
only to join them again into a single slice (in `icmd.Command()`).
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
These types were moved to api/types/container in 7ac4232e70,
but the unit-tests for them were not moved. This patch moves the unit-tests back together
with the types.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
iptables -C flag was introduced in v1.4.11, which was released ten
years ago. Thus, there're no more Linux distributions supported by
Docker using this version. As such, this commit removes the old way of
checking if an iptables rule exists (by using substring matching).
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
The former was doing some checks and logging warnings, whereas
the latter was doing the same checks but to set some internal variables.
As both are called only once and from the same place, there're now
merged together.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
These were only used in a single location, and in a rather bad way;
replace them with strings.Cut() which should be all we need for this.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
- config collided with import
- cap collided with a built-in
- c collided with the "controller" receiver
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This code was forked from libcontainer (now runc) in
fb6dd9766e
From the description of this code:
> THIS CODE DOES NOT COMMUNICATE WITH KERNEL VIA RTNETLINK INTERFACE
> IT IS HERE FOR BACKWARDS COMPATIBILITY WITH OLDER LINUX KERNELS
> WHICH SHIP WITH OLDER NOT ENTIRELY FUNCTIONAL VERSION OF NETLINK
That comment was added as part of a refactor in;
4fe2c7a4db
Digging deeper into the code, it describes:
> This is more backward-compatible than netlink.NetworkSetMaster and
> works on RHEL 6.
That comment (and code) moved around a few times;
- moved into the libcontainer pkg: 6158ccad97
- moved within the networkdriver pkg: 4cdcea2047
- moved into the networkdriver pkg: 90494600d3
Ultimately leading to 7a94cdf8ed, which implemented
this:
> create the bridge device with ioctl
>
> On RHEL 6, creation of a bridge device with netlink fails. Use the more
> backward-compatible ioctl instead. This fixes networking on RHEL 6.
So from that information, it looks indeed to support RHEL 6, and Ubuntu 12.04
which are both EOL, and we haven't supported for a long time, so probably time
to remove this.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Fixes a (theoretical?) panic if ID would be shorter than 12
characters. Also trim the ID _after_ cutting off the suffix.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
We're looking for a specific prefix, so remove the prefix instead. Also remove
redundant error-wrapping, as `os.Open()` already provides details in the error
returned;
open /no/such/file: no such file or directory
open /etc/os-release: permission denied
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Use a single exported implementation, so that we can maintain the
GoDoc string in one place, and use non-exported functions for the
actual implementation.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
These types and functions are more closely related to the functionality
provided by pkg/systeminfo, and used in conjunction with the other functions
in that package, so moving them there.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Use a single exported implementation, so that we can maintain the
GoDoc string in one place, and use non-exported functions for the
actual implementation (which were already in place).
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This utility wasn't very related to all other utilities in pkg/ioutils.
Moving it to longpath to also make it more clear what it does.
It looks like there's only a single (public) external consumer of this
utility, and only used in a test, and it's not 100% clear if it was
intentional to use our package, of if it was a case of "I actually meant
`io/ioutil.MkdirTemp`" so we could consider skipping the alias.
While moving the package, I also renamed `TempDir` to `MkdirTemp`, which
is the signature it matches in "os" from stdlib.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
No changes in vendored code, but removes some indirect dependencies.
full diff: b17f02f0a0...0da442b278
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Since b58de39ca7, this option was now only used
to produce a fatal error when starting the daemon. That change is in the 23.0
release, so we can remove it from the master branch.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This type was added in 677a6b3506, and named
"common", because at the time, the "docker" and "dockerd" (daemon) code
were still in the same repository, and shared this type. Renaming it, now
that's no longer the case.
As there are no external consumers of this type, I'm not adding an alias.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The configDir (and "DOCKER_CONFIG" environment variable) is now only used
for the default location for TLS certificates to secure the daemon API.
It is a leftover from when the "docker" and "dockerd" CLI shared the same
binary, allowing the DOCKER_CONFIG environment variable to set the location
for certificates to be used by both.
This patch merges it into cmd/dockerd, which is where it was used.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The cli package is a leftover from when the "docker" and "dockerd" cli
were both maintained in this repository; the only consumer of this is
now the dockerd CLI, so we can move this code (it should not be imported
by anyone).
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Moby is not a Go module; to prevent anyone from mistakenly trying to
convert it to one before we are ready, introduce a check (usable in CI
and locally) for a go.mod file.
This is preferable to trying to .gitignore the file as we can ensure
that a mistakenly created go.mod is surfaced by Git-based tooling and is
less likely to surprise a contributor.
Signed-off-by: Bjorn Neergaard <bneergaard@mirantis.com>
To make the local build environment more correct and consistent, we
should never leave an uncommitted go.mod in the tree; however, we need a
go.mod for certain commands to work properly. Use a wrapper script to
create and destroy the go.mod as needed instead of potentially changing
tooling behavior by leaving it.
If a go.mod already exists, this script will warn and call the wrapped
command with GO111MODULE=on.
Signed-off-by: Bjorn Neergaard <bneergaard@mirantis.com>
(*Daemon).registerLinks() calling the WriteHostConfig() method of its
container argument is a vestigial behaviour. In the distant past,
registerLinks() would persist the container links in an SQLite database
and drop the link config from the container's persisted HostConfig. This
changed in Docker v1.10 (#16032) which migrated away from SQLite and
began using the link config in the container's HostConfig as the
persistent source of truth. registerLinks() no longer mutates the
HostConfig at all so persisting the HostConfig to disk falls outside of
its scope of responsibilities.
Signed-off-by: Cory Snider <csnider@mirantis.com>
(*Container).CheckpointTo() upserts a snapshot of the container to the
daemon's in-memory ViewDB and also persists the snapshot to disk. It
does not register the live container object with the daemon's container
store, however. The ViewDB and container store are used as the source of
truth for different operations, so having a container registered in one
but not the other can result in inconsistencies. In particular, the List
Containers API uses the ViewDB as its source of truth and the Container
Inspect API uses the container store.
The (*Daemon).setHostConfig() method is called fairly early in the
process of creating a container, long before the container is registered
in the daemon's container store. Due to a rogue CheckpointTo() call
inside setHostConfig(), there is a window of time where a container can
be included in a List Containers API response but "not exist" according
to the Container Inspect API and similar endpoints which operate on a
particular container. Remove the rogue call so that the caller has full
control over when the container is checkpointed and update callers to
checkpoint explicitly. No changes to (*Daemon).create() are needed as it
checkpoints the fully-created container via (*Daemon).Register().
Fixes#44512.
Signed-off-by: Cory Snider <csnider@mirantis.com>
GetContainer() would return (nil, nil) when looking up a container
if the container was inserted into the containersReplica ViewDB but not
the containers Store at the time of the lookup. Callers which reasonably
assume that the returned err == nil implies returned container != nil
would dereference a nil pointer and panic. Change GetContainer() so that
it always returns a container or an error.
Signed-off-by: Cory Snider <csnider@mirantis.com>
the non-exported "daemon.createNetwork" already returns nil if there's
an error, so no need to check the error.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This is a dependency of github.com/fluent/fluent-logger-golang, which
currently does not provide a go.mod, but tests against the latest
versions of its dependencies.
Updating this dependency to the latest version.
Notable changes:
- all: implement omitempty
- fix: JSON encoder may produce invalid utf-8 when provided invalid utf-8 message pack string.
- added Unwrap method to errWrapped plus tests; switched travis to go 1.14
- CopyToJSON: fix bitSize for floats
- Add Reader/Writer constructors with custom buffer
- Add missing bin header functions
- msgp/unsafe: bring code in line with unsafe guidelines
- msgp/msgp: fix ReadMapKeyZC (fix "Fail to decode string encoded as bin type")
full diff: https://github.com/tinylib/msgp/compare/v1.1.0...v1.1.6
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This is an (indirect) dependency of github.com/fluent/fluent-logger-golang,
which currently does not provide a go.mod, but tests against the latest
versions of its dependencies.
Updating this dependency to the latest version.
full diff: https://github.com/philhofer/fwd/compare/v1.0.0...v1.1.2
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This allows differentiating how the detailed data is collected between
the containerd-integration code and the existing implementation.
Signed-off-by: Nicolas De Loof <nicolas.deloof@gmail.com>
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The List Images API endpoint has accepted multiple values for the
`since` and `before` filter predicates, but thanks to Go's randomizing
of map iteration order, it would pick an arbitrary image to compare
created timestamps against. In other words, the behaviour was undefined.
Change these filter predicates to have well-defined semantics: the
logical AND of all values for each of the respective predicates. As
timestamps are a totally-ordered relation, this is exactly equivalent to
applying the newest and oldest creation timestamps for the `since` and
`before` predicates, respectively.
Signed-off-by: Cory Snider <csnider@mirantis.com>
Also using `bytes.TrimSuffix()`, which is slightly more readable, and
makes sure we're only stripping the null terminator.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
golang.org/x/net contains a fix for CVE-2022-41717, which was addressed
in stdlib in go1.19.4 and go1.18.9;
> net/http: limit canonical header cache by bytes, not entries
>
> An attacker can cause excessive memory growth in a Go server accepting
> HTTP/2 requests.
>
> HTTP/2 server connections contain a cache of HTTP header keys sent by
> the client. While the total number of entries in this cache is capped,
> an attacker sending very large keys can cause the server to allocate
> approximately 64 MiB per open connection.
>
> This issue is also fixed in golang.org/x/net/http2 v0.4.0,
> for users manually configuring HTTP/2.
full diff: https://github.com/golang/net/compare/v0.2.0...v0.4.0
other dependency updates (due to circular dependencies):
- golang.org/x/sys v0.3.0: https://github.com/golang/sys/compare/v0.2.0...v0.3.0
- golang.org/x/text v0.5.0: https://github.com/golang/text/compare/v0.4.0...v0.5.0
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
- Fix nil pointer deference for Windows containers in CRI plugin
- Fix lease labels unexpectedly overwriting expiration
- Fix for simultaneous diff creation using the same parent snapshot
full diff: https://github.com/containerd/containerd/v1.6.10...v1.6.11
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Includes security fixes for net/http (CVE-2022-41717, CVE-2022-41720),
and os (CVE-2022-41720).
These minor releases include 2 security fixes following the security policy:
- os, net/http: avoid escapes from os.DirFS and http.Dir on Windows
The os.DirFS function and http.Dir type provide access to a tree of files
rooted at a given directory. These functions permitted access to Windows
device files under that root. For example, os.DirFS("C:/tmp").Open("COM1")
would open the COM1 device.
Both os.DirFS and http.Dir only provide read-only filesystem access.
In addition, on Windows, an os.DirFS for the directory \(the root of the
current drive) can permit a maliciously crafted path to escape from the
drive and access any path on the system.
The behavior of os.DirFS("") has changed. Previously, an empty root was
treated equivalently to "/", so os.DirFS("").Open("tmp") would open the
path "/tmp". This now returns an error.
This is CVE-2022-41720 and Go issue https://go.dev/issue/56694.
- net/http: limit canonical header cache by bytes, not entries
An attacker can cause excessive memory growth in a Go server accepting
HTTP/2 requests.
HTTP/2 server connections contain a cache of HTTP header keys sent by
the client. While the total number of entries in this cache is capped,
an attacker sending very large keys can cause the server to allocate
approximately 64 MiB per open connection.
This issue is also fixed in golang.org/x/net/http2 vX.Y.Z, for users
manually configuring HTTP/2.
Thanks to Josselin Costanzi for reporting this issue.
This is CVE-2022-41717 and Go issue https://go.dev/issue/56350.
View the release notes for more information:
https://go.dev/doc/devel/release#go1.19.4
And the milestone on the issue tracker:
https://github.com/golang/go/issues?q=milestone%3AGo1.19.4+label%3ACherryPickApproved
Full diff: https://github.com/golang/go/compare/go1.19.3...go1.19.4
The golang.org/x/net fix is in 1e63c2f08a
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Kevin has been doing a lot of work helping modernize our CI, and helping
with BuildKit-related changes. Being our "resident GitHub actions expert",
I think it make sense to add him to the maintainers list as a curator (for
starters), to grant him permissions on the repository.
Perhaps we should nominate him to become a maintainer (with a focus on his
area of expertise) :)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This addresses a regression introduced in 407e3a4552,
which turned out to be "too strict", as there's old images that use, for example;
docker pull python:3.5.1-alpine
3.5.1-alpine: Pulling from library/python
unsupported media type application/octet-stream
Before 407e3a4552, such mediatypes were accepted;
docker pull python:3.5.1-alpine
3.5.1-alpine: Pulling from library/python
e110a4a17941: Pull complete
30dac23631f0: Pull complete
202fc3980a36: Pull complete
Digest: sha256:f88925c97b9709dd6da0cb2f811726da9d724464e9be17a964c70f067d2aa64a
Status: Downloaded newer image for python:3.5.1-alpine
docker.io/library/python:3.5.1-alpine
This patch copies the additional media-types, using the list of types that
were added in a215e15cb1, which fixed a
similar issue.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This syncs the seccomp-profile with the latest changes in containerd's
profile, applying the same changes as 17a9324035
Some background from the associated ticket:
> We want to use vsock for guest-host communication on KubeVirt
> (https://github.com/kubevirt/kubevirt). In KubeVirt we run VMs in pods.
>
> However since anyone can just connect from any pod to any VM with the
> default seccomp settings, we cannot limit connection attempts to our
> privileged node-agent.
>
> ### Describe the solution you'd like
> We want to deny the `socket` syscall for the `AF_VSOCK` family by default.
>
> I see in [1] and [2] that AF_VSOCK was actually already blocked for some
> time, but that got reverted since some architectures support the `socketcall`
> syscall which can't be restricted properly. However we are mostly interested
> in `arm64` and `amd64` where limiting `socket` would probably be enough.
>
> ### Additional context
> I know that in theory we could use our own seccomp profiles, but we would want
> to provide security for as many users as possible which use KubeVirt, and there
> it would be very helpful if this protection could be added by being part of the
> DefaultRuntime profile to easily ensure that it is active for all pods [3].
>
> Impact on existing workloads: It is unlikely that this will disturb any existing
> workload, becuase VSOCK is almost exclusively used for host-guest commmunication.
> However if someone would still use it: Privileged pods would still be able to
> use `socket` for `AF_VSOCK`, custom seccomp policies could be applied too.
> Further it was already blocked for quite some time and the blockade got lifted
> due to reasons not related to AF_VSOCK.
>
> The PR in KubeVirt which adds VSOCK support for additional context: [4]
>
> [1]: https://github.com/moby/moby/pull/29076#commitcomment-21831387
> [2]: dcf2632945
> [3]: https://kubernetes.io/docs/tutorials/security/seccomp/#enable-the-use-of-runtimedefault-as-the-default-seccomp-profile-for-all-workloads
> [4]: https://github.com/kubevirt/kubevirt/pull/8546
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This commit allows to remove dependency on the mutable version armon/go-radix.
The go-immutable-radix package is better maintained.
It is likely that a bit more memory will be used when using the
immutable version, though discarded nodes are being reused in a pool.
These changes happen when networks are added/removed or nodes come and
go in a cluster, so we are still talking about a relatively low
frequency event.
The major changes compared to the old radix are when modifying (insert
or delete) a tree, and those are pretty self-contained: we replace the
entire immutable tree under a lock.
Signed-off-by: Tibor Vass <teabee89@gmail.com>
This makes the `ImageList` function to add `shared-size=1` to the url
query when user caller sets the SharedSize.
SharedSize support was introduced in API version 1.42. This field was
added to the options struct, but client wasn't adjusted.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
We only need the content here, not the checksum, so simplifying the code by
just using os.ReadFile().
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
We only need the content here, not the checksum, so simplifying the code by
just using os.ReadFile().
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Some of these options required parsing the resolv.conf file, but the function
could return an error further down; this patch moves the parsing closer to
where their results are used (which may not happen if we're encountering an
error before).
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
We only need the content here, not the checksum, so simplifying the code by
just using os.ReadFile().
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The existing code was always parsing the host's resolv.conf to read
the nameservers, searchdomain and options, but those options were
only needed if these options were not configured on the sandbox.
This patch reverses the logic to only parse the resolv.conf if
no options are present in the sandbox configuration.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
We only need the content here, not the checksum, so simplifying the
code by just using os.ReadFile().
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This one is a "bit" fuzzy, as it may not be _directly_ related to `archive`,
but it's always used _in combination_ with the archive package, so moving it
there.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This patch:
- Deprecates pkg/system.DefaultPathEnv
- Moves the implementation inside oci
- Adds TODOs to align the default in the Builder with the one used elsewhere
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The singleflight function was capturing the context.Context of the first
caller that invoked the `singleflight.Do`. This could cause all
concurrent calls to be cancelled when the first request is cancelled.
singleflight calls were also moved from the ImageService to Daemon, to
avoid having to implement this logic in both graphdriver and containerd
based image services.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
This is only used for tests, and the key is not verified anymore, so
instead of creating a key and storing it, we can just use an ad-hoc
one.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Turned out that the loadOrCreateTrustKey() utility was doing exactly the
same as libtrust.LoadOrCreateTrustKey(), so making it a thin wrapped. I kept
the tests to verify the behavior, but we could remove them as we only need this
for our integration tests.
The storage location for the generated key was changed (again as we only need
this for some integration tests), so we can remove the TrustKeyPath from the
config.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The migration code is in the 22.06 branch, and if we don't migrate
the only side-effect is the daemon's ID being regenerated (as a
UUID).
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This `err` is special (as described at the top of the function), but due
to its name is easy to overlook, which risks the chance of inadvertently
shadowing it.
This patch renames the variable to reduce the chance of this happening.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Sam is on my team, and we started to do weekly triage sessions to
clean up the backlog. Adding him, so that he can help with doing
triage without my assistance :)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The reference.ParseNormalizedNamed() utility already returns a Named
reference, but we're interested in wether the digest has a digest, so
check for that.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Current Dockerfile downloads vpnkit for both linux/amd64
and linux/arm64 platforms even if target platform does not
match. This change will download vpnkit only if target
platform matches, otherwise it will just use a dummy scratch
stage.
Signed-off-by: CrazyMax <crazy-max@users.noreply.github.com>
no significant changes in vendored code, other than updating build-tags
for go1.17, but removes some dependencies from the module, which can
help with future updates;
full diff: 3f7ff695ad...abb19827d3
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
While this replace was needed in swarmkit itself, it looks like
it doesn't cause issues when removed in this repository, so
let's remove it here.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Previously we had to use a replace rule, as later versions of this
module resulted in a panic. This issue was fixed in:
f30034d788
Which means we can remove the replace rule, and update the dependency.
No new release was tagged yet, so sticking to a "commit" for now.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Raspberry Pi allows to start system under overlayfs.
Docker is successfully fallbacks to fuse-overlay but not starting
because of the `Error starting daemon: rename /var/lib/docker/runtimes /var/lib/docker/runtimes-old: invalid cross-device link` error
It's happening because `rename` is not supported by overlayfs.
After manually removing directory `runtimes` docker starts and works successfully
Signed-off-by: Illia Antypenko <ilya@antipenko.pp.ua>
using TARGETVARIANT in frozen-images stage implies changes in
`download-frozen-image-v2.sh` script to add support for variants
so we are able to build against more platforms.
Signed-off-by: CrazyMax <crazy-max@users.noreply.github.com>
The golang.org/x/ projects are now doing tagged releases.
Some notable changes:
- authhandler: Add support for PKCE
- Introduce new AuthenticationError type returned by errWrappingTokenSource.Token
- Add support to set JWT Audience in JWTConfigFromJSON()
- google/internal: Add AWS Session Token to Metadata Requests
- go.mod: update vulnerable net library
- google: add support for "impersonated_service_account" credential type.
- google/externalaccount: add support for workforce pool credentials
full diff: https://github.com/golang/oauth2/compare/2bc19b11175f...v0.1.0
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The file will still be available in Git history; we should drop it
however as it is misleading and obsolete.
Signed-off-by: Bjorn Neergaard <bneergaard@mirantis.com>
This is a partial revert of 7ca0cb7ffa, which
switched from os/exec to the golang.org/x/sys/execabs package to mitigate
security issues (mainly on Windows) with lookups resolving to binaries in the
current directory.
from the go1.19 release notes https://go.dev/doc/go1.19#os-exec-path
> ## PATH lookups
>
> Command and LookPath no longer allow results from a PATH search to be found
> relative to the current directory. This removes a common source of security
> problems but may also break existing programs that depend on using, say,
> exec.Command("prog") to run a binary named prog (or, on Windows, prog.exe) in
> the current directory. See the os/exec package documentation for information
> about how best to update such programs.
>
> On Windows, Command and LookPath now respect the NoDefaultCurrentDirectoryInExePath
> environment variable, making it possible to disable the default implicit search
> of “.” in PATH lookups on Windows systems.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This is a partial revert of 7ca0cb7ffa, which
switched from os/exec to the golang.org/x/sys/execabs package to mitigate
security issues (mainly on Windows) with lookups resolving to binaries in the
current directory.
from the go1.19 release notes https://go.dev/doc/go1.19#os-exec-path
> ## PATH lookups
>
> Command and LookPath no longer allow results from a PATH search to be found
> relative to the current directory. This removes a common source of security
> problems but may also break existing programs that depend on using, say,
> exec.Command("prog") to run a binary named prog (or, on Windows, prog.exe) in
> the current directory. See the os/exec package documentation for information
> about how best to update such programs.
>
> On Windows, Command and LookPath now respect the NoDefaultCurrentDirectoryInExePath
> environment variable, making it possible to disable the default implicit search
> of “.” in PATH lookups on Windows systems.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Now that this function is only ever called from contexts where a
*testing.T value is available, several improvements can be made to it.
Refactor it to be a test helper function so that callers do not need to
check and fail the test themselves. Leverage t.TempDir() so that the
temporary file is cleaned up automatically. Change it to return a single
config.Option to get better ergonomics at call sites.
Signed-off-by: Cory Snider <csnider@mirantis.com>
TestCreateParallel, which was ostensibly added as a regression test for
race conditions inside the bridge driver, contains a race condition. The
getIPv4Data() calls race the network configuration and so will sometimes
see the existing address assignments return IP address ranges which do
not conflict with them. While normally a good thing, the test asserts
that exactly one of the 100 networks is successfully created. Pass the
same IPAM data when attempting to create every network to ensure that
the address ranges conflict.
Signed-off-by: Cory Snider <csnider@mirantis.com>
addresses() would incorrectly return all IP addresses assigned to any
interface in the network namespace if exists() is false. This went
unnoticed as the unit test covering this case tested the method inside a
clean new network namespace, which had no interfaces brought up and
therefore no IP addresses assigned. Modifying
testutils.SetupTestOSContext() to bring up the loopback interface 'lo'
resulted in the loopback addresses 127.0.0.1 and [::1] being assigned to
the loopback interface, causing addresses() to return the loopback
addresses and TestAddressesEmptyInterface to start failing. Fix the
implementation of addresses() so that it only ever returns addresses for
the bridge interface.
Signed-off-by: Cory Snider <csnider@mirantis.com>
portallocator.PortAllocator holds persistent state on port allocations
in the network namespace. It normal operation it is used as a singleton,
which is a problem in unit tests as every test runs in a clean network
namespace. The singleton "default" PortAllocator remembers all the port
allocations made in other tests---in other network namespaces---and
can result in spurious test failures. Refactor the bridge driver so that
tests can construct driver instances with unique PortAllocators.
Signed-off-by: Cory Snider <csnider@mirantis.com>
The global netlink handle ns.NlHandle() is indirectly cached for the
life of the process by the netutils.CheckRouteOverlaps() function. This
caching behaviour is a problem for the libnetwork unit tests as the
global netlink handle changes every time testutils.SetupTestOSContext()
is called, i.e. at the start of nearly every test case. Route overlaps
can be checked for in the wrong network namespace, causing spurious test
failures e.g. when running the same test twice in a row with -count=2.
Stop the netlink handle from being cached by shadowing the package-scope
variable with a function-scoped one.
Signed-off-by: Cory Snider <csnider@mirantis.com>
TestParallel has been written in an unusual style which relies on the
testing package's intra-test parallelism feature and lots of global
state to test one thing using three cooperating parallel tests. It is
complicated to reason about and quite brittle. For example, the command
go test -run TestParallel1 ./libnetwork
would deadlock, waiting until the test timeout for TestParallel2 and
TestParallel3 to run. And the test would be skipped if the
'-test.parallel' flag was less than three, either explicitly or
implicitly (default: GOMAXPROCS).
Overhaul TestParallel to address the aforementioned deficiencies and
get rid of mutable global state.
Signed-off-by: Cory Snider <csnider@mirantis.com>
Reusing the same "OS context" (read: network namespace) and
NetworkController across multiple tests risks tests interfering with
each other, or worse: _depending on_ other tests to set up
preconditions. Construct a new controller for every test which needs
one, and run every test which mutates or inspects the host environment
in a fresh OS context.
The only outlier is runParallelTests. It is the only remaining test
implementation which references the "global" package-scoped controller,
so the global controller instance is effectively private to that one
test.
Signed-off-by: Cory Snider <csnider@mirantis.com>
Sharing a single NetworkController instance between multiple tests
makes it possible for tests to interfere with each other. As a first
step towards giving each test its own private controller instance, make
explicit which controller createTestNetwork() creates the test network
on.
Signed-off-by: Cory Snider <csnider@mirantis.com>
There are a handful of libnetwork tests which need to have multiple
concurrent goroutines inside the same test OS context (network
namespace), including some platform-agnostic tests. Provide test
utilities for spawning goroutines inside an OS context and
setting/restoring an existing goroutine's OS context to abstract away
platform differences and so that each test does not need to reinvent the
wheel.
Signed-off-by: Cory Snider <csnider@mirantis.com>
The SetupTestOSContext calls were made conditional in
https://github.com/moby/libnetwork/pull/148 to work around limitations
in runtime.LockOSThread() which existed before Go 1.10. This workaround
is no longer necessary now that runtime.UnlockOSThread() needs to be
called an equal number of times before the goroutine is unlocked from
the OS thread.
Unfortunately some tests break when SetupTestOSContext is not skipped.
(Evidently this code path has not been exercised in a long time.) A
newly-created network namespace is very barebones: it contains a
loopback interface in the down state and little else. Even pinging
localhost does not work inside of a brand new namespace. Set the
loopback interface to up during namespace setup so that tests which
need to use the loopback interface can do so.
Signed-off-by: Cory Snider <csnider@mirantis.com>
If the GenericData option is missing from the map, the bridge driver
would skip configuration. At first glance this seems fine: the driver
defaults are to not configure anything. But it also skips over
initializing the persistent storage, which is configured through other
option keys in the map. Fix this oversight by removing the early return.
Signed-off-by: Cory Snider <csnider@mirantis.com>
Each call to datastore.DefaultScopes() would modify and return the same
map. Consequently, some of the config for every NetworkController in the
process would be mutated each time one is constructed or reconfigured.
This behaviour is unexpected, unintended, and undesirable. Stop
accidentally sharing configuration by changing DefaultScopes() to return
a distinct map on each call.
Signed-off-by: Cory Snider <csnider@mirantis.com>
(*networkNamespace).InvokeFunc() cleaned up the state of the locked
thread by setting its network namespace to the netns of the goroutine
which called InvokeFunc(). If InvokeFunc() was to be called after the
caller had modified its thread's network namespace, InvokeFunc() would
incorrectly "restore" the state of its goroutine thread to the wrong
namespace, violating the invariant that unlocked threads are fungible.
Change the implementation to restore the thread's netns to the netns
that particular thread had before InvokeFunc() modified it.
Signed-off-by: Cory Snider <csnider@mirantis.com>
Both of these were deprecated in 55f675811a,
but the format of the GoDoc comments didn't follow the correct format, which
caused them not being picked up by tools as "deprecated".
This patch updates uses in the codebase to use the alternatives.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
opencontainers/go-digest is a 1:1 copy of the one in distribution. It's no
longer used in distribution itself, so may be removed there at some point.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This update:
- removes support for go1.11
- removes the use of "golang.org/x/crypto/ed25519", which is now part of stdlib:
> Beginning with Go 1.13, the functionality of this package was moved to the
> standard library as crypto/ed25519. This package only acts as a compatibility
> wrapper.
Note that this is not the latest release; version v1.1.44 introduced a tools.go
file, which added golang.org/x/tools to the dependency tree (but only used for
"go:generate") see commit:
df84acab71
full diff: https://github.com/miekg/dns/compare/v1.1.27...v1.1.43
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
When running inside a container, testns == origns. Consequently, closing
testns causes the deferred netns.Set(origns) call to fail. Stop closing
the aliased original namespace handle.
Signed-off-by: Cory Snider <csnider@mirantis.com>
TestResolvConfHost and TestParallel both depended on a network named
"testhost" already existing in the libnetwork controller. Only TestHost
created that network, so the aforementioned tests would fail unless run
after TestHost. Fix those tests so they can be run in any order.
Signed-off-by: Cory Snider <csnider@mirantis.com>
While this was convenient for our use, it's somewhat unexpected for a function
that writes a file to also create all parent directories; even more because
this function may be executed as root.
This patch makes the package more "safe" to use as a generic package by removing
this functionality, and leaving it up to the caller to create parent directories,
if needed.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
unix.Kill() does not produce an error for PID 0, -1. As a result, checking
process.Alive() would return "true" for both 0 and -1 on macOS (and previously
on Linux as well).
Let's shortcut these values to consider them "not alive", to prevent someone
trying to kill them.
A basic test was added to check the behavior.
Given that the intent of these functions is to handle single processes, this patch
also prevents 0 and negative values to be used.
From KILL(2): https://man7.org/linux/man-pages/man2/kill.2.html
If pid is positive, then signal sig is sent to the process with
the ID specified by pid.
If pid equals 0, then sig is sent to every process in the process
group of the calling process.
If pid equals -1, then sig is sent to every process for which the
calling process has permission to send signals, except for
process 1 (init), but see below.
If pid is less than -1, then sig is sent to every process in the
process group whose ID is -pid.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Using the implementation from pkg/pidfile for windows, as that implementation
looks to be handling more cases to check if a process is still alive (or to be
considered alive).
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
If the file doesn't exist, the process isn't running, so we should be able
to ignore that.
Also remove an intermediate variable.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
full diff: 48dd89375d...6341884e5f
Pulls in a set of fixes to SwarmKit's nascent Cluster Volumes support
discovered during subsequent development and testing.
Signed-off-by: Bjorn Neergaard <bneergaard@mirantis.com>
Deleting a containerd task whose status is Created fails with a
"precondition failed" error. This is because (aside from Windows)
a process is spawned when the task is created, and deleting the task
while the process is running would leak the process if it was allowed.
libcontainerd and the containerd plugin executor mistakenly try to clean
up from a failed start by deleting the created task, which will always
fail with the aforementined error. Change them to pass the
`WithProcessKill` delete option so the cleanup has a chance to succeed.
Signed-off-by: Cory Snider <csnider@mirantis.com>
Currently an attempt to pull a reference which resolves to an OCI
artifact (Helm chart for example), results in a bit unrelated error
message `invalid rootfs in image configuration`.
This provides a more meaningful error in case a user attempts to
download a media type which isn't image related.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
This fallback is used when we filter the manifest list by the user-provided platform and find no matches such that we match the previous Docker behavior (before it supported variant matching). This has been deprecated long enough that I think it's time we finally stop supporting this weird fallback, especially since it makes for buggy behavior like `docker pull --platform linux/arm/v5 alpine:3.16` leading to a `linux/arm/v6` image being pulled (I specified a variant, every manifest list entry specifies a variant, so clearly the only behavior I as a user could reasonably expect is an error that `linux/arm/v5` is not supported, but instead I get an explicitly incompatible image despite doing everything I as a user can to prevent that situation).
Signed-off-by: Tianon Gravi <admwiggin@gmail.com>
This debug message already includes a full platform string, so this ends up being something like `linux/arm/v7/amd64` in the end result.
Signed-off-by: Tianon Gravi <admwiggin@gmail.com>
On Windows, syscall.StartProcess and os/exec.Cmd did not properly
check for invalid environment variable values. A malicious
environment variable value could exploit this behavior to set a
value for a different environment variable. For example, the
environment variable string "A=B\x00C=D" set the variables "A=B" and
"C=D".
Thanks to RyotaK (https://twitter.com/ryotkak) for reporting this
issue.
This is CVE-2022-41716 and Go issue https://go.dev/issue/56284.
This Go release also fixes https://github.com/golang/go/issues/56309, a
runtime bug which can cause random memory corruption when a goroutine
exits with runtime.LockOSThread() set. This fix is necessary to unblock
work to replace certain uses of pkg/reexec with unshared OS threads.
Signed-off-by: Cory Snider <csnider@mirantis.com>
The `execCmd()` utility was a basic wrapper around `exec.Command()`. Inlining it
makes the code more transparent.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Make the example actually do something, and include the output, so that it
shows up in the documentation.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Some of these tests are failing (but not enabled in CI), but the current output
doesn't provide any details on the failure, so this patch is just to improve the
test output to allow debugging the actual failure.
Before this, tests would fail like:
make BIND_DIR=. TEST_FILTER=TestPluginInstallImage test-integration
...
=== FAIL: amd64.integration-cli TestDockerPluginSuite/TestPluginInstallImage (15.22s)
docker_cli_plugins_test.go:220: assertion failed: expression is false: strings.Contains(out, `Encountered remote "application/vnd.docker.container.image.v1+json"(image) when fetching`)
--- FAIL: TestDockerPluginSuite/TestPluginInstallImage (15.22s)
With this patch, tests provide more useful output:
make BIND_DIR=. TEST_FILTER=TestPluginInstallImage test-integration
...
=== FAIL: amd64.integration-cli TestDockerPluginSuite/TestPluginInstallImage (1.15s)
time="2022-10-18T10:21:22Z" level=warning msg="reference for unknown type: application/vnd.docker.plugin.v1+json"
time="2022-10-18T10:21:22Z" level=warning msg="reference for unknown type: application/vnd.docker.plugin.v1+json" digest="sha256:bee151d3fef5c1f787e7846efe4fa42b25a02db4e7543e54e8c679cf19d78598"
mediatype=application/vnd.docker.plugin.v1+json size=522
time="2022-10-18T10:21:22Z" level=warning msg="reference for unknown type: application/vnd.docker.plugin.v1+json"
time="2022-10-18T10:21:22Z" level=warning msg="reference for unknown type: application/vnd.docker.plugin.v1+json" digest="sha256:bee151d3fef5c1f787e7846efe4fa42b25a02db4e7543e54e8c679cf19d78598"
mediatype=application/vnd.docker.plugin.v1+json size=522
docker_cli_plugins_test.go:221: assertion failed: string "Error response from daemon: application/vnd.docker.distribution.manifest.v1+prettyjws not supported\n" does not contain "Encountered remote
\"application/vnd.docker.container.image.v1+json\"(image) when fetching"
--- FAIL: TestDockerPluginSuite/TestPluginInstallImage (1.15s)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The ParseLink() function has special handling for legacy formats;
> This is kept because we can actually get a HostConfig with links
> from an already created container and the format is not `foo:bar`
> but `/foo:/c1/bar`
This patch adds a test-case for this format. While updating, also switching
to use gotest.tools assertions.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The new daemon.containerFSView type covers all the use-cases on Linux
with a much more intuitive API, but is not portable to Windows.
Discourage people from using the old and busted functions in new Linux
code by excluding them entirely from Linux builds.
Signed-off-by: Cory Snider <csnider@mirantis.com>
Mounting a container's volumes under its rootfs directory inside the
host mount namespace causes problems with cross-namespace mount
propagation when /var/lib/docker is bind-mounted into the container as a
volume. The mount event propagates into the container's mount namespace,
overmounting the volume, but the propagated unmount events do not fully
reverse the effect. Each archive operation causes the mount table in the
container's mount namespace to grow larger and larger, until the kernel
limiton the number of mounts in a namespace is hit. The only solution to
this issue which is not subject to race conditions or other blocker
caveats is to avoid mounting volumes into the container's rootfs
directory in the host mount namespace in the first place.
Mount the container volumes inside an unshared mount namespace to
prevent any mount events from propagating into any other mount
namespace. Greatly simplify the archiving implementations by also
chrooting into the container rootfs to sidestep the need to resolve
paths in the host.
Signed-off-by: Cory Snider <csnider@mirantis.com>
The existing archive implementation is not easy to reason about by
reading the source. Prepare to rewrite it by covering more edge cases in
tests. The new test cases were determined by black-box characterizing
the existing behaviour.
Signed-off-by: Cory Snider <csnider@mirantis.com>
It is only applicable to Windows so it does not need to be called from
platform-generic code. Fix locking in the Windows implementation.
Signed-off-by: Cory Snider <csnider@mirantis.com>
The Linux implementation needs to diverge significantly from the Windows
one in order to fix platform-specific bugs. Cut the generic
implementation out of daemon/archive.go and paste identical, verbatim
copies of that implementation into daemon/archive_{windows,linux}.go to
make it easier to compare the progression of changes to the respective
implementations through Git history.
Signed-off-by: Cory Snider <csnider@mirantis.com>
This change was introduced early in the development of rootless support,
before all the kinks were worked out and rootlesskit was built. The
author was testing the daemon by inside a user namespace set up by runc,
observed that the unshare(2) syscall was returning EPERM, and assumed
that it was a fundamental limitation of user namespaces. Seeing as the
kernel documentation (of today) disagrees with that assessment and that
unshare demonstrably works inside user namespaces, I can only assume
that the EPERM was due to a quirk of their test environment, such as a
seccomp filter set up by runc blocking the unshare syscall.
https://github.com/moby/moby/pull/20902#issuecomment-236409406
Mount namespaces are necessary to address #38995 and #43390. Revert the
special-casing so those issues can also be fixed for rootless daemons.
This reverts commit dc950567c1.
Signed-off-by: Cory Snider <csnider@mirantis.com>
Unshare the thread's file system attributes and, if applicable, mount
namespace so that the chroot operation does not affect the rest of the
process.
Signed-off-by: Cory Snider <csnider@mirantis.com>
The applyLayer implementation in pkg/chrootarchive has to set the TMPDIR
environment variable so that archive.UnpackLayer() can successfully
create the whiteout-file temp directory. Change UnpackLayer to create
the temporary directory under the destination path so that environment
variables do not need to be touched.
Signed-off-by: Cory Snider <csnider@mirantis.com>
This fix tries to address issues raised in #44346.
The max-concurrent-downloads and max-concurrent-uploads limits are applied for the whole engine and not for each pull/push command.
Signed-off-by: Luis Henrique Mulinari <luis.mulinari@gmail.com>
Aside from unconditionally unlocking the OS thread even if restoring the
thread's network namespace fails, func (*networkNamespace).InvokeFunc()
correctly implements invoking a function inside a network namespace.
This is far from obvious, however. func InitOSContext() does much of the
heavy lifting but in a bizarre fashion: it restores the initial network
namespace before it is changed in the first place, and the cleanup
function it returns does not restore the network namespace at all! The
InvokeFunc() implementation has to restore the network namespace
explicitly by deferring a call to ns.SetNamespace().
func InitOSContext() is a leaky abstraction taped to a footgun. On the
one hand, it defensively resets the current thread's network namespace,
which has the potential to fix up the thread state if other buggy code
had failed to maintain the invariant that an OS thread must be locked to
a goroutine unless it is interchangeable with a "clean" thread as
spawned by the Go runtime. On the other hand, it _facilitates_ writing
buggy code which fails to maintain the aforementioned invariant because
the cleanup function it returns unlocks the thread from the goroutine
unconditionally while neglecting to restore the thread's network
namespace! It is quite scary to need a function which fixes up threads'
network namespaces after the fact as an arbitrary number of goroutines
could have been scheduled onto a "dirty" thread and run non-libnetwork
code before the thread's namespace is fixed up. Any number of
(not-so-)subtle misbehaviours could result if an unfortunate goroutine
is scheduled onto a "dirty" thread. The whole repository has been
audited to ensure that the aforementioned invariant is never violated,
making after-the-fact fixing up of thread network namespaces redundant.
Make InitOSContext() a no-op on Linux and inline the thread-locking into
the function (singular) which previously relied on it to do so.
func ns.SetNamespace() is of similarly dubious utility. It intermixes
capturing the initial network namespace and restoring the thread's
network namespace, which could result in threads getting put into the
wrong network namespace if the wrong thread is the first to call it.
Delete it entirely; functions which need to manipulate a thread's
network namespace are better served by being explicit about capturing
and restoring the thread's namespace.
Rewrite InvokeFunc() to invoke the closure inside a goroutine to enable
a graceful and safe recovery if the thread's network namespace could not
be restored. Avoid any potential race conditions due to changing the
main thread's network namespace by preventing the aforementioned
goroutines from being eligible to be scheduled onto the main thread.
Signed-off-by: Cory Snider <csnider@mirantis.com>
func (*network) watchMiss() correctly locks its goroutine to an OS
thread before changing the thread's network namespace, but neglects to
restore the thread's network namespace before unlocking. Fix this
oversight by unlocking iff the thread's network namespace is
successfully restored.
Prevent the watchMiss goroutine from being locked to the main thread to
avoid the issues which would arise if such a situation was to occur.
Signed-off-by: Cory Snider <csnider@mirantis.com>
The parallel tests were unconditionally unlocking the test case
goroutine from the OS thread, irrespective of whether the thread's
network namespace was successfully restored. This was not a problem in
practice as the unpaired calls to runtime.LockOSThread() peppered
through the test case would have prevented the goroutine from being
unlocked. Unlock the goroutine from the thread iff the thread's network
namespace is successfully restored.
Signed-off-by: Cory Snider <csnider@mirantis.com>
testutils.SetupTestOSContext() sets the calling thread's network
namespace but neglected to restore it on teardown. This was not a
problem in practice as it called runtime.LockOSThread() twice but
runtime.UnlockOSThread() only once, so the tampered threads would be
terminated by the runtime when the test case returned and replaced with
a clean thread. Correct the utility so it restores the thread's network
namespace during teardown and unlocks the goroutine from the thread on
success.
Remove unnecessary runtime.LockOSThread() calls peppering test cases
which leverage testutils.SetupTestOSContext().
Signed-off-by: Cory Snider <csnider@mirantis.com>
cmd.Environ() is new in go1.19, and not needed for this specific case.
Without this, trying to use this package in code that uses go1.18 will fail;
builder/remotecontext/git/gitutils.go:216:23: cmd.Environ undefined (type *exec.Cmd has no field or method Environ)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
- On Windows, we don't build and run a local test registry (we're not running
docker-in-docker), so we need to skip this test.
- On rootless, networking doesn't support this (currently)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This is accomplished by storing the distribution source in the content
labels. If the distribution source is not found then we check to the
registry to see if the digest exists in the repo, if it does exist then
the puller will use it.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
GitHub uses these parameters to construct a name; removing the ./ prefix
to make them more readable (and add them back where it's used)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Setting cmd.Env overrides the default of passing through the parent
process' environment, which works out fine most of the time, except when
it doesn't. For whatever reason, leaving out all the environment causes
git-for-windows sh.exe subprocesses to enter an infinite loop of
access violations during Cygwin initialization in certain environments
(specifically, our very own dev container image).
Signed-off-by: Cory Snider <csnider@mirantis.com>
While it is undesirable for the system or user git config to be used
when the daemon clones a Git repo, it could break workflows if it was
unconditionally applied to docker/cli as well.
Signed-off-by: Cory Snider <csnider@mirantis.com>
Prevent git commands we run from reading the user or system
configuration, or cloning submodules from the local filesystem.
Signed-off-by: Cory Snider <csnider@mirantis.com>
Keep It Simple! Set the working directory for git commands by...setting
the git process's working directory. Git commands can be run in the
parent process's working directory by passing the empty string.
Signed-off-by: Cory Snider <csnider@mirantis.com>
Make the test more debuggable by logging all git command output and
running each table-driven test case as a subtest.
Signed-off-by: Cory Snider <csnider@mirantis.com>
`system.MkdirAll()` is a special version of os.Mkdir to handle creating directories
using Windows volume paths (`"\\?\Volume{4c1b02c1-d990-11dc-99ae-806e6f6e6963}"`).
This may be important when `MkdirAll` is used, which traverses all parent paths to
create them if missing (ultimately landing on the "volume" path).
The daemon.NewDaemon() function used `system.MkdirAll()` in various places where
a subdirectory within `daemon.Root` was created. This appeared to be mostly out
of convenience (to not have to handle `os.ErrExist` errors). The `daemon.Root`
directory should already be set up in these locations, and should be set up with
correct permissions. Using `system.MkdirAll()` would potentially mask errors if
the root directory is missing, and instead set up parent directories (possibly
with incorrect permissions).
Because of the above, this patch changes `system.MkdirAll` to `os.Mkdir`. As we
are changing these lines, this patch also changes the legacy octal notation
(`0700`) to the now preferred `0o700`.
One location continues to use `system.MkdirAll`, as the temp-directory may be
configured to be outside of `daemon.Root`, but a redundant `os.Stat(realTmp)`
was removed, as `system.MkdirAll` is expected to handle this.
As we are changing these lines, this patch also changes the legacy octal notation
(`0700`) to the now preferred `0o700`.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This makes it more transparent that it's unused for Linux,
and we don't pass "root", which has no relation with the
path on Linux.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This allows us to run CI with the containerd snapshotter enabled, without
patching the daemon.json, or changing how tests set up daemon flags.
A warning log is added during startup, to inform if this variable is set,
as it should only be used for our integration tests.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
We were discarding the underlying error, which made it impossible for
callers to detect (e.g.) an os.ErrNotExist.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
- use t.TempDir() to make sure we're testing from a clean state
- improve checks for errors to have the correct error-type where possible
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
system.MkdirAll is a special version of os.Mkdir to handle creating directories
using Windows volume paths ("\\?\Volume{4c1b02c1-d990-11dc-99ae-806e6f6e6963}").
This may be important when MkdirAll is used, which traverses all parent paths to
create them if missing (ultimately landing on the "volume" path).
Commit 62f648b061 introduced the system.MkdirAll
calls, as a change was made in applyLayer() for Windows to use Windows volume
paths as an alternative for chroot (which is not supported on Windows). Later
iteractions changed this to regular Windows long-paths (`\\?\<path>`) in
230cfc6ed2, and 9b648dfac6.
Such paths are handled by the `os` package.
However, in these tests, the parent path already exists (all paths created are
a direct subdirectory within `tmpDir`). It looks like `MkdirAll` here is used
out of convenience to not have to handle `os.ErrExist` errors. As all these
tests are running in a fresh temporary directory, there should be no need to
handle those, and it's actually desirable to produce an error in that case, as
the directory already existing would be unexpected.
Because of the above, this test changes `system.MkdirAll` to `os.Mkdir`. As we
are changing these lines, this patch also changes the legacy octal notation
(`0700`) to the now preferred `0o700`.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Introduced in 3ac6394b80, which makes no mention
of a reason for extracting to the same directory as we created the archive from,
so I assume this was a copy/paste mistake and the path was meant to be "dest",
not "src".
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Previously, Docker Hub was excluded when configuring "allow-nondistributable-artifacts".
With the updated policy announced by Microsoft, we can remove this restriction;
https://techcommunity.microsoft.com/t5/containers/announcing-windows-container-base-image-redistribution-rights/ba-p/3645201
There are plans to deprecated support for foreign layers altogether in the OCI,
and we should consider to make this option the default, but as that requires
deprecating the option (and possibly keeping an "opt-out" option), we can look
at that separately.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The implementation of CanAccess() is very rudimentary, and should
not be used for anything other than a basic check (and maybe not
even for that). It's only used in a single location in the daemon,
so move it there, and un-export it to not encourage others to use
it out of context.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This type felt really redundant; `pidfile.New()` takes the path of the file to
create as an argument, so this is already known. The only thing the PIDFile
type provided was a `Remove()` method, which was just calling `os.Remove()` on
the path of the file.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Use bytes.TrimSpace instead of using the strings package, which is
more performant, and allows us to skip the intermediate variable.
Also combined some "if" statements to reduce cyclomatic complexity.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
It's ok to ignore if the file doesn't exist, or if the file doesn't
have a PID in it, but we should produce an error if the file exists,
but we're unable to read it for other reasons.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The same attribute was generated for each path that was created, but always
the same, so instead of generating it in each iteration, generate it once,
and pass it to our mkdirall() implementation.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The regex only matched volume paths without a trailing path-separator. In cases
where a path would be passed with a trailing path-separator, it would depend on
further code in mkdirall to strip the trailing slash, then to perform the regex
again in the next iteration.
While regexes aren't ideal, we're already executing this one, so we may as well
use it to match those situations as well (instead of executing it twice), to
allow us to return early.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Ideally, we would construct this lazily, but adding a function and a
sync.Once felt like a bit "too much".
Also updated the GoDoc for some functions to better describe what they do.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The pkg-imports validation prevents reusable library packages from
depending on the whole daemon, accidentally or intentionally. The
allowlist is overly restrictive as it also prevents us from reusing code
in both pkg/ and daemon/ unless that code is also made into a reusable
library package under pkg/. Allow pkg/ packages to import internal/
packages which do not transitively depend on disallowed packages.
Signed-off-by: Cory Snider <csnider@mirantis.com>
Previously we waited for 60 seconds after the service faults to restart
it. However, there isn't much benefit to waiting this long. We expect
15 seconds to be a more reasonable delay.
Co-Authored-by: Kevin Parsons <kevpar@microsoft.com>
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Building off insights from the great work Cory Snider has been doing,
this replaces a reexec with a much lower overhead implementation which
performs the `Chddir` in a new goroutine that is locked to a specific
thread with CLONE_FS unshared.
The thread is thrown away afterwards and the Chdir does effectively the
same thing as what the reexec was being used for.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
golang.org/x/sys/windows now implements this, so we can use that
instead of a local implementation.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The `IsAnInteractiveSession` was deprecated, and `IsWindowsService` is marked
as the recommended replacement.
For details, see 280f808b4a
> CL 244958 includes isWindowsService function that determines if a
> process is running as a service. The code of the function is based on
> public .Net implementation.
>
> IsAnInteractiveSession function implements similar functionality, but
> is based on an old Stackoverflow post., which is not as authoritative
> as code written by Microsoft for their official product.
>
> This change copies CL 244958 isWindowsService function into svc package
> and makes it public. The intention is that future users will prefer
> IsWindowsService to IsAnInteractiveSession.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
go-winio now defines this function, so we can consume that.
Note that there's a difference between the old implementation and the original
one (added in 1cb9e9b44e). The old implementation
had special handling for win32 error codes, which was removed in the go-winio
implementation in 0966e1ad56
As `go-winio.GetFileSystemType()` calls `filepath.VolumeName(path)` internally,
this patch also removes the `string(home[0])`, which is redundant, and could
potentially panic if an empty string would be passed.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Commit 955c1f881a (Docker v17.12.0) replaced
detection of support for multiple lowerdirs (as required by overlay2) to not
depend on the kernel version. The `overlay2.override_kernel_check` was still
used to print a warning that older kernel versions may not have full support.
After this, commit e226aea280 (Docker v20.10.0,
backported to v19.03.7) removed uses of the `overlay2.override_kernel_check`
option altogether, but we were still parsing it.
This patch changes the `parseOptions()` function to not parse the option,
printing a deprecation warning instead. We should change this to be an error,
but the `overlay2.override_kernel_check` option was not deprecated in the
documentation, so keeping it around for one more release.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
These consts were defined locally, but are now defined in golang.org/x/sys, so
we can use those.
Also added some documentation about how this function works, taking the description
from the GetExitCodeProcess function (processthreadsapi.h) API reference:
https://learn.microsoft.com/en-us/windows/win32/api/processthreadsapi/nf-processthreadsapi-getexitcodeprocess
> The GetExitCodeProcess function returns a valid error code defined by the
> application only after the thread terminates. Therefore, an application should
> not use `STILL_ACTIVE` (259) as an error code (`STILL_ACTIVE` is a macro for
> `STATUS_PENDING` (minwinbase.h)). If a thread returns `STILL_ACTIVE` (259) as
> an error code, then applications that test for that value could interpret it
> to mean that the thread is still running, and continue to test for the
> completion of the thread after the thread has terminated, which could put
> the application into an infinite loop.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Most of the package was using stdlib's errors package, so replacing two calls
to pkg/errors with stdlib. Also fixing capitalization of error strings.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
On unix, it's an alias for os.MkdirAll, so remove its use to be
more transparent what's being used.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Merge the accessible() function into CanAccess, and check world-
readable permissions first, before checking owner and group.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Use the IoctlRetInt, IoctlSetInt and IoctlLoopSetStatus64 helper
functions defined in the golang.org/x/sys/unix package instead of
manually wrapping these using a locally defined function.
Inspired by 3cc3d8a560
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
These tests were effectively doing "subtests", using comments to describe each,
however;
- due to the use of `t.Fatal()` would terminate before completing all "subtests"
- The error returned by the function being tested (`Chtimes`), was not checked,
and the test used "indirect" checks to verify if it worked correctly. Adding
assertions to check if the function didn't produce an error.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Removing the "Linux" suffix from one test, which should probably be
rewritten to be run on "unix", to provide test-coverage for those
implementations.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
With t.TempDir(), some of the test-utilities became so small that
it was more transparent to inline them. This also helps separating
concenrs, as we're in the process of thinning out and decoupling
some packages.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
It looks like this function was converting the time (`windows.NsecToTimespec()`),
only to convert it back (`windows.TimespecToNsec()`). This became clear when
moving the lines together:
```go
ctimespec := windows.NsecToTimespec(ctime.UnixNano())
c := windows.NsecToFiletime(windows.TimespecToNsec(ctimespec))
```
And looking at the Golang code, it looks like they're indeed the exact reverse:
```go
func TimespecToNsec(ts Timespec) int64 { return int64(ts.Sec)*1e9 + int64(ts.Nsec) }
func NsecToTimespec(nsec int64) (ts Timespec) {
ts.Sec = nsec / 1e9
ts.Nsec = nsec % 1e9
return
}
```
While modifying this code, also renaming the `e` variable to a more common `err`.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This more closely matches to how it's used everywhere. Also move the comment
describing "what" ChTimes() does inside its GoDoc.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This code caused me some head-scratches, and initially I wondered
if this was a bug, but it looks to be intentional to set nsec, not
sec, as time.Unix() internally divides nsec, and sets sec accordingly;
https://github.com/golang/go/blob/go1.19.2/src/time/time.go#L1364-L1380
// Unix returns the local Time corresponding to the given Unix time,
// sec seconds and nsec nanoseconds since January 1, 1970 UTC.
// It is valid to pass nsec outside the range [0, 999999999].
// Not all sec values have a corresponding time value. One such
// value is 1<<63-1 (the largest int64 value).
func Unix(sec int64, nsec int64) Time {
if nsec < 0 || nsec >= 1e9 {
n := nsec / 1e9
sec += n
nsec -= n * 1e9
if nsec < 0 {
nsec += 1e9
sec--
}
}
return unixTime(sec, int32(nsec))
}
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This code was moved to a separate file in fe5b34ba88,
but it's unclear why it was moved (as this file is not excluded on Windows).
Moving the code back into the chtimes file, to move it closer to where it's used.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
It was only used in a couple of places, and in most places shouldn't be used
as those locations were in unix/linux-only files, so didn't need the wrapper.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
On Linux, when (os/exec.Cmd).SysProcAttr.Pdeathsig is set, the signal
will be sent to the process when the OS thread on which cmd.Start() was
executed dies. The runtime terminates an OS thread when a goroutine
exits after being wired to the thread with runtime.LockOSThread(). If
other goroutines are allowed to be scheduled onto a thread which called
cmd.Start(), an unrelated goroutine could cause the thread to be
terminated and prematurely signal the command. See
https://github.com/golang/go/issues/27505 for more information.
Prevent started subprocesses with Pdeathsig from getting signaled
prematurely by wiring the starting goroutine to the OS thread until the
subprocess has exited. No other goroutines can be scheduled onto a
locked thread so it will remain alive until unlocked or the daemon
process exits.
Signed-off-by: Cory Snider <csnider@mirantis.com>
This utility was only used in a single place, and had no external consumers.
Move it to where it's used.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The pkg/fsutils package was forked in containerd, and later moved to
containerd/continuity/fs. As we're moving more bits to containerd, let's also
use the same implementation to reduce code-duplication and to prevent them from
diverging.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This utility was added in 442b45628e as part of
user-namespaces, and first used in 44e1023a93 to
set up the daemon root, and move the existing content;
44e1023a93/daemon/daemon_experimental.go (L68-L71)
A later iteration no longer _moved_ the existing root directory, and removed the
use of `directory.MoveToSubdir()` e8532023f2
It looks like there's no external consumers of this utility, so we should be
save to remove it.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
- separate exported function from implementation, to allow for GoDoc to be
maintained in a single location.
- don't use named return variables (no "bare" return, and potentially shadowing
variables)
- reverse the `os.IsNotExist(err) && d != dir` condition, putting the "lighter"
`d != dir` first.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
I noticed the comment above this code, but didn't see a corresponding type-cast.
Looking at this file's history, I found that these were removed as part of
2f5f0af3fd, which looks to have overlooked some
deliberate type-casts.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This adds a new filter argument to the volume prune endpoint "all".
When this is not set, or it is a false-y value, then only anonymous
volumes are considered for pruning.
When `all` is set to a truth-y value, you get the old behavior.
This is an API change, but I think one that is what most people would
want.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
From the mailing list:
We have just released Go versions 1.19.2 and 1.18.7, minor point releases.
These minor releases include 3 security fixes following the security policy:
- archive/tar: unbounded memory consumption when reading headers
Reader.Read did not set a limit on the maximum size of file headers.
A maliciously crafted archive could cause Read to allocate unbounded
amounts of memory, potentially causing resource exhaustion or panics.
Reader.Read now limits the maximum size of header blocks to 1 MiB.
Thanks to Adam Korczynski (ADA Logics) and OSS-Fuzz for reporting this issue.
This is CVE-2022-2879 and Go issue https://go.dev/issue/54853.
- net/http/httputil: ReverseProxy should not forward unparseable query parameters
Requests forwarded by ReverseProxy included the raw query parameters from the
inbound request, including unparseable parameters rejected by net/http. This
could permit query parameter smuggling when a Go proxy forwards a parameter
with an unparseable value.
ReverseProxy will now sanitize the query parameters in the forwarded query
when the outbound request's Form field is set after the ReverseProxy.Director
function returns, indicating that the proxy has parsed the query parameters.
Proxies which do not parse query parameters continue to forward the original
query parameters unchanged.
Thanks to Gal Goldstein (Security Researcher, Oxeye) and
Daniel Abeles (Head of Research, Oxeye) for reporting this issue.
This is CVE-2022-2880 and Go issue https://go.dev/issue/54663.
- regexp/syntax: limit memory used by parsing regexps
The parsed regexp representation is linear in the size of the input,
but in some cases the constant factor can be as high as 40,000,
making relatively small regexps consume much larger amounts of memory.
Each regexp being parsed is now limited to a 256 MB memory footprint.
Regular expressions whose representation would use more space than that
are now rejected. Normal use of regular expressions is unaffected.
Thanks to Adam Korczynski (ADA Logics) and OSS-Fuzz for reporting this issue.
This is CVE-2022-41715 and Go issue https://go.dev/issue/55949.
View the release notes for more information: https://go.dev/doc/devel/release#go1.19.2
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Commit 7b153b9e28 updated the main
swagger file, but didn't update the v1.42 version used for the
documentation as it wasn't created yet at the time.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Before this change, the awslogs collectBatch and processEvent
function documentation still referenced the batchPublishFrequency
constant which was removed in favor of the configurable log stream
forceFlushInterval member.
Signed-off-by: Austin Vazquez <macedonv@amazon.com>
Before this change restarting the daemon in live-restore with running
containers + a restart policy meant that volume refs were not restored.
This specifically happens when the container is still running *and*
there is a restart policy that would make sure the container was running
again on restart.
The bug allows volumes to be removed even though containers are
referencing them. 😱
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
This package was moved to a separate repository, using the steps below:
# install filter-repo (https://github.com/newren/git-filter-repo/blob/main/INSTALL.md)
brew install git-filter-repo
cd ~/projects
# create a temporary clone of docker
git clone https://github.com/docker/docker.git moby_pubsub_temp
cd moby_pubsub_temp
# for reference
git rev-parse HEAD
# --> 572ca799db
# remove all code, except for pkg/pubsub, license, and notice, and rename pkg/pubsub to /
git filter-repo --path pkg/pubsub/ --path LICENSE --path NOTICE --path-rename pkg/pubsub/:
# remove canonical imports
git revert -s -S 585ff0ebbe6bc25b801a0e0087dd5353099cb72e
# initialize module
go mod init github.com/moby/pubsub
go mod tidy
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Managed containerd processes are executed with SysProcAttr.Pdeathsig set
to syscall.SIGKILL so that the managed containerd is automatically
killed along with the daemon. At least, that is the intention. In
practice, the signal is sent to the process when the creating _OS
thread_ dies! If a goroutine exits while locked to an OS thread, the Go
runtime will terminate the thread. If that thread happens to be the
same thread which the subprocess was started from, the subprocess will
be signaled. Prevent the journald driver from sometimes unintentionally
killing child processes by ensuring that all runtime.LockOSThread()
calls are paired with runtime.UnlockOSThread().
Signed-off-by: Cory Snider <csnider@mirantis.com>
The `docker` CLI currently doesn't handle situations where the current context
(as defined in `~/.docker/config.json`) is invalid or doesn't exist. As loading
(and checking) the context happens during initialization of the CLI, this
prevents `docker context` commands from being used, which makes it complicated
to fix the situation. For example, running `docker context use <correct context>`
would fail, which makes it not possible to update the `~/.docker/config.json`,
unless doing so manually.
For example, given the following `~/.docker/config.json`:
```json
{
"currentContext": "nosuchcontext"
}
```
All of the commands below fail:
```bash
docker context inspect rootless
Current context "nosuchcontext" is not found on the file system, please check your config file at /Users/thajeztah/.docker/config.json
docker context rm --force rootless
Current context "nosuchcontext" is not found on the file system, please check your config file at /Users/thajeztah/.docker/config.json
docker context use default
Current context "nosuchcontext" is not found on the file system, please check your config file at /Users/thajeztah/.docker/config.json
```
While these things should be fixed, this patch updates the script to switch
the context using the `--context` flag; this flag is taken into account when
initializing the CLI, so that having an invalid context configured won't
block `docker context` commands from being executed. Given that all `context`
commands are local operations, "any" context can be used (it doesn't need to
make a connection with the daemon).
With this patch, those commands can now be run (and won't fail for the wrong
reason);
```bash
docker --context=default context inspect -f "{{.Name}}" rootless
rootless
docker --context=default context inspect -f "{{.Name}}" rootless-doesnt-exist
context "rootless-doesnt-exist" does not exist
```
One other issue may also cause things to fail during uninstall; trying to remove
a context that doesn't exist will fail (even with the `-f` / `--force` option
set);
```bash
docker --context=default context rm blablabla
Error: context "blablabla": not found
```
While this is "ok" in most circumstances, it also means that (potentially) the
current context is not reset to "default", so this patch adds an explicit
`docker context use`, as well as unsetting the `DOCKER_HOST` and `DOCKER_CONTEXT`
environment variables.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
runconfig/config_test.go:23:46: empty-lines: extra empty line at the start of a block (revive)
runconfig/config_test.go:75:55: empty-lines: extra empty line at the start of a block (revive)
oci/devices_linux.go:57:34: empty-lines: extra empty line at the start of a block (revive)
oci/devices_linux.go:60:69: empty-lines: extra empty line at the start of a block (revive)
image/fs_test.go:53:38: empty-lines: extra empty line at the end of a block (revive)
image/tarexport/save.go:88:29: empty-lines: extra empty line at the end of a block (revive)
layer/layer_unix_test.go:21:34: empty-lines: extra empty line at the end of a block (revive)
distribution/xfer/download.go:302:9: empty-lines: extra empty line at the end of a block (revive)
distribution/manifest_test.go:154:99: empty-lines: extra empty line at the end of a block (revive)
distribution/manifest_test.go:329:52: empty-lines: extra empty line at the end of a block (revive)
distribution/manifest_test.go:354:59: empty-lines: extra empty line at the end of a block (revive)
registry/config_test.go:323:42: empty-lines: extra empty line at the end of a block (revive)
registry/config_test.go:350:33: empty-lines: extra empty line at the end of a block (revive)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
cmd/dockerd/trap/trap_linux_test.go:29:29: empty-lines: extra empty line at the end of a block (revive)
cmd/dockerd/daemon.go:327:35: empty-lines: extra empty line at the start of a block (revive)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
client/events.go:19:115: empty-lines: extra empty line at the start of a block (revive)
client/events_test.go:60:31: empty-lines: extra empty line at the start of a block (revive)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
api/server/router/build/build_routes.go:239:32: empty-lines: extra empty line at the start of a block (revive)
api/server/middleware/version.go:45:241: empty-lines: extra empty line at the end of a block (revive)
api/server/router/swarm/helpers_test.go:11:44: empty-lines: extra empty line at the end of a block (revive)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
opts/address_pools_test.go:7:39: empty-lines: extra empty line at the end of a block (revive)
opts/opts_test.go:12:42: empty-lines: extra empty line at the end of a block (revive)
opts/opts_test.go:60:49: empty-lines: extra empty line at the end of a block (revive)
opts/opts_test.go:253:37: empty-lines: extra empty line at the end of a block (revive)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
daemon/network/filter_test.go:174:19: empty-lines: extra empty line at the end of a block (revive)
daemon/restart.go:17:116: empty-lines: extra empty line at the end of a block (revive)
daemon/daemon_linux_test.go:255:41: empty-lines: extra empty line at the end of a block (revive)
daemon/reload_test.go:340:58: empty-lines: extra empty line at the end of a block (revive)
daemon/oci_linux.go:495:101: empty-lines: extra empty line at the end of a block (revive)
daemon/seccomp_linux_test.go:17:36: empty-lines: extra empty line at the start of a block (revive)
daemon/container_operations.go:560:73: empty-lines: extra empty line at the end of a block (revive)
daemon/daemon_unix.go:558:76: empty-lines: extra empty line at the end of a block (revive)
daemon/daemon_unix.go:1092:64: empty-lines: extra empty line at the start of a block (revive)
daemon/container_operations.go:587:24: empty-lines: extra empty line at the end of a block (revive)
daemon/network.go:807:18: empty-lines: extra empty line at the end of a block (revive)
daemon/network.go:813:42: empty-lines: extra empty line at the end of a block (revive)
daemon/network.go:872:72: empty-lines: extra empty line at the end of a block (revive)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
daemon/images/image_squash.go:17:71: empty-lines: extra empty line at the start of a block (revive)
daemon/images/store.go:128:27: empty-lines: extra empty line at the end of a block (revive)
daemon/images/image_list.go:154:55: empty-lines: extra empty line at the start of a block (revive)
daemon/images/image_delete.go:135:13: empty-lines: extra empty line at the end of a block (revive)
daemon/images/image_search.go:25:64: empty-lines: extra empty line at the start of a block (revive)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
daemon/logger/loggertest/logreader.go:58:43: empty-lines: extra empty line at the end of a block (revive)
daemon/logger/ring_test.go:119:34: empty-lines: extra empty line at the end of a block (revive)
daemon/logger/adapter_test.go:37:12: empty-lines: extra empty line at the end of a block (revive)
daemon/logger/adapter_test.go:41:44: empty-lines: extra empty line at the end of a block (revive)
daemon/logger/adapter_test.go:170:9: empty-lines: extra empty line at the end of a block (revive)
daemon/logger/loggerutils/sharedtemp_test.go:152:43: empty-lines: extra empty line at the end of a block (revive)
daemon/logger/loggerutils/sharedtemp.go:124:117: empty-lines: extra empty line at the end of a block (revive)
daemon/logger/syslog/syslog.go:249:87: empty-lines: extra empty line at the end of a block (revive)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
daemon/graphdriver/aufs/aufs.go:239:80: empty-lines: extra empty line at the start of a block (revive)
daemon/graphdriver/graphtest/graphbench_unix.go:249:27: empty-lines: extra empty line at the start of a block (revive)
daemon/graphdriver/graphtest/testutil.go:271:30: empty-lines: extra empty line at the end of a block (revive)
daemon/graphdriver/graphtest/graphbench_unix.go:179:32: empty-block: this block is empty, you can remove it (revive)
daemon/graphdriver/zfs/zfs.go:375:48: empty-lines: extra empty line at the end of a block (revive)
daemon/graphdriver/overlay/overlay.go:248:89: empty-lines: extra empty line at the start of a block (revive)
daemon/graphdriver/devmapper/deviceset.go:636:21: empty-lines: extra empty line at the end of a block (revive)
daemon/graphdriver/devmapper/deviceset.go:1150:70: empty-lines: extra empty line at the start of a block (revive)
daemon/graphdriver/devmapper/deviceset.go:1613:30: empty-lines: extra empty line at the end of a block (revive)
daemon/graphdriver/devmapper/deviceset.go:1645:65: empty-lines: extra empty line at the start of a block (revive)
daemon/graphdriver/btrfs/btrfs.go:53:101: empty-lines: extra empty line at the start of a block (revive)
daemon/graphdriver/devmapper/deviceset.go:1944:89: empty-lines: extra empty line at the start of a block (revive)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
daemon/cluster/convert/service.go:96:34: empty-lines: extra empty line at the end of a block (revive)
daemon/cluster/convert/service.go:169:44: empty-lines: extra empty line at the end of a block (revive)
daemon/cluster/convert/service.go:470:30: empty-lines: extra empty line at the end of a block (revive)
daemon/cluster/convert/container.go:224:23: empty-lines: extra empty line at the start of a block (revive)
daemon/cluster/convert/network.go:109:14: empty-lines: extra empty line at the end of a block (revive)
daemon/cluster/convert/service.go:537:27: empty-lines: extra empty line at the end of a block (revive)
daemon/cluster/services.go:247:19: empty-lines: extra empty line at the end of a block (revive)
daemon/cluster/services.go:252:41: empty-lines: extra empty line at the end of a block (revive)
daemon/cluster/services.go:256:12: empty-lines: extra empty line at the end of a block (revive)
daemon/cluster/services.go:289:80: empty-lines: extra empty line at the start of a block (revive)
daemon/cluster/executor/container/health_test.go:18:37: empty-lines: extra empty line at the start of a block (revive)
daemon/cluster/executor/container/adapter.go:437:68: empty-lines: extra empty line at the end of a block (revive)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
plugin/v2/settable_test.go:24:29: empty-lines: extra empty line at the end of a block (revive)
plugin/manager_linux.go:96:6: empty-lines: extra empty line at the end of a block (revive)
plugin/backend_linux.go:373:16: empty-lines: extra empty line at the start of a block (revive)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
volume/mounts/parser_test.go:42:39: empty-lines: extra empty line at the end of a block (revive)
volume/mounts/windows_parser.go:129:24: empty-lines: extra empty line at the end of a block (revive)
volume/local/local_test.go:16:35: empty-lines: extra empty line at the end of a block (revive)
volume/local/local_unix.go:145:3: early-return: if c {...} else {... return } can be simplified to if !c { ... return } ... (revive)
volume/service/service_test.go:18:38: empty-lines: extra empty line at the end of a block (revive)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
testutil/fixtures/load/frozen.go:141:99: empty-lines: extra empty line at the end of a block (revive)
testutil/daemon/plugin.go:56:129: empty-lines: extra empty line at the end of a block (revive)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
integration/config/config_test.go:106:31: empty-lines: extra empty line at the end of a block (revive)
integration/secret/secret_test.go:106:31: empty-lines: extra empty line at the end of a block (revive)
integration/network/service_test.go:58:50: empty-lines: extra empty line at the end of a block (revive)
integration/network/service_test.go:401:58: empty-lines: extra empty line at the end of a block (revive)
integration/system/event_test.go:30:38: empty-lines: extra empty line at the end of a block (revive)
integration/plugin/logging/read_test.go:19:41: empty-lines: extra empty line at the end of a block (revive)
integration/service/list_test.go:30:48: empty-lines: extra empty line at the end of a block (revive)
integration/service/create_test.go:400:46: empty-lines: extra empty line at the start of a block (revive)
integration/container/logs_test.go:156:42: empty-lines: extra empty line at the end of a block (revive)
integration/container/daemon_linux_test.go:135:44: empty-lines: extra empty line at the end of a block (revive)
integration/container/restart_test.go:160:62: empty-lines: extra empty line at the end of a block (revive)
integration/container/wait_test.go:181:47: empty-lines: extra empty line at the end of a block (revive)
integration/container/restart_test.go:116:30: empty-lines: extra empty line at the end of a block (revive)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
builder/remotecontext/detect_test.go:64:66: empty-lines: extra empty line at the end of a block (revive)
builder/remotecontext/detect_test.go:78:46: empty-lines: extra empty line at the end of a block (revive)
builder/remotecontext/detect_test.go:91:51: empty-lines: extra empty line at the end of a block (revive)
builder/dockerfile/internals_test.go:95:38: empty-lines: extra empty line at the end of a block (revive)
builder/dockerfile/copy.go:86:112: empty-lines: extra empty line at the end of a block (revive)
builder/dockerfile/dispatchers_test.go:286:39: empty-lines: extra empty line at the start of a block (revive)
builder/dockerfile/builder.go:280:38: empty-lines: extra empty line at the end of a block (revive)
builder/dockerfile/dispatchers.go:66:85: empty-lines: extra empty line at the start of a block (revive)
builder/dockerfile/dispatchers.go:559:85: empty-lines: extra empty line at the start of a block (revive)
builder/builder-next/adapters/localinlinecache/inlinecache.go:26:183: empty-lines: extra empty line at the start of a block (revive)
builder/builder-next/adapters/containerimage/pull.go:441:9: empty-lines: extra empty line at the start of a block (revive)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
integration-cli/docker_cli_pull_test.go:55:69: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_cli_exec_test.go:46:64: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_cli_service_health_test.go:86:65: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_api_images_test.go:128:66: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_api_swarm_node_test.go:79:69: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_cli_health_test.go:51:57: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_cli_health_test.go:159:73: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_cli_swarm_unix_test.go:60:67: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_api_inspect_test.go:30:33: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_api_build_test.go:429:71: empty-lines: extra empty line at the start of a block (revive)
integration-cli/docker_cli_attach_unix_test.go:19:78: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_api_build_test.go:470:70: empty-lines: extra empty line at the start of a block (revive)
integration-cli/docker_cli_history_test.go:29:64: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_cli_links_test.go:93:86: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_cli_create_test.go:33:61: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_cli_links_test.go:145:78: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_cli_create_test.go:114:70: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_api_attach_test.go:226:153: empty-lines: extra empty line at the start of a block (revive)
integration-cli/docker_cli_by_digest_test.go:239:71: empty-lines: extra empty line at the start of a block (revive)
integration-cli/docker_cli_create_test.go:135:49: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_cli_create_test.go:143:75: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_cli_create_test.go:181:71: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_cli_inspect_test.go:72:65: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_api_swarm_service_test.go:98:77: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_api_swarm_service_test.go:144:69: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_cli_rmi_test.go:63:2: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_api_swarm_service_test.go:199:79: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_cli_rmi_test.go:69:2: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_api_swarm_service_test.go:300:75: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_cli_prune_unix_test.go:35:25: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_cli_events_unix_test.go:393:60: empty-lines: extra empty line at the start of a block (revive)
integration-cli/docker_cli_events_unix_test.go:441:71: empty-lines: extra empty line at the start of a block (revive)
integration-cli/docker_cli_ps_test.go:33:67: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_cli_ps_test.go:559:67: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_cli_events_test.go:117:75: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_api_containers_test.go:547:74: empty-lines: extra empty line at the start of a block (revive)
integration-cli/docker_api_containers_test.go:1054:84: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_api_containers_test.go:1076:87: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_api_containers_test.go:1232:72: empty-lines: extra empty line at the start of a block (revive)
integration-cli/docker_api_containers_test.go:1801:21: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_cli_network_unix_test.go:58:95: empty-lines: extra empty line at the start of a block (revive)
integration-cli/docker_cli_network_unix_test.go:750:75: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_cli_network_unix_test.go:765:76: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_cli_swarm_test.go:617:100: empty-lines: extra empty line at the start of a block (revive)
integration-cli/docker_cli_swarm_test.go:892:72: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_cli_daemon_test.go:119:74: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_cli_daemon_test.go:981:68: empty-lines: extra empty line at the start of a block (revive)
integration-cli/docker_cli_daemon_test.go:1951:87: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_cli_run_test.go:83:66: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_cli_run_test.go:357:72: empty-lines: extra empty line at the start of a block (revive)
integration-cli/docker_cli_build_test.go:89:83: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_cli_build_test.go:114:83: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_cli_build_test.go:183:80: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_cli_build_test.go:290:71: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_cli_build_test.go:314:65: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_cli_build_test.go:331:67: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_cli_build_test.go:366:76: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_cli_build_test.go:403:67: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_cli_build_test.go:648:67: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_cli_build_test.go:708:72: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_cli_build_test.go:938:66: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_cli_build_test.go:1018:72: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_cli_build_test.go:1097:2: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_cli_build_test.go:1182:62: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_cli_build_test.go:1244:66: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_cli_build_test.go:1524:69: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_cli_build_test.go:1546:80: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_cli_build_test.go:1716:70: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_cli_build_test.go:1730:65: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_cli_build_test.go:2162:74: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_cli_build_test.go:2270:71: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_cli_build_test.go:2288:70: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_cli_build_test.go:3206:65: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_cli_build_test.go:3392:66: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_cli_build_test.go:3433:72: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_cli_build_test.go:3678:76: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_cli_build_test.go:3732:67: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_cli_build_test.go:3759:69: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_cli_build_test.go:3802:61: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_cli_build_test.go:3898:66: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_cli_build_test.go:4107:9: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_cli_build_test.go:4791:74: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_cli_build_test.go:4821:73: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_cli_build_test.go:4854:70: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_cli_build_test.go:5341:74: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_cli_build_test.go:5593:81: empty-lines: extra empty line at the end of a block (revive)
integration-cli/docker_api_containers_test.go:2145:11: empty-lines: extra empty line at the start of a block (revive)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Also renamed variables that collided with import
api/types/strslice/strslice_test.go:36:41: empty-lines: extra empty line at the end of a block (revive)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
pkg/directory/directory.go:9:49: empty-lines: extra empty line at the start of a block (revive)
pkg/pubsub/publisher.go:8:48: empty-lines: extra empty line at the start of a block (revive)
pkg/loopback/attach_loopback.go:96:69: empty-lines: extra empty line at the start of a block (revive)
pkg/devicemapper/devmapper_wrapper.go:136:48: empty-lines: extra empty line at the start of a block (revive)
pkg/devicemapper/devmapper.go:391:35: empty-lines: extra empty line at the end of a block (revive)
pkg/devicemapper/devmapper.go:676:35: empty-lines: extra empty line at the end of a block (revive)
pkg/archive/changes_posix_test.go:15:38: empty-lines: extra empty line at the end of a block (revive)
pkg/devicemapper/devmapper.go:241:51: empty-lines: extra empty line at the start of a block (revive)
pkg/fileutils/fileutils_test.go:17:47: empty-lines: extra empty line at the end of a block (revive)
pkg/fileutils/fileutils_test.go:34:48: empty-lines: extra empty line at the end of a block (revive)
pkg/fileutils/fileutils_test.go:318:32: empty-lines: extra empty line at the end of a block (revive)
pkg/tailfile/tailfile.go:171:6: empty-lines: extra empty line at the end of a block (revive)
pkg/tarsum/fileinfosums_test.go:16:41: empty-lines: extra empty line at the end of a block (revive)
pkg/tarsum/tarsum_test.go:198:42: empty-lines: extra empty line at the start of a block (revive)
pkg/tarsum/tarsum_test.go:294:25: empty-lines: extra empty line at the start of a block (revive)
pkg/tarsum/tarsum_test.go:407:34: empty-lines: extra empty line at the end of a block (revive)
pkg/ioutils/fswriters_test.go:52:45: empty-lines: extra empty line at the end of a block (revive)
pkg/ioutils/writers_test.go:24:39: empty-lines: extra empty line at the end of a block (revive)
pkg/ioutils/bytespipe_test.go:78:26: empty-lines: extra empty line at the end of a block (revive)
pkg/sysinfo/sysinfo_linux_test.go:13:37: empty-lines: extra empty line at the end of a block (revive)
pkg/archive/archive_linux_test.go:57:64: empty-lines: extra empty line at the end of a block (revive)
pkg/archive/changes.go:248:72: empty-lines: extra empty line at the start of a block (revive)
pkg/archive/changes_posix_test.go:15:38: empty-lines: extra empty line at the end of a block (revive)
pkg/archive/copy.go:248:124: empty-lines: extra empty line at the end of a block (revive)
pkg/archive/diff_test.go:198:44: empty-lines: extra empty line at the end of a block (revive)
pkg/archive/archive.go:304:12: empty-lines: extra empty line at the end of a block (revive)
pkg/archive/archive.go:749:37: empty-lines: extra empty line at the end of a block (revive)
pkg/archive/archive.go:812:81: empty-lines: extra empty line at the start of a block (revive)
pkg/archive/copy_unix_test.go:347:34: empty-lines: extra empty line at the end of a block (revive)
pkg/system/path.go:11:39: empty-lines: extra empty line at the end of a block (revive)
pkg/system/meminfo_linux.go:29:21: empty-lines: extra empty line at the end of a block (revive)
pkg/plugins/plugins.go:135:32: empty-lines: extra empty line at the end of a block (revive)
pkg/authorization/response.go:71:48: empty-lines: extra empty line at the start of a block (revive)
pkg/authorization/api_test.go:18:51: empty-lines: extra empty line at the end of a block (revive)
pkg/authorization/middleware_test.go:23:44: empty-lines: extra empty line at the end of a block (revive)
pkg/authorization/middleware_unix_test.go:17:46: empty-lines: extra empty line at the end of a block (revive)
pkg/authorization/api_test.go:57:45: empty-lines: extra empty line at the end of a block (revive)
pkg/authorization/response.go:83:50: empty-lines: extra empty line at the start of a block (revive)
pkg/authorization/api_test.go:66:47: empty-lines: extra empty line at the end of a block (revive)
pkg/authorization/middleware_unix_test.go:45:48: empty-lines: extra empty line at the end of a block (revive)
pkg/authorization/response.go:145:75: empty-lines: extra empty line at the start of a block (revive)
pkg/authorization/middleware_unix_test.go:56:51: empty-lines: extra empty line at the end of a block (revive)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
It was only used in a single location, and the ErrExtractPointNotDirectory was
not checked for, or used as a sentinel error.
This error was introduced in c32dde5baa. It was
never used as a sentinel error, but from that commit, it looks like it was added
as a package variable to mirror already existing errors defined at the package
level.
This patch removes the exported variable, and replaces the error with an
errdefs.InvalidParameter(), so that the API also returns the correct (400)
status code.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
It was only used in a single location, and the ErrVolumeReadonly was not checked
for, or used as a sentinel error.
This error was introduced in c32dde5baa. It was
never used as a sentinel error, but from that commit, it looks like it was added
as a package variable to mirror already existing errors defined at the package
level.
This patch removes the exported variable, and replaces the error with an
errdefs.InvalidParameter(), so that the API also returns the correct (400)
status code.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
It was only used in a single location, and the ErrRootFSReadOnly was not checked
for, or used as a sentinel error.
This error was introduced in c32dde5baa, originally
named `ErrContainerRootfsReadonly`. It was never used as a sentinel error, but
from that commit, it looks like it was added as a package variable to mirror
the coding style of already existing errors defined at the package level.
This patch removes the exported variable, and replaces the error with an
errdefs.InvalidParameter(), so that the API also returns the correct (400)
status code.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This utility was just string-matching error output, and no longer had a
direct connection with ErrContainerRootfsReadonly / ErrVolumeReadonly.
Moving it inline better shows what it's actually checking for.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The getPortMapInfo var was introduced in f198dfd856,
and (from looking at that patch) looks to have been as a quick and dirty workaround
for the `container` argument colliding with the `container` import.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The pkg/signal package was moved to github.com/moby/sys/signal in
28409ca6c7. The DefaultStopSignal const was
deprecated in e53f65a916, and the DumpStacks
function was moved to pkg/stack in ea5c94cdb9,
all of which are included in the 22.x release, so we can safely remove these
from master.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
These were moved, and deprecated in f19ef20a44 and
4caf68f4f6, which are part of the 22.x release, so
we can safely remove these from master.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
These functions were moved to github.com/moby/sys/sequential, and the
stubs were added in 509f19f611, which is
part of the 22.x release, so we can safely remove these from master.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This fixes an inifinite loop in mkdirAs(), used by `MkdirAllAndChown`,
`MkdirAndChown`, and `MkdirAllAndChownNew`, as well as directories being
chown'd multiple times when relative paths are used.
The for loop in this function was incorrectly assuming that;
1. `filepath.Dir()` would always return the parent directory of any given path
2. traversing any given path to ultimately result in "/"
While this is correct for absolute and "cleaned" paths, both assumptions are
incorrect in some variations of "path";
1. for paths with a trailing path-separator ("some/path/"), or dot ("."),
`filepath.Dir()` considers the (implicit) "." to be a location _within_ the
directory, and returns "some/path" as ("parent") directory. This resulted
in the path itself to be included _twice_ in the list of paths to chown.
2. for relative paths ("./some-path", "../some-path"), "traversing" the path
would never end in "/", causing the for loop to run indefinitely:
```go
// walk back to "/" looking for directories which do not exist
// and add them to the paths array for chown after creation
dirPath := path
for {
dirPath = filepath.Dir(dirPath)
if dirPath == "/" {
break
}
if _, err := os.Stat(dirPath); err != nil && os.IsNotExist(err) {
paths = append(paths, dirPath)
}
}
```
A _partial_ mitigation for this would be to use `filepath.Clean()` before using
the path (while `filepath.Dir()` _does_ call `filepath.Clean()`, it only does so
_after_ some processing, so only cleans the result). Doing so would prevent the
double chown from happening, but would not prevent the "final" path to be "."
or ".." (in the relative path case), still causing an infinite loop, or
additional checks for "." / ".." to be needed.
| path | filepath.Dir(path) | filepath.Dir(filepath.Clean(path)) |
|----------------|--------------------|------------------------------------|
| some-path | . | . |
| ./some-path | . | . |
| ../some-path | .. | .. |
| some/path/ | some/path | some |
| ./some/path/ | some/path | some |
| ../some/path/ | ../some/path | ../some |
| some/path/. | some/path | some |
| ./some/path/. | some/path | some |
| ../some/path/. | ../some/path | ../some |
| /some/path/ | /some/path | /some |
| /some/path/. | /some/path | /some |
Instead, this patch adds a `filepath.Abs()` to the function, so make sure that
paths are both cleaned, and not resulting in an infinite loop.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
libnetwork/etchosts/etchosts_test.go:167:54: empty-lines: extra empty line at the end of a block (revive)
libnetwork/osl/route_linux.go:185:74: empty-lines: extra empty line at the start of a block (revive)
libnetwork/osl/sandbox_linux_test.go:323:36: empty-lines: extra empty line at the start of a block (revive)
libnetwork/bitseq/sequence.go:412:48: empty-lines: extra empty line at the start of a block (revive)
libnetwork/datastore/datastore_test.go:67:46: empty-lines: extra empty line at the end of a block (revive)
libnetwork/datastore/mock_store.go:34:60: empty-lines: extra empty line at the end of a block (revive)
libnetwork/iptables/firewalld.go:202:44: empty-lines: extra empty line at the end of a block (revive)
libnetwork/iptables/firewalld_test.go:76:36: empty-lines: extra empty line at the end of a block (revive)
libnetwork/iptables/iptables.go:256:67: empty-lines: extra empty line at the end of a block (revive)
libnetwork/iptables/iptables.go:303:128: empty-lines: extra empty line at the start of a block (revive)
libnetwork/networkdb/cluster.go:183:72: empty-lines: extra empty line at the end of a block (revive)
libnetwork/ipams/null/null_test.go:44:38: empty-lines: extra empty line at the end of a block (revive)
libnetwork/drivers/macvlan/macvlan_store.go:45:52: empty-lines: extra empty line at the end of a block (revive)
libnetwork/ipam/allocator_test.go:1058:39: empty-lines: extra empty line at the start of a block (revive)
libnetwork/drivers/bridge/port_mapping.go:88:111: empty-lines: extra empty line at the end of a block (revive)
libnetwork/drivers/bridge/link.go:26:90: empty-lines: extra empty line at the end of a block (revive)
libnetwork/drivers/bridge/setup_ipv6_test.go:17:34: empty-lines: extra empty line at the end of a block (revive)
libnetwork/drivers/bridge/setup_ip_tables.go:392:4: empty-lines: extra empty line at the start of a block (revive)
libnetwork/drivers/bridge/bridge.go:804:50: empty-lines: extra empty line at the start of a block (revive)
libnetwork/drivers/overlay/ov_serf.go:183:29: empty-lines: extra empty line at the start of a block (revive)
libnetwork/drivers/overlay/ov_utils.go:81:64: empty-lines: extra empty line at the end of a block (revive)
libnetwork/drivers/overlay/peerdb.go:172:67: empty-lines: extra empty line at the start of a block (revive)
libnetwork/drivers/overlay/peerdb.go:209:67: empty-lines: extra empty line at the start of a block (revive)
libnetwork/drivers/overlay/peerdb.go:344:89: empty-lines: extra empty line at the start of a block (revive)
libnetwork/drivers/overlay/peerdb.go:436:63: empty-lines: extra empty line at the start of a block (revive)
libnetwork/drivers/overlay/overlay.go:183:36: empty-lines: extra empty line at the start of a block (revive)
libnetwork/drivers/overlay/encryption.go:69:28: empty-lines: extra empty line at the end of a block (revive)
libnetwork/drivers/overlay/ov_network.go:563:81: empty-lines: extra empty line at the start of a block (revive)
libnetwork/default_gateway.go:32:43: empty-lines: extra empty line at the start of a block (revive)
libnetwork/errors_test.go:9:40: empty-lines: extra empty line at the start of a block (revive)
libnetwork/service_common.go:184:64: empty-lines: extra empty line at the end of a block (revive)
libnetwork/endpoint.go:161:55: empty-lines: extra empty line at the end of a block (revive)
libnetwork/store.go:320:33: empty-lines: extra empty line at the end of a block (revive)
libnetwork/store_linux_test.go:11:38: empty-lines: extra empty line at the end of a block (revive)
libnetwork/sandbox.go:571:36: empty-lines: extra empty line at the start of a block (revive)
libnetwork/service_common.go:317:246: empty-lines: extra empty line at the start of a block (revive)
libnetwork/endpoint.go:550:17: empty-lines: extra empty line at the end of a block (revive)
libnetwork/sandbox_dns_unix.go:213:106: empty-lines: extra empty line at the start of a block (revive)
libnetwork/controller.go:676:85: empty-lines: extra empty line at the end of a block (revive)
libnetwork/agent.go:876:60: empty-lines: extra empty line at the end of a block (revive)
libnetwork/resolver.go:324:69: empty-lines: extra empty line at the end of a block (revive)
libnetwork/network.go:1153:92: empty-lines: extra empty line at the end of a block (revive)
libnetwork/network.go:1955:67: empty-lines: extra empty line at the start of a block (revive)
libnetwork/network.go:2235:9: empty-lines: extra empty line at the start of a block (revive)
libnetwork/libnetwork_internal_test.go:336:26: empty-lines: extra empty line at the start of a block (revive)
libnetwork/resolver_test.go:76:35: empty-lines: extra empty line at the end of a block (revive)
libnetwork/libnetwork_test.go:303:38: empty-lines: extra empty line at the end of a block (revive)
libnetwork/libnetwork_test.go:985:46: empty-lines: extra empty line at the end of a block (revive)
libnetwork/ipam/allocator_test.go:1263:37: empty-lines: extra empty line at the start of a block (revive)
libnetwork/errors_test.go:9:40: empty-lines: extra empty line at the end of a block (revive)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This code was duplicated in two places -- factor it out, add
documentation, and move magic numbers into a constant.
Additionally, use the same permissions (0755) in both code paths, and
ensure that the ID map is used in both code paths.
Co-authored-by: Vasiliy Ulyanov <vulyanov@suse.de>
Signed-off-by: Bjorn Neergaard <bneergaard@mirantis.com>
Signed-off-by: Vasiliy Ulyanov <vulyanov@suse.de>
It was unclear what the distinction was between these configuration
structs, so merging them to simplify.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This was used for testing purposes when libnetwork was in a separate repo, using
the dnet utility, which was removed in 7266a956a8.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Libnetwork configuration files were only used as part of integration tests using
the dnet utility, which was removed in 7266a956a8
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Remove the "deadcode", "structcheck", and "varcheck" linters, as they are
deprecated:
WARN [runner] The linter 'deadcode' is deprecated (since v1.49.0) due to: The owner seems to have abandoned the linter. Replaced by unused.
WARN [runner] The linter 'structcheck' is deprecated (since v1.49.0) due to: The owner seems to have abandoned the linter. Replaced by unused.
WARN [runner] The linter 'varcheck' is deprecated (since v1.49.0) due to: The owner seems to have abandoned the linter. Replaced by unused.
WARN [linters context] structcheck is disabled because of generics. You can track the evolution of the generics support by following the https://github.com/golangci/golangci-lint/issues/2649.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The TODO comment was in regards to allowing graphdriver plugins to
provide their own ContainerFS implementations. The ContainerFS interface
has been removed from Moby, so there is no longer anything which needs
to be figured out.
Signed-off-by: Cory Snider <csnider@mirantis.com>
Now that the type of Container.BaseFS has been reverted to a string,
values can never implement the extractor or archiver interfaces. Rip out
the dead code to support archiving and unarchiving through those
interfcaes.
Signed-off-by: Cory Snider <csnider@mirantis.com>
The Driver abstraction was needed for Linux Containers on Windows,
support for which has since been removed.
There is no direct equivalent to Lchmod() in the standard library so
continue to use the containerd/continuity version.
Signed-off-by: Cory Snider <csnider@mirantis.com>
- dbus: add Connected methods to check connections status
- dbus: add support for querying unit by PID
- dbus: implement support for cgroup freezer APIs
- journal: remove implicit initialization
- login1: add methods to get session/user properties
- login1: add context-aware ListSessions and ListUsers methods
full diff: https://github.com/github.com/coreos/go-systemd/compare/v22.3.2...v22.4.0
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Had this stashed as part of some other changes, so thought I'd push
as a PR; with this patch:
=== RUN TestNewClientWithOpsFromEnv/invalid_cert_path
=== RUN TestNewClientWithOpsFromEnv/default_api_version_with_cert_path
=== RUN TestNewClientWithOpsFromEnv/default_api_version_with_cert_path_and_tls_verify
=== RUN TestNewClientWithOpsFromEnv/default_api_version_with_cert_path_and_host
=== RUN TestNewClientWithOpsFromEnv/invalid_docker_host
=== RUN TestNewClientWithOpsFromEnv/invalid_docker_host,_with_good_format
=== RUN TestNewClientWithOpsFromEnv/override_api_version
--- PASS: TestNewClientWithOpsFromEnv (0.00s)
--- PASS: TestNewClientWithOpsFromEnv/default_api_version (0.00s)
--- PASS: TestNewClientWithOpsFromEnv/invalid_cert_path (0.00s)
--- PASS: TestNewClientWithOpsFromEnv/default_api_version_with_cert_path (0.00s)
--- PASS: TestNewClientWithOpsFromEnv/default_api_version_with_cert_path_and_tls_verify (0.00s)
--- PASS: TestNewClientWithOpsFromEnv/default_api_version_with_cert_path_and_host (0.00s)
--- PASS: TestNewClientWithOpsFromEnv/invalid_docker_host (0.00s)
--- PASS: TestNewClientWithOpsFromEnv/invalid_docker_host,_with_good_format (0.00s)
--- PASS: TestNewClientWithOpsFromEnv/override_api_version (0.00s)
PASS
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Now that we can pass any custom containerd shim to dockerd there is need
for this check. Without this it becomes possible to use wasm shims for
example with images that have "wasi" as the OS.
Signed-off-by: Djordje Lukic <djordje.lukic@docker.com>
These functions were used in 63a7ccdd23, which was
part of Docker v1.5.0 and v1.6.0, but removed in Docker v1.7.0 when the network
stack was replaced with libnetwork in d18919e304.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
After discussing in the maintainers meeting, we concluded that Slowloris attacks
are not a real risk other than potentially having some additional goroutines
lingering around, so setting a long timeout to satisfy the linter, and to at
least have "some" timeout.
libnetwork/diagnostic/server.go:96:10: G112: Potential Slowloris Attack because ReadHeaderTimeout is not configured in the http.Server (gosec)
srv := &http.Server{
Addr: net.JoinHostPort(ip, strconv.Itoa(port)),
Handler: s,
}
api/server/server.go:60:10: G112: Potential Slowloris Attack because ReadHeaderTimeout is not configured in the http.Server (gosec)
srv: &http.Server{
Addr: addr,
},
daemon/metrics_unix.go:34:13: G114: Use of net/http serve function that has no support for setting timeouts (gosec)
if err := http.Serve(l, mux); err != nil && !strings.Contains(err.Error(), "use of closed network connection") {
^
cmd/dockerd/metrics.go:27:13: G114: Use of net/http serve function that has no support for setting timeouts (gosec)
if err := http.Serve(l, mux); err != nil && !strings.Contains(err.Error(), "use of closed network connection") {
^
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
These interfaces were added in aacddda89d, with
no clear motivation, other than "Also hide ViewDB behind an interface".
This patch removes the interface in favor of using a concrete implementation;
There's currently only one implementation of this interface, and if we would
decide to change to an alternative implementation, we could define relevant
interfaces on the receiver side.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Also switching to use arm64, as all amd64 stages have moved to GitHub actions,
so using arm64 allows the same machine to be used for tests after the DCO check
completed.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This was missing to be removed from Jenkinsfile when we moved
to GHA for unit and integration tests.
Signed-off-by: CrazyMax <crazy-max@users.noreply.github.com>
The 1.16 `io/fs` compatibility code was being built on 1.18 and 1.19.
Drop it completely as 1.16 is long EOL, and additionally drop 1.17 as it
has been EOL for a month and 1.18 is both the minimum Go supported by
the 20.10 branch, as well as a very easy jump from 1.17.
Signed-off-by: Bjorn Neergaard <bneergaard@mirantis.com>
The OCI image spec is considering to change the Image struct and embedding the
Platform type (see opencontainers/image-spec#959) in the go implementation.
Moby currently uses some struct-literals to propagate the platform fields,
which will break once those changes in the OCI spec are merged.
Ideally (once that change arrives) we would update the code to set the Platform
information as a whole, instead of assigning related fields individually, but
in some cases in the code, image platform information is only partially set
(for example, OSVersion and OSFeatures are not preserved in all cases). This
may be on purpose, so needs to be reviewed.
This patch keeps the current behavior (assigning only specific fields), but
removes the use of struct-literals to make the code compatible with the
upcoming changes in the image-spec module.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Make sure we use the same alias everywhere for easier finding,
and to prevent accidentally introducing duplicate imports with
different aliases for the same package.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
- prefer error over panic where possible
- ContainerChanges is not implemented by snapshotter-based ImageService
Signed-off-by: Nicolas De Loof <nicolas.deloof@gmail.com>
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The Tagger was introduced in 0296797f0f, as part
of a refactor, but was never used outside of the package itself. The commit
also didn't explain why this was changed into a Type with a constructor, as all
the constructor appears to be used for is to sanitize and validate the tags.
This patch removes the `Tagger` struct and its constructor, and instead just
uses a function to do the same.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
From the mailing list:
We have just released Go versions 1.19.1 and 1.18.6, minor point releases.
These minor releases include 2 security fixes following the security policy:
- net/http: handle server errors after sending GOAWAY
A closing HTTP/2 server connection could hang forever waiting for a clean
shutdown that was preempted by a subsequent fatal error. This failure mode
could be exploited to cause a denial of service.
Thanks to Bahruz Jabiyev, Tommaso Innocenti, Anthony Gavazzi, Steven Sprecher,
and Kaan Onarlioglu for reporting this.
This is CVE-2022-27664 and Go issue https://go.dev/issue/54658.
- net/url: JoinPath does not strip relative path components in all circumstances
JoinPath and URL.JoinPath would not remove `../` path components appended to a
relative path. For example, `JoinPath("https://go.dev", "../go")` returned the
URL `https://go.dev/../go`, despite the JoinPath documentation stating that
`../` path elements are cleaned from the result.
Thanks to q0jt for reporting this issue.
This is CVE-2022-32190 and Go issue https://go.dev/issue/54385.
Release notes:
go1.19.1 (released 2022-09-06) includes security fixes to the net/http and
net/url packages, as well as bug fixes to the compiler, the go command, the pprof
command, the linker, the runtime, and the crypto/tls and crypto/x509 packages.
See the Go 1.19.1 milestone on the issue tracker for details.
https://github.com/golang/go/issues?q=milestone%3AGo1.19.1+label%3ACherryPickApproved
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
From the mailing list:
We have just released Go versions 1.19.1 and 1.18.6, minor point releases.
These minor releases include 2 security fixes following the security policy:
- net/http: handle server errors after sending GOAWAY
A closing HTTP/2 server connection could hang forever waiting for a clean
shutdown that was preempted by a subsequent fatal error. This failure mode
could be exploited to cause a denial of service.
Thanks to Bahruz Jabiyev, Tommaso Innocenti, Anthony Gavazzi, Steven Sprecher,
and Kaan Onarlioglu for reporting this.
This is CVE-2022-27664 and Go issue https://go.dev/issue/54658.
- net/url: JoinPath does not strip relative path components in all circumstances
JoinPath and URL.JoinPath would not remove `../` path components appended to a
relative path. For example, `JoinPath("https://go.dev", "../go")` returned the
URL `https://go.dev/../go`, despite the JoinPath documentation stating that
`../` path elements are cleaned from the result.
Thanks to q0jt for reporting this issue.
This is CVE-2022-32190 and Go issue https://go.dev/issue/54385.
Release notes:
go1.18.6 (released 2022-09-06) includes security fixes to the net/http package,
as well as bug fixes to the compiler, the go command, the pprof command, the
runtime, and the crypto/tls, encoding/xml, and net packages. See the Go 1.18.6
milestone on the issue tracker for details;
https://github.com/golang/go/issues?q=milestone%3AGo1.18.6+label%3ACherryPickApproved
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The wrapper sets the default namespace in the context if none is
provided, this is needed because we are calling these services directly
and not trough GRPC that has an interceptor to set the default namespace
to all calls.
Signed-off-by: Djordje Lukic <djordje.lukic@docker.com>
While the name generator has been frozen for new additions in 624b3cfbe8,
this person has become controversial. Our intent is for this list to be inclusive
and non-controversial.
This patch removes the name from the list.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
1. Commit 1a22418f9f changed permissions to `0700`
on Windows, or more factually, it removed `rw` (`chmod g-rw,o-rw`) and added
executable bits (`chmod u+x`).
2. This was too restrictive, and b7dc9040f0 changed
permissions to only remove the group- and world-writable bits to give read and
execute access to everyone, but setting execute permissions for everyone.
3. However, this also removed the non-permission bits, so 41eb61d5c2
updated the code to preserve those, and keep parity with Linux.
This changes it back to `2.`. I wonder (_think_) _permission_ bits (read, write)
can be portable, except for the _executable_ bit (which is not present on Windows).
The alternative could be to keep the permission bits, and only set the executable
bit (`perm | 0111`) for everyone (equivalent of `chmod +x`), but that likely would
be a breaking change.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The fillGo18FileTypeBits func was added in 1a451d9a7b
to keep the tar headers consistent with headers created with go1.8 and older.
go1.8 and older incorrectly preserved all file-mode bits, including file-type,
instead of stripping those bits and only preserving the _permission_ bits, as
defined in;
- the GNU tar spec: https://www.gnu.org/software/tar/manual/html_node/Standard.html
- and POSIX: http://pubs.opengroup.org/onlinepubs/009695399/basedefs/tar.h.html
We decided at the time to copy the "wrong" behavior to prevent a cache-bust and
to keep the archives identical, however:
- It's not matching the standards, which causes differences between our tar
implementation and the standard tar implementations, as well as implementations
in other languages, such as Python (see docker/compose#883).
- BuildKit does not implement this hack.
- We don't _need_ this extra information (as it's already preserved in the
type header; https://pkg.go.dev/archive/tar#pkg-constants
In short; let's remove this hack.
This reverts commit 1a451d9a7b.
This reverts commit 41eb61d5c2.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
integration-cli/docker_cli_daemon_test.go:545:54: host:port in url should be constructed with net.JoinHostPort and not directly with fmt.Sprintf (nosprintfhostport)
cmdArgs = append(cmdArgs, "--tls=false", "--host", fmt.Sprintf("tcp://%s:%s", l.daemon, l.port))
^
opts/hosts_test.go:35:31: host:port in url should be constructed with net.JoinHostPort and not directly with fmt.Sprintf (nosprintfhostport)
"tcp://:5555": fmt.Sprintf("tcp://%s:5555", DefaultHTTPHost),
^
opts/hosts_test.go:91:30: host:port in url should be constructed with net.JoinHostPort and not directly with fmt.Sprintf (nosprintfhostport)
":5555": fmt.Sprintf("tcp://%s:5555", DefaultHTTPHost),
^
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Updating test-code only; set ReadHeaderTimeout for some, or suppress the linter
error for others.
contrib/httpserver/server.go:11:12: G114: Use of net/http serve function that has no support for setting timeouts (gosec)
log.Panic(http.ListenAndServe(":80", nil))
^
integration/plugin/logging/cmd/close_on_start/main.go:42:12: G112: Potential Slowloris Attack because ReadHeaderTimeout is not configured in the http.Server (gosec)
server := http.Server{
Addr: l.Addr().String(),
Handler: mux,
}
integration/plugin/logging/cmd/discard/main.go:17:12: G112: Potential Slowloris Attack because ReadHeaderTimeout is not configured in the http.Server (gosec)
server := http.Server{
Addr: l.Addr().String(),
Handler: mux,
}
integration/plugin/logging/cmd/dummy/main.go:14:12: G112: Potential Slowloris Attack because ReadHeaderTimeout is not configured in the http.Server (gosec)
server := http.Server{
Addr: l.Addr().String(),
Handler: http.NewServeMux(),
}
integration/plugin/volumes/cmd/dummy/main.go:14:12: G112: Potential Slowloris Attack because ReadHeaderTimeout is not configured in the http.Server (gosec)
server := http.Server{
Addr: l.Addr().String(),
Handler: http.NewServeMux(),
}
testutil/fixtures/plugin/basic/basic.go:25:12: G112: Potential Slowloris Attack because ReadHeaderTimeout is not configured in the http.Server (gosec)
server := http.Server{
Addr: l.Addr().String(),
Handler: http.NewServeMux(),
}
volume/testutils/testutils.go:170:5: G114: Use of net/http serve function that has no support for setting timeouts (gosec)
go http.Serve(l, mux)
^
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The linter falsely detects this as using "math/rand":
libnetwork/networkdb/cluster.go:721:14: G404: Use of weak random number generator (math/rand instead of crypto/rand) (gosec)
val, err := rand.Int(rand.Reader, big.NewInt(int64(n)))
^
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Now that CanonicalTarNameForPath is an alias for filepath.ToSlash, they were
mostly redundant, and only testing Go's stdlib. Coverage for filepath.ToSlash is
provided through TestCanonicalTarName, which does a superset of CanonicalTarNameForPath,
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
filepath.ToSlash is already a no-op on non-Windows platforms, so there's no
need to provide multiple implementations.
We could consider deprecating this function, but it's used in the CLI, and
perhaps it's still useful to have a canonical location to perform this normalization.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Migrating these functions to allow them being shared between moby, docker/cli,
and containerd, and to allow using them without importing all of sys / system,
which (in containerd) also depends on hcsshim and more.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
validate other YAML files, such as the ones used in the documentation,
and GitHub actions workflows, to prevent issues such as;
- 30295c1750
- 8e8d9a3650
With this patch:
hack/validate/yamllint
Congratulations! yamllint config file formatted correctly
Congratulations! YAML files are formatted correctly
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Suppresses warnings like:
LANG=C.UTF-8 yamllint -c hack/validate/yamllint.yaml -f parsable .github/workflows/*.yml
.github/workflows/ci.yml:7:1: [warning] truthy value should be one of [false, true] (truthy)
.github/workflows/windows.yml:7:1: [warning] truthy value should be one of [false, true] (truthy)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Before:
10030:81 error line too long (89 > 80 characters) (line-length)
After:
api/swagger.yaml:10030:81: [error] line too long (89 > 80 characters) (line-length)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Don't make the file hidden, and add .yaml extension, so that editors
pick up the right formatting :)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This interface is used as part of an exported function's signature,
so exporting the interface as well for callers to know what the argument
must have implemented.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
filepath.IsAbs() will short-circuit on Linux/Unix, so having a single
implementation should not affect those platforms.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Prevent new health check probes from racing the task deletion. This may
have been a root cause of containers taking so long to stop on Windows.
Signed-off-by: Cory Snider <csnider@mirantis.com>
We have integration tests which assert the invariant that a
GET /containers/{id}/json response lists only IDs of execs which are in
the Running state, according to GET /exec/{id}/json. The invariant could
be violated if those requests were to race the handling of the exec's
task-exit event. The coarse-grained locking of the container ExecStore
when starting an exec task was accidentally synchronizing
(*Daemon).ProcessEvent and (*Daemon).ContainerExecInspect to it just
enough to make it improbable for the integration tests to catch the
invariant violation on execs which exit immediately. Removing the
unnecessary locking made the underlying race condition more likely for
the tests to hit.
Maintain the invariant by deleting the exec from its container's
ExecCommands before clearing its Running flag. Additionally, fix other
potential data races with execs by ensuring that the ExecConfig lock is
held whenever a mutable field is read from or written to.
Signed-off-by: Cory Snider <csnider@mirantis.com>
Modifying the builtin Windows runtime to send the exited event
immediately upon the container's init process exiting, without first
waiting for the Compute System to shut down, perturbed the timings
enough to make TestWaitConditions flaky on that platform. Make
TestWaitConditions timing-independent by having the container wait
for input on STDIN before exiting.
Signed-off-by: Cory Snider <csnider@mirantis.com>
Attempting to delete the directory while another goroutine is
concurrently executing a CheckpointTo() can fail on Windows due to file
locking. As all callers of CheckpointTo() are required to hold the
container lock, holding the lock while deleting the directory ensures
that there will be no interference.
Signed-off-by: Cory Snider <csnider@mirantis.com>
The existing logic to handle container ID conflicts when attempting to
create a plugin container is not nearly as robust as the implementation
in daemon for user containers. Extract and refine the logic from daemon
and use it in the plugin executor.
Signed-off-by: Cory Snider <csnider@mirantis.com>
The containerd client is very chatty at the best of times. Because the
libcontained API is stateless and references containers and processes by
string ID for every method call, the implementation is essentially
forced to use the containerd client in a way which amplifies the number
of redundant RPCs invoked to perform any operation. The libcontainerd
remote implementation has to reload the containerd container, task
and/or process metadata for nearly every operation. This in turn
amplifies the number of context switches between dockerd and containerd
to perform any container operation or handle a containerd event,
increasing the load on the system which could otherwise be allocated to
workloads.
Overhaul the libcontainerd interface to reduce the impedance mismatch
with the containerd client so that the containerd client can be used
more efficiently. Split the API out into container, task and process
interfaces which the consumer is expected to retain so that
libcontainerd can retain state---especially the analogous containerd
client objects---without having to manage any state-store inside the
libcontainerd client.
Signed-off-by: Cory Snider <csnider@mirantis.com>
The OOMKilled flag on a container's state has historically behaved
rather unintuitively: it is updated on container exit to reflect whether
or not any process within the container has been OOM-killed during the
preceding run of the container. The OOMKilled flag would be set to true
when the container exits if any process within the container---including
execs---was OOM-killed at any time while the container was running,
whether or not the OOM-kill was the cause of the container exiting. The
flag is "sticky," persisting through the next start of the container;
only being cleared once the container exits without any processes having
been OOM-killed that run.
Alter the behavior of the OOMKilled flag such that it signals whether
any process in the container had been OOM-killed since the most recent
start of the container. Set the flag immediately upon any process being
OOM-killed, and clear it when the container transitions to the "running"
state.
There is an ulterior motive for this change. It reduces the amount of
state the libcontainerd client needs to keep track of and clean up on
container exit. It's one less place the client could leak memory if a
container was to be deleted without going through libcontainerd.
Signed-off-by: Cory Snider <csnider@mirantis.com>
The daemon.containerd.Exec call does not access or mutate the
container's ExecCommands store in any way, and locking the exec config
is sufficient to synchronize with the event-processing loop. Locking
the ExecCommands store while starting the exec process only serves to
block unrelated operations on the container for an extended period of
time.
Convert the Store struct's mutex to an unexported field to prevent this
from regressing in the future.
Signed-off-by: Cory Snider <csnider@mirantis.com>
Add an integration test to verify that health checks are killed on
timeout and that the output is captured.
Co-authored-by: Nicolas De Loof <nicolas.deloof@gmail.com>
Signed-off-by: Cory Snider <csnider@mirantis.com>
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Just a minor cleanup; I went looking where these options were used, and
these occurrences set then to their default value. Removing these
assignments makes it easier to find where they're actually used.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
It was deprecated in edac92409a, which
was part of 18.09 and up, so should be safe by now to remove this.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
It was only used in a single location, and only a "convenience" type,
not used to detect a specific error.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
These were deprecated in ee230d8fdd,
which is in the 22.06 branch, so we can safely remove it from
master to have them removed in the release after that.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Terminating the exec process when the context is canceled has been
broken since Docker v17.11 so nobody has been able to depend upon that
behaviour in five years of releases. We are thus free from backwards-
compatibility constraints.
Co-authored-by: Nicolas De Loof <nicolas.deloof@gmail.com>
Co-authored-by: Sebastiaan van Stijn <github@gone.nl>
Signed-off-by: Nicolas De Loof <nicolas.deloof@gmail.com>
Signed-off-by: Cory Snider <csnider@mirantis.com>
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Treat (storage/graph)Driver as snapshotter
Also moved some layerStore related initialization to the non-c8d case
because otherwise they get treated as a graphdriver plugins.
Co-authored-by: Sebastiaan van Stijn <github@gone.nl>
Signed-off-by: Djordje Lukic <djordje.lukic@docker.com>
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Since runtimes can now just be containerd shims, we need to check if the
reference is possibly a containerd shim.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
Update the profile to make use of CAP_BPF and CAP_PERFMON capabilities. Prior to
kernel 5.8, bpf and perf_event_open required CAP_SYS_ADMIN. This change enables
finer control of the privilege setting, thus allowing us to run certain system
tracing tools with minimal privileges.
Based on the original patch from Henry Wang in the containerd repository.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The `-g` / `--graph` options were soft deprecated in favor of `--data-root` in
261ef1fa27 (v17.05.0) and at the time considered
to not be removed. However, with the move towards containerd snapshotters, having
these options around adds additional complexity to handle fallbacks for deprecated
(and hidden) flags, so completing the deprecation.
With this patch:
dockerd --graph=/var/lib/docker --validate
Flag --graph has been deprecated, Use --data-root instead
unable to configure the Docker daemon with file /etc/docker/daemon.json: merged configuration validation from file and command line flags failed: the "graph" config file option is deprecated; use "data-root" instead
mkdir -p /etc/docker
echo '{"graph":"/var/lib/docker"}' > /etc/docker/daemon.json
dockerd --validate
unable to configure the Docker daemon with file /etc/docker/daemon.json: merged configuration validation from file and command line flags failed: the "graph" config file option is deprecated; use "data-root" instead
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
It was only used as an intermediate variable to store what's returned
by layerstore.DriverName() / ImageService.StorageDriver()
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Make the function name more generic, as it's no longer used only
for graphdrivers but also for snapshotters.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Makes sure that tests use a config struct that's more representative
to how it's used in the code.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Makes sure that tests use a config struct that's more representative
to how it's used in the code.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This centralizes more defaults, to be part of the config struct that's
created, instead of interweaving the defaults with other code in various
places.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Use the information stored as part of the container for the error-message,
instead of querying the current storage driver from the daemon.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The check was accounting for old containers that did not have a storage-driver
set in their config, and was added in 4908d7f81d
for docker v0.7.0-rc6 - nearly 9 Years ago, so very likely nobody is still
depending on this ;-)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This was added in 0cba7740d4, as part of
the LCOW implementation. LCOW support has been removed, so we can remove
this check.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Currently only provides the existing "platform" option, but more
options will be added in follow-ups.
Signed-off-by: Nicolas De Loof <nicolas.deloof@gmail.com>
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The legacy v1 is not supported by the containerd import
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
- return early if we're expecting a system-containerd
- rename `initContainerD` to `initContainerd` ':)
- remove .Config to reduce verbosity
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Change the log-level for messages about starting the managed containerd instance
to be the same as for the main API. And remove a redundant debug-log.
With this patch:
dockerd
INFO[2022-08-11T11:46:32.573299176Z] Starting up
INFO[2022-08-11T11:46:32.574304409Z] containerd not running, starting managed containerd
INFO[2022-08-11T11:46:32.575289181Z] started new containerd process address=/var/run/docker/containerd/containerd.sock module=libcontainerd pid=5370
cmd/dockerd: initContainerD(): clean-up some logs
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
the `--log-level` flag overrides whatever is in the containerd configuration file;
f033f6ff85/cmd/containerd/command/main.go (L339-L352)
Given that we set that flag when we start the containerd binary, there is no need
to write it both to the generated config-file and pass it as flag.
This patch also slightly changes the behavior; as both dockerd and containerd use
"info" as default log-level, don't set the log-level if it's the default.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Adding a remote.configFile to store the location instead of re-constructing its
location each time. Also fixing a minor inconsistency in the error formats.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Adding a remote.pidFile to store the location instead of re-constructing its
location each time. Also performing a small refactor to use `strconv.Itoa`
instead of `fmt.Sprintf`.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Containerd, like dockerd has a OOMScore configuration option to adjust its own
OOM score. In dockerd, this option was added when default installations were not
yet running the daemon as a systemd unit, which made it more complicated to set
the score, and adding a daemon option was convenient.
A binary adjusting its own score has been frowned upon, as it's more logical to
make that the responsibility of the process manager _starting_ the daemon, which
is what we did for dockerd in 21578530d7.
There have been discussions on deprecating the daemon flag for dockerd, and
similar discussions have been happening for containerd.
This patch changes how we set the OOM score for the containerd child process,
and to have dockerd (supervisor) set the OOM score, as it's acting as process
manager in this case (performing a role similar to systemd otherwise).
With this patch, the score is still adjusted as usual, but not written to the
containerd configuration file;
dockerd --oom-score-adjust=-123
cat /proc/$(pidof containerd)/oom_score_adj
-123
As a follow-up, we may consider to adjust the containerd OOM score based on the
daemon's own score instead of on the `cli.OOMScoreAdjust` configuration so that
we will also adjust the score in situations where dockerd's OOM score was set
through other ways (systemd or manually adjusting the cgroup). A TODO was added
for this.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Consider Address() (Config.GRPC.Addres) to be the source of truth for
the location of the containerd socket.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This RWMutex was added in 9c4570a958, and used in
the `remote.Client()` method. Commit dd2e19ebd5
split the code for client and daemon, but did not remove the mutex.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
- properly use filepath.Join() for Windows paths
- use getDaemonConfDir() to get the path for storing daemon.json
- return error instead of logrus.Fatal(). The logrus.Fatal() was a left-over
from when Cobra was not used, and appears to have been missed in commit
fb83394714, which did the conversion to Cobra.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
These were deprecated in 098a44c07f, which is
in the 22.06 branch, and no longer in use since e05f614267
so we can remove them from the master branch.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
full diff: 6068d1894d...48dd89375d
Finishes off the work to change references to cluster volumes in the API
from using "csi" as the magic word to "cluster". This reflects that the
volumes are "cluster volumes", not "csi volumes".
Notably, there is no change to the plugin definitions being "csinode"
and "csicontroller". This terminology is appropriate with regards to
plugins because it accurates reflects what the plugin is.
Signed-off-by: Drew Erny <derny@mirantis.com>
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
rename some variables that collided with imports or (upcoming)
changes, e.g. `ctx`, which is commonly used for `context.Context`.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
My IDE was complaining about some things;
- fix inconsistent receiver name (i vs s)
- fix some variables that collided with imports
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
full diff: https://github.com/containerd/containerd/v1.6.6...v1.6.7
Welcome to the v1.6.7 release of containerd!
The seventh patch release for containerd 1.6 contains various fixes,
includes a new version of runc and adds support for ppc64le and riscv64
(requires unreleased runc 1.2) builds.
Notable Updates
- Update runc to v1.1.3
- Seccomp: Allow clock_settime64 with CAP_SYS_TIME
- Fix WWW-Authenticate parsing
- Support RISC-V 64 and ppc64le builds
- Windows: Update hcsshim to v0.9.4 to fix regression with HostProcess stats
- Windows: Fix shim logs going to panic.log file
- Allow ptrace(2) by default for kernels >= 4.8
See the changelog for complete list of changes
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
full diff: https://github.com/containerd/containerd/v1.6.6...v1.6.7
Welcome to the v1.6.7 release of containerd!
The seventh patch release for containerd 1.6 contains various fixes,
includes a new version of runc and adds support for ppc64le and riscv64
(requires unreleased runc 1.2) builds.
Notable Updates
- Update runc to v1.1.3
- Seccomp: Allow clock_settime64 with CAP_SYS_TIME
- Fix WWW-Authenticate parsing
- Support RISC-V 64 and ppc64le builds
- Windows: Update hcsshim to v0.9.4 to fix regression with HostProcess stats
- Windows: Fix shim logs going to panic.log file
- Allow ptrace(2) by default for kernels >= 4.8
See the changelog for complete list of changes
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Update Go runtime to 1.18.5 to address CVE-2022-32189.
Full diff: https://github.com/golang/go/compare/go1.18.4...go1.18.5
--------------------------------------------------------
From the security announcement:
https://groups.google.com/g/golang-announce/c/YqYYG87xB10
We have just released Go versions 1.18.5 and 1.17.13, minor point
releases.
These minor releases include 1 security fixes following the security
policy:
encoding/gob & math/big: decoding big.Float and big.Rat can panic
Decoding big.Float and big.Rat types can panic if the encoded message is
too short.
This is CVE-2022-32189 and Go issue https://go.dev/issue/53871.
View the release notes for more information:
https://go.dev/doc/devel/release#go1.18.5
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The existing implementation used a `nil` value for the CRI plugin's configuration
to indicate that the plugin had to be disabled. Effectively, the `Plugins` value
was only used as an intermediate step, only to be removed later on, and to instead
add the given plugin to `DisabledPlugins` in the containerd configuration.
This patch removes the intermediate step; as a result we also don't need to mask
the containerd `Plugins` field, which was added to allow serializing the toml.
A code comment was added as well to explain why we're (currently) disabling the
CRI plugin by default, which may help future visitors of the code to determin
if that default is still needed.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This removes the `WithRemoteAddr()`, `WithRemoteAddrUser()`, `WithDebugAddress()`,
and `WithMetricsAddress()` options, added in ddae20c032,
but most of them were never used, and `WithRemoteAddr()` no longer in use since
dd2e19ebd5.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Some error conditions returned a non-typed error, which would be returned
as a 500 status by the API. This patch;
- Updates such errors to return an errdefs.InvalidParameter type
- Introduces a locally defined `invalidParam{}` type for convenience.
- Updates some error-strings to match Go conventions
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Commit 7a9cb29fb9 added a new "platform" query-
parameter to the `POST /containers/create` endpoint, but did not update the
swagger file and documentation.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Commit 7a9cb29fb9 added a new "platform" query-
parameter to the `POST /containers/create` endpoint, but did not update the
swagger file and documentation.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Commit 7a9cb29fb9 added a new "platform" query-
parameter to the `POST /containers/create` endpoint, but did not update the
swagger file and documentation.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Making the api types more focused per API type, and the general
api/types package somewhat smaller.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
- TestServiceLogsCompleteness runs service with command to write 6 log
lines but as command exits immediately, service is restarted and 6 more
lines are printed in logs, which confuses the checker.Equals(6) check.
- TestServiceLogsSince runs service with command to write 3 log lines,
and service restart can also affect it's checks.
Let's change from `tail` which exits immediately to `tail -f` which
hangs forever, this way we would not confuse checks with more log lines
when expected.
Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
Contrary to popular belief, the OCI Runtime specification does not
specify the command-line API for runtimes. Looking at containerd's
architecture from the lens of the OCI Runtime spec, the _shim_ is the
OCI Runtime and runC is "just" an implementation detail of the
io.containerd.runc.v2 runtime. When one configures a non-default runtime
in Docker, what they're really doing is instructing Docker to create
containers using the io.containerd.runc.v2 runtime with a configuration
option telling the runtime that the runC binary is at some non-default
path. Consequently, only OCI runtimes which are compatible with the
io.containerd.runc.v2 shim, such as crun, can be used in this manner.
Other OCI runtimes, including kata-containers v2, come with their own
containerd shim and are not compatible with io.containerd.runc.v2.
As Docker has not historically provided a way to select a non-default
runtime which requires its own shim, runtimes such as kata-containers v2
could not be used with Docker.
Allow other containerd shims to be used with Docker; no daemon
configuration required. If the daemon is instructed to create a
container with a runtime name which does not match any of the configured
or stock runtimes, it passes the name along to containerd verbatim. A
user can start a container with the kata-containers runtime, for
example, simply by calling
docker run --runtime io.containerd.kata.v2
Runtime names which containerd would interpret as a path to an arbitrary
binary are disallowed. While handy for development and testing it is not
strictly necessary and would allow anyone with Engine API access to
trivially execute any binary on the host as root, so we have decided it
would be safest for our users if it was not allowed.
It is not yet possible to set an alternative containerd shim as the
default runtime; it can only be configured per-container.
Signed-off-by: Cory Snider <csnider@mirantis.com>
doCopyXattrs() never reached due to copyXattrs boolean being false, as
a result file capabilities not being copied.
moved copyXattr() out of doCopyXattrs()
Signed-off-by: Illo Abdulrahim <abdulrahim.illo@nokia.com>
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Implement --follow entirely correctly for the journald log reader, such
that it exits immediately upon reading back the last log message written
to the journal before the logger was closed. The impossibility of doing
so has been slightly exaggerated.
Signed-off-by: Cory Snider <csnider@mirantis.com>
Fix journald and logfile-powered (jsonfile, local) log readers
incorrectly filtering out messages with timestamps < Since which were
preceded by a message with a timestamp >= Since.
Signed-off-by: Cory Snider <csnider@mirantis.com>
Careful management of the journal read pointer is sufficient to ensure
that no entry is read more than once.
Unit test the journald logger without requiring a running journald by
using the systemd-journal-remote command to write arbitrary entries to
journal files.
Signed-off-by: Cory Snider <csnider@mirantis.com>
Wrap the libsystemd journal reading functionality in a more idiomatic Go
API and refactor the journald logging driver's ReadLogs implementation
to use the wrapper. Rewrite the parts of the ReadLogs implementation in
Go which were previously implemented in C as part of the cgo preamble.
Separating the business logic from the cgo minutiae should hopefully
make the code more accessible to a wider audience of developers for
reviewing the code and contributing improvements.
The structure of the ReadLogs implementation is retained with few
modifications. Any ignored errors were also ignored before the refactor;
the explicit error return values afforded by the sdjournal wrapper makes
this more obvious.
The package github.com/coreos/go-systemd/v22/sdjournal also provides a
more idiomatic Go wrapper around libsystemd. It is unsuitable for our
needs as it does not expose wrappers for the sd_journal_process and
sd_journal_get_fd functions.
Signed-off-by: Cory Snider <csnider@mirantis.com>
Ensure the package can be imported, no matter the build constratints, by
adding an unconstrained doc.go containing a package statement.
Signed-off-by: Cory Snider <csnider@mirantis.com>
This test is skipped on Windows anyway.
Also add a short explanation why emptyfs image was chosen.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Reformat/rename some bits to better align with the old implementation for
easier comparing, and add some TODOs for tracking issues on remaining work.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
In network node change test, the expected behavior is focused on how many nodes
left in networkDB, besides timing issues, things would also go tricky for a
leave-then-join sequence, if the check (counting the nodes) happened before the
first "leave" event, then the testcase actually miss its target and report PASS
without verifying its final result; if the check happened after the 'leave' event,
but before the 'join' event, the test would report FAIL unnecessary;
This code change would check both the db changes and the node count, it would
report PASS only when networkdb has indeed changed and the node count is expected.
Signed-off-by: David Wang <00107082@163.com>
Not all filters are implemented yet, so make sure an error
is returned if a not-yet implemented filter is used.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This was added in 6cdc4ba6cd in 2016, likely
because at the time we were still building for CentOS 6 and Ubuntu 14.04.
All currently supported distros appear to be on _at least_ 219 now, so it looks
safe to remove this;
```bash
docker run -it --rm centos:7
yum install -y systemd-devel
pkg-config 'libsystemd >= 209' && echo "OK" || echo "KO"
OK
pkg-config --print-provides 'libsystemd'
libsystemd = 219
pkg-config --print-provides 'libsystemd-journal'
libsystemd-journal = 219
```
And on a `debian:buster` (old stable)
```bash
docker run -it --rm debian:buster
apt-get update && apt-get install -y libsystemd-dev pkg-config
pkg-config 'libsystemd >= 209' && echo "OK" || echo "KO"
OK
pkg-config --print-provides 'libsystemd'
libsystemd = 241
pkg-config --print-provides 'libsystemd-journal'
Package libsystemd-journal was not found in the pkg-config search path.
Perhaps you should add the directory containing `libsystemd-journal.pc'
to the PKG_CONFIG_PATH environment variable
No package 'libsystemd-journal' found
```
OpenSUSE leap (I think that's built for s390x)
```bash
docker run -it --rm docker.io/opensuse/leap:15
zypper install -y systemd-devel
pkg-config 'libsystemd >= 209' && echo "OK" || echo "KO"
OK
pkg-config --print-provides 'libsystemd'
libsystemd = 246
pkg-config --print-provides 'libsystemd-journal'
Package libsystemd-journal was not found in the pkg-config search path.
Perhaps you should add the directory containing `libsystemd-journal.pc'
to the PKG_CONFIG_PATH environment variable
No package 'libsystemd-journal' found
```
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This was introduced in 906b979b88, which changed
a `goto` to a `break`, but afaics, the intent was still to break out of the loop.
(linter didn't catch this before because it didn't have the right build-tag set)
daemon/logger/journald/read.go:238:4: SA4011: ineffective break statement. Did you mean to break out of the outer loop? (staticcheck)
break // won't be able to write anything anymore
^
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Before this change there was a race condition between State.Wait reading
the exit code from State and the State being changed instantly after the
change which ended the State.Wait.
Now, each State.Wait has its own channel which is used to transmit the
desired StateStatus at the time the state transitions to the awaited
one. Wait no longer reads the status by itself so there is no race.
The issue caused the `docker run --restart=always ...' to sometimes exit
with 0 exit code, because the process was already restarted by the time
State.Wait got the chance to read the exit code.
Test run
--------
Before:
```
$ go test -count 1 -run TestCorrectStateWaitResultAfterRestart .
--- FAIL: TestCorrectStateWaitResultAfterRestart (0.00s)
state_test.go:198: expected exit code 10, got 0
FAIL
FAIL github.com/docker/docker/container 0.011s
FAIL
```
After:
```
$ go test -count 1 -run TestCorrectStateWaitResultAfterRestart .
ok github.com/docker/docker/container 0.011s
```
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
This caused a race condition where AutoRemove could be restored before
container was considered for restart and made autoremove containers
impossible to restart.
```
$ make DOCKER_GRAPHDRIVER=vfs BIND_DIR=. TEST_FILTER='TestContainerWithAutoRemoveCanBeRestarted' TESTFLAGS='-test.count 1' test-integration
...
=== RUN TestContainerWithAutoRemoveCanBeRestarted
=== RUN TestContainerWithAutoRemoveCanBeRestarted/kill
=== RUN TestContainerWithAutoRemoveCanBeRestarted/stop
--- PASS: TestContainerWithAutoRemoveCanBeRestarted (1.61s)
--- PASS: TestContainerWithAutoRemoveCanBeRestarted/kill (0.70s)
--- PASS: TestContainerWithAutoRemoveCanBeRestarted/stop (0.86s)
PASS
DONE 3 tests in 3.062s
```
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
This splits the ImageService methods to separate files, to closer
match the existing implementation, and to reduce the amount of code
per file, making it easier to read, and to reduce merge conflicts if
new functionality is added.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
We use "specs" as alias in most places; rename the alias here accordingly
to prevent confusiong and reduce the risk of introducing duplicate imports.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Cory has actively participated in the project for many months, assisted in several
security advisories, code review, and triage, and (in short) already acted a
maintainer for some time (thank you!).
I nominated Cory as a maintainer per e-mail, and we reached quorum, so opening
this pull request to (should he choose to accept it) be added as a maintainer.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Initial pull/ls works
Build is deactivated if the feature is active
Signed-off-by: Djordje Lukic <djordje.lukic@docker.com>
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The correct formatting for machine-readable comments is;
//<some alphanumeric identifier>:<options>[,<option>...][ // comment]
Which basically means:
- MUST NOT have a space before `<identifier>` (e.g. `nolint`)
- Identified MUST be alphanumeric
- MUST be followed by a colon
- MUST be followed by at least one `<option>`
- Optionally additional `<options>` (comma-separated)
- Optionally followed by a comment
Any other format will not be considered a machine-readable comment by `gofmt`,
and thus formatted as a regular comment. Note that this also means that a
`//nolint` (without anything after it) is considered invalid, same for `//#nosec`
(starts with a `#`).
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
go1.18.4 (released 2022-07-12) includes security fixes to the compress/gzip,
encoding/gob, encoding/xml, go/parser, io/fs, net/http, and path/filepath
packages, as well as bug fixes to the compiler, the go command, the linker,
the runtime, and the runtime/metrics package. See the Go 1.18.4 milestone on the
issue tracker for details:
https://github.com/golang/go/issues?q=milestone%3AGo1.18.4+label%3ACherryPickApproved
This update addresses:
CVE-2022-1705, CVE-2022-1962, CVE-2022-28131, CVE-2022-30630, CVE-2022-30631,
CVE-2022-30632, CVE-2022-30633, CVE-2022-30635, and CVE-2022-32148.
Full diff: https://github.com/golang/go/compare/go1.18.3...go1.18.4
From the security announcement;
https://groups.google.com/g/golang-announce/c/nqrv9fbR0zE
We have just released Go versions 1.18.4 and 1.17.12, minor point releases. These
minor releases include 9 security fixes following the security policy:
- net/http: improper sanitization of Transfer-Encoding header
The HTTP/1 client accepted some invalid Transfer-Encoding headers as indicating
a "chunked" encoding. This could potentially allow for request smuggling, but
only if combined with an intermediate server that also improperly failed to
reject the header as invalid.
This is CVE-2022-1705 and https://go.dev/issue/53188.
- When `httputil.ReverseProxy.ServeHTTP` was called with a `Request.Header` map
containing a nil value for the X-Forwarded-For header, ReverseProxy would set
the client IP as the value of the X-Forwarded-For header, contrary to its
documentation. In the more usual case where a Director function set the
X-Forwarded-For header value to nil, ReverseProxy would leave the header
unmodified as expected.
This is https://go.dev/issue/53423 and CVE-2022-32148.
Thanks to Christian Mehlmauer for reporting this issue.
- compress/gzip: stack exhaustion in Reader.Read
Calling Reader.Read on an archive containing a large number of concatenated
0-length compressed files can cause a panic due to stack exhaustion.
This is CVE-2022-30631 and Go issue https://go.dev/issue/53168.
- encoding/xml: stack exhaustion in Unmarshal
Calling Unmarshal on a XML document into a Go struct which has a nested field
that uses the any field tag can cause a panic due to stack exhaustion.
This is CVE-2022-30633 and Go issue https://go.dev/issue/53611.
- encoding/xml: stack exhaustion in Decoder.Skip
Calling Decoder.Skip when parsing a deeply nested XML document can cause a
panic due to stack exhaustion. The Go Security team discovered this issue, and
it was independently reported by Juho Nurminen of Mattermost.
This is CVE-2022-28131 and Go issue https://go.dev/issue/53614.
- encoding/gob: stack exhaustion in Decoder.Decode
Calling Decoder.Decode on a message which contains deeply nested structures
can cause a panic due to stack exhaustion.
This is CVE-2022-30635 and Go issue https://go.dev/issue/53615.
- path/filepath: stack exhaustion in Glob
Calling Glob on a path which contains a large number of path separators can
cause a panic due to stack exhaustion.
Thanks to Juho Nurminen of Mattermost for reporting this issue.
This is CVE-2022-30632 and Go issue https://go.dev/issue/53416.
- io/fs: stack exhaustion in Glob
Calling Glob on a path which contains a large number of path separators can
cause a panic due to stack exhaustion.
This is CVE-2022-30630 and Go issue https://go.dev/issue/53415.
- go/parser: stack exhaustion in all Parse* functions
Calling any of the Parse functions on Go source code which contains deeply
nested types or declarations can cause a panic due to stack exhaustion.
Thanks to Juho Nurminen of Mattermost for reporting this issue.
This is CVE-2022-1962 and Go issue https://go.dev/issue/53616.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This removes;
- VolumeCreateBody (alias for CreateOptions)
- VolumeListOKBody (alias for ListResponse)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This removes;
- ContainerCreateCreatedBody (alias for CreateResponse)
- ContainerWaitOKBody (alias for WaitResponse)
- ContainerWaitOKBodyError (alias for WaitExitError)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The 22.06 branch was created, so changes in master/main should now be
targeting the next version of the API (1.43).
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Add pkey_alloc(2), pkey_free(2) and pkey_mprotect(2) in seccomp default profile.
pkey_alloc(2), pkey_free(2) and pkey_mprotect(2) can only configure
the calling process's own memory, so they are existing "safe for everyone" syscalls.
close issue: #43481
Signed-off-by: zhubojun <bojun.zhu@foxmail.com>
- Update IsErrNotFound() to check for the current type before falling back to
detecting the deprecated type.
- Remove unauthorizedError and notImplementedError types, which were not used.
- IsErrPluginPermissionDenied() was added in 7c36a1af03,
but not used at the time, and still appears to be unused.
- Deprecate IsErrUnauthorized in favor of errdefs.IsUnauthorized()
- Deprecate IsErrNotImplemented in favor of errdefs,IsNotImplemented()
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Older versions of Go don't format comments, so committing this as
a separate commit, so that we can already make these changes before
we upgrade to Go 1.19.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This was caught by goimports;
goimports -w $(find . -type f -name '*.go'| grep -v "/vendor/")
CI doesn't run on these platforms, so didn't catch it.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This uses the newer github issues forms to create bug reports and
feature requests.
It also makes it harder for people to accidentally open a blank issue
since there are required fields that must be filled in.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
Inlining the string makes the code more grep'able; renaming the
const to "driverName" to reflect the remaining uses of it.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Inlining the string makes the code more grep'able; renaming the
const to "driverName" to reflect the remaining uses of it.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This was effectively a constructor, but through some indirection; make it a
regular function, which is a bit more idiomatic.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This was effectively a constructor, but through some indirection; make it a
regular function, which is a bit more idiomatic.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
full diff: https://github.com/opencontainers/runc/compare/v1.1.2...v1.1.3
This is the third release of the 1.1.z series of runc, and contains
various minor improvements and bugfixes.
- Our seccomp `-ENOSYS` stub now correctly handles multiplexed syscalls on
s390 and s390x. This solves the issue where syscalls the host kernel did not
support would return `-EPERM` despite the existence of the `-ENOSYS` stub
code (this was due to how s390x does syscall multiplexing).
- Retry on dbus disconnect logic in libcontainer/cgroups/systemd now works as
intended; this fix does not affect runc binary itself but is important for
libcontainer users such as Kubernetes.
- Inability to compile with recent clang due to an issue with duplicate
constants in libseccomp-golang.
- When using systemd cgroup driver, skip adding device paths that don't exist,
to stop systemd from emitting warnings about those paths.
- Socket activation was failing when more than 3 sockets were used.
- Various CI fixes.
- Allow to bind mount `/proc/sys/kernel/ns_last_pid` to inside container.
- runc static binaries are now linked against libseccomp v2.5.4.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
iptables package has a function `detectIptables()` called to initialize
some local variables. Since v20.10.0, it first looks for iptables bin,
then ip6tables and finally it checks what iptables flags are available
(including -C). It early exits when ip6tables isn't available, and
doesn't execute the last check.
To remove port mappings (eg. when a container stops/dies), Docker
first checks if those NAT rules exist and then deletes them. However, in
the particular case where there's no ip6tables bin available, iptables
`-C` flag is considered unavailable and thus it looks for NAT rules by
using some substring matching. This substring matching then fails
because `iptables -t nat -S POSTROUTING` dumps rules in a slighly format
than what's expected.
For instance, here's what `iptables -t nat -S POSTROUTING` dumps:
```
-A POSTROUTING -s 172.18.0.2/32 -d 172.18.0.2/32 -p tcp -m tcp --dport 9999 -j MASQUERADE
```
And here's what Docker looks for:
```
POSTROUTING -p tcp -s 172.18.0.2 -d 172.18.0.2 --dport 9999 -j MASQUERADE
```
Because of that, those rules are considered non-existant by Docker and
thus never deleted. To fix that, this change reorders the code in
`detectIptables()`.
Fixes#42127.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
Starting with the 22.06 release, buildx is the default client for
docker build, which uses BuildKit as builder.
This patch changes the default builder version as advertised by
the daemon to "2" (BuildKit), so that pre-22.06 CLIs with BuildKit
support (but no buildx installed) also default to using BuildKit
when interacting with a 22.06 (or up) daemon.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
updates the "logentries" dependency;
- checking error when calling output
- Support Go Modules
full diff: 7a984a84b5...fc06dab2ca
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2022-03-09 20:42:01 +01:00
3740 changed files with 298680 additions and 126241 deletions
@@ -14,7 +14,7 @@ Moby is an open project guided by strong principles, aiming to be modular, flexi
It is open to the community to help set its direction.
- Modular: the project includes lots of components that have well-defined functions and APIs that work together.
- Batteries included but swappable: Moby includes enough components to build fully featured container system, but its modular architecture ensures that most of the components can be swapped by different implementations.
- Batteries included but swappable: Moby includes enough components to build fully featured container systems, but its modular architecture ensures that most of the components can be swapped by different implementations.
- Usable security: Moby provides secure defaults without compromising usability.
- Developer focused: The APIs are intended to be functional and useful to build powerful tools.
They are not necessarily intended as end user tools but as components aimed at developers.
The Engine API is an HTTP API served by Docker Engine. It is the API the
Docker client uses to communicate with the Engine, so everything the Docker
@@ -55,8 +55,8 @@ info:
the URL is not supported by the daemon, a HTTP `400 Bad Request` error message
is returned.
If you omit the version-prefix, the current version of the API (v1.42) is used.
For example, calling `/info` is the same as calling `/v1.42/info`. Using the
If you omit the version-prefix, the current version of the API (v1.43) is used.
For example, calling `/info` is the same as calling `/v1.43/info`. Using the
API without a version-prefix is deprecated and will be removed in a future release.
Engine releases in the near future should support this version of the API,
@@ -214,12 +214,14 @@ definitions:
- `volume` a docker volume with the given `Name`.
- `tmpfs` a `tmpfs`.
- `npipe` a named pipe from the host into the container.
- `cluster` a Swarm cluster volume
type:"string"
enum:
- "bind"
- "volume"
- "tmpfs"
- "npipe"
- "cluster"
example:"volume"
Name:
description:|
@@ -350,12 +352,14 @@ definitions:
- `volume` Creates a volume with the given name and options (or uses a pre-existing volume with the same name and options). These are **not** removed when the container is removed.
- `tmpfs` Create a tmpfs with the given options. The mount source cannot be specified for tmpfs.
- `npipe` Mounts a named pipe from the host into the container. Must exist prior to creating the container.
- `cluster` a Swarm cluster volume
type:"string"
enum:
- "bind"
- "volume"
- "tmpfs"
- "npipe"
- "cluster"
ReadOnly:
description:"Whether the mount should be read-only."
type:"boolean"
@@ -972,6 +976,13 @@ definitions:
items:
type:"integer"
minimum:0
Annotations:
type:"object"
description:|
Arbitrary non-identifying metadata attached to container and
provided to the runtime when the container is started.
additionalProperties:
type:"string"
# Applicable to UNIX platforms
CapAdd:
@@ -1118,6 +1129,7 @@ definitions:
remapping option is enabled.
ShmSize:
type:"integer"
format:"int64"
description:|
Size of `/dev/shm` in bytes. If omitted, the system uses 64MB.
Amount of disk space used by the build cache (in bytes).
type:"integer"
example:51
CreatedAt:
description:|
Date and time at which the build cache was created in
@@ -2281,6 +2357,7 @@ definitions:
example:"2017-08-09T07:09:37.632105588Z"
UsageCount:
type:"integer"
example:26
ImageID:
type:"object"
@@ -2298,6 +2375,8 @@ definitions:
type:"string"
error:
type:"string"
errorDetail:
$ref:"#/definitions/ErrorDetail"
status:
type:"string"
progress:
@@ -4605,7 +4684,8 @@ definitions:
example:false
OOMKilled:
description:|
Whether this container has been killed because it ran out of memory.
Whether a process within this container has been killed because it ran
out of memory since the container was last started.
type:"boolean"
example:false
Dead:
@@ -5195,7 +5275,8 @@ definitions:
SecurityOptions:
description:|
List of security features that are enabled on the daemon, such as
apparmor, seccomp, SELinux, user-namespaces (userns), and rootless.
apparmor, seccomp, SELinux, user-namespaces (userns), rootless and
no-new-privileges.
Additional configuration options for each security feature may
be present, and are included as a comma-separated list of key/value
@@ -6210,6 +6291,28 @@ paths:
`/?[a-zA-Z0-9][a-zA-Z0-9_.-]+`.
type:"string"
pattern:"^/?[a-zA-Z0-9][a-zA-Z0-9_.-]+$"
- name:"platform"
in:"query"
description:|
Platform in the format `os[/arch[/variant]]` used for image lookup.
When specified, the daemon checks if the requested image is present
in the local image cache with the given OS and Architecture, and
otherwise returns a `404` status.
If the option is not set, the host's native OS and Architecture are
used to look up the image in the image cache. However, if no platform
is passed and the given image does exist in the local image cache,
but its OS or architecture does not match, the container is created
with the available image, and a warning is added to the `Warnings`
field in the response, for example;
WARNING: The requested image's platform (linux/arm64/v8) does not
match the detected host platform (linux/amd64) and no
specific platform was requested
type:"string"
default:""
- name:"body"
in:"body"
description:"Container to create"
@@ -6806,9 +6909,9 @@ paths:
Returns which files in a container's filesystem have been added, deleted,
or modified. The `Kind` of modification can be one of:
- `0`: Modified
- `1`: Added
- `2`: Deleted
- `0`: Modified ("C")
- `1`: Added ("A")
- `2`: Deleted ("D")
operationId:"ContainerChanges"
produces:["application/json"]
responses:
@@ -6817,22 +6920,7 @@ paths:
schema:
type:"array"
items:
type:"object"
x-go-name:"ContainerChangeResponseItem"
title:"ContainerChangeResponseItem"
description:"change item in response to ContainerChanges operation"
required:[Path, Kind]
properties:
Path:
description:"Path to file that has changed"
type:"string"
x-nullable:false
Kind:
description:"Kind of change"
type:"integer"
format:"uint8"
enum:[0,1,2]
x-nullable:false
$ref:"#/definitions/FilesystemChange"
examples:
application/json:
- Path:"/dev"
@@ -8159,7 +8247,7 @@ paths:
Available filters:
- `until=<duration>`: duration relative to daemon's time, during which build cache was not used, in Go's duration format (e.g., '24h')
- `until=<timestamp>` remove cache older than `<timestamp>`. The `<timestamp>` can be Unix timestamps, date formatted timestamps, or Go duration strings (e.g. `10m`, `1h30m`) computed relative to the daemon's local time.
- `id=<id>`
- `parent=<id>`
- `type=<string>`
@@ -8658,6 +8746,10 @@ paths:
IdentityToken:"9cbaf023786cd7..."
204:
description:"No error"
401:
description:"Auth error"
schema:
$ref:"#/definitions/ErrorResponse"
500:
description:"Server error"
schema:
@@ -8719,7 +8811,17 @@ paths:
description:"Max API Version the server supports"
Builder-Version:
type:"string"
description:"Default version of docker image builder"
description:|
Default version of docker image builder
The default on Linux is version "2" (BuildKit), but the daemon
can be configured to recommend version "1" (classic Builder).
Windows does not yet support BuildKit for native Windows images,
and uses "1" (classic builder) as a default.
This value is a recommendation as advertised by the daemon, and
it is up to the client to choose which builder to use.
default:"2"
Docker-Experimental:
type:"boolean"
description:"If the server is running with experimental mode enabled"
@@ -9013,7 +9115,7 @@ paths:
BuildCache:
-
ID:"hw53o5aio51xtltp5xjp8v7fx"
Parent:""
Parents:[]
Type:"regular"
Description:"pulled from docker.io/library/debian@sha256:234cb88d3020898631af0ccbbcca9a66ae7306ecd30c9720690858c1b007d2a0"
- `label` (`label=<key>`, `label=<key>=<value>`, `label!=<key>`, or `label!=<key>=<value>`) Prune volumes with (or without, in case `label!=...` is used) the specified labels.
- `all` (`all=true`) - Consider all (local) volumes for pruning and not just anonymous volumes.
returnnil,errors.New("build tag cannot contain a digest")
}
nameWithTag:=reference.TagNameOnly(ref).String()
if_,exists:=uniqNames[nameWithTag];!exists{
uniqNames[nameWithTag]=struct{}{}
repoAndTags=append(repoAndTags,nameWithTag)
}
}
returnrepoAndTags,nil
}
Some files were not shown because too many files have changed in this diff
Show More
Reference in New Issue
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.