Stopping the Engine while a container with autoremove set is running may
leave behind dead containers on disk. These containers aren't reclaimed
on next start, appear as "dead" in `docker ps -a` and can't be
inspected or removed by the user.
This bug has existed since a long time but became user visible with
9f5f4f5a42. Prior to that commit,
containers with no rwlayer weren't added to the in-memory viewdb, so
they weren't visible in `docker ps -a`. However, some dangling files
would still live on disk (e.g. folder in /var/lib/docker/containers,
mount points, etc).
The underlying issue is that when the daemon stops, it tries to stop all
running containers and then closes the containerd client. This leaves a
small window of time where the Engine might receive 'task stop' events
from containerd, and trigger autoremove. If the containerd client is
closed in parallel, the Engine is unable to complete the removal,
leaving the container in 'dead' state. In such case, the Engine logs the
following error:
cannot remove container "bcbc98b4f5c2b072eb3c4ca673fa1c222d2a8af00bf58eae0f37085b9724ea46": Canceled: grpc: the client connection is closing: context canceled
Solving the underlying issue would require complex changes to the
shutdown sequence. Moreover, the same issue could also happen if the
daemon crashes while it deletes a container. Thus, add a cleanup step
on daemon startup to remove these dead containers.
Signed-off-by: Albin Kerouanton <albin.kerouanton@docker.com>
When inspecting multi-platform images where some layer blobs were
missing from the content store, the image inspect operation would return
too early causing some data (like config details or unpacked size) to be
omitted even though are available.
This ensures that `docker image inspect` returns as much information as
possible.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Add integration tests for Windows container functionality focusing on network drivers and container isolation modes.
Signed-off-by: Sopho Merkviladze <smerkviladze@mirantis.com>
Call resolvconf.UserModified() in sandbox.setupDNS() to check if
resolv.conf was manually modified before regenerating it during
container restart for non-host network modes.
Signed-off-by: zhangguanzhang <zhangguanzhang@qq.com>
Signed-off-by: Albin Kerouanton <albin.kerouanton@docker.com>
Migrated TestAPIImagesDelete from the legacy integration-cli suite
(docker_api_images_test.go) to the new integration test framework under
integration/image/remove_test.go.
This update:
- Fixes ENV instruction syntax to use "ENV FOO=bar"
- Adds error type check using errdefs.IsNotFound for cleaner assertions
- Ensures consistent cleanup handling
Signed-off-by: Aditya Mishra <mishraaditya675@gmail.com>
We've seen various failures recently where GitHub actions runners are
running out of space. Skip this test for now.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
If the DNS name still resolves to an IP address, and that address is
assigned to a running container, the ping command will run indefinitely
and the test suite will time out for 10 mins.
This is confusing, as it looks like a daemon hang, or a test suite hang,
whereas it's just a test failure. Add '-c1' to ping to make it return
immediately.
Signed-off-by: Albin Kerouanton <albin.kerouanton@docker.com>
Previous commit reverted a faulty change that broke DNS resolution for
non swarm-scoped networks once a node has joined a Swarm cluster.
This commit adds an integration test to verify that we don't break DNS
resolution again.
Signed-off-by: Albin Kerouanton <albin.kerouanton@docker.com>
Add WithAPIVersion and WithAPIVersionFromEnv to be more clear on
the intent, and to align with other related options and fields.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
On daemon shutdown, the HTTP server tries to gracefully shutdown for 5
seconds. If there's an open API connection to the '/events' endpoint, it
fails to do so as nothing interrupts that connection, thus forcing the
daemon to wait until that timeout is reached.
Add a Close method to the EventsService, and call it during daemon
shutdown. It'll close any events channel, signaling to the '/events'
handler to return and close the connection.
It now takes ~1s (or less) to shutdown the daemon when there's an active
'/events' connection, instead of 5.
Signed-off-by: Albin Kerouanton <albin.kerouanton@docker.com>