InTheForest/moby

mirror of https://github.com/moby/moby.git synced 2026-01-11 18:51:37 +00:00

Author	SHA1	Message	Date
Sebastiaan van Stijn	ed1406cb93	Merge pull request #50236 from corhere/libn/overlay-really-delete-neigh libn/d/overlay: delete FDB entry from AF_BRIDGE	2025-06-24 18:13:54 +02:00
Albin Kerouanton	a41225dcfd	Merge pull request #50091 from corhere/libn/overlay-refactor-checkencryption libnetwork/d/overlay: simplify the encryption logic	2025-06-23 21:32:02 +02:00
Cory Snider	7a12bbe5d3	libn/d/overlay: delete FDB entry from AF_BRIDGE Starting with commit `0d6e7cd983` DeleteNeighbor() needs to be called with the same options as the AddNeighbor() call that created the neighbor entry. The calls in peerdb were modified incorrectly, resulting in the deletes failing and leaking neighbor entries. Fix up the DeleteNeighbor calls so that the FDB entry is deleted from the FDB instead of the neighbor table, and the neighbor is deleted from the neighbor table instead of the FDB. Signed-off-by: Cory Snider <csnider@mirantis.com>	2025-06-19 17:24:16 -04:00
Rob Murray	bf002e51a0	Split [Program\|Revoke]ExternalConnectivity out of libnet driverapi Signed-off-by: Rob Murray <rob.murray@docker.com>	2025-06-18 17:38:35 +01:00
Rob Murray	b387526fce	Merge pull request #49981 from robmry/drop_inc_rules Drop DOCKER-ISOLATION rules	2025-06-17 20:22:00 +01:00
Rob Murray	ec185e57cf	Test Nftabler params Signed-off-by: Rob Murray <rob.murray@docker.com>	2025-06-17 18:28:20 +01:00
Rob Murray	c66abe486b	nftabler: add mirrored WSL2 loopback0 workaround Signed-off-by: Rob Murray <rob.murray@docker.com>	2025-06-16 15:19:47 +01:00
Rob Murray	d31956b2f7	Add an outline nftabler Signed-off-by: Rob Murray <rob.murray@docker.com>	2025-06-16 15:19:45 +01:00
Rob Murray	1ad9599da7	Drop DOCKER-ISOLATION rules The Inter-Network Communication rules in the iptables chains DOCKER-ISOLATION-STAGE-1 / DOCKER-ISOLATION-STAGE-2 (which are called from filter-FORWARD) currently: - Block access from containers in one bridge network, to ports published to host addresses by containers in other bridge networks, when the userland-proxy is disabled. - But, that access is allowed when the proxy is enabled. - Block access to all ports on container addresses in gateway mode "nat-unprotected" networks. - But, those ports can be accessed from anywhere else, including other hosts. Just not other bridge networks. - Allow access from containers in "nat" bridge networks to published ports on container addresses in "routed" networks. But, to do that, extra INC rules are added for the routed network. The INC rules are no longer needed to block access from containers in one network to unpublished ports on container addresses in other networks. Direct routing to containers in NAT networks is blocked by the "raw-PREROUTING" rules that block access from untrusted interfaces (all interfaces apart from the network's own bridge). Drop these INC rules to resolve the inconsistencies listed above, with this change: - Published ports on host addresses can be accessed from containers in other networks (even without the userland-proxy). - The rules for direct routing between bridge networks are the same as the rules for direct routing from outside the Docker host (allowed for gw modes "routed" and "nat-unprotected", disallowed for "nat"). Fewer rules, so it's simpler, and perhaps slightly faster. Internal networks (with no access to networks outside the host) are also implemented using rules in the DOCKER-ISOLATION chains. This change moves those rules to a new chain, DOCKER-INTERNAL, and drops the DOCKER-ISOLATION chains. Signed-off-by: Rob Murray <rob.murray@docker.com>	2025-06-16 14:54:31 +01:00
Matthieu MOREL	6d737371b8	fix comparison rule from errorlint Signed-off-by: Matthieu MOREL <matthieu.morel35@gmail.com> Signed-off-by: Matthieu MOREL <matthieu.morel35@gmail.com>	2025-06-13 08:26:56 +00:00
Sebastiaan van Stijn	13879e7b49	Merge pull request #50082 from mmorel-35/go-critic fix go-critic linter	2025-06-10 13:09:05 +02:00
Paweł Gronowski	52a8109a6b	Merge pull request #50155 from robmry/windows_no_mirrored_plugin Windows: don't try to load "mirrored" network plugin	2025-06-10 09:36:22 +00:00
Rob Murray	55f47f9e34	Windows: don't try to load "mirrored" network plugin Signed-off-by: Rob Murray <rob.murray@docker.com>	2025-06-09 11:38:13 +01:00
Matthieu MOREL	bc9ec5fc02	fix emptyStringTest from go-critic Signed-off-by: Matthieu MOREL <matthieu.morel35@gmail.com>	2025-06-07 09:57:59 +02:00
Rob Murray	793dd8385a	Only "prune" Windows networks created by Docker Signed-off-by: Rob Murray <rob.murray@docker.com>	2025-06-06 20:24:04 +01:00
Rob Murray	9663b36b6d	Merge pull request #50054 from robmry/fix_port_mapping Avoid selecting duplicate host ports for mappings to 0.0.0.0 and specific addresses	2025-06-04 16:46:29 +01:00
Sebastiaan van Stijn	fca97dae9d	libnet/d/overlay/overlayutils: prevent uint32 overflow CodeQL was complaining about the conversion to uint32 Signed-off-by: Sebastiaan van Stijn <github@gone.nl>	2025-05-30 14:00:14 +02:00
Cory Snider	df6b405796	libnetwork/d/overlay: drop initEncryption function The (driver).Join function does many things to set up overlay networking. One of the first things it does is call (network).joinSandbox, which in turn calls (driver).initSandboxPeerDB. The initSandboxPeerDB function iterates through the peer db to add entries to the VXLAN FDB, neighbor table and IPsec security association database in the kernel for all known peers on the overlay network. One of the last things the (driver).Join function does is call (*driver).initEncryption. The initEncryption function iterates through the peer db to add entries to the IPsec security association database in the kernel for all known peers on the overlay network. But the preceding initSandboxPeerDB call already did that! The initEncryption function is redundant and can safely be removed. Signed-off-by: Cory Snider <csnider@mirantis.com>	2025-05-29 14:13:13 -04:00
Cory Snider	713f887698	libnetwork/d/overlay: drop checkEncryption function In addition to being three functions in a trenchcoat, the checkEncryption function has a very subtle implementation which is difficult to reason about. That is not a good property for security relevant code to have. Replace two of the three calls to checkEncryption with conditional calls to setupEncryption and removeEncryption, lifting the conditional logic which was hidden away in checkEncryption into the call sites to make it easier to reason about the code. Replace the third call with a call to a new initEncryption function. Signed-off-by: Cory Snider <csnider@mirantis.com>	2025-05-29 14:13:13 -04:00
Cory Snider	cb4e7b2f03	libnetwork/d/overlay: make setupEncryption a method The setupEncryption and removeEncryption functions take several parameters, but all call sites pass the same values for all the parameters aside from remoteIP: values taken from fields of the driver struct. Refactor these functions to be methods of the driver struct and drop the redundant parameters. Signed-off-by: Cory Snider <csnider@mirantis.com>	2025-05-29 14:13:13 -04:00
Cory Snider	0d893252ac	libnetwork/d/overlay: checkEncryption: drop isLocal param Since it is not meaningful to add or remove encryption between the local node and itself, the isLocal parameter is redundant. Setting up encryption for all network peers is now invoked by calling checkEncryption(nid, netip.Addr{}, true) Calling checkEncryption with isLocal=true, add=false is now more explicitly a no-op. It always was effectively a no-op, but that was not easy to spot by inspection. In the world with the isLocal flag, calls to checkEncryption where isLocal=true and add=false would have rIP set to d.advertiseAddr. In other words, it was a request to remove encryption parameters between the local peer and itself if peerDB had no remote-peer entries for the network. So either the call would do nothing, or it would remove encryption parameters that aren't used for anything. Now the equivalent call always does nothing. Signed-off-by: Cory Snider <csnider@mirantis.com>	2025-05-29 14:13:13 -04:00
Cory Snider	4b1c1236b9	libnetwork/d/overlay: peerdb: drop isLocal param Drop the isLocal boolean parameters from the peerDB functions. Local peers have vtep == netip.Addr{}. Signed-off-by: Cory Snider <csnider@mirantis.com>	2025-05-29 14:13:13 -04:00
Cory Snider	48e0b24ff7	libnetwork/d/overlay: elide vtep for local peers The VTEP value for a peer in peerDB is only accurate for a remote peer. The VTEP for a local peer would be the driver's advertise address, which is not necessarily constant for the lifetime of the driver instance. The VTEP values persisted in the peerDB entries for local peers could be stale or missing if not kept in sync with the advertise address. And the peerDB could get polluted with duplicate entries for local peers if the advertise address was to change, as entries which differ only by VTEP are considered distinct by SetMatrix. Persisting the advertise address as the VTEP for local peers creates lots of problems that are not easy to solve. Stop persisting the VTEP for local peers in peerDB. Any code that needs to know the VTEP for local peers can look that up from the source of truth: the driver's advertise address. Use the lack of a VTEP in peerDB entries to signify local peers, making the isLocal flag redundant. Signed-off-by: Cory Snider <csnider@mirantis.com>	2025-05-29 14:13:13 -04:00
Cory Snider	a9e2d6d06e	libnetwork/d/overlay: filter local peers explicitly The overlay driver's checkEncryption function configures the IPSec parameters for the VXLAN tunnels to peer nodes. When called with isLocal=true, it configures encryption for all peer nodes with at least one peerDB entry. Since the local peers are also included in the peerDB, it needs to filter those entries out. It does so by filtering out any peer entries whose VTEP address is equal to the current local advertise address. Trouble is, the local advertise address is not necessarily constant. The driver tries to handle this case by calling peerDBUpdateSelf() when the advertise address changes. This function iterates through the peerDB and tries to update the VTEP address for all local peer entries, but it does not actually do anything: it mutates a temporary copy of the entry which is not persisted back into the peerDB. (It used to be functional, but was broken when the peerDB was extended to use SetMatrix.) So there may be cases where local peer entries are not filtered out properly, resulting in spurious encryption parameters being programmed into the kernel. Filter out local peers when walking the peerDB by filtering on whether the entry has the isLocal flag set. Remove the no-op code which attempts to update local entries in the peerDB. No other code takes any interest in the VTEP value for isLocal peer entries. Signed-off-by: Cory Snider <csnider@mirantis.com>	2025-05-29 14:13:13 -04:00
Cory Snider	f144264bae	Merge pull request #50090 from corhere/libn/overlay-netip libnetwork/d/overlay: use netip types more	2025-05-29 14:12:28 -04:00
Paweł Gronowski	0e2cc22d36	Merge pull request #50049 from robmry/nftables_env_var_enable nftables: enable using env var	2025-05-28 12:58:21 +00:00
Paweł Gronowski	e37efd4c2d	Merge pull request #50068 from mmorel-35/github.com/containerd/errdefs refactor: replace uses of errdefs package	2025-05-28 12:57:15 +00:00
Rob Murray	19dc38f79b	Listen on mapped host ports before mapping more ports Because we set SO_REUSEADDR on sockets for host ports, if there are port mappings for INADDR_ANY (the default) as well as for specific host ports - bind() cannot be used to detect clashes. That means, for example, on daemon startup, if the port allocator returns the first port in its ephemeral range for a specific host adddress, and the next port mapping is for 0.0.0.0 - the same port is returned and both bind() calls succeed. Then, the container fails to start later when listen() spots the problem and it's too late to find another port. So, bind and listen to each set of ports as they're allocated instead of just binding. Signed-off-by: Rob Murray <rob.murray@docker.com>	2025-05-28 11:38:59 +01:00
Rob Murray	21a165de23	Use env-var DOCKER_FIREWALL_BACKEND=nftables to enable nftables Signed-off-by: Rob Murray <rob.murray@docker.com>	2025-05-28 09:57:26 +01:00
Matthieu MOREL	8561016335	libnetwork: replace uses of errdefs package Signed-off-by: Matthieu MOREL <matthieu.morel35@gmail.com>	2025-05-28 05:40:38 +00:00
Cory Snider	d188df0039	libn/d/overlay: use netip types more The netip types are really useful for tracking state in the overlay driver as they are hashable, unlike net.IP and friends, making them directly useable as map keys. Converting between netip and net types is fairly trivial, but fewer conversions is more ergonomic. The NetworkDB entries for the overlay peer table encode the IP addresses as strings. We need to parse them to some representation before processing them further. Parse directly into netip types and pass those values around to cut down on the number of conversions needed. The peerDB needs to marshal the keys and entries to structs of hashable values to be able to insert them into the SetMatrix. Use netip.Addr in peerEntry so that peerEntry values can be directly inserted into the SetMatrix without conversions. Use a hashable struct type as the SetMatrix key to avoid having to marshal the whole struct to a string and parse it back out. Use netip.Addr as the map key for the driver's encryption map so the values do not need to be converted to and from strings. Change the encryption configuration methods to take netip types so the peerDB code can pass netip values directly. Signed-off-by: Cory Snider <csnider@mirantis.com>	2025-05-27 13:47:11 -04:00
Cory Snider	0317f773a6	libnetwork/internal/setmatrix: make keys generic Make the SetMatrix key's type generic so that e.g. netip.Addr values can be used as matrix keys. Signed-off-by: Cory Snider <csnider@mirantis.com>	2025-05-27 13:29:41 -04:00
Cory Snider	0d6e7cd983	libnetwork/osl: stop tracking neighbor entries The Namespace keeps some state for each inserted neighbor-table entry which is used to delete the entry (and any related entries) given only the IP and MAC address of the entry to delete. This state is not strictly required as the retained data is a pure function of the parameters passed to AddNeighbor(), and the kernel can inform us whether an attempt to add a neighbor entry would conflict with an existing entry. Get rid of the neighbor state in Namespace. It's just one more piece of state that can cause lots of grief if it falls out of sync with ground truth. Require callers to call DeleteNeighbor() with the same aguments as they had passed to AddNeighbor(). Push the responsibility for detecting attempts to insert conflicting entries into the neighbor table onto the kernel by using (*netlink.Handle).NeighAdd() instead of NeighSet(). Modernize the error messages and logging in DeleteNeighbor() and AddNeighbor(). Signed-off-by: Cory Snider <csnider@mirantis.com>	2025-05-27 11:46:21 -04:00
Cory Snider	3bdf99d127	libn/osl: drop unused AddNeighbor force parameter func (*Namespace) AddNeighbor is only ever called with the force parameter set to false. Remove the parameter and eliminate dead code. Signed-off-by: Cory Snider <csnider@mirantis.com>	2025-05-27 11:30:30 -04:00
Rob Murray	0facb0cd63	Merge pull request #49969 from robmry/firewaller_wsl2_param Make WSL2Mirrored into a Firewaller param	2025-05-21 15:32:04 +01:00
Sebastiaan van Stijn	2a96d2eb8d	align //go:build versions Signed-off-by: Sebastiaan van Stijn <github@gone.nl>	2025-05-16 17:52:43 +02:00
Matthieu MOREL	205ba05feb	fix usestdlibvars Signed-off-by: Matthieu MOREL <matthieu.morel35@gmail.com>	2025-05-15 18:24:58 +02:00
Rob Murray	eeba428939	Make WSL2Mirrored a Firewaller param The bridge driver should figure out whether it's running in a mirrored WSL2 setup, and tell the firewaller. So, move the WSL2-deciding code back into the bridge driver and unit test it there. Use TestIptabler to check the rules are constructed properly. Signed-off-by: Rob Murray <rob.murray@docker.com>	2025-05-14 09:38:32 +01:00
Rob Murray	ba0ad9e80f	Unit test the bridge driver in terms of its firewaller Don't inspect iptables rules, because the driver's Firewaller won't always be an iptabler. Signed-off-by: Rob Murray <rob.murray@docker.com>	2025-05-13 18:27:55 +01:00
Rob Murray	daeb080ff1	Test Iptabler params Signed-off-by: Rob Murray <rob.murray@docker.com>	2025-05-12 17:30:40 +01:00
Rob Murray	cb38cc0fdd	Merge pull request #49860 from robmry/firewaller_interface Firewaller interface	2025-05-12 14:18:16 +01:00
Sebastiaan van Stijn	cfdfbfab9b	libnetwork/drivers/remote: inline decodeToMap Signed-off-by: Sebastiaan van Stijn <github@gone.nl>	2025-05-08 14:33:09 +02:00
Rob Murray	b0777be89e	Use firewaller.IPVersion instead of iptables.IPVersion for gwmode Signed-off-by: Rob Murray <rob.murray@docker.com>	2025-05-06 10:18:22 +01:00
Rob Murray	3cbb1ae736	Move filter-FORWARD DROP setting to the firewaller Signed-off-by: Rob Murray <rob.murray@docker.com>	2025-05-06 10:18:22 +01:00
Rob Murray	44843d9917	Pass context to more places Signed-off-by: Rob Murray <rob.murray@docker.com>	2025-05-06 10:18:22 +01:00
Rob Murray	a9bf151260	Put Iptabler behind a Firewaller interface. Signed-off-by: Rob Murray <rob.murray@docker.com>	2025-05-06 10:18:22 +01:00
Matthieu MOREL	70139978d3	fix(ST1016): Use consistent method receiver names Signed-off-by: Matthieu MOREL <matthieu.morel35@gmail.com> Signed-off-by: Sebastiaan van Stijn <github@gone.nl>	2025-05-01 14:30:52 +02:00
Sebastiaan van Stijn	4ae3705d06	Merge pull request #49895 from thaJeztah/linting_fix_noshadow fix various "no shadow" linting issues (govet)	2025-05-01 13:09:38 +02:00
Rob Murray	44a3453d73	Add daemon option --allow-direct-routing Per-network option com.docker.network.bridge.trusted-host-interfaces accepts a list of interfaces that are allowed to route directly to a container's published ports in a bridge network with nat enabled. This daemon level option disables direct access filtering, enabling direct access to published ports on container addresses in all bridge networks, via all host interfaces. It overlaps with short-term env-var workaround: DOCKER_INSECURE_NO_IPTABLES_RAW=1 - it does not allow packets sent from outside the host to reach ports published only to 127.0.0.1 - it will outlive iptables (the workaround was initially intended for hosts that do not have kernel support for the "raw" iptables table). Signed-off-by: Rob Murray <rob.murray@docker.com>	2025-04-30 20:59:28 +01:00
Rob Murray	c16caabe36	Add TestNetworkConfigurationMarshalling Signed-off-by: Rob Murray <rob.murray@docker.com>	2025-04-30 20:59:28 +01:00

1 2 3 4 5 ...

1168 Commits