The only viable way to allocate a port is to bind and listen to it. So,
the windows PortMapper was really a PortAllocator in disguise.
Rename it to OSAllocator and move it to the portallocator package.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
Previous commit changed the OSAllocator to listen after binding a port,
such that we're 100% sure that the port is free. We can now make the
OSAllocator responsible for retrying port allocations when it tries to
find an ephemeral port, or a free port in a range.
Move the retry logic from the 'nat' portmapper to the OSAllocator.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
Move the listen syscall to the `OSAllocator` such that when
`RequestPortsInRange` returns, callers are guaranteed that the allocated
port isn't used by another process.
Bind and listen syscalls were previously split because listening before
inserting DNAT rules could cause connections to be accepted by the
kernel, so packets would never be forwarded to the container.
But, pulling them apart has an undesirable drawback: if another process
is racing against the Engine, and starts listening on the same port,
the conflict wouldn't be detected until OSAllocator's callers issue a
'listen' syscall. This means that callers need to implement their own
retry logic.
To overcome both drawbacks, set a cBPF socket filter on the socket
before it's bound, and let callers call `DetachSocketFilter` to remove
it. Now, callers are guaranteed that the port is free to use, and no
connections will be accepted prematurely.
For TCP / SCTP clients, this means that they'll send the first handshake
packet (e.g. SYN), but the kernel won't reply (e.g. SYN-ACK), and they
will retry until DNAT rules are configured or the socket filter is
removed.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
The userland proxy uses unconnected UDP sockets to receive packets from
anywhere, so enabling SO_REUSEADDR means that multiple sockets can bind
the same port. This defeats the purpose of the portallocator, which is
supposed to ensure that the port is free and not already in use (either
by us, or by another process). So, do not enable SO_REUSEADDR for UDP
sockets.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
This new struct allocates ports from the operating system by creating
sockets and binding them. It's based on the existing bindTCPOrUDP and
bindSCTP functions previously defined in the bridge driver. It tries to
detect conflicts on best effort basis, and doesn't guarantee that the
ports it allocates are not in use by other processes.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>