[Intel-wired-lan] [net-next PATCH v3 00/17] Future-proof tunnel offload handlers

Tue Jun 21 10:41:10 UTC 2016

On 21/06/16 09:22, David Miller wrote:
> From: Tom Herbert <tom at herbertland.com> Date: Mon, 20 Jun 2016 10:05:01 -0700
>> Generally, this means it needs to at least match by local addresses and port for an unconnected/unbound socket, the source address for an unconnected/bound socket, a the full 4-tuple for a connected socket. 
> These lookup keys are all insufficient. At the very least the network namespace must be in the lookup key as well if you want to match "sockets".
But the card doesn't have to be told that; instead, only push a socket to
a device offload if the device is in the same ns as the socket.  Wouldn't
that work?
Anything beyond that - i.e. supporting cross-ns offloads - would require
knowing how packets / addresses get transformed in bridging them from one
ns to another and in general that's quite a wide set of possibilities; it
doesn't seem worth while.  Especially since the likely use-case of tunnels
plus containers is that the host does the decapsulation and transparently
gives the container a virtual ethernet device, which keeps the hardware
and the "socket" in the same ns.
> But anyways, the vastness of the key is why we want to keep "sockets"
> out of network cards, because proper support of "sockets" requires
> access to information the card simply does not and should not have.

I think Tom's talk of "sockets" is a red herring; it's more a question of
"flows".  If we think of our host as a black box, its decisions ("is this
traffic encapsulated?") necessarily depend upon the 5-tuple plus the
(implicit) information that the traffic is being received on a particular
interface.
Netns are another red herring: even without them, what if our host is a
router with NAT, forwarding traffic to another host?  Now you're trying to
match a "socket" on another host (in, perhaps, another IP-address
namespace), but the "flow" is still the same: it's defined in terms of the
addresses on the incoming traffic, not what they might get NATted to by
the time the packets hit an actual socket.

So AFAICT, flow matching up to and including 5-tuple is both necessary and
sufficient for correct UDP tunnel detection in HW.  Sadly most HW
(including our latest here at sfc) thinks it only needs UDP dest port :(
and for such HW, Tom is right that we can't mix it with forwarding, and
have to reserve the port in all ns.

-Ed