[Intel-wired-lan] [PATCH v2 bpf 1/5] net: ethtool: add xdp properties flag set

Toke Høiland-Jørgensen toke at redhat.com
Wed Feb 10 22:52:39 UTC 2021


Jakub Kicinski <kuba at kernel.org> writes:

> On Wed, 10 Feb 2021 11:53:53 +0100 Toke Høiland-Jørgensen wrote:
>> >> I am a bit confused now. Did you mean validation tests of those XDP
>> >> flags, which I am working on or some other validation tests?
>> >> What should these tests verify? Can you please elaborate more on the
>> >> topic, please - just a few sentences how are you see it?  
>> >
>> > Conformance tests can be written for all features, whether they have 
>> > an explicit capability in the uAPI or not. But for those that do IMO
>> > the tests should be required.
>> >
>> > Let me give you an example. This set adds a bit that says Intel NICs 
>> > can do XDP_TX and XDP_REDIRECT, yet we both know of the Tx queue
>> > shenanigans. So can i40e do XDP_REDIRECT or can it not?
>> >
>> > If we have exhaustive conformance tests we can confidently answer that
>> > question. And the answer may not be "yes" or "no", it may actually be
>> > "we need more options because many implementations fall in between".
>> >
>> > I think readable (IOW not written in some insane DSL) tests can also 
>> > be useful for users who want to check which features their program /
>> > deployment will require.  
>> 
>> While I do agree that that kind of conformance test would be great, I
>> don't think it has to hold up this series (the perfect being the enemy
>> of the good, and all that). We have a real problem today that userspace
>> can't tell if a given driver implements, say, XDP_REDIRECT, and so
>> people try to use it and spend days wondering which black hole their
>> packets disappear into. And for things like container migration we need
>> to be able to predict whether a given host supports a feature *before*
>> we start the migration and try to use it.
>
> Unless you have a strong definition of what XDP_REDIRECT means the flag
> itself is not worth much. We're not talking about normal ethtool feature
> flags which are primarily stack-driven, XDP is implemented mostly by
> the driver, each vendor can do their own thing. Maybe I've seen one
> vendor incompatibility too many at my day job to hope for the best...

I'm totally on board with documenting what a feature means. E.g., for
XDP_REDIRECT, whether it's acceptable to fail the redirect in some
situations even when it's active, or if there should always be a
slow-path fallback.

But I disagree that the flag is worthless without it. People are running
into real issues with trying to run XDP_REDIRECT programs on a driver
that doesn't support it at all, and it's incredibly confusing. The
latest example popped up literally yesterday:

https://lore.kernel.org/xdp-newbies/CAM-scZPPeu44FeCPGO=Qz=03CrhhfB1GdJ8FNEpPqP_G27c6mQ@mail.gmail.com/

>> I view the feature flags as a list of features *implemented* by the
>> driver. Which should be pretty static in a given kernel, but may be
>> different than the features currently *enabled* on a given system (due
>> to, e.g., the TX queue stuff).
>
> Hm, maybe I'm not being clear enough. The way XDP_REDIRECT (your
> example) is implemented across drivers differs in a meaningful ways. 
> Hence the need for conformance testing. We don't have a golden SW
> standard to fall back on, like we do with HW offloads.

I'm not disagreeing that we need to harmonise what "implementing a
feature" means. Maybe I'm just not sure what you mean by "conformance
testing"? What would that look like, specifically? A script in selftest
that sets up a redirect between two interfaces that we tell people to
run? Or what? How would you catch, say, that issue where if a machine
has more CPUs than the NIC has TXQs things start falling apart?

> Also IDK why those tests are considered such a huge ask. As I said most
> vendors probably already have them, and so I'd guess do good distros.
> So let's work together.

I guess what I'm afraid of is that this will end up delaying or stalling
a fix for a long-standing issue (which is what I consider this series as
shown by the example above). Maybe you can alleviate that by expanding a
bit on what you mean?

-Toke



More information about the Intel-wired-lan mailing list