[Intel-wired-lan] [PATCH bpf-next 0/5] Add support for SKIP_BPF flag for AF_XDP sockets

Thu Aug 15 16:25:57 UTC 2019

On 8/15/2019 4:12 AM, Toke Høiland-Jørgensen wrote:
> Sridhar Samudrala <sridhar.samudrala at intel.com> writes:
> 
>> This patch series introduces XDP_SKIP_BPF flag that can be specified
>> during the bind() call of an AF_XDP socket to skip calling the BPF
>> program in the receive path and pass the buffer directly to the socket.
>>
>> When a single AF_XDP socket is associated with a queue and a HW
>> filter is used to redirect the packets and the app is interested in
>> receiving all the packets on that queue, we don't need an additional
>> BPF program to do further filtering or lookup/redirect to a socket.
>>
>> Here are some performance numbers collected on
>>    - 2 socket 28 core Intel(R) Xeon(R) Platinum 8180 CPU @ 2.50GHz
>>    - Intel 40Gb Ethernet NIC (i40e)
>>
>> All tests use 2 cores and the results are in Mpps.
>>
>> turbo on (default)
>> ---------------------------------------------	
>>                        no-skip-bpf    skip-bpf
>> ---------------------------------------------	
>> rxdrop zerocopy           21.9         38.5
>> l2fwd  zerocopy           17.0         20.5
>> rxdrop copy               11.1         13.3
>> l2fwd  copy                1.9          2.0
>>
>> no turbo :  echo 1 > /sys/devices/system/cpu/intel_pstate/no_turbo
>> ---------------------------------------------	
>>                        no-skip-bpf    skip-bpf
>> ---------------------------------------------	
>> rxdrop zerocopy           15.4         29.0
>> l2fwd  zerocopy           11.8         18.2
>> rxdrop copy                8.2         10.5
>> l2fwd  copy                1.7          1.7
>> ---------------------------------------------
> 
> You're getting this performance boost by adding more code in the fast
> path for every XDP program; so what's the performance impact of that for
> cases where we do run an eBPF program?

The no-skip-bpf results are pretty close to what i see before the 
patches are applied. As umem is cached in rx_ring for zerocopy the 
overhead is much smaller compared to the copy scenario where i am 
currently calling xdp_get_umem_from_qid().

> 
> Also, this is basically a special-casing of a particular deployment
> scenario. Without a way to control RX queue assignment and traffic
> steering, you're basically hard-coding a particular app's takeover of
> the network interface; I'm not sure that is such a good idea...

Yes. This is mainly targeted for application that create 1 AF_XDP socket 
per RX queue and can use a HW filter (via ethtool or TC flower) to 
redirect the packets to a queue or a group of queues.

> 
> -Toke
>