[Intel-wired-lan] iavf null packets and arbitrary memory reads
Nguyen, Anthony L
anthony.l.nguyen at intel.com
Thu Feb 11 02:30:37 UTC 2021
On Wed, 2021-02-10 at 14:56 -0600, JD wrote:
> Hello,
>
> I've encountered a NIC driver bug that leads to null packets being
> transmitted and arbitrary/OOB memory reads by the iavf driver.
>
> I'm unfortunately not sure how the issue starts, but it has been
> happening across many different AMD servers and virtual machines.
>
> Running a tcpdump (tcpdump -i bond0 -nne ether host
> 00:00:00:00:00:00)
> on bond0 results in these packets being produced at a high rate:
>
> 13:04:14.826298 00:00:00:00:00:00 > 00:00:00:00:00:00, 802.3, length
> 0: LLC, dsap Null (0x00) Individual, ssap Null (0x00) Command, ctrl
> 0x0000: Information, send seq 0, rcv seq 0, Flags [Command], length
> 144
> 0x0000: 0000 0000 0000 0000 0000 0000 0000
> 0000 ................
> 0x0010: 0000 0000 0000 0000 0000 0000 0000
> 0000 ................
> 0x0020: 0000 0000 0000 0000 0000 0000 0000
> 0000 ................
> 0x0030: 0000 0000 0000 0000 0000 0000 0000
> 0000 ................
> 0x0040: 0000 0000 0000 0000 0000 0000 0000
> 0000 ................
> 0x0050: 0000 0000 0000 0000 0000 0000 0000
> 0000 ................
> 0x0060: 0000 0000 0000 0000 0000 0000 0000
> 0000 ................
> 0x0070: 0000 0000 0000 0000 0000 0000 0000
> 0000 ................
> 0x0080: 0000 0000 0000 0000 0000 0000 0000
> 0000 ................
>
>
> As you can see, they have a dest/src ether of 00:00:00:00:00:00 and
> are completely null. This doesn't happen on every virtual machine,
> some return absolutely nothing.
>
> If I filter the tcpdump command to ignore empty packets (all dots),
> some other interesting items begin to appear:
>
> 0x0500: 0000 0000 0000 0029 0100 071b 0473
> 656c .......).....sel
> 0x0510: 696e 7578 7379 7374 656d 5f75 3a6f
> 626a inuxsystem_u:obj
> 0x0520: 6563 745f 723a 6269 6e5f 743a 7330
> 0000 ect_r:bin_t:s0..
> [...]
> 0x0080: 0000 2f75 7372 2f6c 6962 3634 2f70
> 6572 ../usr/lib64/per
> 0x0090: 6c35 2f76 656e 646f 725f 7065 726c
> 2f46 l5/vendor_perl/F
> 0x00a0: 696c 652f 5370 6563 2f55 6e69 782e
> 706d ile/Spec/Unix.pm
>
> To me, that looks like it's reading data from memory and attempting
> to
> send from 00:00:00:00:00:00 to 00:00:00:00:00:00.
>
> If I run that same tcpdump on a different servers exhibiting the null
> packets, completely different items show up which also appear to be
> from memory.
>
> Keeping a tcpdump results in the same items from memory being
> repeated
> infinitely with no observable variation.
>
> So, it seems like the iavf driver is encountering some bug with
> memory
> management and ends up transmitting null packets or arbitrary data
> from memory over bond0.
>
> How/why did I notice this behavior? The VM's seem to perform worse
> over the network when this occurs. They usually exhibit small amounts
> of packet loss, or poor SSH responsiveness. Oddly, I have seen this
> bug in the past, and it resulted in dmesg on the parent printing
> Spoofed packet warnings for the i40e driver. Now it does not, yet the
> null packets still occur.
>
> I would like to help in any way I can to resolve this in the
> iavf/i40e
> driver. I'm happy to provide information from the servers if it's
> needed.
>
> For reference, here is the setup on every single AMD server:
> VM:
> CentOS 7.9
> NIC driver: iavf 4.0.1
> Kernel 4.19.163
>
> KVM parent:
> CentOS 7.9
> NIC driver: i40e 2.12.6
> Kernel: 4.19.163
> 2x Intel XXV710 for 25GbE SFP28 @ 25Gbps BONDED (Mode 4, LACP)
> Vendor: Supermicro Network Adapter AOC-S25G-i2S
> Firmware version: 7.20 0x800082b3 1.2585.0
> MOBO: Supermicro H11DSU-iN
> CPU: AMD EPYC 7352
>
> And here is the dmesg log (grepped for iavf) from a server that has
> the issue:
> iavf: loading out-of-tree module taints kernel.
> iavf: Intel(R) Ethernet Adaptive Virtual Function Network Driver -
> version 4.0.1
> iavf 0000:00:06.0: Multiqueue Enabled: Queue pair count = 4
> iavf 0000:00:06.0: MAC address: 52:54:00:7f:bc:39
> iavf 0000:00:06.0: GRO is enabled
> iavf 0000:00:05.0: Multiqueue Enabled: Queue pair count = 4
> iavf 0000:00:05.0: MAC address: 52:54:00:a6:3e:62
> iavf 0000:00:05.0: GRO is enabled
> iavf 0000:00:06.0 eth0: NIC Link is Up Speed is 25 Gbps Full Duplex
> iavf 0000:00:05.0 eth1: NIC Link is Up Speed is 25 Gbps Full Duplex
>
Hi JD,
I will check and see we're aware of this issue or have any information
about it. If not, I'll see if we can work on a reproduction.
Thanks,
Tony
More information about the Intel-wired-lan
mailing list