[Intel-wired-lan] [PATCH] igb: Do not call netif_device_detach() when PCIe link goes missing
Brown, Aaron F
aaron.f.brown at intel.com
Fri Feb 2 23:25:37 UTC 2018
> From: netdev-owner at vger.kernel.org [mailto:netdev-
> owner at vger.kernel.org] On Behalf Of Mika Westerberg
> Sent: Tuesday, January 23, 2018 2:29 AM
> To: Kirsher, Jeffrey T <jeffrey.t.kirsher at intel.com>
> Cc: Ferenc Boldog <ferenc.boldog at gmail.com>; Nikolay Bogoychev
> <nheart at gmail.com>; Mika Westerberg
> <mika.westerberg at linux.intel.com>; intel-wired-lan at lists.osuosl.org;
> netdev at vger.kernel.org
> Subject: [PATCH] igb: Do not call netif_device_detach() when PCIe link goes
> missing
>
> When the driver notices that PCIe link is gone by reading 0xffffffff
> from a register it clears hw->hw_addr and then calls netif_device_detach().
> This happens when the PCIe device is physically unplugged for example
> the user disconnected the Thunderbolt cable.
>
> However, netif_device_detach() prevents netif_unregister() from bringing
> the device down properly including tearing down MSI-X vectors. This
> triggers following crash during the driver removal:
>
> igb 0000:0b:00.0 enp11s0f0: PCIe link lost, device now detached
> ------------[ cut here ]------------
> kernel BUG at drivers/pci/msi.c:352!
> invalid opcode: 0000 [#1] PREEMPT SMP PTI
> ...
> Call Trace:
> pci_disable_msix+0xc9/0xf0
> igb_reset_interrupt_capability+0x58/0x60 [igb]
> igb_remove+0x90/0x100 [igb]
> pci_device_remove+0x31/0xa0
> device_release_driver_internal+0x152/0x210
> pci_stop_bus_device+0x78/0xa0
> pci_stop_bus_device+0x38/0xa0
> pci_stop_bus_device+0x38/0xa0
> pci_stop_bus_device+0x26/0xa0
> pci_stop_bus_device+0x38/0xa0
> pci_stop_and_remove_bus_device+0x9/0x20
> trim_stale_devices+0xee/0x130
> ? _raw_spin_unlock_irqrestore+0xf/0x30
> trim_stale_devices+0x8f/0x130
> ? _raw_spin_unlock_irqrestore+0xf/0x30
> trim_stale_devices+0xa1/0x130
> ? get_slot_status+0x8b/0xc0
> acpiphp_check_bridge.part.7+0xf9/0x140
> acpiphp_hotplug_notify+0x170/0x1f0
> ...
>
> To prevent the crash do not call netif_device_detach() in igb_rd32().
> This should be fine because hw->hw_addr is set to NULL preventing future
> hardware access of the now missing device.
>
> Link: https://bugzilla.kernel.org/show_bug.cgi?id=198181
> Reported-by: Ferenc Boldog <ferenc.boldog at gmail.com>
> Reported-by: Nikolay Bogoychev <nheart at gmail.com>
> Signed-off-by: Mika Westerberg <mika.westerberg at linux.intel.com>
> ---
> drivers/net/ethernet/intel/igb/igb_main.c | 3 +--
> 1 file changed, 1 insertion(+), 2 deletions(-)
Tested-by: Aaron Brown <aaron.f.brown at intel.com>
More information about the Intel-wired-lan
mailing list