[Intel-wired-lan] [PATCH] igb: use igb_adapter->io_addr instead of e1000_hw->hw_addr

Hisashi T Fujinaka htodd at twofifty.com
Thu Nov 10 13:48:20 UTC 2016


On Thu, 10 Nov 2016, Corinna Vinschen wrote:

> On Nov  8 11:33, Alexander Duyck wrote:
...
>> The question I would have is what is reading the device when it is in
>> this state.  The watchdog and any other functions that would read the
>> device should be disabled.
>>
>> One possibility could be a race between a call to igb_close and the
>> igb_suspend function.  We have seen some of those pop up recently on
>> ixgbe and it looks like igb has the same bug.  We should probably be
>> using the rtnl_lock to guarantee that netif_device_detach and the call
>> to __igb_close are completed before igb_close could possibly be called
>> by the network stack.
>
> Do you have a pointer to the related ixgbe patch, by any chance?
...
>> The thing is that a suspended device should not be accessed at all.
>> If we are accessing it while it is suspended then that is a bug.  If
>> you could throw a WARN_ON call in igb_rd32 to capture where this is
>> being triggered that might be useful.
>>
>>> - Otherwise assume it's actually a surprise removal.  In theory that
>>>   should somehow trigger a device removal sequence, kind of like
>>>   calling igb_remove, no?
>>
>> Well a read of the MMIO region while suspended is more of a surprise
>> read since there shouldn't be anything going on.  We need to isolate
>> where that read is coming from and fix it.
>
> That would be ideal, but the problem couldn't be reproduced yet apart
> from at a customer's customer site.  It's not clear yet if we can access
> the machine for further testing.

Here's the initial patch for igb I have, but it's on hold awaiting more
changes in ixgbe regarding AER.

-- 
Hisashi T Fujinaka - htodd at twofifty.com
BSEE + BSChem + BAEnglish + MSCS + $2.50 = coffee
-------------- next part --------------
From todd.fujinaka at intel.com Thu Nov 10 05:44:02 2016
Date: Thu, 10 Nov 2016 05:36:13 -0800
From: Todd Fujinaka <todd.fujinaka at intel.com>
To: htodd at twofifty.com
Subject: [PATCH] igb: handle close/suspend race netif_device_detach

Similar to ixgbe, when an interface is part of a namespace it is
possible that igb_close() may be called while __igb_shutdown() is
running ending up in a double free WARN and/or a BUG in
free_msi_irqs().

Extend the rtnl_lock() to protect the call to netif_device_detach() and
igb_clear_interrupt_scheme() in __igb_shutdown() and check for
netif_device_present() to avoid clearing the interrupts second time in
igb_close().

Signed-off-by: Todd Fujinaka <todd.fujinaka at intel.com>
---
 drivers/net/ethernet/intel/igb/igb_main.c | 24 +++++++++++++++---------
 1 file changed, 15 insertions(+), 9 deletions(-)

diff --git a/drivers/net/ethernet/intel/igb/igb_main.c b/drivers/net/ethernet/intel/igb/igb_main.c
index 4feca69..ce5add6 100644
--- a/drivers/net/ethernet/intel/igb/igb_main.c
+++ b/drivers/net/ethernet/intel/igb/igb_main.c
@@ -3275,7 +3275,10 @@ static int __igb_close(struct net_device *netdev, bool suspending)
 
 int igb_close(struct net_device *netdev)
 {
-	return __igb_close(netdev, false);
+	if (netif_device_present(netdev))
+		return __igb_close(netdev, false);
+
+	return 0;
 }
 
 /**
@@ -7537,6 +7540,7 @@ static int __igb_shutdown(struct pci_dev *pdev, bool *enable_wake,
 	int retval = 0;
 #endif
 
+	rtnl_lock();
 	netif_device_detach(netdev);
 
 	if (netif_running(netdev))
@@ -7545,6 +7549,7 @@ static int __igb_shutdown(struct pci_dev *pdev, bool *enable_wake,
 	igb_ptp_suspend(adapter);
 
 	igb_clear_interrupt_scheme(adapter);
+	rtnl_unlock();
 
 #ifdef CONFIG_PM
 	retval = pci_save_state(pdev);
@@ -7663,16 +7668,17 @@ static int igb_resume(struct device *dev)
 
 	wr32(E1000_WUS, ~0);
 
-	if (netdev->flags & IFF_UP) {
-		rtnl_lock();
+	rtnl_lock();
+
+	if (!err && netif_running(netdev))
 		err = __igb_open(netdev, true);
-		rtnl_unlock();
-		if (err)
-			return err;
-	}
 
-	netif_device_attach(netdev);
-	return 0;
+	if (!err)
+		netif_device_attach(netdev);
+
+	rtnl_unlock();
+
+	return err;
 }
 
 static int igb_runtime_idle(struct device *dev)
-- 
2.7.4


More information about the Intel-wired-lan mailing list