[Intel-wired-lan] [bug report] iavf: Fix deadlock in iavf_reset_task
Tony Nguyen
anthony.l.nguyen at intel.com
Fri Mar 4 18:04:38 UTC 2022
Hi Dan,
On 3/4/2022 3:53 AM, Dan Carpenter wrote:
> Hello Slawomir Laba,
>
> The patch e85ff9c631e1: "iavf: Fix deadlock in iavf_reset_task" from
> Feb 23, 2022, leads to the following Smatch static checker warning:
>
> drivers/net/ethernet/intel/iavf/iavf_main.c:2691 iavf_reset_task()
> error: double unlocked '&adapter->crit_lock' (orig line 2689)
>
> drivers/net/ethernet/intel/iavf/iavf_main.c
> 2613 static void iavf_reset_task(struct work_struct *work)
> 2614 {
> 2615 struct iavf_adapter *adapter = container_of(work,
> 2616 struct iavf_adapter,
> 2617 reset_task);
> 2618 struct virtchnl_vf_resource *vfres = adapter->vf_res;
> 2619 struct net_device *netdev = adapter->netdev;
> 2620 struct iavf_hw *hw = &adapter->hw;
> 2621 struct iavf_mac_filter *f, *ftmp;
> 2622 struct iavf_cloud_filter *cf;
> 2623 u32 reg_val;
> 2624 int i = 0, err;
> 2625 bool running;
> 2626
> 2627 /* When device is being removed it doesn't make sense to run the reset
> 2628 * task, just return in such a case.
> 2629 */
> 2630 if (!mutex_trylock(&adapter->crit_lock)) {
> 2631 if (adapter->state != __IAVF_REMOVE)
> 2632 queue_work(iavf_wq, &adapter->reset_task);
> 2633
> 2634 return;
> 2635 }
> 2636
> 2637 while (!mutex_trylock(&adapter->client_lock))
> 2638 usleep_range(500, 1000);
> 2639 if (CLIENT_ENABLED(adapter)) {
> 2640 adapter->flags &= ~(IAVF_FLAG_CLIENT_NEEDS_OPEN |
> 2641 IAVF_FLAG_CLIENT_NEEDS_CLOSE |
> 2642 IAVF_FLAG_CLIENT_NEEDS_L2_PARAMS |
> 2643 IAVF_FLAG_SERVICE_CLIENT_REQUESTED);
> 2644 cancel_delayed_work_sync(&adapter->client_task);
> 2645 iavf_notify_client_close(&adapter->vsi, true);
> 2646 }
> 2647 iavf_misc_irq_disable(adapter);
> 2648 if (adapter->flags & IAVF_FLAG_RESET_NEEDED) {
> 2649 adapter->flags &= ~IAVF_FLAG_RESET_NEEDED;
> 2650 /* Restart the AQ here. If we have been reset but didn't
> 2651 * detect it, or if the PF had to reinit, our AQ will be hosed.
> 2652 */
> 2653 iavf_shutdown_adminq(hw);
> 2654 iavf_init_adminq(hw);
> 2655 iavf_request_reset(adapter);
> 2656 }
> 2657 adapter->flags |= IAVF_FLAG_RESET_PENDING;
> 2658
> 2659 /* poll until we see the reset actually happen */
> 2660 for (i = 0; i < IAVF_RESET_WAIT_DETECTED_COUNT; i++) {
> 2661 reg_val = rd32(hw, IAVF_VF_ARQLEN1) &
> 2662 IAVF_VF_ARQLEN1_ARQENABLE_MASK;
> 2663 if (!reg_val)
> 2664 break;
> 2665 usleep_range(5000, 10000);
> 2666 }
> 2667 if (i == IAVF_RESET_WAIT_DETECTED_COUNT) {
> 2668 dev_info(&adapter->pdev->dev, "Never saw reset\n");
> 2669 goto continue_reset; /* act like the reset happened */
> 2670 }
> 2671
> 2672 /* wait until the reset is complete and the PF is responding to us */
> 2673 for (i = 0; i < IAVF_RESET_WAIT_COMPLETE_COUNT; i++) {
> 2674 /* sleep first to make sure a minimum wait time is met */
> 2675 msleep(IAVF_RESET_WAIT_MS);
> 2676
> 2677 reg_val = rd32(hw, IAVF_VFGEN_RSTAT) &
> 2678 IAVF_VFGEN_RSTAT_VFR_STATE_MASK;
> 2679 if (reg_val == VIRTCHNL_VFR_VFACTIVE)
> 2680 break;
> 2681 }
> 2682
> 2683 pci_set_master(adapter->pdev);
> 2684 pci_restore_msi_state(adapter->pdev);
> 2685
> 2686 if (i == IAVF_RESET_WAIT_COMPLETE_COUNT) {
> 2687 dev_err(&adapter->pdev->dev, "Reset never finished (%x)\n",
> 2688 reg_val);
> 2689 iavf_disable_vf(adapter);
>
> The proble is that iavf_disable_vf() calls mutex_unlock(&adapter->crit_lock);
>
> 2690 mutex_unlock(&adapter->client_lock);
> --> 2691 mutex_unlock(&adapter->crit_lock);
>
> so calling it again here is a bug.
>
> I feel like I owe you an apology on this one because I asked you to add
> the mutex_unlock() here via the kbuild-bot... It is confusing though.
> Does the unlock really need to be done inside iavf_disable_vf() are
> can we move that to the callers?
Thanks for the report. I talked to Slawomir about this and he is working
up a patch to fix this.
-Tony
>
> 2692 return; /* Do not attempt to reinit. It's dead, Jim. */
> 2693 }
> 2694
> 2695 continue_reset:
> 2696 /* We don't use netif_running() because it may be true prior to
> 2697 * ndo_open() returning, so we can't assume it means all our open
> 2698 * tasks have finished, since we're not holding the rtnl_lock here.
> 2699 */
>
> regards,
> dan carpenter
More information about the Intel-wired-lan
mailing list