mirror_ubuntu-kernels/drivers/net/ethernet/intel
Ying Hsu 004d25060c igb: Fix igb_down hung on surprise removal
In a setup where a Thunderbolt hub connects to Ethernet and a display
through USB Type-C, users may experience a hung task timeout when they
remove the cable between the PC and the Thunderbolt hub.
This is because the igb_down function is called multiple times when
the Thunderbolt hub is unplugged. For example, the igb_io_error_detected
triggers the first call, and the igb_remove triggers the second call.
The second call to igb_down will block at napi_synchronize.
Here's the call trace:
    __schedule+0x3b0/0xddb
    ? __mod_timer+0x164/0x5d3
    schedule+0x44/0xa8
    schedule_timeout+0xb2/0x2a4
    ? run_local_timers+0x4e/0x4e
    msleep+0x31/0x38
    igb_down+0x12c/0x22a [igb 6615058754948bfde0bf01429257eb59f13030d4]
    __igb_close+0x6f/0x9c [igb 6615058754948bfde0bf01429257eb59f13030d4]
    igb_close+0x23/0x2b [igb 6615058754948bfde0bf01429257eb59f13030d4]
    __dev_close_many+0x95/0xec
    dev_close_many+0x6e/0x103
    unregister_netdevice_many+0x105/0x5b1
    unregister_netdevice_queue+0xc2/0x10d
    unregister_netdev+0x1c/0x23
    igb_remove+0xa7/0x11c [igb 6615058754948bfde0bf01429257eb59f13030d4]
    pci_device_remove+0x3f/0x9c
    device_release_driver_internal+0xfe/0x1b4
    pci_stop_bus_device+0x5b/0x7f
    pci_stop_bus_device+0x30/0x7f
    pci_stop_bus_device+0x30/0x7f
    pci_stop_and_remove_bus_device+0x12/0x19
    pciehp_unconfigure_device+0x76/0xe9
    pciehp_disable_slot+0x6e/0x131
    pciehp_handle_presence_or_link_change+0x7a/0x3f7
    pciehp_ist+0xbe/0x194
    irq_thread_fn+0x22/0x4d
    ? irq_thread+0x1fd/0x1fd
    irq_thread+0x17b/0x1fd
    ? irq_forced_thread_fn+0x5f/0x5f
    kthread+0x142/0x153
    ? __irq_get_irqchip_state+0x46/0x46
    ? kthread_associate_blkcg+0x71/0x71
    ret_from_fork+0x1f/0x30

In this case, igb_io_error_detected detaches the network interface
and requests a PCIE slot reset, however, the PCIE reset callback is
not being invoked and thus the Ethernet connection breaks down.
As the PCIE error in this case is a non-fatal one, requesting a
slot reset can be avoided.
This patch fixes the task hung issue and preserves Ethernet
connection by ignoring non-fatal PCIE errors.

Signed-off-by: Ying Hsu <yinghsu@chromium.org>
Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Link: https://lore.kernel.org/r/20230620174732.4145155-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-22 19:49:44 -07:00
..
e1000 e1000: Remove unnecessary use of kmap_atomic() 2022-11-02 11:09:13 -07:00
e1000e e1000e: Add @adapter description to kdoc 2023-05-18 09:11:32 -07:00
fm10k fm10k: Remove unnecessary aer.h include 2023-03-08 23:34:39 -08:00
i40e i40e, xsk: fix comment typo 2023-06-22 19:38:46 -07:00
iavf iavf: remove mask from iavf_irq_enable_queues() 2023-06-10 00:09:54 -07:00
ice ice: remove unnecessary check for old MAC == new MAC 2023-06-15 22:54:54 -07:00
igb igb: Fix igb_down hung on surprise removal 2023-06-22 19:49:44 -07:00
igbvf Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2023-03-24 10:10:20 -07:00
igc Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2023-06-15 22:19:41 -07:00
ixgbe eth: ixgbe: fix the wake condition 2023-06-07 21:53:11 -07:00
ixgbevf drivers: net: turn on XDP features 2023-02-02 20:48:23 -08:00
e100.c e100: Fix possible use after free in e100_xmit_prepare 2022-11-23 08:38:22 -08:00
Kconfig ixgb: Remove ixgb driver 2023-03-19 10:51:07 +00:00
Makefile ixgb: Remove ixgb driver 2023-03-19 10:51:07 +00:00