mirror_ubuntu-kernels/drivers/net/ethernet/intel/ice
Ivan Vecera 5cb1ebdbc4 ice: Fix race condition during interface enslave
Commit 5dbbbd01cb ("ice: Avoid RTNL lock when re-creating
auxiliary device") changes a process of re-creation of aux device
so ice_plug_aux_dev() is called from ice_service_task() context.
This unfortunately opens a race window that can result in dead-lock
when interface has left LAG and immediately enters LAG again.

Reproducer:
```
#!/bin/sh

ip link add lag0 type bond mode 1 miimon 100
ip link set lag0

for n in {1..10}; do
        echo Cycle: $n
        ip link set ens7f0 master lag0
        sleep 1
        ip link set ens7f0 nomaster
done
```

This results in:
[20976.208697] Workqueue: ice ice_service_task [ice]
[20976.213422] Call Trace:
[20976.215871]  __schedule+0x2d1/0x830
[20976.219364]  schedule+0x35/0xa0
[20976.222510]  schedule_preempt_disabled+0xa/0x10
[20976.227043]  __mutex_lock.isra.7+0x310/0x420
[20976.235071]  enum_all_gids_of_dev_cb+0x1c/0x100 [ib_core]
[20976.251215]  ib_enum_roce_netdev+0xa4/0xe0 [ib_core]
[20976.256192]  ib_cache_setup_one+0x33/0xa0 [ib_core]
[20976.261079]  ib_register_device+0x40d/0x580 [ib_core]
[20976.266139]  irdma_ib_register_device+0x129/0x250 [irdma]
[20976.281409]  irdma_probe+0x2c1/0x360 [irdma]
[20976.285691]  auxiliary_bus_probe+0x45/0x70
[20976.289790]  really_probe+0x1f2/0x480
[20976.298509]  driver_probe_device+0x49/0xc0
[20976.302609]  bus_for_each_drv+0x79/0xc0
[20976.306448]  __device_attach+0xdc/0x160
[20976.310286]  bus_probe_device+0x9d/0xb0
[20976.314128]  device_add+0x43c/0x890
[20976.321287]  __auxiliary_device_add+0x43/0x60
[20976.325644]  ice_plug_aux_dev+0xb2/0x100 [ice]
[20976.330109]  ice_service_task+0xd0c/0xed0 [ice]
[20976.342591]  process_one_work+0x1a7/0x360
[20976.350536]  worker_thread+0x30/0x390
[20976.358128]  kthread+0x10a/0x120
[20976.365547]  ret_from_fork+0x1f/0x40
...
[20976.438030] task:ip              state:D stack:    0 pid:213658 ppid:213627 flags:0x00004084
[20976.446469] Call Trace:
[20976.448921]  __schedule+0x2d1/0x830
[20976.452414]  schedule+0x35/0xa0
[20976.455559]  schedule_preempt_disabled+0xa/0x10
[20976.460090]  __mutex_lock.isra.7+0x310/0x420
[20976.464364]  device_del+0x36/0x3c0
[20976.467772]  ice_unplug_aux_dev+0x1a/0x40 [ice]
[20976.472313]  ice_lag_event_handler+0x2a2/0x520 [ice]
[20976.477288]  notifier_call_chain+0x47/0x70
[20976.481386]  __netdev_upper_dev_link+0x18b/0x280
[20976.489845]  bond_enslave+0xe05/0x1790 [bonding]
[20976.494475]  do_setlink+0x336/0xf50
[20976.502517]  __rtnl_newlink+0x529/0x8b0
[20976.543441]  rtnl_newlink+0x43/0x60
[20976.546934]  rtnetlink_rcv_msg+0x2b1/0x360
[20976.559238]  netlink_rcv_skb+0x4c/0x120
[20976.563079]  netlink_unicast+0x196/0x230
[20976.567005]  netlink_sendmsg+0x204/0x3d0
[20976.570930]  sock_sendmsg+0x4c/0x50
[20976.574423]  ____sys_sendmsg+0x1eb/0x250
[20976.586807]  ___sys_sendmsg+0x7c/0xc0
[20976.606353]  __sys_sendmsg+0x57/0xa0
[20976.609930]  do_syscall_64+0x5b/0x1a0
[20976.613598]  entry_SYSCALL_64_after_hwframe+0x65/0xca

1. Command 'ip link ... set nomaster' causes that ice_plug_aux_dev()
   is called from ice_service_task() context, aux device is created
   and associated device->lock is taken.
2. Command 'ip link ... set master...' calls ice's notifier under
   RTNL lock and that notifier calls ice_unplug_aux_dev(). That
   function tries to take aux device->lock but this is already taken
   by ice_plug_aux_dev() in step 1
3. Later ice_plug_aux_dev() tries to take RTNL lock but this is already
   taken in step 2
4. Dead-lock

The patch fixes this issue by following changes:
- Bit ICE_FLAG_PLUG_AUX_DEV is kept to be set during ice_plug_aux_dev()
  call in ice_service_task()
- The bit is checked in ice_clear_rdma_cap() and only if it is not set
  then ice_unplug_aux_dev() is called. If it is set (in other words
  plugging of aux device was requested and ice_plug_aux_dev() is
  potentially running) then the function only clears the bit
- Once ice_plug_aux_dev() call (in ice_service_task) is finished
  the bit ICE_FLAG_PLUG_AUX_DEV is cleared but it is also checked
  whether it was already cleared by ice_clear_rdma_cap(). If so then
  aux device is unplugged.

Signed-off-by: Ivan Vecera <ivecera@redhat.com>
Co-developed-by: Petr Oros <poros@redhat.com>
Signed-off-by: Petr Oros <poros@redhat.com>
Reviewed-by: Dave Ertman <david.m.ertman@intel.com>
Link: https://lore.kernel.org/r/20220310171641.3863659-1-ivecera@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-10 15:07:46 -08:00
..
ice_adminq_cmd.h ice: support immediate firmware activation via devlink reload 2021-12-15 08:40:38 -08:00
ice_arfs.c ice: make use of ice_for_each_* macros 2021-10-15 07:39:03 -07:00
ice_arfs.h ice: use static inline for dummy functions 2021-06-07 08:59:01 -07:00
ice_base.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2021-12-23 16:09:58 -08:00
ice_base.h ice: split ice_ring onto Tx/Rx separate structs 2021-10-15 07:39:02 -07:00
ice_cgu_regs.h ice: ensure the hardware Clock Generation Unit is configured 2021-12-21 09:11:40 -08:00
ice_common.c ice: initialize local variable 'tlv' 2022-02-18 13:28:39 -08:00
ice_common.h ice: Use int for ice_status 2021-12-14 10:19:13 -08:00
ice_controlq.c ice: Cleanup after ice_status removal 2021-12-14 10:19:13 -08:00
ice_controlq.h ice: add support for sideband messages 2021-06-11 07:38:00 -07:00
ice_dcb_lib.c ice: Use int for ice_status 2021-12-14 10:19:13 -08:00
ice_dcb_lib.h ice: Add infrastructure for mqprio support via ndo_setup_tc 2021-10-20 15:57:54 -07:00
ice_dcb_nl.c ice: Fix problems with DSCP QoS implementation 2021-12-07 13:21:01 -08:00
ice_dcb_nl.h ice: use static inline for dummy functions 2021-06-07 08:59:01 -07:00
ice_dcb.c ice: Cleanup after ice_status removal 2021-12-14 10:19:13 -08:00
ice_dcb.h ice: Cleanup after ice_status removal 2021-12-14 10:19:13 -08:00
ice_devids.h Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2021-10-22 11:41:16 +01:00
ice_devlink.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next 2021-12-31 14:35:40 +00:00
ice_devlink.h net/ice: Remove unused enum 2021-11-30 08:02:12 -08:00
ice_eswitch.c ice: Match on all profiles in slow-path 2022-02-18 13:22:06 -08:00
ice_eswitch.h ice: improve switchdev's slow-path 2022-01-06 10:15:09 -08:00
ice_ethtool_fdir.c ice: Add flow director support for channel mode 2021-12-30 13:16:07 +00:00
ice_ethtool.c ice: Fix curr_link_speed advertised speed 2022-03-08 13:31:09 -08:00
ice_fdir.c ice: Cleanup after ice_status removal 2021-12-14 10:19:13 -08:00
ice_fdir.h ice: Add flow director support for channel mode 2021-12-30 13:16:07 +00:00
ice_flex_pipe.c ice: Optimize a few bitmap operations 2022-01-06 10:15:25 -08:00
ice_flex_pipe.h ice: Cleanup after ice_status removal 2021-12-14 10:19:13 -08:00
ice_flex_type.h ice: refactor PTYPE validating 2021-12-14 08:06:47 -08:00
ice_flow.c ice: Add flow director support for channel mode 2021-12-30 13:16:07 +00:00
ice_flow.h ice: Add flow director support for channel mode 2021-12-30 13:16:07 +00:00
ice_fltr.c ice: improve switchdev's slow-path 2022-01-06 10:15:09 -08:00
ice_fltr.h ice: improve switchdev's slow-path 2022-01-06 10:15:09 -08:00
ice_fw_update.c ice: support immediate firmware activation via devlink reload 2021-12-15 08:40:38 -08:00
ice_fw_update.h ice: support immediate firmware activation via devlink reload 2021-12-15 08:40:38 -08:00
ice_hw_autogen.h ice: support crosstimestamping on E822 devices if supported 2021-12-21 09:11:40 -08:00
ice_idc_int.h ice: Implement iidc operations 2021-05-28 20:11:13 -07:00
ice_idc.c net/ice: Add support for enable_iwarp and enable_roce devlink param 2021-11-22 08:41:56 -08:00
ice_lag.c ice: Fix KASAN error in LAG NETDEV_UNREGISTER handler 2022-02-10 08:47:26 -08:00
ice_lag.h ice: Add initial support framework for LAG 2021-02-08 16:27:01 -08:00
ice_lan_tx_rx.h ice: fix IPIP and SIT TSO offload 2022-02-10 08:47:21 -08:00
ice_lib.c ice: enable parsing IPSEC SPI headers for RSS 2022-02-14 11:22:35 +00:00
ice_lib.h ice: Cleanup after ice_status removal 2021-12-14 10:19:13 -08:00
ice_main.c ice: Fix race condition during interface enslave 2022-03-10 15:07:46 -08:00
ice_nvm.c net: fixup build after bpf header changes 2022-01-04 12:34:19 +00:00
ice_nvm.h ice: support immediate firmware activation via devlink reload 2021-12-15 08:40:38 -08:00
ice_osdep.h
ice_protocol_type.h ice: Match on all profiles in slow-path 2022-02-18 13:22:06 -08:00
ice_ptp_consts.h ice: ensure the hardware Clock Generation Unit is configured 2021-12-21 09:11:40 -08:00
ice_ptp_hw.c ice: exit bypass mode once hardware finishes timestamp calibration 2021-12-21 09:11:40 -08:00
ice_ptp_hw.h ice: exit bypass mode once hardware finishes timestamp calibration 2021-12-21 09:11:40 -08:00
ice_ptp.c ice: check the return of ice_ptp_gettimex64 2022-02-18 13:28:39 -08:00
ice_ptp.h ice: exit bypass mode once hardware finishes timestamp calibration 2021-12-21 09:11:40 -08:00
ice_repr.c ice: improve switchdev's slow-path 2022-01-06 10:15:09 -08:00
ice_repr.h ice: improve switchdev's slow-path 2022-01-06 10:15:09 -08:00
ice_sbq_cmd.h ice: add support for sideband messages 2021-06-11 07:38:00 -07:00
ice_sched.c ice: Remove unnecessary casts 2021-12-14 10:19:14 -08:00
ice_sched.h ice: Cleanup after ice_status removal 2021-12-14 10:19:13 -08:00
ice_sriov.c ice: Remove enum ice_status 2021-12-14 10:19:13 -08:00
ice_sriov.h ice: Use int for ice_status 2021-12-14 10:19:13 -08:00
ice_switch.c ice: Match on all profiles in slow-path 2022-02-18 13:22:06 -08:00
ice_switch.h ice: Cleanup after ice_status removal 2021-12-14 10:19:13 -08:00
ice_tc_lib.c ice: fix setting l4 port flag when adding filter 2022-02-18 13:28:18 -08:00
ice_tc_lib.h ice: VXLAN and Geneve TC support 2021-10-28 11:00:18 -07:00
ice_trace.h ice: split ice_ring onto Tx/Rx separate structs 2021-10-15 07:39:02 -07:00
ice_txrx_lib.c ice: improve switchdev's slow-path 2022-01-06 10:15:09 -08:00
ice_txrx_lib.h ice: propagate xdp_ring onto rx_ring 2021-10-15 07:39:03 -07:00
ice_txrx.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next 2021-12-31 14:35:40 +00:00
ice_txrx.h ice: xsk: fix cleaned_count setting 2021-12-17 11:18:21 -08:00
ice_type.h ice: Add flow director support for channel mode 2021-12-30 13:16:07 +00:00
ice_virtchnl_allowlist.c ice: Enable RSS configure for AVF 2021-04-22 09:26:22 -07:00
ice_virtchnl_allowlist.h ice: Allow ignoring opcodes on specific VF 2021-04-22 09:26:22 -07:00
ice_virtchnl_fdir.c ice: Remove excess error variables 2021-12-14 10:19:13 -08:00
ice_virtchnl_fdir.h ice: Check FDIR program status for AVF 2021-03-22 11:32:12 -07:00
ice_virtchnl_pf.c ice: stop disabling VFs due to PF error responses 2022-03-08 13:31:08 -08:00
ice_virtchnl_pf.h ice: stop disabling VFs due to PF error responses 2022-03-08 13:31:08 -08:00
ice_xsk.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next 2021-12-31 14:35:40 +00:00
ice_xsk.h ice: split ice_ring onto Tx/Rx separate structs 2021-10-15 07:39:02 -07:00
ice.h ice: Fix race condition during interface enslave 2022-03-10 15:07:46 -08:00
Makefile ice: ndo_setup_tc implementation for PF 2021-10-11 09:03:04 -07:00