linux-loongson

mirror of https://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson synced 2025-09-11 02:23:32 +00:00

Author	SHA1	Message	Date
Jakub Kicinski	89695196f0	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Merge in overtime fixes, no conflicts. Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-03-23 10:53:49 -07:00
Alexander Lobakin	5a3156932d	ice: don't allow to run ice_send_event_to_aux() in atomic ctx ice_send_event_to_aux() eventually descends to mutex_lock() (-> might_sched()), so it must not be called under non-task context. However, at least two fixes have happened already for the bug splats occurred due to this function being called from atomic context. To make the emergency landings softer, bail out early when executed in non-task context emitting a warn splat only once. This way we trade some events being potentially lost for system stability and avoid any related hangs and crashes. Fixes: `348048e724` ("ice: Implement iidc operations") Signed-off-by: Alexander Lobakin <alexandr.lobakin@intel.com> Tested-by: Michal Kubiak <michal.kubiak@intel.com> Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Acked-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-03-23 10:40:41 -07:00
Alexander Lobakin	32d53c0aa3	ice: fix 'scheduling while atomic' on aux critical err interrupt There's a kernel BUG splat on processing aux critical error interrupts in ice_misc_intr(): [ 2100.917085] BUG: scheduling while atomic: swapper/15/0/0x00010000 ... [ 2101.060770] Call Trace: [ 2101.063229] <IRQ> [ 2101.065252] dump_stack+0x41/0x60 [ 2101.068587] __schedule_bug.cold.100+0x4c/0x58 [ 2101.073060] __schedule+0x6a4/0x830 [ 2101.076570] schedule+0x35/0xa0 [ 2101.079727] schedule_preempt_disabled+0xa/0x10 [ 2101.084284] __mutex_lock.isra.7+0x310/0x420 [ 2101.088580] ? ice_misc_intr+0x201/0x2e0 [ice] [ 2101.093078] ice_send_event_to_aux+0x25/0x70 [ice] [ 2101.097921] ice_misc_intr+0x220/0x2e0 [ice] [ 2101.102232] __handle_irq_event_percpu+0x40/0x180 [ 2101.106965] handle_irq_event_percpu+0x30/0x80 [ 2101.111434] handle_irq_event+0x36/0x53 [ 2101.115292] handle_edge_irq+0x82/0x190 [ 2101.119148] handle_irq+0x1c/0x30 [ 2101.122480] do_IRQ+0x49/0xd0 [ 2101.125465] common_interrupt+0xf/0xf [ 2101.129146] </IRQ> ... As Andrew correctly mentioned previously[0], the following call ladder happens: ice_misc_intr() <- hardirq ice_send_event_to_aux() device_lock() mutex_lock() might_sleep() might_resched() <- oops Add a new PF state bit which indicates that an aux critical error occurred and serve it in ice_service_task() in process context. The new ice_pf::oicr_err_reg is read-write in both hardirq and process contexts, but only 3 bits of non-critical data probably aren't worth explicit synchronizing (and they're even in the same byte [31:24]). [0] https://lore.kernel.org/all/YeSRUVmrdmlUXHDn@lunn.ch Fixes: `348048e724` ("ice: Implement iidc operations") Signed-off-by: Alexander Lobakin <alexandr.lobakin@intel.com> Tested-by: Michal Kubiak <michal.kubiak@intel.com> Acked-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-03-23 10:40:40 -07:00
Jakub Kicinski	fad6c1f1a1	Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue Tony Nguyen says: ==================== 100GbE Intel Wired LAN Driver Updates 2022-03-16 This series contains updates to gtp and ice driver. Wojciech fixes smatch reported inconsistent indenting for gtp and ice. Yang Yingliang fixes a couple of return value checks for GNSS to IS_PTR instead of null. Jacob adds support for trace events on tx timestamps. * '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue: ice: add trace events for tx timestamps ice: fix return value check in ice_gnss.c ice: Fix inconsistent indenting in ice_switch gtp: Fix inconsistent indenting ==================== Link: https://lore.kernel.org/r/20220316204024.3201500-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-03-17 16:40:32 -07:00
Jakub Kicinski	e243f39685	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net No conflicts. Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-03-17 13:56:58 -07:00
Jacob Keller	4c1202189e	ice: add trace events for tx timestamps We've previously run into many issues related to the latency of a Tx timestamp completion with the ice hardware. It can be difficult to determine the root cause of a slow Tx timestamp. To aid in this, introduce new trace events which capture timing data about when the driver reaches certain points while processing a transmit timestamp * ice_tx_tstamp_request: Trace when the stack initiates a new timestamp request. * ice_tx_tstamp_fw_req: Trace when the driver begins a read of the timestamp register in the work thread. * ice_tx_tstamp_fw_done: Trace when the driver finishes reading a timestamp register in the work thread. * ice_tx_tstamp_complete: Trace when the driver submits the skb back to the stack with a completed Tx timestamp. These trace events can be enabled using the standard trace event subsystem exposed by the ice driver. If they are disabled, they become no-ops with no run time cost. The following is a simple GNU AWK script which can highlight one potential way to use the trace events to capture latency data from the trace buffer about how long the driver takes to process a timestamp: ----- BEGIN { PREC=256 } # Detect requests /tx_tstamp_request/ { time=strtonum($4) skb=$7 # Store the time of request for this skb requests[skb] = time printf("skb %s: idx %d at %.6f\n", skb, idx, time) } # Detect completions /tx_tstamp_complete/ { time=strtonum($4) skb=$7 idx=$9 if (skb in requests) { latency = (time - requests[skb]) * 1000 printf("skb %s: %.3f to complete\n", skb, latency) if (latency > 4) { printf(">>> HIGH LATENCY <<<\n") } printf("\n") } else { printf("!!! skb %s (idx %d) at %.6f\n", skb, idx, time) } } ----- Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2022-03-16 10:38:15 -07:00
Yang Yingliang	2b1d0a242a	ice: fix return value check in ice_gnss.c kthread_create_worker() and tty_alloc_driver() return ERR_PTR() and never return NULL. The NULL test in the return value check should be replaced with IS_ERR(). Fixes: `43113ff734` ("ice: add TTY for GNSS module for E810T device") Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2022-03-16 10:38:15 -07:00
Wojciech Drewek	2bcd5b9f35	ice: Fix inconsistent indenting in ice_switch Fix the following warning as reported by smatch: smatch warnings: drivers/net/ethernet/intel/ice/ice_switch.c:5568 ice_find_dummy_packet() warn: inconsistent indenting Fixes: `9a225f81f5` ("ice: Support GTP-U and GTP-C offload in switchdev") Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Wojciech Drewek <wojciech.drewek@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2022-03-16 10:26:43 -07:00
Sudheer Mogilappagari	1b4ae7d925	ice: destroy flow director filter mutex after releasing VSIs Currently fdir_fltr_lock is accessed in ice_vsi_release_all() function after it is destroyed. Instead destroy mutex after ice_vsi_release_all. Fixes: `40319796b7` ("ice: Add flow director support for channel mode") Signed-off-by: Sudheer Mogilappagari <sudheer.mogilappagari@intel.com> Tested-by: Bharathi Sreenivas <bharathi.sreenivas@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2022-03-15 13:36:13 -07:00
Maciej Fijalkowski	f153546913	ice: fix NULL pointer dereference in ice_update_vsi_tx_ring_stats() It is possible to do NULL pointer dereference in routine that updates Tx ring stats. Currently only stats and bytes are updated when ring pointer is valid, but later on ring is accessed to propagate gathered Tx stats onto VSI stats. Change the existing logic to move to next ring when ring is NULL. Fixes: `e72bba2135` ("ice: split ice_ring onto Tx/Rx separate structs") Reported-by: kernel test robot <lkp@intel.com> Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Acked-by: Alexander Lobakin <alexandr.lobakin@intel.com> Tested-by: Gurucharan G <gurucharanx.g@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2022-03-15 13:36:13 -07:00
Jacob Keller	5a57ee83d9	ice: remove PF pointer from ice_check_vf_init The ice_check_vf_init function takes both a PF and a VF pointer. Every caller looks up the PF pointer from the VF structure. Some callers only use of the PF pointer is call this function. Move the lookup inside ice_check_vf_init and drop the unnecessary argument. Cleanup the callers to drop the now unnecessary local variables. In particular, replace the local PF pointer with a HW structure pointer in ice_vc_get_vf_res_msg which simplifies a few accesses to the HW structure in that function. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2022-03-15 13:23:14 -07:00
Jacob Keller	bf93bf791c	ice: introduce ice_virtchnl.c and ice_virtchnl.h Just as we moved the generic virtualization library logic into ice_vf_lib.c, move the virtchnl message handling into ice_virtchnl.c Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2022-03-15 13:23:10 -07:00
Jacob Keller	8cf52bec5c	ice: cleanup long lines in ice_sriov.c Before we move the virtchnl message handling from ice_sriov.c into ice_virtchnl.c, cleanup some long line warnings to avoid checkpatch.pl complaints. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2022-03-15 13:23:06 -07:00
Jacob Keller	f5f085c01d	ice: introduce ICE_VF_RESET_LOCK flag The ice_reset_vf function performs actions which must be taken only while holding the VF configuration lock. Some flows already acquired the lock, while other flows must acquire it just for the reset function. Add the ICE_VF_RESET_LOCK flag to the function so that it can handle taking and releasing the lock instead at the appropriate scope. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2022-03-15 13:23:02 -07:00
Jacob Keller	9dbb33da12	ice: introduce ICE_VF_RESET_NOTIFY flag In some cases of resetting a VF, the PF would like to first notify the VF that a reset is impending. This is currently done via ice_vc_notify_vf_reset. A wrapper to ice_reset_vf, ice_vf_reset_vf, is used to call this function first before calling ice_reset_vf. In fact, every single call to ice_vc_notify_vf_reset occurs just prior to a call to ice_vc_reset_vf. Now that ice_reset_vf has flags, replace this separate call with an ICE_VF_RESET_NOTIFY flag. This removes an unnecessary exported function of ice_vc_notify_vf_reset, and also makes there be a single function to reset VFs (ice_reset_vf). Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2022-03-15 13:22:57 -07:00
Jacob Keller	7eb517e434	ice: convert ice_reset_vf to take flags The ice_reset_vf function takes a boolean parameter which indicates whether or not the reset is due to a VFLR event. This is somewhat confusing to read because readers must interpret what "true" and "false" mean when seeing a line of code like "ice_reset_vf(vf, false)". We will want to add another toggle to the ice_reset_vf in a following change. To avoid proliferating many arguments, convert this function to take flags instead. ICE_VF_RESET_VFLR will indicate if this is a VFLR reset. A value of 0 indicates no flags. One could argue that "ice_reset_vf(vf, 0)" is no more readable than "ice_reset_vf(vf, false)".. However, this type of flags interface is somewhat common and using 0 to mean "no flags" makes sense in this context. We could bother to add a define for "ICE_VF_RESET_PLAIN" or something similar, but this can be confusing since its not an actual bit flag. This paves the way to add another flag to the function in a following change. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2022-03-15 13:22:54 -07:00
Jacob Keller	4fe193cc9d	ice: convert ice_reset_vf to standard error codes The ice_reset_vf function returns a boolean value indicating whether or not the VF reset. This is a bit confusing since it means that callers need to know how to interpret the return value when needing to indicate an error. Refactor the function and call sites to report a regular error code. We still report success (i.e. return 0) in cases where the reset is in progress or is disabled. Existing callers don't care because they do not check the return value. We keep the error code anyways instead of a void return because we expect future code which may care about or at least report the error value. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2022-03-15 13:22:49 -07:00
Jacob Keller	fe99d1c06c	ice: make ice_reset_all_vfs void The ice_reset_all_vfs function returns true if any VFs were reset, and false otherwise. However, no callers check the return value. Drop this return value and make the function void since the callers do not care about this. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2022-03-15 13:22:44 -07:00
Jacob Keller	dac5728875	ice: drop is_vflr parameter from ice_reset_all_vfs The ice_reset_all_vfs function takes a parameter to handle whether its operating after a VFLR event or not. This is not necessary as every caller always passes true. Simplify the interface by removing the parameter. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2022-03-15 13:22:39 -07:00
Jacob Keller	16686d7fbb	ice: move reset functionality into ice_vf_lib.c Now that the reset functions do not rely on Single Root specific behavior, move the ice_reset_vf, ice_reset_all_vfs, and ice_vf_rebuild_host_cfg functions and their dependent helper functions out of ice_sriov.c and into ice_vf_lib.c Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2022-03-15 13:22:34 -07:00
Jacob Keller	5de95744ff	ice: fix a long line warning in ice_reset_vf We're about to move ice_reset_vf out of ice_sriov.c and into ice_vf_lib.c One of the dev_err statements has a checkpatch.pl violation due to putting the vf->vf_id on the same line as the dev_err. Fix this style issue first before moving the code. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2022-03-15 13:22:30 -07:00
Jacob Keller	9c6f787897	ice: introduce VF operations structure for reset flows The ice driver currently supports virtualization using Single Root IOV, with code in the ice_sriov.c file. In the future, we plan to also implement support for Scalable IOV, which uses slightly different hardware implementations for some functionality. To eventually allow this, we introduce a new ice_vf_ops structure which will contain the basic operations that are different between the two IOV implementations. This primarily includes logic for how to handle the VF reset registers, as well as what to do before and after rebuilding the VF's VSI. Implement these ops structures and call the ops table instead of directly calling the SR-IOV specific function. This will allow us to easily add the Scalable IOV implementation in the future. Additionally, it helps separate the generalized VF logic from SR-IOV specifics. This change allows us to move the reset logic out of ice_sriov.c and into ice_vf_lib.c without placing any Single Root specific details into the generic file. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2022-03-15 13:22:25 -07:00
Jacob Keller	f5840e0da6	ice: fix incorrect dev_dbg print mistaking 'i' for vf->vf_id If we fail to clear the malicious VF indication after a VF reset, the dev_dbg message which is printed uses the local variable 'i' when it meant to use vf->vf_id. Fix this. Fixes: `0891c89674` ("ice: warn about potentially malicious VFs") Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2022-03-15 13:22:19 -07:00
Jacob Keller	109aba47ca	ice: introduce ice_vf_lib.c, ice_vf_lib.h, and ice_vf_lib_private.h Introduce the ice_vf_lib.c file along with the ice_vf_lib.h and ice_vf_lib_private.h header files. These files will house the generic VF structures and access functions. Move struct ice_vf and its dependent definitions into this new header file. The ice_vf_lib.c is compiled conditionally on CONFIG_PCI_IOV. Some of its functionality is required by all driver files. However, some of its functionality will only be required by other files also conditionally compiled based on CONFIG_PCI_IOV. Declaring these functions used only in CONFIG_PCI_IOV files in ice_vf_lib.h is verbose. This is because we must provide a fallback implementation for each function in this header since it is included in files which may not be compiled with CONFIG_PCI_IOV. Instead, introduce a new ice_vf_lib_private.h header which verifies that CONFIG_PCI_IOV is enabled. This header is intended to be directly included in .c files which are CONFIG_PCI_IOV only. Add a #error indication that will complain if the file ever gets included by another C file on a kernel with CONFIG_PCI_IOV disabled. Add a comment indicating the nature of the file and why it is useful. This makes it so that we can easily define functions exposed from ice_vf_lib.c into other virtualization files without needing to add fallback implementations for every single function. This begins the path to separate out generic code which will be reused by other virtualization implementations from ice_sriov.h and ice_sriov.c Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2022-03-15 13:22:13 -07:00
Jacob Keller	1261691dda	ice: use ice_is_vf_trusted helper function The ice_vc_cfg_promiscuous_mode_msg function directly checks ICE_VIRTCHNL_VF_CAP_PRIVILEGE, instead of using the existing helper function ice_is_vf_trusted. Switch this to use the helper function so that all trusted checks are consistent. This aids in any potential future refactor by ensuring consistent code. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2022-03-14 17:22:59 -07:00
Jacob Keller	2b36944810	ice: log an error message when eswitch fails to configure When ice_eswitch_configure fails, print an error message to make it more obvious why VF initialization did not succeed. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Sandeep Penigalapati <sandeep.penigalapati@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2022-03-14 17:22:59 -07:00
Jacob Keller	94ab2488d4	ice: cleanup error logging for ice_ena_vfs The ice_ena_vfs function and some of its sub-functions like ice_set_per_vf_res use a "if (<function>) { <print error> ; <exit> }" flow. This flow discards specialized errors reported by the called function. This style is generally not preferred as it makes tracing error sources more difficult. It also means we cannot log the actual error received properly. Refactor several calls in the ice_ena_vfs function that do this to catch the error in the 'ret' variable. Report this in the messages, and then return the more precise error value. Doing this reveals that ice_set_per_vf_res returns -EINVAL or -EIO in places where -ENOSPC makes more sense. Fix these calls up to return the more appropriate value. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2022-03-14 17:22:59 -07:00
Jacob Keller	346f7aa3c7	ice: move ice_set_vf_port_vlan near other .ndo ops The ice_set_vf_port_vlan function is located in ice_sriov.c very far away from the other .ndo operations that it is similar to. Move this so that its located near the other .ndo operation definitions. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2022-03-14 17:22:59 -07:00
Jacob Keller	a8ea6d86bd	ice: refactor spoofchk control code in ice_sriov.c The API to control the VSI spoof checking for a VF VSI has three functions: enable, disable, and set. The set function takes the VSI and the VF and decides whether to call enable or disable based on the vf->spoofchk field. In some flows, vf->spoofchk is not yet set, such as the function used to control the setting for a VF. (vf->spoofchk is only updated after a success). Simplify this API by refactoring ice_vf_set_spoofchk_cfg to be "ice_vsi_apply_spoofchk" which takes the boolean and allows all callers to avoid having to determine whether to call enable or disable themselves. This matches the expected callers better, and will prevent the need to export more than one function when this code must be called from another file. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2022-03-14 17:22:59 -07:00
Jacob Keller	dc36796ead	ice: rename ICE_MAX_VF_COUNT to avoid confusion The ICE_MAX_VF_COUNT field is defined in ice_sriov.h. This count is true for SR-IOV but will not be true for all VF implementations, such as when the ice driver supports Scalable IOV. Rename this definition to clearly indicate ICE_MAX_SRIOV_VFS. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2022-03-14 17:22:59 -07:00
Jacob Keller	00a57e2959	ice: remove unused definitions from ice_sriov.h A few more macros exist in ice_sriov.h which are not used anywhere. These can be safely removed. Note that ICE_VIRTCHNL_VF_CAP_L2 capability is set but never checked anywhere in the driver. Thus it is also safe to remove. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2022-03-14 17:22:58 -07:00
Jacob Keller	a7e117109a	ice: convert vf->vc_ops to a const pointer The vc_ops structure is used to allow different handlers for virtchnl commands when the driver is in representor mode. The current implementation uses a copy of the ops table in each VF, and modifies this copy dynamically. The usual practice in kernel code is to store the ops table in a constant structure and point to different versions. This has a number of advantages: 1. Reduced memory usage. Each VF merely points to the correct table, so they're able to re-use the same constant lookup table in memory. 2. Consistency. It becomes more difficult to accidentally update or edit only one op call. Instead, the code switches to the correct able by a single pointer write. In general this is atomic, either the pointer is updated or its not. 3. Code Layout. The VF structure can store a pointer to the table without needing to have the full structure definition defined prior to the VF structure definition. This will aid in future refactoring of code by allowing the VF pointer to be kept in ice_vf_lib.h while the virtchnl ops table can be maintained in ice_virtchnl.h There is one major downside in the case of the vc_ops structure. Most of the operations in the table are the same between the two current implementations. This can appear to lead to duplication since each implementation must now fill in the complete table. It could make spotting the differences in the representor mode more challenging. Unfortunately, methods to make this less error prone either add complexity overhead (macros using CPP token concatenation) or don't work on all compilers we support (constant initializer from another constant structure). The cost of maintaining two structures does not out weigh the benefits of the constant table model. While we're making these changes, go ahead and rename the structure and implementations with "virtchnl" instead of "vc_vf_". This will more closely align with the planned file renaming, and avoid similar names when we later introduce a "vf ops" table for separating Scalable IOV and Single Root IOV implementations. Leave the accessor/assignment functions in order to avoid issues with compiling with options disabled. The interface makes it easier to handle when CONFIG_PCI_IOV is disabled in the kernel. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Sandeep Penigalapati <sandeep.penigalapati@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2022-03-14 17:22:58 -07:00
Jacob Keller	649c87c6ff	ice: remove circular header dependencies on ice.h Several headers in the ice driver include ice.h even though they are themselves included by that header. The most notable of these is ice_common.h, but several other headers also do this. Such a recursive inclusion is problematic as it forces headers to be included in a strict order, otherwise compilation errors can result. The circular inclusions do not trigger an endless loop due to standard header inclusion guards, however other errors can occur. For example, ice_flow.h defines ice_rss_hash_cfg, which is used by ice_sriov.h as part of the definition of ice_vf_hash_ip_ctx. ice_flow.h includes ice_acl.h, which includes ice_common.h, and which finally includes ice.h. Since ice.h itself includes ice_sriov.h, this creates a circular dependency. The definition in ice_sriov.h requires things from ice_flow.h, but ice_flow.h itself will lead to trying to load ice_sriov.h as part of its process for expanding ice.h. The current code avoids this issue by having an implicit dependency without the include of ice_flow.h. If we were to fix that so that ice_sriov.h explicitly depends on ice_flow.h the following pattern would occur: ice_flow.h -> ice_acl.h -> ice_common.h -> ice.h -> ice_sriov.h At this point, during the expansion of, the header guard for ice_flow.h is already set, so when ice_sriov.h attempts to load the ice_flow.h header it is skipped. Then, we go on to begin including the rest of ice_sriov.h, including structure definitions which depend on ice_rss_hash_cfg. This produces a compiler warning because ice_rss_hash_cfg hasn't yet been included. Remember, we're just at the start of ice_flow.h! If the order of headers is incorrect (ice_flow.h is not implicitly loaded first in all files which include ice_sriov.h) then we get the same failure. Removing this recursive inclusion requires fixing a few cases where some headers depended on the header inclusions from ice.h. In addition, a few other changes are also required. Most notably, ice_hw_to_dev is implemented as a macro in ice_osdep.h, which is the likely reason that ice_common.h includes ice.h at all. This macro implementation requires the full definition of ice_pf in order to properly compile. Fix this by moving it to a function declared in ice_main.c, so that we do not require all files to depend on the layout of the ice_pf structure. Note that this change only fixes circular dependencies, but it does not fully resolve all implicit dependencies where one header may depend on the inclusion of another. I tried to fix as many of the implicit dependencies as I noticed, but fixing them all requires a somewhat tedious analysis of each header and attempting to compile it separately. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Gurucharan G <gurucharanx.g@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2022-03-14 17:22:58 -07:00
Jacob Keller	0deb0bf70c	ice: rename ice_virtchnl_pf.c to ice_sriov.c The ice_virtchnl_pf.c and ice_virtchnl_pf.h files are where most of the code for implementing Single Root IOV virtualization resides. This code includes support for bringing up and tearing down VFs, hooks into the kernel SR-IOV netdev operations, and for handling virtchnl messages from VFs. In the future, we plan to support Scalable IOV in addition to Single Root IOV as an alternative virtualization scheme. This implementation will re-use some but not all of the code in ice_virtchnl_pf.c To prepare for this future, we want to refactor and split up the code in ice_virtchnl_pf.c into the following scheme: * ice_vf_lib.[ch] Basic VF structures and accessors. This is where scheme-independent code will reside. * ice_virtchnl.[ch] Virtchnl message handling. This is where the bulk of the logic for processing messages from VFs using the virtchnl messaging scheme will reside. This is separated from ice_vf_lib.c because it is distinct and has a bulk of the processing code. * ice_sriov.[ch] Single Root IOV implementation, including initialization and the routines for interacting with SR-IOV based netdev operations. * (future) ice_siov.[ch] Scalable IOV implementation. As a first step, lets assume that all of the code in ice_virtchnl_pf.[ch] is for Single Root IOV. Rename this file to ice_sriov.c and its header to ice_sriov.h Future changes will further split out the code in these files following the plan outlined here. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2022-03-14 17:22:58 -07:00
Jacob Keller	d775155a86	ice: rename ice_sriov.c to ice_vf_mbx.c The ice_sriov.c file primarily contains code which handles the logic for mailbox overflow detection and some other utility functions related to the virtualization mailbox. The bulk of the SR-IOV implementation is actually found in ice_virtchnl_pf.c, and this file isn't strictly SR-IOV specific. In the future, the ice driver will support an additional virtualization scheme known as Scalable IOV, and the code in this file will be used for this alternative implementation. Rename this file (and its associated header) to ice_vf_mbx.c, so that we can later re-use the ice_sriov.c file as the SR-IOV specific file. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2022-03-14 17:22:58 -07:00
Marcin Szycik	9a225f81f5	ice: Support GTP-U and GTP-C offload in switchdev Add support for creating filters for GTP-U and GTP-C in switchdev mode. Add support for parsing GTP-specific options (QFI and PDU type) and TEID. By default, a filter for GTP-U will be added. To add a filter for GTP-C, specify enc_dst_port = 2123, e.g.: tc filter add dev $GTP0 ingress prio 1 flower enc_key_id 1337 \ enc_dst_port 2123 action mirred egress redirect dev $VF1_PR Note: GTP-U with outer IPv6 offload is not supported yet. Note: GTP-U with no payload offload is not supported yet. Signed-off-by: Marcin Szycik <marcin.szycik@linux.intel.com> Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Tested-by: Sandeep Penigalapati <sandeep.penigalapati@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2022-03-11 08:28:28 -08:00
Michal Swiatkowski	e5dd661b8b	ice: Fix FV offset searching Checking only protocol ids while searching for correct FVs can lead to a situation, when incorrect FV will be added to the list. Incorrect means that FV has correct protocol id but incorrect offset. Call ice_get_sw_fv_list with ice_prot_lkup_ext struct which contains all protocol ids with offsets. With this modification allocating and collecting protocol ids list is not longer needed. Signed-off-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Tested-by: Sandeep Penigalapati <sandeep.penigalapati@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2022-03-11 08:28:27 -08:00
Jakub Kicinski	4c7d2e1795	Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue Tony Nguyen says: ==================== 100GbE Intel Wired LAN Driver Updates 2022-03-09 This series contains updates to ice driver only. Martyna implements switchdev filtering on inner EtherType field for tunnels. Marcin adds reporting of slowpath statistics for port representors. Jonathan Toppins changes a non-fatal link error message from warning to debug. Maciej removes unnecessary checks in ice_clean_tx_irq(). Amritha adds support for ADQ to match outer destination MAC for tunnels. * '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue: ice: Add support for outer dest MAC for ADQ tunnels ice: avoid XDP checks in ice_clean_tx_irq() ice: change "can't set link" message to dbg level ice: Add slow path offload stats on port representor in switchdev ice: Add support for inner etype in switchdev ==================== Link: https://lore.kernel.org/r/20220309190315.1380414-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-03-10 20:20:13 -08:00
Jakub Kicinski	1e8a3f0d2a	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net net/dsa/dsa2.c commit `afb3cc1a39` ("net: dsa: unlock the rtnl_mutex when dsa_master_setup() fails") commit `e83d565378` ("net: dsa: replay master state events in dsa_tree_{setup,teardown}_master") https://lore.kernel.org/all/20220307101436.7ae87da0@canb.auug.org.au/ drivers/net/ethernet/intel/ice/ice.h commit `97b0129146` ("ice: Fix error with handling of bonding MTU") commit `43113ff734` ("ice: add TTY for GNSS module for E810T device") https://lore.kernel.org/all/20220310112843.3233bcf1@canb.auug.org.au/ drivers/staging/gdm724x/gdm_lte.c commit `fc7f750dc9` ("staging: gdm724x: fix use after free in gdm_lte_rx()") commit `4bcc4249b4` ("staging: Use netif_rx().") https://lore.kernel.org/all/20220308111043.1018a59d@canb.auug.org.au/ Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-03-10 17:16:56 -08:00
Ivan Vecera	5cb1ebdbc4	ice: Fix race condition during interface enslave Commit `5dbbbd01cb` ("ice: Avoid RTNL lock when re-creating auxiliary device") changes a process of re-creation of aux device so ice_plug_aux_dev() is called from ice_service_task() context. This unfortunately opens a race window that can result in dead-lock when interface has left LAG and immediately enters LAG again. Reproducer: ``` #!/bin/sh ip link add lag0 type bond mode 1 miimon 100 ip link set lag0 for n in {1..10}; do echo Cycle: $n ip link set ens7f0 master lag0 sleep 1 ip link set ens7f0 nomaster done ``` This results in: [20976.208697] Workqueue: ice ice_service_task [ice] [20976.213422] Call Trace: [20976.215871] __schedule+0x2d1/0x830 [20976.219364] schedule+0x35/0xa0 [20976.222510] schedule_preempt_disabled+0xa/0x10 [20976.227043] __mutex_lock.isra.7+0x310/0x420 [20976.235071] enum_all_gids_of_dev_cb+0x1c/0x100 [ib_core] [20976.251215] ib_enum_roce_netdev+0xa4/0xe0 [ib_core] [20976.256192] ib_cache_setup_one+0x33/0xa0 [ib_core] [20976.261079] ib_register_device+0x40d/0x580 [ib_core] [20976.266139] irdma_ib_register_device+0x129/0x250 [irdma] [20976.281409] irdma_probe+0x2c1/0x360 [irdma] [20976.285691] auxiliary_bus_probe+0x45/0x70 [20976.289790] really_probe+0x1f2/0x480 [20976.298509] driver_probe_device+0x49/0xc0 [20976.302609] bus_for_each_drv+0x79/0xc0 [20976.306448] __device_attach+0xdc/0x160 [20976.310286] bus_probe_device+0x9d/0xb0 [20976.314128] device_add+0x43c/0x890 [20976.321287] __auxiliary_device_add+0x43/0x60 [20976.325644] ice_plug_aux_dev+0xb2/0x100 [ice] [20976.330109] ice_service_task+0xd0c/0xed0 [ice] [20976.342591] process_one_work+0x1a7/0x360 [20976.350536] worker_thread+0x30/0x390 [20976.358128] kthread+0x10a/0x120 [20976.365547] ret_from_fork+0x1f/0x40 ... [20976.438030] task:ip state:D stack: 0 pid:213658 ppid:213627 flags:0x00004084 [20976.446469] Call Trace: [20976.448921] __schedule+0x2d1/0x830 [20976.452414] schedule+0x35/0xa0 [20976.455559] schedule_preempt_disabled+0xa/0x10 [20976.460090] __mutex_lock.isra.7+0x310/0x420 [20976.464364] device_del+0x36/0x3c0 [20976.467772] ice_unplug_aux_dev+0x1a/0x40 [ice] [20976.472313] ice_lag_event_handler+0x2a2/0x520 [ice] [20976.477288] notifier_call_chain+0x47/0x70 [20976.481386] __netdev_upper_dev_link+0x18b/0x280 [20976.489845] bond_enslave+0xe05/0x1790 [bonding] [20976.494475] do_setlink+0x336/0xf50 [20976.502517] __rtnl_newlink+0x529/0x8b0 [20976.543441] rtnl_newlink+0x43/0x60 [20976.546934] rtnetlink_rcv_msg+0x2b1/0x360 [20976.559238] netlink_rcv_skb+0x4c/0x120 [20976.563079] netlink_unicast+0x196/0x230 [20976.567005] netlink_sendmsg+0x204/0x3d0 [20976.570930] sock_sendmsg+0x4c/0x50 [20976.574423] ____sys_sendmsg+0x1eb/0x250 [20976.586807] ___sys_sendmsg+0x7c/0xc0 [20976.606353] __sys_sendmsg+0x57/0xa0 [20976.609930] do_syscall_64+0x5b/0x1a0 [20976.613598] entry_SYSCALL_64_after_hwframe+0x65/0xca 1. Command 'ip link ... set nomaster' causes that ice_plug_aux_dev() is called from ice_service_task() context, aux device is created and associated device->lock is taken. 2. Command 'ip link ... set master...' calls ice's notifier under RTNL lock and that notifier calls ice_unplug_aux_dev(). That function tries to take aux device->lock but this is already taken by ice_plug_aux_dev() in step 1 3. Later ice_plug_aux_dev() tries to take RTNL lock but this is already taken in step 2 4. Dead-lock The patch fixes this issue by following changes: - Bit ICE_FLAG_PLUG_AUX_DEV is kept to be set during ice_plug_aux_dev() call in ice_service_task() - The bit is checked in ice_clear_rdma_cap() and only if it is not set then ice_unplug_aux_dev() is called. If it is set (in other words plugging of aux device was requested and ice_plug_aux_dev() is potentially running) then the function only clears the bit - Once ice_plug_aux_dev() call (in ice_service_task) is finished the bit ICE_FLAG_PLUG_AUX_DEV is cleared but it is also checked whether it was already cleared by ice_clear_rdma_cap(). If so then aux device is unplugged. Signed-off-by: Ivan Vecera <ivecera@redhat.com> Co-developed-by: Petr Oros <poros@redhat.com> Signed-off-by: Petr Oros <poros@redhat.com> Reviewed-by: Dave Ertman <david.m.ertman@intel.com> Link: https://lore.kernel.org/r/20220310171641.3863659-1-ivecera@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-03-10 15:07:46 -08:00
Amritha Nambiar	02ddec1986	ice: Add support for outer dest MAC for ADQ tunnels TC flower does not support matching on user specified outer MAC address for tunnels. For ADQ tunnels, the driver adds outer destination MAC address as lower netdev's active unicast MAC address to filter out packets with unrelated MAC address being delivered to ADQ VSIs. Example: - create tunnel device ip l add $VXLAN_DEV type vxlan id $VXLAN_VNI dstport $VXLAN_PORT \ dev $PF - add TC filter (in ADQ mode) $tc filter add dev $VXLAN_DEV protocol ip parent ffff: flower \ dst_ip $INNER_DST_IP ip_proto tcp dst_port $INNER_DST_PORT \ enc_key_id $VXLAN_VNI hw_tc $ADQ_TC Note: Filters with wild-card tunnel ID (when user does not specify tunnel key) are also supported. Signed-off-by: Amritha Nambiar <amritha.nambiar@intel.com> Tested-by: Bharathi Sreenivas <bharathi.sreenivas@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2022-03-09 10:05:32 -08:00
Maciej Fijalkowski	457a02f03e	ice: avoid XDP checks in ice_clean_tx_irq() Commit `9610bd988d` ("ice: optimize XDP_TX workloads") introduced Tx IRQ cleaning routine dedicated for XDP rings. Currently it is impossible to call ice_clean_tx_irq() against XDP ring, so it is safe to drop ice_ring_is_xdp() calls in there. Reviewed-by: Alexander Lobakin <alexandr.lobakin@intel.com> Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Tested-by: Kiran Bhandare <kiranx.bhandare@intel.com> (A Contingent Worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2022-03-09 10:05:27 -08:00
Jonathan Toppins	ad24d9ebc4	ice: change "can't set link" message to dbg level In the case where the link is owned by manageability, the firmware is not allowed to set the link state, so an error code is returned. This however is non-fatal and there is nothing the operator can do, so instead of confusing the operator with messages they can do nothing about hide this message behind the debug log level. Signed-off-by: Jonathan Toppins <jtoppins@redhat.com> Tested-by: Gurucharan G <gurucharanx.g@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2022-03-09 08:02:58 -08:00
Marcin Szycik	c8ff29b587	ice: Add slow path offload stats on port representor in switchdev Implement callbacks to check for stats and fetch port representor stats. Stats are taken from RX/TX ring corresponding to port representor and show the number of bytes/packets that were not offloaded. To see slow path stats run: ifstat -x cpu_hits -a Signed-off-by: Marcin Szycik <marcin.szycik@linux.intel.com> Tested-by: Sandeep Penigalapati <sandeep.penigalapati@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2022-03-09 08:02:58 -08:00
Martyna Szapar-Mudlaw	34a897758e	ice: Add support for inner etype in switchdev Enable support for adding TC rules that filter on the inner EtherType field of tunneled packet headers. Signed-off-by: Martyna Szapar-Mudlaw <martyna.szapar-mudlaw@intel.com> Reviewed-by: Alexander Lobakin <alexandr.lobakin@intel.com> Tested-by: Sandeep Penigalapati <sandeep.penigalapati@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2022-03-09 08:02:58 -08:00
Jedrzej Jagielski	ad35ffa252	ice: Fix curr_link_speed advertised speed Change curr_link_speed advertised speed, due to link_info.link_speed is not equal phy.curr_user_speed_req. Without this patch it is impossible to set advertised speed to same as link_speed. Testing Hints: Try to set advertised speed to 25G only with 25G default link (use ethtool -s 0x80000000) Fixes: `48cb27f2fd` ("ice: Implement handlers for ethtool PHY/link operations") Signed-off-by: Grzegorz Siwik <grzegorz.siwik@intel.com> Signed-off-by: Jedrzej Jagielski <jedrzej.jagielski@intel.com> Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2022-03-08 13:31:09 -08:00
Christophe JAILLET	3d97f1afd8	ice: Don't use GFP_KERNEL in atomic context ice_misc_intr() is an irq handler. It should not sleep. Use GFP_ATOMIC instead of GFP_KERNEL when allocating some memory. Fixes: `348048e724` ("ice: Implement iidc operations") Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Tested-by: Leszek Kaliszczuk <leszek.kaliszczuk@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2022-03-08 13:31:09 -08:00
Dave Ertman	97b0129146	ice: Fix error with handling of bonding MTU When a bonded interface is destroyed, .ndo_change_mtu can be called during the tear-down process while the RTNL lock is held. This is a problem since the auxiliary driver linked to the LAN driver needs to be notified of the MTU change, and this requires grabbing a device_lock on the auxiliary_device's dev. Currently this is being attempted in the same execution context as the call to .ndo_change_mtu which is causing a dead-lock. Move the notification of the changed MTU to a separate execution context (watchdog service task) and eliminate the "before" notification. Fixes: `348048e724` ("ice: Implement iidc operations") Signed-off-by: Dave Ertman <david.m.ertman@intel.com> Tested-by: Jonathan Toppins <jtoppins@redhat.com> Tested-by: Gurucharan G <gurucharanx.g@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2022-03-08 13:31:08 -08:00
Jacob Keller	79498d5af8	ice: stop disabling VFs due to PF error responses The ice_vc_send_msg_to_vf function has logic to detect "failure" responses being sent to a VF. If a VF is sent more than ICE_DFLT_NUM_INVAL_MSGS_ALLOWED then the VF is marked as disabled. Almost identical logic also existed in the i40e driver. This logic was added to the ice driver in commit `1071a8358a` ("ice: Implement virtchnl commands for AVF support") which itself copied from the i40e implementation in commit `5c3c48ac6b` ("i40e: implement virtual device interface"). Neither commit provides a proper explanation or justification of the check. In fact, later commits to i40e changed the logic to allow bypassing the check in some specific instances. The "logic" for this seems to be that error responses somehow indicate a malicious VF. This is not really true. The PF might be sending an error for any number of reasons such as lack of resources, etc. Additionally, this causes the PF to log an info message for every failed VF response which may confuse users, and can spam the kernel log. This behavior is not documented as part of any requirement for our products and other operating system drivers such as the FreeBSD implementation of our drivers do not include this type of check. In fact, the change from dev_err to dev_info in i40e commit `18b7af57d9` ("i40e: Lower some message levels") explains that these messages typically don't actually indicate a real issue. It is quite likely that a user who hits this in practice will be very confused as the VF will be disabled without an obvious way to recover. We already have robust malicious driver detection logic using actual hardware detection mechanisms that detect and prevent invalid device usage. Remove the logic since its not a documented requirement and the behavior is not intuitive. Fixes: `1071a8358a` ("ice: Implement virtchnl commands for AVF support") Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2022-03-08 13:31:08 -08:00
Maciej Fijalkowski	13d04d7970	ice: xsk: fix GCC version checking against pragma unroll presence Pragma unroll was introduced around GCC 8, whereas current xsk code in ice that prepares loop_unrolled_for macro that is based on mentioned pragma, compares GCC version against 4, which is wrong and Stephen found this out by compiling kernel with GCC 5.4 [0]. Fix this mistake and check if GCC version is >= 8. [0]: https://lore.kernel.org/netdev/20220307213659.47658125@canb.auug.org.au/ Fixes: `126cdfe100` ("ice: xsk: Improve AF_XDP ZC Tx and use batching API") Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Link: https://lore.kernel.org/r/20220307231353.56638-1-maciej.fijalkowski@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-03-07 22:14:47 -08:00
Jacob Keller	3d5985a185	ice: convert VF storage to hash table with krefs and RCU The ice driver stores VF structures in a simple array which is allocated once at the time of VF creation. The VF structures are then accessed from the array by their VF ID. The ID must be between 0 and the number of allocated VFs. Multiple threads can access this table: * .ndo operations such as .ndo_get_vf_cfg or .ndo_set_vf_trust * interrupts, such as due to messages from the VF using the virtchnl communication * processing such as device reset * commands to add or remove VFs The current implementation does not keep track of when all threads are done operating on a VF and can potentially result in use-after-free issues caused by one thread accessing a VF structure after it has been released when removing VFs. Some of these are prevented with various state flags and checks. In addition, this structure is quite static and does not support a planned future where virtualization can be more dynamic. As we begin to look at supporting Scalable IOV with the ice driver (as opposed to just supporting Single Root IOV), this structure is not sufficient. In the future, VFs will be able to be added and removed individually and dynamically. To allow for this, and to better protect against a whole class of use-after-free bugs, replace the VF storage with a combination of a hash table and krefs to reference track all of the accesses to VFs through the hash table. A hash table still allows efficient look up of the VF given its ID, but also allows adding and removing VFs. It does not require contiguous VF IDs. The use of krefs allows the cleanup of the VF memory to be delayed until after all threads have released their reference (by calling ice_put_vf). To prevent corruption of the hash table, a combination of RCU and the mutex table_lock are used. Addition and removal from the hash table use the RCU-aware hash macros. This allows simple read-only look ups that iterate to locate a single VF can be fast using RCU. Accesses which modify the hash table, or which can't take RCU because they sleep, will hold the mutex lock. By using this design, we have a stronger guarantee that the VF structure can't be released until after all threads are finished operating on it. We also pave the way for the more dynamic Scalable IOV implementation in the future. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2022-03-03 11:57:18 -08:00
Jacob Keller	fb916db1f0	ice: introduce VF accessor functions Before we switch the VF data structure storage mechanism to a hash, introduce new accessor functions to define the new interface. * ice_get_vf_by_id is a function used to obtain a reference to a VF from the table based on its VF ID * ice_has_vfs is used to quickly check if any VFs are configured * ice_get_num_vfs is used to get an exact count of how many VFs are configured We can drop the old ice_validate_vf_id function, since every caller was just going to immediately access the VF table to get a reference anyways. This way we simply use the single ice_get_vf_by_id to both validate the VF ID is within range and that there exists a VF with that ID. This change enables us to more easily convert the codebase to the hash table since most callers now properly use the interface. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2022-03-03 10:54:33 -08:00
Jacob Keller	000773c00f	ice: factor VF variables to separate structure We maintain a number of values for VFs within the ice_pf structure. This includes the VF table, the number of allocated VFs, the maximum number of supported SR-IOV VFs, the number of queue pairs per VF, the number of MSI-X vectors per VF, and a bitmap of the VFs with detected MDD events. We're about to add a few more variables to this list. Clean this up first by extracting these members out into a new ice_vfs structure defined in ice_virtchnl_pf.h Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2022-03-03 10:54:29 -08:00
Jacob Keller	c4c2c7db64	ice: convert ice_for_each_vf to include VF entry iterator The ice_for_each_vf macro is intended to be used to loop over all VFs. The current implementation relies on an iterator that is the index into the VF array in the PF structure. This forces all users to perform a look up themselves. This abstraction forces a lot of duplicate work on callers and leaks the interface implementation to the caller. Replace this with an implementation that includes the VF pointer the primary iterator. This version simplifies callers which just want to iterate over every VF, as they no longer need to perform their own lookup. The "i" iterator value is replaced with a new unsigned int "bkt" parameter, as this will match the necessary interface for replacing the VF array with a hash table. For now, the bkt is the VF ID, but in the future it will simply be the hash bucket index. Document that it should not be treated as a VF ID. This change aims to simplify switching from the array to a hash table. I considered alternative implementations such as an xarray but decided that the hash table was the simplest and most suitable implementation. I also looked at methods to hide the bkt iterator entirely, but I couldn't come up with a feasible solution that worked for hash table iterators. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2022-03-03 08:46:48 -08:00
Jacob Keller	19281e8668	ice: use ice_for_each_vf for iteration during removal When removing VFs, the driver takes a weird approach of assigning pf->num_alloc_vfs to 0 before iterating over the VFs using a temporary variable. This logic has been in the driver for a long time, and seems to have been carried forward from i40e. We want to refactor the way VFs are stored, and iterating over the data structure without the ice_for_each_vf interface impedes this work. The logic relies on implicitly using the num_alloc_vfs as a sort of "safe guard" for accessing VF data. While this sort of guard makes sense for Single Root IOV where all VFs are added at once, the data structures don't work for VFs which can be added and removed dynamically. We also have a separate state flag, ICE_VF_DEINIT_IN_PROGRESS which is a stronger protection against concurrent removal and access. Avoid the custom tmp iteration and replace it with the standard ice_for_each_vf iterator. Delay the assignment of num_alloc_vfs until after this loop finishes. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2022-03-03 08:46:48 -08:00
Jacob Keller	59e1f857e3	ice: remove checks in ice_vc_send_msg_to_vf The ice_vc_send_msg_to_vf function is used by the PF to send a response to a VF. This function has overzealous checks to ensure its not passed a NULL VF pointer and to ensure that the passed in struct ice_vf has a valid vf_id sub-member. These checks have existed since commit `1071a8358a` ("ice: Implement virtchnl commands for AVF support") and function as simple sanity checks. We are planning to refactor the ice driver to use a hash table along with appropriate locks in a future refactor. This change will modify how the ice_validate_vf_id function works. Instead of a simple >= check to ensure the VF ID is between some range, it will check the hash table to see if the specified VF ID is actually in the table. This requires that the function properly lock the table to prevent race conditions. The checks may seem ok at first glance, but they don't really provide much benefit. In order for ice_vc_send_msg_to_vf to have these checks fail, the callers must either (1) pass NULL as the VF, (2) construct an invalid VF pointer manually, or (3) be using a VF pointer which becomes invalid after they obtain it properly using ice_get_vf_by_id. For (1), a cursory glance over callers of ice_vc_send_msg_to_vf can show that in most cases the functions already operate assuming their VF pointer is valid, such as by derferencing vf->pf or other members. They obtain the VF pointer by accessing the VF array using the VF ID, which can never produce a NULL value (since its a simple address operation on the array it will not be NULL. The sole exception for (1) is that ice_vc_process_vf_msg will forward a NULL VF pointer to this function as part of its goto error handler logic. This requires some minor cleanup to simply exit immediately when an invalid VF ID is detected (Rather than use the same error flow as the rest of the function). For (2), it is unexpected for a flow to construct a VF pointer manually instead of accessing the VF array. Defending against this is likely to just hide bad programming. For (3), it is definitely true that VF pointers could become invalid, for example if a thread is processing a VF message while the VF gets removed. However, the correct solution is not to add additional checks like this which do not guarantee to prevent the race. Instead we plan to solve the root of the problem by preventing the possibility entirely. This solution will require the change to a hash table with proper locking and reference counts of the VF structures. When this is done, ice_validate_vf_id will require locking of the hash table. This will be problematic because all of the callers of ice_vc_send_msg_to_vf will already have to take the lock to obtain the VF pointer anyways. With a mutex, this leads to a double lock that could hang the kernel thread. Avoid this by removing the checks which don't provide much value, so that we can safely add the necessary protections properly. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2022-03-03 08:46:48 -08:00
Jacob Keller	44efe75f73	ice: move VFLR acknowledge during ice_free_vfs After removing all VFs, the driver clears the VFLR indication for VFs. This has been in ice since the beginning of SR-IOV support in the ice driver. The implementation was copied from i40e, and the motivation for the VFLR indication clearing is described in the commit `f7414531a0` ("i40e: acknowledge VFLR when disabling SR-IOV") The commit explains that we need to clear the VFLR indication because the virtual function undergoes a VFLR event. If we don't indicate that it is complete it can cause an issue when VFs are re-enabled due to a "phantom" VFLR. The register block read was added under a pci_vfs_assigned check originally. This was done because we added the check after calling pci_disable_sriov. This was later moved to disable SRIOV earlier in the flow so that the VF drivers could be torn down before we removed functionality. Move the VFLR acknowledge into the main loop that tears down VF resources. This avoids using the tmp value for iterating over VFs multiple times. The result will make it easier to refactor the VF array in a future change. It's possible we might want to modify this flow to also stop checking pci_vfs_assigned. However, it seems reasonable to keep this change: we should only clear the VFLR if we actually disabled SR-IOV. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2022-03-03 08:46:48 -08:00
Jacob Keller	294627a67e	ice: move clear_malvf call in ice_free_vfs The ice_mbx_clear_malvf function is used to clear the indication and count of how many times a VF was detected as malicious. During ice_free_vfs, we use this function to ensure that all removed VFs are reset to a clean state. The call currently is done at the end of ice_free_vfs() using a tmp value to iterate over all of the entries in the bitmap. This separate iteration using tmp is problematic for a planned refactor of the VF array data structure. To avoid this, lets move the call slightly higher into the function inside the loop where we teardown all of the VFs. This avoids one use of the tmp value used for iteration. We'll fix the other user in a future change. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2022-03-03 08:46:48 -08:00
Jacob Keller	cd0f4f3b2c	ice: pass num_vfs to ice_set_per_vf_res() We are planning to replace the simple array structure tracking VFs with a hash table. This change will also remove the "num_alloc_vfs" variable. Instead, new access functions to use the hash table as the source of truth will be introduced. These will generally be equivalent to existing checks, except during VF initialization. Specifically, ice_set_per_vf_res() cannot use the hash table as it will be operating prior to VF structures being inserted into the hash table. Instead of using pf->num_alloc_vfs, simply pass the num_vfs value in from the caller. Note that a sub-function of ice_set_per_vf_res, ice_determine_res, also implicitly depends on pf->num_alloc_vfs. Replace ice_determine_res with a simpler inline implementation based on rounddown_pow_of_two. Note that we must explicitly check that the argument is non-zero since it does not play well with zero as a value. Instead of using the function and while loop, simply calculate the number of queues we have available by dividing by num_vfs. Check if the desired queues are available. If not, round down to the nearest power of 2 that fits within our available queues. This matches the behavior of ice_determine_res but is easier to follow as simple in-line logic. Remove ice_determine_res entirely. With this change, we no longer depend on the pf->num_alloc_vfs during the initialization phase of VFs. This will allow us to safely remove it in a future planned refactor of the VF data structures. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2022-03-03 08:46:48 -08:00
Jacob Keller	b03d519d34	ice: store VF pointer instead of VF ID The VSI structure contains a vf_id field used to associate a VSI with a VF. This is used mainly for ICE_VSI_VF as well as partially for ICE_VSI_CTRL associated with the VFs. This API was designed with the idea that VFs are stored in a simple array that was expected to be static throughout most of the driver's life. We plan on refactoring VF storage in a few key ways: 1) converting from a simple static array to a hash table 2) using krefs to track VF references obtained from the hash table 3) use RCU to delay release of VF memory until after all references are dropped This is motivated by the goal to ensure that the lifetime of VF structures is accounted for, and prevent various use-after-free bugs. With the existing vsi->vf_id, the reference tracking for VFs would become somewhat convoluted, because each VSI maintains a vf_id field which will then require performing a look up. This means all these flows will require reference tracking and proper usage of rcu_read_lock, etc. We know that the VF VSI will always be backed by a valid VF structure, because the VSI is created during VF initialization and removed before the VF is destroyed. Rely on this and store a reference to the VF in the VSI structure instead of storing a VF ID. This will simplify the usage and avoid the need to perform lookups on the hash table in the future. For ICE_VSI_VF, it is expected that vsi->vf is always non-NULL after ice_vsi_alloc succeeds. Because of this, use WARN_ON when checking if a vsi->vf pointer is valid when dealing with VF VSIs. This will aid in debugging code which violates this assumption and avoid more disastrous panics. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2022-03-03 08:46:47 -08:00
Jacob Keller	df830543d6	ice: refactor unwind cleanup in eswitch mode The code for supporting eswitch mode and port representors on VFs uses an unwind based cleanup flow when handling errors. These flows are used to cleanup and get everything back to the state prior to attempting to switch from legacy to representor mode or back. The unwind iterations make sense, but complicate a plan to refactor the VF array structure. In the future we won't have a clean method of reversing an iteration of the VFs. Instead, we can change the cleanup flow to just iterate over all VF structures and clean up appropriately. First notice that ice_repr_add_for_all_vfs and ice_repr_rem_from_all_vfs have an additional step of re-assigning the VC ops. There is no good reason to do this outside of ice_repr_add and ice_repr_rem. It can simply be done as the last step of these functions. Second, make sure ice_repr_rem is safe to call on a VF which does not have a representor. Check if vf->repr is NULL first and exit early if so. Move ice_repr_rem_from_all_vfs above ice_repr_add_for_all_vfs so that we can call it from the cleanup function. In ice_eswitch.c, replace the unwind iteration with a call to ice_eswitch_release_reprs. This will go through all of the VFs and revert the VF back to the standard model without the eswitch mode. To make this safe, ensure this function checks whether or not the represent or has been moved. Rely on the metadata destination in vf->repr->dst. This must be NULL if the representor has not been moved to eswitch mode. Ensure that we always re-assign this value back to NULL after freeing it, and move the ice_eswitch_release_reprs so that it can be called from the setup function. With these changes, eswitch cleanup no longer uses an unwind flow that is problematic for the planned VF data structure change. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Sandeep Penigalapati <sandeep.penigalapati@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2022-03-03 08:46:47 -08:00
Karol Kolacinski	43113ff734	ice: add TTY for GNSS module for E810T device Add a new ice_gnss.c file for holding the basic GNSS module functions. If the device supports GNSS module, call the new ice_gnss_init and ice_gnss_release functions where appropriate. Implement basic functionality for reading the data from GNSS module using TTY device. Add I2C read AQ command. It is now required for controlling the external physical connectors via external I2C port expander on E810-T adapters. Future changes will introduce write functionality. Signed-off-by: Karol Kolacinski <karol.kolacinski@intel.com> Signed-off-by: Sudhansu Sekhar Mishra <sudhansu.mishra@intel.com> Tested-by: Sunitha Mekala <sunithax.d.mekala@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-03-03 14:11:00 +00:00
Jakub Kicinski	aaa25a2fa7	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net tools/testing/selftests/net/mptcp/mptcp_join.sh `34aa6e3bcc` ("selftests: mptcp: add ip mptcp wrappers") `857898eb4b` ("selftests: mptcp: add missing join check") `6ef84b1517` ("selftests: mptcp: more robust signal race test") https://lore.kernel.org/all/20220221131842.468893-1-broonie@kernel.org/ drivers/net/ethernet/mellanox/mlx5/core/en/tc/act/act.h drivers/net/ethernet/mellanox/mlx5/core/en/tc/act/ct.c `fb7e76ea3f` ("net/mlx5e: TC, Skip redundant ct clear actions") `c63741b426` ("net/mlx5e: Fix MPLSoUDP encap to use MPLS action information") `09bf979232` ("net/mlx5e: TC, Move pedit_headers_action to parse_attr") `84ba8062e3` ("net/mlx5e: Test CT and SAMPLE on flow attr") `efe6f961cd` ("net/mlx5e: CT, Don't set flow flag CT for ct clear flow") `3b49a7edec` ("net/mlx5e: TC, Reject rules with multiple CT actions") Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-02-24 17:54:25 -08:00
Tom Rix	5950bdc88d	ice: initialize local variable 'tlv' Clang static analysis reports this issues ice_common.c:5008:21: warning: The left expression of the compound assignment is an uninitialized value. The computed value will also be garbage ldo->phy_type_low \|= ((u64)buf << (i * 16)); ~~~~~~~~~~~~~~~~~ ^ When called from ice_cfg_phy_fec() ldo is the uninitialized local variable tlv. So initialize. Fixes: `ea78ce4dab` ("ice: add link lenient and default override support") Signed-off-by: Tom Rix <trix@redhat.com> Tested-by: Gurucharan G <gurucharanx.g@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2022-02-18 13:28:39 -08:00
Tom Rix	ed22d9c8d1	ice: check the return of ice_ptp_gettimex64 Clang static analysis reports this issue time64.h:69:50: warning: The left operand of '+' is a garbage value set_normalized_timespec64(&ts_delta, lhs.tv_sec + rhs.tv_sec, ~~~~~~~~~~ ^ In ice_ptp_adjtime_nonatomic(), the timespec64 variable 'now' is set by ice_ptp_gettimex64(). This function can fail with -EBUSY, so 'now' can have a gargbage value. So check the return. Fixes: `06c16d89d2` ("ice: register 1588 PTP clock device object for E810 devices") Signed-off-by: Tom Rix <trix@redhat.com> Tested-by: Gurucharan G <gurucharanx.g@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2022-02-18 13:28:39 -08:00
Jacob Keller	fadead80fe	ice: fix concurrent reset and removal of VFs Commit `c503e63200` ("ice: Stop processing VF messages during teardown") introduced a driver state flag, ICE_VF_DEINIT_IN_PROGRESS, which is intended to prevent some issues with concurrently handling messages from VFs while tearing down the VFs. This change was motivated by crashes caused while tearing down and bringing up VFs in rapid succession. It turns out that the fix actually introduces issues with the VF driver caused because the PF no longer responds to any messages sent by the VF during its .remove routine. This results in the VF potentially removing its DMA memory before the PF has shut down the device queues. Additionally, the fix doesn't actually resolve concurrency issues within the ice driver. It is possible for a VF to initiate a reset just prior to the ice driver removing VFs. This can result in the remove task concurrently operating while the VF is being reset. This results in similar memory corruption and panics purportedly fixed by that commit. Fix this concurrency at its root by protecting both the reset and removal flows using the existing VF cfg_lock. This ensures that we cannot remove the VF while any outstanding critical tasks such as a virtchnl message or a reset are occurring. This locking change also fixes the root cause originally fixed by commit `c503e63200` ("ice: Stop processing VF messages during teardown"), so we can simply revert it. Note that I kept these two changes together because simply reverting the original commit alone would leave the driver vulnerable to worse race conditions. Fixes: `c503e63200` ("ice: Stop processing VF messages during teardown") Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2022-02-18 13:28:38 -08:00
Michal Swiatkowski	932645c298	ice: fix setting l4 port flag when adding filter Accidentally filter flag for none encapsulated l4 port field is always set. Even if user wants to add encapsulated l4 port field. Remove this unnecessary flag setting. Fixes: `9e300987d4` ("ice: VXLAN and Geneve TC support") Signed-off-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Tested-by: Sandeep Penigalapati <sandeep.penigalapati@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2022-02-18 13:28:18 -08:00
Wojciech Drewek	b70bc066d7	ice: Match on all profiles in slow-path In switchdev mode, slow-path rules need to match all protocols, in order to correctly redirect unfiltered or missed packets to the uplink. To set this up for the virtual function to uplink flow, the rule that redirects packets to the control VSI must have the tunnel type set to ICE_SW_TUN_AND_NON_TUN. As a result of that new tunnel type being set, ice_get_compat_fv_bitmap will select ICE_PROF_ALL. At that point all profiles would be selected for this rule, resulting in the desired behavior. Without this change slow-path would not work with tunnel protocols. Fixes: `8b032a55c1` ("ice: low level support for tunnels") Signed-off-by: Wojciech Drewek <wojciech.drewek@intel.com> Tested-by: Sandeep Penigalapati <sandeep.penigalapati@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2022-02-18 13:22:06 -08:00
Jakub Kicinski	6b5567b1b2	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net No conflicts. Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-02-17 11:44:20 -08:00
Dave Ertman	88f62aea1c	ice: Simplify tracking status of RDMA support The status of support for RDMA is currently being tracked with two separate status flags. This is unnecessary with the current state of the driver. Simplify status tracking down to a single flag. Rename the helper function to denote the RDMA specific status and universally use the helper function to test the status bit. Signed-off-by: Dave Ertman <david.m.ertman@intel.com> Tested-by: Leszek Kaliszczuk <leszek.kaliszczuk@intel.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-02-14 13:35:12 +00:00
Jesse Brandeburg	86006f9963	ice: enable parsing IPSEC SPI headers for RSS The COMMS package can enable the hardware parser to recognize IPSEC frames with ESP header and SPI identifier. If this package is available and configured for loading in /lib/firmware, then the driver will succeed in enabling this protocol type for RSS. This in turn allows the hardware to hash over the SPI and use it to pick a consistent receive queue for the same secure flow. Without this all traffic is steered to the same queue for multiple traffic threads from the same IP address. For that reason this is marked as a fix, as the driver supports the model, but it wasn't enabled. If the package is not available, adding this type will fail, but the failure is ignored on purpose as it has no negative affect. Fixes: `c90ed40cef` ("ice: Enable writing hardware filtering tables") Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Gurucharan G <gurucharanx.g@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-02-14 11:22:35 +00:00
Jakub Kicinski	5b91c5cc0e	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net No conflicts. Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-02-10 17:29:56 -08:00
Dave Ertman	5dbbbd01cb	ice: Avoid RTNL lock when re-creating auxiliary device If a call to re-create the auxiliary device happens in a context that has already taken the RTNL lock, then the call flow that recreates auxiliary device can hang if there is another attempt to claim the RTNL lock by the auxiliary driver. To avoid this, any call to re-create auxiliary devices that comes from an source that is holding the RTNL lock (e.g. netdev notifier when interface exits a bond) should execute in a separate thread. To accomplish this, add a flag to the PF that will be evaluated in the service task and dealt with there. Fixes: `f9f5301e7e` ("ice: Register auxiliary device to provide RDMA") Signed-off-by: Dave Ertman <david.m.ertman@intel.com> Reviewed-by: Jonathan Toppins <jtoppins@redhat.com> Tested-by: Gurucharan G <gurucharanx.g@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2022-02-10 08:47:27 -08:00
Dave Ertman	bea1898f65	ice: Fix KASAN error in LAG NETDEV_UNREGISTER handler Currently, the same handler is called for both a NETDEV_BONDING_INFO LAG unlink notification as for a NETDEV_UNREGISTER call. This is causing a problem though, since the netdev_notifier_info passed has a different structure depending on which event is passed. The problem manifests as a call trace from a BUG: KASAN stack-out-of-bounds error. Fix this by creating a handler specific to NETDEV_UNREGISTER that only is passed valid elements in the netdev_notifier_info struct for the NETDEV_UNREGISTER event. Also included is the removal of an unbalanced dev_put on the peer_netdev and related braces. Fixes: `6a8b357278` ("ice: Respond to a NETDEV_UNREGISTER event for LAG") Signed-off-by: Dave Ertman <david.m.ertman@intel.com> Acked-by: Jonathan Toppins <jtoppins@redhat.com> Tested-by: Sunitha Mekala <sunithax.d.mekala@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2022-02-10 08:47:26 -08:00
Jesse Brandeburg	46b699c50c	ice: fix IPIP and SIT TSO offload The driver was avoiding offload for IPIP (at least) frames due to parsing the inner header offsets incorrectly when trying to check lengths. This length check works for VXLAN frames but fails on IPIP frames because skb_transport_offset points to the inner header in IPIP frames, which meant the subtraction of transport_header from inner_network_header returns a negative value (-20). With the code before this patch, everything continued to work, but GSO was being used to segment, causing throughputs of 1.5Gb/s per thread. After this patch, throughput is more like 10Gb/s per thread for IPIP traffic. Fixes: `e94d447866` ("ice: Implement filter sync, NDO operations and bump version") Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de> Tested-by: Gurucharan G <gurucharanx.g@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2022-02-10 08:47:21 -08:00
Dan Carpenter	21338d5873	ice: fix an error code in ice_cfg_phy_fec() Propagate the error code from ice_get_link_default_override() instead of returning success. Fixes: `ea78ce4dab` ("ice: add link lenient and default override support") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Tested-by: Gurucharan G <gurucharanx.g@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2022-02-10 07:42:33 -08:00
David S. Miller	adc27288f2	Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue Tony Nguyen says: ==================== 100GbE Intel Wired LAN Driver Updates 2022-02-09 This series contains updates to ice driver only. Brett adds support for QinQ. This begins with code refactoring and re-organization of VLAN configuration functions to allow for introduction of VSI VLAN ops to enable setting and calling of respective operations based on device support of single or double VLANs. Implementations are added for outer VLAN support. To support QinQ, the device must be set to double VLAN mode (DVM). In order for this to occur, the DDP package and NVM must also support DVM. Functions to determine compatibility and properly configure the device are added as well as setting the proper bits to advertise and utilize the proper offloads. Support for VIRTCHNL_VF_OFFLOAD_VLAN_V2 is also included to allow for VF to negotiate and utilize this functionality. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2022-02-10 11:00:13 +00:00
Jakub Kicinski	1127170d45	Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next Daniel Borkmann says: ==================== pull-request: bpf-next 2022-02-09 We've added 126 non-merge commits during the last 16 day(s) which contain a total of 201 files changed, 4049 insertions(+), 2215 deletions(-). The main changes are: 1) Add custom BPF allocator for JITs that pack multiple programs into a huge page to reduce iTLB pressure, from Song Liu. 2) Add __user tagging support in vmlinux BTF and utilize it from BPF verifier when generating loads, from Yonghong Song. 3) Add per-socket fast path check guarding from cgroup/BPF overhead when used by only some sockets, from Pavel Begunkov. 4) Continued libbpf deprecation work of APIs/features and removal of their usage from samples, selftests, libbpf & bpftool, from Andrii Nakryiko and various others. 5) Improve BPF instruction set documentation by adding byte swap instructions and cleaning up load/store section, from Christoph Hellwig. 6) Switch BPF preload infra to light skeleton and remove libbpf dependency from it, from Alexei Starovoitov. 7) Fix architecture-agnostic macros in libbpf for accessing syscall arguments from BPF progs for non-x86 architectures, from Ilya Leoshkevich. 8) Rework port members in struct bpf_sk_lookup and struct bpf_sock to be of 16-bit field with anonymous zero padding, from Jakub Sitnicki. 9) Add new bpf_copy_from_user_task() helper to read memory from a different task than current. Add ability to create sleepable BPF iterator progs, from Kenny Yu. 10) Implement XSK batching for ice's zero-copy driver used by AF_XDP and utilize TX batching API from XSK buffer pool, from Maciej Fijalkowski. 11) Generate temporary netns names for BPF selftests to avoid naming collisions, from Hangbin Liu. 12) Implement bpf_core_types_are_compat() with limited recursion for in-kernel usage, from Matteo Croce. 13) Simplify pahole version detection and finally enable CONFIG_DEBUG_INFO_DWARF5 to be selected with CONFIG_DEBUG_INFO_BTF, from Nathan Chancellor. 14) Misc minor fixes to libbpf and selftests from various folks. * https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (126 commits) selftests/bpf: Cover 4-byte load from remote_port in bpf_sk_lookup bpf: Make remote_port field in struct bpf_sk_lookup 16-bit wide libbpf: Fix compilation warning due to mismatched printf format selftests/bpf: Test BPF_KPROBE_SYSCALL macro libbpf: Add BPF_KPROBE_SYSCALL macro libbpf: Fix accessing the first syscall argument on s390 libbpf: Fix accessing the first syscall argument on arm64 libbpf: Allow overriding PT_REGS_PARM1{_CORE}_SYSCALL selftests/bpf: Skip test_bpf_syscall_macro's syscall_arg1 on arm64 and s390 libbpf: Fix accessing syscall arguments on riscv libbpf: Fix riscv register names libbpf: Fix accessing syscall arguments on powerpc selftests/bpf: Use PT_REGS_SYSCALL_REGS in bpf_syscall_macro libbpf: Add PT_REGS_SYSCALL_REGS macro selftests/bpf: Fix an endianness issue in bpf_syscall_macro test bpf: Fix bpf_prog_pack build HPAGE_PMD_SIZE bpf: Fix leftover header->pages in sparc and powerpc code. libbpf: Fix signedness bug in btf_dump_array_data() selftests/bpf: Do not export subtest as standalone test bpf, x86_64: Fail gracefully on bpf_jit_binary_pack_finalize failures ... ==================== Link: https://lore.kernel.org/r/20220209210050.8425-1-daniel@iogearbox.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-02-09 18:40:56 -08:00
Brett Creeley	f1da5a0866	ice: Add ability for PF admin to enable VF VLAN pruning VFs by default are able to see all tagged traffic regardless of trust and VLAN filters. Based on legacy devices (i.e. ixgbe, i40e), customers expect VFs to receive all VLAN tagged traffic with a matching destination MAC. Add an ethtool private flag 'vf-vlan-pruning' and set the default to off so VFs will receive all VLAN traffic directed towards them. When the flag is turned on, VF will only be able to receive untagged traffic or traffic with VLAN tags it has created interfaces for. Also, the flag cannot be changed while any VFs are allocated. This was done to simplify the implementation. So, if this flag is needed, then the PF admin must enable it. If the user tries to enable the flag while VFs are active, then print an unsupported message with the vf-vlan-pruning flag included. In case multiple flags were specified, this makes it clear to the user which flag failed. Signed-off-by: Brett Creeley <brett.creeley@intel.com> Tested-by: Gurucharan G <gurucharanx.g@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2022-02-09 09:24:45 -08:00
Brett Creeley	cbc8b5645a	ice: Add support for 802.1ad port VLANs VF Currently there is only support for 802.1Q port VLANs on SR-IOV VFs. Add support to also allow 802.1ad port VLANs when double VLAN mode is enabled. Signed-off-by: Brett Creeley <brett.creeley@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2022-02-09 09:24:45 -08:00
Brett Creeley	1babaf77f4	ice: Advertise 802.1ad VLAN filtering and offloads for PF netdev In order for the driver to support 802.1ad VLAN filtering and offloads, it needs to advertise those VLAN features and also support modifying those VLAN features, so make the necessary changes to ice_set_netdev_features(). By default, enable CTAG insertion/stripping and CTAG filtering for both Single and Double VLAN Modes (SVM/DVM). Also, in DVM, enable STAG filtering by default. This is done by setting the feature bits in netdev->features. Also, in DVM, support toggling of STAG insertion/stripping, but don't enable them by default. This is done by setting the feature bits in netdev->hw_features. Since 802.1ad VLAN filtering and offloads are only supported in DVM, make sure they are not enabled by default and that they cannot be enabled during runtime, when the device is in SVM. Add an implementation for the ndo_fix_features() callback. This is needed since the hardware cannot support multiple VLAN ethertypes for VLAN insertion/stripping simultaneously and all supported VLAN filtering must either be enabled or disabled together. Disable inner VLAN stripping by default when DVM is enabled. If a VSI supports stripping the inner VLAN in DVM, then it will have to configure that during runtime. For example if a VF is configured in a port VLAN while DVM is enabled it will be allowed to offload inner VLANs. Signed-off-by: Brett Creeley <brett.creeley@intel.com> Tested-by: Gurucharan G <gurucharanx.g@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2022-02-09 09:24:45 -08:00
Brett Creeley	a1ffafb0b4	ice: Support configuring the device to Double VLAN Mode In order to support configuring the device in Double VLAN Mode (DVM), the DDP and FW have to support DVM. If both support DVM, the PF that downloads the package needs to update the default recipes, set the VLAN mode, and update boost TCAM entries. To support updating the default recipes in DVM, add support for updating an existing switch recipe's lkup_idx and mask. This is done by first calling the get recipe AQ (0x0292) with the desired recipe ID. Then, if that is successful update one of the lookup indices (lkup_idx) and its associated mask if the mask is valid otherwise the already existing mask will be used. The VLAN mode of the device has to be configured while the global configuration lock is held while downloading the DDP, specifically after the DDP has been downloaded. If supported, the device will default to DVM. Co-developed-by: Dan Nowlin <dan.nowlin@intel.com> Signed-off-by: Dan Nowlin <dan.nowlin@intel.com> Signed-off-by: Brett Creeley <brett.creeley@intel.com> Tested-by: Gurucharan G <gurucharanx.g@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2022-02-09 09:24:45 -08:00
Brett Creeley	cc71de8fa1	ice: Add support for VIRTCHNL_VF_OFFLOAD_VLAN_V2 Add support for the VF driver to be able to request VIRTCHNL_VF_OFFLOAD_VLAN_V2, negotiate its VLAN capabilities via VIRTCHNL_OP_GET_OFFLOAD_VLAN_V2_CAPS, add/delete VLAN filters, and enable/disable VLAN offloads. VFs supporting VIRTCHNL_OFFLOAD_VLAN_V2 will be able to use the following virtchnl opcodes: VIRTCHNL_OP_GET_OFFLOAD_VLAN_V2_CAPS VIRTCHNL_OP_ADD_VLAN_V2 VIRTCHNL_OP_DEL_VLAN_V2 VIRTCHNL_OP_ENABLE_VLAN_STRIPPING_V2 VIRTCHNL_OP_DISABLE_VLAN_STRIPPING_V2 VIRTCHNL_OP_ENABLE_VLAN_INSERTION_V2 VIRTCHNL_OP_DISABLE_VLAN_INSERTION_V2 Legacy VF drivers may expect the initial VLAN stripping settings to be configured by the PF, so the PF initializes VLAN stripping based on the VIRTCHNL_OP_GET_VF_RESOURCES opcode. However, with VLAN support via VIRTCHNL_VF_OFFLOAD_VLAN_V2, this function is only expected to be used for VFs that only support VIRTCHNL_VF_OFFLOAD_VLAN, which will only be supported when a port VLAN is configured. Update the function based on the new expectations. Also, change the message when the PF can't enable/disable VLAN stripping to a dev_dbg() as this isn't fatal. When a VF isn't in a port VLAN and it only supports VIRTCHNL_VF_OFFLOAD_VLAN when Double VLAN Mode (DVM) is enabled, then the PF needs to reject the VIRTCHNL_VF_OFFLOAD_VLAN capability and configure the VF in software only VLAN mode. To do this add the new function ice_vf_vsi_cfg_legacy_vlan_mode(), which updates the VF's inner and outer ice_vsi_vlan_ops functions and sets up software only VLAN mode. Signed-off-by: Brett Creeley <brett.creeley@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2022-02-09 09:24:45 -08:00
Brett Creeley	0d54d8f7a1	ice: Add hot path support for 802.1Q and 802.1ad VLAN offloads Currently the driver only supports 802.1Q VLAN insertion and stripping. However, once Double VLAN Mode (DVM) is fully supported, then both 802.1Q and 802.1ad VLAN insertion and stripping will be supported. Unfortunately the VSI context parameters only allow for one VLAN ethertype at a time for VLAN offloads so only one or the other VLAN ethertype offload can be supported at once. To support this, multiple changes are needed. Rx path changes: [1] In DVM, the Rx queue context l2tagsel field needs to be cleared so the outermost tag shows up in the l2tag2_2nd field of the Rx flex descriptor. In Single VLAN Mode (SVM), the l2tagsel field should remain 1 to support SVM configurations. [2] Modify the ice_test_staterr() function to take a __le16 instead of the ice_32b_rx_flex_desc union pointer so this function can be used for both rx_desc->wb.status_error0 and rx_desc->wb.status_error1. [3] Add the new inline function ice_get_vlan_tag_from_rx_desc() that checks if there is a VLAN tag in l2tag1 or l2tag2_2nd. [4] In ice_receive_skb(), add a check to see if NETIF_F_HW_VLAN_STAG_RX is enabled in netdev->features. If it is, then this is the VLAN ethertype that needs to be added to the stripping VLAN tag. Since ice_fix_features() prevents CTAG_RX and STAG_RX from being enabled simultaneously, the VLAN ethertype will only ever be 802.1Q or 802.1ad. Tx path changes: [1] In DVM, the VLAN tag needs to be placed in the l2tag2 field of the Tx context descriptor. The new define ICE_TX_FLAGS_HW_OUTER_SINGLE_VLAN was added to the list of tx_flags to handle this case. [2] When the stack requests the VLAN tag to be offloaded on Tx, the driver needs to set either ICE_TX_FLAGS_HW_OUTER_SINGLE_VLAN or ICE_TX_FLAGS_HW_VLAN, so the tag is inserted in l2tag2 or l2tag1 respectively. To determine which location to use, set a bit in the Tx ring flags field during ring allocation that can be used to determine which field to use in the Tx descriptor. In DVM, always use l2tag2, and in SVM, always use l2tag1. Signed-off-by: Brett Creeley <brett.creeley@intel.com> Tested-by: Gurucharan G <gurucharanx.g@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2022-02-09 09:24:45 -08:00
Brett Creeley	c31af68a1b	ice: Add outer_vlan_ops and VSI specific VLAN ops implementations Add a new outer_vlan_ops member to the ice_vsi structure as outer VLAN ops are only available when the device is in Double VLAN Mode (DVM). Depending on the VSI type, the requirements for what operations to use/allow differ. By default all VSI's have unsupported inner and outer VSI VLAN ops. This implementation was chosen to prevent unexpected crashes due to null pointer dereferences. Instead, if a VSI calls an unsupported op, it will just return -EOPNOTSUPP. Add implementations to support modifying outer VLAN fields for VSI context. This includes the ability to modify VLAN stripping, insertion, and the port VLAN based on the outer VLAN handling fields of the VSI context. These functions should only ever be used if DVM is enabled because that means the firmware supports the outer VLAN fields in the VSI context. If the device is in DVM, then always use the outer_vlan_ops, else use the vlan_ops since the device is in Single VLAN Mode (SVM). Also, move adding the untagged VLAN 0 filter from ice_vsi_setup() to ice_vsi_vlan_setup() as the latter function is specific to the PF and all other VSI types that need an untagged VLAN 0 filter already do this in their specific flows. Without this change, Flow Director is failing to initialize because it does not implement any VSI VLAN ops. Signed-off-by: Brett Creeley <brett.creeley@intel.com> Tested-by: Gurucharan G <gurucharanx.g@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2022-02-09 09:24:45 -08:00
Brett Creeley	7bd527aa17	ice: Adjust naming for inner VLAN operations Current operations act on inner VLAN fields. To support double VLAN, outer VLAN operations and functions will be implemented. Add the "inner" naming to existing VLAN operations to distinguish them from the upcoming outer values and functions. Some spacing adjustments are made to align values. Note that the inner is not talking about a tunneled VLAN, but the second VLAN in the packet. For SVM the driver uses inner or single VLAN filtering and offloads and in Double VLAN Mode the driver uses the inner filtering and offloads for SR-IOV VFs in port VLANs in order to support offloading the guest VLAN while a port VLAN is configured. Signed-off-by: Brett Creeley <brett.creeley@intel.com> Tested-by: Gurucharan G <gurucharanx.g@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2022-02-09 09:24:45 -08:00
Brett Creeley	2bfefa2dab	ice: Use the proto argument for VLAN ops Currently the proto argument is unused. This is because the driver only supports 802.1Q VLAN filtering. This policy is enforced via netdev features that the driver sets up when configuring the netdev, so the proto argument won't ever be anything other than 802.1Q. However, this will allow for future iterations of the driver to seemlessly support 802.1ad filtering. Begin using the proto argument and extend the related structures to support its use. Signed-off-by: Brett Creeley <brett.creeley@intel.com> Tested-by: Gurucharan G <gurucharanx.g@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2022-02-09 09:24:45 -08:00
Brett Creeley	a19d7f7f01	ice: Refactor vf->port_vlan_info to use ice_vlan The current vf->port_vlan_info variable is a packed u16 that contains the port VLAN ID and QoS/prio value. This is fine, but changes are incoming that allow for an 802.1ad port VLAN. Add flexibility by changing the vf->port_vlan_info member to be an ice_vlan structure. Signed-off-by: Brett Creeley <brett.creeley@intel.com> Tested-by: Gurucharan G <gurucharanx.g@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2022-02-09 09:24:45 -08:00
Brett Creeley	fb05ba1257	ice: Introduce ice_vlan struct Add a new struct for VLAN related information. Currently this holds VLAN ID and priority values, but will be expanded to hold TPID value. This reduces the changes necessary if any other values are added in future. Remove the action argument from these calls as it's always ICE_FWD_VSI. Signed-off-by: Brett Creeley <brett.creeley@intel.com> Tested-by: Gurucharan G <gurucharanx.g@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2022-02-09 09:24:45 -08:00
Brett Creeley	bc42afa954	ice: Add new VSI VLAN ops Incoming changes to support 802.1Q and/or 802.1ad VLAN filtering and offloads require more flexibility when configuring VLANs. The VSI VLAN interface will allow flexibility for configuring VLANs for all VSI types. Add new files to separate the VSI VLAN ops and move functions to make the code more organized. Signed-off-by: Brett Creeley <brett.creeley@intel.com> Tested-by: Gurucharan G <gurucharanx.g@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2022-02-09 09:24:45 -08:00
Brett Creeley	3e0b59714b	ice: Add helper function for adding VLAN 0 There are multiple places where VLAN 0 is being added. Create a function to be called in order to minimize changes as the implementation is expanded to support double VLAN and avoid duplicated code. Signed-off-by: Brett Creeley <brett.creeley@intel.com> Tested-by: Gurucharan G <gurucharanx.g@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2022-02-09 09:24:45 -08:00
Brett Creeley	daf4dd1643	ice: Refactor spoofcheck configuration functions Add functions to configure Tx VLAN antispoof based on iproute configuration and/or VLAN mode and VF driver support. This is needed later so the driver can control when it can be configured. Also, add functions that can be used to enable and disable MAC and VLAN spoofcheck. Move spoofchk configuration during VSI setup into the SR-IOV initialization path and into the post VSI rebuild flow for VF VSIs. Signed-off-by: Brett Creeley <brett.creeley@intel.com> Tested-by: Gurucharan G <gurucharanx.g@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2022-02-09 09:24:45 -08:00
Jakub Kicinski	a501ab3f37	Merge branch 'iwl-next' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/linux Nguyen, Anthony L says: ==================== iwl-next Intel Wired LAN Driver Updates 2022-02-07 Dave adds support for ice driver to provide DSCP QoS mappings to irdma driver. [1] https://lore.kernel.org/netdev/20220202191921.1638-1-shiraz.saleem@intel.com/ * 'iwl-next' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/linux: ice: add support for DSCP QoS for IDC ==================== Link: https://lore.kernel.org/r/20220207235921.1303522-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-02-08 16:23:39 -08:00
Dave Ertman	b794eecb2a	ice: add support for DSCP QoS for IDC The ice driver provides QoS information to auxiliary drivers through the exported function ice_get_qos_params. This function doesn't currently support L3 DSCP QoS. Add the necessary defines, structure elements and code to support DSCP QoS through the IIDC functions. Signed-off-by: Dave Ertman <david.m.ertman@intel.com> Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2022-02-03 15:22:03 -08:00
Alexander Lobakin	45a34ca680	ice: respect metadata on XSK Rx to skb For now, if the XDP prog returns XDP_PASS on XSK, the metadata will be lost as it doesn't get copied to the skb. Copy it along with the frame headers. Account its size on skb allocation, and when copying just treat it as a part of the frame and do a pull after to "move" it to the "reserved" zone. net_prefetch() xdp->data_meta and align the copy size to speed-up memcpy() a little and better match ice_construct_skb(). Fixes: `2d4238f556` ("ice: Add support for AF_XDP") Suggested-by: Jesper Dangaard Brouer <brouer@redhat.com> Suggested-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Alexander Lobakin <alexandr.lobakin@intel.com> Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Tested-by: Kiran Bhandare <kiranx.bhandare@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2022-01-31 09:46:05 -08:00
Alexander Lobakin	dc44572d19	ice: don't reserve excessive XDP_PACKET_HEADROOM on XSK Rx to skb {__,}napi_alloc_skb() allocates and reserves additional NET_SKB_PAD + NET_IP_ALIGN for any skb. OTOH, ice_construct_skb_zc() currently allocates and reserves additional `xdp->data - xdp->data_hard_start`, which is XDP_PACKET_HEADROOM for XSK frames. There's no need for that at all as the frame is post-XDP and will go only to the networking stack core. Pass the size of the actual data only to __napi_alloc_skb() and don't reserve anything. This will give enough headroom for stack processing. Fixes: `2d4238f556` ("ice: Add support for AF_XDP") Signed-off-by: Alexander Lobakin <alexandr.lobakin@intel.com> Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Tested-by: Kiran Bhandare <kiranx.bhandare@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2022-01-31 09:43:35 -08:00
Alexander Lobakin	ee803dca96	ice: respect metadata in legacy-rx/ice_construct_skb() In "legacy-rx" mode represented by ice_construct_skb(), we can still use XDP (and XDP metadata), but after XDP_PASS the metadata will be lost as it doesn't get copied to the skb. Copy it along with the frame headers. Account its size on skb allocation, and when copying just treat it as a part of the frame and do a pull after to "move" it to the "reserved" zone. Point net_prefetch() to xdp->data_meta instead of data. This won't change anything when the meta is not here, but will save some cache misses otherwise. Suggested-by: Jesper Dangaard Brouer <brouer@redhat.com> Suggested-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Alexander Lobakin <alexandr.lobakin@intel.com> Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Tested-by: Kiran Bhandare <kiranx.bhandare@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2022-01-31 09:43:35 -08:00
Christophe JAILLET	9c3e54a632	ice: Remove useless DMA-32 fallback configuration As stated in [1], dma_set_mask() with a 64-bit mask never fails if dev->dma_mask is non-NULL. So, if it fails, the 32 bits case will also fail for the same reason. Simplify code and remove some dead code accordingly. [1]: https://lkml.org/lkml/2021/6/7/398 Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Alexander Lobakin <alexandr.lobakin@intel.com> Tested-by: Gurucharan G <gurucharanx.g@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2022-01-27 08:58:24 -08:00
Maciej Fijalkowski	59e92bfe4d	ice: xsk: Borrow xdp_tx_active logic from i40e One of the things that commit `5574ff7b7b` ("i40e: optimize AF_XDP Tx completion path") introduced was the @xdp_tx_active field. Its usage from i40e can be adjusted to ice driver and give us positive performance results. If the descriptor that @next_dd points to has been sent by HW (its DD bit is set), then we are sure that at least quarter of the ring is ready to be cleaned. If @xdp_tx_active is 0 which means that related xdp_ring is not used for XDP_{TX, REDIRECT} workloads, then we know how many XSK entries should placed to completion queue, IOW walking through the ring can be skipped. Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Alexander Lobakin <alexandr.lobakin@intel.com> Acked-by: Magnus Karlsson <magnus.karlsson@intel.com> Link: https://lore.kernel.org/bpf/20220125160446.78976-9-maciej.fijalkowski@intel.com	2022-01-27 17:25:33 +01:00
Maciej Fijalkowski	126cdfe100	ice: xsk: Improve AF_XDP ZC Tx and use batching API Apply the logic that was done for regular XDP from commit `9610bd988d` ("ice: optimize XDP_TX workloads") to the ZC side of the driver. On top of that, introduce batching to Tx that is inspired by i40e's implementation with adjustments to the cleaning logic - take into the account NAPI budget in ice_clean_xdp_irq_zc(). Separating the stats structs onto separate cache lines seemed to improve the performance. Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Alexander Lobakin <alexandr.lobakin@intel.com> Link: https://lore.kernel.org/bpf/20220125160446.78976-8-maciej.fijalkowski@intel.com	2022-01-27 17:25:33 +01:00
Maciej Fijalkowski	86e3f78c8d	ice: xsk: Avoid potential dead AF_XDP Tx processing Commit `9610bd988d` ("ice: optimize XDP_TX workloads") introduced @next_dd and @next_rs to ice_tx_ring struct. Currently, their state is not restored in ice_clean_tx_ring(), which was not causing any troubles as the XDP rings are gone after we're done with XDP prog on interface. For upcoming usage of mentioned fields in AF_XDP, this might expose us to a potential dead Tx side. Scenario would look like following (based on xdpsock): - two xdpsock instances are spawned in Tx mode - one of them is killed - XDP prog is kept on interface due to the other xdpsock still running * this means that XDP rings stayed in place - xdpsock is launched again on same queue id that was terminated on - @next_dd and @next_rs setting is bogus, therefore transmit side is broken To protect us from the above, restore the initial @next_rs and @next_dd values when cleaning the Tx ring. Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Alexander Lobakin <alexandr.lobakin@intel.com> Acked-by: Magnus Karlsson <magnus.karlsson@intel.com> Link: https://lore.kernel.org/bpf/20220125160446.78976-7-maciej.fijalkowski@intel.com	2022-01-27 17:25:33 +01:00
Maciej Fijalkowski	3dd411efe1	ice: Make Tx threshold dependent on ring length XDP_TX workloads use a concept of Tx threshold that indicates the interval of setting RS bit on descriptors which in turn tells the HW to generate an interrupt to signal the completion of Tx on HW side. It is currently based on a constant value of 32 which might not work out well for various sizes of ring combined with for example batch size that can be set via SO_BUSY_POLL_BUDGET. Internal tests based on AF_XDP showed that most convenient setup of mentioned threshold is when it is equal to quarter of a ring length. Make use of recently introduced ICE_RING_QUARTER macro and use this value as a substitute for ICE_TX_THRESH. Align also ethtool -G callback so that next_dd/next_rs fields are up to date in terms of the ring size. Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Alexander Lobakin <alexandr.lobakin@intel.com> Acked-by: Magnus Karlsson <magnus.karlsson@intel.com> Link: https://lore.kernel.org/bpf/20220125160446.78976-5-maciej.fijalkowski@intel.com	2022-01-27 17:25:32 +01:00
Maciej Fijalkowski	3876ff525d	ice: xsk: Handle SW XDP ring wrap and bump tail more often Currently, if ice_clean_rx_irq_zc() processed the whole ring and next_to_use != 0, then ice_alloc_rx_buf_zc() would not refill the whole ring even if the XSK buffer pool would have enough free entries (either from fill ring or the internal recycle mechanism) - it is because ring wrap is not handled. Improve the logic in ice_alloc_rx_buf_zc() to address the problem above. Do not clamp the count of buffers that is passed to xsk_buff_alloc_batch() in case when next_to_use + buffer count >= rx_ring->count, but rather split it and have two calls to the mentioned function - one for the part up until the wrap and one for the part after the wrap. Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Alexander Lobakin <alexandr.lobakin@intel.com> Link: https://lore.kernel.org/bpf/20220125160446.78976-4-maciej.fijalkowski@intel.com	2022-01-27 17:25:32 +01:00
Maciej Fijalkowski	296f13ff38	ice: xsk: Force rings to be sized to power of 2 With the upcoming introduction of batching to XSK data path, performance wise it will be the best to have the ring descriptor count to be aligned to power of 2. Check if ring sizes that user is going to attach the XSK socket fulfill the condition above. For Tx side, although check is being done against the Tx queue and in the end the socket will be attached to the XDP queue, it is fine since XDP queues get the ring->count setting from Tx queues. Suggested-by: Alexander Lobakin <alexandr.lobakin@intel.com> Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Alexander Lobakin <alexandr.lobakin@intel.com> Acked-by: Magnus Karlsson <magnus.karlsson@intel.com> Link: https://lore.kernel.org/bpf/20220125160446.78976-3-maciej.fijalkowski@intel.com	2022-01-27 17:25:32 +01:00
Maciej Fijalkowski	a4e186693c	ice: Remove likely for napi_complete_done Remove the likely before napi_complete_done as this is the unlikely case when busy-poll is used. Removing this has a positive performance impact for busy-poll and no negative impact to the regular case. Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Alexander Lobakin <alexandr.lobakin@intel.com> Acked-by: Magnus Karlsson <magnus.karlsson@intel.com> Link: https://lore.kernel.org/bpf/20220125160446.78976-2-maciej.fijalkowski@intel.com	2022-01-27 17:25:32 +01:00
Christophe JAILLET	0dbc416218	ice: Use bitmap_free() to free bitmap kfree() and bitmap_free() are the same. But using the latter is more consistent when freeing memory allocated with bitmap_zalloc(). Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Tested-by: Gurucharan G <gurucharanx.g@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2022-01-06 10:15:25 -08:00
Christophe JAILLET	e75ed29db5	ice: Optimize a few bitmap operations When a bitmap is local to a function, it is safe to use the non-atomic __[set\|clear]_bit(). No concurrent accesses can occur. Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Tested-by: Gurucharan G <gurucharanx.g@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2022-01-06 10:15:25 -08:00
Christophe JAILLET	a5c259b162	ice: Slightly simply ice_find_free_recp_res_idx The 'possible_idx' bitmap is set just after it is zeroed, so we can save the first step. The 'free_idx' bitmap is used only at the end of the function as the result of a bitmap xor operation. So there is no need to explicitly zero it before. So, slightly simply the code and remove 2 useless 'bitmap_zero()' call Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Tested-by: Sandeep Penigalapati <sandeep.penigalapati@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2022-01-06 10:15:25 -08:00
Wojciech Drewek	c1e5da5dd4	ice: improve switchdev's slow-path In current switchdev implementation, every VF PR is assigned to individual ring on switchdev ctrl VSI. For slow-path traffic, there is a mapping VF->ring done in software based on src_vsi value (by calling ice_eswitch_get_target_netdev function). With this change, HW solution is introduced which is more efficient. For each VF, src MAC (VF's MAC) filter will be created, which forwards packets to the corresponding switchdev ctrl VSI queue based on src MAC address. This filter has to be removed and then replayed in case of resetting one VF. Keep information about this rule in repr->mac_rule, thanks to that we know which rule has to be removed and replayed for a given VF. In case of CORE/GLOBAL all rules are removed automatically. We have to take care of readding them. This is done by ice_replay_vsi_adv_rule. When driver leaves switchdev mode, remove all advanced rules from switchdev ctrl VSI. This is done by ice_rem_adv_rule_for_vsi. Flag repr->rule_added is needed because in some cases reset might be triggered before VF sends request to add MAC. Co-developed-by: Grzegorz Nitka <grzegorz.nitka@intel.com> Signed-off-by: Grzegorz Nitka <grzegorz.nitka@intel.com> Signed-off-by: Wojciech Drewek <wojciech.drewek@intel.com> Tested-by: Sandeep Penigalapati <sandeep.penigalapati@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2022-01-06 10:15:09 -08:00
Victor Raj	c36a2b9716	ice: replay advanced rules after reset ice_replay_vsi_adv_rule will replay advanced rules for a given VSI. Exit this function when list of rules for given recipe is empty. Do not add rule when given vsi_handle does not match vsi_handle from the rule info. Use ICE_MAX_NUM_RECIPES instead of ICE_SW_LKUP_LAST in order to find advanced rules as well. Signed-off-by: Victor Raj <victor.raj@intel.com> Signed-off-by: Wojciech Drewek <wojciech.drewek@intel.com> Tested-by: Sandeep Penigalapati <sandeep.penigalapati@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2022-01-06 09:19:40 -08:00
Jakub Kicinski	7d714ff14d	net: fixup build after bpf header changes Recent bpf-next merge brought in header changes which uncovered includes missing in net-next which were not present in bpf-next. Build problems happen only on less-popular arches like hppa, sparc, alpha etc. I could repro the build problem with ice but not the mlx5 problem Abdul was reporting. mlx5 does look like it should include filter.h, anyway. Reported-by: Abdul Haleem <abdhalee@linux.vnet.ibm.com> Fixes: `e63a023489` ("Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next") Link: https://lore.kernel.org/all/7c03768d-d948-c935-a7ab-b1f963ac7eed@linux.vnet.ibm.com/ Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-01-04 12:34:19 +00:00
David S. Miller	e63a023489	Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next Alexei Starovoitov says: ==================== pull-request: bpf-next 2021-12-30 The following pull-request contains BPF updates for your net-next tree. We've added 72 non-merge commits during the last 20 day(s) which contain a total of 223 files changed, 3510 insertions(+), 1591 deletions(-). The main changes are: 1) Automatic setrlimit in libbpf when bpf is memcg's in the kernel, from Andrii. 2) Beautify and de-verbose verifier logs, from Christy. 3) Composable verifier types, from Hao. 4) bpf_strncmp helper, from Hou. 5) bpf.h header dependency cleanup, from Jakub. 6) get_func_[arg\|ret\|arg_cnt] helpers, from Jiri. 7) Sleepable local storage, from KP. 8) Extend kfunc with PTR_TO_CTX, PTR_TO_MEM argument support, from Kumar. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2021-12-31 14:35:40 +00:00
Kiran Patil	40319796b7	ice: Add flow director support for channel mode Add support to enable flow-director filter when multiple TCs are configured. Flow director filter can be configured using ethtool (--config-ntuple option). When multiple TCs are configured, each TC is mapped to an unique HW VSI. So VSI corresponding to queue used in filter is identified and flow director context is updated with correct VSI while configuring ntuple filter in HW. Signed-off-by: Kiran Patil <kiran.patil@intel.com> Signed-off-by: Amritha Nambiar <amritha.nambiar@intel.com> Signed-off-by: Sudheer Mogilappagari <sudheer.mogilappagari@intel.com> Tested-by: Bharathi Sreenivas <bharathi.sreenivas@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-12-30 13:16:07 +00:00
Jakub Kicinski	b6459415b3	net: Don't include filter.h from net/sock.h sock.h is pretty heavily used (5k objects rebuilt on x86 after it's touched). We can drop the include of filter.h from it and add a forward declaration of struct sk_filter instead. This decreases the number of rebuilt objects when bpf.h is touched from ~5k to ~1k. There's a lot of missing includes this was masking. Primarily in networking tho, this time. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Marc Kleine-Budde <mkl@pengutronix.de> Acked-by: Florian Fainelli <f.fainelli@gmail.com> Acked-by: Nikolay Aleksandrov <nikolay@nvidia.com> Acked-by: Stefano Garzarella <sgarzare@redhat.com> Link: https://lore.kernel.org/bpf/20211229004913.513372-1-kuba@kernel.org	2021-12-29 08:48:14 -08:00
Alexander Lobakin	5ce6663158	ice: switch to napi_build_skb() napi_build_skb() reuses per-cpu NAPI skbuff_head cache in order to save some cycles on freeing/allocating skbuff_heads on every new Rx or completed Tx. ice driver runs Tx completion polling cycle right before the Rx one and uses napi_consume_skb() to feed the cache with skbuff_heads of completed entries, so it's never empty and always warm at that moment. Switch to the napi_build_skb() to relax mm pressure on heavy Rx. Signed-off-by: Alexander Lobakin <alexandr.lobakin@intel.com> Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Tested-by: Gurucharan G <gurucharanx.g@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-12-28 09:42:33 -08:00
Jakub Kicinski	8b3f913322	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net include/net/sock.h commit `8f905c0e73` ("inet: fully convert sk->sk_rx_dst to RCU rules") commit `43f51df417` ("net: move early demux fields close to sk_refcnt") https://lore.kernel.org/all/20211222141641.0caa0ab3@canb.auug.org.au/ Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-12-23 16:09:58 -08:00
Jesse Brandeburg	0092db5fac	ice: trivial: fix odd indenting Fix an odd indent where some code was left indented, and causes smatch to warn: ice_log_pkg_init() warn: inconsistent indenting While here, for consistency, add a break after the default case. This commit has a Fixes: but we caught this while it was only in net-next. Fixes: `247dd97d71` ("ice: Refactor status flow for DDP load") Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Link: https://lore.kernel.org/r/20211221230538.2546315-1-jesse.brandeburg@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-12-22 14:57:55 -08:00
Jacob Keller	13a64f0b98	ice: support crosstimestamping on E822 devices if supported E822 devices on supported platforms can generate a cross timestamp between the platform ART and the device time. This process allows for very precise measurement of the difference between the PTP hardware clock and the platform time. This is only supported if we know the TSC frequency relative to ART, so we do not enable this unless the boot CPU has a known TSC frequency (as required by convert_art_ns_to_tsc). Because PCIe PTM support is not available on all platforms, introduce CONFIG_ICE_HWTS and make it depend on X86 where we know the support exists. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Gurucharan G <gurucharanx.g@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-12-21 09:11:40 -08:00
Jacob Keller	a69f1cb62a	ice: exit bypass mode once hardware finishes timestamp calibration Once the E822 device has sent and received one packet, the hardware computes the internal delay of the PHY using a process known as Vernier calibration. This calibration calculates a more accurate offset for the Tx and Rx timestamps. To make use of this offset, we need to exit the bypass mode. This cannot be done until the PHY has completed offset calibration, as indicated by the offset valid bits. To handle this, introduce a kthread work item which will poll the offset valid bits every few milliseconds seeing if it is safe to exit bypass mode. Once we have finished calibrating the offsets, we can program the total Tx and Rx offset registers and turn off the bypass bit. This allows the hardware to include the more precise vernier calibration offset, and improves the timestamp precision. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Gurucharan G <gurucharanx.g@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-12-21 09:11:40 -08:00
Jacob Keller	b111ab5a11	ice: ensure the hardware Clock Generation Unit is configured The E822 device has a Clock Generation Unit (CGU) responsible for determining the clock frequency that drives the timers. Ensure this function is initialized when bringing up the PTP support, so that the clock has a known frequency. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Gurucharan G <gurucharanx.g@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-12-21 09:11:40 -08:00
Jacob Keller	3a7496234d	ice: implement basic E822 PTP support Implement support for the basic operations needed to enable the PTP hardware clock on E822 devices. This includes implementations for the various PHY access functions, as well as the ability to start and stop the PHY timers. This is different from the E810 device because the configuration depends on link speed, so we cannot just start the PHYs immediately. We must wait until the link is up to get proper values for the speed based initialization. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Gurucharan G <gurucharanx.g@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-12-21 09:11:37 -08:00
Jacob Keller	405efa49b5	ice: convert clk_freq capability into time_ref Convert the clk_freq value into the associated time_ref frequency value for E822 devices. This simplifies determining the time reference value for the clock. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Gurucharan G <gurucharanx.g@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-12-21 09:11:02 -08:00
Jacob Keller	b2ee72565c	ice: introduce ice_ptp_init_phc function When we enable support for E822 devices, there are some additional steps required to initialize the PTP hardware clock. To make this easier to implement as device-specific behavior, refactor the register setups in ice_ptp_init_owner to a new ice_ptp_init_phc function defined in ice_ptp_hw.c This function will have a common section, and an e810 specific sub-implementation. This will enable easily extending the functionality to cover the E822 specific setup required to initialize the hardware clock generation unit. It also makes it clear which steps are E810 specific vs which ones are necessary for all ice devices. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de> Tested-by: Gurucharan G <gurucharanx.g@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-12-21 09:11:02 -08:00
Jacob Keller	39b2810642	ice: use 'int err' instead of 'int status' in ice_ptp_hw.c The ice_ptp_hw.c file introduced a bunch of uses of "int status" instead of the more traditional "int err" or "int ret". These are actually traditional Linux error codes (as opposed to the recently removed ice_status enumeration values). We're about to add a bunch of new functions to ice_ptp_hw.c. It's normally preferred in the ice driver to use "int ret" or "int err" when dealing with error code values. Instead of making the new functions use "int status" lets just fix all of ice_ptp_hw.c to use "int err". This will match the new functions and ensures a consistent style across at least the PTP related files. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Gurucharan G <gurucharanx.g@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-12-21 09:11:02 -08:00
Jacob Keller	e59d75dd41	ice: PTP: move setting of tstamp_config The tstamp_config structure is being set inside of ice_ptp_cfg_timestamp, which is the function used to set Tx and Rx timestamping during initialization. This function is also used in order to set the PHY port timestamping status. However, it makes sense to always set the tstamp_config directly whenever the ice_set_tx_tstamp or ice_set_rx_tstamp functions are called. Move assignment of tstamp_config into the related functions and out of ice_ptp_cfg_timestamp. Now that we assign the timestamp mode in the relevant functions, we no longer modify the config value in ice_set_timestamp_mode. In turn, we no longer want to copy that config value into the PF cached structure. Instead, this is now the source of truth for actual configuration. On success of ice_set_timestamp_mode, copy the real configured mode back to report it out to userspace. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Gurucharan G <gurucharanx.g@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-12-21 09:11:02 -08:00
Jacob Keller	78267d0c9c	ice: introduce ice_base_incval function A future change will add additional possible increment values for the E822 device support. To handle this, we want to look up the increment value to use instead of hard coding it to the nominal value for E810 devices. Introduce ice_base_incval as a function to get the best nominal increment value to use. For now, it just returns the E810 value, but will be refactored in the future to look up the value based on the device type and configured clock frequency. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Gurucharan G <gurucharanx.g@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-12-21 09:11:02 -08:00
Karol Kolacinski	4809671015	ice: Fix E810 PTP reset flow The PF reset does not reset PHC and PHY clocks so it's unnecessary to stop them and reinitialize after the reset. Configuring timestamping changes the VSI fields so it needs to be performed after VSIs are initialized, which was not done in case of a reset. Suggested-by: Patrick Talbert <ptalbert@redhat.com> Signed-off-by: Karol Kolacinski <karol.kolacinski@intel.com> Tested-by: Pasi Vaananen <pvaanane@redhat.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-12-21 09:09:57 -08:00
Maciej Fijalkowski	dcbaf72aa4	ice: xsk: fix cleaned_count setting Currently cleaned_count is initialized to ICE_DESC_UNUSED(rx_ring) and later on during the Rx processing it is incremented per each frame that driver consumed. This can result in excessive buffers requested from xsk pool based on that value. To address this, just drop cleaned_count and pass ICE_DESC_UNUSED(rx_ring) directly as a function argument to ice_alloc_rx_bufs_zc(). Idea is to ask for buffers as many as consumed. Let us also call ice_alloc_rx_bufs_zc unconditionally at the end of ice_clean_rx_irq_zc. This has been changed in that way for corresponding ice_clean_rx_irq, but not here. Fixes: `2d4238f556` ("ice: Add support for AF_XDP") Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Tested-by: Kiran Bhandare <kiranx.bhandare@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-12-17 11:18:21 -08:00
Maciej Fijalkowski	8bea15ab74	ice: xsk: allow empty Rx descriptors on XSK ZC data path Commit `ac6f733a7b` ("ice: allow empty Rx descriptors") stated that ice HW can produce empty descriptors that are valid and they should be processed. Add this support to xsk ZC path to avoid potential processing problems. Fixes: `2d4238f556` ("ice: Add support for AF_XDP") Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Tested-by: Kiran Bhandare <kiranx.bhandare@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-12-17 11:16:55 -08:00
Maciej Fijalkowski	8b51a13c37	ice: xsk: do not clear status_error0 for ntu + nb_buffs descriptor The descriptor that ntu is pointing at when we exit ice_alloc_rx_bufs_zc() should not have its corresponding DD bit cleared as descriptor is not allocated in there and it is not valid for HW usage. The allocation routine at the entry will fill the descriptor that ntu points to after it was set to ntu + nb_buffs on previous call. Even the spec says: "The tail pointer should be set to one descriptor beyond the last empty descriptor in host descriptor ring." Therefore, step away from clearing the status_error0 on ntu + nb_buffs descriptor. Fixes: `db804cfc21` ("ice: Use the xsk batched rx allocation interface") Reported-by: Elza Mathew <elza.mathew@intel.com> Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Tested-by: Kiran Bhandare <kiranx.bhandare@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-12-17 11:15:15 -08:00
Alexander Lobakin	0708b6facb	ice: remove dead store on XSK hotpath The 'if (ntu == rx_ring->count)' block in ice_alloc_rx_buffers_zc() was previously residing in the loop, but after introducing the batched interface it is used only to wrap-around the NTU descriptor, thus no more need to assign 'xdp'. Fixes: `db804cfc21` ("ice: Use the xsk batched rx allocation interface") Signed-off-by: Alexander Lobakin <alexandr.lobakin@intel.com> Acked-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Tested-by: Kiran Bhandare <kiranx.bhandare@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-12-17 11:12:05 -08:00
Maciej Fijalkowski	617f3e1b58	ice: xsk: allocate separate memory for XDP SW ring Currently, the zero-copy data path is reusing the memory region that was initially allocated for an array of struct ice_rx_buf for its own purposes. This is error prone as it is based on the ice_rx_buf struct always being the same size or bigger than what the zero-copy path needs. There can also be old values present in that array giving rise to errors when the zero-copy path uses it. Fix this by freeing the ice_rx_buf region and allocating a new array for the zero-copy path that has the right length and is initialized to zero. Fixes: `57f7f8b6bc` ("ice: Use xdp_buf instead of rx_buf for xsk zero-copy") Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Tested-by: Kiran Bhandare <kiranx.bhandare@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-12-17 11:09:04 -08:00
Maciej Fijalkowski	afe8a3ba85	ice: xsk: return xsk buffers back to pool when cleaning the ring Currently we only NULL the xdp_buff pointer in the internal SW ring but we never give it back to the xsk buffer pool. This means that buffers can be leaked out of the buff pool and never be used again. Add missing xsk_buff_free() call to the routine that is supposed to clean the entries that are left in the ring so that these buffers in the umem can be used by other sockets. Also, only go through the space that is actually left to be cleaned instead of a whole ring. Fixes: `2d4238f556` ("ice: Add support for AF_XDP") Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Tested-by: Kiran Bhandare <kiranx.bhandare@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-12-17 11:07:04 -08:00
Jakub Kicinski	7cd2802d74	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net No conflicts. Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-12-16 16:13:19 -08:00
Jesse Brandeburg	9c99d099f7	ice: use modern kernel API for kick The kernel gained a new interface for drivers to use to combine tail bump (doorbell) and BQL updates, attempt to use those new interfaces. Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Gurucharan G <gurucharanx.g@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-12-15 08:49:25 -08:00
Jesse Brandeburg	21c6e36b1e	ice: tighter control over VSI_DOWN state The driver had comments to the effect of: This flag should be set before calling this function. While reviewing code it was found that there were several violations of this policy, which could introduce hard to find bugs or races. Fix the violations of the "VSI DOWN state must be set before calling ice_down" and make checking the state into code with a WARN_ON. Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Gurucharan G <gurucharanx.g@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-12-15 08:48:26 -08:00
Jesse Brandeburg	cc14db11c8	ice: use prefetch methods The kernel provides some prefetch mechanisms to speed up commonly cold cache line accesses during receive processing. Since these are software structures it helps to have these strategically placed prefetches. Be careful to call BQL prefetch complete only for non XDP queues. Co-developed-by: Piotr Raczynski <piotr.raczynski@intel.com> Signed-off-by: Piotr Raczynski <piotr.raczynski@intel.com> Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Gurucharan G <gurucharanx.g@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-12-15 08:46:28 -08:00
Jesse Brandeburg	1c96c16858	ice: update to newer kernel API Use the netif_tx_* API from netdevice.h which has simpler parameters. Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Gurucharan G <gurucharanx.g@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-12-15 08:45:28 -08:00
Jacob Keller	399e27dbbd	ice: support immediate firmware activation via devlink reload The ice hardware contains an embedded chip with firmware which can be updated using devlink flash. The firmware which runs on this chip is referred to as the Embedded Management Processor firmware (EMP firmware). Activating the new firmware image currently requires that the system be rebooted. This is not ideal as rebooting the system can cause unwanted downtime. In practical terms, activating the firmware does not always require a full system reboot. In many cases it is possible to activate the EMP firmware immediately. There are a couple of different scenarios to cover. * The EMP firmware itself can be reloaded by issuing a special update to the device called an Embedded Management Processor reset (EMP reset). This reset causes the device to reset and reload the EMP firmware. * PCI configuration changes are only reloaded after a cold PCIe reset. Unfortunately there is no generic way to trigger this for a PCIe device without a system reboot. When performing a flash update, firmware is capable of responding with some information about the specific update requirements. The driver updates the flash by programming a secondary inactive bank with the contents of the new image, and then issuing a command to request to switch the active bank starting from the next load. The response to the final command for updating the inactive NVM flash bank includes an indication of the minimum reset required to fully update the device. This can be one of the following: * A full power on is required * A cold PCIe reset is required * An EMP reset is required The response to the command to switch flash banks includes an indication of whether or not the firmware will allow an EMP reset request. For most updates, an EMP reset is sufficient to load the new EMP firmware without issues. In some cases, this reset is not sufficient because the PCI configuration space has changed. When this could cause incompatibility with the new EMP image, the firmware is capable of rejecting the EMP reset request. Add logic to ice_fw_update.c to handle the response data flash update AdminQ commands. For the reset level, issue a devlink status notification informing the user of how to complete the update with a simple suggestion like "Activate new firmware by rebooting the system". Cache the status of whether or not firmware will restrict the EMP reset for use in implementing devlink reload. Implement support for devlink reload with the "fw_activate" flag. This allows user space to request the firmware be activated immediately. For the .reload_down handler, we will issue a request for the EMP reset using the appropriate firmware AdminQ command. If we know that the firmware will not allow an EMP reset, simply exit with a suitable netlink extended ACK message indicating that the EMP reset is not available. For the .reload_up handler, simply wait until the driver has finished resetting. Logic to handle processing of an EMP reset already exists in the driver as part of its reset and rebuild flows. Implement support for the devlink reload interface with the "fw_activate" action. This allows userspace to request activation of firmware without a reboot. Note that support for indicating the required reset and EMP reset restriction is not supported on old versions of firmware. The driver can determine if the two features are supported by checking the device capabilities report. I confirmed support has existed since at least version 5.5.2 as reported by the 'fw.mgmt' version. Support to issue the EMP reset request has existed in all version of the EMP firmware for the ice hardware. Check the device capabilities report to determine whether or not the indications are reported by the running firmware. If the reset requirement indication is not supported, always assume a full power on is necessary. If the reset restriction capability is not supported, always assume the EMP reset is available. Users can verify if the EMP reset has activated the firmware by using the devlink info report to check that the 'running' firmware version has updated. For example a user might do the following: # Check current version $ devlink dev info # Update the device $ devlink dev flash pci/0000:af:00.0 file firmware.bin # Confirm stored version updated $ devlink dev info # Reload to activate new firmware $ devlink dev reload pci/0000:af:00.0 action fw_activate # Confirm running version updated $ devlink dev info Finally, this change does not implement basic driver-only reload support. I did look into trying to do this. However, it requires significant refactor of how the ice driver probes and loads everything. The ice driver probe and allocation flows were not designed with such a reload in mind. Refactoring the flow to support this is beyond the scope of this change. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Gurucharan G <gurucharanx.g@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-12-15 08:40:38 -08:00
Jacob Keller	af18d8866c	ice: reduce time to read Option ROM CIVD data During probe and device reset, the ice driver reads some data from the NVM image as part of ice_init_nvm. Part of this data includes a section of the Option ROM which contains version information. The function ice_get_orom_civd_data is used to locate the '$CIV' data section of the Option ROM. Timing of ice_probe and ice_rebuild indicate that the ice_get_orom_civd_data function takes about 10 seconds to finish executing. The function locates the section by scanning the Option ROM every 512 bytes. This requires a significant number of NVM read accesses, since the Option ROM bank is 500KB. In the worst case it would take about 1000 reads. Worse, all PFs serialize this operation during reload because of acquiring the NVM semaphore. The CIVD section is located at the end of the Option ROM image data. Unfortunately, the driver has no easy method to determine the offset manually. Practical experiments have shown that the data could be at a variety of locations, so simply reversing the scanning order is not sufficient to reduce the overall read time. Instead, copy the entire contents of the Option ROM into memory. This allows reading the data using 4Kb pages instead of 512 bytes at a time. This reduces the total number of firmware commands by a factor of 8. In addition, reading the whole section together at once allows better indication to firmware of when we're "done". Re-write ice_get_orom_civd_data to allocate virtual memory to store the Option ROM data. Copy the entire OptionROM contents at once using ice_read_flash_module. Finally, use this memory copy to scan for the '$CIV' section. This change significantly reduces the time to read the Option ROM CIVD section from ~10 seconds down to ~1 second. This has a significant impact on the total time to complete a driver rebuild or probe. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Gurucharan G <gurucharanx.g@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-12-15 08:38:14 -08:00
Jacob Keller	c9f7a483e4	ice: move ice_devlink_flash_update and merge with ice_flash_pldm_image The ice_devlink_flash_update function performs a few upfront checks and then calls ice_flash_pldm_image. Most if these checks make more sense in the context of code within ice_flash_pldm_image. Merge ice_devlink_flash_update and ice_flash_pldm_image into one function, placing it in ice_fw_update.c Since this is still the entry point for devlink, call the function ice_devlink_flash_update instead of ice_flash_pldm_image. This leaves a single function which handles the devlink parameters and then initiates a PLDM update. With this change, the ice_devlink_flash_update function in ice_fw_update.c becomes the main entry point for flash update. It elimintes some unnecessary boiler plate code between the two previous functions. The ultimate motivation for this is that it eases supporting a dry run with the PLDM library in a future change. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Gurucharan G <gurucharanx.g@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-12-15 08:37:14 -08:00
Jacob Keller	c356eaa824	ice: move and rename ice_check_for_pending_update The ice_devlink_flash_update function performs a few checks and then calls ice_flash_pldm_image. One of these checks is to call ice_check_for_pending_update. This function checks if the device has a pending update, and cancels it if so. This is necessary to allow a new flash update to proceed. We want to refactor the ice code to eliminate ice_devlink_flash_update, moving its checks into ice_flash_pldm_image. To do this, ice_check_for_pending_update will become static, and only called by ice_flash_pldm_image. To make this change easier to review, first just move the function up within the ice_fw_update.c file. While at it, note that the function has a misleading name. Its primary action is to cancel a pending update. Using the verb "check" does not imply this. Rename it to ice_cancel_pending_update. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Gurucharan G <gurucharanx.g@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-12-15 08:36:10 -08:00
Jacob Keller	78ad87da99	ice: devlink: add shadow-ram region to snapshot Shadow RAM We have a region for reading the contents of the NVM flash as a snapshot. This region does not allow reading the Shadow RAM, as it always passes the FLASH_ONLY bit to the low level firmware interface. Add a separate shadow-ram region which will allow snapshot of the current contents of the Shadow RAM. This data is built from the NVM contents but is distinct as the device builds up the Shadow RAM during initialization, so being able to snapshot its contents can be useful when attempting to debug flash related issues. Fix the comment description of the nvm-flash region which incorrectly stated that it filled the shadow-ram region, and add a comment explaining that the nvm-flash region does not actually read the Shadow RAM. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Gurucharan G <gurucharanx.g@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-12-15 08:34:54 -08:00
Karol Kolacinski	37e738b6fd	ice: Don't put stale timestamps in the skb The driver has to check if it does not accidentally put the timestamp in the SKB before previous timestamp gets overwritten. Timestamp values in the PHY are read only and do not get cleared except at hardware reset or when a new timestamp value is captured. The cached_tstamp field is used to detect the case where a new timestamp has not yet been captured, ensuring that we avoid sending stale timestamp data to the stack. Fixes: `ea9b847cda` ("ice: enable transmit timestamps for E810 devices") Signed-off-by: Karol Kolacinski <karol.kolacinski@intel.com> Tested-by: Gurucharan G <gurucharanx.g@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-12-14 11:31:47 -08:00
Karol Kolacinski	0013881c11	ice: Use div64_u64 instead of div_u64 in adjfine Change the division in ice_ptp_adjfine from div_u64 to div64_u64. div_u64 is used when the divisor is 32 bit but in this case incval is 64 bit and it caused incorrect calculations and incval adjustments. Fixes: `06c16d89d2` ("ice: register 1588 PTP clock device object for E810 devices") Signed-off-by: Karol Kolacinski <karol.kolacinski@intel.com> Tested-by: Gurucharan G <gurucharanx.g@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-12-14 11:29:34 -08:00
Tony Nguyen	f8a3bcceb4	ice: Remove unused ICE_FLOW_SEG_HDRS_L2_MASK Remove the unused define ICE_FLOW_SEG_HDRS_L2_MASK. Reported-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Acked-by: Paul Menzel <pmenzel@molgen.mpg.de> Tested-by: Gurucharan G <gurucharanx.g@intel.com>	2021-12-14 10:19:14 -08:00
Dan Carpenter	e53a80835f	ice: Remove unnecessary casts The "bitmap" variable is already an unsigned long so there is no need for this cast. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Tested-by: Gurucharan G <gurucharanx.g@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-12-14 10:19:14 -08:00
Tony Nguyen	c14846914e	ice: Propagate error codes As all functions now return standard error codes, propagate the values being returned instead of converting them to generic values. Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Gurucharan G <gurucharanx.g@intel.com>	2021-12-14 10:19:14 -08:00
Tony Nguyen	2ccc1c1ccc	ice: Remove excess error variables ice_status previously had a variable to contain these values where other error codes had a variable as well. With ice_status now being an int, there is no need for two variables to hold error values. In cases where this occurs, remove one of the excess variables and use a single one. Some initialization of variables are no longer needed and have been removed. Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Gurucharan G <gurucharanx.g@intel.com>	2021-12-14 10:19:13 -08:00
Tony Nguyen	5518ac2a64	ice: Cleanup after ice_status removal Clean up code after changing ice_status to int. Rearrange to fix reverse Christmas tree and pull lines up where applicable. Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Gurucharan G <gurucharanx.g@intel.com>	2021-12-14 10:19:13 -08:00
Tony Nguyen	d54699e27d	ice: Remove enum ice_status Replace uses of ice_status to, as equivalent as possible, error codes. Remove enum ice_status and its helper conversion function as they are no longer needed. Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Gurucharan G <gurucharanx.g@intel.com>	2021-12-14 10:19:13 -08:00
Tony Nguyen	5e24d5984c	ice: Use int for ice_status To prepare for removal of ice_status, change the variables from ice_status to int. This eases the transition when values are changed to return standard int error codes over enum ice_status. Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Gurucharan G <gurucharanx.g@intel.com>	2021-12-14 10:19:13 -08:00
Tony Nguyen	5f87ec4861	ice: Remove string printing for ice_status Remove the ice_stat_str() function which prints the string representation of the ice_status error code. With upcoming changes moving away from ice_status, there will be no need for this function. Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Gurucharan G <gurucharanx.g@intel.com>	2021-12-14 10:19:13 -08:00
Wojciech Drewek	247dd97d71	ice: Refactor status flow for DDP load Before this change, final state of the DDP pkg load process was dependent on many variables such as: ice_status, pkg version, ice_aq_err. The last one had be stored in hw->pkg_dwnld_status. It was impossible to conclude this state just from ice_status, that's why logging process of DDP pkg load in the caller was a little bit complicated. With this patch new status enum is introduced - ice_ddp_state. It covers all the possible final states of the loading process. What's tricky for ice_ddp_state is that not only ICE_DDP_PKG_SUCCESS(=0) means that load was successful. Actually three states mean that: - ICE_DDP_PKG_SUCCESS - ICE_DDP_PKG_SAME_VERSION_ALREADY_LOADED - ICE_DDP_PKG_COMPATIBLE_ALREADY_LOADED ice_is_init_pkg_successful can tell that information. One ddp_state should not be used outside of ice_init_pkg which is ICE_DDP_PKG_ALREADY_LOADED. It is more generic, it is used in ice_dwnld_cfg_bufs to see if pkg is already loaded. At this point we can't use one of the specific one (SAME_VERSION, COMPATIBLE, NOT_SUPPORTED) because we don't have information on the package currently loaded in HW (we are before calling ice_get_pkg_info). We can get rid of hw->pkg_dwnld_status because we are immediately mapping aq errors to ice_ddp_state in ice_dwnld_cfg_bufs. Other errors like ICE_ERR_NO_MEMORY, ICE_ERR_PARAM are mapped the generic ICE_DDP_PKG_ERR. Suggested-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Wojciech Drewek <wojciech.drewek@intel.com> Tested-by: Tony Brelinski <tony.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-12-14 10:19:13 -08:00
Brett Creeley	fabf480bf9	ice: Refactor promiscuous functions Some of the promiscuous mode functions take a boolean to indicate set/clear, which affects readability. Refactor and provide an interface for the promiscuous mode code with explicit set and clear promiscuous mode operations. Signed-off-by: Brett Creeley <brett.creeley@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-12-14 10:18:52 -08:00
Jeff Guo	60f44fe4cd	ice: refactor PTYPE validating Since the capability of a PTYPE within a specific package could be negotiated by checking the HW bit map, it means that there's no need to maintain a different PTYPE list for each type of the package when parsing PTYPE. So refactor the PTYPE validating mechanism. Signed-off-by: Jeff Guo <jia.guo@intel.com> Tested-by: Tony Brelinski <tony.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-12-14 08:06:47 -08:00
Haiyue Wang	8818b95409	ice: Add package PTYPE enable information Scan the 'Marker Ptype TCAM' section to retrieve the Rx parser PTYPE enable information from the current package. Signed-off-by: Haiyue Wang <haiyue.wang@intel.com> Tested-by: Tony Brelinski <tony.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-12-14 08:06:47 -08:00
Hangbin Liu	9c9211a3fc	net_tstamp: add new flag HWTSTAMP_FLAG_BONDED_PHC_INDEX Since commit `94dd016ae5` ("bond: pass get_ts_info and SIOC[SG]HWTSTAMP ioctl to active device") the user could get bond active interface's PHC index directly. But when there is a failover, the bond active interface will change, thus the PHC index is also changed. This may break the user's program if they did not update the PHC timely. This patch adds a new hwtstamp_config flag HWTSTAMP_FLAG_BONDED_PHC_INDEX. When the user wants to get the bond active interface's PHC, they need to add this flag and be aware the PHC index may be changed. With the new flag. All flag checks in current drivers are removed. Only the checking in net_hwtstamp_validate() is kept. Suggested-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-12-14 12:28:24 +00:00
Paolo Abeni	c8064e5b4a	bpf: Let bpf_warn_invalid_xdp_action() report more info In non trivial scenarios, the action id alone is not sufficient to identify the program causing the warning. Before the previous patch, the generated stack-trace pointed out at least the involved device driver. Let's additionally include the program name and id, and the relevant device name. If the user needs additional infos, he can fetch them via a kernel probe, leveraging the arguments added here. Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Toke Høiland-Jørgensen <toke@redhat.com> Link: https://lore.kernel.org/bpf/ddb96bb975cbfddb1546cf5da60e77d5100b533c.1638189075.git.pabeni@redhat.com	2021-12-13 22:28:27 +01:00
Jakub Kicinski	3150a73366	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net No conflicts. Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-12-09 13:23:02 -08:00
Jesse Brandeburg	1a0f25a52e	ice: safer stats processing The driver was zeroing live stats that could be fetched by ndo_get_stats64 at any time. This could result in inconsistent statistics, and the telltale sign was when reading stats frequently from /proc/net/dev, the stats would go backwards. Fix by collecting stats into a local, and delaying when we write to the structure so it's not incremental. Fixes: `fcea6f3da5` ("ice: Add stats and ethtool support") Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Gurucharan G <gurucharanx.g@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-12-08 10:37:02 -08:00
Michal Swiatkowski	de6acd1cdd	ice: fix adding different tunnels Adding filters with the same values inside for VXLAN and Geneve causes HW error, because it looks exactly the same. To choose between different type of tunnels new recipe is needed. Add storing tunnel types in creating recipes function and start checking it in finding function. Change getting open tunnels function to return port on correct tunnel type. This is needed to copy correct port to dummy packet. Block user from adding enc_dst_port via tc flower, because VXLAN and Geneve filters can be created only with destination port which was previously opened. Fixes: `8b032a55c1` ("ice: low level support for tunnels") Signed-off-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Tested-by: Sandeep Penigalapati <sandeep.penigalapati@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-12-07 13:21:01 -08:00
Michal Swiatkowski	0e32ff0240	ice: fix choosing UDP header type In tunnels packet there can be two UDP headers: - outer which for hw should be mark as ICE_UDP_OF - inner which for hw should be mark as ICE_UDP_ILOS or as ICE_TCP_IL if inner header is of TCP type In none tunnels packet header can be: - UDP, which for hw should be mark as ICE_UDP_ILOS - TCP, which for hw should be mark as ICE_TCP_IL Change incorrect ICE_UDP_OF for none tunnel packets to ICE_UDP_ILOS. ICE_UDP_OF is incorrect for none tunnel packets and setting it leads to error from hw while adding this kind of recipe. In summary, for tunnel outer port type should always be set to ICE_UDP_OF, for none tunnel outer and tunnel inner it should always be set to ICE_UDP_ILOS. Fixes: `9e300987d4` ("ice: VXLAN and Geneve TC support") Signed-off-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Tested-by: Sandeep Penigalapati <sandeep.penigalapati@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-12-07 13:21:01 -08:00
Jesse Brandeburg	28dc1b86f8	ice: ignore dropped packets during init If the hardware is constantly receiving unicast or broadcast packets during driver load, the device previously counted many GLV_RDPC (VSI dropped packets) events during init. This causes confusing dropped packet statistics during driver load. The dropped packets counter incrementing does stop once the driver finishes loading. Avoid this problem by baselining our statistics at the end of driver open instead of the end of probe. Fixes: `cdedef59de` ("ice: Configure VSIs for Tx/Rx") Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Gurucharan G <gurucharanx.g@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-12-07 13:21:01 -08:00
Dave Ertman	6d39ea19b0	ice: Fix problems with DSCP QoS implementation The patch that implemented DSCP QoS implementation removed a bandwidth check that was used to check for a specific condition caused by some corner cases. This check should not of been removed. The same patch also added a check for when the DCBx state could be changed in relation to DSCP, but the check was erroneously added nested in a check for CEE mode, which made the check useless. Fix these problems by re-adding the bandwidth check and relocating the DSCP mode check earlier in the function that changes DCBx state in the driver. Fixes: `2a87bd73e5` ("ice: Add DSCP support") Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Dave Ertman <david.m.ertman@intel.com> Tested-by: Gurucharan G <gurucharanx.g@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-12-07 13:21:01 -08:00
Paul Greenwalt	2657e16d8c	ice: rearm other interrupt cause register after enabling VFs The other interrupt cause register (OICR), global interrupt 0, is disabled when enabling VFs to prevent handling VFLR. If the OICR is not rearmed then the VF cannot communicate with the PF. Rearm the OICR after enabling VFs. Fixes: `916c7fdf5e` ("ice: Separate VF VSI initialization/creation from reset flow") Signed-off-by: Paul Greenwalt <paul.greenwalt@intel.com> Tested-by: Tony Brelinski <tony.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-12-07 13:21:01 -08:00
Yahui Cao	f23ab04dd6	ice: fix FDIR init missing when reset VF When VF is being reset, ice_reset_vf() will be called and FDIR resource should be released and initialized again. Fixes: `1f7ea1cd6a` ("ice: Enable FDIR Configure for AVF") Signed-off-by: Yahui Cao <yahui.cao@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-12-07 13:21:00 -08:00
Jakub Kicinski	fc993be36f	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-12-02 11:44:56 -08:00
Shiraz Saleem	244714da8d	net/ice: Remove unused enum Remove ice_devlink_param_id enum as its not used. Suggested-by: Parav Pandit <parav@nvidia.com> Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-11-30 08:02:12 -08:00
Shiraz Saleem	7b62483f64	net/ice: Fix boolean assignment vbool in ice_devlink_enable_roce_get can be assigned to a non-0/1 constant. Fix this assignment of vbool to be 0/1. Fixes: `e523af4ee5` ("net/ice: Add support for enable_iwarp and enable_roce devlink param") Suggested-by: Parav Pandit <parav@nvidia.com> Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-11-30 08:02:12 -08:00
Maciej Fijalkowski	d1ec975f9f	ice: xsk: clear status_error0 for each allocated desc Fix a bug in which the receiving of packets can stop in the zero-copy driver. Ice HW ignores 3 lower bits from QRX_TAIL register, which means that tail is bumped only on intervals of 8. Currently with XSK RX batching in place, ice_alloc_rx_bufs_zc() clears the status_error0 only of the last descriptor that has been allocated/taken from the XSK buffer pool. status_error0 includes DD bit that is looked upon by the ice_clean_rx_irq_zc() to tell if a descriptor can be processed. The bug can be triggered when driver updates the ntu but not the QRX_TAIL, so HW wouldn't have a chance to write to the ready descriptors. Later on driver moves the ntc to the mentioned set of descriptors and interprets them as a ready to be processed, since corresponding DD bits were not cleared nor any writeback has happened that would clear it. This can then lead to ntc == ntu case which means that ring is empty and no further packet processing. Fix the XSK traffic hang that can be observed when l2fwd scenario from xdpsock is used by making sure that status_error0 is cleared for each descriptor that is fed to HW and therefore we are sure that driver will not processed non-valid DD bits. This will also prevent the driver from processing the descriptors that were allocated in favor of the previously processed ones, but writeback didn't happen yet. Fixes: `db804cfc21` ("ice: Use the xsk batched rx allocation interface") Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Reviewed-by: Alexander Lobakin <alexandr.lobakin@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-11-30 12:09:33 +00:00
Jakub Kicinski	93d5404e89	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net drivers/net/ipa/ipa_main.c `8afc7e471a` ("net: ipa: separate disabling setup from modem stop") `76b5fbcd6b` ("net: ipa: kill ipa_modem_init()") Duplicated include, drop one. Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-11-26 13:45:19 -08:00
Shiraz Saleem	e523af4ee5	net/ice: Add support for enable_iwarp and enable_roce devlink param Allow support for 'enable_iwarp' and 'enable_roce' devlink params to turn on/off iWARP or RoCE protocol support for E800 devices. For example, a user can turn on iWARP functionality with, devlink dev param set pci/0000:07:00.0 name enable_iwarp value true cmode runtime This add an iWARP auxiliary rdma device, ice.iwarp.<>, under this PF. A user request to enable both iWARP and RoCE under the same PF is rejected since this device does not support both protocols simultaneously on the same port. Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Tested-by: Leszek Kaliszczuk <leszek.kaliszczuk@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-11-22 08:41:56 -08:00
Marta Plantykow	f65ee535df	ice: avoid bpf_prog refcount underflow Ice driver has the routines for managing XDP resources that are shared between ndo_bpf op and VSI rebuild flow. The latter takes place for example when user changes queue count on an interface via ethtool's set_channels(). There is an issue around the bpf_prog refcounting when VSI is being rebuilt - since ice_prepare_xdp_rings() is called with vsi->xdp_prog as an argument that is used later on by ice_vsi_assign_bpf_prog(), same bpf_prog pointers are swapped with each other. Then it is also interpreted as an 'old_prog' which in turn causes us to call bpf_prog_put on it that will decrement its refcount. Below splat can be interpreted in a way that due to zero refcount of a bpf_prog it is wiped out from the system while kernel still tries to refer to it: [ 481.069429] BUG: unable to handle page fault for address: ffffc9000640f038 [ 481.077390] #PF: supervisor read access in kernel mode [ 481.083335] #PF: error_code(0x0000) - not-present page [ 481.089276] PGD 100000067 P4D 100000067 PUD 1001cb067 PMD 106d2b067 PTE 0 [ 481.097141] Oops: 0000 [#1] PREEMPT SMP PTI [ 481.101980] CPU: 12 PID: 3339 Comm: sudo Tainted: G OE 5.15.0-rc5+ #1 [ 481.110840] Hardware name: Intel Corp. GRANTLEY/GRANTLEY, BIOS GRRFCRB1.86B.0276.D07.1605190235 05/19/2016 [ 481.122021] RIP: 0010:dev_xdp_prog_id+0x25/0x40 [ 481.127265] Code: 80 00 00 00 00 0f 1f 44 00 00 89 f6 48 c1 e6 04 48 01 fe 48 8b 86 98 08 00 00 48 85 c0 74 13 48 8b 50 18 31 c0 48 85 d2 74 07 <48> 8b 42 38 8b 40 20 c3 48 8b 96 90 08 00 00 eb e8 66 2e 0f 1f 84 [ 481.148991] RSP: 0018:ffffc90007b63868 EFLAGS: 00010286 [ 481.155034] RAX: 0000000000000000 RBX: ffff889080824000 RCX: 0000000000000000 [ 481.163278] RDX: ffffc9000640f000 RSI: ffff889080824010 RDI: ffff889080824000 [ 481.171527] RBP: ffff888107af7d00 R08: 0000000000000000 R09: ffff88810db5f6e0 [ 481.179776] R10: 0000000000000000 R11: ffff8890885b9988 R12: ffff88810db5f4bc [ 481.188026] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 [ 481.196276] FS: 00007f5466d5bec0(0000) GS:ffff88903fb00000(0000) knlGS:0000000000000000 [ 481.205633] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 481.212279] CR2: ffffc9000640f038 CR3: 000000014429c006 CR4: 00000000003706e0 [ 481.220530] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 481.228771] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 481.237029] Call Trace: [ 481.239856] rtnl_fill_ifinfo+0x768/0x12e0 [ 481.244602] rtnl_dump_ifinfo+0x525/0x650 [ 481.249246] ? __alloc_skb+0xa5/0x280 [ 481.253484] netlink_dump+0x168/0x3c0 [ 481.257725] netlink_recvmsg+0x21e/0x3e0 [ 481.262263] ____sys_recvmsg+0x87/0x170 [ 481.266707] ? __might_fault+0x20/0x30 [ 481.271046] ? _copy_from_user+0x66/0xa0 [ 481.275591] ? iovec_from_user+0xf6/0x1c0 [ 481.280226] ___sys_recvmsg+0x82/0x100 [ 481.284566] ? sock_sendmsg+0x5e/0x60 [ 481.288791] ? __sys_sendto+0xee/0x150 [ 481.293129] __sys_recvmsg+0x56/0xa0 [ 481.297267] do_syscall_64+0x3b/0xc0 [ 481.301395] entry_SYSCALL_64_after_hwframe+0x44/0xae [ 481.307238] RIP: 0033:0x7f5466f39617 [ 481.311373] Code: 0c 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb bd 0f 1f 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 2f 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 51 c3 48 83 ec 28 89 54 24 1c 48 89 74 24 10 [ 481.342944] RSP: 002b:00007ffedc7f4308 EFLAGS: 00000246 ORIG_RAX: 000000000000002f [ 481.361783] RAX: ffffffffffffffda RBX: 00007ffedc7f5460 RCX: 00007f5466f39617 [ 481.380278] RDX: 0000000000000000 RSI: 00007ffedc7f5360 RDI: 0000000000000003 [ 481.398500] RBP: 00007ffedc7f53f0 R08: 0000000000000000 R09: 000055d556f04d50 [ 481.416463] R10: 0000000000000077 R11: 0000000000000246 R12: 00007ffedc7f5360 [ 481.434131] R13: 00007ffedc7f5350 R14: 00007ffedc7f5344 R15: 0000000000000e98 [ 481.451520] Modules linked in: ice(OE) af_packet binfmt_misc nls_iso8859_1 ipmi_ssif intel_rapl_msr intel_rapl_common x86_pkg_temp_thermal intel_powerclamp mxm_wmi mei_me coretemp mei ipmi_si ipmi_msghandler wmi acpi_pad acpi_power_meter ip_tables x_tables autofs4 crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel ahci crypto_simd cryptd libahci lpc_ich [last unloaded: ice] [ 481.528558] CR2: ffffc9000640f038 [ 481.542041] ---[ end trace d1f24c9ecf5b61c1 ]--- Fix this by only calling ice_vsi_assign_bpf_prog() inside ice_prepare_xdp_rings() when current vsi->xdp_prog pointer is NULL. This way set_channels() flow will not attempt to swap the vsi->xdp_prog pointers with itself. Also, sprinkle around some comments that provide a reasoning about correlation between driver and kernel in terms of bpf_prog refcount. Fixes: `efc2214b60` ("ice: Add support for XDP") Reviewed-by: Alexander Lobakin <alexandr.lobakin@intel.com> Signed-off-by: Marta Plantykow <marta.a.plantykow@intel.com> Co-developed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Tested-by: Kiran Bhandare <kiranx.bhandare@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-11-22 08:35:36 -08:00
Maciej Fijalkowski	792b208658	ice: fix vsi->txq_map sizing The approach of having XDP queue per CPU regardless of user's setting exposed a hidden bug that could occur in case when Rx queue count differ from Tx queue count. Currently vsi->txq_map's size is equal to the doubled vsi->alloc_txq, which is not correct due to the fact that XDP rings were previously based on the Rx queue count. Below splat can be seen when ethtool -L is used and XDP rings are configured: [ 682.875339] BUG: kernel NULL pointer dereference, address: 000000000000000f [ 682.883403] #PF: supervisor read access in kernel mode [ 682.889345] #PF: error_code(0x0000) - not-present page [ 682.895289] PGD 0 P4D 0 [ 682.898218] Oops: 0000 [#1] PREEMPT SMP PTI [ 682.903055] CPU: 42 PID: 2878 Comm: ethtool Tainted: G OE 5.15.0-rc5+ #1 [ 682.912214] Hardware name: Intel Corp. GRANTLEY/GRANTLEY, BIOS GRRFCRB1.86B.0276.D07.1605190235 05/19/2016 [ 682.923380] RIP: 0010:devres_remove+0x44/0x130 [ 682.928527] Code: 49 89 f4 55 48 89 fd 4c 89 ff 53 48 83 ec 10 e8 92 b9 49 00 48 8b 9d a8 02 00 00 48 8d 8d a0 02 00 00 49 89 c2 48 39 cb 74 0f <4c> 3b 63 10 74 25 48 8b 5b 08 48 39 cb 75 f1 4c 89 ff 4c 89 d6 e8 [ 682.950237] RSP: 0018:ffffc90006a679f0 EFLAGS: 00010002 [ 682.956285] RAX: 0000000000000286 RBX: ffffffffffffffff RCX: ffff88908343a370 [ 682.964538] RDX: 0000000000000001 RSI: ffffffff81690d60 RDI: 0000000000000000 [ 682.972789] RBP: ffff88908343a0d0 R08: 0000000000000000 R09: 0000000000000000 [ 682.981040] R10: 0000000000000286 R11: 3fffffffffffffff R12: ffffffff81690d60 [ 682.989282] R13: ffffffff81690a00 R14: ffff8890819807a8 R15: ffff88908343a36c [ 682.997535] FS: 00007f08c7bfa740(0000) GS:ffff88a03fd00000(0000) knlGS:0000000000000000 [ 683.006910] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 683.013557] CR2: 000000000000000f CR3: 0000001080a66003 CR4: 00000000003706e0 [ 683.021819] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 683.030075] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 683.038336] Call Trace: [ 683.041167] devm_kfree+0x33/0x50 [ 683.045004] ice_vsi_free_arrays+0x5e/0xc0 [ice] [ 683.050380] ice_vsi_rebuild+0x4c8/0x750 [ice] [ 683.055543] ice_vsi_recfg_qs+0x9a/0x110 [ice] [ 683.060697] ice_set_channels+0x14f/0x290 [ice] [ 683.065962] ethnl_set_channels+0x333/0x3f0 [ 683.070807] genl_family_rcv_msg_doit+0xea/0x150 [ 683.076152] genl_rcv_msg+0xde/0x1d0 [ 683.080289] ? channels_prepare_data+0x60/0x60 [ 683.085432] ? genl_get_cmd+0xd0/0xd0 [ 683.089667] netlink_rcv_skb+0x50/0xf0 [ 683.094006] genl_rcv+0x24/0x40 [ 683.097638] netlink_unicast+0x239/0x340 [ 683.102177] netlink_sendmsg+0x22e/0x470 [ 683.106717] sock_sendmsg+0x5e/0x60 [ 683.110756] __sys_sendto+0xee/0x150 [ 683.114894] ? handle_mm_fault+0xd0/0x2a0 [ 683.119535] ? do_user_addr_fault+0x1f3/0x690 [ 683.134173] __x64_sys_sendto+0x25/0x30 [ 683.148231] do_syscall_64+0x3b/0xc0 [ 683.161992] entry_SYSCALL_64_after_hwframe+0x44/0xae Fix this by taking into account the value that num_possible_cpus() yields in addition to vsi->alloc_txq instead of doubling the latter. Fixes: `efc2214b60` ("ice: Add support for XDP") Fixes: `22bf877e52` ("ice: introduce XDP_TX fallback path") Reviewed-by: Alexander Lobakin <alexandr.lobakin@intel.com> Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Tested-by: Kiran Bhandare <kiranx.bhandare@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-11-22 08:33:54 -08:00
Hao Chen	7462494408	ethtool: extend ringparam setting/getting API with rx_buf_len Add two new parameters kernel_ringparam and extack for .get_ringparam and .set_ringparam to extend more ring params through netlink. Signed-off-by: Hao Chen <chenhao288@hisilicon.com> Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-11-22 12:31:49 +00:00
Brett Creeley	e6ba5273d4	ice: Fix race conditions between virtchnl handling and VF ndo ops The VF can be configured via the PF's ndo ops at the same time the PF is receiving/handling virtchnl messages. This has many issues, with one of them being the ndo op could be actively resetting a VF (i.e. resetting it to the default state and deleting/re-adding the VF's VSI) while a virtchnl message is being handled. The following error was seen because a VF ndo op was used to change a VF's trust setting while the VIRTCHNL_OP_CONFIG_VSI_QUEUES was ongoing: [35274.192484] ice 0000:88:00.0: Failed to set LAN Tx queue context, error: ICE_ERR_PARAM [35274.193074] ice 0000:88:00.0: VF 0 failed opcode 6, retval: -5 [35274.193640] iavf 0000:88:01.0: PF returned error -5 (IAVF_ERR_PARAM) to our request 6 Fix this by making sure the virtchnl handling and VF ndo ops that trigger VF resets cannot run concurrently. This is done by adding a struct mutex cfg_lock to each VF structure. For VF ndo ops, the mutex will be locked around the critical operations and VFR. Since the ndo ops will trigger a VFR, the virtchnl thread will use mutex_trylock(). This is done because if any other thread (i.e. VF ndo op) has the mutex, then that means the current VF message being handled is no longer valid, so just ignore it. This issue can be seen using the following commands: for i in {0..50}; do rmmod ice modprobe ice sleep 1 echo 1 > /sys/class/net/ens785f0/device/sriov_numvfs echo 1 > /sys/class/net/ens785f1/device/sriov_numvfs ip link set ens785f1 vf 0 trust on ip link set ens785f0 vf 0 trust on sleep 2 echo 0 > /sys/class/net/ens785f0/device/sriov_numvfs echo 0 > /sys/class/net/ens785f1/device/sriov_numvfs sleep 1 echo 1 > /sys/class/net/ens785f0/device/sriov_numvfs echo 1 > /sys/class/net/ens785f1/device/sriov_numvfs ip link set ens785f1 vf 0 trust on ip link set ens785f0 vf 0 trust on done Fixes: `7c710869d6` ("ice: Add handlers for VF netdevice operations") Signed-off-by: Brett Creeley <brett.creeley@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-11-03 08:16:32 -07:00
Brett Creeley	b385cca473	ice: Fix not stopping Tx queues for VFs When a VF is removed and/or reset its Tx queues need to be stopped from the PF. This is done by calling the ice_dis_vf_qs() function, which calls ice_vsi_stop_lan_tx_rings(). Currently ice_dis_vf_qs() is protected by the VF state bit ICE_VF_STATE_QS_ENA. Unfortunately, this is causing the Tx queues to not be disabled in some cases and when the VF tries to re-enable/reconfigure its Tx queues over virtchnl the op is failing. This is because a VF can be reset and/or removed before the ICE_VF_STATE_QS_ENA bit is set, but the Tx queues were already configured via ice_vsi_cfg_single_txq() in the VIRTCHNL_OP_CONFIG_VSI_QUEUES op. However, the ICE_VF_STATE_QS_ENA bit is set on a successful VIRTCHNL_OP_ENABLE_QUEUES, which will always happen after the VIRTCHNL_OP_CONFIG_VSI_QUEUES op. This was causing the following error message when loading the ice driver, creating VFs, and modifying VF trust in an endless loop: [35274.192484] ice 0000:88:00.0: Failed to set LAN Tx queue context, error: ICE_ERR_PARAM [35274.193074] ice 0000:88:00.0: VF 0 failed opcode 6, retval: -5 [35274.193640] iavf 0000:88:01.0: PF returned error -5 (IAVF_ERR_PARAM) to our request 6 Fix this by always calling ice_dis_vf_qs() and silencing the error message in ice_vsi_stop_tx_ring() since the calling code ignores the return anyway. Also, all other places that call ice_vsi_stop_tx_ring() catch the error, so this doesn't affect those flows since there was no change to the values the function returns. Other solutions were considered (i.e. tracking which VF queues had been "started/configured" in VIRTCHNL_OP_CONFIG_VSI_QUEUES, but it seemed more complicated than it was worth. This solution also brings in the chance for other unexpected conditions due to invalid state bit checks. So, the proposed solution seemed like the best option since there is no harm in failing to stop Tx queues that were never started. This issue can be seen using the following commands: for i in {0..50}; do rmmod ice modprobe ice sleep 1 echo 1 > /sys/class/net/ens785f0/device/sriov_numvfs echo 1 > /sys/class/net/ens785f1/device/sriov_numvfs ip link set ens785f1 vf 0 trust on ip link set ens785f0 vf 0 trust on sleep 2 echo 0 > /sys/class/net/ens785f0/device/sriov_numvfs echo 0 > /sys/class/net/ens785f1/device/sriov_numvfs sleep 1 echo 1 > /sys/class/net/ens785f0/device/sriov_numvfs echo 1 > /sys/class/net/ens785f1/device/sriov_numvfs ip link set ens785f1 vf 0 trust on ip link set ens785f0 vf 0 trust on done Fixes: `77ca27c417` ("ice: add support for virtchnl_queue_select.[tx\|rx]_queues bitmap") Signed-off-by: Brett Creeley <brett.creeley@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-11-03 08:16:08 -07:00
Sylwester Dziedziuch	ce572a5b88	ice: Fix replacing VF hardware MAC to existing MAC filter VF was not able to change its hardware MAC address in case the new address was already present in the MAC filter list. Change the handling of VF add mac request to not return if requested MAC address is already present on the list and check if its hardware MAC needs to be updated in this case. Fixes: `ed4c068d46` ("ice: Enable ip link show on the PF to display VF unicast MAC(s)") Signed-off-by: Sylwester Dziedziuch <sylwesterx.dziedziuch@intel.com> Tested-by: Tony Brelinski <tony.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-11-03 08:16:08 -07:00
Brett Creeley	0299faeaf8	ice: Remove toggling of antispoof for VF trusted promiscuous mode Currently when a trusted VF enables promiscuous mode spoofchk will be disabled. This is wrong and should only be modified from the ndo_set_vf_spoofchk callback. Fix this by removing the call to toggle spoofchk for trusted VFs. Fixes: `01b5e89aab` ("ice: Add VF promiscuous support") Signed-off-by: Brett Creeley <brett.creeley@intel.com> Tested-by: Tony Brelinski <tony.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-11-03 08:16:06 -07:00
Brett Creeley	1a8c7778bc	ice: Fix VF true promiscuous mode When a VF requests promiscuous mode and it's trusted and true promiscuous mode is enabled the PF driver attempts to enable unicast and/or multicast promiscuous mode filters based on the request. This is fine, but there are a couple issues with the current code. [1] The define to configure the unicast promiscuous mode mask also includes bits to configure the multicast promiscuous mode mask, which causes multicast to be set/cleared unintentionally. [2] All 4 cases for enable/disable unicast/multicast mode are not handled in the promiscuous mode message handler, which causes unexpected results regarding the current promiscuous mode settings. To fix [1] make sure any promiscuous mask defines include the correct bits for each of the promiscuous modes. To fix [2] make sure that all 4 cases are handled since there are 2 bits (FLAG_VF_UNICAST_PROMISC and FLAG_VF_MULTICAST_PROMISC) that can be either set or cleared. Also, since either unicast and/or multicast promiscuous configuration can fail, introduce two separate error values to handle each of these cases. Fixes: `01b5e89aab` ("ice: Add VF promiscuous support") Signed-off-by: Brett Creeley <brett.creeley@intel.com> Tested-by: Tony Brelinski <tony.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-11-03 08:15:27 -07:00
David S. Miller	ebed1cf5b8	Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue Tony Nguyen says: ==================== 100GbE Intel Wired LAN Driver Updates 2021-10-29 This series contains updates to ice and iavf drivers and virtchnl header file. Brett removes vlan_promisc argument from a function call for ice driver. In the virtchnl header file he removes an unused, reserved define and converts raw value defines to instead use the BIT macro. Marcin adds syncing of MAC addresses when creating switchdev VFs to remove error messages on link up and stops showing buffer information for port representors to remove duplicated entries being displayed for ice driver. Karen introduces a helper to go from pci_dev to iavf_adapter in the iavf driver. Przemyslaw fixes an issue where iavf was attempting to free IRQs before calling disable. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2021-11-01 13:05:20 +00:00
Marcin Szycik	bfaaba99e6	ice: Hide bus-info in ethtool for PRs in switchdev mode Disable showing bus-info information for port representors in switchdev mode. This fixes a bug that caused displaying wrong netdev descriptions in lshw tool - one port representor displayed PF branding string, and in turn one PF displayed a "generic" description. The bug occurs when many devices show the same bus-info in ethtool, which was the case in switchdev mode (PF and its port representors displayed the same bus-info). The bug occurs only if a port representor netdev appears before PF netdev in /proc/net/dev. In the examples below: ens6fX is PF ens6fXvY is VF ethX is port representor One irrelevant column was removed from output Before: $ sudo lshw -c net -businfo Bus info Device Description ========================================= pci@0000:02:00.0 eth102 Ethernet Controller E810-XXV for SFP pci@0000:02:00.1 ens6f1 Ethernet Controller E810-XXV for SFP pci@0000:02:01.0 ens6f0v0 Ethernet Adaptive Virtual Function pci@0000:02:01.1 ens6f0v1 Ethernet Adaptive Virtual Function pci@0000:02:01.2 ens6f0v2 Ethernet Adaptive Virtual Function pci@0000:02:00.0 ens6f0 Ethernet interface Notice that eth102 and ens6f0 have the same bus-info and their descriptions are swapped. After: $ sudo lshw -c net -businfo Bus info Device Description ========================================= pci@0000:02:00.0 ens6f0 Ethernet Controller E810-XXV for SFP pci@0000:02:00.1 ens6f1 Ethernet Controller E810-XXV for SFP pci@0000:02:01.0 ens6f0v0 Ethernet Adaptive Virtual Function pci@0000:02:01.1 ens6f0v1 Ethernet Adaptive Virtual Function pci@0000:02:01.2 ens6f0v2 Ethernet Adaptive Virtual Function Fixes: `7aae80cef7` ("ice: add port representor ethtool ops and stats") Signed-off-by: Marcin Szycik <marcin.szycik@linux.intel.com> Tested-by: Sandeep Penigalapati <sandeep.penigalapati@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-10-29 11:43:15 -07:00
Marcin Szycik	c79bb28e19	ice: Clear synchronized addrs when adding VFs in switchdev mode When spawning VFs in switchdev mode, internal filter list of VSIs is cleared, which includes MAC rules. However MAC entries stay on netdev's multicast list, which causes error message when bringing link up after spawning VFs ("Failed to delete MAC filters"). __dev_mc_sync() is called and tries to unsync addresses that were already removed internally when adding VFs. This can be reproduced with: 1) Load ice driver 2) Change PF to switchdev mode 3) Bring PF link up 4) Bring PF link down 5) Create a VF on PF 6) Bring PF link up Added clearing of netdev's multicast (and also unicast) list when spawning VFs in switchdev mode, so the state of internal rule list and netdev's MAC list is consistent. Fixes: `1a1c40df2e` ("ice: set and release switchdev environment") Signed-off-by: Marcin Szycik <marcin.szycik@linux.intel.com> Tested-by: Sandeep Penigalapati <sandeep.penigalapati@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-10-29 10:56:24 -07:00
Brett Creeley	29e71f41e7	ice: Remove boolean vlan_promisc flag from function Currently, the vlan_promisc flag is used exclusively by VF VSI to determine whether or not to toggle VLAN pruning along with trusted/true-promiscuous mode. This is not needed for a couple of reasons. First, trusted/true-promiscuous mode is only supposed to allow all MAC filters within VLANs that a VF has added filters for, so VLAN pruning should not be disabled. Second, the boolean argument makes the function confusing and unintuitive. Remove this flag. Signed-off-by: Brett Creeley <brett.creeley@intel.com> Tested-by: Tony Brelinski <tony.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-10-29 10:48:16 -07:00
Yang Li	3c6f3ae3bb	intel: Simplify bool conversion Fix the following coccicheck warning: ./drivers/net/ethernet/intel/i40e/i40e_xsk.c:229:35-40: WARNING: conversion to bool not needed here ./drivers/net/ethernet/intel/ice/ice_xsk.c:399:35-40: WARNING: conversion to bool not needed here Reported-by: Abaci Robot <abaci@linux.alibaba.com> Signed-off-by: Yang Li <yang.lee@linux.alibaba.com> Tested-by: Tony Brelinski <tony.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-10-29 09:42:33 -07:00
David S. Miller	704bc986ff	Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue Tony Nguyen says: ==================== 100GbE Intel Wired LAN Driver Updates 2021-10-28 This series contains updates to ice driver only. Michal adds support for eswitch drop and redirect filters from and to tunnel devices. From meaning from uplink to VF and to means from VF to uplink. This is accomplished by adding support for indirect TC tunnel notifications and adding appropriate training packets and match fields for UDP tunnel headers. He also adds returning virtchannel responses for blocked operations as returning a response is still needed. Marcin sets netdev min and max MTU values on port representors to allow for MTU changes over default values. Brett adds detecting and reporting of PHY firmware load issues for devices which support this. Nathan Chancellor fixes a clang warning for implicit fallthrough. Wang Hai fixes a return value for failed allocation. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2021-10-29 12:28:11 +01:00
Wang Hai	c8e51a0122	ice: fix error return code in ice_get_recp_frm_fw() Return error code if devm_kmemdup() fails in ice_get_recp_frm_fw() Fixes: `fd2a6b71e3` ("ice: create advanced switch recipe") Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Wang Hai <wanghai38@huawei.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-10-28 11:00:21 -07:00
Nathan Chancellor	370764e60b	ice: Fix clang -Wimplicit-fallthrough in ice_pull_qvec_from_rc() Clang warns: drivers/net/ethernet/intel/ice/ice_lib.c:1906:2: error: unannotated fall-through between switch labels [-Werror,-Wimplicit-fallthrough] default: ^ drivers/net/ethernet/intel/ice/ice_lib.c:1906:2: note: insert 'break;' to avoid fall-through default: ^ break; 1 error generated. Clang is a little more pedantic than GCC, which does not warn when falling through to a case that is just break or return. Clang's version is more in line with the kernel's own stance in deprecated.rst, which states that all switch/case blocks must end in either break, fallthrough, continue, goto, or return. Add the missing break to silence the warning. Link: https://github.com/ClangBuiltLinux/linux/issues/1482 Signed-off-by: Nathan Chancellor <nathan@kernel.org> Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Gurucharan G <gurucharanx.g@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-10-28 11:00:20 -07:00
Brett Creeley	99d407524c	ice: Add support to print error on PHY FW load failure Some devices have support for loading the PHY FW and in some cases this can fail. When this fails, the FW will set the corresponding bit in the link info structure. Also, the FW will send a link event if the correct link event mask bit is set. Add support for printing an error message when the PHY FW load fails during any link configuration flow and the link event flow. Since ice_check_module_power() is already doing something very similar add a new function ice_check_link_cfg_err() so any failures reported in the link info's link_cfg_err member can be printed in this one function. Also, add the new ICE_FLAG_PHY_FW_LOAD_FAILED bit to the PF's flags so we don't constantly print this error message during link polling if the value never changed. Signed-off-by: Brett Creeley <brett.creeley@intel.com> Tested-by: Sunitha Mekala <sunithax.d.mekala@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-10-28 11:00:20 -07:00
Marcin Szycik	e984c4408f	ice: Add support for changing MTU on PR in switchdev mode This change adds support for changing MTU on port representor in switchdev mode, by setting the min/max MTU values on port representor netdev. Before it was possible to change the MTU only in a limited, default range (68-1500). Signed-off-by: Marcin Szycik <marcin.szycik@intel.com> Tested-by: Sandeep Penigalapati <sandeep.penigalapati@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-10-28 11:00:20 -07:00
Michal Swiatkowski	e492c2e12d	ice: send correct vc status in switchdev Part of virtchannel messages are treated in different way in switchdev mode to block configuring VFs from iavf driver side. This blocking was done by doing nothing and returning success, event without sending response. Not sending response for opcodes that aren't supported in switchdev mode leads to block iavf driver message handling. This happens for example when vlan is configured at VF config time (VLAN module is already loaded). To get rid of it ice driver should answer for each VF message. In switchdev mode: - for adding/deleting VLAN driver should answer success without doing anything to allow creating vlan device on VFs - for enabling/disabling VLAN stripping and promiscuous mode driver should answer not supported, this feature in switchdev can be only set from host side Signed-off-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Tested-by: Sandeep Penigalapati <sandeep.penigalapati@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-10-28 11:00:20 -07:00
Michal Swiatkowski	f0a35040ad	ice: support for GRE in eswitch Mostly reuse code from Geneve and VXLAN in TC parsing code. Add new GRE header to match on correct fields. Create new dummy packets with GRE fields. Instead of checking if any encap values are presented in TC flower, check if device is tunnel type or redirect is to tunnel device. This will allow adding all combination of rules. For example filters only with inner fields. Return error in case device isn't tunnel but encap values are presented. gre example: - create tunnel device ip l add $NVGRE_DEV type gretap remote $NVGRE_REM_IP local $VF1_IP \ dev $PF - add tc filter (in switchdev mode) tc filter add dev $NVGRE_DEV protocol ip parent ffff: flower dst_ip \ $NVGRE1_IP action mirred egress redirect dev $VF1_PR Signed-off-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Tested-by: Sandeep Penigalapati <sandeep.penigalapati@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-10-28 11:00:20 -07:00
Michal Swiatkowski	8b032a55c1	ice: low level support for tunnels Add definition of UDP tunnel dummy packets. Fill destination port value in filter based on UDP tunnel port. Append tunnel flags to switch filter definition in case of matching the tunnel. Both VXLAN and Geneve are UDP tunnels, so only one new header is needed. Signed-off-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Tested-by: Sandeep Penigalapati <sandeep.penigalapati@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-10-28 11:00:20 -07:00
Michal Swiatkowski	9e300987d4	ice: VXLAN and Geneve TC support Add definition for VXLAN and Geneve dummy packet. Define VXLAN and Geneve type of fields to match on correct UDP tunnel header. Parse tunnel specific fields from TC tool like outer MACs, outer IPs, outer destination port and VNI. Save values and masks in outer header struct and move header pointer to inner to simplify parsing inner values. There are two cases for redirect action: - from uplink to VF - TC filter is added on tunnel device - from VF to uplink - TC filter is added on PR, for this case check if redirect device is tunnel device VXLAN example: - create tunnel device ip l add $VXLAN_DEV type vxlan id $VXLAN_VNI dstport $VXLAN_PORT \ dev $PF - add TC filter (in switchdev mode) tc filter add dev $VXLAN_DEV protocol ip parent ffff: flower \ enc_dst_ip $VF1_IP enc_key_id $VXLAN_VNI action mirred egress \ redirect dev $VF1_PR Geneve example: - create tunnel device ip l add $GENEVE_DEV type geneve id $GENEVE_VNI dstport $GENEVE_PORT \ remote $GENEVE_IP - add TC filter (in switchdev mode) tc filter add dev $GENEVE_DEV protocol ip parent ffff: flower \ enc_key_id $GENEVE_VNI dst_ip $GENEVE1_IP action mirred egress \ redirect dev $VF1_PR Signed-off-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Tested-by: Sandeep Penigalapati <sandeep.penigalapati@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-10-28 11:00:18 -07:00
Michal Swiatkowski	195bb48fcc	ice: support for indirect notification Implement indirect notification mechanism to support offloading TC rules on tunnel devices. Keep indirect block list in netdev priv. Notification will call setting tc cls flower function. For now we can offload only ingress type. Return not supported for other flow block binder. Signed-off-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Acked-by: Paul Menzel <pmenzel@molgen.mpg.de> Tested-by: Sandeep Penigalapati <sandeep.penigalapati@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-10-28 10:59:17 -07:00
Jakub Kicinski	7df621a3ee	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net include/net/sock.h `7b50ecfcc6` ("net: Rename ->stream_memory_read to ->sock_is_readable") `4c1e34c0db` ("vsock: Enable y2038 safe timeval for timeout") drivers/net/ethernet/marvell/octeontx2/af/rvu_debugfs.c `0daa55d033` ("octeontx2-af: cn10k: debugfs for dumping LMTST map table") `e77bcdd1f6` ("octeontx2-af: Display all enabled PF VF rsrc_alloc entries.") Adjacent code addition in both cases, keep both. Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-10-28 10:43:58 -07:00
Yongxin Liu	fd1b5beb17	ice: check whether PTP is initialized in ice_ptp_release() PTP is currently only supported on E810 devices, it is checked in ice_ptp_init(). However, there is no check in ice_ptp_release(). For other E800 series devices, ice_ptp_release() will be wrongly executed. Fix the following calltrace. INFO: trying to register non-static key. The code is fine but needs lockdep annotation, or maybe you didn't initialize this object before use? turning off the locking correctness validator. Workqueue: ice ice_service_task [ice] Call Trace: dump_stack_lvl+0x5b/0x82 dump_stack+0x10/0x12 register_lock_class+0x495/0x4a0 ? find_held_lock+0x3c/0xb0 __lock_acquire+0x71/0x1830 lock_acquire+0x1e6/0x330 ? ice_ptp_release+0x3c/0x1e0 [ice] ? _raw_spin_lock+0x19/0x70 ? ice_ptp_release+0x3c/0x1e0 [ice] _raw_spin_lock+0x38/0x70 ? ice_ptp_release+0x3c/0x1e0 [ice] ice_ptp_release+0x3c/0x1e0 [ice] ice_prepare_for_reset+0xcb/0xe0 [ice] ice_do_reset+0x38/0x110 [ice] ice_service_task+0x138/0xf10 [ice] ? __this_cpu_preempt_check+0x13/0x20 process_one_work+0x26a/0x650 worker_thread+0x3f/0x3b0 ? __kthread_parkme+0x51/0xb0 ? process_one_work+0x650/0x650 kthread+0x161/0x190 ? set_kthread_struct+0x40/0x40 ret_from_fork+0x1f/0x30 Fixes: `4dd0d5c33c` ("ice: add lock around Tx timestamp tracker flush") Signed-off-by: Yongxin Liu <yongxin.liu@windriver.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Gurucharan G <gurucharanx.g@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-10-25 13:44:55 -07:00
Dave Ertman	6a8b357278	ice: Respond to a NETDEV_UNREGISTER event for LAG When the PF is a member of a link aggregate, and the driver is removed, the process will hang unless we respond to the NETDEV_UNREGISTER event that is sent to the event_handler for LAG. Add a case statement for the ice_lag_event_handler to unlink the PF from the link aggregate. Also remove code that was incorrectly applying a dev_hold to peer_netdevs that were associated with the ice driver. Fixes: `df006dd4b1` ("ice: Add initial support framework for LAG") Signed-off-by: Dave Ertman <david.m.ertman@intel.com> Tested-by: Tony Brelinski <tony.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-10-25 13:44:37 -07:00
David S. Miller	b89e7f2c31	ice: Nuild fix. M<erge issues... Signed-off-by: David S. Miller <davem@davemloft.net>	2021-10-22 14:24:44 +01:00
David S. Miller	bdfa75ad70	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Lots of simnple overlapping additions. With a build fix from Stephen Rothwell. Signed-off-by: David S. Miller <davem@davemloft.net>	2021-10-22 11:41:16 +01:00
Kiran Patil	9fea749856	ice: Add tc-flower filter support for channel Add support to add/delete channel specific filter using tc-flower. For now, only supported action is "skip_sw hw_tc <tc_num>" Filter criteria is specific to channel and it can be combination of L3, L3+L4, L2+L4. Example: MATCH criteria Action --------------------------- src and/or dest IPv4[6]/mask -> Forward to "hw_tc <tc_num>" dest IPv4[6]/mask + dest L4 port -> Forward to "hw_tc <tc_num>" dest MAC + dest L4 port -> Forward to "hw_tc <tc_num>" src IPv4[6]/mask + src L4 port -> Forward to "hw_tc <tc_num>" src MAC + src L4 port -> Forward to "hw_tc <tc_num>" Adding tc-flower filter for channel using "hw_tc" ------------------------------------------------- tc qdisc add dev <ethX> clsact Above two steps are only needed the first time when adding tc-flower filter. tc filter add dev <ethX> protocol ip ingress prio 1 flower \ dst_ip 192.168.0.1/32 ip_proto tcp dst_port 5001 \ skip_sw hw_tc 1 tc filter show dev <ethX> ingress filter protocol ip pref 1 flower chain 0 filter protocol ip pref 1 flower chain 0 handle 0x1 hw_tc 1 eth_type ipv4 ip_proto tcp dst_ip 192.168.0.1 dst_port 5001 skip_sw in_hw in_hw_count 1 Delete specific filter: ------------------------- tc filter del dev <ethx> ingress pref 1 handle 0x1 flower Delete All filters: ------------------ tc filter del dev <ethX> ingress Co-developed-by: Amritha Nambiar <amritha.nambiar@intel.com> Signed-off-by: Amritha Nambiar <amritha.nambiar@intel.com> Signed-off-by: Kiran Patil <kiran.patil@intel.com> Signed-off-by: Sudheer Mogilappagari <sudheer.mogilappagari@intel.com> Tested-by: Bharathi Sreenivas <bharathi.sreenivas@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-10-20 15:58:11 -07:00
Kiran Patil	fbc7b27af0	ice: enable ndo_setup_tc support for mqprio_qdisc Add support in driver for TC_QDISC_SETUP_MQPRIO. This support enables instantiation of channels in HW using existing MQPRIO infrastructure which is extended to be offloadable. This provides a mechanism to configure dedicated set of queues for each TC. Configuring channels using "tc mqprio": -------------------------------------- tc qdisc add dev <ethX> root mqprio num_tc 3 map 0 1 2 \ queues 4@0 4@4 4@8 hw 1 mode channel Above command configures 3 TCs having 4 queues each. "hw 1 mode channel" implies offload of channel configuration to HW. When driver processes configuration received via "ndo_setup_tc: QDISC_SETUP_MQPRIO", each TC maps to HW VSI with specified queues. User can optionally specify bandwidth min and max rate limit per TC (see example below). If shaper params like min and/or max bandwidth rate limit are specified, driver configures VSI specific rate limiter in HW. Configuring channels and bandwidth shaper parameters using "tc mqprio": ---------------------------------------------------------------- tc qdisc add dev <ethX> root mqprio \ num_tc 4 map 0 1 2 3 queues 4@0 4@4 4@8 4@12 hw 1 mode channel \ shaper bw_rlimit min_rate 1Gbit 2Gbit 3Gbit 4Gbit \ max_rate 4Gbit 5Gbit 6Gbit 7Gbit Command to view configured TCs: ----------------------------- tc qdisc show dev <ethX> Deleting TCs: ------------ tc qdisc del dev <ethX> root mqprio Signed-off-by: Kiran Patil <kiran.patil@intel.com> Signed-off-by: Amritha Nambiar <amritha.nambiar@intel.com> Signed-off-by: Sudheer Mogilappagari <sudheer.mogilappagari@intel.com> Tested-by: Bharathi Sreenivas <bharathi.sreenivas@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-10-20 15:58:11 -07:00
Kiran Patil	0754d65bd4	ice: Add infrastructure for mqprio support via ndo_setup_tc Add infrastructure required for "ndo_setup_tc:qdisc_mqprio". ice_vsi_setup is modified to configure traffic classes based on mqprio data received from the stack. This includes low-level functions to configure min, max rate-limit parameters in hardware for traffic classes. Each traffic class gets mapped to a hardware channel (VSI) which can be individually configured with different bandwidth parameters. Co-developed-by: Tarun Singh <tarun.k.singh@intel.com> Signed-off-by: Tarun Singh <tarun.k.singh@intel.com> Signed-off-by: Kiran Patil <kiran.patil@intel.com> Signed-off-by: Amritha Nambiar <amritha.nambiar@intel.com> Signed-off-by: Sudheer Mogilappagari <sudheer.mogilappagari@intel.com> Tested-by: Bharathi Sreenivas <bharathi.sreenivas@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-10-20 15:57:54 -07:00
Tony Nguyen	7dcf78b870	ice: Add missing E810 device ids As part of support for E810 XXV devices, some device ids were inadvertently left out. Add those missing ids. Fixes: `195fb97766` ("ice: add additional E810 device id") Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Acked-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Acked-by: Paul Menzel <pmenzel@molgen.mpg.de>	2021-10-20 09:07:22 -07:00
Dan Carpenter	8702ed0b0d	ice: fix an error code in ice_ena_vfs() Return the error code if ice_eswitch_configure() fails. Don't return success. Fixes: `1c54c83993` ("ice: enable/disable switchdev when managing VFs") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-10-19 10:53:28 -07:00
Gustavo A. R. Silva	6f3323536a	ice: use devm_kcalloc() instead of devm_kzalloc() Use 2-factor multiplication argument form devm_kcalloc() instead of devm_kzalloc(). Link: https://github.com/KSPP/linux/issues/162 Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-10-19 10:53:28 -07:00
Cai Huoqing	7c1b694ada	ice: Make use of the helper function devm_add_action_or_reset() The helper function devm_add_action_or_reset() will internally call devm_add_action(), and if devm_add_action() fails then it will execute the action mentioned and return the error code. So use devm_add_action_or_reset() instead of devm_add_action() to simplify the error handling, reduce the code. Signed-off-by: Cai Huoqing <caihuoqing@baidu.com> Tested-by: Gurucharan G <gurucharanx.g@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-10-19 10:53:28 -07:00
Wojciech Drewek	3f13f570ff	ice: Refactor PR ethtool ops This patch improves a few things: - it fixes issue where ethtool -i reports that PR supports priv-flags and tests when in fact it does not support them - instead of using the same functions for both PF and PR ethtool ops, this patch introduces separate ops for both cases and internal functions with core logic. - prevent accessing VF VSI while VF is not ready by calling ice_check_vf_ready_for_cfg - all PR specific functions in ethtool.c were moved to one place in file - instead overwriting n_priv_flags in ice_repr_get_drvinfo, priv-flags code was moved from __ice_get_drvinfo to ice_get_drvinfo Signed-off-by: Wojciech Drewek <wojciech.drewek@intel.com> Tested-by: Sandeep Penigalapati <sandeep.penigalapati@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-10-19 10:53:27 -07:00
Wojciech Drewek	73b483b790	ice: Manage act flags for switchdev offloads Currently it is not possible to set/unset lb_en and lan_en flags for advanced rules during their creation. Both flags are enabled by default. In case of switchdev offloads for egress traffic we need lb_en to be disabled. Because of that, we work around it by updating the rule immediately after its creation. This change allows us to set/unset those flags right away and it gets rid of old workaround as well. Using ice_adv_rule_flags_info structure we can pass info about flags we want to be set for a given advanced rule. Flags are stored in flags_info.act. Values from act would be used only if act_valid was set to true, otherwise default values would be used. Signed-off-by: Wojciech Drewek <wojciech.drewek@intel.com> Acked-by: Paul Menzel <pmenzel@molgen.mpg.de> Tested-by: Sandeep Penigalapati <sandeep.penigalapati@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-10-19 10:53:26 -07:00
Wojciech Drewek	1281b74596	ice: Forbid trusted VFs in switchdev mode Merge issues caused the check for switchdev mode has been inserted in wrong place. It should be in ice_set_vf_trust not in ice_set_vf_mac. Trusted VFs are forbidden in switchdev mode because they should be configured only from the host side. Fixes: `1c54c83993` ("ice: enable/disable switchdev when managing VFs") Signed-off-by: Wojciech Drewek <wojciech.drewek@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-10-19 10:53:26 -07:00
Jesse Brandeburg	23be7075b3	ice: fix software generating extra interrupts The driver tried to work around missing completion events that occurred while interrupts are disabled, by triggering a software interrupt whenever we exit polling (but we had to have polled at least once). This was causing a lot of extra interrupts for some workloads like NVMe over TCP, which resulted in regressions in performance. It was also visible when polling didn't prevent interrupts when busy_poll was enabled. Fix the extra interrupts by utilizing our previously unused 3rd ITR (interrupt throttle) index and set it to 20K interrupts per second, and then trigger a software interrupt within that rate limit. While here, slightly refactor the code to avoid an overwrite of a local variable in the case of wb_en = true. Fixes: `b7306b42be` ("ice: manage interrupts during poll exit") Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Gurucharan G <gurucharanx.g@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-10-19 10:45:17 -07:00
Jesse Brandeburg	d16a4f45f3	ice: fix rate limit update after coalesce change If the adaptive settings are changed with ethtool -C ethx adaptive-rx off adaptive-tx off then the interrupt rate limit should be maintained as a user set value, but only if BOTH adaptive settings are off. Fix a bug where the rate limit that was being used in adaptive mode was staying set in the register but was not reported correctly by ethtool -c ethx. Due to long lines include a small refactor of q_vector variable. Fixes: `b8b4772377` ("ice: refactor interrupt moderation writes") Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Gurucharan G <gurucharanx.g@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-10-19 10:45:16 -07:00
Jesse Brandeburg	d8eb7ad5e4	ice: update dim usage and moderation The driver was having trouble with unreliable latency when doing single threaded ping-pong tests. This was root caused to the DIM algorithm landing on a too slow interrupt value, which caused high latency, and it was especially present when queues were being switched frequently by the scheduler as happens on default setups today. In attempting to improve this, we allow the upper rate limit for interrupts to move to rate limit of 4 microseconds as a max, which means that no vector can generate more than 250,000 interrupts per second. The old config was up to 100,000. The driver previously tried to program the rate limit too frequently and if the receive and transmit side were both active on the same vector, the INTRL would be set incorrectly, and this change fixes that issue as a side effect of the redesign. This driver will operate from now on with a slightly changed DIM table with more emphasis towards latency sensitivity by having more table entries with lower latency than with high latency (high being >= 64 microseconds). The driver also resets the DIM algorithm state with a new stats set when there is no work done and the data becomes stale (older than 1 second), for the respective receive or transmit portion of the interrupt. Add a new helper for setting rate limit, which will be used more in a followup patch. Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Gurucharan G <gurucharanx.g@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-10-19 10:45:16 -07:00
Brett Creeley	4ecc863305	ice: Add support for VF rate limiting Implement ndo_set_vf_rate to support setting of min_tx_rate and max_tx_rate; set the appropriate bandwidth in the scheduler for the node representing the specified VF VSI. Co-developed-by: Tarun Singh <tarun.k.singh@intel.com> Signed-off-by: Tarun Singh <tarun.k.singh@intel.com> Signed-off-by: Brett Creeley <brett.creeley@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-10-19 09:22:31 -07:00
Maciej Fijalkowski	2faf63b650	ice: make use of ice_for_each_* macros Go through the code base and use ice_for_each_* macros. While at it, introduce ice_for_each_xdp_txq() macro that can be used for looping over xdp_rings array. Commit is not introducing any new functionality. Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Tested-by: Gurucharan G <gurucharanx.g@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-10-15 07:39:03 -07:00
Maciej Fijalkowski	22bf877e52	ice: introduce XDP_TX fallback path Under rare circumstances there might be a situation where a requirement of having XDP Tx queue per CPU could not be fulfilled and some of the Tx resources have to be shared between CPUs. This yields a need for placing accesses to xdp_ring inside a critical section protected by spinlock. These accesses happen to be in the hot path, so let's introduce the static branch that will be triggered from the control plane when driver could not provide Tx queue dedicated for XDP on each CPU. Currently, the design that has been picked is to allow any number of XDP Tx queues that is at least half of a count of CPUs that platform has. For lower number driver will bail out with a response to user that there were not enough Tx resources that would allow configuring XDP. The sharing of rings is signalled via static branch enablement which in turn indicates that lock for xdp_ring accesses needs to be taken in hot path. Approach based on static branch has no impact on performance of a non-fallback path. One thing that is needed to be mentioned is a fact that the static branch will act as a global driver switch, meaning that if one PF got out of Tx resources, then other PFs that ice driver is servicing will suffer. However, given the fact that HW that ice driver is handling has 1024 Tx queues per each PF, this is currently an unlikely scenario. Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Tested-by: George Kuruvinakunnel <george.kuruvinakunnel@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-10-15 07:39:03 -07:00
Maciej Fijalkowski	9610bd988d	ice: optimize XDP_TX workloads Optimize Tx descriptor cleaning for XDP. Current approach doesn't really scale and chokes when multiple flows are handled. Introduce two ring fields, @next_dd and @next_rs that will keep track of descriptor that should be looked at when the need for cleaning arise and the descriptor that should have the RS bit set, respectively. Note that at this point the threshold is a constant (32), but it is something that we could make configurable. First thing is to get away from setting RS bit on each descriptor. Let's do this only once NTU is higher than the currently @next_rs value. In such case, grab the tx_desc[next_rs], set the RS bit in descriptor and advance the @next_rs by a 32. Second thing is to clean the Tx ring only when there are less than 32 free entries. For that case, look up the tx_desc[next_dd] for a DD bit. This bit is written back by HW to let the driver know that xmit was successful. It will happen only for those descriptors that had RS bit set. Clean only 32 descriptors and advance the DD bit. Actual cleaning routine is moved from ice_napi_poll() down to the ice_xmit_xdp_ring(). It is safe to do so as XDP ring will not get any SKBs in there that would rely on interrupts for the cleaning. Nice side effect is that for rare case of Tx fallback path (that next patch is going to introduce) we don't have to trigger the SW irq to clean the ring. With those two concepts, ring is kept at being almost full, but it is guaranteed that driver will be able to produce Tx descriptors. This approach seems to work out well even though the Tx descriptors are produced in one-by-one manner. Test was conducted with the ice HW bombarded with packets from HW generator, configured to generate 30 flows. Xdp2 sample yields the following results: <snip> proto 17: 79973066 pkt/s proto 17: 80018911 pkt/s proto 17: 80004654 pkt/s proto 17: 79992395 pkt/s proto 17: 79975162 pkt/s proto 17: 79955054 pkt/s proto 17: 79869168 pkt/s proto 17: 79823947 pkt/s proto 17: 79636971 pkt/s </snip> As that sample reports the Rx'ed frames, let's look at sar output. It says that what we Rx'ed we do actually Tx, no noticeable drops. Average: IFACE rxpck/s txpck/s rxkB/s txkB/s rxcmp/s txcmp/s rxmcst/s %ifutil Average: ens4f1 79842324.00 79842310.40 4678261.17 4678260.38 0.00 0.00 0.00 38.32 with tx_busy staying calm. When compared to a state before: Average: IFACE rxpck/s txpck/s rxkB/s txkB/s rxcmp/s txcmp/s rxmcst/s %ifutil Average: ens4f1 90919711.60 42233822.60 5327326.85 2474638.04 0.00 0.00 0.00 43.64 it can be observed that the amount of txpck/s is almost doubled, meaning that the performance is improved by around 90%. All of this due to the drops in the driver, previously the tx_busy stat was bumped at a 7mpps rate. Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Tested-by: George Kuruvinakunnel <george.kuruvinakunnel@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-10-15 07:39:03 -07:00
Maciej Fijalkowski	eb087cd828	ice: propagate xdp_ring onto rx_ring With rings being split, it is now convenient to introduce a pointer to XDP ring within the Rx ring. For XDP_TX workloads this means that xdp_rings array access will be skipped, which was executed per each processed frame. Also, read the XDP prog once per NAPI and if prog is present, set up the local xdp_ring pointer. Reading prog a single time was discussed in [1] with some concern raised by Toke around dispatcher handling and having the need for going through the RCU grace period in the ndo_bpf driver callback, but ice currently is torning down NAPI instances regardless of the prog presence on VSI. Although the pointer to XDP ring introduced to Rx ring makes things a lot slimmer/simpler, I still feel that single prog read per NAPI lifetime is beneficial. Further patch that will introduce the fallback path will also get a profit from that as xdp_ring pointer will be set during the XDP rings setup. [1]: https://lore.kernel.org/bpf/87k0oseo6e.fsf@toke.dk/ Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Tested-by: George Kuruvinakunnel <george.kuruvinakunnel@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-10-15 07:39:03 -07:00
Maciej Fijalkowski	a55e16fa33	ice: do not create xdp_frame on XDP_TX xdp_frame is not needed for XDP_TX data path in ice driver case. For this data path cleaning of sent descriptor will not happen anywhere outside of the driver, which means that carrying the information about the underlying memory model via xdp_frame will not be used. Therefore, this conversion can be simply dropped, which would relieve CPU a bit. Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Tested-by: George Kuruvinakunnel <george.kuruvinakunnel@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-10-15 07:39:03 -07:00
Maciej Fijalkowski	0bb4f9ecad	ice: unify xdp_rings accesses There has been a long lasting issue of improper xdp_rings indexing for XDP_TX and XDP_REDIRECT actions. Given that currently rx_ring->q_index is mixed with smp_processor_id(), there could be a situation where Tx descriptors are produced onto XDP Tx ring, but tail is never bumped - for example pin a particular queue id to non-matching IRQ line. Address this problem by ignoring the user ring count setting and always initialize the xdp_rings array to be of num_possible_cpus() size. Then, always use the smp_processor_id() as an index to xdp_rings array. This provides serialization as at given time only a single softirq can run on a particular CPU. Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Tested-by: George Kuruvinakunnel <george.kuruvinakunnel@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-10-15 07:39:02 -07:00
Maciej Fijalkowski	e72bba2135	ice: split ice_ring onto Tx/Rx separate structs While it was convenient to have a generic ring structure that served both Tx and Rx sides, next commits are going to introduce several Tx-specific fields, so in order to avoid hurting the Rx side, let's pull out the Tx ring onto new ice_tx_ring and ice_rx_ring structs. Rx ring could be handled by the old ice_ring which would reduce the code churn within this patch, but this would make things asymmetric. Make the union out of the ring container within ice_q_vector so that it is possible to iterate over newly introduced ice_tx_ring. Remove the @size as it's only accessed from control path and it can be calculated pretty easily. Change definitions of ice_update_ring_stats and ice_fetch_u64_stats_per_ring so that they are ring agnostic and can be used for both Rx and Tx rings. Sizes of Rx and Tx ring structs are 256 and 192 bytes, respectively. In Rx ring xdp_rxq_info occupies its own cacheline, so it's the major difference now. Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Tested-by: Gurucharan G <gurucharanx.g@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-10-15 07:39:02 -07:00
Maciej Fijalkowski	dc23715cf3	ice: move ice_container_type onto ice_ring_container Currently ice_container_type is scoped only for ice_ethtool.c. Next commit that will split the ice_ring struct onto Rx/Tx specific ring structs is going to also modify the type of linked list of rings that is within ice_ring_container. Therefore, the functions that are taking the ice_ring_container as an input argument will need to be aware of a ring type that will be looked up. Embed ice_container_type within ice_ring_container and initialize it properly when allocating the q_vectors. Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Tested-by: Gurucharan G <gurucharanx.g@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-10-15 07:39:02 -07:00
Maciej Fijalkowski	e93d1c37a8	ice: remove ring_active from ice_ring This field is dead and driver is not making any use of it. Simply remove it. Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Tested-by: Gurucharan G <gurucharanx.g@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-10-15 07:39:02 -07:00
Jakub Kicinski	e15f5972b8	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net tools/testing/selftests/net/ioam6.sh `7b1700e009` ("selftests: net: modify IOAM tests for undef bits") `bf77b1400a` ("selftests: net: Test for the IOAM encapsulation with IPv6") Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-10-14 16:50:14 -07:00
Brett Creeley	b726ddf984	ice: Print the api_patch as part of the fw.mgmt.api Currently when a user uses "devlink dev info", the fw.mgmt.api will be the major.minor numbers as shown below: devlink dev info pci/0000:3b:00.0 pci/0000:3b:00.0: driver ice serial_number 00-01-00-ff-ff-00-00-00 versions: fixed: board.id K91258-000 running: fw.mgmt 6.1.2 fw.mgmt.api 1.7 <--- No patch number included fw.mgmt.build 0xd75e7d06 fw.mgmt.srev 5 fw.undi 1.2992.0 fw.undi.srev 5 fw.psid.api 3.10 fw.bundle_id 0x800085cc fw.app.name ICE OS Default Package fw.app 1.3.27.0 fw.app.bundle_id 0xc0000001 fw.netlist 3.10.2000-3.1e.0 fw.netlist.build 0x2a76e110 stored: fw.mgmt.srev 5 fw.undi 1.2992.0 fw.undi.srev 5 fw.psid.api 3.10 fw.bundle_id 0x800085cc fw.netlist 3.10.2000-3.1e.0 fw.netlist.build 0x2a76e110 There are many features in the driver that depend on the major, minor, and patch version of the FW. Without the patch number in the output for fw.mgmt.api debugging issues related to the FW API version is difficult. Also, using major.minor.patch aligns with the existing firmware version which uses a 3 digit value. Fix this by making the fw.mgmt.api print the major.minor.patch versions. Shown below is the result: devlink dev info pci/0000:3b:00.0 pci/0000:3b:00.0: driver ice serial_number 00-01-00-ff-ff-00-00-00 versions: fixed: board.id K91258-000 running: fw.mgmt 6.1.2 fw.mgmt.api 1.7.9 <--- patch number included fw.mgmt.build 0xd75e7d06 fw.mgmt.srev 5 fw.undi 1.2992.0 fw.undi.srev 5 fw.psid.api 3.10 fw.bundle_id 0x800085cc fw.app.name ICE OS Default Package fw.app 1.3.27.0 fw.app.bundle_id 0xc0000001 fw.netlist 3.10.2000-3.1e.0 fw.netlist.build 0x2a76e110 stored: fw.mgmt.srev 5 fw.undi 1.2992.0 fw.undi.srev 5 fw.psid.api 3.10 fw.bundle_id 0x800085cc fw.netlist 3.10.2000-3.1e.0 fw.netlist.build 0x2a76e110 Fixes: `ff2e5c700e` ("ice: add basic handler for devlink .info_get") Signed-off-by: Brett Creeley <brett.creeley@intel.com> Tested-by: Gurucharan G <gurucharanx.g@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-10-14 10:14:46 -07:00
Michal Swiatkowski	e4c2efa139	ice: fix getting UDP tunnel entry Correct parameters order in call to ice_tunnel_idx_to_entry function. Entry in sparse port table is correct when the idx is 0. For idx 1 one correct entry should be skipped, for idx 2 two of them should be skipped etc. Change if condition to be true when idx is 0, which means that previous valid entry of this tunnel type were skipped. Fixes: `b20e6c17c4` ("ice: convert to new udp_tunnel infrastructure") Signed-off-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Tested-by: Gurucharan G <gurucharanx.g@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-10-14 10:14:45 -07:00
Dave Ertman	73e30a62b1	ice: Avoid crash from unnecessary IDA free In the remove path, there is an attempt to free the aux_idx IDA whether it was allocated or not. This can potentially cause a crash when unloading the driver on systems that do not initialize support for RDMA. But, this free cannot be gated by the status bit for RDMA, since it is allocated if the driver detects support for RDMA at probe time, but the driver can enter into a state where RDMA is not supported after the IDA has been allocated at probe time and this would lead to a memory leak. Initialize aux_idx to an invalid value and check for a valid value when unloading to determine if an IDA free is necessary. Fixes: `d25a0fc41c` ("ice: Initialize RDMA support") Reported-by: Jun Miao <jun.miao@windriver.com> Signed-off-by: Dave Ertman <david.m.ertman@intel.com> Tested-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-10-14 10:14:45 -07:00
Brett Creeley	ff7e932194	ice: Fix failure to re-add LAN/RDMA Tx queues Currently if the VSI is rebuilt/removed and the RDMA PF driver is active the RDMA Tx queue scheduler node configuration will not be cleaned up. This will cause the rebuild/re-add of the VSI to fail due to the software structures not being correctly cleaned up for the VSI index. Fix this by always calling ice_rm_vsi_rdma_cfg() for all VSI. If there are no RDMA scheduler nodes created, then there is no harm in calling ice_rm_vsi_rdma_cfg(). This change applies to all VSI types, so if RDMA support is added for other VSI types they will also get this change. Fixes: `348048e724` ("ice: Implement iidc operations") Signed-off-by: Brett Creeley <brett.creeley@intel.com> Tested-by: Jerzy Wiktor Jurkowski <jerzy.wiktor.jurkowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-10-14 10:14:00 -07:00
Maciej Machnikowski	325b2064d0	ice: Implement support for SMA and U.FL on E810-T Expose SMA and U.FL connectors as ptp_pins on E810-T based adapters and allow controlling them. E810-T adapters are equipped with: - 2 external bidirectional SMA connectors - 1 internal TX U.FL - 1 internal RX U.FL U.FL connectors share signal lines with the SMA connectors. The TX U.FL1 share the line with the SMA1 and the RX U.FL2 share line with the SMA2. This dependence is controlled by the ice_verify_pin_e810t. Additionally add support for the E810-T-based devices which don't use the SMA/U.FL controller. If the IO expander is not detected don't expose pins and use 2 predefined 1PPS input and output pins. Signed-off-by: Maciej Machnikowski <maciej.machnikowski@intel.com> Tested-by: Sunitha Mekala <sunithax.d.mekala@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-10-14 07:37:30 -07:00
Maciej Machnikowski	885fe6932a	ice: Add support for SMA control multiplexer E810-T adapters have two external bidirectional SMA connectors and two internal unidirectional U.FL connectors. Multiplexing between U.FL and SMA and SMA direction is controlled using the PCA9575 expander. Add support for the PCA9575 detection and control of the respective pins of the SMA/U.FL multiplexer using the GPIO AQ API. Signed-off-by: Maciej Machnikowski <maciej.machnikowski@intel.com> Tested-by: Sunitha Mekala <sunithax.d.mekala@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-10-14 07:37:30 -07:00
Maciej Machnikowski	3bb6324b3d	ice: Implement functions for reading and setting GPIO pins Implement ice_aq_get_gpio and ice_aq_set_gpio for reading and changing the state of GPIO pins described in the topology. Signed-off-by: Maciej Machnikowski <maciej.machnikowski@intel.com> Tested-by: Sunitha Mekala <sunithax.d.mekala@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-10-14 07:37:30 -07:00
Maciej Machnikowski	e00ae1a2aa	ice: Refactor ice_aqc_link_topo_addr Separate link topo parameters in struct ice_aqc_link_topo_addr into new struct ice_aqc_link_topo_params. This keeps input parameters for the get_link_topo command in a separate structure and is required by future commands that operate only on link topo params without the node handle. Signed-off-by: Maciej Machnikowski <maciej.machnikowski@intel.com> Tested-by: Sunitha Mekala <sunithax.d.mekala@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-10-14 07:37:30 -07:00
Jacob Keller	4d4a223a86	ice: fix locking for Tx timestamp tracking flush Commit `4dd0d5c33c` ("ice: add lock around Tx timestamp tracker flush") added a lock around the Tx timestamp tracker flow which is used to cleanup any left over SKBs and prepare for device removal. This lock is problematic because it is being held around a call to ice_clear_phy_tstamp. The clear function takes a mutex to send a PHY write command to firmware. This could lead to a deadlock if the mutex actually sleeps, and causes the following warning on a kernel with preemption debugging enabled: [ 715.419426] BUG: sleeping function called from invalid context at kernel/locking/mutex.c:573 [ 715.427900] in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 3100, name: rmmod [ 715.435652] INFO: lockdep is turned off. [ 715.439591] Preemption disabled at: [ 715.439594] [<0000000000000000>] 0x0 [ 715.446678] CPU: 52 PID: 3100 Comm: rmmod Tainted: G W OE 5.15.0-rc4+ #42 bdd7ec3018e725f159ca0d372ce8c2c0e784891c [ 715.458058] Hardware name: Intel Corporation S2600STQ/S2600STQ, BIOS SE5C620.86B.02.01.0010.010620200716 01/06/2020 [ 715.468483] Call Trace: [ 715.470940] dump_stack_lvl+0x6a/0x9a [ 715.474613] ___might_sleep.cold+0x224/0x26a [ 715.478895] __mutex_lock+0xb3/0x1440 [ 715.482569] ? stack_depot_save+0x378/0x500 [ 715.486763] ? ice_sq_send_cmd+0x78/0x14c0 [ice 9a7e1ec00971c89ecd3fe0d4dc7da2b3786a421d] [ 715.494979] ? kfree+0xc1/0x520 [ 715.498128] ? mutex_lock_io_nested+0x12a0/0x12a0 [ 715.502837] ? kasan_set_free_info+0x20/0x30 [ 715.507110] ? __kasan_slab_free+0x10b/0x140 [ 715.511385] ? slab_free_freelist_hook+0xc7/0x220 [ 715.516092] ? kfree+0xc1/0x520 [ 715.519235] ? ice_deinit_lag+0x16c/0x220 [ice 9a7e1ec00971c89ecd3fe0d4dc7da2b3786a421d] [ 715.527359] ? ice_remove+0x1cf/0x6a0 [ice 9a7e1ec00971c89ecd3fe0d4dc7da2b3786a421d] [ 715.535133] ? pci_device_remove+0xab/0x1d0 [ 715.539318] ? __device_release_driver+0x35b/0x690 [ 715.544110] ? driver_detach+0x214/0x2f0 [ 715.548035] ? bus_remove_driver+0x11d/0x2f0 [ 715.552309] ? pci_unregister_driver+0x26/0x250 [ 715.556840] ? ice_module_exit+0xc/0x2f [ice 9a7e1ec00971c89ecd3fe0d4dc7da2b3786a421d] [ 715.564799] ? __do_sys_delete_module.constprop.0+0x2d8/0x4e0 [ 715.570554] ? do_syscall_64+0x3b/0x90 [ 715.574303] ? entry_SYSCALL_64_after_hwframe+0x44/0xae [ 715.579529] ? start_flush_work+0x542/0x8f0 [ 715.583719] ? ice_sq_send_cmd+0x78/0x14c0 [ice 9a7e1ec00971c89ecd3fe0d4dc7da2b3786a421d] [ 715.591923] ice_sq_send_cmd+0x78/0x14c0 [ice 9a7e1ec00971c89ecd3fe0d4dc7da2b3786a421d] [ 715.599960] ? wait_for_completion_io+0x250/0x250 [ 715.604662] ? lock_acquire+0x196/0x200 [ 715.608504] ? do_raw_spin_trylock+0xa5/0x160 [ 715.612864] ice_sbq_rw_reg+0x1e6/0x2f0 [ice 9a7e1ec00971c89ecd3fe0d4dc7da2b3786a421d] [ 715.620813] ? ice_reset+0x130/0x130 [ice 9a7e1ec00971c89ecd3fe0d4dc7da2b3786a421d] [ 715.628497] ? __debug_check_no_obj_freed+0x1e8/0x3c0 [ 715.633550] ? trace_hardirqs_on+0x1c/0x130 [ 715.637748] ice_write_phy_reg_e810+0x70/0xf0 [ice 9a7e1ec00971c89ecd3fe0d4dc7da2b3786a421d] [ 715.646220] ? do_raw_spin_trylock+0xa5/0x160 [ 715.650581] ? ice_ptp_release+0x910/0x910 [ice 9a7e1ec00971c89ecd3fe0d4dc7da2b3786a421d] [ 715.658797] ? ice_ptp_release+0x255/0x910 [ice 9a7e1ec00971c89ecd3fe0d4dc7da2b3786a421d] [ 715.667013] ice_clear_phy_tstamp+0x2c/0x110 [ice 9a7e1ec00971c89ecd3fe0d4dc7da2b3786a421d] [ 715.675403] ice_ptp_release+0x408/0x910 [ice 9a7e1ec00971c89ecd3fe0d4dc7da2b3786a421d] [ 715.683440] ice_remove+0x560/0x6a0 [ice 9a7e1ec00971c89ecd3fe0d4dc7da2b3786a421d] [ 715.691037] ? _raw_spin_unlock_irqrestore+0x46/0x73 [ 715.696005] pci_device_remove+0xab/0x1d0 [ 715.700018] __device_release_driver+0x35b/0x690 [ 715.704637] driver_detach+0x214/0x2f0 [ 715.708389] bus_remove_driver+0x11d/0x2f0 [ 715.712489] pci_unregister_driver+0x26/0x250 [ 715.716857] ice_module_exit+0xc/0x2f [ice 9a7e1ec00971c89ecd3fe0d4dc7da2b3786a421d] [ 715.724637] __do_sys_delete_module.constprop.0+0x2d8/0x4e0 [ 715.730210] ? free_module+0x6d0/0x6d0 [ 715.733963] ? task_work_run+0xe1/0x170 [ 715.737803] ? exit_to_user_mode_loop+0x17f/0x1d0 [ 715.742509] ? rcu_read_lock_sched_held+0x12/0x80 [ 715.747215] ? trace_hardirqs_on+0x1c/0x130 [ 715.751401] do_syscall_64+0x3b/0x90 [ 715.754981] entry_SYSCALL_64_after_hwframe+0x44/0xae [ 715.760033] RIP: 0033:0x7f4dfe59000b [ 715.763612] Code: 73 01 c3 48 8b 0d 6d 1e 0c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa b8 b0 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 3d 1e 0c 00 f7 d8 64 89 01 48 [ 715.782357] RSP: 002b:00007ffe8c891708 EFLAGS: 00000206 ORIG_RAX: 00000000000000b0 [ 715.789923] RAX: ffffffffffffffda RBX: 00005558a20468b0 RCX: 00007f4dfe59000b [ 715.797054] RDX: 000000000000000a RSI: 0000000000000800 RDI: 00005558a2046918 [ 715.804189] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000 [ 715.811319] R10: 00007f4dfe603ac0 R11: 0000000000000206 R12: 00007ffe8c891940 [ 715.818455] R13: 00007ffe8c8920a3 R14: 00005558a20462a0 R15: 00005558a20468b0 Notice that this is the only case where we use the lock in this way. In the cleanup kthread and work kthread the lock is only taken around the bit accesses. This was done intentionally to avoid this kind of issue. The way the lock is used, we only protect ordering of bit sets vs bit clears. The Tx writers in the hot path don't need to be protected against the entire kthread loop. The Tx queues threads only need to ensure that they do not re-use an index that is currently in use. The cleanup loop does not need to block all new set bits, since it will re-queue itself if new timestamps are present. Fix the tracker flow so that it uses the same flow as the standard cleanup thread. In addition, ensure the in_use bitmap actually gets cleared properly. This fixes the warning and also avoids the potential deadlock that might have occurred otherwise. Fixes: `4dd0d5c33c` ("ice: add lock around Tx timestamp tracker flush") Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-10-12 12:10:39 +01:00
Michal Swiatkowski	7fde6d8b44	ice: ndo_setup_tc implementation for PR Add tc-flower support for VF port representor devices. Implement ndo_setup_tc callback for TC HW offload on VF port representors devices. Implemented both methods: add and delete tc-flower flows. Mark NETIF_F_HW_TC bit in net device's feature set to enable offload TC infrastructure for port representor. Implement TC filters replay function required to restore filters settings while switchdev configuration is rebuilt. Signed-off-by: Michal Swiatkowski <michal.swiatkowski@intel.com> Signed-off-by: Wojciech Drewek <wojciech.drewek@intel.com> Tested-by: Sandeep Penigalapati <sandeep.penigalapati@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-10-11 09:03:08 -07:00
Kiran Patil	0d08a441fb	ice: ndo_setup_tc implementation for PF Implement ndo_setup_tc net device callback for TC HW offload on PF device. ndo_setup_tc provides support for HW offloading various TC filters. Add support for configuring the following filter with tc-flower: - default L2 filters (src/dst mac addresses, ethertype, VLAN) - variations of L3, L3+L4, L2+L3+L4 filters using advanced filters (including ipv4 and ipv6 addresses). Allow for adding/removing TC flows when PF device is configured in eswitch switchdev mode. Two types of actions are supported at the moment: FLOW_ACTION_DROP and FLOW_ACTION_REDIRECT. Co-developed-by: Priyalee Kushwaha <priyalee.kushwaha@intel.com> Signed-off-by: Priyalee Kushwaha <priyalee.kushwaha@intel.com> Signed-off-by: Kiran Patil <kiran.patil@intel.com> Signed-off-by: Wojciech Drewek <wojciech.drewek@intel.com> Tested-by: Sandeep Penigalapati <sandeep.penigalapati@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-10-11 09:03:04 -07:00
Michal Swiatkowski	572b820dfa	ice: Allow changing lan_en and lb_en on all kinds of filters There is no way to change default lan_en and lb_en flags while adding new rule. Add function that allows changing these flags on rule determined by rule id and recipe id. Function checks if the rule is presented on regular rules list or advance rules list and call the appropriate function to update rule entry. As rules with ICE_SW_LKUP_DFLT recipe aren't tracked in a list, implement function which updates flags without searching for rules based only on rule id. Signed-off-by: Michal Swiatkowski <michal.swiatkowski@intel.com> Signed-off-by: Wojciech Drewek <wojciech.drewek@intel.com> Tested-by: Sandeep Penigalapati <sandeep.penigalapati@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-10-11 08:50:37 -07:00
Victor Raj	8b8ef05b77	ice: cleanup rules info Change ICE_SW_LKUP_LAST to ICE_MAX_NUM_RECIPES as for now there also can be recipes other than the default. Free all structures created for advanced recipes in cleanup function. Write a function to clean allocated structures on advanced rule info. Signed-off-by: Victor Raj <victor.raj@intel.com> Signed-off-by: Wojciech Drewek <wojciech.drewek@intel.com> Tested-by: Sandeep Penigalapati <sandeep.penigalapati@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-10-11 08:49:55 -07:00
Shivanshu Shukla	8bb98f33de	ice: allow deleting advanced rules To remove advanced rule the same protocols list like in adding should be send to function. Based on this information list of advanced rules is searched to find the correct rule id. Remove advanced rule if it forwards to only one VSI. If it forwards to list of VSI remove only input VSI from this list. Introduce function to remove rule by id. It is used in case rule needs to be removed even if it forwards to the list of VSI. Allow removing all advanced rules from a particular VSI. It is useful in rebuilding VSI path. Co-developed-by: Dan Nowlin <dan.nowlin@intel.com> Signed-off-by: Dan Nowlin <dan.nowlin@intel.com> Signed-off-by: Shivanshu Shukla <shivanshu.shukla@intel.com> Signed-off-by: Wojciech Drewek <wojciech.drewek@intel.com> Tested-by: Sandeep Penigalapati <sandeep.penigalapati@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-10-11 08:48:49 -07:00
Grishma Kotecha	0f94570d0c	ice: allow adding advanced rules Define dummy packet headers to allow adding advanced rules in HW. This header is used as admin queue command parameter for adding a rule. The firmware will extract correct fields and will use them in look ups. Define each supported packets header and offsets to words used in recipe. Supported headers: - MAC + IPv4 + UDP - MAC + VLAN + IPv4 + UDP - MAC + IPv4 + TCP - MAC + VLAN + IPv4 + TCP - MAC + IPv6 + UDP - MAC + VLAN + IPv6 + UDP - MAC + IPv6 + TCP - MAC + VLAN + IPv6 + TCP Add code for creating an advanced rule. Rule needs to match defined dummy packet, if not return error, which means that this type of rule isn't currently supported. The first step in adding advanced rule is searching for an advanced recipe matching this kind of rule. If it doesn't exist new recipe is created. Dummy packet has to be filled with the correct header field value from the rule definition. It will be used to do look up in HW. Support searching for existing advance rule entry. It is used in case of adding the same rule on different VSI. In this case, instead of creating new rule, the existing one should be updated with refreshed VSI list. Add initialization for prof_res_bm_init flag to zero so that the possible resource for fv in the files can be initialized. Co-developed-by: Dan Nowlin <dan.nowlin@intel.com> Signed-off-by: Dan Nowlin <dan.nowlin@intel.com> Signed-off-by: Grishma Kotecha <grishma.kotecha@intel.com> Signed-off-by: Wojciech Drewek <wojciech.drewek@intel.com> Tested-by: Sandeep Penigalapati <sandeep.penigalapati@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-10-11 08:44:29 -07:00
Dan Nowlin	fd2a6b71e3	ice: create advanced switch recipe These changes introduce code for creating advanced recipes for the switch in hardware. There are a couple of recipes already defined in the HW. They apply to matching on basic protocol headers, like MAC, VLAN, MACVLAN, ethertype or direction (promiscuous), etc.. If the user wants to match on other protocol headers (eg. ip address, src/dst port etc.) or different variation of already supported protocols, there is a need to create new, more complex recipe. That new recipe is referred as 'advanced recipe', and the filtering rule created on top of that recipe is called 'advanced rule'. One recipe can have up to 5 words, but the first word is always reserved for match on switch id, so the driver can define up to 4 words for one recipe. To support recipes with more words up to 5 recipes can be chained, so 20 words can be programmed for look up. Input for adding recipe function is a list of protocols to support. Based on this list correct profile is being chosen. Correct profile means that it contains all protocol types from a list. Each profile have up to 48 field vector words and each of this word have protocol id and offset. These two fields need to match with input data for adding recipe function. If the correct profile can't be found the function returns an error. The next step after finding the correct profile is grouping words into groups. One group can have up to 4 words. This is done to simplify sending recipes to HW (because recipe also can have up to 4 words). In case of chaining (so when look up consists of more than 4 words) last recipe will always have results from the previous recipes used as words. A recipe to profile map is used to store information about which profile is associate with this recipe. This map is an array of 64 elements (max number of recipes) and each element is a 256 bits bitmap (max number of profiles) Profile to recipe map is used to store information about which recipe is associate with this profile. This map is an array of 256 elements (max number of profiles) and each element is a 64 bits bitmap (max number of recipes) Signed-off-by: Dan Nowlin <dan.nowlin@intel.com> Signed-off-by: Wojciech Drewek <wojciech.drewek@intel.com> Tested-by: Sandeep Penigalapati <sandeep.penigalapati@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-10-11 08:38:27 -07:00
Dan Nowlin	450052a414	ice: manage profiles and field vectors Implement functions to manage profiles and field vectors in hardware. In hardware, there are up to 256 profiles and each of these profiles can have 48 field vector words. Each field vector word is described by protocol id and offset in the packet. To add a new recipe all used profiles need to be searched. If the profile contains all required protocol ids and offsets from the recipe it can be used. The driver has to add this profile to recipe association to tell hardware that newly added recipe is going to be associated with this profile. The amount of used profiles depend on the package. To avoid searching across not used profile, max profile id value is calculated at init flow. The profile is considered as unused when all field vector words in the profile are invalid (protocol id 0xff and offset 0x1ff). Profiles are read from the package section ICE_SID_FLD_VEC_SW. Empty field vector words can be used for recipe results. Store all unused field vector words in prof_res_bm. It is a 256 elements array (max number of profiles) each element is a 48 bit bitmap (max number of field vector words). For now, support only non-tunnel profiles type. Co-developed-by: Grishma Kotecha <grishma.kotecha@intel.com> Signed-off-by: Grishma Kotecha <grishma.kotecha@intel.com> Signed-off-by: Dan Nowlin <dan.nowlin@intel.com> Signed-off-by: Wojciech Drewek <wojciech.drewek@intel.com> Tested-by: Sandeep Penigalapati <sandeep.penigalapati@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-10-11 08:36:25 -07:00
Grishma Kotecha	7715ec3247	ice: implement low level recipes functions Add code to manage recipes and profiles on admin queue layer. Allow the driver to add a new recipe and update an existing one. Get a recipe and get a recipe to profile association is mostly used in update existing recipes code. Only default recipes can be updated. An update is done by reading recipes from HW, changing their params and calling add recipe command. Support following admin queue commands: - ice_aqc_opc_add_recipe (0x0290) - create a recipe with protocol header information and other details that determine how this recipe filter works - ice_aqc_opc_recipe_to_profile (0x0291) - associate a switch recipe to a profile - ice_aqc_opc_get_recipe (0x0292) - get details of an existing recipe - ice_aqc_opc_get_recipe_to_profile (0x0293) - get a recipe associated with profile ID Define ICE_AQC_RES_TYPE_RECIPE resource type to hold a switch recipe. It is needed when a new switch recipe needs to be created. Co-developed-by: Dan Nowlin <dan.nowlin@intel.com> Signed-off-by: Dan Nowlin <dan.nowlin@intel.com> Signed-off-by: Grishma Kotecha <grishma.kotecha@intel.com> Signed-off-by: Wojciech Drewek <wojciech.drewek@intel.com> Tested-by: Sandeep Penigalapati <sandeep.penigalapati@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-10-11 07:41:49 -07:00
Wojciech Drewek	7aae80cef7	ice: add port representor ethtool ops and stats Introduce the following ethtool operations for VF's representor: -get_drvinfo -get_strings -get_ethtool_stats -get_sset_count -get_link In all cases, existing operations were used with minor changes which allow us to detect if ethtool op was called for representor. Only VF VSI stats will be available for representor. Implement ndo_get_stats64 for port representor. This will update VF VSI stats and read them. Signed-off-by: Wojciech Drewek <wojciech.drewek@intel.com> Tested-by: Sandeep Penigalapati <sandeep.penigalapati@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-10-07 10:41:43 -07:00
Grzegorz Nitka	f5396b8a66	ice: switchdev slow path Slow path means allowing packet to go from uplink to representor and from representor to correct VF on Rx site and from VF to representor and to uplink on Tx site. To accomplish this driver, has to set correct Tx descriptor. When packet is sent from representor to VF, destination should be set to VF VSI. When packet is sent from uplink port destination should be uplink to bypass switch infrastructure and send packet outside. On Rx site driver should check source VSI field from Rx descriptor and based on that forward packed to correct netdev. To allow this there is a target netdevs table in control plane VSI struct. Co-developed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Signed-off-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Signed-off-by: Grzegorz Nitka <grzegorz.nitka@intel.com> Tested-by: Sandeep Penigalapati <sandeep.penigalapati@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-10-07 10:41:42 -07:00
Grzegorz Nitka	b3be918dcc	ice: rebuild switchdev when resetting all VFs As resetting all VFs behaves mostly like creating new VFs also eswitch infrastructure has to be recreated. The easiest way to do that is to rebuild eswitch after resetting VFs. Implement helper functions to start and stop all representors queues. This is used to disable traffic on port representors. In rebuild path: - NAPI has to be disabled - eswitch environment has to be set up - new port representors have to be created, because the old one had pointer to not existing VFs - new control plane VSI ring should be remapped - NAPI hast to be enabled - rxdid has to be set to FLEX_NIC_2, because this descriptor id support source_vsi, which is needed on control plane VSI queues - port representors queues have to be started Signed-off-by: Grzegorz Nitka <grzegorz.nitka@intel.com> Tested-by: Sandeep Penigalapati <sandeep.penigalapati@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-10-07 10:41:42 -07:00
Grzegorz Nitka	1c54c83993	ice: enable/disable switchdev when managing VFs Only way to enable switchdev is to create VFs when the eswitch mode is set to switchdev. Check if correct mode is set and enable switchdev in function which creating VFs. Disable switchdev when user change number of VFs to 0. Changing eswitch mode back to legacy when VFs are created in switchdev mode isn't allowed. As switchdev takes care of managing filter rules, adding new rules on VF is blocked. In case of resetting VF driver has to update pointer in ice_repr struct, because after reset VSI related things can change. Co-developed-by: Wojciech Drewek <wojciech.drewek@intel.com> Signed-off-by: Wojciech Drewek <wojciech.drewek@intel.com> Signed-off-by: Grzegorz Nitka <grzegorz.nitka@intel.com> Tested-by: Sandeep Penigalapati <sandeep.penigalapati@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-10-07 10:41:42 -07:00
Grzegorz Nitka	f66756e0ea	ice: introduce new type of VSI for switchdev New type of VSI has to be defined for switchdev control plane VSI. Number of allocated Tx and Rx queue has to be equal to amount of VFs, because each port representor should have one Tx and Rx queue. Also to not increase number of used irqs too much, control plane VSI uses only one q_vector and handle all queues in one irq. To allow handling all queues in one irq , new function to clean msix for eswitch was introduced. This function will schedule napi for each representor instead of scheduling it only for one like in normal clean irq function. Only one additional msix has to be requested. Always try to request it in ice_ena_msix_range function. Signed-off-by: Grzegorz Nitka <grzegorz.nitka@intel.com> Tested-by: Sandeep Penigalapati <sandeep.penigalapati@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-10-07 10:41:42 -07:00
Grzegorz Nitka	1a1c40df2e	ice: set and release switchdev environment Switchdev environment has to be set up when user create VFs and eswitch mode is switchdev. Release is done when user delete all VFs. Data path in this implementation is based on control plane VSI. This VSI is used to pass traffic from port representors to corresponding VFs and vice versa. Default TX rule has to be added to forward packet to control plane VSI. This will redirect packets from VFs which don't match other rules to control plane VSI. On RX side default rule is added on uplink VSI to receive all traffic that doesn't match other rules. When setting switchdev environment all other rules from VFs should be removed. Packet to VFs will be forwarded by control plane VSI. As VF without any mac rules can't send any packet because of antispoof mechanism, VSI antispoof should be turned off on each VFs. To send packet from representor to correct VSI, destination VSI field in TX descriptor will have to be filled. Allow that by setting destination override bit in control plane VSI security config. Packet from VFs will be received on control plane VSI. Driver should decide to which netdev forward the packet. Decision is made based on src_vsi field from descriptor. There is a target netdev list in control plane VSI struct which choose netdev based on src_vsi number. Co-developed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Signed-off-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Signed-off-by: Grzegorz Nitka <grzegorz.nitka@intel.com> Tested-by: Sandeep Penigalapati <sandeep.penigalapati@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-10-07 10:41:42 -07:00
Michal Swiatkowski	bd676b2929	ice: allow changing lan_en and lb_en on dflt rules There is no way to change default lan_en and lb_en flags while adding new rule. Add function that allows changing these flags on ICE_SW_LKUP_DFLT recipe and any rule id. lan_en allows packet to go outside if rule is matched. Clearing this bit will block packet from sending it outside. lb_en allows packet to be forwarded to other VSI. Clearing this bit will block packet from forwarding it to other VSI. Signed-off-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Tested-by: Sandeep Penigalapati <sandeep.penigalapati@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-10-07 10:41:42 -07:00
Michal Swiatkowski	ff5411ef88	ice: manage VSI antispoof and destination override Implement functions to make setting VSI security config easier. Main function ice_update_security fills security section field and checks against error in updating VSI. Reset functions are responsible for correct filling config according to user expectations. This helper is needed because destination override is located in this section. Driver has to set this bit to allow strering Tx packet on VSI based on value in Tx descriptors. Signed-off-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Tested-by: Sandeep Penigalapati <sandeep.penigalapati@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-10-07 10:41:42 -07:00
Michal Swiatkowski	ac19e03ef7	ice: allow process VF opcodes in different ways In switchdev driver shouldn't add MAC, VLAN and promisc filters on iavf demand but should return success to not break normal iavf flow. Achieve that by creating table of functions pointer with default functions used to parse iavf command. While parse iavf command, call correct function from table instead of calling function direct. When port representors are being created change functions in table to new one that behaves correctly for switchdev puprose (ignoring new filters). Change back to default ops when representors are being removed. Co-developed-by: Wojciech Drewek <wojciech.drewek@intel.com> Signed-off-by: Wojciech Drewek <wojciech.drewek@intel.com> Signed-off-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Tested-by: Sandeep Penigalapati <sandeep.penigalapati@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-10-07 10:41:42 -07:00
Michal Swiatkowski	37165e3f56	ice: introduce VF port representor Port representor is used to manage VF from host side. To allow it each created representor registers netdevice with random hw address. Also devlink port is created for all representors. Port representor name is created based on switch id or managed by devlink core if devlink port was registered with success. Open and stop ndo ops are implemented to allow managing the VF link state. Link state is tracked in VF struct. Struct ice_netdev_priv is extended by pointer to representor field. This is needed to get correct representor from netdev struct mostly used in ndo calls. Implement helper functions to check if given netdev is netdev of port representor (ice_is_port_repr_netdev) and to get representor from netdev (ice_netdev_to_repr). As driver mostly will create or destroy port representors on all VFs instead of on single one, write functions to add and remove representor for each VF. Representor struct contains pointer to source VSI, which is VSI configured on VF, backpointer to VF, backpointer to netdev, q_vector pointer and metadata_dst which will be used in data path. Co-developed-by: Grzegorz Nitka <grzegorz.nitka@intel.com> Signed-off-by: Grzegorz Nitka <grzegorz.nitka@intel.com> Signed-off-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Tested-by: Sandeep Penigalapati <sandeep.penigalapati@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-10-07 10:41:41 -07:00
Wojciech Drewek	2ae0aa4758	ice: Move devlink port to PF/VF struct Keeping devlink port inside VSI data structure causes some issues. Since VF VSI is released during reset that means that we have to unregister devlink port and register it again every time reset is triggered. With the new changes in devlink API it might cause deadlock issues. After calling devlink_port_register/devlink_port_unregister devlink API is going to lock rtnl_mutex. It's an issue when VF reset is triggered in netlink operation context (like setting VF MAC address or VLAN), because rtnl_lock is already taken by netlink. Another call of rtnl_lock from devlink API results in dead-lock. By moving devlink port to PF/VF we avoid creating/destroying it during reset. Since this patch, devlink ports are created during ice_probe, destroyed during ice_remove for PF and created during ice_repr_add, destroyed during ice_repr_rem for VF. Signed-off-by: Wojciech Drewek <wojciech.drewek@intel.com> Tested-by: Sandeep Penigalapati <sandeep.penigalapati@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-10-07 10:41:41 -07:00
Michal Swiatkowski	3ea9bd5d02	ice: support basic E-Switch mode control Write set and get eswitch mode functions used by devlink ops. Use new pf struct member eswitch_mode to track current eswitch mode in driver. Changing eswitch mode is only allowed when there are no VFs created. Create new file for eswitch related code. Add config flag ICE_SWITCHDEV to allow user to choose if switchdev support should be enabled or disabled. Use case examples: - show current eswitch mode ('legacy' is the default one) [root@localhost]# devlink dev eswitch show pci/0000:03:00.1 pci/0000:03:00.1: mode legacy - move to 'switchdev' mode [root@localhost]# devlink dev eswitch set pci/0000:03:00.1 mode switchdev [root@localhost]# devlink dev eswitch show pci/0000:03:00.1 pci/0000:03:00.1: mode switchdev - create 2 VFs [root@localhost]# echo 2 > /sys/class/net/ens4f1/device/sriov_numvfs - unsuccessful attempt to change eswitch mode while VFs are created [root@localhost]# devlink dev eswitch set pci/0000:03:00.1 mode legacy devlink answers: Operation not supported - destroy VFs [root@localhost]# echo 0 > /sys/class/net/ens4f1/device/sriov_numvfs - restore 'legacy' mode [root@localhost]# devlink dev eswitch set pci/0000:03:00.1 mode legacy [root@localhost]# devlink dev eswitch show pci/0000:03:00.1 pci/0000:03:00.1: mode legacy Co-developed-by: Grzegorz Nitka <grzegorz.nitka@intel.com> Signed-off-by: Grzegorz Nitka <grzegorz.nitka@intel.com> Signed-off-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Tested-by: Sandeep Penigalapati <sandeep.penigalapati@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-10-07 10:41:41 -07:00
Jakub Kicinski	a05e4c0af4	ethernet: use eth_hw_addr_set() for dev->addr_len cases Convert all Ethernet drivers from memcpy(... dev->addr_len) to eth_hw_addr_set(): @@ expression dev, np; @@ - memcpy(dev->dev_addr, np, dev->addr_len) + eth_hw_addr_set(dev, np) In theory addr_len may not be ETH_ALEN, but we don't expect non-Ethernet devices to live under this directory, and only the following cases of setting addr_len exist: - cxgb4 for mgmt device, and the drivers which set it to ETH_ALEN: s2io, mlx4, vxge. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-10-05 13:16:48 +01:00
Jakub Kicinski	f3956ebb3b	ethernet: use eth_hw_addr_set() instead of ether_addr_copy() Convert Ethernet from ether_addr_copy() to eth_hw_addr_set(): @@ expression dev, np; @@ - ether_addr_copy(dev->dev_addr, np) + eth_hw_addr_set(dev, np) Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-10-02 14:18:25 +01:00
Jakub Kicinski	6b7b0c3091	Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next Daniel Borkmann says: ==================== bpf-next 2021-10-02 We've added 85 non-merge commits during the last 15 day(s) which contain a total of 132 files changed, 13779 insertions(+), 6724 deletions(-). The main changes are: 1) Massive update on test_bpf.ko coverage for JITs as preparatory work for an upcoming MIPS eBPF JIT, from Johan Almbladh. 2) Add a batched interface for RX buffer allocation in AF_XDP buffer pool, with driver support for i40e and ice from Magnus Karlsson. 3) Add legacy uprobe support to libbpf to complement recently merged legacy kprobe support, from Andrii Nakryiko. 4) Add bpf_trace_vprintk() as variadic printk helper, from Dave Marchevsky. 5) Support saving the register state in verifier when spilling <8byte bounded scalar to the stack, from Martin Lau. 6) Add libbpf opt-in for stricter BPF program section name handling as part of libbpf 1.0 effort, from Andrii Nakryiko. 7) Add a document to help clarifying BPF licensing, from Alexei Starovoitov. 8) Fix skel_internal.h to propagate errno if the loader indicates an internal error, from Kumar Kartikeya Dwivedi. 9) Fix build warnings with -Wcast-function-type so that the option can later be enabled by default for the kernel, from Kees Cook. 10) Fix libbpf to ignore STT_SECTION symbols in legacy map definitions as it otherwise errors out when encountering them, from Toke Høiland-Jørgensen. 11) Teach libbpf to recognize specialized maps (such as for perf RB) and internally remove BTF type IDs when creating them, from Hengqi Chen. 12) Various fixes and improvements to BPF selftests. ==================== Link: https://lore.kernel.org/r/20211002001327.15169-1-daniel@iogearbox.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-10-01 19:58:02 -07:00
Len Baker	30cba287eb	ice: Prefer kcalloc over open coded arithmetic As noted in the "Deprecated Interfaces, Language Features, Attributes, and Conventions" documentation [1], size calculations (especially multiplication) should not be performed in memory allocator (or similar) function arguments due to the risk of them overflowing. This could lead to values wrapping around and a smaller allocation being made than the caller was expecting. Using those allocations could lead to linear overflows of heap memory and other misbehaviors. In this case this is not actually dynamic sizes: both sides of the multiplication are constant values. However it is best to refactor this anyway, just to keep the open-coded math idiom out of code. So, use the purpose specific kcalloc() function instead of the argument size * count in the kzalloc() function. [1] https://www.kernel.org/doc/html/v5.14/process/deprecated.html#open-coded-arithmetic-in-allocator-arguments Signed-off-by: Len Baker <len.baker@gmx.com> Reviewed-by: Gustavo A. R. Silva <gustavoars@kernel.org> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-09-28 09:42:04 -07:00
Jeff Guo	b37e4e94c1	ice: Fix macro name for IPv4 fragment flag In IPv4 header, fragment flags indicate whether the packet needs to be fragmented or not. The value 0x20 represents MF (More Fragment); fix the macro name to match this. Signed-off-by: Ting Xu <ting.xu@intel.com> Signed-off-by: Jeff Guo <jia.guo@intel.com> Tested-by: Gurucharan G <gurucharanx.g@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-09-28 09:42:04 -07:00
Jacob Keller	0128cc6e92	ice: refactor devlink getter/fallback functions to void After commit `a8f89fa277` ("ice: do not abort devlink info if board identifier can't be found"), the getter/fallback() functions no longer report an error. Convert the interface to a void so that it is no longer possible to add a version field that is fatal. This makes sense, because we should not fail to report other versions just because one of the version pieces could not be found. Finally, clean up the getter functions line wrapping so that none of them take more than 80 columns, as is the usual style for networking files. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-09-28 09:42:04 -07:00
Anirudh Venkataramanan	4fc5fbee5c	ice: Fix link mode handling The messaging for unsupported module detection is different for lenient mode and strict mode. Update the code to print the right messaging for a given link mode. Media topology conflict is not an error in lenient mode, so return an error code only if not in lenient mode. Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Gurucharan G <gurucharanx.g@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-09-28 09:42:04 -07:00
Anirudh Venkataramanan	40b247608b	ice: Add feature bitmap, helpers and a check for DSCP DSCP a.k.a L3 QoS is only supported on certain devices. To enforce this, this patch introduces a bitmap of features and helper functions. The feature bitmap is set based on device IDs on driver init. Currently, DSCP is the only feature in this bitmap, but there will be more in the future. In the DCB netlink flow, check if the feature bit is set before exercising DSCP. Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Gurucharan G <gurucharanx.g@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-09-28 09:42:04 -07:00
Dave Ertman	2a87bd73e5	ice: Add DSCP support Implement code to handle submission of APP TLV's containing DSCP to TC mapping. The first such mapping received on an interface will cause that PF to switch to L3 DSCP QoS mode, apply the default config for that mode, and apply the received mapping. Only one such mapping will be allowed per DSCP value, and when the last DSCP mapping is deleted, the PF will switch back into L2 VLAN QoS mode, applying the appropriate default QoS settings. L3 DSCP QoS mode will only be allowed in SW DCBx mode, in other words, when the FW LLDP engine is disabled. Commands that break this mutual exclusivity will be blocked. Co-developed-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Signed-off-by: Dave Ertman <david.m.ertman@intel.com> Tested-by: Gurucharan G <gurucharanx.g@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-09-28 09:42:04 -07:00
Magnus Karlsson	db804cfc21	ice: Use the xsk batched rx allocation interface Use the new xsk batched rx allocation interface for the zero-copy data path. As the array of struct xdp_buff pointers kept by the driver is really a ring that wraps, the allocation routine is modified to detect a wrap and in that case call the allocation function twice. The allocation function cannot deal with wrapped rings, only arrays. As we now know exactly how many buffers we get and that there is no wrapping, the allocation function can be simplified even more as all if-statements in the allocation loop can be removed, improving performance. Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20210922075613.12186-5-magnus.karlsson@gmail.com	2021-09-28 00:18:35 +02:00
Magnus Karlsson	57f7f8b6bc	ice: Use xdp_buf instead of rx_buf for xsk zero-copy In order to use the new xsk batched buffer allocation interface, a pointer to an array of struct xsk_buff pointers need to be provided so that the function can put the result of the allocation there. In the ice driver, we already have a ring that stores pointers to xdp_buffs. This is only used for the xsk zero-copy driver and is a union with the structure that is used for the regular non zero-copy path. Unfortunately, that structure is larger than the xdp_buffs pointers which mean that there will be a stride (of 20 bytes) between each xdp_buff pointer. And feeding this into the xsk_buff_alloc_batch interface will not work since it assumes a regular array of xdp_buff pointers (each 8 bytes with 0 bytes in-between them on a 64-bit system). To fix this, remove the xdp_buff pointer from the rx_buf union and move it one step higher to the union above which only has pointers to arrays in it. This solves the problem and we can directly feed the SW ring of xdp_buff pointers straight into the allocation function in the next patch when that interface is used. This will improve performance. Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20210922075613.12186-4-magnus.karlsson@gmail.com	2021-09-28 00:18:34 +02:00
Leon Romanovsky	838cefd5e5	ice: Open devlink when device is ready Move devlink_registration routine to be the last command, when the device is fully initialized. Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-09-27 16:31:59 +01:00
Leon Romanovsky	2ff04286a9	ice: Delete always true check of PF pointer PF pointer is always valid when PCI core calls its .shutdown() and .remove() callbacks. There is no need to check it again. Fixes: `837f08fdec` ("ice: Add basic driver framework for Intel(R) E800 Series") Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-09-24 14:12:57 +01:00
Leon Romanovsky	db4278c55f	devlink: Make devlink_register to be void devlink_register() can't fail and always returns success, but all drivers are obligated to check returned status anyway. This adds a lot of boilerplate code to handle impossible flow. Make devlink_register() void and simplify the drivers that use that API call. Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Acked-by: Simon Horman <simon.horman@corigine.com> Acked-by: Vladimir Oltean <olteanv@gmail.com> # dsa Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-09-22 14:15:12 +01:00
Dave Ertman	bfe8443509	ice: Correctly deal with PFs that do not support RDMA There are two cases where the current PF does not support RDMA functionality. The first is if the NVM loaded on the device is set to not support RDMA (common_caps.rdma is false). The second is if the kernel bonding driver has included the current PF in an active link aggregate. When the driver has determined that this PF does not support RDMA, then auxiliary devices should not be created on the auxiliary bus. Without a device on the auxiliary bus, even if the irdma driver is present, there will be no RDMA activity attempted on this PF. Currently, in the reset flow, an attempt to create auxiliary devices is performed without regard to the ability of the PF. There needs to be a check in ice_aux_plug_dev (as the central point that creates auxiliary devices) to see if the PF is in a state to support the functionality. When disabling and re-enabling RDMA due to the inclusion/removal of the PF in a link aggregate, we also need to set/clear the bit which controls auxiliary device creation so that a reset recovery in a link aggregate situation doesn't try to create auxiliary devices when it shouldn't. Fixes: `f9f5301e7e` ("ice: Register auxiliary device to provide RDMA") Reported-by: Yongxin Liu <yongxin.liu@windriver.com> Signed-off-by: Dave Ertman <david.m.ertman@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-09-10 09:58:55 +01:00
Jakub Kicinski	29ce8f9701	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net include/linux/netdevice.h net/socket.c `d0efb16294` ("net: don't unconditionally copy_from_user a struct ifreq for socket ioctls") `876f0bf9d0` ("net: socket: simplify dev_ifconf handling") `29c4964822` ("net: socket: rework compat_ifreq_ioctl()") Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-08-31 09:06:04 -07:00
Brett Creeley	b357d9717b	ice: Only lock to update netdev dev_addr commit `3ba7f53f8b` ("ice: don't remove netdev->dev_addr from uc sync list") introduced calls to netif_addr_lock_bh() and netif_addr_unlock_bh() in the driver's ndo_set_mac() callback. This is fine since the driver is updated the netdev's dev_addr, but since this is a spinlock, the driver cannot sleep when the lock is held. Unfortunately the functions to add/delete MAC filters depend on a mutex. This was causing a trace with the lock debug kernel config options enabled when changing the mac address via iproute. [ 203.273059] BUG: sleeping function called from invalid context at kernel/locking/mutex.c:281 [ 203.273065] in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 6698, name: ip [ 203.273068] Preemption disabled at: [ 203.273068] [<ffffffffc04aaeab>] ice_set_mac_address+0x8b/0x1c0 [ice] [ 203.273097] CPU: 31 PID: 6698 Comm: ip Tainted: G S W I 5.14.0-rc4 #2 [ 203.273100] Hardware name: Intel Corporation S2600WFT/S2600WFT, BIOS SE5C620.86B.02.01.0010.010620200716 01/06/2020 [ 203.273102] Call Trace: [ 203.273107] dump_stack_lvl+0x33/0x42 [ 203.273113] ? ice_set_mac_address+0x8b/0x1c0 [ice] [ 203.273124] ___might_sleep.cold.150+0xda/0xea [ 203.273131] mutex_lock+0x1c/0x40 [ 203.273136] ice_remove_mac+0xe3/0x180 [ice] [ 203.273155] ? ice_fltr_add_mac_list+0x20/0x20 [ice] [ 203.273175] ice_fltr_prepare_mac+0x43/0xa0 [ice] [ 203.273194] ice_set_mac_address+0xab/0x1c0 [ice] [ 203.273206] dev_set_mac_address+0xb8/0x120 [ 203.273210] dev_set_mac_address_user+0x2c/0x50 [ 203.273212] do_setlink+0x1dd/0x10e0 [ 203.273217] ? __nla_validate_parse+0x12d/0x1a0 [ 203.273221] __rtnl_newlink+0x530/0x910 [ 203.273224] ? __kmalloc_node_track_caller+0x17f/0x380 [ 203.273230] ? preempt_count_add+0x68/0xa0 [ 203.273236] ? _raw_spin_lock_irqsave+0x1f/0x30 [ 203.273241] ? kmem_cache_alloc_trace+0x4d/0x440 [ 203.273244] rtnl_newlink+0x43/0x60 [ 203.273245] rtnetlink_rcv_msg+0x13a/0x380 [ 203.273248] ? rtnl_calcit.isra.40+0x130/0x130 [ 203.273250] netlink_rcv_skb+0x4e/0x100 [ 203.273256] netlink_unicast+0x1a2/0x280 [ 203.273258] netlink_sendmsg+0x242/0x490 [ 203.273260] sock_sendmsg+0x58/0x60 [ 203.273263] ____sys_sendmsg+0x1ef/0x260 [ 203.273265] ? copy_msghdr_from_user+0x5c/0x90 [ 203.273268] ? ____sys_recvmsg+0xe6/0x170 [ 203.273270] ___sys_sendmsg+0x7c/0xc0 [ 203.273272] ? copy_msghdr_from_user+0x5c/0x90 [ 203.273274] ? ___sys_recvmsg+0x89/0xc0 [ 203.273276] ? __netlink_sendskb+0x50/0x50 [ 203.273278] ? mod_objcg_state+0xee/0x310 [ 203.273282] ? __dentry_kill+0x114/0x170 [ 203.273286] ? get_max_files+0x10/0x10 [ 203.273288] __sys_sendmsg+0x57/0xa0 [ 203.273290] do_syscall_64+0x37/0x80 [ 203.273295] entry_SYSCALL_64_after_hwframe+0x44/0xae [ 203.273296] RIP: 0033:0x7f8edf96e278 [ 203.273298] Code: 89 02 48 c7 c0 ff ff ff ff eb b5 0f 1f 80 00 00 00 00 f3 0f 1e fa 48 8d 05 25 63 2c 00 8b 00 85 c0 75 17 b8 2e 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 58 c3 0f 1f 80 00 00 00 00 41 54 41 89 d4 55 [ 203.273300] RSP: 002b:00007ffcb8bdac08 EFLAGS: 00000246 ORIG_RAX: 000000000000002e [ 203.273303] RAX: ffffffffffffffda RBX: 000000006115e0ae RCX: 00007f8edf96e278 [ 203.273304] RDX: 0000000000000000 RSI: 00007ffcb8bdac70 RDI: 0000000000000003 [ 203.273305] RBP: 0000000000000000 R08: 0000000000000001 R09: 00007ffcb8bda5b0 [ 203.273306] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000001 [ 203.273306] R13: 0000555e10092020 R14: 0000000000000000 R15: 0000000000000005 Fix this by only locking when changing the netdev->dev_addr. Also, make sure to restore the old netdev->dev_addr on any failures. Fixes: `3ba7f53f8b` ("ice: don't remove netdev->dev_addr from uc sync list") Signed-off-by: Brett Creeley <brett.creeley@intel.com> Tested-by: Gurucharan G <gurucharanx.g@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-08-27 13:15:55 -07:00
Jacob Keller	9ee313433c	ice: restart periodic outputs around time changes When we enabled auxiliary input/output support for the E810 device, we forgot to add logic to restart the output when we change time. This is important as the periodic output will be incorrect after a time change otherwise. This unfortunately includes the adjust time function, even though it uses an atomic hardware interface. The atomic adjustment can still cause the pin output to stall permanently, so we need to stop and restart it. Introduce wrapper functions to temporarily disable and then re-enable the clock outputs. Fixes: `172db5f91d` ("ice: add support for auxiliary input/output pins") Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Sunitha D Mekala <sunithax.d.mekala@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-08-27 13:14:41 -07:00
Jacob Keller	4dd0d5c33c	ice: add lock around Tx timestamp tracker flush The driver didn't take the lock while flushing the Tx tracker, which could cause a race where one thread is trying to read timestamps out while another thread is trying to read the tracker to check the timestamps. Avoid this by ensuring that flushing is locked against read accesses. Fixes: `ea9b847cda` ("ice: enable transmit timestamps for E810 devices") Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Gurucharan G <gurucharanx.g@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-08-27 09:14:49 -07:00
Jacob Keller	1f0cbb3e89	ice: remove dead code for allocating pin_config We have code in the ice driver which allocates the pin_config structure if n_pins is > 0, but we never set n_pins to be greater than zero. There's no reason to keep this code until we actually have pin_config support. Remove this. We can re-add it properly when we implement support for pin_config for E810-T devices. Fixes: `172db5f91d` ("ice: add support for auxiliary input/output pins") Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Gurucharan G <gurucharanx.g@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-08-27 09:11:39 -07:00
Jacob Keller	84c5fb8c42	ice: fix Tx queue iteration for Tx timestamp enablement The driver accidentally copied the ice_for_each_rxq iterator when implementing enablement of the ptp_tx bit for the Tx rings. We still load the Tx rings and set the ptp_tx field, but we iterate over the count of the num_rxq. If the number of Tx and Rx queues differ, this could either cause a buffer overrun when accessing the tx_rings list if num_txq is greater than num_rxq, or it could cause us to fail to enable Tx timestamps for some rings. This was not noticed originally as we generally have the same number of Tx and Rx queues. Fixes: `ea9b847cda` ("ice: enable transmit timestamps for E810 devices") Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Gurucharan G <gurucharanx.g@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-08-27 08:38:55 -07:00
Jakub Kicinski	97c78d0af5	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net drivers/net/wwan/mhi_wwan_mbim.c - drop the extra arg. Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-08-26 17:57:57 -07:00
Yufeng Mo	f3ccfda193	ethtool: extend coalesce setting uAPI with CQE mode In order to support more coalesce parameters through netlink, add two new parameter kernel_coal and extack for .set_coalesce and .get_coalesce, then some extra info can return to user with the netlink API. Signed-off-by: Yufeng Mo <moyufeng@huawei.com> Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-08-24 07:38:29 -07:00
Jacob Keller	a8f89fa277	ice: do not abort devlink info if board identifier can't be found The devlink dev info command reports version information about the device and firmware running on the board. This includes the "board.id" field which is supposed to represent an identifier of the board design. The ice driver uses the Product Board Assembly identifier for this. In some cases, the PBA is not present in the NVM. If this happens, devlink dev info will fail with an error. Instead, modify the ice_info_pba function to just exit without filling in the context buffer. This will cause the board.id field to be skipped. Log a dev_dbg message in case someone wants to confirm why board.id is not showing up for them. Fixes: `e961b679fb` ("ice: add board identifier info to devlink .info_get") Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Link: https://lore.kernel.org/r/20210819223451.245613-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-08-19 18:20:01 -07:00
Jakub Kicinski	f444fea789	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net drivers/ptp/Kconfig: `55c8fca1da` ("ptp_pch: Restore dependency on PCI") `e5f3155267` ("ethernet: fix PTP_1588_CLOCK dependencies") Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-08-19 18:09:18 -07:00
Maciej Machnikowski	5f77351963	ice: Fix perout start time rounding Internal tests found out that the latest code doesn't bring up 1PPS out as expected. As a result of incorrect define used to round the time up the time was round down to the past second boundary. Fix define used for rounding to properly round up to the next Top of second in ice_ptp_cfg_clkout to fix it. Fixes: `172db5f91d` ("ice: add support for auxiliary input/output pins") Signed-off-by: Maciej Machnikowski <maciej.machnikowski@intel.com> Tested-by: Sunitha Mekala <sunithax.d.mekala@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Link: https://lore.kernel.org/r/20210813165018.2196013-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-08-13 17:22:53 -07:00
Jakub Kicinski	f4083a752a	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Conflicts: drivers/net/ethernet/broadcom/bnxt/bnxt_ptp.h `9e26680733` ("bnxt_en: Update firmware call to retrieve TX PTP timestamp") `9e518f2580` ("bnxt_en: 1PPS functions to configure TSIO pins") `099fdeda65` ("bnxt_en: Event handler for PPS events") kernel/bpf/helpers.c include/linux/bpf-cgroup.h `a2baf4e8bb` ("bpf: Fix potentially incorrect results with bpf_get_local_storage()") `c7603cfa04` ("bpf: Add ambient BPF runtime context stored in current") drivers/net/ethernet/mellanox/mlx5/core/pci_irq.c `5957cc557d` ("net/mlx5: Set all field of mlx5_irq before inserting it to the xarray") `2d0b41a376` ("net/mlx5: Refcount mlx5_irq with integer") MAINTAINERS `7b637cd52f` ("MAINTAINERS: fix Microchip CAN BUS Analyzer Tool entry typo") `7d901a1e87` ("net: phy: add Maxlinear GPY115/21x/24x driver") Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-08-13 06:41:22 -07:00
Brett Creeley	3ba7f53f8b	ice: don't remove netdev->dev_addr from uc sync list In some circumstances, such as with bridging, it's possible that the stack will add the device's own MAC address to its unicast address list. If, later, the stack deletes this address, the driver will receive a request to remove this address. The driver stores its current MAC address as part of the VSI MAC filter list instead of separately. So, this causes a problem when the device's MAC address is deleted unexpectedly, which results in traffic failure in some cases. The following configuration steps will reproduce the previously mentioned problem: > ip link set eth0 up > ip link add dev br0 type bridge > ip link set br0 up > ip addr flush dev eth0 > ip link set eth0 master br0 > echo 1 > /sys/class/net/br0/bridge/vlan_filtering > modprobe -r veth > modprobe -r bridge > ip addr add 192.168.1.100/24 dev eth0 The following ping command fails due to the netdev->dev_addr being deleted when removing the bridge module. > ping <link partner> Fix this by making sure to not delete the netdev->dev_addr during MAC address sync. After fixing this issue it was noticed that the netdev_warn() in .set_mac was overly verbose, so make it at netdev_dbg(). Also, there is a possibility of a race condition between .set_mac and .set_rx_mode. Fix this by calling netif_addr_lock_bh() and netif_addr_unlock_bh() on the device's netdev when the netdev->dev_addr is going to be updated in .set_mac. Fixes: `e94d447866` ("ice: Implement filter sync, NDO operations and bump version") Signed-off-by: Brett Creeley <brett.creeley@intel.com> Tested-by: Liang Li <liali@redhat.com> Tested-by: Gurucharan G <gurucharanx.g@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-08-09 09:59:23 -07:00
Anirudh Venkataramanan	c503e63200	ice: Stop processing VF messages during teardown When VFs are setup and torn down in quick succession, it is possible that a VF is torn down by the PF while the VF's virtchnl requests are still in the PF's mailbox ring. Processing the VF's virtchnl request when the VF itself doesn't exist results in undefined behavior. Fix this by adding a check to stop processing virtchnl requests when VF teardown is in progress. Fixes: `ddf30f7ff8` ("ice: Add handler to configure SR-IOV") Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-08-09 09:59:23 -07:00
Anirudh Venkataramanan	50ac747984	ice: Prevent probing virtual functions The userspace utility "driverctl" can be used to change/override the system's default driver choices. This is useful in some situations (buggy driver, old driver missing a device ID, trying a workaround, etc.) where the user needs to load a different driver. However, this is also prone to user error, where a driver is mapped to a device it's not designed to drive. For example, if the ice driver is mapped to driver iavf devices, the ice driver crashes. Add a check to return an error if the ice driver is being used to probe a virtual function. Fixes: `837f08fdec` ("ice: Add basic driver framework for Intel(R) E800 Series") Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Gurucharan G <gurucharanx.g@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-08-09 09:59:23 -07:00
Leon Romanovsky	919d13a7e4	devlink: Set device as early as possible All kernel devlink implementations call to devlink_alloc() during initialization routine for specific device which is used later as a parent device for devlink_register(). Such late device assignment causes to the situation which requires us to call to device_register() before setting other parameters, but that call opens devlink to the world and makes accessible for the netlink users. Any attempt to move devlink_register() to be the last call generates the following error due to access to the devlink->dev pointer. [ 8.758862] devlink_nl_param_fill+0x2e8/0xe50 [ 8.760305] devlink_param_notify+0x6d/0x180 [ 8.760435] __devlink_params_register+0x2f1/0x670 [ 8.760558] devlink_params_register+0x1e/0x20 The simple change of API to set devlink device in the devlink_alloc() instead of devlink_register() fixes all this above and ensures that prior to call to devlink_register() everything already set. Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-08-09 10:21:40 +01:00
Arnd Bergmann	a76053707d	dev_ioctl: split out ndo_eth_ioctl Most users of ndo_do_ioctl are ethernet drivers that implement the MII commands SIOCGMIIPHY/SIOCGMIIREG/SIOCSMIIREG, or hardware timestamping with SIOCSHWTSTAMP/SIOCGHWTSTAMP. Separate these from the few drivers that use ndo_do_ioctl to implement SIOCBOND, SIOCBR and SIOCWANDEV commands. This is a purely cosmetic change intended to help readers find their way through the implementation. Cc: Doug Ledford <dledford@redhat.com> Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: Jay Vosburgh <j.vosburgh@gmail.com> Cc: Veaceslav Falico <vfalico@gmail.com> Cc: Andy Gospodarek <andy@greyhouse.net> Cc: Andrew Lunn <andrew@lunn.ch> Cc: Vivien Didelot <vivien.didelot@gmail.com> Cc: Florian Fainelli <f.fainelli@gmail.com> Cc: Vladimir Oltean <olteanv@gmail.com> Cc: Leon Romanovsky <leon@kernel.org> Cc: linux-rdma@vger.kernel.org Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-07-27 20:11:45 +01:00
David S. Miller	e1289cfb63	Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next Daniel Borkmann says: ==================== pull-request: bpf-next 2021-06-28 The following pull-request contains BPF updates for your net-next tree. We've added 37 non-merge commits during the last 12 day(s) which contain a total of 56 files changed, 394 insertions(+), 380 deletions(-). The main changes are: 1) XDP driver RCU cleanups, from Toke Høiland-Jørgensen and Paul E. McKenney. 2) Fix bpf_skb_change_proto() IPv4/v6 GSO handling, from Maciej Żenczykowski. 3) Fix false positive kmemleak report for BPF ringbuf alloc, from Rustam Kovhaev. 4) Fix x86 JIT's extable offset calculation for PROBE_LDX NULL, from Ravi Bangoria. 5) Enable libbpf fallback probing with tracing under RHEL7, from Jonathan Edwards. 6) Clean up x86 JIT to remove unused cnt tracking from EMIT macro, from Jiri Olsa. 7) Netlink cleanups for libbpf to please Coverity, from Kumar Kartikeya Dwivedi. 8) Allow to retrieve ancestor cgroup id in tracing programs, from Namhyung Kim. 9) Fix lirc BPF program query to use user-provided prog_cnt, from Sean Young. 10) Add initial libbpf doc including generated kdoc for its API, from Grant Seltzer. 11) Make xdp_rxq_info_unreg_mem_model() more robust, from Jakub Kicinski. 12) Fix up bpfilter startup log-level to info level, from Gary Lin. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2021-06-28 15:28:03 -07:00
Christophe JAILLET	b81c191c46	ice: Fix a memory leak in an error handling path in 'ice_pf_dcb_cfg()' If this 'kzalloc()' fails we must free some resources as in all the other error handling paths of this function. Fixes: `348048e724` ("ice: Implement iidc operations") Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-06-25 11:30:50 -07:00
Tony Nguyen	70fa0a0780	ice: remove unnecessary VSI assignment ice_get_vf_vsi() is being called twice for the same VSI. Remove the unnecessary call/assignment. Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com>	2021-06-25 11:30:50 -07:00
Victor Raj	37c592062b	ice: remove the VSI info from previous agg Remove the VSI info from previous aggregator after moving the VSI to a new aggregator. Signed-off-by: Victor Raj <victor.raj@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-06-25 11:30:49 -07:00
Maciej Machnikowski	172db5f91d	ice: add support for auxiliary input/output pins The E810 device supports programmable pins for enabling both input and output events related to the PTP hardware clock. This includes both output signals with programmable period, as well as timestamping of events on input pins. Add support for enabling these using the CONFIG_PTP_1588_CLOCK interface. This allows programming the software defined pins to take advantage of the hardware clock features. Signed-off-by: Maciej Machnikowski <maciej.machnikowski@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-06-25 11:30:49 -07:00
Jesse Brandeburg	3089cf6d3c	ice: add tracepoints This patch is modeled after one by Scott Peterson for i40e. Add tracepoints to the driver, via a new file ice_trace.h and some new trace calls added in interesting places in the driver. Add some tracing for DIMLIB to help debug interrupt moderation problems. Performance should not be affected, and this can be very useful for debugging and adding new trace events to paths in the future. Note eBPF programs can attach to these events, as well as perf can count them since we're attaching to the events subsystem in the kernel. Co-developed-by: Ben Shelton <benjamin.h.shelton@intel.com> Signed-off-by: Ben Shelton <benjamin.h.shelton@intel.com> Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-06-25 08:32:18 -07:00
Toke Høiland-Jørgensen	49589b23d5	intel: Remove rcu_read_lock() around XDP program invocation The Intel drivers all have rcu_read_lock()/rcu_read_unlock() pairs around XDP program invocations. However, the actual lifetime of the objects referred by the XDP program invocation is longer, all the way through to the call to xdp_do_flush(), making the scope of the rcu_read_lock() too small. This turns out to be harmless because it all happens in a single NAPI poll cycle (and thus under local_bh_disable()), but it makes the rcu_read_lock() misleading. Rather than extend the scope of the rcu_read_lock(), just get rid of it entirely. With the addition of RCU annotations to the XDP_REDIRECT map types that take bh execution into account, lockdep even understands this to be safe, so there's really no reason to keep it around. Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Tested-by: Jesper Dangaard Brouer <brouer@redhat.com> # i40e Cc: Jesse Brandeburg <jesse.brandeburg@intel.com> Cc: Tony Nguyen <anthony.l.nguyen@intel.com> Cc: intel-wired-lan@lists.osuosl.org Link: https://lore.kernel.org/bpf/20210624160609.292325-12-toke@redhat.com	2021-06-24 19:44:34 +02:00
Jakub Kicinski	adc2e56ebe	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Trivial conflicts in net/can/isotp.c and tools/testing/selftests/net/mptcp/mptcp_connect.sh scaled_ppm_to_ppb() was moved from drivers/ptp/ptp_clock.c to include/linux/ptp_clock_kernel.h in -next so re-apply the fix there. Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-06-18 19:47:02 -07:00
Jesse Brandeburg	dda90cb90a	ice: report hash type such as L2/L3/L4 The hardware is reporting the type of the hash used for RSS as a PTYPE field in the receive descriptor. Use this value to set the skb packet hash type by extending the hash type table to cover all 10-bits of possible values (requiring some variables to be changed from u8 to u16), and then use that table to convert to one of the possible values in enum pkt_hash_types. While we're here, remove the unused ptype struct value, which makes table init easier for the zero entries, and use ranged initializer to remove a bunch of code (works with gcc and clang). Without this change, the kernel will recalculate the hash in software, which can consume extra CPU cycles. Co-developed-by: Kiran Patil <kiran.patil@intel.com> Signed-off-by: Kiran Patil <kiran.patil@intel.com> Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-06-18 08:59:46 -07:00
Colin Ian King	587b839de7	ice: remove redundant continue statement in a for-loop The continue statement in the for-loop is redundant. Re-work the hw_lock check to remove it. Addresses-Coverity: ("Continue has no effect") Signed-off-by: Colin Ian King <colin.king@canonical.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-06-17 09:25:06 -07:00
Lorenzo Bianconi	4d7f75fe80	net: ice: ptp: fix compilation warning if PTP_1588_CLOCK is disabled Fix the following compilation warning if PTP_1588_CLOCK is not enabled drivers/net/ethernet/intel/ice/ice_ptp.h:149:1: error: return type defaults to ‘int’ [-Werror=return-type] ice_ptp_request_ts(struct ice_ptp_tx tx, struct sk_buff skb) Fixes: `ea9b847cda` ("ice: enable transmit timestamps for E810 devices") Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-06-17 09:24:49 -07:00
Jacob Keller	1e00113413	ice: remove unnecessary NULL checks before ptp_read_system_* The ptp_read_system_prets and ptp_read_system_postts functions already check for the NULL value of the ptp_system_timestamp structure pointer. There is no need to check this manually in the ice driver code. Remove the checks. Reported-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-06-17 09:24:27 -07:00
Shaokun Zhang	b13ad3e08d	ice: Remove the repeated declaration Function 'ice_is_vsi_valid' is declared twice, remove the repeated declaration. Cc: Jesse Brandeburg <jesse.brandeburg@intel.com> Cc: Tony Nguyen <anthony.l.nguyen@intel.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Shaokun Zhang <zhangshaokun@hisilicon.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-06-17 09:19:59 -07:00
Paul M Stillwell Jr	c73bf3bd83	ice: remove local variable Remove the local variable since it's only used once. Instead, use it directly. Signed-off-by: Paul M Stillwell Jr <paul.m.stillwell.jr@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-06-17 09:19:59 -07:00
Paul M Stillwell Jr	b6b0501d8d	ice: reduce scope of variables There are some places where the scope of a variable can be reduced so do that. Signed-off-by: Paul M Stillwell Jr <paul.m.stillwell.jr@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-06-17 09:19:59 -07:00
Jacob Keller	0c526d440f	ice: mark PTYPE 2 as reserved The entry for PTYPE 2 in the ice_ptype_lkup table incorrectly states that this is an L2 packet with no payload. According to the datasheet, this PTYPE is actually unused and reserved. Fix the lookup entry to indicate this is an unused entry that is reserved. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-06-17 09:19:59 -07:00
Jacob Keller	638a0c8c88	ice: fix incorrect payload indicator on PTYPE The entry for PTYPE 90 indicates that the payload is layer 3. This does not match the specification in the datasheet which indicates the packet is a MAC, IPv6, UDP packet, with a payload in layer 4. Fix the lookup table to match the data sheet. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-06-17 09:19:58 -07:00
Jacob Keller	ea9b847cda	ice: enable transmit timestamps for E810 devices Add support for enabling Tx timestamp requests for outgoing packets on E810 devices. The ice hardware can support multiple outstanding Tx timestamp requests. When sending a descriptor to hardware, a Tx timestamp request is made by setting a request bit, and assigning an index that represents which Tx timestamp index to store the timestamp in. Hardware makes no effort to synchronize the index use, so it is up to software to ensure that Tx timestamp indexes are not re-used before the timestamp is reported back. To do this, introduce a Tx timestamp tracker which will keep track of currently in-use indexes. In the hot path, if a packet has a timestamp request, an index will be requested from the tracker. Unfortunately, this does require a lock as the indexes are shared across all queues on a PHY. There are not enough indexes to reliably assign only 1 to each queue. For the E810 devices, the timestamp indexes are not shared across PHYs, so each port can have its own tracking. Once hardware captures a timestamp, an interrupt is fired. In this interrupt, trigger a new work item that will figure out which timestamp was completed, and report the timestamp back to the stack. This function loops through the Tx timestamp indexes and checks whether there is now a valid timestamp. If so, it clears the PHY timestamp indication in the PHY memory, locks and removes the SKB and bit in the tracker, then reports the timestamp to the stack. It is possible in some cases that a timestamp request will be initiated but never completed. This might occur if the packet is dropped by software or hardware before it reaches the PHY. Add a task to the periodic work function that will check whether a timestamp request is more than a few seconds old. If so, the timestamp index is cleared in the PHY, and the SKB is released. Just as with Rx timestamps, the Tx timestamps are only 40 bits wide, and use the same overall logic for extending to 64 bits of nanoseconds. With this change, E810 devices should be able to perform basic PTP functionality. Future changes will extend the support to cover the E822-based devices. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-06-11 08:47:41 -07:00
Jacob Keller	77a781155a	ice: enable receive hardware timestamping Add SIOCGHWTSTAMP and SIOCSHWTSTAMP ioctl handlers to respond to requests to enable timestamping support. If the request is for enabling Rx timestamps, set a bit in the Rx descriptors to indicate that receive timestamps should be reported. Hardware captures receive timestamps in the PHY which only captures part of the timer, and reports only 40 bits into the Rx descriptor. The upper 32 bits represent the contents of GLTSYN_TIME_L at the point of packet reception, while the lower 8 bits represent the upper 8 bits of GLTSYN_TIME_0. The networking and PTP stack expect 64 bit timestamps in nanoseconds. To support this, implement some logic to extend the timestamps by using the full PHC time. If the Rx timestamp was captured prior to the PHC time, then the real timestamp is PHC - (lower_32_bits(PHC) - timestamp) If the Rx timestamp was captured after the PHC time, then the real timestamp is PHC + (timestamp - lower_32_bits(PHC)) These calculations are correct as long as neither the PHC timestamp nor the Rx timestamps are more than 2^32-1 nanseconds old. Further, we can detect when the Rx timestamp is before or after the PHC as long as the PHC timestamp is no more than 2^31-1 nanoseconds old. In that case, we calculate the delta between the lower 32 bits of the PHC and the Rx timestamp. If it's larger than 2^31-1 then the Rx timestamp must have been captured in the past. If it's smaller, then the Rx timestamp must have been captured after PHC time. Add an ice_ptp_extend_32b_ts function that relies on a cached copy of the PHC time and implements this algorithm to calculate the proper upper 32bits of the Rx timestamps. Cache the PHC time periodically in all of the Rx rings. This enables each Rx ring to simply call the extension function with a recent copy of the PHC time. By ensuring that the PHC time is kept up to date periodically, we ensure this algorithm doesn't use stale data and produce incorrect results. To cache the time, introduce a kworker and a kwork item to periodically store the Rx time. It might seem like we should use the .do_aux_work interface of the PTP clock. This doesn't work because all PFs must cache this time, but only one PF owns the PTP clock device. Thus, the ice driver will manage its own kthread instead of relying on the PTP do_aux_work handler. With this change, the driver can now report Rx timestamps on all incoming packets. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-06-11 08:47:41 -07:00
Jacob Keller	67569a7f94	ice: report the PTP clock index in ethtool .get_ts_info Now that the driver registers a PTP clock device that represents the clock hardware, it is important that the clock index is reported via the ethtool .get_ts_info callback. The underlying hardware resource is shared between multiple PF functions. Only one function owns the hardware resources associated with a timer, but multiple functions may be associated with it for the purposes of timestamping. To support this, the owning PF will store the clock index into the driver shared parameters buffer in firmware. Other PFs will look up the clock index by reading the driver shared parameter on demand when requested via the .get_ts_info ethtool function. In this way, all functions which are tied to the same timer are able to report the clock index. Userspace software such as ptp4l performs a look up on the netdev to determine the associated clock, and all commands to control or configure the clock will be handled through the controlling PF. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-06-11 08:47:41 -07:00
Jacob Keller	06c16d89d2	ice: register 1588 PTP clock device object for E810 devices Add a new ice_ptp.c file for holding the basic PTP clock interface functions. If the device supports PTP, call the new ice_ptp_init and ice_ptp_release functions where appropriate. If the function owns the hardware resource associated with the PTP hardware clock, register with the PTP_1588_CLOCK infrastructure to allocate a new clock object that represents the device hardware clock. Implement basic functionality for reading and setting the clock time, performing clock adjustments, and adjusting the clock frequency. Future changes will introduce functionality for handling related features including Tx and Rx timestamps. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-06-11 08:47:30 -07:00
Jacob Keller	03cb4473be	ice: add low level PTP clock access functions Add the ice_ptp_hw.c file and some associated definitions to the ice driver folder. This file contains basic low level definitions for functions that interact with the device hardware. For now, only E810-based devices are supported. The ice hardware supports 2 major variants which have different PHYs with different procedures necessary for interacting with the device clock. Because the device captures timestamps in the PHY, each PHY has its own internal timer. The timers are synchronized in hardware by first preparing the source timer and the PHY timer shadow registers, and then issuing a synchronization command. This ensures that both the source timer and PHY timers are programmed simultaneously. The timers themselves are all driven from the same oscillator source. The functions in ice_ptp_hw.c abstract over the differences between how the PHYs in E810 are programmed vs how the PHYs in E822 devices are programmed. This series only implements E810 support, but E822 support will be added in a future change. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-06-11 07:38:00 -07:00
Jacob Keller	7f9ab54d31	ice: add support for set/get of driver-stored firmware parameters Depending on the device configuration, the ice hardware may share the PTP hardware clock timer between multiple PFs. Each PF is informed by firmware during initialization of the PTP timer association. When bringing up PTP, only the PFs which own the timer shall allocate a PTP hardware clock. Other PFs associated with that timer must report the correct PTP clock index in order to allow userspace software the ability to know which ports are connected to the same clock. To support this, the firmware has driver shared parameters. These parameters enable one PF to write the clock index into firmware, and have other PFs read the associated value out. This enables the driver to have only a single PF allocate and control the device timer registers, while other PFs associated with that timer can report the correct clock in the ETHTOOL_GET_TS_INFO report. Add support for the necessary admin queue commands to enable reading and writing of the driver shared parameters. This will be used in a future change to enable sharing the PTP clock index between PF drivers. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-06-11 07:38:00 -07:00
Jacob Keller	9733cc94c5	ice: process 1588 PTP capabilities during initialization The device firmware reports PTP clock capabilities to each PF during initialization. This includes various information for both the overall device and the individual function, including For functions: * whether this function has timesync enabled * whether this function owns one of the 2 possible clock timers, and which one * which timer the function is associated with * the clock frequency, if the device supports multiple clock frequencies * The GPIO pin association for the timer owned by this PF, if any For the device: * Which PF owns timer 0, if any * Which PF owns timer 1, if any * whether timer 0 is enabled * whether timer 1 is enabled Extract the bits from the capabilities information reported by firmware and store them in the device and function capability structures.o This information will be used in a future change to have the function driver enable PTP hardware clock support. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-06-11 07:38:00 -07:00
Jacob Keller	8f5ee3c477	ice: add support for sideband messages In order to support certain device features, including enabling the PTP hardware clock, the ice driver needs to control some registers on the device PHY. These registers are accessed by sending sideband messages. For some hardware, these messages must be sent over the device admin queue, while other hardware has a dedicated control queue for the sideband messages. Add the neighbor device message structure for sending a message to the neighboring device. Where supported, initialize the sideband control queue and handle cleanup. Add a wrapper function for sending sideband control queue messages that read or write a neighboring device register. Because some devices send sideband messages over the AdminQ, also increase the length of the admin queue to allow more messages to be queued up. This is important because the sideband messages add additional pressure on the AQ usage. This support will be used in following patches to enable support for CONFIG_1588_PTP_CLOCK. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-06-11 07:38:00 -07:00
Maciej Fijalkowski	2e84f6b377	ice: parameterize functions responsible for Tx ring management Commit `ae15e0ba1b` ("ice: Change number of XDP Tx queues to match number of Rx queues") tried to address the incorrect setting of XDP queue count that was based on the Tx queue count, whereas in theory we should provide the XDP queue per Rx queue. However, the routines that setup and destroy the set of Tx resources are still based on the vsi->num_txq. Ice supports the asynchronous Tx/Rx queue count, so for a setup where vsi->num_txq > vsi->num_rxq, ice_vsi_stop_tx_rings and ice_vsi_cfg_txqs will be accessing the vsi->xdp_rings out of the bounds. Parameterize two mentioned functions so they get the size of Tx resources array as the input. Fixes: `ae15e0ba1b` ("ice: Change number of XDP Tx queues to match number of Rx queues") Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Tested-by: Kiran Bhandare <kiranx.bhandare@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-06-09 13:15:10 -07:00
Maciej Fijalkowski	ebc5399ea1	ice: add ndo_bpf callback for safe mode netdev ops ice driver requires a programmable pipeline firmware package in order to have a support for advanced features. Otherwise, driver falls back to so called 'safe mode'. For that mode, ndo_bpf callback is not exposed and when user tries to load XDP program, the following happens: $ sudo ./xdp1 enp179s0f1 libbpf: Kernel error message: Underlying driver does not support XDP in native mode link set xdp fd failed which is sort of confusing, as there is a native XDP support, but not in the current mode. Improve the user experience by providing the specific ndo_bpf callback dedicated for safe mode which will make use of extack to explicitly let the user know that the DDP package is missing and that's the reason that the XDP can't be loaded onto interface currently. Cc: Jamal Hadi Salim <jhs@mojatatu.com> Fixes: `efc2214b60` ("ice: Add support for XDP") Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Tested-by: Kiran Bhandare <kiranx.bhandare@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-06-09 13:15:10 -07:00
David S. Miller	b3ef1550a4	Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue Tony Nguyen says: ==================== 100GbE Intel Wired LAN Driver Updates 2021-06-07 This series contains updates to virtchnl header file and ice driver. Brett adds capability bits to virtchnl to specify whether a primary or secondary MAC address is being requested and adds the implementation to ice. He also adds storing of VF MAC address so that it will be preserved across reboots of VM and refactors VF queue configuration to remove the expectation that configuration be done all at once. Krzysztof refactors ice_setup_rx_ctx() to remove configuration not related to Rx context into a new function, ice_vsi_cfg_rxq(). Liwei Song extends the wait time for the global config timeout. Salil Mehta refactors code in ice_vsi_set_num_qs() to remove an unnecessary call when the user has requested specific number of Rx or Tx queues. Jesse converts define macros to static inlines for NOP configurations. Jake adds messaging when devlink fails to read device capabilities and when pldmfw cannot find the requested firmware. Adds a wait for reset completion when reporting devlink info and reinitializes NVM during rebuild to ensure values are current. Ani adds detection and reporting of modules exceeding supported power levels and changes an error message to a debug message. Paul fixes a clang warning for deadcode.DeadStores. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2021-06-07 13:24:50 -07:00
David S. Miller	126285651b	Merge ra.kernel.org:/pub/scm/linux/kernel/git/netdev/net Bug fixes overlapping feature additions and refactoring, mostly. Signed-off-by: David S. Miller <davem@davemloft.net>	2021-06-07 13:01:52 -07:00
Paul M Stillwell Jr	7e94090ae1	ice: fix clang warning regarding deadcode.DeadStores clang generates deadcode.DeadStores warnings when a variable is used to read a value, but then that value isn't used later in the code. Fix this warning. Signed-off-by: Paul M Stillwell Jr <paul.m.stillwell.jr@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-06-07 08:59:02 -07:00
Anirudh Venkataramanan	a69606cde1	ice: downgrade error print to debug print Failing to add or remove LLDP filter doesn't seem to be a fatal error, so downgrade the dev_err message to a dev_dbg message. Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-06-07 08:59:01 -07:00
Anirudh Venkataramanan	c77849f546	ice: Detect and report unsupported module power levels Determine whether an unsupported power configuration is preventing link establishment by storing and checking the link_cfg_err_byte. Print error messages when module power levels are unsupported. Also add a new flag bit to prevent spamming said error messages. Co-developed-by: Jeb Cramer <jeb.j.cramer@intel.com> Signed-off-by: Jeb Cramer <jeb.j.cramer@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-06-07 08:59:01 -07:00
Jacob Keller	97a4ec0107	ice: (re)initialize NVM fields when rebuilding After performing a flash update, a device EMP reset may occur. This reset will cause the newly downloaded firmware to be initialized. When this happens, the driver still reports the previous NVM version information. This is because the NVM versions are cached within the hw structure. This can be confusing, as the new firmware is in fact running in this case. Handle this by calling ice_init_nvm when rebuilding the driver state. This will update the flash version information and ensures that the current values are displayed when reporting the NVM versions to the stack. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-06-07 08:59:01 -07:00
Jacob Keller	1c08052ec4	ice: wait for reset before reporting devlink info Requesting device firmware information while the device is busy cleaning up after a reset can result in an unexpected failure: This occurs because the command is attempting to access the device AdminQ while it is down. Resolve this by having the command wait for a while until the reset is complete. To do this, introduce a reset_wait_queue and associated helper function "ice_wait_for_reset". This helper will use the wait queue to sleep until the driver is done rebuilding. Use of a wait queue is preferred because the potential sleep duration can be several seconds. To ensure that the thread wakes up properly, a new wake_up call is added during all code paths which clear the reset state bits associated with the driver rebuild flow. Using this ensures that tools can request device information without worrying about whether the driver is cleaning up from a reset. Specifically, it is expected that a flash update could result in a device reset, and it is better to delay the response for information until the reset is complete rather than exit with an immediate failure. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-06-07 08:59:01 -07:00
Jacob Keller	e872b94f9c	ice: add error message when pldmfw_flash_image fails When flashing a new firmware image onto the device, the pldmfw library parses the image contents looking for a matching record. If no record can be found, the function reports an error of -ENOENT. This can produce a very confusing error message and experience for the user: $devlink dev flash pci/0000🆎00.0 file image.bin devlink answers: No such file or directory This is because the ENOENT error code is interpreted as a missing file or directory. The pldmfw library does not have direct access to the extack pointer as it is generic and non-netdevice specific. The only way that ENOENT is returned by the pldmfw library is when no record matches. Catch this specific error and report a suitable extended ack message: $devlink dev flash pci/0000🆎00.0 file image.bin Error: ice: Firmware image has no record matching this device devlink answers: No such file or directory In addition, ensure that we log an error message to the console whenever this function fails. Because our driver specific PLDM operation functions potentially set the extended ACK message, avoid overwriting this with a generic message. This change should result in an improved experience when attempting to flash an image that does not have a compatible record. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-06-07 08:59:01 -07:00
Jacob Keller	d5f84ae95f	ice: add extack when unable to read device caps When filling out information for the DEVLINK_CMD_INFO_GET, the driver needs to read some device capabilities. Add an extack message to properly inform the caller of the failure, as we do for other failures in this function. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-06-07 08:59:01 -07:00
Jesse Brandeburg	96cf4f689b	ice: use static inline for dummy functions Trivial: The driver had previously attempted to use #define macros to make functions that have no use in certain configs disappear. Using static inlines instead allows for certain static checkers to process the code better, and results in no functional change. Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-06-07 08:59:01 -07:00
Salil Mehta	b38b7f2bb4	ice: Re-organizes reqstd/avail {R, T}XQ check/code for efficiency If user has explicitly requested the number of {R,T}XQs, then it is unnecessary to get the count of already available {R,T}XQs from the PF avail_{r,t}xqs bitmap. This value will get overridden by user specified value in any case. Re-organize this code for improving the flow, readability and efficiency. This scope of improvement was found during the review of the ICE driver code. Fixes: `87324e747f` ("ice: Implement ethtool ops for channels") Cc: intel-wired-lan@lists.osuosl.org Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Salil Mehta <salil.mehta@huawei.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-06-07 08:59:01 -07:00
Liwei Song	fb3612840d	ice: set the value of global config lock timeout longer It may need hold Global Config Lock a longer time when download DDP package file, extend the timeout value to 5000ms to ensure that download can be finished before other AQ command got time to run, this will fix the issue below when probe the device, 5000ms is a test value that work with both Backplane and BreakoutCable NVM image: ice 0000:f4:00.0: VSI 12 failed lan queue config, error ICE_ERR_CFG ice 0000:f4:00.0: Failed to delete VSI 12 in FW - error: ICE_ERR_AQ_TIMEOUT ice 0000:f4:00.0: probe failed due to setup PF switch: -12 ice: probe of 0000:f4:00.0 failed with error -12 Signed-off-by: Liwei Song <liwei.song@windriver.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-06-07 08:59:01 -07:00
Brett Creeley	7ad15440ac	ice: Refactor VIRTCHNL_OP_CONFIG_VSI_QUEUES handling Currently, when a VF requests queue configuration via VIRTCHNL_OP_CONFIG_VSI_QUEUES the PF driver expects that this message will only be called once and we always assume the queues being configured start from 0. This is incorrect and is causing issues when a VF tries to send this message for multiple queue blocks. Fix this by using the queue_id specified in the virtchnl message and allowing for individual Rx and/or Tx queues to be configured. Also, reduce the duplicated for loops for configuring the queues by moving all the logic into a single for loop. Signed-off-by: Brett Creeley <brett.creeley@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-06-07 08:59:01 -07:00
Krzysztof Kazimierczak	43c7f9198d	ice: Refactor ice_setup_rx_ctx Move AF_XDP logic and buffer allocation out of ice_setup_rx_ctx() to a new function ice_vsi_cfg_rxq(), so the function actually sets up the Rx context. Signed-off-by: Krzysztof Kazimierczak <krzysztof.kazimierczak@intel.com> Co-developed-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Kiran Bhandare <kiranx.bhandare@intel.com>	2021-06-07 08:58:56 -07:00
Brett Creeley	f28cd5ce1a	ice: Save VF's MAC across reboot If a VM reboots and/or VF driver is unloaded, its cached hardware MAC address (hw_lan_addr.addr) is cleared in some cases. If the VF is trusted, then the PF driver allows the VF to clear its old MAC address even if this MAC was configured by a host administrator. If the VF is untrusted, then the PF driver allows the VF to clear its old MAC address only if the host admin did not set it. For the trusted VF case, this is unexpected and will cause issues because the host configured MAC (i.e. via XML) will be cleared on VM reboot. For the untrusted VF case, this is done to be consistent and it will allow the VF to keep the same MAC across VM reboot. Fix this by introducing dev_lan_addr to the VF structure. This will be the VF's MAC address when it's up and running and in most cases will be the same as the hw_lan_addr. However, to address the VM reboot and unload/reload problem, the driver will never allow the hw_lan_addr to be cleared via VIRTCHNL_OP_DEL_ETH_ADDR. When the VF's MAC is changed, the dev_lan_addr and hw_lan_addr will always be updated with the same value. The only ways the VF's MAC can change are the following: - Set the VF's MAC administratively on the host via iproute2. - If the VF is trusted and changes/sets its own MAC. - If the VF is untrusted and the host has not set the MAC via iproute2. Signed-off-by: Brett Creeley <brett.creeley@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-06-07 08:30:58 -07:00
Brett Creeley	51efbbdf1d	ice: Manage VF's MAC address for both legacy and new cases Currently there is no way for a VF driver to specify if it wants to change it's hardware address. New bits are being added to virtchnl.h in struct virtchnl_ether_addr that allow for the VF to correctly communicate this information. However, legacy VF drivers that don't support the new virtchnl.h bits still need to be supported. Make a best effort attempt at saving the VF's primary/device address in the legacy case and depend on the VIRTCHNL_ETHER_ADDR_PRIMARY type for the new case. Legacy case - If a unicast MAC is being added and the hw_lan_addr.addr is empty, then populate it. This assumes that the address is the VF's hardware address. If a unicast MAC is being added and the hw_lan_addr.addr is not empty, then cache it in the legacy_last_added_umac.addr. If a unicast MAC is being deleted and it matches the hw_lan_addr.addr, then zero the hw_lan_addr.addr. Also, if the legacy_last_added_umac.addr has not expired, copy the legacy_last_added_umac.addr into the hw_lan_addr.addr. This is done because we cannot guarantee the order of VIRTCHNL_OP_ADD_ETH_ADDR and VIRTCHNL_OP_DEL_ETH_ADDR. New case - If a unicast MAC is being added and it's specified as VIRTCHNL_ETHER_ADDR_PRIMARY, then replace the current hw_lan_addr.addr. If a unicast MAC is being deleted and it's type is specified as VIRTCHNL_ETHER_ADDR_PRIMARY, then zero the hw_lan_addr.addr. Untrusted VFs - Only allow above legacy/new changes to their hardware address if the PF has not set it administratively via iproute2. Trusted VFs - Always allow above legacy/new changes to their hardware address even if the PF has administratively set it via iproute2. Also, change the variable dflt_lan_addr to hw_lan_addr to clearly represent the purpose of this variable since it's purpose is to act as a hardware programmed MAC address for the VF. Signed-off-by: Brett Creeley <brett.creeley@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-06-07 08:30:58 -07:00
Dave Ertman	f9f83202b7	ice: Allow all LLDP packets from PF to Tx Currently in the ice driver, the check whether to allow a LLDP packet to egress the interface from the PF_VSI is being based on the SKB's priority field. It checks to see if the packets priority is equal to TC_PRIO_CONTROL. Injected LLDP packets do not always meet this condition. SCAPY defaults to a sk_buff->protocol value of ETH_P_ALL (0x0003) and does not set the priority field. There will be other injection methods (even ones used by end users) that will not correctly configure the socket so that SKB fields are correctly populated. Then ethernet header has to have to correct value for the protocol though. Add a check to also allow packets whose ethhdr->h_proto matches ETH_P_LLDP (0x88CC). Fixes: `0c3a6101ff` ("ice: Allow egress control packets from PF_VSI") Signed-off-by: Dave Ertman <david.m.ertman@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-06-04 07:37:48 -07:00
Paul Greenwalt	5cd349c349	ice: report supported and advertised autoneg using PHY capabilities Ethtool incorrectly reported supported and advertised auto-negotiation settings for a backplane PHY image which did not support auto-negotiation. This can occur when using media or PHY type for reporting ethtool supported and advertised auto-negotiation settings. Remove setting supported and advertised auto-negotiation settings based on PHY type in ice_phy_type_to_ethtool(), and MAC type in ice_get_link_ksettings(). Ethtool supported and advertised auto-negotiation settings should be based on the PHY image using the AQ command get PHY capabilities with media. Add setting supported and advertised auto-negotiation settings based get PHY capabilities with media in ice_get_link_ksettings(). Fixes: `48cb27f2fd` ("ice: Implement handlers for ethtool PHY/link operations") Signed-off-by: Paul Greenwalt <paul.greenwalt@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-06-04 07:37:48 -07:00
Haiyue Wang	c7ee6ce1cf	ice: handle the VF VSI rebuild failure VSI rebuild can be failed for LAN queue config, then the VF's VSI will be NULL, the VF reset should be stopped with the VF entering into the disable state. Fixes: `12bb018c53` ("ice: Refactor VF reset") Signed-off-by: Haiyue Wang <haiyue.wang@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-06-04 07:37:48 -07:00
Brett Creeley	8679f07a99	ice: Fix VFR issues for AVF drivers that expect ATQLEN cleared Some AVF drivers expect the VF_MBX_ATQLEN register to be cleared for any type of VFR/VFLR. Fix this by clearing the VF_MBX_ATQLEN register at the same time as VF_MBX_ARQLEN. Fixes: `82ba01282c` ("ice: clear VF ARQLEN register on reset") Signed-off-by: Brett Creeley <brett.creeley@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-06-04 07:37:48 -07:00
Brett Creeley	f0457690af	ice: Fix allowing VF to request more/less queues via virtchnl Commit `12bb018c53` ("ice: Refactor VF reset") caused a regression that removes the ability for a VF to request a different amount of queues via VIRTCHNL_OP_REQUEST_QUEUES. This prevents VF drivers to either increase or decrease the number of queue pairs they are allocated. Fix this by using the variable vf->num_req_qs when determining the vf->num_vf_qs during VF VSI creation. Fixes: `12bb018c53` ("ice: Refactor VF reset") Signed-off-by: Brett Creeley <brett.creeley@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-06-04 07:37:48 -07:00
Maciej Fijalkowski	e102db780e	ice: track AF_XDP ZC enabled queues in bitmap Commit `c7a219048e` ("ice: Remove xsk_buff_pool from VSI structure") silently introduced a regression and broke the Tx side of AF_XDP in copy mode. xsk_pool on ice_ring is set only based on the existence of the XDP prog on the VSI which in turn picks ice_clean_tx_irq_zc to be executed. That is not something that should happen for copy mode as it should use the regular data path ice_clean_tx_irq. This results in a following splat when xdpsock is run in txonly or l2fwd scenarios in copy mode: <snip> [ 106.050195] BUG: kernel NULL pointer dereference, address: 0000000000000030 [ 106.057269] #PF: supervisor read access in kernel mode [ 106.062493] #PF: error_code(0x0000) - not-present page [ 106.067709] PGD 0 P4D 0 [ 106.070293] Oops: 0000 [#1] PREEMPT SMP NOPTI [ 106.074721] CPU: 61 PID: 0 Comm: swapper/61 Not tainted 5.12.0-rc2+ #45 [ 106.081436] Hardware name: Intel Corporation S2600WFT/S2600WFT, BIOS SE5C620.86B.02.01.0008.031920191559 03/19/2019 [ 106.092027] RIP: 0010:xp_raw_get_dma+0x36/0x50 [ 106.096551] Code: 74 14 48 b8 ff ff ff ff ff ff 00 00 48 21 f0 48 c1 ee 30 48 01 c6 48 8b 87 90 00 00 00 48 89 f2 81 e6 ff 0f 00 00 48 c1 ea 0c <48> 8b 04 d0 48 83 e0 fe 48 01 f0 c3 66 66 2e 0f 1f 84 00 00 00 00 [ 106.115588] RSP: 0018:ffffc9000d694e50 EFLAGS: 00010206 [ 106.120893] RAX: 0000000000000000 RBX: ffff88984b8c8a00 RCX: ffff889852581800 [ 106.128137] RDX: 0000000000000006 RSI: 0000000000000000 RDI: ffff88984cd8b800 [ 106.135383] RBP: ffff888123b50001 R08: ffff889896800000 R09: 0000000000000800 [ 106.142628] R10: 0000000000000000 R11: ffffffff826060c0 R12: 00000000000000ff [ 106.149872] R13: 0000000000000000 R14: 0000000000000040 R15: ffff888123b50018 [ 106.157117] FS: 0000000000000000(0000) GS:ffff8897e0f40000(0000) knlGS:0000000000000000 [ 106.165332] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 106.171163] CR2: 0000000000000030 CR3: 000000000560a004 CR4: 00000000007706e0 [ 106.178408] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 106.185653] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 106.192898] PKRU: 55555554 [ 106.195653] Call Trace: [ 106.198143] <IRQ> [ 106.200196] ice_clean_tx_irq_zc+0x183/0x2a0 [ice] [ 106.205087] ice_napi_poll+0x3e/0x590 [ice] [ 106.209356] __napi_poll+0x2a/0x160 [ 106.212911] net_rx_action+0xd6/0x200 [ 106.216634] __do_softirq+0xbf/0x29b [ 106.220274] irq_exit_rcu+0x88/0xc0 [ 106.223819] common_interrupt+0x7b/0xa0 [ 106.227719] </IRQ> [ 106.229857] asm_common_interrupt+0x1e/0x40 </snip> Fix this by introducing the bitmap of queues that are zero-copy enabled, where each bit, corresponding to a queue id that xsk pool is being configured on, will be set/cleared within ice_xsk_pool_{en,dis}able and checked within ice_xsk_pool(). The latter is a function used for deciding which napi poll routine is executed. Idea is being taken from our other drivers such as i40e and ixgbe. Fixes: `c7a219048e` ("ice: Remove xsk_buff_pool from VSI structure") Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Tested-by: Kiran Bhandare <kiranx.bhandare@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-06-03 08:38:37 -07:00
Magnus Karlsson	89d65df024	ice: add correct exception tracing for XDP Add missing exception tracing to XDP when a number of different errors can occur. The support was only partial. Several errors where not logged which would confuse the user quite a lot not knowing where and why the packets disappeared. Fixes: `efc2214b60` ("ice: Add support for XDP") Fixes: `2d4238f556` ("ice: Add support for AF_XDP") Reported-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Tested-by: Kiran Bhandare <kiranx.bhandare@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-06-03 08:38:37 -07:00
Dave Ertman	f9f5301e7e	ice: Register auxiliary device to provide RDMA Register ice client auxiliary RDMA device on the auxiliary bus per PCIe device function for the auxiliary driver (irdma) to attach to. It allows to realize a single RDMA driver (irdma) capable of working with multiple netdev drivers over multi-generation Intel HW supporting RDMA. There is no load ordering dependencies between ice and irdma. Signed-off-by: Dave Ertman <david.m.ertman@intel.com> Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-05-28 20:11:13 -07:00
Dave Ertman	348048e724	ice: Implement iidc operations Add implementations for supporting iidc operations for device operation such as allocation of resources and event notifications. Signed-off-by: Dave Ertman <david.m.ertman@intel.com> Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-05-28 20:11:13 -07:00
Dave Ertman	d25a0fc41c	ice: Initialize RDMA support Probe the device's capabilities to see if it supports RDMA. If so, allocate and reserve resources to support its operation; populate structures with initial values. Signed-off-by: Dave Ertman <david.m.ertman@intel.com> Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-05-28 20:11:13 -07:00
Qi Zhang	ddd1f3cfed	ice: Support RSS configure removal for AVF Add the handler for virtchnl message VIRTCHNL_OP_DEL_RSS_CFG to remove an existing RSS configuration with matching hashed fields. Signed-off-by: Vignesh Sridhar <vignesh.sridhar@intel.com> Co-developed-by: Jia Guo <jia.guo@intel.com> Signed-off-by: Jia Guo <jia.guo@intel.com> Signed-off-by: Qi Zhang <qi.z.zhang@intel.com> Signed-off-by: Haiyue Wang <haiyue.wang@intel.com> Tested-by: Bo Chen <BoX.C.Chen@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-04-22 09:26:22 -07:00
Qi Zhang	222a8ab016	ice: Enable RSS configure for AVF Currently, RSS hash input is not available to AVF by ethtool, it is set by the PF directly. Add the RSS configure support for AVF through new virtchnl message, and define the capability flag VIRTCHNL_VF_OFFLOAD_ADV_RSS_PF to query this new RSS offload support. Signed-off-by: Jia Guo <jia.guo@intel.com> Signed-off-by: Qi Zhang <qi.z.zhang@intel.com> Signed-off-by: Haiyue Wang <haiyue.wang@intel.com> Tested-by: Bo Chen <BoX.C.Chen@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-04-22 09:26:22 -07:00
Brett Creeley	c5afbe99b7	ice: Add helper function to get the VF's VSI Currently, the driver gets the VF's VSI by using a long string of dereferences (i.e. vf->pf->vsi[vf->lan_vsi_idx]). If the method to get the VF's VSI were to change the driver would have to change it in every location. Fix this by adding the helper ice_get_vf_vsi(). Signed-off-by: Brett Creeley <brett.creeley@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-04-22 09:26:22 -07:00
Colin Ian King	c9b5f681fe	ice: remove redundant assignment to pointer vsi Pointer vsi is being re-assigned a value that is never read, the assignment is redundant and can be removed. Addresses-Coverity: ("Unused value") Signed-off-by: Colin Ian King <colin.king@canonical.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-04-22 09:26:22 -07:00
Brett Creeley	142da08c4d	ice: Advertise virtchnl UDP segmentation offload capability As the hardware is capable of supporting UDP segmentation offload, add a capability bit to virtchnl.h to communicate this and have the driver advertise its support. Suggested-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Signed-off-by: Brett Creeley <brett.creeley@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-04-22 09:26:22 -07:00
Michal Swiatkowski	c0dcaa55f9	ice: Allow ignoring opcodes on specific VF Declare bitmap of allowed commands on VF. Initialize default opcodes list that should be always supported. Declare array of supported opcodes for each caps used in virtchnl code. Change allowed bitmap by setting or clearing corresponding bit to allowlist (bit set) or denylist (bit clear). Signed-off-by: Michal Swiatkowski <michal.swiatkowski@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-04-22 09:26:22 -07:00
Vignesh Sridhar	0891c89674	ice: warn about potentially malicious VFs Attempt to detect malicious VFs and, if suspected, log the information but keep going to allow the user to take any desired actions. Potentially malicious VFs are identified by checking if the VFs are transmitting too many messages via the PF-VF mailbox which could cause an overflow of this channel resulting in denial of service. This is done by creating a snapshot or static capture of the mailbox buffer which can be traversed and in which the messages sent by VFs are tracked. Co-developed-by: Yashaswini Raghuram Prathivadi Bhayankaram <yashaswini.raghuram.prathivadi.bhayankaram@intel.com> Signed-off-by: Yashaswini Raghuram Prathivadi Bhayankaram <yashaswini.raghuram.prathivadi.bhayankaram@intel.com> Co-developed-by: Paul M Stillwell Jr <paul.m.stillwell.jr@intel.com> Signed-off-by: Paul M Stillwell Jr <paul.m.stillwell.jr@intel.com> Co-developed-by: Brett Creeley <brett.creeley@intel.com> Signed-off-by: Brett Creeley <brett.creeley@intel.com> Signed-off-by: Vignesh Sridhar <vignesh.sridhar@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-04-22 09:26:22 -07:00
Jakub Kicinski	8203c7ce4e	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net drivers/net/ethernet/stmicro/stmmac/stmmac_main.c - keep the ZC code, drop the code related to reinit net/bridge/netfilter/ebtables.c - fix build after move to net_generic Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-04-17 11:08:07 -07:00
Paul M Stillwell Jr	4c26f69d0c	ice: reduce scope of variable The scope of this variable can be reduced so do that. Signed-off-by: Paul M Stillwell Jr <paul.m.stillwell.jr@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-04-14 17:12:18 -07:00
Paul M Stillwell Jr	4fe3622694	ice: remove return variable We were saving the return value from ice_vsi_manage_rss_lut(), but the errors from that function are not critical so change it to return void and remove the code that saved the value. Signed-off-by: Paul M Stillwell Jr <paul.m.stillwell.jr@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-04-14 17:12:17 -07:00
Bruce Allan	b370245b4b	ice: suppress false cppcheck issues Silence false errors, warnings and style issues reported by cppcheck. Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-04-14 17:12:17 -07:00
Brett Creeley	c931c782d8	ice: Set vsi->vf_id as ICE_INVAL_VFID for non VF VSI types Currently the vsi->vf_id is set only for ICE_VSI_VF and it's left as 0 for all other VSI types. This is confusing and could be problematic since 0 is a valid vf_id. Fix this by always setting non VF VSI types to ICE_INVAL_VFID. Signed-off-by: Brett Creeley <brett.creeley@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-04-14 17:12:05 -07:00
Jesse Brandeburg	1cdea9a7ea	ice: remove unused struct member The only time you can ever have a rq_last_status is if a firmware event was somehow reporting a status on the receive queue, which are generally firmware initiated events or mailbox messages from a VF. Mostly this struct member was unused. Fix this problem by still printing the value of the field in a debug print, but don't store the value forever in a struct, potentially creating opportunities for callers to use the wrong struct member. Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-04-14 17:00:06 -07:00
Jesse Brandeburg	58623c52b4	ice: use local for consistency Do a minor refactor on ice_vsi_rebuild to use a local variable to store vsi->type. Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-04-14 17:00:06 -07:00
Jesse Brandeburg	80ad6dde61	ice: print name in /proc/iomem The driver previously printed it's PCI address in the name field for the pci resource, which when displayed via /proc/iomem, would print the same thing twice. It's more useful for debugging to see the driver name, as most other modules do. Here's a diff of before and after this change: 99100000-991fffff : 0000:3b:00.1 9a000000-a04fffff : PCI Bus 0000:3b 9a000000-9bffffff : 0000:3b:00.1 - 9a000000-9bffffff : 0000:3b:00.1 + 9a000000-9bffffff : ice 9c000000-9dffffff : 0000:3b:00.0 - 9c000000-9dffffff : 0000:3b:00.0 + 9c000000-9dffffff : ice 9e000000-9effffff : 0000:3b:00.1 9f000000-9fffffff : 0000:3b:00.0 a0000000-a000ffff : 0000:3b:00.1 Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-04-14 17:00:06 -07:00
Scott W Taylor	e9c9692c8a	ice: Reimplement module reads used by ethtool There was an excessive increment of the QSFP page, which is now fixed. Additionally, this new update now reads 8 bytes at a time and will retry each request if the module/bus is busy. Also, prevent reading from upper pages if module does not support those pages. Signed-off-by: Scott W Taylor <scott.w.taylor@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-04-14 17:00:06 -07:00
Jesse Brandeburg	d59684a07e	ice: refactor ITR data structures Use a dedicated bitfield in order to both increase the amount of checking around the length of ITR writes as well as simplify the checks of dynamic mode. Basically unpack the "high bit means dynamic" logic into bitfields. Also, remove some unused ITR defines. Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-04-14 17:00:06 -07:00
Jesse Brandeburg	b7306b42be	ice: manage interrupts during poll exit The driver would occasionally miss that there were outstanding descriptors to clean when exiting busy/napi poll. This issue has been in the code since the introduction of the ice driver. Attempt to "catch" any remaining work by triggering a software interrupt when exiting napi poll or busy-poll. This will not cause extra interrupts in the case of normal execution. This issue was found when running sfnt-pingpong, with busy poll enabled, and typically with larger I/O sizes like > 8192, the program would occasionally report > 1 second maximums to complete a ping pong. Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-04-14 17:00:05 -07:00
Jacob Keller	cdf1f1f169	ice: replace custom AIM algorithm with kernel's DIM library The ice driver has support for adaptive interrupt moderation, an algorithm for tuning the interrupt rate dynamically. This algorithm is based on various assumptions about ring size, socket buffer size, link speed, SKB overhead, ethernet frame overhead and more. The Linux kernel has support for a dynamic interrupt moderation algorithm known as "dimlib". Replace the custom driver-specific implementation of dynamic interrupt moderation with the kernel's algorithm. The Intel hardware has a different hardware implementation than the originators of the dimlib code had to work with, which requires the driver to use a slightly different set of inputs for the actual moderation values, while getting all the advice from dimlib of better/worse, shift left or right. The change made for this implementation is to use a pair of values for each of the 5 "slots" that the dimlib moderation expects, and the driver will program those pairs when dimlib recommends a slot to use. The currently implementation uses two tables, one for receive and one for transmit, and the pairs of values in each slot set the maximum delay of an interrupt and a maximum number of interrupts per second (both expressed in microseconds). There are two separate kinds of bugs fixed by using DIMLIB, one is UDP single stream send was too slow, and the other is that 8K ping-pong was going to the most aggressive moderation and has much too high latency. The overall result of using DIMLIB is that we meet or exceed our performance expectations set based on the old algorithm. Co-developed-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-04-14 17:00:05 -07:00
Jesse Brandeburg	b8b4772377	ice: refactor interrupt moderation writes Introduce several new helpers for writing ITR and GLINT_RATE registers, and refactor the code calling them. This resulted in removal of several duplicate functions and rolled a bunch of simple code back into the calling routines. In particular this removes some code that was doing both a store and a set in a helper function, which seems better done as separate tasks in the caller (and generally takes less lines of code even with a tiny bit of repetition). Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-04-14 17:00:05 -07:00
Anirudh Venkataramanan	a476d72abe	ice: Add new VSI states to track netdev alloc/registration Add two new VSI states, one to track if a netdev for the VSI has been allocated and the other to track if the netdev has been registered. Call unregister_netdev/free_netdev only when the corresponding state bits are set. Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-04-14 17:00:05 -07:00
Anirudh Venkataramanan	7e408e07b4	ice: Drop leading underscores in enum ice_pf_state Remove the leading underscores in enum ice_pf_state. This is not really communicating anything and is unnecessary. No functional change. Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-04-14 17:00:05 -07:00
Bruce Allan	d41f26b5ef	ice: use kernel definitions for IANA protocol ports and ether-types The well-known IANA protocol port 3260 (iscsi-target 0x0cbc) and the ether-types 0x8906 (ETH_P_FCOE) and 0x8914 (ETH_P_FIP) are already defined in kernel header files. Use those definitions instead of open-coding the same. Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-04-14 17:00:05 -07:00
Colin Ian King	ef963ae427	ice: Fix potential infinite loop when using u8 loop counter A for-loop is using a u8 loop counter that is being compared to a u32 cmp_dcbcfg->numapp to check for the end of the loop. If cmp_dcbcfg->numapp is larger than 255 then the counter j will wrap around to zero and hence an infinite loop occurs. Fix this by making counter j the same type as cmp_dcbcfg->numapp. Addresses-Coverity: ("Infinite loop") Fixes: `aeac8ce864` ("ice: Recognize 860 as iSCSI port in CEE mode") Signed-off-by: Colin Ian King <colin.king@canonical.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-04-13 11:50:38 -07:00
Jakub Kicinski	8859a44ea0	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Conflicts: MAINTAINERS - keep Chandrasekar drivers/net/ethernet/mellanox/mlx5/core/en_main.c - simple fix + trust the code re-added to param.c in -next is fine include/linux/bpf.h - trivial include/linux/ethtool.h - trivial, fix kdoc while at it include/linux/skmsg.h - move to relevant place in tcp.c, comment re-wrapped net/core/skmsg.c - add the sk = sk // sk = NULL around calls net/tipc/crypto.c - trivial Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-04-09 20:48:35 -07:00
Yongxin Liu	1831da7ea5	ice: fix memory leak of aRFS after resuming from suspend In ice_suspend(), ice_clear_interrupt_scheme() is called, and then irq_free_descs() will be eventually called to free irq and its descriptor. In ice_resume(), ice_init_interrupt_scheme() is called to allocate new irqs. However, in ice_rebuild_arfs(), struct irq_glue and struct cpu_rmap maybe cannot be freed, if the irqs that released in ice_suspend() were reassigned to other devices, which makes irq descriptor's affinity_notify lost. So call ice_free_cpu_rx_rmap() before ice_clear_interrupt_scheme(), which can make sure all irq_glue and cpu_rmap can be correctly released before corresponding irq and descriptor are released. Fix the following memory leak. unreferenced object 0xffff95bd951afc00 (size 512): comm "kworker/0:1", pid 134, jiffies 4294684283 (age 13051.958s) hex dump (first 32 bytes): 18 00 00 00 18 00 18 00 70 fc 1a 95 bd 95 ff ff ........p....... 00 00 ff ff 01 00 ff ff 02 00 ff ff 03 00 ff ff ................ backtrace: [<0000000072e4b914>] __kmalloc+0x336/0x540 [<0000000054642a87>] alloc_cpu_rmap+0x3b/0xb0 [<00000000f220deec>] ice_set_cpu_rx_rmap+0x6a/0x110 [ice] [<000000002370a632>] ice_probe+0x941/0x1180 [ice] [<00000000d692edba>] local_pci_probe+0x47/0xa0 [<00000000503934f0>] work_for_cpu_fn+0x1a/0x30 [<00000000555a9e4a>] process_one_work+0x1dd/0x410 [<000000002c4b414a>] worker_thread+0x221/0x3f0 [<00000000bb2b556b>] kthread+0x14c/0x170 [<00000000ad2cf1cd>] ret_from_fork+0x1f/0x30 unreferenced object 0xffff95bd81b0a2a0 (size 96): comm "kworker/0:1", pid 134, jiffies 4294684283 (age 13051.958s) hex dump (first 32 bytes): 38 00 00 00 01 00 00 00 e0 ff ff ff 0f 00 00 00 8............... b0 a2 b0 81 bd 95 ff ff b0 a2 b0 81 bd 95 ff ff ................ backtrace: [<00000000582dd5c5>] kmem_cache_alloc_trace+0x31f/0x4c0 [<000000002659850d>] irq_cpu_rmap_add+0x25/0xe0 [<00000000495a3055>] ice_set_cpu_rx_rmap+0xb4/0x110 [ice] [<000000002370a632>] ice_probe+0x941/0x1180 [ice] [<00000000d692edba>] local_pci_probe+0x47/0xa0 [<00000000503934f0>] work_for_cpu_fn+0x1a/0x30 [<00000000555a9e4a>] process_one_work+0x1dd/0x410 [<000000002c4b414a>] worker_thread+0x221/0x3f0 [<00000000bb2b556b>] kthread+0x14c/0x170 [<00000000ad2cf1cd>] ret_from_fork+0x1f/0x30 Fixes: `769c500dcc` ("ice: Add advanced power mgmt for WoL") Signed-off-by: Yongxin Liu <yongxin.liu@windriver.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-04-08 10:21:37 -07:00
Tony Nguyen	2e20521b80	ice: Remove unnecessary blank line Checkpatch reports the following, fix it. ----------------------------------------- drivers/net/ethernet/intel/ice/ice_main.c ----------------------------------------- CHECK:BRACES: Blank lines aren't necessary before a close brace '}' FILE: drivers/net/ethernet/intel/ice/ice_main.c:455: + +} Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com>	2021-04-07 17:09:16 -07:00
Brett Creeley	771015b90b	ice: Remove unnecessary checks in add/kill_vid ndo ops Currently the driver is doing two unnecessary checks. First both ops are checking if the VLAN ID passed in is less than VLAN_N_VID and second both ops are checking to see if a port VLAN is configured on the VSI. The first check is already handled by the 8021q driver so this is an unnecessary check. The second check is unnecessary because the PF VSI is never put into a port VLAN. Remove these checks. Signed-off-by: Brett Creeley <brett.creeley@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-04-07 17:09:16 -07:00
Anirudh Venkataramanan	51fe27e179	ice: Remove rx_gro_dropped stat Tracking of the rx_gro_dropped statistic was removed in commit `f73fc40327` ("ice: drop dead code in ice_receive_skb()"). Remove the associated variables and its reporting to ethtool stats. Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-04-07 17:09:16 -07:00
Anirudh Venkataramanan	efc1eddb28	ice: Use local variable instead of pointer derefs Replace multiple instances of vsi->back and pi->phy with equivalent local variables Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-04-07 17:09:16 -07:00
Anirudh Venkataramanan	dc6aaa139f	ice: Remove unnecessary variable In ice_init_phy_user_cfg, vsi is used only to get to hw. Remove this and just use pi->hw Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-04-07 17:09:16 -07:00
Jeb Cramer	75751c80d6	ice: Limit forced overrides based on FW version Beyond a specific version of firmware, there is no need to provide override values to the firmware when setting PHY capabilities. In this case, we do not need to indicate whether we're in Strict or Lenient Link Mode. In the case of translating capabilities to the configuration structure, the module compliance enforcement is already correctly set by firmware, so the extra code block is redundant. Signed-off-by: Jeb Cramer <jeb.j.cramer@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-04-07 17:09:16 -07:00
Anirudh Venkataramanan	0a02944fea	ice: Use default configuration mode for PHY configuration Recent firmware supports a new "get PHY capabilities" mode ICE_AQC_REPORT_DFLT_CFG which makes it unnecessary for the driver to track and apply NVM based default link overrides. If FW AQ API version supports it, use Report Default Configuration. Add check function for Report Default Configuration support and update accordingly. Also change adv_phy_type_[lo\|hi] to advert_phy_type[lo\|hi] for clarity. Co-developed-by: Mateusz Pacuszka <mateuszx.pacuszka@intel.com> Signed-off-by: Mateusz Pacuszka <mateuszx.pacuszka@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-04-07 17:09:15 -07:00
Anirudh Venkataramanan	178a666daa	ice: Replace some memsets and memcpys with assignment In ice_set_link_ksettings, use assignment instead of memset/memcpy where possible Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-04-07 17:09:15 -07:00
Anirudh Venkataramanan	450f10e794	ice: Fix error return codes in ice_set_link_ksettings Return more appropriate error codes so that the right error message is communicated to the user by ethtool. Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-04-07 17:09:15 -07:00
Anirudh Venkataramanan	0be39bb4c7	ice: Rename a couple of variables In ice_set_link_ksettings, change 'abilities' to 'phy_caps' and 'p' to 'pi'. This is more consistent with similar usages elsewhere in the driver. Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-04-07 17:09:15 -07:00
Anirudh Venkataramanan	fd3dc1655e	ice: Remove unnecessary checker loop The loop checking for PF VSI doesn't make any sense. The VSI type backing the netdev passed to ice_set_link_ksettings will always be of type ICE_PF_VSI. Remove it. Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-04-07 17:09:15 -07:00
Anirudh Venkataramanan	d348d51771	ice: Ignore EMODE return for opcode 0x0605 When link is owned by manageability, the driver is not allowed to fiddle with link. FW returns ICE_AQ_RC_EMODE if the driver attempts to do so. This patch adds a new function ice_set_link which abstracts the call to ice_aq_set_link_restart_an and provides a clean way to turn on/off link. While making this change, I also spotted that an int variable was being used to hold both an ice_status return code and the Linux errno return code. This pattern more often than not results in the driver inadvertently returning ice_status back to kernel which is a major boo-boo. Clean it up. Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-04-07 17:09:15 -07:00
Anirudh Venkataramanan	d6730a871e	ice: Align macro names to the specification For get PHY abilities AQ, the specification defines "report modes" as "with media", "without media" and "active configuration". For clarity, rename macros to align with the specification. Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-04-07 17:09:15 -07:00
Victor Raj	7fb09a7375	ice: Modify recursive way of adding nodes Remove the recursive way of adding the nodes to the layer in order to reduce the stack usage. Instead the algorithm is modified to use a while loop. The previous code was scanning recursively the nodes horizontally. The total stack consumption will be based on number of nodes present on that layer. In some cases it can consume more stack. Signed-off-by: Victor Raj <victor.raj@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-04-07 17:09:15 -07:00
Chinh T Cao	3056df93f7	ice: Re-send some AQ commands, as result of EBUSY AQ error Retry sending some AQ commands, as result of EBUSY AQ error. ice_aqc_opc_get_link_topo ice_aqc_opc_lldp_stop ice_aqc_opc_lldp_start ice_aqc_opc_lldp_filter_ctrl This change follows the latest guidelines from HW team. It is better to retry the same AQ command several times, as the result of EBUSY, instead of returning error to the caller right away. Signed-off-by: Chinh T Cao <chinh.t.cao@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-04-07 17:09:15 -07:00
Tony Nguyen	a07cc1786d	ice: Correct comment block style The following is reported by checkpatch, correct it. ----------------------------------------------- drivers/net/ethernet/intel/ice/ice_adminq_cmd.h ----------------------------------------------- WARNING:NETWORKING_BLOCK_COMMENT_STYLE: networking block comments don't use an empty /* line, use /* Comment... FILE: drivers/net/ethernet/intel/ice/ice_adminq_cmd.h:1428: +/* + * Send to PF command (indirect 0x0801) ID is only used by PF Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com>	2021-03-31 14:21:28 -07:00
Bruce Allan	0c3e94c247	ice: cleanup style issues A few style issues reported by checkpatch have snuck into the code; resolve the style issues. COMPLEX_MACRO: Macros with complex values should be enclosed in parentheses Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-03-31 14:21:28 -07:00
Anirudh Venkataramanan	e97fb1aea9	ice: Consolidate VSI state and flags struct ice_vsi has two fields, state and flags which seem to be serving the same purpose. Consolidate them into one field 'state'. enum ice_state is used to represent state information of the PF. While some of these enum values can be use to represent VSI state, it makes more sense to represent VSI state with its own enum. So derive a new enum ice_vsi_state from ice_vsi_flags and ice_state and use it. Also rename enum ice_state to ice_pf_state for clarity. Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-03-31 14:21:28 -07:00
Brett Creeley	b66a972abb	ice: Refactor ice_set/get_rss into LUT and key specific functions Currently ice_set/get_rss are used to set/get the RSS LUT and/or RSS key. However nearly everywhere these functions are called only the LUT or key are set/get. Also, making this change reduces how many things ice_set/get_rss are doing. Fix this by adding ice_set/get_rss_lut and ice_set/get_rss_key functions. Also, consolidate all calls for setting/getting the RSS LUT and RSS Key to use ice_set/get_rss_lut() and ice_set/get_rss_key(). Signed-off-by: Brett Creeley <brett.creeley@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-03-31 14:21:28 -07:00
Brett Creeley	e3c53928a3	ice: Refactor get/set RSS LUT to use struct parameter Update ice_aq_get_rss_lut() and ice_aq_set_rss_lut() to take a new structure ice_aq_get_set_rss_params instead of passing individual parameters. This is done for 2 reasons: 1. Reduce the number of parameters passed to the functions. 2. Reduce the amount of change required if the arguments ever need to be updated in the future. Also, reduce duplicate code that was checking for an invalid vsi_handle and lut parameter by moving the checks to the lower level __ice_aq_get_set_rss_lut(). Signed-off-by: Brett Creeley <brett.creeley@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-03-31 14:21:28 -07:00
Brett Creeley	8134d5ff97	ice: Change ice_vsi_setup_q_map() to not depend on RSS Currently, ice_vsi_setup_q_map() depends on the VSI's rss_size. However, the Rx Queue Mapping section of the VSI context has no dependency on RSS. Instead, limit the maximum number of Rx queues per TC based on the Rx Queue mapping section of the VSI context, which currently allows for up to 256 Rx queues per TC. Signed-off-by: Brett Creeley <brett.creeley@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-03-31 14:21:28 -07:00
Qi Zhang	94a936981a	ice: rename ptype bitmap Align all ptype bitmap to follow ice_ptypes_xxx prefix. Signed-off-by: Qi Zhang <qi.z.zhang@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-03-31 14:21:28 -07:00
Bruce Allan	36ac7911fa	ice: correct memory allocation call Use malloc() instead of calloc() when allocating only a single object as opposed to an array of objects. Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-03-31 14:21:28 -07:00
Anirudh Venkataramanan	805f980bfe	ice: Check for bail out condition early Check for bail out condition before calling ice_aq_sff_eeprom Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-03-31 14:21:28 -07:00
Bruce Allan	800c1443cb	ice: remove unnecessary duplicated AQ command flag setting Commit `a012dca9f7` ("ice: add ethtool -m support for reading i2c eeprom modules") unnecessarily added the ICE_AQ_FLAG_BUF flag to the descriptor when sending the indirect Read/Write SFF EEPROM AQ command. The flag is already added later in the code flow for all indirect AQ commands, i.e. commands that provide an additional data buffer. Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-03-31 14:21:27 -07:00
Paul Greenwalt	5c57145a49	ice: change link misconfiguration message Change link misconfiguration message since the configuration could be intended by the user. Signed-off-by: Paul Greenwalt <paul.greenwalt@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-03-31 14:21:27 -07:00
Paul M Stillwell Jr	2ec5638559	ice: handle increasing Tx or Rx ring sizes There is an issue when the Tx or Rx ring size increases using 'ethtool -L ...' where the new rings don't get the correct ITR values because when we rebuild the VSI we don't know that some of the rings may be new. Fix this by looking at the original number of rings and determining if the rings in ice_vsi_rebuild_set_coalesce() were not present in the original rings received in ice_vsi_rebuild_get_coalesce(). Also change the code to return an error if we can't allocate memory for the coalesce data in ice_vsi_rebuild(). Signed-off-by: Paul M Stillwell Jr <paul.m.stillwell.jr@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-03-31 14:21:27 -07:00
Dan Nowlin	a05983c3d0	ice: Update to use package info from ice segment There are two package versions in the package binary. Today, these two version numbers are the same. However, in the future that may change. Update code to use the package info from the ice segment metadata section, which is the package information that is actually downloaded to the firmware during the download package process. Signed-off-by: Dan Nowlin <dan.nowlin@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-03-31 14:21:27 -07:00
Anirudh Venkataramanan	1e23f076b2	ice: Delay netdev registration Once a netdev is registered, the corresponding network interface can be immediately used by userspace utilities (like say NetworkManager). This can be problematic if the driver technically isn't fully up yet. Move netdev registration to the end of probe, as by this time the driver data structures and device will be initialized as expected. However, delaying netdev registration causes a failure in the aRFS flow where netdev->reg_state == NETREG_REGISTERED condition is checked. It's not clear why this check was added to begin with, so remove it. Local testing didn't indicate any issues with this change. The state bit check in ice_open was put in as a stop-gap measure to prevent a premature interface up operation. This is no longer needed, so remove it. Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-03-31 14:21:27 -07:00
Benita Bose	634da4c118	ice: Add Support for XPS Enable and configure XPS. The driver code implemented sets up the Transmit Packet Steering Map, which in turn will be used by the kernel in queue selection during Tx. Signed-off-by: Benita Bose <benita.bose@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-03-31 14:21:27 -07:00
Robert Malz	b7eeb52721	ice: Cleanup fltr list in case of allocation issues When ice_remove_vsi_lkup_fltr is called, by calling ice_add_to_vsi_fltr_list local copy of vsi filter list is created. If any issues during creation of vsi filter list occurs it up for the caller to free already allocated memory. This patch ensures proper memory deallocation in these cases. Fixes: `80d144c9ac` ("ice: Refactor switch rule management structures and functions") Signed-off-by: Robert Malz <robertx.malz@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-03-29 10:37:49 -07:00
Anirudh Venkataramanan	3176551979	ice: Use port number instead of PF ID for WoL As per the spec, the WoL control word read from the NVM should be interpreted as port numbers, and not PF numbers. So when checking if WoL supported, use the port number instead of the PF ID. Also, ice_is_wol_supported doesn't really need a pointer to the pf struct, but just needs a pointer to the hw instance. Fixes: `769c500dcc` ("ice: Add advanced power mgmt for WoL") Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-03-29 10:37:49 -07:00
Jacek Bułatek	7a91d3f02b	ice: Fix for dereference of NULL pointer Add handling of allocation fault for ice_vsi_list_map_info. Also *fi should not be NULL pointer, it is a reference to raw data field, so remove this variable and use the reference directly. Fixes: `9daf8208dd` ("ice: Add support for switch filter programming") Signed-off-by: Jacek Bułatek <jacekx.bulatek@intel.com> Co-developed-by: Haiyue Wang <haiyue.wang@intel.com> Signed-off-by: Haiyue Wang <haiyue.wang@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-03-29 10:37:49 -07:00
Dave Ertman	741b7b743b	ice: remove DCBNL_DEVRESET bit from PF state The original purpose of the ICE_DCBNL_DEVRESET was to protect the driver during DCBNL device resets. But, the flow for DCBNL device resets now consists of only calls up the stack such as dev_close() and dev_open() that will result in NDO calls to the driver. These will be handled with state changes from the stack. Also, there is a problem of the dev_close and dev_open being blocked by checks for reset in progress also using the ICE_DCBNL_DEVRESET bit. Since the ICE_DCBNL_DEVRESET bit is not necessary for protecting the driver from DCBNL device resets and it is actually blocking changes coming from the DCBNL interface, remove the bit from the PF state and don't block driver function based on DCBNL reset in progress. Fixes: `b94b013eb6` ("ice: Implement DCBNL support") Signed-off-by: Dave Ertman <david.m.ertman@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-03-29 10:37:19 -07:00
Bruce Allan	59df14f9cc	ice: fix memory allocation call Fix the order of number of array members and member size parameters in a *calloc() call. Fixes: `b3c3890489` ("ice: avoid unnecessary single-member variable-length structs") Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-03-29 10:11:55 -07:00
Krzysztof Goreczny	e95fc8573e	ice: prevent ice_open and ice_stop during reset There is a possibility of race between ice_open or ice_stop calls performed by OS and reset handling routine both trying to modify VSI resources. Observed scenarios: - reset handler deallocates memory in ice_vsi_free_arrays and ice_open tries to access it in ice_vsi_cfg_txq leading to driver crash - reset handler deallocates memory in ice_vsi_free_arrays and ice_close tries to access it in ice_down leading to driver crash - reset handler clears port scheduler topology and sets port state to ICE_SCHED_PORT_STATE_INIT leading to ice_ena_vsi_txq fail in ice_open To prevent this additional checks in ice_open and ice_stop are introduced to make sure that OS is not allowed to alter VSI config while reset is in progress. Fixes: `cdedef59de` ("ice: Configure VSIs for Tx/Rx") Signed-off-by: Krzysztof Goreczny <krzysztof.goreczny@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-03-29 10:11:55 -07:00
Chinh T Cao	aeac8ce864	ice: Recognize 860 as iSCSI port in CEE mode iSCSI can use both TCP ports 860 and 3260. However, in our current implementation, the ice_aqc_opc_get_cee_dcb_cfg (0x0A07) AQ command doesn't provide a way to communicate the protocol port number to the AQ's caller. Thus, we assume that 3260 is the iSCSI port number at the AQ's caller layer. Rely on the dcbx-willing mode, desired QoS and remote QoS configuration to determine which port number that iSCSI will use. Fixes: `0ebd3ff13c` ("ice: Add code for DCB initialization part 2/4") Signed-off-by: Chinh T Cao <chinh.t.cao@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-03-29 10:11:55 -07:00
Fabio Pricoco	f88c529ac7	ice: Increase control queue timeout 250 msec timeout is insufficient for some AQ commands. Advice from FW team was to increase the timeout. Increase to 1 second. Fixes: `7ec59eeac8` ("ice: Add support for control queues") Signed-off-by: Fabio Pricoco <fabio.pricoco@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-03-29 10:11:55 -07:00
Anirudh Venkataramanan	08771bce33	ice: Continue probe on link/PHY errors An incorrect NVM update procedure can result in the driver failing probe. In this case, the recommended resolution method is to update the NVM using the right procedure. However, if the driver fails probe, the user will not be able to update the NVM. So do not fail probe on link/PHY errors. Fixes: `1a3571b593` ("ice: restore PHY settings on media insertion") Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-03-29 10:11:55 -07:00
David S. Miller	241949e488	Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next Alexei Starovoitov says: ==================== pull-request: bpf-next 2021-03-24 The following pull-request contains BPF updates for your net-next tree. We've added 37 non-merge commits during the last 15 day(s) which contain a total of 65 files changed, 3200 insertions(+), 738 deletions(-). The main changes are: 1) Static linking of multiple BPF ELF files, from Andrii. 2) Move drop error path to devmap for XDP_REDIRECT, from Lorenzo. 3) Spelling fixes from various folks. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-25 16:30:46 -07:00
David S. Miller	efd13b71a3	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-25 15:31:22 -07:00
Gustavo A. R. Silva	9ded647a51	ice: Fix fall-through warnings for Clang In preparation to enable -Wimplicit-fallthrough for Clang, fix a warning by explicitly adding a break statement instead of just letting the code fall through to the next case. Link: https://github.com/KSPP/linux/issues/115 Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-03-23 11:34:02 -07:00
Tony Nguyen	ef860480ea	ice: Fix prototype warnings Correct reported warnings for "warning: expecting prototype for ... Prototype was for ... instead" Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com>	2021-03-23 11:34:02 -07:00
Qi Zhang	d6218317e2	ice: Check FDIR program status for AVF Enable returning FDIR completion status by checking the ctrl_vsi Rx queue descriptor value. To enable returning FDIR completion status from ctrl_vsi Rx queue, COMP_Queue and COMP_Report of FDIR filter programming descriptor needs to be properly configured. After program request sent to ctrl_vsi Tx queue, ctrl_vsi Rx queue interrupt will be triggered and completion status will be returned. Driver will first issue request in ice_vc_fdir_add_fltr(), then pass FDIR context to the background task in interrupt service routine ice_vc_fdir_irq_handler() and finally deal with them in ice_flush_fdir_ctx(). ice_flush_fdir_ctx() will check the descriptor's value, fdir context, and then send back virtual channel message to VF by calling ice_vc_add_fdir_fltr_post(). An additional timer will be setup in case of hardware interrupt timeout. Signed-off-by: Yahui Cao <yahui.cao@intel.com> Signed-off-by: Brett Creeley <brett.creeley@intel.com> Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Signed-off-by: Qi Zhang <qi.z.zhang@intel.com> Tested-by: Chen Bo <BoX.C.Chen@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-03-22 11:32:12 -07:00
Qi Zhang	213528fed2	ice: Add more FDIR filter type for AVF FDIR for AVF can forward - L2TPV3 packets by matching session id. - IPSEC ESP packets by matching security parameter index. - IPSEC AH packets by matching security parameter index. - NAT_T ESP packets by matching security parameter index. - Any PFCP session packets(s field is 1). Signed-off-by: Yahui Cao <yahui.cao@intel.com> Signed-off-by: Qi Zhang <qi.z.zhang@intel.com> Tested-by: Chen Bo <BoX.C.Chen@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-03-22 11:32:12 -07:00
Qi Zhang	ef9e4cc589	ice: Add GTPU FDIR filter for AVF Add new FDIR filter type to forward GTPU packets by matching TEID or QFI. The filter is only enabled when COMMS DDP package is downloaded. Signed-off-by: Yahui Cao <yahui.cao@intel.com> Signed-off-by: Qi Zhang <qi.z.zhang@intel.com> Tested-by: Chen Bo <BoX.C.Chen@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-03-22 11:32:12 -07:00
Qi Zhang	21606584f1	ice: Add non-IP Layer2 protocol FDIR filter for AVF Add new filter type that allow forward non-IP Ethernet packets base on its ethertype. The filter is only enabled when COMMS DDP package is loaded. Signed-off-by: Yahui Cao <yahui.cao@intel.com> Signed-off-by: Qi Zhang <qi.z.zhang@intel.com> Tested-by: Chen Bo <BoX.C.Chen@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-03-22 11:32:12 -07:00
Qi Zhang	346bf25043	ice: Add new actions support for VF FDIR Add two new actions support for VF FDIR: A passthrough action does not specify the destination queue, but just allow the packet go to next pipeline stage, a typical use cases is combined with a software mark (FDID) action. Allow specify a 2^n continuous queues as the destination of a FDIR rule. Packet distribution is based on current RSS configure. Signed-off-by: Yahui Cao <yahui.cao@intel.com> Signed-off-by: Qi Zhang <qi.z.zhang@intel.com> Tested-by: Chen Bo <BoX.C.Chen@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-03-22 11:32:12 -07:00
Qi Zhang	0ce332fd62	ice: Add FDIR pattern action parser for VF Add basic FDIR flow list and pattern / action parse functions for VF. Signed-off-by: Yahui Cao <yahui.cao@intel.com> Signed-off-by: Qi Zhang <qi.z.zhang@intel.com> Tested-by: Chen Bo <BoX.C.Chen@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-03-22 11:32:12 -07:00
Qi Zhang	1f7ea1cd6a	ice: Enable FDIR Configure for AVF The virtual channel is going to be extended to support FDIR and RSS configure from AVF. New data structures and OP codes will be added, the patch enable the FDIR part. To support above advanced AVF feature, we need to figure out what kind of data structure should be passed from VF to PF to describe an FDIR rule or RSS config rule. The common part of the requirement is we need a data structure to represent the input set selection of a rule's hash key. An input set selection is a group of fields be selected from one or more network protocol layers that could be identified as a specific flow. For example, select dst IP address from an IPv4 header combined with dst port from the TCP header as the input set for an IPv4/TCP flow. The patch adds a new data structure virtchnl_proto_hdrs to abstract a network protocol headers group which is composed of layers of network protocol header(virtchnl_proto_hdr). A protocol header contains a 32 bits mask (field_selector) to describe which fields are selected as input sets, as well as a header type (enum virtchnl_proto_hdr_type). Each bit is mapped to a field in enum virtchnl_proto_hdr_field guided by its header type. +------------+-----------+------------------------------+ \| \| Proto Hdr \| Header Type A \| \| \| +------------------------------+ \| \| \| BIT 31 \| ... \| BIT 1 \| BIT 0 \| \| \|-----------+------------------------------+ \|Proto Hdrs \| Proto Hdr \| Header Type B \| \| \| +------------------------------+ \| \| \| BIT 31 \| ... \| BIT 1 \| BIT 0 \| \| \|-----------+------------------------------+ \| \| Proto Hdr \| Header Type C \| \| \| +------------------------------+ \| \| \| BIT 31 \| ... \| BIT 1 \| BIT 0 \| \| \|-----------+------------------------------+ \| \| .... \| +-------------------------------------------------------+ All fields in enum virtchnl_proto_hdr_fields are grouped with header type and the value of the first field of a header type is always 32 aligned. enum proto_hdr_type { header_type_A = 0; header_type_B = 1; .... } enum proto_hdr_field { /* header type A / header_A_field_0 = 0, header_A_field_1 = 1, header_A_field_2 = 2, header_A_field_3 = 3, / header type B */ header_B_field_0 = 32, // = header_type_B << 5 header_B_field_0 = 33, header_B_field_0 = 34 header_B_field_0 = 35, .... }; So we have: proto_hdr_type = proto_hdr_field / 32 bit offset = proto_hdr_field % 32 To simply the protocol header's operations, couple help macros are added. For example, to select src IP and dst port as input set for an IPv4/UDP flow. we have: struct virtchnl_proto_hdr hdr[2]; VIRTCHNL_SET_PROTO_HDR_TYPE(&hdr[0], IPV4) VIRTCHNL_ADD_PROTO_HDR_FIELD(&hdr[0], IPV4, SRC) VIRTCHNL_SET_PROTO_HDR_TYPE(&hdr[1], UDP) VIRTCHNL_ADD_PROTO_HDR_FIELD(&hdr[1], UDP, DST) The byte array is used to store the protocol header of a training package. The byte array must be network order. The patch added virtual channel support for iAVF FDIR add/validate/delete filter. iAVF FDIR is Flow Director for Intel Adaptive Virtual Function which can direct Ethernet packets to the queues of the Network Interface Card. Add/delete command is adding or deleting one rule for each virtual channel message, while validate command is just verifying if this rule is valid without any other operations. To add or delete one rule, driver needs to config TCAM and Profile, build training packets which contains the input set value, and send the training packets through FDIR Tx queue. In addition, driver needs to manage the software context to avoid adding duplicated rules, deleting non-existent rule, input set conflicts and other invalid cases. NOTE: Supported pattern/actions and their parse functions are not be included in this patch, they will be added in a separate one. Signed-off-by: Jeff Guo <jia.guo@intel.com> Signed-off-by: Yahui Cao <yahui.cao@intel.com> Signed-off-by: Simei Su <simei.su@intel.com> Signed-off-by: Beilei Xing <beilei.xing@intel.com> Signed-off-by: Qi Zhang <qi.z.zhang@intel.com> Tested-by: Chen Bo <BoX.C.Chen@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-03-22 11:32:12 -07:00
Qi Zhang	da62c5ff9d	ice: Add support for per VF ctrl VSI enabling We are going to enable FDIR configure for AVF through virtual channel. The first step is to add helper functions to support control VSI setup. A control VSI will be allocated for a VF when AVF creates its first FDIR rule through ice_vf_ctrl_vsi_setup(). The patch will also allocate FDIR rule space for VF's control VSI. If a VF asks for flow director rules, then those should come entirely from the best effort pool and not from the guaranteed pool. The patch allow a VF VSI to have only space in the best effort rules. Signed-off-by: Xiaoyun Li <xiaoyun.li@intel.com> Signed-off-by: Yahui Cao <yahui.cao@intel.com> Signed-off-by: Qi Zhang <qi.z.zhang@intel.com> Tested-by: Chen Bo <BoX.C.Chen@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-03-22 11:32:12 -07:00
Qi Zhang	7012dfd1af	ice: Enhanced IPv4 and IPv6 flow filter Separate IPv4 and IPv6 ptype bit mask table into 2 tables: with or without L4 protocols. When a flow filter without any l4 type is specified, the ICE_FLOW_SEG_HDR_IPV_OTHER flag can be used to describe if user want to create a IP rule target for all IP packet or just IP packet without l4 header. Signed-off-by: Dan Nowlin <dan.nowlin@intel.com> Signed-off-by: Qi Zhang <qi.z.zhang@intel.com> Tested-by: Chen Bo <BoX.C.Chen@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-03-22 11:32:12 -07:00
Qi Zhang	cbad5db88a	ice: Support to separate GTP-U uplink and downlink To apply different input set for GTP-U packet with or without extend header as well as GTP-U uplink and downlink, we need to add TCAM mask matching capability. This allows comprehending different PTYPE attributes by examining flags from the parser. Using this method, different profiles can be used by examining flag values from the parser. Signed-off-by: Dan Nowlin <dan.nowlin@intel.com> Signed-off-by: Qi Zhang <qi.z.zhang@intel.com> Tested-by: Chen Bo <BoX.C.Chen@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-03-22 11:32:12 -07:00
Qi Zhang	0577313e53	ice: Add more advanced protocol support in flow filter Add more protocol support in flow filter, these include PPPoE, L2TPv3, GTP, PFCP, ESP and AH. Signed-off-by: Ting Xu <ting.xu@intel.com> Signed-off-by: Yahui Cao <yahui.cao@intel.com> Signed-off-by: Qi Zhang <qi.z.zhang@intel.com> Tested-by: Chen Bo <BoX.C.Chen@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-03-22 11:32:12 -07:00
Qi Zhang	b199dddbd3	ice: Support non word aligned input set field To support FDIR input set with protocol field like DSCP, TTL, PROT, etc. which is not word aligned, we need to enable field vector masking. Signed-off-by: Dan Nowlin <dan.nowlin@intel.com> Signed-off-by: Qi Zhang <qi.z.zhang@intel.com> Tested-by: Chen Bo <BoX.C.Chen@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-03-22 11:32:11 -07:00
Qi Zhang	390bd14180	ice: Add more basic protocol support for flow filter Add more protocol and field support for flow filter include: ETH, VLAN, ICMP, ARP and TCP flag. Signed-off-by: Kevin Scott <kevin.c.scott@intel.com> Signed-off-by: Qi Zhang <qi.z.zhang@intel.com> Tested-by: Chen Bo <BoX.C.Chen@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-03-22 11:32:11 -07:00
Lorenzo Bianconi	fdc13979f9	bpf, devmap: Move drop error path to devmap for XDP_REDIRECT We want to change the current ndo_xdp_xmit drop semantics because it will allow us to implement better queue overflow handling. This is working towards the larger goal of a XDP TX queue-hook. Move XDP_REDIRECT error path handling from each XDP ethernet driver to devmap code. According to the new APIs, the driver running the ndo_xdp_xmit pointer, will break tx loop whenever the hw reports a tx error and it will just return to devmap caller the number of successfully transmitted frames. It will be devmap responsibility to free dropped frames. Move each XDP ndo_xdp_xmit capable driver to the new APIs: - veth - virtio-net - mvneta - mvpp2 - socionext - amazon ena - bnxt - freescale (dpaa2, dpaa) - xen-frontend - qede - ice - igb - ixgbe - i40e - mlx5 - ti (cpsw, cpsw-new) - tun - sfc Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Ioana Ciornei <ioana.ciornei@nxp.com> Reviewed-by: Ilias Apalodimas <ilias.apalodimas@linaro.org> Reviewed-by: Camelia Groza <camelia.groza@nxp.com> Acked-by: Edward Cree <ecree.xilinx@gmail.com> Acked-by: Jesper Dangaard Brouer <brouer@redhat.com> Acked-by: Shay Agroskin <shayagr@amazon.com> Link: https://lore.kernel.org/bpf/ed670de24f951cfd77590decf0229a0ad7fd12f6.1615201152.git.lorenzo@kernel.org	2021-03-18 16:38:51 +01:00
Alexander Duyck	c8d4725e98	intel: Update drivers to use ethtool_sprintf Update the Intel drivers to make use of ethtool_sprintf. The general idea is to reduce code size and overhead by replacing the repeated pattern of string printf statements and ETH_STRING_LEN counter increments. Signed-off-by: Alexander Duyck <alexanderduyck@fb.com> Acked-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-17 11:42:30 -07:00
Magnus Karlsson	bb52073645	ice: optimize for XDP_REDIRECT in xsk path Optimize ice_run_xdp_zc() for the XDP program verdict being XDP_REDIRECT in the xsk zero-copy path. This path is only used when having AF_XDP zero-copy on and in that case most packets will be directed to user space. This provides a little over 100k extra packets in throughput on my server when running l2fwd in xdpsock. Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Tested-by: George Kuruvinakunnel <george.kuruvinakunnel@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-03-15 12:04:51 -07:00
Maciej Fijalkowski	89861c485c	ice: move headroom initialization to ice_setup_rx_ctx ice_rx_offset(), that is supposed to initialize the Rx buffer headroom, relies on ICE_RX_FLAGS_RING_BUILD_SKB flag as well as XDP prog presence. Currently, the callsite of mentioned function is placed incorrectly within ice_setup_rx_ring() where Rx ring's build skb flag is not set yet. This causes the XDP_REDIRECT to be partially broken due to inability to create xdp_frame in the headroom space, as the headroom is 0. Fix this by moving ice_rx_offset() to ice_setup_rx_ctx() after the flag setting. Fixes: `f1b1f409bf` ("ice: store the result of ice_rx_offset() onto ice_ring") Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Tested-by: Kiran Bhandare <kiranx.bhandare@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-03-12 07:43:46 -08:00
Magnus Karlsson	ed0907e3bd	ice: fix napi work done reporting in xsk path Fix the wrong napi work done reporting in the xsk path of the ice driver. The code in the main Rx processing loop was written to assume that the buffer allocation code returns true if all allocations where successful and false if not. In contrast with all other Intel NIC xsk drivers, the ice_alloc_rx_bufs_zc() has the inverted logic messing up the work done reporting in the napi loop. This can be fixed either by inverting the return value from ice_alloc_rx_bufs_zc() in the function that uses this in an incorrect way, or by changing the return value of ice_alloc_rx_bufs_zc(). We chose the latter as it makes all the xsk allocation functions for Intel NICs behave in the same way. My guess is that it was this unexpected discrepancy that gave rise to this bug in the first place. Fixes: `5bb0c4b5eb` ("ice, xsk: Move Rx allocation out of while-loop") Reported-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Tested-by: Kiran Bhandare <kiranx.bhandare@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-03-12 07:43:46 -08:00
Henry Tieman	0393e46ac4	ice: update the number of available RSS queues It was possible to have Rx queues that were not available for use by RSS. This would happen when increasing the number of Rx queues while there was a user defined RSS LUT. Always update the number of available RSS queues when changing the number of Rx queues. Fixes: `87324e747f` ("ice: Implement ethtool ops for channels") Signed-off-by: Henry Tieman <henry.w.tieman@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-02-22 11:28:57 -08:00
Dave Ertman	0d4907f65d	ice: Fix state bits on LLDP mode switch DCBX_CAP bits were not being adjusted when switching between SW and FW controlled LLDP. Adjust bits to correctly indicate which mode the LLDP logic is in. Fixes: `b94b013eb6` ("ice: Implement DCBNL support") Signed-off-by: Dave Ertman <david.m.ertman@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-02-22 11:28:57 -08:00
Brett Creeley	a6aa7c8f99	ice: Account for port VLAN in VF max packet size calculation Currently if an AVF driver doesn't account for the possibility of a port VLAN when determining its max packet size then packets at MTU will be dropped. It is not the VF driver's responsibility to account for a port VLAN so fix this. To fix this, do the following: 1. Add a function that determines the max packet size a VF is allowed by using the port's max packet size and whether the VF is in a port VLAN. If a port VLAN is configured then a VF's max packet size will always be the port's max packet size minus VLAN_HLEN. Otherwise it will be the port's max packet size. 2. Use this function to verify the max packet size from the VF. 3. If there is a port VLAN configured then add 4 bytes (VLAN_HLEN) to the VF's max packet size configuration. Also, the VIRTCHNL_OP_GET_VF_RESOURCES message provides the capability to communicate a VF's max packet size. Use the new function for this purpose. Fixes: `1071a8358a` ("ice: Implement virtchnl commands for AVF support") Signed-off-by: Brett Creeley <brett.creeley@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-02-22 11:19:15 -08:00
Brett Creeley	37b52be260	ice: Set trusted VF as default VSI when setting allmulti on Currently the PF will only set a trusted VF as the default VSI when it requests FLAG_VF_UNICAST_PROMISC over VIRTCHNL. However, when FLAG_VF_MULTICAST_PROMISC is set it's expected that the trusted VF will see multicast packets that don't have a matching destination MAC in the devices internal switch. Fix this by setting the trusted VF as the default VSI if either FLAG_VF_UNICAST_PROMISC or FLAG_VF_MULTICAST_PROMISC is set. Fixes: `01b5e89aab` ("ice: Add VF promiscuous support") Signed-off-by: Brett Creeley <brett.creeley@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-02-22 11:18:39 -08:00
Dave Ertman	7dcf7aa01c	ice: report correct max number of TCs In the driver currently, we are reporting max number of TCs to the DCBNL callback as a kernel define set to 8. This is preventing userspace applications performing DCBx to correctly down map the TCs from requested to actual values. Report the actual max TC value to userspace from the capability struct. Fixes: `b94b013eb6` ("ice: Implement DCBNL support") Signed-off-by: Dave Ertman <david.m.ertman@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-02-22 10:22:07 -08:00
Maciej Fijalkowski	f1b1f409bf	ice: store the result of ice_rx_offset() onto ice_ring Output of ice_rx_offset() is based on ethtool's priv flag setting, which when changed, causes PF reset (disables napi, frees irqs, loads different Rx mem model, etc.). This means that within napi its result is constant and there is no reason to call it per each processed frame. Add new 'rx_offset' field to ice_ring that is meant to hold the ice_rx_offset() result and use it within ice_clean_rx_irq(). Furthermore, use it within ice_alloc_mapped_page(). Reviewed-by: Björn Töpel <bjorn.topel@intel.com> Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-02-12 10:36:57 -08:00
Maciej Fijalkowski	5c57e507f2	ice: skip NULL check against XDP prog in ZC path Whole zero-copy variant of clean Rx IRQ is executed when xsk_pool is attached to rx_ring and it can happen only when XDP program is present on interface. Therefore it is safe to assume that program is always !NULL and there is no need for checking it in ice_run_xdp_zc. Reviewed-by: Björn Töpel <bjorn.topel@intel.com> Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Tested-by: Kiran Bhandare <kiranx.bhandare@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-02-12 10:28:40 -08:00
Maciej Fijalkowski	43a925e49d	ice: remove redundant checks in ice_change_mtu dev_validate_mtu checks that mtu value specified by user is not less than min mtu and not greater than max allowed mtu. It is being done before calling the ndo_change_mtu exposed by driver, so remove these redundant checks in ice_change_mtu. Reviewed-by: Björn Töpel <bjorn.topel@intel.com> Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-02-12 10:20:13 -08:00
Maciej Fijalkowski	29b82f2a09	ice: move skb pointer from rx_buf to rx_ring Similar thing has been done in i40e, as there is no real need for having the sk_buff pointer in each rx_buf. Non-eop frames can be simply handled on that pointer moved upwards to rx_ring. Reviewed-by: Björn Töpel <bjorn.topel@intel.com> Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-02-12 10:18:40 -08:00
Maciej Fijalkowski	59c97d1b51	ice: simplify ice_run_xdp There's no need for 'result' variable, we can directly return the internal status based on action returned by xdp prog. Reviewed-by: Björn Töpel <bjorn.topel@intel.com> Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Tested-by: Kiran Bhandare <kiranx.bhandare@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-02-12 10:14:30 -08:00
Tony Nguyen	741106f7bd	ice: Improve MSI-X fallback logic Currently if the driver is unable to get all the MSI-X vectors it wants, it falls back to the minimum configuration which equates to a single Tx/Rx traffic queue pair. Instead of using the minimum configuration, if given more vectors than the minimum, utilize those vectors for additional traffic queues after accounting for other interrupts. Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com>	2021-02-08 16:27:01 -08:00
Mitch Williams	fe6cd89050	ice: Fix trivial error message This message indicates an error on close, not open. Signed-off-by: Mitch Williams <mitch.a.williams@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-02-08 16:27:01 -08:00
Bruce Allan	7a63dae0fa	ice: remove unnecessary casts Casting a void * rvalue in an assignment is unnecessary in C; remove the casts. Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-02-08 16:27:01 -08:00
Chinh T Cao	fc2d1165d4	ice: Refactor DCB related variables out of the ice_port_info struct Refactor the DCB related variables out of the ice_port_info_struct. The goal is to make the ice_port_info struct cleaner. Signed-off-by: Chinh T Cao <chinh.t.cao@intel.com> Co-developed-by: Dave Ertman <david.m.ertman@intel.com> Signed-off-by: Dave Ertman <david.m.ertman@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-02-08 16:27:01 -08:00
Jesse Brandeburg	1d9f7ca324	ice: fix writeback enable logic The writeback enable logic was incorrectly implemented (due to misunderstanding what the side effects of the implementation would be during polling). Fix this logic issue, while implementing a new feature allowing the user to control the writeback frequency using the knobs for controlling interrupt throttling that we already have. Basically if you leave adaptive interrupts enabled, the writeback frequency will be varied even if busy_polling or if napi-poll is in use. If the interrupt rates are set to a fixed value by ethtool -C and adaptive is off, the driver will allow the user-set interrupt rate to guide how frequently the hardware will complete descriptors to the driver. Effectively the user will get a control over the hardware efficiency, allowing the choice between immediate interrupts or delayed up to a maximum of the interrupt rate, even when interrupts are disabled during polling. Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Co-developed-by: Brett Creeley <brett.creeley@intel.com> Signed-off-by: Brett Creeley <brett.creeley@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-02-08 16:27:01 -08:00
Ben Shelton	4f8a14976a	ice: Use PSM clock frequency to calculate RL profiles The core clock frequency is currently hardcoded at 446 MHz for the RL profile calculations. This causes issues since not all devices use that clock frequency. Read the GLGEN_CLKSTAT_SRC register to determine which PSM clock frequency is selected. This ensures that the rate limiter profile calculations will be correct. Signed-off-by: Ben Shelton <benjamin.h.shelton@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-02-08 16:27:01 -08:00
Kiran Patil	b126bd6bcd	ice: create scheduler aggregator node config and move VSIs Create set scheduler aggregator node and move for VSIs into respective scheduler node. Max children per aggregator node is 64. There are two types of aggregator node(s) created. 1. dedicated node for PF and _CTRL VSIs 2. dedicated node(s) for VFs. As part of reset and rebuild, aggregator nodes are recreated and VSIs are moved to respective aggregator node. Having related VSIs in respective tree avoid starvation between PF and VF w.r.t Tx bandwidth. Co-developed-by: Tarun Singh <tarun.k.singh@intel.com> Signed-off-by: Tarun Singh <tarun.k.singh@intel.com> Co-developed-by: Victor Raj <victor.raj@intel.com> Signed-off-by: Victor Raj <victor.raj@intel.com> Co-developed-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Signed-off-by: Kiran Patil <kiran.patil@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-02-08 16:27:01 -08:00
Dave Ertman	df006dd4b1	ice: Add initial support framework for LAG Add the framework and initial implementation for receiving and processing netdev bonding events. This is only the software support and the implementation of the HW offload for bonding support will be coming at a later time. There are some architectural gaps that need to be closed before that happens. Because this is a software only solution that supports in kernel bonding, SR-IOV is not supported with this implementation. Signed-off-by: Dave Ertman <david.m.ertman@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-02-08 16:27:01 -08:00
Michal Swiatkowski	c7a219048e	ice: Remove xsk_buff_pool from VSI structure Current implementation of netdev already contains xsk_buff_pools. We no longer have to contain these structures in ice_vsi. Refactor the code to operate on netdev-provided xsk_buff_pools. Move scheduling napi on each queue to a separate function to simplify setup function. Signed-off-by: Michal Swiatkowski <michal.swiatkowski@intel.com> Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Tested-by: Kiran Bhandare <kiranx.bhandare@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-02-08 16:27:01 -08:00
Dave Ertman	34295a3696	ice: implement new LLDP filter command There is an issue with some NVMs where an already existent LLDP filter is blocking the creation of a filter to allow LLDP packets to be redirected to the default VSI for the interface. This is blocking all LLDP functionality based in the kernel when the FW LLDP agent is disabled (e.g. software based DCBx). Implement the new AQ command to allow adding VSI destinations to existent filters on NVM versions that support the new command. The new lldp_fltr_ctrl AQ command supports Rx filters only, so the code flow for adding filters to disable Tx of control frames will remain intact. Signed-off-by: Dave Ertman <david.m.ertman@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-02-08 16:27:01 -08:00
Brett Creeley	382e0a6880	ice: log message when trusted VF goes in/out of promisc mode Currently there is no message printed on the host when a VF goes in and out of promiscuous mode. This is causing confusion because this is the expected behavior based on i40e. Fix this. Signed-off-by: Brett Creeley <brett.creeley@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-02-08 16:27:01 -08:00
Bruce Allan	12aae8f1d8	ice: remove dead code The check for a NULL pf pointer is moot since the earlier declaration and assignment of struct device *dev already de-referenced the pointer. Also, the only caller of ice_set_dflt_mib() already ensures pf is not NULL. Cc: Dave Ertman <david.m.ertman@intel.com> Reported-by: kernel test robot <lkp@intel.com> Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-02-05 11:44:48 -08:00
Bruce Allan	11404310d5	ice: use flex_array_size where possible Use the flex_array_size() helper with the recently added flexible array members in structures. Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-02-05 11:44:42 -08:00
Gustavo A. R. Silva	e94c0df984	ice: Replace one-element array with flexible-array member There is a regular need in the kernel to provide a way to declare having a dynamically sized set of trailing elements in a structure. Kernel code should always use “flexible array members”[1] for these cases. The older style of one-element or zero-length arrays should no longer be used[2]. Refactor the code according to the use of a flexible-array member in struct ice_res_tracker, instead of a one-element array and use the struct_size() helper to calculate the size for the allocations. Also, notice that the code below suggests that, currently, two too many bytes are being allocated with devm_kzalloc(), as the total number of entries (pf->irq_tracker->num_entries) for pf->irq_tracker->list[] is _vectors_ and sizeof(pf->irq_tracker) also includes the size of the one-element array _list_ in struct ice_res_tracker. drivers/net/ethernet/intel/ice/ice_main.c:3511: 3511 / populate SW interrupts pool with number of OS granted IRQs. */ 3512 pf->num_avail_sw_msix = (u16)vectors; 3513 pf->irq_tracker->num_entries = (u16)vectors; 3514 pf->irq_tracker->end = pf->irq_tracker->num_entries; With this change, the right amount of dynamic memory is now allocated because, contrary to one-element arrays which occupy at least as much space as a single object of the type, flexible-array members don't occupy such space in the containing structure. [1] https://en.wikipedia.org/wiki/Flexible_array_member [2] https://www.kernel.org/doc/html/v5.9-rc1/process/deprecated.html#zero-length-and-one-element-arrays Built-tested-by: kernel test robot <lkp@intel.com> Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-02-05 11:44:32 -08:00
Jacob Keller	e67fbcfbb4	ice: display stored UNDI firmware version via devlink info Just as we recently added support for other stored firmware flash versions, support display of the stored UNDI Option ROM version via devlink info. To do this, we need to introduce a new ice_get_inactive_orom_ver function. This is a little trickier than with other flash versions. The Option ROM version data was being read from a special "Boot Configuration" block of the NVM Preserved Field Area. This block only contains the active Option ROM version data. It is populated when the device firmware finishes updating the Option ROM. This method is ineffective at reading the stored Option ROM version data. Instead of reading from this section of the flash, replace this version extraction with one which locates the Combo Version information from within the Option ROM binary. This data is stored within the Option ROM at a 512 byte offset, in a simple structured format. The structure uses a simple modulo 256 checksum for integrity verification. Scan through the Option ROM to locate the CIVD data section, and extract the Combo Version. Refactor ice_get_orom_ver_info so that it takes the bank select enumeration parameter. Use this to implement ice_get_inactive_orom_ver. Although all ice devices have a Boot Configuration block in the NVM PFA, not all devices have a valid Option ROM. In this case, the old ice_get_orom_ver_info would "succeed" but report a version of all zeros. The new implementation would fail to locate the $CIV section in the Option ROM and report an error. Thus, we must ensure that ice_init_nvm does not fail if ice_get_orom_ver_info fails. Use the new ice_get_inactive_orom_ver to allow reporting the Option ROM versions for a pending update via devlink info. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-02-05 11:44:16 -08:00
Jacob Keller	e120a9ab45	ice: display stored netlist versions via devlink info Add a function to read the inactive netlist bank for version information. To support this, refactor how we read the netlist version data. Instead of using the firmware AQ interface with a module ID, read from the flash as a flat NVM, using ice_read_flash_module. This change requires a slight adjustment to the offset values used, as reading from the flat NVM includes the type field (which was stripped by firmware previously). Cleanup the macro names and move them to ice_type.h. For clarity in how we calculate the offsets and so that programmers can easily map the offset value to the data sheet, use a wrapper macro to account for the offset adjustments. Use the newly added ice_get_inactive_netlist_ver function to extract the version data from the pending netlist module update. Add the stored variants of "fw.netlist", and "fw.netlist.build" to the info version map array. With this change, we now report the "fw.netlist" and "fw.netlist.build" versions into the stored section of the devlink info report. As with the main NVM module versions, if there is no pending update, we report the currently active values as stored. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-02-05 11:43:37 -08:00
Jacob Keller	2c4fe41d72	ice: display some stored NVM versions via devlink info The devlink info interface supports drivers reporting "stored" versions. These versions indicate the version of an update that has been downloaded to the device, but is not yet active. The code for extracting the NVM version recently changed to enable support for reading from either the active or the inactive bank. Use this to implement ice_get_inactive_nvm_ver, which will read the NVM version data from the inactive section of flash. When reporting the versions via devlink info, first read the device capabilities. Determine if there is a pending flash update, and if so, extract relevant version information from the inactive flash. Store these within the info context structure. When reporting "stored" firmware versions, devlink documentation indicates that we ought to always report a stored value, even if there is no pending update. In this common case, the stored version should match the running version. This means that each stored version should by default fallback to the same value as reported by the running handler. To support this, modify the version structure to have both a "getter" and a "fallback". Modify the control loop so that it will use the "fallback" function if the "getter" function does not report a version. To report versions for which we can read the stored value, use a new "stored()" macro. This macro will insert two entries into the version list. The first entry is the traditional running version. The second is the stored version, implemented with a fallback to the active version. This is a little tricky, but reduces the overall duplication of elements in the entry list, and ensures that running and stored values remain consistent. To avoid some duplication, add a combined() macro that will insert both the running and stored versions into the version entry list. Using this new support, add pending version reporter functions for "fw.psid.api" and "fw.bundle_id". This enables reporting the stored values for some of versions in the NVM module of the flash. Reporting management versions is not implemented by this patch. The active management version is reported to the driver via the AdminQ mailbox during load. Although the version must be in the firmware binary somewhere, accessing this from the inactive firmware is not trivial and has not been implemented in this change. Future changes will introduce support for reading the UNDI Option ROM version and the version associated with the Netlist module. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-02-05 11:41:09 -08:00
Jacob Keller	0ce50c7066	ice: introduce function for reading from flash modules When reading from the flash memory of the device, the ice driver has two interfaces available to it. First, it can use a mediated interface via firmware that allows specifying a module ID. This allows reading from specific modules of the active flash bank. The second interface available is to perform flat reads. This allows complete access to the entire flash. However, using it requires the software to handle calculating module location and interpret pointer addresses. While most data required is accessible through the convenient first interface, certain flash contents are not. This includes the CSS header information associated with the Option ROM and NVM banks, as well as any access to the "inactive" banks used as scratch space for performing flash updates. In order to access all of the relevant flash contents, software must use the flat reads. Rather than forcing all flows to perform flat read calculations, introduce a new abstraction for reading from the flash: ice_read_flash_module. This function provides an abstraction for reading from either the active or inactive flash bank at the requested module. This interface is very similar to the abstraction provided via firmware, but allows access to additional modules, as well as providing a mechanism to request access to both flash banks. At first glance, it might make sense for this abstraction to allow specifying precisely which bank (1st or 2nd) the caller wishes to read. This is simpler to implement but more difficult to use. In practice, most callers only know whether they want the active bank, or the inactive bank. Rather than force callers to determine for themselves which bank to read from, implement ice_read_flash_module in terms of "active" vs "inactive". This significantly simplifies the implementation at the caller level and is a more useful abstraction over the flash contents. Make use of this new interface to refactor reading of the main NVM version information. Instead of using the firmware's mediated ShadowRAM function, use the ice_read_flash_module abstraction. To do this, notice that most reads of the NVM are going to be in 2-byte word chunks. To simplify using ice_read_flash_module for this case, ice_read_nvm_module is introduced. This is a simple wrapper around ice_read_flash_module which takes the correct pointer address for the NVM bank, and forces the 2-byte word format onto the caller. When reading the NVM versions, some fields are read from the Shadow RAM. The Shadow RAM is the first 64KB of flash memory, and is populated during device load. Most fields are copied from a section within the active NVM bank. In order to read this data from both the active and inactive NVM banks, we need to read not from the first 64KB of flash, but instead from the correct offset into the NVM bank. Introduce ice_read_nvm_sr_copy for this purpose. This function wraps around ice_read_nvm_module and has the same interface as the ice_read_sr_word, with the exception of allowing the caller to specify whether to read the active or inactive flash bank. With this change, it is now trivial to refactor ice_get_nvm_ver_info to read using the software mediated ice_read_flash_module interface instead of relying on the firmware mediated interface. This will be used in the following change to implement support for stored versions in the devlink info report. Additionally, the overall ice_read_flash_module interface will be used and extended to support all three major flash banks, and additionally to support reading the flash image security revision information. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-02-05 11:35:41 -08:00
Jacob Keller	1fa95e0120	ice: cache NVM module bank information The ice flash contains two copies of each of the NVM, Option ROM, and Netlist modules. Each bank has a pointer word and a size word. In order to correctly read from the active flash bank, the driver must calculate the offset manually. During NVM initialization, read the Shadow RAM control word and determine which bank is active for each NVM module. Additionally, cache the size and pointer values for use in calculating the correct offset. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-02-05 11:35:41 -08:00
Jacob Keller	74789085d9	ice: introduce context struct for info report The ice driver uses an array of structures which link an info name with a function that formats the associated version data into a string. All existing format functions simply format already captured static data from the driver hw structure. Future changes will introduce format functions for reporting the versions of flash sections stored but not yet applied. This type of version data is not stored as a member of the hw structure. This is because (a) it might not yet exist in the case there is no pending flash update, and (b) even if it does, it might change such as if an update is canceled or replaced by a new update before finalizing. We could simply have each format function gather its own data upon being called. However, in some cases the raw binary version data is a combination of multiple different reported fields. Additionally, the current interface doesn't have a way for the function to indicate that the version doesn't exist. Refactor this function interface to take a new ice_info_ctx structure instead of the buffer pointer and length. This context structure allows for future extensions to pre-gather version data that is stored within the context struct instead of the hw struct. Allocate this context structure initially at the start of ice_devlink_info_get. We use dynamic allocation instead of a local stack variable in order to avoid using too much kernel stack once we extend it with additional data structures. Modify the main loop that drives the info reporting so that the version buffer string is always cleared between each format. Explicitly check that the format function actually filled in a version string of non-zero length. If the string is not provided, simply skip this version without reporting an error. This allows for introducing format functions of versions which may or may not be present, such as the version of a pending update that has not yet been activated. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-02-05 11:35:41 -08:00
Jacob Keller	9af368fa9c	ice: create flash_info structure and separate NVM version The ice_nvm_info structure has become somewhat of a dumping ground for all of the fields related to flash version. It holds the NVM version and EETRACK id, the OptionROM info structure, the flash size, the ShadowRAM size, and more. A future change is going to add the ability to read the NVM version and EETRACK ID from the inactive NVM bank. To make this simpler, it is useful to have these NVM version info fields extracted to their own structure. Rename ice_nvm_info into ice_flash_info, and create a separate ice_nvm_info structure that will contain the eetrack and NVM map version. Move the netlist_ver structure into ice_flash_info and rename it ice_netlist_info for consistency. Modify the static ice_get_orom_ver_info to take the option rom structure as a pointer. This makes it more obvious what portion of the hw struct is being modified. Do the same for ice_get_netlist_ver_info. Introduce a new ice_get_nvm_ver_info function, which will be similar to ice_get_orom_ver_info and ice_get_netlist_ver_info, used to keep the NVM version extraction code co-located. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-02-05 11:35:34 -08:00
Jacob Keller	08e1294daa	ice: report timeout length for erasing during devlink flash When erasing, notify userspace of how long we will potentially take to erase a module. Doing so allows userspace to report the timeout, giving a clear indication of the upper time bound of the operation. Since we're re-using the erase timeout value, make it a macro rather than a magic number. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Shannon Nelson <snelson@pensando.io> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-02-05 09:34:24 -08:00
Alexander Lobakin	a79afa78e6	net: use the new dev_page_is_reusable() instead of private versions Now we can remove a bunch of identical functions from the drivers and make them use common dev_page_is_reusable(). All {,un}likely() checks are omitted since it's already present in this helper. Also update some comments near the call sites. Suggested-by: David Rientjes <rientjes@google.com> Suggested-by: Jakub Kicinski <kuba@kernel.org> Cc: John Hubbard <jhubbard@nvidia.com> Signed-off-by: Alexander Lobakin <alobakin@pm.me> Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-02-04 18:20:14 -08:00
Jakub Kicinski	c358f95205	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net drivers/net/can/dev.c `b552766c87` ("can: dev: prevent potential information leak in can_fill_info()") `3e77f70e73` ("can: dev: move driver related infrastructure into separate subdir") `0a042c6ec9` ("can: dev: move netlink related code into seperate file") Code move. drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c `57ac4a31c4` ("net/mlx5e: Correctly handle changing the number of queues when the interface is down") `214baf2287` ("net/mlx5e: Support HTB offload") Adjacent code changes net/switchdev/switchdev.c `20776b465c` ("net: switchdev: don't set port_obj_info->handled true when -EOPNOTSUPP") `ffb68fc58e` ("net: switchdev: remove the transaction structure from port object notifiers") `bae33f2b5a` ("net: switchdev: remove the transaction structure from port attributes") Transaction parameter gets dropped otherwise keep the fix. Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-01-28 17:09:31 -08:00
Brett Creeley	f3fe97f643	ice: Fix MSI-X vector fallback logic The current MSI-X enablement logic tries to enable best-case MSI-X vectors and if that fails we only support a bare-minimum set. This includes a single MSI-X for 1 Tx and 1 Rx queue and a single MSI-X for the OICR interrupt. Unfortunately, the driver fails to load when we don't get as many MSI-X as requested for a couple reasons. First, the code to allocate MSI-X in the driver tries to allocate num_online_cpus() MSI-X for LAN traffic without caring about the number of MSI-X actually enabled/requested from the kernel for LAN traffic. So, when calling ice_get_res() for the PF VSI, it returns failure because the number of available vectors is less than requested. Fix this by not allowing the PF VSI to allocation more than pf->num_lan_msix MSI-X vectors and pf->num_lan_msix Rx/Tx queues. Limiting the number of queues is done because we don't want more than 1 Tx/Rx queue per interrupt due to performance conerns. Second, the driver assigns pf->num_lan_msix = 2, to account for LAN traffic and the OICR. However, pf->num_lan_msix is only meant for LAN MSI-X. This is causing a failure when the PF VSI tries to allocate/reserve the minimum pf->num_lan_msix because the OICR MSI-X has already been reserved, so there may not be enough MSI-X vectors left. Fix this by setting pf->num_lan_msix = 1 for the failure case. Then the ICE_MIN_MSIX accounts for the LAN MSI-X and the OICR MSI-X needed for the failure case. Update the related defines used in ice_ena_msix_range() to align with the above behavior and remove the unused RDMA defines because RDMA is currently not supported. Also, remove the now incorrect comment. Fixes: `152b978a1f` ("ice: Rework ice_ena_msix_range") Signed-off-by: Brett Creeley <brett.creeley@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-01-26 10:44:17 -08:00
Brett Creeley	943b881e35	ice: Don't allow more channels than LAN MSI-X available Currently users could create more channels than LAN MSI-X available. This is happening because there is no check against pf->num_lan_msix when checking the max allowed channels and will cause performance issues if multiple Tx and Rx queues are tied to a single MSI-X. Fix this by not allowing more channels than LAN MSI-X available in pf->num_lan_msix. Fixes: `87324e747f` ("ice: Implement ethtool ops for channels") Signed-off-by: Brett Creeley <brett.creeley@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-01-26 10:44:17 -08:00
Nick Nunley	13ed5e8a9b	ice: update dev_addr in ice_set_mac_address even if HW filter exists Fix the driver to copy the MAC address configured in ndo_set_mac_address into dev_addr, even if the MAC filter already exists in HW. In some situations (e.g. bonding) the netdev's dev_addr could have been modified outside of the driver, with no change to the HW filter, so the driver cannot assume that they match. Fixes: `757976ab16` ("ice: Fix check for removing/adding mac filters") Signed-off-by: Nick Nunley <nicholas.d.nunley@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-01-26 10:43:49 -08:00
Nick Nunley	1b0b0b581b	ice: Implement flow for IPv6 next header (extension header) This patch is based on a similar change to i40e by Slawomir Laba: "i40e: Implement flow for IPv6 next header (extension header)". When a packet contains an IPv6 header with next header which is an extension header and not a protocol one, the kernel function skb_transport_header called with such sk_buff will return a pointer to the extension header and not to the TCP one. The above explained call caused a problem with packet processing for skb with encapsulation for tunnel with ICE_TX_CTX_EIPT_IPV6. The extension header was not skipped at all. The ipv6_skip_exthdr function does check if next header of the IPV6 header is an extension header and doesn't modify the l4_proto pointer if it points to a protocol header value so its safe to omit the comparison of exthdr and l4.hdr pointers. The ipv6_skip_exthdr can return value -1. This means that the skipping process failed and there is something wrong with the packet so it will be dropped. Fixes: `a4e82a81f5` ("ice: Add support for tunnel offloads") Signed-off-by: Nick Nunley <nicholas.d.nunley@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-01-26 10:43:49 -08:00
Henry Tieman	29e2d9eb82	ice: fix FDir IPv6 flexbyte The packet classifier would occasionally misrecognize an IPv6 training packet when the next protocol field was 0. The correct value for unspecified protocol is IPPROTO_NONE. Fixes: `165d80d6ad` ("ice: Support IPv6 Flow Director filters") Signed-off-by: Henry Tieman <henry.w.tieman@intel.com> Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-01-26 10:43:49 -08:00
Jakub Kicinski	2d9116be76	Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next Daniel Borkmann says: ==================== pull-request: bpf-next 2021-01-16 1) Extend atomic operations to the BPF instruction set along with x86-64 JIT support, that is, atomic{,64}_{xchg,cmpxchg,fetch_{add,and,or,xor}}, from Brendan Jackman. 2) Add support for using kernel module global variables (__ksym externs in BPF programs) retrieved via module's BTF, from Andrii Nakryiko. 3) Generalize BPF stackmap's buildid retrieval and add support to have buildid stored in mmap2 event for perf, from Jiri Olsa. 4) Various fixes for cross-building BPF sefltests out-of-tree which then will unblock wider automated testing on ARM hardware, from Jean-Philippe Brucker. 5) Allow to retrieve SOL_SOCKET opts from sock_addr progs, from Daniel Borkmann. 6) Clean up driver's XDP buffer init and split into two helpers to init per- descriptor and non-changing fields during processing, from Lorenzo Bianconi. 7) Minor misc improvements to libbpf & bpftool, from Ian Rogers. * https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (41 commits) perf: Add build id data in mmap2 event bpf: Add size arg to build_id_parse function bpf: Move stack_map_get_build_id into lib bpf: Document new atomic instructions bpf: Add tests for new BPF atomic operations bpf: Add bitwise atomic instructions bpf: Pull out a macro for interpreting atomic ALU operations bpf: Add instructions for atomic_[cmp]xchg bpf: Add BPF_FETCH field / create atomic_fetch_add instruction bpf: Move BPF_STX reserved field check into BPF_STX verifier code bpf: Rename BPF_XADD and prepare to encode other atomics in .imm bpf: x86: Factor out a lookup table for some ALU opcodes bpf: x86: Factor out emission of REX byte bpf: x86: Factor out emission of ModR/M for *(reg + off) tools/bpftool: Add -Wall when building BPF programs bpf, libbpf: Avoid unused function warning on bpf_tail_call_static selftests/bpf: Install btf_dump test cases selftests/bpf: Fix installation of urandom_read selftests/bpf: Move generated test files to $(TEST_GEN_FILES) selftests/bpf: Fix out-of-tree build ... ==================== Link: https://lore.kernel.org/r/20210116012922.17823-1-daniel@iogearbox.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-01-15 17:57:26 -08:00
Eric Dumazet	f73fc40327	ice: drop dead code in ice_receive_skb() napi_gro_receive() can never return GRO_DROP GRO_DROP can only be returned from napi_gro_frags() which is the other NAPI GRO entry point. Followup patch will remove GRO_DROP, because drivers are not supposed to call napi_gro_frags() if prior napi_get_frags() has failed. Note that I have left the gro_dropped variable. I leave to ice maintainers the decision to further remove it from ethtool -S results. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-01-09 14:24:25 -08:00
Lorenzo Bianconi	be9df4aff6	net, xdp: Introduce xdp_prepare_buff utility routine Introduce xdp_prepare_buff utility routine to initialize per-descriptor xdp_buff fields (e.g. xdp_buff pointers). Rely on xdp_prepare_buff() in all XDP capable drivers. Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Alexander Duyck <alexanderduyck@fb.com> Acked-by: Jesper Dangaard Brouer <brouer@redhat.com> Acked-by: John Fastabend <john.fastabend@gmail.com> Acked-by: Shay Agroskin <shayagr@amazon.com> Acked-by: Martin Habets <habetsm.xilinx@gmail.com> Acked-by: Camelia Groza <camelia.groza@nxp.com> Acked-by: Marcin Wojtas <mw@semihalf.com> Link: https://lore.kernel.org/bpf/45f46f12295972a97da8ca01990b3e71501e9d89.1608670965.git.lorenzo@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2021-01-08 13:39:24 -08:00
Lorenzo Bianconi	43b5169d83	net, xdp: Introduce xdp_init_buff utility routine Introduce xdp_init_buff utility routine to initialize xdp_buff fields const over NAPI iterations (e.g. frame_sz or rxq pointer). Rely on xdp_init_buff in all XDP capable drivers. Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Alexander Duyck <alexanderduyck@fb.com> Acked-by: Jesper Dangaard Brouer <brouer@redhat.com> Acked-by: John Fastabend <john.fastabend@gmail.com> Acked-by: Shay Agroskin <shayagr@amazon.com> Acked-by: Martin Habets <habetsm.xilinx@gmail.com> Acked-by: Camelia Groza <camelia.groza@nxp.com> Acked-by: Marcin Wojtas <mw@semihalf.com> Link: https://lore.kernel.org/bpf/7f8329b6da1434dc2b05a77f2e800b29628a8913.1608670965.git.lorenzo@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2021-01-08 13:39:24 -08:00
Jakub Kicinski	30bfce1094	net: remove ndo_udp_tunnel_* callbacks All UDP tunnel port management is now routed via udp_tunnel_nic infra directly. Remove the old callbacks. Reviewed-by: Alexander Duyck <alexanderduyck@fb.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-01-07 12:53:29 -08:00
Björn Töpel	8d14768a79	ice, xsk: clear the status bits for the next_to_use descriptor On the Rx side, the next_to_use index points to the next item in the HW ring to be refilled/allocated, and next_to_clean points to the next item to potentially be processed. When the HW Rx ring is fully refilled, i.e. no packets has been processed, the next_to_use will be next_to_clean - 1. When the ring is fully processed next_to_clean will be equal to next_to_use. The latter case is where a bug is triggered. If the next_to_use bits are not cleared, and the "fully processed" state is entered, a stale descriptor can be processed. The skb-path correctly clear the status bit for the next_to_use descriptor, but the AF_XDP zero-copy path did not do that. This change adds the status bits clearing of the next_to_use descriptor. Fixes: `2d4238f556` ("ice: Add support for AF_XDP") Signed-off-by: Björn Töpel <bjorn.topel@intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2020-12-16 10:51:07 -08:00
Björn Töpel	5bb0c4b5eb	ice, xsk: Move Rx allocation out of while-loop Instead doing the check for allocation in each loop, move it outside the while loop and do it every NAPI loop. This change boosts the xdpsock rxdrop scenario with 15% more packets-per-second. Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Björn Töpel <bjorn.topel@intel.com> Link: https://lore.kernel.org/r/20201211085410.59350-1-bjorn.topel@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2020-12-14 17:54:42 -08:00
Jakub Kicinski	46d5e62dd3	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net xdp_return_frame_bulk() needs to pass a xdp_buff to __xdp_return(). strlcpy got converted to strscpy but here it makes no functional difference, so just keep the right code. Conflicts: net/netfilter/nf_tables_api.c Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2020-12-11 22:29:38 -08:00
Björn Töpel	1beb7830d3	ice: avoid premature Rx buffer reuse The page recycle code, incorrectly, relied on that a page fragment could not be freed inside xdp_do_redirect(). This assumption leads to that page fragments that are used by the stack/XDP redirect can be reused and overwritten. To avoid this, store the page count prior invoking xdp_do_redirect(). Fixes: `efc2214b60` ("ice: Add support for XDP") Reported-and-analyzed-by: Li RongQing <lirongqing@baidu.com> Signed-off-by: Björn Töpel <bjorn.topel@intel.com> Tested-by: George Kuruvinakunnel <george.kuruvinakunnel@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2020-12-09 15:26:58 -08:00
Simon Perron Caissy	5b13886da8	ice: Add space to unknown speed Add space to the end of 'Unknown' string in order to avoid concatenation with 'bps' string when formatting netdev log message. Signed-off-by: Simon Perron Caissy <simon.perron.caissy@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2020-12-09 08:11:55 -08:00
Jacob Keller	9228d8b261	ice: join format strings to same line as ice_debug When printing messages with ice_debug, align the printed string to the origin line of the message in order to ease debugging and tracking messages back to their source. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2020-12-09 08:11:55 -08:00
Bruce Allan	34d8461a65	ice: silence static analysis warning sparse warns about cast to/from restricted types which is not an actual problem; silence the warning. Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2020-12-09 08:11:55 -08:00
Bruce Allan	32e6deb297	ice: cleanup misleading comment The maximum Admin Queue buffer size and NVM shadow RAM sector size are both 4 Kilobytes. Some comments refer to those as 4Kb which can be confused with 4 Kilobits. Update the comments to use the commonly used KB symbol instead. Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2020-12-09 08:11:55 -08:00
Nick Nunley	bcf68ea1e5	ice: Remove vlan_ena from vsi structure vlan_ena was introduced to track whether VLAN filters are enabled on the device, but 1) checking for num_vlan > 1 already gives us this information, and is currently used in this way throughout the code 2) the logic for vlan_ena is broken when multiple VLANs are active Just remove vlan_ena and use num_vlan instead. Signed-off-by: Nick Nunley <nicholas.d.nunley@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2020-12-09 08:11:54 -08:00
Jeb Cramer	956542cae5	ice: Remove gate to OROM init Remove the gate that prevents the OROM and netlist info from being populated. The NVM now has the appropriate section for software to reference the versioning info. Signed-off-by: Jeb Cramer <jeb.j.cramer@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2020-12-09 08:11:54 -08:00
Jeb Cramer	c21125c997	ice: Enable Support for FW Override (E82X) The driver is able to override the firmware when it comes to supporting a more lenient link mode. This feature was limited to E810 devices. It is now extended to E82X devices. Signed-off-by: Jeb Cramer <jeb.j.cramer@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2020-12-09 08:11:54 -08:00
Paul M Stillwell Jr	f2651a91b9	ice: don't always return an error for Get PHY Abilities AQ command There are times when the driver shouldn't return an error when the Get PHY abilities AQ command (0x0600) returns an error. Instead the driver should log that the error occurred and continue on. This allows the driver to load even though the AQ command failed. The user can then later determine the reason for the failure and correct it. Signed-off-by: Paul M Stillwell Jr <paul.m.stillwell.jr@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2020-12-09 08:11:54 -08:00
Bruce Allan	88dcfdb4cd	ice: cleanup stack hog In ice_flow_add_prof_sync(), struct ice_flow_prof_params has recently grown in size hogging stack space when allocated there. Hogging stack space should be avoided. Change allocation to be on the heap when needed. Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Harikumar Bokkena <harikumarx.bokkena@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2020-12-09 08:11:13 -08:00
Jakub Kicinski	a1dd1d8697	Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next Alexei Starovoitov says: ==================== pull-request: bpf-next 2020-12-03 The main changes are: 1) Support BTF in kernel modules, from Andrii. 2) Introduce preferred busy-polling, from Björn. 3) bpf_ima_inode_hash() and bpf_bprm_opts_set() helpers, from KP Singh. 4) Memcg-based memory accounting for bpf objects, from Roman. 5) Allow bpf_{s,g}etsockopt from cgroup bind{4,6} hooks, from Stanislav. * https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (118 commits) selftests/bpf: Fix invalid use of strncat in test_sockmap libbpf: Use memcpy instead of strncpy to please GCC selftests/bpf: Add fentry/fexit/fmod_ret selftest for kernel module selftests/bpf: Add tp_btf CO-RE reloc test for modules libbpf: Support attachment of BPF tracing programs to kernel modules libbpf: Factor out low-level BPF program loading helper bpf: Allow to specify kernel module BTFs when attaching BPF programs bpf: Remove hard-coded btf_vmlinux assumption from BPF verifier selftests/bpf: Add CO-RE relocs selftest relying on kernel module BTF selftests/bpf: Add support for marking sub-tests as skipped selftests/bpf: Add bpf_testmod kernel module for testing libbpf: Add kernel module BTF support for CO-RE relocations libbpf: Refactor CO-RE relocs to not assume a single BTF object libbpf: Add internal helper to load BTF data by FD bpf: Keep module's btf_data_size intact after load bpf: Fix bpf_put_raw_tracepoint()'s use of __module_address() selftests/bpf: Add Userspace tests for TCP_WINDOW_CLAMP bpf: Adds support for setting window clamp samples/bpf: Fix spelling mistake "recieving" -> "receiving" bpf: Fix cold build of test_progs-no_alu32 ... ==================== Link: https://lore.kernel.org/r/20201204021936.85653-1-alexei.starovoitov@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2020-12-04 07:48:12 -08:00
Björn Töpel	b02e5a0ebb	xsk: Propagate napi_id to XDP socket Rx path Add napi_id to the xdp_rxq_info structure, and make sure the XDP socket pick up the napi_id in the Rx path. The napi_id is used to find the corresponding NAPI structure for socket busy polling. Signed-off-by: Björn Töpel <bjorn.topel@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Ilias Apalodimas <ilias.apalodimas@linaro.org> Acked-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Tariq Toukan <tariqt@nvidia.com> Link: https://lore.kernel.org/bpf/20201130185205.196029-7-bjorn.topel@gmail.com	2020-12-01 00:09:25 +01:00
Jacob Keller	52cc5f3a16	devlink: move flash end and begin to core devlink When performing a flash update via devlink, device drivers may inform user space of status updates via devlink_flash_update_(begin\|end\|timeout\|status)_notify functions. It is expected that drivers do not send any status notifications unless they send a begin and end message. If a driver sends a status notification without sending the appropriate end notification upon finishing (regardless of success or failure), the current implementation of the devlink userspace program can get stuck endlessly waiting for the end notification that will never come. The current ice driver implementation may send such a status message without the appropriate end notification in rare cases. Fixing the ice driver is relatively simple: we just need to send the begin_notify at the start of the function and always send an end_notify no matter how the function exits. Rather than assuming driver authors will always get this right in the future, lets just fix the API so that it is not possible to get wrong. Make devlink_flash_update_begin_notify and devlink_flash_update_end_notify static, and call them in devlink.c core code. Always send the begin_notify just before calling the driver's flash_update routine. Always send the end_notify just after the routine returns regardless of success or failure. Doing this makes the status notification easier to use from the driver, as it no longer needs to worry about catching failures and cleaning up by calling devlink_flash_update_end_notify. It is now no longer possible to do the wrong thing in this regard. We also save a couple of lines of code in each driver. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Acked-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2020-11-19 21:41:02 -08:00
Jacob Keller	b44cfd4f5b	devlink: move request_firmware out of driver All drivers which implement the devlink flash update support, with the exception of netdevsim, use either request_firmware or request_firmware_direct to locate the firmware file. Rather than having each driver do this separately as part of its .flash_update implementation, perform the request_firmware within net/core/devlink.c Replace the file_name parameter in the struct devlink_flash_update_params with a pointer to the fw object. Use request_firmware rather than request_firmware_direct. Although most Linux distributions today do not have the fallback mechanism implemented, only about half the drivers used the _direct request, as compared to the generic request_firmware. In the event that a distribution does support the fallback mechanism, the devlink flash update ought to be able to use it to provide the firmware contents. For distributions which do not support the fallback userspace mechanism, there should be essentially no difference between request_firmware and request_firmware_direct. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Acked-by: Shannon Nelson <snelson@pensando.io> Acked-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2020-11-19 21:40:57 -08:00
Dan Nowlin	051d2b5cfa	ice: fix adding IP4 IP6 Flow Director rules A subsequent addition of an IP4 or IP6 rule after other rules would overwrite any existing TCAM entries of related L4 protocols(ex: tcp4 or udp6). This was due to the mask including too many TCAM entries. Add new packet type masks with bits properly excluded so rules are not overwritten. Signed-off-by: Dan Nowlin <dan.nowlin@intel.com> Signed-off-by: Henry Tieman <henry.w.tieman@intel.com> Tested-by: Brijesh Behera <brijeshx.behera@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2020-10-09 13:14:19 -07:00
Bixuan Cui	ecfb751f1a	ice: Fix pointer cast warnings pointers should be casted to unsigned long to avoid -Wpointer-to-int-cast warnings: drivers/net/ethernet/intel/ice/ice_flow.h:197:33: warning: cast from pointer to integer of different size drivers/net/ethernet/intel/ice/ice_flow.h:198:32: warning: cast to pointer from integer of different size Signed-off-by: Bixuan Cui <cuibixuan@huawei.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2020-10-09 13:14:19 -07:00
Jacob Keller	1e8249cc9d	ice: add additional debug logging for firmware update While debugging a recent failure to update the flash of an ice device, I found it helpful to add additional logging which helped determine the root cause of the problem being a timeout issue. Add some extra dev_dbg() logging messages which can be enabled using the dynamic debug facility, including one for ice_aq_wait_for_event that will use jiffies to capture a rough estimate of how long we waited for the completion of a firmware command. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Brijesh Behera <brijeshx.behera@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2020-10-09 13:14:19 -07:00
Jacob Keller	48d40025b5	ice: refactor devlink_port to be per-VSI Currently, the devlink_port structure is stored within the ice_pf. This made sense because we create a single devlink_port for each PF. This setup does not mesh with the abstractions in the driver very well, and led to a flow where we accidentally call devlink_port_unregister twice during error cleanup. In particular, if devlink_port_register or devlink_port_unregister are called twice, this leads to a kernel panic. This appears to occur during some possible flows while cleaning up from a failure during driver probe. If register_netdev fails, then we will call devlink_port_unregister in ice_cfg_netdev as it cleans up. Later, we again call devlink_port_unregister since we assume that we must cleanup the port that is associated with the PF structure. This occurs because we cleanup the devlink_port for the main PF even though it was not allocated. We allocated the port within a per-VSI function for managing the main netdev, but did not release the port when cleaning up that VSI, the allocation and destruction are not aligned. Instead of attempting to manage the devlink_port as part of the PF structure, manage it as part of the PF VSI. Doing this has advantages, as we can match the de-allocation of the devlink_port with the unregister_netdev associated with the main PF VSI. Moving the port to the VSI is preferable as it paves the way for handling devlink ports allocated for other purposes such as SR-IOV VFs. Since we're changing up how we allocate the devlink_port, also change the indexing. Originally, we indexed the port using the PF id number. This came from an old goal of sharing a devlink for each physical function. Managing devlink instances across multiple function drivers is not workable. Instead, lets set the port number to the logical port number returned by firmware and set the index using the VSI index (sometimes referred to as VSI handle). Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2020-10-09 13:14:19 -07:00
Jacob Keller	410d06879c	ice: add the DDP Track ID to devlink info Add "fw.app.bundle_id" to display the DDP Track ID of the active DDP package. This id is similar to "fw.bundle_id" and is a unique identifier for the DDP package that is loaded in the device. Each new DDP has a unique Track ID generated for it, and the ID can be used to identify and track the DDP package. Add documentation for the new devlink info version. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2020-10-09 13:14:19 -07:00
Anirudh Venkataramanan	045afac407	ice: Change ice_info_get_dsn to be void ice_info_get_dsn always returns 0, so just make it void. Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2020-10-09 13:14:19 -07:00
Bruce Allan	ac382a0944	ice: remove repeated words A new test in checkpatch detects repeated words; cleanup all pre-existing occurrences of those now. Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Co-developed-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2020-10-09 13:14:19 -07:00
Andy Shevchenko	4d7ebed6aa	ice: devlink: use %phD to print small buffer Use %phD format to print small buffer as hex string. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2020-10-09 13:14:19 -07:00
David S. Miller	8b0308fe31	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Rejecting non-native endian BTF overlapped with the addition of support for it. The rest were more simple overlapping changes, except the renesas ravb binding update, which had to follow a file move as well as a YAML conversion. Signed-off-by: David S. Miller <davem@davemloft.net>	2020-10-05 18:40:01 -07:00
Jacob Keller	be49b1ad29	ice: preserve NVM capabilities in safe mode If the driver initializes in safe mode, it will call ice_set_safe_mode_caps. This results in clearing the capabilities structures, in order to set them up for operating in safe mode, ensuring many features are disabled. This has a side effect of also clearing the capability bits that relate to NVM update. The result is that the device driver will not indicate support for unified update, even if the firmware is capable. Fix this by adding the relevant capability fields to the list of values we preserve. To simplify the code, use a common_cap structure instead of a handful of local variables. To reduce some duplication of the capability name, introduce a couple of macros used to restore the capabilities values from the cached copy. Fixes: `de9b277ee0` ("ice: Add support for unified NVM update flow capability") Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Brijesh Behera <brijeshx.behera@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2020-09-30 08:32:35 -07:00
Jacob Keller	0ec86e8e82	ice: increase maximum wait time for flash write commands The ice driver needs to wait for a firmware response to each command to write a block of data to the scratch area used to update the device firmware. The driver currently waits for up to 1 second for this to be returned. It turns out that firmware might take longer than 1 second to return a completion in some cases. If this happens, the flash update will fail to complete. Fix this by increasing the maximum time that the driver will wait for both writing a block of data, and for activating the new NVM bank. The timeout for an erase command is already several minutes, as the firmware had to erase the entire bank which was already expected to take a minute or more in the worst case. In the case where firmware really won't respond, we will now take longer to fail. However, this ensures that if the firmware is simply slow to respond, the flash update can still complete. This new maximum timeout should not adversely increase the update time, as the implementation for wait_event_interruptible_timeout, and should wake very soon after we get a completion event. It is better for a flash update be slow but still succeed than to fail because we gave up too quickly. Fixes: `d69ea414c9` ("ice: implement device flash update via devlink") Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Brijesh Behera <brijeshx.behera@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2020-09-30 08:32:35 -07:00
Sebastian Andrzej Siewior	0171f4e8d3	net: intel: Remove in_interrupt() warnings in_interrupt() is ill defined and does not provide what the name suggests. The usage especially in driver code is deprecated and a tree wide effort to clean up and consolidate the (ab)usage of in_interrupt() and related checks is happening. In this case the checks cover only parts of the contexts in which these functions cannot be called. They fail to detect preemption or interrupt disabled invocations. As the functions which are invoked from the various places contain already a broad variety of checks (always enabled or debug option dependent) cover all invalid conditions already, there is no point in having inconsistent warnings in those drivers. Just remove them. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Alexander Duyck <alexander.h.duyck@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-09-29 14:02:54 -07:00
Jakub Kicinski	b20e6c17c4	ice: convert to new udp_tunnel infrastructure Convert ice to the new infra, use share port tables. Leave a tiny bit more error checking in place than usual, because this driver really does quite a bit of magic. We need to calculate the number of VxLAN and GENEVE entries the firmware has reserved. Thanks to the conversion the driver will no longer sleep in an atomic section. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-09-28 12:50:12 -07:00
Jakub Kicinski	f049b826a8	ice: remove unused args from ice_get_open_tunnel_port() ice_get_open_tunnel_port() is always passed TNL_ALL as the second parameter. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-09-28 12:50:12 -07:00
Jacob Keller	50db1bca55	ice: add support for flash update overwrite mask Support the recently added DEVLINK_ATTR_FLASH_UPDATE_OVERWRITE_MASK parameter in the ice flash update handler. Convert the overwrite mask bitfield into the appropriate preservation level used by the firmware when updating. Because there is no equivalent preservation level for overwriting only identifiers, this combination is rejected by the driver as not supported with an appropriate extended ACK message. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-09-25 17:20:57 -07:00
Jacob Keller	bc75c054f0	devlink: convert flash_update to use params structure The devlink core recently gained support for checking whether the driver supports a flash_update parameter, via `supported_flash_update_params`. However, parameters are specified as function arguments. Adding a new parameter still requires modifying the signature of the .flash_update callback in all drivers. Convert the .flash_update function to take a new `struct devlink_flash_update_params` instead. By using this structure, and the `supported_flash_update_params` bit field, a new parameter to flash_update can be added without requiring modification to existing drivers. As before, all parameters except file_name will require driver opt-in. Because file_name is a necessary field to for the flash_update to make sense, no "SUPPORTED" bitflag is provided and it is always considered valid. All future additional parameters will require a new bit in the supported_flash_update_params bitfield. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Jakub Kicinski <kuba@kernel.org> Cc: Jiri Pirko <jiri@mellanox.com> Cc: Jakub Kicinski <kuba@kernel.org> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Michael Chan <michael.chan@broadcom.com> Cc: Bin Luo <luobin9@huawei.com> Cc: Saeed Mahameed <saeedm@mellanox.com> Cc: Leon Romanovsky <leon@kernel.org> Cc: Ido Schimmel <idosch@mellanox.com> Cc: Danielle Ratson <danieller@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-09-25 17:20:57 -07:00
Jacob Keller	22ec3d232f	devlink: check flash_update parameter support in net core When implementing .flash_update, drivers which do not support per-component update are manually checking the component parameter to verify that it is NULL. Without this check, the driver might accept an update request with a component specified even though it will not honor such a request. Instead of having each driver check this, move the logic into net/core/devlink.c, and use a new `supported_flash_update_params` field in the devlink_ops. Drivers which will support per-component update must now specify this by setting DEVLINK_SUPPORT_FLASH_UPDATE_COMPONENT in the supported_flash_update_params in their devlink_ops. This helps ensure that drivers do not forget to check for a NULL component if they do not support per-component update. This also enables a slightly better error message by enabling the core stack to set the netlink bad attribute message to indicate precisely the unsupported attribute in the message. Going forward, any new additional parameter to flash update will require a bit in the supported_flash_update_params bitfield. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Jakub Kicinski <kuba@kernel.org> Cc: Jiri Pirko <jiri@mellanox.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Michael Chan <michael.chan@broadcom.com> Cc: Bin Luo <luobin9@huawei.com> Cc: Saeed Mahameed <saeedm@mellanox.com> Cc: Leon Romanovsky <leon@kernel.org> Cc: Ido Schimmel <idosch@mellanox.com> Cc: Danielle Ratson <danieller@mellanox.com> Cc: Shannon Nelson <snelson@pensando.io> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-09-25 17:20:57 -07:00
Jesse Brandeburg	b50f7bca5e	intel-ethernet: clean up W=1 warnings in kdoc This takes care of all of the trivial W=1 fixes in the Intel Ethernet drivers, which allows developers and maintainers to build more of the networking tree with more complete warning checks. There are three classes of kdoc warnings fixed: - cannot understand function prototype: 'x' - Excess function parameter 'x' description in 'y' - Function parameter or member 'x' not described in 'y' All of the changes were trivial comment updates on function headers. Inspired by Lee Jones' series of wireless work to do the same. Compile tested only, and passes simple test of $ git ls-files *.[ch] \| egrep drivers/net/ethernet/intel \| \ xargs scripts/kernel-doc -none Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-09-25 16:28:59 -07:00
Jacob Keller	f6a07271bb	ice: fix memory leak in ice_vsi_setup During ice_vsi_setup, if ice_cfg_vsi_lan fails, it does not properly release memory associated with the VSI rings. If we had used devres allocations for the rings, this would be ok. However, we use kzalloc and kfree_rcu for these ring structures. Using the correct label to cleanup the rings during ice_vsi_setup highlights an issue in the ice_vsi_clear_rings function: it can leave behind stale ring pointers in the q_vectors structure. When releasing rings, we must also ensure that no q_vector associated with the VSI will point to this ring again. To resolve this, loop over all q_vectors and release their ring mapping. Because we are about to free all rings, no q_vector should remain pointing to any of the rings in this VSI. Fixes: `5513b920a4` ("ice: Update Tx scheduler tree for VSI multi-Tx queue support") Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2020-09-25 07:39:24 -07:00
Jacob Keller	135f4b9e93	ice: fix memory leak if register_netdev_fails The ice_setup_pf_sw function can cause a memory leak if register_netdev fails, due to accidentally failing to free the VSI rings. Fix the memory leak by using ice_vsi_release, ensuring we actually go through the full teardown process. This should be safe even if the netdevice is not registered because we will have set the netdev pointer to NULL, ensuring ice_vsi_release won't call unregister_netdev. An alternative fix would be moving management of the PF VSI netdev into the main VSI setup code. This is complicated and likely requires significant refactor in how we manage VSIs Fixes: `3a858ba392` ("ice: Add support for VSI allocation and deallocation") Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2020-09-25 07:39:24 -07:00
Anirudh Venkataramanan	466e439292	ice: Fix call trace on suspend It appears that the ice_suspend flow is missing a call to pci_save_state and this is triggering the message "State of device not saved by ice_suspend" and a call trace. Fix it. Fixes: `769c500dcc` ("ice: Add advanced power mgmt for WoL") Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2020-09-25 07:39:24 -07:00
Qinglang Miao	ccb5942add	ice: simplify the return expression of ice_finalize_update() Simplify the return expression. Signed-off-by: Qinglang Miao <miaoqinglang@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-09-21 13:47:07 -07:00
Andrew Lunn	d4602a9f47	net: devlink: region: Pass the region ops to the snapshot function Pass the region to be snapshotted to the function performing the snapshot. This allows one function to operate on numerous regions. v4: Add missing kerneldoc for ICE Signed-off-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-09-18 18:17:45 -07:00
David S. Miller	150f29f5e6	Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next Daniel Borkmann says: ==================== pull-request: bpf-next 2020-09-01 The following pull-request contains BPF updates for your net-next tree. There are two small conflicts when pulling, resolve as follows: 1) Merge conflict in tools/lib/bpf/libbpf.c between `88a8212028` ("libbpf: Factor out common ELF operations and improve logging") in bpf-next and `1e891e513e` ("libbpf: Fix map index used in error message") in net-next. Resolve by taking the hunk in bpf-next: [...] scn = elf_sec_by_idx(obj, obj->efile.btf_maps_shndx); data = elf_sec_data(obj, scn); if (!scn \|\| !data) { pr_warn("elf: failed to get %s map definitions for %s\n", MAPS_ELF_SEC, obj->path); return -EINVAL; } [...] 2) Merge conflict in drivers/net/ethernet/mellanox/mlx5/core/en/xsk/rx.c between `9647c57b11` ("xsk: i40e: ice: ixgbe: mlx5: Test for dma_need_sync earlier for better performance") in bpf-next and `e20f0dbf20` ("net/mlx5e: RX, Add a prefetch command for small L1_CACHE_BYTES") in net-next. Resolve the two locations by retaining net_prefetch() and taking xsk_buff_dma_sync_for_cpu() from bpf-next. Should look like: [...] xdp_set_data_meta_invalid(xdp); xsk_buff_dma_sync_for_cpu(xdp, rq->xsk_pool); net_prefetch(xdp->data); [...] We've added 133 non-merge commits during the last 14 day(s) which contain a total of 246 files changed, 13832 insertions(+), 3105 deletions(-). The main changes are: 1) Initial support for sleepable BPF programs along with bpf_copy_from_user() helper for tracing to reliably access user memory, from Alexei Starovoitov. 2) Add BPF infra for writing and parsing TCP header options, from Martin KaFai Lau. 3) bpf_d_path() helper for returning full path for given 'struct path', from Jiri Olsa. 4) AF_XDP support for shared umems between devices and queues, from Magnus Karlsson. 5) Initial prep work for full BPF-to-BPF call support in libbpf, from Andrii Nakryiko. 6) Generalize bpf_sk_storage map & add local storage for inodes, from KP Singh. 7) Implement sockmap/hash updates from BPF context, from Lorenz Bauer. 8) BPF xor verification for scalar types & add BPF link iterator, from Yonghong Song. 9) Use target's prog type for BPF_PROG_TYPE_EXT prog verification, from Udip Pant. 10) Rework BPF tracing samples to use libbpf loader, from Daniel T. Lee. 11) Fix xdpsock sample to really cycle through all buffers, from Weqaar Janjua. 12) Improve type safety for tun/veth XDP frame handling, from Maciej Żenczykowski. 13) Various smaller cleanups and improvements all over the place. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2020-09-01 13:22:59 -07:00
Magnus Karlsson	9647c57b11	xsk: i40e: ice: ixgbe: mlx5: Test for dma_need_sync earlier for better performance Test for dma_need_sync earlier to increase performance. xsk_buff_dma_sync_for_cpu() takes an xdp_buff as parameter and from that the xsk_buff_pool reference is dug out. Perf shows that this dereference causes a lot of cache misses. But as the buffer pool is now sent down to the driver at zero-copy initialization time, we might as well use this pointer directly, instead of going via the xsk_buff and we can do so already in xsk_buff_dma_sync_for_cpu() instead of in xp_dma_sync_for_cpu. This gets rid of these cache misses. Throughput increases with 3% for the xdpsock l2fwd sample application on my machine. Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Björn Töpel <bjorn.topel@intel.com> Link: https://lore.kernel.org/bpf/1598603189-32145-11-git-send-email-magnus.karlsson@intel.com	2020-08-31 21:15:04 +02:00
Magnus Karlsson	c4655761d3	xsk: i40e: ice: ixgbe: mlx5: Rename xsk zero-copy driver interfaces Rename the AF_XDP zero-copy driver interface functions to better reflect what they do after the replacement of umems with buffer pools in the previous commit. Mostly it is about replacing the umem name from the function names with xsk_buff and also have them take the a buffer pool pointer instead of a umem. The various ring functions have also been renamed in the process so that they have the same naming convention as the internal functions in xsk_queue.h. This so that it will be clearer what they do and also for consistency. Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Björn Töpel <bjorn.topel@intel.com> Link: https://lore.kernel.org/bpf/1598603189-32145-3-git-send-email-magnus.karlsson@intel.com	2020-08-31 21:15:04 +02:00
Magnus Karlsson	1742b3d528	xsk: i40e: ice: ixgbe: mlx5: Pass buffer pool to driver instead of umem Replace the explicit umem reference passed to the driver in AF_XDP zero-copy mode with the buffer pool instead. This in preparation for extending the functionality of the zero-copy mode so that umems can be shared between queues on the same netdev and also between netdevs. In this commit, only an umem reference has been added to the buffer pool struct. But later commits will add other entities to it. These are going to be entities that are different between different queue ids and netdevs even though the umem is shared between them. Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Björn Töpel <bjorn.topel@intel.com> Link: https://lore.kernel.org/bpf/1598603189-32145-2-git-send-email-magnus.karlsson@intel.com	2020-08-31 21:15:03 +02:00
Tariq Toukan	f468f21b7a	net: Take common prefetch code structure into a function Many device drivers use the same prefetch code structure to deal with small L1 cacheline size. Take this code into a function and call it from the drivers. Suggested-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Reviewed-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-08-26 15:55:53 -07:00
Linus Torvalds	049eb096da	pci-v5.9-changes -----BEGIN PGP SIGNATURE----- iQJIBAABCgAyFiEEgMe7l+5h9hnxdsnuWYigwDrT+vwFAl8sdUkUHGJoZWxnYWFz QGdvb2dsZS5jb20ACgkQWYigwDrT+vwH2Q/7Brcm1uyLORSzseGsaXSGMncBs2YB aKbfhyy4BPsDIZRLnzcfRZzgKo3f4jlLH9dJ6nBukbNXCvS/g7oYCXtNKVuB70MD IgBH3OJxLmqsYgDkoQmj1fZBCBhdqMgGbRmeIPLqiIBrWOJkBpGHXKpb0XtyXAas CpD0Tvr0JBeHMluZq6Uay09jBDKexeCFrT5HCoVaRMXT/C/iB5K1oMrUczzITsdi jB9xesDjh32rYtaePKfuL8itbRT7jtqOwQlk7sCtnMNamaOOaYO/s6hL5v/4GxMh rtWa1knOxxA1nOsnEkUEHi0Fj/+9zXDIdb7v6thRDo0ZgWQxl7l3nshvmPcxX421 tpCm3HqmvHzGqSI85Rtr3p4XKm9e+IjgE2EA/J6Y8Q6Grrb0EGJituhO4meL2Ciq 6mxdhu7InxDJ2p3TLGas3fB/1hrCO0Fc0pQoBJx7YgqA1ANyld9DYCkDN6IDoZBI uUjKgkE1dfbW/pGjotjhBsmz3dycZHkurIFdt1iX/Xtt5KKdPAzu9yM2U03iIS2R im1wZ/THiS/YCOlgL/J8+DHTY0ZvXjAdbiSPjTFfwb9XTh8aHVWtFaaZON1jRIjg xMpIY0SxfshpLx631ThZdDTDiOwE8D3B+1n/kMwps6HOLpxOoJZeSGTRCt9wGP40 j58DTtLm5FKpdYc= =moI9 -----END PGP SIGNATURE----- Merge tag 'pci-v5.9-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci Pull PCI updates from Bjorn Helgaas: "Enumeration: - Fix pci_cfg_wait queue locking problem (Bjorn Helgaas) - Convert PCIe capability PCIBIOS errors to errno (Bolarinwa Olayemi Saheed) - Align PCIe capability and PCI accessor return values (Bolarinwa Olayemi Saheed) - Fix pci_create_slot() reference count leak (Qiushi Wu) - Announce device after early fixups (Tiezhu Yang) PCI device hotplug: - Make rpadlpar functions static (Wei Yongjun) Driver binding: - Add device even if driver attach failed (Rajat Jain) Virtualization: - xen: Remove redundant initialization of irq (Colin Ian King) IOMMU: - Add pci_pri_supported() to check device or associated PF (Ashok Raj) - Release IVRS table in AMD ACS quirk (Hanjun Guo) - Mark AMD Navi10 GPU rev 0x00 ATS as broken (Kai-Heng Feng) - Treat "external-facing" devices themselves as internal (Rajat Jain) MSI: - Forward MSI-X error code in pci_alloc_irq_vectors_affinity() (Piotr Stankiewicz) Error handling: - Clear PCIe Device Status errors only if OS owns AER (Jonathan Cameron) - Log correctable errors as warning, not error (Matt Jolly) - Use 'pci_channel_state_t' instead of 'enum pci_channel_state' (Luc Van Oostenryck) Peer-to-peer DMA: - Allow P2PDMA on AMD Zen and newer CPUs (Logan Gunthorpe) ASPM: - Add missing newline in sysfs 'policy' (Xiongfeng Wang) Native PCIe controllers: - Convert to devm_platform_ioremap_resource_byname() (Dejin Zheng) - Convert to devm_platform_ioremap_resource() (Dejin Zheng) - Remove duplicate error message from devm_pci_remap_cfg_resource() callers (Dejin Zheng) - Fix runtime PM imbalance on error (Dinghao Liu) - Remove dev_err() when handing an error from platform_get_irq() (Krzysztof Wilczyński) - Use pci_host_bridge.windows list directly instead of splicing in a temporary list for cadence, mvebu, host-common (Rob Herring) - Use pci_host_probe() instead of open-coding all the pieces for altera, brcmstb, iproc, mobiveil, rcar, rockchip, tegra, v3, versatile, xgene, xilinx, xilinx-nwl (Rob Herring) - Default host bridge parent device to the platform device (Rob Herring) - Use pci_is_root_bus() instead of tracking root bus number separately in aardvark, designware (imx6, keystone, designware-host), mobiveil, xilinx-nwl, xilinx, rockchip, rcar (Rob Herring) - Set host bridge bus number in pci_scan_root_bus_bridge() instead of each driver for aardvark, designware-host, host-common, mediatek, rcar, tegra, v3-semi (Rob Herring) - Move DT resource setup into devm_pci_alloc_host_bridge() (Rob Herring) - Set bridge map_irq and swizzle_irq to default functions; drivers that don't support legacy IRQs (iproc) need to undo this (Rob Herring) ARM Versatile PCIe controller driver: - Drop flag PCI_ENABLE_PROC_DOMAINS (Rob Herring) Cadence PCIe controller driver: - Use "dma-ranges" instead of "cdns,no-bar-match-nbits" property (Kishon Vijay Abraham I) - Remove "mem" from reg binding (Kishon Vijay Abraham I) - Fix cdns_pcie_{host\|ep}_setup() error path (Kishon Vijay Abraham I) - Convert all r/w accessors to perform only 32-bit accesses (Kishon Vijay Abraham I) - Add support to start link and verify link status (Kishon Vijay Abraham I) - Allow pci_host_bridge to have custom pci_ops (Kishon Vijay Abraham I) - Add new ops for CPU addr fixup (Kishon Vijay Abraham I) - Fix updating Vendor ID and Subsystem Vendor ID register (Kishon Vijay Abraham I) - Use bridge resources for outbound window setup (Rob Herring) - Remove private bus number and range storage (Rob Herring) Cadence PCIe endpoint driver: - Add MSI-X support (Alan Douglas) HiSilicon PCIe controller driver: - Remove non-ECAM HiSilicon hip05/hip06 driver (Rob Herring) Intel VMD host bridge driver: - Use Shadow MEMBAR registers for QEMU/KVM guests (Jon Derrick) Loongson PCIe controller driver: - Use DECLARE_PCI_FIXUP_EARLY for bridge_class_quirk() (Tiezhu Yang) Marvell Aardvark PCIe controller driver: - Indicate error in 'val' when config read fails (Pali Rohár) - Don't touch PCIe registers if no card connected (Pali Rohár) Marvell MVEBU PCIe controller driver: - Setup BAR0 in order to fix MSI (Shmuel Hazan) Microsoft Hyper-V host bridge driver: - Fix a timing issue which causes kdump to fail occasionally (Wei Hu) - Make some functions static (Wei Yongjun) NVIDIA Tegra PCIe controller driver: - Revert tegra124 raw_violation_fixup (Nicolas Chauvet) - Remove PLL power supplies (Thierry Reding) Qualcomm PCIe controller driver: - Change duplicate PCI reset to phy reset (Abhishek Sahu) - Add missing ipq806x clocks in PCIe driver (Ansuel Smith) - Add missing reset for ipq806x (Ansuel Smith) - Add ext reset (Ansuel Smith) - Use bulk clk API and assert on error (Ansuel Smith) - Add support for tx term offset for rev 2.1.0 (Ansuel Smith) - Define some PARF params needed for ipq8064 SoC (Ansuel Smith) - Add ipq8064 rev2 variant (Ansuel Smith) - Support PCI speed set for ipq806x (Sham Muthayyan) Renesas R-Car PCIe controller driver: - Use devm_pci_alloc_host_bridge() (Rob Herring) - Use struct pci_host_bridge.windows list directly (Rob Herring) - Convert rcar-gen2 to use modern host bridge probe functions (Rob Herring) TI J721E PCIe driver: - Add TI J721E PCIe host and endpoint driver (Kishon Vijay Abraham I) Xilinx Versal CPM PCIe controller driver: - Add Versal CPM Root Port driver and YAML schema (Bharat Kumar Gogada) MicroSemi Switchtec management driver: - Add missing __iomem and __user tags to fix sparse warnings (Logan Gunthorpe) Miscellaneous: - Replace http:// links with https:// (Alexander A. Klimov) - Replace lkml.org, spinics, gmane with lore.kernel.org (Bjorn Helgaas) - Remove unused pci_lost_interrupt() (Heiner Kallweit) - Move PCI_VENDOR_ID_REDHAT definition to pci_ids.h (Huacai Chen) - Fix kerneldoc warnings (Krzysztof Kozlowski)" * tag 'pci-v5.9-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (113 commits) PCI: Fix kerneldoc warnings PCI: xilinx-cpm: Add Versal CPM Root Port driver PCI: xilinx-cpm: Add YAML schemas for Versal CPM Root Port PCI: Set bridge map_irq and swizzle_irq to default functions PCI: Move DT resource setup into devm_pci_alloc_host_bridge() PCI: rcar-gen2: Convert to use modern host bridge probe functions PCI: Remove dev_err() when handing an error from platform_get_irq() MAINTAINERS: Add Kishon Vijay Abraham I for TI J721E SoC PCIe misc: pci_endpoint_test: Add J721E in pci_device_id table PCI: j721e: Add TI J721E PCIe driver PCI: switchtec: Add missing __iomem tag to fix sparse warnings PCI: switchtec: Add missing __iomem and __user tags to fix sparse warnings PCI: rpadlpar: Make functions static PCI/P2PDMA: Allow P2PDMA on AMD Zen and newer CPUs PCI: Release IVRS table in AMD ACS quirk PCI: Announce device after early fixups PCI: Mark AMD Navi10 GPU rev 0x00 ATS as broken PCI: Remove unused pci_lost_interrupt() dt-bindings: PCI: Add EP mode dt-bindings for TI's J721E SoC dt-bindings: PCI: Add host mode dt-bindings for TI's J721E SoC ...	2020-08-07 18:48:15 -07:00
David S. Miller	2e7199bd77	Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next Daniel Borkmann says: ==================== pull-request: bpf-next 2020-08-04 The following pull-request contains BPF updates for your net-next tree. We've added 73 non-merge commits during the last 9 day(s) which contain a total of 135 files changed, 4603 insertions(+), 1013 deletions(-). The main changes are: 1) Implement bpf_link support for XDP. Also add LINK_DETACH operation for the BPF syscall allowing processes with BPF link FD to force-detach, from Andrii Nakryiko. 2) Add BPF iterator for map elements and to iterate all BPF programs for efficient in-kernel inspection, from Yonghong Song and Alexei Starovoitov. 3) Separate bpf_get_{stack,stackid}() helpers for perf events in BPF to avoid unwinder errors, from Song Liu. 4) Allow cgroup local storage map to be shared between programs on the same cgroup. Also extend BPF selftests with coverage, from YiFei Zhu. 5) Add BPF exception tables to ARM64 JIT in order to be able to JIT BPF_PROBE_MEM load instructions, from Jean-Philippe Brucker. 6) Follow-up fixes on BPF socket lookup in combination with reuseport group handling. Also add related BPF selftests, from Jakub Sitnicki. 7) Allow to use socket storage in BPF_PROG_TYPE_CGROUP_SOCK-typed programs for socket create/release as well as bind functions, from Stanislav Fomichev. 8) Fix an info leak in xsk_getsockopt() when retrieving XDP stats via old struct xdp_statistics, from Peilin Ye. 9) Fix PT_REGS_RC{,_CORE}() macros in libbpf for MIPS arch, from Jerry Crunchtime. 10) Extend BPF kernel test infra with skb->family and skb->{local,remote}_ip{4,6} fields and allow user space to specify skb->dev via ifindex, from Dmitry Yakunin. 11) Fix a bpftool segfault due to missing program type name and make it more robust to prevent them in future gaps, from Quentin Monnet. 12) Consolidate cgroup helper functions across selftests and fix a v6 localhost resolver issue, from John Fastabend. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2020-08-03 18:27:40 -07:00
Tony Nguyen	7dbc63f0a5	ice: Misc minor fixes This is a collection of minor fixes including typos, white space, and style. No functional changes. Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com>	2020-08-01 08:44:04 -07:00
Victor Raj	6a2c2b2c1b	ice: adjust profile ID map locks The profile ID map lock should be held till the caller completes all references of that profile entries. The current code releases the lock right after the match search. This caused a driver issue when the profile map entries were referenced after it was freed in other thread after the lock was released earlier. Signed-off-by: Victor Raj <victor.raj@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2020-08-01 08:44:04 -07:00
Tony Nguyen	eddbee9b94	ice: update PTYPE lookup table Update the PTYPE lookup table to reflect values that can be set by the hardware. Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com>	2020-08-01 08:44:04 -07:00
Nick Nunley	68d210a609	ice: Disable VLAN pruning in promiscuous mode Disable VLAN pruning when entering promiscuous mode, and re-enable it when exiting. Without this VLAN-over-bridge topologies created on the device won't be functional unless rx-vlan-filter is explicitly disabled with ethtool. Signed-off-by: Nick Nunley <nicholas.d.nunley@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2020-08-01 08:44:04 -07:00
Surabhi Boob	bcc46cb8a0	ice: Graceful error handling in HW table calloc failure In the ice_init_hw_tbls, if the devm_kcalloc for es->written fails, catch that error and bail out gracefully, instead of continuing with a NULL pointer. Fixes: `32d63fa1e9` ("ice: Initialize DDP package structures") Signed-off-by: Surabhi Boob <surabhi.boob@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2020-08-01 08:44:04 -07:00
Kiran Patil	0a37abfa01	ice: port fix for chk_linearlize This is a port of commit `248de22e63` ("i40e/i40evf: Account for frags split over multiple descriptors in check linearize") As part of testing workloads (read/write) using larger IO size (128K) tx_timeout is observed and whenever it happens, it was due to tx_linearize. Signed-off-by: Kiran Patil <kiran.patil@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2020-08-01 08:44:04 -07:00
Brett Creeley	f34f55557a	ice: Allow 2 queue pairs per VF on SR-IOV initialization Currently VFs are only allowed to get 16, 4, and 1 queue pair by default, which require 17, 5, and 2 MSI-X vectors respectively. This is because each VF needs a MSI-X per data queue and a MSI-X for its other interrupt. The calculation is based on the number of VFs created, MSI-X available, and queue pairs available at the time of VF creation. Unfortunately the values above exclude 2 queue pairs when only 3 MSI-X are available to each VF based on resource constraints. The current calculation would default to 2 MSI-X and 1 queue pair. This is a waste of resources, so fix this by allowing 2 queue pairs per VF when there are between 2 and 5 MSI-X available per VF. Signed-off-by: Brett Creeley <brett.creeley@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2020-08-01 08:44:04 -07:00
Vignesh Sridhar	ec1d1d2302	ice: Clear and free XLT entries on reset This fix has been added to address memory leak issues resulting from triggering a sudden driver reset which does not allow us to follow our normal removal flows for SW XLT entries for advanced features. - Adding call to destroy flow profile locks when clearing SW XLT tables. - Extraction sequence entries were not correctly cleared previously which could cause ownership conflicts for repeated reset-replay calls. Fixes: `31ad4e4ee1` ("ice: Allocate flow profile") Signed-off-by: Vignesh Sridhar <vignesh.sridhar@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2020-08-01 08:44:04 -07:00
Jesse Brandeburg	a8fffd7ae9	ice: add useful statistics Display and count some useful hot-path statistics. The usefulness is as follows: - tx_restart: use to determine if the transmit ring size is too small or if the transmit interrupt rate is too low. - rx_gro_dropped: use to count drops from GRO layer, which previously were completely uncounted when occurring. - tx_busy: use to determine when the driver is miscounting number of descriptors needed for an skb. - tx_timeout: as our other drivers, count the number of times we've reset due to timeout because the kernel only prints a warning once per netdev. Several of these were already counted but not displayed. Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2020-08-01 08:44:04 -07:00
Jesse Brandeburg	a4c493fea5	ice: remove page_reuse statistic The page reuse statistic wasn't even being displayed to the user, even though the driver counted it. Don't waste the struct space and hot-path cycles since the driver doesn't display it. Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2020-08-01 08:43:59 -07:00
Vignesh Sridhar	cdedbab92d	ice: Fix RSS profile locks Replacing flow profile locks with RSS profile locks in the function to remove all RSS rules for a given VSI. This is to align the locks used for RSS rule addition to VSI and removal during VSI teardown to avoid a race condition owing to several iterations of the above operations. In function to get RSS rules for given VSI and protocol header replacing the pointer reference of the RSS entry with a copy of hash value to ensure thread safety. Signed-off-by: Vignesh Sridhar <vignesh.sridhar@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2020-08-01 08:22:30 -07:00
Kiran Patil	f07d134d37	ice: fix the vsi_id mask to be 10 bit for set_rss_lut set_rss_lut can fail due to incorrect vsi_id mask. vsi_id is 10 bit but mask was 0x1FF whereas it should be 0x3FF. For vsi_num >= 512, FW set_rss_lut can fail with return code EACCESS (VSI ownership issue) because software was providing incorrect vsi_num (dropping 10th bit due to incorrect mask) for set_rss_lut admin command Signed-off-by: Kiran Patil <kiran.patil@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2020-08-01 08:20:10 -07:00
Nick Nunley	585cdabdfd	ice: rename misleading grst_delay variable The grst_delay variable in ice_check_reset contains the maximum time (in 100 msec units) that the driver will wait for a reset event to transition to the Device Active state. The value is the sum of three separate components: 1) The maximum time it may take for the firmware to process its outstanding command before handling the reset request. 2) The value in RSTCTL.GRSTDEL (the delay firmware inserts between first seeing the driver reset request and the actual hardware assertion). 3) The maximum expected reset processing time in hardware. Referring to this total time as "grst_delay" is misleading and potentially confusing to someone checking the code and cross-referencing the hardware specification. Fix this by renaming the variable to "grst_timeout", which is more descriptive of its actual use. Signed-off-by: Nick Nunley <nicholas.d.nunley@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2020-08-01 08:17:40 -07:00
Wei Yongjun	65c72291f7	ice: mark PM functions as __maybe_unused In certain configurations without power management support, the following warnings happen: drivers/net/ethernet/intel/ice/ice_main.c:4214:12: warning: 'ice_resume' defined but not used [-Wunused-function] 4214 \| static int ice_resume(struct device dev) \| ^~~~~~~~~~ drivers/net/ethernet/intel/ice/ice_main.c:4150:12: warning: 'ice_suspend' defined but not used [-Wunused-function] 4150 \| static int ice_suspend(struct device dev) \| ^~~~~~~~~~~ Mark these functions as __maybe_unused to make it clear to the compiler that this is going to happen based on the configuration, which is the standard for these types of functions. Fixes: `769c500dcc` ("ice: Add advanced power mgmt for WoL") Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2020-08-01 08:15:56 -07:00
Tony Nguyen	6221595fc5	ice: fix unused parameter warning Depending on PAGE_SIZE, the following unused parameter warning can be reported: drivers/net/ethernet/intel/ice/ice_txrx.c: In function ‘ice_rx_frame_truesize’: drivers/net/ethernet/intel/ice/ice_txrx.c:513:21: warning: unused parameter ‘size’ [-Wunused-parameter] unsigned int size) The 'size' variable is used only when PAGE_SIZE >= 8192. Add __maybe_unused to remove the warning. Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com>	2020-07-29 08:38:56 -07:00
Ben Shelton	7dfff9ffe8	ice: disable no longer needed workaround for FW logging For the FW logging info AQ command, we currently set the ICE_AQ_FLAG_RD in order to work around a FW issue. This issue has been fixed so remove the workaround. Signed-off-by: Ben Shelton <benjamin.h.shelton@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2020-07-29 08:38:56 -07:00
Bruce Allan	e923f04d66	ice: reduce scope of variable The scope of the macro local variable 'i' can be reduced. Do so to avoid static analysis tools from complaining. Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2020-07-29 08:38:55 -07:00
Marcin Szycik	78116e979d	ice: cleanup VSI on probe fail As part of ice_setup_pf_sw() a PF VSI is setup; release the VSI in case of failure. Signed-off-by: Marcin Szycik <marcin.szycik@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2020-07-29 08:38:55 -07:00
Brett Creeley	cd1f56f429	ice: Allow all VLANs in safe mode Currently the PF VSI's context parameters are left in a bad state when going into safe mode. This is causing VLAN traffic to not pass. Fix this by configuring the PF VSI to allow all VLAN tagged traffic. Also, remove redundant comment explaining the safe mode flow in ice_probe(). Signed-off-by: Brett Creeley <brett.creeley@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2020-07-29 08:38:55 -07:00
Krzysztof Kazimierczak	682dfedcee	ice: need_wakeup flag might not be set for Tx This is a port of i40e commit `705639572e` ("i40e: need_wakeup flag might not be set for Tx"). Quoting the original commit message: "The need_wakeup flag for Tx might not be set for AF_XDP sockets that are only used to send packets. This happens if there is at least one outstanding packet that has not been completed by the hardware and we get that corresponding completion (which will not generate an interrupt since interrupts are disabled in the napi poll loop) between the time we stopped processing the Tx completions and interrupts are enabled again. In this case, the need_wakeup flag will have been cleared at the end of the Tx completion processing as we believe we will get an interrupt from the outstanding completion at a later point in time. But if this completion interrupt occurs before interrupts are enable, we lose it and should at that point really have set the need_wakeup flag since there are no more outstanding completions that can generate an interrupt to continue the processing. When this happens, user space will see a Tx queue need_wakeup of 0 and skip issuing a syscall, which means will never get into the Tx processing again and we have a deadlock." As a result, packet processing stops. This patch introduces a fix for this issue, by always setting the need_wakeup flag at the end of an interrupt processing. This ensures that the deadlock will not happen. Signed-off-by: Krzysztof Kazimierczak <krzysztof.kazimierczak@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2020-07-29 08:38:55 -07:00
Victor Raj	4043818c13	ice: distribute Tx queues evenly Distribute the Tx queues evenly across all queue groups. This will help the queues to get more equal sharing among the queues when all are in use. In the previous algorithm, the next queue group node will be picked up only after the previous one filled with max children. For example: if VSI is configured with 9 queues, the first 8 queues will be assigned to queue group 1 and the 9th queue will be assigned to queue group 2. The 2 queue groups split the bandwidth between them equally (50:50). The first queue group node will share the 50% bandwidth with all of its children (8 queues). And the second queue group node will share the entire 50% bandwidth with its only children. The new algorithm will fix this issue. Signed-off-by: Victor Raj <victor.raj@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2020-07-29 08:38:55 -07:00
Tarun Singh	984824a210	ice: Adjust scheduler default BW weight By default the queues are configured in legacy mode. The default BW settings for legacy/advanced modes are different. The existing code was using the advanced mode default value of 1 which was incorrect. This caused the unbalanced BW sharing among siblings. The recommended default value is applied. Signed-off-by: Tarun Singh <tarun.k.singh@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2020-07-29 08:38:55 -07:00
Tarun Singh	b3b93d6ce1	ice: Add RL profile bit mask check Mask bits before accessing the profile type field. Signed-off-by: Tarun Singh <tarun.k.singh@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2020-07-29 08:38:55 -07:00
Paul M Stillwell Jr	a02016de00	ice: fix overwriting TX/RX descriptor values when rebuilding VSI If a user sets the value of the TX or RX descriptors to some non-default value using 'ethtool -G' then we need to not overwrite the values when we rebuild the VSI. The VSI rebuild could happen as a result of a user setting the number of queues via the 'ethtool -L' command. Fix this by checking to see if the value we have stored is non-zero and if it is then don't change the value. Signed-off-by: Paul M Stillwell Jr <paul.m.stillwell.jr@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2020-07-29 08:38:55 -07:00
Kiran Patil	ca1fdb885e	ice: return correct error code from ice_aq_sw_rules Return ICE_ERR_DOES_NOT_EXIST return code if admin command error code is ICE_AQ_RC_ENOENT (not exist). ice_aq_sw_rules is used when switch rule is getting added/deleted/updated. In case of delete/update switch rule, admin command can return ICE_AQ_RC_ENOENT error code if such rule does not exist, hence return ICE_ERR_DOES_NOT_EXIST error code from ice_aq_sw_rule, so that caller of this function can decide how to handle ICE_ERR_DOES_NOT_EXIST. Signed-off-by: Kiran Patil <kiran.patil@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2020-07-29 08:38:55 -07:00
Nick Nunley	a54a0b24f4	ice: restore VF MSI-X state during PCI reset During a PCI FLR the MSI-X Enable flag in the VF PCI MSI-X capability register will be cleared. This can lead to issues when a VF is assigned to a VM because in these cases the VF driver receives no indication of the PF PCI error/reset and additionally it is incapable of restoring the cleared flag in the hypervisor configuration space without fully reinitializing the driver interrupt functionality. Since the VF driver is unable to easily resolve this condition on its own, restore the VF MSI-X flag during the PF PCI reset handling. Signed-off-by: Nick Nunley <nicholas.d.nunley@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2020-07-29 08:38:55 -07:00
Dave Ertman	0ce6c34a8f	ice: fix link event handling timing When the driver experiences a link event (especially link up) there can be multiple events generated. Some of these are link fault and still have a state of DOWN set. The problem happens when the link comes UP during the PF driver handling one of the LINK DOWN events. The status of the link is updated and is now seen as UP, so when the actual LINK UP event comes, the port information has already been updated to be seen as UP, even though none of the UP activities have been completed. After the link information has been updated in the link handler and evaluated for MEDIA PRESENT, if the state of the link has been changed to UP, treat the DOWN event as an UP event since the link is now UP. Signed-off-by: Dave Ertman <david.m.ertman@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2020-07-29 08:38:54 -07:00
Dave Ertman	b767ca650f	ice: Fix link broken after GLOBR reset After a GLOBR, the link was broken so that a link up situation was being seen as a link down. The problem was that the rebuild process was updating the port_info link status without doing any of the other things that need to be done when link changes. This was causing the port_info struct to have current "UP" information so that any further UP interrupts were skipped as redundant. The rebuild flow should not be updating the port_info struct link information, so eliminate this and leave it to the link event handling code. Signed-off-by: Dave Ertman <david.m.ertman@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2020-07-29 08:38:54 -07:00
Dave Ertman	7d9c9b791f	ice: Implement LFC workaround There is a bug where the LFC settings are not being preserved through a link event. The registers in question are the ones that are touched (and restored) when a set_local_mib AQ command is performed. On a link-up event, make sure that a set_local_mib is being performed. Move the function ice_aq_set_lldp_mib() from the DCB specific ice_dcb.c to ice_common.c so that the driver always has access to this AQ command. Signed-off-by: Dave Ertman <david.m.ertman@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2020-07-29 08:38:54 -07:00
Jacob Keller	d69ea414c9	ice: implement device flash update via devlink Use the newly added pldmfw library to implement device flash update for the Intel ice networking device driver. This support uses the devlink flash update interface. The main parts of the flash include the Option ROM, the netlist module, and the main NVM data. The PLDM firmware file contains modules for each of these components. Using the pldmfw library, the provided firmware file will be scanned for the three major components, "fw.undi" for the Option ROM, "fw.mgmt" for the main NVM module containing the primary device firmware, and "fw.netlist" containing the netlist module. The flash is separated into two banks, the active bank containing the running firmware, and the inactive bank which we use for update. Each module is updated in a staged process. First, the inactive bank is erased, preparing the device for update. Second, the contents of the component are copied to the inactive portion of the flash. After all components are updated, the driver signals the device to switch the active bank during the next EMP reset (which would usually occur during the next reboot). Although the firmware AdminQ interface does report an immediate status for each command, the NVM erase and NVM write commands receive status asynchronously. The driver must not continue writing until previous erase and write commands have finished. The real status of the NVM commands is returned over the receive AdminQ. Implement a simple interface that uses a wait queue so that the main update thread can sleep until the completion status is reported by firmware. For erasing the inactive banks, this can take quite a while in practice. To help visualize the process to the devlink application and other applications based on the devlink netlink interface, status is reported via the devlink_flash_update_status_notify. While we do report status after each 4k block when writing, there is no real status we can report during erasing. We simply must wait for the complete module erasure to finish. With this implementation, basic flash update for the ice hardware is supported. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-07-28 17:07:06 -07:00
Jacob Keller	2ab560a78e	ice: add flags indicating pending update of firmware module After a flash update, the pending status of the update can be determined from the device capabilities. Read the appropriate device capability and store whether there is a pending update awaiting a reboot. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-07-28 17:07:06 -07:00
Cudzilo, Szymon T	544cd2ac13	ice: Add AdminQ commands for FW update Add structures, identifiers, and helper functions for several AdminQ commands related to performing a firmware update for the ice hardware. These will be used in future code for implementing the devlink .flash_update handler. Signed-off-by: Cudzilo, Szymon T <szymon.t.cudzilo@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-07-28 17:07:06 -07:00
Jacek Naczyk	de9b277ee0	ice: Add support for unified NVM update flow capability Extends function parsing response from Discover Device Capability AQC to check if the device supports unified NVM update flow. Signed-off-by: Jacek Naczyk <jacek.naczyk@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-07-28 17:07:06 -07:00
Andrii Nakryiko	e8407fdeb9	bpf, xdp: Remove XDP_QUERY_PROG and XDP_QUERY_PROG_HW XDP commands Now that BPF program/link management is centralized in generic net_device code, kernel code never queries program id from drivers, so XDP_QUERY_PROG/XDP_QUERY_PROG_HW commands are unnecessary. This patch removes all the implementations of those commands in kernel, along the xdp_attachment_query(). This patch was compile-tested on allyesconfig. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200722064603.3350758-10-andriin@fb.com	2020-07-25 20:37:02 -07:00
Paul M Stillwell Jr	c2b352262a	ice: add 1G SGMII PHY type There isn't a case for 1G SGMII in ice_get_media_type() so add the handling for it. Also handle the special case where some direct attach cables may report that they support 1G SGMII, but that is erroneous since SGMII is supposed to be a backplane media type (between a MAC and a PHY). If the driver doesn't handle this special case then a user could see the 'Port' in ethtool change from 'Direct attach Copper' to 'Backplane' when they have forced the speed to 1G, but the cable hasn't changed. Lastly, change ice_aq_get_phy_caps() to save the module_type info if the function was called with ICE_AQC_REPORT_TOPO_CAP. This call uses the media information to populate the module_type. If no media is present then the values in module_type will be 0. Signed-off-by: Paul M Stillwell Jr <paul.m.stillwell.jr@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2020-07-23 15:36:14 -07:00
Doug Dziggel	c1eb3b6b68	ice: Report AOC PHY Types as Fiber Report AOC types as fiber instead of unknown. Signed-off-by: Doug Dziggel <douglas.a.dziggel@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2020-07-23 15:34:43 -07:00
Paul Greenwalt	8ea1da593b	ice: add AQC get link topology handle support Add AQC get link topology handle support. This is needed to determine Direct Attach (DA) or backplane media type for PHY types that support either. Get link topology handle cage node type request can be used to determine if a cage is present or not. If a cage is present for PHY types that supports both DA and backplane media type, then the media type is DA, else the media type is backplane. Signed-off-by: Paul Greenwalt <paul.greenwalt@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2020-07-23 15:33:26 -07:00
Lev Faerman	bdeff9718a	ice: Rename low_power_ctrl Rename the low_power_ctrl field to low_power_ctrl_an to be properly descriptive of it being an autoneg field. Signed-off-by: Lev Faerman <lev.faerman@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2020-07-23 15:31:30 -07:00
Paul Greenwalt	5ee30564c8	ice: update reporting of autoneg capabilities Firmware now reports AN28, AN32, and AN73. Add a helper and check these new values and report PHY autoneg capability. Signed-off-by: Paul Greenwalt <paul.greenwalt@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2020-07-23 15:29:46 -07:00
Paul Greenwalt	55df52a0bc	ice: add ice_aq_get_phy_caps() debug logs Add debug logs for ice_aq_get_phy_caps(), and format ice_aq_set_phy_cfg() and ice_aq_get_link_info() debug logs to make them more readable. Signed-off-by: Paul Greenwalt <paul.greenwalt@intel.com> Signed-off-by: Paul M Stillwell Jr <paul.m.stillwell.jr@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2020-07-23 15:27:46 -07:00
Bruce Allan	b4e813dd04	ice: support Total Port Shutdown on devices that support it When the Port Disable bit is set in the Link Default Override Mask TLV PFA module in the NVM, Total Port Shutdown mode is supported and enabled. In this mode, the driver should act as if the link-down-on-close ethtool private flag is always enabled and dis-allow any change to that flag. Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Signed-off-by: Paul Greenwalt <paul.greenwalt@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2020-07-23 15:26:09 -07:00
Paul Greenwalt	ea78ce4dab	ice: add link lenient and default override support Adds functions to check for link override firmware support and get the override settings for a port. The previously supported/default link mode was strict mode. In strict mode link is configured based on get PHY capabilities PHY types with media. Lenient mode is now the default link mode. In lenient mode the link is configured based on get PHY capabilities PHY types without media. This allows the user to configure link that the media does not report. Limit the minimum supported link mode to 25G for devices that support 100G, and 1G for devices that support less than 100G. Default override is only supported in lenient mode. If default override is supported and enabled, then default override values are used for configuring speed and FEC. Default override provide persistent link settings in the NVM. Signed-off-by: Paul Greenwalt <paul.greenwalt@intel.com> Signed-off-by: Evan Swanson <evan.swanson@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2020-07-23 15:22:31 -07:00
Paul Greenwalt	1a3571b593	ice: restore PHY settings on media insertion After the transition from no media to media FW will clear the set-phy-cfg data set by the user. Save initial PHY settings and any settings later requested by the user and use that data to restore PHY settings on media insertion. Since PHY configuration is now being stored, replace calls that were calling FW to get the configuration with the saved copy. Signed-off-by: Paul Greenwalt <paul.greenwalt@intel.com> Signed-off-by: Chinh T Cao <chinh.t.cao@intel.com> Signed-off-by: Paul M Stillwell Jr <paul.m.stillwell.jr@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2020-07-23 15:15:28 -07:00
Paul Greenwalt	61cf42e71a	ice: move auto FEC checks into ice_cfg_phy_fec() The call to ice_cfg_phy_fec() requires the caller to perform certain actions before calling it. Instead of imposing these preconditions move the operations into the function and perform them ourselves. Also, fix some style issues in nearby touched code. Signed-off-by: Paul Greenwalt <paul.greenwalt@intel.com> Signed-off-by: Chinh T Cao <chinh.t.cao@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2020-07-23 15:05:49 -07:00
Paul Greenwalt	2ffb60856a	ice: refactor FC functions Create a helper function for configuring requested flow control so that it can be utilized by other functions looking to configure flow control settings. Utilize the existing helper ice_copy_phy_caps_to_cfg() to copy a PHY capability to configuration instead duplicating the code for it. Signed-off-by: Paul Greenwalt <paul.greenwalt@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2020-07-23 15:03:58 -07:00
Akeem G Abodunrin	769c500dcc	ice: Add advanced power mgmt for WoL Add callbacks needed to support advanced power management for Wake on LAN. Also make ice_pf_state_is_nominal function available for all configurations not just CONFIG_PCI_IOV. Signed-off-by: Akeem G Abodunrin <akeem.g.abodunrin@intel.com> Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2020-07-23 14:59:20 -07:00
Jacob Keller	81aed6475d	ice: split ice_discover_caps into two functions Using the new ice_aq_list_caps and ice_parse_(dev\|func)_caps functions, replace ice_discover_caps with two functions that each take a pointer to the dev_caps and func_caps structures respectively. This makes the side effect of updating the hw->dev_caps and hw->func_caps obvious from reading the implementation of the function. Additionally, it opens the way for enabling reading of device capabilities outside of the initialization flow. By passing in a pointer, another caller will be able to read the capabilities without modifying the HW capabilities structures. As there are no other callers, it is safe to now remove ice_aq_discover_caps and ice_parse_caps. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2020-07-23 14:48:41 -07:00
Jacob Keller	595b13e228	ice: split ice_parse_caps into separate functions The ice_parse_caps function is used to convert the capability block data coming from firmware into a structured format used by other parts of the code. The current implementation directly updates the hw->func_caps and hw->dev_caps structures. It is directly called from within ice_aq_discover_caps. This causes the discover_caps function to have the side effect of modifying the HW capability structures, which is not intuitive. Split this function into ice_parse_dev_caps and ice_parse_func_caps. These functions will take a pointer to the dev_caps and func_caps respectively. Also create an ice_parse_common_caps for sharing the capability logic that is common to device and function. Doing so enables a future refactor to allow reading and parsing capabilities into a local caps structure instead of modifying the members of the HW structure directly. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2020-07-23 14:46:33 -07:00
Jacob Keller	1082b360e3	ice: refactor ice_discover_caps to avoid need to retry The ice_discover_caps function is used to read the device and function capabilities, updating the hardware capabilities structures with relevant data. The exact number of capabilities returned by the hardware is unknown ahead of time. The AdminQ command will report the total number of capabilities in the return buffer. The current implementation involves requesting capabilities once, reading this returned size, and then re-requested with that size. This isn't really necessary. The firmware interface has a maximum size of ICE_AQ_MAX_BUF_LEN. Firmware can never return more than ICE_AQ_MAX_BUF_LEN / sizeof(struct ice_aqc_list_caps_elem) capabilities. Avoid the retry loop by simply allocating a buffer of size ICE_AQ_MAX_BUF_LEN. This is significantly simpler than retrying. The extra allocation isn't a big deal, as it will be released after we finish parsing the capabilities. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2020-07-23 14:16:02 -07:00
Danielle Ratson	71ad8d55f8	devlink: Replace devlink_port_attrs_set parameters with a struct Currently, devlink_port_attrs_set accepts a long list of parameters, that most of them are devlink port's attributes. Use the devlink_port_attrs struct to replace the relevant parameters. Signed-off-by: Danielle Ratson <danieller@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-07-09 13:15:29 -07:00
Luc Van Oostenryck	16d79cd4e2	PCI: Use 'pci_channel_state_t' instead of 'enum pci_channel_state' The method struct pci_error_handlers.error_detected() is defined and documented as taking an 'enum pci_channel_state' for the second argument, but most drivers use 'pci_channel_state_t' instead. This 'pci_channel_state_t' is not a typedef for the enum but a typedef for a bitwise type in order to have better/stricter typechecking. Consolidate everything by using 'pci_channel_state_t' in the method's definition, in the related helpers and in the drivers. Enforce use of 'pci_channel_state_t' by replacing 'enum pci_channel_state' with an anonymous 'enum'. Note: Currently, from a typechecking point of view this patch changes nothing because only the constants defined by the enum are bitwise, not the enum itself (sparse doesn't have the notion of 'bitwise enum'). This may change in some not too far future, hence the patch. [bhelgaas: squash in https://lore.kernel.org/r/20200702162651.49526-3-luc.vanoostenryck@gmail.com https://lore.kernel.org/r/20200702162651.49526-4-luc.vanoostenryck@gmail.com] Link: https://lore.kernel.org/r/20200702162651.49526-2-luc.vanoostenryck@gmail.com Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>	2020-07-07 17:11:52 -05:00
Bruce Allan	66486d8943	ice: replace single-element array used for C struct hack Convert the pre-C90-extension "C struct hack" method (using a single- element array at the end of a structure for implementing variable-length types) to the preferred use of C99 flexible array member. Additional code cleanups were done near areas affected by this change. Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2020-07-01 16:35:23 -07:00
Bruce Allan	b3c3890489	ice: avoid unnecessary single-member variable-length structs There are a number of structures that consist of a one-element array as the only struct member. Some of those are unused so remove them. Others are used to index into a buffer/array consisting of a variable number of a different data or structure type. Those are unnecessary since we can use simple pointer arithmetic or index directly into the buffer to access individual elements of the buffer/array. Additional code cleanups were done near areas affected by this change. Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2020-07-01 16:33:29 -07:00
Jacob Keller	8d7aab3515	ice: implement snapshot for device capabilities Add a new devlink region used for capturing a snapshot of the device capabilities buffer which is reported by the firmware over the AdminQ. This information can useful in debugging driver and firmware interactions. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2020-07-01 15:41:45 -07:00
David S. Miller	b0f46a9754	Merge branch '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue Jeff Kirsher says: ==================== 40GbE Intel Wired LAN Driver Updates 2020-06-25 This series contains updates to i40e driver and removes the individual driver versions from all of the Intel wired LAN drivers. Shiraz moves the client header so that it can easily be shared between the i40e LAN driver and i40iw RDMA driver. Jesse cleans up the unused defines, since they are just dead weight. Alek reduces the unreasonably long wait time for a PF reset after reboot by using jiffies to limit the maximum wait time for the PF reset to succeed. Added additional logging to let the user know when the driver transitions into recovery mode. Adds new device support for our 5 Gbps NICs. Todd adds a check to see if MFS is set after warm reboot and notifies the user when MFS is set to anything lower than the default value. Arkadiusz fixes a possible race condition, where were holding a spin-lock while in atomic context. v2: removed code comments that were no longer applicable in patch 2 of the series. Also removed 'inline' from patch 4 and patch 8 of the series. Also re-arranged code to be able to remove the forward function declarations. Dropped patch 9 of the series, while the author works on cleaning up the commit message. v3: Updated patch 8 description to answer Jakub's questions ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2020-06-26 12:22:34 -07:00
Jeff Kirsher	34a2a3b83e	net/intel: remove driver versions from Intel drivers As with other networking drivers, remove the unnecessary driver version from the Intel drivers. The ethtool driver information and module version will then report the kernel version instead. For ixgbe, i40e and ice drivers, the driver passes the driver version to the firmware to confirm that we are up and running. So we now pass the value of UTS_RELEASE to the firmware. This adminq call is required per the HAS document. The Device then sends an indication to the BMC that the PF driver is present. This is done using Host NC Driver Status Indication in NC-SI Get Link command or via the Host Network Controller Driver Status Change AEN. What the BMC may do with this information is implementation-dependent, but this is a standard NC-SI 1.1 command we honor per the HAS. CC: Bruce Allan <bruce.w.allan@intel.com> CC: Jesse Brandeburg <jesse.brandeburg@intel.com> CC: Alek Loktionov <aleksandr.loktionov@intel.com> CC: Kevin Liedtke <kevin.d.liedtke@intel.com> CC: Aaron Rowden <aaron.f.rowden@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Co-developed-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com>	2020-06-25 22:25:13 -07:00
Ciara Loftus	b1d95cc239	ice: protect ring accesses with WRITE_ONCE The READ_ONCE macro is used when reading rings prior to accessing the statistics pointer. The corresponding WRITE_ONCE usage when allocating and freeing the rings to ensure protected access was not in place. Introduce this. Signed-off-by: Ciara Loftus <ciara.loftus@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2020-06-18 22:35:34 -07:00
Lorenzo Bianconi	1b698fa5d8	xdp: Rename convert_to_xdp_frame in xdp_convert_buff_to_frame In order to use standard 'xdp' prefix, rename convert_to_xdp_frame utility routine in xdp_convert_buff_to_frame and replace all the occurrences Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Jesper Dangaard Brouer <brouer@redhat.com> Link: https://lore.kernel.org/bpf/6344f739be0d1a08ab2b9607584c4d5478c8c083.1590698295.git.lorenzo@kernel.org	2020-06-01 15:02:53 -07:00
Chinh T Cao	b5e19a642b	ice: Ignore EMODE when setting PHY config When setting the PHY cfg (CQ cmd 0x0601), if the firmware responds with an EMODE error, software will ignore the error as it simply means that manageability (ex: BMC) is in control of the link and that the new setting may not be applied. Signed-off-by: Chinh T Cao <chinh.t.cao@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2020-05-31 04:01:16 -07:00
Henry Tieman	d5329be990	ice: fix aRFS after flow director delete The logic was missing for adding back perfect flows after flow director filter delete. The code now adds perfect flows into the HW tables after filter delete. Signed-off-by: Henry Tieman <henry.w.tieman@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2020-05-31 03:58:12 -07:00
Brett Creeley	a039f6fcba	ice: Use coalesce values from q_vector 0 when increasing q_vectors Currently when a VSI is built (i.e. reset, set channels, etc.) the coalesce settings will be preserved in most cases. However, when the number of q_vectors are increased the settings for the new q_vectors will be set to the driver defaults of AIM on, Rx/Tx ITR 50, and INTRL 0. This is causing issues with how the ethtool layer gets the current coalesce settings since it only uses q_vector 0. So, assume that the user set the coalesce settings globally (i.e. ethtool -C eth0) and use q_vector 0's settings for all of the new q_vectors. Signed-off-by: Brett Creeley <brett.creeley@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2020-05-31 03:56:40 -07:00
Paul M Stillwell Jr	1a9c561aa3	ice: fix PCI device serial number to be lowercase values Commit `ceb2f00707` ("ice: Use pci_get_dsn()") changed the code to use a new function to get the Device Serial Number. It also changed the case of the filename for loading a package on a specific NIC from lowercase to uppercase. Change the filename back to lowercase since that is what we specified. Fixes: `ceb2f00707` ("ice: Use pci_get_dsn()") Signed-off-by: Paul M Stillwell Jr <paul.m.stillwell.jr@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2020-05-31 03:55:07 -07:00
Bruce Allan	ebb462dc21	ice: fix function signature style format Where possible, cuddle multiple lines of function signatures to be consistent throughout the code. Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2020-05-31 03:52:25 -07:00
Brett Creeley	7dcc0fb8f6	ice: Allow VF to request reset as soon as it's initialized A VF driver has the ability to request reset via VIRTCHNL_OP_RESET_VF. This is a required step in VF driver load. Currently, the PF is only allowing a VF to request reset using this method after the VF has already communicated resources via VIRTCHNL_OP_GET_VF_RESOURCES. However, this is incorrect because the VF can request reset before requesting resources. Fix this by allowing the VF to request a reset once it has been initialized. Signed-off-by: Brett Creeley <brett.creeley@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2020-05-31 03:50:37 -07:00
Jesse Brandeburg	765dd7a182	ice: Fix inability to set channels when down Currently the driver prevents a user from doing modprobe ice ethtool -L eth0 combined 5 ip link set eth0 up The ethtool command fails, because the driver is checking to see if the interface is down before allowing the get_channels to proceed (even for a set_channels). Remove this check and allow the user to configure the interface before bringing it up, which is a much better usability case. Fixes: `87324e747f` ("ice: Implement ethtool ops for channels") Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2020-05-31 03:49:00 -07:00
Brett Creeley	401ce33b32	ice: Always clear QRXFLXP_CNTXT before writing new value Always clear the previous value in QRXFLXP_CNTXT before writing a new value. This will make it so re-used queues will not accidentally take the previously configured settings. Signed-off-by: Brett Creeley <brett.creeley@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2020-05-31 03:47:35 -07:00
Brett Creeley	cf0bf41dd6	ice: Reset VF for all port VLAN changes from host Currently the PF is modifying the VF's port VLAN on the fly when configured via iproute. This is okay for most cases, but if the VF already has guest VLANs configured the PF has to remove all of those filters so only VLAN tagged traffic that matches the port VLAN will pass. Instead of adding functionality to track which guest VLANs have been added, just reset the VF each time port VLAN parameters are modified. Signed-off-by: Brett Creeley <brett.creeley@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2020-05-31 03:43:00 -07:00
Chinh T Cao	bff185e240	ice: Update ICE_PHY_TYPE_HIGH_MAX_INDEX value As currently, we are supporting only 5 PHY_SPEEDs for phy_type_high. Thus, we should adjust the value of ICE_PHY_TYPE_HIGH_MAX_INDEX to 5. Signed-off-by: Chinh T Cao <chinh.t.cao@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2020-05-31 03:41:29 -07:00
Dan Nowlin	c9a12d6d20	ice: Increase timeout after PFR To allow for resets during package download, increase the timeout period after performing a PFR. The time waited is the global config lock timeout plus the normal PFSWR timeout. Signed-off-by: Dan Nowlin <dan.nowlin@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2020-05-31 03:39:45 -07:00
Brett Creeley	2bb19d6e07	ice: Fix transmit for all software offloaded VLANs Currently the driver does not recognize when there is an 802.1AD VLAN tag right after the dmac/smac (outermost VLAN tag). If any DCB map is applied and/or DCB is enabled this is causing the hardware to insert a VLAN 0 tag after the 802.1AD VLAN tag that is already in the packet. Fix this by preventing VLAN tag 0 from being added when any VLAN is already present after dmac/smac (software offloaded) or skb (hardware offloaded). Signed-off-by: Brett Creeley <brett.creeley@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2020-05-31 03:38:20 -07:00
Paul Greenwalt	c1636a6e8a	ice: support adding 16 unicast/multicast filter on untrusted VF Allow untrusted VF to add 16 unicast/multicast filters. VF uses 1 filter for the default/perm_addr/LAA MAC, 1 for broadcast, and 16 additional unicast/multicast filters. Signed-off-by: Paul Greenwalt <paul.greenwalt@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2020-05-31 03:36:06 -07:00
Brett Creeley	f109603a4b	ice: allow host to clear administratively set VF MAC Currently a user is not allowed to clear a VF's administratively set MAC on the PF. Fix this by allowing an all zero MAC address via "ip link set ${pf_eth} vf ${vf_id} mac 00:00:00:00:00:00". An example use case for this would be issuing a "virsh shutdown" command on a VM. The call to iproute mentioned above is part of this flow. Without this change the driver incorrectly rejects clearing the VF's administratively set MAC and prints unhelpful log messages. Also, improve the comments surrounding this change. Signed-off-by: Brett Creeley <brett.creeley@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2020-05-31 03:34:15 -07:00
Brett Creeley	3726cce258	ice: Refactor VF VSI release and setup functions Currently when a VF VSI calls ice_vsi_release() and ice_vsi_setup() it subsequently clears/sets the VF cached variables for lan_vsi_idx and lan_vsi_num. This works fine, but can be improved by handling this in the VF specific VSI release and setup functions. Also, when a VF VSI is setup too many parameters are passed that can be derived from the VF. Fix this by only calling VF VSI setup with the bare minimum parameters. Also, add functionality to invalidate a VF's VSI when it's released and/or setup fails. This will make it so a VF VSI cannot be accessed via its cached vsi_idx/vsi_num in these cases. Finally when a VF's VSI is invalidated set the lan_vsi_idx and lan_vsi_num to ICE_NO_VSI to clearly show that there is no valid VSI associated with this VF. Signed-off-by: Brett Creeley <brett.creeley@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2020-05-28 16:25:19 -07:00
Brett Creeley	12bb018c53	ice: Refactor VF reset Currently VF VSI are being reset twice during a PFR or greater. This is causing reset, specifically resetting all VFs, to take too long. This is causing various issues with VF drivers not being able to gracefully handle the VF reset timeout. Fix this by refactoring how VF reset is handled for the case mentioned previously and for the VFR/VFLR case. The refactor was done by doing the following: 1. Removing the call to ice_vsi_rebuild_by_type for ICE_VSI_VF VSI, which was causing the initial VSI rebuild. 2. Adding functions for pre/post VSI rebuild functions that can be called in both the reset all VFs case and reset individual VF case. 3. Adding VSI rebuild functions that are specific for the reset all VFs case and adding functions that are specific for the reset individual VF case. 4. Calling the pre-rebuild function, then the specific VSI rebuild function based on the reset type, and then calling the post-rebuild function to handle VF resets. This patch series makes some assumptions about how VSI are handling by FW during reset: 1. During a PFR or greater all VSI in FW will be cleared. 2. During a VFR/VFLR the VSI rebuild responsibility is in the hands of the PF software. 3. There is code in the ice_reset_all_vfs() case to amortize operations if possible. This was left intact. 4. PF software should not be replaying VSI based filters that were added other than host configured, PF software configured, or the VF's default/LAA MAC. This is the VF drivers job after it has been reset. Signed-off-by: Brett Creeley <brett.creeley@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2020-05-28 16:24:02 -07:00
Paul Greenwalt	a58e1d8174	ice: remove VM/VF disable command on CORER/GLOBR reset Remove VM/VF disable AQC (opcode 0x0C31) when resetting all VFs. This is not required for CORER/GLOBR reset. Signed-off-by: Paul Greenwalt <paul.greenwalt@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2020-05-28 16:11:26 -07:00
Brett Creeley	350e822cd5	ice: Add functions to rebuild host VLAN/MAC config for a VF When resetting a VF the VLAN and MAC filter configurations need to be replayed. Add helper functions for this purpose. Signed-off-by: Brett Creeley <brett.creeley@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2020-05-28 16:10:02 -07:00
Brett Creeley	eb2af3ee94	ice: Add function to set trust mode bit on reset As the title says, use a function to set trust mode bit on reset. Signed-off-by: Brett Creeley <brett.creeley@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2020-05-28 16:07:21 -07:00
Brett Creeley	a06325a090	ice: Renaming and simplification in VF init path Some function names weren't very clear and some portions of VF creation could be moved into functions for clarity. Fix this by renaming some functions and move pieces of code into clearly name functions. Signed-off-by: Brett Creeley <brett.creeley@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2020-05-28 16:03:51 -07:00
Brett Creeley	916c7fdf5e	ice: Separate VF VSI initialization/creation from reset flow Currently the same flow is used for VF VSI initialization/creation and VF VSI reset. This makes the initialization/creation flow unnecessarily complicated. Fix this by separating the initialization/creation of the VF VSI from the reset flow. Signed-off-by: Brett Creeley <brett.creeley@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2020-05-28 16:01:39 -07:00
Brett Creeley	cfcee02b6c	ice: Add helper function for clearing VPGEN_VFRTRIG Create a helper function for clearing VPGEN_VFRTRIG as this needs to be done on reset to notify the VF that we are done resetting it. Also, it needs to be done on SR-IOV initialization/creation in case it was left in a bad state after SR-IOV tear down. Signed-off-by: Brett Creeley <brett.creeley@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2020-05-28 15:59:18 -07:00
Brett Creeley	02337f1f59	ice: Simplify ice_sriov_configure Add a new function for checking if SR-IOV can be configured based on the PF and/or device's state/capabilities. Also, simplify the flow in ice_sriov_configure(). Signed-off-by: Brett Creeley <brett.creeley@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2020-05-28 15:57:40 -07:00
Brett Creeley	ac3716134a	ice: Refactor ice_ena_vf_mappings to split MSIX and queue mappings Currently ice_ena_vf_mappings() does all of the VF's MSIX and queue mapping in one function. This makes it hard to digest. Fix this by creating a new function for enabling MSIX mappings and one for enabling queue mappings. Also, rename some variables in the functions for clarity. Signed-off-by: Brett Creeley <brett.creeley@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2020-05-28 15:55:54 -07:00
Tony Nguyen	d3112cd1ab	ice: Declare functions static ice_get_pfa_module_tlv() and ice_read_sr_word() are not being called outside of their file. Declare them as static. Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2020-05-28 15:53:21 -07:00
Jacob Keller	c2b313b783	ice: fix kernel BUG if register_netdev fails If register_netdev() fails, the driver will attempt to cleanup the q_vectors and inadvertently trigger a kernel BUG due to a NULL pointer dereference. This occurs because cleaning up q_vectors attempts to call netif_napi_del on napi_structs which were never initialized. Resolve this by releasing the netdev in ice_cfg_netdev and setting vsi->netdev to NULL. This ensures that after ice_cfg_netdev fails the state is rewound to match as if ice_cfg_netdev was never called. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2020-05-28 15:51:28 -07:00
Jacob Keller	bc3a024101	ice: fix potential double free in probe unrolling If ice_init_interrupt_scheme fails, ice_probe will jump to clearing up the interrupts. This can lead to some static analysis tools such as the compiler sanitizers complaining about double free problems. Since ice_init_interrupt_scheme already unrolls internally on failure, there is no need to call ice_clear_interrupt_scheme when it fails. Add a new unroll label and use that instead. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2020-05-28 15:50:01 -07:00
Jacob Keller	072064a43e	ice: cleanup VSI context initialization Remove an unnecessary copy of vsi->info into ctxt->info in ice_vsi_init. This line is essentially a no-op because ice_set_dflt_vsi_ctx performs a memset to clear the info from the context structure. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2020-05-28 15:48:39 -07:00
Anirudh Venkataramanan	9918f2d22f	ice: Poll for reset completion when DDP load fails There are certain cases where the DDP load fails and the FW issues a core reset. For these cases, wait for reset to complete before proceeding with reset of the driver init. Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2020-05-28 15:40:12 -07:00
Krzysztof Kazimierczak	3f0d97cdfe	ice: Check UMEM FQ size when allocating bufs If a UMEM is present on a queue when an interface/queue pair is being enabled, the driver will try to prepare the Rx buffers in advance to improve performance. However, if fill queue is shorter than HW Rx ring, the driver will report failure after getting the last address from the fill queue. This still lets the driver process the packets correctly during the NAPI poll, but leads to a constant NAPI rescheduling. Not allocating the buffers in advance would result in a potential performance decrease. Commit `d57d76428a` ("xsk: Add API to check for available entries in FQ") provides an API that lets drivers check the number of addresses that the fill queue holds. Notify the user if fill queue is not long enough to prepare all buffers before packet processing starts, and allocate the buffers during the NAPI poll. If the fill queue size is sufficient, prepare Rx buffers in advance. Signed-off-by: Krzysztof Kazimierczak <krzysztof.kazimierczak@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2020-05-27 18:13:59 -07:00
Anirudh Venkataramanan	13f90b393f	ice: Refactor Rx checksum checks We don't need both rx_status and rx_error parameters, as the latter is a subset of the former. Remove rx_error completely and check the right bit in rx_status. Rename rx_status to rx_status0, and rx_status_err1 to rx_status1. This naming more closely reflects the specification. Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2020-05-27 18:00:35 -07:00
Bruce Allan	7e34786a74	ice: avoid undefined behavior When writing the driver's struct ice_tlan_ctx structure, do not write the 8-bit element int_q_state with the associated internal-to-hardware field which is 122-bits, otherwise the helper function ice_write_byte() will use undefined behavior when setting the mask used for that write. This should not cause any functional change and will avoid use of undefined behavior. Also, update a comment to highlight this structure element is not written. Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2020-05-27 17:58:21 -07:00
Marta Plantykow	ae15e0ba1b	ice: Change number of XDP Tx queues to match number of Rx queues In current implementation number of XDP Tx queues is the same as the number of transmit queues, which is not always true. This patch changes this number to match the number of receive queues. XDP programs are running on Rx rings, so what we actually need to provide is the XDP Tx ring per each Rx ring so that the whole XDP ecosystem is functional, e.g. if the result of XDP prog is XDP_TX then you have the need to access the XDP Tx ring. Signed-off-by: Marta Plantykow <marta.a.plantykow@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2020-05-27 17:55:56 -07:00
Marta Plantykow	49d358e0e7	ice: Add XDP Tx to VSI ring stats When XDP Tx program is loaded and packets are sent from interface, VSI statistics are not updated. This patch adds packets sent on Tx XDP ring to VSI ring stats. Signed-off-by: Marta Plantykow <marta.a.plantykow@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2020-05-27 17:54:16 -07:00
Marta Plantykow	c8f135c6ee	ice: Change number of XDP TxQ to 0 when destroying rings When XDP Tx rings are destroyed the number of XDP Tx queues is not changing. This patch is changing this number to 0. Signed-off-by: Marta Plantykow <marta.a.plantykow@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2020-05-27 17:49:56 -07:00
Evan Swanson	b5c7f857e5	ice: Handle critical FW error during admin queue initialization A race condition between FW and SW can occur between admin queue setup and the first command sent. A link event may occur and FW attempts to notify a non-existent queue. FW will set the critical error bit and disable the queue. When this happens retry queue setup. Signed-off-by: Evan Swanson <evan.swanson@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2020-05-27 17:48:23 -07:00
Brett Creeley	1960827570	ice: Don't allow VLAN stripping change when pvid set Currently, if the PVID is set in the VLAN handling section of the VSI context the driver still allows VLAN stripping to be enabled/disabled. VLAN stripping should only be modifiable when the PVID is not set. Fix this by preventing VLAN stripping modification when PVID is set. Signed-off-by: Brett Creeley <brett.creeley@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2020-05-27 17:46:00 -07:00
Brett Creeley	4f1fe43c92	ice: Add more Rx errors to netdev's rx_error counter Currently we are only including illegal_bytes and rx_crc_errors in the PF netdev's rx_error counter. There are many more causes of Rx errors that the device supports and reports via Ethtool. Accumulate all Rx errors in the PF netdev's rx_error counter. Signed-off-by: Brett Creeley <brett.creeley@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2020-05-27 17:44:06 -07:00
Surabhi Boob	68d2707837	ice: Fix for memory leaks and modify ICE_FREE_CQ_BUFS Handle memory leaks during control queue initialization and buffer allocation failures. The macro ICE_FREE_CQ_BUFS is modified to re-use for this fix. Signed-off-by: Surabhi Boob <surabhi.boob@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2020-05-27 17:32:50 -07:00
Surabhi Boob	1aaef2bc4e	ice: Fix memory leak Handle memory leak on filter management initialization failure. Signed-off-by: Surabhi Boob <surabhi.boob@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2020-05-27 17:11:29 -07:00
Jesse Brandeburg	5df42c8267	ice: fix MAC write command The manage MAC write command was implemented in an overly complex way that actually didn't work, as it wasn't symmetric to the manage MAC read command, and was feeding bytes out of order to the firmware. Fix the implementation by just using a simple array to represent the MAC address when it is being written via firmware command. Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2020-05-27 17:06:44 -07:00
Paul Greenwalt	bf8987df8a	ice: set VF default LAN address Remove is_zero_ether_add() check when setting the VF default LAN address. This check assumed that the address had been delete and zeroed before calling ice_vc_add_mac_addr(). Now the default LAN address will be set to the last unicast MAC address added by the VF. The default LAN address is reported by the PF via ndo_get_vf_config. Signed-off-by: Paul Greenwalt <paul.greenwalt@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2020-05-27 17:05:02 -07:00
Jesse Brandeburg	f0cbbb9c6e	ice: remove unused macro The driver had an unused define that can be removed. Found by compiler -Werror=unused-macros check. Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2020-05-27 17:03:40 -07:00
Jesse Brandeburg	22bef5e78f	ice: fix signed vs unsigned comparisons Fix the remaining signed vs unsigned issues, which appear when compiling with -Werror=sign-compare. Many of these are because there is an external interface that is passing an int to us (which we can't change) but that we (rightfully) store and compare against as an unsigned in our data structures. Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2020-05-27 17:02:47 -07:00
David S. Miller	2b1a7f741a	Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue Jeff Kirsher says: ==================== 100GbE Intel Wired LAN Driver Updates 2020-05-22 This series contains updates to virtchnl and the ice driver. Geert Uytterhoeven fixes a data structure alignment issue in the virtchnl structures. Henry adds Flow Director support which allows for the redirection on ntuple rules over six patches. Initially Henry adds the initial infrastructure for Flow Director, and then later adds IPv4 and IPv6 support, as well as being able to display the ntuple rules. Bret add Accelerated Receive Flow Steering (aRFS) support which is used to steer receive flows to a specific queue. Fixes a transmit timeout when the VF link transitions from up/down/up because the transmit and receive queue interrupts are not enabled as part of VF's link up. Fixed an issue when the default VF LAN address is changed and after reset the PF will attempt to add the new MAC, which fails because it already exists. This causes the VF to be disabled completely until it is removed and enabled via sysfs. Anirudh (Ani) makes a fix where the ice driver needs to call set_mac_cfg to enable jumbo frames, so ensure it gets called during initialization and after reset. Fix bad register reads during a register dump in ethtool by removing the bad registers. Paul fixes an issue where the receive Malicious Driver Detection (MDD) auto reset message was not being logged because it occurred after the VF reset. Victor adds a check for compatibility between the Dynamic Device Personalization (DDP) package and the NIC firmware to ensure that everything aligns. Jesse fixes a administrative queue string call with the appropriate error reporting variable. Also fixed the loop variables that are comparing or assigning signed against unsigned values. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2020-05-23 16:51:26 -07:00
Jesse Brandeburg	c1e0883012	ice: cleanup unsigned loops Fix loop variables that are comparing or assigning signed against unsigned values, mostly by declaring loop counters as unsigned. Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2020-05-22 22:27:31 -07:00
Jesse Brandeburg	9d68a79c3b	ice: fix usage of incorrect variable The driver was using rq_last_status where it should have been using sq_last_status. Fix the string to be using the correct error reporting variable. Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2020-05-22 22:26:02 -07:00
Anirudh Venkataramanan	1fba4a8a92	ice: Fix bad register reads The "ethtool -d" handler reads registers in the ice_regs_dump_list array and returns read values back to the userspace. The register offsets PFINT0_ITR* are not valid as per the specification and reading these causes a "unable to handle kernel paging request" bug in the driver. Remove these registers from ice_regs_dump_list. Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2020-05-22 22:24:29 -07:00
Victor Raj	b827291958	ice: check for compatibility between DDP package and firmware Require the Dynamic Device Personalization (DDP) file to have the same major version number and the same or older minor number than the firmware version major and minor, respectively. Check the OS and NVM package versions before downloading the package. If the OS package version is not compatible with NVM then return an appropriate error. Split the 32-byte segment name into a 28-byte segment name and a 4-byte Track-ID. Older packages will still work with this change because no package has a name that will take up more than 28 bytes; in this case the Track-ID will be 0. Note that the driver will store the segment name as 32-bytes in the ice_hw structure, in order to normalize the length of the various package name strings that it uses. Also add section ID and structure for the segment metadata section. Signed-off-by: Victor Raj <victor.raj@intel.com> Signed-off-by: Dan Nowlin <dan.nowlin@intel.com> Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2020-05-22 22:22:50 -07:00
Brett Creeley	47ebc7b024	ice: Check if unicast MAC exists before setting VF MAC Currently if a unicast MAC is set via ndo_set_vf_mac, the PF driver will set the VF's dflt_lan_addr.addr once some basic checks have passed. The VF is then reset. During reset the PF driver will attempt to program the VF's MAC from the dflt_lan_addr.addr field. This fails when the MAC already exists on the PF's switch. This is causing the VF to be completely disabled until removing/enabling any VFs via sysfs. Fix this by checking if the unicast MAC exists before triggering a VF reset directly in ndo_set_vf_mac. Also, add a check if the unicast MAC is set to the same value as before and return 0 if that is the case. Signed-off-by: Brett Creeley <brett.creeley@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2020-05-22 22:20:22 -07:00
Brett Creeley	4dc926d3a5	ice: Fix Tx timeout when link is toggled on a VF's interface Currently if the iavf is loaded and a VF link transitions from up to down to up again a Tx timeout will be triggered. This happens because Tx/Rx queue interrupts are only enabled when receiving the VIRTCHNL_OP_CONFIG_MAP_IRQ message, which happens on reset or initial iavf driver load, but not when bringing link up. This is problematic because they are disabled on the VIRTCHNL_OP_DISABLE_QUEUES message, which is part of bringing a VF's link down. However, they are not enabled on the VIRTCHNL_OP_ENABLE_QUEUES message, which is part of bringing a VF's link up. Fix this by re-enabling the VF's Rx and Tx queue interrupts when they were previously configured. This is done by first checking to make sure the previous value in QINT_[R\|T]QCTL.MSIX_INDX is not 0, which is used to represent the OICR in the VF's interrupt space. If the MSIX_INDX is non-zero then enable the interrupt by setting the QINT_[R\|T]CTL.CAUSE_ENA bit to 1. Signed-off-by: Brett Creeley <brett.creeley@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2020-05-22 22:10:58 -07:00
Paul Greenwalt	7438a3b094	ice: print Rx MDD auto reset message before VF reset Rx MDD auto reset message was not being logged because logging occurred after the VF reset and the VF MDD data was reinitialized. Log the Rx MDD auto reset message before triggering the VF reset. Signed-off-by: Paul Greenwalt <paul.greenwalt@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2020-05-22 22:07:06 -07:00
Anirudh Venkataramanan	4244910568	ice: Call ice_aq_set_mac_cfg As per the specification, the driver needs to call set_mac_cfg (opcode 0x0603) to be able to exercise jumbo frames. Call the function during initialization and the post reset rebuild flow. Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2020-05-22 22:05:25 -07:00
Brett Creeley	28bf26724f	ice: Implement aRFS Enable accelerated Receive Flow Steering (aRFS). It is used to steer Rx flows to a specific queue. This functionality is triggered by the network stack through ndo_rx_flow_steer and requires Flow Director (ntuple on) to function. The fltr_info is used to add/remove/update flow rules in the HW, the fltr_state is used to determine what to do with the filter with respect to HW and/or SW, and the flow_id is used in co-ordination with the network stack. The work for aRFS is split into two paths: the ndo_rx_flow_steer operation and the ice_service_task. The former is where the kernel hands us an Rx SKB among other items to setup aRFS and the latter is where the driver adds/updates/removes filter rules from HW and updates filter state. In the Rx path the following things can happen: 1. New aRFS entries are added to the hash table and the state is set to ICE_ARFS_INACTIVE so the filter can be updated in HW by the ice_service_task path. 2. aRFS entries have their Rx Queue updated if we receive a pre-existing flow_id and the filter state is ICE_ARFS_ACTIVE. The state is set to ICE_ARFS_INACTIVE so the filter can be updated in HW by the ice_service_task path. 3. aRFS entries marked as ICE_ARFS_TODEL are deleted In the ice_service_task path the following things can happen: 1. New aRFS entries marked as ICE_ARFS_INACTIVE are added or updated in HW. and their state is updated to ICE_ARFS_ACTIVE. 2. aRFS entries are deleted from HW and their state is updated to ICE_ARFS_TODEL. Signed-off-by: Brett Creeley <brett.creeley@intel.com> Signed-off-by: Madhu Chittim <madhu.chittim@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2020-05-22 22:02:34 -07:00
Henry Tieman	83af003951	ice: Restore filters following reset Following a reset, Flow Director filters are cleared from the hardware. Rebuild the filters using the software structures containing the filter rules. Signed-off-by: Henry Tieman <henry.w.tieman@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2020-05-22 21:46:51 -07:00
Henry Tieman	2c57ffcb19	ice: Enable flex-bytes support Flex-bytes allows for packet matching based on an offset and value. This is supported via the ethtool user-def option. It is specified by providing an offset followed by a 2 byte match value. Offset is measured from the start of the MAC address. The following restrictions apply to flex-bytes. The specified offset must be an even number and be smaller than 0x1fe. Example usage: ethtool -N eth0 flow-type tcp4 src-ip 192.168.0.55 dst-ip 172.16.0.55 \ src-port 12 dst-port 13 user-def 0x10ffff action 32 Signed-off-by: Henry Tieman <henry.w.tieman@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2020-05-22 21:44:48 -07:00
Henry Tieman	165d80d6ad	ice: Support IPv6 Flow Director filters Extend supported filters to allow for IPv6 filters. Supported fields are: src-ip, dst-ip, src-port, and dst-port Supported flow-types are: tcp6, udp6, sctp6, ip6 Example usage: ethtool -N eth0 flow-type tcp6 src-port 12 dst-port 13 \ src-ip fce0::1:34 dst-ip fce0::1:35 action 32 Signed-off-by: Henry Tieman <henry.w.tieman@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2020-05-22 21:42:20 -07:00
Henry Tieman	cac2a27cd9	ice: Support IPv4 Flow Director filters Support the addition and deletion of IPv4 filters. Supported fields are: src-ip, dst-ip, src-port, and dst-port Supported flow-types are: tcp4, udp4, sctp4, ip4 Example usage: ethtool -N eth0 flow-type tcp4 src-ip 192.168.0.55 dst-ip 172.16.0.55 \ src-port 16 dst-port 12 action 32 Signed-off-by: Henry Tieman <henry.w.tieman@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2020-05-22 21:36:27 -07:00
Henry Tieman	4ab956462f	ice: Support displaying ntuple rules Add functionality for ethtool --show-ntuple, allowing for filters to be displayed when set functionality is added. Add statistics related to Flow Director matches and status. Signed-off-by: Henry Tieman <henry.w.tieman@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2020-05-22 21:30:23 -07:00
Henry Tieman	148beb6120	ice: Initialize Flow Director resources Flow Director allows for redirection based on ntuple rules. Rules are programmed using the ethtool set-ntuple interface. Supported actions are redirect to queue and drop. Setup the initial framework to process Flow Director filters. Create and allocate resources to manage and program filters to the hardware. Filters are processed via a sideband interface; a control VSI is created to manage communication and process requests through the sideband. Upon allocation of resources, update the hardware tables to accept perfect filters. Signed-off-by: Henry Tieman <henry.w.tieman@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2020-05-22 21:26:37 -07:00
David S. Miller	a152b85984	Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next Daniel Borkmann says: ==================== pull-request: bpf-next 2020-05-23 The following pull-request contains BPF updates for your net-next tree. We've added 50 non-merge commits during the last 8 day(s) which contain a total of 109 files changed, 2776 insertions(+), 2887 deletions(-). The main changes are: 1) Add a new AF_XDP buffer allocation API to the core in order to help lowering the bar for drivers adopting AF_XDP support. i40e, ice, ixgbe as well as mlx5 have been moved over to the new API and also gained a small improvement in performance, from Björn Töpel and Magnus Karlsson. 2) Add getpeername()/getsockname() attach types for BPF sock_addr programs in order to allow for e.g. reverse translation of load-balancer backend to service address/port tuple from a connected peer, from Daniel Borkmann. 3) Improve the BPF verifier is_branch_taken() logic to evaluate pointers being non-NULL, e.g. if after an initial test another non-NULL test on that pointer follows in a given path, then it can be pruned right away, from John Fastabend. 4) Larger rework of BPF sockmap selftests to make output easier to understand and to reduce overall runtime as well as adding new BPF kTLS selftests that run in combination with sockmap, also from John Fastabend. 5) Batch of misc updates to BPF selftests including fixing up test_align to match verifier output again and moving it under test_progs, allowing bpf_iter selftest to compile on machines with older vmlinux.h, and updating config options for lirc and v6 segment routing helpers, from Stanislav Fomichev, Andrii Nakryiko and Alan Maguire. 6) Conversion of BPF tracing samples outdated internal BPF loader to use libbpf API instead, from Daniel T. Lee. 7) Follow-up to BPF kernel test infrastructure in order to fix a flake in the XDP selftests, from Jesper Dangaard Brouer. 8) Minor improvements to libbpf's internal hashmap implementation, from Ian Rogers. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2020-05-22 18:30:34 -07:00
Tony Nguyen	5757cc7c8b	ice: Rename build_ctob to ice_build_ctob To make the function easier to identify as being part of the ice driver, prepend ice to the function name. Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2020-05-21 22:10:04 -07:00
Bruce Allan	c522d1f686	ice: remove unnecessary backslash Self-explanatory. Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2020-05-21 22:10:04 -07:00
Bruce Allan	86a2e00d20	ice: remove unnecessary check The variable status cannot be zero due to a prior check of it; remove this check. Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2020-05-21 22:10:04 -07:00
Bruce Allan	92ace4824c	ice: remove unnecessary expression that is always true The else conditional expression is always true due to the if conditional expression; remove it and add a comment to make it obvious still. Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2020-05-21 22:10:04 -07:00
Lihong Yang	757976ab16	ice: Fix check for removing/adding mac filters In function ice_set_mac_address, we will remove old dev_addr before adding the new MAC. In the removing and adding process of the MAC, there is no need to return error if the check finds the to-be-removed dev_addr does not exist in the MAC filter list or the to-be-added mac already exists, keep going or return success accordingly. Signed-off-by: Lihong Yang <lihong.yang@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2020-05-21 22:10:04 -07:00
Michal Swiatkowski	1b8f15b64a	ice: refactor filter functions Move filter functions to separate file. Add functions that prepare suitable ice_fltr_info struct depending on the filter type and add this struct to earlier created list: - ice_fltr_add_mac_to_list - ice_fltr_add_vlan_to_list - ice_fltr_add_eth_to_list This functions are used in adding and removing filters. Create wrappers for functions mentioned above that alloc list, add suitable ice_fltr_info to it and call add or remove function. - ice_fltr_prepare_mac - ice_fltr_prepare_mac_and_broadcast - ice_fltr_prepare_vlan - ice_fltr_prepare_eth Signed-off-by: Michal Swiatkowski <michal.swiatkowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2020-05-21 22:10:04 -07:00
Eric Joyner	857a4f0e9f	ice: Fix resource leak on early exit from function Memory allocated in the ice_add_prof_id_vsig() function wasn't being properly freed if an error occurred inside the for-loop in the function. In particular, 'p' wasn't being freed if an error occurred before it was added to the resource list at the end of the for-loop. Signed-off-by: Eric Joyner <eric.joyner@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2020-05-21 22:10:04 -07:00
Jesse Brandeburg	53bb66983f	ice: cleanup vf_id signedness The vf_id variable is dealt with in the code in inconsistent ways of sign usage, preventing compilation with -Werror=sign-compare. Fix this problem in the code by always treating vf_id as unsigned, since there are no valid values of vf_id that are negative. Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2020-05-21 22:10:04 -07:00
Karol Kolacinski	88865fc4bb	ice: Fix casting issues Change min() macros to min_t() which has compare type specified and it helps avoid precision loss. In some cases there was precision loss during calls or assignments. Some fields in structs were unnecessarily large and gave multiple warnings. There were also some minor type differences which are now fixed as well as some cases where a simple cast was needed. Callers were were passing data that is a u16 to ice_sched_cfg_node_bw_alloc() but the function was truncating that to a u8. Fix that by changing the function to take a u16. Signed-off-by: Karol Kolacinski <karol.kolacinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2020-05-21 22:10:04 -07:00
Lihong Yang	0fee35774d	ice: Provide more meaningful error message When printing the ice status or AQ error codes, instead of printing out the numerical value, provide the description of the error code. This provides more info about the issue than a number. Signed-off-by: Lihong Yang <lihong.yang@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2020-05-21 22:10:04 -07:00
Anirudh Venkataramanan	de75135b5c	ice: Fix probe/open race condition As soon as the driver registers the PF netdev, userspace utilities like NetworkManager try to bring up the associated interface. When this happens, the driver may not have finished initializing fully, resulting in a bunch of errors in the interface up flow. The driver already has a mechanism to indicate if it's not up yet; by setting the __ICE_DOWN bit in pf->state, but this bit gets cleared too early in the current flow. So clear this bit only when the driver is fully up. Also check for the same bit in the ice_open flow, and return -EBUSY if the bit is set. Also in ice_open, replace references of vsi->back with a local variable. Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2020-05-21 22:10:04 -07:00
Dave Ertman	46a316500e	ice: only drop link once when setting pauseparams Currently, the ice driver is setting a PHY configuration, which causes a link drop, and then additionally it calls for a nway_reset, which restarts auto-negotiation on the link, which also causes a link drop. These two link events in such close timing is causing the FW to not be able to generate a link interrupt for the driver to respond to. Remove the unnecessary auto-negotiation restart from the set pauseparams flow. Also remove error path that would have performed an ice_down/ice_up as that is also unnecessary. Signed-off-by: Dave Ertman <david.m.ertman@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2020-05-21 22:10:04 -07:00
Dave Ertman	891540024b	ice: Fix check for contiguous TCs The current implementation for contiguous TC check is assuming that the UPs will be mapped to TCs in a linear progressing fashion. This is obviously not always true. Change the check to allow for various UP2TC mapping configurations. Signed-off-by: Dave Ertman <david.m.ertman@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2020-05-21 22:10:04 -07:00
Avinash JD	610ed0e93e	ice: Don't reset and rebuild for Tx timeout on PFC enabled queue When there's a Tx timeout for a queue which belongs to a PFC enabled TC, then it's not because the queue is hung but because PFC is in action. In PFC, peer sends a pause frame for a specified period of time when its buffer threshold is exceeded (due to congestion). Netdev on the other hand checks if ACK is received within a specified time for a TX packet, if not, it'll invoke the tx_timeout routine. Signed-off-by: Avinash JD <avinash.dayanand@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2020-05-21 22:10:03 -07:00
Brett Creeley	01b5e89aab	ice: Add VF promiscuous support Implement promiscuous support for VF VSIs. Behaviour of promiscuous support is based on VF trust as well as the, introduced, vf-true-promisc flag. A trusted VF with vf-true-promisc disabled will be the default VSI, which means that all traffic without a matching destination MAC address in the device's internal switch will be forwarded to this VF VSI. A trusted VF with vf-true-promisc enabled will go into "true promiscuous mode". This amounts to the VF receiving all ingress and egress traffic that hits the device's internal switch. An untrusted VF will only receive traffic destined for that VF. The vf-true-promisc-support flag cannot be toggled while any VF is in promiscuous mode. This flag should be set prior to loading the iavf driver or spawning VF(s). Signed-off-by: Brett Creeley <brett.creeley@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2020-05-21 22:10:03 -07:00
Tony Nguyen	a4e82a81f5	ice: Add support for tunnel offloads Create a boost TCAM entry for each tunnel port in order to get a tunnel PTYPE. Update netdev feature flags and implement the appropriate logic to get and set values for hardware offloads. Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Henry Tieman <henry.w.tieman@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2020-05-21 22:10:03 -07:00
Jacob Keller	f45a645fa6	ice: report netlist version in .info_get The flash memory for the ice hardware contains a block of information used for link management called the Netlist module. As this essentially represents another section of firmware, add its version information to the output of the driver's .info_get handler. This includes both a version and the first few bytes of a hash of the module contents. fw.netlist -> the version information extracted from the netlist module fw.netlist.build-> first 4 bytes of the hash of the contents, similar to fw.mgmt.build Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2020-05-21 22:10:03 -07:00
Björn Töpel	175fc43067	ice, xsk: Migrate to new MEM_TYPE_XSK_BUFF_POOL Remove MEM_TYPE_ZERO_COPY in favor of the new MEM_TYPE_XSK_BUFF_POOL APIs. v4->v5: Fixed "warning: Excess function parameter 'alloc' description in 'ice_alloc_rx_bufs_zc'" and "warning: Excess function parameter 'xdp' description in 'ice_construct_skb_zc'". (Jakub) Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Björn Töpel <bjorn.topel@intel.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Cc: intel-wired-lan@lists.osuosl.org Link: https://lore.kernel.org/bpf/20200520192103.355233-10-bjorn.topel@gmail.com	2020-05-21 17:31:26 -07:00
Magnus Karlsson	a71506a4fd	xsk: Move driver interface to xdp_sock_drv.h Move the AF_XDP zero-copy driver interface to its own include file called xdp_sock_drv.h. This, hopefully, will make it more clear for NIC driver implementors to know what functions to use for zero-copy support. v4->v5: Fix -Wmissing-prototypes by include header file. (Jakub) Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200520192103.355233-4-bjorn.topel@gmail.com	2020-05-21 17:31:26 -07:00
Jesper Dangaard Brouer	2a637c5b1a	xdp: For Intel AF_XDP drivers add XDP frame_sz Intel drivers implement native AF_XDP zerocopy in separate C-files, that have its own invocation of bpf_prog_run_xdp(). The setup of xdp_buff is also handled in separately from normal code path. This patch update XDP frame_sz for AF_XDP zerocopy drivers i40e, ice and ixgbe, as the code changes needed are very similar. Introduce a helper function xsk_umem_xdp_frame_sz() for calculating frame size. Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Björn Töpel <bjorn.topel@intel.com> Cc: intel-wired-lan@lists.osuosl.org Cc: Magnus Karlsson <magnus.karlsson@intel.com> Link: https://lore.kernel.org/bpf/158945347511.97035.8536753731329475655.stgit@firesoul	2020-05-14 21:21:56 -07:00
Jesper Dangaard Brouer	d4ecdbf7aa	ice: Add XDP frame size to driver This driver uses different memory models depending on PAGE_SIZE at compile time. For PAGE_SIZE 4K it uses page splitting, meaning for normal MTU frame size is 2048 bytes (and headroom 192 bytes). For larger MTUs the driver still use page splitting, by allocating order-1 pages (8192 bytes) for RX frames. For PAGE_SIZE larger than 4K, driver instead advance its rx_buffer->page_offset with the frame size "truesize". For XDP frame size calculations, this mean that in PAGE_SIZE larger than 4K mode the frame_sz change on a per packet basis. For the page split 4K PAGE_SIZE mode, xdp.frame_sz is more constant and can be updated once outside the main NAPI loop. The default setting in the driver uses build_skb(), which provides the necessary headroom and tailroom for XDP-redirect in RX-frame (in both modes). There is one complication, which is legacy-rx mode (configurable via ethtool priv-flags). There are zero headroom in this mode, which is a requirement for XDP-redirect to work. The conversion to xdp_frame (convert_to_xdp_frame) will detect this insufficient space, and xdp_do_redirect() call will fail. This is deemed acceptable, as it allows other XDP actions to still work in legacy-mode. In legacy-mode + larger PAGE_SIZE due to lacking tailroom, we also accept that xdp_adjust_tail shrink doesn't work. Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Cc: intel-wired-lan@lists.osuosl.org Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Cc: Alexander Duyck <alexander.duyck@gmail.com> Link: https://lore.kernel.org/bpf/158945347002.97035.328088795813704587.stgit@firesoul	2020-05-14 21:21:56 -07:00
Wei Yongjun	f8d530ac29	ice: Fix error return code in ice_add_prof() Fix to return a error code from the error handling case instead of 0, as done elsewhere in this function. Fixes: `31ad4e4ee1` ("ice: Allocate flow profile") Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-04-30 20:29:09 -07:00
Linus Torvalds	86f26a77cb	pci-v5.7-changes -----BEGIN PGP SIGNATURE----- iQJIBAABCgAyFiEEgMe7l+5h9hnxdsnuWYigwDrT+vwFAl6GTQMUHGJoZWxnYWFz QGdvb2dsZS5jb20ACgkQWYigwDrT+vy3PhAAmqpYBRobOsG8QbmKDjoJEFtkqdvD z6+4zf/R+hF11RyXjMDwihIe8d+tkQ4eAaYu6Oh5PrTyanz0G0PgeCrivZeytULk thqQIWzDQMVA5vN/2/Vy8s5s+3HzP8z/MZOFScJ7+xA1MndXptPRTNmFUbjx+GAv x8/pTp0u9AF6m7itX65DxXvwkzjWamt+Ar4Yx2IcuKAU/M5RtfuZO3PpDnqn7/wk JFlkRoYeFB6qNnnkPdeyPHl9dALhuhzgdTyklQEnKVW3nf3xThYDhcEwdh6kBQgl 0dH8lL5LXy7PKGN8RES4wB0Vqndw/HlsCF5O4wkkfItbnbJxGJtS139e5973m0ud sgWvF4yJAT2jCKhIeNz34sePQJMyWALhv0XzZCsJ0YeGHsrV1jrHELkwUT1+eIsT 3UV0iZ6aL06zQJDyKUbbIcQzEQ/wwBC+x9VgsyL54K1quCQZ1N1Nl/dvrb4cRG9m m9EhJK/brDf4c0uFlOmMTSxV1t5J+z6ZSQnh1ShD/o5yBsxqN6q5brDT6LEs+jbM LsIkA18jJOd4OyiDs98YiFKvIfFQbQ0LEBQpJwhF0snvfBFMMbUYN/T/NYneWON/ F0TpkFoP7PXDuq55iNaLdnObfzrpC9kdzUyWvePUvjxIl55bkf+/qtUny+H48t4L dNggvW052d7BHes= =deWu -----END PGP SIGNATURE----- Merge tag 'pci-v5.7-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci Pull pci updates from Bjorn Helgaas: "Enumeration: - Revert sysfs "rescan" renames that broke apps (Kelsey Skunberg) - Add more 32 GT/s link speed decoding and improve the implementation (Yicong Yang) Resource management: - Add support for sizing programmable host bridge apertures and fix a related alpha Nautilus regression (Ivan Kokshaysky) Interrupts: - Add boot interrupt quirk mechanism for Xeon chipsets and document boot interrupts (Sean V Kelley) PCIe native device hotplug: - When possible, disable in-band presence detect and use PDS (Alexandru Gagniuc) - Add DMI table for devices that don't use in-band presence detection but don't advertise that correctly (Stuart Hayes) - Fix hang when powering slots up/down via sysfs (Lukas Wunner) - Fix an MSI interrupt race (Stuart Hayes) Virtualization: - Add ACS quirks for Zhaoxin devices (Raymond Pang) Error handling: - Add Error Disconnect Recover (EDR) support so firmware can report devices disconnected via DPC and we can try to recover (Kuppuswamy Sathyanarayanan) Peer-to-peer DMA: - Add Intel Sky Lake-E Root Ports B, C, D to the whitelist (Andrew Maier) ASPM: - Reduce severity of common clock config message (Chris Packham) - Clear the correct bits when enabling L1 substates, so we don't go to the wrong state (Yicong Yang) Endpoint framework: - Replace EPF linkup ops with notifier call chain and improve locking (Kishon Vijay Abraham I) - Fix concurrent memory allocation in OB address region (Kishon Vijay Abraham I) - Move PF function number assignment to EPC core to support multiple function creation methods (Kishon Vijay Abraham I) - Fix issue with clearing configfs "start" entry (Kunihiko Hayashi) - Fix issue with endpoint MSI-X ignoring BAR Indicator and Table Offset (Kishon Vijay Abraham I) - Add support for testing DMA transfers (Kishon Vijay Abraham I) - Add support for testing > 10 endpoint devices (Kishon Vijay Abraham I) - Add support for tests to clear IRQ (Kishon Vijay Abraham I) - Add common DT schema for endpoint controllers (Kishon Vijay Abraham I) Amlogic Meson PCIe controller driver: - Add DT bindings for AXG PCIe PHY, shared MIPI/PCIe analog PHY (Remi Pommarel) - Add Amlogic AXG PCIe PHY, AXG MIPI/PCIe analog PHY drivers (Remi Pommarel) Cadence PCIe controller driver: - Add Root Complex/Endpoint DT schema for Cadence PCIe (Kishon Vijay Abraham I) Intel VMD host bridge driver: - Add two VMD Device IDs that require bus restriction mode (Sushma Kalakota) Mobiveil PCIe controller driver: - Refactor and modularize mobiveil driver (Hou Zhiqiang) - Add support for Mobiveil GPEX Gen4 host (Hou Zhiqiang) Microsoft Hyper-V host bridge driver: - Add support for Hyper-V PCI protocol version 1.3 and PCI_BUS_RELATIONS2 (Long Li) - Refactor to prepare for virtual PCI on non-x86 architectures (Boqun Feng) - Fix memory leak in hv_pci_probe()'s error path (Dexuan Cui) NVIDIA Tegra PCIe controller driver: - Use pci_parse_request_of_pci_ranges() (Rob Herring) - Add support for endpoint mode and related DT updates (Vidya Sagar) - Reduce -EPROBE_DEFER error message log level (Thierry Reding) Qualcomm PCIe controller driver: - Restrict class fixup to specific Qualcomm devices (Bjorn Andersson) Synopsys DesignWare PCIe controller driver: - Refactor core initialization code for endpoint mode (Vidya Sagar) - Fix endpoint MSI-X to use correct table address (Kishon Vijay Abraham I) TI DRA7xx PCIe controller driver: - Fix MSI IRQ handling (Vignesh Raghavendra) TI Keystone PCIe controller driver: - Allow AM654 endpoint to raise MSI-X interrupt (Kishon Vijay Abraham I) Miscellaneous: - Quirk ASMedia XHCI USB to avoid "PME# from D0" defect (Kai-Heng Feng) - Use ioremap(), not phys_to_virt(), for platform ROM to fix video ROM mapping with CONFIG_HIGHMEM (Mikel Rychliski)" * tag 'pci-v5.7-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (96 commits) misc: pci_endpoint_test: remove duplicate macro PCI_ENDPOINT_TEST_STATUS PCI: tegra: Print -EPROBE_DEFER error message at debug level misc: pci_endpoint_test: Use full pci-endpoint-test name in request_irq() misc: pci_endpoint_test: Fix to support > 10 pci-endpoint-test devices tools: PCI: Add 'e' to clear IRQ misc: pci_endpoint_test: Add ioctl to clear IRQ misc: pci_endpoint_test: Avoid using module parameter to determine irqtype PCI: keystone: Allow AM654 PCIe Endpoint to raise MSI-X interrupt PCI: dwc: Fix dw_pcie_ep_raise_msix_irq() to get correct MSI-X table address PCI: endpoint: Fix ->set_msix() to take BIR and offset as arguments misc: pci_endpoint_test: Add support to get DMA option from userspace tools: PCI: Add 'd' command line option to support DMA misc: pci_endpoint_test: Use streaming DMA APIs for buffer allocation PCI: endpoint: functions/pci-epf-test: Print throughput information PCI: endpoint: functions/pci-epf-test: Add DMA support to transfer data PCI: pciehp: Fix MSI interrupt race PCI: pciehp: Fix indefinite wait on sysfs requests PCI: endpoint: Fix clearing start entry in configfs PCI: tegra: Add support for PCIe endpoint mode in Tegra194 PCI: sysfs: Revert "rescan" file renames ...	2020-04-03 14:25:02 -07:00
Kuppuswamy Sathyanarayanan	894020fdd8	PCI/AER: Rationalize error status register clearing The AER interfaces to clear error status registers were a confusing mess: - pci_cleanup_aer_uncorrect_error_status() cleared non-fatal errors from the Uncorrectable Error Status register. - pci_aer_clear_fatal_status() cleared fatal errors from the Uncorrectable Error Status register. - pci_cleanup_aer_error_status_regs() cleared the Root Error Status register (for Root Ports), the Uncorrectable Error Status register, and the Correctable Error Status register. Rename them to make them consistent: From To ---------------------------------------- ------------------------------- pci_cleanup_aer_uncorrect_error_status() pci_aer_clear_nonfatal_status() pci_aer_clear_fatal_status() pci_aer_clear_fatal_status() pci_cleanup_aer_error_status_regs() pci_aer_clear_status() Since pci_cleanup_aer_error_status_regs() (renamed to pci_aer_clear_status()) is only used within drivers/pci/, move the declaration from <linux/aer.h> to drivers/pci/pci.h. [bhelgaas: commit log, add renames] Link: https://lore.kernel.org/r/d1310a75dc3d28f7e8da4e99c45fbd3e60fe238e.1585000084.git.sathyanarayanan.kuppuswamy@linux.intel.com Signed-off-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>	2020-03-28 13:19:05 -05:00
Jacob Keller	dce730f178	ice: add a devlink region for dumping NVM contents Add a devlink region for exposing the device's Non Volatime Memory flash contents. Support the recently added .snapshot operation, enabling userspace to request a snapshot of the NVM contents via DEVLINK_CMD_REGION_NEW. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-03-26 19:39:26 -07:00
Jacob Keller	e961b679fb	ice: add board identifier info to devlink .info_get Export a unique board identifier using "board.id" for devlink's .info_get command. Obtain this by reading the NVM for the PBA identification string. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Acked-by: Jakub Kicinski <kuba@kernel.org> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2020-03-21 01:02:19 -07:00
Jacob Keller	ff2e5c700e	ice: add basic handler for devlink .info_get The devlink .info_get callback allows the driver to report detailed version information. The following devlink versions are reported with this initial implementation: "fw.mgmt" -> The version of the firmware that controls PHY, link, etc "fw.mgmt.api" -> API version of interface exposed over the AdminQ "fw.mgmt.build" -> Unique build id of the source for the management fw "fw.undi" -> Version of the Option ROM containing the UEFI driver "fw.psid.api" -> Version of the NVM image format. "fw.bundle_id" -> Unique identifier for the combined flash image. "fw.app.name" -> The name of the active DDP package. "fw.app" -> The version of the active DDP package. With this, devlink dev info can report at least as much information as is reported by ETHTOOL_GDRVINFO. Compare the output from ethtool vs from devlink: $ ethtool -i ens785s0 driver: ice version: 0.8.1-k firmware-version: 0.80 0x80002ec0 1.2581.0 expansion-rom-version: bus-info: 0000:3b:00.0 supports-statistics: yes supports-test: yes supports-eeprom-access: yes supports-register-dump: yes supports-priv-flags: yes $ devlink dev info pci/0000:3b:00.0 pci/0000:3b:00.0: driver ice serial number 00-01-ab-ff-ff-ca-05-68 versions: running: fw.mgmt 2.1.7 fw.mgmt.api 1.5 fw.mgmt.build 0x305d955f fw.undi 1.2581.0 fw.psid.api 0.80 fw.bundle_id 0x80002ec0 fw.app.name ICE OS Default Package fw.app 1.3.1.0 More pieces of information can be displayed, each version is kept separate instead of munged together, and each version has an identifier which comes with associated documentation. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Acked-by: Jakub Kicinski <kuba@kernel.org> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2020-03-21 01:00:32 -07:00
Jacob Keller	1adf7ead82	ice: enable initial devlink support Begin implementing support for the devlink interface with the ice driver. The pf structure is currently memory managed through devres, via a devm_alloc. To mimic this behavior, after allocating the devlink pointer, use devm_add_action to add a teardown action for releasing the devlink memory on exit. The ice hardware is a multi-function PCIe device. Thus, each physical function will get its own devlink instance. This means that each function will be treated independently, with its own parameters and configuration. This is done because the ice driver loads a separate instance for each function. Due to this, the implementation does not enable devlink to manage device-wide resources or configuration, as each physical function will be treated independently. This is done for simplicity, as managing a devlink instance across multiple driver instances would significantly increase the complexity for minimal gain. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2020-03-21 00:55:42 -07:00
Jesse Brandeburg	84a2479822	ice: implement full NVM read from ETHTOOL_GEEPROM The current implementation of .get_eeprom only enables reading from the Shadow RAM portion of the NVM contents. Implement support for reading the entire flash contents instead of only the initial portion contained in the Shadow RAM. A complete dump can take several seconds, but the ETHTOOL_GEEPROM ioctl is capable of reading only a limited portion at a time by specifying the offset and length to read. In order to perform the reads directly, several functions are made non static. Additionally, the unused ice_read_sr_buf_aq and ice_read_sr_buf functions are removed. Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2020-03-21 00:31:17 -07:00
Jacob Keller	81f07491e2	ice: discover and store size of available flash When reading from the NVM using a flat address, it is useful to know the upper bound on the size of the flash contents. This value is not stored within the NVM. We can determine the size by performing a bisection between upper and lower bounds. It is known that the size cannot exceed 16 MB (offset of 0xFFFFFF). Use a while loop to bisect the upper and lower bounds by reading one byte at a time. On a failed read, lower the maximum bound. On a successful read, increase the lower bound. Save this as the flash_size in the ice_nvm_info structure that contains data related to the NVM. The size will be used in a future patch for implementing full NVM read via ethtool's GEEPROM command. The maximum possible size for the flash is bounded by the size limit for the NVM AdminQ commands. Add a new macro, ICE_AQC_NVM_MAX_OFFSET, which can be used to represent this upper bound. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2020-03-21 00:29:10 -07:00
Jacob Keller	d4e874448e	ice: store NVM version info in extracted format The NVM version and Option ROM version information is stored within the struct ice_nvm_ver_info structure. The data for the NVM is stored as a 2byte value with the major and minor versions each using one byte from the field. The Option ROM is stored as a 4byte value that contains a major, build, and patch number. Modify the code to immediately extract the version values and store them in a new struct ice_orom_info. Remove the now unnecessary ice_get_nvm_version function. Update ice_ethtool.c to use the new fields directly from the structured data. This reduces complexity of the code that prints these versions in ice_ethtool.c Update the macro definitions and variable names to use the term "orom" instead of "oem" for the Option ROM version. This helps increase the clarity of the Option ROM version code. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2020-03-21 00:28:18 -07:00
Jacob Keller	e94509906d	ice: create function to read a section of the NVM and Shadow RAM The NVM contents are read via firmware by using the ice_aq_read_nvm function. This function has a couple of limits: 1) The AdminQ commands can only take buffers sized up to 4Kb. Thus, any larger read must be split into multiple reads. 2) when reading from the Shadow RAM, reads must not cross sector boundaries. The sectors are also 4Kb in size. Implement the ice_read_flat_nvm function to read portions of the NVM by flat offset. That is, to read using offsets from the start of the NVM rather than from a specific module. This function will be able to read both from the NVM and from the Shadow RAM. For simplicity NVM reads will always be broken up to not cross 4Kb page boundaries, even though this is not required unless reading from the Shadow RAM. Use this new function as the implementation of ice_read_sr_word_aq. The ice_read_sr_buf_aq function is not modified here. This is because a following change will remove the only caller of that function in favor of directly using ice_read_flat_nvm. Thus, there is little benefit to changing it now only to remove it momentarily. At the same time, the ice_read_sr_aq function will also be removed. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2020-03-21 00:24:56 -07:00
Jacob Keller	2efefb56f9	ice: use __le16 types for explicitly Little Endian values The ice_read_sr_aq function returns words in the Little Endian format. Remove the need for __force and typecasting by using a local variable in the ice_read_sr_word_aq function. Additionally clarify explicitly that the ice_read_sr_aq function takes storage for __le16 values instead of using u16. Being explicit about the endianness of this data helps when using tools like sparse to catch endian-related issues. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2020-03-20 23:38:40 -07:00
Jacob Keller	dab02de867	ice: fix incorrect size description of ice_get_nvm_version The function comment for ice_get_nvm_version indicated that the ver_hi and ver_lo values were 16 bits. In fact, they are only uint8_t values, meaning that they have a maximum size of 8 bits. Fix the comment to match the correct size. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2020-03-10 13:11:02 -07:00
Bruce Allan	6dae8aa0ed	ice: use variable name more descriptive than type The variable name 'type' is not very descriptive. Replace instances of those with a variable name that is more descriptive or replace it if not needed. Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2020-03-10 13:10:58 -07:00
Anirudh Venkataramanan	dced8ad321	ice: Use EOPNOTSUPP instead of ENOTSUPP Using ENOTSUPP almost always results in some bizarre error message to be printed in userspace. This is likely because ENOTSUPP was defined for the NFS protocol (as per a comment in include/linux/errno.h). Use EOPNOTSUPP instead. Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2020-03-10 13:10:53 -07:00
Tony Nguyen	93ff48589a	ice: Fix format specifier Commit `ed5a3f664c` ("ice: Removing hung_queue variable to use txqueue function parameter") began utilizing the txqueue variable over the hung_queue variable. hung_queue was an int where txqueue is an unsigned int. Update the format specifiers to reflect the new type. Fixes: `ed5a3f664c` ("ice: Removing hung_queue variable to use txqueue function parameter") Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2020-03-10 13:10:47 -07:00
Bruce Allan	c88ba3fb33	ice: fix use of deprecated strlcpy() checkpatch complains "CHECK:DEPRECATED_API: Deprecated use of 'strlcpy', prefer 'stracpy or strscpy' instead"; use strscpy. Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2020-03-10 13:10:42 -07:00
Lukasz Czapnik	c8a1071df9	ice: Increase mailbox receive queue length to maximum Currently the PF's mailbox receive queue is only 512 entries. This fine, but considering that all VF's mailbox send queues funnel into the PF's single mailbox receive queue, let's increase it to the maximum size. This will help prevent any possible bottleneck/slowdown occurring from the PF's mailbox receive queue being full. Signed-off-by: Lukasz Czapnik <lukasz.czapnik@intel.com> Signed-off-by: Brett Creeley <brett.creeley@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2020-03-10 13:10:36 -07:00
Brett Creeley	345be791ab	ice: Correct setting VLAN pruning VLAN pruning is not always being set correctly due to a previous change that set Tx antispoof off. ice_vsi_is_vlan_pruning_ena() currently checks for both Tx antispoof and Rx pruning. The expectation for this function is to only check Rx pruning so fix the check. Fixes: `cd6d6b8331` ("ice: Fix VF spoofchk") Signed-off-by: Brett Creeley <brett.creeley@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2020-03-10 13:10:32 -07:00
Dave Ertman	35e935617e	ice: renegotiate link after FW DCB on When switching from SW DCB to FW DCB it is necessary to renegotiate DCBx so that the FW agent can have up to date information about the DCB settings of the link partner. Perform an autoneg restart on the link when activating FW DCB. Signed-off-by: Dave Ertman <david.m.ertman@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2020-03-10 13:10:24 -07:00
Avinash JD	1f454e06d9	ice: Fix corner case when switching from IEEE to CEE While testing DCB for a corner case in which mode is switched from IEEE to CEE and pfc_ena bitmask unchanged then DCBX mode doesn't get updated. This is happening because the function ice_dcb_get_mode() is called in a "no change detected block" instead of "change detected block". Signed-off-by: Avinash JD <avinash.dayanand@intel.com> Signed-off-by: Scott Register <scottx.register@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2020-03-10 13:10:19 -07:00
Brett Creeley	111820b051	ice: Display Link detected via Ethtool in safe mode Currently the "Link detected" field is not shown when the device goes into safe mode. This is because the safe mode Ethtool ops does not set the get_link function. Fix this by setting the safe mode Ethtool op get_link function. Signed-off-by: Brett Creeley <brett.creeley@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2020-03-10 13:10:15 -07:00
Brett Creeley	f844d5212c	ice: Fix removing driver while bare-metal VFs pass traffic Currently, if there are bare-metal VFs passing traffic and the ice driver is removed, there is a possibility of VFs triggering a Tx timeout right before iavf_remove(). This is causing iavf_close() to not be called because there is a check in the beginning of iavf_remove() that bails out early if (adapter->state < IAVF_DOWN_PENDING). This makes it so some resources do not get cleaned up. Specifically, free_irq() is never called for data interrupts, which results in the following line of code to trigger: pci_disable_msix() free_msi_irqs() ... BUG_ON(irq_has_action(entry->irq + i)); ... To prevent the Tx timeout from occurring on the VF during driver unload for ice and the iavf there are a few changes that are needed. [1] Don't disable all active VF Tx/Rx queues prior to calling pci_disable_sriov. [2] Call ice_free_vfs() before disabling the service task. [3] Disable VF resets when the ice driver is being unloaded by setting the pf->state flag __ICE_VF_RESETS_DISABLED. Changing [1] and [2] allow each VF driver's remove flow to successfully send VIRTCHNL requests, which includes queue disable. This prevents unexpected Tx timeouts because the PF driver is no longer forcefully disabling queues. Due to [1] and [2] there is a possibility that the PF driver will get a VFLR or reset request over VIRTCHNL from a VF during PF driver unload. Prevent that by doing [3]. Signed-off-by: Brett Creeley <brett.creeley@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2020-03-10 13:10:02 -07:00
Brett Creeley	46c276cebf	ice: Improve clarity of prints and variables Currently when the device runs out of MSI-X interrupts a cryptic and unhelpful message is printed. This will cause confusion when hitting this case. Fix this by clearing up the error message for both SR-IOV and non SR-IOV use cases. Also, make a few minor changes to increase clarity of variables. 1. Change per VF MSI-X and queue pair variables in the PF structure. 2. Use ICE_NONQ_VECS_VF when determining pf->num_msix_per_vf instead of the magic number "1". This vector is reserved for the OICR. All of the resource tracking functions were moved to avoid adding any forward declaration function prototypes. Signed-off-by: Brett Creeley <brett.creeley@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2020-03-10 13:09:52 -07:00
Mitch Williams	0ca469fbc3	ice: allow bigger VFs Unlike the XL710 series, 800-series hardware can allocate more than 4 MSI-X vectors per VF. This patch enables that functionality. We dynamically allocate vectors and queues depending on how many VFs are enabled. Allocating the maximum number of VFs replicates XL710 behavior with 4 queues and 4 vectors. But allocating a smaller number of VFs will give you 16 queues and 16 vectors. Signed-off-by: Mitch Williams <mitch.a.williams@intel.com> Signed-off-by: Brett Creeley <brett.creeley@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2020-03-10 13:09:52 -07:00
Jeff Kirsher	f3beaf246f	ice: Cleanup unneeded parenthesis Sergei Shtylyov pointed out that two instances of parenthesis are not needed, so remove them. Suggested-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com>	2020-03-10 13:09:51 -07:00
Jacob Keller	ceb2f00707	ice: Use pci_get_dsn() Replace the open-coded implementation for reading the PCIe DSN with pci_get_dsn(). The pci_get_dsn() function will perform two pci_read_config_dword calls to read the lower and upper config dwords. It bitwise ORs them into a u64 value. Instead of using put_unaligned_le32 to convert the value to LE32 format, just use the %016llX printf specifier. This will print the u64 correct, putting the most significant byte of the value first. Since pci_get_dsn() correctly orders the two dwords into a u64, this should produce equivalent results in less code. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-03-05 17:36:24 -08:00
Jakub Kicinski	4a80a18338	ice: let core reject the unsupported coalescing parameters Set ethtool_ops->supported_coalesce_params to let the core reject unsupported coalescing parameters. This driver correctly rejects all unsupported parameters. As a side effect of these changes the info message about the bad parameter will no longer be printed. We also always reject the tx_coalesce_usecs_high param, even if the target queue pair does not have a TX queue. Error code changes from EINVAL to EOPNOTSUPP. v2: allow adaptive TX v3: adjust commit message for new error code and member name Signed-off-by: Jakub Kicinski <kuba@kernel.org> Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-03-05 12:12:35 -08:00
David S. Miller	e65ee2fb54	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Conflict resolution of ice_virtchnl_pf.c based upon work by Stephen Rothwell. Signed-off-by: David S. Miller <davem@davemloft.net>	2020-02-21 13:39:34 -08:00
Bruce Allan	2fbfa9668b	ice: fix define for E822 backplane device This product's name has changed; update the macro identifier accordingly. Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2020-02-19 13:39:33 -08:00
Bruce Allan	e36aeec0f4	ice: add support for E823 devices Add E823 device ids and convert conditional expressions to a more appropriate switch statement. Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2020-02-19 13:37:38 -08:00
Bruce Allan	195fb97766	ice: add additional E810 device id Add support for device id 0x159b. Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2020-02-19 13:28:58 -08:00
Jesse Brandeburg	af23635a53	ice: add backslash-n to strings There were several strings found without line feeds, fix them by adding a line feed, as is typical. Without this lotsofmessagescanbejumbledtogether. This patch has known checkpatch warnings from long lines for the NL_* messages, because checkpatch doesn't know how to ignore them. Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2020-02-19 13:26:45 -08:00
Jacob Keller	7124507291	ice: increase PF reset wait timeout to 300 milliseconds Increase the maximum time that the driver will wait for a PF reset from 200 milliseconds to 300 milliseconds, to account for possibility of a slightly longer than expected PF reset. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2020-02-19 13:14:37 -08:00
Krzysztof Kazimierczak	5fa23e0b23	ice: Support XDP UMEM wake up mechanism Add support for a new AF_XDP feature that has already been introduced in upstreamed Intel NIC drivers. If a user space application signals that it might sleep using the new bind flag XDP_USE_NEED_WAKEUP, the driver will then set this flag if it has no more buffers on the NIC Rx ring and yield to the application. For Tx, it will set the flag if it has no outstanding Tx completion interrupts and return to the application. Signed-off-by: Krzysztof Kazimierczak <krzysztof.kazimierczak@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2020-02-19 13:12:21 -08:00
Dave Ertman	31c5f7f3f4	ice: SW DCB, report correct max TC value lldpad is using the value reported in the DCB config for max_tc as the max allowed number of TCs, not the current max. ICE driver was reporting it as current maximum TC. Change DCB_NL function to report maximum TC allowed by this device. Signed-off-by: Dave Ertman <david.m.ertman@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2020-02-19 13:09:20 -08:00
Avinash Dayanand	27d9be98ed	ice: Report correct DCB mode Add code to detect if DCB is in IEEE or CEE mode. Without this the code will always report as IEEE mode which is incorrect and confuses the user. Signed-off-by: Avinash Dayanand <avinash.dayanand@intel.com> Signed-off-by: Scott Register <scottx.register@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2020-02-19 13:06:58 -08:00
Avinash JD	c8608b5071	ice: Add DCBNL ops required to configure ETS in CEE for SW DCB Couple of DCBNL ops are required for configuring ETS in SW DCB CEE mode. If these functions are not added, it'll break the CEE functionality. Signed-off-by: Avinash JD <avinash.dayanand@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2020-02-19 13:04:41 -08:00
Brett Creeley	36be2baa09	ice: Always clear the QRXFLXP_CNTXT register for VF Rx queues Currently when the PF reduces its number of channels via ethtool and then VFs are created there may be stale data for some of the Rx queues belonging to VFs. This happens when a VF reuses an Rx queue that was previously used by the PF. Specifically, the QRXFLXP_CNTXT register will have incorrect values. Fix this by always clearing the relevant values in the QRXFLXP_CNTXT register for VF queues. Signed-off-by: Brett Creeley <brett.creeley@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2020-02-19 13:01:51 -08:00
Dan Nowlin	a6892c96fc	ice: Fix for TCAM entry management Order intermediate VSIG list correct in order to correctly match existing VSIG lists. When overriding pre-existing TCAM entries, properly delete the existing entry and remove it from the change/update list. Signed-off-by: Dan Nowlin <dan.nowlin@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2020-02-19 12:59:37 -08:00
Paul Greenwalt	9d5c5a5290	ice: update malicious driver detection event handling Update the PF VFs MDD event message to rate limit once per second and report the total number Rx\|Tx event count. Add support to print pending MDD events that occur during the rate limit. The use of net_ratelimit did not allow for per VF Rx\|Tx granularity. Additional PF MDD log messages are guarded by netif_msg_[rx\|tx]_err(). Since VF RX MDD events disable the queue, add ethtool private flag mdd-auto-reset-vf to configure VF reset to re-enable the queue. Disable anti-spoof detection interrupt to prevent spurious events during a function reset. To avoid race condition do not make PF MDD register reads conditional on global MDD result. Signed-off-by: Paul Greenwalt <paul.greenwalt@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2020-02-19 12:56:34 -08:00
Avinash Dayanand	a29a912d44	ice: Validate config for SW DCB map Validate the inputs for SW DCB config received either via lldptool or pcap file. And don't apply DCB for bad bandwidth inputs. Without this patch, any config having bad inputs will cause the loss of link making PF unusable even after driver reload. Recoverable only via system reboot. Signed-off-by: Avinash Dayanand <avinash.dayanand@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2020-02-19 12:12:27 -08:00
Brett Creeley	c54d209c78	ice: Wait for VF to be reset/ready before configuration The configuration/command below is failing when the VF in the xml file is already bound to the host iavf driver. pci_0000_af_0_0.xml: <interface type='hostdev' managed='yes'> <source> <address type='pci' domain='0x0000' bus='0xaf' slot='0x0' function='0x0'/> </source> <mac address='00🇩🇪ad:00:11:01'/> </interface> > virsh attach-device domain_name pci_0000_af_0_0.xml error: Failed to attach device from pci_0000_af_0_0.xml error: Cannot set interface MAC/vlanid to 00🇩🇪ad:00:11:01/0 for ifname ens1f1 vf 0: Device or resource busy This is failing because the VF has not been completely removed/reset after being unbound (via the virsh command above) from the host iavf driver and ice_set_vf_mac() checks if the VF is disabled before waiting for the reset to finish. Fix this by waiting for the VF remove/reset process to happen before checking if the VF is disabled. Also, since many functions for VF administration on the PF were more or less calling the same 3 functions (ice_wait_on_vf_reset(), ice_is_vf_disabled(), and ice_check_vf_init()) move these into the helper function ice_check_vf_ready_for_cfg(). Then call this function in any flow that attempts to configure/query a VF from the PF. Lastly, increase the maximum wait time in ice_wait_on_vf_reset() to 800ms, and modify/add the #define(s) that determine the wait time. This was done for robustness because in rare/stress cases VF removal can take a max of ~800ms and previously the wait was a max of ~300ms. Signed-off-by: Brett Creeley <brett.creeley@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2020-02-19 11:50:41 -08:00
Michal Swiatkowski	8a55c08d3b	ice: Don't tell the OS that link is going down Remove code that tell the OS that link is going down when user change flow control via ethtool. When link is up it isn't certain that link goes down after 0x0605 aq command. If link doesn't go down, OS thinks that link is down, but physical link is up. To reset this state user have to take interface down and up. If link goes down after 0x0605 command, FW send information about that and after that driver tells the OS that the link goes down. So this code in ethtool is unnecessary. Signed-off-by: Michal Swiatkowski <michal.swiatkowski@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2020-02-19 11:50:41 -08:00
Brett Creeley	840f8ad0aa	ice: Don't reject odd values of usecs set by user Currently if a user sets an odd [tx\|rx]-usecs value through ethtool, the request is denied because the hardware is set to have an ITR granularity of 2us. This caused poor customer experience. Fix this by aligning to a register allowed value, which results in rounding down. Also, print a once per ring container type message to be clear about our intentions. Also, change the ITR_TO_REG define to be the bitwise and of the ITR setting and the ICE_ITR_MASK. This makes the purpose of ITR_TO_REG more obvious. Signed-off-by: Brett Creeley <brett.creeley@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2020-02-19 11:50:41 -08:00
Bruce Allan	fb0c5b05c1	ice: use true/false for bool types Subject says it all. Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2020-02-15 18:37:56 -08:00
Bruce Allan	644f40ea0c	ice: add function argument description to function header comment Commit `0290bd291c` ("netdev: pass the stuck queue to the timeout handler") introduced a new argument to the function but missed adding the description of the argument to the function header comment. Add it now. Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2020-02-15 18:34:33 -08:00
Bruce Allan	e0708aa8a5	ice: use proper format for function pointer as a function parameter Compiling with gcc-9.2.1 with W=1 points out warnings about the improper function parameter list. Fix it. Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2020-02-15 17:54:20 -08:00
Bruce Allan	4e83fc934e	ice: replace "fallthrough" comments with fallthrough reserved word "fallthrough" comments are used in switch case statements to explicitly indicate the code is intended to fall through to the following statement. Different variants of "fallthough" are acceptable, e.g. "fall through", "fallthrough", "Fall-through". The GCC compiler has an optional warning (-Wimplicit-fallthrough[=n]) to warn when such a comment is not present; the default version of which is enabled when compiling the Linux kernel. There have been recent discussions in kernel mailing lists regarding replacing non-standardized "fallthrough" comments with the pseudo-reserved word 'fallthrough' which will be defined as __attribute__ ((fallthrough)) for versions of gcc that support it (i.e. gcc 7 and newer) or as a nop for versions that do not. Replace "fallthrough" comments with fallthrough reserved word. Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2020-02-15 17:03:12 -08:00
Bruce Allan	752eee0678	ice: remove unnecessary fallthrough comments Fallthrough comments are used to explicitly indicate the code is intended to flow from one case statement to the next in a switch statement rather than break out of the switch statement. They are only needed when a case has one or more statements to execute before falling through to the next case, not when there is a list of cases for which the same statement(s) should be executed. Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2020-02-15 16:56:48 -08:00
Brett Creeley	24e2e2a0b8	ice: Fix virtchnl_queue_select bitmap validation Currently in ice_vc_ena_qs_msg() we are incorrectly validating the virtchnl queue select bitmaps. The virtchnl_queue_select rx_queues and tx_queue bitmap is being compared against ICE_MAX_BASE_QS_PER_VF, but the problem is that these bitmaps can have a value greater than ICE_MAX_BASE_QS_PER_VF. Fix this by comparing the bitmaps against BIT(ICE_MAX_BASE_QS_PER_VF). Also, add the function ice_vc_validate_vqs_bitmaps() that checks to see if both virtchnl_queue_select bitmaps are empty along with checking that the bitmaps only have valid bits set. This function can then be used in both the queue enable and disable flows. Arkady Gilinksky's patch on the intel-wired-lan mailing list ("i40e/iavf: Fix msg interface between VF and PF") made me aware of this issue. Signed-off-by: Brett Creeley <brett.creeley@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2020-02-15 16:54:02 -08:00
Brett Creeley	e1fe692680	ice: Fix and refactor Rx queue disable for VFs Currently when a VF driver sends the PF a request to disable Rx queues we will disable them one at a time, even if the VF driver sent us a batch of queues to disable. This is causing issues where the Rx queue disable times out with LFC enabled. This can be improved by detecting when the VF is trying to disable all of its queues. Also remove the variable num_qs_ena from the ice_vf structure as it was only used to see if there were no Rx and no Tx queues active. Instead add a function that checks if both the vf->rxq_ena and vf->txq_ena bitmaps are empty. Signed-off-by: Brett Creeley <brett.creeley@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2020-02-15 16:49:44 -08:00
Brett Creeley	2309ae385a	ice: Handle LAN overflow event for VF queues Currently we are not handling LAN overflow events. There can be cases where LAN overflow events occur on VF queues, especially with Link Flow Control (LFC) enabled on the controlling PF. In order to recover from the LAN overflow event caused by a VF we need to determine if the queue belongs to a VF and reset that VF accordingly. The struct ice_aqc_event_lan_overflow returns a copy of the GLDCB_RTCTQ register, which tells us what the queue index is in the global/device space. The global queue index needs to first be converted to a PF space queue index and then it can be used to find if a VF owns it. Signed-off-by: Brett Creeley <brett.creeley@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2020-02-15 16:47:20 -08:00
Brett Creeley	39066dc549	ice: Fix implicit queue mapping mode in ice_vsi_get_qs Currently in ice_vsi_get_qs() we set the mapping_mode for Tx and Rx to vsi->[tx\|rx]_mapping_mode, but the problem is vsi->[tx\|rx]_mapping_mode have not been set yet. This was working because ICE_VSI_MAP_CONTIG is defined to 0. Fix this by being explicit with our mapping mode by initializing the Tx and Rx structure's mapping_mode to ICE_VSI_MAP_CONTIG and then setting the vsi->[tx\|rx]_mapping_mode to the [tx\|rx]_qs_cfg.mapping_mode values. Also, only assign the vsi->[tx\|rx]_mapping_mode when the queues are successfully mapped to the VSI. With this change there was no longer a need to initialize the ret variable to 0 so remove that. Signed-off-by: Brett Creeley <brett.creeley@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2020-02-15 16:42:36 -08:00
Brett Creeley	13a6233b03	ice: Add support to enable/disable all Rx queues before waiting Currently when we enable/disable all Rx queues we do the following sequence for each Rx queue and then move to the next queue. 1. Enable/Disable the Rx queue via register write. 2. Read the configuration register to determine if the Rx queue was enabled/disabled successfully. In some cases enabling/disabling queue 0 fails because of step 2 above. Fix this by doing step 1 for all of the Rx queues and then step 2 for all of the Rx queues. Also, there are cases where we enable/disable a single queue (i.e. SR-IOV and XDP) so add a new function that does step 1 and 2 above with a read flush in between. This change also required a single Rx queue to be enabled/disabled with and without waiting for the change to propagate through hardware. Fix this by adding a boolean wait flag to the necessary functions. Also, add the keywords "one" and "all" to distinguish between enabling/disabling a single Rx queue and all Rx queues respectively. Signed-off-by: Brett Creeley <brett.creeley@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2020-02-15 16:39:55 -08:00
Brett Creeley	72634bc228	ice: Only allow tagged bcast/mcast traffic for VF in port VLAN Currently the VF can see other's broadcast and multicast traffic because it always has a VLAN filter for VLAN 0. Fix this by removing/adding the VF's VLAN 0 filter when a port VLAN is added/removed respectively. This required a few changes. 1. Move where we add VLAN 0 by default for the VF into ice_alloc_vsi_res() because this is when we determine if a port VLAN is present for load and reset. 2. Moved where we kill the old port VLAN filter in ice_set_vf_port_vlan() to the very end of the function because it allows us to save the old port VLAN configuration upon any failure case. 3. During adding/removing of a port VLAN via ice_set_vf_port_vlan() we also need to remove/add the VLAN 0 filter rule respectively. 4. Improve log messages. Signed-off-by: Brett Creeley <brett.creeley@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2020-02-15 16:37:21 -08:00
Brett Creeley	61c9ce86a6	ice: Fix Port VLAN priority bits Currently when configuring a port VLAN for a VF we are only shifting the QoS bits by 12. This is incorrect. Fix this by getting rid of the ICE specific VLAN defines and use the kernel VLAN defines instead. Also, don't assign a value to vlanprio until the VLAN ID and QoS parameters have been validated. Also, there are many places we do (le16_to_cpu(vsi->info.pvid) & VLAN_VID_MASK). Instead do (vf->port_vlan_info & VLAN_VID_MASK) because we always save what's stored in vsi->info.pvid to vf->port_vlan_info in the CPU's endianness. Signed-off-by: Brett Creeley <brett.creeley@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2020-02-15 16:34:02 -08:00
Brett Creeley	0b6c6a8bb6	ice: Add helper to determine if VF link is up The check for vf->link_up is incorrect because this field is only valid if vf->link_forced is true. Fix this by adding the helper ice_is_vf_link_up() to determine if the VF's link is up. Signed-off-by: Brett Creeley <brett.creeley@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2020-02-15 16:29:49 -08:00
Brett Creeley	b093841f9a	ice: Refactor port vlan configuration for the VF Currently ice_vsi_manage_pvid() calls ice_vsi_[set\|kill]_pvid_fill_ctxt() when enabling/disabling a port VLAN on a VSI respectively. These two functions have some duplication so just move their unique pieces inline in ice_vsi_manage_pvid() and then the duplicate code can be reused for both the enabling/disabling paths. Before this patch the info.pvid field was not being written correctly via ice_vsi_kill_pvid_fill_ctxt() so it was being hard coded to 0 in ice_set_vf_port_vlan(). Fix this by setting the info.pvid field to 0 before calling ice_vsi_update() in ice_vsi_manage_pvid(). We currently use vf->port_vlan_id to keep track of the port VLAN ID and QoS, which is a bit misleading. Fix this by renaming it to vf->port_vlan_info. Also change the name of the argument for ice_vsi_manage_pvid() from vid to pvid_info. In ice_vsi_manage_pvid() only save the fields that were modified in the VSI properties structure on success instead of the entire thing. Signed-off-by: Brett Creeley <brett.creeley@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2020-02-15 16:27:17 -08:00
Brett Creeley	42f3efef35	ice: Add initial support for QinQ Allow support for S-Tag + C-Tag VLAN traffic by disabling pruning when there are no 0x8100 VLAN interfaces currently created on top of the PF. When an 0x8100 VLAN interface is configured, enable pruning and only support single and double C-Tag VLAN traffic. If all of the 0x8100 interfaces that were created on top of the PF are removed via ethtool -K <iface> rx-vlan-filter off or via ip tools, then disable pruning and allow S-Tag + C-Tag traffic again. Add VLAN 0 filter by default for the PF. This is because a bridge sets the default_pvid to 1, sends the request down to ice_vlan_rx_add_vid(), and we never get the request to add VLAN 0 via the 8021q module which causes all untagged traffic to be dropped. Signed-off-by: Brett Creeley <brett.creeley@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2020-02-15 16:24:33 -08:00
Tony Nguyen	4ee656bba8	ice: Trivial fixes This is a collection of trivial fixes including fixing whitespace, typos, function headers, reverse Christmas tree, etc. Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2020-02-12 11:49:12 -08:00
Ben Shelton	1d8bd99272	ice: Use correct netif error function Use the correct netif_msg_[tx,rx]_error() function to determine whether to print the MDD event type. Signed-off-by: Ben Shelton <benjamin.h.shelton@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2020-02-12 11:49:07 -08:00
Anirudh Venkataramanan	3306f79f42	ice: Cleanup ice_vsi_alloc_q_vectors 1. Remove local variable num_q_vectors and use vsi->num_q_vectors instead 2. Remove local variable pf and pass vsi->back to ice_pf_to_dev Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2020-02-12 11:49:04 -08:00
Anirudh Venkataramanan	19cce2c6f6	ice: Make print statements more compact Formatting strings in print function calls (like dev_info, dev_err, etc.) can exceed 80 columns without making checkpatch unhappy. So remove newlines where applicable and make print statements more compact. Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2020-02-12 11:49:00 -08:00
Anirudh Venkataramanan	9a946843ba	ice: Use ice_pf_to_dev Use ice_pf_to_dev(pf) instead of &pf->pdev->dev Use ice_pf_to_dev(vsi->back) instead of &vsi->back->pdev->dev When a pointer to the pf instance is available, use ice_pf_to_dev instead of ice_hw_to_dev Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2020-02-12 11:48:55 -08:00
Tony Nguyen	0a6ea04e3b	ice: Remove possible null dereference Commit `1f45ebe0d8` ("ice: add extra check for null Rx descriptor") moved the call to ice_construct_skb() under a null check as Coverity reported a possible use of null skb. However, the original call was not deleted, do so now. Fixes: `1f45ebe0d8` ("ice: add extra check for null Rx descriptor") Reported-by: Bruce Allan <bruce.w.allan@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2020-02-12 11:48:51 -08:00
Bruce Allan	cf8fc2a086	ice: update Unit Load Status bitmask to check after reset After a reset the Unit Load Status bits in the GLNVM_ULD register to check for completion should be 0x7FF before continuing. Update the mask to check (minus the three reserved bits that are always set). Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2020-02-12 11:48:45 -08:00
Bruce Allan	fbf1e1f698	ice: fix and consolidate logging of NVM/firmware version information Logging the firmware/NVM information during driver load is redundant since that information is also available via ethtool. Move the functionality found in ice_nvm_version_str() directly into ice_get_drvinfo() and remove calling the former and logging that info during driver probe. This also gets rid of a bug in ice_nvm_version_str() where it returns a pointer to a buffer which is free'ed when that function exits. Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2020-02-12 11:48:41 -08:00
Akeem G Abodunrin	b55e603252	ice: Modify link message logging This patch modifies link message logging to include "Full Duplex" and "Negotiated" for FEC, so as to distinguish it from "Requested" FEC. Signed-off-by: Akeem G Abodunrin <akeem.g.abodunrin@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2020-02-12 11:48:35 -08:00
Anirudh Venkataramanan	a8b72ce03a	ice: Remove CONFIG_PCI_IOV wrap in ice_set_pf_caps Remove unnecessary CONFIG_PCI_IOV wrapping in ice_set_pf_caps. None of the data structures accessed within the block are wrapped with this flag. When CONFIG_PCI_IOV is undefined, pf->num_vfs_supported will be 0 anyway. Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2020-02-12 11:48:31 -08:00
Brett Creeley	3d9f999080	ice: Remove ice_dev_onetime_setup() ice_dev_onetime_setup contains driver workarounds needed for firmware limitations. These issues have now been resolved in newer NVMs so remove the function. Signed-off-by: Brett Creeley <brett.creeley@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2020-02-12 11:48:26 -08:00
Brett Creeley	168983a8e1	ice: Don't allow same value for Rx tail to be written twice Currently we compare the value we are about to write to the Rx tail register with the previous value of next_to_use. The problem with this is we only write tail on 8 descriptor boundaries, but next_to_use is updated whenever we clean Rx descriptors. Fix this by comparing the value we are about to write to tail with the previously written tail value. This will prevent duplicate Rx tail bumps. Signed-off-by: Brett Creeley <brett.creeley@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2020-02-12 11:48:22 -08:00
Paul Greenwalt	ad9a87bec3	ice: display supported and advertised link modes Display all of the supported and advertised link modes based on the PHY capability with media. Displaying all supported modes is more informative then only displaying the current link mode. Signed-off-by: Paul Greenwalt <paul.greenwalt@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2020-02-12 11:48:18 -08:00
Dave Ertman	53977ee474	ice: Fix switch between FW and SW LLDP When switching between FW and SW LLDP mode, the number of configured TLV apps in the driver's DCB configuration is getting out of synch with what lldpad thinks is configured. This is causing a problem when shutting down lldpad. The cleanup is trying to delete TLV apps that are not defined in the kernel. Since the driver is keeping an accurate account of the apps defined, use the drivers number of apps to determine if there is an app to delete. If the number of apps is <= 1, then do not attempt to delete. Signed-off-by: Dave Ertman <david.m.ertman@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2020-02-12 11:48:14 -08:00
Dave Ertman	242b5e068b	ice: Fix DCB rebuild after reset The function ice_dcb_rebuild had some logic flaws in it, and also didn't differentiate between FW and SW modes needs. For FW flow, the willing setting was being forced to OFF and left that way. Unwilling in DCB FW mode is not a supported model. Leave the config alone and use the return value from the set command to determine if setting the config was successful. The SW DCB flow does not need to need to register for MIB change events (as they are not used in SW mode). Use !is_sw_lldp checks to only perform FW specific task while in FW mode. Also adding a reapplication of the current DCB config after a link event. Some NVMs are not maintaining their DCB configs across link events. Signed-off-by: Dave Ertman <david.m.ertman@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2020-02-12 11:48:10 -08:00
Tony Nguyen	18a8d35863	ice: Bump version Bump version to 0.8.2-k Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2020-01-25 21:50:23 -08:00
Md Fahad Iqbal Polash	6876fb6404	ice: Implement ethtool get/set rx-flow-hash Provide support to change or retrieve RSS hash options for a flow type. The supported flow-types are: tcp4, tcp6, udp4, udp6, sctp4, sctp6. Signed-off-by: Md Fahad Iqbal Polash <md.fahad.iqbal.polash@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2020-01-25 21:47:28 -08:00
Md Fahad Iqbal Polash	1c01c8c6c9	ice: Initilialize VF RSS tables Set configuration for hardware RSS tables for VFs. Signed-off-by: Md Fahad Iqbal Polash <md.fahad.iqbal.polash@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2020-01-25 21:45:19 -08:00
Tony Nguyen	2c61054c5f	ice: Optimize table usage Attempt to optimize TCAM entries and reduce table resource usage by searching for profiles that can be reused. Provide resource cleanup of both hardware and software structures. Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Henry Tieman <henry.w.tieman@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2020-01-25 21:42:50 -08:00
Tony Nguyen	43dbfc7bb8	ice: Enable writing filtering tables Write the hardware tables based on the populated software structures. Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Henry Tieman <henry.w.tieman@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2020-01-25 21:38:32 -08:00
Tony Nguyen	451f2c4406	ice: Populate TCAM filter software structures Store the TCAM entry with the profile data and the VSI group in the respective SW structures. This will be subsequently used to write out the tables to hardware. Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Henry Tieman <henry.w.tieman@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2020-01-25 21:34:36 -08:00
Tony Nguyen	31ad4e4ee1	ice: Allocate flow profile Create an extraction sequence based on the packet header protocols to be programmed and allocate a flow profile for the extraction sequence. Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Henry Tieman <henry.w.tieman@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2020-01-24 16:06:32 -08:00
Tony Nguyen	c90ed40cef	ice: Enable writing hardware filtering tables Enable the driver to write the filtering hardware tables to allow for changing of RSS rules. Upon loading of DDP package, a minimal configuration should be written to hardware. Introduce and initialize structures for storing configuration and make the top level calls to configure the RSS tables to initial values. A packet segment will be created but nothing is written to hardware yet. Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Henry Tieman <henry.w.tieman@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2020-01-24 13:18:19 -08:00
Colin Ian King	102d412a3d	ice: remove redundant assignment to variable xmit_done The variable xmit_done is being initialized with a value that is never read and it is being updated later with a new value. The initialization is redundant and can be removed. Addresses-Coverity: ("Unused value") Signed-off-by: Colin Ian King <colin.king@canonical.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2020-01-17 09:55:34 -08:00
Julio Faracco	ed5a3f664c	ice: Removing hung_queue variable to use txqueue function parameter The scope of function .ndo_tx_timeout was changed to include the hang queue when a TX timeout event occurs. See commit `0290bd291c` ("netdev: pass the stuck queue to the timeout handler") for more details. Now, drivers don't need to identify which queue is stopped. Drivers can simply use the queue index provided by dev_watchdog and execute all actions needed to restore network traffic. This commit do some cleanups into Intel ice driver to remove a redundant loop to find stopped queue. Signed-off-by: Julio Faracco <jcfaracco@gmail.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2020-01-17 09:55:34 -08:00
Jacob Keller	5d9e618cbb	ice: Add device ids for E822 devices Add support for E822 devices Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2020-01-03 16:08:33 -08:00
Krzysztof Kazimierczak	9112539934	ice: Suppress Coverity warnings for xdp_rxq_info_reg Coverity reports some of the calls to xdp_rxq_info_reg() as potential issues, because the driver does not check its return value. However, those calls are wrapped with "if (!xdp_rxq_info_is_reg(&ring->xdp_rxq))" and this check alone is enough to be sure that the function will never fail. All possible states of xdp_rxq_info are: - NEW, - REGISTERED, - UNREGISTERED, - UNUSED. The driver won't mark a queue as UNUSED under no circumstance, so the return value can be ignored safely. Add comments for Coverity right above calls to xdp_rxq_info_reg() to suppress the warnings. Signed-off-by: Krzysztof Kazimierczak <krzysztof.kazimierczak@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2020-01-03 16:08:33 -08:00
Krzysztof Kazimierczak	65bb559b6c	ice: Add a boundary check in ice_xsk_umem() In ice_xsk_umem(), variable qid which is later used as an array index, is not validated for a possible boundary exceedance. Because of that, a calling function might receive an invalid address, which causes general protection fault when dereferenced. To address this, add a boundary check to see if qid is greater than the size of a UMEM array. Also, don't let user change vsi->num_xsk_umems just by trying to setup a second UMEM if its value is already set up (i.e. UMEM region has already been allocated for this VSI). While at it, make sure that ring->zca.free pointer is always zeroed out if there is no UMEM on a specified ring. Signed-off-by: Krzysztof Kazimierczak <krzysztof.kazimierczak@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2020-01-03 16:08:33 -08:00
Mitch Williams	1f45ebe0d8	ice: add extra check for null Rx descriptor In the case where the hardware gives us a null Rx descriptor, it is theoretically possible that we could call one of our skb-construction functions with no data pointer, which would cause a panic. In real life, this will never happen - we only get null RX descriptors as the final descriptor in a chain of otherwise-valid descriptors. When this happens, the skb will be extant and we'll just call ice_add_rx_frag(), which can deal with empty data buffers. Unfortunately, Coverity does not have intimate knowledge of our hardware, so we must add a check here. Signed-off-by: Mitch Williams <mitch.a.williams@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2020-01-03 16:08:33 -08:00
Bruce Allan	ac614b13fe	ice: suppress checked_return error Coverity reports an error that is not really an error; suppress it. Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2020-01-03 16:08:33 -08:00
Tony Nguyen	bda5b7db82	ice: Demote MTU change print to debug Following the changes of commit `12299132b3` ("net: ethernet: intel: Demote MTU change prints to debug"), change the MTU change message to netdev_dbg() Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2020-01-03 16:08:33 -08:00
Brett Creeley	ed4c068d46	ice: Enable ip link show on the PF to display VF unicast MAC(s) Currently when there are SR-IOV VF(s) and the user does "ip link show <pf interface>" the VF unicast MAC addresses all show 00:00:00:00:00:00 if the unicast MAC was set via VIRTCHNL (i.e. not administratively set by the host PF). This is misleading to the host administrator. Fix this by setting the VF's dflt_lan_addr.addr when the VF's unicast MAC address is configured via VIRTCHNL. There are a couple cases where we don't allow the dflt_lan_addr.addr field to be written. First, If the VF's pf_set_mac field is true and the VF is not trusted, then we don't allow the dflt_lan_addr.addr to be modified. Second, if the dflt_lan_addr.addr has already been set (i.e. via VIRTCHNL). Also a small refactor was done to separate the flow for add and delete MAC addresses in order to simplify the logic for error conditions and set/clear the VF's dflt_lan_addr.addr field. Signed-off-by: Brett Creeley <brett.creeley@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2020-01-03 16:08:33 -08:00
Brett Creeley	26a91525cc	ice: Fix VF link state when it's IFLA_VF_LINK_STATE_AUTO Currently the flow for ice_set_vf_link_state() is not configuring link the same as all other VF link configuration flows. Fix this by only setting the necessary VF members in ice_set_vf_link_state() and then call ice_vc_notify_link_state() to actually configure link for the VF. This made ice_set_pfe_link_forced() unnecessary, so it was deleted. Also, this commonizes the link flows for the VF to all call ice_vc_notify_link_state(). Signed-off-by: Brett Creeley <brett.creeley@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2020-01-03 16:08:33 -08:00
Vignesh Sridhar	f57a683ded	ice: Remove Rx flex descriptor programming Remove Rx flex descriptor metadata and flag programming; per specification these registers cannot be written to as they are read only. Signed-off-by: Vignesh Sridhar <vignesh.sridhar@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2020-01-03 16:08:33 -08:00
Michal Swiatkowski	11c25c2f2e	ice: Return error on not supported ethtool -C parameters Check for all unused parameters, if ethtool sent one of them, print info about that and return error. Signed-off-by: Michal Swiatkowski <michal.swiatkowski@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2020-01-03 16:08:33 -08:00
Michal Swiatkowski	61dc79ced7	ice: Restore interrupt throttle settings after VSI rebuild After each rebuild driver deallocates q_vectors, so the interrupt throttle rate (ITR) settings get lost. Create a function to save and restore ITR for each queue. If a user increases the number of queues, restore all the previous queue settings for each existing queue, and the additional queues will get the default setting. Signed-off-by: Michal Swiatkowski <michal.swiatkowski@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2020-01-03 16:08:33 -08:00
Michal Swiatkowski	118e0e1002	ice: Set default value for ITR in alloc function When the user sets itr_setting to zero from ethtool -C, the driver changes this value to default in ice_cfg_itr (for example after changing ring param). Remove code that sets default value in ice_cfg_itr and move it to place where the driver allocates q_vectors. Signed-off-by: Michal Swiatkowski <michal.swiatkowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2020-01-03 16:08:33 -08:00
Brett Creeley	005881bcf9	ice: Add ice_for_each_vf() macro Currently we do "for (i = 0; i < pf->num_alloc_vfs; i++)" all over the place. Many other places use macros to contain this repeated for loop, So create the macro ice_for_each_vf(pf, i) that does the same thing. There were a couple places we were using one loop variable and a VF iterator, which were changed to using a local variable within the ice_for_each_vf() macro. Also in ice_alloc_vfs() we were setting pf->num_alloc_vfs after doing "for (i = 0; i < num_alloc_vfs; i++)". Instead assign pf->num_alloc_vfs right after allocating memory for the pf->vf array. Signed-off-by: Brett Creeley <brett.creeley@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2020-01-03 16:08:33 -08:00
Brett Creeley	fc0f39bcb5	ice: Add code to keep track of current dflt_vsi We can't have more than one default VSI so prevent another VSI from overwriting the current dflt_vsi. This was achieved by adding the following functions: ice_is_dflt_vsi_in_use() - Used to check if the default VSI is already being used. ice_is_vsi_dflt_vsi() - Used to check if VSI passed in is in fact the default VSI. ice_set_dflt_vsi() - Used to set the default VSI via a switch rule ice_clear_dflt_vsi() - Used to clear the default VSI via a switch rule. Also, there was no need to introduce any locking because all mailbox events and synchronization of switch filters for the PF happen in the service task. Signed-off-by: Brett Creeley <brett.creeley@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2020-01-03 16:08:33 -08:00
Brett Creeley	cd6d6b8331	ice: Fix VF spoofchk There are many things wrong with the function ice_set_vf_spoofchk(). 1. The VSI being modified is the PF VSI, not the VF VSI. 2. We are enabling Rx VLAN pruning instead of Tx VLAN anti-spoof. 3. The spoofchk setting for each VF is not initialized correctly or re-initialized correctly on reset. To fix [1] we need to make sure we are modifying the VF VSI. This is done by using the vf->lan_vsi_idx to index into the PF's VSI array. To fix [2] replace setting Rx VLAN pruning in ice_set_vf_spoofchk() with setting Tx VLAN anti-spoof. To Fix [3] we need to make sure the initial VSI settings match what is done in ice_set_vf_spoofchk() for spoofchk=on. Also make sure this also works for VF reset. This was done by modifying ice_vsi_init() to account for the current spoofchk state of the VF VSI. Because of these changes, Tx VLAN anti-spoof needs to be removed from ice_cfg_vlan_pruning(). This is okay for the VF because this is now controlled from the admin enabling/disabling spoofchk. For the PF, Tx VLAN anti-spoof should not be set. This change requires us to call ice_set_vf_spoofchk() when configuring promiscuous mode for the VF which requires ice_set_vf_spoofchk() to move in order to prevent a forward declaration prototype. Also, add VLAN 0 by default when allocating a VF since the PF is unaware if the guest OS is running the 8021q module. Without this, MDD events will trigger on untagged traffic because spoofcheck is enabled by default. Due to this change, ignore add/delete messages for VLAN 0 from VIRTCHNL since this is added/deleted during VF initialization/teardown respectively and should not be modified. Signed-off-by: Brett Creeley <brett.creeley@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2020-01-03 16:08:33 -08:00
Brett Creeley	a54e3b8cff	ice: Support UDP segmentation offload Based on the work done by Alex Duyck on other Intel drivers, add code to support UDP segmentation offload (USO) for the ice driver. Signed-off-by: Brett Creeley <brett.creeley@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2020-01-03 16:08:32 -08:00
David S. Miller	2bbc078f81	Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next Daniel Borkmann says: ==================== pull-request: bpf-next 2019-12-27 The following pull-request contains BPF updates for your net-next tree. We've added 127 non-merge commits during the last 17 day(s) which contain a total of 110 files changed, 6901 insertions(+), 2721 deletions(-). There are three merge conflicts. Conflicts and resolution looks as follows: 1) Merge conflict in net/bpf/test_run.c: There was a tree-wide cleanup `c593642c8b` ("treewide: Use sizeof_field() macro") which gets in the way with `b590cb5f80` ("bpf: Switch to offsetofend in BPF_PROG_TEST_RUN"): <<<<<<< HEAD if (!range_is_zero(__skb, offsetof(struct __sk_buff, priority) + sizeof_field(struct __sk_buff, priority), ======= if (!range_is_zero(__skb, offsetofend(struct __sk_buff, priority), >>>>>>> `7c8dce4b16` There are a few occasions that look similar to this. Always take the chunk with offsetofend(). Note that there is one where the fields differ in here: <<<<<<< HEAD if (!range_is_zero(__skb, offsetof(struct __sk_buff, tstamp) + sizeof_field(struct __sk_buff, tstamp), ======= if (!range_is_zero(__skb, offsetofend(struct __sk_buff, gso_segs), >>>>>>> `7c8dce4b16` Just take the one with offsetofend() /and/ gso_segs. Latter is correct due to `850a88cc40` ("bpf: Expose __sk_buff wire_len/gso_segs to BPF_PROG_TEST_RUN"). 2) Merge conflict in arch/riscv/net/bpf_jit_comp.c: (I'm keeping Bjorn in Cc here for a double-check in case I got it wrong.) <<<<<<< HEAD if (is_13b_check(off, insn)) return -1; emit(rv_blt(tcc, RV_REG_ZERO, off >> 1), ctx); ======= emit_branch(BPF_JSLT, RV_REG_T1, RV_REG_ZERO, off, ctx); >>>>>>> `7c8dce4b16` Result should look like: emit_branch(BPF_JSLT, tcc, RV_REG_ZERO, off, ctx); 3) Merge conflict in arch/riscv/include/asm/pgtable.h: <<<<<<< HEAD ======= #define VMALLOC_SIZE (KERN_VIRT_SIZE >> 1) #define VMALLOC_END (PAGE_OFFSET - 1) #define VMALLOC_START (PAGE_OFFSET - VMALLOC_SIZE) #define BPF_JIT_REGION_SIZE (SZ_128M) #define BPF_JIT_REGION_START (PAGE_OFFSET - BPF_JIT_REGION_SIZE) #define BPF_JIT_REGION_END (VMALLOC_END) /* * Roughly size the vmemmap space to be large enough to fit enough * struct pages to map half the virtual address space. Then * position vmemmap directly below the VMALLOC region. / #define VMEMMAP_SHIFT \ (CONFIG_VA_BITS - PAGE_SHIFT - 1 + STRUCT_PAGE_MAX_SHIFT) #define VMEMMAP_SIZE BIT(VMEMMAP_SHIFT) #define VMEMMAP_END (VMALLOC_START - 1) #define VMEMMAP_START (VMALLOC_START - VMEMMAP_SIZE) #define vmemmap ((struct page )VMEMMAP_START) >>>>>>> `7c8dce4b16` Only take the BPF_* defines from there and move them higher up in the same file. Remove the rest from the chunk. The VMALLOC_* etc defines got moved via `01f52e16b8` ("riscv: define vmemmap before pfn_to_page calls"). Result: [...] #define __S101 PAGE_READ_EXEC #define __S110 PAGE_SHARED_EXEC #define __S111 PAGE_SHARED_EXEC #define VMALLOC_SIZE (KERN_VIRT_SIZE >> 1) #define VMALLOC_END (PAGE_OFFSET - 1) #define VMALLOC_START (PAGE_OFFSET - VMALLOC_SIZE) #define BPF_JIT_REGION_SIZE (SZ_128M) #define BPF_JIT_REGION_START (PAGE_OFFSET - BPF_JIT_REGION_SIZE) #define BPF_JIT_REGION_END (VMALLOC_END) /* * Roughly size the vmemmap space to be large enough to fit enough * struct pages to map half the virtual address space. Then * position vmemmap directly below the VMALLOC region. */ #define VMEMMAP_SHIFT \ (CONFIG_VA_BITS - PAGE_SHIFT - 1 + STRUCT_PAGE_MAX_SHIFT) #define VMEMMAP_SIZE BIT(VMEMMAP_SHIFT) #define VMEMMAP_END (VMALLOC_START - 1) #define VMEMMAP_START (VMALLOC_START - VMEMMAP_SIZE) [...] Let me know if there are any other issues. Anyway, the main changes are: 1) Extend bpftool to produce a struct (aka "skeleton") tailored and specific to a provided BPF object file. This provides an alternative, simplified API compared to standard libbpf interaction. Also, add libbpf extern variable resolution for .kconfig section to import Kconfig data, from Andrii Nakryiko. 2) Add BPF dispatcher for XDP which is a mechanism to avoid indirect calls by generating a branch funnel as discussed back in bpfconf'19 at LSF/MM. Also, add various BPF riscv JIT improvements, from Björn Töpel. 3) Extend bpftool to allow matching BPF programs and maps by name, from Paul Chaignon. 4) Support for replacing cgroup BPF programs attached with BPF_F_ALLOW_MULTI flag for allowing updates without service interruption, from Andrey Ignatov. 5) Cleanup and simplification of ring access functions for AF_XDP with a bonus of 0-5% performance improvement, from Magnus Karlsson. 6) Enable BPF JITs for x86-64 and arm64 by default. Also, final version of audit support for BPF, from Daniel Borkmann and latter with Jiri Olsa. 7) Move and extend test_select_reuseport into BPF program tests under BPF selftests, from Jakub Sitnicki. 8) Various BPF sample improvements for xdpsock for customizing parameters to set up and benchmark AF_XDP, from Jay Jayatheerthan. 9) Improve libbpf to provide a ulimit hint on permission denied errors. Also change XDP sample programs to attach in driver mode by default, from Toke Høiland-Jørgensen. 10) Extend BPF test infrastructure to allow changing skb mark from tc BPF programs, from Nikita V. Shirokov. 11) Optimize prologue code sequence in BPF arm32 JIT, from Russell King. 12) Fix xdp_redirect_cpu BPF sample to manually attach to tracepoints after libbpf conversion, from Jesper Dangaard Brouer. 13) Minor misc improvements from various others. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2019-12-27 14:20:10 -08:00
David S. Miller	ac80010fc9	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Mere overlapping changes in the conflicts here. Signed-off-by: David S. Miller <davem@davemloft.net>	2019-12-22 15:15:05 -08:00
Magnus Karlsson	f8509aa078	xsk: ixgbe: i40e: ice: mlx5: Xsk_umem_discard_addr to xsk_umem_release_addr Change the name of xsk_umem_discard_addr to xsk_umem_release_addr to better reflect the new naming of the AF_XDP queue manipulation functions. As this functions is used by drivers implementing support for AF_XDP zero-copy, it requires a name change to these drivers. The function xsk_umem_release_addr_rq has also changed name in the same fashion. Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/1576759171-28550-10-git-send-email-magnus.karlsson@intel.com	2019-12-20 16:00:09 -08:00
Michael S. Tsirkin	0290bd291c	netdev: pass the stuck queue to the timeout handler This allows incrementing the correct timeout statistic without any mess. Down the road, devices can learn to reset just the specific queue. The patch was generated with the following script: use strict; use warnings; our $^I = '.bak'; my @work = ( ["arch/m68k/emu/nfeth.c", "nfeth_tx_timeout"], ["arch/um/drivers/net_kern.c", "uml_net_tx_timeout"], ["arch/um/drivers/vector_kern.c", "vector_net_tx_timeout"], ["arch/xtensa/platforms/iss/network.c", "iss_net_tx_timeout"], ["drivers/char/pcmcia/synclink_cs.c", "hdlcdev_tx_timeout"], ["drivers/infiniband/ulp/ipoib/ipoib_main.c", "ipoib_timeout"], ["drivers/infiniband/ulp/ipoib/ipoib_main.c", "ipoib_timeout"], ["drivers/message/fusion/mptlan.c", "mpt_lan_tx_timeout"], ["drivers/misc/sgi-xp/xpnet.c", "xpnet_dev_tx_timeout"], ["drivers/net/appletalk/cops.c", "cops_timeout"], ["drivers/net/arcnet/arcdevice.h", "arcnet_timeout"], ["drivers/net/arcnet/arcnet.c", "arcnet_timeout"], ["drivers/net/arcnet/com20020.c", "arcnet_timeout"], ["drivers/net/ethernet/3com/3c509.c", "el3_tx_timeout"], ["drivers/net/ethernet/3com/3c515.c", "corkscrew_timeout"], ["drivers/net/ethernet/3com/3c574_cs.c", "el3_tx_timeout"], ["drivers/net/ethernet/3com/3c589_cs.c", "el3_tx_timeout"], ["drivers/net/ethernet/3com/3c59x.c", "vortex_tx_timeout"], ["drivers/net/ethernet/3com/3c59x.c", "vortex_tx_timeout"], ["drivers/net/ethernet/3com/typhoon.c", "typhoon_tx_timeout"], ["drivers/net/ethernet/8390/8390.h", "ei_tx_timeout"], ["drivers/net/ethernet/8390/8390.h", "eip_tx_timeout"], ["drivers/net/ethernet/8390/8390.c", "ei_tx_timeout"], ["drivers/net/ethernet/8390/8390p.c", "eip_tx_timeout"], ["drivers/net/ethernet/8390/ax88796.c", "ax_ei_tx_timeout"], ["drivers/net/ethernet/8390/axnet_cs.c", "axnet_tx_timeout"], ["drivers/net/ethernet/8390/etherh.c", "__ei_tx_timeout"], ["drivers/net/ethernet/8390/hydra.c", "__ei_tx_timeout"], ["drivers/net/ethernet/8390/mac8390.c", "__ei_tx_timeout"], ["drivers/net/ethernet/8390/mcf8390.c", "__ei_tx_timeout"], ["drivers/net/ethernet/8390/lib8390.c", "__ei_tx_timeout"], ["drivers/net/ethernet/8390/ne2k-pci.c", "ei_tx_timeout"], ["drivers/net/ethernet/8390/pcnet_cs.c", "ei_tx_timeout"], ["drivers/net/ethernet/8390/smc-ultra.c", "ei_tx_timeout"], ["drivers/net/ethernet/8390/wd.c", "ei_tx_timeout"], ["drivers/net/ethernet/8390/zorro8390.c", "__ei_tx_timeout"], ["drivers/net/ethernet/adaptec/starfire.c", "tx_timeout"], ["drivers/net/ethernet/agere/et131x.c", "et131x_tx_timeout"], ["drivers/net/ethernet/allwinner/sun4i-emac.c", "emac_timeout"], ["drivers/net/ethernet/alteon/acenic.c", "ace_watchdog"], ["drivers/net/ethernet/amazon/ena/ena_netdev.c", "ena_tx_timeout"], ["drivers/net/ethernet/amd/7990.h", "lance_tx_timeout"], ["drivers/net/ethernet/amd/7990.c", "lance_tx_timeout"], ["drivers/net/ethernet/amd/a2065.c", "lance_tx_timeout"], ["drivers/net/ethernet/amd/am79c961a.c", "am79c961_timeout"], ["drivers/net/ethernet/amd/amd8111e.c", "amd8111e_tx_timeout"], ["drivers/net/ethernet/amd/ariadne.c", "ariadne_tx_timeout"], ["drivers/net/ethernet/amd/atarilance.c", "lance_tx_timeout"], ["drivers/net/ethernet/amd/au1000_eth.c", "au1000_tx_timeout"], ["drivers/net/ethernet/amd/declance.c", "lance_tx_timeout"], ["drivers/net/ethernet/amd/lance.c", "lance_tx_timeout"], ["drivers/net/ethernet/amd/mvme147.c", "lance_tx_timeout"], ["drivers/net/ethernet/amd/ni65.c", "ni65_timeout"], ["drivers/net/ethernet/amd/nmclan_cs.c", "mace_tx_timeout"], ["drivers/net/ethernet/amd/pcnet32.c", "pcnet32_tx_timeout"], ["drivers/net/ethernet/amd/sunlance.c", "lance_tx_timeout"], ["drivers/net/ethernet/amd/xgbe/xgbe-drv.c", "xgbe_tx_timeout"], ["drivers/net/ethernet/apm/xgene-v2/main.c", "xge_timeout"], ["drivers/net/ethernet/apm/xgene/xgene_enet_main.c", "xgene_enet_timeout"], ["drivers/net/ethernet/apple/macmace.c", "mace_tx_timeout"], ["drivers/net/ethernet/atheros/ag71xx.c", "ag71xx_tx_timeout"], ["drivers/net/ethernet/atheros/alx/main.c", "alx_tx_timeout"], ["drivers/net/ethernet/atheros/atl1c/atl1c_main.c", "atl1c_tx_timeout"], ["drivers/net/ethernet/atheros/atl1e/atl1e_main.c", "atl1e_tx_timeout"], ["drivers/net/ethernet/atheros/atlx/atl.c", "atlx_tx_timeout"], ["drivers/net/ethernet/atheros/atlx/atl1.c", "atlx_tx_timeout"], ["drivers/net/ethernet/atheros/atlx/atl2.c", "atl2_tx_timeout"], ["drivers/net/ethernet/broadcom/b44.c", "b44_tx_timeout"], ["drivers/net/ethernet/broadcom/bcmsysport.c", "bcm_sysport_tx_timeout"], ["drivers/net/ethernet/broadcom/bnx2.c", "bnx2_tx_timeout"], ["drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.h", "bnx2x_tx_timeout"], ["drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c", "bnx2x_tx_timeout"], ["drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c", "bnx2x_tx_timeout"], ["drivers/net/ethernet/broadcom/bnxt/bnxt.c", "bnxt_tx_timeout"], ["drivers/net/ethernet/broadcom/genet/bcmgenet.c", "bcmgenet_timeout"], ["drivers/net/ethernet/broadcom/sb1250-mac.c", "sbmac_tx_timeout"], ["drivers/net/ethernet/broadcom/tg3.c", "tg3_tx_timeout"], ["drivers/net/ethernet/calxeda/xgmac.c", "xgmac_tx_timeout"], ["drivers/net/ethernet/cavium/liquidio/lio_main.c", "liquidio_tx_timeout"], ["drivers/net/ethernet/cavium/liquidio/lio_vf_main.c", "liquidio_tx_timeout"], ["drivers/net/ethernet/cavium/liquidio/lio_vf_rep.c", "lio_vf_rep_tx_timeout"], ["drivers/net/ethernet/cavium/thunder/nicvf_main.c", "nicvf_tx_timeout"], ["drivers/net/ethernet/cirrus/cs89x0.c", "net_timeout"], ["drivers/net/ethernet/cisco/enic/enic_main.c", "enic_tx_timeout"], ["drivers/net/ethernet/cisco/enic/enic_main.c", "enic_tx_timeout"], ["drivers/net/ethernet/cortina/gemini.c", "gmac_tx_timeout"], ["drivers/net/ethernet/davicom/dm9000.c", "dm9000_timeout"], ["drivers/net/ethernet/dec/tulip/de2104x.c", "de_tx_timeout"], ["drivers/net/ethernet/dec/tulip/tulip_core.c", "tulip_tx_timeout"], ["drivers/net/ethernet/dec/tulip/winbond-840.c", "tx_timeout"], ["drivers/net/ethernet/dlink/dl2k.c", "rio_tx_timeout"], ["drivers/net/ethernet/dlink/sundance.c", "tx_timeout"], ["drivers/net/ethernet/emulex/benet/be_main.c", "be_tx_timeout"], ["drivers/net/ethernet/ethoc.c", "ethoc_tx_timeout"], ["drivers/net/ethernet/faraday/ftgmac100.c", "ftgmac100_tx_timeout"], ["drivers/net/ethernet/fealnx.c", "fealnx_tx_timeout"], ["drivers/net/ethernet/freescale/dpaa/dpaa_eth.c", "dpaa_tx_timeout"], ["drivers/net/ethernet/freescale/fec_main.c", "fec_timeout"], ["drivers/net/ethernet/freescale/fec_mpc52xx.c", "mpc52xx_fec_tx_timeout"], ["drivers/net/ethernet/freescale/fs_enet/fs_enet-main.c", "fs_timeout"], ["drivers/net/ethernet/freescale/gianfar.c", "gfar_timeout"], ["drivers/net/ethernet/freescale/ucc_geth.c", "ucc_geth_timeout"], ["drivers/net/ethernet/fujitsu/fmvj18x_cs.c", "fjn_tx_timeout"], ["drivers/net/ethernet/google/gve/gve_main.c", "gve_tx_timeout"], ["drivers/net/ethernet/hisilicon/hip04_eth.c", "hip04_timeout"], ["drivers/net/ethernet/hisilicon/hix5hd2_gmac.c", "hix5hd2_net_timeout"], ["drivers/net/ethernet/hisilicon/hns/hns_enet.c", "hns_nic_net_timeout"], ["drivers/net/ethernet/hisilicon/hns3/hns3_enet.c", "hns3_nic_net_timeout"], ["drivers/net/ethernet/huawei/hinic/hinic_main.c", "hinic_tx_timeout"], ["drivers/net/ethernet/i825xx/82596.c", "i596_tx_timeout"], ["drivers/net/ethernet/i825xx/ether1.c", "ether1_timeout"], ["drivers/net/ethernet/i825xx/lib82596.c", "i596_tx_timeout"], ["drivers/net/ethernet/i825xx/sun3_82586.c", "sun3_82586_timeout"], ["drivers/net/ethernet/ibm/ehea/ehea_main.c", "ehea_tx_watchdog"], ["drivers/net/ethernet/ibm/emac/core.c", "emac_tx_timeout"], ["drivers/net/ethernet/ibm/emac/core.c", "emac_tx_timeout"], ["drivers/net/ethernet/ibm/ibmvnic.c", "ibmvnic_tx_timeout"], ["drivers/net/ethernet/intel/e100.c", "e100_tx_timeout"], ["drivers/net/ethernet/intel/e1000/e1000_main.c", "e1000_tx_timeout"], ["drivers/net/ethernet/intel/e1000e/netdev.c", "e1000_tx_timeout"], ["drivers/net/ethernet/intel/fm10k/fm10k_netdev.c", "fm10k_tx_timeout"], ["drivers/net/ethernet/intel/i40e/i40e_main.c", "i40e_tx_timeout"], ["drivers/net/ethernet/intel/iavf/iavf_main.c", "iavf_tx_timeout"], ["drivers/net/ethernet/intel/ice/ice_main.c", "ice_tx_timeout"], ["drivers/net/ethernet/intel/ice/ice_main.c", "ice_tx_timeout"], ["drivers/net/ethernet/intel/igb/igb_main.c", "igb_tx_timeout"], ["drivers/net/ethernet/intel/igbvf/netdev.c", "igbvf_tx_timeout"], ["drivers/net/ethernet/intel/ixgb/ixgb_main.c", "ixgb_tx_timeout"], ["drivers/net/ethernet/intel/ixgbe/ixgbe_debugfs.c", "adapter->netdev->netdev_ops->ndo_tx_timeout(adapter->netdev);"], ["drivers/net/ethernet/intel/ixgbe/ixgbe_main.c", "ixgbe_tx_timeout"], ["drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c", "ixgbevf_tx_timeout"], ["drivers/net/ethernet/jme.c", "jme_tx_timeout"], ["drivers/net/ethernet/korina.c", "korina_tx_timeout"], ["drivers/net/ethernet/lantiq_etop.c", "ltq_etop_tx_timeout"], ["drivers/net/ethernet/marvell/mv643xx_eth.c", "mv643xx_eth_tx_timeout"], ["drivers/net/ethernet/marvell/pxa168_eth.c", "pxa168_eth_tx_timeout"], ["drivers/net/ethernet/marvell/skge.c", "skge_tx_timeout"], ["drivers/net/ethernet/marvell/sky2.c", "sky2_tx_timeout"], ["drivers/net/ethernet/marvell/sky2.c", "sky2_tx_timeout"], ["drivers/net/ethernet/mediatek/mtk_eth_soc.c", "mtk_tx_timeout"], ["drivers/net/ethernet/mellanox/mlx4/en_netdev.c", "mlx4_en_tx_timeout"], ["drivers/net/ethernet/mellanox/mlx4/en_netdev.c", "mlx4_en_tx_timeout"], ["drivers/net/ethernet/mellanox/mlx5/core/en_main.c", "mlx5e_tx_timeout"], ["drivers/net/ethernet/micrel/ks8842.c", "ks8842_tx_timeout"], ["drivers/net/ethernet/micrel/ksz884x.c", "netdev_tx_timeout"], ["drivers/net/ethernet/microchip/enc28j60.c", "enc28j60_tx_timeout"], ["drivers/net/ethernet/microchip/encx24j600.c", "encx24j600_tx_timeout"], ["drivers/net/ethernet/natsemi/sonic.h", "sonic_tx_timeout"], ["drivers/net/ethernet/natsemi/sonic.c", "sonic_tx_timeout"], ["drivers/net/ethernet/natsemi/jazzsonic.c", "sonic_tx_timeout"], ["drivers/net/ethernet/natsemi/macsonic.c", "sonic_tx_timeout"], ["drivers/net/ethernet/natsemi/natsemi.c", "ns_tx_timeout"], ["drivers/net/ethernet/natsemi/ns83820.c", "ns83820_tx_timeout"], ["drivers/net/ethernet/natsemi/xtsonic.c", "sonic_tx_timeout"], ["drivers/net/ethernet/neterion/s2io.h", "s2io_tx_watchdog"], ["drivers/net/ethernet/neterion/s2io.c", "s2io_tx_watchdog"], ["drivers/net/ethernet/neterion/vxge/vxge-main.c", "vxge_tx_watchdog"], ["drivers/net/ethernet/netronome/nfp/nfp_net_common.c", "nfp_net_tx_timeout"], ["drivers/net/ethernet/nvidia/forcedeth.c", "nv_tx_timeout"], ["drivers/net/ethernet/nvidia/forcedeth.c", "nv_tx_timeout"], ["drivers/net/ethernet/oki-semi/pch_gbe/pch_gbe_main.c", "pch_gbe_tx_timeout"], ["drivers/net/ethernet/packetengines/hamachi.c", "hamachi_tx_timeout"], ["drivers/net/ethernet/packetengines/yellowfin.c", "yellowfin_tx_timeout"], ["drivers/net/ethernet/pensando/ionic/ionic_lif.c", "ionic_tx_timeout"], ["drivers/net/ethernet/qlogic/netxen/netxen_nic_main.c", "netxen_tx_timeout"], ["drivers/net/ethernet/qlogic/qla3xxx.c", "ql3xxx_tx_timeout"], ["drivers/net/ethernet/qlogic/qlcnic/qlcnic_main.c", "qlcnic_tx_timeout"], ["drivers/net/ethernet/qualcomm/emac/emac.c", "emac_tx_timeout"], ["drivers/net/ethernet/qualcomm/qca_spi.c", "qcaspi_netdev_tx_timeout"], ["drivers/net/ethernet/qualcomm/qca_uart.c", "qcauart_netdev_tx_timeout"], ["drivers/net/ethernet/rdc/r6040.c", "r6040_tx_timeout"], ["drivers/net/ethernet/realtek/8139cp.c", "cp_tx_timeout"], ["drivers/net/ethernet/realtek/8139too.c", "rtl8139_tx_timeout"], ["drivers/net/ethernet/realtek/atp.c", "tx_timeout"], ["drivers/net/ethernet/realtek/r8169_main.c", "rtl8169_tx_timeout"], ["drivers/net/ethernet/renesas/ravb_main.c", "ravb_tx_timeout"], ["drivers/net/ethernet/renesas/sh_eth.c", "sh_eth_tx_timeout"], ["drivers/net/ethernet/renesas/sh_eth.c", "sh_eth_tx_timeout"], ["drivers/net/ethernet/samsung/sxgbe/sxgbe_main.c", "sxgbe_tx_timeout"], ["drivers/net/ethernet/seeq/ether3.c", "ether3_timeout"], ["drivers/net/ethernet/seeq/sgiseeq.c", "timeout"], ["drivers/net/ethernet/sfc/efx.c", "efx_watchdog"], ["drivers/net/ethernet/sfc/falcon/efx.c", "ef4_watchdog"], ["drivers/net/ethernet/sgi/ioc3-eth.c", "ioc3_timeout"], ["drivers/net/ethernet/sgi/meth.c", "meth_tx_timeout"], ["drivers/net/ethernet/silan/sc92031.c", "sc92031_tx_timeout"], ["drivers/net/ethernet/sis/sis190.c", "sis190_tx_timeout"], ["drivers/net/ethernet/sis/sis900.c", "sis900_tx_timeout"], ["drivers/net/ethernet/smsc/epic100.c", "epic_tx_timeout"], ["drivers/net/ethernet/smsc/smc911x.c", "smc911x_timeout"], ["drivers/net/ethernet/smsc/smc9194.c", "smc_timeout"], ["drivers/net/ethernet/smsc/smc91c92_cs.c", "smc_tx_timeout"], ["drivers/net/ethernet/smsc/smc91x.c", "smc_timeout"], ["drivers/net/ethernet/stmicro/stmmac/stmmac_main.c", "stmmac_tx_timeout"], ["drivers/net/ethernet/sun/cassini.c", "cas_tx_timeout"], ["drivers/net/ethernet/sun/ldmvsw.c", "sunvnet_tx_timeout_common"], ["drivers/net/ethernet/sun/niu.c", "niu_tx_timeout"], ["drivers/net/ethernet/sun/sunbmac.c", "bigmac_tx_timeout"], ["drivers/net/ethernet/sun/sungem.c", "gem_tx_timeout"], ["drivers/net/ethernet/sun/sunhme.c", "happy_meal_tx_timeout"], ["drivers/net/ethernet/sun/sunqe.c", "qe_tx_timeout"], ["drivers/net/ethernet/sun/sunvnet.c", "sunvnet_tx_timeout_common"], ["drivers/net/ethernet/sun/sunvnet_common.c", "sunvnet_tx_timeout_common"], ["drivers/net/ethernet/sun/sunvnet_common.h", "sunvnet_tx_timeout_common"], ["drivers/net/ethernet/synopsys/dwc-xlgmac-net.c", "xlgmac_tx_timeout"], ["drivers/net/ethernet/ti/cpmac.c", "cpmac_tx_timeout"], ["drivers/net/ethernet/ti/cpsw.c", "cpsw_ndo_tx_timeout"], ["drivers/net/ethernet/ti/cpsw_priv.c", "cpsw_ndo_tx_timeout"], ["drivers/net/ethernet/ti/cpsw_priv.h", "cpsw_ndo_tx_timeout"], ["drivers/net/ethernet/ti/davinci_emac.c", "emac_dev_tx_timeout"], ["drivers/net/ethernet/ti/netcp_core.c", "netcp_ndo_tx_timeout"], ["drivers/net/ethernet/ti/tlan.c", "tlan_tx_timeout"], ["drivers/net/ethernet/toshiba/ps3_gelic_net.h", "gelic_net_tx_timeout"], ["drivers/net/ethernet/toshiba/ps3_gelic_net.c", "gelic_net_tx_timeout"], ["drivers/net/ethernet/toshiba/ps3_gelic_wireless.c", "gelic_net_tx_timeout"], ["drivers/net/ethernet/toshiba/spider_net.c", "spider_net_tx_timeout"], ["drivers/net/ethernet/toshiba/tc35815.c", "tc35815_tx_timeout"], ["drivers/net/ethernet/via/via-rhine.c", "rhine_tx_timeout"], ["drivers/net/ethernet/wiznet/w5100.c", "w5100_tx_timeout"], ["drivers/net/ethernet/wiznet/w5300.c", "w5300_tx_timeout"], ["drivers/net/ethernet/xilinx/xilinx_emaclite.c", "xemaclite_tx_timeout"], ["drivers/net/ethernet/xircom/xirc2ps_cs.c", "xirc_tx_timeout"], ["drivers/net/fjes/fjes_main.c", "fjes_tx_retry"], ["drivers/net/slip/slip.c", "sl_tx_timeout"], ["include/linux/usb/usbnet.h", "usbnet_tx_timeout"], ["drivers/net/usb/aqc111.c", "usbnet_tx_timeout"], ["drivers/net/usb/asix_devices.c", "usbnet_tx_timeout"], ["drivers/net/usb/asix_devices.c", "usbnet_tx_timeout"], ["drivers/net/usb/asix_devices.c", "usbnet_tx_timeout"], ["drivers/net/usb/ax88172a.c", "usbnet_tx_timeout"], ["drivers/net/usb/ax88179_178a.c", "usbnet_tx_timeout"], ["drivers/net/usb/catc.c", "catc_tx_timeout"], ["drivers/net/usb/cdc_mbim.c", "usbnet_tx_timeout"], ["drivers/net/usb/cdc_ncm.c", "usbnet_tx_timeout"], ["drivers/net/usb/dm9601.c", "usbnet_tx_timeout"], ["drivers/net/usb/hso.c", "hso_net_tx_timeout"], ["drivers/net/usb/int51x1.c", "usbnet_tx_timeout"], ["drivers/net/usb/ipheth.c", "ipheth_tx_timeout"], ["drivers/net/usb/kaweth.c", "kaweth_tx_timeout"], ["drivers/net/usb/lan78xx.c", "lan78xx_tx_timeout"], ["drivers/net/usb/mcs7830.c", "usbnet_tx_timeout"], ["drivers/net/usb/pegasus.c", "pegasus_tx_timeout"], ["drivers/net/usb/qmi_wwan.c", "usbnet_tx_timeout"], ["drivers/net/usb/r8152.c", "rtl8152_tx_timeout"], ["drivers/net/usb/rndis_host.c", "usbnet_tx_timeout"], ["drivers/net/usb/rtl8150.c", "rtl8150_tx_timeout"], ["drivers/net/usb/sierra_net.c", "usbnet_tx_timeout"], ["drivers/net/usb/smsc75xx.c", "usbnet_tx_timeout"], ["drivers/net/usb/smsc95xx.c", "usbnet_tx_timeout"], ["drivers/net/usb/sr9700.c", "usbnet_tx_timeout"], ["drivers/net/usb/sr9800.c", "usbnet_tx_timeout"], ["drivers/net/usb/usbnet.c", "usbnet_tx_timeout"], ["drivers/net/vmxnet3/vmxnet3_drv.c", "vmxnet3_tx_timeout"], ["drivers/net/wan/cosa.c", "cosa_net_timeout"], ["drivers/net/wan/farsync.c", "fst_tx_timeout"], ["drivers/net/wan/fsl_ucc_hdlc.c", "uhdlc_tx_timeout"], ["drivers/net/wan/lmc/lmc_main.c", "lmc_driver_timeout"], ["drivers/net/wan/x25_asy.c", "x25_asy_timeout"], ["drivers/net/wimax/i2400m/netdev.c", "i2400m_tx_timeout"], ["drivers/net/wireless/intel/ipw2x00/ipw2100.c", "ipw2100_tx_timeout"], ["drivers/net/wireless/intersil/hostap/hostap_main.c", "prism2_tx_timeout"], ["drivers/net/wireless/intersil/hostap/hostap_main.c", "prism2_tx_timeout"], ["drivers/net/wireless/intersil/hostap/hostap_main.c", "prism2_tx_timeout"], ["drivers/net/wireless/intersil/orinoco/main.c", "orinoco_tx_timeout"], ["drivers/net/wireless/intersil/orinoco/orinoco_usb.c", "orinoco_tx_timeout"], ["drivers/net/wireless/intersil/orinoco/orinoco.h", "orinoco_tx_timeout"], ["drivers/net/wireless/intersil/prism54/islpci_dev.c", "islpci_eth_tx_timeout"], ["drivers/net/wireless/intersil/prism54/islpci_eth.c", "islpci_eth_tx_timeout"], ["drivers/net/wireless/intersil/prism54/islpci_eth.h", "islpci_eth_tx_timeout"], ["drivers/net/wireless/marvell/mwifiex/main.c", "mwifiex_tx_timeout"], ["drivers/net/wireless/quantenna/qtnfmac/core.c", "qtnf_netdev_tx_timeout"], ["drivers/net/wireless/quantenna/qtnfmac/core.h", "qtnf_netdev_tx_timeout"], ["drivers/net/wireless/rndis_wlan.c", "usbnet_tx_timeout"], ["drivers/net/wireless/wl3501_cs.c", "wl3501_tx_timeout"], ["drivers/net/wireless/zydas/zd1201.c", "zd1201_tx_timeout"], ["drivers/s390/net/qeth_core.h", "qeth_tx_timeout"], ["drivers/s390/net/qeth_core_main.c", "qeth_tx_timeout"], ["drivers/s390/net/qeth_l2_main.c", "qeth_tx_timeout"], ["drivers/s390/net/qeth_l2_main.c", "qeth_tx_timeout"], ["drivers/s390/net/qeth_l3_main.c", "qeth_tx_timeout"], ["drivers/s390/net/qeth_l3_main.c", "qeth_tx_timeout"], ["drivers/staging/ks7010/ks_wlan_net.c", "ks_wlan_tx_timeout"], ["drivers/staging/qlge/qlge_main.c", "qlge_tx_timeout"], ["drivers/staging/rtl8192e/rtl8192e/rtl_core.c", "_rtl92e_tx_timeout"], ["drivers/staging/rtl8192u/r8192U_core.c", "tx_timeout"], ["drivers/staging/unisys/visornic/visornic_main.c", "visornic_xmit_timeout"], ["drivers/staging/wlan-ng/p80211netdev.c", "p80211knetdev_tx_timeout"], ["drivers/tty/n_gsm.c", "gsm_mux_net_tx_timeout"], ["drivers/tty/synclink.c", "hdlcdev_tx_timeout"], ["drivers/tty/synclink_gt.c", "hdlcdev_tx_timeout"], ["drivers/tty/synclinkmp.c", "hdlcdev_tx_timeout"], ["net/atm/lec.c", "lec_tx_timeout"], ["net/bluetooth/bnep/netdev.c", "bnep_net_timeout"] ); for my $p (@work) { my @pair = @$p; my $file = $pair[0]; my $func = $pair[1]; print STDERR $file , ": ", $func,"\n"; our @ARGV = ($file); while (<ARGV>) { if (m/($func\s$struct\s+net_device\s+\[A-Za-z_]?[A-Za-z-0-9_])($)/) { print STDERR "found $1+$2 in $file\n"; } if (s/($func\s$struct\s+net_device\s+\[A-Za-z_]?[A-Za-z-0-9_])($)/$1, unsigned int txqueue$2/) { print STDERR "$func found in $file\n"; } print; } } where the list of files and functions is simply from: git grep ndo_tx_timeout, with manual addition of headers in the rare cases where the function is from a header, then manually changing the few places which actually call ndo_tx_timeout. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Heiner Kallweit <hkallweit1@gmail.com> Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com> Acked-by: Shannon Nelson <snelson@pensando.io> Reviewed-by: Martin Habets <mhabets@solarflare.com> changes from v9: fixup a forward declaration changes from v9: more leftovers from v3 change changes from v8: fix up a missing direct call to timeout rebased on net-next changes from v7: fixup leftovers from v3 change changes from v6: fix typo in rtl driver changes from v5: add missing files (allow any net device argument name) changes from v4: add a missing driver header changes from v3: change queue # to unsigned Changes from v2: added headers Changes from v1: Fix errors found by kbuild: generalize the pattern a bit, to pick up a couple of instances missed by the previous version. Signed-off-by: David S. Miller <davem@davemloft.net>	2019-12-12 21:38:57 -08:00
Pankaj Bharadiya	c593642c8b	treewide: Use sizeof_field() macro Replace all the occurrences of FIELD_SIZEOF() with sizeof_field() except at places where these are defined. Later patches will remove the unused definition of FIELD_SIZEOF(). This patch is generated using following script: EXCLUDE_FILES="include/linux/stddef.h\|include/linux/kernel.h" git grep -l -e "\bFIELD_SIZEOF\b" \| while read file; do if [[ "$file" =~ $EXCLUDE_FILES ]]; then continue fi sed -i -e 's/\bFIELD_SIZEOF\b/sizeof_field/g' $file; done Signed-off-by: Pankaj Bharadiya <pankaj.laxminarayan.bharadiya@intel.com> Link: https://lore.kernel.org/r/20190924105839.110713-3-pankaj.laxminarayan.bharadiya@intel.com Co-developed-by: Kees Cook <keescook@chromium.org> Signed-off-by: Kees Cook <keescook@chromium.org> Acked-by: David Miller <davem@davemloft.net> # for net	2019-12-09 10:36:44 -08:00
Kevin Scott	ed960c1d36	ice: Update FW API minor version Update FW API minor version to align to current value advertised by FW in new NVM images. Signed-off-by: Kevin Scott <kevin.c.scott@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-11-22 13:43:46 -08:00
Jacob Keller	1748ce80e0	ice: remove pointless NULL check of port_info The code in ice_sched_cleanup_all checks whether the port info is NULL prior to calling ice_sched_clear_port. However, ice_sched_clear_port already checks whether port info is non-NULL. More importantly, it also checks whether the port structure has been initialized by checking its port_state field as well. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-11-22 13:43:42 -08:00
Henry Tieman	87324e747f	ice: Implement ethtool ops for channels Add code to query and set the number of channels on the primary VSI for a PF. This is accessed from the 'ethtool -l' and 'ethtool -L' commands, respectively. Though the ice driver supports asymmetric queues report an IRQ vector that has both Rx and Tx queues attached and is counted as a 'combined' channel. Signed-off-by: Henry Tieman <henry.w.tieman@intel.com> Co-developed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-11-22 13:43:26 -08:00
Jesse Brandeburg	730fdea40b	ice: implement VF stats NDO Implement the VF stats gathering via the kernel via ndo_get_vf_stats(). The driver will show per-VF stats in the output of the ip -s link show dev <PF> command. Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-11-22 13:15:25 -08:00
Jesse Brandeburg	4c66d227e4	ice: add helpers for virtchnl The virtchannel interface was repeating a lot of strings and wasting storage space in the kernel. There was also inconsistent messages for the same thing. Consolidate all those messages and bit checks into a couple of helper functions. Also, reduce stack space usage by simplifying getting the pointer to the pf using a helper. Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Co-developed-by: Brett Creeley <brett.creeley@intel.com> Signed-off-by: Brett Creeley <brett.creeley@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-11-22 13:15:21 -08:00
Brett Creeley	4015d11e4b	ice: Add ice_pf_to_dev(pf) macro We use &pf->dev->pdev all over the code. Add a simple macro to do this for us. When multiple de-references like this are being done add a local struct device variable. Signed-off-by: Brett Creeley <brett.creeley@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-11-22 13:15:17 -08:00
Tony Nguyen	9efe35d0db	ice: Do not use devm* functions for local uses In situations where we alloc and free memory within the same function do not use the devm_* variants; use regular alloc and free functions. Remove any unused vars if there are no usages after these changes. Also, replace an allocate and copy with kmemdup() and remove an unnecessary memset() to 0 after a kzalloc(). Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-11-22 13:15:12 -08:00
Brett Creeley	1bc7a4ab85	ice: Refactor removal of VLAN promiscuous rules Currently ice_clear_vsi_promisc() detects if the VLAN ID sent is not 0 and sets the recipe_id to ICE_SW_LKUP_PROMISC_VLAN in that case and ICE_SW_LKUP_PROMISC if the VLAN_ID is 0. However this doesn't allow VLAN 0 promiscuous rules to be removed, but they can be added. Fix this by checking if the promisc_mask contains ICE_PROMISC_VLAN_RX or ICE_PROMISC_VLAN_TX. This change was made to match what is being done for ice_set_vsi_promisc(). Signed-off-by: Brett Creeley <brett.creeley@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-11-22 13:15:08 -08:00
Brett Creeley	e25f9152bc	ice: Fix setting coalesce to handle DCB configuration Currently there can be a case where a DCB map is applied and there are more interrupt vectors (vsi->num_q_vectors) than Rx queues (vsi->num_rxq) and Tx queues (vsi->num_txq). If we try to set coalesce settings in this case it will report a false failure. Fix this by checking if vector index is valid with respect to the number of Tx and Rx queues configured. Signed-off-by: Brett Creeley <brett.creeley@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-11-22 13:15:04 -08:00
Akeem G Abodunrin	1f9639d2fb	ice: Only disable VF state when freeing each VF resources It is wrong to set PF disable state flag for all VFs when freeing VF resources - Instead, we should set VF disable state flag for each VF with its resources being returned to the device. Right now, all VF opcodes, mailbox communication to clear its resources as well fails - since we already indicate that PF is in disable state, with all VFs not active. In addition, we don't need to notify VF that PF is intending to reset it, if it is already in disabled state. Signed-off-by: Akeem G Abodunrin <akeem.g.abodunrin@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-11-22 13:14:48 -08:00
Jesse Brandeburg	949375de94	ice: fix stack leakage In the case of an invalid virtchannel request the driver would return uninitialized data to the VF from the PF stack which is a bug. Fix by initializing the stack variable earlier in the function before any return paths can be taken. Fixes: `1071a8358a` ("ice: Implement virtchnl commands for AVF support") Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-11-22 13:09:31 -08:00
Brett Creeley	2f9ec24198	ice: Don't modify stripping for add/del VLANs on VF Currently when adding/deleting vlans in ice_vc_process_vlan_msg() we are calling ice_vsi_manage_vlan_stripping() to enable/disable when adding and deleting a VLAN respectively. This is wrong because adding/deleting VLANs has nothing to do with configuring VLAN stripping. VLAN stripping is configured through the following VIRTCHNL operations: VIRTCHNL_OP_ENABLE_VLAN_STRIPPING VIRTCHNL_OP_DISABLE_VLAN_STRIPPING Unfortunately we can't just remove this because then stripping will never be configured on VF initialization. Fix this by adding a new function that initializes (disables/enables) VLAN stripping for the VF based on the device supported capabilities. This allows us to remove the call to ice_vsi_manage_vlan_stripping() in ice_vc_process_vlan_msg(). Signed-off-by: Brett Creeley <brett.creeley@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-11-22 13:06:34 -08:00
Brett Creeley	d4bc4e2d6b	ice: Disallow VF VLAN opcodes if VLAN offloads disabled Currently if the host disables VLAN offloads on the VF by not setting the VIRTCHNL_VF_OFFLOAD_VLAN capability bit we will still honor VF VLAN configuration messages over VIRTCHNL. These messages (i.e. enable/disable VLAN stripping and VLAN filtering) should be blocked when the feature is not supported. Fix that by adding a helper function to determine if the VF is allowed to do VLAN operations based on the host's VF configuration. Also, mirror the VF communicated capabilities in the host's VF configuration. Signed-off-by: Brett Creeley <brett.creeley@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-11-22 13:06:34 -08:00
Bruce Allan	9164f761c9	ice: Correct capabilities reporting of max TCs Firmware always returns 8 as the max number of supported TCs. However on devices with more than 4 ports, the maximum number of TCs per port is limited to 4. Check and, if necessary, correct the reporting of capabilities for devices with more than 4 ports. Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-11-22 13:06:34 -08:00
Bruce Allan	eae1bbb2a4	ice: Store number of functions for the device Store the number of functions the device has and use this number when setting safe mode capabilities instead of calculating it. Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Co-developed-by: Kevin Scott <kevin.c.scott@intel.com> Signed-off-by: Kevin Scott <kevin.c.scott@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-11-22 13:06:34 -08:00
David S. Miller	14684b9301	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net One conflict in the BPF samples Makefile, some fixes in 'net' whilst we were converting over to Makefile.target rules in 'net-next'. Signed-off-by: David S. Miller <davem@davemloft.net>	2019-11-09 11:04:37 -08:00
Colin Ian King	615457a226	ice: fix potential infinite loop because loop counter being too small Currently the for-loop counter i is a u8 however it is being checked against a maximum value hw->num_tx_sched_layers which is a u16. Hence there is a potential wrap-around of counter i back to zero if hw->num_tx_sched_layers is greater than 255. Fix this by making i a u16. Addresses-Coverity: ("Infinite loop") Fixes: `b36c598c99` ("ice: Updates to Tx scheduler code") Signed-off-by: Colin Ian King <colin.king@canonical.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-11-08 16:10:51 -08:00
Jacob Keller	fb0254b284	ice: print opcode when printing controlq errors To help aid in debugging, display the command opcode in debug messages that print an error code. This makes it easier to see what command failed if only ICE_DBG_AQ_MSG is enabled. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-11-08 12:03:18 -08:00
Jacob Keller	faa01721ce	ice: use more accurate ICE_DBG mask types ice_debug_cq is passed a mask which is always ICE_DBG_AQ_CMD. Modify this function, removing the mask parameter entirely, and directly use the more appropriate ICE_DBG_AQ_DESC and ICE_DBG_AQ_DESC_BUF. The function is only called from ice_controlq.c, and has no other callers outside of that file. Move it and mark it static to avoid namespace pollution. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-11-08 12:03:15 -08:00
Anirudh Venkataramanan	964674f1dd	ice: Introduce and use ice_vsi_type_str ice_vsi_type_str converts an ice_vsi_type enum value to its string equivalent. This is expected to help easily identify VSI types from module print statements. Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-11-08 12:03:12 -08:00
Bruce Allan	87a2e49889	ice: remove unnecessary conditional check There is no reason to do this conditional check before the assignment so simply remove it. Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-11-08 12:03:10 -08:00
Brett Creeley	893869d5d0	ice: Update enum ice_flg64_bits to current specification Currently the VLAN ice_flg64_bits are off by 1. Fix this by setting the ICE_FLG_EVLAN_x8100 flag to 14, which also updates ICE_FLG_EVLAN_x9100 to 15 and ICE_FLG_VLAN_x8100 to 16. Signed-off-by: Brett Creeley <brett.creeley@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-11-08 12:03:06 -08:00
Mitch Williams	88bb432a55	ice: delay less Shorten the delay for SQ responses, but increase the number of loops. Max delay time is unchanged, but some operations complete much more quickly. In the process, add a new define to make the delay count and delay time more explicit. Add comments to make things more explicit. This fixes a problem with VF resets failing on with many VFs. Signed-off-by: Mitch Williams <mitch.a.williams@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-11-08 12:03:03 -08:00
Bruce Allan	e000248ec8	ice: use pkg_dwnld_status instead of sq_last_status Since the return value from the Download Package AQ command is stored in hw->pkg_dwnld_status, use that instead of sq_last_status since that may have the return value from some other AQ command leading to unexpected results. Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-11-08 12:02:59 -08:00
Brett Creeley	b791cdd5c7	ice: Change max MSI-x vector_id check in cfg_irq_map Currently we check to make sure the vector_id passed down from iavf is less than or equal to pf->hw.func_caps.common_caps.num_msix_vectors. This is incorrect because the vector_id is always 0-based and never greater than or equal to the ICE_MAX_INTR_PER_VF. Fix this by checking to make sure the vector_id is less than the max allowed interrupts per VF (ICE_MAX_INTR_PER_VF). Signed-off-by: Brett Creeley <brett.creeley@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-11-08 12:02:56 -08:00
Akeem G Abodunrin	ec4f5a436b	ice: Check if VF is disabled for Opcode and other operations This patch adds code to check if PF or VF is disabled before honoring mailbox message to configure VF - If it is disabled, and opcode is for resetting VF, the PF driver simply tell VF that all is set. In addition, if reset is ongoing, and Admin intend to configure VF on the host, we can poll the VF enabling bit to make sure it is ready before continue - If after ~250 milliseconds, VF is not in active state, we can bail out with invalid error. Signed-off-by: Akeem G Abodunrin <akeem.g.abodunrin@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-11-08 12:02:54 -08:00
Paul Greenwalt	241c8cf052	ice: configure software LLDP in ice_init_pf_dcb Move software LLDP configuration when FW DCBX is disabled to ice_init_pf_dcb, since that is where the FW DCBX state is determined. Remove this software LLDP configuration from ice_vsi_setup and ice_set_priv_flags. Software configuration includes redirecting Rx LLDP packets up the stack, when FW DCBX is not running. Signed-off-by: Paul Greenwalt <paul.greenwalt@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-11-08 12:02:50 -08:00
Usha Ketineni	c0a3665f71	ice: Fix to change Rx/Tx ring descriptor size via ethtool with DCBx This patch fixes the call trace caused by the kernel when the Rx/Tx descriptor size change request is initiated via ethtool when DCB is configured. ice_set_ringparam() should use vsi->num_txq instead of vsi->alloc_txq as it represents the queues that are enabled in the driver when DCB is enabled/disabled. Otherwise, queue index being used can go out of range. For example, when vsi->alloc_txq has 104 queues and with 3 TCS enabled via DCB, each TC gets 34 queues, vsi->num_txq will be 102 and only 102 queues will be enabled. Signed-off-by: Usha Ketineni <usha.k.ketineni@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-11-08 12:02:46 -08:00
Henry Tieman	5f8cc355c4	ice: avoid setting features during reset Certain subsystems behave very badly when called during reset (core dump). This patch returns -EBUSY when reconfiguring some subsystems during reset. With this patch some ethtool functions will not core dump during reset. Signed-off-by: Henry Tieman <henry.w.tieman@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-11-08 12:02:43 -08:00
Dave Ertman	b94b013eb6	ice: Implement DCBNL support Implement interface layer for the DCBNL subsystem. These are the functions to support the callbacks defined in the dcbnl_rtnl_ops struct. These callbacks are going to be used to interface with the DCB settings of the device. Implementation of dcb_nl set functions and supporting SW DCB functions. Signed-off-by: Dave Ertman <david.m.ertman@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-11-08 12:02:14 -08:00
Usha Ketineni	1ddef455f4	ice: Add NDO callback to set the maximum per-queue bitrate Allow for rate limiting Tx queues. Bitrate is set in Mbps(megabits per second). Mbps max-rate is set for the queue via sysfs: /sys/class/net/<iface>/queues/tx-<queue>/tx_maxrate ex: echo 100 >/sys/class/net/ens7/queues/tx-0/tx_maxrate echo 200 >/sys/class/net/ens7/queues/tx-1/tx_maxrate Note: A value of zero for tx_maxrate means disabled, default is disabled. Signed-off-by: Usha Ketineni <usha.k.ketineni@intel.com> Co-developed-by: Tarun Singh <tarun.k.singh@intel.com> Signed-off-by: Tarun Singh <tarun.k.singh@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-11-08 11:58:49 -08:00
Anirudh Venkataramanan	9d614b6425	ice: Use ice_ena_vsi and ice_dis_vsi in DCB configuration flow DCB configuration flow needs to disable and enable only the PF (main) VSI, so use ice_ena_vsi and ice_dis_vsi. To avoid the use of ifdef to control the staticness of these functions, move them to ice_lib.c. Also replace the allocate and copy of old_cfg to kmemdup() in ice_pf_dcb_cfg(). Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-11-08 11:58:49 -08:00
Anirudh Venkataramanan	039c60c597	ice: Fix return value when SR-IOV is not supported When the device is not capable of supporting SR-IOV -ENODEV is being returned; -EOPNOTSUPP is more appropriate. Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-11-06 16:41:45 -08:00
Brett Creeley	ff010eca05	ice: Rename VF function ice_vc_dis_vf to match its behavior ice_vc_dis_vf() tells iavf that it's going to perform a reset and then performs a software reset. This is misleading based on the function name because the VF does not get disabled. So fix this by changing the name to ice_vc_reset_vf(). Signed-off-by: Brett Creeley <brett.creeley@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-11-06 16:41:45 -08:00
Krzysztof Kazimierczak	133f4883f9	ice: Get rid of ice_cleanup_header ice_cleanup_hdrs() has been stripped of most of its content, it only serves as a wrapper for eth_skb_pad(). We can get rid of it altogether and simplify the codebase. Signed-off-by: Krzysztof Kazimierczak <krzysztof.kazimierczak@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-11-06 16:41:45 -08:00
Paul Greenwalt	e18ff11818	ice: print PCI link speed and width Print message to inform user of PCI link speed and width. Signed-off-by: Paul Greenwalt <paul.greenwalt@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-11-06 16:41:45 -08:00
Paul Greenwalt	5878589dc3	ice: print unsupported module message Print message to inform user if unsupported module is inserted, and extend the topology / configuration detection. Signed-off-by: Paul Greenwalt <paul.greenwalt@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-11-06 16:41:45 -08:00
Mitch Williams	395594563b	ice: write register with correct offset The VF_MBX_ARQLEN register array is per-PF, not global, so we should not use the absolute VF ID as an index. Instead, use the per-PF VF ID. This fixes an issue with VFs on PFs other than 0 not seeing reset. Signed-off-by: Mitch Williams <mitch.a.williams@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-11-06 16:41:45 -08:00
Michal Swiatkowski	eb0ee8abfe	ice: Check for null pointer dereference when setting rings Without this check rebuild vsi can lead to kernel panic. Signed-off-by: Michal Swiatkowski <michal.swiatkowski@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-11-06 16:41:45 -08:00
Michal Swiatkowski	4e56802e0e	ice: save PCI state in probe Save state to correct recovery memory and I/O BARs address after PCI bus reset. Without this after reset kernel can't read device registers. Signed-off-by: Michal Swiatkowski <michal.swiatkowski@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-11-06 16:41:45 -08:00
Dave Ertman	b2883dfe1f	ice: Adjust DCB INIT for SW mode Adjust ice_init_dcb to set the is_sw_lldp boolean in the case where the FW has been detected to be in an untenable state such that the driver should forcibly make sure it is off. This will ensure that the FW is in a known state. Signed-off-by: Dave Ertman <david.m.ertman@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-11-06 16:41:45 -08:00
Bruce Allan	c6012ac1c3	ice: fix driver unload flow As part of the driver unload flow, a PF reset is issued which may still cause an interrupt to be generated by the device. Do not clear the interrupt scheme until the reset is complete and there are no pending transactions otherwise a hardware error may occur. Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-11-06 16:41:45 -08:00
Paul Greenwalt	cfbf13674b	ice: handle DCBx non-contiguous TC request If DCBx request non-contiguous TCs, then the driver will configure default traffic class (TC0). This is done to prevent Tx hang since the driver currently does not support non-contiguous TC. Signed-off-by: Paul Greenwalt <paul.greenwalt@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-11-06 16:41:45 -08:00
Md Fahad Iqbal Polash	031f214752	ice: Update Boot Configuration Section read of NVM The Boot Configuration Section Block has been moved to the Preserved Field Area (PFA) of NVM. Update the NVM reads that involves Boot Configuration Section. Signed-off-by: Md Fahad Iqbal Polash <md.fahad.iqbal.polash@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-11-06 16:41:45 -08:00
Scott W Taylor	a012dca9f7	ice: add ethtool -m support for reading i2c eeprom modules Implement ethtool -m support to read eeprom data from SFP/QSFP modules. Signed-off-by: Scott W Taylor <scott.w.taylor@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-11-06 16:41:45 -08:00
Maciej Fijalkowski	23b44513c3	ice: allow 3k MTU for XDP At this point ice driver is able to work on order 1 pages that are split onto two 3k buffers. Let's reflect that when user is setting new MTU size and XDP is present on interface. Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-11-04 13:10:08 -08:00
Maciej Fijalkowski	aaf27254fd	ice: add build_skb() support Driver is now prepared for building the skb around the existing Rx buffer, so introduce the ice_build_skb responsible for it. Make use of XDP's data_meta as well. I've observed around 30% less CPU consumption with build_skb Rx path, in comparison to legacy Rx. What stands behind such result is the avoidance of flow_dissector (which we were diving into via eth_get_headlen) and no memcpy calls. Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-11-04 13:09:59 -08:00
Maciej Fijalkowski	59bb080805	ice: introduce frame padding computation logic Take into account the underlying architecture specific settings and based on that calculate the possible padding that can be supplied. Typically, for x86 and standard MTU size we will end up with 192 bytes of headroom. This is the same behavior as our other drivers have and we can dedicate it for XDP purposes. Furthermore, introduce the Rx ring flag for indicating whether build_skb is used on particular. Based on that invoke the routines for padding calculation. Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-11-04 13:09:50 -08:00
Maciej Fijalkowski	7237f5b0db	ice: introduce legacy Rx flag Add an ethtool "legacy-rx" priv flag for toggling the Rx path. This control knob will be mainly used for build_skb usage as well as buffer size/MTU manipulation. In preparation for adding build_skb support in a way that it takes care of how we set the values of max_frame and rx_buf_len fields of struct ice_vsi. Specifically, in this patch mentioned fields are set to values that will allow us to provide headroom and tailroom in-place. This can be mostly broken down onto following: - for legacy-rx "on" ethtool control knob, old behaviour is kept; - for standard 1500 MTU size configure the buffer of size 1536, as network stack is expecting the NET_SKB_PAD to be provided and NET_IP_ALIGN can have a non-zero value (these can be typically equal to 32 and 2, respectively); - for larger MTUs go with max_frame set to 9k and configure the 3k buffer in case when PAGE_SIZE of underlying arch is less than 8k; 3k buffer is implying the need for order 1 page, so that our page recycling scheme can still be applied; With that said, substitute the hardcoded ICE_RXBUF_2048 and PAGE_SIZE values in DMA API that we're making use of with rx_ring->rx_buf_len and ice_rx_pg_size(rx_ring). The latter is an introduced helper for determining the page size based on its order (which was figured out via ice_rx_pg_order). Last but not least, take care of truesize calculation. In the followup patch the headroom/tailroom computation logic will be introduced. This change aligns the buffer and frame configuration with other Intel drivers, most importantly with iavf. Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-11-04 13:09:46 -08:00
Krzysztof Kazimierczak	2d4238f556	ice: Add support for AF_XDP Add zero copy AF_XDP support. This patch adds zero copy support for Tx and Rx; code for zero copy is added to ice_xsk.h and ice_xsk.c. For Tx, implement ndo_xsk_wakeup. As with other drivers, reuse existing XDP Tx queues for this task, since XDP_REDIRECT guarantees mutual exclusion between different NAPI contexts based on CPU ID. In turn, a netdev can XDP_REDIRECT to another netdev with a different NAPI context, since the operation is bound to a specific core and each core has its own hardware ring. For Rx, allocate frames as MEM_TYPE_ZERO_COPY on queues that AF_XDP is enabled. Signed-off-by: Krzysztof Kazimierczak <krzysztof.kazimierczak@intel.com> Co-developed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-11-04 12:01:55 -08:00
Krzysztof Kazimierczak	0891d6d4b1	ice: Move common functions to ice_txrx_lib.c In preparation of AF XDP, move functions that will be used both by skb and zero-copy paths to a new file called ice_txrx_lib.c. This allows us to avoid using ifdefs to control the staticness of said functions. Move other functions (ice_rx_csum, ice_rx_hash and ice_ptype_to_htype) called only by the moved ones to the new file as well. Signed-off-by: Krzysztof Kazimierczak <krzysztof.kazimierczak@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-11-04 11:45:05 -08:00
Maciej Fijalkowski	efc2214b60	ice: Add support for XDP Add support for XDP. Implement ndo_bpf and ndo_xdp_xmit. Upon load of an XDP program, allocate additional Tx rings for dedicated XDP use. The following actions are supported: XDP_TX, XDP_DROP, XDP_REDIRECT, XDP_PASS, and XDP_ABORTED. Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-11-04 10:23:59 -08:00
Maciej Fijalkowski	e75d1b2c37	ice: get rid of per-tc flow in Tx queue configuration routines There's no reason for treating DCB as first class citizen when configuring the Tx queues and going through TCs. Reverse the logic and base the configuration logic on rings, which is the object of interest anyway. Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-11-04 10:03:14 -08:00
Anirudh Venkataramanan	eff380aaff	ice: Introduce ice_base.c Remove a few uses of kernel configuration flags from ice_lib.c by introducing a new source file ice_base.c. Also move corresponding function prototypes from ice_lib.h to ice_base.h and include ice_base.h where required. Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-11-04 10:03:14 -08:00
Tony Nguyen	2de1256636	ice: Bump version Bump version to 0.8.1-k Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-09-12 11:39:25 -07:00
Tony Nguyen	462acf6aca	ice: Enable DDP package download Attempt to request an optional device-specific DDP package file (one with the PCIe Device Serial Number in its name so that different DDP package files can be used on different devices). If the optional package file exists, download it to the device. If not, download the default package file. Log an appropriate message based on whether or not a DDP package file exists and the return code from the attempt to download it to the device. If the download fails and there is not already a package file on the device, go into "Safe Mode" where some features are not supported. Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-09-12 11:37:38 -07:00
Tony Nguyen	32d63fa1e9	ice: Initialize DDP package structures Add functions to initialize, parse, and clean structures representing the DDP package. Upon completion of package download, read and store the DDP package contents to these structures. This configuration is used to identify the default behavior and later used to update the HW table entries. Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-09-12 11:28:40 -07:00
Tony Nguyen	c764881096	ice: Implement Dynamic Device Personalization (DDP) download Add the required defines, structures, and functions to enable downloading a DDP package. Before download, checks are performed to ensure the package is valid and compatible. Note that package download is not yet requested by the driver as further initialization is required to utilize the package. Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-09-12 11:19:16 -07:00
Lukasz Czapnik	870f805e97	ice: Fix FW version formatting in dmesg The FW build id is currently being displayed as an int which doesn't make sense. Instead display FW build id as a hex value. Also add other useful information to the output such as NVM version, API patch info, and FW build hash. Signed-off-by: Lukasz Czapnik <lukasz.czapnik@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-09-12 10:37:22 -07:00
Paul M Stillwell Jr	e3710a01a8	ice: send driver version to firmware The driver is required to send a version to the firmware to indicate that the driver is up. If the driver doesn't do this the firmware doesn't behave properly. Signed-off-by: Paul M Stillwell Jr <paul.m.stillwell.jr@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-09-12 10:22:04 -07:00
Anirudh Venkataramanan	5c875c1af8	ice: Rework around device/function capabilities ice_parse_caps is printing capabilities in a different way when compared to the variable names. This makes it difficult to search for the right strings in the debug logs. So this patch updates the print strings to be exactly the same as the fields' name in the structure. Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-09-05 08:13:41 -07:00
Jesse Brandeburg	dd47e1fd86	ice: change default number of receive descriptors The driver should start out with a reasonable number of descriptors that can prevent drops due to a CPU being in a power management state. Change the default number of descriptors to 2048. The user can always change the value at runtime. Transmit descriptor counts are not modified because they don't need to change due to the speed of the interface, or for power managed CPUs, but the code is simplified to a fixed value for the transmit default. Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-09-05 08:13:41 -07:00
Anirudh Venkataramanan	8c243700ab	ice: Minor refactor in queue management Remove q_left_tx and q_left_rx from the PF struct as these can be obtained by calling ice_get_avail_txq_count and ice_get_avail_rxq_count respectively. The function ice_determine_q_usage is only setting num_lan_tx and num_lan_rx in the PF structure, and these are later assigned to vsi->alloc_txq and vsi->alloc_rxq respectively. This is an unnecessary indirection, so remove ice_determine_q_usage and just assign values for vsi->alloc_txq and vsi->alloc_rxq in ice_vsi_set_num_qs and use these to set num_lan_tx and num_lan_rx respectively. Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-09-05 08:13:41 -07:00
Dave Ertman	ea300f41bb	ice: Allow for delayed LLDP MIB change registration Add an additional boolean parameter to the ice_init_dcb function. This boolean controls if the LLDP MIB change events are registered for. Also, add a new function defined ice_cfg_lldp_mib_change. The additional function is necessary to be able to register for LLDP MIB change events after calling ice_init_dcb. The net effect of these two changes is to allow a delayed registration for MIB change events so that the driver is not accepting events before it is ready for them. Signed-off-by: Dave Ertman <david.m.ertman@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-09-05 08:13:41 -07:00
Ashish Shah	201beeb715	ice: update Tx context struct Add internal usage flag, bit 91 as described in spec. Update width of internal queue state to 122 also as described in spec. Signed-off-by: Ashish Shah <ashish.n.shah@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-09-05 08:13:41 -07:00
Akeem G Abodunrin	dfc6240012	ice: Report VF link status with opcode to get resources This patch changes how and when the driver report link status, instead of waiting till the call to enable queues for VF, we should report link status earlier with opcode to get VF resources - So as to avoid reporting erroneous information, especially when queues have not been configured. In addition, we can also make a call to get and report link status change after when queue is enabled, at least to report netdev or PHY link status. This is in accordance to how link speed is being reported for PF... Signed-off-by: Akeem G Abodunrin <akeem.g.abodunrin@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-09-05 08:13:41 -07:00
Anirudh Venkataramanan	80739b57b1	ice: Check for DCB capability before initializing DCB Check the ICE_FLAG_DCB_CAPABLE before calling ice_init_pf_dcb. Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-09-05 08:13:41 -07:00
Lukasz Czapnik	c61d234234	ice: report link down for VF when PF's queues are not enabled This is port of a fix from i40e commit `2ad1274fa3` ("i40e: don't report link up for a VF who hasn't enabled queues") Older VF drivers do not respond well to receiving a link up notification before queues are enabled. This can cause their state machine to think that it is safe to send traffic. This results in a Tx hang on the VF. Record whether the PF has actually enabled queues for the VF. When reporting link status, always report link down if the queues aren't enabled. In this way, the VF driver will never receive a link up notification until after its queues are enabled. Signed-off-by: Lukasz Czapnik <lukasz.czapnik@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-09-05 08:13:41 -07:00
Mitch Williams	29d42f1f3a	ice: Reliably reset VFs When a PFR (or bigger reset) occurs, the device clears the VF_MBX_ARQLEN register for all VFs. But if a VFR is triggered by a VF, the device does NOT clear this register, and the VF driver will never see the reset. When this happens, the VF driver will eventually timeout and attempt recovery, and usually it will be successful. But this makes resets take a long time and there are occasional failures. We cannot just blithely clear this register on every reset; this has been shown to cause synchronization problems when a PFR is triggered with a large number of VFs. Fix this by clearing VF_MBX_ARQLEN when the reset source is not PFR. GlobR will trigger PFR, so this test catches that occurrence as well. Signed-off-by: Mitch Williams <mitch.a.williams@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-09-05 08:13:40 -07:00
Jesse Brandeburg	9d56b7fd6a	ice: change work limit to a constant The driver has supported a transmit work limit that was configurable from ethtool for a long time, but there are no good use cases for having it be a variable that can be changed at run time. In addition, this variable was noted to be causing performance overhead due to cache misses. Just remove the variable and let the code use a constant so that the functionality is maintained (a limit on the number of transmits that will be cleaned in any one call to the clean routines) without the cache miss. Removes code, removes a variable, removes testing surface. Yay. Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-09-05 08:13:40 -07:00
Jesse Brandeburg	d27525ec1f	ice: small efficiency fixes Add a small bit of efficiency to the code by adding a prefetch of the port_info structure in order to help avoid a cache miss a little later on in execution. Also add an unlikely statement to a branch which generally will never happen in normal operation. Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-09-05 08:13:40 -07:00
Jesse Brandeburg	6503b65930	ice: move code closer together This is a simple patch to move the assignment to a local variable closer to the site where the local variable is used. This can help readability and also maybe performance, although the performance enhancement is really dependent upon the compiler. No functional change. Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-09-05 08:13:40 -07:00
Jesse Brandeburg	2fb0821fd5	ice: clean up arguments There are a couple of functions that don't need two arguments passed in when the second argument already had access to the pointer pointed to by the first. Remove the unnecessary arguments. Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-09-05 08:13:40 -07:00
Anirudh Venkataramanan	ade78c2ec1	ice: Check root pointer for validity ice_sched_get_tc_node uses pi->root without checking for NULL. Add a check to prevent NULL pointer dereference. Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-09-05 08:13:40 -07:00
Anirudh Venkataramanan	208ff75135	ice: Add ice_get_main_vsi to get PF/main VSI There are multiple places where we currently use ice_find_vsi_by_type to get the PF (a.k.a. main) VSI. The PF VSI by definition is always the first element in the pf->vsi array (i.e. pf->vsi[0]). So instead add and use a new helper function ice_get_main_vsi, which just returns pf->vsi[0]. Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-09-05 08:13:40 -07:00
Brett Creeley	34cdcb165b	ice: Update fields in ice_vsi_set_num_qs when reconfiguring Currently when vsi->req_txqs or vsi->req_rxqs are set we don't correctly set the number of vsi->num_q_vectors. Fix this by setting the number of queue vectors based on the max between the vsi->alloc_txqs and vsi->alloc_rxqs. Signed-off-by: Brett Creeley <brett.creeley@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-09-05 08:13:40 -07:00
Brett Creeley	cd186e5151	ice: Only disable VLAN pruning for the VF when all VLANs are removed Currently if the VF adds a VLAN, VLAN pruning will be enabled for that VSI. Also, when a VLAN gets deleted it will disable VLAN pruning even if other VLAN(s) exists for the VF. Fix this by only disabling VLAN pruning on the VF VSI when removing the last VF (i.e. vf->num_vlan == 0). Signed-off-by: Brett Creeley <brett.creeley@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-09-03 17:17:13 -07:00
Michal Swiatkowski	03bba02016	ice: Remove enable DCB when SW LLDP is activated Remove code that enables DCB in initialization when SW LLDP is activated. DCB flag is set or reset before in ice_init_pf_dcb based on number of TCs. So there is not need to overwrite it. Setting DCB without checking number of TCs can cause communication problems with other cards. Host card sends packet with VLAN priority tag, but client card doesn't strip this tag and ping doesn't work. Signed-off-by: Michal Swiatkowski <michal.swiatkowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-09-03 17:14:37 -07:00
Dave Ertman	3d57fd10f2	ice: Report stats when VSI is down There is currently a check in get_ndo_stats that returns before updating stats if the VSI is down or there are no Tx or Rx queues. This causes the netdev to report zero stats with the netdev is down. Remove the check so that the behavior of reporting stats is the same as it was in IXGBE. Signed-off-by: Dave Ertman <david.m.ertman@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-09-03 17:07:50 -07:00
Mitch Williams	06914ac20a	ice: Always notify FW of VF reset The call to ice_dis_vsi_txq() acts as the notification to the firmware that the VF is being reset. Because of this, we need to make this call every time we reset, regardless of whatever else we do to stop the Tx queues. Without this change, VF resets would fail to complete on interfaces that were up and running. Signed-off-by: Mitch Williams <mitch.a.williams@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-09-03 17:04:14 -07:00
Dave Ertman	473ca57488	ice: Correctly handle return values for init DCB In the init path for DCB, the call to ice_init_dcb() can return a non-zero value for either an actual error, or due to the FW lldp engine being stopped. We are currently treating all non-zero values only as an indication that the FW LLDP engine is stopped. Check for an actual error in the DCB init flow. Signed-off-by: Dave Ertman <david.m.ertman@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-09-03 17:02:23 -07:00
Usha Ketineni	a257f188b7	ice: Limit Max TCs on devices with more than 4 ports This patch limits the max TCs set by the driver to the value provided by the firmware as per the capabilities of the device. Otherwise, hard coding to 8 TC max would fail the device configurations with more than 4 ports. Signed-off-by: Usha Ketineni <usha.k.ketineni@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-09-03 16:35:58 -07:00
Tony Nguyen	6a025730e0	ice: Cleanup defines in ice_type.h Conventionally, if the #defines/other are not needed by other header files being included, #includes are done first followed by #defines and other stuff. Move the #defines before the #includes to follow this convention. Suggested by: Bruce Allan <bruce.w.allan@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-09-03 16:32:30 -07:00
Jesse Brandeburg	2e0ab37c04	ice: print extra message if topology issue The driver needs to inform the user if there is an issue with the topology / configuration of the link. Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-09-03 16:27:45 -07:00
Jesse Brandeburg	432609887a	ice: add print of autoneg state to link message Print the state of auto-negotiation when printing the Link up message. Adds new text to the "NIC Link is up" line like Autoneg: <True \| False> Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-09-03 16:25:34 -07:00
Bruce Allan	7404e84a23	ice: update driver unloading field for Queue Shutdown AQ command According to recent specification versions, the field in the Queue Shutdown AdminQ command consisting of the "driver unloading" indication is not a 4 byte field (it is byte.bit 16.0). Change it to a byte and remove the unnecessary endian conversion. Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-09-03 16:23:35 -07:00
Bruce Allan	18057cb357	ice: add needed PFR during driver unload According to the specification, a PF Reset must be done as part of the driver unload flow. Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-09-03 16:18:52 -07:00
Chinh T Cao	d24ef08a9d	ice: Deduce TSA value from the priority value in the CEE mode In CEE mode, the TSA information can be derived from the reported priority value. Signed-off-by: Chinh T Cao <chinh.t.cao@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-09-03 16:16:36 -07:00
Brett Creeley	567af267fa	ice: Report what the user set for coalesce [tx\|rx]-usecs Currently if the user sets an odd value for [tx\|rx]-usecs we align the value because the hardware only understands ITR values in multiples of 2. This seems misleading because we are essentially telling the user that the ITR value is odd, when in fact we have changed it internally. Fix this by reporting that setting odd ITR values is not allowed. Also, while making changes to ice_set_rc_coalesce() I noticed a bit of code/error duplication. Make the necessary changes to remove the duplication. Signed-off-by: Brett Creeley <brett.creeley@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-09-03 16:11:10 -07:00
Jeb Cramer	8132e17dfb	ice: Fix resource leak in ice_remove_rule_internal() We don't free s_rule if ice_aq_sw_rules() returns a non-zero status. If it returned a zero status, s_rule would be freed right after, so this implies it should be freed within the scope of the function regardless. Signed-off-by: Jeb Cramer <jeb.j.cramer@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-09-03 16:08:54 -07:00
Anirudh Venkataramanan	03af840650	ice: Fix EMP reset handling ice_reset_subtask needs to handle EMP resets as well, as EMP resets can be triggered by the firmware. This patch adds the logic to do this. Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-09-03 13:47:12 -07:00
Henry Tieman	ae2bdbb45d	ice: fix adminq calls during remove The order of operations was incorrect in ice_remove(). The code would try to use adminq operations after the adminq was disabled. This caused all adminq calls to fail and possibly timeout waiting. Signed-off-by: Henry Tieman <henry.w.tieman@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-08-26 23:54:29 -07:00
Anirudh Venkataramanan	152b978a1f	ice: Rework ice_ena_msix_range The current implementation of ice_ena_msix_range is difficult to read and has subtle issues. This patch reworks the said function for clarity and correctness. More specifically, 1. Add more checks to bail out of 'needed' is greater than 'v_left'. 2. Simplify fallback logic 3. Do not set pf->num_avail_sw_msix in ice_ena_msix_range as it gets overwritten by ice_init_interrupt_scheme. Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-08-26 23:52:29 -07:00
Akeem G Abodunrin	cb6a8dc078	ice: Fix VF configuration issues due to reset This patch fixes a critical reset issue that resulting to the server reboot when an Admin changes VF configuration on the host, for example changing VF to Trusted/non_Trusted mode, the PF driver send reset notification to AVF driver while also continue with reset flow. However, AVF driver schedule another reset due to notification, which causes two concurrent reset going on, and trigger lock up in the FW, with AQ call to delete VSI. Signed-off-by: Akeem G Abodunrin <akeem.g.abodunrin@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-08-26 23:47:57 -07:00
Anirudh Venkataramanan	78b5713ac1	ice: Alloc queue management bitmaps and arrays dynamically The total number of queues available on the device is divided between multiple physical functions (PF) in the firmware and provided to the driver when it gets function capabilities from the firmware. Thus each PF knows how many Tx/Rx queues it has. These queues are then doled out to different VSIs (for LAN traffic, SR-IOV VF traffic, etc.) To track usage of these queues at the PF level, the driver uses two bitmaps avail_txqs and avail_rxqs. At the VSI level (i.e. struct ice_vsi instances) the driver uses two arrays txq_map and rxq_map, to track ownership of VSIs' queues in avail_txqs and avail_rxqs respectively. The aforementioned bitmaps and arrays should be allocated dynamically, because the number of queues supported by a PF is only available once function capabilities have been queried. The current static allocation consumes way more memory than required. This patch removes the DECLARE_BITMAP for avail_txqs and avail_rxqs and instead uses bitmap_zalloc to allocate the bitmaps during init. Similarly txq_map and rxq_map are now allocated in ice_vsi_alloc_arrays. As a result ICE_MAX_TXQS and ICE_MAX_RXQS defines are no longer needed. Also as txq_map and rxq_map are now allocated and freed, some code reordering was required in ice_vsi_rebuild for correct functioning. Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-08-26 23:45:54 -07:00
Paul Greenwalt	77ca27c417	ice: add support for virtchnl_queue_select.[tx\|rx]_queues bitmap The VF driver can call VIRTCHNL_OP_[ENABLE\|DISABLE]_QUEUES separately for each queue. Add support for virtchnl_queue_select.[tx\|rx]_queues bitmap which is used to indicate which queues to enable and disable. Add tracing of VF Tx/Rx per queue enable state to avoid enabling enabled queues and disabling disabled queues. Add total queues enabled count and clear ICE_VF_STATE_QS_ENA when count is zero. Signed-off-by: Paul Greenwalt <paul.greenwalt@intel.com> Signed-off-by: Peng Huang <peng.huang@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-08-26 23:37:16 -07:00
Maciej Fijalkowski	d02f734cb7	ice: add support for enabling/disabling single queues Refactor the queue handling functions that are going through queue arrays in a way that the logic done for a single queue is pulled out and it will be called for each ring when traversing ring array. This implies that when disabling Tx rings we won't fill up q_ids, q_teids and q_handles arrays. Drop also 'offset' parameter; the value from vsi's txq_map is stored in ring->reg_idx and that drops the need for mentioned parameter. Introduce the ice_vsi_cfg_txq, ice_vsi_stop_tx_ring and ice_vsi_ctrl_rx_ring that are the functions with pulled out logic. There's several Tx queue meta data (q_id, q_handle, q_teid and other) that need to be set up during Tx queue disablement, so let's as well add a helper structure that wraps it up and a function that will be filling it up. Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-08-26 23:33:40 -07:00
Colin Ian King	a1199d679a	ice: fix potential infinite loop The loop counter of a for-loop is a u8 however this is being compared to an int upper bound and this can lead to an infinite loop if the upper bound is greater than 255 since the loop counter will wrap back to zero. Fix this potential issue by making the loop counter an int. Addresses-Coverity: ("Infinite loop") Fixes: `c7aeb4d1b9` ("ice: Disable VFs until reset is completed") Signed-off-by: Colin Ian King <colin.king@canonical.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-08-26 23:30:26 -07:00
Jacob Keller	35b4f4372f	ice: fix ice_is_tc_ena ice_is_tc_ena is used to check whether a given traffic class is enabled. Because there are only 8 traffic classes, the function took a u8 bitmap. This causes problems because it is cast to an unsigned long causing a static analysis warning regarding Out-of-bounds read. Fix this by simply updating ice_is_tc_ena to take an unsigned long. Passing a u8 to this function should implicitly convert the value. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-08-26 23:27:10 -07:00
Michal Swiatkowski	9c7dd7566d	ice: add validation in OP_CONFIG_VSI_QUEUES VF message Check num_queue_pairs to avoid access to unallocated field of vsi->tx_rings/vsi->rx_rings. Without this validation we can set vsi->alloc_txq/vsi->alloc_rxq to value smaller than ICE_MAX_BASE_QS_PER_VF and send this command with num_queue_pairs greater than vsi->alloc_txq/vsi->alloc_rxq. This lead to access to unallocated memory. In VF vsi alloc_txq and alloc_rxq should be the same. Get minimum because looks more readable. Also add validation for ring_len param. It should be greater than 32 and be multiple of 32. Incorrect value leads to hang traffic on PF. Signed-off-by: Michal Swiatkowski <michal.swiatkowski@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-08-26 23:25:14 -07:00
Akeem G Abodunrin	e63a1dbdc7	ice: Don't clog kernel debug log with VF MDD events errors In case of MDD events on VF, don't clog kernel log with unlimited VF MDD events message "VF 0 has had 1018 MDD events since last boot" - limit events log message to 30, based on the observation in some experimentation with sending malicious packet once, and number of events reported before device stopped observing MDD events. Also removed defunct macro "ICE_DFLT_NUM_MDD_EVENTS_ALLOWED" for tracking number of MDD events allowed before disabling the interface... Signed-off-by: Akeem G Abodunrin <akeem.g.abodunrin@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-08-26 23:21:28 -07:00
Krzysztof Kazimierczak	4425e0531c	ice: Introduce a local variable for a VSI in the rebuild path When a VSI is accessed inside the ice_for_each_vsi macro in the rebuild path (ice_vsi_rebuild_all() and ice_vsi_replay_all()), it is referred to as pf->vsi[i]. Introduce local variables to improve readability. Signed-off-by: Krzysztof Kazimierczak <krzysztof.kazimierczak@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-08-26 23:18:06 -07:00
Jesse Brandeburg	dc67039b3d	ice: shorten local and add debug prints Add some verbose debugging for dyndbg to help us when we are having issues with link and/or PHY. While there, shorten some strings used by locals that were causing long line wrapping. Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-08-26 23:09:46 -07:00
Anirudh Venkataramanan	f27db2e65e	ice: Sanitize ice_ena_vsi and ice_dis_vsi 1. ndo_open and ndo_stop are implemented by ice_open and ice_stop respectively. When enabling/disabling VSIs, just call ice_open/ice_stop instead of ndo_open/ndo_stop. 2. Rework logic around rtnl_lock/rtnl_unlock 3. In ice_ena_vsi, remove an unnecessary stack variable and return 0 instead of err when __ICE_NEEDS_RESTART is not set. Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-08-26 23:02:48 -07:00
Victor Raj	2935824873	ice: added sibling head to parse nodes There was a bug in the previous code which never traverses all the children to get the first node of the requested layer. Add a sibling head pointer to point the first node of each layer per TC. This helps traverse easier and quicker and also removes the recursion. Signed-off-by: Victor Raj <victor.raj@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-08-26 22:59:00 -07:00
Usha Ketineni	9e7a5d1746	ice: Fix ethtool port and PFC stats for 4x25G cards This patch fixes the issue where port and PFC statistics counters are incrementing at the wrong port with 4x25G cards. Read the GLPRT port registers using lport parameter instead of pf_id to update the statistics otherwise the pf_ids are flipped for ports 2 and 3 when read from the HW register PF_FUNC_RID and this is expected as per hardware specification. Signed-off-by: Usha Ketineni <usha.k.ketineni@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-08-26 22:54:12 -07:00
Akeem G Abodunrin	8b2c858240	ice: Don't allow VSI to remove unassociated ucast filter If a VSI is not using a unicast filter or did not configure that particular unicast filter, driver should not allow it to be removed by the rogue VSI. Signed-off-by: Akeem G Abodunrin <akeem.g.abodunrin@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-08-23 10:51:46 -07:00
Akeem G Abodunrin	bbb968e8b3	ice: Fix issues updating VSI MAC filters VSI, especially VF could request to add or remove filter for another VSI, driver should really guide such request and disallow it. However, instead of returning error for such malicious request, driver can simply return success. In addition, we are not tracking number of MAC filters configured per VF correctly - and this leads to issue updating VF MAC filters whenever they were removed and re-configured via bringing VF interface down and up. Also, since VF could send request to update multiple MAC filters at once, driver should program those filters individually in the switch, in order to determine which action resulted to error, and communicate accordingly to the VF. So, with this changes, we now track number of filters added right from when VF resources allocation is done, and could properly add filters for both trusted and non_trusted VFs, without MAC filters mis-match issue in the switch... Also refactor code, so that driver can use new function to add or remove MAC filters. Signed-off-by: Akeem G Abodunrin <akeem.g.abodunrin@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-08-23 10:46:53 -07:00
Bruce Allan	5a4a867310	ice: update ethtool stats on-demand Users expect ethtool statistics to be updated on-demand when invoking 'ethtool -S <iface>' instead of providing a snapshot of statistics taken once a second (the frequency of the watchdog task where stats are currently updated). Update stats every time 'ethtool -S <iface>' is run. Also, fix an indentation style issue and an unnecessary local variable initialization in ice_get_ethtool_stats() discovered while investigating the subject issue. Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-08-23 10:34:27 -07:00
Amruth G.P	3f416961b0	ice: Add input handlers for virtual channel handlers Move the assignment to local variables after validation. Remove unnecessary checks in ice_vc_process_vf_msg() as the respective functions are now performing the checks. Signed-off-by: "Amruth G.P" <amruth.gouda.parameshwarappa@intel.com> Signed-off-by: Nitesh B Venkatesh <nitesh.b.venkatesh@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-08-23 10:29:53 -07:00
Chinh T Cao	3747f03115	ice: Don't clear auto_fec bit in ice_cfg_phy_fec() The driver should never clear the auto_fec_enable bit. Signed-off-by: Chinh T Cao <chinh.t.cao@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-08-23 10:25:39 -07:00
Chinh T Cao	057911ba9b	ice: Fix flag used for module query When checking the PHY for status, by specification, the driver should be using "topology" mode when querying the module type. Signed-off-by: Chinh T Cao <chinh.t.cao@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-08-23 10:23:11 -07:00
Mitch Williams	90e477379e	ice: silence some bogus error messages In some circumstances, VF devices can be deactivated while a message is in-flight. In that case, a series of scary error message will be printed in the log. Since these are actually harmless, check for this case and suppress them. No harm, no foul. Signed-off-by: Mitch Williams <mitch.a.williams@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-08-23 10:20:32 -07:00
Dave Ertman	84a118ab58	ice: Rename ethtool private flag for lldp The current flag name of "enable-fw-lldp" is a bit cumbersome. Change priv-flag name to "fw-lldp-agent" with a value of on or off. This is more straight-forward in meaning. Signed-off-by: Dave Ertman <david.m.ertman@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-08-23 10:15:15 -07:00
Jacob Keller	f8af5bf5b4	ice: reject VF attempts to enable head writeback The virtchnl interface provides a mechanism for a VF driver to request head writeback support. This feature is deprecated as of AVF 1.0, but older versions of a VF driver may still attempt to request the mode. Since the ice hardware does not support head writeback, we should not accept Tx queue configuration which attempts to enable it. Currently, the driver simply assumes that the headwb_enabled bit will never be set. If a VF driver does request head writeback, the configuration will return successfully, even though head writeback is not enabled. This leaves the VF driver in a non functional state since it is assuming to be operating in head writeback mode. Fix the PF driver to reject any attempt to setup headwb_enabled. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-08-23 10:09:45 -07:00
Michal Swiatkowski	42a179c80d	ice: Copy dcbx configuration only if mode is correct In rebuild DCB desired_dcbx_cfg was copy to local_dcbx_cfg, but if DCBX mode is IEEE desired_dcbx_cfg is not initialized by DCBX config from FW. Change logic to copy config value only if mode is set to CEE. If driver copy desired_dcbx_cfg to local_dcbx_cfg in IEEE mode there is problem with globr. System is frozen after two or more globr. Signed-off-by: Michal Swiatkowski <michal.swiatkowski@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-08-23 09:59:51 -07:00
Dave Ertman	64bcaec642	ice: Treat DCBx state NOT_STARTED as valid When a port is not cabled, but DCBx is enabled in the firmware, the status of DCBx will be NOT_STARTED. This is a valid state for FW enabled and should not be treated as a is_fw_lldp true automatically. Add the code to treat NOT_STARTED as another valid state. Signed-off-by: Dave Ertman <david.m.ertman@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-08-23 09:55:55 -07:00
Brett Creeley	da4a9e73d8	ice: Don't call synchronize_irq() for VF's from the host Currently we will call synchronize_irq() from the host for VF's. This is not correct, so don't allow it. Signed-off-by: Brett Creeley <brett.creeley@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-08-23 09:49:19 -07:00
Dave Ertman	1b0c3247a0	ice: Account for all states of FW DCBx and LLDP Currently, only the DCBx status is taken into account to determine if FW LLDP is possible. But there are NVM version coming out with DCBx enabled, and FW LLDP disabled. This is causing errors where the driver sees that DCBx is not disabled, and then tries to register for LLDP MIB change events, and fails. Change the logic to detect both DCBx and LLDP states in the FW engine. Signed-off-by: Dave Ertman <david.m.ertman@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-08-23 09:44:48 -07:00
Dave Ertman	0c3a6101ff	ice: Allow egress control packets from PF_VSI For control packets (i.e. LLDP packets) to be able to egress from the main VSI, a bit has to be set in the TX_descriptor. This should only be done for the main VSI and only if the FW LLDP agent is disabled. A bit to allow this also has to be set in the VSI context. Add the logic to add the necessary bits in the VSI context for the PF_VSI and the TX_descriptors for control packets egressing the PF_VSI. Signed-off-by: Dave Ertman <david.m.ertman@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-08-23 09:17:45 -07:00
Brett Creeley	be6f7ef69c	ice: improve print for VF's when adding/deleting MAC filters When we fail to add/delete MAC filters in the VF, the print doesn't distinguish between the two. Fix that by printing whether or not we failed to add/delete the MAC filter respectively. Signed-off-by: Brett Creeley <brett.creeley@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-08-20 14:44:03 -07:00
Pawel Kaminski	cbfe31b5d7	ice: Change type for queue counts These queue variables are being assigned values that are type u16. Change the local variables to match these types. Since these represent queue counts, they should never be negative. Signed-off-by: Pawel Kaminski <pawel.kaminski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-08-20 14:42:35 -07:00
Akeem G Abodunrin	c275684b92	ice: Move VF resources definition to SR-IOV specific file In order to use some of the VF resources definition in the SR-IOV specific virtchnl header file, this patch moves applicable code to ice_virtchnl_pf.h file accordingly... and they should have been defined in the destination file originally. Signed-off-by: Akeem G Abodunrin <akeem.g.abodunrin@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-08-20 14:40:46 -07:00
Brett Creeley	11836214d5	ice: Increase size of Mailbox receive queue for many VFs Currently we use the ICE_MBXQ_LEN for both the Mailbox send and receive queues that are used to communicate with VFs. This is fine for the send queue because the PF driver will lock the queue for every single send, but for the Mailbox receive queue every VF is posting to its Mailbox send queue and the hardware is then handing the message to the PF on its Mailbox receive queue. This becomes a problem with many VFs because it seems to overburden the Mailbox receive queue on the PF. Fix this by increasing the Mailbox receive queue for the PF to 512 entries. The number 512 was determined based on the number of VFs supported by the device. We can have a total of 256 VFs so in the worst case this allows the VFs to put 2 messages in the PFs Mailbox receive queue at the same time. Signed-off-by: Brett Creeley <brett.creeley@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-08-20 14:37:15 -07:00
Brett Creeley	60d628ea27	ice: Reduce wait times during VF bringup/reset Currently there are a couple places where the VF is waiting too long when checking the status of registers. This is causing the AVF driver to spin for longer than necessary in the __IAVF_STARTUP state. Sometimes it causes the AVF to go into the __IAVF_COMM_FAILED, which may retrigger the __IAVF_STARTUP state. Try to reduce the chance of this happening by removing unnecessary wait times in VF bringup/resets. Signed-off-by: Brett Creeley <brett.creeley@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-08-20 14:36:00 -07:00
Paul Greenwalt	1337175dec	ice: update GLINT_DYN_CTL and GLINT_VECT2FUNC register access Register access for GLINT_DYN_CTL and GLINT_VECT2FUNC should be within the PF space and not the absolute device space. Signed-off-by: Paul Greenwalt <paul.greenwalt@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-08-20 14:34:36 -07:00
Tony Nguyen	e6c45149b8	ice: Do not always bring up PF VSI in ice_ena_vsi() During rebuild ice_ena_vsi() is called to recover the VSI state. This function assumes the PF VSI is always to be enabled, however, it's possible that during reset/rebuild the interface can be brought down. If this occurs, we can attempt to bring up the PF VSI on a downed interface which can lead to various crashes. If the interface is not running, do not bring up the associated VSI. Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-08-20 14:32:42 -07:00
Mitch Williams	ac6f733a7b	ice: allow empty Rx descriptors In some circumstances, the hardware will hand us a receive descriptor which has no data attached, but is otherwise valid. The receive code was improperly ignoring these descriptors, which result in an infinite loop. To fix this, change the receive code to process all descriptors, regardless of the size of the associated data. Add checks to the memory-handling functions to allow for zero size. Signed-off-by: Mitch Williams <mitch.a.williams@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-08-20 14:30:37 -07:00
Usha Ketineni	7829570e28	ice: Fix kernel hang with DCB reset in CEE mode This patch fixes the set local MIB AQ call failures in the DCB rebuild path by setting the defaults for the ETS recommended DCB configuration. Also, willing bits for the DCB configuration needs to be set correctly. Resets works fine in IEEE mode as the ETS recommended DCB configuration is populated but not in CEE mode. Without this patch, PFR causes the kernel hang in CEE mode. Signed-off-by: Usha Ketineni <usha.k.ketineni@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-08-20 14:29:22 -07:00
Brett Creeley	2ab28bb04c	ice: Set WB_ON_ITR when we don't re-enable interrupts Currently when busy polling is enabled we aren't setting/enabling WB_ON_ITR in the driver. This doesn't break the driver, but it does cause issues. If we don't enable WB_ON_ITR mode we will still get write-backs from hardware during polling when a cache line has been filled, but if a cache line is not filled we will not get the write-back because WB_ON_ITR is not set. Fix this by enabling WB_ON_ITR in the driver when interrupts are disabled. Signed-off-by: Brett Creeley <brett.creeley@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-08-20 14:21:21 -07:00
Paul Greenwalt	f1a4a66d23	ice: fix set pause param autoneg check When ETHTOOL_GLINKSETTINGS is defined get pause param pause->autoneg reports SW configured setting, however when not defined get pause param pause->autoneg reports the link status. Set pause param needs to compare pause->autoneg with the same source as get pause param to block the user from changing autoneg with the set pause param option, or the user may be incorrectly blocked from changing Rx\|Tx pause settings. Signed-off-by: Paul Greenwalt <paul.greenwalt@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-08-20 13:55:28 -07:00
Akeem G Abodunrin	d82dd83df2	ice: Restructure VFs initialization flows This patch restructures how VFs are configured, and resources allocated. Instead of freeing resources that were never allocated, and resetting empty VFs that have never been created - the new flow will just allocate resources for number of requested VFs based on the availability. During VFs initialization process, global interrupt is disabled, and rearmed after getting MSIX vectors for VFs. This allows immediate mailbox communications, instead of delaying it till later and VFs. PF communications resulted to using polling instead of actual interrupt. The issue manifested when creating higher number of VFs (128 VFs) per PF. Signed-off-by: Akeem G Abodunrin <akeem.g.abodunrin@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-08-20 12:28:35 -07:00
Brett Creeley	9118fcd525	ice: Assume that more than one Rx queue is rare in ice_napi_poll Currently we divide budget by the number of Rx queues per Rx ring container in ice_napi_poll even if there is only 1. This is an unnecessary divide for the normal case of 1 Rx ring per Rx ring container. Fix this by using an unlikely() call in the case where we actually need to divide. Also, we will always set budget_per_ring even if there are no Rx rings in the Rx ring container so we don't need to initialize it to 0. Signed-off-by: Brett Creeley <brett.creeley@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-08-20 12:28:35 -07:00
Brett Creeley	c1ddf1f5c4	ice: Use the software based tail when checking for hung Tx ring Currently in ice_get_tx_pending we try to read a Tx ring's tail. This is then compared with the software based head (next_to_clean) to determine if we have pending work. This will never work because reading of the Tx ring's tail is no longer supported. Fix this by using the software based tail (next_to_use) to determine if there is pending work. Signed-off-by: Brett Creeley <brett.creeley@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-08-20 12:28:35 -07:00
Tony Nguyen	3015b8fcb6	ice: Bump version number Update driver version to 0.7.5 Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-07-31 13:41:09 -07:00
Akeem G Abodunrin	b67f25d76e	ice: Remove flag to track VF interrupt status As a result of refactoring of VF VSIs interrupts code, there is no need to track its configuration status again with ICE_VF_STATE_CFG_INTR flag - In fact, it is not being checked anywhere in the code right now, so this patch removes the dead code as applicable to the flag. Signed-off-by: Akeem G Abodunrin <akeem.g.abodunrin@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-07-31 13:41:05 -07:00
Brett Creeley	ba880734ba	ice: Remove unnecessary flag ICE_FLAG_MSIX_ENA This flag is not needed and is called every time we re-enable interrupts in the hotpath so remove it. Also remove ice_vsi_req_irq() because it was a wrapper function for ice_vsi_req_irq_msix() whose sole purpose was checking the ICE_FLAG_MSIX_ENA flag. Signed-off-by: Brett Creeley <brett.creeley@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-07-31 13:41:01 -07:00
Akeem G Abodunrin	9921494463	ice: Don't return error for disabling LAN Tx queue that does exist Since Tx rings are being managed by FW/NVM, Tx rings might have not been set up or driver had already wiped them off - In that case, call to disable LAN Tx queue is being returned as not in existence. This patch makes sure we don't return unnecessary error for such scenario. Signed-off-by: Akeem G Abodunrin <akeem.g.abodunrin@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-07-31 13:40:57 -07:00
Brett Creeley	a1e9968593	ice: Remove duplicate code in ice_alloc_rx_bufs Currently if the call to ice_alloc_mapped_page() fails we jump to the no_buf label, possibly call ice_release_rx_desc(), and return true indicating that there is more work to do. In the success case we just fall out of the while loop, possibly call ice_alloc_mapped_page(), and return false saying we exhausted cleaned_count. This flow can be improved by breaking if ice_alloc_mapped_page() fails and then the flow outside of the while loop is the same for the failure and success case. Signed-off-by: Brett Creeley <brett.creeley@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-07-31 13:40:52 -07:00
Brett Creeley	56923ab664	ice: Add stats for Rx drops at the port level Currently we are not reporting dropped counts at the port level to ethtool or netlink. This was found when debugging Rx dropped issues and the total packets sent did not equal the total packets received minus the rx_dropped, which was very confusing. To determine dropped counts at the port level we need to read the PRTRPB_RDPC register. To fix reporting we will store the dropped counts in the PF's rx_discards. This will be reported to netlink by storing it in the PF VSI's rx_missed_errors signaling that the receiver missed the packet. Also, we will report this to ethtool in the rx_dropped.nic field. Signed-off-by: Brett Creeley <brett.creeley@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-07-31 13:40:46 -07:00
Akeem G Abodunrin	66b29e7a88	ice: Update number of VF queue before setting VSI resources In case there is a request from a VF to change its number of queues, and the request was successful, we need to update number of queues configured on the VF before updating corresponding VSI for that VF, especially LAN Tx queue tree and TC update, otherwise, we would continued to use old value of vf->num_vf_qs for allocated Tx/Rx queues... Signed-off-by: Akeem G Abodunrin <akeem.g.abodunrin@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-07-31 13:40:42 -07:00
Akeem G Abodunrin	d5a4635917	ice: Set up Tx scheduling tree based on alloc VSI Tx queues This patch uses allocated number of Tx queues per VSI to set up its scheduling tree instead of using total number of available Tx queues. Only PF VSIs have total number of allocated Tx queues equal to number of available Tx queues, other VSIs have different number of queues configured. Signed-off-by: Akeem G Abodunrin <akeem.g.abodunrin@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-07-31 13:40:35 -07:00
Brett Creeley	cb7db35641	ice: Only bump Rx tail and release buffers once per napi_poll Currently we bump the Rx tail and release/give buffers to hardware every 16 descriptors. This causes us to bump Rx tail up to 4 times per napi_poll call. Also we are always bumping tail on an odd index and this is a problem because hardware ignores the lower 3 bits in the QRX_TAIL register. This is making it so hardware sees tail bumps only every 8 descriptors. Instead lets only bump Rx tail once per napi_poll if the value aligns with hardware's expectations of the lower 3 bits being cleared. Also only release/give Rx buffers once per napi_poll call. Signed-off-by: Brett Creeley <brett.creeley@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-07-31 13:40:30 -07:00
Akeem G Abodunrin	c7aeb4d1b9	ice: Disable VFs until reset is completed This patch adds code to clear VFs enable status until reset is completed, and Tx/Rx rings are setup. Without this patch, the code flow request Tx queues to be disabled after reset, especially PFR - where VF VSI Tx rings have already been wiped off in the NVM and result to adminq error based on the call to disable Tx LAN queue in ice_reset_all_vfs function call. Signed-off-by: Akeem G Abodunrin <akeem.g.abodunrin@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-07-31 10:23:04 -07:00
Tony Nguyen	6d5999467d	ice: Do not configure port with no media The firmware reports an error when trying to configure a port with no media. Instead of always configuring the port, check for media before attempting to configure it. In the absence of media, turn off link and poll for media to become available before re-enabling link. Move ice_force_phys_link_state() up to avoid forward declaration. Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-07-31 10:23:04 -07:00
Jacob Keller	5c91ecfda5	ice: separate out control queue lock creation The ice_init_all_ctrlq and ice_shutdown_all_ctrlq functions create and destroy the locks used to protect the send and receive process of each control queue. This is problematic, as the driver may use these functions to shutdown and re-initialize the control queues at run time. For example, it may do this in response to a device reset. If the driver failed to recover from a reset, it might leave the control queues offline. In this case, the locks will no longer be initialized. A later call to ice_sq_send_cmd will then attempt to acquire a lock that has been destroyed. It is incorrect behavior to access a lock that has been destroyed. Indeed, ice_aq_send_cmd already tries to avoid accessing an offline control queue, but the check occurs inside the lock. The root of the problem is that the locks are destroyed at run time. Modify ice_init_all_ctrlq and ice_shutdown_all_ctrlq such that they no longer create or destroy the locks. Introduce new functions, ice_create_all_ctrlq and ice_destroy_all_ctrlq. Call these functions in ice_init_hw and ice_deinit_hw. Now, the control queue locks will remain valid for the life of the driver, and will not be destroyed until the driver unloads. This also allows removing a duplicate check of the sq.count and rq.count values when shutting down the controlqs. The ice_shutdown_ctrlq function already checks this value under the lock. Previously commit `dec64ff10e` ("ice: use [sr]q.count when checking if queue is initialized") needed this check to happen outside the lock, because it prevented duplicate attempts at destroying the locks. The driver may now safely use ice_init_all_ctrlq and ice_shutdown_all_ctrlq while handling reset events, without causing the locks to be invalid. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-07-31 10:23:04 -07:00
Brett Creeley	c31a5c25bb	ice: Always set prefena when configuring an Rx queue Currently we are always setting prefena to 0. This is causing the hardware to only fetch descriptors when there are none free in the cache for a received packet instead of prefetching when it has used the last descriptor regardless of incoming packets. Fix this by allowing the hardware to prefetch Rx descriptors. Signed-off-by: Brett Creeley <brett.creeley@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-07-31 10:23:04 -07:00
Tony Nguyen	17bc6d0721	ice: Move vector base setup to PF VSI When interrupt tracking was refactored, during rebuild, the call to ice_vsi_setup_vector_base() was inadvertently removed from the PF VSI instead of being removed from the VF VSI. During reset, the failure to properly setup the vector base generates a call trace. Correct this so that resets/rebuilds properly complete. Fixes: `cbe66bfee6` ("ice: Refactor interrupt tracking") Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-07-31 10:23:04 -07:00
Jacob Keller	36517fd397	ice: track hardware stat registers past rollover Currently, ice_stat_update32 and ice_stat_update40 will limit the value of the software statistic to 32 or 40 bits wide, depending on which register is being read. This means that if a driver is running for a long time, the displayed software register values will roll over to zero at 40 bits or 32 bits. This occurs because the functions directly assign the difference between the previous value and current value of the hardware statistic. Instead, add this value to the current software statistic, and then update the previous value. In this way, each time ice_stat_update40 or ice_stat_update32 are called, they will increment the software tracking value by the difference of the hardware register from its last read. The software tracking value will correctly count up until it overflows a u64. The only requirement is that the ice_stat_update functions be called at least once each time the hardware register overflows. While we're fixing ice_stat_update40, modify it to use rd64 instead of two calls to rd32. Additionally, drop the now unnecessary hireg function parameter. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-07-31 10:23:04 -07:00
Paul Greenwalt	5a056cd7ea	ice: add lp_advertising flow control support Add support for reporting link partner advertising when ETHTOOL_GLINKSETTINGS defined. Get pause param reports the Tx/Rx pause configured, and then ethtool issues ETHTOOL_GSET ioctl and ice_get_settings_link_up reports the negotiated Tx/Rx pause. Negotiated pause frame report per IEEE 802.3-2005 table 288-3. $ ethtool --show-pause ens6f0 Pause parameters for ens6f0: Autonegotiate: on RX: on TX: on RX negotiated: on TX negotiated: on $ ethtool ens6f0 Settings for ens6f0: Supported ports: [ FIBRE ] Supported link modes: 25000baseCR/Full Supported pause frame use: Symmetric Supports auto-negotiation: Yes Supported FEC modes: None BaseR RS Advertised link modes: 25000baseCR/Full Advertised pause frame use: Symmetric Receive-only Advertised auto-negotiation: Yes Advertised FEC modes: None BaseR RS Link partner advertised link modes: Not reported Link partner advertised pause frame use: Symmetric Link partner advertised auto-negotiation: Yes Link partner advertised FEC modes: Not reported Speed: 25000Mb/s Duplex: Full Port: Direct Attach Copper PHYAD: 0 Transceiver: internal Auto-negotiation: on Supports Wake-on: g Wake-on: g Current message level: 0x00000007 (7) drv probe link Link detected: yes When ETHTOOL_GLINKSETTINGS is not defined, get pause param reports the negotiated Tx/Rx pause. Signed-off-by: Paul Greenwalt <paul.greenwalt@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-07-31 10:23:04 -07:00
Matthew Wilcox (Oracle)	d7840976e3	net: Use skb accessors in network drivers In preparation for unifying the skb_frag and bio_vec, use the fine accessors which already exist and use skb_frag_t instead of struct skb_frag_struct. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-07-22 20:47:56 -07:00
Mauro Carvalho Chehab	fe34c89d25	docs: driver-model: move it to the driver-api book The audience for the Kernel driver-model is clearly Kernel hackers. Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org> Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> # ice driver changes	2019-07-15 11:03:02 -03:00
Linus Torvalds	f632a8170a	Driver Core and debugfs changes for 5.3-rc1 Here is the "big" driver core and debugfs changes for 5.3-rc1 It's a lot of different patches, all across the tree due to some api changes and lots of debugfs cleanups. Because of this, there is going to be some merge issues with your tree at the moment, I'll follow up with the expected resolutions to make it easier for you. Other than the debugfs cleanups, in this set of changes we have: - bus iteration function cleanups (will cause build warnings with s390 and coresight drivers in your tree) - scripts/get_abi.pl tool to display and parse Documentation/ABI entries in a simple way - cleanups to Documenatation/ABI/ entries to make them parse easier due to typos and other minor things - default_attrs use for some ktype users - driver model documentation file conversions to .rst - compressed firmware file loading - deferred probe fixes All of these have been in linux-next for a while, with a bunch of merge issues that Stephen has been patient with me for. Other than the merge issues, functionality is working properly in linux-next :) Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> -----BEGIN PGP SIGNATURE----- iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCXSgpnQ8cZ3JlZ0Brcm9h aC5jb20ACgkQMUfUDdst+ykcwgCfS30OR4JmwZydWGJ7zK/cHqk+KjsAnjOxjC1K LpRyb3zX29oChFaZkc5a =XrEZ -----END PGP SIGNATURE----- Merge tag 'driver-core-5.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core Pull driver core and debugfs updates from Greg KH: "Here is the "big" driver core and debugfs changes for 5.3-rc1 It's a lot of different patches, all across the tree due to some api changes and lots of debugfs cleanups. Other than the debugfs cleanups, in this set of changes we have: - bus iteration function cleanups - scripts/get_abi.pl tool to display and parse Documentation/ABI entries in a simple way - cleanups to Documenatation/ABI/ entries to make them parse easier due to typos and other minor things - default_attrs use for some ktype users - driver model documentation file conversions to .rst - compressed firmware file loading - deferred probe fixes All of these have been in linux-next for a while, with a bunch of merge issues that Stephen has been patient with me for" * tag 'driver-core-5.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (102 commits) debugfs: make error message a bit more verbose orangefs: fix build warning from debugfs cleanup patch ubifs: fix build warning after debugfs cleanup patch driver: core: Allow subsystems to continue deferring probe drivers: base: cacheinfo: Ensure cpu hotplug work is done before Intel RDT arch_topology: Remove error messages on out-of-memory conditions lib: notifier-error-inject: no need to check return value of debugfs_create functions swiotlb: no need to check return value of debugfs_create functions ceph: no need to check return value of debugfs_create functions sunrpc: no need to check return value of debugfs_create functions ubifs: no need to check return value of debugfs_create functions orangefs: no need to check return value of debugfs_create functions nfsd: no need to check return value of debugfs_create functions lib: 842: no need to check return value of debugfs_create functions debugfs: provide pr_fmt() macro debugfs: log errors when something goes wrong drivers: s390/cio: Fix compilation warning about const qualifiers drivers: Add generic helper to match by of_node driver_find_device: Unify the match function with class_find_device() bus_find_device: Unify the match callback with class_find_device ...	2019-07-12 12:24:03 -07:00
Gustavo A. R. Silva	89f6a3051e	ice: Use struct_size() helper One of the more common cases of allocation size calculations is finding the size of a structure that has a zero-sized array at the end, along with memory for some number of elements for that array. For example: struct foo { int stuff; struct boo entry[]; }; size = sizeof(struct foo) + count * sizeof(struct boo); instance = alloc(size, GFP_KERNEL); Instead of leaving these open-coded and prone to type mistakes, we can now use the new struct_size() helper: size = struct_size(instance, entry, count); This code was detected with the help of Coccinelle. Signed-off-by: "Gustavo A. R. Silva" <gustavo@embeddedor.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-06-28 14:54:11 -07:00
Mauro Carvalho Chehab	4489f161b7	docs: driver-model: convert docs to ReST and rename to *.rst Convert the various documents at the driver-model, preparing them to be part of the driver-api book. The conversion is actually: - add blank lines and identation in order to identify paragraphs; - fix tables markups; - add some lists markups; - mark literal blocks; - adjust title markups. At its new index.rst, let's add a :orphan: while this is not linked to the main index.rst file, in order to avoid build warnings. Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org> Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> # ice Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-06-21 15:47:26 +02:00
Jeff Kirsher	3aea173622	ice: Use LLDP ethertype define ETH_P_LLDP Instead of using a local define for the LLDP ethertype, use the kernel define ETH_P_LLDP. Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-06-05 13:04:29 -07:00
Anirudh Venkataramanan	2f2da36ebf	ice: Trivial cosmetic changes This patch mostly capitalizes abbreviations in code comments. Fixed some typos and removed some unnecessary newlines as well. Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-05-30 10:57:55 -07:00
Anirudh Venkataramanan	072efdf8bf	ice: Recognize higher speeds In ice_print_link_msg, add cases for 50GB and 100GB speeds. This results in the right speed being reported on load, instead of "Unknownbps". When VF link if forced (in ice_set_pfe_link_forced), report max speed 100GB. Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-05-30 10:53:05 -07:00
Jacob Keller	4f70daa081	ice: Use a different ICE_DBG bit for firmware log messages Replace the use of the ICE_DBG_AQ_MSG bit when dumping firmware logging messages with a separate distinct type ICE_DBG_FW_LOG. This is useful so that developers may enable ICE_DBG_FW_LOG and get firmware logging messages, without also dumping AdminQ messages at the same time. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-05-30 10:51:33 -07:00
Anirudh Venkataramanan	ed14245ab7	ice: Update function header Add some details to the function header for ice_deinit_hw. Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-05-30 10:48:51 -07:00
Anirudh Venkataramanan	49c6e41b0d	ice: Move define for ICE_AQC_DRIVER_UNLOADING The define describing the bits for the struct field should be below the field itself. Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-05-30 10:45:02 -07:00
Anirudh Venkataramanan	62f4dafc18	ice: Align to updated AQ command formats The current specification has updates to the command formats for manage MAC opcodes (opcodes 0x0107 and 0x0108) and get PHY caps (opcode 0x0600). Update the code to reflect this. Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-05-30 10:43:42 -07:00
Anirudh Venkataramanan	91d7a59087	ice: Use continue instead of an else block For style consistency, use continue instead of an else block in ice_pf_dcb_recfg. Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-05-30 10:38:53 -07:00
Preethi Banala	8be92a76c3	ice: Change minimum descriptor count value for Tx/Rx rings Change minimum number of descriptor count from 32 to 64. This is to have a feature parity with previous Intel NIC drivers. Signed-off-by: Preethi Banala <preethi.banala@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-05-30 10:37:26 -07:00
Dave Ertman	2e0e62285c	ice: Add switch rules to handle LLDP packets Add call to configure dropping egress LLDP packets in ice_vsi_setup and remove the rule in ice_vsi_release. Add calls to add/remove rule to route LLDP packets to default VSI when FW LLDP engine is disabled/enabled and remove rule if applied during ice_vsi_release. In the function ice_add_eth_mac(), there is a line that hard codes the filter info flag to TX. This is incorrect as this flag will be set by the calling function that built the list of filters to add. So remove the hard coded value. This patch also contains a fix to stop treating the DCBx state of "Not Started" as an error state that kicks DCB in SW mode. This will address having non-cabled interfaces automatically go into SW mode with the FW engine running. Signed-off-by: Dave Ertman <david.m.ertman@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-05-30 10:31:42 -07:00
Bruce Allan	092a33d403	ice: Cleanup ice_update_link_info Do not allocate memory for the Get PHY Abilities command data buffer when it is not necessary, change one local variable to another to reduce the number of de-references, reduce the scope of some local variables, and reorder the code and change exit points to get rid of an unnecessary goto label. Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-05-29 23:07:22 -07:00
Akeem G Abodunrin	d31530e83e	ice: Use right type for ice_cfg_vsi_lan return ice_cfg_vsi_lan returns a value of type enum ice_status. So use a local of the same type to capture the return value. Signed-off-by: Akeem G Abodunrin <akeem.g.abodunrin@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-05-29 23:05:37 -07:00
Paul Greenwalt	f776b3acb0	ice: Add support for Forward Error Correction (FEC) This patch adds driver support for Forward Error Correction (FEC) and ethtool handlers to set/get FEC params. Signed-off-by: Paul Greenwalt <paul.greenwalt@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-05-29 23:01:49 -07:00
Anirudh Venkataramanan	047e52c0e8	ice: Add support for virtchnl_vector_map.[rxq\|txq]_map Add support for virtchnl_vector_map.[rxq\|txq]_map to use bitmap to associate indicated queues with the specified vector. This support is needed since the Windows AVF driver calls VIRTCHNL_OP_CONFIG_IRQ_MAP for each vector and used the bitmap to indicate the associated queues. Updated ice_vc_dis_qs_msg to not subtract one from virtchnl_irq_map_info.num_vectors, and changed the VSI vector index to the vector id. This change supports the Windows AVF driver which maps one vector at a time and sets num_vectors to one. Using vectors_id to index the vector array . Add check for vector_id zero, and return VIRTCHNL_STATUS_ERR_PARAM if vector_id is zero and there are rings associated with that vector. Vector_id zero is for the OICR. Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-05-29 22:56:28 -07:00
Tony Nguyen	561f437901	ice: Introduce ice_init_mac_fltr and move ice_napi_del Consolidate adding unicast and broadcast MAC filters in a single new function ice_init_mac_fltr. Move ice_napi_del to ice_lib.c Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-05-29 22:51:35 -07:00
Brett Creeley	72ecb896e4	ice: Use GLINT_DYN_CTL to disable VF's interrupts Currently in ice_free_vf_res() we are writing to the VFINT_DYN_CTLN register in the PF's function space to disable all VF's interrupts. This is incorrect because this register is only for use in the VF's function space. This becomes obvious when seeing that the valid indices used for the VFINT_DYN_CTLN register is from 0-63, which is the maximum number of interrupts for a VF (not including the OICR interrupt). Fix this by writing to the GLINT_DYN_CTL register for each VF. We can do this because we keep track of each VF's first_vector_idx inside of the PF's function space and the number of interrupts given to each VF. Also in ice_free_vfs() we were disabling Rx/Tx queues after calling pci_disable_sriov(). One part of disabling the Tx queues causes the PF driver to trigger a software interrupt, which causes the VF's napi routine to run. This doesn't currently work because pci_disable_sriov() causes iavf_remove() to be called which disables interrupts. Fix this by disabling Rx/Tx queues prior to pci_disable_sriov(). Signed-off-by: Brett Creeley <brett.creeley@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-05-29 22:49:06 -07:00
Brett Creeley	e89e899f3e	ice: Add a helper to trigger software interrupt Add a new function ice_trigger_sw_intr to trigger interrupts. Signed-off-by: Brett Creeley <brett.creeley@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-05-29 03:00:53 -07:00
Md Fahad Iqbal Polash	3a9e32bb06	ice: Configure RSS LUT key only if RSS is enabled Call ice_vsi_cfg_rss_lut_key only if RSS is enabled. Signed-off-by: Md Fahad Iqbal Polash <md.fahad.iqbal.polash@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-05-29 02:59:08 -07:00
Dan Nowlin	11fe1b3a38	ice: Add ice_get_fw_log_cfg to init FW logging In order to initialize the current status of the FW logging, this patch adds ice_get_fw_log_cfg. The function retrieves the current setting of the FW logging from HW and updates the ice_hw structure accordingly. Signed-off-by: Dan Nowlin <dan.nowlin@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-05-29 02:57:27 -07:00
Anirudh Venkataramanan	1eb11036a3	ice: Minor cleanup in ice_switch.h Remove duplicate define for ICE_INVAL_Q_HANDLE. Move defines to the top of the file. Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-05-29 02:55:34 -07:00
Dave Ertman	91aed40da3	ice: Remove redundant and premature event config In the path for re-enabling FW LLDP engine, there is a call to register for LLDP MIB change events. This call is redundant, in that the call to ice_pf_dcb_cfg will already register the driver for these events. Also, the call as it stands now is too early in the flow before before DCB is configured. Remove the redundant call. Signed-off-by: Dave Ertman <david.m.ertman@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-05-29 02:53:57 -07:00
Mitch Williams	4cc82aaa74	ice: Change message level Change the message level of the MTU change log message from debug to info. Signed-off-by: Mitch Williams <mitch.a.williams@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-05-29 02:52:23 -07:00
Mitch Williams	23c0112246	ice: Check all VFs for MDD activity, don't disable Don't use the mdd_detected variable as an exit condition for this loop; the first VF to NOT have an MDD event will cause the loop to terminate. Instead just look at all of the VFs, but don't disable them. This prevents proper release of resources if the VFs are rebooted or the VF driver reloaded. Instead, just log a message and call out repeat offenders. To make it clear what we are doing, use a differently-named variable in the loop. Signed-off-by: Mitch Williams <mitch.a.williams@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-05-29 02:50:46 -07:00
Brett Creeley	cbe66bfee6	ice: Refactor interrupt tracking Currently we have two MSI-x (IRQ) trackers, one for OS requested MSI-x entries (sw_irq_tracker) and one for hardware MSI-x vectors (hw_irq_tracker). Generally the sw_irq_tracker has less entries than the hw_irq_tracker because the hw_irq_tracker has entries equal to the max allowed MSI-x per PF and the sw_irq_tracker is mainly the minimum (non SR-IOV portion of the vectors, kernel granted IRQs). All of the non SR-IOV portions of the driver (i.e. LAN queues, RDMA queues, OICR, etc.) take at least one of each type of tracker resource. SR-IOV only grabs entries from the hw_irq_tracker. There are a few issues with this approach that can be seen when doing any kind of device reconfiguration (i.e. ethtool -L, SR-IOV, etc.). One of them being, any time the driver creates an ice_q_vector and associates it to a LAN queue pair it will grab and use one entry from the hw_irq_tracker and one from the sw_irq_tracker. If the indices on these does not match it will cause a Tx timeout, which will cause a reset and then the indices will match up again and traffic will resume. The mismatched indices come from the trackers not being the same size and/or the search_hint in the two trackers not being equal. Another reason for the refactor is the co-existence of features with SR-IOV. If SR-IOV is enabled and the interrupts are taken from the end of the sw_irq_tracker then other features can no longer use this space because the hardware has now given the remaining interrupts to SR-IOV. This patch reworks how we track MSI-x vectors by removing the hw_irq_tracker completely and instead MSI-x resources needed for SR-IOV are determined all at once instead of per VF. This can be done because when creating VFs we know how many are wanted and how many MSI-x vectors each VF needs. This also allows us to start using MSI-x resources from the end of the PF's allowed MSI-x vectors so we are less likely to use entries needed for other features (i.e. RDMA, L2 Offload, etc). This patch also reworks the ice_res_tracker structure by removing the search_hint and adding a new member - "end". Instead of having a search_hint we will always search from 0. The new member, "end", will be used to manipulate the end of the ice_res_tracker (specifically sw_irq_tracker) during runtime based on MSI-x vectors needed by SR-IOV. In the normal case, the end of ice_res_tracker will be equal to the ice_res_tracker's num_entries. The sriov_base_vector member was added to the PF structure. It is used to represent the starting MSI-x index of all the needed MSI-x vectors for all SR-IOV VFs. Depending on how many MSI-x are needed, SR-IOV may have to take resources from the sw_irq_tracker. This is done by setting the sw_irq_tracker->end equal to the pf->sriov_base_vector. When all SR-IOV VFs are removed then the sw_irq_tracker->end is reset back to sw_irq_tracker->num_entries. The sriov_base_vector, along with the VF's number of MSI-x (pf->num_vf_msix), vf_id, and the base MSI-x index on the PF (pf->hw.func_caps.common_cap.msix_vector_first_id), is used to calculate the first HW absolute MSI-x index for each VF, which is used to write to the VPINT_ALLOC[_PCI] and GLINT_VECT2FUNC registers to program the VFs MSI-x PCI configuration bits. Also, the sriov_base_vector is used along with VF's num_vf_msix, vf_id, and q_vector->v_idx to determine the MSI-x register index (used for writing to GLINT_DYN_CTL) within the PF's space. Interrupt changes removed any references to hw_base_vector, hw_oicr_idx, and hw_irq_tracker. Only sw_base_vector, sw_oicr_idx, and sw_irq_tracker variables remain. Change all of these by removing the "sw_" prefix to help avoid confusion with these variables and their use. Signed-off-by: Brett Creeley <brett.creeley@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-05-29 02:48:49 -07:00
Anirudh Venkataramanan	0e674aeb0b	ice: Add handler for ethtool selftest This patch adds a handler for ethtool selftest. Selftest includes testing link, interrupts, eeprom, registers and packet loopback. Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-05-29 02:44:12 -07:00
Brett Creeley	4b6f3ecabf	ice: Don't call ice_cfg_itr() for SR-IOV ice_cfg_itr() sets the ITR granularity and default ITR values for the PF's interrupt vectors. For VF's this will be done in the AVF driver flow. Fix this by not calling ice_cfg_itr() for SR-IOV. Signed-off-by: Brett Creeley <brett.creeley@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-05-29 02:40:30 -07:00
Brett Creeley	1aec6e1b08	ice: Set minimum default Rx descriptor count to 512 Currently we set the default number of Rx descriptors per queue to the system's page size divided by the number of bytes per descriptor. For 4K page size systems this is resulting in 128 Rx descriptors per queue. This is causing more dropped packets than desired in the default configuration. Fix this by setting the minimum default Rx descriptor count per queue to 512. Signed-off-by: Brett Creeley <brett.creeley@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-05-29 02:38:50 -07:00
Bruce Allan	e65e9e1566	ice: Resolve static analysis warning Some static analysis tools can complain when doing a bitop assignment using operands of different sizes. Fix that. Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-05-29 02:36:58 -07:00
Tony Nguyen	3171948e94	ice: Implement toggling ethtool rx-vlan-filter Implement the toggling of rx-vlan-filter; enable\|disable VLAN pruning based on on\|off, respectively. Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-05-29 02:35:06 -07:00
Anirudh Venkataramanan	588d511f89	ice: Remove direct write for GLLAN_RCTL_0 Clear PXE mode AQ call (opcode 0x0110) is now supported in FW. So remove the direct register write to GLLAN_RCTL_0. Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-05-29 02:33:21 -07:00
Bruce Allan	95f8e8b931	ice: Fix LINE_SPACING style issue Fix a checkpatch "LINE_SPACING: Please don't use multiple blank lines" issue that has snuck in to the code. Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-05-29 02:31:43 -07:00
Bruce Allan	feee3cb306	ice: Silence semantic parser warnings Recent versions of sparse warn about casting pointers to/from restricted endian types in the Linux driver. Silence those with the compiler attribute __force macro from the Linux kernel to force casts to/from restricted endian types. Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-05-23 10:51:54 -07:00
Brett Creeley	aa6ccf3f2d	ice: Fix couple of issues in ice_vsi_release Currently the driver is calling ice_napi_del() and then unregister_netdev(). The call to unregister_netdev() will result in a call to ice_stop() and then ice_vsi_close(). This is where we call napi_disable() for all the MSI-X vectors. This flow is reversed so make the changes to ensure napi_disable() happens prior to napi_del(). Before calling napi_del() and free_netdev() make sure unregister_netdev() was called. This is done by making sure the __ICE_DOWN bit is set in the vsi->state for the interested VSI. Signed-off-by: Brett Creeley <brett.creeley@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-05-23 10:51:54 -07:00
Jesse Brandeburg	8d5fce1903	ice: Reorganize ice_vf struct The ice_vf struct can be used hundreds of times in our driver so it pays to use less memory per struct. ice_vf prior to this commit: /* size: 112, cachelines: 2, members: 25 / / sum members: 101, holes: 4, sum holes: 8 / / bit holes: 2, sum bit holes: 11 bits / / padding: 3 / / last cacheline: 48 bytes / ice_vf after this commit: / size: 104, cachelines: 2, members: 25 / / sum members: 100, holes: 3, sum holes: 4 / / bit holes: 1, sum bit holes: 3 bits / / last cacheline: 40 bytes */ Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-05-23 10:51:54 -07:00
Jesse Brandeburg	0ab54c5f2f	ice: Use bitfields when possible We can use bit fields to store boolean values and when the bit fields are next to each other, the compiler will combine them (as long as the size holds enough). Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-05-23 10:51:54 -07:00
Jesse Brandeburg	65124bbf98	ice: Reorganize tx_buf and ring structs Use more efficient structure ordering by using the pahole tool and a lot of code inspection to get hot cache lines to have packed data (no holes if possible) and adjacent warm data. ice_ring prior to this change: /* size: 192, cachelines: 3, members: 23 / / sum members: 158, holes: 4, sum holes: 12 / / padding: 22 / ice_ring after this change: / size: 192, cachelines: 3, members: 25 / / sum members: 162, holes: 1, sum holes: 1 / / padding: 29 / ice_tx_buf prior to this change: / size: 48, cachelines: 1, members: 7 / / sum members: 38, holes: 2, sum holes: 6 / / padding: 4 / / last cacheline: 48 bytes / ice_tx_buf after this change: / size: 40, cachelines: 1, members: 7 / / sum members: 38, holes: 1, sum holes: 2 / / last cacheline: 40 bytes */ Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-05-23 10:51:54 -07:00
Richard Rodriguez	55e062ba77	ice: Format ethtool reported stats Fixes ethtool -S reported stats in ice driver to match format and nomenclature of the ixgbe driver. Signed-off-by: Richard Rodriguez <richard.rodriguez@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-05-23 10:51:54 -07:00
Brett Creeley	72f9c20398	ice: Gracefully handle reset failure in ice_alloc_vfs() Currently if ice_reset_all_vfs() fails in ice_alloc_vfs() we fail to free some resources, reset variables, and return an error value. Fix this by adding another unroll case to free the pf->vf array, set the pf->num_alloc_vfs to 0, and return an error code. Without this, if ice_reset_all_vfs() fails in ice_alloc_vfs() we will not be able to do SRIOV without hard rebooting the system because rmmod'ing the driver does not work. Signed-off-by: Brett Creeley <brett.creeley@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-05-23 10:51:54 -07:00
Usha Ketineni	a17a5ff681	ice: Refactor the LLDP MIB change event handling This patch fixes the LLDP MIB change event handling code by removing the workarounds in the current code. Added ice_dcb_need_recfg() to print the DCB configuration changes detected via MIB change event. Signed-off-by: Usha Ketineni <usha.k.ketineni@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-05-23 10:51:53 -07:00
Tony Nguyen	9ccb062c14	ice: Advertise supported link modes if none requested User requested link modes affect what is returned as an advertised link mode. If no modes have been requested, we are not advertising any link modes. Advertise what we are capable of supporting if no link modes have been requested. Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-05-23 10:51:53 -07:00
Dave Ertman	e223eaec67	ice: Fix hang when ethtool disables FW LLDP When disabling and enabling VSIs, there are a couple of flows that recursively acquire the RTNL lock which causes a deadlock. Fix that. Signed-off-by: Dave Ertman <david.m.ertman@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-05-23 10:51:53 -07:00
Anirudh Venkataramanan	a84db52569	ice: Call out dev/func caps when printing ice_parse_caps is used to parse both device and function capabilities. Currently, capabilities are printed with a cryptic "HW caps" prefix, which makes it difficult to distinguish whether the capabilities being printed are device or function capabilities. This patch makes a change to add a "func cap" prefix when printing function capabilities, and a "dev cap" prefix when printing device capabilities. This patch also changes some of the capability print strings for consistency. Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-05-23 10:51:53 -07:00
Anirudh Venkataramanan	f24e35d88b	ice: Remove braces for single statement blocks Fix checkpatch warning "WARNING:BRACES: braces {} are not necessary for single statement blocks" Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-05-23 10:51:53 -07:00
Bruce Allan	173e23c0cb	ice: Cleanup an unnecessary variable initialization Commit 3463688e6ced ("ice: Add more validation in ice_vc_cfg_irq_map_msg") added an assignment of vsi making the assignment during declaration unnecessary. Also, cleanup the declaration and assignment of irqmap_info to not use two lines in the variable declaration section. Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-05-23 10:51:53 -07:00
Anirudh Venkataramanan	31eafa403b	ice: Implement LLDP persistence Implement LLDP persistence across reboots, start and stop of LLDP agent. Add additional parameter to ice_aq_start_lldp and ice_aq_stop_lldp. Also change the ethtool private flag from "disable-fw-lldp" to "enable-fw-lldp". This change will flip the boolean logic of the functionality of the flag (on = enable, off = disable). The change in name and functionality is to differentiate between the pre-persistence and post-persistence states. Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-05-23 10:51:53 -07:00
Anirudh Venkataramanan	b4603dbf1e	ice: Fix double spacing Fix double spacing in ice_napi_disable_all Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-05-23 10:51:53 -07:00
Linus Torvalds	80f232121b	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next Pull networking updates from David Miller: "Highlights: 1) Support AES128-CCM ciphers in kTLS, from Vakul Garg. 2) Add fib_sync_mem to control the amount of dirty memory we allow to queue up between synchronize RCU calls, from David Ahern. 3) Make flow classifier more lockless, from Vlad Buslov. 4) Add PHY downshift support to aquantia driver, from Heiner Kallweit. 5) Add SKB cache for TCP rx and tx, from Eric Dumazet. This reduces contention on SLAB spinlocks in heavy RPC workloads. 6) Partial GSO offload support in XFRM, from Boris Pismenny. 7) Add fast link down support to ethtool, from Heiner Kallweit. 8) Use siphash for IP ID generator, from Eric Dumazet. 9) Pull nexthops even further out from ipv4/ipv6 routes and FIB entries, from David Ahern. 10) Move skb->xmit_more into a per-cpu variable, from Florian Westphal. 11) Improve eBPF verifier speed and increase maximum program size, from Alexei Starovoitov. 12) Eliminate per-bucket spinlocks in rhashtable, and instead use bit spinlocks. From Neil Brown. 13) Allow tunneling with GUE encap in ipvs, from Jacky Hu. 14) Improve link partner cap detection in generic PHY code, from Heiner Kallweit. 15) Add layer 2 encap support to bpf_skb_adjust_room(), from Alan Maguire. 16) Remove SKB list implementation assumptions in SCTP, your's truly. 17) Various cleanups, optimizations, and simplifications in r8169 driver. From Heiner Kallweit. 18) Add memory accounting on TX and RX path of SCTP, from Xin Long. 19) Switch PHY drivers over to use dynamic featue detection, from Heiner Kallweit. 20) Support flow steering without masking in dpaa2-eth, from Ioana Ciocoi. 21) Implement ndo_get_devlink_port in netdevsim driver, from Jiri Pirko. 22) Increase the strict parsing of current and future netlink attributes, also export such policies to userspace. From Johannes Berg. 23) Allow DSA tag drivers to be modular, from Andrew Lunn. 24) Remove legacy DSA probing support, also from Andrew Lunn. 25) Allow ll_temac driver to be used on non-x86 platforms, from Esben Haabendal. 26) Add a generic tracepoint for TX queue timeouts to ease debugging, from Cong Wang. 27) More indirect call optimizations, from Paolo Abeni" * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1763 commits) cxgb4: Fix error path in cxgb4_init_module net: phy: improve pause mode reporting in phy_print_status dt-bindings: net: Fix a typo in the phy-mode list for ethernet bindings net: macb: Change interrupt and napi enable order in open net: ll_temac: Improve error message on error IRQ net/sched: remove block pointer from common offload structure net: ethernet: support of_get_mac_address new ERR_PTR error net: usb: smsc: fix warning reported by kbuild test robot staging: octeon-ethernet: Fix of_get_mac_address ERR_PTR check net: dsa: support of_get_mac_address new ERR_PTR error net: dsa: sja1105: Fix status initialization in sja1105_get_ethtool_stats vrf: sit mtu should not be updated when vrf netdev is the link net: dsa: Fix error cleanup path in dsa_init_module l2tp: Fix possible NULL pointer dereference taprio: add null check on sched_nest to avoid potential null pointer dereference net: mvpp2: cls: fix less than zero check on a u32 variable net_sched: sch_fq: handle non connected flows net_sched: sch_fq: do not assume EDT packets are ordered net: hns3: use devm_kcalloc when allocating desc_cb net: hns3: some cleanup for struct hns3_enet_ring ...	2019-05-07 22:03:58 -07:00
Michal Swiatkowski	64439f8f0b	ice: Disable sniffing VF traffic on PF Delete code that add default Tx rule on PF. With this rule PF can see Tx VF traffic that should go outside. For traffic from VF to another VF default Tx rule on PF doesn't apply because of lower priority than VF mac rule. With this change on PF in promisc mode we can see only Rx traffic that doesn't match any other rule (mac etc.). We can't see Tx traffic from other VSI. Signed-off-by: Michal Swiatkowski <michal.swiatkowski@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-05-04 14:44:47 -07:00
Jesse Brandeburg	0690527014	ice: Use more efficient structures Move a bunch of members around to make more efficient use of memory, eliminating holes where possible. None of these members are hot path so cache line alignment is not very important here. Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-05-04 14:40:36 -07:00
Jesse Brandeburg	0437f1a98a	ice: Use bitfields where possible The driver was converted to not use bool, but it was neglected that the bools should have been converted to bit fields as bit fields in software structures are ok, as long as they use the correct kinds of unsigned types. This avoids wasting lots of storage space to store single bit values. One of the change hunks moves a variable lport out of a group of "combinable" bit fields because all bits of the u8 lport are valid and the variable can be packed in the struct in struct holes. Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-05-04 14:38:57 -07:00
Akeem G Abodunrin	d95276ced0	ice: Add function to program ethertype based filter rule on VSIs This patch adds function to program VSI with ethertype based filter rule, so that all flow control frames would be disallowed from being transmitted to the client, in order to prevent malicious VSI, especially VF from sending out PAUSE or PFC frames, and then control other VSIs traffic. Signed-off-by: Akeem G Abodunrin <akeem.g.abodunrin@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-05-04 14:36:28 -07:00
Tony Nguyen	8f529ff912	ice: Separate if conditions for ice_set_features() Set features can have multiple features turned on\|off in a single call. Grouping these all in an if/else means after one condition is met, other conditions/features will not be evaluated. Break the if/else statements by feature to ensure all features will be handled properly. Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-05-04 14:33:44 -07:00
Tony Nguyen	a03499d614	ice: Remove __always_unused attribute The variable netdev is being used in this function; remove the __always_unused attribute from it. Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-05-04 14:31:59 -07:00
Bruce Allan	c3a6825e82	ice: Suppress false-positive style issues reported by static analyzer A recent version of cppcheck falsely reports- Variable ip.hdr is assigned a value that is never used. ip is a union so the pointer ip.hdr is actually used when referenced as ip.v4 and ip.v6. Silence these false reports when using cppcheck with the --inline-suppr command-line option. Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-05-04 14:30:05 -07:00
Brett Creeley	e40c899a64	ice: Refactor getting/setting coalesce Currently if the driver has an uneven amount of Rx/Tx queues setting the coalesce settings through ethtool will result in an error. This is happening because in the setting coalesce flow we are reporting an error if either Rx or Tx fails. Also, the flow for setting/getting per_q_coalesce and setting/getting coalesce settings for the entire device is different. Fix these issues by adding one function, ice_set_q_coalesce(), and another, ice_get_q_coalesce(), that both getting/setting per_q and entire device coalesce can use. This makes handling the error cases generic between the two flows and simplifies __ice_set_coalesce() and __ice_get_coalesce(). Also, add a header comment to __ice_set_coalesce(). Signed-off-by: Brett Creeley <brett.creeley@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-05-04 14:25:26 -07:00
Brett Creeley	a85a3847fb	ice: Always free/allocate q_vectors Currently when probing/removing the driver we allocate/deallocate each vsi->q_vectors array in ice_vsi_alloc_arrays() and ice_vsi_free_arrays() respectively. However, we don't do this during the reset and VSI rebuild flow. This is inconsistent and unnecessary to have a difference between the two flows. This patch makes the change to always allocate/deallocate the vsi->q_vectors array regardless of the driver flow we are in. Also, update the comment for ice_vsi_free_arrays() to be more descriptive. Signed-off-by: Brett Creeley <brett.creeley@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-05-04 14:22:55 -07:00
Bruce Allan	207e3721ac	ice: Do not unnecessarily initialize local variable The local variable speed does not need to be initialized and can cause some static analysis tools to complain the initial assigned value is never used. Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-05-04 14:21:01 -07:00
Michal Swiatkowski	ba0db585bd	ice: Add more validation in ice_vc_cfg_irq_map_msg Add few checks to validate msg from iavf driver. Test if we have got enough q_vectors allocated in VSI connected with VF. Add masks for itr_indx and msix_indx to avoid writing to reserved fieldi of QINT. Clear q_vector->num_ring_rx/tx, without it we can increment this value every time we send irq map msg from VF. So after second call this value will be incorrect. Decrement num_vectors from msg, because last vector in iavf msg is misc vector (we don't set map for it). Signed-off-by: Michal Swiatkowski <michal.swiatkowski@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-05-04 14:18:27 -07:00
Akeem G Abodunrin	bb877b22bc	ice: Don't remove VLAN filters that were never programmed In case of non-trusted VFs, it is possible to program VLAN filter far less than what is requested by the VF originally, thereby makes number of VLAN elements being tracked by VF different from actual VLAN tags. This patch makes sure that we are not attempting to remove VLAN filter that does not exist. Signed-off-by: Akeem G Abodunrin <akeem.g.abodunrin@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-05-04 14:07:34 -07:00
Tony Nguyen	e80e76db6c	ice: Preserve VLAN Rx stripping settings When Tx insertion is set, we are not accounting for the state of Rx stripping. This causes Rx stripping to be enabled any time Tx insertion is changed, even when it's supposed to be disabled. Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-05-04 13:45:00 -07:00
Michal Swiatkowski	a52db6b260	ice: Fix for allowing too many MDD events on VF Disable VF if any malicious device driver (MDD) event is detected by hardware. Track vf->num_mdd_events for information about VF MDD events. Signed-off-by: Michal Swiatkowski <michal.swiatkowski@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-05-04 13:42:53 -07:00
Jesse Brandeburg	819d899863	ice: Use pf instead of vsi-back Many times in our functions we have a local variable pf, which is equivalent to vsi->back. Just use pf consistently instead of vsi->back where available. Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-05-04 13:06:56 -07:00
Brett Creeley	20ce2a1a2e	ice: Use dev_err when ice_cfg_vsi_lan fails dev_err makes more sense than dev_info when this call fails. Signed-off-by: Brett Creeley <brett.creeley@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-05-02 01:29:13 -07:00
Brett Creeley	c2a23e0061	ice: Refactor link event flow Currently the link event flow works, but can be much better. Refactor the link event flow to make it cleaner and more clear on what is going on. Signed-off-by: Brett Creeley <brett.creeley@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-05-02 01:27:11 -07:00
Tony Nguyen	49a6a5d7eb	ice: Add missing PHY type to link settings The PHY type ICE_PHY_TYPE_LOW_25G_AUI_C2C is missing from ice_get_settings_link_up() which is causing a warning message for unrecognized PHY. Add the PHY type to correctly set the settings and avoid the warning message. Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-05-02 01:23:42 -07:00
Brett Creeley	b07833a00d	ice: Add reg_idx variable in ice_q_vector structure Every time we want to re-enable interrupts and/or write to a register that requires an interrupt vector's hardware index we do the following: vsi->hw_base_vector + q_vector->v_idx This is a wasteful operation, especially in the hot path. Fix this by adding a u16 reg_idx member to the ice_q_vector structure and make the necessary changes to make this work. Signed-off-by: Brett Creeley <brett.creeley@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-05-02 01:21:56 -07:00
Md Fahad Iqbal Polash	8d7189d266	ice: Remove runtime change of PFINT_OICR_ENA register Runtime change of PFINT_OICR_ENA register is unnecessary. The handlers should always clear the atomic bit for each task as they start, because it will make sure that any late interrupt will either 1) re-set the bit, or 2) be handled directly in the "already running" task handler. Signed-off-by: Md Fahad Iqbal Polash <md.fahad.iqbal.polash@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-05-02 01:19:26 -07:00
Akeem G Abodunrin	5079b853b2	ice: Fix issue when adding more than allowed VLANs This patch fixes issue with non trusted VFs being able to add more than permitted number of VLANs by adding a check in ice_vc_process_vlan_msg. Also don't return an error in this case as the VF does not need to know that it is not trusted. Also rework ice_vsi_kill_vlan to use the right types. Signed-off-by: Akeem G Abodunrin <akeem.g.abodunrin@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-05-02 01:17:37 -07:00
Brett Creeley	acd1751a39	ice: Remove unnecessary wait when disabling/enabling Rx queues In ice_vsi_ctrl_rx_rings() we are unnecessarily waiting for QRX_CTRL_QENA_REQ and QRX_CTRL_QENA_STAT to be the same value prior to disabling each Rx queue. There is no reason to do this so remove this wait loop as we already have a wait loop after disabling/enabling the Rx queue through the QRX_CTRL register to make sure it gets successfully disabled/enabled. Signed-off-by: Brett Creeley <brett.creeley@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-05-02 01:15:43 -07:00
Brett Creeley	b9c8bb06b5	ice: Add ability to update rx-usecs-high Currently the driver allows rx-usecs-high values to be set, but when querying the device for rx-usecs-high the value does not stick. This is because it was not yet implemented. Add code to allow the user to change rx-usecs-high and use this to set the q_vector's intrl value. Signed-off-by: Brett Creeley <brett.creeley@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-05-02 01:13:39 -07:00
Paul Greenwalt	b4b418b3ad	ice: Add 52 byte RSS hash key support Add support to set 52 byte RSS hash key. Signed-off-by: Paul Greenwalt <paul.greenwalt@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-05-02 01:11:47 -07:00
Brett Creeley	0c2561c81f	ice: Use ice_for_each_q_vector macro where possible There are many places in the code where we do the following: for (i = 0; i < vsi->num_q_vectors; i++) Instead use the macro mentioned in the commit title: ice_for_each_q_vector(vsi, i) Signed-off-by: Brett Creeley <brett.creeley@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-05-02 01:09:51 -07:00
Maciej Fijalkowski	a92e1bb6ad	ice: Validate ring existence and its q_vector per VSI When stopping Tx rings, we use 'i' as an ring array index for looking up whether the ice_ring exists and have assigned a q_vector. This checks rings only within a given TC and we need to go through every ring in VSI. Use 'q_idx' instead. Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-05-02 01:08:00 -07:00
Brett Creeley	1553f4f77a	ice: Reduce scope of variable in ice_vsi_cfg_rxqs Reduce scope of the variable 'err' to inside the for loop instead of using it as a second looping conditional. Also while here, improve the debug message if we fail to configure a Rx queue. Signed-off-by: Brett Creeley <brett.creeley@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-05-02 01:05:38 -07:00
Bruce Allan	fe7219fa7c	ice: Resolve static analysis reported issue Static analysis points out the default case in the switch statement in ice_get_itr_intrl_gran() is an infeasible condition causing the default case statement to be unreachable. Remove it and since the function no longer returns anything but success, change it to just return void and update the only call to it accordingly. Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-05-02 01:03:49 -07:00
Akeem G Abodunrin	85796d6e2f	ice: Return configuration error without queue to disable If there is no queue to disable, return appropriate configuration error earlier without acquiring the lock. Signed-off-by: Akeem G Abodunrin <akeem.g.abodunrin@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-05-02 01:01:40 -07:00
Anirudh Venkataramanan	bb87ee0efb	ice: Create framework for VSI queue context This patch introduces a framework to store queue specific information in VSI queue contexts. Currently VSI queue context (represented by struct ice_q_ctx) only has q_handle as a member. In future patches, this structure will be updated to hold queue specific information. Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-05-02 00:57:44 -07:00
Stanislav Fomichev	c43f1255b8	net: pass net_device argument to the eth_get_headlen Update all users of eth_get_headlen to pass network device, fetch network namespace from it and pass it down to the flow dissector. This commit is a noop until administrator inserts BPF flow dissector program. Cc: Maxim Krasnyansky <maxk@qti.qualcomm.com> Cc: Saeed Mahameed <saeedm@mellanox.com> Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Cc: intel-wired-lan@lists.osuosl.org Cc: Yisen Zhuang <yisen.zhuang@huawei.com> Cc: Salil Mehta <salil.mehta@huawei.com> Cc: Michael Chan <michael.chan@broadcom.com> Cc: Igor Russkikh <igor.russkikh@aquantia.com> Signed-off-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2019-04-23 18:36:34 +02:00
Brett Creeley	711987bbad	ice: Calculate ITR increment based on direct calculation Currently when calculating how much to increment ITR by inside of ice_update_itr() we do some estimations and intermediate calculations. Instead of doing estimations, just do the calculation directly. This allows for a more accurate value and it makes it easier for the next person to understand and update. Also, remove the dividing the ITR value by 2 when latency driven because the ITR values are already so low for 100Gbps speed. This should help get to the desired ITR value faster. Signed-off-by: Brett Creeley <brett.creeley@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-04-18 08:38:48 -07:00
Anirudh Venkataramanan	9c010de7cf	ice: Bump driver version Update driver version to 0.7.4 Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-04-18 08:38:48 -07:00
Anirudh Venkataramanan	3a257a1404	ice: Add code to control FW LLDP and DCBX This patch adds code to start or stop LLDP and DCBX in firmware through use of ethtool private flags. Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-04-18 08:38:48 -07:00
Anirudh Venkataramanan	b832c2f631	ice: Add code for DCB rebuild This patch introduces a new function ice_dcb_rebuild which reinitializes DCB after a reset. Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-04-18 08:38:48 -07:00
Anirudh Venkataramanan	4b0fdceb81	ice: Add code to get DCB related statistics This patch adds a new function ice_update_dcb_stats to get DCB stats from the hardware and ethtool support for displaying these stats. Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-04-18 08:38:47 -07:00
Anirudh Venkataramanan	5f6aa50e4e	ice: Add priority information into VLAN header This patch introduces a new function ice_tx_prepare_vlan_flags_dcb to insert 802.1p priority information into the VLAN header Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-04-18 08:38:47 -07:00
Anirudh Venkataramanan	a629cf0a01	ice: Update rings based on TC information This patch adds a new function ice_vsi_cfg_dcb_rings which updates a VSI's rings based on DCB traffic class information. Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-04-18 08:38:47 -07:00
Anirudh Venkataramanan	00cc3f1b3a	ice: Add code to process LLDP MIB change events This patch adds support to process LLDP MIB change notifications sent by the firmware. Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-04-18 08:38:47 -07:00
Anirudh Venkataramanan	0deab659a6	ice: Add code for DCB initialization part 4/4 When the firmware doesn't support LLDP or DCBX, the driver should switch to "software LLDP mode". This patch adds support for the same. Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-04-18 08:38:47 -07:00
Anirudh Venkataramanan	7b9ffc76bf	ice: Add code for DCB initialization part 3/4 This patch adds a new function ice_pf_dcb_cfg (and related helpers) which applies the DCB configuration obtained from the firmware. As part of this, VSIs/netdevs are updated with traffic class information. This patch requires a bit of a refactor of existing code. 1. For a MIB change event, the associated VSI is closed and brought up again. The gap between closing and opening the VSI can cause a race condition. Fix this by grabbing the rtnl_lock prior to closing the VSI and then only free it after re-opening the VSI during a MIB change event. 2. ice_sched_query_elem is used in ice_sched.c and with this patch, in ice_dcb.c as well. However, ice_dcb.c is not built when CONFIG_DCB is unset. This results in namespace warnings (ice_sched.o: Externally defined symbols with no external references) when CONFIG_DCB is unset. To avoid this move ice_sched_query_elem from ice_sched.c to ice_common.c. Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-04-18 08:38:47 -07:00
Anirudh Venkataramanan	0ebd3ff13c	ice: Add code for DCB initialization part 2/4 This patch introduces a new top level function ice_init_dcb (and related lower level helper functions) which continues the DCB init flow. This function uses ice_get_dcb_cfg to get, parse and store the DCB configuration. Once this is done, it sets itself up to be notified by the firmware on LLDP MIB change events. Reviewed-by: Bruce Allan <bruce.w.allan@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-04-18 08:38:47 -07:00
Anirudh Venkataramanan	37b6f6469f	ice: Add code for DCB initialization part 1/4 This patch introduces a skeleton for ice_init_pf_dcb, the top level function for DCB initialization. Subsequent patches will add to this DCB init flow. In this patch, ice_init_pf_dcb checks if DCB is a supported capability. If so, an admin queue call to start the LLDP and DCBx in firmware is issued. If not, an error is reported. Note that we don't fail the driver init if DCB init fails. Reviewed-by: Bruce Allan <bruce.w.allan@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-04-18 08:38:47 -07:00
Anirudh Venkataramanan	802abbb44a	ice: Bump version Bump driver version to 0.7.3 Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-04-18 08:38:47 -07:00
Anirudh Venkataramanan	f9867df6d9	ice: Fix incorrect use of abbreviations Capitalize abbreviations and spell out some that aren't obvious. Reviewed-by: Bruce Allan <bruce.w.allan@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-04-18 08:38:47 -07:00
Anirudh Venkataramanan	94c4441b5a	ice: Fix typos in code comments This patch fixes typos in code comments. Reviewed-by: Bruce Allan <bruce.w.allan@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-04-18 08:38:47 -07:00
Will Deacon	fb24ea52f7	drivers: Remove explicit invocations of mmiowb() mmiowb() is now implied by spin_unlock() on architectures that require it, so there is no reason to call it from driver code. This patch was generated using coccinelle: @mmiowb@ @@ - mmiowb(); and invoked as: $ for d in drivers include/linux/qed sound; do \ spatch --include-headers --sp-file mmiowb.cocci --dir $d --in-place; done NOTE: mmiowb() has only ever guaranteed ordering in conjunction with spin_unlock(). However, pairing each mmiowb() removal in this patch with the corresponding call to spin_unlock() is not at all trivial, so there is a small chance that this change may regress any drivers incorrectly relying on mmiowb() to order MMIO writes between CPUs using lock-free synchronisation. If you've ended up bisecting to this commit, you can reintroduce the mmiowb() calls using wmb() instead, which should restore the old behaviour on all architectures other than some esoteric ia64 systems. Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Will Deacon <will.deacon@arm.com>	2019-04-08 12:01:02 +01:00
Florian Westphal	6b16f9ee89	net: move skb->xmit_more hint to softnet data There are two reasons for this. First, the xmit_more flag conceptually doesn't fit into the skb, as xmit_more is not a property related to the skb. Its only a hint to the driver that the stack is about to transmit another packet immediately. Second, it was only done this way to not have to pass another argument to ndo_start_xmit(). We can place xmit_more in the softnet data, next to the device recursion. The recursion counter is already written to on each transmit. The "more" indicator is placed right next to it. Drivers can use the netdev_xmit_more() helper instead of skb->xmit_more to check the "more packets coming" hint. skb->xmit_more is retained (but always 0) to not cause build breakage. This change takes care of the simple s/skb->xmit_more/netdev_xmit_more()/ conversions. Remaining drivers are converted in the next patches. Suggested-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-04-01 18:35:02 -07:00
Anirudh Venkataramanan	64f4b9437f	ice: Remove "2 BITS" comment Some enums in ice_tx_desc_cmd_bits have a trailing /* 2 BITS */ comment, but the value has just one bit set (ex. ICE_TX_DESC_CMD_L4T_EOFT_SCTP has the value 0x200 (i.e. only bit 9 is set). This is confusing and misleading. So remove the comment. Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-03-26 15:27:13 -07:00
Brett Creeley	92414f3292	ice: Update comment regarding the ITR_GRAN_S Since the driver now hard codes the ITR granularity to 2 us in the GLINT_CTL register the comment next to ITR_GRAN_S needs to be updated. Signed-off-by: Brett Creeley <brett.creeley@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-03-26 15:22:44 -07:00
Anirudh Venkataramanan	6c2f997af5	ice: Update function header for __ice_vsi_get_qs Remove some redundant text in the function header for __ice_vsi_get_qs Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-03-26 15:17:44 -07:00
Anirudh Venkataramanan	ac4667551e	ice: Remove unnecessary braces Single statement if conditions don't need braces. Remove it. Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-03-26 15:13:39 -07:00
Anirudh Venkataramanan	10c7e4c5fc	ice: Remove unused function prototype Commit `37bb839012` ("ice: Move common functions out of ice_main.c part 7/7") seems to have inadvertently introduced a function prototype for ice_vsi_cfg_tc without a corresponding function implementation. Remove it. Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-03-26 15:07:06 -07:00
Brett Creeley	203a068ac9	ice: Add missing case in print_link_msg for printing flow control Currently we aren't checking for the ICE_FC_NONE case for the current flow control mode. This is causing "Unknown" to be printed for the current flow control method if flow control is disabled. Fix this by adding the case for ICE_FC_NONE to print "None". Signed-off-by: Brett Creeley <brett.creeley@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-03-26 15:05:27 -07:00
Brett Creeley	8244dd2d23	ice: Audit hotpath structures with pahole Currently the ice_q_vector structure and ice_ring_container structure are taking up more space than necessary due to cache alignment holes and unnecessary variables respectively. This is not helping the driver's performance. The following fixes were done to improve cache alignment, reduce wasted space, and increase performance. 1. Remove the ice_latency_range enum as it is unused. 2. Remove the latency_range variable in the ice_ring_container structure. 3. Change the size of the itr_idx in the ice_ring_container structure from an int to an u16. This reduced the size of ice_ring_container structure to 32 Bytes so it has no holes or padding. 4. Re-arrange the ice_q_vector structure using pahole to align members as best as possible in regards to 64 Byte cache line size. Signed-off-by: Brett Creeley <brett.creeley@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-03-26 15:03:25 -07:00
Preethi Banala	89f3e4a5b7	ice: Do not bail out when filter already exists If filter already exists, do not go through error path flow but instead continue to process rest of the function. Hence have an appropriate check after adding MAC filters. Signed-off-by: Preethi Banala <preethi.banala@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-03-26 14:51:59 -07:00
Akeem G Abodunrin	4e1af7bf22	ice: Fix issue with VF attempt to delete default MAC address This patch fixes issue that occurs when VF is attempting to remove default LAN/MAC address, which is programmed by the administrator. We shouldn't return error for the call by the VF, but continue with the remaining steps to handle MAC opcode. Also update the dev_err message to explicitly say that VF can't change MAC programmed by PF. Also change "mac" to "MAC" for kernel print statements in the same file. Signed-off-by: Akeem G Abodunrin <akeem.g.abodunrin@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-03-26 14:49:59 -07:00
Mitch Williams	a7c9b47bc9	ice: enable VF admin queue interrupts The VPINT_MBX_CTL register array must be programmed to enable VF admin queue interrupts. Without this, VFs never get interrupts on vector 0, and some VF drivers will fail to init. Signed-off-by: Mitch Williams <mitch.a.williams@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-03-26 14:07:46 -07:00
Anirudh Venkataramanan	64a59d05a4	ice: Fix for adaptive interrupt moderation commit `63f545ed12` ("ice: Add support for adaptive interrupt moderation") was meant to add support for adaptive interrupt moderation but there was an error on my part while formatting the patch, and thus only part of the patch ended up being submitted. This patch rectifies the error by adding the rest of the code. Fixes: `63f545ed12` ("ice: Add support for adaptive interrupt moderation") Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-03-26 14:03:01 -07:00
Brett Creeley	5995b6d0c6	ice: Implement pci_error_handler ops This patch implements the following pci_error_handler ops: .error_detected = ice_pci_err_detected .slot_reset = ice_pci_err_slot_reset .reset_notify = ice_pci_err_reset_notify .reset_prepare = ice_pci_err_reset_prepare .reset_done = ice_pci_err_reset_done .resume = ice_pci_err_resume Signed-off-by: Brett Creeley <brett.creeley@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-03-26 13:57:12 -07:00
Brett Creeley	5abac9d7e1	ice: Put __ICE_PREPARED_FOR_RESET check in ice_prepare_for_reset Currently we check if the __ICE_PREPARED_FOR_RESET bit is set prior to calling ice_prepare_for_reset in ice_reset_subtask(), but we aren't checking that bit in ice_do_reset() before calling ice_prepare_for_reset(). This is not consistent and can cause issues if ice_prepare_for_reset() is called prior to ice_do_reset(). Fix this by checking if the __ICE_PREPARED_FOR_RESET bit is set internal to ice_prepare_for_reset(). Signed-off-by: Brett Creeley <brett.creeley@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-03-26 13:40:20 -07:00
Mitch Williams	cf6c6e01bf	ice: use virt channel status codes When communicating with the AVF driver, we need to use the status codes from virtchnl.h, not our own ice-specific codes. Without this, when an error occurs, the VF will report nonsensical results. NOTE: this depends on changes made to include/linux/avf/virtchnl.h by commit `bb58fd7eef` ("i40e: Update status codes") Signed-off-by: Mitch Williams <mitch.a.williams@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-03-26 13:35:04 -07:00
Jeremiah Kyle	91dab5d53f	ice: Remove unnecessary newlines from log messages Two log messages contained newlines in the middle of the message. This resulted in unexpected driver log output. This patch removes the newlines to restore consistency with the rest of the driver log messages. Signed-off-by: Jeremiah Kyle <jeremiah.kyle@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-03-26 13:15:28 -07:00
Chinh T Cao	86e81794ac	ice: Create a generic name for the ice_rx_flg64_bits structure This structure is used to define the packet flags. These flags are applicable for both TX and RX packet. Thus, this patch changes its name from ice_rx_flag64_bits to ice_flg64_bits, and its member definition. Signed-off-by: Chinh T Cao <chinh.t.cao@intel.com> Reviewed-by: Bruce Allan <bruce.w.allan@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-03-25 10:40:04 -07:00
Bruce Allan	2bdc97be97	ice: add and use new ice_for_each_traffic_class() macro There are numerous for() loops iterating over each of the max traffic classes. Use a simple iterator macro instead to make the code cleaner. Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-03-25 10:33:54 -07:00
Preethi Banala	105e5bc23a	ice: change VF VSI tc info along with num_queues Update VF VSI tc info along with vsi->num_txq/num_rxq when VF requests to configure queues. Signed-off-by: Preethi Banala <preethi.banala@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-03-25 10:14:02 -07:00
Dave Ertman	2ebd4428d9	ice: Prevent unintended multiple chain resets In the current implementation of ice_reset_subtask, if multiple reset types are set in the pf->state, the most intrusive one is meant to be performed only, but the bits requesting the other types are not being cleared. This would lead to another reset being performed the next time the service task is scheduled. Change the flow of ice_reset_subtask so that all reset request bits in pf->state are cleared, and we still perform the most intrusive of the resets requested. Signed-off-by: Dave Ertman <david.m.ertman@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-03-25 10:12:21 -07:00
Maciej Fijalkowski	a65f71fed5	ice: map Rx buffer pages with DMA attributes Provide DMA_ATTR_WEAK_ORDERING and DMA_ATTR_SKIP_CPU_SYNC attributes to the DMA API during the mapping operations on Rx side. With this change the non-x86 platforms will be able to sync only with what is being used (2k buffer) instead of entire page. This should yield a slight performance improvement. Furthermore, DMA unmap may destroy the changes that were made to the buffer by CPU when platform is not a x86 one. DMA_ATTR_SKIP_CPU_SYNC attribute usage fixes this issue. Also add a sync_single_for_device call during the Rx buffer assignment, to make sure that the cache lines are cleared before device attempting to write to the buffer. Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-03-25 10:10:39 -07:00
Maciej Fijalkowski	712edbbb67	ice: Limit the ice_add_rx_frag to frag addition Refactor ice_fetch_rx_buf and ice_add_rx_frag in a way that we have standalone functions that do either the skb construction or frag addition to previously constructed skb. The skb handling between rx_bufs is spread among various functions. The ice_get_rx_buf will retrieve the skb pointer from rx_buf and if it is a NULL pointer then we do the ice_construct_skb, otherwise we add a frag to the current skb via ice_add_rx_frag. Then, on the ice_put_rx_buf the skb pointer that belongs to rx_buf will be cleared. Moving further, if the current frame is not EOP frame we assign the current skb to the rx_buf that is pointed by updated next_to_clean indicator. What is more during the buffer reuse let's assign each member of ice_rx_buf individually so we avoid the unnecessary copy of skb. Last but not least, this logic split will allow us for better code reuse when adding a support for build_skb. Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-03-25 09:55:35 -07:00
Maciej Fijalkowski	1d032bc77b	ice: Gather the rx buf clean-up logic for better reuse Pull out the code responsible for page counting and buffer recycling so that it will be possible to clean up the Rx buffers in cases where we won't allocate skb (ex. XDP) Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-03-25 09:45:07 -07:00
Maciej Fijalkowski	03c66a1376	ice: Introduce bulk update for page count {get,put}_page are atomic operations which we use for page count handling. The current logic for refcount handling is that we increment it when passing a skb with the data from the first half of page up to netstack and recycle the second half of page. This operation protects us from losing a page since the network stack can decrement the refcount of page from skb. The performance can be gently improved by doing the bulk updates of refcount instead of doing it one by one. During the buffer initialization, maximize the page's refcount and don't allow the refcount to become less than two. Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-03-25 09:33:13 -07:00
Maciej Fijalkowski	1857ca42a7	ice: Get rid of ice_pull_tail Instead of adding a frag and later when dealing with EOP frame accessing that frag in order to copy the headers onto linear part of skb, we can do this in ice_add_rx_frag in case where the data_len is still 0 and frame won't fit onto the linear part as a whole. Function comment of ice_pull_tail was a bit misleading because of mentioned optimizations that can be performed (drop a frag/maintaining accurate truesize of skb) - it seems that this part of logic was dropped and the comment was not updated to reflect this change. Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-03-25 09:22:21 -07:00
Maciej Fijalkowski	bbb97808a0	ice: Pull out page reuse checks onto separate function Introduce ice_can_reuse_rx_page which will verify whether the page can be reused and return the boolean result to caller. Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-03-25 08:00:14 -07:00
Maciej Fijalkowski	6c869cb7a8	ice: Retrieve rx_buf in separate function Introduce ice_get_rx_buf, which will fetch the Rx buffer and do the DMA synchronization. Length of the packet that hardware Rx descriptor contains is now read in ice_clean_rx_irq, so we can feed ice_get_rx_buf with it and resign from rx_desc passed as argument in ice_fetch_rx_buf and ice_add_rx_frag. Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-03-25 07:58:17 -07:00
Brett Creeley	250c3b3e0a	ice: Enable link events over the ARQ The hardware now supports link events over the admin receive queue (ARQ), so enable HW link events over the ARQ and remove code for link event polling. Signed-off-by: Brett Creeley <brett.creeley@intel.com> Reviewed-by: Bruce Allan <bruce.w.allan@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-03-25 07:54:44 -07:00
Alan Brady	8d051b8b5d	ice: use irq_num var in ice_vsi_req_irq_msix Someone went through the effort of making this a variable so let's use it instead of recalculating it again. Signed-off-by: Alan Brady <alan.brady@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-03-25 07:52:51 -07:00
Michal Swiatkowski	840bcd88f8	ice: Restore VLAN switch rule if port VLAN existed before The VLAN rule is lost when VM starts or the AVF driver (iavf.ko) is reloaded. So it is necessary to add this rule again. Signed-off-by: Michal Swiatkowski <michal.swiatkowski@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-03-25 07:41:49 -07:00
Victor Raj	b0153fdd7e	ice: update VSI config dynamically When VSI increases the number of queues dynamically, the scheduler just needs to add the new required nodes rather than re-adjusting with previously allocated number of nodes. Readjusting didn't provide enough parents to add the upper layer nodes also can't place lan and rdma subtrees separately. In decrease case, keep the VSI configuration with max number of queues always. This will leave some extra nodes in the tree but no harm done. Signed-off-by: Victor Raj <victor.raj@intel.com> Reviewed-by: Bruce Allan <bruce.w.allan@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-03-25 07:23:20 -07:00
Akeem G Abodunrin	f1ef73f50b	ice: Get VF VSI instances directly via PF This patch changes how we get VF VSIs instances. Instead of relying on mailbox virtual channel message to retrieve VSI, it is more reliable getting it directly via VF object in PF data structure. Signed-off-by: Akeem G Abodunrin <akeem.g.abodunrin@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-03-22 08:26:29 -07:00
Akeem G Abodunrin	d84b899a94	ice: Don't let VF know that it is untrusted Don't let the VF know it's not trusted when it tries to add more than permitted additional MAC addresses. Signed-off-by: Akeem G Abodunrin <akeem.g.abodunrin@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-03-22 08:23:27 -07:00
Yashaswini Raghuram Prathivadi Bhayankaram	26069b448e	ice: Set LAN_EN for all directional rules The LAN_EN bit for a switch rule determines if the packet can go out on the wire or not. Set the LAN_EN flag in the switch action for all directional rules. Signed-off-by: Yashaswini Raghuram Prathivadi Bhayankaram <yashaswini.raghuram.prathivadi.bhayankaram@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-03-22 08:20:05 -07:00
Christopher N Bednarz	b58dafbc6f	ice: Do not set LB_EN for prune switch rules LB_EN for prune switch rules was causing all TX traffic to loopback to the internal switch and dropped. When running bi-directional stress workloads with RDMA the RDPU would hang blocking tx and rx traffic. Signed-off-by: Christopher N Bednarz <christopher.n.bednarz@intel.com> Reviewed-by: Bruce Allan <bruce.w.allan@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-03-22 08:19:17 -07:00
Yashaswini Raghuram Prathivadi Bhayankaram	277b3a4547	ice: Enable LAN_EN for the right recipes In VEB mode, enable LAN_EN bit in the action fields for filter rules corresponding to the right recipes. Signed-off-by: Yashaswini Raghuram Prathivadi Bhayankaram <yashaswini.raghuram.prathivadi.bhayankaram@intel.com> Reviewed-by: Bruce Allan <bruce.w.allan@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-03-22 08:19:17 -07:00
Akeem G Abodunrin	5eda8afd6b	ice: Add support for PF/VF promiscuous mode Implement support for VF promiscuous mode, MAC/VLAN/MAC_VLAN and PF multicast MAC/VLAN/MAC_VLAN promiscuous mode. Signed-off-by: Akeem G Abodunrin <akeem.g.abodunrin@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-03-22 08:19:17 -07:00
Victor Raj	e1ca65a3cc	ice: code cleanup in ice_sched.c This patch does some clean up in the Tx scheduler code: 1. Adjust the stack variable usage 2. Modify the debug prints to display the FW error 3. Add additional debug prints while adding/removing VSIs Signed-off-by: Victor Raj <victor.raj@intel.com> Reviewed-by: Bruce Allan <bruce.w.allan@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-03-22 08:19:17 -07:00
Anirudh Venkataramanan	eb86b09491	ice: Remove unused vsi_id field Remove unused vsi_id field from struct ice_sched_vsi_info. Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-03-22 08:19:17 -07:00
Bruce Allan	c8b7abdd7d	ice: fix some function prototype and signature style issues Put the return type on a separate line for function prototypes and signatures that would exceed the 80-character limit if both were on the same line. Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-03-22 08:19:17 -07:00
Kiran Patil	60dcc39ea3	ice: fix the divide by zero issue Static analysis flagged a potential divide by zero error because vsi->num_rxq can become zero in certain condition and it is used as divisor. Signed-off-by: Kiran Patil <kiran.patil@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-03-22 08:19:17 -07:00
Akeem G Abodunrin	5743020d37	ice: Fix issue reconfiguring VF queues When VF requested for queues changes, we need to update LAN Tx queue with correct number of VF queue pairs and re-allocate VF resources based on this new requested number of queues, which is constraint within maximum queue supported per VF. Signed-off-by: Akeem G Abodunrin <akeem.g.abodunrin@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-03-22 08:19:17 -07:00
Anirudh Venkataramanan	23d21c3dbb	ice: Remove unused function prototype Commit `7c710869d6` ("ice: Add handlers for VF netdevice operations") seems to have inadvertently introduced a function prototype for ice_set_vf_bw that isn't implemented. Remove it. Fixes: `7c710869d6` ("ice: Add handlers for VF netdevice operations") Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-03-22 08:19:17 -07:00
Bruce Allan	1b5c19c779	ice: fix static analysis warnings cppcheck warns "Identical condition '<var>', second condition is always false". Fix them. Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-03-22 08:19:16 -07:00
Akeem G Abodunrin	7eeac88976	ice: Fix issue reclaiming resources back to the pool after reset This patch fixes issue reclaiming VF resources back to the pool after reset - Since we only allocate HW vector for all VFs and track together with resources allocation for PF with ice_search_res, we need to free VFs resources separately, using first VF vector index to traverse the list. Otherwise tracker starts from the last assigned vectors list and causes maximum supported number of HW vectors, 1024 to be exhausted, depending on the number of VFs enabled, which causes a lot of unwanted issues, and failed to reassign vectors for VFs. Signed-off-by: Akeem G Abodunrin <akeem.g.abodunrin@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-03-22 08:19:16 -07:00
Akeem G Abodunrin	cb93a9529d	ice: Enable MAC anti-spoof by default This patch enables MAC anti-spoof by default, with creation of VF VSIs or when the VF VSIs are being re-initialized. Signed-off-by: Akeem G Abodunrin <akeem.g.abodunrin@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-03-22 08:19:16 -07:00
Brett Creeley	ad71b256ba	ice: Determine descriptor count and ring size based on PAGE_SIZE Currently we set the default number of Tx and Rx descriptors to 128 by default. For Rx this amounts to a full page (assuming 4K pages) because each Rx descriptor is 32 Bytes, but for Tx it only amounts to a half page because each Tx descriptor is 16 Bytes (assuming 4K pages). Instead of assuming 4K pages, determine the ring size and the number of descriptors for Tx and Rx based on a calculation using the PAGE_SIZE, ICE_MAX_NUM_DESC, and ICE_REQ_DESC_MULTIPLE. This change is being made to improve the performance of the driver when using the default settings. Signed-off-by: Brett Creeley <brett.creeley@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-03-19 17:24:03 -07:00
Akeem G Abodunrin	544f63d307	ice: Reset all VFs with VFLR during SR-IOV init flow During SR-IOV initialization, we allocate and setup VFs with reset, and since we were going to inform Firmware about our intention to do VFLR by disabling LAN TX Queue, then we really have to complete VF reset flow with VFLR using appropriate registers - Otherwise, reset status bit for VF in the Guest OS might returns DEADBEEF. This resolves issue to properly initialize VFs in the Guest OS via PCI passthrough. Signed-off-by: Akeem G Abodunrin <akeem.g.abodunrin@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-03-19 17:02:26 -07:00
Brett Creeley	7a1f711175	ice: Get resources per function ice_get_guar_num_vsi currently calculates the number of VSIs per PF. Rework this into a general function ice_get_num_per_func, that can calculate per PF allocations for not just VSIs but across multiple resource types. Signed-off-by: Brett Creeley <brett.creeley@intel.com> Reviewed-by: Bruce Allan <bruce.w.allan@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-03-19 17:00:54 -07:00
Akeem G Abodunrin	1c44e3bce1	ice: Implement flow to reset VFs with PFR and other resets All VF VSIs need to be reset and rebuild with the main VSIs before replaying all VSIs, so that all existing switch filters, scheduler tree and other configuration could be replayed at once. This fixes issues when doing PFR and CORER reset. Signed-off-by: Akeem G Abodunrin <akeem.g.abodunrin@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-03-19 17:00:09 -07:00
Brett Creeley	70457520ba	ice: configure GLINT_ITR to always have an ITR gran of 2 Instead of hoping that our ITR granularity will be 2 usec program the GLINT_CTL register to make sure the ITR granularity is always 2 usecs. Now that we know what the ITR granularity will be get rid of the check in ice_probe() to verify our previous assumption. Signed-off-by: Brett Creeley <brett.creeley@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-03-19 16:56:10 -07:00
Brett Creeley	80ed404abb	ice: use ice_for_each_vsi macro when possible Replace all instances of: for (i = 0; i < pf->num_alloc_vsi; i++) with the following macro: ice_for_each_vsi(pf, i) This will allow the code to be consistent since there are currently cases of using both. Signed-off-by: Brett Creeley <brett.creeley@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-03-19 16:54:23 -07:00
Chinh T Cao	d8df260af7	ice : Ensure only valid bits are set in ice_aq_set_phy_cfg In the ice_aq_set_phy_cfg AQ command, the 16.4 bit is reserved. This patch will make sure that this bit will never be set to 1. Signed-off-by: Chinh T Cao <chinh.t.cao@intel.com> Reviewed-by: Bruce Allan <bruce.w.allan@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-03-19 16:52:47 -07:00
Brett Creeley	16c3301b55	ice: remove redundant variable and if condition In ice_pf_rxq_wait we are using an unnecessary local variable and also we are checking if the timeout time was reached after the loop. Get rid of the local variable and return 0 right when we get a successful result. This makes it so we can return -ETIMEDOUT if we ever exit the loop because we know the timeout time has been hit. Signed-off-by: Brett Creeley <brett.creeley@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-03-19 16:51:08 -07:00
Bruce Allan	77ed84f49a	ice: avoid multiple unnecessary de-references in probe Add a local variable struct device *dev to avoid unnecessary de-references throughout ice_probe(). Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-03-19 16:49:36 -07:00
Akeem G Abodunrin	42b2cc83af	ice: Fix issue with VF reset and multiple VFs support on PFs This patch fixes issues with VF queues being disabled, and VF netdev network carrier being lost after reset. Basically, we need to check if VF is enabled, and queue configured in reset_all_vfs flow, and disable/enable those queues appropriately whenever the function is called after Global/CORER/PFR reset/rebuild/replay. Signed-off-by: Akeem G Abodunrin <akeem.g.abodunrin@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-03-19 16:37:43 -07:00
Michal Swiatkowski	77a7a84d62	ice: Fix broadcast traffic in port VLAN mode Set egress (Rx) pruning enable flag for VF VSI in VSI ctxt to enable prune action. To avoid seeing broadcast packet in different VLAN, pruning enable flag in VSI ctxt should be set. Write new functions (fill VSI ctx) to not repeat send ctxt code. Signed-off-by: Michal Swiatkowski <michal.swiatkowski@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-03-19 16:32:26 -07:00
Jesse Brandeburg	1fa6e138ad	ice: fix overlong string, update stats output A test started warning on a string truncation. This led to an unfortunate realization that we are likely not accounting for the stats length correctly before this patch, so fix the issue by putting "port." in front of all the PF stats, instead of magically prepending it at runtime. Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-02-25 08:56:02 -08:00
Lukasz Czapnik	40c3c54638	ice: Fix for FC get rx/tx pause params Ethtool reported pause params based on the currently negotiated link settings instead of current PHY config. User was not able to turn off pause params because ethtool was incorrectly reporting parameters as off when link was down even though PHY was configured to support pause frames. Now pause params are taken from PHY config instead of link status. Signed-off-by: Lukasz Czapnik <lukasz.czapnik@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-02-25 08:56:02 -08:00
Mitch Williams	f966127a68	ice: use absolute vector ID for VFs When the PF driver sets up the VF MSI-X vector allocation, it needs to use the hardware absolute vector ID, not the per-PF vector ID. Without this change we see (apparent) TX hangs when using VFs on multiple PFs. Signed-off-by: Mitch Williams <mitch.a.williams@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-02-25 08:56:02 -08:00
Victor Raj	f70b9d5f44	ice: check for a leaf node presence Check for a leaf node presence for a given VSI. This check is required before removing a VSI since VSIs can't be removed with enabled queues (with leaf nodes) from the FW scheduler tree unless its a reset. Signed-off-by: Victor Raj <victor.raj@intel.com> Reviewed-by: Bruce Allan <bruce.w.allan@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-02-25 08:56:01 -08:00
Victor Raj	6e9650d533	ice: flush Tx pipe on disable queue timeout Set the flush Tx pipe flag instead of getting an EAGAIN error when FW times out in processing the disable Tx queue command. Signed-off-by: Victor Raj <victor.raj@intel.com> Reviewed-by: Bruce Allan <bruce.w.allan@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-02-25 08:56:01 -08:00
Mitch Williams	82ba01282c	ice: clear VF ARQLEN register on reset On older devices like X710 and X722, the VF's ARQLEN register is cleared on reset, so the VF driver uses that register to detect an unannounced reset. Unfortunately, on devices controlled by ice, this register is NOT cleared on reset. This causes the VF to miss resets, and even on properly-announced resets, the VF driver complains that it didn't see the reset. To fix this, we'll do it in software. When we handle a VF reset (whether triggered by software or VFLR), clear this register after the HW reset is complete. Signed-off-by: Mitch Williams <mitch.a.williams@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-02-25 08:56:01 -08:00
Mitch Williams	4cf7bc0d27	ice: don't spam VFs with link messages Don't send a link message to the VFs unless link actually changes state. This avoids a small timing hole in some VF drivers that can cause an apparent TX hang if they receive a link status message at the wrong time. Although we have fixed the timing hole in the current VF driver, there are still lots of drivers in the field that have this timing hole. Let's not fall into it if we can avoid it. Signed-off-by: Mitch Williams <mitch.a.williams@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-02-25 08:56:01 -08:00
Brett Creeley	b751930c6c	ice: only use the VF for ICE_VSI_VF in ice_vsi_release In ice_vsi_release we are always assigning a value to the local VF variable. Change this to only be assigned if the VSI is a VF VSI. Signed-off-by: Brett Creeley <brett.creeley@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-02-25 08:56:01 -08:00
Bruce Allan	32a64994db	ice: fix numeric overflow warning When compiling and analyzing the driver on newer kernels, a static analyzer warns about the following "numeric overflow" issues: "The result of expression: 'budget-1' generates 4-byte type while casting to a bigger size of 8-byte". "The result of expression: '*words-words_read' generates 4-byte type while casting to a bigger size of 8-byte". Fix them both. Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-02-25 08:56:01 -08:00
Brett Creeley	0e04e8e14b	ice: fix issue where host reboots on unload when iommu=on Currently if the kernel has the intel_iommu=on parameter set, on some platforms removing the driver causes a system reboot. In initialization we associate the control queue interrupts with the pf->hw_oicr_idx and enable the interrupts by setting the CAUSE_ENA bit. The problem comes on teardown because we are not clearing the CAUSE_ENA bit for the control queues, but the vector at pf->hw_oicr_idx (miscellaneous interrupt vector) gets disabled. Fix this by clearing the CAUSE_ENA bit in the appropriate control queue registers on when freeing the miscellaneous interrupt vector. Also, move the call to ice_free_irq_msix_misc() to after ice_deinit_sw() in ice_remove() because ice_deinit_sw() makes an AQ call, but ice_free_irq_msix_misc() disables the miscellaneous vector and it's associated interrupts. Also, create two small helper functions to enable and disable the control queue interrupts respectively. Signed-off-by: Brett Creeley <brett.creeley@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-02-25 08:56:01 -08:00
Jacob Keller	f9264dd687	ice: fix ice_remove_rule_internal vsi_list handling When adding multiple VLANs to the same VSI, the ice_add_vlan code will share the VSI list, so as not to create multiple unnecessary VSI lists. Consider the following flow ice_add_vlan(hw, <VSI 0 VID 7, VSI 0 VID 8, VSI 0 VID 9>) Where we add three VLAN filters for VIDs 7, 8, and 9, all for VSI 0. The ice_add_vlan will create a single vsi_list and share it among all the filters. Later, if we try to remove a VLAN, ice_remove_vlan(hw, <VSI 0 VID 7>) Then the removal code will update the vsi_list and remove VSI 0 from it. But, since the vsi_list is shared, this breaks the list for the other users who reference it. We actually even free the VSI list memory, and may result in segmentation faults. This is due to the way that VLAN rule share VSI lists with reference counts, and is caused because we call ice_rem_update_vsi_list even when the ref_cnt is greater than one. To fix this, handle the case where ref_cnt is greater than one separately. In this case, we need to remove the associated rule without modifying the vsi_list, since it is currently being referenced by another rule. Instead, we just need to decrement the VSI list ref_cnt. The case for handling sharing of VSI lists with multiple VSIs is not currently supported by this code. No such rules will be created today, and this code will require changes if/when such code is added. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Bruce Allan <bruce.w.allan@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-02-25 08:56:01 -08:00
Bruce Allan	198a666a45	ice: fix stack hogs from struct ice_vsi_ctx structures struct ice_vsi_ctx has gotten large enough that function local declarations of it on the stack are causing stack hogs. Fix that by allocating the structs on heap. Cleanup some formatting issues in the code around these changes and fix incorrect data type uses of returned functions in a couple places. Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-02-25 08:56:01 -08:00
Bruce Allan	c6dfd690f1	ice: sizeof(<type>) should be avoided With sizeof(), it is preferable to use the variable of type <type> instead of sizeof(<type>). There are multiple places where a temporary variable is used to hold a 'size' value which is then used for a subsequent alloc/memset. Get rid of the temporary variable by calculating size as part of the alloc/memset statement. Also remove unnecessary type-cast. Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-02-25 08:56:01 -08:00
Victor Raj	0e8fd74df2	ice: Fix added in VSI supported nodes calc VSI supported nodes are calculated in order to add the VSI parent or intermediate nodes to the scheduler tree. If one of the node in below layers (from VSI layer) has space to add the new VSI or intermediate node above that layer then it's not required to continue the calculation further for below layers. Signed-off-by: Victor Raj <victor.raj@intel.com> Reviewed-by: Bruce Allan <bruce.w.allan@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-02-25 08:56:01 -08:00
Maciej Fijalkowski	5ed5d316d9	ice: Fix the calculation of ICE_MAX_MTU Currently ICE_MAX_MTU subtracts only ETH_HLEN from max frame size and adds ETH_FCS_LEN and VLAN_HLEN, which is not what was intended. The ETH_HLEN + ETH_FCS_LEN + VLAN_HLEN expression should be surrounded with parentheses. Wrap mentioned expression and take into account VLAN double tagging. Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-02-25 08:56:01 -08:00
Bruce Allan	99be37edeb	ice: Mark extack argument as __always_unused Commit `87b0984ebf` ("net: Add extack argument to ndo_fdb_add()") in net-next added an extended parameter to the .ndo_fdb_add op and changed ice_fdb_add() accordingly. Update the function header and add the __always_unused attribute to the new parameter to avoid -Wunused-parameter warnings. Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-02-25 08:56:01 -08:00
Petr Machata	87b0984ebf	net: Add extack argument to ndo_fdb_add() Drivers may not be able to support certain FDB entries, and an error code is insufficient to give clear hints as to the reasons of rejection. In order to make it possible to communicate the rejection reason, extend ndo_fdb_add() with an extack argument. Adapt the existing implementations of ndo_fdb_add() to take the parameter (and ignore it). Pass the extack parameter when invoking ndo_fdb_add() from rtnl_fdb_add(). Signed-off-by: Petr Machata <petrm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-01-17 15:18:47 -08:00
Jacob Keller	d671e3e0da	ice: add const qualifier to mac_addr parameter The function ice_aq_manage_mac_write takes a pointer to a MAC address. The parameter is not marked const, even though the function doesn't need to modify it. This prevents passing a parameter that is already marked const. Update the function prototype to take a const pointer, to allow passing constant pointers to this function. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-01-15 12:42:38 -08:00
Anirudh Venkataramanan	aef74145f0	ice: Add support for new PHY types This patch adds code for the detection and operation of several additional PHY types that support higher link speeds. Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-01-15 12:38:44 -08:00
Anirudh Venkataramanan	cf909e19ac	ice: Offload SCTP checksum This patch adds the ability to offload SCTP checksum calculations to the NIC. Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-01-15 12:02:27 -08:00
Tony Nguyen	a8939784a1	ice: Allow for software timestamping Use ethtool_op_get_ts_info to provide software timestamping. Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-01-15 11:56:26 -08:00
Brett Creeley	67fe64d78c	ice: Implement getting and setting ethtool coalesce This patch includes the following ethtool operations: 1. get_coalesce 2. set_coalesce 3. get_per_q_coalesce 4. set_per_q_coalesce Each ITR value (current_itr/target_itr) are stored on a per ice_ring_container basis. This is because each valid ice_ring_container can have 1 or more rings that are tied to the same q_vector ITR index. Signed-off-by: Brett Creeley <brett.creeley@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-01-15 11:50:05 -08:00
Brett Creeley	63f545ed12	ice: Add support for adaptive interrupt moderation Currently the driver does not support adaptive/dynamic interrupt moderation. This patch adds support for this. Also, adaptive/dynamic interrupt moderation is turned on by default upon driver load. In order to support adaptive interrupt moderation, two functions were added, ice_update_itr() and ice_itr_divisor(). These are used to determine the current packet load and to determine a divisor based on link speed respectively. This patch also adds the ICE_ITR_GRAN_S define that is used in the hot-path when setting a new ITR value. The shift is used to pet two birds with one hand, set the ITR value while re-enabling the interrupt. Also, the ICE_ITR_GRAN_S is defined as 1 because the device has a ITR granularity of 2usecs. Signed-off-by: Brett Creeley <brett.creeley@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-01-15 11:29:16 -08:00
Anirudh Venkataramanan	9be1d6f8c3	ice: Move aggregator list into ice_hw instance The aggregator list needs to be preserved for use after a reset. This patch moves it out of the port_info instance and into the ice_hw instance. Signed-off-by: Tarun Singh <tarun.k.singh@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-01-15 11:21:13 -08:00
Anirudh Venkataramanan	03f7a98668	ice: Rework queue management code for reuse This patch reworks the queue management code to allow for reuse with the XDP feature (to be added in a future patch). Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-01-15 11:11:10 -08:00
Bruce Allan	ab4ab73fc1	ice: Add ethtool private flag to make forcing link down optional Add new infrastructure for implementing ethtool private flags using the existing pf->flags bitmap to store them, and add the link-down-on-close ethtool private flag to optionally bring down the PHY link when the interface is administratively downed. Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-01-15 10:32:59 -08:00
Brett Creeley	b6f934f027	ice: Set physical link up/down when an interface is set up/down When a netdev is set up/down we need to set the phsyical link state accordingly. This patch adds that functionality by calling ice_force_phys_link_state(vsi, link_up) in both the ice_stop() and ice_open() paths. In order to force link, ice_force_phys_link_state(vsi, link_up) will first determine the current phy capabilities. If link has not changed there is nothing to do. If link has changed, previous PHY capabilities are saved and the "Enable Automatic Link Update" and "Link Establishment State Machine (LESM)" enable bits are set. Then the new PHY config is saved. The "Enable Automatic Link Update" will force the FW to execute Setup link and restart auto-negotiation. This should then result in a "Link Status Event (LSE)" which will cause the driver to get the current link status. Signed-off-by: Brett Creeley <brett.creeley@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-01-15 10:27:18 -08:00
Bruce Allan	4c98ab550c	ice: Implement support for normal get_eeprom[_len] ethtool ops Add support for get_eeprom and get_eeprom_len ethtool ops Specification states that PF software accesses NVM (shadow-ram) via AQ commands (e.g. NVM Read, NVM Write) in the range 0x000000-0x00FFFF (64KB), so the get_eeprom_len op should return 64KB. If additional regions of the 16MB NVM must be read, another access method must be used. The ethtool kernel code, by default, will ask for multiple page-size hunks of the NVM not to exceed the value returned by ice_get_eeprom_len(). ice_read_sr_buf() deals with arch page sizes different than 4KB. Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-01-15 10:20:43 -08:00
Anirudh Venkataramanan	8e151d50a1	ice: Add ethtool set_phys_id handler Add led blinking handler to ethtool. Since led blinking is controlled by FW/HW only ETHTOOL_ID_ACTIVE and ETHTOOL_ID_INACTIVE are really needed. Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-01-15 10:07:08 -08:00
Md Fahad Iqbal Polash	27a98affa6	ice: Configure RSS LUT and HASH KEY in rebuild path This patch configures the RSS lookup table and hash key post reset. Signed-off-by: Md Fahad Iqbal Polash <md.fahad.iqbal.polash@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-01-15 10:02:44 -08:00
Anirudh Venkataramanan	1f9c7840e8	ice: Refactor a few Tx scheduler functions The following functions were refactored to call a new common function, ice_aqc_send_sched_elem_cmd(): - ice_aq_add_sched_elems() - ice_aq_delete_sched_elems() - ice_aq_move_sched_elems() - ice_aq_query_sched_elems() - ice_aq_cfg_sched_elems() - ice_aq_suspend_sched_elems() - ice_aq_resume_sched_elems() Signed-off-by: Greg Priest <greg.priest@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-01-15 09:54:59 -08:00
Bruce Allan	3d50514717	ice: Fix unused variable build warning Commit `2fd527b72b` ("net: ndo_bridge_setlink: Add extack") added a new parameter "extack" to ice_bridge_setlink but this parameter isn't used by the function. This results in a warning: unused parameter ‘extack’ [-Wunused-parameter]. Fix that by adding an "__always_unused" qualifier. Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-01-15 08:44:05 -08:00
Young Xiao	eec903769b	ice: Do not enable NAPI on q_vectors that have no rings If ice driver has q_vectors w/ active NAPI that has no rings, then this will result in a divide by zero error. To correct it I am updating the driver code so that we only support NAPI on q_vectors that have 1 or more rings allocated to them. See commit `13a8cd191a` ("i40e: Do not enable NAPI on q_vectors that have no rings") for detail. Signed-off-by: Young Xiao <YangX92@hotmail.com> Acked-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2018-12-20 12:10:24 -08:00
Petr Machata	2fd527b72b	net: ndo_bridge_setlink: Add extack Drivers may not be able to implement a VLAN addition or reconfiguration. In those cases it's desirable to explain to the user that it was rejected (and why). To that end, add extack argument to ndo_bridge_setlink. Adapt all users to that change. Following patches will use the new argument in the bridge driver. Signed-off-by: Petr Machata <petrm@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-12 16:34:21 -08:00
Jesse Brandeburg	0bcd952fee	ethernet/intel: consolidate NAPI and NAPI exit While reviewing code, I noticed that Eric Dumazet recommends that drivers check the return code of napi_complete_done, and use that to decide to enable interrupts or not when exiting poll. One of the Intel drivers was already fixed (ixgbe). Upon looking at the Intel drivers as a whole, we are handling our polling and NAPI exit in a few different ways based on whether we have multiqueue and whether we have Tx cleanup included. Several drivers had the bug of exiting NAPI with return 0, which appears to mess up the accounting in the stack. Consolidate all the NAPI routines to do best known way of exiting and to just mostly look like each other. 1) check return code of napi_complete_done to control interrupt enable 2) return the actual amount of work done. 3) return budget immediately if need NAPI poll again Tested the changes on e1000e with a high interrupt rate set, and it shows about an 8% reduction in the CPU utilization when busy polling because we aren't re-enabling interrupts when we're about to be polled. Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2018-11-21 10:35:23 -08:00
Bruce Allan	f25dad19ba	ice: Fix possible NULL pointer de-reference A recent update to smatch is causing it to report the error "we previously assumed 'm_entry->vsi_list_info' could be null". Fix that. Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2018-11-20 11:39:04 -08:00
Anirudh Venkataramanan	d337f2afb7	ice: Use Tx\|Rx in comments In code comments, use Tx\|Rx instead of tx\|rx Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2018-11-20 11:39:04 -08:00
Anirudh Venkataramanan	df17b7e02f	ice: Cosmetic formatting changes 1. Fix several cases of double spacing 2. Fix typos 3. Capitalize abbreviations Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2018-11-20 11:39:04 -08:00
Bruce Allan	2c5492de87	ice: Cleanup short function signatures Function signatures that do not exceed 80-characters should be on a single line. Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2018-11-20 11:39:04 -08:00
Bruce Allan	bc0c6fab8a	ice: Cleanup ice_tx_timeout() Clean up number of formatting issues and a comment that could use clarification. Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2018-11-20 11:39:04 -08:00
Dave Ertman	e0c9fd9b77	ice: Fix return value from NAPI poll ice_napi_poll is hard-coded to return zero when it's done. It should instead return the work done (if any work was done). The only time it should return zero is if an interrupt or poll is handled and no work is performed. So change the return value to be the minimum of work done or budget-1. Signed-off-by: Dave Ertman <david.m.ertman@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2018-11-20 11:39:04 -08:00
Bruce Allan	55aa141ed9	ice: Constify global structures that can/should be Indicate these structs should not be modified and take advantage of some compiler optimizations by making these structs const. Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2018-11-20 11:39:04 -08:00
Yashaswini Raghuram Prathivadi Bhayankaram	6a7e699369	ice: Do not set LAN_EN for MAC-VLAN filters In the action fields for a MAC-VLAN filter, do not set the LAN_EN flag if the MAC in the MAC-VLAN is unicast MAC. The unicast packets that match should not be forwarded to the wire. Signed-off-by: Yashaswini Raghuram Prathivadi Bhayankaram <yashaswini.raghuram.prathivadi.bhayankaram@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2018-11-20 11:39:04 -08:00
Jaroslaw Ilgiewicz	5fb597d7c8	ice: Pass the return value of ice_init_def_sw_recp() Added check of return value for ice_init_def_sw_recp(). Now we know if memory was correctly allocated. Signed-off-by: Jaroslaw Ilgiewicz <jaroslaw.ilgiewicz@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2018-11-20 11:39:03 -08:00
Bruce Allan	7afdbc903a	ice: Cleanup duplicate control queue code 1. Assigning the register offset and mask values contains duplicate code that can easily be replaced with a macro. 2. Separate functions for freeing send queue and receive queue rings are not needed; replace with a single function that uses a pointer to the struct ice_ctl_q_ring structure as a parameter instead of a pointer to the struct ice_ctl_q_info structure. 3. Initializing register settings for both send queue and receive queue contains duplicate code that can easily be replaced with a helper function. 4. Separate functions for freeing send queue and receive queue buffers are not needed; duplicate code can easily be replaced with a macro. Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2018-11-20 11:39:03 -08:00
Akeem G Abodunrin	d38b08834f	ice: Do autoneg based on VSI state If VSI state is up, we should do autoneg with link up, otherwise with link down. Signed-off-by: Akeem G Abodunrin <akeem.g.abodunrin@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2018-11-20 11:39:03 -08:00
Md Fahad Iqbal Polash	ef878d6086	ice: Remove ICE_MAX_TXQ_PER_TXQG check when configuring Tx queue This patch removes the condition checking of VSI TX queue number to ICE_MAX_TXQ_PER_TXQG. This is an unnecessary check and causes a driver load error on hosts that have more than 128 cores. Signed-off-by: Md Fahad Iqbal Polash <md.fahad.iqbal.polash@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2018-11-13 09:09:26 -08:00
Henry Tieman	47e3e53cea	ice: Destroy scheduler tree in reset path The scheduler tree is is always rebuilt during reset. The existing code adds new scheduler nodes for queues but may not clean up earlier nodes. This patch removed the old scheduler tree during reset before it is rebuilt. Signed-off-by: Henry Tieman <henry.w.tieman@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2018-11-13 09:09:26 -08:00
Usha Ketineni	c5a2a4a388	ice: Fix to make VLAN priority tagged traffic to appear on all TCs This patch includes below changes to resolve the issue of ETS bandwidth shaping to work. 1. Allocation of Tx queues is accounted for based on the enabled TC's in ice_vsi_setup_q_map() and enabled the Tx queues on those TC's via ice_vsi_cfg_txqs() 2. Get the mapped netdev TC # for the user priority and set the priority to TC mapping for the VSI. Signed-off-by: Usha Ketineni <usha.k.ketineni@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2018-11-13 09:09:26 -08:00
Brett Creeley	99fc1057b4	ice: Call pci_disable_sriov before stopping queues for VF Previous to this commit the driver was immediately stopping Tx/Rx queues when doing the following "echo 0 > sriov_numvfs" and then it was calling pci_disable_sriov if the VFs are not assigned. This was causing the VIRTCHNL_OP_DISABLE_QUEUES to fail because it was trying to stop the queues for a second time. Fix this by calling pci_disable_sriov before stopping the Tx/Rx queues. This allows the VIRTCHNL_OP_DISABLE_QUEUES to get processed before the driver tries to stop the Rx/Tx queues in ice_free_vfs. Signed-off-by: Brett Creeley <brett.creeley@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2018-11-13 09:09:26 -08:00
Piotr Raczynski	7b8ff0f9cc	ice: Increase Rx queue disable timeout With much traffic coming into the port, Rx queue disable procedure can take more time until all pending queue requests on PCIe finish. Reuse ICE_Q_WAIT_MAX_RETRY macro and increase the delay itself. Signed-off-by: Piotr Raczynski <piotr.raczynski@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2018-11-13 09:09:26 -08:00
Lev Faerman	6263e811f4	ice: Fix NVM mask defines Fixes bad masks that would break compilation when evaluated. Signed-off-by: Lev Faerman <lev.faerman@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2018-11-13 09:09:26 -08:00
Dave Ertman	d09e2693b6	ice: Avoid nested RTNL locking in ice_dis_vsi ice_dis_vsi() performs an rtnl_lock() if it detects a netdev that is running on the VSI. In cases where the RTNL lock has already been acquired, a deadlock results. Add a boolean to pass to ice_dis_vsi to tell it if the RTNL lock is already held. Signed-off-by: Dave Ertman <david.m.ertman@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2018-11-13 09:09:26 -08:00
Anirudh Venkataramanan	995c90f2de	ice: Calculate guaranteed VSIs per function and use it Currently we are setting the guar_num_vsi to equal to ICE_MAX_VSI which is the device limit of 768. This is incorrect and could have unintended consequences. To fix this use the valid_function's 8-bit bitmap returned from discovering device capabilities to determine the guar_num_vsi per function. guar_num_vsi value is then passed on to pf->num_alloc_vsi. Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2018-11-13 09:09:26 -08:00
Anirudh Venkataramanan	10e03a22de	ice: Remove node before releasing VSI Before releasing the VSI, remove the VSI scheduler node. If not, the node is left in the scheduler tree and, on subsequent load, the scheduler tree contains the node so it does not set it in vsi_ctx. This, later, causes the node to not be found in ice_sched_get_free_qparent which leads to a "Failed to set LAN Tx queue context, error: -1". To remove the scheduler node, this patch introduces ice_rm_vsi_lan_cfg and related helpers. Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2018-11-13 09:09:26 -08:00
Tony Nguyen	b354e98f49	ice: Check for q_vector when stopping rings There is a gap in time between a VF reset, which sets the q_vector to NULL, and the VF requesting mapping of the q_vectors. If ice_vsi_stop_tx_rings() is called during this time, a NULL pointer dereference is encountered. Add a check in ice_vsi_stop_tx_rings() to ensure the q_vector is set to avoid this situation from occurring. Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2018-11-13 09:09:25 -08:00
Brett Creeley	807bc98d31	ice: Fix debug print in ice_tx_timeout Currently the debug print in ice_tx_timeout is printing useless and duplicate values. First, head is being assigned to tx_ring->next_to_clean and we are printing both of those values, but naming them HWB and NTC respectively. Also, reading tail always returns 0 so remove that as well. Instead of assigning the SW head (NTC) read to head, use the actual head register and change the debug print to note that this is HW_HEAD. Also reduce the scope of a couple variables. Signed-off-by: Brett Creeley <brett.creeley@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2018-11-13 09:09:25 -08:00
Chinh T Cao	ffe498237b	ice: Change req_speeds to be u16 Since the req_speeds field in struct ice_link_status is a u8, req_speeds & ICE_AQ_LINK_SPEED_40GB always returns 0. This was caught by a coverity scan. Fix this by changing req_speeds to be u16. Reported-by: Bruce Allan <bruce.w.allan@intel.com> Signed-off-by: Chinh T Cao <chinh.t.cao@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2018-11-07 09:37:28 -08:00
Brett Creeley	d944b46992	ice: Fix the bytecount sent to netdev_tx_sent_queue Currently if the driver does a TSO offload the bytecount sent to netdev_tx_sent_queue will be incorrect. This is because in ice_tso we overwrite the initial value that we set in ice_tx_map. This creates a mismatch between the Tx and Tx clean flow. In the Tx clean flow we calculate the bytecount (called total_bytes) as we clean the descriptors so the value used in the Tx clean path is correct. Fix this by using += in ice_tso instead of =. This fixes the mismatch in bytecount mentioned above. Signed-off-by: Brett Creeley <brett.creeley@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2018-11-06 12:46:47 -08:00
Brett Creeley	c585ea42ec	ice: Fix tx_timeout in PF driver Prior to this commit the driver was running into tx_timeouts when a queue was stressed enough. This was happening because the HW tail and SW tail (NTU) were incorrectly out of sync. Consequently this was causing the HW head to collide with the HW tail, which to the hardware means that all descriptors posted for Tx have been processed. Due to the Tx logic used in the driver SW tail and HW tail are allowed to be out of sync. This is done as an optimization because it allows the driver to write HW tail as infrequently as possible, while still updating the SW tail index to keep track. However, there are situations where this results in the tail never getting updated, resulting in Tx timeouts. Tx HW tail write condition: if (netif_xmit_stopped(txring_txq(tx_ring) \|\| !skb->xmit_more) writel(sw_tail, tx_ring->tail); An issue was found in the Tx logic that was causing the afore mentioned condition for updating HW tail to never happen, causing tx_timeouts. In ice_xmit_frame_ring we calculate how many descriptors we need for the Tx transaction based on the skb the kernel hands us. This is then passed into ice_maybe_stop_tx along with some extra padding to determine if we have enough descriptors available for this transaction. If we don't then we return -EBUSY to the stack, otherwise we move on and eventually prepare the Tx descriptors accordingly in ice_tx_map and set next_to_watch. In ice_tx_map we make another call to ice_maybe_stop_tx with a value of MAX_SKB_FRAGS + 4. The key here is that this value is possibly less than the value we sent in the first call to ice_maybe_stop_tx in ice_xmit_frame_ring. Now, if the number of unused descriptors is between MAX_SKB_FRAGS + 4 and the value used in the first call to ice_maybe_stop_tx in ice_xmit_frame_ring then we do not update the HW tail because of the "Tx HW tail write condition" above. This is because in ice_maybe_stop_tx we return success from ice_maybe_stop_tx instead of calling __ice_maybe_stop_tx and subsequently calling netif_stop_subqueue, which sets the __QUEUE_STATE_DEV_XOFF bit. This bit is then checked in the "Tx HW tail write condition" by calling netif_xmit_stopped and subsequently updating HW tail if the afore mentioned bit is set. In ice_clean_tx_irq, if next_to_watch is not NULL, we end up cleaning the descriptors that HW sets the DD bit on and we have the budget. The HW head will eventually run into the HW tail in response to the description in the paragraph above. The next time through ice_xmit_frame_ring we make the initial call to ice_maybe_stop_tx with another skb from the stack. This time we do not have enough descriptors available and we return NETDEV_TX_BUSY to the stack and end up setting next_to_watch to NULL. This is where we are stuck. In ice_clean_tx_irq we never clean anything because next_to_watch is always NULL and in ice_xmit_frame_ring we never update HW tail because we already return NETDEV_TX_BUSY to the stack and eventually we hit a tx_timeout. This issue was fixed by making sure that the second call to ice_maybe_stop_tx in ice_tx_map is passed a value that is >= the value that was used on the initial call to ice_maybe_stop_tx in ice_xmit_frame_ring. This was done by adding the following defines to make the logic more clear and to reduce the chance of mucking this up again: ICE_CACHE_LINE_BYTES 64 ICE_DESCS_PER_CACHE_LINE (ICE_CACHE_LINE_BYTES / \ sizeof(struct ice_tx_desc)) ICE_DESCS_FOR_CTX_DESC 1 ICE_DESCS_FOR_SKB_DATA_PTR 1 The ICE_CACHE_LINE_BYTES being 64 is an assumption being made so we don't have to figure this out on every pass through the Tx path. Instead I added a sanity check in ice_probe to verify cache line size and print a message if it's not 64 Bytes. This will make it easier to file issues if they are seen when the cache line size is not 64 Bytes when reading from the GLPCI_CNF2 register. Signed-off-by: Brett Creeley <brett.creeley@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2018-11-06 12:46:47 -08:00
Dave Ertman	25525b69bb	ice: Fix napi delete calls for remove In the remove path, the vsi->netdev is being set to NULL before the call to free vectors. This is causing the netif_napi_del call to never be made. Add a call to ice_napi_del to the same location as the calls to unregister_netdev and just prior to them. This will use the reverse flow as the register and netif_napi_add calls. Signed-off-by: Dave Ertman <david.m.ertman@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2018-11-06 12:46:47 -08:00
Anirudh Venkataramanan	31082519c1	ice: Fix typo in error message Print should say "Enabling" instead of "Enaabling" Signed-off-by: Akeem G Abodunrin <akeem.g.abodunrin@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2018-11-06 12:46:47 -08:00
Md Fahad Iqbal Polash	58297dd133	ice: Fix flags for port VLAN According to the spec, whenever insert PVID field is set, the VLAN driver insertion mode should be set to 01b which isn't done currently. Fix it. Signed-off-by: Md Fahad Iqbal Polash <md.fahad.iqbal.polash@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2018-11-06 12:46:47 -08:00
Anirudh Venkataramanan	9ecd25c268	ice: Remove duplicate addition of VLANs in replay path ice_restore_vlan and active_vlans were originally put in place to reprogram VLAN filters in the replay path. This is now done as part of the much broader VSI rebuild/replay framework. So remove both ice_restore_vlan and active_vlans Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2018-11-06 12:46:47 -08:00
Victor Raj	33e055fcc2	ice: Free VSI contexts during for unload In the unload path, all VSIs are freed. Also free the related VSI contexts to prevent memory leaks. Signed-off-by: Victor Raj <victor.raj@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2018-11-06 12:46:47 -08:00
Akeem G Abodunrin	0f5d4c21a5	ice: Fix dead device link issue with flow control Setting Rx or Tx pause parameter currently results in link loss on the interface, requiring the platform/host to be cold power cycled. Fix it. Signed-off-by: Akeem G Abodunrin <akeem.g.abodunrin@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2018-11-06 12:46:47 -08:00
Anirudh Venkataramanan	afd9d4ab58	ice: Check for reset in progress during remove The remove path does not currently check to see if a reset is in progress before proceeding. This can cause a resource collision resulting in various types of errors. Check for reset in progress and wait for a reasonable amount of time before allowing the remove to progress. Signed-off-by: Dave Ertman <david.m.ertman@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2018-11-06 12:46:46 -08:00
Anirudh Venkataramanan	ce317dd9f8	ice: Set carrier state and start/stop queues in rebuild Set the carrier state post rebuild by querying the link status. Also start/stop queues based on link status. Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2018-11-06 12:46:46 -08:00
Anirudh Venkataramanan	4f4be03bde	ice: Poll for link status change When the physical link goes up or down, the driver is supposed to receive a link status event (LSE). The driver currently has the code to handle LSEs but there is no firmware support for this feature yet. So this patch adds the ability for the driver to poll for link status changes. The polling itself is done in ice_watchdog_subtask. For namespace cleanliness, this patch also removes code that handles LSE. This code will be reintroduced once the feature is officially supported. Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2018-10-24 14:30:40 -07:00
Anirudh Venkataramanan	982b121918	ice: Allocate VF interrupts and set queue map Allocate VF interrupts using VPINT_ALLOC_PCI. Multiple interrupts are specified as a range from "first" to "last". Also, according to the spec, the queue mapping for a VF needs to be set in both contig and scatter queue modes. So make this change as well. Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2018-10-24 14:30:35 -07:00
Anirudh Venkataramanan	f203dca363	ice: Introduce ice_dev_onetime_setup ice_dev_onetime_setup contains a couple of driver workarounds for current firmware limitations. These workarounds are expected to go away once these limitations are fixed in the firmware. On a firmware release that has these issues addressed, these workarounds (while unnecessary) will not break anything. Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2018-10-24 14:29:55 -07:00
Anirudh Venkataramanan	99189e8b6b	ice: Use capability count returned by the firmware The firmware now returns the capability count in the command buffer. Use it. Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2018-10-24 14:00:05 -07:00
Anirudh Venkataramanan	ac5a8aef11	ice: Update expected FW version Update to the current firmware major and minor version which are 1 and 3 respectively. Also remove an empty comment line. Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2018-10-24 13:56:37 -07:00
Anirudh Venkataramanan	633d7449a3	ice: Change device ID define names to align with branding string Basically remove references to C810 and use E810C (from the branding string) instead. Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2018-10-24 13:53:30 -07:00
Anirudh Venkataramanan	f3aaaaaae2	ice: Make ice_msix_clean_rings static commit `158a08a694` ("ice: remove ndo_poll_controller") removed ice_netpoll and introduced a namespace warning for ice_msix_clean_rings. Fix the namespace warning by making ice_msix_clean_rings static. Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2018-10-24 13:35:36 -07:00
Anirudh Venkataramanan	5cc6c8b30c	ice: Update version string Update version string to 0.7.2-k Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2018-10-03 07:42:30 -07:00
Dave Ertman	124cd54796	ice: Use the right function to enable/disable VSI The ice_ena/dis_vsi should have a single differentiating factor to determine if the netdev_ops call is used or a direct call to ice_vsi_open/close. This is if the netif is running or not. If netif is running, use ndo_open/ndo_close. Else, use ice_vsi_open/ice_vsi_close. Signed-off-by: Dave Ertman <david.m.ertman@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2018-10-03 07:42:30 -07:00
Brett Creeley	d2b464a7ff	ice: Add more flexibility on how we assign an ITR index This issue came about when looking at the VF function ice_vc_cfg_irq_map_msg. Currently we are assigning the itr_setting value to the itr_idx received from the AVF driver, which is not correct and is not used for the VF flow anyway. Currently the only way we set the ITR index for both the PF and VF driver is by hard coding ICE_TX_ITR or ICE_RX_ITR for the ITR index on each q_vector. To fix this, add the member itr_idx in struct ice_ring_container. This can then be used to dynamically program the correct ITR index. This change also affected the PF driver so make the necessary changes there as well. Also, removed the itr_setting member in struct ice_ring because it is not being used meaningfully and is going to be removed in a future patch that includes dynamic ITR. On another note, this will be useful moving forward if we decide to split Rx/Tx rings on different q_vectors instead of sharing them as queue pairs. Signed-off-by: Brett Creeley <brett.creeley@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2018-10-03 07:42:30 -07:00
Dave Ertman	072f0c3db9	ice: Fix potential null pointer issues Add checks in the filter handling flow to avoid dereferencing NULL pointers. Signed-off-by: Dave Ertman <david.m.ertman@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2018-10-03 07:42:30 -07:00
Brett Creeley	c60cdb13ec	ice: Add code to go from ICE_FWD_TO_VSI_LIST to ICE_FWD_TO_VSI When a switch rule is initially created we set the filter action to ICE_FWD_TO_VSI. The filter action changes to ICE_FWD_TO_VSI_LIST whenever more than one VSI is subscribed to the same switch rule. When the switch rule goes from 2 VSIs in the list to 1 VSI we remove and delete the VSI list rule, but we currently don't update the switch rule in hardware. This is causing switch rules to be lost, so fix that by making a call to ice_update_pkt_fwd_rule() with the necessary changes. Signed-off-by: Brett Creeley <brett.creeley@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2018-10-03 07:42:30 -07:00

... 21 22 23 24 25 ...

2338 Commits