mirror_ubuntu-kernels

mirror of https://git.proxmox.com/git/mirror_ubuntu-kernels.git synced 2025-12-25 01:11:44 +00:00

Author	SHA1	Message	Date
Vasundhara Volam	7154917a12	bnxt_en: Refactor bnxt_dl_info_get(). Add a new function bnxt_dl_info_put() to simplify the code, as there are more stored firmware version fields to be added in the next patch. Also, rename fw_ver variable name to ncsi_ver for better naming while copying to devlink info_get cb. v2: Ensure active_pkg_name string is NULL terminated when copied to devlink. Return directly from the last call to bnxt_dl_info_put(). Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Reviewed-by: Andy Gospodarek <gospo@broadcom.com> Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Link: https://lore.kernel.org/r/1602493854-29283-9-git-send-email-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2020-10-12 14:27:03 -07:00
Vasundhara Volam	4933f6753b	bnxt_en: Add bnxt_hwrm_nvm_get_dev_info() to query NVM info. Add a new bnxt_hwrm_nvm_get_dev_info() to query firmware version information via NVM_GET_DEV_INFO firmware command. Use it to get the running version of the NVM configuration information. This new function will also be used in subsequent patches to get the stored firmware versions. Reviewed-by: Andy Gospodarek <gospo@broadcom.com> Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Link: https://lore.kernel.org/r/1602493854-29283-8-git-send-email-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2020-10-12 14:27:03 -07:00
Michael Chan	8eddb3e7ce	bnxt_en: Log unknown link speed appropriately. If the VF virtual link is set to always enabled, the speed may be unknown when the physical link is down. The driver currently logs the link speed as 4294967295 Mbps which is SPEED_UNKNOWN. Modify the link up log message as "speed unknown" which makes more sense. Reviewed-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com> Reviewed-by: Edwin Peer <edwin.peer@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Link: https://lore.kernel.org/r/1602493854-29283-7-git-send-email-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2020-10-12 14:27:03 -07:00
Michael Chan	c966c67c09	bnxt_en: Log event_data1 and event_data2 when handling RESET_NOTIFY event. Log these values that contain useful firmware state information. Reviewed-by: Edwin Peer <edwin.peer@broadcom.com> Reviewed-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Link: https://lore.kernel.org/r/1602493854-29283-6-git-send-email-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2020-10-12 14:27:03 -07:00
Michael Chan	03ab8ca1e9	bnxt_en: Simplify bnxt_async_event_process(). event_data1 and event_data2 are used when processing most events. Store these in local variables at the beginning of the function to simplify many of the case statements. Reviewed-by: Edwin Peer <edwin.peer@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Link: https://lore.kernel.org/r/1602493854-29283-5-git-send-email-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2020-10-12 14:27:03 -07:00
Michael Chan	8fb35cd302	bnxt_en: Set driver default message level. Currently, bp->msg_enable has default value of 0. It is more useful to have the commonly used NETIF_MSG_DRV and NETIF_MSG_HW enabled by default. v2: Change the fall back bnxt_reset_task() inside bnxt_rx_ring_reset() to silent mode. With older fw, we would take the fall back path and it would be very noisy. Reviewed-by: Edwin Peer <edwin.peer@broadcom.com> Reviewed-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Link: https://lore.kernel.org/r/1602493854-29283-4-git-send-email-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2020-10-12 14:27:03 -07:00
Vasundhara Volam	6896cb35ee	bnxt_en: Enable online self tests for multi-host/NPAR mode. Online self tests are not disruptive and can be run in NPAR mode and in multi-host NIC as well. Reviewed-by: Edwin Peer <edwin.peer@broadcom.com> Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Link: https://lore.kernel.org/r/1602493854-29283-3-git-send-email-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2020-10-12 14:27:02 -07:00
Vasundhara Volam	cf223bfaf7	bnxt_en: Return -EROFS to user space, if NVM writes are not permitted. If NVRAM resources are locked, NVM writes are not permitted. In such scenarios, firmware returns HWRM_ERR_CODE_RESOURCE_LOCKED error to firmware commands. Reviewed-by: Edwin Peer <edwin.peer@broadcom.com> Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Link: https://lore.kernel.org/r/1602493854-29283-2-git-send-email-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2020-10-12 14:27:02 -07:00
Marek Vasut	64a632da53	net: fec: Fix phy_device lookup for phy_reset_after_clk_enable() The phy_reset_after_clk_enable() is always called with ndev->phydev, however that pointer may be NULL even though the PHY device instance already exists and is sufficient to perform the PHY reset. This condition happens in fec_open(), where the clock must be enabled first, then the PHY must be reset, and then the PHY IDs can be read out of the PHY. If the PHY still is not bound to the MAC, but there is OF PHY node and a matching PHY device instance already, use the OF PHY node to obtain the PHY device instance, and then use that PHY device instance when triggering the PHY reset. Fixes: `1b0a83ac04` ("net: fec: add phy_reset_after_clk_enable() support") Signed-off-by: Marek Vasut <marex@denx.de> Cc: Christoph Niedermaier <cniedermaier@dh-electronics.com> Cc: David S. Miller <davem@davemloft.net> Cc: NXP Linux Team <linux-imx@nxp.com> Cc: Richard Leitner <richard.leitner@skidata.com> Cc: Shawn Guo <shawnguo@kernel.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2020-10-12 14:16:30 -07:00
Jonathan Lemon	b2b8a92733	mlx4: handle non-napi callers to napi_poll netcons calls napi_poll with a budget of 0 to transmit packets. Handle this by: - skipping RX processing - do not try to recycle TX packets to the RX cache Signed-off-by: Jonathan Lemon <jonathan.lemon@gmail.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2020-10-12 14:02:32 -07:00
Valentin Vidic	3af5f0f5c7	net: korina: fix kfree of rx/tx descriptor array kmalloc returns KSEG0 addresses so convert back from KSEG1 in kfree. Also make sure array is freed when the driver is unloaded from the kernel. Fixes: `ef11291bcd` ("Add support the Korina (IDT RC32434) Ethernet MAC") Signed-off-by: Valentin Vidic <vvidic@valentin-vidic.from.hr> Acked-by: Willem de Bruijn <willemb@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2020-10-12 10:05:48 -07:00
Vladimir Oltean	70edfae15a	net: mscc: ocelot: offload VLAN mangle action to VCAP IS1 The VCAP_IS1_ACT_VID_REPLACE_ENA action, from the VCAP IS1 ingress TCAM, changes the classified VLAN. We are only exposing this ability for switch ports that are under VLAN aware bridges. This is because in standalone ports mode and under a bridge with vlan_filtering=0, the ocelot driver configures the switch to operate as VLAN-unaware, so the classified VLAN is not derived from the 802.1Q header from the packet, but instead is always equal to the port-based VLAN ID of the ingress port. We _can_ still change the classified VLAN for packets when operating in this mode, but the end result will most likely be a drop, since both the ingress and the egress port need to be members of the modified VLAN. And even if we install the new classified VLAN into the VLAN table of the switch, the result would still not be as expected: we wouldn't see, on the output port, the modified VLAN tag, but the original one, even though the classified VLAN was indeed modified. This is because of how the hardware works: on egress, what is pushed to the frame is a "port tag", which gives us the following options: - Tag all frames with port tag (derived from the classified VLAN) - Tag all frames with port tag, except if the classified VLAN is 0 or equal to the native VLAN of the egress port - No port tag Needless to say, in VLAN-unaware mode we are disabling the port tag. Otherwise, the existing VLAN tag would be ignored, and a second VLAN tag (the port tag), holding the classified VLAN, would be pushed (instead of replacing the existing 802.1Q tag). This is definitely not what the user wanted when installing a "vlan modify" action. So it is simply not worth bothering with VLAN modify rules under other configurations except when the ports are fully VLAN-aware. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2020-10-11 11:19:04 -07:00
Claudiu Manoil	71b77a7a27	enetc: Migrate to PHYLINK and PCS_LYNX This is a methodical transition of the driver from phylib to phylink, following the guidelines from sfp-phylink.rst. The MAC register configurations based on interface mode were moved from the probing path to the mac_config() hook. MAC enable and disable commands (enabling Rx and Tx paths at MAC level) were also extracted and assigned to their corresponding phylink hooks. As part of the migration to phylink, the serdes configuration from the driver was offloaded to the PCS_LYNX module, introduced in commit `0da4c3d393` ("net: phy: add Lynx PCS module"), the PCS_LYNX module being a mandatory component required to make the enetc driver work with phylink. Signed-off-by: Claudiu Manoil <claudiu.manoil@nxp.com> Reviewed-by: Ioana Ciornei <ioana.cionei@nxp.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2020-10-11 11:04:42 -07:00
Claudiu Manoil	46456ccfd9	enetc: Clean up serdes configuration Decouple internal mdio bus creation from serdes configuration, as a prerequisite to offloading serdes configuration to a different module. Group together mdio bus creation routines, cleanup. Signed-off-by: Claudiu Manoil <claudiu.manoil@nxp.com> Reviewed-by: Ioana Ciornei <ioana.ciornei@nxp.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2020-10-11 11:04:42 -07:00
Claudiu Manoil	08f90fc9d1	enetc: Clean up MAC and link configuration Decouple level MAC configuration based on phy interface type from general port configuration. Group together MAC and link configuration code. Decouple external mdio bus creation from interface type parsing. No longer return an (unhandled) error code when phy_node not found, use phy_node to indicate whether the port has a phy or not. No longer fall-through when serdes configuration fails for the link modes that require internal link configuration. Signed-off-by: Claudiu Manoil <claudiu.manoil@nxp.com> Reviewed-by: Ioana Ciornei <ioana.ciornei@nxp.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2020-10-11 11:04:42 -07:00
Maxim Kochetkov	fea9b31e25	dpaa_eth: enable NETIF_MSG_HW by default When packets are received on the error queue, this function under net_ratelimit(): netif_err(priv, hw, net_dev, "Err FD status = 0x%08x\n"); does not get printed. Instead we only see: [ 3658.845592] net_ratelimit: 244 callbacks suppressed [ 3663.969535] net_ratelimit: 230 callbacks suppressed [ 3669.085478] net_ratelimit: 228 callbacks suppressed Enabling NETIF_MSG_HW fixes this issue, and we can see some information about the frame descriptors of packets. Signed-off-by: Maxim Kochetkov <fido_max@inbox.ru> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Madalin Bucur <madalin.bucur@oss.nxp.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2020-10-10 10:49:56 -07:00
Heiner Kallweit	8d6112f0a0	r8169: factor out handling rtl8169_stats Factor out handling the private packet/byte counters to new functions rtl_get_priv_stats() and rtl_inc_priv_stats(). Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2020-10-10 10:46:07 -07:00
Gustavo A. R. Silva	f6e5ee6a2f	net: thunderx: Use struct_size() helper in kmalloc() Make use of the new struct_size() helper instead of the offsetof() idiom. Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2020-10-10 10:34:03 -07:00
Naoki Hayama	394039fe2c	net: tlan: Fix typo abitrary Fix comment typo. s/abitrary/arbitrary/ Signed-off-by: Naoki Hayama <naoki.hayama@lineo.co.jp> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2020-10-09 16:30:03 -07:00
Dan Nowlin	051d2b5cfa	ice: fix adding IP4 IP6 Flow Director rules A subsequent addition of an IP4 or IP6 rule after other rules would overwrite any existing TCAM entries of related L4 protocols(ex: tcp4 or udp6). This was due to the mask including too many TCAM entries. Add new packet type masks with bits properly excluded so rules are not overwritten. Signed-off-by: Dan Nowlin <dan.nowlin@intel.com> Signed-off-by: Henry Tieman <henry.w.tieman@intel.com> Tested-by: Brijesh Behera <brijeshx.behera@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2020-10-09 13:14:19 -07:00
Bixuan Cui	ecfb751f1a	ice: Fix pointer cast warnings pointers should be casted to unsigned long to avoid -Wpointer-to-int-cast warnings: drivers/net/ethernet/intel/ice/ice_flow.h:197:33: warning: cast from pointer to integer of different size drivers/net/ethernet/intel/ice/ice_flow.h:198:32: warning: cast to pointer from integer of different size Signed-off-by: Bixuan Cui <cuibixuan@huawei.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2020-10-09 13:14:19 -07:00
Jacob Keller	1e8249cc9d	ice: add additional debug logging for firmware update While debugging a recent failure to update the flash of an ice device, I found it helpful to add additional logging which helped determine the root cause of the problem being a timeout issue. Add some extra dev_dbg() logging messages which can be enabled using the dynamic debug facility, including one for ice_aq_wait_for_event that will use jiffies to capture a rough estimate of how long we waited for the completion of a firmware command. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Brijesh Behera <brijeshx.behera@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2020-10-09 13:14:19 -07:00
Jacob Keller	48d40025b5	ice: refactor devlink_port to be per-VSI Currently, the devlink_port structure is stored within the ice_pf. This made sense because we create a single devlink_port for each PF. This setup does not mesh with the abstractions in the driver very well, and led to a flow where we accidentally call devlink_port_unregister twice during error cleanup. In particular, if devlink_port_register or devlink_port_unregister are called twice, this leads to a kernel panic. This appears to occur during some possible flows while cleaning up from a failure during driver probe. If register_netdev fails, then we will call devlink_port_unregister in ice_cfg_netdev as it cleans up. Later, we again call devlink_port_unregister since we assume that we must cleanup the port that is associated with the PF structure. This occurs because we cleanup the devlink_port for the main PF even though it was not allocated. We allocated the port within a per-VSI function for managing the main netdev, but did not release the port when cleaning up that VSI, the allocation and destruction are not aligned. Instead of attempting to manage the devlink_port as part of the PF structure, manage it as part of the PF VSI. Doing this has advantages, as we can match the de-allocation of the devlink_port with the unregister_netdev associated with the main PF VSI. Moving the port to the VSI is preferable as it paves the way for handling devlink ports allocated for other purposes such as SR-IOV VFs. Since we're changing up how we allocate the devlink_port, also change the indexing. Originally, we indexed the port using the PF id number. This came from an old goal of sharing a devlink for each physical function. Managing devlink instances across multiple function drivers is not workable. Instead, lets set the port number to the logical port number returned by firmware and set the index using the VSI index (sometimes referred to as VSI handle). Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2020-10-09 13:14:19 -07:00
Jacob Keller	410d06879c	ice: add the DDP Track ID to devlink info Add "fw.app.bundle_id" to display the DDP Track ID of the active DDP package. This id is similar to "fw.bundle_id" and is a unique identifier for the DDP package that is loaded in the device. Each new DDP has a unique Track ID generated for it, and the ID can be used to identify and track the DDP package. Add documentation for the new devlink info version. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2020-10-09 13:14:19 -07:00
Anirudh Venkataramanan	045afac407	ice: Change ice_info_get_dsn to be void ice_info_get_dsn always returns 0, so just make it void. Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2020-10-09 13:14:19 -07:00
Bruce Allan	ac382a0944	ice: remove repeated words A new test in checkpatch detects repeated words; cleanup all pre-existing occurrences of those now. Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Co-developed-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2020-10-09 13:14:19 -07:00
Andy Shevchenko	4d7ebed6aa	ice: devlink: use %phD to print small buffer Use %phD format to print small buffer as hex string. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2020-10-09 13:14:19 -07:00
Moshe Shemesh	bef878e865	net/mlx5: Add support for devlink reload limit no reset Add support for devlink reload action fw_activate with reload limit no_reset which does firmware live patching, updating the firmware image without reset, no downtime and no configuration lose. The driver checks if the firmware is capable of handling the pending firmware changes as a live patch. If it is then it triggers firmware live patching flow. Signed-off-by: Moshe Shemesh <moshe@mellanox.com> Reviewed-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2020-10-09 12:06:53 -07:00
Moshe Shemesh	2d69356752	net/mlx5: Add support for fw live patch event Firmware live patch event notifies the driver that the firmware was just updated using live patch. In such case the driver should not reload or re-initiate entities, part to updating the firmware version and re-initiate the firmware tracer which can be updated by live patch with new strings database to help debugging an issue. Signed-off-by: Moshe Shemesh <moshe@mellanox.com> Reviewed-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2020-10-09 12:06:53 -07:00
Moshe Shemesh	b4f7cbb367	net/mlx5: Add devlink param enable_remote_dev_reset support The enable_remote_dev_reset devlink param flags that the host admin allows resets by other hosts. In case it is cleared mlx5 host PF driver will send NACK on pci sync for firmware update reset request and the command will fail. By default enable_remote_dev_reset parameter is true, so pci sync for firmware update reset is enabled. Signed-off-by: Moshe Shemesh <moshe@mellanox.com> Reviewed-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2020-10-09 12:06:53 -07:00
Moshe Shemesh	5ec697446f	net/mlx5: Add support for devlink reload action fw activate Add support for devlink reload action fw_activate. To activate firmware image the mlx5 driver resets the firmware and reloads it from flash. If a new image was stored on flash it will be loaded. Once this reload command is executed the driver initiates fw sync reset flow, where the firmware synchronizes all PFs on coming reset and driver reload. Signed-off-by: Moshe Shemesh <moshe@mellanox.com> Reviewed-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2020-10-09 12:06:53 -07:00
Moshe Shemesh	7dd6df329d	net/mlx5: Handle sync reset abort event If firmware sends sync_reset_abort to driver the driver should clear the reset requested mode as reset is not expected any more. Signed-off-by: Moshe Shemesh <moshe@mellanox.com> Reviewed-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2020-10-09 12:06:53 -07:00
Moshe Shemesh	eabe8e5e88	net/mlx5: Handle sync reset now event On sync_reset_now event the driver does reload and PCI link toggle to activate firmware upgrade reset. When the firmware sends this event it syncs the event on all PFs, so all PFs will do PCI link toggle at once. To do PCI link toggle, the driver ensures that no other device ID under the same bridge by checking that all the PF functions under the same PCI bridge have same device ID. If no other device it uses PCI bridge link control to turn link down and up. Signed-off-by: Moshe Shemesh <moshe@mellanox.com> Reviewed-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2020-10-09 12:06:52 -07:00
Moshe Shemesh	38b9f903f2	net/mlx5: Handle sync reset request event Once the driver gets sync_reset_request from firmware it prepares for the coming reset and sends acknowledge. After getting this event the driver expects device reset, either it will trigger PCI reset on sync_reset_now event or such PCI reset will be triggered by another PF of the same device. So it moves to reset requested mode and if it gets PCI reset triggered by the other PF it detect the reset and reloads. Signed-off-by: Moshe Shemesh <moshe@mellanox.com> Reviewed-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2020-10-09 12:06:52 -07:00
Moshe Shemesh	e7f4d0bcb8	net/mlx5: Set cap for pci sync for fw update event Set capability to notify the firmware that this host driver is capable of handling pci sync for firmware update events. Signed-off-by: Moshe Shemesh <moshe@mellanox.com> Reviewed-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2020-10-09 12:06:52 -07:00
Moshe Shemesh	3180472f58	net/mlx5: Add functions to set/query MFRL register Add functions to query and set the MFRL reset options supported by firmware. Signed-off-by: Moshe Shemesh <moshe@mellanox.com> Reviewed-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2020-10-09 12:06:52 -07:00
Moshe Shemesh	dc64cc7c63	devlink: Add devlink reload limit option Add reload limit to demand restrictions on reload actions. Reload limits supported: no_reset: No reset allowed, no down time allowed, no link flap and no configuration is lost. By default reload limit is unspecified and so no constraints on reload actions are required. Some combinations of action and limit are invalid. For example, driver can not reinitialize its entities without any downtime. The no_reset reload limit will have usecase in this patchset to implement restricted fw_activate on mlx5. Have the uapi parameter of reload limit ready for future support of multiselection. Signed-off-by: Moshe Shemesh <moshe@mellanox.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2020-10-09 12:06:52 -07:00
Moshe Shemesh	ccdf07219d	devlink: Add reload action option to devlink reload command Add devlink reload action to allow the user to request a specific reload action. The action parameter is optional, if not specified then devlink driver re-init action is used (backward compatible). Note that when required to do firmware activation some drivers may need to reload the driver. On the other hand some drivers may need to reset the firmware to reinitialize the driver entities. Therefore, the devlink reload command returns the actions which were actually performed. Reload actions supported are: driver_reinit: driver entities re-initialization, applying devlink-param and devlink-resource values. fw_activate: firmware activate. command examples: $devlink dev reload pci/0000:82:00.0 action driver_reinit reload_actions_performed: driver_reinit $devlink dev reload pci/0000:82:00.0 action fw_activate reload_actions_performed: driver_reinit fw_activate Signed-off-by: Moshe Shemesh <moshe@mellanox.com> Reviewed-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2020-10-09 12:06:52 -07:00
Marek Vasut	0da1ccbbef	net: fec: Fix PHY init after phy_reset_after_clk_enable() The phy_reset_after_clk_enable() does a PHY reset, which means the PHY loses its register settings. The fec_enet_mii_probe() starts the PHY and does the necessary calls to configure the PHY via PHY framework, and loads the correct register settings into the PHY. Therefore, fec_enet_mii_probe() should be called only after the PHY has been reset, not before as it is now. Fixes: `1b0a83ac04` ("net: fec: add phy_reset_after_clk_enable() support") Reviewed-by: Andrew Lunn <andrew@lunn.ch> Tested-by: Richard Leitner <richard.leitner@skidata.com> Signed-off-by: Marek Vasut <marex@denx.de> Cc: Christoph Niedermaier <cniedermaier@dh-electronics.com> Cc: David S. Miller <davem@davemloft.net> Cc: NXP Linux Team <linux-imx@nxp.com> Cc: Shawn Guo <shawnguo@kernel.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2020-10-09 08:17:11 -07:00
Vladimir Oltean	de997e545d	net: mscc: ocelot: add missing VCAP ES0 and IS1 regmaps for VSC7514 Without these definitions, the driver will crash in: mscc_ocelot_probe -> ocelot_init -> ocelot_vcap_init -> __ocelot_target_read_ix I missed this because I did not have the VSC7514 hardware to test, only the VSC9959 and VSC9953, and the probing part is different. Fixes: `e3aea296d8` ("net: mscc: ocelot: add definitions for VCAP ES0 keys, actions and target") Fixes: `a61e365d7c` ("net: mscc: ocelot: add definitions for VCAP IS1 keys, actions and target") Reported-by: Divya Koppera <Divya.Koppera@microchip.com> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2020-10-08 17:52:19 -07:00
Allen Pais	0eb484ee49	cxgb4: convert tasklets to use new tasklet_setup() API In preparation for unconditionally passing the struct tasklet_struct pointer to all tasklet callbacks, switch to using the new tasklet_setup() and from_tasklet() to pass the tasklet pointer explicitly. Signed-off-by: Romain Perier <romain.perier@gmail.com> Signed-off-by: Allen Pais <apais@linux.microsoft.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2020-10-08 16:48:21 -07:00
Jakub Kicinski	9d49aea13f	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Small conflict around locking in rxrpc_process_event() - channel_lock moved to bundle in next, while state lock needs _bh() from net. Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2020-10-08 15:44:50 -07:00
Heiner Kallweit	47dda78671	r8169: consider that PHY reset may still be in progress after applying firmware Some firmware files trigger a PHY soft reset and don't wait for it to be finished. PHY register writes directly after applying the firmware may fail or provide unexpected results therefore. Fix this by waiting for bit BMCR_RESET to be cleared after applying firmware. There's nothing wrong with the referenced change, it's just that the fix will apply cleanly only after this change. Fixes: `89fbd26cca` ("r8169: fix firmware not resetting tp->ocp_base") Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2020-10-08 12:20:51 -07:00
Igor Russkikh	60db5e408e	net: atlantic: implement media detect feature via phy tunables Mediadetect is another name for the EDPD (energy detect power down). This feature allows device to save extra power when no link is available. PHY goes into the extreme power saving mode and only periodically wakes up and checks for the link. AQC devices has fixed check period of 6 seconds The feature may increase linkup time. Signed-off-by: Igor Russkikh <irusskikh@marvell.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-10-06 06:16:01 -07:00
Igor Russkikh	e193c3ab83	net: atlantic: implement phy downshift feature PHY downshift allows phy to try renegotiate if link is unstable and can carry higher speed. AQC devices has integrated PHY which is controlled by MAC firmware. Thus, driver defines new ethtool callbacks to implement phy tunables via netdev. Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Igor Russkikh <irusskikh@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-10-06 06:16:01 -07:00
Vladimir Oltean	0132649366	net: mscc: ocelot: warn when encoding an out-of-bounds watermark value There is an upper bound to the value that a watermark may hold. That upper bound is not immediately obvious during configuration, and it might be possible to have accidental truncation. Actually this has happened already, add a warning to prevent it from happening again. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-10-06 06:05:47 -07:00
Vladimir Oltean	601e984f23	net: mscc: ocelot: divide watermark value by 60 when writing to SYS_ATOP Tail dropping is enabled for a port when: 1. A source port consumes more packet buffers than the watermark encoded in SYS:PORT:ATOP_CFG.ATOP. AND 2. Total memory use exceeds the consumption watermark encoded in SYS:PAUSE_CFG:ATOP_TOT_CFG. The unit of these watermarks is a 60 byte memory cell. That unit is programmed properly into ATOP_TOT_CFG, but not into ATOP. Actually when written into ATOP, it would get truncated and wrap around. Fixes: `a556c76adc` ("net: mscc: Add initial Ocelot switch support") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-10-06 06:05:47 -07:00
David S. Miller	8b0308fe31	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Rejecting non-native endian BTF overlapped with the addition of support for it. The rest were more simple overlapping changes, except the renesas ravb binding update, which had to follow a file move as well as a YAML conversion. Signed-off-by: David S. Miller <davem@davemloft.net>	2020-10-05 18:40:01 -07:00
Vladimir Oltean	2e554a7a5d	net: dsa: propagate switchdev vlan_filtering prepare phase to drivers A driver may refuse to enable VLAN filtering for any reason beyond what the DSA framework cares about, such as: - having tc-flower rules that rely on the switch being VLAN-aware - the particular switch does not support VLAN, even if the driver does (the DSA framework just checks for the presence of the .port_vlan_add and .port_vlan_del pointers) - simply not supporting this configuration to be toggled at runtime Currently, when a driver rejects a configuration it cannot support, it does this from the commit phase, which triggers various warnings in switchdev. So propagate the prepare phase to drivers, to give them the ability to refuse invalid configurations cleanly and avoid the warnings. Since we need to modify all function prototypes and check for the prepare phase from within the drivers, take that opportunity and move the existing driver restrictions within the prepare phase where that is possible and easy. Cc: Florian Fainelli <f.fainelli@gmail.com> Cc: Martin Blumenstingl <martin.blumenstingl@googlemail.com> Cc: Hauke Mehrtens <hauke@hauke-m.de> Cc: Woojung Huh <woojung.huh@microchip.com> Cc: Microchip Linux Driver Support <UNGLinuxDriver@microchip.com> Cc: Sean Wang <sean.wang@mediatek.com> Cc: Landen Chao <Landen.Chao@mediatek.com> Cc: Andrew Lunn <andrew@lunn.ch> Cc: Vivien Didelot <vivien.didelot@gmail.com> Cc: Jonathan McDowell <noodles@earth.li> Cc: Linus Walleij <linus.walleij@linaro.org> Cc: Alexandre Belloni <alexandre.belloni@bootlin.com> Cc: Claudiu Manoil <claudiu.manoil@nxp.com> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-10-05 05:56:48 -07:00
Tom Rix	f4544e5361	net: mvneta: fix double free of txq->buf clang static analysis reports this problem: drivers/net/ethernet/marvell/mvneta.c:3465:2: warning: Attempt to free released memory kfree(txq->buf); ^~~~~~~~~~~~~~~ When mvneta_txq_sw_init() fails to alloc txq->tso_hdrs, it frees without poisoning txq->buf. The error is caught in the mvneta_setup_txqs() caller which handles the error by cleaning up all of the txqs with a call to mvneta_txq_sw_deinit which also frees txq->buf. Since mvneta_txq_sw_deinit is a general cleaner, all of the partial cleaning in mvneta_txq_sw_deinit()'s error handling is not needed. Fixes: `2adb719d74` ("net: mvneta: Implement software TSO") Signed-off-by: Tom Rix <trix@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-10-04 15:07:19 -07:00
Michael Chan	8d4bd96b54	bnxt_en: Eliminate unnecessary RX resets. Currently, the driver will schedule RX ring reset when we get a buffer error in the RX completion record. These RX buffer errors can be due to normal out-of-buffer conditions or a permanent error in the RX ring. Because the driver cannot distinguish between these 2 conditions, we assume all these buffer errors require reset. This is very disruptive when it is just a normal out-of-buffer condition. Newer firmware will now monitor the rings for the permanent failure and will send a notification to the driver when it happens. This allows the driver to reset only when such a notification is received. In environments where we have predominently out-of-buffer conditions, we now can avoid these unnecessary resets. Reviewed-by: Edwin Peer <edwin.peer@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-10-04 14:41:05 -07:00
Michael Chan	1b5c8b63d6	bnxt_en: Reduce unnecessary message log during RX errors. There is logic in the RX path to detect unexpected handles in the RX completion. We'll print a warning and schedule a reset. The next expected handle is then set to 0xffff which is guaranteed to not match any valid handle. This will force all remaining packets in the ring to be discarded before the reset. There can be hundreds of these packets remaining in the ring and there is no need to print the warnings for these forced errors. Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Reviewed-by: Edwin Peer <edwin.peer@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-10-04 14:41:05 -07:00
Michael Chan	8a27d4b9e5	bnxt_en: Add a software counter for RX ring reset. Add a per ring rx_resets counter to count these RX resets. Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-10-04 14:41:05 -07:00
Michael Chan	8fbf58e17d	bnxt_en: Implement RX ring reset in response to buffer errors. On some older chips, it is necessary to do a reset when we get buffer errors associated with an RX ring. These buffer errors may become frequent if the RX ring underruns under heavy traffic. The current code does a global reset of all reasources when this happens. This works but creates a big disruption of all rings when one RX ring is having problem. This patch implements a localized RX ring reset of just the RX ring having the issue. All other rings including all TX rings will not be affected by this single RX ring reset. Only the older chips prior to the P5 class supports this reset. Because it is not a global reset, packets may still be arriving while we are calling firmware to reset that ring. We need to be sure that we don't post any buffers during this time while the ring is undergoing reset. After firmware completes successfully, the ring will be in the reset state with no buffers and we can start filling it with new buffers and posting them. Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Signed-off-by: Edwin Peer <edwin.peer@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-10-04 14:41:05 -07:00
Michael Chan	7737d325f8	bnxt_en: Refactor bnxt_init_one_rx_ring(). bnxt_init_one_rx_ring() includes logic to initialize the BDs for one RX ring and to allocate the buffers. Separate the allocation logic into a new bnxt_alloc_one_rx_ring() function. The allocation function will be used later to allocate new buffers for one specified RX ring when we reset that RX ring. Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-10-04 14:41:05 -07:00
Michael Chan	975bc99a4a	bnxt_en: Refactor bnxt_free_rx_skbs(). bnxt_free_rx_skbs() frees all the allocated buffers and SKBs for every RX ring. Refactor this function by calling a new function bnxt_free_one_rx_ring_skbs() to free these buffers on one specified RX ring at a time. This is preparation work for resetting one RX ring during run-time. Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-10-04 14:41:05 -07:00
Michael Chan	fc8864e0b6	bnxt_en: Log FW health status info, if reset is aborted. If firmware does not come out of reset, log FW health status info to provide more information on firmware status. Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-10-04 14:41:05 -07:00
Edwin Peer	87f7ab8d6f	bnxt_en: perform no master recovery during startup The NS3 SoC platforms require assistance from the OP-TEE to recover firmware if a crash occurs while no driver is bound. The CRASHED_NO_MASTER condition is recorded in the firmware status register during the crash to indicate when driver intervension is needed to coordinate a firmware reload. This condition is detected during early driver initialization in order to effect a firmware fastboot on supported platforms when necessary. Reviewed-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com> Signed-off-by: Edwin Peer <edwin.peer@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-10-04 14:41:05 -07:00
Edwin Peer	ba02629ff6	bnxt_en: log firmware status on firmware init failure Firmware now supports device independent discovery of the status register location. This status register can provide more detailed information about firmware errors, especially if problems occur before the HWRM interface is functioning. Attempt to map this register if it is present and report the firmware status on firmware init failures. Signed-off-by: Edwin Peer <edwin.peer@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-10-04 14:41:05 -07:00
Edwin Peer	3e9ec2bb93	bnxt_en: refactor bnxt_alloc_fw_health() The allocator for the firmware health structure conflates allocation and capability checks, limiting the reusability of the code. This patch separates out the capability check and disablement and improves the warning message to better describe the consequences of an allocation failure. Signed-off-by: Edwin Peer <edwin.peer@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-10-04 14:41:05 -07:00
Vasundhara Volam	424174f14e	bnxt_en: Update firmware interface spec to 1.10.1.68. Main changes is to extend hwrm_nvm_get_dev_info_output() for stored firmware versions and a new flag is added to fw_status_reg. Reviewed-by: Edwin Peer <edwin.peer@broadcom.com> Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-10-04 14:41:05 -07:00
Gustavo A. R. Silva	93e6664e66	bnx2x: Use fallthrough pseudo-keyword Replace /* no break */ comments with the new pseudo-keyword macro fallthrough[1]. [1] https://www.kernel.org/doc/html/v5.7/process/deprecated.html?highlight=fallthrough#implicit-switch-case-fall-through Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-10-03 17:33:23 -07:00
Gustavo A. R. Silva	401d8ce4ae	net: ksz884x: Use fallthrough pseudo-keyword Replace /* Fallthrough... */ comment with the new pseudo-keyword macro fallthrough[1]. [1] https://www.kernel.org/doc/html/v5.7/process/deprecated.html?highlight=fallthrough#implicit-switch-case-fall-through Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-10-03 17:33:23 -07:00
Gustavo A. R. Silva	e55e66e8ae	net: bna: Use fallthrough pseudo-keyword Replace /* !!! fall through !!! */ comments with the new pseudo-keyword macro fallthrough[1]. [1] https://www.kernel.org/doc/html/v5.7/process/deprecated.html?highlight=fallthrough#implicit-switch-case-fall-through Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-10-03 17:33:23 -07:00
Christophe JAILLET	790ca79d3e	net: typhoon: Fix a typo Typoon --> Typhoon s/Typoon/Typhoon/ Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-10-03 17:23:02 -07:00
Randy Dunlap	1f7e877c20	net: hinic: fix DEVLINK build errors Fix many (lots deleted here) build errors in hinic by selecting NET_DEVLINK. ld: drivers/net/ethernet/huawei/hinic/hinic_hw_dev.o: in function `mgmt_watchdog_timeout_event_handler': hinic_hw_dev.c:(.text+0x30a): undefined reference to `devlink_health_report' ld: drivers/net/ethernet/huawei/hinic/hinic_devlink.o: in function `hinic_fw_reporter_dump': hinic_devlink.c:(.text+0x1c): undefined reference to `devlink_fmsg_u32_pair_put' ld: drivers/net/ethernet/huawei/hinic/hinic_devlink.o: in function `hinic_fw_reporter_dump': hinic_devlink.c:(.text+0x126): undefined reference to `devlink_fmsg_binary_pair_put' ld: drivers/net/ethernet/huawei/hinic/hinic_devlink.o: in function `hinic_hw_reporter_dump': hinic_devlink.c:(.text+0x1ba): undefined reference to `devlink_fmsg_string_pair_put' ld: hinic_devlink.c:(.text+0x227): undefined reference to `devlink_fmsg_u8_pair_put' ld: drivers/net/ethernet/huawei/hinic/hinic_devlink.o: in function `hinic_devlink_alloc': hinic_devlink.c:(.text+0xaee): undefined reference to `devlink_alloc' ld: drivers/net/ethernet/huawei/hinic/hinic_devlink.o: in function `hinic_devlink_free': hinic_devlink.c:(.text+0xb04): undefined reference to `devlink_free' ld: drivers/net/ethernet/huawei/hinic/hinic_devlink.o: in function `hinic_devlink_register': hinic_devlink.c:(.text+0xb26): undefined reference to `devlink_register' ld: drivers/net/ethernet/huawei/hinic/hinic_devlink.o: in function `hinic_devlink_unregister': hinic_devlink.c:(.text+0xb46): undefined reference to `devlink_unregister' ld: drivers/net/ethernet/huawei/hinic/hinic_devlink.o: in function `hinic_health_reporters_create': hinic_devlink.c:(.text+0xb75): undefined reference to `devlink_health_reporter_create' ld: hinic_devlink.c:(.text+0xb95): undefined reference to `devlink_health_reporter_create' ld: hinic_devlink.c:(.text+0xbac): undefined reference to `devlink_health_reporter_destroy' ld: drivers/net/ethernet/huawei/hinic/hinic_devlink.o: in function `hinic_health_reporters_destroy': Fixes: `51ba902a16` ("net-next/hinic: Initialize hw interface") Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Cc: Bin Luo <luobin9@huawei.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Jakub Kicinski <kuba@kernel.org> Cc: Aviad Krawczyk <aviad.krawczyk@huawei.com> Cc: Zhao Chen <zhaochen6@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-10-03 16:52:19 -07:00
Vineetha G. Jaya Kumaran	388e201d41	net: stmmac: Modify configuration method of EEE timers Ethtool manual stated that the tx-timer is the "the amount of time the device should stay in idle mode prior to asserting its Tx LPI". The previous implementation for "ethtool --set-eee tx-timer" sets the LPI TW timer duration which is not correct. Hence, this patch fixes the "ethtool --set-eee tx-timer" to configure the EEE LPI timer. The LPI TW Timer will be using the defined default value instead of "ethtool --set-eee tx-timer" which follows the EEE LS timer implementation. Changelog V2 Not removing/modifying the eee_timer. EEE LPI timer can be configured through ethtool and also the eee_timer module param. *EEE TW Timer will be configured with default value only, not able to be configured through ethtool or module param. This follows the implementation of the EEE LS Timer. Fixes: `d765955d2a` ("stmmac: add the Energy Efficient Ethernet support") Signed-off-by: Vineetha G. Jaya Kumaran <vineetha.g.jaya.kumaran@intel.com> Signed-off-by: Voon Weifeng <weifeng.voon@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-10-03 16:40:25 -07:00
Ioana Ciornei	061d631f7d	dpaa2-eth: add support for devlink parser error drop traps Add support for the new group of devlink traps - PARSER_ERROR_DROPS. This consists of registering the array of parser error drops supported, controlling their action through the .trap_group_action_set() callback and reporting an erroneous skb received on the error queue appropriately. DPAA2 devices do not support controlling the action of independent parser error traps, thus the .trap_action_set() callback just returns an EOPNOTSUPP while .trap_group_action_set() actually notifies the hardware what it should do with a frame marked as having a header error. Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-10-02 16:31:56 -07:00
Ioana Ciornei	ceeb03ad8e	dpaa2-eth: add basic devlink support Add basic support in dpaa2-eth for devlink. For the moment, just register the device with devlink, add the corresponding devlink port and implement the .info_get() callback. Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-10-02 16:31:56 -07:00
Shannon Nelson	9e15410dc7	ionic: add new bad firmware error code If the new firmware image downladed for update is corrupted or is a bad format, the download process will report a status code specifically for that. Signed-off-by: Shannon Nelson <snelson@pensando.io> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-10-02 16:30:01 -07:00
Shannon Nelson	bb9f80f31d	ionic: use lif ident for filter count Use the lif's ident information for the uc and mc filter counts rather than the ionic's version, to be sure we're getting the info that is specific to this lif. While we're thinking about it, add some missing error checking where we get the lif's identity information. Signed-off-by: Shannon Nelson <snelson@pensando.io> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-10-02 16:30:01 -07:00
Shannon Nelson	a21b5d49e7	ionic: refill lif identity after fw_up After we do a fw upgrade and refill the ionic->ident.dev, we also need to update the other identity info. Since the lif identity needs to be updated each time the ionic identity is refreshed, we can pull it into ionic_identify(). The debugfs entry is moved so that it doesn't cause an error message when the data is refreshed after the fw upgrade. Signed-off-by: Shannon Nelson <snelson@pensando.io> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-10-02 16:30:01 -07:00
Shannon Nelson	ba6ab8aca2	ionic: disable all queue napi contexts on timeout Some time ago we short-circuited the queue disables on a timeout error in order to not have to wait on every queue when we already know it will time out. However, this meant that we're not properly stopping all the interrupts and napi contexts. This changes queue disable to always call ionic_qcq_disable() and to give it an argument to know when to not do the adminq request. Signed-off-by: Shannon Nelson <snelson@pensando.io> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-10-02 16:30:01 -07:00
Shannon Nelson	7c737fc43c	ionic: check qcq ptr in ionic_qcq_disable There are a couple of error recovery paths that can come through ionic_qcq_disable() without having set up the qcq, so we need to make sure we have a valid qcq pointer before using it. Signed-off-by: Shannon Nelson <snelson@pensando.io> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-10-02 16:30:01 -07:00
Shannon Nelson	2c580d7783	ionic: clear linkcheck bit on alloc fail Clear our link check requested flag on an allocation error. We end up dropping this link check request, but that should be fine as our watchdog will come back a few seconds later and request it again. Reported-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Shannon Nelson <snelson@pensando.io> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-10-02 16:30:01 -07:00
Shannon Nelson	52733cff9b	ionic: drain the work queue Check through our work list for additional items. This normally will only have one item, but occasionally may have another job waiting. There really is no need reschedule ourself here. Reported-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Shannon Nelson <snelson@pensando.io> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-10-02 16:30:01 -07:00
Shannon Nelson	9576a36cc1	ionic: contiguous memory for notifyq The event notification queue is set up a little differently in the NIC and so the notifyq q and cq descriptor structures need to be contiguous, which got missed in an earlier patch that separated out the q and cq descriptor allocations. That patch was aimed at making the big tx and rx descriptor queue allocations easier to manage - the notifyq is much smaller and doesn't need to be split. This patch simply adds an if/else and slightly different code for the notifyq descriptor allocation. Fixes: `ea5a8b09dc` ("ionic: reduce contiguous memory allocation requirement") Signed-off-by: Shannon Nelson <snelson@pensando.io> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-10-02 16:30:01 -07:00
David S. Miller	ab0faf5f04	mlx5-fixes-2020-09-30 -----BEGIN PGP SIGNATURE----- iQEzBAABCAAdFiEEGhZs6bAKwk/OTgTpSD+KveBX+j4FAl93ap4ACgkQSD+KveBX +j5+eAf/dvxx+WyWr5pV3gxd0x7K/wV+F1JFVe99k8yH6kYbpo56U+oRQGP4kvnG 4Ggb/XE7hSahvReRVD5vn4LKk2RQ/GMWEurF/GQDPklaHZyZHtcI3+2C/azEHnf+ vGgbM1xDT0gNZoa+2pA7LBgruJF/k+gRbth6EHrjlcxqiqt2k4d5Hs0m/Xd5R0TC D+Yks3uHAOrTiP2idOWNoWmd5AOmh802wX0w4iyKZ9ZfJGMN3t2AKyZDIhNfPyPf avebIihkr5y5DsYGZE+HjZjK0+vXaKVAGgzDbeLZ2sPVdCJJAFRZptG6mPJ0d0N3 TPwcBZcs5BsJkEQ5XqBA0IwJjksVeA== =421I -----END PGP SIGNATURE----- Merge tag 'mlx5-fixes-2020-09-30' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux From: Saeed Mahameed <saeedm@nvidia.com> ==================== This series introduces some fixes to mlx5 driver. v1->v2: - Patch #1 Don't return while mutex is held. (Dave) v2->v3: - Drop patch #1, will consider a better approach (Jakub) - use cpu_relax() instead of cond_resched() (Jakub) - while(i--) to reveres a loop (Jakub) - Drop old mellanox email sign-off and change the committer email (Jakub) Please pull and let me know if there is any problem. For -stable v4.15 ('net/mlx5e: Fix VLAN cleanup flow') ('net/mlx5e: Fix VLAN create flow') For -stable v4.16 ('net/mlx5: Fix request_irqs error flow') For -stable v5.4 ('net/mlx5e: Add resiliency in Striding RQ mode for packets larger than MTU') ('net/mlx5: Avoid possible free of command entry while timeout comp handler') For -stable v5.7 ('net/mlx5e: Fix return status when setting unsupported FEC mode') For -stable v5.8 ('net/mlx5e: Fix race condition on nhe->n pointer in neigh update') ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2020-10-02 16:14:21 -07:00
Vladimir Oltean	16a7a15f4b	net: mscc: ocelot: offload redirect action to VCAP IS2 Via the OCELOT_MASK_MODE_REDIRECT flag put in the IS2 action vector, it is possible to replace previous forwarding decisions with the port mask installed in this rule. I have studied Table 54 "MASK_MODE and PORT_MASK Combinations" from the VSC7514 documentation and it appears to behave sanely when this rule is installed in either lookup 0 or 1. Namely, a redirect in lookup 1 will overwrite the forwarding decision taken by any entry in lookup 0. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-10-02 15:40:30 -07:00
Vladimir Oltean	f854e6f6f4	net: mscc: ocelot: relax ocelot_exclusive_mac_etype_filter_rules() The issue which led to the introduction of this check was that MAC_ETYPE rules, such as filters on dst_mac and src_mac, would only match non-IP frames. There is a knob in VCAP_S2_CFG which forces all IP frames to be treated as non-IP, which is what we're currently doing if the user requested a dst_mac filter, in order to maintain sanity. But that knob is actually per IS2 lookup. And the good thing with exposing the lookups to the user via tc chains is that we're now able to offload MAC_ETYPE keys to one lookup, and IP keys to the other lookup. So let's do that. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-10-02 15:40:30 -07:00
Vladimir Oltean	226e9cd82a	net: mscc: ocelot: only install TCAM entries into a specific lookup and PAG We were installing TCAM rules with the LOOKUP field as unmasked, meaning that all entries were matching on all lookups. Now that lookups are exposed as individual chains, let's make the LOOKUP explicit when offloading TCAM entries. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-10-02 15:40:30 -07:00
Xiaoliang Yang	2f17c050d8	net: mscc: ocelot: offload egress VLAN rewriting to VCAP ES0 VCAP ES0 is an egress VCAP operating on all outgoing frames. This patch added ES0 driver to support vlan push action of tc filter. Usage: tc filter add dev swp1 egress protocol 802.1Q flower indev swp0 skip_sw \ vlan_id 1 vlan_prio 1 action vlan push id 2 priority 2 Signed-off-by: Xiaoliang Yang <xiaoliang.yang_1@nxp.com> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-10-02 15:40:30 -07:00
Xiaoliang Yang	75944fda1d	net: mscc: ocelot: offload ingress skbedit and vlan actions to VCAP IS1 VCAP IS1 is a VCAP module which can filter on the most common L2/L3/L4 Ethernet keys, and modify the results of the basic QoS classification and VLAN classification based on those flow keys. There are 3 VCAP IS1 lookups, mapped over chains 10000, 11000 and 12000. Currently the driver is hardcoded to use IS1_ACTION_TYPE_NORMAL half keys. Note that the VLAN_MANGLE has been omitted for now. In hardware, the VCAP_IS1_ACT_VID_REPLACE_ENA field replaces the classified VLAN (metadata associated with the frame) and not the VLAN from the header itself. There are currently some issues which need to be addressed when operating in standalone, or in bridge with vlan_filtering=0 modes, because in those cases the switch ports have VLAN awareness disabled, and changing the classified VLAN to anything other than the pvid causes the packets to be dropped. Another issue is that on egress, we expect port tagging to push the classified VLAN, but port tagging is disabled in the modes mentioned above, so although the classified VLAN is replaced, it is not visible in the packet transmitted by the switch. Signed-off-by: Xiaoliang Yang <xiaoliang.yang_1@nxp.com> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-10-02 15:40:30 -07:00
Vladimir Oltean	1397a2eb52	net: mscc: ocelot: create TCAM skeleton from tc filter chains For Ocelot switches, there are 2 ingress pipelines for flow offload rules: VCAP IS1 (Ingress Classification) and IS2 (Security Enforcement). IS1 and IS2 support different sets of actions. The pipeline order for a packet on ingress is: Basic classification -> VCAP IS1 -> VCAP IS2 Furthermore, IS1 is looked up 3 times, and IS2 is looked up twice (each TCAM entry can be configured to match only on the first lookup, or only on the second, or on both etc). Because the TCAMs are completely independent in hardware, and because of the fixed pipeline, we actually have very limited options when it comes to offloading complex rules to them while still maintaining the same semantics with the software data path. This patch maps flow offload rules to ingress TCAMs according to a predefined chain index number. There is going to be a script in selftests that clarifies the usage model. There is also an egress TCAM (VCAP ES0, the Egress Rewriter), which is modeled on top of the default chain 0 of the egress qdisc, because it doesn't have multiple lookups. Suggested-by: Allan W. Nielsen <allan.nielsen@microchip.com> Co-developed-by: Xiaoliang Yang <xiaoliang.yang_1@nxp.com> Signed-off-by: Xiaoliang Yang <xiaoliang.yang_1@nxp.com> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-10-02 15:40:30 -07:00
Vladimir Oltean	319e4dd11a	net: mscc: ocelot: introduce conversion helpers between port and netdev Since the mscc_ocelot_switch_lib is common between a pure switchdev and a DSA driver, the procedure of retrieving a net_device for a certain port index differs, as those are registered by their individual front-ends. Up to now that has been dealt with by always passing the port index to the switch library, but now, we're going to need to work with net_device pointers from the tc-flower offload, for things like indev, or mirred. It is not desirable to refactor that, so let's make sure that the flower offload core has the ability to translate between a net_device and a port index properly. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Acked-by: Alexandre Belloni <alexandre.belloni@bootlin.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-10-02 15:40:30 -07:00
Vladimir Oltean	ea9d1f30b1	net: mscc: ocelot: offload multiple tc-flower actions in same rule At this stage, the tc-flower offload of mscc_ocelot can only delegate rules to the VCAP IS2 security enforcement block. These rules have, in hardware, separate bits for policing and for overriding the destination port mask and/or copying to the CPU. So it makes sense that we attempt to expose some more of that low-level complexity instead of simply choosing between a single type of action. Something similar happens with the VCAP IS1 block, where the same action can contain enable bits for VLAN classification and for QoS classification at the same time. So model the action structure after the hardware description, and let the high-level ocelot_flower.c construct an action vector from multiple tc actions. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-10-02 15:40:29 -07:00
Vlad Buslov	1253935ad8	net/mlx5e: Fix race condition on nhe->n pointer in neigh update Current neigh update event handler implementation takes reference to neighbour structure, assigns it to nhe->n, tries to schedule workqueue task and releases the reference if task was already enqueued. This results potentially overwriting existing nhe->n pointer with another neighbour instance, which causes double release of the instance (once in neigh update handler that failed to enqueue to workqueue and another one in neigh update workqueue task that processes updated nhe->n pointer instead of original one): [ 3376.512806] ------------[ cut here ]------------ [ 3376.513534] refcount_t: underflow; use-after-free. [ 3376.521213] Modules linked in: act_skbedit act_mirred act_tunnel_key vxlan ip6_udp_tunnel udp_tunnel nfnetlink act_gact cls_flower sch_ingress openvswitch nsh nf_conncount nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 mlx5_ib mlx5_core mlxfw pci_hyperv_intf ptp pps_core nfsv3 nfs_acl rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache ib_isert iscsi_target_mod ib_srpt target_core_mod ib_srp rpcrdma rdma_ucm ib_umad ib_ipoib ib_iser rdma_cm ib_cm iw_cm rfkill ib_uverbs ib_core sunrpc kvm_intel kvm iTCO_wdt iTCO_vendor_support virtio_net irqbypass net_failover crc32_pclmul lpc_ich i2c_i801 failover pcspkr i2c_smbus mfd_core ghash_clmulni_intel sch_fq_codel drm i2c _core ip_tables crc32c_intel serio_raw [last unloaded: mlxfw] [ 3376.529468] CPU: 8 PID: 22756 Comm: kworker/u20:5 Not tainted 5.9.0-rc5+ #6 [ 3376.530399] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014 [ 3376.531975] Workqueue: mlx5e mlx5e_rep_neigh_update [mlx5_core] [ 3376.532820] RIP: 0010:refcount_warn_saturate+0xd8/0xe0 [ 3376.533589] Code: ff 48 c7 c7 e0 b8 27 82 c6 05 0b b6 09 01 01 e8 94 93 c1 ff 0f 0b c3 48 c7 c7 88 b8 27 82 c6 05 f7 b5 09 01 01 e8 7e 93 c1 ff <0f> 0b c3 0f 1f 44 00 00 8b 07 3d 00 00 00 c0 74 12 83 f8 01 74 13 [ 3376.536017] RSP: 0018:ffffc90002a97e30 EFLAGS: 00010286 [ 3376.536793] RAX: 0000000000000000 RBX: ffff8882de30d648 RCX: 0000000000000000 [ 3376.537718] RDX: ffff8882f5c28f20 RSI: ffff8882f5c18e40 RDI: ffff8882f5c18e40 [ 3376.538654] RBP: ffff8882cdf56c00 R08: 000000000000c580 R09: 0000000000001a4d [ 3376.539582] R10: 0000000000000731 R11: ffffc90002a97ccd R12: 0000000000000000 [ 3376.540519] R13: ffff8882de30d600 R14: ffff8882de30d640 R15: ffff88821e000900 [ 3376.541444] FS: 0000000000000000(0000) GS:ffff8882f5c00000(0000) knlGS:0000000000000000 [ 3376.542732] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 3376.543545] CR2: 0000556e5504b248 CR3: 00000002c6f10005 CR4: 0000000000770ee0 [ 3376.544483] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 3376.545419] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 3376.546344] PKRU: 55555554 [ 3376.546911] Call Trace: [ 3376.547479] mlx5e_rep_neigh_update.cold+0x33/0xe2 [mlx5_core] [ 3376.548299] process_one_work+0x1d8/0x390 [ 3376.548977] worker_thread+0x4d/0x3e0 [ 3376.549631] ? rescuer_thread+0x3e0/0x3e0 [ 3376.550295] kthread+0x118/0x130 [ 3376.550914] ? kthread_create_worker_on_cpu+0x70/0x70 [ 3376.551675] ret_from_fork+0x1f/0x30 [ 3376.552312] ---[ end trace d84e8f46d2a77eec ]--- Fix the bug by moving work_struct to dedicated dynamically-allocated structure. This enabled every event handler to work on its own private neighbour pointer and removes the need for handling the case when task is already enqueued. Fixes: `232c001398` ("net/mlx5e: Add support to neighbour update flow") Signed-off-by: Vlad Buslov <vladbu@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2020-10-02 10:59:58 -07:00
Aya Levin	d4a16052bc	net/mlx5e: Fix VLAN create flow When interface is attached while in promiscuous mode and with VLAN filtering turned off, both configurations are not respected and VLAN filtering is performed. There are 2 flows which add the any-vid rules during interface attach: VLAN creation table and set rx mode. Each is relaying on the other to add any-vid rules, eventually non of them does. Fix this by adding any-vid rules on VLAN creation regardless of promiscuous mode. Fixes: `9df30601c8` ("net/mlx5e: Restore vlan filter after seamless reset") Signed-off-by: Aya Levin <ayal@nvidia.com> Reviewed-by: Moshe Shemesh <moshe@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2020-10-02 10:59:58 -07:00
Aya Levin	8c7353b6f7	net/mlx5e: Fix VLAN cleanup flow Prior to this patch unloading an interface in promiscuous mode with RX VLAN filtering feature turned off - resulted in a warning. This is due to a wrong condition in the VLAN rules cleanup flow, which left the any-vid rules in the VLAN steering table. These rules prevented destroying the flow group and the flow table. The any-vid rules are removed in 2 flows, but none of them remove it in case both promiscuous is set and VLAN filtering is off. Fix the issue by changing the condition of the VLAN table cleanup flow to clean also in case of promiscuous mode. mlx5_core 0000:00:08.0: mlx5_destroy_flow_group:2123:(pid 28729): Flow group 20 wasn't destroyed, refcount > 1 mlx5_core 0000:00:08.0: mlx5_destroy_flow_group:2123:(pid 28729): Flow group 19 wasn't destroyed, refcount > 1 mlx5_core 0000:00:08.0: mlx5_destroy_flow_table:2112:(pid 28729): Flow table 262149 wasn't destroyed, refcount > 1 ... ... ------------[ cut here ]------------ FW pages counter is 11560 after reclaiming all pages WARNING: CPU: 1 PID: 28729 at drivers/net/ethernet/mellanox/mlx5/core/pagealloc.c:660 mlx5_reclaim_startup_pages+0x178/0x230 [mlx5_core] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014 Call Trace: mlx5_function_teardown+0x2f/0x90 [mlx5_core] mlx5_unload_one+0x71/0x110 [mlx5_core] remove_one+0x44/0x80 [mlx5_core] pci_device_remove+0x3e/0xc0 device_release_driver_internal+0xfb/0x1c0 device_release_driver+0x12/0x20 pci_stop_bus_device+0x68/0x90 pci_stop_and_remove_bus_device+0x12/0x20 hv_eject_device_work+0x6f/0x170 [pci_hyperv] ? __schedule+0x349/0x790 process_one_work+0x206/0x400 worker_thread+0x34/0x3f0 ? process_one_work+0x400/0x400 kthread+0x126/0x140 ? kthread_park+0x90/0x90 ret_from_fork+0x22/0x30 ---[ end trace 6283bde8d26170dc ]--- Fixes: `9df30601c8` ("net/mlx5e: Restore vlan filter after seamless reset") Signed-off-by: Aya Levin <ayal@nvidia.com> Reviewed-by: Moshe Shemesh <moshe@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2020-10-02 10:59:58 -07:00
Aya Levin	2608a2f831	net/mlx5e: Fix return status when setting unsupported FEC mode Verify the configured FEC mode is supported by at least a single link mode before applying the command. Otherwise fail the command and return "Operation not supported". Prior to this patch, the command was successful, yet it falsely set all link modes to FEC auto mode - like configuring FEC mode to auto. Auto mode is the default configuration if a link mode doesn't support the configured FEC mode. Fixes: `b5ede32d33` ("net/mlx5e: Add support for FEC modes based on 50G per lane links") Signed-off-by: Aya Levin <ayal@mellanox.com> Reviewed-by: Eran Ben Elisha <eranbe@nvidia.com> Reviewed-by: Moshe Shemesh <moshe@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2020-10-02 10:59:57 -07:00
Aya Levin	3d093bc236	net/mlx5e: Fix driver's declaration to support GRE offload Declare GRE offload support with respect to the inner protocol. Add a list of supported inner protocols on which the driver can offload checksum and GSO. For other protocols, inform the stack to do the needed operations. There is no noticeable impact on GRE performance. Fixes: `2729984149` ("net/mlx5e: Support TSO and TX checksum offloads for GRE tunnels") Signed-off-by: Aya Levin <ayal@mellanox.com> Reviewed-by: Moshe Shemesh <moshe@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2020-10-02 10:59:57 -07:00
Maor Dickman	2b0219898b	net/mlx5e: CT, Fix coverity issue The cited commit introduced the following coverity issue at function mlx5_tc_ct_rule_to_tuple_nat: - Memory - corruptions (OVERRUN) Overrunning array "tuple->ip.src_v6.in6_u.u6_addr32" of 4 4-byte elements at element index 7 (byte offset 31) using index "ip6_offset" (which evaluates to 7). In case of IPv6 destination address rewrite, ip6_offset values are between 4 to 7, which will cause memory overrun of array "tuple->ip.src_v6.in6_u.u6_addr32" to array "tuple->ip.dst_v6.in6_u.u6_addr32". Fixed by writing the value directly to array "tuple->ip.dst_v6.in6_u.u6_addr32" in case ip6_offset values are between 4 to 7. Fixes: `bc562be967` ("net/mlx5e: CT: Save ct entries tuples in hashtables") Signed-off-by: Maor Dickman <maord@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2020-10-02 10:59:57 -07:00
Aya Levin	c3c9402373	net/mlx5e: Add resiliency in Striding RQ mode for packets larger than MTU Prior to this fix, in Striding RQ mode the driver was vulnerable when receiving packets in the range (stride size - headroom, stride size]. Where stride size is calculated by mtu+headroom+tailroom aligned to the closest power of 2. Usually, this filtering is performed by the HW, except for a few cases: - Between 2 VFs over the same PF with different MTUs - On bluefield, when the host physical function sets a larger MTU than the ARM has configured on its representor and uplink representor. When the HW filtering is not present, packets that are larger than MTU might be harmful for the RQ's integrity, in the following impacts: 1) Overflow from one WQE to the next, causing a memory corruption that in most cases is unharmful: as the write happens to the headroom of next packet, which will be overwritten by build_skb(). In very rare cases, high stress/load, this is harmful. When the next WQE is not yet reposted and points to existing SKB head. 2) Each oversize packet overflows to the headroom of the next WQE. On the last WQE of the WQ, where addresses wrap-around, the address of the remainder headroom does not belong to the next WQE, but it is out of the memory region range. This results in a HW CQE error that moves the RQ into an error state. Solution: Add a page buffer at the end of each WQE to absorb the leak. Actually the maximal overflow size is headroom but since all memory units must be of the same size, we use page size to comply with UMR WQEs. The increase in memory consumption is of a single page per RQ. Initialize the mkey with all MTTs pointing to a default page. When the channels are activated, UMR WQEs will redirect the RX WQEs to the actual memory from the RQ's pool, while the overflow MTTs remain mapped to the default page. Fixes: `73281b78a3` ("net/mlx5e: Derive Striding RQ size from MTU") Signed-off-by: Aya Levin <ayal@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2020-10-02 10:59:56 -07:00
Aya Levin	08a762cecc	net/mlx5e: Fix error path for RQ alloc Increase granularity of the error path to avoid unneeded free/release. Fix the cleanup to be symmetric to the order of creation. Fixes: `0ddf543226` ("xdp/mlx5: setup xdp_rxq_info") Fixes: `422d4c401e` ("net/mlx5e: RX, Split WQ objects for different RQ types") Signed-off-by: Aya Levin <ayal@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2020-10-02 10:59:56 -07:00
Maor Gottlieb	732ebfab7f	net/mlx5: Fix request_irqs error flow Fix error flow handling in request_irqs which try to free irq that we failed to request. It fixes the below trace. WARNING: CPU: 1 PID: 7587 at kernel/irq/manage.c:1684 free_irq+0x4d/0x60 CPU: 1 PID: 7587 Comm: bash Tainted: G W OE 4.15.15-1.el7MELLANOXsmp-x86_64 #1 Hardware name: Advantech SKY-6200/SKY-6200, BIOS F2.00 08/06/2020 RIP: 0010:free_irq+0x4d/0x60 RSP: 0018:ffffc9000ef47af0 EFLAGS: 00010282 RAX: ffff88001476ae00 RBX: 0000000000000655 RCX: 0000000000000000 RDX: ffff88001476ae00 RSI: ffffc9000ef47ab8 RDI: ffff8800398bb478 RBP: ffff88001476a838 R08: ffff88001476ae00 R09: 000000000000156d R10: 0000000000000000 R11: 0000000000000004 R12: ffff88001476a838 R13: 0000000000000006 R14: ffff88001476a888 R15: 00000000ffffffe4 FS: 00007efeadd32740(0000) GS:ffff88047fc40000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007fc9cc010008 CR3: 00000001a2380004 CR4: 00000000007606e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 PKRU: 55555554 Call Trace: mlx5_irq_table_create+0x38d/0x400 [mlx5_core] ? atomic_notifier_chain_register+0x50/0x60 mlx5_load_one+0x7ee/0x1130 [mlx5_core] init_one+0x4c9/0x650 [mlx5_core] pci_device_probe+0xb8/0x120 driver_probe_device+0x2a1/0x470 ? driver_allows_async_probing+0x30/0x30 bus_for_each_drv+0x54/0x80 __device_attach+0xa3/0x100 pci_bus_add_device+0x4a/0x90 pci_iov_add_virtfn+0x2dc/0x2f0 pci_enable_sriov+0x32e/0x420 mlx5_core_sriov_configure+0x61/0x1b0 [mlx5_core] ? kstrtoll+0x22/0x70 num_vf_store+0x4b/0x70 [mlx5_core] kernfs_fop_write+0x102/0x180 __vfs_write+0x26/0x140 ? rcu_all_qs+0x5/0x80 ? _cond_resched+0x15/0x30 ? __sb_start_write+0x41/0x80 vfs_write+0xad/0x1a0 SyS_write+0x42/0x90 do_syscall_64+0x60/0x110 entry_SYSCALL_64_after_hwframe+0x3d/0xa2 Fixes: `24163189da` ("net/mlx5: Separate IRQ request/free from EQ life cycle") Signed-off-by: Maor Gottlieb <maorg@nvidia.com> Reviewed-by: Eran Ben Elisha <eranbe@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2020-10-02 10:59:56 -07:00
Saeed Mahameed	b898ce7bcc	net/mlx5: cmdif, Avoid skipping reclaim pages if FW is not accessible In case of pci is offline reclaim_pages_cmd() will still try to call the FW to release FW pages, cmd_exec() in this case will return a silent success without actually calling the FW. This is wrong and will cause page leaks, what we should do is to detect pci offline or command interface un-available before tying to access the FW and manually release the FW pages in the driver. In this patch we share the code to check for FW command interface availability and we call it in sensitive places e.g. reclaim_pages_cmd(). Alternative fix: 1. Remove MLX5_CMD_OP_MANAGE_PAGES form mlx5_internal_err_ret_value, command success simulation list. 2. Always Release FW pages even if cmd_exec fails in reclaim_pages_cmd(). Reviewed-by: Moshe Shemesh <moshe@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2020-10-02 10:59:55 -07:00
Eran Ben Elisha	410bd754cd	net/mlx5: Add retry mechanism to the command entry index allocation It is possible that new command entry index allocation will temporarily fail. The new command holds the semaphore, so it means that a free entry should be ready soon. Add one second retry mechanism before returning an error. Patch "net/mlx5: Avoid possible free of command entry while timeout comp handler" increase the possibility to bump into this temporarily failure as it delays the entry index release for non-callback commands. Fixes: `e126ba97db` ("mlx5: Add driver for Mellanox Connect-IB adapters") Signed-off-by: Eran Ben Elisha <eranbe@nvidia.com> Reviewed-by: Moshe Shemesh <moshe@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2020-10-02 10:59:55 -07:00
Eran Ben Elisha	1d5558b1f0	net/mlx5: poll cmd EQ in case of command timeout Once driver detects a command interface command timeout, it warns the user and returns timeout error to the caller. In such case, the entry of the command is not evacuated (because only real event interrupt is allowed to clear command interface entry). If the HW event interrupt of this entry will never arrive, this entry will be left unused forever. Command interface entries are limited and eventually we can end up without the ability to post a new command. In addition, if driver will not consume the EQE of the lost interrupt and rearm the EQ, no new interrupts will arrive for other commands. Add a resiliency mechanism for manually polling the command EQ in case of a command timeout. In case resiliency mechanism will find non-handled EQE, it will consume it, and the command interface will be fully functional again. Once the resiliency flow finished, wait another 5 seconds for the command interface to complete for this command entry. Define mlx5_cmd_eq_recover() to manage the cmd EQ polling resiliency flow. Add an async EQ spinlock to avoid races between resiliency flows and real interrupts that might run simultaneously. Fixes: `e126ba97db` ("mlx5: Add driver for Mellanox Connect-IB adapters") Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2020-10-02 10:59:55 -07:00
Eran Ben Elisha	50b2412b7e	net/mlx5: Avoid possible free of command entry while timeout comp handler Upon command completion timeout, driver simulates a forced command completion. In a rare case where real interrupt for that command arrives simultaneously, it might release the command entry while the forced handler might still access it. Fix that by adding an entry refcount, to track current amount of allowed handlers. Command entry to be released only when this refcount is decremented to zero. Command refcount is always initialized to one. For callback commands, command completion handler is the symmetric flow to decrement it. For non-callback commands, it is wait_func(). Before ringing the doorbell, increment the refcount for the real completion handler. Once the real completion handler is called, it will decrement it. For callback commands, once the delayed work is scheduled, increment the refcount. Upon callback command completion handler, we will try to cancel the timeout callback. In case of success, we need to decrement the callback refcount as it will never run. In addition, gather the entry index free and the entry free into a one flow for all command types release. Fixes: `e126ba97db` ("mlx5: Add driver for Mellanox Connect-IB adapters") Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com> Reviewed-by: Moshe Shemesh <moshe@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2020-10-02 10:59:54 -07:00
Eran Ben Elisha	432161ea26	net/mlx5: Fix a race when moving command interface to polling mode As part of driver unload, it destroys the commands EQ (via FW command). As the commands EQ is destroyed, FW will not generate EQEs for any command that driver sends afterwards. Driver should poll for later commands status. Driver commands mode metadata is updated before the commands EQ is actually destroyed. This can lead for double completion handle by the driver (polling and interrupt), if a command is executed and completed by FW after the mode was changed, but before the EQ was destroyed. Fix that by using the mlx5_cmd_allowed_opcode mechanism to guarantee that only DESTROY_EQ command can be executed during this time period. Fixes: `e126ba97db` ("mlx5: Add driver for Mellanox Connect-IB adapters") Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com> Reviewed-by: Moshe Shemesh <moshe@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2020-10-02 10:59:54 -07:00
Armin Wolf	360f898746	lib8390: Use netif_msg_init to initialize msg_enable bits Use netif_msg_init() to process param settings and use only the proper initialized value of ei_local->msg_level for later processing; Signed-off-by: Armin Wolf <W_Armin@gmx.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-10-01 19:08:46 -07:00
Geert Uytterhoeven	a6f51f2efa	ravb: Add support for explicit internal clock delay configuration Some EtherAVB variants support internal clock delay configuration, which can add larger delays than the delays that are typically supported by the PHY (using an "rgmii-id" PHY mode, and/or "[rt]xc-skew-ps" properties). Historically, the EtherAVB driver configured these delays based on the "rgmii-id" PHY mode. This caused issues with PHY drivers that implement PHY internal delays properly[1]. Hence a backwards-compatible workaround was added by masking the PHY mode[2]. Add proper support for explicit configuration of the MAC internal clock delays using the new "[rt]x-internal-delay-ps" properties. Fall back to the old handling if none of these properties is present. [1] Commit `bcf3440c6d` ("net: phy: micrel: add phy-mode support for the KSZ9031 PHY") [2] Commit `9b23203c32` ("ravb: Mask PHY mode to avoid inserting delays twice"). Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Reviewed-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-10-01 12:53:31 -07:00
Geert Uytterhoeven	ce19a9eb53	ravb: Split delay handling in parsing and applying Currently, full delay handling is done in both the probe and resume paths. Split it in two parts, so the resume path doesn't have to redo the parsing part over and over again. Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Reviewed-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-10-01 12:53:31 -07:00
Heiner Kallweit	ef9da46dde	r8169: fix data corruption issue on RTL8402 Petr reported that after resume from suspend RTL8402 partially truncates incoming packets, and re-initializing register RxConfig before the actual chip re-initialization sequence is needed to avoid the issue. Reported-by: Petr Tesarik <ptesarik@suse.cz> Proposed-by: Petr Tesarik <ptesarik@suse.cz> Tested-by: Petr Tesarik <ptesarik@suse.cz> Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-10-01 12:37:21 -07:00
Heiner Kallweit	bb13a80062	r8169: fix handling ether_clk Petr reported that system freezes on r8169 driver load on a system using ether_clk. The original change was done under the assumption that the clock isn't needed for basic operations like chip register access. But obviously that was wrong. Therefore effectively revert the original change, and in addition leave the clock active when suspending and WoL is enabled. Chip may not be able to process incoming packets otherwise. Fixes: `9f0b54cd16` ("r8169: move switching optional clock on/off to pll power functions") Reported-by: Petr Tesarik <ptesarik@suse.cz> Tested-by: Petr Tesarik <ptesarik@suse.cz> Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Reviewed-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-10-01 12:35:21 -07:00
Gustavo A. R. Silva	ff7ea04ad5	net/mlx5e: Fix potential null pointer dereference Calls to kzalloc() and kvzalloc() should be null-checked in order to avoid any potential failures. In this case, a potential null pointer dereference. Fix this by adding null checks for _parse_attr_ and _flow_ right after allocation. Addresses-Coverity-ID: 1497154 ("Dereference before null check") Fixes: `c620b77215` ("net/mlx5: Refactor tc flow attributes structure") Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2020-09-30 21:26:31 -07:00
Dan Carpenter	7b2b16ee54	net/mlx5e: Fix a use after free on error in mlx5_tc_ct_shared_counter_get() This code frees "shared_counter" and then dereferences on the next line to get the error code. Fixes: `1edae2335a` ("net/mlx5e: CT: Use the same counter for both directions") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2020-09-30 21:26:31 -07:00
Ariel Levkovich	5efbe61788	net/mlx5: Fix dereference on pointer attr after null check When removing a flow from the slow path fdb, a flow attr struct is allocated for the rule removal process. If the allocation fails the code prints a warning message but continues with the removal flow which include dereferencing a pointer which could be null. Fix this by exiting the function in case the attr allocation failed. Fixes: `c620b77215` ("net/mlx5: Refactor tc flow attributes structure") Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Ariel Levkovich <lariel@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2020-09-30 21:26:31 -07:00
Parav Pandit	7be3412a76	net/mlx5: Use dma device access helper Use the PCI device directly for dma accesses as non PCI device unlikely support IOMMU and dma mappings. Introduce and use helper routine to access DMA device. Signed-off-by: Parav Pandit <parav@nvidia.com> Reviewed-by: Vu Pham <vuhuong@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2020-09-30 21:26:30 -07:00
Hamdan Igbaria	036e19b90f	net/mlx5: E-Switch, Support flow source for local vport Set flow source as hint for local vport. Signed-off-by: Hamdan Igbaria <hamdani@mellanox.com> Reviewed-by: Oz Shlomo <ozsh@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2020-09-30 21:26:30 -07:00
Parav Pandit	c7eddc6092	net/mlx5: E-switch, Move devlink eswitch ports closer to eswitch Currently devlink eswitch ports are registered and unregistered by the representor layer. However it is better to register them at eswitch layer so that in future user initiated command port add and delete commands can also register/unregister devlink ports without depending on representor layer. Signed-off-by: Parav Pandit <parav@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Reviewed-by: Vu Pham <vuhuong@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2020-09-30 21:26:29 -07:00
Parav Pandit	38679b5a0d	net/mlx5: E-switch, Use helper function to load unload representor To register and unregister devlink ports when loading/unload representors, refactor the code to helper functions. Signed-off-by: Parav Pandit <parav@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Reviewed-by: Vu Pham <vuhuong@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2020-09-30 21:26:29 -07:00
Parav Pandit	2c40db2f1d	net/mlx5: E-switch, Add helper to check egress ACL need Currently only VF vports need egress ACL table. Add a generic helper to check whether a vport need egress ACL table or not. Signed-off-by: Parav Pandit <parav@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Reviewed-by: Vu Pham <vuhuong@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2020-09-30 21:26:29 -07:00
sunils	7cd7becddd	net/mlx5: E-switch, Use PF num in metadata reg c0 Currently only 256 vports can be supported as only 8 bits are reserved for them and 8 bits are reserved for vhca_ids in metadata reg c0. To support more than 256 vports, replace vhca_id with a unique shorter 4-bit PF number which covers upto 16 PF's. Use remaining 12 bits for vports ranging 1-4095. This will continue to generate unique metadata even if multiple PCI devices have same switch_id. Signed-off-by: sunils <sunils@nvidia.com> Reviewed-by: Parav Pandit <parav@nvidia.com> Reviewed-by: Vu Pham <vuhuong@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2020-09-30 21:26:28 -07:00
Hamdan Igbaria	0172391967	net/mlx5: DR, Add support for rule creation with flow source hint Skip the rule according to flow arrival source, in case of RX and the source is local port skip and in case of TX and the source is uplink skip, we get this info according to the flow source hint we get from upper layers when creating the rule. This is needed because for example in case of FDB table which has a TX and RX tables and we are inserting a rule with an encap action which is only a TX action, in this case rule will fail on RX, so we can rely on the flow source hint and skip RX in such case. Until now we relied on metadata regc_0 that upper layer mapped the port in the regc_0, but the problem is that upper layer did not always use regc_0 for port mapping, so now we added support to flow source hint which upper layers will pass to SW steering when creating a rule. Signed-off-by: Alex Vesker <valex@nvidia.com> Signed-off-by: Hamdan Igbaria <hamdani@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2020-09-30 21:26:28 -07:00
Yevgeny Kliteynik	e6b69bf379	net/mlx5: DR, Call ste_builder directly with tag pointer Instead of getting the tag in each function, call the builder directly with the tag. This will allow to use the same function for building the tag and the bitmask. Signed-off-by: Alex Vesker <valex@nvidia.com> Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2020-09-30 21:26:28 -07:00
Yevgeny Kliteynik	92b4b88531	net/mlx5: DR, Remove unneeded local variable The misc3 variable is used only once and can be dropped. Signed-off-by: Alex Vesker <valex@nvidia.com> Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2020-09-30 21:26:27 -07:00
Yevgeny Kliteynik	e6422d1da0	net/mlx5: DR, Remove unneeded vlan check from L2 builder When we create a matcher we check that all fields are consumed. There is no need for this specific check. This keeps the STE builder functions simple and clean. Signed-off-by: Alex Vesker <valex@nvidia.com> Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2020-09-30 21:26:27 -07:00
Yevgeny Kliteynik	38a5c59d7e	net/mlx5: DR, Remove unneeded check from source port builder Mask validity for ste builders is checked by mlx5dr_ste_build_pre_check during matcher creation. It already checks the mask value of source_vport, so removing this duplicated check. Also, moving there the check of source_eswitch_owner_vhca_id mask. Signed-off-by: Alex Vesker <valex@nvidia.com> Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2020-09-30 21:26:27 -07:00
Yevgeny Kliteynik	97ffd895fe	net/mlx5: DR, Replace the check for valid STE entry Validity check is done by reading the next lu_type from the STE, this check can be replaced by checking the refcount. This will make the check independent on internal STE structure. Signed-off-by: Alex Vesker <valex@nvidia.com> Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2020-09-30 21:26:26 -07:00
Shannon Nelson	0816e0c818	ionic: prevent early watchdog check In one corner case scenario, the driver device lif setup can get delayed such that the ionic_watchdog_cb() timer goes off before the ionic->lif is set, thus causing a NULL pointer panic. We catch the problem by checking for a NULL lif just a little earlier in the callback. Signed-off-by: Shannon Nelson <snelson@pensando.io> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-09-30 15:11:09 -07:00
Shannon Nelson	df8aeaa826	ionic: stop watchdog timer earlier on remove We need to be better at making sure we don't have a link check watchdog go off while we're shutting things down, so let's stop the timer as soon as we start the remove. Meanwhile, since that was the only thing in ionic_dev_teardown(), simplify and remove that function. Signed-off-by: Shannon Nelson <snelson@pensando.io> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-09-30 15:11:09 -07:00
Hariprasad Kelam	66a5209b53	octeontx2-pf: Fix synchnorization issue in mbox Mbox implementation in octeontx2 driver has three states alloc, send and reset in mbox response. VF allocate and sends message to PF for processing, PF ACKs them back and reset the mbox memory. In some case we see synchronization issue where after msgs_acked is incremented and before mbox_reset API is called, if current execution is scheduled out and a different thread is scheduled in which checks for msgs_acked. Since the new thread sees msgs_acked == msgs_sent it will try to allocate a new message and to send a new mbox message to PF.Now if mbox_reset is scheduled in, PF will see '0' in msgs_send. This patch fixes the issue by calling mbox_reset before incrementing msgs_acked flag for last processing message and checks for valid message size. Fixes: `d424b6c02` ("octeontx2-pf: Enable SRIOV and added VF mbox handling") Signed-off-by: Hariprasad Kelam <hkelam@marvell.com> Signed-off-by: Geetha sowjanya <gakula@marvell.com> Signed-off-by: Sunil Goutham <sgoutham@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-09-30 15:07:19 -07:00
Hariprasad Kelam	1ea0166da0	octeontx2-pf: Fix the device state on error Currently in otx2_open on failure of nix_lf_start transmit queues are not stopped which are already started in link_event. Since the tx queues are not stopped network stack still try's to send the packets leading to driver crash while access the device resources. Fixes: `50fe6c02e` ("octeontx2-pf: Register and handle link notifications") Signed-off-by: Hariprasad Kelam <hkelam@marvell.com> Signed-off-by: Geetha sowjanya <gakula@marvell.com> Signed-off-by: Sunil Goutham <sgoutham@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-09-30 15:07:19 -07:00
Geetha sowjanya	89eae5e87b	octeontx2-pf: Fix TCP/UDP checksum offload for IPv6 frames For TCP/UDP checksum offload feature in Octeontx2 expects L3TYPE to be set irrespective of IP header checksum is being offloaded or not. Currently for IPv6 frames L3TYPE is not being set resulting in packet drop with checksum error. This patch fixes this issue. Fixes: `3ca6c4c88` ("octeontx2-pf: Add packet transmission support") Signed-off-by: Geetha sowjanya <gakula@marvell.com> Signed-off-by: Sunil Goutham <sgoutham@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-09-30 15:07:19 -07:00
Subbaraya Sundeep	e154b5b703	octeontx2-af: Fix enable/disable of default NPC entries Packet replication feature present in Octeontx2 is a hardware linked list of PF and its VF interfaces so that broadcast packets are sent to all interfaces present in the list. It is driver job to add and delete a PF/VF interface to/from the list when the interface is brought up and down. This patch fixes the npc_enadis_default_entries function to handle broadcast replication properly if packet replication feature is present. Fixes: `40df309e41` ("octeontx2-af: Support to enable/disable default MCAM entries") Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com> Signed-off-by: Geetha sowjanya <gakula@marvell.com> Signed-off-by: Sunil Goutham <sgoutham@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-09-30 15:07:19 -07:00
Alexandre Belloni	20c168be68	net: macb: move pdata to private header struct macb_platform_data is only used by macb_pci to register the platform device, move its definition to cadence/macb.h and remove platform_data/macb.h Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-09-30 14:18:19 -07:00
Jacob Keller	be49b1ad29	ice: preserve NVM capabilities in safe mode If the driver initializes in safe mode, it will call ice_set_safe_mode_caps. This results in clearing the capabilities structures, in order to set them up for operating in safe mode, ensuring many features are disabled. This has a side effect of also clearing the capability bits that relate to NVM update. The result is that the device driver will not indicate support for unified update, even if the firmware is capable. Fix this by adding the relevant capability fields to the list of values we preserve. To simplify the code, use a common_cap structure instead of a handful of local variables. To reduce some duplication of the capability name, introduce a couple of macros used to restore the capabilities values from the cached copy. Fixes: `de9b277ee0` ("ice: Add support for unified NVM update flow capability") Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Brijesh Behera <brijeshx.behera@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2020-09-30 08:32:35 -07:00
Jacob Keller	0ec86e8e82	ice: increase maximum wait time for flash write commands The ice driver needs to wait for a firmware response to each command to write a block of data to the scratch area used to update the device firmware. The driver currently waits for up to 1 second for this to be returned. It turns out that firmware might take longer than 1 second to return a completion in some cases. If this happens, the flash update will fail to complete. Fix this by increasing the maximum time that the driver will wait for both writing a block of data, and for activating the new NVM bank. The timeout for an erase command is already several minutes, as the firmware had to erase the entire bank which was already expected to take a minute or more in the worst case. In the case where firmware really won't respond, we will now take longer to fail. However, this ensures that if the firmware is simply slow to respond, the flash update can still complete. This new maximum timeout should not adversely increase the update time, as the implementation for wait_event_interruptible_timeout, and should wake very soon after we get a completion event. It is better for a flash update be slow but still succeed than to fail because we gave up too quickly. Fixes: `d69ea414c9` ("ice: implement device flash update via devlink") Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Brijesh Behera <brijeshx.behera@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2020-09-30 08:32:35 -07:00
Vladimir Oltean	98642d1aa2	net: mscc: ocelot: look up the filters in flower_stats() and flower_destroy() Currently a new filter is created, containing just enough correct information to be able to call ocelot_vcap_block_find_filter_by_index() on it. This will be limiting us in the future, when we'll have more metadata associated with a filter, which will matter in the stats() and destroy() callbacks, and which we can't make up on the spot. For example, we'll start "offloading" some dummy tc filter entries for the TCAM skeleton, but we won't actually be adding them to the hardware, or to block->rules. So, it makes sense to avoid deleting those rules too. That's the kind of thing which is difficult to determine unless we look up the real filter. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-09-29 18:26:24 -07:00
Vladimir Oltean	085f5b9162	net: mscc: ocelot: add a new ocelot_vcap_block_find_filter_by_id function And rename the existing find to ocelot_vcap_block_find_filter_by_index. The index is the position in the TCAM, and the id is the flow cookie given by tc. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-09-29 18:26:24 -07:00
Vladimir Oltean	642942637c	net: mscc: ocelot: rename variable 'cnt' in vcap_data_offset_get() The 'cnt' variable is actually used for 2 purposes, to hold the number of sub-words per VCAP entry, and the number of sub-words per VCAP action. In fact, I'm pretty sure these 2 numbers can never be different from one another. By hardware definition, the entry (key) TCAM rows are divided into the same number of sub-words as its associated action RAM rows. But nonetheless, let's at least rename the variables such that observations like this one are easier to make in the future. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-09-29 18:26:24 -07:00
Vladimir Oltean	5963083a31	net: mscc: ocelot: rename variable 'count' in vcap_data_offset_get() This gets rid of one of the 2 variables named, very generically, "count". Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-09-29 18:26:24 -07:00
Xiaoliang Yang	e6ae7c506f	net: mscc: ocelot: calculate vcap offsets correctly for full and quarter entries When calculating the offsets for the current entry within the row and placing them inside struct vcap_data, the function assumes half key entry (2 keys per row). This patch modifies the vcap_data_offset_get() function to calculate a correct data offset when the setting VCAP Type-Group of a key to VCAP_TG_FULL or VCAP_TG_QUARTER. This is needed because, for example, VCAP ES0 only supports full keys. Also rename the 'count' variable to 'num_entries_per_row' to make the function just one tiny bit easier to follow. Signed-off-by: Xiaoliang Yang <xiaoliang.yang_1@nxp.com> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-09-29 18:26:24 -07:00
Vladimir Oltean	7a155fa3d8	net: mscc: ocelot: parse flower action before key When we'll make the switch to multiple chain offloading, we'll want to know first what VCAP block the rule is offloaded to. This impacts what keys are available. Since the VCAP block is determined by what actions are used, parse the action first. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-09-29 18:26:24 -07:00
Vladimir Oltean	d732e9cef0	net: mscc: ocelot: remove unneeded VCAP parameters for IS2 Now that we are deriving these from the constants exposed by the hardware, we can delete the static info we're keeping in the driver. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-09-29 18:26:24 -07:00
Vladimir Oltean	2096805497	net: mscc: ocelot: automatically detect VCAP constants The numbers in struct vcap_props are not intuitive to derive, because they are not a straightforward copy-and-paste from the reference manual but instead rely on a fairly detailed level of understanding of the layout of an entry in the TCAM and in the action RAM. For this reason, bugs are very easy to introduce here. Ease the work of hardware porters and read from hardware the constants that were exported for this particular purpose. Note that this implies that struct vcap_props can no longer be const. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-09-29 18:26:24 -07:00
Vladimir Oltean	e3aea296d8	net: mscc: ocelot: add definitions for VCAP ES0 keys, actions and target As a preparation step for the offloading to ES0, let's create the infrastructure for talking with this hardware block. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-09-29 18:26:12 -07:00
Vladimir Oltean	a61e365d7c	net: mscc: ocelot: add definitions for VCAP IS1 keys, actions and target As a preparation step for the offloading to IS1, let's create the infrastructure for talking with this hardware block. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-09-29 18:26:12 -07:00
Vladimir Oltean	c1c3993edb	net: mscc: ocelot: generalize existing code for VCAP In the Ocelot switches there are 3 TCAMs: VCAP ES0, IS1 and IS2, which have the same configuration interface, but different sets of keys and actions. The driver currently only supports VCAP IS2. In preparation of VCAP IS1 and ES0 support, the existing code must be generalized to work with any VCAP. In that direction, we should move the structures that depend upon VCAP instantiation, like vcap_is2_keys and vcap_is2_actions, out of struct ocelot and into struct vcap_props .keys and .actions, a structure that is replicated 3 times, once per VCAP. We'll pass that structure as an argument to each function that does the key and action packing - only the control logic needs to distinguish between ocelot->vcap[VCAP_IS2] or IS1 or ES0. Another change is to make use of the newly introduced ocelot_target_read and ocelot_target_write API, since the 3 VCAPs have the same registers but put at different addresses. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-09-29 18:26:12 -07:00
Xiaoliang Yang	ed5672d82c	net: mscc: ocelot: return error if VCAP filter is not found Although it doesn't look like it is possible to hit these conditions from user space, there are 2 separate, but related, issues. First, the ocelot_vcap_block_get_filter_index function, née ocelot_ace_rule_get_index_id prior to the `aae4e500e1` ("net: mscc: ocelot: generalize the "ACE/ACL" names") rename, does not do what the author probably intended. If the desired filter entry is not present in the ACL block, this function returns an index equal to the total number of filters, instead of -1, which is maybe what was intended, judging from the curious initialization with -1, and the "++index" idioms. Either way, none of the callers seems to expect this behavior. Second issue, the callers don't actually check the return value at all. So in case the filter is not found in the rule list, propagate the return code. So update the callers and also take the opportunity to get rid of the odd coding idioms that appear to work but don't. Signed-off-by: Xiaoliang Yang <xiaoliang.yang_1@nxp.com> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-09-29 18:26:12 -07:00
Vladimir Oltean	3c0e37a9e4	net: mscc: ocelot: introduce a new ocelot_target_{read,write} API There are some targets (register blocks) in the Ocelot switch that are instantiated more than once. For example, the VCAP IS1, IS2 and ES0 blocks all share the same register layout for interacting with the cache for the TCAM and the action RAM. For the VCAPs, the procedure for servicing them is actually common. We just need an API specifying which VCAP we are talking to, and we do that via these raw ocelot_target_read and ocelot_target_write accessors. In plain ocelot_read, the target is encoded into the register enum itself: u16 target = reg >> TARGET_OFFSET; For the VCAPs, the registers are currently defined like this: enum ocelot_reg { [...] S2_CORE_UPDATE_CTRL = S2 << TARGET_OFFSET, S2_CORE_MV_CFG, S2_CACHE_ENTRY_DAT, S2_CACHE_MASK_DAT, S2_CACHE_ACTION_DAT, S2_CACHE_CNT_DAT, S2_CACHE_TG_DAT, [...] }; which is precisely what we want to avoid, because we'd have to duplicate the same register map for S1 and for S0, and then figure out how to pass VCAP instance-specific registers to the ocelot_read calls (basically another lookup table that undoes the effect of shifting with TARGET_OFFSET). So for some targets, propose a more raw API, similar to what is currently done with ocelot_port_readl and ocelot_port_writel. Those targets can only be accessed with ocelot_target_{read,write} and not with ocelot_{read,write} after the conversion, which is fine. The VCAP registers are not actually modified to use this new API as of this patch. They will be modified in the next one. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Acked-by: Alexandre Belloni <alexandre.belloni@bootlin.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-09-29 18:26:12 -07:00
Lorenzo Bianconi	879456bedb	net: mvneta: avoid possible cache misses in mvneta_rx_swbm Do not use rx_desc pointers if possible since rx descriptors are stored in uncached memory and dereferencing rx_desc pointers generate extra loads. This patch improves XDP_DROP performance of ~ 110Kpps (700Kpps vs 590Kpps) on Marvell Espressobin Analyzed-by: Ilias Apalodimas <ilias.apalodimas@linaro.org> Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-09-29 18:10:07 -07:00
Kevin Brace	2b6b78e082	via-rhine: New device driver maintainer Signed-off-by: Kevin Brace <kevinbrace@bracecomputerlab.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-09-29 14:23:45 -07:00
Kevin Brace	9f5159e89d	via-rhine: Eliminate version information Signed-off-by: Kevin Brace <kevinbrace@bracecomputerlab.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-09-29 14:23:45 -07:00
Kevin Brace	aa15190cf2	via-rhine: VTunknown1 device is really VT8251 South Bridge The VIA Technologies VT8251 South Bridge's integrated Rhine-II Ethernet MAC comes has a PCI revision value of 0x7c. This was verified on ASUS P5V800-VM mainboard. Signed-off-by: Kevin Brace <kevinbrace@bracecomputerlab.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-09-29 14:23:45 -07:00
Kevin Brace	d120c9a81e	via-rhine: Fix for the hardware having a reset failure after resume In rhine_resume() and rhine_suspend(), the code calls netif_running() to see if the network interface is down or not. If it is down (i.e., netif_running() returning false), they will skip any housekeeping work within the function relating to the hardware. This becomes a problem when the hardware resumes from a standby since it is counting on rhine_resume() to map its MMIO and power up rest of the hardware. Not getting its MMIO remapped and rest of the hardware powered up lead to a soft reset failure and hardware disappearance. The solution is to map its MMIO and power up rest of the hardware inside rhine_open() before soft reset is to be performed. This solution was verified on ASUS P5V800-VM mainboard's integrated Rhine-II Ethernet MAC inside VIA Technologies VT8251 South Bridge. Signed-off-by: Kevin Brace <kevinbrace@bracecomputerlab.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-09-29 14:23:45 -07:00
Armin Wolf	2b2706aaae	lib8390: Replace panic() call with BUILD_BUG_ON Replace panic() call in lib8390.c with BUILD_BUG_ON() since checking the size of struct e8390_pkt_hdr should happen at compile-time. Signed-off-by: Armin Wolf <W_Armin@gmx.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-09-29 14:04:09 -07:00
Sebastian Andrzej Siewior	77afca296f	net: vxge: Remove in_interrupt() conditionals vxge_os_dma_malloc() and vxge_os_dma_malloc_async() are both called from callchains which use GFP_KERNEL allocations unconditionally or have other requirements to be called from fully preemptible task context.. vxge_os_dma_malloc(): 1) __vxge_hw_blockpool_create() <- GFP_KERNEL 2) __vxge_hw_mempool_grow() <- vzalloc() __vxge_hw_blockpool_malloc() vxge_os_dma_malloc_async(): 1 __vxge_hw_mempool_grow() <- vzalloc() __vxge_hw_blockpool_malloc() __vxge_hw_blockpool_blocks_add() 2) vxge_hw_vpath_open() <- vzalloc() __vxge_hw_blockpool_block_allocate() That means neither of these functions needs a conditional allocation mode. Remove the in_interrupt() conditional and use GFP_KERNEL. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-09-29 14:02:54 -07:00
Sebastian Andrzej Siewior	a1f467463c	net: sun3lance: Remove redundant checks in interrupt handler lance_interrupt() contains two pointless checks: - A check whether the 'dev_id' argument is NULL. 'dev_id' is the pointer which was handed in to request_irq() and the interrupt handler will always be invoked with that pointer as 'dev_id' argument by the core code. - A check for interrupt reentrancy. The core code already guarantees non-reentrancy of interrupt handlers. Remove these check. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-09-29 14:02:54 -07:00

1 2 3 4 5 ...

35258 Commits