linux-loongson

mirror of https://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson synced 2025-09-06 21:14:18 +00:00

Author	SHA1	Message	Date
Dirk van der Merwe	790d23e7c5	nfp: implement PCI driver shutdown callback Device may be shutdown without the hardware being reinitialized, in which case we want to ensure we cleanup properly. This is especially important for kexec with traffic flowing. The shutdown procedures resembles the remove procedures, so we can reuse those common tasks. Signed-off-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-04-26 12:08:13 -04:00
David S. Miller	8b44836583	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Two easy cases of overlapping changes. Signed-off-by: David S. Miller <davem@davemloft.net>	2019-04-25 23:52:29 -04:00
Ido Schimmel	f2ad1a522e	net: devlink: Add extack to shared buffer operations Add extack to shared buffer set operations, so that meaningful error messages could be propagated to the user. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Petr Machata <petrm@mellanox.com> Cc: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-04-22 22:09:32 -07:00
Pablo Cascón	4ef6cbe80d	nfp: add SR-IOV trusted VF support By default VFs are not trusted. Add ndo_set_vf_trust support to toggle a new per-VF bit. Coupled with FW with this capability allows a trusted VF to change its MAC even after being administratively set by the PF. Also populate the trusted field on ndo_get_vf_config. Add the same ndo to the representors. Signed-off-by: Pablo Cascón <pablo.cascon@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-04-19 21:00:31 -07:00
John Hurley	7d26c96052	nfp: flower: fix size_t compile warning A recent addition to NFP introduced a function that formats a string with a size_t variable. This is formatted with %ld which is fine on 64-bit architectures but produces a compile warning on 32-bit architectures. Fix this by using the z length modifier. Fixes: a6156a6ab0f9 ("nfp: flower: handle merge hint messages") Signed-off-by: John Hurley <john.hurley@netronome.com> Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-04-19 11:56:40 -07:00
Colin Ian King	d003d772e6	nfp: abm: fix spelling mistake "offseting" -> "offsetting" There are a couple of spelling mistakes in NL_SET_ERR_MSG_MOD error messages. Fix these. Signed-off-by: Colin Ian King <colin.king@canonical.com> Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Mukesh Ojha <mojha@codeaurora.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-04-17 23:22:26 -07:00
John Hurley	9bad65e515	nfp: flower: fix implicit fallthrough warning The nfp_flower_copy_pre_actions function introduces a case statement with an intentional fallthrough. However, this generates a warning if built with the -Wimplicit-fallthrough flag. Remove the warning by adding a fall through comment. Fixes: `1c6952ca58` ("nfp: flower: generate merge flow rule") Signed-off-by: John Hurley <john.hurley@netronome.com> Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Reviewed-by: Jiong Wang <jiong.wang@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-04-16 21:42:06 -07:00
John Hurley	8af56f40e5	nfp: flower: offload merge flows A merge flow is formed from 2 sub flows. The match fields of the merge are the same as the first sub flow that has formed it, with the actions being a combination of the first and second sub flow. Therefore, a merge flow should replace sub flow 1 when offloaded. Offload valid merge flows by using a new 'flow mod' message type to replace an existing offloaded rule. Track the deletion of sub flows that are linked to a merge flow and revert offloaded merge rules if required. Signed-off-by: John Hurley <john.hurley@netronome.com> Signed-off-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-04-15 15:45:36 -07:00
John Hurley	aa6ce2ea0c	nfp: flower: support stats update for merge flows With the merging of 2 sub flows, a new 'merge' flow will be created and written to FW. The TC layer is unaware that the merge flow exists and will request stats from the sub flows. Conversely, the FW treats a merge rule the same as any other rule and sends stats updates to the NFP driver. Add links between merge flows and their sub flows. Use these links to pass merge flow stats updates from FW to the underlying sub flows, ensuring TC stats requests are handled correctly. The updating of sub flow stats is done on (the less time critcal) TC stats requests rather than on FW stats update. Signed-off-by: John Hurley <john.hurley@netronome.com> Signed-off-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-04-15 15:45:36 -07:00
John Hurley	1c6952ca58	nfp: flower: generate merge flow rule When combining 2 sub_flows to a single 'merge flow' (assuming the merge is valid), the merge flow should contain the same match fields as sub_flow 1 with actions derived from a combination of sub_flows 1 and 2. This action list should have all actions from sub_flow 1 with the exception of the output action that triggered the 'implicit recirculation' by sending to an internal port, followed by all actions of sub_flow 2. Any pre-actions in either sub_flow should feature at the start of the action list. Add code to generate a new merge flow and populate the match and actions fields based on the sub_flows. The offloading of the flow is left to future patches. Signed-off-by: John Hurley <john.hurley@netronome.com> Signed-off-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-04-15 15:45:36 -07:00
John Hurley	107e37bb4f	nfp: flower: validate merge hint flows Two flows can be merged if the second flow (after recirculation) matches on bits that are either matched on or explicitly set by the first flow. This means that if a packet hits flow 1 and recirculates then it is guaranteed to hit flow 2. Add a 'can_merge' function that determines if 2 sub_flows in a merge hint can be validly merged to a single flow. Signed-off-by: John Hurley <john.hurley@netronome.com> Signed-off-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-04-15 15:45:36 -07:00
John Hurley	dbc2d68edc	nfp: flower: handle merge hint messages If a merge hint is received containing 2 flows that are matched via an implicit recirculation (sending to and matching on an internal port), fw reports that the flows (called sub_flows) may be able to be combined to a single flow. Add infastructure to accept and process merge hint messages. The actual merging of the flows is left as a stub call. Signed-off-by: John Hurley <john.hurley@netronome.com> Signed-off-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-04-15 15:45:36 -07:00
John Hurley	cf4172d575	nfp: flower: get flows by host context Each flow is given a context ID that the fw uses (along with its cookie) to identity the flow. The flows stats are updated by the fw via this ID which is a reference to a pre-allocated array entry. In preparation for flow merge code, enable the nfp_fl_payload structure to be accessed via this stats context ID. Rather than increasing the memory requirements of the pre-allocated array, add a new rhashtable to associate each active stats context ID with its rule payload. While adding new code to the compile metadata functions, slightly restructure the existing function to allow for cleaner, easier to read error handling. Signed-off-by: John Hurley <john.hurley@netronome.com> Signed-off-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-04-15 15:45:36 -07:00
John Hurley	45756dfeda	nfp: flower: allow tunnels to output to internal port The neighbour table in the FW only accepts next hop entries if the egress port is an nfp repr. Modify this to allow the next hop to be an internal port. This means that if a packet is to egress to that port, it will recirculate back into the system with the internal port becoming its ingress port. Signed-off-by: John Hurley <john.hurley@netronome.com> Signed-off-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-04-15 15:45:36 -07:00
John Hurley	f41dd0595d	nfp: flower: support fallback packets from internal ports FW may receive a packet with its ingress port marked as an internal port. If a rule does not exist to match on this port, the packet will be sent to the NFP driver. Modify the flower app to detect packets from such internal ports and convert the ingress port to the correct kernel space netdev. At this point, it is assumed that fallback packets from internal ports are to be sent out said port. Therefore, set the redir_egress bool to true on detection of these ports. Signed-off-by: John Hurley <john.hurley@netronome.com> Signed-off-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-04-15 15:45:36 -07:00
John Hurley	27f54b5825	nfp: allow fallback packets from non-reprs Currently, it is assumed that fallback packets will be from reprs. Modify this to allow an app to receive non-repr ports from the fallback channel - e.g. from an internal port. If such a packet is received, do not update repr stats. Change the naming function calls so as not to imply it will always be a repr netdev returned. Add the option to set a bool value to redirect a fallback packet out the returned port rather than RXing it. Setting of this bool in subsequent patches allows the handling of packets falling back when they are due to egress an internal port. Signed-off-by: John Hurley <john.hurley@netronome.com> Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-04-15 15:45:36 -07:00
John Hurley	4d12ba4278	nfp: flower: allow offloading of matches on 'internal' ports Recent FW modifications allow the offloading of non repr ports. These ports exist internally on the NFP. So if a rule outputs to an 'internal' port, then the packet will recirculate back into the system but will now have this internal port as it's incoming port. These ports are indicated by a specific type field combined with an 8 bit port id. Add private app data to assign additional port ids for use in offloads. Provide functions to lookup or create new ids when a rule attempts to match on an internal netdev - the only internal netdevs currently supported are of type openvswitch. Have a netdev notifier to release port ids on netdev unregister. OvS offloads rules that match on internal ports as TC egress filters. Ensure that such rules are accepted by the driver. Signed-off-by: John Hurley <john.hurley@netronome.com> Signed-off-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-04-15 15:45:36 -07:00
John Hurley	2f2622f59c	nfp: flower: turn on recirc and merge hint support in firmware Write to a FW symbol to indicate that the driver supports flow merging. If this symbol does not exist then flow merging and recirculation is not supported on the FW. If support is available, add a stub to deal with FW to kernel merge hint messages. Full flow merging requires the firmware to support of flow mods. If it does not, then do not attempt to 'turn on' flow merging. Signed-off-by: John Hurley <john.hurley@netronome.com> Signed-off-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-04-15 15:45:36 -07:00
Jakub Kicinski	bcf0cafab4	nfp: split out common control message handling code BPF's control message handler seems like a good base to built on for request-reply control messages. Split it out to allow for reuse. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-04-12 17:29:15 -07:00
Jakub Kicinski	0a72d8332c	nfp: move vNIC reset before netdev init During probe we clear vNIC configuration in case the device wasn't closed cleanly by previous driver. Move that code before netdev init, so netdev init can already try to apply its config parameters. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-04-12 17:29:15 -07:00
Jakub Kicinski	dd5b2498d8	nfp: add a mutex lock for the vNIC ctrl BAR Soon we will try to write to the vNIC mailbox without RTNL held. Add a new mutex to protect access to specific parts of the PCI control BAR. Move the mailbox size checking to the mailbox lock() helper, where it can be more effective (happen prior to potential overwrite of other data). Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-04-12 17:29:15 -07:00
Dirk van der Merwe	e64718282c	nfp: opportunistically poll for reconfig result If the reconfig was a quick update, we could have results available from firmware within 200us. Signed-off-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-04-12 17:29:15 -07:00
David S. Miller	f83f715195	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Minor comment merge conflict in mlx5. Staging driver has a fixup due to the skb->xmit_more changes in 'net-next', but was removed in 'net'. Signed-off-by: David S. Miller <davem@davemloft.net>	2019-04-05 14:14:19 -07:00
Jiri Pirko	c25f08ac65	nfp: remove ndo_get_port_parent_id implementation Remove implementation of get_port_parent_id ndo and rely on core calling into devlink for the information directly. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-04-04 17:42:36 -07:00
Jiri Pirko	1b15c90270	nfp: pass switch ID through devlink_port_attrs_set() Pass the switch ID down the to devlink through devlink_port_attrs_set() so it can be used by devlink_compat_switch_id_get(). Signed-off-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-04-04 17:42:36 -07:00
Jiri Pirko	bec5267cde	net: devlink: extend port attrs for switch ID Extend devlink_port_attrs_set() to pass switch ID for ports which are part of switch and store it in port attrs. For other ports, this is NULL. Note that this allows the driver to group devlink ports into one or more switches according to the actual topology. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-04-04 17:42:36 -07:00
Florian Westphal	6b16f9ee89	net: move skb->xmit_more hint to softnet data There are two reasons for this. First, the xmit_more flag conceptually doesn't fit into the skb, as xmit_more is not a property related to the skb. Its only a hint to the driver that the stack is about to transmit another packet immediately. Second, it was only done this way to not have to pass another argument to ndo_start_xmit(). We can place xmit_more in the softnet data, next to the device recursion. The recursion counter is already written to on each transmit. The "more" indicator is placed right next to it. Drivers can use the netdev_xmit_more() helper instead of skb->xmit_more to check the "more packets coming" hint. skb->xmit_more is retained (but always 0) to not cause build breakage. This change takes care of the simple s/skb->xmit_more/netdev_xmit_more()/ conversions. Remaining drivers are converted in the next patches. Suggested-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-04-01 18:35:02 -07:00
Dirk van der Merwe	61f7c6f448	nfp: implement ethtool get module EEPROM Now that the NSP provides the ability to read from the SFF modules' EEPROM, we can use this interface to implement the ethtool callback. If the NSP only provides partial data, we log the event from within the driver but pass a success code to ethtool to prevent it from discarding the partial data. Signed-off-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-04-01 18:05:13 -07:00
Dirk van der Merwe	593cb18285	nfp: nsp: implement read SFF module EEPROM The NSP now provides the ability to read from the SFF module EEPROM. Note that even if an error occurs, the NSP may still provide some of the data. Signed-off-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-04-01 18:05:13 -07:00
Pieter Jansen van Vuuren	eff07b42d8	nfp: flower: reduce action list size by coalescing mangle actions With the introduction of flow_action_for_each pedit actions are no longer grouped together, instead pedit actions are broken out per 32 byte word. This results in an inefficient use of the action list that is pushed to hardware where each 32 byte word becomes its own action. Therefore we combine groups of 32 byte word before sending the action list to hardware. Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-04-01 18:05:13 -07:00
Pieter Jansen van Vuuren	42cd5484a2	nfp: flower: remove vlan CFI bit from push vlan action We no longer set CFI when pushing vlan tags, therefore we remove the CFI bit from push vlan. Fixes: `1a1e586f54` ("nfp: add basic action capabilities to flower offloads") Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com> Signed-off-by: Louis Peens <louis.peens@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-04-01 18:02:41 -07:00
Pieter Jansen van Vuuren	f7ee799a51	nfp: flower: replace CFI with vlan present Replace vlan CFI bit with a vlan present bit that indicates the presence of a vlan tag. Previously the driver incorrectly assumed that an vlan id of 0 is not matchable, therefore we indicate vlan presence with a vlan present bit. Fixes: `5571e8c9f2` ("nfp: extend flower matching capabilities") Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com> Signed-off-by: Louis Peens <louis.peens@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-04-01 18:02:41 -07:00
Jakub Kicinski	c3e1f7fff6	nfp: disable netpoll on representors NFP reprs are software device on top of the PF's vNIC. The comment above __dev_queue_xmit() sayeth: When calling this method, interrupts MUST be enabled. This is because the BH enable code must have IRQs enabled so that it will not deadlock. For netconsole we can't guarantee IRQ state, let's just disable netpoll on representors to be on the safe side. When the initial implementation of NFP reprs was added by the commit `5de73ee467` ("nfp: general representor implementation") .ndo_poll_controller was required for netpoll to be enabled. Fixes: `ac3d9dd034` ("netpoll: make ndo_poll_controller() optional") Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: John Hurley <john.hurley@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-03-28 17:04:29 -07:00
Jakub Kicinski	c8ba5b91a0	nfp: validate the return code from dev_queue_xmit() dev_queue_xmit() may return error codes as well as netdev_tx_t, and it always consumes the skb. Make sure we always return a correct netdev_tx_t value. Fixes: `eadfa4c3be` ("nfp: add stats and xmit helpers for representors") Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: John Hurley <john.hurley@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-03-28 17:04:29 -07:00
Jiri Pirko	f1fa719cfd	nfp: do not handle nn->port defined case in nfp_net_get_phys_port_name() If nn->port is defined it means that devlink_port has been registered for this port as well. Devlink core is handling the port name formatting. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-03-28 12:55:31 -07:00
Jiri Pirko	5dc37bb9b0	net: replace ndo_get_devlink with ndo_get_devlink_port Follow-up patch is going to need a devlink port instance according to a netdev. Devlink port instance should be always available when devlink is used. So change the recently introduced ndo_get_devlink to ndo_get_devlink_port. With that, adjust the wrapper for the only user to get devlink pointer. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Michal Kubecek <mkubecek@suse.cz> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-03-28 12:55:30 -07:00
Jiri Pirko	335bc0dde0	nfp: register devlink port before netdev Change the init/fini flow and register devlink port instance before netdev. Now it is needed for correct behavior of phys_port_name generation, but in general it makes sense to register devlink port first. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-03-28 12:55:30 -07:00
Jiri Pirko	faaccbe6eb	nfp: move devlink port type set after netdev registration Similar to other driver, move the port type set after netdev registration is done. Along with that, clear the type before unregistration. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-03-24 14:55:31 -04:00
Moshe Shemesh	bea964107f	net: Add IANA_VXLAN_UDP_PORT definition to vxlan header file Added IANA_VXLAN_UDP_PORT (4789) definition to vxlan header file so it can be used by drivers instead of local definition. Updated drivers which locally defined it as 4789 to use it. Signed-off-by: Moshe Shemesh <moshe@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Cc: John Hurley <john.hurley@netronome.com> Cc: Jakub Kicinski <jakub.kicinski@netronome.com> Cc: Yunsheng Lin <linyunsheng@huawei.com> Cc: Peng Li <lipeng321@huawei.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2019-03-22 12:09:31 -07:00
Moshe Shemesh	974eff2b57	net: Move the definition of the default Geneve udp port to public header file Move the definition of the default Geneve udp port from the geneve source to the header file, so we can re-use it from drivers. Modify existing drivers to use it. Signed-off-by: Moshe Shemesh <moshe@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Cc: John Hurley <john.hurley@netronome.com> Cc: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2019-03-22 12:09:31 -07:00
Jakub Kicinski	31f1a0e37c	nfp: remove defines for unused control bits NFP driver ABI contains bits for L2 switching which were never implemented in initially envisioned form. Remove the defines, and open up the possibility of reclaiming the bits for other uses. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-03-21 14:01:55 -07:00
Dirk van der Merwe	eaab2d2d0f	nfp: fix simple vNIC mailbox length The simple vNIC mailbox length should be 12 decimal and not 0x12. Using a decimal also makes it clear this is a length value and not another field within the simple mailbox defines. Found by code inspection, there are no known firmware configurations where this would cause issues. Fixes: `527d7d1b99` ("nfp: read mailbox address from TLV caps") Signed-off-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-03-07 11:00:21 -08:00
Dirk van der Merwe	35697764d7	nfp: nsp: set higher timeout for flash bundle The management firmware now supports being passed a bundle with multiple components to be stored in flash at once. This makes it easier to update all components to a known state with a single user command, however, this also has the potential to increase the time required to perform the update significantly. The management firmware only updates the components out of a bundle which are outdated, however, we need to make sure we can handle the absolute worst case where a CPLD update can take a long time to perform. We set a very conservative total timeout of 900s which already adds a contingency. Signed-off-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-03-01 11:36:01 -08:00
Jakub Kicinski	345415138d	nfp: nsp: allow the use of DMA buffer Newer versions of NSP can access host memory. Simplest access type requires all data to be in one contiguous area. Since we don't have the guarantee on where callers of the NSP ABI will allocate their buffers we allocate a bounce buffer and copy the data in and out. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-03-01 11:36:01 -08:00
Jakub Kicinski	66487abe2f	nfp: nsp: move default buffer handling into its own function DMA version of NSP communication is coming, move the code which copies data into the NFP buffer into a separate function. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-03-01 11:36:00 -08:00
Jakub Kicinski	882cdcb5d3	nfp: nsp: use fractional size of the buffer NSP expresses the buffer size in MB and 4 kB blocks. For small buffers the kB part may make a difference, so count it in. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-03-01 11:36:00 -08:00
Jakub Kicinski	1e301a1407	nfp: report RJ45 connector in ethtool Add support for reporting twisted pair port type. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-03-01 11:36:00 -08:00
Jakub Kicinski	03969b9414	nfp: remove ethtool flashing fallback Now that devlink fallback will be called reliably, we can remove the ethtool flashing code. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-02-26 08:49:05 -08:00
Jakub Kicinski	28e8c75413	nfp: add .ndo_get_devlink Support getting devlink instance from a new NDO. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-02-26 08:49:05 -08:00
Florian Fainelli	d7977107b3	nfp: Remove switchdev.h inclusion This is no longer necessary after `a5084bb71f` ("nfp: Implement ndo_get_port_parent_id()") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-02-24 17:40:46 -08:00
David S. Miller	70f3522614	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Three conflicts, one of which, for marvell10g.c is non-trivial and requires some follow-up from Heiner or someone else. The issue is that Heiner converted the marvell10g driver over to use the generic c45 code as much as possible. However, in 'net' a bug fix appeared which makes sure that a new local mask (MDIO_AN_10GBT_CTRL_ADV_NBT_MASK) with value 0x01e0 is cleared. Signed-off-by: David S. Miller <davem@davemloft.net>	2019-02-24 12:06:19 -08:00
Jiong Wang	f036ebd9bf	nfp: bpf: fix ALU32 high bits clearance bug NFP BPF JIT compiler is doing a couple of small optimizations when jitting ALU imm instructions, some of these optimizations could save code-gen, for example: A & -1 = A A \| 0 = A A ^ 0 = A However, for ALU32, high 32-bit of the 64-bit register should still be cleared according to ISA semantics. Fixes: `cd7df56ed3` ("nfp: add BPF to NFP code translator") Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Jiong Wang <jiong.wang@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2019-02-23 00:07:47 +01:00
Jiong Wang	71c190249f	nfp: bpf: fix code-gen bug on BPF_ALU \| BPF_XOR \| BPF_K The intended optimization should be A ^ 0 = A, not A ^ -1 = A. Fixes: `cd7df56ed3` ("nfp: add BPF to NFP code translator") Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Jiong Wang <jiong.wang@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2019-02-23 00:07:47 +01:00
Pieter Jansen van Vuuren	0496743b20	nfp: flower: fix masks for tcp and ip flags fields Check mask fields of tcp and ip flags when setting the corresponding mask flag used in hardware. Fixes: `8f2566225a` ("flow_offload: add flow_rule and flow_match") Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com> Reviewed-by: John Hurley <john.hurley@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-02-17 15:28:50 -08:00
Jakub Kicinski	5c5696f3df	nfp: devlink: allow flashing the device via devlink Devlink now allows updating device flash. Implement this callback. Compared to ethtool update we no longer have to release the networking locks - devlink doesn't take them. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-02-17 15:27:38 -08:00
David S. Miller	885e631959	Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next Alexei Starovoitov says: ==================== pull-request: bpf-next 2019-02-16 The following pull-request contains BPF updates for your net-next tree. The main changes are: 1) numerous libbpf API improvements, from Andrii, Andrey, Yonghong. 2) test all bpf progs in alu32 mode, from Jiong. 3) skb->sk access and bpf_sk_fullsock(), bpf_tcp_sock() helpers, from Martin. 4) support for IP encap in lwt bpf progs, from Peter. 5) remove XDP_QUERY_XSK_UMEM dead code, from Jan. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2019-02-16 22:56:34 -08:00
Jakub Kicinski	0ff8409b52	nfp: flower: remove double new line Recent cls_flower offload rewrite added a double new line. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-02-12 11:38:52 -05:00
Jakub Kicinski	dd27c2e3d0	bpf: offload: add priv field for drivers Currently bpf_offload_dev does not have any priv pointer, forcing the drivers to work backwards from the netdev in program metadata. This is not great given programs are conceptually associated with the offload device, and it means one or two unnecessary deferences. Add a priv pointer to bpf_offload_dev. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2019-02-12 17:07:09 +01:00
Jakub Kicinski	1f5cf1036c	nfp: devlink: include vendor/product info in serial number The manufacturing team requests we include vendor and product in the serial number field, as the serial number itself is not unique across manufacturing facilities and products. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-02-11 20:39:56 -08:00
Jakub Kicinski	05fe4ab75c	nfp: devlink: use the generic manufacture identifier instead of vendor Vendor may sound ambiguous, let's rename the fab string to "board.manufacture" (which was just added as a generic identifier). Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-02-11 20:39:56 -08:00
Gustavo A. R. Silva	af6f12f22b	nfp: flower: cmsg: use struct_size() helper One of the more common cases of allocation size calculations is finding the size of a structure that has a zero-sized array at the end, along with memory for some number of elements for that array. For example: struct foo { int stuff; void entry[]; }; size = sizeof(struct foo) + count sizeof(void *); instance = alloc(size, GFP_KERNEL); Instead of leaving these open-coded and prone to type mistakes, we can now use the new struct_size() helper: instance = alloc(struct_size(instance, entry, count), GFP_KERNEL); Notice that, in this case, variable size is not necessary, hence it is removed. This code was detected with the help of Coccinelle. Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-02-08 22:57:28 -08:00
Pablo Neira Ayuso	c0bc5d8e2b	nfp: flower: remove unused index from nfp_fl_pedit() Static checker warning complains on uninitialized variable: drivers/net/ethernet/netronome/nfp/flower/action.c:618 nfp_fl_pedit() error: uninitialized symbol 'idx'. Which is actually never used from the functions that take it as parameter. Remove it. Fixes: `7386788175` ("drivers: net: use flow action infrastructure") Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-02-08 11:46:45 -08:00
Florian Fainelli	a5084bb71f	nfp: Implement ndo_get_port_parent_id() NFP only supports SWITCHDEV_ATTR_ID_PORT_PARENT_ID, which makes it a great candidate to be converted to use the ndo_get_port_parent_id() NDO instead of implementing switchdev_port_attr_get(). Since NFP uses switchdev_port_same_parent_id() convert it to use netdev_port_same_parent_id(). Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-02-06 14:16:12 -08:00
Pablo Neira Ayuso	7386788175	drivers: net: use flow action infrastructure This patch updates drivers to use the new flow action infrastructure. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-02-06 10:38:25 -08:00
Pablo Neira Ayuso	3b1903ef97	flow_offload: add statistics retrieval infrastructure and use it This patch provides the flow_stats structure that acts as container for tc_cls_flower_offload, then we can use to restore the statistics on the existing TC actions. Hence, tcf_exts_stats_update() is not used from drivers anymore. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-02-06 10:38:25 -08:00
Pablo Neira Ayuso	8f2566225a	flow_offload: add flow_rule and flow_match structures and use them This patch wraps the dissector key and mask - that flower uses to represent the matching side - around the flow_match structure. To avoid a follow up patch that would edit the same LoCs in the drivers, this patch also wraps this new flow match structure around the flow rule object. This new structure will also contain the flow actions in follow up patches. This introduces two new interfaces: bool flow_rule_match_key(rule, dissector_id) that returns true if a given matching key is set on, and: flow_rule_match_XYZ(rule, &match); To fetch the matching side XYZ into the match container structure, to retrieve the key and the mask with one single call. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-02-06 10:38:25 -08:00
Jakub Kicinski	bff5731d43	net: devlink: report cell size of shared buffers Shared buffer allocation is usually done in cell increments. Drivers will either round up the allocation or refuse the configuration if it's not an exact multiple of cell size. Drivers know exactly the cell size of shared buffer, so help out users by providing this information in dumps. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-02-03 11:25:34 -08:00
David S. Miller	beb73559bf	Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next Alexei Starovoitov says: ==================== pull-request: bpf-next 2019-02-01 The following pull-request contains BPF updates for your net-next tree. The main changes are: 1) introduce bpf_spin_lock, from Alexei. 2) convert xdp samples to libbpf, from Maciej. 3) skip verifier tests for unsupported program/map types, from Stanislav. 4) powerpc64 JIT support for BTF line info, from Sandipan. 5) assorted fixed, from Valdis, Jesper, Jiong. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2019-02-01 20:12:18 -08:00
Jiong Wang	ac7a1717a2	nfp: bpf: complete ALU32 logic shift supports The following ALU32 logic shift supports are missing: BPF_ALU \| BPF_LSH \| BPF_X BPF_ALU \| BPF_RSH \| BPF_X BPF_ALU \| BPF_RSH \| BPF_K For BPF_RSH \| BPF_K, it could be implemented using NFP direct shift instruction. For the other BPF_X shifts, NFP indirect shifts sequences need to be used. Separate code-gen hook is assigned to each instruction to make the implementation clear. Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Jiong Wang <jiong.wang@netronome.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2019-02-01 18:03:49 -08:00
Jiong Wang	db0a4b3b6b	nfp: bpf: correct the behavior for shifts by zero Shifts by zero do nothing, and should be treated as nops. Even though compiler is not supposed to generate such instructions and manual written assembly is unlikely to have them, but they are legal instructions and have defined behavior. This patch correct existing shifts code-gen to make sure they do nothing when shift amount is zero except when the instruction is ALU32 for which high bits need to be cleared. For shift amount bigger than type size, already, NFP JIT back-end errors out for immediate shift and only low 5 bits will be taken into account for indirect shift which is the same as x86. Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Jiong Wang <jiong.wang@netronome.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2019-02-01 18:03:49 -08:00
Jakub Kicinski	7c908f467d	nfp: devlink: report the running and flashed versions Report versions of firmware components using the new NSP command. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-02-01 15:30:31 -08:00
Jakub Kicinski	b96588400a	nfp: nsp: add support for versions command Retrieve the FW versions with the new command. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-02-01 15:30:31 -08:00
Jakub Kicinski	937a3e2645	nfp: devlink: report fixed versions Report information about the hardware. RFCv2: - add defines for board IDs which are likely to be reusable for other drivers (Jiri). Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-02-01 15:30:31 -08:00
Jakub Kicinski	4adba00839	nfp: devlink: report driver name and serial number Report the basic info through new devlink info API. RFCv2: - add driver name; - align serial to core changes. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-02-01 15:30:30 -08:00
Gustavo A. R. Silva	ee69804714	nfp: use struct_size() in kzalloc() One of the more common cases of allocation size calculations is finding the size of a structure that has a zero-sized array at the end, along with memory for some number of elements for that array. For example: struct foo { int stuff; struct boo entry[]; }; instance = kzalloc(sizeof(struct foo) + count * sizeof(struct boo), GFP_KERNEL); Instead of leaving these open-coded and prone to type mistakes, we can now use the new struct_size() helper: instance = kzalloc(struct_size(instance, entry, count), GFP_KERNEL); This code was detected with the help of Coccinelle. Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-02-01 15:12:29 -08:00
David S. Miller	ec7146db15	Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next Daniel Borkmann says: ==================== pull-request: bpf-next 2019-01-29 The following pull-request contains BPF updates for your net-next tree. The main changes are: 1) Teach verifier dead code removal, this also allows for optimizing / removing conditional branches around dead code and to shrink the resulting image. Code store constrained architectures like nfp would have hard time doing this at JIT level, from Jakub. 2) Add JMP32 instructions to BPF ISA in order to allow for optimizing code generation for 32-bit sub-registers. Evaluation shows that this can result in code reduction of ~5-20% compared to 64 bit-only code generation. Also add implementation for most JITs, from Jiong. 3) Add support for __int128 types in BTF which is also needed for vmlinux's BTF conversion to work, from Yonghong. 4) Add a new command to bpftool in order to dump a list of BPF-related parameters from the system or for a specific network device e.g. in terms of available prog/map types or helper functions, from Quentin. 5) Add AF_XDP sock_diag interface for querying sockets from user space which provides information about the RX/TX/fill/completion rings, umem, memory usage etc, from Björn. 6) Add skb context access for skb_shared_info->gso_segs field, from Eric. 7) Add support for testing flow dissector BPF programs by extending existing BPF_PROG_TEST_RUN infrastructure, from Stanislav. 8) Split BPF kselftest's test_verifier into various subgroups of tests in order better deal with merge conflicts in this area, from Jakub. 9) Add support for queue/stack manipulations in bpftool, from Stanislav. 10) Document BTF, from Yonghong. 11) Dump supported ELF section names in libbpf on program load failure, from Taeung. 12) Silence a false positive compiler warning in verifier's BTF handling, from Peter. 13) Fix help string in bpftool's feature probing, from Prashant. 14) Remove duplicate includes in BPF kselftests, from Yue. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2019-01-28 19:38:33 -08:00
Jiong Wang	461448398a	nfp: bpf: implement jitting of JMP32 This patch implements code-gen for new JMP32 instructions on NFP. Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Jiong Wang <jiong.wang@netronome.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2019-01-26 13:33:02 -08:00
Jakub Kicinski	9a06927e77	nfp: bpf: support removing dead code Add a verifier callback to the nfp JIT to remove the instructions the verifier deemed to be dead. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2019-01-23 17:35:32 -08:00
Jakub Kicinski	a32014b351	nfp: bpf: support optimizing dead branches Verifier will now optimize out branches to dead code, implement the replace_insn callback to take advantage of that optimization. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2019-01-23 17:35:32 -08:00
Jakub Kicinski	e2fc61146a	nfp: bpf: save original program length Instead of passing env->prog->len around, and trying to adjust for optimized out instructions just save the initial number of instructions in struct nfp_prog. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2019-01-23 17:35:32 -08:00
Jakub Kicinski	91a87a5823	nfp: bpf: split up the skip flag We fail program loading if jump lands on a skipped instruction. This is for historical reasons, it used to be that we only skipped instructions optimized out based on prior context, and therefore the optimization would be buggy if we jumped directly to such instruction (because the context would be skipped by the jump). There are cases where instructions can be skipped without any context, for example there is no point in generating code for: r0 \|= 0 We will also soon support dropping dead code, so make the skip logic differentiate between "optimized with preceding context" vs other skip types. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2019-01-23 17:35:32 -08:00
Jakub Kicinski	e90287f3aa	nfp: bpf: don't use instruction number for jump target Instruction number is meaningless at code gen phase. The target of the instruction is overwritten by nfp_fixup_branches(). The convention is to put the raw offset in target address as a place holder. See cmp_* functions. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2019-01-23 17:35:32 -08:00
John Hurley	20cce88650	nfp: flower: enable MAC address sharing for offloadable devs A MAC address is not necessarily a unique identifier for a netdev. Drivers such as Linux bonds, for example, can apply the same MAC address to the upper layer device and all lower layer devices. NFP MAC offload for tunnel decap includes port verification for reprs but also supports the offload of non-repr MAC addresses by assigning 'global' indexes to these. This means that the FW will not verify the incoming port of a packet matching this destination MAC. Modify the MAC offload logic to assign global indexes based on MAC address instead of net device (as it currently does). Use this to allow multiple devices to share the same MAC. In other words, if a repr shares its MAC address with another device then give the offloaded MAC a global index rather than associate it with an ingress port. Track this so that changes can be reverted as MACs stop being shared. Implement this by removing the current list based assignment of global indexes and replacing it with an rhashtable that maps an offloaded MAC address to the number of devices sharing it, distributing global indexes based on this. Signed-off-by: John Hurley <john.hurley@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-01-16 15:23:15 -08:00
John Hurley	13cf71031d	nfp: flower: ensure MAC cleanup on address change It is possible to receive a MAC address change notification without the net device being down (e.g. when an OvS bridge is assigned the same MAC as a port added to it). This means that an offloaded MAC address may not be removed if its device gets a new address. Maintain a record of the offloaded MAC addresses for each repr and netdev assigned a MAC offload index. Use this to delete the (now expired) MAC if a change of address event occurs. Only handle change address events if the device is already up - if not then the netdev up event will handle it. Signed-off-by: John Hurley <john.hurley@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-01-16 15:23:15 -08:00
John Hurley	05d2bee6bd	nfp: flower: add infastructure for non-repr priv data NFP repr netdevs contain private data that can store per port information. In certain cases, the NFP driver offloads information from non-repr ports (e.g. tunnel ports). As the driver does not have control over non-repr netdevs, it cannot add/track private data directly to the netdev struct. Add infastructure to store private information on any non-repr netdev that is offloaded at a given time. This is used in a following patch to track offloaded MAC addresses for non-reprs and enable correct house keeping on address changes. Signed-off-by: John Hurley <john.hurley@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-01-16 15:23:15 -08:00
John Hurley	49402b0b7f	nfp: flower: ensure deletion of old offloaded MACs When a potential tunnel end point goes down then its MAC address should not be matchable on the NFP. Implement a delete message for offloaded MACs and call this on net device down. While at it, remove the actions on register and unregister netdev events. A MAC should only be offloaded if the device is up. Note that the netdev notifier will replay any notifications for UP devices on registration so NFP can still offload ports that exist before the driver is loaded. Similarly, devices need to go down before they can be unregistered so removal of offloaded MACs is only required on down events. Signed-off-by: John Hurley <john.hurley@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-01-16 15:23:15 -08:00
John Hurley	0115dcc314	nfp: flower: remove list infastructure from MAC offload Potential MAC destination addresses for tunnel end-points are offloaded to firmware. This was done by building a list of such MACs and writing to firmware as blocks of addresses. Simplify this code by removing the list format and sending a new message for each offloaded MAC. This is in preparation for delete MAC messages. There will be one delete flag per message so we cannot assume that this applies to all addresses in a list. Signed-off-by: John Hurley <john.hurley@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-01-16 15:23:15 -08:00
John Hurley	41da0b5ef3	nfp: flower: ignore offload of VF and PF repr MAC addresses Currently MAC addresses of all repr netdevs, along with selected non-NFP controlled netdevs, are offloaded to FW as potential tunnel end-points. However, the addresses of VF and PF reprs are meaningless outside of internal communication and it is only those of physical port reprs required. Modify the MAC address offload selection code to ignore VF/PF repr devs. Signed-off-by: John Hurley <john.hurley@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-01-16 15:23:15 -08:00
John Hurley	f3b975778c	nfp: flower: tidy tunnel related private data Recent additions to the flower app private data have grouped the variables of a given feature into a struct and added that struct to the main private data struct. In keeping with this, move all tunnel related private data to their own struct. This has no affect on functionality but improves readability and maintenance of the code. Signed-off-by: John Hurley <john.hurley@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-01-16 15:23:15 -08:00
Pieter Jansen van Vuuren	467322e262	nfp: flower: support multiple memory units for filter offloads Adds support for multiple memory units which are used for filter offloads. Each filter is assigned a stats id, the MSBs of the id are used to determine which memory unit the filter should be offloaded to. The number of available memory units that could be used for filter offload is obtained from HW. A simple round robin technique is used to allocate and distribute the ids across memory units. Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-01-16 15:23:14 -08:00
Fred Lotter	96439889b4	nfp: flower: increase cmesg reply timeout QA tests report occasional timeouts on REIFY message replies. Profiling of the two cmesg reply types under burst conditions, with a 12-core host under heavy cpu and io load (stress --cpu 12 --io 12), show both PHY MTU change and REIFY replies can exceed the 10ms timeout. The maximum MTU reply wait under burst is 16ms, while the maximum REIFY wait under 40 VF burst is 12ms. Using a 4 VF REIFY burst results in an 8ms maximum wait. A larger VF burst does increase the delay, but not in a linear enough way to justify a scaled REIFY delay. The worse case values between MTU and REIFY appears close enough to justify a common timeout. Pick a conservative 40ms to make a safer future proof common reply timeout. The delay only effects the failure case. Change the REIFY timeout mechanism to use wait_event_timeout() instead of wait_event_interruptible_timeout(), to match the MTU code. In the current implementation, theoretically, a signal could interrupt the REIFY waiting period, with a return code of ERESTARTSYS. However, this is caught under the general timeout error code EIO. I cannot see the benefit of exposing the REIFY waiting period to signals with such a short delay (40ms), while the MTU mechnism does not use the same logic. In the absence of any reply (wakeup() call), both reply types will wake up the task after the timeout period. The REIFY timeout applies to the entire representor group being instantiated (e.g. VFs), while the MTU timeout apples to a single PHY MTU change. Signed-off-by: Fred Lotter <frederik.lotter@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-01-16 15:23:14 -08:00
Luis Chamberlain	750afb08ca	cross-tree: phase out dma_zalloc_coherent() We already need to zero out memory for dma_alloc_coherent(), as such using dma_zalloc_coherent() is superflous. Phase it out. This change was generated with the following Coccinelle SmPL patch: @ replace_dma_zalloc_coherent @ expression dev, size, data, handle, flags; @@ -dma_zalloc_coherent(dev, size, handle, flags) +dma_alloc_coherent(dev, size, handle, flags) Suggested-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Luis Chamberlain <mcgrof@kernel.org> [hch: re-ran the script on the latest tree] Signed-off-by: Christoph Hellwig <hch@lst.de>	2019-01-08 07:58:37 -05:00
David S. Miller	339bbff2d6	Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next Daniel Borkmann says: ==================== pull-request: bpf-next 2018-12-21 The following pull-request contains BPF updates for your net-next tree. There is a merge conflict in test_verifier.c. Result looks as follows: [...] }, { "calls: cross frame pruning", .insns = { [...] .prog_type = BPF_PROG_TYPE_SOCKET_FILTER, .errstr_unpriv = "function calls to other bpf functions are allowed for root only", .result_unpriv = REJECT, .errstr = "!read_ok", .result = REJECT, }, { "jset: functional", .insns = { [...] { "jset: unknown const compare not taken", .insns = { BPF_RAW_INSN(BPF_JMP \| BPF_CALL, 0, 0, 0, BPF_FUNC_get_prandom_u32), BPF_JMP_IMM(BPF_JSET, BPF_REG_0, 1, 1), BPF_LDX_MEM(BPF_B, BPF_REG_8, BPF_REG_9, 0), BPF_EXIT_INSN(), }, .prog_type = BPF_PROG_TYPE_SOCKET_FILTER, .errstr_unpriv = "!read_ok", .result_unpriv = REJECT, .errstr = "!read_ok", .result = REJECT, }, [...] { "jset: range", .insns = { [...] }, .prog_type = BPF_PROG_TYPE_SOCKET_FILTER, .result_unpriv = ACCEPT, .result = ACCEPT, }, The main changes are: 1) Various BTF related improvements in order to get line info working. Meaning, verifier will now annotate the corresponding BPF C code to the error log, from Martin and Yonghong. 2) Implement support for raw BPF tracepoints in modules, from Matt. 3) Add several improvements to verifier state logic, namely speeding up stacksafe check, optimizations for stack state equivalence test and safety checks for liveness analysis, from Alexei. 4) Teach verifier to make use of BPF_JSET instruction, add several test cases to kselftests and remove nfp specific JSET optimization now that verifier has awareness, from Jakub. 5) Improve BPF verifier's slot_type marking logic in order to allow more stack slot sharing, from Jiong. 6) Add sk_msg->size member for context access and add set of fixes and improvements to make sock_map with kTLS usable with openssl based applications, from John. 7) Several cleanups and documentation updates in bpftool as well as auto-mount of tracefs for "bpftool prog tracelog" command, from Quentin. 8) Include sub-program tags from now on in bpf_prog_info in order to have a reliable way for user space to get all tags of the program e.g. needed for kallsyms correlation, from Song. 9) Add BTF annotations for cgroup_local_storage BPF maps and implement bpf fs pretty print support, from Roman. 10) Fix bpftool in order to allow for cross-compilation, from Ivan. 11) Update of bpftool license to GPLv2-only + BSD-2-Clause in order to be compatible with libbfd and allow for Debian packaging, from Jakub. 12) Remove an obsolete prog->aux sanitation in dump and get rid of version check for prog load, from Daniel. 13) Fix a memory leak in libbpf's line info handling, from Prashant. 14) Fix cpumap's frame alignment for build_skb() so that skb_shared_info does not get unaligned, from Jesper. 15) Fix test_progs kselftest to work with older compilers which are less smart in optimizing (and thus throwing build error), from Stanislav. 16) Cleanup and simplify AF_XDP socket teardown, from Björn. 17) Fix sk lookup in BPF kselftest's test_sock_addr with regards to netns_id argument, from Andrey. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-20 17:31:36 -08:00
David S. Miller	2be09de7d6	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Lots of conflicts, by happily all cases of overlapping changes, parallel adds, things of that nature. Thanks to Stephen Rothwell, Saeed Mahameed, and others for their guidance in these resolutions. Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-20 11:53:36 -08:00
Jakub Kicinski	4987eaccd2	nfp: bpf: optimize codegen for JSET with a constant The top word of the constant can only have bits set if sign extension set it to all-1, therefore we don't really have to mask the top half of the register. We can just OR it into the result as is. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-12-20 17:28:29 +01:00
Jakub Kicinski	6e774845b3	nfp: bpf: remove the trivial JSET optimization The verifier will now understand the JSET instruction, so don't mark the dead branch in the JIT as noop. We won't generate any code, anyway. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-12-20 17:28:28 +01:00
John Hurley	b12c97d45c	nfp: flower: fix cb_ident duplicate in indirect block register Previously the identifier used for indirect block callback registry and for block rule cb registry (when done via indirect blocks) was the pointer to the netdev we were interested in receiving updates on. This worked fine if a single app existed that registered one callback per netdev of interest. However, if multiple cards are in place and, in turn, multiple apps, then each app may register the same callback with the same identifier to both the netdev's indirect block cb list and to a block's cb list. This can lead to EEXIST errors and/or incorrect cb deletions. Prevent this conflict by using the app pointer as the identifier for netdev indirect block cb registry, allowing each app to register a unique callback per netdev. For block cb registry, the same app may register multiple cbs to the same block if using TC shared blocks. Instead of the app, use the pointer to the allocated cb_priv data as the identifier here. This means that there can be a unique block callback for each app/netdev combo. Fixes: `3166dd07a9` ("nfp: flower: offload tunnel decap rules via indirect TC blocks") Reported-by: Edward Cree <ecree@solarflare.com> Signed-off-by: John Hurley <john.hurley@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-17 23:34:12 -08:00
Jakub Kicinski	036b9e7cae	nfp: abm: allow to opt-out of RED offload FW team asks to be able to not support RED even if NIC is capable of buffering for testing and experimentation. Add an opt-out flag. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-16 12:41:42 -08:00
David S. Miller	addb067983	Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next Daniel Borkmann says: ==================== pull-request: bpf-next 2018-12-11 The following pull-request contains BPF updates for your net-next tree. It has three minor merge conflicts, resolutions: 1) tools/testing/selftests/bpf/test_verifier.c Take first chunk with alignment_prevented_execution. 2) net/core/filter.c [...] case bpf_ctx_range_ptr(struct __sk_buff, flow_keys): case bpf_ctx_range(struct __sk_buff, wire_len): return false; [...] 3) include/uapi/linux/bpf.h Take the second chunk for the two cases each. The main changes are: 1) Add support for BPF line info via BTF and extend libbpf as well as bpftool's program dump to annotate output with BPF C code to facilitate debugging and introspection, from Martin. 2) Add support for BPF_ALU \| BPF_ARSH \| BPF_{K,X} in interpreter and all JIT backends, from Jiong. 3) Improve BPF test coverage on archs with no efficient unaligned access by adding an "any alignment" flag to the BPF program load to forcefully disable verifier alignment checks, from David. 4) Add a new bpf_prog_test_run_xattr() API to libbpf which allows for proper use of BPF_PROG_TEST_RUN with data_out, from Lorenz. 5) Extend tc BPF programs to use a new __sk_buff field called wire_len for more accurate accounting of packets going to wire, from Petar. 6) Improve bpftool to allow dumping the trace pipe from it and add several improvements in bash completion and map/prog dump, from Quentin. 7) Optimize arm64 BPF JIT to always emit movn/movk/movk sequence for kernel addresses and add a dedicated BPF JIT backend allocator, from Ard. 8) Add a BPF helper function for IR remotes to report mouse movements, from Sean. 9) Various cleanups in BPF prog dump e.g. to make UAPI bpf_prog_info member naming consistent with existing conventions, from Yonghong and Song. 10) Misc cleanups and improvements in allowing to pass interface name via cmdline for xdp1 BPF example, from Matteo. 11) Fix a potential segfault in BPF sample loader's kprobes handling, from Daniel T. 12) Fix SPDX license in libbpf's README.rst, from Andrey. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-10 18:00:43 -08:00
Pieter Jansen van Vuuren	290974d434	nfp: flower: ensure TCP flags can be placed in IPv6 frame Previously we did not ensure tcp flags have a place to be stored when using IPv6. We correct this by including IPv6 key layer when we match tcp flags and the IPv6 key layer has not been included already. Fixes: `07e1671cfc` ("nfp: flower: refactor shared ip header in match offload") Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-10 17:45:41 -08:00
David S. Miller	4cc1feeb6f	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Several conflicts, seemingly all over the place. I used Stephen Rothwell's sample resolutions for many of these, if not just to double check my own work, so definitely the credit largely goes to him. The NFP conflict consisted of a bug fix (moving operations past the rhashtable operation) while chaning the initial argument in the function call in the moved code. The net/dsa/master.c conflict had to do with a bug fix intermixing of making dsa_master_set_mtu() static with the fixing of the tagging attribute location. cls_flower had a conflict because the dup reject fix from Or overlapped with the addition of port range classifiction. __set_phy_supported()'s conflict was relatively easy to resolve because Andrew fixed it in both trees, so it was just a matter of taking the net-next copy. Or at least I think it was :-) Joe Stringer's fix to the handling of netns id 0 in bpf_sk_lookup() intermixed with changes on how the sdif and caller_net are calculated in these code paths in net-next. The remaining BPF conflicts were largely about the addition of the __bpf_md_ptr stuff in 'net' overlapping with adjustments and additions to the relevant data structure where the MD pointer macros are used. Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-09 21:43:31 -08:00
Jiong Wang	84708c1386	nfp: bpf: implement jitting of BPF_ALU \| BPF_ARSH \| BPF_* BPF_X support needs indirect shift mode, please see code comments for details. Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Jiong Wang <jiong.wang@netronome.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2018-12-07 13:30:48 -08:00
Yangtao Li	6f6c74fad8	nfp: convert to DEFINE_SHOW_ATTRIBUTE Use DEFINE_SHOW_ATTRIBUTE macro to simplify the code. Signed-off-by: Yangtao Li <tiny.windzz@gmail.com> Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-03 17:33:38 -08:00
Jakub Kicinski	6db3a9dcf0	nfp: report more info when reconfiguration fails FW reconfiguration timeouts are a common indicator of FW trouble. To make debugging easier print requested update and control word when reconfiguration fails. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-30 13:30:45 -08:00
Jakub Kicinski	9571d98775	nfp: add offset to all TLV parsing errors When troubleshooting incorrect FW capabilities it's useful to know where the faulty TLV is located. Add offset to all errors messages. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-30 13:30:44 -08:00
Jakub Kicinski	51a6588e8c	nfp: add offloads on representors FW/HW can generally support the standard networking offloads on representors without any trouble. Add the ability for FW to advertise which features should be available on representors. Because representors are muxed on top of the vNIC we need to listen on feature changes of their lower devices, and update their features appropriately. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: John Hurley <john.hurley@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-30 13:30:44 -08:00
Jakub Kicinski	71844fac1e	nfp: add locking around representor changes Up until now we never needed to keep a networking locks around representors accesses, we only accessed them when device was reconfigured (under nfp pf->lock) or on fast path (under RCU). Now we want to be able to iterate over all representors during notifications, so make sure representor assignment is done under RTNL lock. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: John Hurley <john.hurley@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-30 13:30:44 -08:00
Jakub Kicinski	fbf60e377d	nfp: run don't require Qdiscs on representor netdevs Our representors are software devices built on top of the PF vNIC, the queuing should only happen at the vNIC netdevice. Allow representors to run qdisc-less. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: John Hurley <john.hurley@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-30 13:30:44 -08:00
Jakub Kicinski	9db8bbcb9b	nfp: run representor TX locklessly Our representors are software devices built on top of the PF vNIC, the only state they have are per-cpu stats, so make the TX run locklessly. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: John Hurley <john.hurley@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-30 13:30:44 -08:00
Jakub Kicinski	d7cc825225	nfp: avoid oversized TSO headers with metadata prepend In preparation for TSO over representors make sure the port id prepend will always fit in the frame. The current max header length is 255, which is ample, so assume worst case scenario of 8 byte prepend and save ourselves the conditionals. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-30 13:30:44 -08:00
Jakub Kicinski	b54ad0eaad	nfp: correct descriptor offsets in presence of metadata The TSO-related offsets in the descriptor should not include the length of the prepended metadata. Adjust them. Note that this could not have caused issues in the past as we don't support TSO with metadata prepend as of this patch. Signed-off-by: Michael Rapson <michael.rapson@netronome.com> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-30 13:30:44 -08:00
Jakub Kicinski	8b5ddf1e51	nfp: move queue variable init nd_q is only used at the very end of nfp_net_tx(), there is no need to initialize it early. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-30 13:30:44 -08:00
Jakub Kicinski	de31049a48	nfp: move temporary variables in nfp_net_tx_complete() Move temporary variables in scope of the loop in nfp_net_tx_complete(), and add a temp for txbuf software structure. This saves us 0.2% of CPU. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-30 13:30:44 -08:00
Jakub Kicinski	9586274967	nfp: copy only the relevant part of the TX descriptor for frags Chained descriptors for fragments need to duplicate all the descriptor fields of the skb head, so we copy the descriptor and then modify the relevant fields. This is wasteful, because the top half of the descriptor will get overwritten entirely while the bottom half is not modified at all. Copy only the bottom half. This saves us 0.3% of CPU in a GSO test. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-30 13:30:44 -08:00
John Hurley	b5f0cf0834	nfp: flower: prevent offload if rhashtable insert fails For flow offload adds, if the rhash insert code fails, the flow will still have been offloaded but the reference to it in the driver freed. Re-order the offload setup calls to ensure that a flow will only be written to FW if a kernel reference is held and stored in the rhashtable. Remove this hashtable entry if the offload fails. Fixes: `c01d0efa51` ("nfp: flower: use rhashtable for flow caching") Signed-off-by: John Hurley <john.hurley@netronome.com> Reviewed-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-30 13:24:56 -08:00
John Hurley	1166494891	nfp: flower: release metadata on offload failure Calling nfp_compile_flow_metadata both assigns a stats context and increments a ref counter on (or allocates) a mask id table entry. These are released by the nfp_modify_flow_metadata call on flow deletion, however, if a flow add fails after metadata is set then the flow entry will be deleted but the metadata assignments leaked. Add an error path to the flow add offload function to ensure allocated metadata is released in the event of an offload fail. Fixes: `81f3ddf254` ("nfp: add control message passing capabilities to flower offloads") Signed-off-by: John Hurley <john.hurley@netronome.com> Reviewed-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-30 13:24:56 -08:00
David S. Miller	4afe60a97b	Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next Daniel Borkmann says: ==================== pull-request: bpf-next 2018-11-26 The following pull-request contains BPF updates for your net-next tree. The main changes are: 1) Extend BTF to support function call types and improve the BPF symbol handling with this info for kallsyms and bpftool program dump to make debugging easier, from Martin and Yonghong. 2) Optimize LPM lookups by making longest_prefix_match() handle multiple bytes at a time, from Eric. 3) Adds support for loading and attaching flow dissector BPF progs from bpftool, from Stanislav. 4) Extend the sk_lookup() helper to be supported from XDP, from Nitin. 5) Enable verifier to support narrow context loads with offset > 0 to adapt to LLVM code generation (currently only offset of 0 was supported). Add test cases as well, from Andrey. 6) Simplify passing device functions for offloaded BPF progs by adding callbacks to bpf_prog_offload_ops instead of ndo_bpf. Also convert nfp and netdevsim to make use of them, from Quentin. 7) Add support for sock_ops based BPF programs to send events to the perf ring-buffer through perf_event_output helper, from Sowmini and Daniel. 8) Add read / write support for skb->tstamp from tc BPF and cg BPF programs to allow for supporting rate-limiting in EDT qdiscs like fq from BPF side, from Vlad. 9) Extend libbpf API to support map in map types and add test cases for it as well to BPF kselftests, from Nikita. 10) Account the maximum packet offset accessed by a BPF program in the verifier and use it for optimizing nfp JIT, from Jiong. 11) Fix error handling regarding kprobe_events in BPF sample loader, from Daniel T. 12) Add support for queue and stack map type in bpftool, from David. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-26 13:08:17 -08:00
Jakub Kicinski	340a4864d5	nfp: abm: add support for more threshold actions Original FW only allowed us to perform ECN marking. Newer releases also support plain old drop. Add the ability to configure drop policy. This is particularly useful in combination with GRED, because different bands can have different ECN marking setting. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: John Hurley <john.hurley@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-19 18:53:46 -08:00
Jakub Kicinski	174ab544e3	nfp: abm: add cls_u32 offload for simple band classification Use offload of very simple u32 filters to direct packets to GRED bands based on the DSCP marking. No u32 hashing is supported, just plain simple filters matching on ToS or Priority with appropriate mask device can support. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: John Hurley <john.hurley@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-19 18:53:46 -08:00
Jakub Kicinski	6a80240571	nfp: abm: add functions to update DSCP -> virtual queue map Learn how to set the DSCP map. FW uses a packed array which geometry depends on the number of supported priorities and virtual queues. Write code to assemble this map and to communicate the setting to the FW via mailbox. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: John Hurley <john.hurley@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-19 18:53:46 -08:00
Jakub Kicinski	14780c3429	nfp: abm: calculate PRIO map len and check mailbox size In preparation for PRIO offload calculate how long the prio map for FW will be and make sure the configuration can be performed via the vNIC mailbox. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: John Hurley <john.hurley@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-19 18:53:46 -08:00
Jakub Kicinski	f3d6372064	nfp: abm: add GRED offload Add support for GRED offload. It behaves much like RED, but can apply different parameters to different bands. GRED operates pretty much exactly like our HW/FW with a single FIFO and different RED state instances. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: John Hurley <john.hurley@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-19 18:53:46 -08:00
Jakub Kicinski	990b50a53a	nfp: abm: wrap RED parameters in bands Wrap RED parameters and stats into a structure, and a 1-element array. Upcoming GRED offload will add the support for more bands. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: John Hurley <john.hurley@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-19 18:53:46 -08:00
Jakub Kicinski	184ec856ca	nfp: abm: add up bands for sto/non-sto stats Add up stats for all bands for the extra ethtool statistics. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: John Hurley <john.hurley@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-19 18:53:45 -08:00
Jakub Kicinski	57f31bbaa9	nfp: abm: switch to extended stats for reading packet/byte counts In PRIO-enabled FW read the statistics from per-band symbol, rather than from the standard per-PCIe-queue counters. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: John Hurley <john.hurley@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-19 18:53:45 -08:00
Jakub Kicinski	68e9864221	nfp: abm: size threshold table to account for bands Make sure the threshold table is large enough to hold information for all bands. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: John Hurley <john.hurley@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-19 18:53:45 -08:00
Jakub Kicinski	5720769609	nfp: abm: pass band parameter to functions In preparation for per-band RED offload pass band parameter to functions. For now it will always be 0. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: John Hurley <john.hurley@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-19 18:53:45 -08:00
Jakub Kicinski	3a44820591	nfp: abm: map per-band symbols In preparation for multi-band RED offload if FW is capable map the extended symbols which will allow us to set per-band parameters and read stats. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: John Hurley <john.hurley@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-19 18:53:45 -08:00
Jakub Kicinski	bd3b5d462a	nfp: abm: restructure Qdisc handling In preparation of handling more Qdisc types switch to a different offload strategy. We have now recreated the Qdisc hierarchy in the driver. Every time the hierarchy changes parse it, and update the configuration of the HW accordingly. While at it drop the support of pretending that we can instantiate a single queue on a multi-queue device in HW/FW. MQ is now required, and each queue will have its own instance of RED. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: John Hurley <john.hurley@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-14 08:51:28 -08:00
Jakub Kicinski	52db4eaca5	nfp: abm: save RED's parameters Use the new driver Qdisc structure to keep track of parameters of RED Qdiscs. This way as the Qdisc moves around in the hierarchy we will be able to configure the HW appropriately. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: John Hurley <john.hurley@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-14 08:51:28 -08:00
Jakub Kicinski	6c5dbda0d4	nfp: abm: reset RED's child based on limit RED qdisc will replace its child Qdisc with a new FIFO queue if it is reconfigured and the limit parameter is not 0. This means that when it's created with limit of 0 it will have no FIFO, and all packets will be dropped. If it's changed and limit is specified it will loose its existing child (implicit graft). Make sure we mark RED Qdisc child as NFP_QDISC_UNTRACKED if its not the expected FIFO. nfp_abm_qdisc_replace() will return 1 if Qdisc already existed. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: John Hurley <john.hurley@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-14 08:51:28 -08:00
Jakub Kicinski	6b8417b7e6	nfp: abm: build full Qdisc hierarchy based on graft notifications Using graft notifications recreate in the driver the full Qdisc hierarchy. Keep track of how many times each Qdisc is attached to the hierarchy to make sure we don't offload Qdiscs which are attached multiple times (device queues can't be shared). For graft events of Qdiscs we don't know exist make the child as invalid/untracked. Note that MQ Qdisc doesn't send destruction events reliably when device is dismantled, so we need to manually clean out the children otherwise we'd think Qdiscs which are still in use are getting freed. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: John Hurley <john.hurley@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-14 08:51:28 -08:00
Jakub Kicinski	aee7539c58	nfp: abm: allocate Qdisc child table To keep track of Qdisc hierarchy allocate a table for children for each Qdisc. RED Qdisc can only have one child. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: John Hurley <john.hurley@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-14 08:51:27 -08:00
Jakub Kicinski	1853125889	nfp: abm: remember which Qdisc is root Keep track of which Qdisc is currently root. We need to implement TC_SETUP_ROOT_QDISC handling, and for completeness also clear the root Qdisc pointer when it's freed. TC_SETUP_ROOT_QDISC isn't always sent when device is dismantled. Remembering the root Qdisc will allow us to build the entire hierarchy in following patches. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: John Hurley <john.hurley@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-14 08:51:27 -08:00
Jakub Kicinski	4f5681d088	nfp: abm: track all offload-enabled qdiscs Allocate an object corresponding to any offloaded qdisc we are informed about by the kernel. Not only the qdiscs we have a chance of offloading. The count of created objects will be used to decide whether the ethtool TC offload can be disabled, since otherwise we may miss destroy commands. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: John Hurley <john.hurley@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-14 08:51:27 -08:00
Jakub Kicinski	6666f545e9	nfp: abm: keep track of all RED thresholds Instead of writing the threshold out when Qdisc is configured and not remembering it move to a scheme where we remember all thresholds. When configuration changes parse the offloaded Qdiscs and set thresholds appropriately. This will help future extensions. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: John Hurley <john.hurley@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-14 08:51:27 -08:00
Jakub Kicinski	08990494e5	nfp: abm: rename qdiscs -> red_qdiscs Rename qdiscs member to red_qdiscs. One of following patches will use the name qdiscs for tracking all qdisc types. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: John Hurley <john.hurley@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-14 08:51:27 -08:00
John Hurley	d4b69bad61	nfp: flower: remove unnecessary code in flow lookup Recent changes to NFP mean that stats updates from fw to driver no longer require a flow lookup and (because egdev offload has been removed) the ingress netdev for a lookup is now always known. Remove obsolete code in a flow lookup that matches on host context and that allows for a netdev to be NULL. Signed-off-by: John Hurley <john.hurley@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-11 09:54:53 -08:00
John Hurley	4f63fde3fc	nfp: flower: remove TC egdev offloads Previously, only tunnel decap rules required egdev registration for offload in NFP. These are now supported via indirect TC block callbacks. Remove the egdev code from NFP. Signed-off-by: John Hurley <john.hurley@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-11 09:54:53 -08:00
John Hurley	3166dd07a9	nfp: flower: offload tunnel decap rules via indirect TC blocks Previously, TC block tunnel decap rules were only offloaded when a callback was triggered through registration of the rules egress device. This meant that the driver had no access to the ingress netdev and so could not verify it was the same tunnel type that the rule implied. Register tunnel devices for indirect TC block offloads in NFP, giving access to new rules based on the ingress device rather than egress. Use this to verify the netdev type of VXLAN and Geneve based rules and offload the rules to HW if applicable. Tunnel registration is done via a netdev notifier. On notifier registration, this is triggered for already existing netdevs. This means that NFP can register for offloads from devices that exist before it is loaded (filter rules will be replayed from the TC core). Similarly, on notifier unregister, a call is triggered for each currently active netdev. This allows the driver to unregister any indirect block callbacks that may still be active. Signed-off-by: John Hurley <john.hurley@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-11 09:54:53 -08:00
John Hurley	65b7970edf	nfp: flower: increase scope of netdev checking functions Both the actions and tunnel_conf files contain local functions that check the type of an input netdev. In preparation for re-use with tunnel offload via indirect blocks, move these to static inline functions in a header file. Signed-off-by: John Hurley <john.hurley@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-11 09:54:53 -08:00
John Hurley	7885b4fc8d	nfp: flower: allow non repr netdev offload Previously the offload functions in NFP assumed that the ingress (or egress) netdev passed to them was an nfp repr. Modify the driver to permit the passing of non repr netdevs as the ingress device for an offload rule candidate. This may include devices such as tunnels. The driver should then base its offload decision on a combination of ingress device and egress port for a rule. Signed-off-by: John Hurley <john.hurley@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-11 09:54:53 -08:00
Quentin Monnet	16a8cb5cff	bpf: do not pass netdev to translate() and prepare() offload callbacks The kernel functions to prepare verifier and translate for offloaded program retrieve "offload" from "prog", and "netdev" from "offload". Then both "prog" and "netdev" are passed to the callbacks. Simplify this by letting the drivers retrieve the net device themselves from the offload object attached to prog - if they need it at all. There is currently no need to pass the netdev as an argument to those functions. Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2018-11-10 15:39:54 -08:00
Quentin Monnet	a40a26322a	bpf: pass prog instead of env to bpf_prog_offload_verifier_prep() Function bpf_prog_offload_verifier_prep(), called from the kernel BPF verifier to run a driver-specific callback for preparing for the verification step for offloaded programs, takes a pointer to a struct bpf_verifier_env object. However, no driver callback needs the whole structure at this time: the two drivers supporting this, nfp and netdevsim, only need a pointer to the struct bpf_prog instance held by env. Update the callback accordingly, on kernel side and in these two drivers. Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2018-11-10 15:39:54 -08:00
Quentin Monnet	eb9119471e	bpf: pass destroy() as a callback and remove its ndo_bpf subcommand As part of the transition from ndo_bpf() to callbacks attached to struct bpf_offload_dev for some of the eBPF offload operations, move the functions related to program destruction to the struct and remove the subcommand that was used to call them through the NDO. Remove function __bpf_offload_ndo(), which is no longer used. Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2018-11-10 15:39:54 -08:00
Quentin Monnet	b07ade27e9	bpf: pass translate() as a callback and remove its ndo_bpf subcommand As part of the transition from ndo_bpf() to callbacks attached to struct bpf_offload_dev for some of the eBPF offload operations, move the functions related to code translation to the struct and remove the subcommand that was used to call them through the NDO. Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2018-11-10 15:39:54 -08:00
Quentin Monnet	00db12c3d1	bpf: call verifier_prep from its callback in struct bpf_offload_dev In a way similar to the change previously brought to the verify_insn hook and to the finalize callback, switch to the newly added ops in struct bpf_prog_offload for calling the functions used to prepare driver verifiers. Since the dev_ops pointer in struct bpf_prog_offload is no longer used by any callback, we can now remove it from struct bpf_prog_offload. Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2018-11-10 15:39:54 -08:00
Quentin Monnet	1385d755cf	bpf: pass a struct with offload callbacks to bpf_offload_dev_create() For passing device functions for offloaded eBPF programs, there used to be no place where to store the pointer without making the non-offloaded programs pay a memory price. As a consequence, three functions were called with ndo_bpf() through specific commands. Now that we have struct bpf_offload_dev, and since none of those operations rely on RTNL, we can turn these three commands into hooks inside the struct bpf_prog_offload_ops, and pass them as part of bpf_offload_dev_create(). This commit effectively passes a pointer to the struct to bpf_offload_dev_create(). We temporarily have two struct bpf_prog_offload_ops instances, one under offdev->ops and one under offload->dev_ops. The next patches will make the transition towards the former, so that offload->dev_ops can be removed, and callbacks relying on ndo_bpf() added to offdev->ops as well. While at it, rename "nfp_bpf_analyzer_ops" as "nfp_bpf_dev_ops" (and similarly for netdevsim). Suggested-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2018-11-10 15:39:53 -08:00
Quentin Monnet	1da6f57338	nfp: bpf: move nfp_bpf_analyzer_ops from verifier.c to offload.c We are about to add several new callbacks to the struct, all of them defined in offload.c. Move the struct bpf_prog_offload_ops object in that file. As a consequence, nfp_verify_insn() and nfp_finalize() can no longer be static. Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2018-11-10 15:39:53 -08:00
Jakub Kicinski	560f1ba4d8	nfp: use the new __netdev_tx_sent_queue() BQL optimisation __netdev_tx_sent_queue() was added in commit e59020abf0f ("net: bql: add __netdev_tx_sent_queue()") and allows for better GSO performance. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-09 19:49:00 -08:00
Jiong Wang	cf599f5031	nfp: bpf: relax prog rejection through max_pkt_offset NFP is refusing to offload programs whenever the MTU is set to a value larger than the max packet bytes that fits in NFP Cluster Target Memory (CTM). However, a eBPF program doesn't always need to access the whole packet data. Verifier has always calculated maximum direct packet access (DPA) offset, and kept it in max_pkt_offset inside prog auxiliar information. This patch relax prog rejection based on max_pkt_offset. Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Jiong Wang <jiong.wang@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-11-09 09:16:32 +01:00
Jakub Kicinski	6e5a716f42	nfp: abm: refuse RED offload with harddrop set RED Qdisc will now inform the drivers about the state of the harddrop flag. Refuse to offload in case harddrop is set. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: John Hurley <john.hurley@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-08 20:48:01 -08:00
Jakub Kicinski	cae5f48e32	nfp: abm: don't set negative threshold Turns out the threshold value is used in signed compares in the FW, so we should avoid setting the top bit. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: John Hurley <john.hurley@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-08 20:48:01 -08:00
Jakub Kicinski	032748acf6	nfp: abm: provide more precise info about offload parameter validation Improve log messages printed when RED can't be offloaded because of Qdisc parameters. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: John Hurley <john.hurley@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-08 20:48:01 -08:00
Jakub Kicinski	83ec8857a0	nfp: parse vNIC TLV capabilities at alloc time In certain cases initialization logic which follows allocation of the vNIC structure may want to validate the capabilities of that vNIC. This is easy before vNIC is initialized for normal capabilities which are at fixed offsets in control memory, easy to locate and read, but poses a challenge if the capabilities are in form of TLVs. Parse the TLVs early on so other code can just access parsed info, instead of having to do the parsing by itself. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: John Hurley <john.hurley@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-08 20:48:00 -08:00
Jakub Kicinski	e38f5d11b9	nfp: pass ctrl_bar pointer to nfp_net_alloc Move setting ctrl_bar pointer to the nfp_net_alloc function, to make sure we can parse capabilities early in the following patch. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: John Hurley <john.hurley@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-08 20:48:00 -08:00
Jakub Kicinski	47330f9bdf	nfp: abm: split qdisc offload code into a separate file The Qdisc offload code is logically separate, and we will soon do significant surgery on it to support more Qdiscs, so move it to a separate file. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: John Hurley <john.hurley@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-08 20:48:00 -08:00
John Hurley	e963e1097a	nfp: flower: include geneve as supported offload tunnel type Offload of geneve decap rules is supported in NFP. Include geneve in the check for supported types. Signed-off-by: John Hurley <john.hurley@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-07 23:00:23 -08:00
John Hurley	83f27d027d	nfp: flower: use geneve and vxlan helpers Make use of the recently added VXLAN and geneve helper functions to determine the type of the netdev from its rtnl_link_ops. Signed-off-by: John Hurley <john.hurley@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-07 23:00:23 -08:00
Jakub Kicinski	0c665e2bf4	nfp: flower: use the common netdev notifier Use driver's common notifier for LAG and tunnel configuration. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: John Hurley <john.hurley@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-07 11:45:22 -08:00
Jakub Kicinski	3e33359040	nfp: register a notifier handler in a central location for the device Code interested in networking events registers its own notifier handlers. Create one device-wide notifier instance. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: John Hurley <john.hurley@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-07 11:45:22 -08:00
Jakub Kicinski	659bb404eb	nfp: flower: make nfp_fl_lag_changels_event() void nfp_fl_lag_changels_event() never fails, and therefore we would never return NOTIFY_BAD for NETDEV_CHANGELOWERSTATE. Make this clearer by changing nfp_fl_lag_changels_event()'s return type to void. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: John Hurley <john.hurley@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-07 11:45:22 -08:00
Jakub Kicinski	a558c982a8	nfp: flower: don't try to nack device unregister events Returning an error from a notifier means we want to veto the change. We shouldn't veto NETDEV_UNREGISTER just because we couldn't find the tracking info for given master. I can't seem to find a way to trigger this unless we have some other bug, so it's probably not fix-worthy. While at it move the checking if the netdev really is of interest into the handling functions, like we do for other events. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: John Hurley <john.hurley@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-07 11:45:22 -08:00
Jakub Kicinski	e50bfdf74d	nfp: flower: remove unnecessary iteration over devices For flower tunnel offloads FW has to be informed about MAC addresses of tunnel devices. We use a netdev notifier to keep track of these addresses. Remove unnecessary loop over netdevices after notifier is registered. The intention of the loop was to catch devices which already existed on the system before nfp driver got loaded, but netdev notifier will replay NETDEV_REGISTER events. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: John Hurley <john.hurley@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-07 11:45:22 -08:00
Pieter Jansen van Vuuren	4234d62c27	nfp: flower: add ipv6 set flow label and hop limit offload Add ipv6 set flow label and hop limit action offload. Since pedit sets headers per 4 byte word, we need to ensure that setting either version, priority, payload_len or nexthdr does not get offloaded. Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-07 11:45:21 -08:00
Pieter Jansen van Vuuren	a3c6b063fe	nfp: flower: add ipv4 set ttl and tos offload Add ipv4 set ttl and tos action offload. Since pedit sets headers per 4 byte word, we need to ensure that setting either version, ihl, protocol, total length or checksum does not get offloaded. Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com> Reviewed-by: John Hurley <john.hurley@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-07 11:45:21 -08:00
David S. Miller	a19c59cc10	Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next Daniel Borkmann says: ==================== pull-request: bpf-next 2018-10-21 The following pull-request contains BPF updates for your net-next tree. The main changes are: 1) Implement two new kind of BPF maps, that is, queue and stack map along with new peek, push and pop operations, from Mauricio. 2) Add support for MSG_PEEK flag when redirecting into an ingress psock sk_msg queue, and add a new helper bpf_msg_push_data() for insert data into the message, from John. 3) Allow for BPF programs of type BPF_PROG_TYPE_CGROUP_SKB to use direct packet access for __skb_buff, from Song. 4) Use more lightweight barriers for walking perf ring buffer for libbpf and perf tool as well. Also, various fixes and improvements from verifier side, from Daniel. 5) Add per-symbol visibility for DSO in libbpf and hide by default global symbols such as netlink related functions, from Andrey. 6) Two improvements to nfp's BPF offload to check vNIC capabilities in case prog is shared with multiple vNICs and to protect against mis-initializing atomic counters, from Jakub. 7) Fix for bpftool to use 4 context mode for the nfp disassembler, also from Jakub. 8) Fix a return value comparison in test_libbpf.sh and add several bpftool improvements in bash completion, documentation of bpf fs restrictions and batch mode summary print, from Quentin. 9) Fix a file resource leak in BPF selftest's load_kallsyms() helper, from Peng. 10) Fix an unused variable warning in map_lookup_and_delete_elem(), from Alexei. 11) Fix bpf_skb_adjust_room() signature in BPF UAPI helper doc, from Nicolas. 12) Add missing executables to .gitignore in BPF selftests, from Anders. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2018-10-21 21:11:46 -07:00
David S. Miller	2e2d6f0342	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net net/sched/cls_api.c has overlapping changes to a call to nlmsg_parse(), one (from 'net') added rtm_tca_policy instead of NULL to the 5th argument, and another (from 'net-next') added cb->extack instead of NULL to the 6th argument. net/ipv4/ipmr_base.c is a case of a bug fix in 'net' being done to code which moved (to mr_table_dump)) in 'net-next'. Thanks to David Ahern for the heads up. Signed-off-by: David S. Miller <davem@davemloft.net>	2018-10-19 11:03:06 -07:00
Ido Schimmel	5ff4ff4fe8	net: Add netif_is_vxlan() Add the ability to determine whether a netdev is a VxLAN netdev by calling the above mentioned function that checks the netdev's rtnl_link_ops. This will allow modules to identify netdev events involving a VxLAN netdev and act accordingly. For example, drivers capable of VxLAN offload will need to configure the underlying device when a VxLAN netdev is being enslaved to an offloaded bridge. Convert nfp to use the newly introduced helper. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Reviewed-by: Petr Machata <petrm@mellanox.com> Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-10-17 17:45:07 -07:00
Jakub Kicinski	44b6fed0c1	nfp: bpf: double check vNIC capabilities after object sharing Program translation stage checks that program can be offloaded to the netdev which was passed during the load (bpf_attr->prog_ifindex). After program sharing was introduced, however, the netdev on which program is loaded can theoretically be different, and therefore we should recheck the program size and max stack size at load time. This was found by code inspection, AFAIK today all vNICs have identical caps. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2018-10-16 15:23:58 -07:00
Jakub Kicinski	527db74b71	nfp: bpf: protect against mis-initializing atomic counters Atomic operations on the NFP are currently always in big endian. The driver keeps track of regions of memory storing atomic values and byte swaps them accordingly. There are corner cases where the map values may be initialized before the driver knows they are used as atomic counters. This can happen either when the datapath is performing the update and the stack contents are unknown or when map is updated before the program which will use it for atomic values is loaded. To avoid situation where user initializes the value to 0 1 2 3 and then after loading a program which uses the word as an atomic counter starts reading 3 2 1 0 - only allow atomic counters to be initialized to endian-neutral values. For updates from the datapath the stack information may not be as precise, so just allow initializing such values to 0. Example code which would break: struct bpf_map_def SEC("maps") rxcnt = { .type = BPF_MAP_TYPE_HASH, .key_size = sizeof(__u32), .value_size = sizeof(__u64), .max_entries = 1, }; int xdp_prog1() { __u64 nonzeroval = 3; __u32 key = 0; __u64 *value; value = bpf_map_lookup_elem(&rxcnt, &key); if (!value) bpf_map_update_elem(&rxcnt, &key, &nonzeroval, BPF_ANY); else __sync_fetch_and_add(value, 1); return XDP_PASS; } $ offload bpftool map dump key: 00 00 00 00 value: 00 00 00 03 00 00 00 00 should be: $ offload bpftool map dump key: 00 00 00 00 value: 03 00 00 00 00 00 00 00 Reported-by: David Beckett <david.beckett@netronome.com> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2018-10-16 15:23:58 -07:00
Pieter Jansen van Vuuren	140b6abac2	nfp: flower: use offsets provided by pedit instead of index for ipv6 Previously when populating the set ipv6 address action, we incorrectly made use of pedit's key index to determine which 32bit word should be set. We now calculate which word has been selected based on the offset provided by the pedit action. Fixes: `354b82bb32` ("nfp: add set ipv6 source and destination address") Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-10-15 23:17:25 -07:00
Pieter Jansen van Vuuren	d08c9e5893	nfp: flower: fix multiple keys per pedit action Previously we only allowed a single header key per pedit action to change the header. This used to result in the last header key in the pedit action to overwrite previous headers. We now keep track of them and allow multiple header keys per pedit action. Fixes: `c0b1bd9a8b` ("nfp: add set ipv4 header action flower offload") Fixes: `354b82bb32` ("nfp: add set ipv6 source and destination address") Fixes: `f8b7b0a6b1` ("nfp: add set tcp and udp header action flower offload") Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-10-15 23:17:24 -07:00
Pieter Jansen van Vuuren	8913806f16	nfp: flower: fix pedit set actions for multiple partial masks Previously we did not correctly change headers when using multiple pedit actions with partial masks. We now take this into account and no longer just commit the last pedit action. Fixes: `c0b1bd9a8b` ("nfp: add set ipv4 header action flower offload") Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-10-15 23:17:24 -07:00
Ryan C Goodfellow	5948185b97	nfp: devlink port split support for 1x100G CXP NIC This commit makes it possible to use devlink to split the 100G CXP Netronome into two 40G interfaces. Currently when you ask for 2 interfaces, the math in src/nfp_devlink.c:nfp_devlink_port_split calculates that you want 5 lanes per port because for some reason eth_port.port_lanes=10 (shouldn't this be 12 for CXP?). What we really want when asking for 2 breakout interfaces is 4 lanes per port. This commit makes that happen by calculating based on 8 lanes if 10 are present. Signed-off-by: Ryan C Goodfellow <rgoodfel@isi.edu> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Greg Weeks <greg.weeks@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-10-15 22:29:55 -07:00
Jakub Kicinski	96de25060d	nfp: replace long license headers with SPDX Replace the repeated license text with SDPX identifiers. While at it bump the Copyright dates for files we touched this year. Signed-off-by: Edwin Peer <edwin.peer@netronome.com> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Nic Viljoen <nick.viljoen@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-10-11 12:16:21 -07:00
Pieter Jansen van Vuuren	12ecf61529	nfp: flower: use host context count provided by firmware Read the host context count symbols provided by firmware and use it to determine the number of allocated stats ids. Previously it won't be possible to offload more than 2^17 filter even if FW was able to do so. Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-10-10 22:32:44 -07:00
Pieter Jansen van Vuuren	7fade1077c	nfp: flower: use stats array instead of storing stats per flow Make use of an array stats instead of storing stats per flow which would require a hash lookup at critical times. Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-10-10 22:32:44 -07:00
Pieter Jansen van Vuuren	c01d0efa51	nfp: flower: use rhashtable for flow caching Make use of relativistic hash tables for tracking flows instead of fixed sized hash tables. Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-10-10 22:32:44 -07:00
David S. Miller	071a234ad7	Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next Alexei Starovoitov says: ==================== pull-request: bpf-next 2018-10-08 The following pull-request contains BPF updates for your net-next tree. The main changes are: 1) sk_lookup_[tcp\|udp] and sk_release helpers from Joe Stringer which allow BPF programs to perform lookups for sockets in a network namespace. This would allow programs to determine early on in processing whether the stack is expecting to receive the packet, and perform some action (eg drop, forward somewhere) based on this information. 2) per-cpu cgroup local storage from Roman Gushchin. Per-cpu cgroup local storage is very similar to simple cgroup storage except all the data is per-cpu. The main goal of per-cpu variant is to implement super fast counters (e.g. packet counters), which don't require neither lookups, neither atomic operations in a fast path. The example of these hybrid counters is in selftests/bpf/netcnt_prog.c 3) allow HW offload of programs with BPF-to-BPF function calls from Quentin Monnet 4) support more than 64-byte key/value in HW offloaded BPF maps from Jakub Kicinski 5) rename of libbpf interfaces from Andrey Ignatov. libbpf is maturing as a library and should follow good practices in library design and implementation to play well with other libraries. This patch set brings consistent naming convention to global symbols. 6) relicense libbpf as LGPL-2.1 OR BSD-2-Clause from Alexei Starovoitov to let Apache2 projects use libbpf 7) various AF_XDP fixes from Björn and Magnus ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2018-10-08 23:42:44 -07:00
Quentin Monnet	7ff0ccde43	nfp: bpf: support pointers to other stack frames for BPF-to-BPF calls Mark instructions that use pointers to areas in the stack outside of the current stack frame, and process them accordingly in mem_op_stack(). This way, we also support BPF-to-BPF calls where the caller passes a pointer to data in its own stack frame to the callee (typically, when the caller passes an address to one of its local variables located in the stack, as an argument). Thanks to Jakub and Jiong for figuring out how to deal with this case, I just had to turn their email discussion into this patch. Suggested-by: Jiong Wang <jiong.wang@netronome.com> Suggested-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com> Reviewed-by: Jiong Wang <jiong.wang@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-10-08 10:24:13 +02:00
Quentin Monnet	4454962314	nfp: bpf: optimise save/restore for R6~R9 based on register usage When pre-processing the instructions, it is trivial to detect what subprograms are using R6, R7, R8 or R9 as destination registers. If a subprogram uses none of those, then we do not need to jump to the subroutines dedicated to saving and restoring callee-saved registers in its prologue and epilogue. This patch introduces detection of callee-saved registers in subprograms and prevents the JIT from adding calls to those subroutines whenever we can: we save some instructions in the translated program, and some time at runtime on BPF-to-BPF calls and returns. If no subprogram needs to save those registers, we can avoid appending the subroutines at the end of the program. Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-10-08 10:24:13 +02:00
Quentin Monnet	2178f3f0dc	nfp: bpf: fix return address from register-saving subroutine to callee On performing a BPF-to-BPF call, we first jump to a subroutine that pushes callee-saved registers (R6~R9) to the stack, and from there we goes to the start of the callee next. In order to do so, the caller must pass to the subroutine the address of the NFP instruction to jump to at the end of that subroutine. This cannot be reliably implemented when translated the caller, as we do not always know the start offset of the callee yet. This patch implement the required fixup step for passing the start offset in the callee via the register used by the subroutine to hold its return address. Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-10-08 10:24:13 +02:00
Quentin Monnet	bdf4c66faf	nfp: bpf: update fixup function for BPF-to-BPF calls support Relocation for targets of BPF-to-BPF calls are required at the end of translation. Update the nfp_fixup_branches() function in that regard. When checking that the last instruction of each bloc is a branch, we must account for the length of the instructions required to pop the return address from the stack. Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com> Signed-off-by: Jiong Wang <jiong.wang@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-10-08 10:24:13 +02:00
Quentin Monnet	fb19816541	nfp: bpf: account for additional stack usage when checking stack limit Offloaded programs using BPF-to-BPF calls use the stack to store the return address when calling into a subprogram. Callees also need some space to save eBPF registers R6 to R9. And contrarily to kernel verifier, we align stack frames on 64 bytes (and not 32). Account for all this when checking the stack size limit before JIT-ing the program. This means we have to recompute maximum stack usage for the program, we cannot get the value from the kernel. In addition to adapting the checks on stack usage, move them to the finalize() callback, now that we have it and because such checks are part of the verification step rather than translation. Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-10-08 10:24:13 +02:00
Quentin Monnet	389f263b60	nfp: bpf: add main logics for BPF-to-BPF calls support in nfp driver This is the main patch for the logics of BPF-to-BPF calls in the nfp driver. The functions called on BPF_JUMP \| BPF_CALL and BPF_JUMP \| BPF_EXIT were used to call helpers and exit from the program, respectively; make them usable for calling into, or returning from, a BPF subprogram as well. For all calls, push the return address as well as the callee-saved registers (R6 to R9) to the stack, and pop them upon returning from the calls. In order to limit the overhead in terms of instruction number, this is done through dedicated subroutines. Jumping to the callee actually consists in jumping to the subroutine, that "returns" to the callee: this will require some fixup for passing the address in a later patch. Similarly, returning consists in jumping to the subroutine, which pops registers and then return directly to the caller (but no fixup is needed here). Return to the caller is performed with the RTN instruction newly added to the JIT. For the few steps where we need to know what subprogram an instruction belongs to, the struct nfp_insn_meta is extended with a new subprog_idx field. Note that checks on the available stack size, to take into account the additional requirements associated to BPF-to-BPF calls (storing R6-R9 and return addresses), are added in a later patch. Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com> Signed-off-by: Jiong Wang <jiong.wang@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-10-08 10:24:13 +02:00
Quentin Monnet	e3b49dc69b	nfp: bpf: account for BPF-to-BPF calls when preparing nfp JIT Similarly to "exit" or "helper call" instructions, BPF-to-BPF calls will require additional processing before translation starts, in order to record and mark jump destinations. We also mark the instructions where each subprogram begins. This will be used in a following commit to determine where to add prologues for subprograms. Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com> Reviewed-by: Jiong Wang <jiong.wang@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-10-08 10:24:13 +02:00
Quentin Monnet	bcfdfb7c96	nfp: bpf: ignore helper-related checks for BPF calls in nfp verifier The checks related to eBPF helper calls are performed each time the nfp driver meets a BPF_JUMP \| BPF_CALL instruction. However, these checks are not relevant for BPF-to-BPF call (same instruction code, different value in source register), so just skip the checks for such calls. While at it, rename the function that runs those checks to make it clear they apply to _helper_ calls only. Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com> Reviewed-by: Jiong Wang <jiong.wang@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-10-08 10:24:12 +02:00
Quentin Monnet	c5da54d93e	nfp: bpf: copy eBPF subprograms information from kernel verifier In order to support BPF-to-BPF calls in offloaded programs, the nfp driver must collect information about the distinct subprograms: namely, the number of subprograms composing the complete program and the stack depth of those subprograms. The latter in particular is non-trivial to collect, so we copy those elements from the kernel verifier via the newly added post-verification hook. The struct nfp_prog is extended to store this information. Stack depths are stored in an array of dedicated structs. Subprogram start indexes are not collected. Instead, meta instructions associated to the start of a subprogram will be marked with a flag in a later patch. Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com> Reviewed-by: Jiong Wang <jiong.wang@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-10-08 10:24:12 +02:00
Quentin Monnet	1a7e62e632	nfp: bpf: rename nfp_prog->stack_depth as nfp_prog->stack_frame_depth In preparation for support for BPF to BPF calls in offloaded programs, rename the "stack_depth" field of the struct nfp_prog as "stack_frame_depth". This is to make it clear that the field refers to the maximum size of the current stack frame (as opposed to the maximum size of the whole stack memory). Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-10-08 10:24:12 +02:00
Quentin Monnet	c941ce9c28	bpf: add verifier callback to get stack usage info for offloaded progs In preparation for BPF-to-BPF calls in offloaded programs, add a new function attribute to the struct bpf_prog_offload_ops so that drivers supporting eBPF offload can hook at the end of program verification, and potentially extract information collected by the verifier. Implement a minimal callback (returning 0) in the drivers providing the structs, namely netdevsim and nfp. This will be useful in the nfp driver, in later commits, to extract the number of subprograms as well as the stack depth for those subprograms. Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com> Reviewed-by: Jiong Wang <jiong.wang@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-10-08 10:24:12 +02:00
David S. Miller	9e50727f0e	mlx5-updates-2018-10-03 mlx5 core driver and ethernet netdev updates, please note there is a small devlink releated update to allow extack argument to eswitch operations. From Eli Britstein, 1) devlink: Add extack argument to the eswitch related operations 2) net/mlx5e: E-Switch, return extack messages for failures in the e-switch devlink callbacks 3) net/mlx5e: Add extack messages for TC offload failures From Eran Ben Elisha, 4) mlx5e: Add counter for aRFS rule insertion failures From Feras Daoud 5) Fast teardown support for mlx5 device This change introduces the enhanced version of the "Force teardown" that allows SW to perform teardown in a faster way without the need to reclaim all the FW pages. Fast teardown provides the following advantages: 1- Fix a FW race condition that could cause command timeout 2- Avoid moving to polling mode 3- Close the vport to prevent PCI ACK to be sent without been scatter to memory -----BEGIN PGP SIGNATURE----- iQEcBAABAgAGBQJbtU45AAoJEEg/ir3gV/o+/C4H/RHA4KImrb476EdB3VNYMqAN dgXb+bmh6sZP+jHWqQ4c3aVeh6/T8qm4gwiSn2nVTtHEnxtCdIYljzDC1Nswczeg pSjD1eOP7M1LpAOmBb8xdnJcX7yM7r1bTklnp2sN853WShbsDRYgZBHsBwTzx25U ZdzL4QTLuohlG/aLrbGXMntIy45ya2fVQrnK54s18nFlgsdFjEs0mi0xaUKNBC6+ P8CTohHAxuuxmL5b+6MIYLZCdgd8cLNQFdtqbckEVw7SvcRTxfraRlyqJ0YOgTGB TdSWnqZz2JYH29wSFbpFG8qX6GCv8FoiZ+fKzldbolHk442rrktHv3+Y7qQuZVs= =NVks -----END PGP SIGNATURE----- Merge tag 'mlx5-updates-2018-10-03' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux Saeed Mahameed says: ==================== mlx5-updates-2018-10-03 mlx5 core driver and ethernet netdev updates, please note there is a small devlink releated update to allow extack argument to eswitch operations. From Eli Britstein, 1) devlink: Add extack argument to the eswitch related operations 2) net/mlx5e: E-Switch, return extack messages for failures in the e-switch devlink callbacks 3) net/mlx5e: Add extack messages for TC offload failures From Eran Ben Elisha, 4) mlx5e: Add counter for aRFS rule insertion failures From Feras Daoud 5) Fast teardown support for mlx5 device This change introduces the enhanced version of the "Force teardown" that allows SW to perform teardown in a faster way without the need to reclaim all the FW pages. Fast teardown provides the following advantages: 1- Fix a FW race condition that could cause command timeout 2- Avoid moving to polling mode 3- Close the vport to prevent PCI ACK to be sent without been scatter to memory ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2018-10-04 09:48:37 -07:00
David S. Miller	6f41617bf2	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Minor conflict in net/core/rtnetlink.c, David Ahern's bug fix in 'net' overlapped the renaming of a netlink attribute in net-next. Signed-off-by: David S. Miller <davem@davemloft.net>	2018-10-03 21:00:17 -07:00
Eli Britstein	db7ff19e7b	devlink: Add extack for eswitch operations Add extack argument to the eswitch related operations. Signed-off-by: Eli Britstein <elibr@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2018-10-03 16:17:58 -07:00
Jakub Kicinski	ff58e2df62	nfp: avoid soft lockups under control message storm When FW floods the driver with control messages try to exit the cmsg processing loop every now and then to avoid soft lockups. Cmsg processing is generally very lightweight so 512 seems like a reasonable budget, which should not be exceeded under normal conditions. Fixes: `77ece8d5f1` ("nfp: add control vNIC datapath") Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Tested-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-10-02 11:47:58 -07:00
Jakub Kicinski	0c9864c05f	nfp: bpf: allow control message sizing for map ops In current ABI the size of the messages carrying map elements was statically defined to at most 16 words of key and 16 words of value (NFP word is 4 bytes). We should not make this assumption and use the max key and value sizes from the BPF capability instead. To make sure old kernels don't get surprised with larger (or smaller) messages bump the FW ABI version to 3 when key/value size is different than 16 words. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-10-02 14:39:59 +02:00
Jakub Kicinski	9bbdd41b8a	nfp: allow apps to request larger MTU on control vNIC Some apps may want to have higher MTU on the control vNIC/queue. Allow them to set the requested MTU at init time. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-10-02 14:39:59 +02:00
Jakub Kicinski	28264eb227	nfp: bpf: parse global BPF ABI version capability Up until now we only had per-vNIC BPF ABI version capabilities, which are slightly awkward to use because bulk of the resources and configuration does not relate to any particular vNIC. Add a new capability for global ABI version and check the per-vNIC version are equal to it. Assume the ABI version 2 if no explicit version capability is present. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-10-02 14:39:58 +02:00
Jakub Kicinski	97ea8ac360	nfp: warn on experimental TLV types Reserve two TLV types for feature development, and warn in the driver if they ever leak into production. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-10-01 22:49:30 -07:00
David S. Miller	a06ee256e5	Merge ra.kernel.org:/pub/scm/linux/kernel/git/davem/net Version bump conflict in batman-adv, take what's in net-next. iavf conflict, adjustment of netdev_ops in net-next conflicting with poll controller method removal in net. Signed-off-by: David S. Miller <davem@davemloft.net>	2018-09-25 10:35:29 -07:00
Eric Dumazet	0825ce7031	nfp: remove ndo_poll_controller As diagnosed by Song Liu, ndo_poll_controller() can be very dangerous on loaded hosts, since the cpu calling ndo_poll_controller() might steal all NAPI contexts (for all RX/TX queues of the NIC). This capture can last for unlimited amount of time, since one cpu is generally not able to drain all the queues under load. nfp uses NAPI for TX completions, so we better let core networking stack call the napi->poll() to avoid the capture. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Jakub Kicinski <jakub.kicinski@netronome.com> Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com> Tested-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-09-23 21:55:25 -07:00
Jakub Kicinski	23d9f5531c	nfp: provide a better warning when ring allocation fails NFP supports fairly enormous ring sizes (up to 256k descriptors). In commit `4662717038` ("nfp: use kvcalloc() to allocate SW buffer descriptor arrays") we have started using kvcalloc() functions to make sure the allocation of software state arrays doesn't hit the MAX_ORDER limit. Unfortunately, we can't use virtual mappings for the DMA region holding HW descriptors. In case this allocation fails instead of the generic (and fairly scary) warning/splat in the logs print a helpful message explaining what happened and suggesting how to fix it. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-09-19 23:07:41 -07:00
David S. Miller	aaf9253025	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net	2018-09-12 22:22:42 -07:00
Jakub Kicinski	eca09be82e	nfp: report FW vNIC stats in interface stats Report in standard netdev stats drops and errors as well as RX multicast from the FW vNIC counters. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-09-12 13:20:49 -07:00
Louis Peens	224de549f0	nfp: flower: reject tunnel encap with ipv6 outer headers for offloading This fixes a bug where ipv6 tunnels would report that it is getting offloaded to hardware but would actually be rejected by hardware. Fixes: `b27d6a95a7` ("nfp: compile flower vxlan tunnel set actions") Signed-off-by: Louis Peens <louis.peens@netronome.com> Reviewed-by: John Hurley <john.hurley@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-09-12 13:18:30 -07:00
Pieter Jansen van Vuuren	db191db813	nfp: flower: fix vlan match by checking both vlan id and vlan pcp Previously we only checked if the vlan id field is present when trying to match a vlan tag. The vlan id and vlan pcp field should be treated independently. Fixes: `5571e8c9f2` ("nfp: extend flower matching capabilities") Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-09-12 13:18:30 -07:00
jun qian	6577b0f716	nfp: replace spin_lock_bh with spin_lock in tasklet callback As you are already in a tasklet, it is unnecessary to call spin_lock_bh. Signed-off-by: jun qian <hangdianqj@163.com> Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-09-07 22:56:02 -07:00
Jakub Kicinski	7848418e28	nfp: separate VXLAN and GRE feature handling VXLAN and GRE FW features have to currently be both advertised for the driver to enable them. Separate the handling. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-09-05 22:18:11 -07:00
Jakub Kicinski	e84b2f2db2	nfp: validate rtsym accesses fall within the symbol With the accesses to rtsyms now all going via special helpers we can easily make sure the driver is not reading past the end of the symbol. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Francois H. Theron <francois.theron@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-09-05 22:17:07 -07:00
Jakub Kicinski	31e380f38f	nfp: prefix rtsym error messages with symbol name For ease of debug preface all error messages with the name of the symbol which caused them. Use the same message format for existing messages while at it. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Francois H. Theron <francois.theron@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-09-05 22:17:07 -07:00
Jakub Kicinski	3c576de30b	nfp: fix readq on absolute RTsyms Return the error and report value through the output param. Fixes: `640917dd81` ("nfp: support access to absolute RTsyms") Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Francois H. Theron <francois.theron@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-09-05 22:17:07 -07:00
David S. Miller	36302685f5	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net	2018-09-04 21:33:03 -07:00
Jakub Kicinski	9ad716b95f	nfp: wait for posted reconfigs when disabling the device To avoid leaking a running timer we need to wait for the posted reconfigs after netdev is unregistered. In common case the process of deinitializing the device will perform synchronous reconfigs which wait for posted requests, but especially with VXLAN ports being actively added and removed there can be a race condition leaving a timer running after adapter structure is freed leading to a crash. Add an explicit flush after deregistering and for a good measure a warning to check if timer is running just before structures are freed. Fixes: `3d780b926a` ("nfp: add async reconfiguration mechanism") Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-08-31 23:01:30 -07:00
Jakub Kicinski	4152e58cb8	nfp: make RTsym users handle absolute symbols correctly Make the RTsym users access the size via the helper, which takes care of special handling of absolute symbols. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Francois H. Theron <francois.theron@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-08-28 16:01:47 -07:00
Jakub Kicinski	640917dd81	nfp: support access to absolute RTsyms Add support in nfpcore for reading the absolute RTsyms. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Francois H. Theron <francois.theron@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-08-28 16:01:47 -07:00
Jakub Kicinski	1240989ccc	nfp: convert all RTsym users to use new read/write helpers Convert all users of RTsym to the new set of helpers which handle all targets correctly. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Francois H. Theron <francois.theron@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-08-28 16:01:47 -07:00
Jakub Kicinski	761969992d	nfp: convert existing RTsym helpers to full target decoding Make nfp_rtsym_{read,write}_le() and nfp_rtsym_map() use the new target resolution helpers to allow accessing in-cache symbols. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Francois H. Theron <francois.theron@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-08-28 16:01:47 -07:00
Jakub Kicinski	8f6d6052cf	nfp: pass cpp_id to nfp_cpp_map_area() Align nfp_cpp_map_area() with other CPP-level APIs and pass encoded cpp_id/dest rather than target, action, domain tuple. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Francois H. Theron <francois.theron@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-08-28 16:01:47 -07:00
Jakub Kicinski	3f0e55a2a6	nfp: add RTsym access helpers RTsyms may have special encodings for more complex symbol types. For example symbols which are placed in external memory unit's cache directly, constants or local memory. Add set of helpers which will check for those special encodings and handle them correctly. For now only add direct cache accesses, we don't have a need to access the other ones in foreseeable future. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Francois H. Theron <francois.theron@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-08-28 16:01:47 -07:00
Jakub Kicinski	c678a9759a	nfp: add basic errors messages to target logic Add error prints to CPP target encoding/decoding logic, otherwise it's quite hard to pin point the reasons why read or write operations fail. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Francois H. Theron <francois.theron@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-08-28 16:01:47 -07:00
Jakub Kicinski	73eaf3b7b8	nfp: save the MU locality field offset We will soon need the MU locality field offset much more often than just for decoding MIP address. Save it in nfp_cpp for quick access. Note that we can already reuse the target config from nfp_cpp, no need to do the XPB read. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Francois H. Theron <francois.theron@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-08-28 16:01:47 -07:00
Jakub Kicinski	9bf6cce893	nfp: refactor the per-chip PCIe config Use a switch statement instead of ifs for code dependent on chip version. While at it make sure we fail for unknown chip revisions. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-08-28 16:01:47 -07:00
Jakub Kicinski	0377505c54	nfp: add support for NFP5000 Add NFP5000 to supported chips, the chip is backward compatible with NFP4000 and NFP6000, so core PCIe code needs to handle it the same way as 4k and 6k. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-08-28 16:01:47 -07:00
Jakub Kicinski	f6e71efdf9	nfp: abm: look up MAC addresses via management FW In multi-host scenarios Management FW may allocate MAC addresses at runtime, we have to use the indirect lookup to find them. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-08-28 16:01:47 -07:00
Jakub Kicinski	34243f5909	nfp: add support for indirect HWinfo lookup Management FW can adjust some of the information in the HWinfo table at runtime. In some cases reading the table directly will not yield correct results. Add a NSP command for looking up information. Up until now we weren't making use of any of the values which may get adjusted. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-08-28 16:01:47 -07:00
Jakub Kicinski	ac86da0546	nfp: interpret extended FW load result codes To enable easier FW distribution NFP can now automatically select between FW stored on the flash and loaded from the kernel. If FW loading policy is set to auto it will compare the versions of FW from the host and from the flash and load the newer one. If FW type doesn't match (e.g. one advanced application vs another) the FW from the host takes precedence, unless one of them is the basic NIC firmware, in which case the non-basic-NIC FW is selected. This automatic selection mechanism requires we inform user what the verdict was. Print a message to the logs explaining the decision and the reason. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-08-28 16:01:46 -07:00
Jakub Kicinski	2db100002e	nfp: attempt FW load from flash Flash may contain a default NFP application FW. This application can either be put there by the user (with ethtool -f) or shipped with the card. If file system FW is not found, attempt to load this flash stored app FW. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-08-28 16:01:46 -07:00
Jakub Kicinski	1c0372b67c	nfp: encapsulate NSP command arguments into structs There is already a fair number of arguments to nfp_nsp_command() family of functions. Encapsulate them into structures to make adding new ones easier. No functional changes. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-08-28 16:01:46 -07:00
Cong Wang	244cd96adb	net_sched: remove list_head from tc_action After commit `90b73b77d0`, list_head is no longer needed. Now we just need to convert the list iteration to array iteration for drivers. Fixes: `90b73b77d0` ("net: sched: change action API to use array of pointers to actions") Cc: Jiri Pirko <jiri@mellanox.com> Cc: Vlad Buslov <vladbu@mellanox.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-08-21 12:45:44 -07:00
Jakub Kicinski	19997ba7cb	nfp: clean up return types in kdoc comments Remove 'Return:' information from functions which no longer return a value. Also update name and return types of nfp_nffw_info access functions. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-08-13 19:36:08 -07:00
Pieter Jansen van Vuuren	0a22b17a6b	nfp: flower: add geneve option match offload Introduce a new layer for matching on geneve options. This allows offloading filters configured to match geneve with options. Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com> Signed-off-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-08-07 12:22:15 -07:00
Pieter Jansen van Vuuren	9e7c32fe44	nfp: flower: add geneve option push action offload Introduce new push geneve option action. This allows offloading filters configured to entunnel geneve with options. Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-08-07 12:22:15 -07:00
John Hurley	d7ff7ec573	nfp: flower: allow matching on ipv4 UDP tunnel tos and ttl The addition of FLOW_DISSECTOR_KEY_ENC_IP to TC flower means that the ToS and TTL of the tunnel header can now be matched on. Extend the NFP tunnel match function to include these new fields. Signed-off-by: John Hurley <john.hurley@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-08-07 12:22:14 -07:00
John Hurley	2a43747147	nfp: flower: set ip tunnel ttl from encap action The TTL for encapsulating headers in IPv4 UDP tunnels is taken from a route lookup. Modify this to first check if a user has specified a TTL to be used in the TC action. Signed-off-by: John Hurley <john.hurley@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-08-07 12:22:14 -07:00
David S. Miller	1ba982806c	Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next Daniel Borkmann says: ==================== pull-request: bpf-next 2018-08-07 The following pull-request contains BPF updates for your net-next tree. The main changes are: 1) Add cgroup local storage for BPF programs, which provides a fast accessible memory for storing various per-cgroup data like number of transmitted packets, etc, from Roman. 2) Support bpf_get_socket_cookie() BPF helper in several more program types that have a full socket available, from Andrey. 3) Significantly improve the performance of perf events which are reported from BPF offload. Also convert a couple of BPF AF_XDP samples overto use libbpf, both from Jakub. 4) seg6local LWT provides the End.DT6 action, which allows to decapsulate an outer IPv6 header containing a Segment Routing Header. Adds this action now to the seg6local BPF interface, from Mathieu. 5) Do not mark dst register as unbounded in MOV64 instruction when both src and dst register are the same, from Arthur. 6) Define u_smp_rmb() and u_smp_wmb() to their respective barrier instructions on arm64 for the AF_XDP sample code, from Brian. 7) Convert the tcp_client.py and tcp_server.py BPF selftest scripts over from Python 2 to Python 3, from Jeremy. 8) Enable BTF build flags to the BPF sample code Makefile, from Taeung. 9) Remove an unnecessary rcu_read_lock() in run_lwt_bpf(), from Taehee. 10) Several improvements to the README.rst from the BPF documentation to make it more consistent with RST format, from Tobin. 11) Replace all occurrences of strerror() by calls to strerror_r() in libbpf and fix a FORTIFY_SOURCE build error along with it, from Thomas. 12) Fix a bug in bpftool's get_btf() function to correctly propagate an error via PTR_ERR(), from Yue. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2018-08-07 11:02:05 -07:00
Jakub Kicinski	0c26159352	nfp: bpf: xdp_adjust_tail support Add support for adjust_tail. There are no FW changes needed but add a FW capability just in case there would be any issue with previously released FW, or we will have to change the ABI in the future. The helper is trivial and shouldn't be used too often so just inline the body of the function. We add the delta to locally maintained packet length register and check for overflow, since add of negative value must overflow if result is positive. Note that if delta of 0 would be allowed in the kernel this trick stops working and we need one more instruction to compare lengths before and after the change. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-08-04 21:58:12 +02:00
David S. Miller	89b1698c93	Merge ra.kernel.org:/pub/scm/linux/kernel/git/davem/net The BTF conflicts were simple overlapping changes. The virtio_net conflict was an overlap of a fix of statistics counter, happening alongisde a move over to a bonafide statistics structure rather than counting value on the stack. Signed-off-by: David S. Miller <davem@davemloft.net>	2018-08-02 10:55:32 -07:00
Jakub Kicinski	240b74fde3	nfp: fix variable dereferenced before check in nfp_app_ctrl_rx_raw() 'app' is dereferenced before used for the devlink trace point. In case FW is buggy and sends a control message to a VF queue we should make sure app is not NULL. Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-07-31 09:28:19 +02:00
John Hurley	ee614c8710	nfp: flower: fix port metadata conversion bug Function nfp_flower_repr_get_type_and_port expects an enum nfp_repr_type return value but, if the repr type is unknown, returns a value of type enum nfp_flower_cmsg_port_type. This means that if FW encodes the port ID in a way the driver does not understand instead of dropping the frame driver may attribute it to a physical port (uplink) provided the port number is less than physical port count. Fix this and ensure a net_device of NULL is returned if the repr can not be determined. Fixes: `1025351a88` ("nfp: add flower app") Signed-off-by: John Hurley <john.hurley@netronome.com> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-07-28 14:27:32 -07:00
Jakub Kicinski	17082566a9	nfp: bpf: improve map offload info messages FW can put constraints on map element size to maximize resource use and efficiency. When user attempts offload of a map which does not fit into those constraints an informational message is printed to kernel logs to inform user about the reason offload failed. Map offload does not have access to any advanced error reporting like verifier log or extack. There is also currently no way for us to nicely expose the FW capabilities to user space. Given all those constraints we should make sure log messages are as informative as possible. Improve them. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-07-27 07:14:35 +02:00
Jakub Kicinski	ab01f4ac5f	nfp: bpf: remember maps by ID Record perf maps by map ID, not raw kernel pointer. This helps with debug messages, because printing pointers to logs is frowned upon, and makes debug easier for the users, as map ID is something they should be more familiar with. Note that perf maps are offload neutral, therefore IDs won't be orphaned. While at it use a rate limited print helper for the error message. Reported-by: Kees Cook <keescook@chromium.org> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-07-27 07:14:35 +02:00
Jakub Kicinski	0958762748	nfp: bpf: allow receiving perf events on data queues Control queue is fairly low latency, and requires SKB allocations, which means we can't even reach 0.5Msps with perf events. Allow perf events to be delivered to data queues. This allows us to not only use multiple queues, but also receive and deliver to user space more than 5Msps per queue (Xeon E5-2630 v4 2.20GHz, no retpolines). Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-07-27 07:14:35 +02:00
Jakub Kicinski	20c5420421	nfp: bpf: pass raw data buffer to nfp_bpf_event_output() In preparation for SKB-less perf event handling make nfp_bpf_event_output() take buffer address and length, not SKB as parameters. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-07-27 07:14:35 +02:00
Jakub Kicinski	79ca38e80c	nfp: allow control message reception on data queues Port id 0xffffffff is reserved for control messages. Allow reception of messages with this id on data queues. Hand off a raw buffer to the higher layer code, without allocating SKB for max efficiency. The RX handle can't modify or keep the buffer, after it returns buffer is handed back over to the NIC RX free buffer list. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-07-27 07:14:35 +02:00
Jakub Kicinski	78a8b760e4	nfp: move repr handling on RX path Representor packets are received on PF queues with special metadata tag for demux. There is no reason to resolve the representor ID -> netdev after the skb has been allocated. Move the code, this will allow us to handle special FW messages without SKB allocation overhead. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-07-27 07:14:35 +02:00
Jakub Kicinski	5ea14712d7	nfp: protect from theoretical size overflows on HW descriptor ring Use array_size() and store the size as full size_t to protect from theoretical size overflow when handling HW descriptor rings. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-07-25 22:17:44 -07:00
Jakub Kicinski	e76c1d3d2a	nfp: restore correct ordering of fields in rx ring structure Commit `7f1c684a89` ("nfp: setup xdp_rxq_info") mixed the cache cold and cache hot data in the nfp_net_rx_ring structure (ignoring the feedback), to try to fit the structure into 2 cache lines after struct xdp_rxq_info was added. Now that we are about to add a new field the structure will grow back to 3 cache lines, so order the members correctly. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-07-25 22:17:44 -07:00
Jakub Kicinski	4662717038	nfp: use kvcalloc() to allocate SW buffer descriptor arrays Use kvcalloc() instead of tmp variable + kzalloc() when allocating SW buffer information to allow falling back to vmalloc and to protect from theoretical integer overflow. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-07-25 22:17:44 -07:00
Jakub Kicinski	5b0ced17ed	nfp: don't fail probe on pci_sriov_set_totalvfs() errors On machines with buggy ACPI tables or when SR-IOV is already enabled we may not be able to set the SR-IOV VF limit in sysfs, it's not fatal because the limit is imposed by the driver anyway. Only the sysfs 'sriov_totalvfs' attribute will be too high. Print an error to inform user about the failure but allow probe to continue. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-07-25 22:17:44 -07:00
David S. Miller	19725496da	Merge ra.kernel.org:/pub/scm/linux/kernel/git/davem/net	2018-07-24 19:21:58 -07:00
Jakub Kicinski	07300f774f	nfp: avoid buffer leak when FW communication fails After device is stopped we reset the rings by moving all free buffers to positions [0, cnt - 2], and clear the position cnt - 1 in the ring. We then proceed to clear the read/write pointers. This means that if we try to reset the ring again the code will assume that the next to fill buffer is at position 0 and swap it with cnt - 1. Since we previously cleared position cnt - 1 it will lead to leaking the first buffer and leaving ring in a bad state. This scenario can only happen if FW communication fails, in which case the ring will never be used again, so the fact it's in a bad state will not be noticed. Buffer leak is the only problem. Don't try to move buffers in the ring if the read/write pointers indicate the ring was never used or have already been reset. nfp_net_clear_config_and_disable() is now fully idempotent. Found by code inspection, FW communication failures are very rare, and reconfiguring a live device is not common either, so it's unlikely anyone has ever noticed the leak. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-07-22 10:58:52 -07:00
Jakub Kicinski	042f882556	nfp: bring back support for offloading shared blocks Now that we have offload replay infrastructure added by commit `326367427c` ("net: sched: call reoffload op on block callback reg") and flows are guaranteed to be removed correctly, we can revert commit `951a8ee6de` ("nfp: reject binding to shared blocks"). Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: John Hurley <john.hurley@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-07-22 10:58:52 -07:00
John Hurley	b809ec869b	nfp: flower: ensure dead neighbour entries are not offloaded Previously only the neighbour state was checked to decide if an offloaded entry should be removed. However, there can be situations when the entry is dead but still marked as valid. This can lead to dead entries not being removed from fw tables or even incorrect data being added. Check the entry dead bit before deciding if it should be added to or removed from fw neighbour tables. Fixes: `8e6a9046b6` ("nfp: flower vxlan neighbour offload") Signed-off-by: John Hurley <john.hurley@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-07-22 10:55:01 -07:00
David S. Miller	eae249b27f	Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next Daniel Borkmann says: ==================== pull-request: bpf-next 2018-07-20 The following pull-request contains BPF updates for your net-next tree. The main changes are: 1) Add sharing of BPF objects within one ASIC: this allows for reuse of the same program on multiple ports of a device, and therefore gains better code store utilization. On top of that, this now also enables sharing of maps between programs attached to different ports of a device, from Jakub. 2) Cleanup in libbpf and bpftool's Makefile to reduce unneeded feature detections and unused variable exports, also from Jakub. 3) First batch of RCU annotation fixes in prog array handling, i.e. there are several __rcu markers which are not correct as well as some of the RCU handling, from Roman. 4) Two fixes in BPF sample files related to checking of the prog_cnt upper limit from sample loader, from Dan. 5) Minor cleanup in sockmap to remove a set but not used variable, from Colin. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2018-07-20 23:58:30 -07:00
David S. Miller	c4c5551df1	Merge ra.kernel.org:/pub/scm/linux/kernel/git/torvalds/linux All conflicts were trivial overlapping changes, so reasonably easy to resolve. Signed-off-by: David S. Miller <davem@davemloft.net>	2018-07-20 21:17:12 -07:00
Jakub Kicinski	b5faa20d6f	nfp: bpf: allow program sharing within ASIC Allow program sharing between netdevs of the same NFP ASIC. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-07-18 15:10:34 +02:00
Jakub Kicinski	602144c224	bpf: offload: keep the offload state per-ASIC Create a higher-level entity to represent a device/ASIC to allow programs and maps to be shared between device ports. The extra work is required to make sure we don't destroy BPF objects as soon as the netdev for which they were loaded gets destroyed, as other ports may still be using them. When netdev goes away all of its BPF objects will be moved to other netdevs of the device, and only destroyed when last netdev is unregistered. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-07-18 15:10:34 +02:00
Jakub Kicinski	9fd7c55591	bpf: offload: aggregate offloads per-device Currently we have two lists of offloaded objects - programs and maps. Netdevice unregister notifier scans those lists to orphan objects associated with device being unregistered. This puts unnecessary (even if negligible) burden on all netdev unregister calls in BPF- -enabled kernel. The lists of objects may potentially get long making the linear scan even more problematic. There haven't been complaints about this mechanisms so far, but it is suboptimal. Instead of relying on notifiers, make the few BPF-capable drivers register explicitly for BPF offloads. The programs and maps will now be collected per-device not on a global list, and only scanned for removal when driver unregisters from BPF offloads. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-07-18 15:10:34 +02:00
Jakub Kicinski	4612bebfa3	nfp: add .ndo_init() and .ndo_uninit() callbacks BPF code should unregister the offload capabilities from .ndo_uninit(), to make sure the operation is atomic with unlist_netdevice(). Plumb the init/uninit NDOs for vNICs and representors. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-07-18 15:10:34 +02:00
David S. Miller	2aa4a3378a	Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next Daniel Borkmann says: ==================== pull-request: bpf-next 2018-07-15 The following pull-request contains BPF updates for your net-next tree. The main changes are: 1) Various different arm32 JIT improvements in order to optimize code emission and make the JIT code itself more robust, from Russell. 2) Support simultaneous driver and offloaded XDP in order to allow for advanced use-cases where some work is offloaded to the NIC and some to the host. Also add ability for bpftool to load programs and maps beyond just the cgroup case, from Jakub. 3) Add BPF JIT support in nfp for multiplication as well as division. For the latter in particular, it uses the reciprocal algorithm to emulate it, from Jiong. 4) Add BTF pretty print functionality to bpftool in plain and JSON output format, from Okash. 5) Add build and installation to the BPF helper man page into bpftool, from Quentin. 6) Add a TCP BPF callback for listening sockets which is triggered right after the socket transitions to TCP_LISTEN state, from Andrey. 7) Add a new cgroup tree command to bpftool which iterates over the whole cgroup tree and prints all attached programs, from Roman. 8) Improve xdp_redirect_cpu sample to support parsing of double VLAN tagged packets, from Jesper. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2018-07-14 18:47:44 -07:00
Jakub Kicinski	5f4284015e	nfp: add support for simultaneous driver and hw XDP Split handling of offloaded and driver programs completely. Since offloaded programs always come with XDP_FLAGS_HW_MODE set in reality there could be no sharing, anyway, programs would only be installed in driver or in hardware. Splitting the handling allows us to install programs in HW and in driver at the same time. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-07-13 20:26:35 +02:00
Jakub Kicinski	a25717d2b6	xdp: support simultaneous driver and hw XDP attachment Split the query of HW-attached program from the software one. Introduce new .ndo_bpf command to query HW-attached program. This will allow drivers to install different programs in HW and SW at the same time. Netlink can now also carry multiple programs on dump (in which case mode will be set to XDP_ATTACHED_MULTI and user has to check per-attachment point attributes, IFLA_XDP_PROG_ID will not be present). We reuse IFLA_XDP_PROG_ID skb space for second mode, so rtnl_xdp_size() doesn't need to be updated. Note that the installation side is still not there, since all drivers currently reject installing more than one program at the time. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-07-13 20:26:35 +02:00
Jakub Kicinski	05296620f6	xdp: factor out common program/flags handling from drivers Basic operations drivers perform during xdp setup and query can be moved to helpers in the core. Encapsulate program and flags into a structure and add helpers. Note that the structure is intended as the "main" program information source in the driver. Most drivers will additionally place the program pointer in their fast path or ring structures. The helpers don't have a huge impact now, but they will decrease the code duplication when programs can be installed in HW and driver at the same time. Encapsulating the basic operations in helpers will hopefully also reduce the number of changes to drivers which adopt them. Helpers could really be static inline, but they depend on definition of struct netdev_bpf which means they'd have to be placed in netdevice.h, an already 4500 line header. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-07-13 20:26:35 +02:00
Jakub Kicinski	6b86758973	xdp: don't make drivers report attachment mode prog_attached of struct netdev_bpf should have been superseded by simply setting prog_id long time ago, but we kept it around to allow offloading drivers to communicate attachment mode (drv vs hw). Subsequently drivers were also allowed to report back attachment flags (prog_flags), and since nowadays only programs attached will XDP_FLAGS_HW_MODE can get offloaded, we can tell the attachment mode from the flags driver reports. Remove prog_attached member. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-07-13 20:26:35 +02:00
Arnd Bergmann	51bef926be	nfp: avoid using getnstimeofday64() getnstimeofday64 is deprecated in favor of the ktime_get() family of functions. The direct replacement would be ktime_get_real_ts64(), but I'm picking the basic ktime_get() instead: - using a ktime_t simplifies the code compared to timespec64 - using monotonic time instead of real time avoids issues caused by a concurrent settimeofday() or during a leap second adjustment. Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-07-12 14:55:39 -07:00
Linus Torvalds	8979319f2d	pci-v4.18-fixes-2 -----BEGIN PGP SIGNATURE----- iQJIBAABCgAyFiEEgMe7l+5h9hnxdsnuWYigwDrT+vwFAltBPacUHGJoZWxnYWFz QGdvb2dsZS5jb20ACgkQWYigwDrT+vwx+hAAvzTP6o3VOtgNK7lm3nOBuzfykgCv TFhXP2yeDItWDBLDpWX7wjs2657W3Sjrpw6FyVIYvsoMKKRuOYeZ6ChDieG5ZgTj oxav5U6TlDHoF0fNq0LWfv78lP6+++7/6yaer6j9xDksVqE4/zxlFcExxuszhZlC 8ptJ54ORn92RdfRHCDptA4PFReNlQWNw3bpKpGxu8xj0TN/sYPN3ggHfnmiEyGMZ 8/KLNOzGrhoqztaBPyRoG3GhU1hpicT+fWzg11Li8hP1solQDht+S3mnCC1IxqdI JSU7/jhti4nUV56qE7QzYzdsVbTFcItGn0WImklIFCt7htp4Rms0wN+QCmwhtjKM 0WH02gRDip4CcYcPZGeGaeexWJFEScamFK1yxxDV7059KsoQKUN1Sm+R7y9xORYw nQAHPTO/nv02Xo+ADAYzV0aBPD7fEvFaWtdXAuLocVWVj3eiEqp8ftoYmuWym6T3 gHWt9Dod8olj3t9bkR1MWZ121Ar5MsobgJSF30smEx8VMKv+4Xx8eXvq0FCuUQwT s/WLTPqK31Sqjii0xe1wO8g0yexCVaBpXJB/e9PpYf/+ICItG1DHLz0Ygw01QWVK Tv7HTAofdO42kChHjErB6GbzSuxDxSVCPN26bwkZu0eV91uKiLKA7wLUv8+qSXlZ lK98Eypy0Ti7pvA= =IOF/ -----END PGP SIGNATURE----- Merge tag 'pci-v4.18-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci Pull PCI fixes from Bjorn Helgaas: - Fix a use-after-free in the endpoint code (Dan Carpenter) - Stop defaulting CONFIG_PCIE_DW_PLAT_HOST to yes (Geert Uytterhoeven) - Fix an nfp regression caused by a change in how we limit the number of VFs we can enable (Jakub Kicinski) - Fix failure path cleanup issues in the new R-Car gen3 PHY support (Marek Vasut) - Fix leaks of OF nodes in faraday, xilinx-nwl, xilinx (Nicholas Mc Guire) * tag 'pci-v4.18-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: nfp: stop limiting VFs to 0 PCI/IOV: Reset total_VFs limit after detaching PF driver PCI: faraday: Add missing of_node_put() PCI: xilinx-nwl: Add missing of_node_put() PCI: xilinx: Add missing of_node_put() PCI: endpoint: Use after free in pci_epf_unregister_driver() PCI: controller: dwc: Do not let PCIE_DW_PLAT_HOST default to yes PCI: rcar: Clean up PHY init on failure PCI: rcar: Shut the PHY down in failpath	2018-07-08 10:55:21 -07:00
Jiong Wang	9fb410a89e	nfp: bpf: migrate to advanced reciprocal divide in reciprocal_div.h As we are doing JIT, we would want to use the advanced version of the reciprocal divide (reciprocal_value_adv) to trade performance with host. We could reduce the required ALU instructions from 4 to 2 or 1. Signed-off-by: Jiong Wang <jiong.wang@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-07-07 01:45:31 +02:00
Jiong Wang	2a952b03d1	nfp: bpf: support u32 divide using reciprocal_div.h NFP doesn't have integer divide instruction, this patch use reciprocal algorithm (the basic one, reciprocal_div) to emulate it. For each u32 divide, we would need 11 instructions to finish the operation. 7 (for multiplication) + 4 (various ALUs) = 11 Given NFP only supports multiplication no bigger than u32, we'd require divisor and dividend no bigger than that as well. Also eBPF doesn't support signed divide and has enforced this on C language level by failing compilation. However LLVM assembler hasn't enforced this, so it is possible for negative constant to leak in as a BPF_K operand through assembly code, we reject such cases as well. Signed-off-by: Jiong Wang <jiong.wang@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Acked-by: Song Liu <songliubraving@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-07-07 01:45:31 +02:00
Jiong Wang	d3d23fdb46	nfp: bpf: support u16 and u32 multiplications NFP supports u16 and u32 multiplication. Multiplication is done 8-bits per step, therefore we need 2 steps for u16 and 4 steps for u32. We also need one start instruction to initialize the sequence and one or two instructions to fetch the result depending on either you need the high halve of u32 multiplication. For ALU64, if either operand is beyond u32's value range, we reject it. One thing to note, if the source operand is BPF_K, then we need to check "imm" field directly, and we'd reject it if it is negative. Because for ALU64, "imm" (with s32 type) is expected to be sign extended to s64 which NFP mul doesn't support. For ALU32, it is fine for "imm" be negative though, because the result is 32-bits and here is no difference on the low halve of result for signed/unsigned mul, so we will get correct result. Signed-off-by: Jiong Wang <jiong.wang@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Acked-by: Song Liu <songliubraving@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-07-07 01:45:31 +02:00
Jiong Wang	33b9431058	nfp: bpf: copy range info for all operands of all ALU operations NFP verifier hook is coping range information of the shift amount for indirect shift operation so optimized shift sequences could be generated. We want to use range info to do more things. For example, to decide whether multiplication and divide are supported on the given range. This patch simply let NFP verifier hook to copy range info for all operands of all ALU operands. Signed-off-by: Jiong Wang <jiong.wang@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Acked-by: Song Liu <songliubraving@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-07-07 01:45:31 +02:00
Jiong Wang	662c54721d	nfp: bpf: rename umin/umax to umin_src/umax_src The two fields are a copy of umin and umax info of bpf_insn->src_reg generated by verifier. Rename to make their meaning clear. Signed-off-by: Jiong Wang <jiong.wang@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Acked-by: Song Liu <songliubraving@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-07-07 01:45:31 +02:00
David S. Miller	b68034087a	Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next Daniel Borkmann says: ==================== pull-request: bpf-next 2018-07-03 The following pull-request contains BPF updates for your net-next tree. The main changes are: 1) Various improvements to bpftool and libbpf, that is, bpftool build speed improvements, missing BPF program types added for detection by section name, ability to load programs from '.text' section is made to work again, and better bash completion handling, from Jakub. 2) Improvements to nfp JIT's map read handling which allows for optimizing memcpy from map to packet, from Jiong. 3) New BPF sample is added which demonstrates XDP in combination with bpf_perf_event_output() helper to sample packets on all CPUs, from Toke. 4) Add a new BPF kselftest case for tracking connect(2) BPF hooks infrastructure in combination with TFO, from Andrey. 5) Extend the XDP/BPF xdp_rxq_info sample code with a cmdline option to read payload from packet data in order to use it for benchmarking. Also for '--action XDP_TX' option implement swapping of MAC addresses to avoid drops on some hardware seen during testing, from Jesper. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2018-07-04 08:53:53 +09:00
David S. Miller	5cd3da4ba2	Merge ra.kernel.org:/pub/scm/linux/kernel/git/davem/net Simple overlapping changes in stmmac driver. Adjust skb_gro_flush_final_remcsum function signature to make GRO list changes in net-next, as per Stephen Rothwell's example merge resolution. Signed-off-by: David S. Miller <davem@davemloft.net>	2018-07-03 10:29:26 +09:00
David S. Miller	271b955e52	Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf Daniel Borkmann says: ==================== pull-request: bpf 2018-07-01 The following pull-request contains BPF updates for your net tree. The main changes are: 1) A bpf_fib_lookup() helper fix to change the API before freeze to return an encoding of the FIB lookup result and return the nexthop device index in the params struct (instead of device index as return code that we had before), from David. 2) Various BPF JIT fixes to address syzkaller fallout, that is, do not reject progs when set_memory_*() fails since it could still be RO. Also arm32 JIT was not using bpf_jit_binary_lock_ro() API which was an issue, and a memory leak in s390 JIT found during review, from Daniel. 3) Multiple fixes for sockmap/hash to address most of the syzkaller triggered bugs. Usage with IPv6 was crashing, a GPF in bpf_tcp_close(), a missing sock_map_release() routine to hook up to callbacks, and a fix for an omitted bucket lock in sock_close(), from John. 4) Two bpftool fixes to remove duplicated error message on program load, and another one to close the libbpf object after program load. One additional fix for nfp driver's BPF offload to avoid stopping offload completely if replace of program failed, from Jakub. 5) Couple of BPF selftest fixes that bail out in some of the test scripts if the user does not have the right privileges, from Jeffrin. 6) Fixes in test_bpf for s390 when CONFIG_BPF_JIT_ALWAYS_ON is set where we need to set the flag that some of the test cases are expected to fail, from Kleber. 7) Fix to detangle BPF_LIRC_MODE2 dependency from CONFIG_CGROUP_BPF since it has no relation to it and lirc2 users often have configs without cgroups enabled and thus would not be able to use it, from Sean. 8) Fix a selftest failure in sockmap by removing a useless setrlimit() call that would set a too low limit where at the same time we are already including bpf_rlimit.h that does the job, from Yonghong. 9) Fix BPF selftest config with missing missing NET_SCHED, from Anders. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2018-07-01 09:27:44 +09:00
John Hurley	635cf43dbd	nfp: flower: enabled offloading of Team LAG Currently the NFP fw only supports L3/L4 hashing so rejects the offload of filters that output to LAG ports implementing other hash algorithms. Team, however, uses a BPF function for the hash that is not defined. To support Team offload, accept hashes that are defined as 'unknown' (only Team defines such hash types). In this case, use the NFP default of L3/L4 hashing for egress port selection. Signed-off-by: John Hurley <john.hurley@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-06-30 21:31:56 +09:00
John Hurley	51a8cefc6e	nfp: flower: offload tos and tunnel flags for ipv4 udp tunnels Extract the tos and the tunnel flags from the tunnel key and offload these action fields. Only the checksum and tunnel key flags are implemented in fw so reject offloads of other flags. The tunnel key flag is always considered set in the fw so enforce that it is set in the rule. Note that the compulsory setting of the tunnel key flag and optional setting of checksum is inline with how tc currently generates ipv4 udp tunnel actions. Signed-off-by: John Hurley <john.hurley@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-06-30 21:31:56 +09:00
John Hurley	ed21b637e9	nfp: flower: extract ipv4 udp tunnel ttl from route Previously the ttl for ipv4 udp tunnels was set to the namespace default. Modify this to attempt to extract the ttl from a full route lookup on the tunnel destination. If this is not possible then resort to the default. Signed-off-by: John Hurley <john.hurley@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-06-30 21:31:56 +09:00
Pieter Jansen van Vuuren	ed8f2b52b6	nfp: flower: ignore checksum actions when performing pedit actions Hardware will automatically update csum in headers when a set action has been performed. This means we could in the driver ignore the explicit checksum action when performing a set action. Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-06-30 21:31:56 +09:00
Jakub Kicinski	5d4b0b4068	nfp: populate bus-info on representors We used to leave bus-info in ethtool driver info empty for representors in case multi-PCIe-to-single-host cards make the association between PCIe device and NFP many to one. It seems these attempts are futile, we need to link the representors to one PCIe device in sysfs to get consistent naming, plus devlink uses one PCIe as a handle, anyway. The multi-PCIe-to-single-system support won't be clean, if it ever comes. Turns out some user space (RHEL tests) likes to read bus-info so just populate it. While at it remove unnecessary app NULL-check, representors are spawned by an app, so it must exist. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-06-30 21:31:56 +09:00
Jakub Kicinski	d387b8a19a	nfp: make use of napi_consume_skb() Use napi_consume_skb() in nfp_net_tx_complete() to get bulk free. Pass 0 as budget for ctrl queue completion since it runs out of a tasklet. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-06-30 21:31:56 +09:00
Jakub Kicinski	670b5274ff	nfp: implement netpoll ndo (thus enabling netconsole) NFP NAPI handling will only complete the TXed packets when called with budget of 0, implement ndo_poll_controller by scheduling NAPI on all TX queues. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-06-30 21:31:56 +09:00
Jakub Kicinski	18aa5b180f	nfp: fail probe if serial or interface id is missing On some platforms with broken ACPI tables we may not have access to the Serial Number PCIe capability. This capability is crucial for us for switchdev operation as we use serial number as switch ID, and for communication with management FW where interface ID is used. If we can't determine the Serial Number we have to fail device probe. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-06-30 21:31:56 +09:00
Jakub Kicinski	f055a9dfee	nfp: expose ring stats of inactive rings via ethtool After user changes the ring count statistics for deactivated rings disappear from ethtool -S output. This causes loss of information to the user and means that ethtool stats may not add up to interface stats. Always expose counters from all the rings. Note that we allocate at most num_possible_cpus() rings so number of rings should be reasonable. The alternative of only listing stats for rings which were ever in use could be confusing. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-06-30 21:31:56 +09:00
Jakub Kicinski	83235822b8	nfp: stop limiting VFs to 0 Before `8d85a7a4f2` ("PCI/IOV: Allow PF drivers to limit total_VFs to 0"), pci_sriov_set_totalvfs(pdev, 0) meant "we can enable TotalVFs virtual functions". After `8d85a7a4f2`, it means "we can't enable any VFs". That broke this scenario where nfp intends to remove any limit on the number of VFs that can be enabled: nfp_pci_probe nfp_pcie_sriov_read_nfd_limit nfp_rtsym_read_le("nfd_vf_cfg_max_vfs", &err) pci_sriov_set_totalvfs(pf->pdev, 0) # if FW didn't expose a limit ... # userspace writes N to sysfs "sriov_numvfs": sriov_numvfs_store pci_sriov_get_totalvfs # now returns 0 return -ERANGE Prior to `8d85a7a4f2`, pci_sriov_get_totalvfs() returned TotalVFs, but it now returns 0. Remove the pci_sriov_set_totalvfs(pdev, 0) calls so we don't limit the number of VFs that can be enabled. Fixes: `8d85a7a4f2` ("PCI/IOV: Allow PF drivers to limit total_VFs to 0") Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> [bhelgaas: changelog] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>	2018-06-29 15:09:00 -05:00
Jiong Wang	cc0dff6dc3	nfp: bpf: allow source ptr type be map ptr in memcpy optimization Map read has been supported on NFP, this patch enables optimization for memcpy from map to packet. This patch also fixed one latent bug which will cause copying from unexpected address once memcpy for map pointer enabled. The fixed code path was not exercised before. Reported-by: Mary Pham <mary.pham@netronome.com> Reported-by: David Beckett <david.beckett@netronome.com> Signed-off-by: Jiong Wang <jiong.wang@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Acked-by: Song Liu <songliubraving@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-06-27 10:57:15 +02:00
Chengguang Xu	2d2595719a	nfp: cast sizeof() to int when comparing with error code sizeof() will return unsigned value so in the error check negative error code will be always larger than sizeof(). Fixes: `a0d8e02c35` ("nfp: add support for reading nffw info") Signed-off-by: Chengguang Xu <cgxu519@gmx.com> Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-06-27 15:36:32 +09:00
John Hurley	951a8ee6de	nfp: reject binding to shared blocks TC shared blocks allow multiple qdiscs to be grouped together and filters shared between them. Currently the chains of filters attached to a block are only flushed when the block is removed. If a qdisc is removed from a block but the block still exists, flow del messages are not passed to the callback registered for that qdisc. For the NFP, this presents the possibility of rules still existing in hw when they should be removed. Prevent binding to shared blocks until the kernel can send per qdisc del messages when block unbinds occur. tcf_block_shared() was not used outside of the core until now, so also add an empty implementation for builds with CONFIG_NET_CLS=n. Fixes: `4861738775` ("net: sched: introduce shared filter blocks infrastructure") Signed-off-by: John Hurley <john.hurley@netronome.com> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-06-27 10:46:43 +09:00
Pieter Jansen van Vuuren	a64119415f	nfp: flower: fix mpls ether type detection Previously it was not possible to distinguish between mpls ether types and other ether types. This leads to incorrect classification of offloaded filters that match on mpls ether type. For example the following two filters overlap: # tc filter add dev eth0 parent ffff: \ protocol 0x8847 flower \ action mirred egress redirect dev eth1 # tc filter add dev eth0 parent ffff: \ protocol 0x0800 flower \ action mirred egress redirect dev eth2 The driver now correctly includes the mac_mpls layer where HW stores mpls fields, when it detects an mpls ether type. It also sets the MPLS_Q bit to indicate that the filter should match mpls packets. Fixes: `bb055c198d` ("nfp: add mpls match offloading support") Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-06-27 10:46:43 +09:00
John Hurley	60513bd82c	net: sched: pass extack pointer to block binds and cb registration Pass the extact struct from a tc qdisc add to the block bind function and, in turn, to the setup_tc ndo of binding device via the tc_block_offload struct. Pass this back to any block callback registrations to allow netlink logging of fails in the bind process. Signed-off-by: John Hurley <john.hurley@netronome.com> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-06-26 23:21:32 +09:00
Jakub Kicinski	68d676a089	nfp: bpf: don't stop offload if replace failed Stopping offload completely if replace of program failed dates back to days of transparent offload. Back then we wanted to silently fall back to the in-driver processing. Today we mark programs for offload when they are loaded into the kernel, so the transparent offload is no longer a reality. Flags check in the driver will only allow replace of a driver program with another driver program or an offload program with another offload program. When driver program is replaced stopping offload is a no-op, because driver program isn't offloaded. When replacing offloaded program if the offload fails the entire operation will fail all the way back to user space and we should continue using the old program. IOW when replacing a driver program stopping offload is unnecessary and when replacing offloaded program - it's a bug, old program should continue to run. In practice this bug would mean that if offload operation was to fail (either due to FW communication error, kernel OOM or new program being offloaded but for a different netdev) driver would continue reporting that previous XDP program is offloaded but in fact no program will be loaded in hardware. The failure is fairly unlikely (found by inspection, when working on the code) but it's unpleasant. Backport note: even though the bug was introduced in commit `cafa92ac25` ("nfp: bpf: add support for XDP_FLAGS_HW_MODE"), this fix depends on commit `441a33031f` ("net: xdp: don't allow device-bound programs in driver mode"), so this fix is sufficient only in v4.15 or newer. Kernels v4.13.x and v4.14.x do need to stop offload if it was transparent/opportunistic, i.e. if XDP_FLAGS_HW_MODE was not set on running program. Fixes: `cafa92ac25` ("nfp: bpf: add support for XDP_FLAGS_HW_MODE") Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Acked-by: Song Liu <songliubraving@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-06-25 11:33:55 +02:00
Jiri Pirko	eba7927b55	nfp: handle cls_flower command default case Currently the default case is not handled, which with future command introductions would introduce a warning. So handle it. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-06-25 16:14:03 +09:00
Linus Torvalds	9215310cf1	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Pull networking fixes from David Miller: 1) Various netfilter fixlets from Pablo and the netfilter team. 2) Fix regression in IPVS caused by lack of PMTU exceptions on local routes in ipv6, from Julian Anastasov. 3) Check pskb_trim_rcsum for failure in DSA, from Zhouyang Jia. 4) Don't crash on poll in TLS, from Daniel Borkmann. 5) Revert SO_REUSE{ADDR,PORT} change, it regresses various things including Avahi mDNS. From Bart Van Assche. 6) Missing of_node_put in qcom/emac driver, from Yue Haibing. 7) We lack checking of the TCP checking in one special case during SYN receive, from Frank van der Linden. 8) Fix module init error paths of mac80211 hwsim, from Johannes Berg. 9) Handle 802.1ad properly in stmmac driver, from Elad Nachman. 10) Must grab HW caps before doing quirk checks in stmmac driver, from Jose Abreu. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (81 commits) net: stmmac: Run HWIF Quirks after getting HW caps neighbour: skip NTF_EXT_LEARNED entries during forced gc net: cxgb3: add error handling for sysfs_create_group tls: fix waitall behavior in tls_sw_recvmsg tls: fix use-after-free in tls_push_record l2tp: filter out non-PPP sessions in pppol2tp_tunnel_ioctl() l2tp: reject creation of non-PPP sessions on L2TPv2 tunnels mlxsw: spectrum_switchdev: Fix port_vlan refcounting mlxsw: spectrum_router: Align with new route replace logic mlxsw: spectrum_router: Allow appending to dev-only routes ipv6: Only emit append events for appended routes stmmac: added support for 802.1ad vlan stripping cfg80211: fix rcu in cfg80211_unregister_wdev mac80211: Move up init of TXQs mac80211_hwsim: fix module init error paths cfg80211: initialize sinfo in cfg80211_get_station nl80211: fix some kernel doc tag mistakes hv_netvsc: Fix the variable sizes in ipsecv2 and rsc offload rds: avoid unenecessary cong_update in loop transport l2tp: clean up stale tunnel or session in pppol2tp_connect's error path ...	2018-06-16 07:39:34 +09:00
Kees Cook	42bc47b353	treewide: Use array_size() in vmalloc() The vmalloc() function has no 2-factor argument form, so multiplication factors need to be wrapped in array_size(). This patch replaces cases of: vmalloc(a * b) with: vmalloc(array_size(a, b)) as well as handling cases of: vmalloc(a * b * c) with: vmalloc(array3_size(a, b, c)) This does, however, attempt to ignore constant size factors like: vmalloc(4 * 1024) though any constants defined via macros get caught up in the conversion. Any factors with a sizeof() of "unsigned char", "char", and "u8" were dropped, since they're redundant. The Coccinelle script used for this was: // Fix redundant parens around sizeof(). @@ type TYPE; expression THING, E; @@ ( vmalloc( - (sizeof(TYPE)) * E + sizeof(TYPE) * E , ...) \| vmalloc( - (sizeof(THING)) * E + sizeof(THING) * E , ...) ) // Drop single-byte sizes and redundant parens. @@ expression COUNT; typedef u8; typedef __u8; @@ ( vmalloc( - sizeof(u8) * (COUNT) + COUNT , ...) \| vmalloc( - sizeof(__u8) * (COUNT) + COUNT , ...) \| vmalloc( - sizeof(char) * (COUNT) + COUNT , ...) \| vmalloc( - sizeof(unsigned char) * (COUNT) + COUNT , ...) \| vmalloc( - sizeof(u8) * COUNT + COUNT , ...) \| vmalloc( - sizeof(__u8) * COUNT + COUNT , ...) \| vmalloc( - sizeof(char) * COUNT + COUNT , ...) \| vmalloc( - sizeof(unsigned char) * COUNT + COUNT , ...) ) // 2-factor product with sizeof(type/expression) and identifier or constant. @@ type TYPE; expression THING; identifier COUNT_ID; constant COUNT_CONST; @@ ( vmalloc( - sizeof(TYPE) * (COUNT_ID) + array_size(COUNT_ID, sizeof(TYPE)) , ...) \| vmalloc( - sizeof(TYPE) * COUNT_ID + array_size(COUNT_ID, sizeof(TYPE)) , ...) \| vmalloc( - sizeof(TYPE) * (COUNT_CONST) + array_size(COUNT_CONST, sizeof(TYPE)) , ...) \| vmalloc( - sizeof(TYPE) * COUNT_CONST + array_size(COUNT_CONST, sizeof(TYPE)) , ...) \| vmalloc( - sizeof(THING) * (COUNT_ID) + array_size(COUNT_ID, sizeof(THING)) , ...) \| vmalloc( - sizeof(THING) * COUNT_ID + array_size(COUNT_ID, sizeof(THING)) , ...) \| vmalloc( - sizeof(THING) * (COUNT_CONST) + array_size(COUNT_CONST, sizeof(THING)) , ...) \| vmalloc( - sizeof(THING) * COUNT_CONST + array_size(COUNT_CONST, sizeof(THING)) , ...) ) // 2-factor product, only identifiers. @@ identifier SIZE, COUNT; @@ vmalloc( - SIZE * COUNT + array_size(COUNT, SIZE) , ...) // 3-factor product with 1 sizeof(type) or sizeof(expression), with // redundant parens removed. @@ expression THING; identifier STRIDE, COUNT; type TYPE; @@ ( vmalloc( - sizeof(TYPE) * (COUNT) * (STRIDE) + array3_size(COUNT, STRIDE, sizeof(TYPE)) , ...) \| vmalloc( - sizeof(TYPE) * (COUNT) * STRIDE + array3_size(COUNT, STRIDE, sizeof(TYPE)) , ...) \| vmalloc( - sizeof(TYPE) * COUNT * (STRIDE) + array3_size(COUNT, STRIDE, sizeof(TYPE)) , ...) \| vmalloc( - sizeof(TYPE) * COUNT * STRIDE + array3_size(COUNT, STRIDE, sizeof(TYPE)) , ...) \| vmalloc( - sizeof(THING) * (COUNT) * (STRIDE) + array3_size(COUNT, STRIDE, sizeof(THING)) , ...) \| vmalloc( - sizeof(THING) * (COUNT) * STRIDE + array3_size(COUNT, STRIDE, sizeof(THING)) , ...) \| vmalloc( - sizeof(THING) * COUNT * (STRIDE) + array3_size(COUNT, STRIDE, sizeof(THING)) , ...) \| vmalloc( - sizeof(THING) * COUNT * STRIDE + array3_size(COUNT, STRIDE, sizeof(THING)) , ...) ) // 3-factor product with 2 sizeof(variable), with redundant parens removed. @@ expression THING1, THING2; identifier COUNT; type TYPE1, TYPE2; @@ ( vmalloc( - sizeof(TYPE1) * sizeof(TYPE2) * COUNT + array3_size(COUNT, sizeof(TYPE1), sizeof(TYPE2)) , ...) \| vmalloc( - sizeof(TYPE1) * sizeof(THING2) * (COUNT) + array3_size(COUNT, sizeof(TYPE1), sizeof(TYPE2)) , ...) \| vmalloc( - sizeof(THING1) * sizeof(THING2) * COUNT + array3_size(COUNT, sizeof(THING1), sizeof(THING2)) , ...) \| vmalloc( - sizeof(THING1) * sizeof(THING2) * (COUNT) + array3_size(COUNT, sizeof(THING1), sizeof(THING2)) , ...) \| vmalloc( - sizeof(TYPE1) * sizeof(THING2) * COUNT + array3_size(COUNT, sizeof(TYPE1), sizeof(THING2)) , ...) \| vmalloc( - sizeof(TYPE1) * sizeof(THING2) * (COUNT) + array3_size(COUNT, sizeof(TYPE1), sizeof(THING2)) , ...) ) // 3-factor product, only identifiers, with redundant parens removed. @@ identifier STRIDE, SIZE, COUNT; @@ ( vmalloc( - (COUNT) * STRIDE * SIZE + array3_size(COUNT, STRIDE, SIZE) , ...) \| vmalloc( - COUNT * (STRIDE) * SIZE + array3_size(COUNT, STRIDE, SIZE) , ...) \| vmalloc( - COUNT * STRIDE * (SIZE) + array3_size(COUNT, STRIDE, SIZE) , ...) \| vmalloc( - (COUNT) * (STRIDE) * SIZE + array3_size(COUNT, STRIDE, SIZE) , ...) \| vmalloc( - COUNT * (STRIDE) * (SIZE) + array3_size(COUNT, STRIDE, SIZE) , ...) \| vmalloc( - (COUNT) * STRIDE * (SIZE) + array3_size(COUNT, STRIDE, SIZE) , ...) \| vmalloc( - (COUNT) * (STRIDE) * (SIZE) + array3_size(COUNT, STRIDE, SIZE) , ...) \| vmalloc( - COUNT * STRIDE * SIZE + array3_size(COUNT, STRIDE, SIZE) , ...) ) // Any remaining multi-factor products, first at least 3-factor products // when they're not all constants... @@ expression E1, E2, E3; constant C1, C2, C3; @@ ( vmalloc(C1 * C2 * C3, ...) \| vmalloc( - E1 * E2 * E3 + array3_size(E1, E2, E3) , ...) ) // And then all remaining 2 factors products when they're not all constants. @@ expression E1, E2; constant C1, C2; @@ ( vmalloc(C1 * C2, ...) \| vmalloc( - E1 * E2 + array_size(E1, E2) , ...) ) Signed-off-by: Kees Cook <keescook@chromium.org>	2018-06-12 16:19:22 -07:00
Kees Cook	778e1cdd81	treewide: kvzalloc() -> kvcalloc() The kvzalloc() function has a 2-factor argument form, kvcalloc(). This patch replaces cases of: kvzalloc(a * b, gfp) with: kvcalloc(a * b, gfp) as well as handling cases of: kvzalloc(a * b * c, gfp) with: kvzalloc(array3_size(a, b, c), gfp) as it's slightly less ugly than: kvcalloc(array_size(a, b), c, gfp) This does, however, attempt to ignore constant size factors like: kvzalloc(4 * 1024, gfp) though any constants defined via macros get caught up in the conversion. Any factors with a sizeof() of "unsigned char", "char", and "u8" were dropped, since they're redundant. The Coccinelle script used for this was: // Fix redundant parens around sizeof(). @@ type TYPE; expression THING, E; @@ ( kvzalloc( - (sizeof(TYPE)) * E + sizeof(TYPE) * E , ...) \| kvzalloc( - (sizeof(THING)) * E + sizeof(THING) * E , ...) ) // Drop single-byte sizes and redundant parens. @@ expression COUNT; typedef u8; typedef __u8; @@ ( kvzalloc( - sizeof(u8) * (COUNT) + COUNT , ...) \| kvzalloc( - sizeof(__u8) * (COUNT) + COUNT , ...) \| kvzalloc( - sizeof(char) * (COUNT) + COUNT , ...) \| kvzalloc( - sizeof(unsigned char) * (COUNT) + COUNT , ...) \| kvzalloc( - sizeof(u8) * COUNT + COUNT , ...) \| kvzalloc( - sizeof(__u8) * COUNT + COUNT , ...) \| kvzalloc( - sizeof(char) * COUNT + COUNT , ...) \| kvzalloc( - sizeof(unsigned char) * COUNT + COUNT , ...) ) // 2-factor product with sizeof(type/expression) and identifier or constant. @@ type TYPE; expression THING; identifier COUNT_ID; constant COUNT_CONST; @@ ( - kvzalloc + kvcalloc ( - sizeof(TYPE) * (COUNT_ID) + COUNT_ID, sizeof(TYPE) , ...) \| - kvzalloc + kvcalloc ( - sizeof(TYPE) * COUNT_ID + COUNT_ID, sizeof(TYPE) , ...) \| - kvzalloc + kvcalloc ( - sizeof(TYPE) * (COUNT_CONST) + COUNT_CONST, sizeof(TYPE) , ...) \| - kvzalloc + kvcalloc ( - sizeof(TYPE) * COUNT_CONST + COUNT_CONST, sizeof(TYPE) , ...) \| - kvzalloc + kvcalloc ( - sizeof(THING) * (COUNT_ID) + COUNT_ID, sizeof(THING) , ...) \| - kvzalloc + kvcalloc ( - sizeof(THING) * COUNT_ID + COUNT_ID, sizeof(THING) , ...) \| - kvzalloc + kvcalloc ( - sizeof(THING) * (COUNT_CONST) + COUNT_CONST, sizeof(THING) , ...) \| - kvzalloc + kvcalloc ( - sizeof(THING) * COUNT_CONST + COUNT_CONST, sizeof(THING) , ...) ) // 2-factor product, only identifiers. @@ identifier SIZE, COUNT; @@ - kvzalloc + kvcalloc ( - SIZE * COUNT + COUNT, SIZE , ...) // 3-factor product with 1 sizeof(type) or sizeof(expression), with // redundant parens removed. @@ expression THING; identifier STRIDE, COUNT; type TYPE; @@ ( kvzalloc( - sizeof(TYPE) * (COUNT) * (STRIDE) + array3_size(COUNT, STRIDE, sizeof(TYPE)) , ...) \| kvzalloc( - sizeof(TYPE) * (COUNT) * STRIDE + array3_size(COUNT, STRIDE, sizeof(TYPE)) , ...) \| kvzalloc( - sizeof(TYPE) * COUNT * (STRIDE) + array3_size(COUNT, STRIDE, sizeof(TYPE)) , ...) \| kvzalloc( - sizeof(TYPE) * COUNT * STRIDE + array3_size(COUNT, STRIDE, sizeof(TYPE)) , ...) \| kvzalloc( - sizeof(THING) * (COUNT) * (STRIDE) + array3_size(COUNT, STRIDE, sizeof(THING)) , ...) \| kvzalloc( - sizeof(THING) * (COUNT) * STRIDE + array3_size(COUNT, STRIDE, sizeof(THING)) , ...) \| kvzalloc( - sizeof(THING) * COUNT * (STRIDE) + array3_size(COUNT, STRIDE, sizeof(THING)) , ...) \| kvzalloc( - sizeof(THING) * COUNT * STRIDE + array3_size(COUNT, STRIDE, sizeof(THING)) , ...) ) // 3-factor product with 2 sizeof(variable), with redundant parens removed. @@ expression THING1, THING2; identifier COUNT; type TYPE1, TYPE2; @@ ( kvzalloc( - sizeof(TYPE1) * sizeof(TYPE2) * COUNT + array3_size(COUNT, sizeof(TYPE1), sizeof(TYPE2)) , ...) \| kvzalloc( - sizeof(TYPE1) * sizeof(THING2) * (COUNT) + array3_size(COUNT, sizeof(TYPE1), sizeof(TYPE2)) , ...) \| kvzalloc( - sizeof(THING1) * sizeof(THING2) * COUNT + array3_size(COUNT, sizeof(THING1), sizeof(THING2)) , ...) \| kvzalloc( - sizeof(THING1) * sizeof(THING2) * (COUNT) + array3_size(COUNT, sizeof(THING1), sizeof(THING2)) , ...) \| kvzalloc( - sizeof(TYPE1) * sizeof(THING2) * COUNT + array3_size(COUNT, sizeof(TYPE1), sizeof(THING2)) , ...) \| kvzalloc( - sizeof(TYPE1) * sizeof(THING2) * (COUNT) + array3_size(COUNT, sizeof(TYPE1), sizeof(THING2)) , ...) ) // 3-factor product, only identifiers, with redundant parens removed. @@ identifier STRIDE, SIZE, COUNT; @@ ( kvzalloc( - (COUNT) * STRIDE * SIZE + array3_size(COUNT, STRIDE, SIZE) , ...) \| kvzalloc( - COUNT * (STRIDE) * SIZE + array3_size(COUNT, STRIDE, SIZE) , ...) \| kvzalloc( - COUNT * STRIDE * (SIZE) + array3_size(COUNT, STRIDE, SIZE) , ...) \| kvzalloc( - (COUNT) * (STRIDE) * SIZE + array3_size(COUNT, STRIDE, SIZE) , ...) \| kvzalloc( - COUNT * (STRIDE) * (SIZE) + array3_size(COUNT, STRIDE, SIZE) , ...) \| kvzalloc( - (COUNT) * STRIDE * (SIZE) + array3_size(COUNT, STRIDE, SIZE) , ...) \| kvzalloc( - (COUNT) * (STRIDE) * (SIZE) + array3_size(COUNT, STRIDE, SIZE) , ...) \| kvzalloc( - COUNT * STRIDE * SIZE + array3_size(COUNT, STRIDE, SIZE) , ...) ) // Any remaining multi-factor products, first at least 3-factor products, // when they're not all constants... @@ expression E1, E2, E3; constant C1, C2, C3; @@ ( kvzalloc(C1 * C2 * C3, ...) \| kvzalloc( - (E1) * E2 * E3 + array3_size(E1, E2, E3) , ...) \| kvzalloc( - (E1) * (E2) * E3 + array3_size(E1, E2, E3) , ...) \| kvzalloc( - (E1) * (E2) * (E3) + array3_size(E1, E2, E3) , ...) \| kvzalloc( - E1 * E2 * E3 + array3_size(E1, E2, E3) , ...) ) // And then all remaining 2 factors products when they're not all constants, // keeping sizeof() as the second factor argument. @@ expression THING, E1, E2; type TYPE; constant C1, C2, C3; @@ ( kvzalloc(sizeof(THING) * C2, ...) \| kvzalloc(sizeof(TYPE) * C2, ...) \| kvzalloc(C1 * C2 * C3, ...) \| kvzalloc(C1 * C2, ...) \| - kvzalloc + kvcalloc ( - sizeof(TYPE) * (E2) + E2, sizeof(TYPE) , ...) \| - kvzalloc + kvcalloc ( - sizeof(TYPE) * E2 + E2, sizeof(TYPE) , ...) \| - kvzalloc + kvcalloc ( - sizeof(THING) * (E2) + E2, sizeof(THING) , ...) \| - kvzalloc + kvcalloc ( - sizeof(THING) * E2 + E2, sizeof(THING) , ...) \| - kvzalloc + kvcalloc ( - (E1) * E2 + E1, E2 , ...) \| - kvzalloc + kvcalloc ( - (E1) * (E2) + E1, E2 , ...) \| - kvzalloc + kvcalloc ( - E1 * E2 + E1, E2 , ...) ) Signed-off-by: Kees Cook <keescook@chromium.org>	2018-06-12 16:19:22 -07:00
Pieter Jansen van Vuuren	e62e51af34	nfp: flower: free dst_entry in route table We need to release the refcnt on dst_entry in the route table, otherwise we will leak the route. Fixes: `8e6a9046b6` ("nfp: flower vxlan neighbour offload") Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com> Signed-off-by: Louis Peens <louis.peens@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-06-12 15:18:28 -07:00
Jakub Kicinski	fe06a64e0d	nfp: remove phys_port_name on flower's vNIC .ndo_get_phys_port_name was recently extended to support multi-vNIC FWs. These are firmwares which can have more than one vNIC per PF without associated port (e.g. Adaptive Buffer Management FW), therefore we need a way of distinguishing the vNICs. Unfortunately, it's too late to make flower use the same naming. Flower users may depend on .ndo_get_phys_port_name returning -EOPNOTSUPP, for example the name udev gave the PF vNIC was just the bare PCI device-based name before the change, and will have 'nn0' appended after. To ensure flower's vNIC doesn't have phys_port_name attribute, add a flag to vNIC struct and set it in flower code. New projects will not set the flag adhere to the naming scheme from the start. Fixes: `51c1df83e3` ("nfp: assign vNIC id as phys_port_name of vNICs which are not ports") Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-06-12 15:18:28 -07:00
Jakub Kicinski	29f534c4bb	nfp: include all ring counters in interface stats We are gathering software statistics on per-ring basis. .ndo_get_stats64 handler adds the rings up. Unfortunately we are currently only adding up active rings, which means that if user decreases the number of active rings the statistics from deactivated rings will no longer be counted and total interface statistics may go backwards. Always sum all possible rings, the stats are allocated statically for max number of rings, so we don't have to worry about them being removed. We could add the stats up when user changes the ring count, but it seems unnecessary.. Adding up inactive rings will be very quick since no datapath will be touching them. Fixes: `164d1e9e5d` ("nfp: add support for ethtool .set_channels") Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-06-12 15:18:28 -07:00
Jakub Kicinski	f8d0efb112	nfp: don't pad strings in nfp_cpp_resource_find() to avoid gcc 8 warning Once upon a time nfp_cpp_resource_find() took a name parameter, which could be any user-chosen string. Resources are identified by a CRC32 hash of a 8 byte string, so we had to pad user input with zeros to make sure CRC32 gave the correct result. Since then nfp_cpp_resource_find() was made to operate on allocated resources only (struct nfp_resource). We kzalloc those so there is no need to pad the strings and use memcmp. This avoids a GCC 8 stringop-truncation warning: In function ‘nfp_cpp_resource_find’, inlined from ‘nfp_resource_try_acquire’ at .../nfpcore/nfp_resource.c:153:8, inlined from ‘nfp_resource_acquire’ at .../nfpcore/nfp_resource.c:206:9: .../nfpcore/nfp_resource.c:108:2: warning: strncpy’ output may be truncated copying 8 bytes from a string of length 8 [-Wstringop-truncation] strncpy(name_pad, res->name, sizeof(name_pad)); ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-06-12 15:18:28 -07:00
David Ahern	ac0fc8a1bb	devlink: Add extack to reload and port_{un, }split operations Add extack argument to reload, port_split and port_unsplit operations. Signed-off-by: David Ahern <dsahern@gmail.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-06-05 12:32:37 -04:00
Jakub Kicinski	2440711e49	nfp: abm: report correct MQ stats Report the stat diff to make sure MQ stats add up to child stats. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-05-29 09:49:17 -04:00
Jakub Kicinski	674cb229b6	nfp: abm: multi-queue RED offload Add support for MQ offload and setting RED parameters on queue-by-queue basis. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-05-29 09:49:16 -04:00
Jakub Kicinski	2ef3c253f1	nfp: abm: expose all PF queues Allocate the PF representor as multi-queue to allow setting the configuration per-queue. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-05-29 09:49:16 -04:00
Jakub Kicinski	0a8b7019bb	nfp: abm: expose the internal stats in ethtool There is a handful of statistics exposing some internal details of the implementation. Expose those via ethtool. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-05-29 09:49:16 -04:00
Jakub Kicinski	21f31bc029	nfp: allow apps to add extra stats to ports Allow nfp apps to add extra ethtool stats. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-05-29 09:49:16 -04:00
Jakub Kicinski	cb89cac8e7	nfp: abm: report statistics from RED offload Report basic and extended RED statistics back to TC. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-05-29 09:49:16 -04:00
Jakub Kicinski	8c8e6406f5	nfp: abm: add simple RED offload Offload simple RED configurations. For now support only DCTCP like scenarios where min and max are the same. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-05-29 09:49:15 -04:00
Jakub Kicinski	25e0036fcd	nfp: abm: add helpers for configuring queue marking levels Queue levels for simple ECN marking are stored in _abi_nfd_out_q_lvls_X symbol, where X is the PCIe PF id. Find out the location of that symbol and add helpers for modifying it. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-05-29 09:49:15 -04:00
Jakub Kicinski	055ee0d698	nfp: abm: enable advanced queuing on demand ABM NIC FW has a cut-through mode where the PCIe queuing is bypassed, thus working like our standard NIC FWs. Use this mode by default and only enable queuing in switchdev mode where users can configure it. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-05-29 09:49:15 -04:00
Jakub Kicinski	ca14573272	nfp: prefix vNIC phys_port_name with 'n' Some drivers are using a bare number inside phys_port_name as VF id and OpenStack's regexps will pick it up. We can't use a bare number for your vNICs, prefix the names with 'n'. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-05-29 09:49:15 -04:00
Jakub Kicinski	6fd1cfc0f0	nfp: return -EOPNOTSUPP from .ndo_get_phys_port_name for VFs After recent change we started returning 0 from ndo_get_phys_port_name for VFs. The name parameter for ndo_get_phys_port_name is not initialized by the stack so this can lead to a crash. We should have kept returning -EOPNOTSUPP in the first place. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-05-29 09:49:15 -04:00
John Hurley	7e24a59311	nfp: flower: compute link aggregation action If the egress device of an offloaded rule is a LAG port, then encode the output port to the NFP with a LAG identifier and the offloaded group ID. A prelag action is also offloaded which must be the first action of the series (although may appear after other pre-actions - e.g. tunnels). This causes the FW to check that it has the necessary information to output to the requested LAG port. If it does not, the packet is sent to the kernel before any other actions are applied to it. Signed-off-by: John Hurley <john.hurley@netronome.com> Reviewed-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-05-24 23:10:57 -04:00
John Hurley	2e1cc5226b	nfp: flower: implement host cmsg handler for LAG Adds the control message handler to synchronize offloaded group config with that of the kernel. Such messages are sent from fw to driver and feature the following 3 flags: - Data: an attached cmsg could not be processed - store for retransmission - Xon: FW can accept new messages - retransmit any stored cmsgs - Sync: full sync requested so retransmit all kernel LAG group info Signed-off-by: John Hurley <john.hurley@netronome.com> Reviewed-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-05-24 23:10:57 -04:00
John Hurley	bb9a8d0311	nfp: flower: monitor and offload LAG groups Monitor LAG events via the NETDEV_CHANGEUPPER/NETDEV_CHANGELOWERSTATE notifiers to maintain a list of offloadable groups. Sync these groups with HW via a delayed workqueue to prevent excessive re-configuration. When the workqueue is triggered it may generate multiple control messages for different groups. These messages are linked via a batch ID and flags to indicate a new batch and the end of a batch. Update private data in each repr to track their LAG lower state flags. The state of a repr is used to determine the active netdevs that can be offloaded. For example, in active-backup mode, we only offload the netdev currently active. Signed-off-by: John Hurley <john.hurley@netronome.com> Reviewed-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-05-24 23:10:57 -04:00
John Hurley	b945245297	nfp: flower: add per repr private data for LAG offload Add a bitmap to each flower repr to track its state if it is enslaved by a bond. This LAG state may be different to the port state - for example, the port may be up but LAG state may be down due to the selection in an active/backup bond. Signed-off-by: John Hurley <john.hurley@netronome.com> Reviewed-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-05-24 23:10:56 -04:00
John Hurley	898bc7d634	nfp: flower: check for/turn on LAG support in firmware Check if the fw contains the _abi_flower_balance_sync_enable symbol. If it does then write a 1 to this indicating that the driver is willing to receive NIC to kernel LAG related control messages. If the write is successful, update the list of extra features supported by the fw and add a stub to accept LAG cmsgs. Signed-off-by: John Hurley <john.hurley@netronome.com> Reviewed-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-05-24 23:10:56 -04:00
John Hurley	1945ca7a81	nfp: nfpcore: add rtsym writing function Add an rtsym API function that combines the lookup of a symbol and the writing of a value to it. Values can be written as unsigned 32 or 64 bits. Signed-off-by: John Hurley <john.hurley@netronome.com> Reviewed-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-05-24 23:10:56 -04:00
John Hurley	24f132e29c	nfp: add ndo_set_mac_address for representors Adding a netdev to a bond requires that its mac address can be modified. The default eth_mac_addr is sufficient to satisfy this requirement. Signed-off-by: John Hurley <john.hurley@netronome.com> Reviewed-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-05-24 23:10:56 -04:00
David S. Miller	90fed9c946	Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next Alexei Starovoitov says: ==================== pull-request: bpf-next 2018-05-24 The following pull-request contains BPF updates for your net-next tree. The main changes are: 1) Björn Töpel cleans up AF_XDP (removes rebind, explicit cache alignment from uapi, etc). 2) David Ahern adds mtu checks to bpf_ipv{4,6}_fib_lookup() helpers. 3) Jesper Dangaard Brouer adds bulking support to ndo_xdp_xmit. 4) Jiong Wang adds support for indirect and arithmetic shifts to NFP 5) Martin KaFai Lau cleans up BTF uapi and makes the btf_header extensible. 6) Mathieu Xhonneux adds an End.BPF action to seg6local with BPF helpers allowing to edit/grow/shrink a SRH and apply on a packet generic SRv6 actions. 7) Sandipan Das adds support for bpf2bpf function calls in ppc64 JIT. 8) Yonghong Song adds BPF_TASK_FD_QUERY command for introspection of tracing events. 9) other misc fixes from Gustavo A. R. Silva, Sirio Balmelli, John Fastabend, and Magnus Karlsson ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2018-05-24 22:20:51 -04:00
Jakub Kicinski	51c1df83e3	nfp: assign vNIC id as phys_port_name of vNICs which are not ports When NFP is modelled as a switch we assign phys_port_name to respective port(representor )s: vNIC0 - \| - PF port (pf%d) MAC/PHY (p%d[s%d]) - \|E== In most cases there is only one vNIC for communication with the switch. If there is more than one we need to be able to identify them. Use %d as phys_port_name of the vNICs. We don't have to pass ID to nfp_net_debugfs_vnic_add() separately any more. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-05-23 14:26:19 -04:00
Jakub Kicinski	290f54db31	nfp: use split in naming of PCIe PF ports PCI PFs can host more than one logical endpoint. In NFP terms this means having more than one vNIC for PCIe PF. The vNICs are usually corresponding 1:1 to Ethernet ports. In core NIC we use the legacy idea of vNIC being the Ethernet port, hence netdevs put pX(sY) in their phys_port_name, like Ethernet ports would. When ASIC ports are fully represented we need to be able to name different PCIe PF ports, too. Use a scheme similar to Ethernet ports - pfXsY, for PCIe PF number X, sub-port Y. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-05-23 14:26:19 -04:00
Jakub Kicinski	1f70036723	nfp: abm: force Ethternet port up Current control firmware does not cater too well to multi-host applications. There is no way to check which hosts are up or otherwise negotiate what the state of the external port (the Ethernet port) should be. Make sure the link is up when driver loads, and don't take it down when Ethernet port netdev is closed. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-05-23 14:26:19 -04:00
Jakub Kicinski	d05d902eda	nfp: abm: spawn port netdevs To configure buffering points we need full set of netdevs: ASIC user netdev -- \| -- PCIe port MAC port -- \| -- Configuring egrees qdiscs on user netdev configures standard Linux TC software qdiscs, configuring PCIe port qdiscs will provide a way of setting ASIC queuing parameters for PCIe block. MAC port netdev egress qdiscs correspond to ASIC MAC Traffic Manager block. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-05-23 14:26:19 -04:00
Jakub Kicinski	4afa3af418	nfp: add devlink_eswitch_mode_set callback Our previous apps all assumed to use only one eswitch mode (legacy or switchdev) without the ability to change it. ABM NIC will want to support the switch so plumb devlink_eswitch_mode_set through. The devlink_eswitch_mode_set is expected to spawn representors and potentially devlink ports so it's called under big devlink lock and pf->lock. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-05-23 14:26:19 -04:00
Jakub Kicinski	634c6b7a85	nfp: add app pointer to port representors nfp_apps can currently associate their structures with vNICs but not representors. Add app priv pointer to representors as well. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-05-23 14:26:18 -04:00
Jakub Kicinski	cc54dc2804	nfp: abm: create project-specific vNIC structure ABM NIC requires more complex vNIC handling, allocate per-vNIC structure. Find out RX queue base and PCI PF id. There will be multiple PFs sharing the same MAC port, therefore the MAC address assigned to the vNIC must be looked up in the HWInfo database. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-05-23 14:26:18 -04:00
Jakub Kicinski	c4c8f39a57	nfp: abm: add initial active buffer management NIC skeleton Add a very rudimentary active buffer management NIC support. For now it's like a core NIC without SR-IOV support. Next commits will extend its functionality. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-05-23 14:26:18 -04:00
Jakub Kicinski	b586c77b3c	nfp: core: allow 4-byte aligned accesses to Memory Units Current code doesn't enforce length requirements on 32bit accesses with action NFP_CPP_ACTION_RW to memory units, but if the access is only aligned to 4 bytes as well we will fall into the explicit access case and error out. Such accesses are correct, allow them by lowering the width earlier. While at it use a switch statement to improve readability. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-05-23 14:26:18 -04:00
Jakub Kicinski	a0d163f432	nfp: add shared buffer configuration Allow app FW to advertise its shared buffer pool information. Use the per-PF mailbox to configure them from devlink. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-05-23 14:26:18 -04:00
Jakub Kicinski	0c693323a1	nfp: add support for per-PCI PF mailbox When working with devlink-related functionality for locking reasons it's easier to create a new mailbox per-PCI PF device than try to use one of the netdev/vNIC mailboxes. Define new mailbox structure and resolve its symbol during probe. For forward compatibility allow silent truncation of mailbox command data. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-05-23 14:26:18 -04:00
Jakub Kicinski	8f6196f63c	nfp: move rtsym helpers to pf code nfp_net_pf_rtsym_read_optional() and nfp_net_pf_map_rtsym() are not really related to networking code. Move them to the PF code and remove the net from their names. They will soon be needed by code outside of nfp_net_main.c anyway. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-05-23 14:26:18 -04:00
David S. Miller	6f6e434aa2	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net S390 bpf_jit.S is removed in net-next and had changes in 'net', since that code isn't used any more take the removal. TLS data structures split the TX and RX components in 'net-next', put the new struct members from the bug fix in 'net' into the RX part. The 'net-next' tree had some reworking of how the ERSPAN code works in the GRE tunneling code, overlapping with a one-line headroom calculation fix in 'net'. Overlapping changes in __sock_map_ctx_update_elem(), keep the bits that read the prog members via READ_ONCE() into local variables before using them. Signed-off-by: David S. Miller <davem@davemloft.net>	2018-05-21 16:01:54 -04:00
Jiri Pirko	5ec1380a21	devlink: extend attrs_set for setting port flavours Devlink ports can have specific flavour according to the purpose of use. This patch extend attrs_set so the driver can say which flavour port has. Initial flavours are: physical, cpu, dsa User can query this to see right away what is the purpose of each port. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-05-19 16:30:39 -04:00
Jiri Pirko	b9ffcbaf56	devlink: introduce devlink_port_attrs_set Change existing setter for split port information into more generic attrs setter. Alongside with that, allow to set port number and subport number for split ports. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-05-19 16:30:39 -04:00
Jiong Wang	c217abccaa	nfp: bpf: support arithmetic indirect right shift (BPF_ARSH \| BPF_X) Code logic is similar with arithmetic right shift by constant, and NFP get indirect shift amount through source A operand of PREV_ALU. It is possible to fall back to logic right shift if the MSB is known to be zero from range info, however there is no benefit to do this given logic indirect right shift use the same number and cycle of instruction sequence. Suppose the MSB of regX is the bit we want to replicate to fill in all the vacant positions, and regY contains the shift amount, then we could use single instruction to set up both. [alu, --, regY, OR, regX] -- NOTE: the PREV_ALU result doesn't need to write to any destination register. Signed-off-by: Jiong Wang <jiong.wang@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-05-18 21:35:55 +02:00
Jiong Wang	f43d0f17fe	nfp: bpf: support arithmetic right shift by constant (BPF_ARSH \| BPF_K) Code logic is similar with logic right shift except we also need to set PREV_ALU result properly, the MSB of which is the bit that will be replicated to fill in all the vacant positions. Signed-off-by: Jiong Wang <jiong.wang@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-05-18 21:35:55 +02:00
Jiong Wang	991f5b3651	nfp: bpf: support logic indirect shifts (BPF_[L\|R]SH \| BPF_X) For indirect shifts, shift amount is not specified as constant, NFP needs to get the shift amount through the low 5 bits of source A operand in PREV_ALU, therefore extra instructions are needed compared with shifts by constants. Because NFP is 32-bit, so we are using register pair for 64-bit shifts and therefore would need different instruction sequences depending on whether shift amount is less than 32 or not. NFP branch-on-bit-test instruction emitter is added by this patch and is used for efficient runtime check on shift amount. We'd think the shift amount is less than 32 if bit 5 is clear and greater or equal than 32 otherwise. Shift amount is greater than or equal to 64 will result in undefined behavior. This patch also use range info to avoid generating unnecessary runtime code if we are certain shift amount is less than 32 or not. Signed-off-by: Jiong Wang <jiong.wang@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-05-18 21:35:54 +02:00
Jiri Pirko	3b734ff604	nfp: flower: fix error path during representor creation Don't store repr pointer to reprs array until the representor is successfully created. This avoids message about "representor destruction" even when it was never created. Also it cleans-up the flow. Also, check return value after port alloc. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-05-17 16:23:29 -04:00
David S. Miller	b9f672af14	Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next Daniel Borkmann says: ==================== pull-request: bpf-next 2018-05-17 The following pull-request contains BPF updates for your net-next tree. The main changes are: 1) Provide a new BPF helper for doing a FIB and neighbor lookup in the kernel tables from an XDP or tc BPF program. The helper provides a fast-path for forwarding packets. The API supports IPv4, IPv6 and MPLS protocols, but currently IPv4 and IPv6 are implemented in this initial work, from David (Ahern). 2) Just a tiny diff but huge feature enabled for nfp driver by extending the BPF offload beyond a pure host processing offload. Offloaded XDP programs are allowed to set the RX queue index and thus opening the door for defining a fully programmable RSS/n-tuple filter replacement. Once BPF decided on a queue already, the device data-path will skip the conventional RSS processing completely, from Jakub. 3) The original sockmap implementation was array based similar to devmap. However unlike devmap where an ifindex has a 1:1 mapping into the map there are use cases with sockets that need to be referenced using longer keys. Hence, sockhash map is added reusing as much of the sockmap code as possible, from John. 4) Introduce BTF ID. The ID is allocatd through an IDR similar as with BPF maps and progs. It also makes BTF accessible to user space via BPF_BTF_GET_FD_BY_ID and adds exposure of the BTF data through BPF_OBJ_GET_INFO_BY_FD, from Martin. 5) Enable BPF stackmap with build_id also in NMI context. Due to the up_read() of current->mm->mmap_sem build_id cannot be parsed. This work defers the up_read() via a per-cpu irq_work so that at least limited support can be enabled, from Song. 6) Various BPF JIT follow-up cleanups and fixups after the LD_ABS/LD_IND JIT conversion as well as implementation of an optimized 32/64 bit immediate load in the arm64 JIT that allows to reduce the number of emitted instructions; in case of tested real-world programs they were shrinking by three percent, from Daniel. 7) Add ifindex parameter to the libbpf loader in order to enable BPF offload support. Right now only iproute2 can load offloaded BPF and this will also enable libbpf for direct integration into other applications, from David (Beckett). 8) Convert the plain text documentation under Documentation/bpf/ into RST format since this is the appropriate standard the kernel is moving to for all documentation. Also add an overview README.rst, from Jesper. 9) Add __printf verification attribute to the bpf_verifier_vlog() helper. Though it uses va_list we can still allow gcc to check the format string, from Mathieu. 10) Fix a bash reference in the BPF selftest's Makefile. The '\|& ...' is a bash 4.0+ feature which is not guaranteed to be available when calling out to shell, therefore use a more portable variant, from Joe. 11) Fix a 64 bit division in xdp_umem_reg() by using div_u64() instead of relying on the gcc built-in, from Björn. 12) Fix a sock hashmap kmalloc warning reported by syzbot when an overly large key size is used in hashmap then causing overflows in htab->elem_size. Reject bogus attr->key_size early in the sock_hash_alloc(), from Yonghong. 13) Ensure in BPF selftests when urandom_read is being linked that --build-id is always enabled so that test_stacktrace_build_id[_nmi] won't be failing, from Alexei. 14) Add bitsperlong.h as well as errno.h uapi headers into the tools header infrastructure which point to one of the arch specific uapi headers. This was needed in order to fix a build error on some systems for the BPF selftests, from Sirio. 15) Allow for short options to be used in the xdp_monitor BPF sample code. And also a bpf.h tools uapi header sync in order to fix a selftest build failure. Both from Prashant. 16) More formally clarify the meaning of ID in the direct packet access section of the BPF documentation, from Wang. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2018-05-16 22:47:11 -04:00
David S. Miller	9d6b4bfb59	Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf Daniel Borkmann says: ==================== pull-request: bpf 2018-05-14 The following pull-request contains BPF updates for your net tree. The main changes are: 1) Fix nfp to allow zero-length BPF capabilities, meaning the nfp capability parsing loop will otherwise exit early if the last capability is zero length and therefore driver will fail to probe with an error such as: nfp: BPF capabilities left after parsing, parsed:92 total length:100 nfp: invalid BPF capabilities at offset:92 Fix from Jakub. 2) libbpf's bpf_object__open() may return IS_ERR_OR_NULL() and not just an error. Fix libbpf's bpf_prog_load_xattr() to handle that case as well, also from Jakub. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2018-05-13 21:07:02 -04:00
David S. Miller	b2d6cee117	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net The bpf syscall and selftests conflicts were trivial overlapping changes. The r8169 change involved moving the added mdelay from 'net' into a different function. A TLS close bug fix overlapped with the splitting of the TLS state into separate TX and RX parts. I just expanded the tests in the bug fix from "ctx->conf == X" into "ctx->tx_conf == X && ctx->rx_conf == X". Signed-off-by: David S. Miller <davem@davemloft.net>	2018-05-11 20:53:22 -04:00
Pieter Jansen van Vuuren	df13c59b54	nfp: flower: remove headroom from max MTU calculation Since commit `29a5dcae27` ("nfp: flower: offload phys port MTU change") we take encapsulation headroom into account when calculating the max allowed MTU. This is unnecessary as the max MTU advertised by firmware should have already accounted for encap headroom. Subtracting headroom twice brings the max MTU below what's necessary for some deployments. Fixes: `29a5dcae27` ("nfp: flower: offload phys port MTU change") Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com> Reviewed-by: John Hurley <john.hurley@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-05-10 15:28:01 -04:00
Jakub Kicinski	26aeb9daa0	nfp: bpf: allow zero-length capabilities Some BPF capabilities carry no value, they simply indicate feature is present. Our capability parsing loop will exit early if last capability is zero-length because it's looking for more than 8 bytes of data (8B is our TLV header length). Allow the last capability to be zero-length. This bug would lead to driver failing to probe with the following error if the last capability FW advertises is zero-length: nfp: BPF capabilities left after parsing, parsed:92 total length:100 nfp: invalid BPF capabilities at offset:92 Note the "parsed" and "length" values are 8 apart. No shipping FW runs into this issue, but we can't guarantee that will remain the case. Fixes: `77a844ee65` ("nfp: bpf: prepare for parsing BPF FW capabilities") Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-05-09 18:17:17 +02:00
Jakub Kicinski	d985888faa	nfp: bpf: support setting the RX queue index BPF has access to all internal FW datapath structures. Including the structure containing RX queue selection. With little coordination with the datapath we can let the offloaded BPF select the RX queue. We just need a way to tell the datapath that queue selection has already been done and it shouldn't overwrite it. Define a bit to tell datapath BPF already selected a queue (QSEL_SET), if the selected queue is not enabled (>= number of enabled queues) datapath will perform normal RSS. BPF queue selection on the NIC can be used to replace standard datapath RSS with fully programmable BPF/XDP RSS. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-05-09 18:04:37 +02:00
David S. Miller	01adc4851a	Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next Minor conflict, a CHECK was placed into an if() statement in net-next, whilst a newline was added to that CHECK call in 'net'. Thanks to Daniel for the merge resolution. Signed-off-by: David S. Miller <davem@davemloft.net>	2018-05-07 23:35:08 -04:00
Jakub Kicinski	b4264c96b5	nfp: bpf: rewrite map pointers with NFP TIDs Kernel will now replace map fds with actual pointer before calling the offload prepare. We can identify those pointers and replace them with NFP table IDs instead of loading the table ID in code generated for CALL instruction. This allows us to support having the same CALL being used with different maps. Since we don't want to change the FW ABI we still need to move the TID from R1 to portion of R0 before the jump. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Reviewed-by: Jiong Wang <jiong.wang@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-05-04 23:41:03 +02:00
Jakub Kicinski	9816dd35ec	nfp: bpf: perf event output helpers support Add support for the perf_event_output family of helpers. The implementation on the NFP will not match the host code exactly. The state of the host map and rings is unknown to the device, hence device can't return errors when rings are not installed. The device simply packs the data into a firmware notification message and sends it over to the host, returning success to the program. There is no notion of a host CPU on the device when packets are being processed. Device will only offload programs which set BPF_F_CURRENT_CPU. Still, if map index doesn't match CPU no error will be returned (see above). Dropped/lost firmware notification messages will not cause "lost events" event on the perf ring, they are only visible via device error counters. Firmware notification messages may also get reordered in respect to the packets which caused their generation. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-05-04 23:41:03 +02:00
Jakub Kicinski	630a4d3874	nfp: bpf: record offload neutral maps in the driver For asynchronous events originating from the device, like perf event output, we need to be able to make sure that objects being referred to by the FW message are valid on the host. FW events can get queued and reordered. Even if we had a FW message "barrier" we should still protect ourselves from bogus FW output. Add a reverse-mapping hash table and record in it all raw map pointers FW may refer to. Only record neutral maps, i.e. perf event arrays. These are currently the only objects FW can refer to. Use RCU protection on the read side, update side is under RTNL. Since program vs map destruction order is slightly painful for offload simply take an extra reference on all the recorded maps to make sure they don't disappear. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-05-04 23:41:03 +02:00
David S. Miller	a7b15ab887	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Overlapping changes in selftests Makefile. Signed-off-by: David S. Miller <davem@davemloft.net>	2018-05-04 09:58:56 -04:00
John Hurley	50a5852a65	nfp: flower: set tunnel ttl value to net default Firmware requires that the ttl value for an encapsulating ipv4 tunnel header be included as an action field. Prior to the support of Geneve tunnel encap (when ttl set was removed completely), ttl value was extracted from the tunnel key. However, tests have shown that this can still produce a ttl of 0. Fix the issue by setting the namespace default value for each new tunnel. Follow up patch for net-next will do a full route lookup. Fixes: `3ca3059dc3` ("nfp: flower: compile Geneve encap actions") Fixes: `b27d6a95a7` ("nfp: compile flower vxlan tunnel set actions") Signed-off-by: John Hurley <john.hurley@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-05-01 18:59:57 -04:00
Jakub Kicinski	c55ca688ed	nfp: don't depend on eth_tbl being available For very very old generation of the management FW Ethernet port information table may theoretically not be available. This in turn will cause the nfp_port structures to not be allocated. Make sure we don't crash the kernel when there is no eth_tbl: RIP: 0010:nfp_net_pci_probe+0xf2/0xb40 [nfp] ... Call Trace: nfp_pci_probe+0x6de/0xab0 [nfp] local_pci_probe+0x47/0xa0 work_for_cpu_fn+0x1a/0x30 process_one_work+0x1de/0x3e0 Found while working with broken/development version of management FW. Fixes: `a5950182c0` ("nfp: map mac_stats and vf_cfg BARs") Fixes: `93da7d9660` ("nfp: provide nfp_port to of nfp_net_get_mac_addr()") Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-04-27 11:15:10 -04:00
David S. Miller	79741a38b4	Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next Daniel Borkmann says: ==================== pull-request: bpf-next 2018-04-27 The following pull-request contains BPF updates for your net-next tree. The main changes are: 1) Add extensive BPF helper description into include/uapi/linux/bpf.h and a new script bpf_helpers_doc.py which allows for generating a man page out of it. Thus, every helper in BPF now comes with proper function signature, detailed description and return code explanation, from Quentin. 2) Migrate the BPF collect metadata tunnel tests from BPF samples over to the BPF selftests and further extend them with v6 vxlan, geneve and ipip tests, simplify the ipip tests, improve documentation and convert to bpf_ntoh() / bpf_hton() api, from William. 3) Currently, helpers that expect ARG_PTR_TO_MAP_{KEY,VALUE} can only access stack and packet memory. Extend this to allow such helpers to also use map values, which enabled use cases where value from a first lookup can be directly used as a key for a second lookup, from Paul. 4) Add a new helper bpf_skb_get_xfrm_state() for tc BPF programs in order to retrieve XFRM state information containing SPI, peer address and reqid values, from Eyal. 5) Various optimizations in nfp driver's BPF JIT in order to turn ADD and SUB instructions with negative immediate into the opposite operation with a positive immediate such that nfp can better fit small immediates into instructions. Savings in instruction count up to 4% have been observed, from Jakub. 6) Add the BPF prog's gpl_compatible flag to struct bpf_prog_info and add support for dumping this through bpftool, from Jiri. 7) Move the BPF sockmap samples over into BPF selftests instead since sockmap was rather a series of tests than sample anyway and this way this can be run from automated bots, from John. 8) Follow-up fix for bpf_adjust_tail() helper in order to make it work with generic XDP, from Nikita. 9) Some follow-up cleanups to BTF, namely, removing unused defines from BTF uapi header and renaming 'name' struct btf_* members into name_off to make it more clear they are offsets into string section, from Martin. 10) Remove test_sock_addr from TEST_GEN_PROGS in BPF selftests since not run directly but invoked from test_sock_addr.sh, from Yonghong. 11) Remove redundant ret assignment in sample BPF loader, from Wang. 12) Add couple of missing files to BPF selftest's gitignore, from Anders. There are two trivial merge conflicts while pulling: 1) Remove samples/sockmap/Makefile since all sockmap tests have been moved to selftests. 2) Add both hunks from tools/testing/selftests/bpf/.gitignore to the file since git should ignore all of them. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2018-04-26 21:19:50 -04:00
John Hurley	c50647d3e8	nfp: flower: ignore duplicate cb requests for same rule If a flower rule has a repr both as ingress and egress port then 2 callbacks may be generated for the same rule request. Add an indicator to each flow as to whether or not it was added from an ingress registered cb. If so then ignore add/del/stat requests to it from an egress cb. Signed-off-by: John Hurley <john.hurley@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-04-25 14:07:04 -04:00
John Hurley	54a4a03439	nfp: flower: support offloading multiple rules with same cookie When multiple netdevs are attached to a tc offload block and register for callbacks, a rule added to the block will be propogated to all netdevs. Previously these were detected as duplicates (based on cookie) and rejected. Modify the rule nfp lookup function to optionally include an ingress netdev and a host context along with the cookie value when searching for a rule. When a new rule is passed to the driver, the netdev the rule is to be attached to is considered when searching for dublicates. When a stats update is received from HW, the host context is used alongside the cookie to map to the correct host rule. Signed-off-by: John Hurley <john.hurley@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-04-25 14:07:04 -04:00
Jakub Kicinski	dd92a7d1af	nfp: print PCIe link bandwidth on probe To aid debugging of performance issues caused by limited PCIe bandwidth print the PCIe link information on probe. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-04-25 14:07:04 -04:00
Jakub Kicinski	3e3e9fd8b6	nfp: reset local locks on init NFP locks record the owner when held, for PCIe devices the owner ID will be the PCIe link number. When driver loads it should scan known locks and if they indicate that they are held by local endpoint but the driver doesn't hold them - release them. Locks can be left taken for instance when kernel gets kexec-ed or after a crash. Management FW tries to clean up stale locks too, but it currently depends on PCIe link going down which doesn't always happen. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-04-25 14:07:04 -04:00
Jakub Kicinski	7bdc97be90	nfp: bpf: optimize comparisons to negative constants Comparison instruction requires a subtraction. If the constant is negative we are more likely to fit it into a NFP instruction directly if we change the sign and use addition. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-04-25 09:56:10 +02:00
Jakub Kicinski	61dd8f0007	nfp: bpf: tabularize generations of compare operations There are quite a few compare instructions now, use a table to translate BPF instruction code to NFP instruction parameters instead of parameterizing helpers. This saves LOC and makes future extensions easier. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-04-25 09:56:10 +02:00
Jakub Kicinski	6c59500c2d	nfp: bpf: optimize add/sub of a negative constant NFP instruction set can fit small immediates into the instruction. Negative integers, however, will never fit because they will have highest bit set. If we swap the ALU op between ADD and SUB and negate the constant we have a better chance of fitting small negative integers into the instruction itself and saving one or two cycles. immed[gprB_21, 0xfffffffc] alu[gprA_4, gprA_4, +, gprB_21], gpr_wrboth immed[gprB_21, 0xffffffff] alu[gprA_5, gprA_5, +carry, gprB_21], gpr_wrboth now becomes: alu[gprA_4, gprA_4, -, 4], gpr_wrboth alu[gprA_5, gprA_5, -carry, 0], gpr_wrboth Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-04-25 09:56:10 +02:00
Jakub Kicinski	9c9e53233c	nfp: bpf: remove double space Whitespace cleanup - remove double space. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-04-25 09:56:10 +02:00
David S. Miller	e0ada51db9	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Conflicts were simple overlapping changes in microchip driver. Signed-off-by: David S. Miller <davem@davemloft.net>	2018-04-21 16:32:48 -04:00
Nikita V. Shirokov	5a6a22e378	bpf: make netronome nfp compatible w/ bpf_xdp_adjust_tail w/ bpf_xdp_adjust_tail helper xdp's data_end pointer could be changed as well (only "decrease" of pointer's location is going to be supported). changing of this pointer will change packet's size. for nfp driver we will just calculate packet's length unconditionally Acked-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Nikita V. Shirokov <tehnerd@tehnerd.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-04-18 23:34:16 +02:00
Pieter Jansen van Vuuren	cf2cbadc20	nfp: flower: split and limit cmsg skb lists Introduce a second skb list for handling control messages and limit the number of allowed messages. Some control messages are considered more crucial than others, resulting in the need for a second skb list. By splitting the list into a separate high and low priority list we can ensure that messages on the high list get added to the head of the list that gets processed, this however has no functional impact. Previously there was no limit on the number of messages allowed on the queue, this could result in the queue growing boundlessly and eventually the host running out of memory. Fixes: `b985f870a5` ("nfp: process control messages in workqueue in flower app") Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-04-12 21:57:28 -04:00
Pieter Jansen van Vuuren	0b1a989ef5	nfp: flower: move route ack control messages out of the workqueue Previously we processed the route ack control messages in the workqueue, this unnecessarily loads the workqueue. We can deal with these messages sooner as we know we are going to drop them. Fixes: `8e6a9046b6` ("nfp: flower vxlan neighbour offload") Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-04-12 21:57:28 -04:00
Jakub Kicinski	bc05f9bcd8	nfp: print a message when mutex wait is interrupted When waiting for an NFP mutex is interrupted print a message to make root causing later error messages easier. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-04-12 21:57:27 -04:00
Jakub Kicinski	5496295aef	nfp: ignore signals when communicating with management FW We currently allow signals to interrupt the wait for management FW commands. Exiting the wait should not cause trouble, the FW will just finish executing the command in the background and new commands will wait for the old one to finish. However, this may not be what users expect (Ctrl-C not actually stopping the command). Moreover some systems routinely request link information with signals pending (Ubuntu 14.04 runs a landscape-sysinfo python tool from MOTD) worrying users with errors like these: nfp 0000:04:00.0: nfp_nsp: Error -512 waiting for code 0x0007 to start nfp 0000:04:00.0: nfp: reading port table failed -512 Make the wait for management FW responses non-interruptible. Fixes: `1a64821c6a` ("nfp: add support for service processor access") Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-04-12 21:57:27 -04:00
Dirk van der Merwe	1489bbd10e	nfp: use full 40 bits of the NSP buffer address The NSP default buffer is a piece of NFP memory where additional command data can be placed. Its format has been copied from host buffer, but the PCIe selection bits do not make sense in this case. If those get masked out from a NFP address - writes to random place in the chip memory may be issued and crash the device. Even in the general NSP buffer case, it doesn't make sense to have the PCIe selection bits there anymore. These are unused at the moment, and when it becomes necessary, the PCIe selection bits should rather be moved to another register to utilise more bits for the buffer address. This has never been an issue because the buffer used to be allocated in memory with less-than-38-bit-long address but that is about to change. Fixes: `1a64821c6a` ("nfp: add support for service processor access") Signed-off-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-04-04 11:45:24 -04:00
Jakub Kicinski	0df57e604c	nfp: add a separate counter for packets with CHECKSUM_COMPLETE We are currently counting packets with CHECKSUM_COMPLETE as "hw_rx_csum_ok". This is confusing. Add a new counter. To make sure it fits in the same cacheline move the less used error counter to a different location. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-04-04 11:36:50 -04:00
David S. Miller	c0b458a946	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Minor conflicts in drivers/net/ethernet/mellanox/mlx5/core/en_rep.c, we had some overlapping changes: 1) In 'net' MLX5E_PARAMS_LOG_{SQ,RQ}_SIZE --> MLX5E_REP_PARAMS_LOG_{SQ,RQ}_SIZE 2) In 'net-next' params->log_rq_size is renamed to be params->log_rq_mtu_frames. 3) In 'net-next' params->hard_mtu is added. Signed-off-by: David S. Miller <davem@davemloft.net>	2018-04-01 19:49:34 -04:00
David S. Miller	d4069fe6fc	Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next Daniel Borkmann says: ==================== pull-request: bpf-next 2018-03-31 The following pull-request contains BPF updates for your net-next tree. The main changes are: 1) Add raw BPF tracepoint API in order to have a BPF program type that can access kernel internal arguments of the tracepoints in their raw form similar to kprobes based BPF programs. This infrastructure also adds a new BPF_RAW_TRACEPOINT_OPEN command to BPF syscall which returns an anon-inode backed fd for the tracepoint object that allows for automatic detach of the BPF program resp. unregistering of the tracepoint probe on fd release, from Alexei. 2) Add new BPF cgroup hooks at bind() and connect() entry in order to allow BPF programs to reject, inspect or modify user space passed struct sockaddr, and as well a hook at post bind time once the port has been allocated. They are used in FB's container management engine for implementing policy, replacing fragile LD_PRELOAD wrapper intercepting bind() and connect() calls that only works in limited scenarios like glibc based apps but not for other runtimes in containerized applications, from Andrey. 3) BPF_F_INGRESS flag support has been added to sockmap programs for their redirect helper call bringing it in line with cls_bpf based programs. Support is added for both variants of sockmap programs, meaning for tx ULP hooks as well as recv skb hooks, from John. 4) Various improvements on BPF side for the nfp driver, besides others this work adds BPF map update and delete helper call support from the datapath, JITing of 32 and 64 bit XADD instructions as well as offload support of bpf_get_prandom_u32() call. Initial implementation of nfp packet cache has been tackled that optimizes memory access (see merge commit for further details), from Jakub and Jiong. 5) Removal of struct bpf_verifier_env argument from the print_bpf_insn() API has been done in order to prepare to use print_bpf_insn() soon out of perf tool directly. This makes the print_bpf_insn() API more generic and pushes the env into private data. bpftool is adjusted as well with the print_bpf_insn() argument removal, from Jiri. 6) Couple of cleanups and prep work for the upcoming BTF (BPF Type Format). The latter will reuse the current BPF verifier log as well, thus bpf_verifier_log() is further generalized, from Martin. 7) For bpf_getsockopt() and bpf_setsockopt() helpers, IPv4 IP_TOS read and write support has been added in similar fashion to existing IPv6 IPV6_TCLASS socket option we already have, from Nikita. 8) Fixes in recent sockmap scatterlist API usage, which did not use sg_init_table() for initialization thus triggering a BUG_ON() in scatterlist API when CONFIG_DEBUG_SG was enabled. This adds and uses a small helper sg_init_marker() to properly handle the affected cases, from Prashant. 9) Let the BPF core follow IDR code convention and therefore use the idr_preload() and idr_preload_end() helpers, which would also help idr_alloc_cyclic() under GFP_ATOMIC to better succeed under memory pressure, from Shaohua. 10) Last but not least, a spelling fix in an error message for the BPF cookie UID helper under BPF sample code, from Colin. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2018-03-31 23:33:04 -04:00
John Hurley	29a5dcae27	nfp: flower: offload phys port MTU change Trigger a port mod message to request an MTU change on the NIC when any physical port representor is assigned a new MTU value. The driver waits 10 msec for an ack that the FW has set the MTU. If no ack is received the request is rejected and an appropriate warning flagged. Rather than maintain an MTU queue per repr, one is maintained per app. Because the MTU ndo is protected by the rtnl lock, there can never be contention here. Portmod messages from the NIC are also protected by rtnl so we first check if the portmod is an ack and, if so, handle outside rtnl and the cmsg work queue. Acks are detected by the marking of a bit in a portmod response. They are then verfied by checking the port number and MTU value expected by the app. If the expected MTU is 0 then no acks are currently expected. Also, ensure that the packet headroom reserved by the flower firmware is considered when accepting an MTU change on any repr. Signed-off-by: John Hurley <john.hurley@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-03-30 10:18:55 -04:00
John Hurley	167cebeffa	nfp: modify app MTU setting callbacks Rename the 'change_mtu' app callback to 'check_mtu'. This is called whenever an MTU change is requested on a netdev. It can reject the change but is not responsible for implementing it. Introduce a new 'repr_change_mtu' app callback that is hit when the MTU of a repr is to be changed. This is responsible for performing the MTU change and verifying it. Signed-off-by: John Hurley <john.hurley@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-03-30 10:18:54 -04:00
Jakub Kicinski	7c095f5d9b	nfp: bpf: improve wrong FW response warnings When FW responds with a message of wrong size or type make sure the type is checked first and included in the wrong size message. This makes it easier to figure out which FW command failed. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Reviewed-by: Jiong Wang <jiong.wang@netronome.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2018-03-28 19:36:14 -07:00
Jakub Kicinski	df4a37d8b5	nfp: bpf: add support for bpf_get_prandom_u32() NFP has a prng register, which we can read to obtain a u32 worth of pseudo random data. Generate code for it. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Reviewed-by: Jiong Wang <jiong.wang@netronome.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2018-03-28 19:36:14 -07:00
Jakub Kicinski	41aed09cf6	nfp: bpf: add support for atomic add of unknown values Allow atomic add to be used even when the value is not guaranteed to fit into a 16 bit immediate. This requires the value to be pulled as data, and therefore use of a transfer register and a context swap. Track the information about possible lengths of the value, if it's guaranteed to be larger than 16bits don't generate the code for the optimized case at all. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Reviewed-by: Jiong Wang <jiong.wang@netronome.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2018-03-28 19:36:14 -07:00
Jakub Kicinski	b556ddd9c1	nfp: bpf: expose command delay slots Allow callers to control the delay slots of commands, instead of giving them just a wait/nowait choice. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Reviewed-by: Jiong Wang <jiong.wang@netronome.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2018-03-28 19:36:14 -07:00
Jakub Kicinski	dcb0c27f3c	nfp: bpf: add basic support for atomic adds Implement atomic add operation for 32 and 64 bit values. Depend on the verifier to ensure alignment. Values have to be kept in big endian and swapped upon read/write. For now only support atomic add of a constant. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Reviewed-by: Jiong Wang <jiong.wang@netronome.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2018-03-28 19:36:13 -07:00
Jakub Kicinski	bfee64deaa	nfp: bpf: add map deletes from the datapath Support calling map_delete_elem() FW helper from the datapath programs. For JIT checks and code are basically equivalent to map lookups. Similarly to other map helper key must be on the stack. Different pointer types are left for future extension. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Reviewed-by: Jiong Wang <jiong.wang@netronome.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2018-03-28 19:36:13 -07:00
Jakub Kicinski	44d65a47ae	nfp: bpf: add map updates from the datapath Support calling map_update_elem() from the datapath programs by calling into FW-provided helper. Value pointer is passed in LM pointer #2. Keeping track of old state for arg3 is not necessary, since LM pointer #2 will be always loaded in this case, the trivial optimization for value at the bottom of the stack can't be done here. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Reviewed-by: Jiong Wang <jiong.wang@netronome.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2018-03-28 19:36:13 -07:00
Jakub Kicinski	289c5b7630	nfp: bpf: add helper for basic map call checks Add a verifier helper for performing the basic state checks before a call to a map helper. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Reviewed-by: Jiong Wang <jiong.wang@netronome.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2018-03-28 19:36:13 -07:00
Jakub Kicinski	2f46e0c127	nfp: bpf: add helper for validating stack pointers Our implementation has restriction on stack pointers for function calls. Move the common checks into a helper for reuse. The state has to be encapsulated into a structure to support parameters other than BPF_REG_2. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Reviewed-by: Jiong Wang <jiong.wang@netronome.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2018-03-28 19:36:13 -07:00
Jakub Kicinski	fc4484970e	nfp: bpf: rename map_lookup_stack() to map_call_stack_common() We will reuse most of map call code gen for other map calls. Rename the lookup gen function and use meta->func_id instead of hard-coding lookup. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Reviewed-by: Jiong Wang <jiong.wang@netronome.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2018-03-28 19:36:13 -07:00
Jiong Wang	87b10ecdce	nfp: bpf: detect packet reads could be cached, enable the optimisation This patch is the front end of this optimisation, it detects and marks those packet reads that could be cached. Then the optimisation "backend" will be activated automatically. Signed-off-by: Jiong Wang <jiong.wang@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2018-03-28 19:36:12 -07:00
Jiong Wang	91ff69e840	nfp: bpf: support unaligned read offset This patch add the support for unaligned read offset, i.e. the read offset to the start of packet cache area is not aligned to REG_WIDTH. In this case, the read area might across maximum three transfer-in registers. Signed-off-by: Jiong Wang <jiong.wang@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2018-03-28 19:36:12 -07:00
Jiong Wang	be75923786	nfp: bpf: read from packet data cache for PTR_TO_PACKET This patch assumes there is a packet data cache, and would try to read packet data from the cache instead of from memory. This patch only implements the optimisation "backend", it doesn't build the packet data cache, so this optimisation is not enabled. This patch has only enabled aligned packet data read, i.e. when the read offset to the start of cache is REG_WIDTH aligned. Signed-off-by: Jiong Wang <jiong.wang@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2018-03-28 19:36:12 -07:00
Pieter Jansen van Vuuren	71ea5343a0	nfp: flower: implement ip fragmentation match offload Implement ip fragmentation match offloading for both IPv4 and IPv6. Allows offloading frag, nofrag, first and nofirstfrag classification. Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-03-26 13:01:09 -04:00
Pieter Jansen van Vuuren	07e1671cfc	nfp: flower: refactor shared ip header in match offload Refactored shared ip header code for IPv4 and IPv6 in match offload. Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-03-26 13:01:09 -04:00
Joe Perches	d3757ba4c1	ethernet: Use octal not symbolic permissions Prefer the direct use of octal for permissions. Done with checkpatch -f --types=SYMBOLIC_PERMS --fix-inplace and some typing. Miscellanea: o Whitespace neatening around these conversions. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-03-26 12:07:49 -04:00
Jakub Kicinski	e8a4796ee2	nfp: bpf: fix check of program max insn count NFP program allocation length is in bytes and NFP program length is in instructions, fix the comparison of the two. Fixes: `9314c442d7` ("nfp: bpf: move translation prepare to offload.c") Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Jiong Wang <jiong.wang@netronome.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2018-03-24 10:41:24 -07:00
Dirk van der Merwe	70271dadee	nfp: advertise firmware for mixed 10G/25G mode The AMDA0099-0001 platform can support the 1x10G + 1x25G mixed mode operation. Recently, firmware has been added for this configuration mode. Signed-off-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-02-22 15:22:50 -05:00
Jakub Kicinski	f7308991bf	nfp: add Makefiles to all directories To be able to build separate objects we need to provide Kbuild with a Makefile in each directory. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-02-22 15:22:50 -05:00
Pieter Jansen van Vuuren	ffa61202fe	nfp: flower: implement tcp flag match offload Implement tcp flag match offloading. Current tcp flag match support include FIN, SYN, RST, PSH and URG flags, other flags are unsupported. The PSH and URG flags are only set in the hardware fast path when used in combination with the SYN, RST and PSH flags. Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com> Reviewed-by: John Hurley <john.hurley@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-02-16 16:24:24 -05:00
Michael Rapson	014f900898	nfp: standardize FW header whitespace The nfp_net_ctrl.h file used spaces for indentation in the past but tabs have crept in. Host driver files use tabs for indentation by default, so let's convert to tabs for consistency across the file and our drivers. Signed-off-by: Michael Rapson <michael.rapson@netronome.com> Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-02-16 16:24:24 -05:00
David S. Miller	437a4db66d	Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf Daniel Borkmann says: ==================== pull-request: bpf 2018-02-09 The following pull-request contains BPF updates for your net tree. The main changes are: 1) Two fixes for BPF sockmap in order to break up circular map references from programs attached to sockmap, and detaching related sockets in case of socket close() event. For the latter we get rid of the smap_state_change() and plug into ULP infrastructure, which will later also be used for additional features anyway such as TX hooks. For the second issue, dependency chain is broken up via map release callback to free parse/verdict programs, all from John. 2) Fix a libbpf relocation issue that was found while implementing XDP support for Suricata project. Issue was that when clang was invoked with default target instead of bpf target, then various other e.g. debugging relevant sections are added to the ELF file that contained relocation entries pointing to non-BPF related sections which libbpf trips over instead of skipping them. Test cases for libbpf are added as well, from Jesper. 3) Various misc fixes for bpftool and one for libbpf: a small addition to libbpf to make sure it recognizes all standard section prefixes. Then, the Makefile in bpftool/Documentation is improved to explicitly check for rst2man being installed on the system as we otherwise risk installing empty man pages; the man page for bpftool-map is corrected and a set of missing bash completions added in order to avoid shipping bpftool where the completions are only partially working, from Quentin. 4) Fix applying the relocation to immediate load instructions in the nfp JIT which were missing a shift, from Jakub. 5) Two fixes for the BPF kernel selftests: handle CONFIG_BPF_JIT_ALWAYS_ON=y gracefully in test_bpf.ko module and mark them as FLAG_EXPECTED_FAIL in this case; and explicitly delete the veth devices in the two tests test_xdp_{meta,redirect}.sh before dismantling the netnses as when selftests are run in batch mode, then workqueue to handle destruction might not have finished yet and thus veth creation in next test under same dev name would fail, from Yonghong. 6) Fix test_kmod.sh to check the test_bpf.ko module path before performing an insmod, and fallback to modprobe. Especially the latter is useful when having a device under test that has the modules installed instead, from Naresh. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2018-02-09 14:05:10 -05:00
Jakub Kicinski	1a5e8e3500	nfp: populate MODULE_VERSION DKMS and similar out-of-tree module replacement services use module version to make sure the out-of-tree software is not older than the module shipped with the kernel. We use the kernel version in ethtool -i output, put it into MODULE_VERSION as well. Reported-by: Jan Gutter <jan.gutter@netronome.com> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-02-08 10:01:27 -05:00
Jakub Kicinski	0d592e52fb	nfp: limit the number of TSO segments Most FWs limit the number of TSO segments a frame can produce to 64. This is for fairness and efficiency (of FW datapath) reasons. If a frame with larger number of segments is submitted the FW will drop it. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-02-08 10:01:27 -05:00
Jakub Kicinski	d692403e5c	nfp: forbid disabling hw-tc-offload on representors while offload active All netdevs which can accept TC offloads must implement .ndo_set_features(). nfp_reprs currently do not do that, which means hw-tc-offload can be turned on and off even when offloads are active. Whether the offloads are active is really a question to nfp_ports, so remove the per-app tc_busy callback indirection thing, and simply count the number of offloaded items in nfp_port structure. Fixes: `8a2768732a` ("nfp: provide infrastructure for offloading flower based TC filters") Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Tested-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-02-08 10:01:27 -05:00
Jakub Kicinski	0b9de4ca85	nfp: don't advertise hw-tc-offload on non-port netdevs nfp_port is a structure which represents an ASIC port, both PCIe vNIC (on a PF or a VF) or the external MAC port. vNIC netdev (struct nfp_net) and pure representor netdev (struct nfp_repr) both have a pointer to this structure. nfp_reprs always have a port associated. nfp_nets, however, only represent a device port in legacy mode, where they are considered the MAC port. In switchdev mode they are just the CPU's side of the PCIe link. By definition TC offloads only apply to device ports. Don't set the flag on vNICs without a port (i.e. in switchdev mode). Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Tested-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-02-08 10:01:27 -05:00
Jakub Kicinski	e3ac6c0737	nfp: bpf: require ETH table Upcoming changes will require all netdevs supporting TC offloads to have a full struct nfp_port. Require those for BPF offload. The operation without management FW reporting information about Ethernet ports is something we only support for very old and very basic NIC firmwares anyway. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Tested-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-02-08 10:01:27 -05:00
Jakub Kicinski	b7d9923547	nfp: bpf: fix immed relocation for larger offsets Immed relocation is missing a shift which means for larger offsets the lower and higher part of the address would be ORed together. Fixes: `ce4ebfd859` ("nfp: bpf: add helpers for updating immediate instructions") Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Jiong Wang <jiong.wang@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-02-08 11:59:50 +01:00
Jakub Kicinski	703f578a35	nfp: fix kdoc warnings on nested structures Commit `84ce5b9877` ("scripts: kernel-doc: improve nested logic to handle multiple identifiers") improved the handling of nested structure definitions in scripts/kernel-doc, and changed the expected format of documentation. This causes new warnings to appear on W=1 builds. Only comment changes. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-02-06 11:43:58 -05:00
Edwin Peer	1d8ef0c076	nfp: fix TLV offset calculation The data pointer in the config space TLV parser already includes NFP_NET_CFG_TLV_BASE, it should not be added again. Incorrect offset values were only used in printed user output, rendering the bug merely cosmetic. Fixes: `73a0329b05` ("nfp: add TLV capabilities to the BAR") Signed-off-by: Edwin Peer <edwin.peer@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-02-02 19:08:20 -05:00
Jakub Kicinski	3107fdc8b2	nfp: use tc_cls_can_offload_and_chain0() Make use of tc_cls_can_offload_and_chain0() to set extack msg in case ethtool tc offload flag is not set or chain unsupported. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-01-25 21:23:08 -05:00
Wei Yongjun	e58decc9c5	nfp: fix error return code in nfp_pci_probe() Fix to return error code -EINVAL instead of 0 when num_vfs above limit_vfs, as done elsewhere in this function. Fixes: `0dc7862191` ("nfp: handle SR-IOV already enabled when driver is probing") Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-01-23 10:43:28 -05:00
Carl Heymann	e71494ae68	nfp: fix fw dump handling of absolute rtsym size Fix bug that causes _absolute_ rtsym sizes of > 8 bytes (as per symbol table) to result in incorrect space used during a TLV-based debug dump. Detail: The size calculation stage calculates the correct size (size of the rtsym address field == 8), while the dump uses the size in the table to calculate the TLV size to reserve. Symbols with size <= 8 are handled OK due to aligning sizes to 8, but including any absolute symbol with listed size > 8 leads to an ENOSPC error during the dump. Fixes: `da762863ed` ("nfp: fix absolute rtsym handling in debug dump") Signed-off-by: Carl Heymann <carl.heymann@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-01-23 10:12:01 -05:00
Quentin Monnet	52be9a7cde	nfp: bpf: use extack support to improve debugging Use the recently added extack support for eBPF offload in the driver. Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-01-22 16:28:32 -05:00
Quentin Monnet	acc2abbbb1	nfp: bpf: plumb extack into functions related to XDP offload Pass a pointer to an extack object to nfp_app_xdp_offload() in order to prepare for extack usage in the nfp driver. Next step will be to forward this extack pointer to nfp_net_bpf_offload(), once this function is able to use it for printing error messages. Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-01-22 16:28:32 -05:00
Pieter Jansen van Vuuren	01c15e93a7	nfp: flower: prioritize stats updates Previously it was possible to interrupt processing stats updates because they were handled in a work queue. Interrupting the stats updates could lead to a situation where we backup the control message queue. This patch moves the stats update processing out of the work queue to be processed as soon as hardware sends a request. Reported-by: Louis Peens <louis.peens@netronome.com> Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-01-21 18:08:05 -05:00
David S. Miller	ea9722e265	Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next Alexei Starovoitov says: ==================== pull-request: bpf-next 2018-01-19 The following pull-request contains BPF updates for your net-next tree. The main changes are: 1) bpf array map HW offload, from Jakub. 2) support for bpf_get_next_key() for LPM map, from Yonghong. 3) test_verifier now runs loaded programs, from Alexei. 4) xdp cpumap monitoring, from Jesper. 5) variety of tests, cleanups and small x64 JIT optimization, from Daniel. 6) user space can now retrieve HW JITed program, from Jiong. Note there is a minor conflict between Russell's arm32 JIT fixes and removal of bpf_jit_enable variable by Daniel which should be resolved by keeping Russell's comment and removing that variable. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2018-01-20 22:03:46 -05:00
Jakub Kicinski	81bd5ded60	nfp: bpf: disable all ctrl vNIC capabilities BPF firmware currently exposes IRQ moderation capability. The driver will make use of it by default, inserting 50 usec delay to every control message exchange. This cuts the number of messages per second we can exchange by almost half. None of the other capabilities make much sense for BPF control vNIC, either. Disable them all. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-01-19 15:44:19 -05:00
Jakub Kicinski	78a0a65f40	nfp: allow apps to disable ctrl vNIC capabilities Most vNIC capabilities are netdev related. It makes no sense to initialize them and waste FW resources. Some are even counter-productive, like IRQ moderation, which will slow down exchange of control messages. Add to nfp_app a mask of enabled control vNIC capabilities for apps to use. Make flower and BPF enable all capabilities for now. No functional changes. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-01-19 15:44:18 -05:00
Jakub Kicinski	545bfa7a6a	nfp: split reading capabilities out of nfp_net_init() nfp_net_init() is a little long and we are about to add more code to reading capabilties. Move the capability reading, parsing and validating out. Only actual initialization will stay in nfp_net_init(). No functional changes. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-01-19 15:44:18 -05:00
Jakub Kicinski	527d7d1b99	nfp: read mailbox address from TLV caps Allow specifying alternative vNIC mailbox location in TLV caps. This way we can size the mailbox to the needs and not necessarily waste 512B of ctrl memory space. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-01-19 15:44:18 -05:00
Jakub Kicinski	ce991ab666	nfp: read ME frequency from vNIC ctrl memory PCIe island clock frequency is used when converting coalescing parameters from usecs to NFP timestamps. Most chips don't run at 1200MHz, allow FW to provide us with the real frequency. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-01-19 15:44:18 -05:00
Jakub Kicinski	73a0329b05	nfp: add TLV capabilities to the BAR NFP is entirely programmable, including the PCI data interface. Using a fixed control BAR layout certainly makes implementations easier, but require careful considerations when space is allocated. Once BAR area is allocated to one feature nothing else can use it. Allocating space statically also requires it to be sized upfront, which leads to either unnecessary limitation or wastage. We currently have a 32bit capability word defined which tells drivers which application FW features are supported. Most of the bits are exhausted. The same bits are also reused for enabling specific features. Bulk of capabilities don't have a need for an enable bit, however, leading to confusion and wastage. TLVs seems like a better fit for expressing capabilities of applications running on programmable hardware. This patch leaves the front of the BAR as is, and declares a TLV capability start at offset 0x58. Most of the space up to 0x0d90 is already allocated, but the used space can be wrapped with RESERVED TLVs. E.g.: Address Type Length 0x0058 RESERVED 0xe00 /* Wrap basic structures / 0x0e5c FEATURE_A 0x004 0x0e64 FEATURE_B 0x004 0x0e6c RESERVED 0x990 / Wrap qeueue stats */ 0x1800 FEATURE_C 0x100 Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-01-19 15:44:18 -05:00
Jakub Kicinski	bf245f1fb0	nfp: improve app not found message When driver app matching loaded FW is not found users are faced with: nfp: failed to find app with ID 0x%02x This message does not properly explain that matching driver code is either not built into the driver or the driver is too old. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-01-19 15:44:18 -05:00
Jakub Kicinski	3eb47dfca0	nfp: protect each repr pointer individually with RCU Representors are grouped in sets by type. Currently the whole sets are under RCU protection, but individual representor pointers are not. This causes some inconveniences when representors have to be destroyed, because we have to allocate new sets to remove any representors. Protect the individual pointers with RCU. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-01-19 15:44:18 -05:00
Jakub Kicinski	e1740fb6c1	nfp: add nfp_reprs_get_locked() helper The write side of repr tables is always done under pf->lock. Add a helper to dereference repr table pointers under protection of that lock. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-01-19 15:44:18 -05:00
Jakub Kicinski	bcc93a23ca	nfp: register devlink after app is created Devlink used to have two global locks: devlink lock and port lock, our lock ordering looked like this: devlink lock -> driver's pf->lock -> devlink port lock After recent changes port lock was replaced with per-instance lock. Unfortunately, new per-instance lock is taken on most operations now. This means we can only grab the pf->lock from the port split/unsplit ops. Lock ordering looks like this: devlink lock -> driver's pf->lock -> devlink instance lock Since we can't take pf->lock from most devlink ops, make sure nfp_apps are prepared to service them as soon as devlink is registered. Locking the pf must be pushed down after nfp_app_init() callback. The init order looks like this: nfp_app_init devlink_register nfp_app_start netdev/port_register As soon as app_init is done nfp_apps must be ready to service devlink-related callbacks. apps can only register their own devlink objects from nfp_app_start. Fixes: `2406e7e546` ("devlink: Add per devlink instance lock") Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-01-19 15:44:18 -05:00
Jakub Kicinski	e1a2599db5	nfp: release global resources only on the remove path NFP app is currently shut down as soon as all the vNICs are gone. This means we can't depend on the app existing throughout the lifetime of the device. Free the app only from PCI remove path. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-01-19 15:44:18 -05:00
Jakub Kicinski	aa3f4b69a7	nfp: core: make scalar CPP helpers fail on short accesses Currently the helpers for accessing 4 or 8 byte values over the CPP bus return the length of IO on success. If the IO was short caller has to deal with error handling. The short IO for 4/8B values is completely impractical. Make the helpers return an error if full access was not possible. Fix the few places which are actually dealing with errors correctly, most call sites already only deal with negative return codes. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-01-19 15:44:18 -05:00
Jakub Kicinski	ca027a1c45	nfp: bpf: add short busy wait for FW replies Scheduling out and in for every FW message can slow us down unnecessarily. Our experiments show that even under heavy load the FW responds to 99.9% messages within 200 us. Add a short busy wait before entering the wait queue. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-01-18 22:54:26 +01:00
Jakub Kicinski	7a0ef69395	bpf: offload: allow array map offload The special handling of different map types is left to the driver. Allow offload of array maps by simply adding it to accepted types. For nfp we have to make sure array elements are not deleted. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-01-18 22:54:25 +01:00
Jiong Wang	eb1d7db927	nfp: bpf: set new jit info fields This patch set those new jit info fields introduced in this patch set. Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Jiong Wang <jiong.wang@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-01-18 01:26:15 +01:00
David S. Miller	c02b3741eb	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Overlapping changes all over. The mini-qdisc bits were a little bit tricky, however. Signed-off-by: David S. Miller <davem@davemloft.net>	2018-01-17 00:10:42 -05:00
Quentin Monnet	74801e50d5	nfp: bpf: reject program on instructions unknown to the JIT compiler If an eBPF instruction is unknown to the driver JIT compiler, we can reject the program at verification time. Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Jiong Wang <jiong.wang@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-01-17 01:15:06 +01:00
Jakub Kicinski	7dfa4d87cf	nfp: bpf: print map lookup problems into verifier log Use the verifier log to output error messages if map lookup can't be offloaded. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Acked-by: Quentin Monnet <quentin.monnet@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-01-17 01:15:06 +01:00
Jakub Kicinski	0d9c9f0f40	nfp: use the correct index for link speed table sts variable is holding link speed as well as state. We should be using ls to index into ls_to_ethtool. Fixes: `265aeb511b` ("nfp: add support for .get_link_ksettings()") Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-01-16 14:55:07 -05:00
Jakub Kicinski	1bba4c413a	nfp: bpf: implement bpf map offload Plug in to the stack's map offload callbacks for BPF map offload. Get next call needs some special handling on the FW side, since we can't send a NULL pointer to the FW there is a get first entry FW command. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-01-14 23:36:31 +01:00
Jakub Kicinski	3dd43c3319	nfp: bpf: add support for reading map memory Map memory needs to use 40 bit addressing. Add handling of such accesses. Since 40 bit addresses are formed by using both 32 bit operands we need to pre-calculate the actual address instead of adding in the offset inside the instruction, like we did in 32 bit mode. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-01-14 23:36:30 +01:00
Jakub Kicinski	77a3d3113b	nfp: bpf: add verification and codegen for map lookups Verify our current constraints on the location of the key are met and generate the code for calling map lookup on the datapath. New relocation types have to be added - for helpers and return addresses. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-01-14 23:36:30 +01:00
Jakub Kicinski	ce4ebfd859	nfp: bpf: add helpers for updating immediate instructions Immediate loads are used to load the return address of a helper. We need to be able to update those loads for relocations. Immediate loads can be slightly more complex and spread over two instructions in general, but here we only care about simple loads of small (< 65k) constants, so complex cases are not handled. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-01-14 23:36:30 +01:00
Jakub Kicinski	9d080d5da9	nfp: bpf: parse function call and map capabilities Parse helper function and supported map FW TLV capabilities. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-01-14 23:36:30 +01:00
Jakub Kicinski	ff3d43f756	nfp: bpf: implement helpers for FW map ops Implement calls for FW map communication. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-01-14 23:36:30 +01:00
Jakub Kicinski	d48ae231c5	nfp: bpf: add basic control channel communication For map support we will need to send and receive control messages. Add basic support for sending a message to FW, and waiting for a reply. Control messages are tagged with a 16 bit ID. Add a simple ID allocator and make sure we don't allow too many messages in flight, to avoid request <> reply mismatches. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-01-14 23:36:30 +01:00
Jakub Kicinski	4da98eea79	nfp: bpf: add map data structure To be able to split code into reasonable chunks we need to add the map data structures already. Later patches will add code piece by piece. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-01-14 23:36:30 +01:00
Jakub Kicinski	0a9c1991f2	bpf: rename bpf_dev_offload -> bpf_prog_offload With map offload coming, we need to call program offload structure something less ambiguous. Pure rename, no functional changes. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-01-14 23:36:29 +01:00
David S. Miller	19d28fbd30	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net BPF alignment tests got a conflict because the registers are output as Rn_w instead of just Rn in net-next, and in net a fixup for a testcase prohibits logical operations on pointers before using them. Also, we should attempt to patch BPF call args if JIT always on is enabled. Instead, if we fail to JIT the subprogs we should pass an error back up and fail immediately. Signed-off-by: David S. Miller <davem@davemloft.net>	2018-01-11 22:13:42 -05:00
Jakub Kicinski	fc2336505f	nfp: always unmask aux interrupts at init The link state and exception interrupts may be masked when we probe. The firmware should in theory prevent sending (and automasking) those interrupts if the device is disabled, but if my reading of the FW code is correct there are firmwares out there with race conditions in this area. The interrupt may also be masked if previous driver which used the device was malfunctioning and we didn't load the FW (there is no other good way to comprehensively reset the PF). Note that FW unmasks the data interrupts by itself when vNIC is enabled, such helpful operation is not performed for LSC/EXN interrupts. Always unmask the auxiliary interrupts after request_irq(). On the remove path add missing PCI write flush before free_irq(). Fixes: `4c3523623d` ("net: add driver for Netronome NFP4000/NFP6000 NIC VFs") Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-01-10 15:50:04 -05:00
Quentin Monnet	ff627e3d07	nfp: bpf: reuse verifier log for debug messages Now that `bpf_verifier_log_write()` is exported from the verifier and makes it possible to reuse the verifier log to print messages to the standard output, use this instead of the kernel logs in the nfp driver for printing error messages occurring at verification time. Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-01-10 13:49:36 +01:00
Nic Viljoen	c087aa8bbf	nfp: bpf: add signed jump insns This patch adds signed jump instructions (jsgt, jsge, jslt, jsle) to the nfp jit. As well as adding the additional required raw assembler branch mask to nfp_asm.h Signed-off-by: Nic Viljoen <nick.viljoen@netronome.com> Reviewed-by: Jiong Wang <jiong.wang@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-01-10 13:49:36 +01:00
Jakub Kicinski	af93d15ac6	nfp: hand over to BPF offload app at coarser granularity Instead of having an app callback per message type hand off all offload-related handling to apps with one "rest of ndo_bpf" callback. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-01-10 13:49:36 +01:00
Jakub Kicinski	e84797fe15	nfp: bpf: use a large constant in unresolved branches To make absolute relocated branches (branches which will be completely rewritten with br_set_offset()) distinguishable in user space dumps from normal jumps add a large offset to them. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Jiong Wang <jiong.wang@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-01-10 13:49:35 +01:00
Jakub Kicinski	44a12ecc1c	nfp: bpf: don't depend on high order allocations for program image The translator pre-allocates a buffer of maximal program size. Due to HW/FW limitations the program buffer can't currently be longer than 128Kb, so we used to kmalloc() it, and then map for DMA directly. Now that the late branch resolution is copying the program image anyway, we can just kvmalloc() the buffer. While at it, after translation reallocate the buffer to save space. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-01-10 13:49:35 +01:00
Jakub Kicinski	2314fe9ed0	nfp: bpf: relocate jump targets just before the load Don't translate the program assuming it will be loaded at a given address. This will be required for sharing programs between ports of the same NIC, tail calls and subprograms. It will also make the jump targets easier to understand when dumping the program to user space. Translate the program as if it was going to be loaded at address zero. When load happens add the load offset in and set addresses of special branches. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Jiong Wang <jiong.wang@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-01-10 13:49:35 +01:00
Jakub Kicinski	488feeaf6d	nfp: bpf: add helpers for modifying branch addresses In preparation for better handling of relocations move existing helper for setting branch offset to nfp_asm.c and add two more. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Jiong Wang <jiong.wang@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-01-10 13:49:35 +01:00
Jakub Kicinski	1549921da3	nfp: bpf: move jump resolution to jit.c Jump target resolution should be in jit.c not offload.c. No functional changes. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Jiong Wang <jiong.wang@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-01-10 13:49:35 +01:00
Jakub Kicinski	a0f30c97ac	nfp: bpf: allow disabling TC offloads when XDP active TC BPF offload was added first, so we used to assume that the ethtool TC HW offload flag cannot be touched whenever any BPF program is loaded on the NIC. This unncessarily limits changes to the TC flag when offloaded program is XDP. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-01-10 13:49:35 +01:00
Jakub Kicinski	ccbdc596f4	nfp: bpf: don't allow changing MTU above BPF offload limit when active When BPF offload is active we need may need to restrict the MTU changes more than just to the limitation of the kernel XDP datapath. Allow the BPF code to veto a MTU change. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-01-10 13:49:35 +01:00
Jakub Kicinski	c4f7730be5	nfp: bpf: round up the size of the stack Kernel enforces the alignment of the bottom of the stack, NFP deals with positive offsets better so we should align the top of the stack. Round the stack size to NFP word size (4B). Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-01-10 13:49:35 +01:00
Jakub Kicinski	8c6a6d9804	nfp: fix incumbent kdoc warnings We should use % instead of @ for documenting preprocessor defines. Add missing documentation of __NFP_REPR_TYPE_MAX. This gets rid of all remaining kdoc warnings in the driver. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-01-10 13:49:35 +01:00
Jakub Kicinski	a9c324be72	nfp: don't try to register XDP rxq structures on control queues Some RX rings are used for control messages, those will not have a netdev pointer in dp. Skip XDP rxq handling on those rings. Fixes: `7f1c684a89` ("nfp: setup xdp_rxq_info") Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-01-10 13:49:35 +01:00
David S. Miller	7f0b800048	Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next Daniel Borkmann says: ==================== pull-request: bpf-next 2018-01-07 The following pull-request contains BPF updates for your net-next tree. The main changes are: 1) Add a start of a framework for extending struct xdp_buff without having the overhead of populating every data at runtime. Idea is to have a new per-queue struct xdp_rxq_info that holds read mostly data (currently that is, queue number and a pointer to the corresponding netdev) which is set up during rxqueue config time. When a XDP program is invoked, struct xdp_buff holds a pointer to struct xdp_rxq_info that the BPF program can then walk. The user facing BPF program that uses struct xdp_md for context can use these members directly, and the verifier rewrites context access transparently by walking the xdp_rxq_info and net_device pointers to load the data, from Jesper. 2) Redo the reporting of offload device information to user space such that it works in combination with network namespaces. The latter is reported through a device/inode tuple as similarly done in other subsystems as well (e.g. perf) in order to identify the namespace. For this to work, ns_get_path() has been generalized such that the namespace can be retrieved not only from a specific task (perf case), but also from a callback where we deduce the netns (ns_common) from a netdevice. bpftool support using the new uapi info and extensive test cases for test_offload.py in BPF selftests have been added as well, from Jakub. 3) Add two bpftool improvements: i) properly report the bpftool version such that it corresponds to the version from the kernel source tree. So pick the right linux/version.h from the source tree instead of the installed one. ii) fix bpftool and also bpf_jit_disasm build with bintutils >= 2.9. The reason for the build breakage is that binutils library changed the function signature to select the disassembler. Given this is needed in multiple tools, add a proper feature detection to the tools/build/features infrastructure, from Roman. 4) Implement the BPF syscall command BPF_MAP_GET_NEXT_KEY for the stacktrace map. It is currently unimplemented, but there are use cases where user space needs to walk all stacktrace map entries e.g. for dumping or deleting map entries w/o having to close and recreate the map. Add BPF selftests along with it, from Yonghong. 5) Few follow-up cleanups for the bpftool cgroup code: i) rename the cgroup 'list' command into 'show' as we have it for other subcommands as well, ii) then alias the 'show' command such that 'list' is accepted which is also common practice in iproute2, and iii) remove couple of newlines from error messages using p_err(), from Jakub. 6) Two follow-up cleanups to sockmap code: i) remove the unused bpf_compute_data_end_sk_skb() function and ii) only build the sockmap infrastructure when CONFIG_INET is enabled since it's only aware of TCP sockets at this time, from John. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2018-01-07 21:26:31 -05:00
Jesper Dangaard Brouer	7f1c684a89	nfp: setup xdp_rxq_info Driver hook points for xdp_rxq_info: * reg : nfp_net_rx_ring_alloc * unreg: nfp_net_rx_ring_free In struct nfp_net_rx_ring moved member @size into a hole on 64-bit. Thus, the size remaines the same after adding member @xdp_rxq. Cc: oss-drivers@netronome.com Cc: Jakub Kicinski <jakub.kicinski@netronome.com> Cc: Simon Horman <simon.horman@netronome.com> Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2018-01-05 15:21:21 -08:00
Jakub Kicinski	d0adb51edb	nfp: add basic multicast filtering We currently always pass all multicast traffic through. Only set L2MC when actually needed. Since the driver was not making use of the capability to filter out mcast frames, some FW projects don't implement it any more. Don't warn users if capability is not present (like we do for promisc flag). The lack of L2MC capability is assumed to mean all multicast traffic goes through. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-01-05 13:46:47 -05:00
Dirk van der Merwe	d2c2928d86	nfp: flower: implement the PORT_REIFY message The PORT_REIFY message indicates whether reprs have been created or when they are about to be destroyed. This is necessary so firmware can know which state the driver is in, e.g. the firmware will not send any control messages related to ports when the reprs are destroyed. This prevents nuisance warning messages printed whenever the firmware sends updates for non-existent reprs. Signed-off-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-01-03 12:17:49 -05:00
Dirk van der Merwe	0f08479143	nfp: add repr_preclean callback Just before a repr is cleaned up, we give the app a chance to perform some preclean configuration while the reprs pointer is still configured for the app. Signed-off-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-01-03 12:17:38 -05:00
Dirk van der Merwe	c6d20ab4d7	nfp: flower: obtain repr link state only from firmware Instead of starting up reprs assuming that there is link, only respond to the link state reported by firmware. Furthermore, ensure link is down after repr netdevs are created. Signed-off-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-01-03 12:17:30 -05:00
Jakub Kicinski	cae1927c0b	bpf: offload: allow netdev to disappear while verifier is running To allow verifier instruction callbacks without any extra locking NETDEV_UNREGISTER notification would wait on a waitqueue for verifier to finish. This design decision was made when rtnl lock was providing all the locking. Use the read/write lock instead and remove the workqueue. Verifier will now call into the offload code, so dev_ops are moved to offload structure. Since verifier calls are all under bpf_prog_is_dev_bound() we no longer need static inline implementations to please builds with CONFIG_NET=n. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2017-12-31 16:12:23 +01:00
Jakub Kicinski	4f83435ad7	nfp: bpf: allocate vNIC priv for keeping track of the offloaded program After TC offloads were converted to callbacks we have no choice but keep track of the offloaded filter in the driver. Since this change came a little late in the release cycle there were a number of conflicts and allocation of vNIC priv structure seems to have slipped away in linux-next. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-12-27 20:37:38 -05:00
David S. Miller	fba961ab29	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Lots of overlapping changes. Also on the net-next side the XDP state management is handled more in the generic layers so undo the 'net' nfp fix which isn't applicable in net-next. Include a necessary change by Jakub Kicinski, with log message: ==================== cls_bpf no longer takes care of offload tracking. Make sure netdevsim performs necessary checks. This fixes a warning caused by TC trying to remove a filter it has not added. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2017-12-22 11:16:31 -05:00
Jakub Kicinski	d3f89b98e3	nfp: bpf: keep track of the offloaded program After TC offloads were converted to callbacks we have no choice but keep track of the offloaded filter in the driver. The check for nn->dp.bpf_offload_xdp was a stop gap solution to make sure failed TC offload won't disable XDP, it's no longer necessary. nfp_net_bpf_offload() will return -EBUSY on TC vs XDP conflicts. Fixes: `3f7889c4c7` ("net: sched: cls_bpf: call block callbacks for offload") Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-12-20 13:08:18 -05:00
Jakub Kicinski	102740bd94	cls_bpf: fix offload assumptions after callback conversion cls_bpf used to take care of tracking what offload state a filter is in, i.e. it would track if offload request succeeded or not. This information would then be used to issue correct requests to the driver, e.g. requests for statistics only on offloaded filters, removing only filters which were offloaded, using add instead of replace if previous filter was not added etc. This tracking of offload state no longer functions with the new callback infrastructure. There could be multiple entities trying to offload the same filter. Throw out all the tracking and corresponding commands and simply pass to the drivers both old and new bpf program. Drivers will have to deal with offload state tracking by themselves. Fixes: `3f7889c4c7` ("net: sched: cls_bpf: call block callbacks for offload") Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-12-20 13:08:18 -05:00
John Hurley	3ca3059dc3	nfp: flower: compile Geneve encap actions Generate rules for the NFP to encapsulate packets in Geneve tunnels. Move the vxlan action code to generic udp tunnel actions and use core code for both vxlan and Geneve. Only support outputting to well known port 6081. Setting tunnel options is not supported yet. Only attempt to offload if the fw supports Geneve. Signed-off-by: John Hurley <john.hurley@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-12-19 14:52:13 -05:00
John Hurley	bedeca15af	nfp: flower: compile Geneve match fields Compile Geneve match fields for offloading to the NFP. The addition of Geneve overflows the 8 bit key_layer field, so apply extended metadata to the match cmsg allowing up to 32 more key_layer fields. Rather than adding new Geneve blocks, move the vxlan code to generic ipv4 udp tunnel structs and use these for both vxlan and Geneve. Matches are only supported when specifically mentioning well known port 6081. Geneve tunnel options are not yet included in the match. Only offload Geneve if the fw supports it - include check for this. Signed-off-by: John Hurley <john.hurley@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-12-19 14:52:12 -05:00
John Hurley	739973486f	nfp: flower: read extra feature support from fw Extract the _abi_flower_extra_features symbol from the fw which gives a 64 bit bitmap of new features (on top of the flower base support) that the fw can offload. Store this bitmap in the priv data associated with each app. If the symbol does not exist, set the bitmap to 0. Signed-off-by: John Hurley <john.hurley@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-12-19 14:52:12 -05:00
John Hurley	574f1e9ccc	nfp: flower: remove unused tun_mask variable The tunnel dest IP is required for separate offload to the NFP. It is already verified that a dest IP must be present and must be an exact match in the flower rule. Therefore, we can just extract the IP from the generated offload rule and remove the unused mask variable. The function is then no longer required to return the IP separately. Because tun_dst is localised to tunnel matches, move the declaration to the tunnel if branch. Signed-off-by: John Hurley <john.hurley@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-12-19 14:52:12 -05:00
David S. Miller	59436c9ee1	Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next Daniel Borkmann says: ==================== pull-request: bpf-next 2017-12-18 The following pull-request contains BPF updates for your net-next tree. The main changes are: 1) Allow arbitrary function calls from one BPF function to another BPF function. As of today when writing BPF programs, __always_inline had to be used in the BPF C programs for all functions, unnecessarily causing LLVM to inflate code size. Handle this more naturally with support for BPF to BPF calls such that this __always_inline restriction can be overcome. As a result, it allows for better optimized code and finally enables to introduce core BPF libraries in the future that can be reused out of different projects. x86 and arm64 JIT support was added as well, from Alexei. 2) Add infrastructure for tagging functions as error injectable and allow for BPF to return arbitrary error values when BPF is attached via kprobes on those. This way of injecting errors generically eases testing and debugging without having to recompile or restart the kernel. Tags for opting-in for this facility are added with BPF_ALLOW_ERROR_INJECTION(), from Josef. 3) For BPF offload via nfp JIT, add support for bpf_xdp_adjust_head() helper call for XDP programs. First part of this work adds handling of BPF capabilities included in the firmware, and the later patches add support to the nfp verifier part and JIT as well as some small optimizations, from Jakub. 4) The bpftool now also gets support for basic cgroup BPF operations such as attaching, detaching and listing current BPF programs. As a requirement for the attach part, bpftool can now also load object files through 'bpftool prog load'. This reuses libbpf which we have in the kernel tree as well. bpftool-cgroup man page is added along with it, from Roman. 5) Back then commit `e87c6bc385` ("bpf: permit multiple bpf attachments for a single perf event") added support for attaching multiple BPF programs to a single perf event. Given they are configured through perf's ioctl() interface, the interface has been extended with a PERF_EVENT_IOC_QUERY_BPF command in this work in order to return an array of one or multiple BPF prog ids that are currently attached, from Yonghong. 6) Various minor fixes and cleanups to the bpftool's Makefile as well as a new 'uninstall' and 'doc-uninstall' target for removing bpftool itself or prior installed documentation related to it, from Quentin. 7) Add CONFIG_CGROUP_BPF=y to the BPF kernel selftest config file which is required for the test_dev_cgroup test case to run, from Naresh. 8) Fix reporting of XDP prog_flags for nfp driver, from Jakub. 9) Fix libbpf's exit code from the Makefile when libelf was not found in the system, also from Jakub. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2017-12-18 10:51:06 -05:00
Jakub Kicinski	4a29c0db69	nfp: set flags in the correct member of netdev_bpf netdev_bpf.flags is the input member for installing the program. netdev_bpf.prog_flags is the output member for querying. Set the correct one on query. Fixes: `92f0292b35` ("net: xdp: report flags program was installed with on query") Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2017-12-17 20:41:59 +01:00
Jakub Kicinski	0bce7c9a60	nfp: bpf: correct printk formats for size_t Build bot reported warning about invalid printk formats on 32bit architectures. Use %zu for size_t and %zd ptr diff. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2017-12-15 22:01:05 +01:00
Carl Heymann	28b2d7d04b	nfp: fix XPB register reads in debug dump For XPB registers reads, some island IDs require special handling (e.g. ARM island), which is already taken care of in nfp_xpb_readl(), so use that instead of a straight CPP read. Without this fix all "xpbm:ArmIsldXpbmMap.*" registers are reported as 0xffffffff. It has also been observed to cause a system reboot. With this fix correct values are reported, none of which are 0xffffffff. The values may be read using ethtool debug level 2. # ethtool -W <netdev> 2 # ethtool -w <netdev> data dump.dat Fixes: `0e6c4955e1` ("nfp: dump CPP, XPB and direct ME CSRs") Signed-off-by: Carl Heymann <carl.heymann@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-12-15 12:48:45 -05:00
Carl Heymann	da762863ed	nfp: fix absolute rtsym handling in debug dump In TLV-based ethtool debug dumps, don't do a CPP read for absolute rtsyms, use the addr field in the symbol table directly as the value. Without this fix rtsym gro_release_ring_0 is 4 bytes of zeros. With this fix the correct value, 0x0000004a 0x00000000 is reported. The values may be read using ethtool debug level 2. # ethtool -W <netdev> 2 # ethtool -w <netdev> data dump.dat Fixes: `e1e798e3fd` ("nfp: dump rtsyms") Signed-off-by: Carl Heymann <carl.heymann@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-12-15 12:48:45 -05:00
Dirk van der Merwe	7a74156591	nfp: implement firmware flashing Firmware flashing takes around 60s (specified to not take more than 70s). Prevent hogging the RTNL lock in this time and make use of the longer timeout for the NSP command. The timeout is set to 2.5 * 70 seconds. We only allow flashing the firmware from reprs or PF netdevs. VFs do not have an app reference. Signed-off-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-12-15 12:26:12 -05:00
Dirk van der Merwe	87a23801e5	nfp: extend NSP infrastructure for configurable timeouts The firmware flashing NSP operation takes longer to execute than the current default timeout. We need a mechanism to set a longer timeout for some commands. This patch adds the infrastructure to this. The default timeout is still 30 seconds. Signed-off-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-12-15 12:26:12 -05:00
Jakub Kicinski	8231f84441	nfp: bpf: optimize the adjust_head calls in trivial cases If the program is simple and has only one adjust head call with constant parameters, we can check that the call will always succeed at translation time. We need to track the location of the call and make sure parameters are always the same. We also have to check the parameters against datapath constraints and ETH_HLEN. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2017-12-15 14:18:18 +01:00
Jakub Kicinski	0d49eaf4db	nfp: bpf: add basic support for adjust head call Support bpf_xdp_adjust_head(). We need to check whether the packet offset after adjustment is within datapath's limits. We also check if the frame is at least ETH_HLEN long (similar to the kernel implementation). Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2017-12-15 14:18:18 +01:00
Jakub Kicinski	2cb230bded	nfp: bpf: prepare for call support Add skeleton of verifier checks and translation handler for call instructions. Make sure jump target resolution will not treat them as jumps. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2017-12-15 14:18:18 +01:00
Jakub Kicinski	77a844ee65	nfp: bpf: prepare for parsing BPF FW capabilities BPF FW creates a run time symbol called bpf_capabilities which contains TLV-formatted capability information. Allocate app private structure to store parsed capabilities and add a skeleton of parsing logic. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2017-12-15 14:18:18 +01:00
Jakub Kicinski	a351ab565c	nfp: add nfp_cpp_area_size() accessor Allow users outside of core reading area sizes. This was not needed previously because whatever entity created the area would usually know what size it asked for. The nfp_rtsym_map() helper, however, will allocate the area based on the size of an RT-symbol with given name. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2017-12-15 14:18:18 +01:00
Carl Heymann	92a54f4a47	nfp: debug dump - decrease endian conversions Convert the requested dump level parameter to big-endian at the start of nfp_net_dump_calculate_size() and nfp_net_dump_populate_buffer(), then compare and assign it directly where needed in the traversal and prolog code. This decreases the total number of conversions used. Signed-off-by: Carl Heymann <carl.heymann@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-12-11 12:08:13 -05:00
John Hurley	197171e5ba	nfp: flower: remove unused defines Delete match field defines that are not supported at this time. Signed-off-by: John Hurley <john.hurley@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-12-11 12:08:04 -05:00
John Hurley	a427673e1f	nfp: flower: remove dead code paths Port matching is selected by default on every rule so remove check for it and delete 'else' side of the statement. Remove nfp_flower_meta_one as now it will not feature in the code. Rename nfp_flower_meta_two given that one has been removed. 'Additional metadata' if statement can never be true so remove it as well. Signed-off-by: John Hurley <john.hurley@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-12-11 12:07:57 -05:00
John Hurley	de7d954984	nfp: flower: do not assume mac/mpls matches Remove the matching of mac/mpls as a default selection. These are not necessarily set by a TC rule (unlike the port). Previously a mac/mpls field would exist in every match and be masked out if not used. This patch has no impact on functionality but removes unnessary memory assignment in the match cmsg. Signed-off-by: John Hurley <john.hurley@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-12-11 12:07:47 -05:00
David S. Miller	51e18a453f	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Conflict was two parallel additions of include files to sch_generic.c, no biggie. Signed-off-by: David S. Miller <davem@davemloft.net>	2017-12-09 22:09:55 -05:00
Cong Wang	9f8a739e72	act_mirred: get rid of tcfm_ifindex from struct tcf_mirred tcfm_dev always points to the correct netdev and we already hold a refcnt, so no need to use tcfm_ifindex to lookup again. If we would support moving target netdev across netns, using pointer would be better than ifindex. This also fixes dumping obsolete ifindex, now after the target device is gone we just dump 0 as ifindex. Cc: Jiri Pirko <jiri@mellanox.com> Cc: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-12-06 14:50:13 -05:00
Carl Heymann	60b84a9b38	nfp: dump indirect ME CSRs - The spec defines CSR address ranges for indirect ME CSRs. For Each TLV chunk in the spec, dump a chunk that includes the spec and the data over the defined address range. - Each indirect CSR has 8 contexts. To read one context, first write the context to a specific derived address, read it back, and then read the register value. - For each address, read and dump all 8 contexts in this manner. Signed-off-by: Carl Heymann <carl.heymann@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-12-05 15:01:03 -05:00
Carl Heymann	0e6c4955e1	nfp: dump CPP, XPB and direct ME CSRs - The spec defines CSR address ranges for these types. - Dump each TLV chunk in the spec as a chunk that includes the spec and the data over the defined address range. Signed-off-by: Carl Heymann <carl.heymann@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-12-05 15:01:02 -05:00
Carl Heymann	e9364d30d5	nfp: dump firmware name Dump FW name as TLV, based on dump specification. Signed-off-by: Carl Heymann <carl.heymann@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-12-05 15:01:02 -05:00
Carl Heymann	10144de383	nfp: dump single hwinfo field by key - Add spec TLV for hwinfo field, containing key string as data. - Add dump TLV for hwinfo field, with data being key and value as packed zero-terminated strings. - If specified hwinfo field is not found, dump the spec TLV as -ENOENT error. Signed-off-by: Carl Heymann <carl.heymann@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-12-05 15:01:02 -05:00
Carl Heymann	24ff8455af	nfp: dump all hwinfo - Dump hwinfo as separate TLV chunk, in a packed format containing zero-separated key and value strings. - This provides additional debug context, if requested by the dumpspec. Signed-off-by: Carl Heymann <carl.heymann@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-12-05 15:01:02 -05:00
Carl Heymann	e1e798e3fd	nfp: dump rtsyms - Support rtsym TLVs. - If specified rtsym is not found, dump the spec TLV as -ENOENT error. Signed-off-by: Carl Heymann <carl.heymann@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-12-05 15:01:01 -05:00
Carl Heymann	f3682c7866	nfp: dumpspec TLV traversal - Perform dumpspec traversals for calculating size and populating the dump. - Initially, wrap all spec TLVs in dump error TLVs (changed by later patches in the series). Signed-off-by: Carl Heymann <carl.heymann@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-12-05 15:01:01 -05:00
Carl Heymann	f7852b8e9e	nfp: dump prolog - Use a TLV structure, with the typed chunks aligned to 8-byte sizes. - Dump numeric fields as big-endian. - Prolog contains the dump level. Signed-off-by: Carl Heymann <carl.heymann@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-12-05 15:01:01 -05:00
Carl Heymann	8a925303b6	nfp: load debug dump spec Load the TLV-based binary specification of what needs to be included in a dump, from the "_abi_dump_spec" rtsymbol. If the symbol is not defined, then dumps for levels >= 1 are not supported. Signed-off-by: Carl Heymann <carl.heymann@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-12-05 15:01:01 -05:00
Carl Heymann	d79e19f564	nfp: debug dump ethtool ops - Skeleton code to perform a binary debug dump via ethtoolops "set_dump", "get_dump_flags" and "get_dump_data", i.e. the ethtool -W/w mechanism. - Skeleton functions for debugdump operations provided. - An integer "dump level" can be specified, this is stored between ethtool invocations. Dump level 0 is still the "arm.diag" resource for backward compatibility. Other dump levels each define a set of state information to include in the dump, driven by a spec from FW. Signed-off-by: Carl Heymann <carl.heymann@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-12-05 15:01:01 -05:00
Pieter Jansen van Vuuren	42d779ffc1	nfp: fix port stats for mac representors Previously we swapped the tx_packets, tx_bytes and tx_dropped counters with rx_packets, rx_bytes and rx_dropped counters, respectively. This behaviour is correct and expected for VF representors but it should not be swapped for physical port mac representors. Fixes: `eadfa4c3be` ("nfp: add stats and xmit helpers for representors") Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-12-05 11:29:36 -05:00
Jakub Kicinski	bd0b2e7fe6	net: xdp: make the stack take care of the tear down Since day one of XDP drivers had to remember to free the program on the remove path. This leads to code duplication and is error prone. Make the stack query the installed programs on unregister and if something is installed, remove the program. Freeing of program attached to XDP generic is moved from free_netdev() as well. Because the remove will now be called before notifiers are invoked, BPF offload state of the program will not get destroyed before uninstall. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2017-12-03 00:27:57 +01:00
Jakub Kicinski	92f0292b35	net: xdp: report flags program was installed with on query Some drivers enforce that flags on program replacement and removal must match the flags passed on install. This leaves the possibility open to enable simultaneous loading of XDP programs both to HW and DRV. Allow such drivers to report the flags back to the stack. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2017-12-03 00:27:57 +01:00
Jiong Wang	6bc7103c89	nfp: bpf: detect load/store sequences lowered from memory copy This patch add the optimization frontend, but adding a new eBPF IR scan pass "nfp_bpf_opt_ldst_gather". The pass will traverse the IR to recognize the load/store pairs sequences that come from lowering of memory copy builtins. The gathered memory copy information will be kept in the meta info structure of the first load instruction in the sequence and will be consumed by the optimization backend added in the previous patches. NOTE: a sequence with cross memory access doesn't qualify this optimization, i.e. if one load in the sequence will load from place that has been written by previous store. This is because when we turn the sequence into single CPP operation, we are reading all contents at once into NFP transfer registers, then write them out as a whole. This is not identical with what the original load/store sequence is doing. Detecting cross memory access for two random pointers will be difficult, fortunately under XDP/eBPF's restrictied runtime environment, the copy normally happen among map, packet data and stack, they do not overlap with each other. And for cases supported by NFP, cross memory access will only happen on PTR_TO_PACKET. Fortunately for this, there is ID information that we could do accurate memory alias check. Signed-off-by: Jiong Wang <jiong.wang@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2017-12-01 20:59:20 +01:00
Jiong Wang	8c90053858	nfp: bpf: implement memory bulk copy for length bigger than 32-bytes When the gathered copy length is bigger than 32-bytes and within 128-bytes (the maximum length a single CPP Pull/Push request can finish), the strategy of read/write are changeed into: * Read. - use direct reference mode when length is within 32-bytes. - use indirect mode when length is bigger than 32-bytes. * Write. - length <= 8-bytes use write8 (direct_ref). - length <= 32-byte and 4-bytes aligned use write32 (direct_ref). - length <= 32-bytes but not 4-bytes aligned use write8 (indirect_ref). - length > 32-bytes and 4-bytes aligned use write32 (indirect_ref). - length > 32-bytes and not 4-bytes aligned and <= 40-bytes use write32 (direct_ref) to finish the first 32-bytes. use write8 (direct_ref) to finish all remaining hanging part. - length > 32-bytes and not 4-bytes aligned use write32 (indirect_ref) to finish those 4-byte aligned parts. use write8 (direct_ref) to finish all remaining hanging part. Signed-off-by: Jiong Wang <jiong.wang@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2017-12-01 20:59:20 +01:00
Jiong Wang	9879a3814b	nfp: bpf: implement memory bulk copy for length within 32-bytes For NFP, we want to re-group a sequence of load/store pairs lowered from memcpy/memmove into single memory bulk operation which then could be accelerated using NFP CPP bus. This patch extends the existing load/store auxiliary information by adding two new fields: struct bpf_insn paired_st; s16 ldst_gather_len; Both fields are supposed to be carried by the the load instruction at the head of the sequence. "paired_st" is the corresponding store instruction at the head and "ldst_gather_len" is the gathered length. If "ldst_gather_len" is negative, then the sequence is doing memory load/store in descending order, otherwise it is in ascending order. We need this information to detect overlapped memory access. This patch then optimize memory bulk copy when the copy length is within 32-bytes. The strategy of read/write used is: Read. Use read32 (direct_ref), always. * Write. - length <= 8-bytes write8 (direct_ref). - length <= 32-bytes and is 4-byte aligned write32 (direct_ref). - length <= 32-bytes but is not 4-byte aligned write8 (indirect_ref). NOTE: the optimization should not change program semantics. The destination register of the last load instruction should contain the same value before and after this optimization. Signed-off-by: Jiong Wang <jiong.wang@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2017-12-01 20:59:20 +01:00
Jiong Wang	5e4d6d2093	nfp: bpf: factor out is_mbpf_load & is_mbpf_store It is usual that we need to check if one BPF insn is for loading/storeing data from/to memory. Therefore, it makes sense to factor out related code to become common helper functions. Signed-off-by: Jiong Wang <jiong.wang@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2017-12-01 20:59:20 +01:00
Jakub Kicinski	5468a8b929	nfp: bpf: encode indirect commands Add support for emitting commands with field overwrites. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Jiong Wang <jiong.wang@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2017-12-01 20:59:20 +01:00
Jiong Wang	3239e7bb28	nfp: bpf: correct the encoding for No-Dest immed When immed is used with No-Dest, the emitter should use reg.dst instead of reg.areg for the destination, using the latter will actually encode register zero. Signed-off-by: Jiong Wang <jiong.wang@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2017-12-01 20:59:20 +01:00
Jiong Wang	08859f159e	nfp: bpf: relax source operands check The NFP normally requires the source operands to be difference addressing modes, but we should rule out the very special NN_REG_NONE type. There are instruction that ignores both A/B operands, for example: local_csr_rd For these instructions, we might pass the same operand type, NN_REG_NONE, for both A/B operands. NOTE: in current NFP ISA, it is only possible for instructions with unrestricted operands to take none operands, but in case there is new and similar instructoin in restricted form, they would follow similar rules, so swreg_to_restricted is updated as well. Signed-off-by: Jiong Wang <jiong.wang@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2017-12-01 20:59:20 +01:00
Jiong Wang	29fe46efba	nfp: bpf: don't do ld/shifts combination if shifts are jump destination If any of the shift insns in the ld/shift sequence is jump destination, don't do combination. Signed-off-by: Jiong Wang <jiong.wang@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2017-12-01 20:59:20 +01:00
Jiong Wang	1266f5d655	nfp: bpf: don't do ld/mask combination if mask is jump destination If the mask insn in the ld/mask pair is jump destination, then don't do combination. Signed-off-by: Jiong Wang <jiong.wang@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2017-12-01 20:59:20 +01:00
Jiong Wang	a09d5c52c4	nfp: bpf: flag jump destination to guide insn combine optimizations NFP eBPF offload JIT engine is doing some instruction combine based optimizations which however must not be safe if the combined sequences are across basic block boarders. Currently, there are post checks during fixing jump destinations. If the jump destination is found to be eBPF insn that has been combined into another one, then JIT engine will raise error and abort. This is not optimal. The JIT engine ought to disable the optimization on such cross-bb-border sequences instead of abort. As there is no control flow information in eBPF infrastructure that we can't do basic block based optimizations, this patch extends the existing jump destination record pass to also flag the jump destination, then in instruction combine passes we could skip the optimizations if insns in the sequence are jump targets. Suggested-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Jiong Wang <jiong.wang@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2017-12-01 20:59:19 +01:00
Jiong Wang	5b674140ad	nfp: bpf: record jump destination to simplify jump fixup eBPF insns are internally organized as dual-list inside NFP offload JIT. Random access to an insn needs to be done by either forward or backward traversal along the list. One place we need to do such traversal is at nfp_fixup_branches where one traversal is needed for each jump insn to find the destination. Such traversals could be avoided if jump destinations are collected through a single travesal in a pre-scan pass, and such information could also be useful in other places where jump destination info are needed. This patch adds such jump destination collection in nfp_prog_prepare. Suggested-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Jiong Wang <jiong.wang@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2017-12-01 20:59:19 +01:00
Jiong Wang	854dc87d1a	nfp: bpf: support backward jump This patch adds support for backward jump on NFP. - restrictions on backward jump in various functions have been removed. - nfp_fixup_branches now supports backward jump. There is one thing to note, currently an input eBPF JMP insn may generate several NFP insns, for example, NFP imm move insn A \ NFP compare insn B --> 3 NFP insn jited from eBPF JMP insn M NFP branch insn C / --- NFP insn X --> 1 NFP insn jited from eBPF insn N --- ... therefore, we are doing sanity check to make sure the last jited insn from an eBPF JMP is a NFP branch instruction. Once backward jump is allowed, it is possible an eBPF JMP insn is at the end of the program. This is however causing trouble for the sanity check. Because the sanity check requires the end index of the NFP insns jited from one eBPF insn while only the start index is recorded before this patch that we can only get the end index by: start_index_of_the_next_eBPF_insn - 1 or for the above example: start_index_of_eBPF_insn_N (which is the index of NFP insn X) - 1 nfp_fixup_branches was using nfp_for_each_insn_walk2 to expose next insn to each iteration during the traversal so the last index could be calculated from which. Now, it needs some extra code to handle the last insn. Meanwhile, the use of walk2 is actually unnecessary, we could simply use generic single instruction walk to do this, the next insn could be easily calculated using list_next_entry. So, this patch migrates the jump fixup traversal method to list_for_each_entry, this simplifies the code logic a little bit. The other thing to note is a new state variable "last_bpf_off" is introduced to track the index of the last jited NFP insn. This is necessary because NFP is generating special purposes epilogue sequences, so the index of the last jited NFP insn is not always nfp_prog->prog_len - 1. Suggested-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Jiong Wang <jiong.wang@netronome.com> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2017-12-01 20:59:19 +01:00
Jakub Kicinski	a646c9b2da	nfp: fix old kdoc issues Since commit `3a025e1d1c` ("Add optional check for bad kernel-doc comments") when built with W=1 build will complain about kdoc errors. Fix the kdoc issues we have. kdoc is still confused by defines in nfp_net_ctrl.h but those are not really errors. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2017-12-01 20:59:19 +01:00
David S. Miller	e4be7baba8	Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf Daniel Borkmann says: ==================== pull-request: bpf 2017-11-23 The following pull-request contains BPF updates for your net tree. The main changes are: 1) Several BPF offloading fixes, from Jakub. Among others: - Limit offload to cls_bpf and XDP program types only. - Move device validation into the driver and don't make any assumptions about the device in the classifier due to shared blocks semantics. - Don't pass offloaded XDP program into the driver when it should be run in native XDP instead. Offloaded ones are not JITed for the host in such cases. - Don't destroy device offload state when moved to another namespace. - Revert dumping offload info into user space for now, since ifindex alone is not sufficient. This will be redone properly for bpf-next tree. 2) Fix test_verifier to avoid using bpf_probe_write_user() helper in test cases, since it's dumping a warning into kernel log which may confuse users when only running tests. Switch to use bpf_trace_printk() instead, from Yonghong. 3) Several fixes for correcting ARG_CONST_SIZE_OR_ZERO semantics before it becomes uabi, from Gianluca. More specifically: - Add a type ARG_PTR_TO_MEM_OR_NULL that is used only by bpf_csum_diff(), where the argument is either a valid pointer or NULL. The subsequent ARG_CONST_SIZE_OR_ZERO then enforces a valid pointer in case of non-0 size or a valid pointer or NULL in case of size 0. Given that, the semantics for ARG_PTR_TO_MEM in combination with ARG_CONST_SIZE_OR_ZERO are now such that in case of size 0, the pointer must always be valid and cannot be NULL. This fix in semantics allows for bpf_probe_read() to drop the recently added size == 0 check in the helper that would become part of uabi otherwise once released. At the same time we can then fix bpf_probe_read_str() and bpf_perf_event_output() to use ARG_CONST_SIZE_OR_ZERO instead of ARG_CONST_SIZE in order to fix recently reported issues by Arnaldo et al, where LLVM optimizes two boundary checks into a single one for unknown variables where the verifier looses track of the variable bounds and thus rejects valid programs otherwise. 4) A fix for the verifier for the case when it detects comparison of two constants where the branch is guaranteed to not be taken at runtime. Verifier will rightfully prune the exploration of such paths, but we still pass the program to JITs, where they would complain about using reserved fields, etc. Track such dead instructions and sanitize them with mov r0,r0. Rejection is not possible since LLVM may generate them for valid C code and doesn't do as much data flow analysis as verifier. For bpf-next we might implement removal of such dead code and adjust branches instead. Fix from Alexei. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2017-11-24 02:33:01 +09:00
Jakub Kicinski	b48b1f7ac7	nfp: flower: add missing kdoc Commit `0115552eac` ("nfp: remove false positive offloads in flower vxlan") missed adding kdoc for a new parameter of nfp_flower_add_offload(). Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-11-21 20:24:37 +09:00
Jakub Kicinski	288b3de55a	bpf: offload: move offload device validation out to the drivers With TC shared block changes we can't depend on correct netdev pointer being available in cls_bpf. Move the device validation to the driver. Core will only make sure that offloaded programs are always attached in the driver (or in HW by the driver). We trust that drivers which implement offload callbacks will perform necessary checks. Moving the checks to the driver is generally a useful thing, in practice the check should be against a switchdev instance, not a netdev, given that most ASICs will probably allow using the same program on many ports. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2017-11-21 00:37:35 +01:00
Linus Torvalds	8170024750	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Pull networking fixes from David Miller: 1) Revert regression inducing change to the IPSEC template resolver, from Steffen Klassert. 2) Peeloffs can cause the wrong sk to be waken up in SCTP, fix from Xin Long. 3) Min packet MTU size is wrong in cpsw driver, from Grygorii Strashko. 4) Fix build failure in netfilter ctnetlink, from Arnd Bergmann. 5) ISDN hisax driver checks pnp_irq() for errors incorrectly, from Arvind Yadav. 6) Fix fealnx driver build failure on MIPS, from Huacai Chen. 7) Fix into leak in SCTP, the scope_id of socket addresses is not always filled in. From Eric W. Biederman. 8) MTU inheritance between physical function and representor fix in nfp driver, from Dirk van der Merwe. 9) Fix memory leak in rsi driver, from Colin Ian King. 10) Fix expiration and generation ID handling of cached ipv4 redirect routes, from Xin Long. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (40 commits) net: usb: hso.c: remove unneeded DRIVER_LICENSE #define ibmvnic: fix dma_mapping_error call ipvlan: NULL pointer dereference panic in ipvlan_port_destroy route: also update fnhe_genid when updating a route cache route: update fnhe_expires for redirect when the fnhe exists sctp: set frag_point in sctp_setsockopt_maxseg correctly rsi: fix memory leak on buf and usb_reg_buf net/netlabel: Add list_next_rcu() in rcu_dereference(). nfp: remove false positive offloads in flower vxlan nfp: register flower reprs for egress dev offload nfp: inherit the max_mtu from the PF netdev nfp: fix vlan receive MAC statistics typo nfp: fix flower offload metadata flag usage virto_net: remove empty file 'virtio_net.' net/sctp: Always set scope_id in sctp_inet6_skb_msgname fealnx: Fix building error on MIPS isdn: hisax: Fix pnp_irq's error checking for setup_teles3 isdn: hisax: Fix pnp_irq's error checking for setup_sedlbauer_isapnp isdn: hisax: Fix pnp_irq's error checking for setup_niccy isdn: hisax: Fix pnp_irq's error checking for setup_ix1micro ...	2017-11-17 20:18:37 -08:00
John Hurley	0115552eac	nfp: remove false positive offloads in flower vxlan Pass information to the match offload on whether or not the repr is the ingress or egress dev. Only accept tunnel matches if repr is the egress dev. This means rules such as the following are successfully offloaded: tc .. add dev vxlan0 .. enc_dst_port 4789 .. action redirect dev nfp_p0 While rules such as the following are rejected: tc .. add dev nfp_p0 .. enc_dst_port 4789 .. action redirect dev vxlan0 Also reject non tunnel flows that are offloaded to an egress dev. Non tunnel matches assume that the offload dev is the ingress port and offload a match accordingly. Fixes: `611aec101a` ("nfp: compile flower vxlan tunnel metadata match fields") Signed-off-by: John Hurley <john.hurley@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-11-17 14:09:36 +09:00
John Hurley	1a24d4f9c0	nfp: register flower reprs for egress dev offload Register a callback for offloading flows that have a repr as their egress device. The new egdev_register function is added to net-next for the 4.15 release. Signed-off-by: John Hurley <john.hurley@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-11-17 14:09:36 +09:00
Dirk van der Merwe	743ba5b47f	nfp: inherit the max_mtu from the PF netdev The PF netdev is used for data transfer for reprs, so reprs inherit the maximum MTU settings of the PF netdev. Fixes: `5de73ee467` ("nfp: general representor implementation") Signed-off-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-11-17 14:09:36 +09:00
Pieter Jansen van Vuuren	745eaf9afe	nfp: fix vlan receive MAC statistics typo Correct typo in vlan receive MAC stats. Previously the MAC statistics reported in ethtool for vlan receive contained a typo resulting in ethtool reporting rx_vlan_reveive_ok instead of rx_vlan_received_ok. Fixes: `a5950182c0` ("nfp: map mac_stats and vf_cfg BARs") Fixes: `098ce840c9` ("nfp: report MAC statistics in ethtool") Reported-by: Brendan Galloway <brendan.galloway@netronome.com> Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-11-17 14:09:35 +09:00
Pieter Jansen van Vuuren	6c3ab204f4	nfp: fix flower offload metadata flag usage Hardware has no notion of new or last mask id, instead it makes use of the message type (i.e. add flow or del flow) in combination with a single bit in metadata flags to determine when to add or delete a mask id. Previously we made use of the new or last flags to indicate that a new mask should be allocated or deallocated, respectively. This incorrect behaviour is fixed by making use single bit in metadata flags to indicate mask allocation or deallocation. Fixes: `43f84b72c5` ("nfp: add metadata to each flow offload") Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-11-17 14:09:35 +09:00
Linus Torvalds	7c225c69f8	Merge branch 'akpm' (patches from Andrew) Merge updates from Andrew Morton: - a few misc bits - ocfs2 updates - almost all of MM * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (131 commits) memory hotplug: fix comments when adding section mm: make alloc_node_mem_map a void call if we don't have CONFIG_FLAT_NODE_MEM_MAP mm: simplify nodemask printing mm,oom_reaper: remove pointless kthread_run() error check mm/page_ext.c: check if page_ext is not prepared writeback: remove unused function parameter mm: do not rely on preempt_count in print_vma_addr mm, sparse: do not swamp log with huge vmemmap allocation failures mm/hmm: remove redundant variable align_end mm/list_lru.c: mark expected switch fall-through mm/shmem.c: mark expected switch fall-through mm/page_alloc.c: broken deferred calculation mm: don't warn about allocations which stall for too long fs: fuse: account fuse_inode slab memory as reclaimable mm, page_alloc: fix potential false positive in __zone_watermark_ok mm: mlock: remove lru_add_drain_all() mm, sysctl: make NUMA stats configurable shmem: convert shmem_init_inodecache() to void Unify migrate_pages and move_pages access checks mm, pagevec: rename pagevec drained field ...	2017-11-15 19:42:40 -08:00
Mel Gorman	453f85d43f	mm: remove __GFP_COLD As the page free path makes no distinction between cache hot and cold pages, there is no real useful ordering of pages in the free list that allocation requests can take advantage of. Juding from the users of __GFP_COLD, it is likely that a number of them are the result of copying other sites instead of actually measuring the impact. Remove the __GFP_COLD parameter which simplifies a number of paths in the page allocator. This is potentially controversial but bear in mind that the size of the per-cpu pagelists versus modern cache sizes means that the whole per-cpu list can often fit in the L3 cache. Hence, there is only a potential benefit for microbenchmarks that alloc/free pages in a tight loop. It's even worse when THP is taken into account which has little or no chance of getting a cache-hot page as the per-cpu list is bypassed and the zeroing of multiple pages will thrash the cache anyway. The truncate microbenchmarks are not shown as this patch affects the allocation path and not the free path. A page fault microbenchmark was tested but it showed no sigificant difference which is not surprising given that the __GFP_COLD branches are a miniscule percentage of the fault path. Link: http://lkml.kernel.org/r/20171018075952.10627-9-mgorman@techsingularity.net Signed-off-by: Mel Gorman <mgorman@techsingularity.net> Acked-by: Vlastimil Babka <vbabka@suse.cz> Cc: Andi Kleen <ak@linux.intel.com> Cc: Dave Chinner <david@fromorbit.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Jan Kara <jack@suse.cz> Cc: Johannes Weiner <hannes@cmpxchg.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2017-11-15 18:21:06 -08:00
Manish Kurup	bf068bdd3c	nfp flower action: Modified to use VLAN helper functions Modified netronome nfp flower action to use VLAN helper functions instead of accessing/referencing TC act_vlan private structures directly. Reviewed-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com> Signed-off-by: Manish Kurup <manish.kurup@verizon.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-11-10 15:32:20 +09:00
Dirk van der Merwe	0d08709383	nfp: implement ethtool FEC mode settings Add support in the driver ethtool ops to modify the NFP FEC modes. The FEC modes can be set for vNIC associated with physical ports or for MAC representor netdevs. Signed-off-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Signed-off-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-11-05 23:23:27 +09:00
Dirk van der Merwe	b471232e2c	nfp: add helpers for FEC support Implement helpers to determine and modify FEC modes via the NSP. The NSP advertises FEC capabilities on a per port basis and provides support for: * Auto mode selection * Reed Solomon * BaseR * None/Off Signed-off-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Signed-off-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-11-05 23:23:26 +09:00
Dirk van der Merwe	a564d30ec2	nfp: add get/set link settings ndos to representors Since it is now safe to modify link settings for representors, we can attach the get/set link settings ndos to it. The get/set link settings are nfp_port based operations. If a port becomes invalid, the representor will be removed in the same way a vnic would be. Signed-off-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Signed-off-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-11-05 23:23:26 +09:00
Dirk van der Merwe	5fa27d59af	nfp: resync repr state when port table sync If the NSP port table has been refreshed, resync the representor state with the new port information. At the moment, this only entails looking for invalid ports and killing off representors associated with them. The repr instance becomes NULL which is safe since the app accessor function for reprs returns NULL when it cannot access a repr. Signed-off-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Signed-off-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-11-05 23:23:26 +09:00
Dirk van der Merwe	51ccc37d9d	nfp: refactor nfp_app_reprs_set The criteria that reprs cannot be replaced with another new set of reprs has been removed. This check is not needed since the only use case that could exercise this at the moment, would be to modify the number of SRIOV VFs without first disabling them. This case is explicitly disallowed in any case and subsequent patches in this series need to be able to replace the running set of reprs. All cases where the return code used to be checked for the nfp_app_reprs_set function have been removed. As stated above, it is not possible for the current code to encounter a case where reprs exist and need to be replaced. Signed-off-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Signed-off-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-11-05 23:23:26 +09:00
Jakub Kicinski	7717c319d8	nfp: make use of MAC reinit Recent management FW images can perform full reinit of MAC cores without requiring a reboot. When loading the driver check if there are changes pending and if so call NSP MAC reinit. Full application FW reload is still required, and all MACs need to be reinited at the same time (not only the ones which have been reconfigured, and thus potentially causing disruption to unrelated netdevs) therefore for now changing MAC config without reloading the driver still remains future work. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Tested-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Signed-off-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-11-05 23:23:26 +09:00
Jakub Kicinski	4e59532541	nfp: don't depend on compiler constant propagation Matthias reports: nfp_eth_set_bit_config() is marked as __always_inline to allow gcc to identify the 'mask' parameter as known to be constant at compile time, which is required to use the FIELD_GET() macro. The forced inlining does the trick for gcc, but for kernel builds with clang it results in undefined symbols: drivers/net/ethernet/netronome/nfp/nfpcore/nfp_nsp_eth.o: In function `__nfp_eth_set_aneg': drivers/net/ethernet/netronome/nfp/nfpcore/nfp_nsp_eth.c:(.text+0x787): undefined reference to `__compiletime_assert_492' drivers/net/ethernet/netronome/nfp/nfpcore/nfp_nsp_eth.c:(.text+0x7b1): undefined reference to `__compiletime_assert_496' These __compiletime_assert_xyx() calls would have been optimized away if the compiler had seen 'mask' as a constant. Add a macro to extract the mask and shift and pass those to nfp_eth_set_bit_config() separately. Reported-by: Matthias Kaehlcke <mka@chromium.org> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Tested-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Signed-off-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-11-05 23:23:26 +09:00
Jakub Kicinski	c6c580d7bc	nfp: bpf: move to new BPF program offload infrastructure Following steps are taken in the driver to offload an XDP program: XDP_SETUP_PROG: * prepare: - allocate program state; - run verifier (bpf_analyzer()); - run translation; * load: - stop old program if needed; - load program; - enable BPF if not enabled; * clean up: - free program image. With new infrastructure the flow will look like this: BPF_OFFLOAD_VERIFIER_PREP: - allocate program state; BPF_OFFLOAD_TRANSLATE: - run translation; XDP_SETUP_PROG: - stop old program if needed; - load program; - enable BPF if not enabled; BPF_OFFLOAD_DESTROY: - free program image. Take advantage of the new infrastructure. Allocation of driver metadata has to be moved from jit.c to offload.c since it's now done at a different stage. Since there is no separate driver private data for verification step, move temporary nfp_meta pointer into nfp_prog. We will now use user space context offsets. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-11-05 22:26:19 +09:00
Jakub Kicinski	9314c442d7	nfp: bpf: move translation prepare to offload.c struct nfp_prog is currently only used internally by the translator. This means there is a lot of parameter passing going on, between the translator and different stages of offload. Simplify things by allocating nfp_prog in offload.c already. We will now use kmalloc() to allocate the program area and only DMA map it for the time of loading (instead of allocating DMA coherent memory upfront). Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-11-05 22:26:19 +09:00
Jakub Kicinski	c1c88eae8a	nfp: bpf: move program prepare and free into offload.c Most of offload/translation prepare logic will be moved to offload.c. To help git generate more reasonable diffs move nfp_prog_prepare() and nfp_prog_free() functions there as a first step. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-11-05 22:26:19 +09:00
Jakub Kicinski	e4a91cd565	nfp: bpf: require seamless reload for program replace Firmware supports live replacement of programs for quite some time now. Remove the software-fallback related logic and depend on the FW for program replace. Seamless reload will become a requirement if maps are present, anyway. Load and start stages have to be split now, since replace only needs a load, start has already been done on add. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-11-05 22:26:19 +09:00
Jakub Kicinski	9ce7a95632	nfp: bpf: refactor offload logic We currently create a fake cls_bpf offload object when we want to offload XDP. Simplify and clarify the code by moving the TC/XDP specific logic out of common offload code. This is easy now that we don't support legacy TC actions. We only need the bpf program and state of the skip_sw flag. Temporarily set @code to NULL in nfp_net_bpf_offload(), compilers seem to have trouble recognizing it's always initialized. Next patches will eliminate that variable. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-11-05 22:26:19 +09:00
Jakub Kicinski	5559eedb78	nfp: bpf: remove unnecessary include of nfp_net.h BPF offload's main header does not need to include nfp_net.h. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-11-05 22:26:19 +09:00
Jakub Kicinski	94508438e8	nfp: bpf: remove the register renumbering leftovers The register renumbering was removed and will not be coming back in its old, naive form, given that it would be fundamentally incompatible with calling functions. Remove the leftovers. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-11-05 22:26:19 +09:00
Jakub Kicinski	012bb8a8b5	nfp: bpf: drop support for cls_bpf with legacy actions Only support BPF_PROG_TYPE_SCHED_CLS programs in direct action mode. This simplifies preparing the offload since there will now be only one mode of operation for that type of program. We need to know the attachment mode type of cls_bpf programs, because exit codes are interpreted differently for legacy vs DA mode. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-11-05 22:26:19 +09:00
Jakub Kicinski	f4e63525ee	net: bpf: rename ndo_xdp to ndo_bpf ndo_xdp is a control path callback for setting up XDP in the driver. We can reuse it for other forms of communication between the eBPF stack and the drivers. Rename the callback and associated structures and definitions. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-11-05 22:26:18 +09:00
David S. Miller	2a171788ba	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Files removed in 'net-next' had their license header updated in 'net'. We take the remove from 'net-next'. Signed-off-by: David S. Miller <davem@davemloft.net>	2017-11-04 09:26:51 +09:00
Linus Torvalds	ead751507d	License cleanup: add SPDX license identifiers to some files Many source files in the tree are missing licensing information, which makes it harder for compliance tools to determine the correct license. By default all files without license information are under the default license of the kernel, which is GPL version 2. Update the files which contain no license information with the 'GPL-2.0' SPDX license identifier. The SPDX identifier is a legally binding shorthand, which can be used instead of the full boiler plate text. This patch is based on work done by Thomas Gleixner and Kate Stewart and Philippe Ombredanne. How this work was done: Patches were generated and checked against linux-4.14-rc6 for a subset of the use cases: - file had no licensing information it it. - file was a /uapi/ one with no licensing information in it, - file was a /uapi/ one with existing licensing information, Further patches will be generated in subsequent months to fix up cases where non-standard license headers were used, and references to license had to be inferred by heuristics based on keywords. The analysis to determine which SPDX License Identifier to be applied to a file was done in a spreadsheet of side by side results from of the output of two independent scanners (ScanCode & Windriver) producing SPDX tag:value files created by Philippe Ombredanne. Philippe prepared the base worksheet, and did an initial spot review of a few 1000 files. The 4.13 kernel was the starting point of the analysis with 60,537 files assessed. Kate Stewart did a file by file comparison of the scanner results in the spreadsheet to determine which SPDX license identifier(s) to be applied to the file. She confirmed any determination that was not immediately clear with lawyers working with the Linux Foundation. Criteria used to select files for SPDX license identifier tagging was: - Files considered eligible had to be source code files. - Make and config files were included as candidates if they contained >5 lines of source - File already had some variant of a license header in it (even if <5 lines). All documentation files were explicitly excluded. The following heuristics were used to determine which SPDX license identifiers to apply. - when both scanners couldn't find any license traces, file was considered to have no license information in it, and the top level COPYING file license applied. For non /uapi/ files that summary was: SPDX license identifier # files ---------------------------------------------------\|------- GPL-2.0 11139 and resulted in the first patch in this series. If that file was a /uapi/ path one, it was "GPL-2.0 WITH Linux-syscall-note" otherwise it was "GPL-2.0". Results of that was: SPDX license identifier # files ---------------------------------------------------\|------- GPL-2.0 WITH Linux-syscall-note 930 and resulted in the second patch in this series. - if a file had some form of licensing information in it, and was one of the /uapi/ ones, it was denoted with the Linux-syscall-note if any GPL family license was found in the file or had no licensing in it (per prior point). Results summary: SPDX license identifier # files ---------------------------------------------------\|------ GPL-2.0 WITH Linux-syscall-note 270 GPL-2.0+ WITH Linux-syscall-note 169 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause) 21 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) 17 LGPL-2.1+ WITH Linux-syscall-note 15 GPL-1.0+ WITH Linux-syscall-note 14 ((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause) 5 LGPL-2.0+ WITH Linux-syscall-note 4 LGPL-2.1 WITH Linux-syscall-note 3 ((GPL-2.0 WITH Linux-syscall-note) OR MIT) 3 ((GPL-2.0 WITH Linux-syscall-note) AND MIT) 1 and that resulted in the third patch in this series. - when the two scanners agreed on the detected license(s), that became the concluded license(s). - when there was disagreement between the two scanners (one detected a license but the other didn't, or they both detected different licenses) a manual inspection of the file occurred. - In most cases a manual inspection of the information in the file resulted in a clear resolution of the license that should apply (and which scanner probably needed to revisit its heuristics). - When it was not immediately clear, the license identifier was confirmed with lawyers working with the Linux Foundation. - If there was any question as to the appropriate license identifier, the file was flagged for further research and to be revisited later in time. In total, over 70 hours of logged manual review was done on the spreadsheet to determine the SPDX license identifiers to apply to the source files by Kate, Philippe, Thomas and, in some cases, confirmation by lawyers working with the Linux Foundation. Kate also obtained a third independent scan of the 4.13 code base from FOSSology, and compared selected files where the other two scanners disagreed against that SPDX file, to see if there was new insights. The Windriver scanner is based on an older version of FOSSology in part, so they are related. Thomas did random spot checks in about 500 files from the spreadsheets for the uapi headers and agreed with SPDX license identifier in the files he inspected. For the non-uapi files Thomas did random spot checks in about 15000 files. In initial set of patches against 4.14-rc6, 3 files were found to have copy/paste license identifier errors, and have been fixed to reflect the correct identifier. Additionally Philippe spent 10 hours this week doing a detailed manual inspection and review of the 12,461 patched files from the initial patch version early this week with: - a full scancode scan run, collecting the matched texts, detected license ids and scores - reviewing anything where there was a license detected (about 500+ files) to ensure that the applied SPDX license was correct - reviewing anything where there was no detection but the patch license was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied SPDX license was correct This produced a worksheet with 20 files needing minor correction. This worksheet was then exported into 3 different .csv files for the different types of files to be modified. These .csv files were then reviewed by Greg. Thomas wrote a script to parse the csv files and add the proper SPDX tag to the file, in the format that the file expected. This script was further refined by Greg based on the output to detect more types of files automatically and to distinguish between header and source .c files (which need different comment types.) Finally Greg ran the script using the .csv files to generate the patches. Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org> Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> -----BEGIN PGP SIGNATURE----- iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCWfswbQ8cZ3JlZ0Brcm9h aC5jb20ACgkQMUfUDdst+ykvEwCfXU1MuYFQGgMdDmAZXEc+xFXZvqgAoKEcHDNA 6dVh26uchcEQLN/XqUDt =x306 -----END PGP SIGNATURE----- Merge tag 'spdx_identifiers-4.14-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core Pull initial SPDX identifiers from Greg KH: "License cleanup: add SPDX license identifiers to some files Many source files in the tree are missing licensing information, which makes it harder for compliance tools to determine the correct license. By default all files without license information are under the default license of the kernel, which is GPL version 2. Update the files which contain no license information with the 'GPL-2.0' SPDX license identifier. The SPDX identifier is a legally binding shorthand, which can be used instead of the full boiler plate text. This patch is based on work done by Thomas Gleixner and Kate Stewart and Philippe Ombredanne. How this work was done: Patches were generated and checked against linux-4.14-rc6 for a subset of the use cases: - file had no licensing information it it. - file was a /uapi/ one with no licensing information in it, - file was a /uapi/ one with existing licensing information, Further patches will be generated in subsequent months to fix up cases where non-standard license headers were used, and references to license had to be inferred by heuristics based on keywords. The analysis to determine which SPDX License Identifier to be applied to a file was done in a spreadsheet of side by side results from of the output of two independent scanners (ScanCode & Windriver) producing SPDX tag:value files created by Philippe Ombredanne. Philippe prepared the base worksheet, and did an initial spot review of a few 1000 files. The 4.13 kernel was the starting point of the analysis with 60,537 files assessed. Kate Stewart did a file by file comparison of the scanner results in the spreadsheet to determine which SPDX license identifier(s) to be applied to the file. She confirmed any determination that was not immediately clear with lawyers working with the Linux Foundation. Criteria used to select files for SPDX license identifier tagging was: - Files considered eligible had to be source code files. - Make and config files were included as candidates if they contained >5 lines of source - File already had some variant of a license header in it (even if <5 lines). All documentation files were explicitly excluded. The following heuristics were used to determine which SPDX license identifiers to apply. - when both scanners couldn't find any license traces, file was considered to have no license information in it, and the top level COPYING file license applied. For non /uapi/ files that summary was: SPDX license identifier # files ---------------------------------------------------\|------- GPL-2.0 11139 and resulted in the first patch in this series. If that file was a /uapi/ path one, it was "GPL-2.0 WITH Linux-syscall-note" otherwise it was "GPL-2.0". Results of that was: SPDX license identifier # files ---------------------------------------------------\|------- GPL-2.0 WITH Linux-syscall-note 930 and resulted in the second patch in this series. - if a file had some form of licensing information in it, and was one of the /uapi/ ones, it was denoted with the Linux-syscall-note if any GPL family license was found in the file or had no licensing in it (per prior point). Results summary: SPDX license identifier # files ---------------------------------------------------\|------ GPL-2.0 WITH Linux-syscall-note 270 GPL-2.0+ WITH Linux-syscall-note 169 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause) 21 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) 17 LGPL-2.1+ WITH Linux-syscall-note 15 GPL-1.0+ WITH Linux-syscall-note 14 ((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause) 5 LGPL-2.0+ WITH Linux-syscall-note 4 LGPL-2.1 WITH Linux-syscall-note 3 ((GPL-2.0 WITH Linux-syscall-note) OR MIT) 3 ((GPL-2.0 WITH Linux-syscall-note) AND MIT) 1 and that resulted in the third patch in this series. - when the two scanners agreed on the detected license(s), that became the concluded license(s). - when there was disagreement between the two scanners (one detected a license but the other didn't, or they both detected different licenses) a manual inspection of the file occurred. - In most cases a manual inspection of the information in the file resulted in a clear resolution of the license that should apply (and which scanner probably needed to revisit its heuristics). - When it was not immediately clear, the license identifier was confirmed with lawyers working with the Linux Foundation. - If there was any question as to the appropriate license identifier, the file was flagged for further research and to be revisited later in time. In total, over 70 hours of logged manual review was done on the spreadsheet to determine the SPDX license identifiers to apply to the source files by Kate, Philippe, Thomas and, in some cases, confirmation by lawyers working with the Linux Foundation. Kate also obtained a third independent scan of the 4.13 code base from FOSSology, and compared selected files where the other two scanners disagreed against that SPDX file, to see if there was new insights. The Windriver scanner is based on an older version of FOSSology in part, so they are related. Thomas did random spot checks in about 500 files from the spreadsheets for the uapi headers and agreed with SPDX license identifier in the files he inspected. For the non-uapi files Thomas did random spot checks in about 15000 files. In initial set of patches against 4.14-rc6, 3 files were found to have copy/paste license identifier errors, and have been fixed to reflect the correct identifier. Additionally Philippe spent 10 hours this week doing a detailed manual inspection and review of the 12,461 patched files from the initial patch version early this week with: - a full scancode scan run, collecting the matched texts, detected license ids and scores - reviewing anything where there was a license detected (about 500+ files) to ensure that the applied SPDX license was correct - reviewing anything where there was no detection but the patch license was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied SPDX license was correct This produced a worksheet with 20 files needing minor correction. This worksheet was then exported into 3 different .csv files for the different types of files to be modified. These .csv files were then reviewed by Greg. Thomas wrote a script to parse the csv files and add the proper SPDX tag to the file, in the format that the file expected. This script was further refined by Greg based on the output to detect more types of files automatically and to distinguish between header and source .c files (which need different comment types.) Finally Greg ran the script using the .csv files to generate the patches. Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org> Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>" * tag 'spdx_identifiers-4.14-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: License cleanup: add SPDX license identifier to uapi header files with a license License cleanup: add SPDX license identifier to uapi header files with no license License cleanup: add SPDX GPL-2.0 license identifier to files with no license	2017-11-02 10:04:46 -07:00
Jakub Kicinski	18f7619179	nfp: improve defines for constants in ethtool We split rvector stats into two categories - per queue and stats which are added up into one total counter. Improve the defines denoting their number. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-11-02 20:27:11 +09:00
Jakub Kicinski	16f50cda06	nfp: use a counter instead of log message for allocation failures Add a counter incremented when allocation of replacement RX page fails. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-11-02 20:27:11 +09:00
Jakub Kicinski	790a399171	nfp: switch to dev_alloc_page() Use the dev_alloc_page() networking helper to allocate pages for RX packets. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-11-02 20:27:11 +09:00
Jakub Kicinski	43b45245e5	nfp: bpf: fall back to core NIC app if BPF not selected If kernel config does not include BPF just replace the BPF app handler with the handler for basic NIC. The BPF app will now be built only if BPF infrastructure is selected in kernel config. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-11-02 20:27:11 +09:00
Jakub Kicinski	2c4197a041	nfp: reorganize the app table The app table is an unordered array right now. We have to search apps by ID. It also makes it harder to fall back to core NIC if advanced functions are not compiled into the kernel (e.g. eBPF). Make the table keyed by app id. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-11-02 20:27:11 +09:00
Jakub Kicinski	f449657f83	nfp: bpf: reject TC offload if XDP loaded Recent TC changes dropped the check protecting us from trying to offload a TC program if XDP programs are already loaded. Fixes: `90d97315b3` ("nfp: bpf: Convert ndo_setup_tc offloads to block callbacks") Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-11-02 20:27:11 +09:00
John Hurley	dc4646a950	nfp: flower: vxlan - ensure no sleep in atomic context Functions called by the netevent notifier must be in atomic context. Change the mutex to spinlock and ensure mem allocations are done with the atomic flag. Also, remove unnecessary locking after notifiers are unregistered. Signed-off-by: John Hurley <john.hurley@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-11-02 20:27:11 +09:00
John Hurley	2df7b2d206	nfp: flower: app should use struct nfp_repr Ensure priv netdev data in flower app is cast to nfp_repr and not nfp_net as in other apps. Fixes: `363fc53b8b` ("nfp: flower: Convert ndo_setup_tc offloads to block callbacks") Signed-off-by: John Hurley <john.hurley@netronome.com> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-11-02 20:27:11 +09:00
Greg Kroah-Hartman	b24413180f	License cleanup: add SPDX GPL-2.0 license identifier to files with no license Many source files in the tree are missing licensing information, which makes it harder for compliance tools to determine the correct license. By default all files without license information are under the default license of the kernel, which is GPL version 2. Update the files which contain no license information with the 'GPL-2.0' SPDX license identifier. The SPDX identifier is a legally binding shorthand, which can be used instead of the full boiler plate text. This patch is based on work done by Thomas Gleixner and Kate Stewart and Philippe Ombredanne. How this work was done: Patches were generated and checked against linux-4.14-rc6 for a subset of the use cases: - file had no licensing information it it. - file was a /uapi/ one with no licensing information in it, - file was a /uapi/ one with existing licensing information, Further patches will be generated in subsequent months to fix up cases where non-standard license headers were used, and references to license had to be inferred by heuristics based on keywords. The analysis to determine which SPDX License Identifier to be applied to a file was done in a spreadsheet of side by side results from of the output of two independent scanners (ScanCode & Windriver) producing SPDX tag:value files created by Philippe Ombredanne. Philippe prepared the base worksheet, and did an initial spot review of a few 1000 files. The 4.13 kernel was the starting point of the analysis with 60,537 files assessed. Kate Stewart did a file by file comparison of the scanner results in the spreadsheet to determine which SPDX license identifier(s) to be applied to the file. She confirmed any determination that was not immediately clear with lawyers working with the Linux Foundation. Criteria used to select files for SPDX license identifier tagging was: - Files considered eligible had to be source code files. - Make and config files were included as candidates if they contained >5 lines of source - File already had some variant of a license header in it (even if <5 lines). All documentation files were explicitly excluded. The following heuristics were used to determine which SPDX license identifiers to apply. - when both scanners couldn't find any license traces, file was considered to have no license information in it, and the top level COPYING file license applied. For non /uapi/ files that summary was: SPDX license identifier # files ---------------------------------------------------\|------- GPL-2.0 11139 and resulted in the first patch in this series. If that file was a /uapi/ path one, it was "GPL-2.0 WITH Linux-syscall-note" otherwise it was "GPL-2.0". Results of that was: SPDX license identifier # files ---------------------------------------------------\|------- GPL-2.0 WITH Linux-syscall-note 930 and resulted in the second patch in this series. - if a file had some form of licensing information in it, and was one of the /uapi/ ones, it was denoted with the Linux-syscall-note if any GPL family license was found in the file or had no licensing in it (per prior point). Results summary: SPDX license identifier # files ---------------------------------------------------\|------ GPL-2.0 WITH Linux-syscall-note 270 GPL-2.0+ WITH Linux-syscall-note 169 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause) 21 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) 17 LGPL-2.1+ WITH Linux-syscall-note 15 GPL-1.0+ WITH Linux-syscall-note 14 ((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause) 5 LGPL-2.0+ WITH Linux-syscall-note 4 LGPL-2.1 WITH Linux-syscall-note 3 ((GPL-2.0 WITH Linux-syscall-note) OR MIT) 3 ((GPL-2.0 WITH Linux-syscall-note) AND MIT) 1 and that resulted in the third patch in this series. - when the two scanners agreed on the detected license(s), that became the concluded license(s). - when there was disagreement between the two scanners (one detected a license but the other didn't, or they both detected different licenses) a manual inspection of the file occurred. - In most cases a manual inspection of the information in the file resulted in a clear resolution of the license that should apply (and which scanner probably needed to revisit its heuristics). - When it was not immediately clear, the license identifier was confirmed with lawyers working with the Linux Foundation. - If there was any question as to the appropriate license identifier, the file was flagged for further research and to be revisited later in time. In total, over 70 hours of logged manual review was done on the spreadsheet to determine the SPDX license identifiers to apply to the source files by Kate, Philippe, Thomas and, in some cases, confirmation by lawyers working with the Linux Foundation. Kate also obtained a third independent scan of the 4.13 code base from FOSSology, and compared selected files where the other two scanners disagreed against that SPDX file, to see if there was new insights. The Windriver scanner is based on an older version of FOSSology in part, so they are related. Thomas did random spot checks in about 500 files from the spreadsheets for the uapi headers and agreed with SPDX license identifier in the files he inspected. For the non-uapi files Thomas did random spot checks in about 15000 files. In initial set of patches against 4.14-rc6, 3 files were found to have copy/paste license identifier errors, and have been fixed to reflect the correct identifier. Additionally Philippe spent 10 hours this week doing a detailed manual inspection and review of the 12,461 patched files from the initial patch version early this week with: - a full scancode scan run, collecting the matched texts, detected license ids and scores - reviewing anything where there was a license detected (about 500+ files) to ensure that the applied SPDX license was correct - reviewing anything where there was no detection but the patch license was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied SPDX license was correct This produced a worksheet with 20 files needing minor correction. This worksheet was then exported into 3 different .csv files for the different types of files to be modified. These .csv files were then reviewed by Greg. Thomas wrote a script to parse the csv files and add the proper SPDX tag to the file, in the format that the file expected. This script was further refined by Greg based on the output to detect more types of files automatically and to distinguish between header and source .c files (which need different comment types.) Finally Greg ran the script using the .csv files to generate the patches. Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org> Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-11-02 11:10:55 +01:00
Jiong Wang	254ef4d746	nfp: bpf: support [BPF_ALU \| BPF_ALU64] \| BPF_NEG This patch supports BPF_NEG under both BPF_ALU64 and BPF_ALU. LLVM recently starts to generate it. NOTE: BPF_NEG takes single operand which is an register and serve as both input and output. Signed-off-by: Jiong Wang <jiong.wang@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-11-02 16:47:30 +09:00
Jiong Wang	5d42ced195	nfp: bpf: rename ALU_OP_NEG to ALU_OP_NOT The current ALU_OP_NEG is Op encoding 0x4 for NPF ALU instruction. It is actually performing "~B" operation which is bitwise NOT. The using naming ALU_OP_NEG is misleading as NEG is -B which is not the same as ~B. Signed-off-by: Jiong Wang <jiong.wang@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-11-02 16:47:30 +09:00
Jiri Pirko	44ae12a768	net: sched: move the can_offload check from binding phase to rule insertion phase This restores the original behaviour before the block callbacks were introduced. Allow the drivers to do binding of block always, no matter if the NETIF_F_HW_TC feature is on or off. Move the check to the block callback which is called for rule insertion. Reported-by: Alexander Duyck <alexander.duyck@gmail.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-11-02 16:10:39 +09:00
Alexei Starovoitov	638f5b90d4	bpf: reduce verifier memory consumption the verifier got progressively smarter over time and size of its internal state grew as well. Time to reduce the memory consumption. Before: sizeof(struct bpf_verifier_state) = 6520 After: sizeof(struct bpf_verifier_state) = 896 It's done by observing that majority of BPF programs use little to no stack whereas verifier kept all of 512 stack slots ready always. Instead dynamically reallocate struct verifier state when stack access is detected. Runtime difference before vs after is within a noise. The number of processed instructions stays the same. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-11-01 11:41:18 +09:00
David S. Miller	e1ea2f9856	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Several conflicts here. NFP driver bug fix adding nfp_netdev_is_nfp_repr() check to nfp_fl_output() needed some adjustments because the code block is in an else block now. Parallel additions to net/pkt_cls.h and net/sch_generic.h A bug fix in __tcp_retransmit_skb() conflicted with some of the rbtree changes in net-next. The tc action RCU callback fixes in 'net' had some overlap with some of the recent tcf_block reworking. Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-30 21:09:24 +09:00
Pablo Cascón	a267eaebfc	nfp: inform the VF driver needs to be restarted after changing the MAC Add message to inform the VF MAC was changed and the need to restart the VF driver for the changes to be effective. Signed-off-by: Pablo Cascón <pablo.cascon@netronome.com> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-28 18:59:48 +09:00
Kees Cook	3248f77fa3	drivers/net: netronome: Convert timers to use timer_setup() In preparation for unconditionally passing the struct timer_list pointer to all timer callbacks, switch to using the new timer_setup() and from_timer() to pass the timer pointer explicitly. Cc: Jakub Kicinski <jakub.kicinski@netronome.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Jiri Pirko <jiri@mellanox.com> Cc: Jamal Hadi Salim <jhs@mojatatu.com> Cc: Simon Horman <simon.horman@netronome.com> Cc: oss-drivers@netronome.com Cc: netdev@vger.kernel.org Signed-off-by: Kees Cook <keescook@chromium.org> Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-27 12:09:15 +09:00
Pieter Jansen van Vuuren	d309ae5c6a	nfp: refuse offloading filters that redirects to upper devices Previously we did not ensure that a netdev is a representative netdev before dereferencing its private data. This can occur when an upper netdev is created on a representative netdev. This patch corrects this by first ensuring that the netdev is a representative netdev before using it. Checking only switchdev_port_same_parent_id is not sufficient to ensure that we can safely use the netdev. Failing to check that the netdev is also a representative netdev would result in incorrect dereferencing. Fixes: `1a1e586f54` ("nfp: add basic action capabilities to flower offloads") Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-26 10:07:14 +09:00
Jakub Kicinski	9f16c8abcd	nfp: bpf: optimize mov64 a little Loading 64bit constants require up to 4 load immediates, since we can only load 16 bits at a time. If the 32bit halves of the 64bit constant are the same, however, we can save a cycle by doing a register move instead of two loads of 16 bits. Note that we don't optimize the normal ALU64 load because even though it's a 64 bit load the upper half of the register is a coming from sign extension so we can load it in one cycle anyway. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-24 17:38:37 +09:00
Jakub Kicinski	b14157eeed	nfp: bpf: support stack accesses via non-constant pointers If stack pointer has a different value on different paths but the alignment to words (4B) remains the same, we can set a new LMEM access pointer to the calculated value and access whichever word it's pointing to. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-24 17:38:37 +09:00
Jakub Kicinski	2df03a50f1	nfp: bpf: support accessing the stack beyond 64 bytes To access beyond 64th byte of the stack we need to set a new stack pointer register (LMEM is accessed indirectly through those pointers). Add a function for encoding local CSR access instruction. Use stack pointer number 3. Note that stack pointer registers allow us to index into 32 bytes of LMEM (with shift operations i.e. when operands are restricted). This means if access is crossing 32 byte boundary we must not use offsetting, we have to set the pointer to the exact address and move it with post-increments. We depend on the datapath placing the stack base address in GPR A22 for our use. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-24 17:38:37 +09:00
Jakub Kicinski	d348848063	nfp: bpf: allow stack accesses via modified stack registers As long as the verifier tells us the stack offset exactly we can render the LMEM reads quite easily. Simply make sure that the offset is constant for a given instruction and add it to the instruction's offset. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-24 17:38:37 +09:00
Jakub Kicinski	9a90c83c09	nfp: bpf: optimize the RMW for stack accesses When we are performing unaligned stack accesses in the 32-64B window we have to do a read-modify-write cycle. E.g. for reading 8 bytes from address 17: 0: tmp = stack[16] 1: gprLo = tmp >> 8 2: tmp = stack[20] 3: gprLo \|= tmp << 24 4: tmp = stack[20] 5: gprHi = tmp >> 8 6: tmp = stack[24] 7: gprHi \|= tmp << 24 The load on line 4 is unnecessary, because tmp already contains data from stack[20]. For write we can optimize both loads and writebacks away. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-24 17:38:37 +09:00
Jakub Kicinski	a82b23fb38	nfp: bpf: add stack read support Add simple stack read support, similar to write in every aspect, but data flowing the other way. Note that unlike write which can be done in smaller than word quantities, if registers are loaded with less-than-word of stack contents - the values have to be zero extended. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-24 17:38:37 +09:00
Jakub Kicinski	ee9133a845	nfp: bpf: add stack write support Stack is implemented by the LMEM register file. Unaligned accesses to LMEM are not allowed. Accesses also have to be 4B wide. To support stack we need to make sure offsets of pointers are known at translation time (for now) and perform correct load/mask/shift operations. Since we can access first 64B of LMEM without much effort support only stacks not bigger than 64B. Following commits will extend the possible sizes beyond that. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-24 17:38:37 +09:00
Jakub Kicinski	70c78fc138	nfp: bpf: refactor nfp_bpf_check_ptr() nfp_bpf_check_ptr() mostly looks at the pointer register. Add a temporary variable to shorten the code. While at it make sure we print error messages if translation fails to help users identify the problem (to be carried in ext_ack in due course). Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-24 17:38:37 +09:00
Jakub Kicinski	ff42bb9fe3	nfp: bpf: add helper for emitting nops The need to emitting a few nops will become more common soon as we add stack and map support. Add a helper. This allows for code to be shorter but also may be handy for marking the nops with a "reason" to ease applying optimizations. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-24 17:38:37 +09:00
David S. Miller	f8ddadc4db	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net There were quite a few overlapping sets of changes here. Daniel's bug fix for off-by-ones in the new BPF branch instructions, along with the added allowances for "data_end > ptr + x" forms collided with the metadata additions. Along with those three changes came veritifer test cases, which in their final form I tried to group together properly. If I had just trimmed GIT's conflict tags as-is, this would have split up the meta tests unnecessarily. In the socketmap code, a set of preemption disabling changes overlapped with the rename of bpf_compute_data_end() to bpf_compute_data_pointers(). Changes were made to the mv88e6060.c driver set addr method which got removed in net-next. The hyperv transport socket layer had a locking change in 'net' which overlapped with a change of socket state macro usage in 'net-next'. Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-22 13:39:14 +01:00
Pieter Jansen van Vuuren	62d3f60b4d	nfp: use struct fields for 8 bit-wide access Use direct access struct fields rather than PREP_FIELD() macros to manipulate the jump ID and length, both of which are exactly 8-bits wide. This simplifies the code somewhat. Signed-off-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com> Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-22 03:09:32 +01:00
Jiri Pirko	8d26d5636d	net: sched: avoid ndo_setup_tc calls for TC_SETUP_CLS* All drivers are converted to use block callbacks for TC_SETUP_CLS. So it is now safe to remove the calls to ndo_setup_tc from cls_ Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-21 03:04:08 +01:00
Jiri Pirko	90d97315b3	nfp: bpf: Convert ndo_setup_tc offloads to block callbacks Benefit from the newly introduced block callback infrastructure and convert ndo_setup_tc calls for bpf offloads to block callbacks. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-21 03:04:08 +01:00
Jiri Pirko	363fc53b8b	nfp: flower: Convert ndo_setup_tc offloads to block callbacks Benefit from the newly introduced block callback infrastructure and convert ndo_setup_tc calls for flower offloads to block callbacks. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-21 03:04:08 +01:00
Mark Brown	4c7787ba3a	nfp: Explicitly include linux/bug.h Today's -next build encountered an error due to a missing definition of WARN_ON(), caused by some header reorganization removing an implicit inclusion of linux/bug.h. Fix this with an explicit inclusion. Signed-off-by: Mark Brown <broonie@kernel.org> Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-14 18:31:41 -07:00
Jakub Kicinski	bfddbc8adc	nfp: bpf: support direct packet access in TC Add support for direct packet access in TC, note that because writing the packet will cause the verifier to generate a csum fixup prologue we won't be able to offload packet writes from TC, just yet, only the reads will work. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-14 11:13:28 -07:00
Jakub Kicinski	e663fe3863	nfp: bpf: direct packet access - write This patch adds ability to write packet contents using pre-validated packet pointers (direct packet access). Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-14 11:13:28 -07:00
Jakub Kicinski	2ca71441f5	nfp: bpf: add support for direct packet access - read In direct packet access bound checks are already done, we can simply dereference the packet pointer. Verifier/parser logic needs to record pointer type. Note that although verifier does protect us from CTX vs other pointer changes we will also want to differentiate between PACKET vs MAP_VALUE or STACK, so we can add the check already. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-14 11:13:28 -07:00
Jakub Kicinski	0a7939775f	nfp: bpf: separate I/O from checks for legacy data load Move data load into a separate function and separate it from packet length checks of legacy I/O. This makes the code more readable and easier to reuse. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-14 11:13:28 -07:00
Jakub Kicinski	943c57b97c	nfp: bpf: fix context accesses Sizes of fields in struct xdp_md/xdp_buff and some in sk_buff depend on target architecture. Take that into account and use struct xdp_buff, not struct xdp_md. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-14 11:13:28 -07:00
Jakub Kicinski	0f6cf4ddf6	nfp: bpf: support BPF offload only on little endian eBPF is host-endian specific. Translating both BE and LE eBPF to the NFP is feasible, but would require quite a bit of indirection. The fact that I don't have access to any BE hosts that would fit a 25G/40G/100G NIC is also limiting my ability to test big endian. For now restrict the offload to little endian hosts only. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-14 11:13:28 -07:00
Jakub Kicinski	3119d1fd46	nfp: bpf: implement byte swap instruction Implement byte swaps with rotations, shifts and byte loads. Remember to clear upper parts of the 64 bit registers. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-14 11:13:28 -07:00
Jakub Kicinski	c000dfb5e2	nfp: bpf: add mov helper Register move operation is encoded as alu no op. This means that one has to specify number of unused/none parameters to the emit_alu(). Add a helper. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-14 11:13:28 -07:00
Jakub Kicinski	26fa818dc0	nfp: bpf: fix compare instructions Now that we have BPF assemebler support in LLVM 6 we can easily test all compare instructions (LLVM 4 didn't generate most of them from C). Fix the compare to immediates and refactor the order of compare to regs to make sure they both follow the same pattern. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-14 11:13:28 -07:00
Jakub Kicinski	8283737065	nfp: bpf: add missing return in jne_imm optimization We optimize comparisons to immediate 0 as if (reg.lo \| reg.hi). The early return statement was missing, however, which means we would generate two comparisons - optimized one followed by a normal 2x 32 bit compare. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-14 11:13:28 -07:00
Jakub Kicinski	bc8c80a8c9	nfp: bpf: reorder arguments to emit_ld_field_any() ld_field instruction has the following format in NFP assembler: ld_field[dst, 1000, src, <<24] reoder parameters to emit_ld_field_any() to make it closer to the familiar assembler order. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-14 11:13:27 -07:00
Jakub Kicinski	5f0ca2fb71	nfp: handle page allocation failures page_address() does not handle NULL argument gracefully, make sure we NULL-check the page pointer before passing it to page_address(). Fixes: `ecd63a0217` ("nfp: add XDP support in the driver") Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-10 13:18:33 -07:00
Jakub Kicinski	c3d64ad4fe	nfp: fix ethtool stats gather retry The while loop fetching 64 bit ethtool statistics may have to retry multiple times, it shouldn't modify the outside state. Fixes: `4c3523623d` ("net: add driver for Netronome NFP4000/NFP6000 NIC VFs") Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-10 13:18:33 -07:00
Jakub Kicinski	2de1be1db2	nfp: bpf: pass dst register to ld_field instruction ld_field instruction is a bit special because the encoding uses two source registers and one of them becomes the output. We do need to pass the dst register to our encoding helpers though, otherwise the "write both banks" flag will not be observed. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-09 09:51:03 -07:00
Jakub Kicinski	2e85d3884f	nfp: bpf: byte swap the instructions Device expects the instructions in little endian. Make sure we byte swap on big endian hosts. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-09 09:51:03 -07:00
Jakub Kicinski	1c03e03f9b	nfp: bpf: pad code with valid nops We need to append up to 8 nops after last instruction to make sure the CPU will not fetch garbage instructions with invalid ECC if the code store was not initialized. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-09 09:51:03 -07:00
Jakub Kicinski	fd068ddc88	nfp: bpf: calculate code store ECC In the initial PoC firmware I simply disabled ECC on the instruction store. Do the ECC calculation for generated instructions in the driver. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-09 09:51:03 -07:00
Jakub Kicinski	18e53b6cb9	nfp: bpf: move to datapath ABI version 2 Datapath ABI version 2 stores the packet information in LMEM instead of NNRs. We also have strict restrictions on which GPRs we can use. Only GPRs 0-23 are reserved for BPF. Adjust the static register locations and "ABI" registers. Note that packet length is packed with other info so we have to extract it into one of the scratch registers, OTOH since LMEM can be used in restricted operands we don't have to extract packet pointer. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-09 09:51:03 -07:00
Jakub Kicinski	995e101ffa	nfp: bpf: encode extended LM pointer operands Most instructions have special fields which allow switching between base and extended Local Memory pointers. Introduce those to register encoding, we will use the extra LM pointers to access high addresses of the stack. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-09 09:51:03 -07:00
Jakub Kicinski	9f15d0f438	nfp: bpf: encode LMEM accesses NFP LMEM is a large, indirectly accessed register file. There are two basic indirect access registers. Each access operation may either use offset (up to 8 or 16 words) or perform post decrement/increment. Add encodings of LMEM indexes as instruction operands. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-09 09:51:03 -07:00
Jakub Kicinski	8afd9c961e	nfp: add more white space to the instruction defines We need to add longer OP_* defines, move the values away. Purely whitespace commit. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-09 09:51:02 -07:00
Jakub Kicinski	509144e250	nfp: bpf: remove packet marking support Temporarily drop support for skb->mark. We are primarily focusing on XDP offload, and implementing skb->mark on the new datapath has lower priority. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-09 09:51:02 -07:00
Jakub Kicinski	226e0e94ce	nfp: bpf: remove register rename Remove the register renumbering optimization. To implement calling map and other helpers we need more strict register layout. We can't freely reassign register numbers. This will have the effect of running in 4 context/thread mode, which should be OK since we are moving towards integrating the BPF closer with FW app datapath anyway, and the target datapath itself runs in 4 context mode. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-09 09:51:02 -07:00
Jakub Kicinski	3cae131933	nfp: bpf: encode all 64bit shifts Add encodings of all 64bit shift operations. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-09 09:51:02 -07:00
Jakub Kicinski	2a15bb1aba	nfp: bpf: move software reg helpers and cmd table out of translator Move the software reg helpers and some static data to nfp_asm.c. They are related to the previous patch, but move is done in a separate commit for ease of review. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-09 09:51:02 -07:00
Jakub Kicinski	b3f868df3c	nfp: bpf: use the power of sparse to check we encode registers right Define a new __bitwise type for software representation of registers. This will allow us to catch incorrect parameter types using sparse. Accessors we define also allow us to return correct enum type and therefore ensure all switches handle all register types. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-09 09:51:02 -07:00
Jakub Kicinski	a52b35c39e	nfp: bpf: lift the single-port limitation Limiting the eBPF offload to a single port was a workaround required for the PoC application FW which has not been released externally. It's not necessary any more. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-09 09:51:02 -07:00
Jakub Kicinski	3a4b0129bf	nfp: output control messages to trace_devlink_hwmsg() Use standard devlink trace point to allow tracing of control messages. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-09 09:51:02 -07:00

... 10 11 12 13 14 ...

1503 Commits