Commit Graph

1361 Commits

Author SHA1 Message Date
David S. Miller
aa2eaa8c27 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Minor overlapping changes in the btusb and ixgbe drivers.

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-15 14:17:27 +02:00
Dirk van der Merwe
421bceb270 nfp: read chip model from the PluDevice register
The PluDevice register provides the authoritative chip model/revision.

Since the model number is purely used for reporting purposes, follow
the hardware team convention of subtracting 0x10 from the PluDevice
register to obtain the chip model/revision number.

Suggested-by: Francois H. Theron <francois.theron@netronome.com>
Signed-off-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-12 00:01:00 +01:00
Dirk van der Merwe
44798eceea nfp: devlink: set unknown fw_load_policy
If the 'app_fw_from_flash' HWinfo key is invalid, set the
'fw_load_policy' devlink parameter value to unknown.

Suggested-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-11 15:10:05 +01:00
Dirk van der Merwe
8fb822ce93 kdoc: fix nfp_fw_load documentation
Fixed the incorrect prefix for the 'nfp_fw_load' function.

Signed-off-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-10 17:29:27 +01:00
Dirk van der Merwe
0fbee0ec1f nfp: devlink: add 'reset_dev_on_drv_probe' support
Add support for the 'reset_dev_on_drv_probe' devlink parameter. The
reset control policy is controlled by the 'abi_drv_reset' hwinfo key.

Signed-off-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-10 17:29:27 +01:00
Dirk van der Merwe
ff04788c5b nfp: devlink: add 'fw_load_policy' support
Add support for the 'fw_load_policy' devlink parameter. The FW load
policy is controlled by the 'app_fw_from_flash' hwinfo key.

Remap the values from devlink to the hwinfo key and back.

Signed-off-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-10 17:29:27 +01:00
Dirk van der Merwe
165c3c9f8c nfp: add devlink param infrastructure
Register devlink parameters for driver use. Subsequent patches will add
support for specific parameters.

In order to support devlink parameters, the management firmware needs to
be able to lookup and set hwinfo keys.

Signed-off-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-10 17:29:27 +01:00
Dirk van der Merwe
f8921d7330 nfp: honor FW reset and loading policies
The firmware reset and loading policies can be controlled with the
combination of three hwinfo keys, 'abi_drv_reset', 'abi_drv_load_ifc'
and 'app_fw_from_flash'.

'app_fw_from_flash' defines which firmware should take precedence,
'Disk', 'Flash' or the 'Preferred' firmware. When 'Preferred'
is selected, the management firmware makes the decision on which
firmware will be loaded by comparing versions of the flash firmware
and the host supplied firmware.

'abi_drv_reset' defines when the driver should reset the firmware when
the driver is probed, either 'Disk' if firmware was found on disk,
'Always' reset or 'Never' reset. Note that the device is always reset
on driver unload if firmware was loaded when the driver was probed.

'abi_drv_load_ifc' defines a list of PF devices allowed to load FW on
the device.

Furthermore, we limit the cases to where the driver will unload firmware
again when the driver is removed to only when firmware was loaded by the
driver and only if this particular device was the only one that could
have loaded firmware. This is needed to avoid firmware being removed
while in use on multi-host platforms.

Signed-off-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-10 17:29:27 +01:00
Dirk van der Merwe
e69e9db903 nfp: nsp: add support for hwinfo set operation
Add support for the NSP HWinfo set command. This closely follows the
HWinfo lookup command.

Signed-off-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-10 17:29:26 +01:00
Dirk van der Merwe
74612cdaf5 nfp: nsp: add support for optional hwinfo lookup
There are cases where we want to read a hwinfo entry from the NFP, and
if it doesn't exist, use a default value instead.

To support this, we must silence warning/error messages when the hwinfo
entry doesn't exist since this is a valid use case. The NSP command
structure provides the ability to silence command errors, in which case
the caller should log any command errors appropriately. Protocol errors
are unaffected by this.

Signed-off-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-10 17:29:26 +01:00
Dirk van der Merwe
1da16f0c84 nfp: nsp: add support for fw_loaded command
Add support for the simple command that indicates whether application
firmware is loaded.

Signed-off-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-10 17:29:26 +01:00
Fred Lotter
28abe57962 nfp: flower: cmsg rtnl locks can timeout reify messages
Flower control message replies are handled in different locations. The truly
high priority replies are handled in the BH (tasklet) context, while the
remaining replies are handled in a predefined Linux work queue. The work
queue handler orders replies into high and low priority groups, and always
start servicing the high priority replies within the received batch first.

Reply Type:			Rtnl Lock:	Handler:

CMSG_TYPE_PORT_MOD		no		BH tasklet (mtu)
CMSG_TYPE_TUN_NEIGH		no		BH tasklet
CMSG_TYPE_FLOW_STATS		no		BH tasklet
CMSG_TYPE_PORT_REIFY		no		WQ high
CMSG_TYPE_PORT_MOD		yes		WQ high (link/mtu)
CMSG_TYPE_MERGE_HINT		yes		WQ low
CMSG_TYPE_NO_NEIGH		no		WQ low
CMSG_TYPE_ACTIVE_TUNS		no		WQ low
CMSG_TYPE_QOS_STATS		no		WQ low
CMSG_TYPE_LAG_CONFIG		no		WQ low

A subset of control messages can block waiting for an rtnl lock (from both
work queue priority groups). The rtnl lock is heavily contended for by
external processes such as systemd-udevd, systemd-network and libvirtd,
especially during netdev creation, such as when flower VFs and representors
are instantiated.

Kernel netlink instrumentation shows that external processes (such as
systemd-udevd) often use successive rtnl_trylock() sequences, which can result
in an rtnl_lock() blocked control message to starve for longer periods of time
during rtnl lock contention, i.e. netdev creation.

In the current design a single blocked control message will block the entire
work queue (both priorities), and introduce a latency which is
nondeterministic and dependent on system wide rtnl lock usage.

In some extreme cases, one blocked control message at exactly the wrong time,
just before the maximum number of VFs are instantiated, can block the work
queue for long enough to prevent VF representor REIFY replies from getting
handled in time for the 40ms timeout.

The firmware will deliver the total maximum number of REIFY message replies in
around 300us.

Only REIFY and MTU update messages require replies within a timeout period (of
40ms). The MTU-only updates are already done directly in the BH (tasklet)
handler.

Move the REIFY handler down into the BH (tasklet) in order to resolve timeouts
caused by a blocked work queue waiting on rtnl locks.

Signed-off-by: Fred Lotter <frederik.lotter@netronome.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-07 18:05:50 +02:00
David S. Miller
1e46c09ec1 Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Daniel Borkmann says:

====================
The following pull-request contains BPF updates for your *net-next* tree.

The main changes are:

1) Add the ability to use unaligned chunks in the AF_XDP umem. By
   relaxing where the chunks can be placed, it allows to use an
   arbitrary buffer size and place whenever there is a free
   address in the umem. Helps more seamless DPDK AF_XDP driver
   integration. Support for i40e, ixgbe and mlx5e, from Kevin and
   Maxim.

2) Addition of a wakeup flag for AF_XDP tx and fill rings so the
   application can wake up the kernel for rx/tx processing which
   avoids busy-spinning of the latter, useful when app and driver
   is located on the same core. Support for i40e, ixgbe and mlx5e,
   from Magnus and Maxim.

3) bpftool fixes for printf()-like functions so compiler can actually
   enforce checks, bpftool build system improvements for custom output
   directories, and addition of 'bpftool map freeze' command, from Quentin.

4) Support attaching/detaching XDP programs from 'bpftool net' command,
   from Daniel.

5) Automatic xskmap cleanup when AF_XDP socket is released, and several
   barrier/{read,write}_once fixes in AF_XDP code, from Björn.

6) Relicense of bpf_helpers.h/bpf_endian.h for future libbpf
   inclusion as well as libbpf versioning improvements, from Andrii.

7) Several new BPF kselftests for verifier precision tracking, from Alexei.

8) Several BPF kselftest fixes wrt endianess to run on s390x, from Ilya.

9) And more BPF kselftest improvements all over the place, from Stanislav.

10) Add simple BPF map op cache for nfp driver to batch dumps, from Jakub.

11) AF_XDP socket umem mapping improvements for 32bit archs, from Ivan.

12) Add BPF-to-BPF call and BTF line info support for s390x JIT, from Yauheni.

13) Small optimization in arm64 JIT to spare 1 insns for BPF_MOD, from Jerin.

14) Fix an error check in bpf_tcp_gen_syncookie() helper, from Petar.

15) Various minor fixes and cleanups, from Nathan, Masahiro, Masanari,
    Peter, Wei, Yue.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-06 16:49:17 +02:00
zhong jiang
47e2527769 nfp: Drop unnecessary continue in nfp_net_pf_alloc_vnics
Continue is not needed at the bottom of a loop.

Signed-off-by: zhong jiang <zhongjiang@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-06 14:58:21 +02:00
David S. Miller
765b7590c9 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
r8152 conflicts are the NAPI fixes in 'net' overlapping with
some tasklet stuff in net-next

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-02 11:20:17 -07:00
David S. Miller
94880a5b2e Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf
Daniel Borkmann says:

====================
pull-request: bpf 2019-08-31

The following pull-request contains BPF updates for your *net* tree.

The main changes are:

1) Fix 32-bit zero-extension during constant blinding which
   has been causing a regression on ppc64, from Naveen.

2) Fix a latency bug in nfp driver when updating stack index
   register, from Jiong.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-30 17:39:37 -07:00
Jakub Kicinski
f24e29099f nfp: bpf: add simple map op cache
Each get_next and lookup call requires a round trip to the device.
However, the device is capable of giving us a few entries back,
instead of just one.

In this patch we ask for a small yet reasonable number of entries
(4) on every get_next call, and on subsequent get_next/lookup calls
check this little cache for a hit. The cache is only kept for 250us,
and is invalidated on every operation which may modify the map
(e.g. delete or update call). Note that operations may be performed
simultaneously, so we have to keep track of operations in flight.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-08-31 00:49:05 +02:00
Jakub Kicinski
bc2796db5a nfp: bpf: rework MTU checking
If control channel MTU is too low to support map operations a warning
will be printed. This is not enough, we want to make sure probe fails
in such scenario, as this would clearly be a faulty configuration.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Acked-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-08-31 00:49:05 +02:00
John Hurley
e8024cb483 nfp: flower: handle neighbour events on internal ports
Recent code changes to NFP allowed the offload of neighbour entries to FW
when the next hop device was an internal port. This allows for offload of
tunnel encap when the end-point IP address is applied to such a port.

Unfortunately, the neighbour event handler still rejects events that are
not associated with a repr dev and so the firmware neighbour table may get
out of sync for internal ports.

Fix this by allowing internal port neighbour events to be correctly
processed.

Fixes: 45756dfeda ("nfp: flower: allow tunnels to output to internal port")
Signed-off-by: John Hurley <john.hurley@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-28 16:06:49 -07:00
John Hurley
739d7c5752 nfp: flower: prevent ingress block binds on internal ports
Internal port TC offload is implemented through user-space applications
(such as OvS) by adding filters at egress via TC clsact qdiscs. Indirect
block offload support in the NFP driver accepts both ingress qdisc binds
and egress binds if the device is an internal port. However, clsact sends
bind notification for both ingress and egress block binds which can lead
to the driver registering multiple callbacks and receiving multiple
notifications of new filters.

Fix this by rejecting ingress block bind callbacks when the port is
internal and only adding filter callbacks for egress binds.

Fixes: 4d12ba4278 ("nfp: flower: allow offloading of matches on 'internal' ports")
Signed-off-by: John Hurley <john.hurley@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-28 16:06:49 -07:00
David S. Miller
68aaf44595 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Minor conflict in r8169, bug fix had two versions in net
and net-next, take the net-next hunks.

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-27 14:23:31 -07:00
Jakub Kicinski
d00ee466a0 nfp: add AMDA0058 boards to firmware list
Add MODULE_FIRMWARE entries for AMDA0058 boards.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-26 17:13:35 -07:00
Jiong Wang
86c28b2d69 nfp: bpf: fix latency bug when updating stack index register
NFP is using Local Memory to model stack. LM_addr could be used as base of
a 16 32-bit word region of Local Memory. Then, if the stack offset is
beyond the current region, the local index needs to be updated. The update
needs at least three cycles to take effect, therefore the sequence normally
looks like:

  local_csr_wr[ActLMAddr3, gprB_5]
  nop
  nop
  nop

If the local index switch happens on a narrow loads, then the instruction
preparing value to zero high 32-bit of the destination register could be
counted as one cycle, the sequence then could be something like:

  local_csr_wr[ActLMAddr3, gprB_5]
  nop
  nop
  immed[gprB_5, 0]

However, we have zero extension optimization that zeroing high 32-bit could
be eliminated, therefore above IMMED insn won't be available for which case
the first sequence needs to be generated.

Fixes: 0b4de1ff19 ("nfp: bpf: eliminate zero extension code-gen")
Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-08-26 23:03:05 +02:00
Masahiro Yamada
cbdf59ad65 treewide: remove dummy Makefiles for single targets
Now that the single target build descends into sub-directories in the
same way as the normal build, these dummy Makefiles are not needed
any more.

Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
2019-08-21 21:05:21 +09:00
Vlad Buslov
1edfb8ed6c nfp: flower: verify that block cb is not busy before binding
When processing FLOW_BLOCK_BIND command on indirect block, check that flow
block cb is not busy.

Fixes: 0d4fd02e71 ("net: flow_offload: add flow_block_cb_is_busy() and use it")
Reported-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-19 18:16:23 -07:00
David S. Miller
446bf64b61 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Merge conflict of mlx5 resolved using instructions in merge
commit 9566e650bf.

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-19 11:54:03 -07:00
Pablo Neira Ayuso
ef01adae0e net: sched: use major priority number as hardware priority
tc transparently maps the software priority number to hardware. Update
it to pass the major priority which is what most drivers expect. Update
drivers too so they do not need to lshift the priority field of the
flow_cls_common_offload object. The stmmac driver is an exception, since
this code assumes the tc software priority is fine, therefore, lshift it
just to be conservative.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-18 14:13:23 -07:00
Greg Kroah-Hartman
16e9b481e9 nfp: no need to check return value of debugfs_create functions
When calling debugfs functions, there is no need to ever check the
return value.  The function can work or not, but the code logic should
never do something different based on this.

Cc: "David S. Miller" <davem@davemloft.net>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Jesper Dangaard Brouer <hawk@kernel.org>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: Edwin Peer <edwin.peer@netronome.com>
Cc: Yangtao Li <tiny.windzz@gmail.com>
Cc: Simon Horman <simon.horman@netronome.com>
Cc: oss-drivers@netronome.com
Cc: netdev@vger.kernel.org
Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-10 15:25:48 -07:00
wenxu
4e481908c5 flow_offload: move tc indirect block to flow offload
move tc indirect block to flow_offload and rename
it to flow indirect block.The nf_tables can use the
indr block architecture.

Signed-off-by: wenxu <wenxu@ucloud.cn>
Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-08 18:44:30 -07:00
David S. Miller
13dfb3fa49 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Just minor overlapping changes in the conflicts here.

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-06 18:44:57 -07:00
John Hurley
2e0bc7f3cb nfp: flower: encode mac indexes with pre-tunnel rule check
When a tunnel packet arrives on the NFP card, its destination MAC is
looked up and MAC index returned for it. This index can help verify the
tunnel by, for example, ensuring that the packet arrived on the expected
port. If the packet is destined for a known MAC that is not connected to a
given physical port then the mac index can have a global value (e.g. when
a series of bonded ports shared the same MAC).

If the packet is to be detunneled at a bridge device or internal port like
an Open vSwitch VLAN port, then it should first match a 'pre-tunnel' rule
to direct it to that internal port.

Use the MAC index to indicate if a packet should match a pre-tunnel rule
before decap is allowed. Do this by tracking the number of internal ports
associated with a MAC address and, if the number if >0, set a bit in the
mac_index to forward the packet to the pre-tunnel table before continuing
with decap.

Signed-off-by: John Hurley <john.hurley@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-06 14:24:22 -07:00
John Hurley
09aa811bb7 nfp: flower: remove offloaded MACs when reprs are applied to OvS bridges
MAC addresses along with an identifying index are offloaded to firmware to
allow tunnel decapsulation. If a tunnel packet arrives with a matching
destination MAC address and a verified index, it can continue on the
decapsulation process. This replicates the MAC verifications carried out
in the kernel network stack.

When a netdev is added to a bridge (e.g. OvS) then packets arriving on
that dev are directed through the bridge datapath instead of passing
through the network stack. Therefore, tunnelled packets matching the MAC
of that dev will not be decapped here.

Replicate this behaviour on firmware by removing offloaded MAC addresses
when a MAC representer is added to an OvS bridge. This can prevent any
false positive tunnel decaps.

Signed-off-by: John Hurley <john.hurley@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-06 14:24:22 -07:00
John Hurley
f12725d98c nfp: flower: offload pre-tunnel rules
Pre-tunnel rules are TC flower and OvS rules that forward a packet to the
tunnel end point where it can then pass through the network stack and be
decapsulated. These are required if the tunnel end point is, say, an OvS
internal port.

Currently, firmware determines that a packet is in a tunnel and decaps it
if it has a known destination IP and MAC address. However, this bypasses
the flower pre-tunnel rule and so does not update the stats. Further to
this it ignores VLANs that may exist outside of the tunnel header.

Offload pre-tunnel rules to the NFP. This embeds the pre-tunnel rule into
the tunnel decap process based on (firmware) mac index and VLAN. This
means that decap can be carried out correctly with VLANs and that stats
can be updated for all kernel rules correctly.

Signed-off-by: John Hurley <john.hurley@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-06 14:24:22 -07:00
John Hurley
120ffd84a9 nfp: flower: verify pre-tunnel rules
Pre-tunnel rules must direct packets to an internal port based on L2
information. Rules that egress to an internal port are already indicated
by a non-NULL device in its nfp_fl_payload struct. Verfiy the rest of the
match fields indicate that the rule is a pre-tunnel rule. This requires a
full match on the destination MAC address, an option VLAN field, and no
specific matches on other lower layer fields (with the exception of L4
proto and flags).

If a rule is identified as a pre-tunnel rule then mark it for offload to
the pre-tunnel table. Similarly, remove it from the pre-tunnel table on
rule deletion. The actual offloading of these commands is left to a
following patch.

Signed-off-by: John Hurley <john.hurley@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-06 14:24:22 -07:00
John Hurley
f5c977eed7 nfp: flower: detect potential pre-tunnel rules
Pre-tunnel rules are used when the tunnel end-point is on an 'internal
port'. These rules are used to direct the tunnelled packets (based on outer
header fields) to the internal port where they can be detunnelled. The
rule must send the packet to ingress the internal port at the TC layer.

Currently FW does not support an action to send to ingress so cannot
offload such rules. However, in preparation for populating the pre-tunnel
table to represent such rules, check for rules that send to the ingress of
an internal port and mark them as such. Further validation of such rules
is left to subsequent patches.

Signed-off-by: John Hurley <john.hurley@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-06 14:24:21 -07:00
John Hurley
4b10c53d81 nfp: flower: push vlan after tunnel in merge
NFP allows the merging of 2 flows together into a single offloaded flow.
In the kernel datapath the packet must match 1 flow, impliment its
actions, recirculate, match the 2nd flow and also impliment its actions.
Merging creates a single flow with all actions from the 2 original flows.

Firmware impliments a tunnel header push as the packet is about to egress
the card. Therefore, if the first merge rule candiate pushes a tunnel,
then the second rule can only have an egress action for a valid merge to
occur (or else the action ordering will be incorrect). This prevents the
pushing of a tunnel header followed by the pushing of a vlan header.

In order to support this behaviour, firmware allows VLAN information to
be encoded in the tunnel push action. If this is non zero then the fw will
push a VLAN after the tunnel header push meaning that 2 such flows with
these actions can be merged (with action order being maintained).

Support tunnel in VLAN pushes by encoding VLAN information in the tunnel
push action of any merge flow requiring this.

Signed-off-by: John Hurley <john.hurley@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-06 14:24:21 -07:00
David S. Miller
0a062ba725 mlx5-fixes-2019-07-25
-----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEEGhZs6bAKwk/OTgTpSD+KveBX+j4FAl06EYUACgkQSD+KveBX
 +j7R4QgAht/C4115mi1Tc3d3zYjHp3SWLFxwK4vF0U2j30ouhsj1oaIP8bQdw6Mr
 6hS4IZSdKNO5wo+NNqMnLYVtsAnvNGOuvYwUvMK5TDkdDb2lIzRlxihpWgTqWzXr
 6Eh3nv5rTItgLMqxbLL1EE8Idlx3HQDJtU2a/AmxjmU/TqSKzbBTpnKIlRMPDFNC
 PLWXjFXBR/XtcTbsnj7RtlD2HkDAERVTiMP2mlTvXjXxlN56YXCle4CWZamgH9H4
 bTCrZwQHH9hllMAnAkq4gpHN7Z6/eXjV6jzu+BOE7ChOaEC5N2F+p5ARXqe+HwRL
 apMYgRH5u4mzDt+1CbwR/I/pFOw3WA==
 =NXce
 -----END PGP SIGNATURE-----

Merge tag 'mlx5-fixes-2019-07-25' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux

Saeed Mahameed says:

====================
Mellanox, mlx5 fixes 2019-07-25

This series introduces some fixes to mlx5 driver.

1) Ariel is addressing an issue with enacp flow counter race condition
2) Aya fixes ethtool speed handling
3) Edward fixes modify_cq hw bits alignment
4) Maor fixes RDMA_RX capabilities handling
5) Mark reverses unregister devices order to address an issue with LAG
6) From Tariq,
  - wrong max num channels indication regression
  - TLS counters naming and documentation as suggested by Jakub
  - kTLS, Call WARN_ONCE on netdev mismatch

There is one patch in this series that touches nfp driver to align
TLS statistics names with latest documentation, Jakub is CC'ed.

Please pull and let me know if there is any problem.

For -stable v4.9:
  ('net/mlx5: Use reversed order when unregister devices')

For -stable v4.20
  ('net/mlx5e: Prevent encap flow counter update async to user query')
  ('net/mlx5: Fix modify_cq_in alignment')

For -stable v5.1
  ('net/mlx5e: Fix matching of speed to PRM link modes')

For -stable v5.2
  ('net/mlx5: Add missing RDMA_RX capabilities')
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-26 14:26:41 -07:00
Tariq Toukan
4ea52e2508 nfp: tls: rename tls packet counters
Align to the naming convention in TLS documentation.

Fixes: 51a5e56329 ("nfp: tls: add basic statistics")
Suggested-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-07-25 13:31:00 -07:00
John Hurley
e03e47a3dc nfp: flower: offload MPLS set action
Recent additions to the kernel include a TC action module to manipulate
MPLS headers on packets. Such actions are available to offload via the
flow_offload intermediate representation API.

Modify the NFP driver to allow the offload of MPLS set actions to
firmware. Set actions update the outermost MPLS header. The offload
includes a mask to specify which fields should be set.

Signed-off-by: John Hurley <john.hurley@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-23 13:52:51 -07:00
John Hurley
35b7c70cc3 nfp: flower: offload MPLS pop action
Recent additions to the kernel include a TC action module to manipulate
MPLS headers on packets. Such actions are available to offload via the
flow_offload intermediate representation API.

Modify the NFP driver to allow the offload of MPLS pop actions to
firmware. The act_mpls TC module enforces that the next protocol is
supplied along with the pop action. Passing this to firmware allows it
to properly rebuild the underlying packet after the pop.

Signed-off-by: John Hurley <john.hurley@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-23 13:52:51 -07:00
John Hurley
a6eb1817fb nfp: flower: offload MPLS push action
Recent additions to the kernel include a TC action module to manipulate
MPLS headers on packets. Such actions are available to offload via the
flow_offload intermediate representation API.

Modify the NFP driver to allow the offload of MPLS push actions to
firmware.

Signed-off-by: John Hurley <john.hurley@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-23 13:52:50 -07:00
Matthew Wilcox (Oracle)
d7840976e3 net: Use skb accessors in network drivers
In preparation for unifying the skb_frag and bio_vec, use the fine
accessors which already exist and use skb_frag_t instead of
struct skb_frag_struct.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-22 20:47:56 -07:00
Pablo Neira Ayuso
14bfb13f0e net: flow_offload: add flow_block structure and use it
This object stores the flow block callbacks that are attached to this
block. Update flow_block_cb_lookup() to take this new object.

This patch restores the block sharing feature.

Fixes: da3eeb904f ("net: flow_offload: add list handling functions")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-19 21:27:45 -07:00
Pablo Neira Ayuso
0c7294ddae net: flow_offload: remove netns parameter from flow_block_cb_alloc()
No need to annotate the netns on the flow block callback object,
flow_block_cb_is_busy() already checks for used blocks.

Fixes: d63db30c85 ("net: flow_offload: add flow_block_cb_alloc() and flow_block_cb_free()")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-19 21:27:45 -07:00
John Hurley
103b7c25f5 nfp: flower: ensure ip protocol is specified for L4 matches
Flower rules on the NFP firmware are able to match on an IP protocol
field. When parsing rules in the driver, unknown IP protocols are only
rejected when further matches are to be carried out on layer 4 fields, as
the firmware will not be able to extract such fields from packets.

L4 protocol dissectors such as FLOW_DISSECTOR_KEY_PORTS are only parsed if
an IP protocol is specified. This leaves a loophole whereby a rule that
attempts to match on transport layer information such as port numbers but
does not explicitly give an IP protocol type can be incorrectly offloaded
(in this case with wildcard port numbers matches).

Fix this by rejecting the offload of flows that attempt to match on L4
information, not only when matching on an unknown IP protocol type, but
also when the protocol is wildcarded.

Fixes: 2a04784594 ("nfp: flower: check L4 matches on unknown IP protocols")
Signed-off-by: John Hurley <john.hurley@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-12 15:31:55 -07:00
John Hurley
fd262a6d8a nfp: flower: fix ethernet check on match fields
NFP firmware does not explicitly match on an ethernet type field. Rather,
each rule has a bitmask of match fields that can be used to infer the
ethernet type.

Currently, if a flower rule contains an unknown ethernet type, a check is
carried out for matches on other fields of the packet. If matches on
layer 3 or 4 are found, then the offload is rejected as firmware will not
be able to extract these fields from a packet with an ethernet type it
does not currently understand.

However, if a rule contains an unknown ethernet type without any L3 (or
above) matches then this will effectively be offloaded as a rule with a
wildcarded ethertype. This can lead to misclassifications on the firmware.

Fix this issue by rejecting all flower rules that specify a match on an
unknown ethernet type.

Further ensure correct offloads by moving the 'L3 and above' check to any
rule that does not specify an ethernet type and rejecting rules with
further matches. This means that we can still offload rules with a
wildcarded ethertype if they only match on L2 fields but will prevent
rules which match on further fields that we cannot be sure if the firmware
will be able to extract.

Fixes: af9d842c13 ("nfp: extend flower add flow offload")
Signed-off-by: John Hurley <john.hurley@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-12 15:31:55 -07:00
Pablo Neira Ayuso
f9e30088d2 net: flow_offload: rename tc_cls_flower_offload to flow_cls_offload
And any other existing fields in this structure that refer to tc.
Specifically:

* tc_cls_flower_offload_flow_rule() to flow_cls_offload_flow_rule().
* TC_CLSFLOWER_* to FLOW_CLS_*.
* tc_cls_common_offload to tc_cls_common_offload.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-09 14:38:51 -07:00
Pablo Neira Ayuso
0d4fd02e71 net: flow_offload: add flow_block_cb_is_busy() and use it
This patch adds a function to check if flow block callback is already in
use.  Call this new function from flow_block_cb_setup_simple() and from
drivers.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-09 14:38:50 -07:00
Pablo Neira Ayuso
955bcb6ea0 drivers: net: use flow block API
This patch updates flow_block_cb_setup_simple() to use the flow block API.
Several drivers are also adjusted to use it.

This patch introduces the per-driver list of flow blocks to account for
blocks that are already in use.

Remove tc_block_offload alias.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-09 14:38:50 -07:00
Pablo Neira Ayuso
32f8c4093a net: flow_offload: rename TCF_BLOCK_BINDER_TYPE_* to FLOW_BLOCK_BINDER_TYPE_*
Rename from TCF_BLOCK_BINDER_TYPE_* to FLOW_BLOCK_BINDER_TYPE_* and
remove temporary tcf_block_binder_type alias.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-09 14:38:50 -07:00
Pablo Neira Ayuso
9c0e189ec9 net: flow_offload: rename TC_BLOCK_{UN}BIND to FLOW_BLOCK_{UN}BIND
Rename from TC_BLOCK_{UN}BIND to FLOW_BLOCK_{UN}BIND and remove
temporary tc_block_command alias.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-09 14:38:50 -07:00
Pablo Neira Ayuso
4e95bc268b net: flow_offload: add flow_block_cb_setup_simple()
Most drivers do the same thing to set up the flow block callbacks, this
patch adds a helper function to do this.

This preparation patch reduces the number of changes to adapt the
existing drivers to use the flow block callback API.

This new helper function takes a flow block list per-driver, which is
set to NULL until this driver list is used.

This patch also introduces the flow_block_command and
flow_block_binder_type enumerations, which are renamed to use
FLOW_BLOCK_* in follow up patches.

There are three definitions (aliases) in order to reduce the number of
updates in this patch, which go away once drivers are fully adapted to
use this flow block API.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-09 14:38:50 -07:00
Jakub Kicinski
5a4cea280c nfp: tls: undo TLS sequence tracking when dropping the frame
If driver has to drop the TLS frame it needs to undo the TCP
sequence tracking changes, otherwise device will receive
segments out of order and drop them.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-08 20:21:09 -07:00
Jakub Kicinski
c8d3928ea7 nfp: tls: avoid one of the ifdefs for TLS
Move the #ifdef CONFIG_TLS_DEVICE a little so we can eliminate
the other one.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-08 20:21:09 -07:00
Jakub Kicinski
c3b6491133 nfp: tls: don't leave key material in freed FW cmsg skbs
Make sure the contents of the skb which carried key material
to the FW is cleared.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-08 20:21:09 -07:00
Dirk van der Merwe
b5d9a834f4 net/tls: don't clear TX resync flag on error
Introduce a return code for the tls_dev_resync callback.

When the driver TX resync fails, kernel can retry the resync again
until it succeeds.  This prevents drivers from attempting to offload
TLS packets if the connection is known to be out of sync.

We don't worry about the RX resync since they will be retried naturally
as more encrypted records get received.

Signed-off-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-08 20:21:09 -07:00
Jakub Kicinski
427545b304 nfp: tls: count TSO segments separately for the TLS offload
Count the number of successfully submitted TLS segments,
not skbs. This will make it easier to compare the TLS
encryption count against other counters.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-08 20:21:09 -07:00
Dirk van der Merwe
f6dfa31509 nfp: ccm: increase message limits
Increase the batch limit to consume small message bursts more
effectively. Practically, the effect on the 'add' messages is not
significant since the mailbox is sized such that the 'add' messages are
still limited to the same order of magnitude that it was originally set
for.

Furthermore, increase the queue size limit to 1024 entries. This further
improves the handling of bursts of small control messages.

Signed-off-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-08 20:21:09 -07:00
Jakub Kicinski
53601c68b8 nfp: tls: use unique connection ids instead of 4-tuple for TX
Connection 4 tuple reuse is slightly problematic - TLS socket
and context do not get destroyed until all the associated skbs
left the system and all references are released. This leads
to stale connection entry in the device preventing addition
of new one if the 4 tuple is reused quickly enough.

Instead of using read 4 tuple as the key use a unique ID.
Set the protocol to TCP and port to 0 to ensure no collisions
with real connections.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-08 20:21:09 -07:00
Jakub Kicinski
ff8869d5ed nfp: tls: move setting ipver_vlan to a helper
Long lines are ugly.  No functional changes.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-08 20:21:09 -07:00
Jakub Kicinski
0f93242d96 nfp: tls: ignore queue limits for delete commands
We need to do our best not to drop delete commands, otherwise
we will have stale entries in the connection table.  Ignore
the control message queue limits for delete commands.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-08 20:21:09 -07:00
Wei Yongjun
31d166642c nfp: tls: fix error return code in nfp_net_tls_add()
Fix to return negative error code -EINVAL from the error handling
case instead of 0, as done elsewhere in this function.

Fixes: 1f35a56cf5 ("nfp: tls: add/delete TLS TX connections")
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-08 15:27:33 -07:00
Sebastian Andrzej Siewior
f654e67670 nfp: Use spinlock_t instead of struct spinlock
For spinlocks the type spinlock_t should be used instead of "struct
spinlock".

Use spinlock_t for spinlock's definition.

Cc: Jakub Kicinski <jakub.kicinski@netronome.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: oss-drivers@netronome.com
Cc: netdev@vger.kernel.org
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-05 16:17:44 -07:00
Pieter Jansen van Vuuren
fccac5802d nfp: flower: add GRE encap action support
Add new GRE encapsulation support, which allows offload of filters
using tunnel_key set action in combination with actions that egress
to GRE type ports.

Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: John Hurley <john.hurley@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-27 19:47:36 -07:00
Pieter Jansen van Vuuren
e3a6aba081 nfp: flower: add GRE decap classification support
Extend the existing tunnel matching support to include GRE decap
classification. Specifically matching existing tunnel fields for
NVGRE (GRE with protocol field set to TEB).

Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: John Hurley <john.hurley@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-27 19:47:36 -07:00
Pieter Jansen van Vuuren
104dce5be9 nfp: flower: rename tunnel related functions in action offload
Previously tunnel related functions in action offload only applied
to UDP tunnels. Rename these functions in preparation for new
tunnel types.

Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: John Hurley <john.hurley@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-27 19:47:36 -07:00
Pieter Jansen van Vuuren
4bf8758a89 nfp: flower: add helper functions for tunnel classification
Adds IPv4 address and TTL/TOS helper functions, which is done in
preparation for compiling new tunnel types.

Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: John Hurley <john.hurley@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-27 19:47:36 -07:00
Pieter Jansen van Vuuren
986643de53 nfp: flower: refactor tunnel key layer calculation
Refactor the key layer calculation function, in particular the tunnel
key layer calculation by introducing helper functions. This is done
in preparation for supporting GRE tunnel offloads.

Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: John Hurley <john.hurley@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-27 19:47:36 -07:00
David S. Miller
13091aa305 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Honestly all the conflicts were simple overlapping changes,
nothing really interesting to report.

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-17 20:20:36 -07:00
Pieter Jansen van Vuuren
bef6e97d57 nfp: flower: extend extack messaging for flower match and actions
Use extack messages in flower offload when compiling match and actions
messages that will configure hardware.

Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-14 19:48:57 -07:00
Pieter Jansen van Vuuren
14179c4b45 nfp: flower: use extack messages in flower offload
Use extack messages in flower offload, specifically focusing on
the extack use in add offload, remove offload and get stats paths.

Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-14 19:48:57 -07:00
Pieter Jansen van Vuuren
2a04784594 nfp: flower: check L4 matches on unknown IP protocols
Matching on fields with a protocol that is unknown to hardware
is not strictly unsupported. Determine if hardware can offload
a filter with an unknown protocol by checking if any L4 fields
are being matched as well.

Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-14 19:48:57 -07:00
Jakub Kicinski
f767fc6655 nfp: print a warning when binding VFs to PF driver
Users sometimes mistakenly try to manually bind the PF driver
to the VFs, print a warning message in that case.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-14 19:18:27 -07:00
Jakub Kicinski
605fd1c67e nfp: update the old flash error message
Apparently there are still cards in the wild with a very old
management FW.  Let's make the error message in that case
indicate more clearly that management firmware has to be
updated.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-14 19:18:27 -07:00
Jakub Kicinski
9ed431c1d7 nfp: tls: make use of kernel-driven TX resync
When TCP stream gets out of sync (driver stops receiving skbs
with expected TCP sequence numbers) request a TX resync from
the kernel.

We try to distinguish retransmissions from missed transmissions
by comparing the sequence number to expected - if it's further
than the expected one - we probably missed packets.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-11 12:22:27 -07:00
Jakub Kicinski
eeb2efaf36 net/tls: generalize the resync callback
Currently only RX direction is ever resynced, however, TX may
also get out of sequence if packets get dropped on the way to
the driver.  Rename the resync callback and add a direction
parameter.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-11 12:22:27 -07:00
Jakub Kicinski
c0a4948e1d nfp: tls: enable TLS RX offload
Set ethtool TLS RX feature based on NIC capabilities, and enable
TLS RX when connections are added for decryption.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-11 12:22:27 -07:00
Dirk van der Merwe
cad228a376 nfp: tls: implement RX TLS resync
Enable kernel-controlled RX resync and propagate TLS connection
RX resync from kernel TLS to firmware.

Signed-off-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-11 12:22:27 -07:00
Jakub Kicinski
e2c7114a12 nfp: add async version of mailbox communication
Some control messages must be sent from atomic context.  The mailbox
takes sleeping locks and uses a waitqueue so add a "posted" version
of communication.

Trylock the semaphore and if that's successful kick of the device
communication.  The device communication will be completed from
a workqueue, which will also release the semaphore.

If locks are taken queue the message and return.  Schedule a
different workqueue to take the semaphore and run the communication.
Note that the there are currently no atomic users which would actually
need the return value, so all replies to posted messages are just
freed.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-11 12:22:27 -07:00
Jakub Kicinski
d7053e0433 nfp: rename nfp_ccm_mbox_alloc()
We need the name nfp_ccm_mbox_alloc() for allocating the mailbox
communication channel itself.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-11 12:22:26 -07:00
Dirk van der Merwe
5bcb5c7e98 nfp: tls: set skb decrypted flag
Firmware indicates when a packet has been decrypted by reusing the
currently unused BPF flag.  Transfer this information into the skb
and provide a statistic of all decrypted segments.

Signed-off-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-11 12:22:26 -07:00
John Hurley
dce5ccccd1 nfp: ensure skb network header is set for packet redirect
Packets received at the NFP driver may be redirected to egress of another
netdev (e.g. in the case of OvS internal ports). On the egress path, some
processes, like TC egress hooks, may expect the network header offset
field in the skb to be correctly set. If this is not the case there is
potential for abnormal behaviour and even the triggering of BUG() calls.

Set the skb network header field before the mac header pull when doing a
packet redirect.

Fixes: 27f54b5825 ("nfp: allow fallback packets from non-reprs")
Signed-off-by: John Hurley <john.hurley@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-09 20:08:09 -07:00
Jakub Kicinski
51a5e56329 nfp: tls: add basic statistics
Count TX TLS packets: successes, out of order, and dropped due to
missing record info.  Make sure the RX and TX completion statistics
don't share cache lines with TX ones as much as possible.  With TLS
stats they are no longer reasonably aligned.

Signed-off-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-06 14:13:40 -07:00
Dirk van der Merwe
1f35a56cf5 nfp: tls: add/delete TLS TX connections
This patch adds the functionality to add and delete TLS connections on
the NFP, received from the kernel TLS callbacks.

Make use of the common control message (CCM) infrastructure to propagate
the kernel state to firmware.

Signed-off-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-06 14:13:40 -07:00
Dirk van der Merwe
c3991d397f nfp: tls: add datapath support for TLS TX
Prepend connection handle to each transmitted TLS packet.

For each connection, the driver tracks the next sequence number
expected. If an out of order packet is observed, the driver calls into
the TLS kernel code to reencrypt that particular skb.

Signed-off-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-06 14:13:40 -07:00
Jakub Kicinski
5584c0f825 nfp: prepare for more TX metadata prepend
Subsequent patches will add support for more TX metadata fields.
Prepare for this by handling an additional double word - firmware
handle as metadata type 7.

Signed-off-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-06 14:13:40 -07:00
Jakub Kicinski
232eeb1f84 nfp: add tls init code
Add FW ABI defines and code for basic init of TLS offload.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-06 14:13:40 -07:00
Jakub Kicinski
d9d2d4c54f nfp: parse crypto opcode TLV
Parse TLV containing a bitmask of supported crypto operations.
The TLV contains a capability bitmask (supported operations)
and enabled bitmask.  Each operation describes the crypto
protocol quite exhaustively (protocol, AEAD, direction).

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-06 14:13:39 -07:00
Jakub Kicinski
d3e4dfe060 nfp: add support for sending control messages via mailbox
FW may prefer to handle some communication via a mailbox
or the vNIC may simply not have a control queue (VFs).
Add a way of exchanging ccm-compatible messages via a
mailbox.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-06 14:13:39 -07:00
Jakub Kicinski
a68634893f nfp: parse the mailbox cmsg TLV
Parse the mailbox TLV.  When control message queue is not available
we can fall back to passing the control messages via the vNIC
mailbox.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-06 14:13:39 -07:00
Jakub Kicinski
3ed77bf766 nfp: make bar_lock a semaphore
We will need to release the bar lock from a workqueue
so move from a mutex to a semaphore.  This lock should
not be too hot.  Unfortunately semaphores don't have
lockdep support.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-06 14:13:39 -07:00
Jakub Kicinski
76581af254 nfp: count all failed TX attempts as errors
Currently if we need to modify the head of the skb and allocation
fails we would free the skb and not increment the error counter.
Make sure all errors are counted.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-06 14:13:39 -07:00
Gustavo A. R. Silva
856e6d9f9d nfp: flower: use struct_size() helper
One of the more common cases of allocation size calculations is finding
the size of a structure that has a zero-sized array at the end, along
with memory for some number of elements for that array. For example:

struct nfp_tun_active_tuns {
	...
        struct route_ip_info {
                __be32 ipv4;
                __be32 egress_port;
                __be32 extra[2];
        } tun_info[];
};

Make use of the struct_size() helper instead of an open-coded version
in order to avoid any potential type mistakes.

So, replace the following form:

sizeof(struct nfp_tun_active_tuns) + sizeof(struct route_ip_info) * count

with:

struct_size(payload, tun_info, count)

This code was detected with the help of Coccinelle.

Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-05 16:54:43 -07:00
Jiong Wang
0b4de1ff19 nfp: bpf: eliminate zero extension code-gen
This patch eliminate zero extension code-gen for instructions including
both alu and load/store. The only exception is for ctx load, because
offload target doesn't go through host ctx convert logic so we do
customized load and ignores zext flag set by verifier.

Cc: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-05-24 18:58:38 -07:00
Linus Torvalds
2c1212de6f SPDX update for 5.2-rc2, round 1
Here are series of patches that add SPDX tags to different kernel files,
 based on two different things:
   - SPDX entries are added to a bunch of files that we missed a year ago
     that do not have any license information at all.
 
     These were either missed because the tool saw the MODULE_LICENSE()
     tag, or some EXPORT_SYMBOL tags, and got confused and thought the
     file had a real license, or the files have been added since the last
     big sweep, or they were Makefile/Kconfig files, which we didn't
     touch last time.
 
   - Add GPL-2.0-only or GPL-2.0-or-later tags to files where our scan
     tools can determine the license text in the file itself.  Where this
     happens, the license text is removed, in order to cut down on the
     700+ different ways we have in the kernel today, in a quest to get
     rid of all of these.
 
 These patches have been out for review on the linux-spdx@vger mailing
 list, and while they were created by automatic tools, they were
 hand-verified by a bunch of different people, all whom names are on the
 patches are reviewers.
 
 The reason for these "large" patches is if we were to continue to
 progress at the current rate of change in the kernel, adding license
 tags to individual files in different subsystems, we would be finished
 in about 10 years at the earliest.
 
 There will be more series of these types of patches coming over the next
 few weeks as the tools and reviewers crunch through the more "odd"
 variants of how to say "GPLv2" that developers have come up with over
 the years, combined with other fun oddities (GPL + a BSD disclaimer?)
 that are being unearthed, with the goal for the whole kernel to be
 cleaned up.
 
 These diffstats are not small, 3840 files are touched, over 10k lines
 removed in just 24 patches.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 
 iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCXOP8uw8cZ3JlZ0Brcm9h
 aC5jb20ACgkQMUfUDdst+ynmGQCgy3evqzleuOITDpuWaxewFdHqiJYAnA7KRw4H
 1KwtfRnMtG6dk/XaS7H7
 =O9lH
 -----END PGP SIGNATURE-----

Merge tag 'spdx-5.2-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core

Pull SPDX update from Greg KH:
 "Here is a series of patches that add SPDX tags to different kernel
  files, based on two different things:

   - SPDX entries are added to a bunch of files that we missed a year
     ago that do not have any license information at all.

     These were either missed because the tool saw the MODULE_LICENSE()
     tag, or some EXPORT_SYMBOL tags, and got confused and thought the
     file had a real license, or the files have been added since the
     last big sweep, or they were Makefile/Kconfig files, which we
     didn't touch last time.

   - Add GPL-2.0-only or GPL-2.0-or-later tags to files where our scan
     tools can determine the license text in the file itself. Where this
     happens, the license text is removed, in order to cut down on the
     700+ different ways we have in the kernel today, in a quest to get
     rid of all of these.

  These patches have been out for review on the linux-spdx@vger mailing
  list, and while they were created by automatic tools, they were
  hand-verified by a bunch of different people, all whom names are on
  the patches are reviewers.

  The reason for these "large" patches is if we were to continue to
  progress at the current rate of change in the kernel, adding license
  tags to individual files in different subsystems, we would be finished
  in about 10 years at the earliest.

  There will be more series of these types of patches coming over the
  next few weeks as the tools and reviewers crunch through the more
  "odd" variants of how to say "GPLv2" that developers have come up with
  over the years, combined with other fun oddities (GPL + a BSD
  disclaimer?) that are being unearthed, with the goal for the whole
  kernel to be cleaned up.

  These diffstats are not small, 3840 files are touched, over 10k lines
  removed in just 24 patches"

* tag 'spdx-5.2-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (24 commits)
  treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 25
  treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 24
  treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 23
  treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 22
  treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 21
  treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 20
  treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 19
  treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 18
  treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 17
  treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 15
  treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 14
  treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 13
  treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 12
  treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 11
  treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 10
  treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 9
  treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 7
  treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 5
  treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 4
  treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 3
  ...
2019-05-21 12:33:38 -07:00
Thomas Gleixner
ec8f24b7fa treewide: Add SPDX license identifier - Makefile/Kconfig
Add SPDX license identifiers to all Make/Kconfig files which:

 - Have no license information of any form

These files fall under the project license, GPL v2 only. The resulting SPDX
license identifier is:

  GPL-2.0-only

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-05-21 10:50:46 +02:00
Pieter Jansen van Vuuren
cb07d915bf nfp: flower: add rcu locks when accessing netdev for tunnels
Add rcu locks when accessing netdev when processing route request
and tunnel keep alive messages received from hardware.

Fixes: 8e6a9046b6 ("nfp: flower vxlan neighbour offload")
Fixes: 856f5b1357 ("nfp: flower vxlan neighbour keep-alive")
Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: John Hurley <john.hurley@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-14 16:02:42 -07:00
Jakub Kicinski
6c9f054414 nfp: add missing kdoc
Add missing kdoc for app member.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-09 16:41:46 -07:00
Jiong Wang
69e168ebdc nfp: bpf: fix static check error through tightening shift amount adjustment
NFP shift instruction has something special. If shift direction is left
then shift amount of 1 to 31 is specified as 32 minus the amount to shift.

But no need to do this for indirect shift which has shift amount be 0. Even
after we do this subtraction, shift amount 0 will be turned into 32 which
will eventually be encoded the same as 0 because only low 5 bits are
encoded, but shift amount be 32 will fail the FIELD_PREP check done later
on shift mask (0x1f), due to 32 is out of mask range. Such error has been
observed when compiling nfp/bpf/jit.c using gcc 8.3 + O3.

This issue has started when indirect shift support added after which the
incoming shift amount to __emit_shf could be 0, therefore it is at that
time shift amount adjustment inside __emit_shf should have been tightened.

Fixes: 991f5b3651 ("nfp: bpf: support logic indirect shifts (BPF_[L|R]SH | BPF_X)")
Reported-by: Oleksandr Natalenko <oleksandr@natalenko.name>
Reported-by: Pablo Cascón <pablo.cascon@netronome.com
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-05-09 15:54:30 -07:00
Pieter Jansen van Vuuren
1e966763e2 nfp: reintroduce ndo_get_port_parent_id for representor ports
NFP does not register devlink ports for representors (without
the "devlink: expose PF and VF representors as ports" series
there are no port flavours to expose them as).

Commit c25f08ac65 ("nfp: remove ndo_get_port_parent_id implementation")
went to far in removing ndo_get_port_parent_id for representors.
This causes redirection offloads to fail, and switch_id attribute
missing.

Reintroduce the ndo_get_port_parent_id callback for representor ports.

Fixes: c25f08ac65 ("nfp: remove ndo_get_port_parent_id implementation")
Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-08 16:32:36 -07:00
Pieter Jansen van Vuuren
d6787147e1 net/sched: remove block pointer from common offload structure
Based on feedback from Jiri avoid carrying a pointer to the tcf_block
structure in the tc_cls_common_offload structure. Instead store
a flag in driver private data which indicates if offloads apply
to a shared block at block binding time.

Suggested-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-07 12:23:40 -07:00
Pieter Jansen van Vuuren
5fb5c395e2 nfp: flower: add qos offload stats request and reply
Add stats request function that sends a stats request message to hw for
a specific police-filter. Process stats reply from hw and update the
stored qos structure.

Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-05 21:49:24 -07:00
Pieter Jansen van Vuuren
49cbef1388 nfp: flower: add qos offload install and remove functionality.
Add install and remove offload functionality for qos offloads. We
first check that a police filter can be implemented by the VF rate
limiting feature in hw, then we install the filter via the qos
infrastructure. Finally we implement the mechanism for removing
these types of filters.

Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-05 21:49:24 -07:00
Pieter Jansen van Vuuren
b66d035eec nfp: flower: add qos offload framework
Introduce matchall filter offload infrastructure that is needed to
offload qos features like policing. Subsequent patches will make
use of police-filters for ingress rate limiting.

Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-05 21:49:24 -07:00
Dirk van der Merwe
790d23e7c5 nfp: implement PCI driver shutdown callback
Device may be shutdown without the hardware being reinitialized, in
which case we want to ensure we cleanup properly.

This is especially important for kexec with traffic flowing.

The shutdown procedures resembles the remove procedures, so we can reuse
those common tasks.

Signed-off-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-26 12:08:13 -04:00
David S. Miller
8b44836583 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Two easy cases of overlapping changes.

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-25 23:52:29 -04:00
Ido Schimmel
f2ad1a522e net: devlink: Add extack to shared buffer operations
Add extack to shared buffer set operations, so that meaningful error
messages could be propagated to the user.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Cc: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-22 22:09:32 -07:00
Pablo Cascón
4ef6cbe80d nfp: add SR-IOV trusted VF support
By default VFs are not trusted. Add ndo_set_vf_trust support to toggle
a new per-VF bit. Coupled with FW with this capability allows a
trusted VF to change its MAC even after being administratively set by
the PF. Also populate the trusted field on ndo_get_vf_config. Add the
same ndo to the representors.

Signed-off-by: Pablo Cascón <pablo.cascon@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-19 21:00:31 -07:00
John Hurley
7d26c96052 nfp: flower: fix size_t compile warning
A recent addition to NFP introduced a function that formats a string with
a size_t variable. This is formatted with %ld which is fine on 64-bit
architectures but produces a compile warning on 32-bit architectures.

Fix this by using the z length modifier.

Fixes: a6156a6ab0f9 ("nfp: flower: handle merge hint messages")
Signed-off-by: John Hurley <john.hurley@netronome.com>
Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-19 11:56:40 -07:00
Colin Ian King
d003d772e6 nfp: abm: fix spelling mistake "offseting" -> "offsetting"
There are a couple of spelling mistakes in NL_SET_ERR_MSG_MOD error
messages. Fix these.

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Mukesh Ojha <mojha@codeaurora.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-17 23:22:26 -07:00
John Hurley
9bad65e515 nfp: flower: fix implicit fallthrough warning
The nfp_flower_copy_pre_actions function introduces a case statement with
an intentional fallthrough. However, this generates a warning if built
with the -Wimplicit-fallthrough flag.

Remove the warning by adding a fall through comment.

Fixes: 1c6952ca58 ("nfp: flower: generate merge flow rule")
Signed-off-by: John Hurley <john.hurley@netronome.com>
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Reviewed-by: Jiong Wang <jiong.wang@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-16 21:42:06 -07:00
John Hurley
8af56f40e5 nfp: flower: offload merge flows
A merge flow is formed from 2 sub flows. The match fields of the merge are
the same as the first sub flow that has formed it, with the actions being
a combination of the first and second sub flow. Therefore, a merge flow
should replace sub flow 1 when offloaded.

Offload valid merge flows by using a new 'flow mod' message type to
replace an existing offloaded rule. Track the deletion of sub flows that
are linked to a merge flow and revert offloaded merge rules if required.

Signed-off-by: John Hurley <john.hurley@netronome.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-15 15:45:36 -07:00
John Hurley
aa6ce2ea0c nfp: flower: support stats update for merge flows
With the merging of 2 sub flows, a new 'merge' flow will be created and
written to FW. The TC layer is unaware that the merge flow exists and will
request stats from the sub flows. Conversely, the FW treats a merge rule
the same as any other rule and sends stats updates to the NFP driver.

Add links between merge flows and their sub flows. Use these links to pass
merge flow stats updates from FW to the underlying sub flows, ensuring TC
stats requests are handled correctly. The updating of sub flow stats is
done on (the less time critcal) TC stats requests rather than on FW stats
update.

Signed-off-by: John Hurley <john.hurley@netronome.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-15 15:45:36 -07:00
John Hurley
1c6952ca58 nfp: flower: generate merge flow rule
When combining 2 sub_flows to a single 'merge flow' (assuming the merge is
valid), the merge flow should contain the same match fields as sub_flow 1
with actions derived from a combination of sub_flows 1 and 2. This action
list should have all actions from sub_flow 1 with the exception of the
output action that triggered the 'implicit recirculation' by sending to
an internal port, followed by all actions of sub_flow 2. Any pre-actions
in either sub_flow should feature at the start of the action list.

Add code to generate a new merge flow and populate the match and actions
fields based on the sub_flows. The offloading of the flow is left to
future patches.

Signed-off-by: John Hurley <john.hurley@netronome.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-15 15:45:36 -07:00
John Hurley
107e37bb4f nfp: flower: validate merge hint flows
Two flows can be merged if the second flow (after recirculation) matches
on bits that are either matched on or explicitly set by the first flow.
This means that if a packet hits flow 1 and recirculates then it is
guaranteed to hit flow 2.

Add a 'can_merge' function that determines if 2 sub_flows in a merge hint
can be validly merged to a single flow.

Signed-off-by: John Hurley <john.hurley@netronome.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-15 15:45:36 -07:00
John Hurley
dbc2d68edc nfp: flower: handle merge hint messages
If a merge hint is received containing 2 flows that are matched via an
implicit recirculation (sending to and matching on an internal port), fw
reports that the flows (called sub_flows) may be able to be combined to a
single flow.

Add infastructure to accept and process merge hint messages. The actual
merging of the flows is left as a stub call.

Signed-off-by: John Hurley <john.hurley@netronome.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-15 15:45:36 -07:00
John Hurley
cf4172d575 nfp: flower: get flows by host context
Each flow is given a context ID that the fw uses (along with its cookie)
to identity the flow. The flows stats are updated by the fw via this ID
which is a reference to a pre-allocated array entry.

In preparation for flow merge code, enable the nfp_fl_payload structure to
be accessed via this stats context ID. Rather than increasing the memory
requirements of the pre-allocated array, add a new rhashtable to associate
each active stats context ID with its rule payload.

While adding new code to the compile metadata functions, slightly
restructure the existing function to allow for cleaner, easier to read
error handling.

Signed-off-by: John Hurley <john.hurley@netronome.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-15 15:45:36 -07:00
John Hurley
45756dfeda nfp: flower: allow tunnels to output to internal port
The neighbour table in the FW only accepts next hop entries if the egress
port is an nfp repr. Modify this to allow the next hop to be an internal
port. This means that if a packet is to egress to that port, it will
recirculate back into the system with the internal port becoming its
ingress port.

Signed-off-by: John Hurley <john.hurley@netronome.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-15 15:45:36 -07:00
John Hurley
f41dd0595d nfp: flower: support fallback packets from internal ports
FW may receive a packet with its ingress port marked as an internal port.
If a rule does not exist to match on this port, the packet will be sent to
the NFP driver. Modify the flower app to detect packets from such internal
ports and convert the ingress port to the correct kernel space netdev.

At this point, it is assumed that fallback packets from internal ports are
to be sent out said port. Therefore, set the redir_egress bool to true on
detection of these ports.

Signed-off-by: John Hurley <john.hurley@netronome.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-15 15:45:36 -07:00
John Hurley
27f54b5825 nfp: allow fallback packets from non-reprs
Currently, it is assumed that fallback packets will be from reprs. Modify
this to allow an app to receive non-repr ports from the fallback channel -
e.g. from an internal port. If such a packet is received, do not update
repr stats.

Change the naming function calls so as not to imply it will always be a
repr netdev returned. Add the option to set a bool value to redirect a
fallback packet out the returned port rather than RXing it. Setting of
this bool in subsequent patches allows the handling of packets falling
back when they are due to egress an internal port.

Signed-off-by: John Hurley <john.hurley@netronome.com>
Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-15 15:45:36 -07:00
John Hurley
4d12ba4278 nfp: flower: allow offloading of matches on 'internal' ports
Recent FW modifications allow the offloading of non repr ports. These
ports exist internally on the NFP. So if a rule outputs to an 'internal'
port, then the packet will recirculate back into the system but will now
have this internal port as it's incoming port. These ports are indicated
by a specific type field combined with an 8 bit port id.

Add private app data to assign additional port ids for use in offloads.
Provide functions to lookup or create new ids when a rule attempts to
match on an internal netdev - the only internal netdevs currently
supported are of type openvswitch. Have a netdev notifier to release
port ids on netdev unregister.

OvS offloads rules that match on internal ports as TC egress filters.
Ensure that such rules are accepted by the driver.

Signed-off-by: John Hurley <john.hurley@netronome.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-15 15:45:36 -07:00
John Hurley
2f2622f59c nfp: flower: turn on recirc and merge hint support in firmware
Write to a FW symbol to indicate that the driver supports flow merging. If
this symbol does not exist then flow merging and recirculation is not
supported on the FW. If support is available, add a stub to deal with FW
to kernel merge hint messages.

Full flow merging requires the firmware to support of flow mods. If it
does not, then do not attempt to 'turn on' flow merging.

Signed-off-by: John Hurley <john.hurley@netronome.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-15 15:45:36 -07:00
Jakub Kicinski
bcf0cafab4 nfp: split out common control message handling code
BPF's control message handler seems like a good base to built
on for request-reply control messages.  Split it out to allow
for reuse.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-12 17:29:15 -07:00
Jakub Kicinski
0a72d8332c nfp: move vNIC reset before netdev init
During probe we clear vNIC configuration in case the device
wasn't closed cleanly by previous driver.  Move that code
before netdev init, so netdev init can already try to apply
its config parameters.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-12 17:29:15 -07:00
Jakub Kicinski
dd5b2498d8 nfp: add a mutex lock for the vNIC ctrl BAR
Soon we will try to write to the vNIC mailbox without RTNL held.
Add a new mutex to protect access to specific parts of the PCI
control BAR.

Move the mailbox size checking to the mailbox lock() helper, where
it can be more effective (happen prior to potential overwrite of
other data).

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-12 17:29:15 -07:00
Dirk van der Merwe
e64718282c nfp: opportunistically poll for reconfig result
If the reconfig was a quick update, we could have results available from
firmware within 200us.

Signed-off-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-12 17:29:15 -07:00
David S. Miller
f83f715195 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Minor comment merge conflict in mlx5.

Staging driver has a fixup due to the skb->xmit_more changes
in 'net-next', but was removed in 'net'.

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-05 14:14:19 -07:00
Jiri Pirko
c25f08ac65 nfp: remove ndo_get_port_parent_id implementation
Remove implementation of get_port_parent_id ndo and rely on core calling
into devlink for the information directly.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-04 17:42:36 -07:00
Jiri Pirko
1b15c90270 nfp: pass switch ID through devlink_port_attrs_set()
Pass the switch ID down the to devlink through devlink_port_attrs_set()
so it can be used by devlink_compat_switch_id_get().

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-04 17:42:36 -07:00
Jiri Pirko
bec5267cde net: devlink: extend port attrs for switch ID
Extend devlink_port_attrs_set() to pass switch ID for ports which are
part of switch and store it in port attrs. For other ports, this is
NULL.

Note that this allows the driver to group devlink ports into one or more
switches according to the actual topology.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-04 17:42:36 -07:00
Florian Westphal
6b16f9ee89 net: move skb->xmit_more hint to softnet data
There are two reasons for this.

First, the xmit_more flag conceptually doesn't fit into the skb, as
xmit_more is not a property related to the skb.
Its only a hint to the driver that the stack is about to transmit another
packet immediately.

Second, it was only done this way to not have to pass another argument
to ndo_start_xmit().

We can place xmit_more in the softnet data, next to the device recursion.
The recursion counter is already written to on each transmit. The "more"
indicator is placed right next to it.

Drivers can use the netdev_xmit_more() helper instead of skb->xmit_more
to check the "more packets coming" hint.

skb->xmit_more is retained (but always 0) to not cause build breakage.

This change takes care of the simple s/skb->xmit_more/netdev_xmit_more()/
conversions.  Remaining drivers are converted in the next patches.

Suggested-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-01 18:35:02 -07:00
Dirk van der Merwe
61f7c6f448 nfp: implement ethtool get module EEPROM
Now that the NSP provides the ability to read from the SFF modules'
EEPROM, we can use this interface to implement the ethtool callback.

If the NSP only provides partial data, we log the event from within
the driver but pass a success code to ethtool to prevent it from
discarding the partial data.

Signed-off-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-01 18:05:13 -07:00
Dirk van der Merwe
593cb18285 nfp: nsp: implement read SFF module EEPROM
The NSP now provides the ability to read from the SFF module EEPROM.
Note that even if an error occurs, the NSP may still provide some of the
data.

Signed-off-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-01 18:05:13 -07:00
Pieter Jansen van Vuuren
eff07b42d8 nfp: flower: reduce action list size by coalescing mangle actions
With the introduction of flow_action_for_each pedit actions are no
longer grouped together, instead pedit actions are broken out per
32 byte word. This results in an inefficient use of the action list
that is pushed to hardware where each 32 byte word becomes its own
action. Therefore we combine groups of 32 byte word before sending
the action list to hardware.

Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-01 18:05:13 -07:00
Pieter Jansen van Vuuren
42cd5484a2 nfp: flower: remove vlan CFI bit from push vlan action
We no longer set CFI when pushing vlan tags, therefore we remove
the CFI bit from push vlan.

Fixes: 1a1e586f54 ("nfp: add basic action capabilities to flower offloads")
Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com>
Signed-off-by: Louis Peens <louis.peens@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-01 18:02:41 -07:00
Pieter Jansen van Vuuren
f7ee799a51 nfp: flower: replace CFI with vlan present
Replace vlan CFI bit with a vlan present bit that indicates the
presence of a vlan tag. Previously the driver incorrectly assumed
that an vlan id of 0 is not matchable, therefore we indicate vlan
presence with a vlan present bit.

Fixes: 5571e8c9f2 ("nfp: extend flower matching capabilities")
Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com>
Signed-off-by: Louis Peens <louis.peens@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-01 18:02:41 -07:00
Jakub Kicinski
c3e1f7fff6 nfp: disable netpoll on representors
NFP reprs are software device on top of the PF's vNIC.
The comment above __dev_queue_xmit() sayeth:

 When calling this method, interrupts MUST be enabled.  This is because
 the BH enable code must have IRQs enabled so that it will not deadlock.

For netconsole we can't guarantee IRQ state, let's just
disable netpoll on representors to be on the safe side.

When the initial implementation of NFP reprs was added by the
commit 5de73ee467 ("nfp: general representor implementation")
.ndo_poll_controller was required for netpoll to be enabled.

Fixes: ac3d9dd034 ("netpoll: make ndo_poll_controller() optional")
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: John Hurley <john.hurley@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-28 17:04:29 -07:00
Jakub Kicinski
c8ba5b91a0 nfp: validate the return code from dev_queue_xmit()
dev_queue_xmit() may return error codes as well as netdev_tx_t,
and it always consumes the skb.  Make sure we always return a
correct netdev_tx_t value.

Fixes: eadfa4c3be ("nfp: add stats and xmit helpers for representors")
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: John Hurley <john.hurley@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-28 17:04:29 -07:00
Jiri Pirko
f1fa719cfd nfp: do not handle nn->port defined case in nfp_net_get_phys_port_name()
If nn->port is defined it means that devlink_port has been registered
for this port as well. Devlink core is handling the port name
formatting.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-28 12:55:31 -07:00
Jiri Pirko
5dc37bb9b0 net: replace ndo_get_devlink with ndo_get_devlink_port
Follow-up patch is going to need a devlink port instance according to
a netdev. Devlink port instance should be always available when devlink
is used. So change the recently introduced ndo_get_devlink to
ndo_get_devlink_port. With that, adjust the wrapper for the only
user to get devlink pointer.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Michal Kubecek <mkubecek@suse.cz>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-28 12:55:30 -07:00
Jiri Pirko
335bc0dde0 nfp: register devlink port before netdev
Change the init/fini flow and register devlink port instance before
netdev. Now it is needed for correct behavior of phys_port_name
generation, but in general it makes sense to register devlink port
first.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-28 12:55:30 -07:00
Jiri Pirko
f6b19b354d net: devlink: select NET_DEVLINK from drivers
Some drivers are becoming more dependent on NET_DEVLINK being selected
in configuration. With upcoming compat functions, the behavior would be
wrong in case devlink was not compiled in. So make the drivers select
NET_DEVLINK and rely on the functions being there, not just stubs.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-24 14:55:31 -04:00
Jiri Pirko
faaccbe6eb nfp: move devlink port type set after netdev registration
Similar to other driver, move the port type set after netdev registration
is done. Along with that, clear the type before unregistration.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-24 14:55:31 -04:00
Moshe Shemesh
bea964107f net: Add IANA_VXLAN_UDP_PORT definition to vxlan header file
Added IANA_VXLAN_UDP_PORT (4789) definition to vxlan header file so it
can be used by drivers instead of local definition.
Updated drivers which locally defined it as 4789 to use it.

Signed-off-by: Moshe Shemesh <moshe@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Cc: John Hurley <john.hurley@netronome.com>
Cc: Jakub Kicinski <jakub.kicinski@netronome.com>
Cc: Yunsheng Lin <linyunsheng@huawei.com>
Cc: Peng Li <lipeng321@huawei.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-03-22 12:09:31 -07:00
Moshe Shemesh
974eff2b57 net: Move the definition of the default Geneve udp port to public header file
Move the definition of the default Geneve udp port from the geneve
source to the header file, so we can re-use it from drivers.
Modify existing drivers to use it.

Signed-off-by: Moshe Shemesh <moshe@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Cc: John Hurley <john.hurley@netronome.com>
Cc: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-03-22 12:09:31 -07:00
Jakub Kicinski
31f1a0e37c nfp: remove defines for unused control bits
NFP driver ABI contains bits for L2 switching which were never
implemented in initially envisioned form.

Remove the defines, and open up the possibility of
reclaiming the bits for other uses.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-21 14:01:55 -07:00
Dirk van der Merwe
eaab2d2d0f nfp: fix simple vNIC mailbox length
The simple vNIC mailbox length should be 12 decimal and not 0x12.
Using a decimal also makes it clear this is a length value and not
another field within the simple mailbox defines.

Found by code inspection, there are no known firmware configurations
where this would cause issues.

Fixes: 527d7d1b99 ("nfp: read mailbox address from TLV caps")
Signed-off-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-07 11:00:21 -08:00
Dirk van der Merwe
35697764d7 nfp: nsp: set higher timeout for flash bundle
The management firmware now supports being passed a bundle with
multiple components to be stored in flash at once. This makes it
easier to update all components to a known state with a single
user command, however, this also has the potential to increase
the time required to perform the update significantly.

The management firmware only updates the components out of a bundle
which are outdated, however, we need to make sure we can handle
the absolute worst case where a CPLD update can take a long time
to perform.

We set a very conservative total timeout of 900s which already
adds a contingency.

Signed-off-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-01 11:36:01 -08:00
Jakub Kicinski
345415138d nfp: nsp: allow the use of DMA buffer
Newer versions of NSP can access host memory.  Simplest access
type requires all data to be in one contiguous area.  Since we
don't have the guarantee on where callers of the NSP ABI will
allocate their buffers we allocate a bounce buffer and copy
the data in and out.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-01 11:36:01 -08:00
Jakub Kicinski
66487abe2f nfp: nsp: move default buffer handling into its own function
DMA version of NSP communication is coming, move the code which
copies data into the NFP buffer into a separate function.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-01 11:36:00 -08:00
Jakub Kicinski
882cdcb5d3 nfp: nsp: use fractional size of the buffer
NSP expresses the buffer size in MB and 4 kB blocks.  For small
buffers the kB part may make a difference, so count it in.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-01 11:36:00 -08:00
Jakub Kicinski
1e301a1407 nfp: report RJ45 connector in ethtool
Add support for reporting twisted pair port type.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-01 11:36:00 -08:00
Jakub Kicinski
03969b9414 nfp: remove ethtool flashing fallback
Now that devlink fallback will be called reliably, we can remove
the ethtool flashing code.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-26 08:49:05 -08:00
Jakub Kicinski
28e8c75413 nfp: add .ndo_get_devlink
Support getting devlink instance from a new NDO.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-26 08:49:05 -08:00
Jakub Kicinski
f4b6bcc700 net: devlink: turn devlink into a built-in
Being able to build devlink as a module causes growing pains.
First all drivers had to add a meta dependency to make sure
they are not built in when devlink is built as a module.  Now
we are struggling to invoke ethtool compat code reliably.

Make devlink code built-in, users can still not build it at
all but the dynamically loadable module option is removed.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-26 08:49:05 -08:00
Florian Fainelli
d7977107b3 nfp: Remove switchdev.h inclusion
This is no longer necessary after a5084bb71f ("nfp: Implement
ndo_get_port_parent_id()")

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-24 17:40:46 -08:00
David S. Miller
70f3522614 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Three conflicts, one of which, for marvell10g.c is non-trivial and
requires some follow-up from Heiner or someone else.

The issue is that Heiner converted the marvell10g driver over to
use the generic c45 code as much as possible.

However, in 'net' a bug fix appeared which makes sure that a new
local mask (MDIO_AN_10GBT_CTRL_ADV_NBT_MASK) with value 0x01e0
is cleared.

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-24 12:06:19 -08:00
Jiong Wang
f036ebd9bf nfp: bpf: fix ALU32 high bits clearance bug
NFP BPF JIT compiler is doing a couple of small optimizations when jitting
ALU imm instructions, some of these optimizations could save code-gen, for
example:

  A & -1 =  A
  A |  0 =  A
  A ^  0 =  A

However, for ALU32, high 32-bit of the 64-bit register should still be
cleared according to ISA semantics.

Fixes: cd7df56ed3 ("nfp: add BPF to NFP code translator")
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-02-23 00:07:47 +01:00
Jiong Wang
71c190249f nfp: bpf: fix code-gen bug on BPF_ALU | BPF_XOR | BPF_K
The intended optimization should be A ^ 0 = A, not A ^ -1 = A.

Fixes: cd7df56ed3 ("nfp: add BPF to NFP code translator")
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-02-23 00:07:47 +01:00
Pieter Jansen van Vuuren
0496743b20 nfp: flower: fix masks for tcp and ip flags fields
Check mask fields of tcp and ip flags when setting the corresponding mask
flag used in hardware.

Fixes: 8f2566225a ("flow_offload: add flow_rule and flow_match")
Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com>
Reviewed-by: John Hurley <john.hurley@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-17 15:28:50 -08:00
Jakub Kicinski
5c5696f3df nfp: devlink: allow flashing the device via devlink
Devlink now allows updating device flash.  Implement this
callback.

Compared to ethtool update we no longer have to release
the networking locks - devlink doesn't take them.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-17 15:27:38 -08:00
David S. Miller
885e631959 Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Alexei Starovoitov says:

====================
pull-request: bpf-next 2019-02-16

The following pull-request contains BPF updates for your *net-next* tree.

The main changes are:

1) numerous libbpf API improvements, from Andrii, Andrey, Yonghong.

2) test all bpf progs in alu32 mode, from Jiong.

3) skb->sk access and bpf_sk_fullsock(), bpf_tcp_sock() helpers, from Martin.

4) support for IP encap in lwt bpf progs, from Peter.

5) remove XDP_QUERY_XSK_UMEM dead code, from Jan.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-16 22:56:34 -08:00
Jakub Kicinski
0ff8409b52 nfp: flower: remove double new line
Recent cls_flower offload rewrite added a double new line.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-12 11:38:52 -05:00
Jakub Kicinski
dd27c2e3d0 bpf: offload: add priv field for drivers
Currently bpf_offload_dev does not have any priv pointer, forcing
the drivers to work backwards from the netdev in program metadata.
This is not great given programs are conceptually associated with
the offload device, and it means one or two unnecessary deferences.
Add a priv pointer to bpf_offload_dev.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-02-12 17:07:09 +01:00
Jakub Kicinski
1f5cf1036c nfp: devlink: include vendor/product info in serial number
The manufacturing team requests we include vendor and product
in the serial number field, as the serial number itself is not
unique across manufacturing facilities and products.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-11 20:39:56 -08:00
Jakub Kicinski
05fe4ab75c nfp: devlink: use the generic manufacture identifier instead of vendor
Vendor may sound ambiguous, let's rename the fab string to
"board.manufacture" (which was just added as a generic identifier).

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-11 20:39:56 -08:00
Gustavo A. R. Silva
af6f12f22b nfp: flower: cmsg: use struct_size() helper
One of the more common cases of allocation size calculations is finding
the size of a structure that has a zero-sized array at the end, along
with memory for some number of elements for that array. For example:

struct foo {
    int stuff;
    void *entry[];
};

size = sizeof(struct foo) + count * sizeof(void *);
instance = alloc(size, GFP_KERNEL);

Instead of leaving these open-coded and prone to type mistakes, we can
now use the new struct_size() helper:

instance = alloc(struct_size(instance, entry, count), GFP_KERNEL);

Notice that, in this case, variable size is not necessary, hence
it is removed.

This code was detected with the help of Coccinelle.

Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-08 22:57:28 -08:00
Pablo Neira Ayuso
c0bc5d8e2b nfp: flower: remove unused index from nfp_fl_pedit()
Static checker warning complains on uninitialized variable:

        drivers/net/ethernet/netronome/nfp/flower/action.c:618 nfp_fl_pedit()
        error: uninitialized symbol 'idx'.

Which is actually never used from the functions that take it as
parameter. Remove it.

Fixes: 7386788175 ("drivers: net: use flow action infrastructure")
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-08 11:46:45 -08:00
Florian Fainelli
a5084bb71f nfp: Implement ndo_get_port_parent_id()
NFP only supports SWITCHDEV_ATTR_ID_PORT_PARENT_ID, which makes it a
great candidate to be converted to use the ndo_get_port_parent_id() NDO
instead of implementing switchdev_port_attr_get().

Since NFP uses switchdev_port_same_parent_id() convert it to use
netdev_port_same_parent_id().

Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-06 14:16:12 -08:00
Pablo Neira Ayuso
7386788175 drivers: net: use flow action infrastructure
This patch updates drivers to use the new flow action infrastructure.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-06 10:38:25 -08:00
Pablo Neira Ayuso
3b1903ef97 flow_offload: add statistics retrieval infrastructure and use it
This patch provides the flow_stats structure that acts as container for
tc_cls_flower_offload, then we can use to restore the statistics on the
existing TC actions. Hence, tcf_exts_stats_update() is not used from
drivers anymore.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-06 10:38:25 -08:00
Pablo Neira Ayuso
8f2566225a flow_offload: add flow_rule and flow_match structures and use them
This patch wraps the dissector key and mask - that flower uses to
represent the matching side - around the flow_match structure.

To avoid a follow up patch that would edit the same LoCs in the drivers,
this patch also wraps this new flow match structure around the flow rule
object. This new structure will also contain the flow actions in follow
up patches.

This introduces two new interfaces:

	bool flow_rule_match_key(rule, dissector_id)

that returns true if a given matching key is set on, and:

	flow_rule_match_XYZ(rule, &match);

To fetch the matching side XYZ into the match container structure, to
retrieve the key and the mask with one single call.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-06 10:38:25 -08:00
Jakub Kicinski
bff5731d43 net: devlink: report cell size of shared buffers
Shared buffer allocation is usually done in cell increments.
Drivers will either round up the allocation or refuse the
configuration if it's not an exact multiple of cell size.
Drivers know exactly the cell size of shared buffer, so help
out users by providing this information in dumps.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-03 11:25:34 -08:00
David S. Miller
beb73559bf Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Alexei Starovoitov says:

====================
pull-request: bpf-next 2019-02-01

The following pull-request contains BPF updates for your *net-next* tree.

The main changes are:

1) introduce bpf_spin_lock, from Alexei.

2) convert xdp samples to libbpf, from Maciej.

3) skip verifier tests for unsupported program/map types, from Stanislav.

4) powerpc64 JIT support for BTF line info, from Sandipan.

5) assorted fixed, from Valdis, Jesper, Jiong.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-01 20:12:18 -08:00
Jiong Wang
ac7a1717a2 nfp: bpf: complete ALU32 logic shift supports
The following ALU32 logic shift supports are missing:

  BPF_ALU | BPF_LSH | BPF_X
  BPF_ALU | BPF_RSH | BPF_X
  BPF_ALU | BPF_RSH | BPF_K

For BPF_RSH | BPF_K, it could be implemented using NFP direct shift
instruction. For the other BPF_X shifts, NFP indirect shifts sequences need
to be used.

Separate code-gen hook is assigned to each instruction to make the
implementation clear.

Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-02-01 18:03:49 -08:00
Jiong Wang
db0a4b3b6b nfp: bpf: correct the behavior for shifts by zero
Shifts by zero do nothing, and should be treated as nops.

Even though compiler is not supposed to generate such instructions and
manual written assembly is unlikely to have them, but they are legal
instructions and have defined behavior.

This patch correct existing shifts code-gen to make sure they do nothing
when shift amount is zero except when the instruction is ALU32 for which
high bits need to be cleared.

For shift amount bigger than type size, already, NFP JIT back-end errors
out for immediate shift and only low 5 bits will be taken into account for
indirect shift which is the same as x86.

Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-02-01 18:03:49 -08:00
Jakub Kicinski
7c908f467d nfp: devlink: report the running and flashed versions
Report versions of firmware components using the new NSP command.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-01 15:30:31 -08:00
Jakub Kicinski
b96588400a nfp: nsp: add support for versions command
Retrieve the FW versions with the new command.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-01 15:30:31 -08:00
Jakub Kicinski
937a3e2645 nfp: devlink: report fixed versions
Report information about the hardware.

RFCv2:
 - add defines for board IDs which are likely to be reusable for
   other drivers (Jiri).

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-01 15:30:31 -08:00
Jakub Kicinski
4adba00839 nfp: devlink: report driver name and serial number
Report the basic info through new devlink info API.

RFCv2:
 - add driver name;
 - align serial to core changes.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-01 15:30:30 -08:00
Gustavo A. R. Silva
ee69804714 nfp: use struct_size() in kzalloc()
One of the more common cases of allocation size calculations is finding
the size of a structure that has a zero-sized array at the end, along
with memory for some number of elements for that array. For example:

struct foo {
    int stuff;
    struct boo entry[];
};

instance = kzalloc(sizeof(struct foo) + count * sizeof(struct boo), GFP_KERNEL);

Instead of leaving these open-coded and prone to type mistakes, we can
now use the new struct_size() helper:

instance = kzalloc(struct_size(instance, entry, count), GFP_KERNEL);

This code was detected with the help of Coccinelle.

Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-01 15:12:29 -08:00
David S. Miller
ec7146db15 Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Daniel Borkmann says:

====================
pull-request: bpf-next 2019-01-29

The following pull-request contains BPF updates for your *net-next* tree.

The main changes are:

1) Teach verifier dead code removal, this also allows for optimizing /
   removing conditional branches around dead code and to shrink the
   resulting image. Code store constrained architectures like nfp would
   have hard time doing this at JIT level, from Jakub.

2) Add JMP32 instructions to BPF ISA in order to allow for optimizing
   code generation for 32-bit sub-registers. Evaluation shows that this
   can result in code reduction of ~5-20% compared to 64 bit-only code
   generation. Also add implementation for most JITs, from Jiong.

3) Add support for __int128 types in BTF which is also needed for
   vmlinux's BTF conversion to work, from Yonghong.

4) Add a new command to bpftool in order to dump a list of BPF-related
   parameters from the system or for a specific network device e.g. in
   terms of available prog/map types or helper functions, from Quentin.

5) Add AF_XDP sock_diag interface for querying sockets from user
   space which provides information about the RX/TX/fill/completion
   rings, umem, memory usage etc, from Björn.

6) Add skb context access for skb_shared_info->gso_segs field, from Eric.

7) Add support for testing flow dissector BPF programs by extending
   existing BPF_PROG_TEST_RUN infrastructure, from Stanislav.

8) Split BPF kselftest's test_verifier into various subgroups of tests
   in order better deal with merge conflicts in this area, from Jakub.

9) Add support for queue/stack manipulations in bpftool, from Stanislav.

10) Document BTF, from Yonghong.

11) Dump supported ELF section names in libbpf on program load
    failure, from Taeung.

12) Silence a false positive compiler warning in verifier's BTF
    handling, from Peter.

13) Fix help string in bpftool's feature probing, from Prashant.

14) Remove duplicate includes in BPF kselftests, from Yue.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-28 19:38:33 -08:00
Jiong Wang
461448398a nfp: bpf: implement jitting of JMP32
This patch implements code-gen for new JMP32 instructions on NFP.

Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-01-26 13:33:02 -08:00
Jakub Kicinski
9a06927e77 nfp: bpf: support removing dead code
Add a verifier callback to the nfp JIT to remove the instructions
the verifier deemed to be dead.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-01-23 17:35:32 -08:00
Jakub Kicinski
a32014b351 nfp: bpf: support optimizing dead branches
Verifier will now optimize out branches to dead code, implement
the replace_insn callback to take advantage of that optimization.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-01-23 17:35:32 -08:00
Jakub Kicinski
e2fc61146a nfp: bpf: save original program length
Instead of passing env->prog->len around, and trying to adjust
for optimized out instructions just save the initial number
of instructions in struct nfp_prog.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-01-23 17:35:32 -08:00
Jakub Kicinski
91a87a5823 nfp: bpf: split up the skip flag
We fail program loading if jump lands on a skipped instruction.
This is for historical reasons, it used to be that we only skipped
instructions optimized out based on prior context, and therefore
the optimization would be buggy if we jumped directly to such
instruction (because the context would be skipped by the jump).

There are cases where instructions can be skipped without any
context, for example there is no point in generating code for:

	 r0 |= 0

We will also soon support dropping dead code, so make the skip
logic differentiate between "optimized with preceding context"
vs other skip types.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-01-23 17:35:32 -08:00
Jakub Kicinski
e90287f3aa nfp: bpf: don't use instruction number for jump target
Instruction number is meaningless at code gen phase.  The target
of the instruction is overwritten by nfp_fixup_branches().  The
convention is to put the raw offset in target address as a place
holder.  See cmp_* functions.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-01-23 17:35:32 -08:00
John Hurley
20cce88650 nfp: flower: enable MAC address sharing for offloadable devs
A MAC address is not necessarily a unique identifier for a netdev. Drivers
such as Linux bonds, for example, can apply the same MAC address to the
upper layer device and all lower layer devices.

NFP MAC offload for tunnel decap includes port verification for reprs but
also supports the offload of non-repr MAC addresses by assigning 'global'
indexes to these. This means that the FW will not verify the incoming port
of a packet matching this destination MAC.

Modify the MAC offload logic to assign global indexes based on MAC address
instead of net device (as it currently does). Use this to allow multiple
devices to share the same MAC. In other words, if a repr shares its MAC
address with another device then give the offloaded MAC a global index
rather than associate it with an ingress port. Track this so that changes
can be reverted as MACs stop being shared.

Implement this by removing the current list based assignment of global
indexes and replacing it with an rhashtable that maps an offloaded MAC
address to the number of devices sharing it, distributing global indexes
based on this.

Signed-off-by: John Hurley <john.hurley@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-16 15:23:15 -08:00
John Hurley
13cf71031d nfp: flower: ensure MAC cleanup on address change
It is possible to receive a MAC address change notification without the
net device being down (e.g. when an OvS bridge is assigned the same MAC as
a port added to it). This means that an offloaded MAC address may not be
removed if its device gets a new address.

Maintain a record of the offloaded MAC addresses for each repr and netdev
assigned a MAC offload index. Use this to delete the (now expired) MAC if
a change of address event occurs. Only handle change address events if the
device is already up - if not then the netdev up event will handle it.

Signed-off-by: John Hurley <john.hurley@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-16 15:23:15 -08:00
John Hurley
05d2bee6bd nfp: flower: add infastructure for non-repr priv data
NFP repr netdevs contain private data that can store per port information.
In certain cases, the NFP driver offloads information from non-repr ports
(e.g. tunnel ports). As the driver does not have control over non-repr
netdevs, it cannot add/track private data directly to the netdev struct.

Add infastructure to store private information on any non-repr netdev that
is offloaded at a given time. This is used in a following patch to track
offloaded MAC addresses for non-reprs and enable correct house keeping on
address changes.

Signed-off-by: John Hurley <john.hurley@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-16 15:23:15 -08:00
John Hurley
49402b0b7f nfp: flower: ensure deletion of old offloaded MACs
When a potential tunnel end point goes down then its MAC address should
not be matchable on the NFP.

Implement a delete message for offloaded MACs and call this on net device
down. While at it, remove the actions on register and unregister netdev
events. A MAC should only be offloaded if the device is up. Note that the
netdev notifier will replay any notifications for UP devices on
registration so NFP can still offload ports that exist before the driver
is loaded. Similarly, devices need to go down before they can be
unregistered so removal of offloaded MACs is only required on down events.

Signed-off-by: John Hurley <john.hurley@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-16 15:23:15 -08:00
John Hurley
0115dcc314 nfp: flower: remove list infastructure from MAC offload
Potential MAC destination addresses for tunnel end-points are offloaded to
firmware. This was done by building a list of such MACs and writing to
firmware as blocks of addresses.

Simplify this code by removing the list format and sending a new message
for each offloaded MAC.

This is in preparation for delete MAC messages. There will be one delete
flag per message so we cannot assume that this applies to all addresses
in a list.

Signed-off-by: John Hurley <john.hurley@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-16 15:23:15 -08:00
John Hurley
41da0b5ef3 nfp: flower: ignore offload of VF and PF repr MAC addresses
Currently MAC addresses of all repr netdevs, along with selected non-NFP
controlled netdevs, are offloaded to FW as potential tunnel end-points.
However, the addresses of VF and PF reprs are meaningless outside of
internal communication and it is only those of physical port reprs
required.

Modify the MAC address offload selection code to ignore VF/PF repr devs.

Signed-off-by: John Hurley <john.hurley@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-16 15:23:15 -08:00
John Hurley
f3b975778c nfp: flower: tidy tunnel related private data
Recent additions to the flower app private data have grouped the variables
of a given feature into a struct and added that struct to the main private
data struct.

In keeping with this, move all tunnel related private data to their own
struct. This has no affect on functionality but improves readability and
maintenance of the code.

Signed-off-by: John Hurley <john.hurley@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-16 15:23:15 -08:00
Pieter Jansen van Vuuren
467322e262 nfp: flower: support multiple memory units for filter offloads
Adds support for multiple memory units which are used for filter
offloads. Each filter is assigned a stats id, the MSBs of the id are
used to determine which memory unit the filter should be offloaded
to. The number of available memory units that could be used for filter
offload is obtained from HW. A simple round robin technique is used to
allocate and distribute the ids across memory units.

Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-16 15:23:14 -08:00
Fred Lotter
96439889b4 nfp: flower: increase cmesg reply timeout
QA tests report occasional timeouts on REIFY message replies. Profiling
of the two cmesg reply types under burst conditions, with a 12-core host
under heavy cpu and io load (stress --cpu 12 --io 12), show both PHY MTU
change and REIFY replies can exceed the 10ms timeout. The maximum MTU
reply wait under burst is 16ms, while the maximum REIFY wait under 40 VF
burst is 12ms. Using a 4 VF REIFY burst results in an 8ms maximum wait.
A larger VF burst does increase the delay, but not in a linear enough
way to justify a scaled REIFY delay. The worse case values between
MTU and REIFY appears close enough to justify a common timeout. Pick a
conservative 40ms to make a safer future proof common reply timeout. The
delay only effects the failure case.

Change the REIFY timeout mechanism to use wait_event_timeout() instead
of wait_event_interruptible_timeout(), to match the MTU code. In the
current implementation, theoretically, a signal could interrupt the
REIFY waiting period, with a return code of ERESTARTSYS. However, this is
caught under the general timeout error code EIO. I cannot see the benefit
of exposing the REIFY waiting period to signals with such a short delay
(40ms), while the MTU mechnism does not use the same logic. In the absence
of any reply (wakeup() call), both reply types will wake up the task after
the timeout period. The REIFY timeout applies to the entire representor
group being instantiated (e.g. VFs), while the MTU timeout apples to a
single PHY MTU change.

Signed-off-by: Fred Lotter <frederik.lotter@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-16 15:23:14 -08:00
Luis Chamberlain
750afb08ca cross-tree: phase out dma_zalloc_coherent()
We already need to zero out memory for dma_alloc_coherent(), as such
using dma_zalloc_coherent() is superflous. Phase it out.

This change was generated with the following Coccinelle SmPL patch:

@ replace_dma_zalloc_coherent @
expression dev, size, data, handle, flags;
@@

-dma_zalloc_coherent(dev, size, handle, flags)
+dma_alloc_coherent(dev, size, handle, flags)

Suggested-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
[hch: re-ran the script on the latest tree]
Signed-off-by: Christoph Hellwig <hch@lst.de>
2019-01-08 07:58:37 -05:00
David S. Miller
339bbff2d6 Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Daniel Borkmann says:

====================
pull-request: bpf-next 2018-12-21

The following pull-request contains BPF updates for your *net-next* tree.

There is a merge conflict in test_verifier.c. Result looks as follows:

        [...]
        },
        {
                "calls: cross frame pruning",
                .insns = {
                [...]
                .prog_type = BPF_PROG_TYPE_SOCKET_FILTER,
                .errstr_unpriv = "function calls to other bpf functions are allowed for root only",
                .result_unpriv = REJECT,
                .errstr = "!read_ok",
                .result = REJECT,
	},
        {
                "jset: functional",
                .insns = {
        [...]
        {
                "jset: unknown const compare not taken",
                .insns = {
                        BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0,
                                     BPF_FUNC_get_prandom_u32),
                        BPF_JMP_IMM(BPF_JSET, BPF_REG_0, 1, 1),
                        BPF_LDX_MEM(BPF_B, BPF_REG_8, BPF_REG_9, 0),
                        BPF_EXIT_INSN(),
                },
                .prog_type = BPF_PROG_TYPE_SOCKET_FILTER,
                .errstr_unpriv = "!read_ok",
                .result_unpriv = REJECT,
                .errstr = "!read_ok",
                .result = REJECT,
        },
        [...]
        {
                "jset: range",
                .insns = {
                [...]
                },
                .prog_type = BPF_PROG_TYPE_SOCKET_FILTER,
                .result_unpriv = ACCEPT,
                .result = ACCEPT,
        },

The main changes are:

1) Various BTF related improvements in order to get line info
   working. Meaning, verifier will now annotate the corresponding
   BPF C code to the error log, from Martin and Yonghong.

2) Implement support for raw BPF tracepoints in modules, from Matt.

3) Add several improvements to verifier state logic, namely speeding
   up stacksafe check, optimizations for stack state equivalence
   test and safety checks for liveness analysis, from Alexei.

4) Teach verifier to make use of BPF_JSET instruction, add several
   test cases to kselftests and remove nfp specific JSET optimization
   now that verifier has awareness, from Jakub.

5) Improve BPF verifier's slot_type marking logic in order to
   allow more stack slot sharing, from Jiong.

6) Add sk_msg->size member for context access and add set of fixes
   and improvements to make sock_map with kTLS usable with openssl
   based applications, from John.

7) Several cleanups and documentation updates in bpftool as well as
   auto-mount of tracefs for "bpftool prog tracelog" command,
   from Quentin.

8) Include sub-program tags from now on in bpf_prog_info in order to
   have a reliable way for user space to get all tags of the program
   e.g. needed for kallsyms correlation, from Song.

9) Add BTF annotations for cgroup_local_storage BPF maps and
   implement bpf fs pretty print support, from Roman.

10) Fix bpftool in order to allow for cross-compilation, from Ivan.

11) Update of bpftool license to GPLv2-only + BSD-2-Clause in order
    to be compatible with libbfd and allow for Debian packaging,
    from Jakub.

12) Remove an obsolete prog->aux sanitation in dump and get rid of
    version check for prog load, from Daniel.

13) Fix a memory leak in libbpf's line info handling, from Prashant.

14) Fix cpumap's frame alignment for build_skb() so that skb_shared_info
    does not get unaligned, from Jesper.

15) Fix test_progs kselftest to work with older compilers which are less
    smart in optimizing (and thus throwing build error), from Stanislav.

16) Cleanup and simplify AF_XDP socket teardown, from Björn.

17) Fix sk lookup in BPF kselftest's test_sock_addr with regards
    to netns_id argument, from Andrey.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-20 17:31:36 -08:00
David S. Miller
2be09de7d6 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Lots of conflicts, by happily all cases of overlapping
changes, parallel adds, things of that nature.

Thanks to Stephen Rothwell, Saeed Mahameed, and others
for their guidance in these resolutions.

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-20 11:53:36 -08:00
Jakub Kicinski
4987eaccd2 nfp: bpf: optimize codegen for JSET with a constant
The top word of the constant can only have bits set if sign
extension set it to all-1, therefore we don't really have to
mask the top half of the register.  We can just OR it into
the result as is.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-12-20 17:28:29 +01:00
Jakub Kicinski
6e774845b3 nfp: bpf: remove the trivial JSET optimization
The verifier will now understand the JSET instruction, so don't
mark the dead branch in the JIT as noop.  We won't generate any
code, anyway.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-12-20 17:28:28 +01:00
John Hurley
b12c97d45c nfp: flower: fix cb_ident duplicate in indirect block register
Previously the identifier used for indirect block callback registry and
for block rule cb registry (when done via indirect blocks) was the pointer
to the netdev we were interested in receiving updates on. This worked fine
if a single app existed that registered one callback per netdev of
interest. However, if multiple cards are in place and, in turn, multiple
apps, then each app may register the same callback with the same
identifier to both the netdev's indirect block cb list and to a block's cb
list. This can lead to EEXIST errors and/or incorrect cb deletions.

Prevent this conflict by using the app pointer as the identifier for
netdev indirect block cb registry, allowing each app to register a unique
callback per netdev. For block cb registry, the same app may register
multiple cbs to the same block if using TC shared blocks. Instead of the
app, use the pointer to the allocated cb_priv data as the identifier here.
This means that there can be a unique block callback for each app/netdev
combo.

Fixes: 3166dd07a9 ("nfp: flower: offload tunnel decap rules via indirect TC blocks")
Reported-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: John Hurley <john.hurley@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-17 23:34:12 -08:00
Jakub Kicinski
036b9e7cae nfp: abm: allow to opt-out of RED offload
FW team asks to be able to not support RED even if NIC is capable
of buffering for testing and experimentation.  Add an opt-out flag.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-16 12:41:42 -08:00
David S. Miller
addb067983 Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Daniel Borkmann says:

====================
pull-request: bpf-next 2018-12-11

The following pull-request contains BPF updates for your *net-next* tree.

It has three minor merge conflicts, resolutions:

1) tools/testing/selftests/bpf/test_verifier.c

 Take first chunk with alignment_prevented_execution.

2) net/core/filter.c

  [...]
  case bpf_ctx_range_ptr(struct __sk_buff, flow_keys):
  case bpf_ctx_range(struct __sk_buff, wire_len):
        return false;
  [...]

3) include/uapi/linux/bpf.h

  Take the second chunk for the two cases each.

The main changes are:

1) Add support for BPF line info via BTF and extend libbpf as well
   as bpftool's program dump to annotate output with BPF C code to
   facilitate debugging and introspection, from Martin.

2) Add support for BPF_ALU | BPF_ARSH | BPF_{K,X} in interpreter
   and all JIT backends, from Jiong.

3) Improve BPF test coverage on archs with no efficient unaligned
   access by adding an "any alignment" flag to the BPF program load
   to forcefully disable verifier alignment checks, from David.

4) Add a new bpf_prog_test_run_xattr() API to libbpf which allows for
   proper use of BPF_PROG_TEST_RUN with data_out, from Lorenz.

5) Extend tc BPF programs to use a new __sk_buff field called wire_len
   for more accurate accounting of packets going to wire, from Petar.

6) Improve bpftool to allow dumping the trace pipe from it and add
   several improvements in bash completion and map/prog dump,
   from Quentin.

7) Optimize arm64 BPF JIT to always emit movn/movk/movk sequence for
   kernel addresses and add a dedicated BPF JIT backend allocator,
   from Ard.

8) Add a BPF helper function for IR remotes to report mouse movements,
   from Sean.

9) Various cleanups in BPF prog dump e.g. to make UAPI bpf_prog_info
   member naming consistent with existing conventions, from Yonghong
   and Song.

10) Misc cleanups and improvements in allowing to pass interface name
    via cmdline for xdp1 BPF example, from Matteo.

11) Fix a potential segfault in BPF sample loader's kprobes handling,
    from Daniel T.

12) Fix SPDX license in libbpf's README.rst, from Andrey.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-10 18:00:43 -08:00
Pieter Jansen van Vuuren
290974d434 nfp: flower: ensure TCP flags can be placed in IPv6 frame
Previously we did not ensure tcp flags have a place to be stored
when using IPv6. We correct this by including IPv6 key layer when
we match tcp flags and the IPv6 key layer has not been included
already.

Fixes: 07e1671cfc ("nfp: flower: refactor shared ip header in match offload")
Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-10 17:45:41 -08:00
David S. Miller
4cc1feeb6f Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Several conflicts, seemingly all over the place.

I used Stephen Rothwell's sample resolutions for many of these, if not
just to double check my own work, so definitely the credit largely
goes to him.

The NFP conflict consisted of a bug fix (moving operations
past the rhashtable operation) while chaning the initial
argument in the function call in the moved code.

The net/dsa/master.c conflict had to do with a bug fix intermixing of
making dsa_master_set_mtu() static with the fixing of the tagging
attribute location.

cls_flower had a conflict because the dup reject fix from Or
overlapped with the addition of port range classifiction.

__set_phy_supported()'s conflict was relatively easy to resolve
because Andrew fixed it in both trees, so it was just a matter
of taking the net-next copy.  Or at least I think it was :-)

Joe Stringer's fix to the handling of netns id 0 in bpf_sk_lookup()
intermixed with changes on how the sdif and caller_net are calculated
in these code paths in net-next.

The remaining BPF conflicts were largely about the addition of the
__bpf_md_ptr stuff in 'net' overlapping with adjustments and additions
to the relevant data structure where the MD pointer macros are used.

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-09 21:43:31 -08:00
Jiong Wang
84708c1386 nfp: bpf: implement jitting of BPF_ALU | BPF_ARSH | BPF_*
BPF_X support needs indirect shift mode, please see code comments for
details.

Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-12-07 13:30:48 -08:00
Yangtao Li
6f6c74fad8 nfp: convert to DEFINE_SHOW_ATTRIBUTE
Use DEFINE_SHOW_ATTRIBUTE macro to simplify the code.

Signed-off-by: Yangtao Li <tiny.windzz@gmail.com>
Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-03 17:33:38 -08:00
Jakub Kicinski
6db3a9dcf0 nfp: report more info when reconfiguration fails
FW reconfiguration timeouts are a common indicator of FW trouble.
To make debugging easier print requested update and control word
when reconfiguration fails.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-30 13:30:45 -08:00
Jakub Kicinski
9571d98775 nfp: add offset to all TLV parsing errors
When troubleshooting incorrect FW capabilities it's useful to know
where the faulty TLV is located.  Add offset to all errors messages.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-30 13:30:44 -08:00
Jakub Kicinski
51a6588e8c nfp: add offloads on representors
FW/HW can generally support the standard networking offloads
on representors without any trouble.  Add the ability for FW
to advertise which features should be available on representors.

Because representors are muxed on top of the vNIC we need to listen
on feature changes of their lower devices, and update their features
appropriately.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: John Hurley <john.hurley@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-30 13:30:44 -08:00
Jakub Kicinski
71844fac1e nfp: add locking around representor changes
Up until now we never needed to keep a networking locks around
representors accesses, we only accessed them when device was
reconfigured (under nfp pf->lock) or on fast path (under RCU).
Now we want to be able to iterate over all representors during
notifications, so make sure representor assignment is done
under RTNL lock.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: John Hurley <john.hurley@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-30 13:30:44 -08:00
Jakub Kicinski
fbf60e377d nfp: run don't require Qdiscs on representor netdevs
Our representors are software devices built on top of the PF
vNIC, the queuing should only happen at the vNIC netdevice.
Allow representors to run qdisc-less.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: John Hurley <john.hurley@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-30 13:30:44 -08:00
Jakub Kicinski
9db8bbcb9b nfp: run representor TX locklessly
Our representors are software devices built on top of the PF
vNIC, the only state they have are per-cpu stats, so make
the TX run locklessly.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: John Hurley <john.hurley@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-30 13:30:44 -08:00
Jakub Kicinski
d7cc825225 nfp: avoid oversized TSO headers with metadata prepend
In preparation for TSO over representors make sure the port id
prepend will always fit in the frame.  The current max header
length is 255, which is ample, so assume worst case scenario
of 8 byte prepend and save ourselves the conditionals.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-30 13:30:44 -08:00
Jakub Kicinski
b54ad0eaad nfp: correct descriptor offsets in presence of metadata
The TSO-related offsets in the descriptor should not include
the length of the prepended metadata.  Adjust them.  Note that
this could not have caused issues in the past as we don't
support TSO with metadata prepend as of this patch.

Signed-off-by: Michael Rapson <michael.rapson@netronome.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-30 13:30:44 -08:00
Jakub Kicinski
8b5ddf1e51 nfp: move queue variable init
nd_q is only used at the very end of nfp_net_tx(), there is no need
to initialize it early.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-30 13:30:44 -08:00
Jakub Kicinski
de31049a48 nfp: move temporary variables in nfp_net_tx_complete()
Move temporary variables in scope of the loop in nfp_net_tx_complete(),
and add a temp for txbuf software structure.  This saves us 0.2% of CPU.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-30 13:30:44 -08:00
Jakub Kicinski
9586274967 nfp: copy only the relevant part of the TX descriptor for frags
Chained descriptors for fragments need to duplicate all the descriptor
fields of the skb head, so we copy the descriptor and then modify the
relevant fields.  This is wasteful, because the top half of the descriptor
will get overwritten entirely while the bottom half is not modified at all.
Copy only the bottom half.  This saves us 0.3% of CPU in a GSO test.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-30 13:30:44 -08:00
John Hurley
b5f0cf0834 nfp: flower: prevent offload if rhashtable insert fails
For flow offload adds, if the rhash insert code fails, the flow will still
have been offloaded but the reference to it in the driver freed.

Re-order the offload setup calls to ensure that a flow will only be written
to FW if a kernel reference is held and stored in the rhashtable. Remove
this hashtable entry if the offload fails.

Fixes: c01d0efa51 ("nfp: flower: use rhashtable for flow caching")
Signed-off-by: John Hurley <john.hurley@netronome.com>
Reviewed-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-30 13:24:56 -08:00
John Hurley
1166494891 nfp: flower: release metadata on offload failure
Calling nfp_compile_flow_metadata both assigns a stats context and
increments a ref counter on (or allocates) a mask id table entry. These
are released by the nfp_modify_flow_metadata call on flow deletion,
however, if a flow add fails after metadata is set then the flow entry
will be deleted but the metadata assignments leaked.

Add an error path to the flow add offload function to ensure allocated
metadata is released in the event of an offload fail.

Fixes: 81f3ddf254 ("nfp: add control message passing capabilities to flower offloads")
Signed-off-by: John Hurley <john.hurley@netronome.com>
Reviewed-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-30 13:24:56 -08:00
David S. Miller
4afe60a97b Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Daniel Borkmann says:

====================
pull-request: bpf-next 2018-11-26

The following pull-request contains BPF updates for your *net-next* tree.

The main changes are:

1) Extend BTF to support function call types and improve the BPF
   symbol handling with this info for kallsyms and bpftool program
   dump to make debugging easier, from Martin and Yonghong.

2) Optimize LPM lookups by making longest_prefix_match() handle
   multiple bytes at a time, from Eric.

3) Adds support for loading and attaching flow dissector BPF progs
   from bpftool, from Stanislav.

4) Extend the sk_lookup() helper to be supported from XDP, from Nitin.

5) Enable verifier to support narrow context loads with offset > 0
   to adapt to LLVM code generation (currently only offset of 0 was
   supported). Add test cases as well, from Andrey.

6) Simplify passing device functions for offloaded BPF progs by
   adding callbacks to bpf_prog_offload_ops instead of ndo_bpf.
   Also convert nfp and netdevsim to make use of them, from Quentin.

7) Add support for sock_ops based BPF programs to send events to
   the perf ring-buffer through perf_event_output helper, from
   Sowmini and Daniel.

8) Add read / write support for skb->tstamp from tc BPF and cg BPF
   programs to allow for supporting rate-limiting in EDT qdiscs
   like fq from BPF side, from Vlad.

9) Extend libbpf API to support map in map types and add test cases
   for it as well to BPF kselftests, from Nikita.

10) Account the maximum packet offset accessed by a BPF program in
    the verifier and use it for optimizing nfp JIT, from Jiong.

11) Fix error handling regarding kprobe_events in BPF sample loader,
    from Daniel T.

12) Add support for queue and stack map type in bpftool, from David.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-26 13:08:17 -08:00
Jakub Kicinski
340a4864d5 nfp: abm: add support for more threshold actions
Original FW only allowed us to perform ECN marking.  Newer releases
also support plain old drop.  Add the ability to configure drop
policy.  This is particularly useful in combination with GRED,
because different bands can have different ECN marking setting.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: John Hurley <john.hurley@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-19 18:53:46 -08:00
Jakub Kicinski
174ab544e3 nfp: abm: add cls_u32 offload for simple band classification
Use offload of very simple u32 filters to direct packets to GRED
bands based on the DSCP marking.  No u32 hashing is supported,
just plain simple filters matching on ToS or Priority with
appropriate mask device can support.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: John Hurley <john.hurley@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-19 18:53:46 -08:00
Jakub Kicinski
6a80240571 nfp: abm: add functions to update DSCP -> virtual queue map
Learn how to set the DSCP map.  FW uses a packed array which
geometry depends on the number of supported priorities and
virtual queues.  Write code to assemble this map and to communicate
the setting to the FW via mailbox.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: John Hurley <john.hurley@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-19 18:53:46 -08:00
Jakub Kicinski
14780c3429 nfp: abm: calculate PRIO map len and check mailbox size
In preparation for PRIO offload calculate how long the prio map
for FW will be and make sure the configuration can be performed
via the vNIC mailbox.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: John Hurley <john.hurley@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-19 18:53:46 -08:00
Jakub Kicinski
f3d6372064 nfp: abm: add GRED offload
Add support for GRED offload.  It behaves much like RED, but
can apply different parameters to different bands.  GRED operates
pretty much exactly like our HW/FW with a single FIFO and different
RED state instances.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: John Hurley <john.hurley@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-19 18:53:46 -08:00
Jakub Kicinski
990b50a53a nfp: abm: wrap RED parameters in bands
Wrap RED parameters and stats into a structure, and a 1-element
array.  Upcoming GRED offload will add the support for more bands.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: John Hurley <john.hurley@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-19 18:53:46 -08:00
Jakub Kicinski
184ec856ca nfp: abm: add up bands for sto/non-sto stats
Add up stats for all bands for the extra ethtool statistics.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: John Hurley <john.hurley@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-19 18:53:45 -08:00
Jakub Kicinski
57f31bbaa9 nfp: abm: switch to extended stats for reading packet/byte counts
In PRIO-enabled FW read the statistics from per-band symbol, rather
than from the standard per-PCIe-queue counters.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: John Hurley <john.hurley@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-19 18:53:45 -08:00
Jakub Kicinski
68e9864221 nfp: abm: size threshold table to account for bands
Make sure the threshold table is large enough to hold information
for all bands.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: John Hurley <john.hurley@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-19 18:53:45 -08:00
Jakub Kicinski
5720769609 nfp: abm: pass band parameter to functions
In preparation for per-band RED offload pass band parameter to
functions.  For now it will always be 0.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: John Hurley <john.hurley@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-19 18:53:45 -08:00
Jakub Kicinski
3a44820591 nfp: abm: map per-band symbols
In preparation for multi-band RED offload if FW is capable map
the extended symbols which will allow us to set per-band parameters
and read stats.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: John Hurley <john.hurley@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-19 18:53:45 -08:00
Jakub Kicinski
bd3b5d462a nfp: abm: restructure Qdisc handling
In preparation of handling more Qdisc types switch to a different
offload strategy.  We have now recreated the Qdisc hierarchy in
the driver.  Every time the hierarchy changes parse it, and update
the configuration of the HW accordingly.

While at it drop the support of pretending that we can instantiate
a single queue on a multi-queue device in HW/FW.  MQ is now required,
and each queue will have its own instance of RED.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: John Hurley <john.hurley@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-14 08:51:28 -08:00
Jakub Kicinski
52db4eaca5 nfp: abm: save RED's parameters
Use the new driver Qdisc structure to keep track of parameters
of RED Qdiscs.  This way as the Qdisc moves around in the hierarchy
we will be able to configure the HW appropriately.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: John Hurley <john.hurley@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-14 08:51:28 -08:00
Jakub Kicinski
6c5dbda0d4 nfp: abm: reset RED's child based on limit
RED qdisc will replace its child Qdisc with a new FIFO queue if
it is reconfigured and the limit parameter is not 0.

This means that when it's created with limit of 0 it will have no FIFO,
and all packets will be dropped.  If it's changed and limit is specified
it will loose its existing child (implicit graft).  Make sure we mark
RED Qdisc child as NFP_QDISC_UNTRACKED if its not the expected FIFO.

nfp_abm_qdisc_replace() will return 1 if Qdisc already existed.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: John Hurley <john.hurley@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-14 08:51:28 -08:00
Jakub Kicinski
6b8417b7e6 nfp: abm: build full Qdisc hierarchy based on graft notifications
Using graft notifications recreate in the driver the full Qdisc
hierarchy.  Keep track of how many times each Qdisc is attached
to the hierarchy to make sure we don't offload Qdiscs which are
attached multiple times (device queues can't be shared).  For
graft events of Qdiscs we don't know exist make the child as
invalid/untracked.

Note that MQ Qdisc doesn't send destruction events reliably when
device is dismantled, so we need to manually clean out the
children otherwise we'd think Qdiscs which are still in use
are getting freed.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: John Hurley <john.hurley@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-14 08:51:28 -08:00
Jakub Kicinski
aee7539c58 nfp: abm: allocate Qdisc child table
To keep track of Qdisc hierarchy allocate a table for children
for each Qdisc.  RED Qdisc can only have one child.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: John Hurley <john.hurley@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-14 08:51:27 -08:00
Jakub Kicinski
1853125889 nfp: abm: remember which Qdisc is root
Keep track of which Qdisc is currently root.  We need to implement
TC_SETUP_ROOT_QDISC handling, and for completeness also clear the
root Qdisc pointer when it's freed.  TC_SETUP_ROOT_QDISC isn't always
sent when device is dismantled.

Remembering the root Qdisc will allow us to build the entire hierarchy
in following patches.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: John Hurley <john.hurley@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-14 08:51:27 -08:00
Jakub Kicinski
4f5681d088 nfp: abm: track all offload-enabled qdiscs
Allocate an object corresponding to any offloaded qdisc we are
informed about by the kernel.  Not only the qdiscs we have a
chance of offloading.

The count of created objects will be used to decide whether
the ethtool TC offload can be disabled, since otherwise we may
miss destroy commands.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: John Hurley <john.hurley@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-14 08:51:27 -08:00
Jakub Kicinski
6666f545e9 nfp: abm: keep track of all RED thresholds
Instead of writing the threshold out when Qdisc is configured
and not remembering it move to a scheme where we remember all
thresholds.  When configuration changes parse the offloaded
Qdiscs and set thresholds appropriately.

This will help future extensions.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: John Hurley <john.hurley@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-14 08:51:27 -08:00
Jakub Kicinski
08990494e5 nfp: abm: rename qdiscs -> red_qdiscs
Rename qdiscs member to red_qdiscs.  One of following patches will
use the name qdiscs for tracking all qdisc types.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: John Hurley <john.hurley@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-14 08:51:27 -08:00
John Hurley
d4b69bad61 nfp: flower: remove unnecessary code in flow lookup
Recent changes to NFP mean that stats updates from fw to driver no longer
require a flow lookup and (because egdev offload has been removed) the
ingress netdev for a lookup is now always known.

Remove obsolete code in a flow lookup that matches on host context and
that allows for a netdev to be NULL.

Signed-off-by: John Hurley <john.hurley@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-11 09:54:53 -08:00
John Hurley
4f63fde3fc nfp: flower: remove TC egdev offloads
Previously, only tunnel decap rules required egdev registration for
offload in NFP. These are now supported via indirect TC block callbacks.

Remove the egdev code from NFP.

Signed-off-by: John Hurley <john.hurley@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-11 09:54:53 -08:00
John Hurley
3166dd07a9 nfp: flower: offload tunnel decap rules via indirect TC blocks
Previously, TC block tunnel decap rules were only offloaded when a
callback was triggered through registration of the rules egress device.
This meant that the driver had no access to the ingress netdev and so
could not verify it was the same tunnel type that the rule implied.

Register tunnel devices for indirect TC block offloads in NFP, giving
access to new rules based on the ingress device rather than egress. Use
this to verify the netdev type of VXLAN and Geneve based rules and offload
the rules to HW if applicable.

Tunnel registration is done via a netdev notifier. On notifier
registration, this is triggered for already existing netdevs. This means
that NFP can register for offloads from devices that exist before it is
loaded (filter rules will be replayed from the TC core). Similarly, on
notifier unregister, a call is triggered for each currently active netdev.
This allows the driver to unregister any indirect block callbacks that may
still be active.

Signed-off-by: John Hurley <john.hurley@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-11 09:54:53 -08:00
John Hurley
65b7970edf nfp: flower: increase scope of netdev checking functions
Both the actions and tunnel_conf files contain local functions that check
the type of an input netdev. In preparation for re-use with tunnel offload
via indirect blocks, move these to static inline functions in a header
file.

Signed-off-by: John Hurley <john.hurley@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-11 09:54:53 -08:00
John Hurley
7885b4fc8d nfp: flower: allow non repr netdev offload
Previously the offload functions in NFP assumed that the ingress (or
egress) netdev passed to them was an nfp repr.

Modify the driver to permit the passing of non repr netdevs as the ingress
device for an offload rule candidate. This may include devices such as
tunnels. The driver should then base its offload decision on a combination
of ingress device and egress port for a rule.

Signed-off-by: John Hurley <john.hurley@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-11 09:54:53 -08:00
Quentin Monnet
16a8cb5cff bpf: do not pass netdev to translate() and prepare() offload callbacks
The kernel functions to prepare verifier and translate for offloaded
program retrieve "offload" from "prog", and "netdev" from "offload".
Then both "prog" and "netdev" are passed to the callbacks.

Simplify this by letting the drivers retrieve the net device themselves
from the offload object attached to prog - if they need it at all. There
is currently no need to pass the netdev as an argument to those
functions.

Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-11-10 15:39:54 -08:00
Quentin Monnet
a40a26322a bpf: pass prog instead of env to bpf_prog_offload_verifier_prep()
Function bpf_prog_offload_verifier_prep(), called from the kernel BPF
verifier to run a driver-specific callback for preparing for the
verification step for offloaded programs, takes a pointer to a struct
bpf_verifier_env object. However, no driver callback needs the whole
structure at this time: the two drivers supporting this, nfp and
netdevsim, only need a pointer to the struct bpf_prog instance held by
env.

Update the callback accordingly, on kernel side and in these two
drivers.

Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-11-10 15:39:54 -08:00
Quentin Monnet
eb9119471e bpf: pass destroy() as a callback and remove its ndo_bpf subcommand
As part of the transition from ndo_bpf() to callbacks attached to struct
bpf_offload_dev for some of the eBPF offload operations, move the
functions related to program destruction to the struct and remove the
subcommand that was used to call them through the NDO.

Remove function __bpf_offload_ndo(), which is no longer used.

Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-11-10 15:39:54 -08:00
Quentin Monnet
b07ade27e9 bpf: pass translate() as a callback and remove its ndo_bpf subcommand
As part of the transition from ndo_bpf() to callbacks attached to struct
bpf_offload_dev for some of the eBPF offload operations, move the
functions related to code translation to the struct and remove the
subcommand that was used to call them through the NDO.

Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-11-10 15:39:54 -08:00
Quentin Monnet
00db12c3d1 bpf: call verifier_prep from its callback in struct bpf_offload_dev
In a way similar to the change previously brought to the verify_insn
hook and to the finalize callback, switch to the newly added ops in
struct bpf_prog_offload for calling the functions used to prepare driver
verifiers.

Since the dev_ops pointer in struct bpf_prog_offload is no longer used
by any callback, we can now remove it from struct bpf_prog_offload.

Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-11-10 15:39:54 -08:00
Quentin Monnet
1385d755cf bpf: pass a struct with offload callbacks to bpf_offload_dev_create()
For passing device functions for offloaded eBPF programs, there used to
be no place where to store the pointer without making the non-offloaded
programs pay a memory price.

As a consequence, three functions were called with ndo_bpf() through
specific commands. Now that we have struct bpf_offload_dev, and since
none of those operations rely on RTNL, we can turn these three commands
into hooks inside the struct bpf_prog_offload_ops, and pass them as part
of bpf_offload_dev_create().

This commit effectively passes a pointer to the struct to
bpf_offload_dev_create(). We temporarily have two struct
bpf_prog_offload_ops instances, one under offdev->ops and one under
offload->dev_ops. The next patches will make the transition towards the
former, so that offload->dev_ops can be removed, and callbacks relying
on ndo_bpf() added to offdev->ops as well.

While at it, rename "nfp_bpf_analyzer_ops" as "nfp_bpf_dev_ops" (and
similarly for netdevsim).

Suggested-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-11-10 15:39:53 -08:00
Quentin Monnet
1da6f57338 nfp: bpf: move nfp_bpf_analyzer_ops from verifier.c to offload.c
We are about to add several new callbacks to the struct, all of them
defined in offload.c. Move the struct bpf_prog_offload_ops object in
that file. As a consequence, nfp_verify_insn() and nfp_finalize() can no
longer be static.

Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-11-10 15:39:53 -08:00
Jakub Kicinski
560f1ba4d8 nfp: use the new __netdev_tx_sent_queue() BQL optimisation
__netdev_tx_sent_queue() was added in commit e59020abf0f
("net: bql: add __netdev_tx_sent_queue()") and allows for
better GSO performance.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-09 19:49:00 -08:00
Jiong Wang
cf599f5031 nfp: bpf: relax prog rejection through max_pkt_offset
NFP is refusing to offload programs whenever the MTU is set to a value
larger than the max packet bytes that fits in NFP Cluster Target Memory
(CTM). However, a eBPF program doesn't always need to access the whole
packet data.

Verifier has always calculated maximum direct packet access (DPA) offset,
and kept it in max_pkt_offset inside prog auxiliar information. This patch
relax prog rejection based on max_pkt_offset.

Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-11-09 09:16:32 +01:00
Jakub Kicinski
6e5a716f42 nfp: abm: refuse RED offload with harddrop set
RED Qdisc will now inform the drivers about the state of the harddrop
flag.  Refuse to offload in case harddrop is set.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: John Hurley <john.hurley@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-08 20:48:01 -08:00
Jakub Kicinski
cae5f48e32 nfp: abm: don't set negative threshold
Turns out the threshold value is used in signed compares in the FW,
so we should avoid setting the top bit.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: John Hurley <john.hurley@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-08 20:48:01 -08:00
Jakub Kicinski
032748acf6 nfp: abm: provide more precise info about offload parameter validation
Improve log messages printed when RED can't be offloaded because
of Qdisc parameters.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: John Hurley <john.hurley@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-08 20:48:01 -08:00
Jakub Kicinski
83ec8857a0 nfp: parse vNIC TLV capabilities at alloc time
In certain cases initialization logic which follows allocation of
the vNIC structure may want to validate the capabilities of that vNIC.
This is easy before vNIC is initialized for normal capabilities which
are at fixed offsets in control memory, easy to locate and read, but
poses a challenge if the capabilities are in form of TLVs.  Parse
the TLVs early on so other code can just access parsed info, instead
of having to do the parsing by itself.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: John Hurley <john.hurley@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-08 20:48:00 -08:00
Jakub Kicinski
e38f5d11b9 nfp: pass ctrl_bar pointer to nfp_net_alloc
Move setting ctrl_bar pointer to the nfp_net_alloc function,
to make sure we can parse capabilities early in the following
patch.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: John Hurley <john.hurley@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-08 20:48:00 -08:00
Jakub Kicinski
47330f9bdf nfp: abm: split qdisc offload code into a separate file
The Qdisc offload code is logically separate, and we will soon
do significant surgery on it to support more Qdiscs, so move
it to a separate file.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: John Hurley <john.hurley@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-08 20:48:00 -08:00
John Hurley
e963e1097a nfp: flower: include geneve as supported offload tunnel type
Offload of geneve decap rules is supported in NFP. Include geneve in the
check for supported types.

Signed-off-by: John Hurley <john.hurley@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-07 23:00:23 -08:00
John Hurley
83f27d027d nfp: flower: use geneve and vxlan helpers
Make use of the recently added VXLAN and geneve helper functions to
determine the type of the netdev from its rtnl_link_ops.

Signed-off-by: John Hurley <john.hurley@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-07 23:00:23 -08:00
Jakub Kicinski
0c665e2bf4 nfp: flower: use the common netdev notifier
Use driver's common notifier for LAG and tunnel configuration.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: John Hurley <john.hurley@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-07 11:45:22 -08:00
Jakub Kicinski
3e33359040 nfp: register a notifier handler in a central location for the device
Code interested in networking events registers its own notifier
handlers.  Create one device-wide notifier instance.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: John Hurley <john.hurley@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-07 11:45:22 -08:00
Jakub Kicinski
659bb404eb nfp: flower: make nfp_fl_lag_changels_event() void
nfp_fl_lag_changels_event() never fails, and therefore we would
never return NOTIFY_BAD for NETDEV_CHANGELOWERSTATE.  Make this
clearer by changing nfp_fl_lag_changels_event()'s return type
to void.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: John Hurley <john.hurley@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-07 11:45:22 -08:00
Jakub Kicinski
a558c982a8 nfp: flower: don't try to nack device unregister events
Returning an error from a notifier means we want to veto the change.
We shouldn't veto NETDEV_UNREGISTER just because we couldn't find
the tracking info for given master.

I can't seem to find a way to trigger this unless we have some
other bug, so it's probably not fix-worthy.

While at it move the checking if the netdev really is of interest
into the handling functions, like we do for other events.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: John Hurley <john.hurley@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-07 11:45:22 -08:00
Jakub Kicinski
e50bfdf74d nfp: flower: remove unnecessary iteration over devices
For flower tunnel offloads FW has to be informed about MAC addresses
of tunnel devices.  We use a netdev notifier to keep track of these
addresses.

Remove unnecessary loop over netdevices after notifier is registered.
The intention of the loop was to catch devices which already existed
on the system before nfp driver got loaded, but netdev notifier will
replay NETDEV_REGISTER events.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: John Hurley <john.hurley@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-07 11:45:22 -08:00
Pieter Jansen van Vuuren
4234d62c27 nfp: flower: add ipv6 set flow label and hop limit offload
Add ipv6 set flow label and hop limit action offload. Since pedit sets
headers per 4 byte word, we need to ensure that setting either version,
priority, payload_len or nexthdr does not get offloaded.

Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-07 11:45:21 -08:00
Pieter Jansen van Vuuren
a3c6b063fe nfp: flower: add ipv4 set ttl and tos offload
Add ipv4 set ttl and tos action offload. Since pedit sets headers per 4
byte word, we need to ensure that setting either version, ihl, protocol,
total length or checksum does not get offloaded.

Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com>
Reviewed-by: John Hurley <john.hurley@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-07 11:45:21 -08:00
David S. Miller
a19c59cc10 Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Daniel Borkmann says:

====================
pull-request: bpf-next 2018-10-21

The following pull-request contains BPF updates for your *net-next* tree.

The main changes are:

1) Implement two new kind of BPF maps, that is, queue and stack
   map along with new peek, push and pop operations, from Mauricio.

2) Add support for MSG_PEEK flag when redirecting into an ingress
   psock sk_msg queue, and add a new helper bpf_msg_push_data() for
   insert data into the message, from John.

3) Allow for BPF programs of type BPF_PROG_TYPE_CGROUP_SKB to use
   direct packet access for __skb_buff, from Song.

4) Use more lightweight barriers for walking perf ring buffer for
   libbpf and perf tool as well. Also, various fixes and improvements
   from verifier side, from Daniel.

5) Add per-symbol visibility for DSO in libbpf and hide by default
   global symbols such as netlink related functions, from Andrey.

6) Two improvements to nfp's BPF offload to check vNIC capabilities
   in case prog is shared with multiple vNICs and to protect against
   mis-initializing atomic counters, from Jakub.

7) Fix for bpftool to use 4 context mode for the nfp disassembler,
   also from Jakub.

8) Fix a return value comparison in test_libbpf.sh and add several
   bpftool improvements in bash completion, documentation of bpf fs
   restrictions and batch mode summary print, from Quentin.

9) Fix a file resource leak in BPF selftest's load_kallsyms()
   helper, from Peng.

10) Fix an unused variable warning in map_lookup_and_delete_elem(),
    from Alexei.

11) Fix bpf_skb_adjust_room() signature in BPF UAPI helper doc,
    from Nicolas.

12) Add missing executables to .gitignore in BPF selftests, from Anders.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-21 21:11:46 -07:00
David S. Miller
2e2d6f0342 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
net/sched/cls_api.c has overlapping changes to a call to
nlmsg_parse(), one (from 'net') added rtm_tca_policy instead of NULL
to the 5th argument, and another (from 'net-next') added cb->extack
instead of NULL to the 6th argument.

net/ipv4/ipmr_base.c is a case of a bug fix in 'net' being done to
code which moved (to mr_table_dump)) in 'net-next'.  Thanks to David
Ahern for the heads up.

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-19 11:03:06 -07:00
Ido Schimmel
5ff4ff4fe8 net: Add netif_is_vxlan()
Add the ability to determine whether a netdev is a VxLAN netdev by
calling the above mentioned function that checks the netdev's
rtnl_link_ops.

This will allow modules to identify netdev events involving a VxLAN
netdev and act accordingly. For example, drivers capable of VxLAN
offload will need to configure the underlying device when a VxLAN netdev
is being enslaved to an offloaded bridge.

Convert nfp to use the newly introduced helper.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-17 17:45:07 -07:00
Jakub Kicinski
44b6fed0c1 nfp: bpf: double check vNIC capabilities after object sharing
Program translation stage checks that program can be offloaded to
the netdev which was passed during the load (bpf_attr->prog_ifindex).
After program sharing was introduced, however, the netdev on which
program is loaded can theoretically be different, and therefore
we should recheck the program size and max stack size at load time.

This was found by code inspection, AFAIK today all vNICs have
identical caps.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-10-16 15:23:58 -07:00
Jakub Kicinski
527db74b71 nfp: bpf: protect against mis-initializing atomic counters
Atomic operations on the NFP are currently always in big endian.
The driver keeps track of regions of memory storing atomic values
and byte swaps them accordingly.  There are corner cases where
the map values may be initialized before the driver knows they
are used as atomic counters.  This can happen either when the
datapath is performing the update and the stack contents are
unknown or when map is updated before the program which will
use it for atomic values is loaded.

To avoid situation where user initializes the value to 0 1 2 3
and then after loading a program which uses the word as an atomic
counter starts reading 3 2 1 0 - only allow atomic counters to be
initialized to endian-neutral values.

For updates from the datapath the stack information may not be
as precise, so just allow initializing such values to 0.

Example code which would break:
struct bpf_map_def SEC("maps") rxcnt = {
       .type = BPF_MAP_TYPE_HASH,
       .key_size = sizeof(__u32),
       .value_size = sizeof(__u64),
       .max_entries = 1,
};

int xdp_prog1()
{
      	__u64 nonzeroval = 3;
	__u32 key = 0;
	__u64 *value;

	value = bpf_map_lookup_elem(&rxcnt, &key);
	if (!value)
		bpf_map_update_elem(&rxcnt, &key, &nonzeroval, BPF_ANY);
	else
		__sync_fetch_and_add(value, 1);

	return XDP_PASS;
}

$ offload bpftool map dump
key: 00 00 00 00 value: 00 00 00 03 00 00 00 00

should be:

$ offload bpftool map dump
key: 00 00 00 00 value: 03 00 00 00 00 00 00 00

Reported-by: David Beckett <david.beckett@netronome.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-10-16 15:23:58 -07:00
Pieter Jansen van Vuuren
140b6abac2 nfp: flower: use offsets provided by pedit instead of index for ipv6
Previously when populating the set ipv6 address action, we incorrectly
made use of pedit's key index to determine which 32bit word should be
set. We now calculate which word has been selected based on the offset
provided by the pedit action.

Fixes: 354b82bb32 ("nfp: add set ipv6 source and destination address")
Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-15 23:17:25 -07:00
Pieter Jansen van Vuuren
d08c9e5893 nfp: flower: fix multiple keys per pedit action
Previously we only allowed a single header key per pedit action to
change the header. This used to result in the last header key in the
pedit action to overwrite previous headers. We now keep track of them
and allow multiple header keys per pedit action.

Fixes: c0b1bd9a8b ("nfp: add set ipv4 header action flower offload")
Fixes: 354b82bb32 ("nfp: add set ipv6 source and destination address")
Fixes: f8b7b0a6b1 ("nfp: add set tcp and udp header action flower offload")
Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-15 23:17:24 -07:00
Pieter Jansen van Vuuren
8913806f16 nfp: flower: fix pedit set actions for multiple partial masks
Previously we did not correctly change headers when using multiple
pedit actions with partial masks. We now take this into account and
no longer just commit the last pedit action.

Fixes: c0b1bd9a8b ("nfp: add set ipv4 header action flower offload")
Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-15 23:17:24 -07:00
Ryan C Goodfellow
5948185b97 nfp: devlink port split support for 1x100G CXP NIC
This commit makes it possible to use devlink to split the 100G CXP
Netronome into two 40G interfaces. Currently when you ask for 2
interfaces, the math in src/nfp_devlink.c:nfp_devlink_port_split
calculates that you want 5 lanes per port because for some reason
eth_port.port_lanes=10 (shouldn't this be 12 for CXP?). What we really
want when asking for 2 breakout interfaces is 4 lanes per port. This
commit makes that happen by calculating based on 8 lanes if 10 are
present.

Signed-off-by: Ryan C Goodfellow <rgoodfel@isi.edu>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Greg Weeks <greg.weeks@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-15 22:29:55 -07:00
Jakub Kicinski
96de25060d nfp: replace long license headers with SPDX
Replace the repeated license text with SDPX identifiers.
While at it bump the Copyright dates for files we touched
this year.

Signed-off-by: Edwin Peer <edwin.peer@netronome.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Nic Viljoen <nick.viljoen@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-11 12:16:21 -07:00
Pieter Jansen van Vuuren
12ecf61529 nfp: flower: use host context count provided by firmware
Read the host context count symbols provided by firmware and use
it to determine the number of allocated stats ids. Previously it
won't be possible to offload more than 2^17 filter even if FW was
able to do so.

Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-10 22:32:44 -07:00
Pieter Jansen van Vuuren
7fade1077c nfp: flower: use stats array instead of storing stats per flow
Make use of an array stats instead of storing stats per flow which
would require a hash lookup at critical times.

Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-10 22:32:44 -07:00
Pieter Jansen van Vuuren
c01d0efa51 nfp: flower: use rhashtable for flow caching
Make use of relativistic hash tables for tracking flows instead
of fixed sized hash tables.

Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-10 22:32:44 -07:00
David S. Miller
071a234ad7 Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Alexei Starovoitov says:

====================
pull-request: bpf-next 2018-10-08

The following pull-request contains BPF updates for your *net-next* tree.

The main changes are:

1) sk_lookup_[tcp|udp] and sk_release helpers from Joe Stringer which allow
BPF programs to perform lookups for sockets in a network namespace. This would
allow programs to determine early on in processing whether the stack is
expecting to receive the packet, and perform some action (eg drop,
forward somewhere) based on this information.

2) per-cpu cgroup local storage from Roman Gushchin.
Per-cpu cgroup local storage is very similar to simple cgroup storage
except all the data is per-cpu. The main goal of per-cpu variant is to
implement super fast counters (e.g. packet counters), which don't require
neither lookups, neither atomic operations in a fast path.
The example of these hybrid counters is in selftests/bpf/netcnt_prog.c

3) allow HW offload of programs with BPF-to-BPF function calls from Quentin Monnet

4) support more than 64-byte key/value in HW offloaded BPF maps from Jakub Kicinski

5) rename of libbpf interfaces from Andrey Ignatov.
libbpf is maturing as a library and should follow good practices in
library design and implementation to play well with other libraries.
This patch set brings consistent naming convention to global symbols.

6) relicense libbpf as LGPL-2.1 OR BSD-2-Clause from Alexei Starovoitov
to let Apache2 projects use libbpf

7) various AF_XDP fixes from Björn and Magnus
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-08 23:42:44 -07:00
Quentin Monnet
7ff0ccde43 nfp: bpf: support pointers to other stack frames for BPF-to-BPF calls
Mark instructions that use pointers to areas in the stack outside of the
current stack frame, and process them accordingly in mem_op_stack().
This way, we also support BPF-to-BPF calls where the caller passes a
pointer to data in its own stack frame to the callee (typically, when
the caller passes an address to one of its local variables located in
the stack, as an argument).

Thanks to Jakub and Jiong for figuring out how to deal with this case,
I just had to turn their email discussion into this patch.

Suggested-by: Jiong Wang <jiong.wang@netronome.com>
Suggested-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com>
Reviewed-by: Jiong Wang <jiong.wang@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-10-08 10:24:13 +02:00
Quentin Monnet
4454962314 nfp: bpf: optimise save/restore for R6~R9 based on register usage
When pre-processing the instructions, it is trivial to detect what
subprograms are using R6, R7, R8 or R9 as destination registers. If a
subprogram uses none of those, then we do not need to jump to the
subroutines dedicated to saving and restoring callee-saved registers in
its prologue and epilogue.

This patch introduces detection of callee-saved registers in subprograms
and prevents the JIT from adding calls to those subroutines whenever we
can: we save some instructions in the translated program, and some time
at runtime on BPF-to-BPF calls and returns.

If no subprogram needs to save those registers, we can avoid appending
the subroutines at the end of the program.

Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-10-08 10:24:13 +02:00
Quentin Monnet
2178f3f0dc nfp: bpf: fix return address from register-saving subroutine to callee
On performing a BPF-to-BPF call, we first jump to a subroutine that
pushes callee-saved registers (R6~R9) to the stack, and from there we
goes to the start of the callee next. In order to do so, the caller must
pass to the subroutine the address of the NFP instruction to jump to at
the end of that subroutine. This cannot be reliably implemented when
translated the caller, as we do not always know the start offset of the
callee yet.

This patch implement the required fixup step for passing the start
offset in the callee via the register used by the subroutine to hold its
return address.

Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-10-08 10:24:13 +02:00
Quentin Monnet
bdf4c66faf nfp: bpf: update fixup function for BPF-to-BPF calls support
Relocation for targets of BPF-to-BPF calls are required at the end of
translation. Update the nfp_fixup_branches() function in that regard.

When checking that the last instruction of each bloc is a branch, we
must account for the length of the instructions required to pop the
return address from the stack.

Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com>
Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-10-08 10:24:13 +02:00
Quentin Monnet
fb19816541 nfp: bpf: account for additional stack usage when checking stack limit
Offloaded programs using BPF-to-BPF calls use the stack to store the
return address when calling into a subprogram. Callees also need some
space to save eBPF registers R6 to R9. And contrarily to kernel
verifier, we align stack frames on 64 bytes (and not 32). Account for
all this when checking the stack size limit before JIT-ing the program.
This means we have to recompute maximum stack usage for the program, we
cannot get the value from the kernel.

In addition to adapting the checks on stack usage, move them to the
finalize() callback, now that we have it and because such checks are
part of the verification step rather than translation.

Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-10-08 10:24:13 +02:00
Quentin Monnet
389f263b60 nfp: bpf: add main logics for BPF-to-BPF calls support in nfp driver
This is the main patch for the logics of BPF-to-BPF calls in the nfp
driver.

The functions called on BPF_JUMP | BPF_CALL and BPF_JUMP | BPF_EXIT were
used to call helpers and exit from the program, respectively; make them
usable for calling into, or returning from, a BPF subprogram as well.

For all calls, push the return address as well as the callee-saved
registers (R6 to R9) to the stack, and pop them upon returning from the
calls. In order to limit the overhead in terms of instruction number,
this is done through dedicated subroutines. Jumping to the callee
actually consists in jumping to the subroutine, that "returns" to the
callee: this will require some fixup for passing the address in a later
patch. Similarly, returning consists in jumping to the subroutine, which
pops registers and then return directly to the caller (but no fixup is
needed here).

Return to the caller is performed with the RTN instruction newly added
to the JIT.

For the few steps where we need to know what subprogram an instruction
belongs to, the struct nfp_insn_meta is extended with a new subprog_idx
field.

Note that checks on the available stack size, to take into account the
additional requirements associated to BPF-to-BPF calls (storing R6-R9
and return addresses), are added in a later patch.

Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com>
Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-10-08 10:24:13 +02:00
Quentin Monnet
e3b49dc69b nfp: bpf: account for BPF-to-BPF calls when preparing nfp JIT
Similarly to "exit" or "helper call" instructions, BPF-to-BPF calls will
require additional processing before translation starts, in order to
record and mark jump destinations.

We also mark the instructions where each subprogram begins. This will be
used in a following commit to determine where to add prologues for
subprograms.

Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com>
Reviewed-by: Jiong Wang <jiong.wang@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-10-08 10:24:13 +02:00
Quentin Monnet
bcfdfb7c96 nfp: bpf: ignore helper-related checks for BPF calls in nfp verifier
The checks related to eBPF helper calls are performed each time the nfp
driver meets a BPF_JUMP | BPF_CALL instruction. However, these checks
are not relevant for BPF-to-BPF call (same instruction code, different
value in source register), so just skip the checks for such calls.

While at it, rename the function that runs those checks to make it clear
they apply to _helper_ calls only.

Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com>
Reviewed-by: Jiong Wang <jiong.wang@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-10-08 10:24:12 +02:00
Quentin Monnet
c5da54d93e nfp: bpf: copy eBPF subprograms information from kernel verifier
In order to support BPF-to-BPF calls in offloaded programs, the nfp
driver must collect information about the distinct subprograms: namely,
the number of subprograms composing the complete program and the stack
depth of those subprograms. The latter in particular is non-trivial to
collect, so we copy those elements from the kernel verifier via the
newly added post-verification hook. The struct nfp_prog is extended to
store this information. Stack depths are stored in an array of dedicated
structs.

Subprogram start indexes are not collected. Instead, meta instructions
associated to the start of a subprogram will be marked with a flag in a
later patch.

Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com>
Reviewed-by: Jiong Wang <jiong.wang@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-10-08 10:24:12 +02:00
Quentin Monnet
1a7e62e632 nfp: bpf: rename nfp_prog->stack_depth as nfp_prog->stack_frame_depth
In preparation for support for BPF to BPF calls in offloaded programs,
rename the "stack_depth" field of the struct nfp_prog as
"stack_frame_depth". This is to make it clear that the field refers to
the maximum size of the current stack frame (as opposed to the maximum
size of the whole stack memory).

Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-10-08 10:24:12 +02:00
Quentin Monnet
c941ce9c28 bpf: add verifier callback to get stack usage info for offloaded progs
In preparation for BPF-to-BPF calls in offloaded programs, add a new
function attribute to the struct bpf_prog_offload_ops so that drivers
supporting eBPF offload can hook at the end of program verification, and
potentially extract information collected by the verifier.

Implement a minimal callback (returning 0) in the drivers providing the
structs, namely netdevsim and nfp.

This will be useful in the nfp driver, in later commits, to extract the
number of subprograms as well as the stack depth for those subprograms.

Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com>
Reviewed-by: Jiong Wang <jiong.wang@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-10-08 10:24:12 +02:00
David S. Miller
9e50727f0e mlx5-updates-2018-10-03
mlx5 core driver and ethernet netdev updates, please note there is a small
 devlink releated update to allow extack argument to eswitch operations.
 
 From Eli Britstein,
 1) devlink: Add extack argument to the eswitch related operations
 2) net/mlx5e: E-Switch, return extack messages for failures in the e-switch devlink callbacks
 3) net/mlx5e: Add extack messages for TC offload failures
 
 From Eran Ben Elisha,
 4) mlx5e: Add counter for aRFS rule insertion failures
 
 From Feras Daoud
 5) Fast teardown support for mlx5 device
 This change introduces the enhanced version of the "Force teardown" that
 allows SW to perform teardown in a faster way without the need to reclaim
 all the FW pages.
 Fast teardown provides the following advantages:
     1- Fix a FW race condition that could cause command timeout
     2- Avoid moving to polling mode
     3- Close the vport to prevent PCI ACK to be sent without been scatter
     to memory
 -----BEGIN PGP SIGNATURE-----
 
 iQEcBAABAgAGBQJbtU45AAoJEEg/ir3gV/o+/C4H/RHA4KImrb476EdB3VNYMqAN
 dgXb+bmh6sZP+jHWqQ4c3aVeh6/T8qm4gwiSn2nVTtHEnxtCdIYljzDC1Nswczeg
 pSjD1eOP7M1LpAOmBb8xdnJcX7yM7r1bTklnp2sN853WShbsDRYgZBHsBwTzx25U
 ZdzL4QTLuohlG/aLrbGXMntIy45ya2fVQrnK54s18nFlgsdFjEs0mi0xaUKNBC6+
 P8CTohHAxuuxmL5b+6MIYLZCdgd8cLNQFdtqbckEVw7SvcRTxfraRlyqJ0YOgTGB
 TdSWnqZz2JYH29wSFbpFG8qX6GCv8FoiZ+fKzldbolHk442rrktHv3+Y7qQuZVs=
 =NVks
 -----END PGP SIGNATURE-----

Merge tag 'mlx5-updates-2018-10-03' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux

Saeed Mahameed says:

====================
mlx5-updates-2018-10-03

mlx5 core driver and ethernet netdev updates, please note there is a small
devlink releated update to allow extack argument to eswitch operations.

From Eli Britstein,
1) devlink: Add extack argument to the eswitch related operations
2) net/mlx5e: E-Switch, return extack messages for failures in the e-switch devlink callbacks
3) net/mlx5e: Add extack messages for TC offload failures

From Eran Ben Elisha,
4) mlx5e: Add counter for aRFS rule insertion failures

From Feras Daoud
5) Fast teardown support for mlx5 device
This change introduces the enhanced version of the "Force teardown" that
allows SW to perform teardown in a faster way without the need to reclaim
all the FW pages.
Fast teardown provides the following advantages:
    1- Fix a FW race condition that could cause command timeout
    2- Avoid moving to polling mode
    3- Close the vport to prevent PCI ACK to be sent without been scatter
    to memory
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-04 09:48:37 -07:00
David S. Miller
6f41617bf2 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Minor conflict in net/core/rtnetlink.c, David Ahern's bug fix in 'net'
overlapped the renaming of a netlink attribute in net-next.

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-03 21:00:17 -07:00
Eli Britstein
db7ff19e7b devlink: Add extack for eswitch operations
Add extack argument to the eswitch related operations.

Signed-off-by: Eli Britstein <elibr@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-10-03 16:17:58 -07:00
Jakub Kicinski
ff58e2df62 nfp: avoid soft lockups under control message storm
When FW floods the driver with control messages try to exit the cmsg
processing loop every now and then to avoid soft lockups.  Cmsg
processing is generally very lightweight so 512 seems like a reasonable
budget, which should not be exceeded under normal conditions.

Fixes: 77ece8d5f1 ("nfp: add control vNIC datapath")
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Tested-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-02 11:47:58 -07:00
Jakub Kicinski
0c9864c05f nfp: bpf: allow control message sizing for map ops
In current ABI the size of the messages carrying map elements was
statically defined to at most 16 words of key and 16 words of value
(NFP word is 4 bytes).  We should not make this assumption and use
the max key and value sizes from the BPF capability instead.

To make sure old kernels don't get surprised with larger (or smaller)
messages bump the FW ABI version to 3 when key/value size is different
than 16 words.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-10-02 14:39:59 +02:00
Jakub Kicinski
9bbdd41b8a nfp: allow apps to request larger MTU on control vNIC
Some apps may want to have higher MTU on the control vNIC/queue.
Allow them to set the requested MTU at init time.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-10-02 14:39:59 +02:00
Jakub Kicinski
28264eb227 nfp: bpf: parse global BPF ABI version capability
Up until now we only had per-vNIC BPF ABI version capabilities,
which are slightly awkward to use because bulk of the resources
and configuration does not relate to any particular vNIC.  Add
a new capability for global ABI version and check the per-vNIC
version are equal to it.  Assume the ABI version 2 if no explicit
version capability is present.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-10-02 14:39:58 +02:00
Jakub Kicinski
97ea8ac360 nfp: warn on experimental TLV types
Reserve two TLV types for feature development, and warn in the driver
if they ever leak into production.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-01 22:49:30 -07:00
David S. Miller
a06ee256e5 Merge ra.kernel.org:/pub/scm/linux/kernel/git/davem/net
Version bump conflict in batman-adv, take what's in net-next.

iavf conflict, adjustment of netdev_ops in net-next conflicting
with poll controller method removal in net.

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-25 10:35:29 -07:00
Eric Dumazet
0825ce7031 nfp: remove ndo_poll_controller
As diagnosed by Song Liu, ndo_poll_controller() can
be very dangerous on loaded hosts, since the cpu
calling ndo_poll_controller() might steal all NAPI
contexts (for all RX/TX queues of the NIC). This capture
can last for unlimited amount of time, since one
cpu is generally not able to drain all the queues under load.

nfp uses NAPI for TX completions, so we better let core
networking stack call the napi->poll() to avoid the capture.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Jakub Kicinski <jakub.kicinski@netronome.com>
Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Tested-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-23 21:55:25 -07:00
Jakub Kicinski
23d9f5531c nfp: provide a better warning when ring allocation fails
NFP supports fairly enormous ring sizes (up to 256k descriptors).
In commit 4662717038 ("nfp: use kvcalloc() to allocate SW buffer
descriptor arrays") we have started using kvcalloc() functions to
make sure the allocation of software state arrays doesn't hit
the MAX_ORDER limit.  Unfortunately, we can't use virtual mappings
for the DMA region holding HW descriptors.  In case this allocation
fails instead of the generic (and fairly scary) warning/splat in
the logs print a helpful message explaining what happened and
suggesting how to fix it.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-19 23:07:41 -07:00
David S. Miller
aaf9253025 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2018-09-12 22:22:42 -07:00
Jakub Kicinski
eca09be82e nfp: report FW vNIC stats in interface stats
Report in standard netdev stats drops and errors as well as
RX multicast from the FW vNIC counters.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-12 13:20:49 -07:00
Louis Peens
224de549f0 nfp: flower: reject tunnel encap with ipv6 outer headers for offloading
This fixes a bug where ipv6 tunnels would report that it is
getting offloaded to hardware but would actually be rejected
by hardware.

Fixes: b27d6a95a7 ("nfp: compile flower vxlan tunnel set actions")
Signed-off-by: Louis Peens <louis.peens@netronome.com>
Reviewed-by: John Hurley <john.hurley@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-12 13:18:30 -07:00
Pieter Jansen van Vuuren
db191db813 nfp: flower: fix vlan match by checking both vlan id and vlan pcp
Previously we only checked if the vlan id field is present when trying
to match a vlan tag. The vlan id and vlan pcp field should be treated
independently.

Fixes: 5571e8c9f2 ("nfp: extend flower matching capabilities")
Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-12 13:18:30 -07:00
jun qian
6577b0f716 nfp: replace spin_lock_bh with spin_lock in tasklet callback
As you are already in a tasklet, it is unnecessary to call spin_lock_bh.

Signed-off-by: jun qian <hangdianqj@163.com>
Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-07 22:56:02 -07:00
Jakub Kicinski
7848418e28 nfp: separate VXLAN and GRE feature handling
VXLAN and GRE FW features have to currently be both advertised
for the driver to enable them.  Separate the handling.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-05 22:18:11 -07:00
Jakub Kicinski
e84b2f2db2 nfp: validate rtsym accesses fall within the symbol
With the accesses to rtsyms now all going via special helpers
we can easily make sure the driver is not reading past the
end of the symbol.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Francois H. Theron <francois.theron@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-05 22:17:07 -07:00
Jakub Kicinski
31e380f38f nfp: prefix rtsym error messages with symbol name
For ease of debug preface all error messages with the name
of the symbol which caused them.  Use the same message format
for existing messages while at it.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Francois H. Theron <francois.theron@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-05 22:17:07 -07:00
Jakub Kicinski
3c576de30b nfp: fix readq on absolute RTsyms
Return the error and report value through the output param.

Fixes: 640917dd81 ("nfp: support access to absolute RTsyms")
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Francois H. Theron <francois.theron@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-05 22:17:07 -07:00
David S. Miller
36302685f5 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2018-09-04 21:33:03 -07:00
Jakub Kicinski
9ad716b95f nfp: wait for posted reconfigs when disabling the device
To avoid leaking a running timer we need to wait for the
posted reconfigs after netdev is unregistered.  In common
case the process of deinitializing the device will perform
synchronous reconfigs which wait for posted requests, but
especially with VXLAN ports being actively added and removed
there can be a race condition leaving a timer running after
adapter structure is freed leading to a crash.

Add an explicit flush after deregistering and for a good
measure a warning to check if timer is running just before
structures are freed.

Fixes: 3d780b926a ("nfp: add async reconfiguration mechanism")
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-31 23:01:30 -07:00
Jakub Kicinski
4152e58cb8 nfp: make RTsym users handle absolute symbols correctly
Make the RTsym users access the size via the helper, which
takes care of special handling of absolute symbols.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Francois H. Theron <francois.theron@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-28 16:01:47 -07:00
Jakub Kicinski
640917dd81 nfp: support access to absolute RTsyms
Add support in nfpcore for reading the absolute RTsyms.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Francois H. Theron <francois.theron@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-28 16:01:47 -07:00
Jakub Kicinski
1240989ccc nfp: convert all RTsym users to use new read/write helpers
Convert all users of RTsym to the new set of helpers which
handle all targets correctly.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Francois H. Theron <francois.theron@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-28 16:01:47 -07:00
Jakub Kicinski
761969992d nfp: convert existing RTsym helpers to full target decoding
Make nfp_rtsym_{read,write}_le() and nfp_rtsym_map() use the new
target resolution helpers to allow accessing in-cache symbols.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Francois H. Theron <francois.theron@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-28 16:01:47 -07:00
Jakub Kicinski
8f6d6052cf nfp: pass cpp_id to nfp_cpp_map_area()
Align nfp_cpp_map_area() with other CPP-level APIs and pass
encoded cpp_id/dest rather than target, action, domain tuple.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Francois H. Theron <francois.theron@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-28 16:01:47 -07:00
Jakub Kicinski
3f0e55a2a6 nfp: add RTsym access helpers
RTsyms may have special encodings for more complex symbol types.
For example symbols which are placed in external memory unit's
cache directly, constants or local memory.  Add set of helpers
which will check for those special encodings and handle them
correctly.

For now only add direct cache accesses, we don't have a need to
access the other ones in foreseeable future.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Francois H. Theron <francois.theron@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-28 16:01:47 -07:00
Jakub Kicinski
c678a9759a nfp: add basic errors messages to target logic
Add error prints to CPP target encoding/decoding logic, otherwise
it's quite hard to pin point the reasons why read or write
operations fail.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Francois H. Theron <francois.theron@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-28 16:01:47 -07:00
Jakub Kicinski
73eaf3b7b8 nfp: save the MU locality field offset
We will soon need the MU locality field offset much more
often than just for decoding MIP address.  Save it in nfp_cpp
for quick access.  Note that we can already reuse the target
config from nfp_cpp, no need to do the XPB read.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Francois H. Theron <francois.theron@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-28 16:01:47 -07:00
Jakub Kicinski
9bf6cce893 nfp: refactor the per-chip PCIe config
Use a switch statement instead of ifs for code dependent
on chip version.  While at it make sure we fail for unknown
chip revisions.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-28 16:01:47 -07:00
Jakub Kicinski
0377505c54 nfp: add support for NFP5000
Add NFP5000 to supported chips, the chip is backward compatible
with NFP4000 and NFP6000, so core PCIe code needs to handle it
the same way as 4k and 6k.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-28 16:01:47 -07:00
Jakub Kicinski
f6e71efdf9 nfp: abm: look up MAC addresses via management FW
In multi-host scenarios Management FW may allocate MAC addresses
at runtime, we have to use the indirect lookup to find them.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-28 16:01:47 -07:00
Jakub Kicinski
34243f5909 nfp: add support for indirect HWinfo lookup
Management FW can adjust some of the information in the HWinfo table
at runtime.  In some cases reading the table directly will not yield
correct results.  Add a NSP command for looking up information.
Up until now we weren't making use of any of the values which may
get adjusted.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-28 16:01:47 -07:00
Jakub Kicinski
ac86da0546 nfp: interpret extended FW load result codes
To enable easier FW distribution NFP can now automatically
select between FW stored on the flash and loaded from the
kernel.

If FW loading policy is set to auto it will compare the
versions of FW from the host and from the flash and load
the newer one.  If FW type doesn't match (e.g. one advanced
application vs another) the FW from the host takes precedence,
unless one of them is the basic NIC firmware, in which case
the non-basic-NIC FW is selected.

This automatic selection mechanism requires we inform user
what the verdict was.  Print a message to the logs explaining
the decision and the reason.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-28 16:01:46 -07:00
Jakub Kicinski
2db100002e nfp: attempt FW load from flash
Flash may contain a default NFP application FW.  This application
can either be put there by the user (with ethtool -f) or shipped
with the card.  If file system FW is not found, attempt to load
this flash stored app FW.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-28 16:01:46 -07:00
Jakub Kicinski
1c0372b67c nfp: encapsulate NSP command arguments into structs
There is already a fair number of arguments to nfp_nsp_command()
family of functions.  Encapsulate them into structures to make
adding new ones easier.  No functional changes.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-28 16:01:46 -07:00
Cong Wang
244cd96adb net_sched: remove list_head from tc_action
After commit 90b73b77d0, list_head is no longer needed.
Now we just need to convert the list iteration to array
iteration for drivers.

Fixes: 90b73b77d0 ("net: sched: change action API to use array of pointers to actions")
Cc: Jiri Pirko <jiri@mellanox.com>
Cc: Vlad Buslov <vladbu@mellanox.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-21 12:45:44 -07:00
Jakub Kicinski
19997ba7cb nfp: clean up return types in kdoc comments
Remove 'Return:' information from functions which no longer
return a value.  Also update name and return types of nfp_nffw_info
access functions.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-13 19:36:08 -07:00
Pieter Jansen van Vuuren
0a22b17a6b nfp: flower: add geneve option match offload
Introduce a new layer for matching on geneve options. This allows
offloading filters configured to match geneve with options.

Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-07 12:22:15 -07:00
Pieter Jansen van Vuuren
9e7c32fe44 nfp: flower: add geneve option push action offload
Introduce new push geneve option action. This allows offloading
filters configured to entunnel geneve with options.

Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-07 12:22:15 -07:00
John Hurley
d7ff7ec573 nfp: flower: allow matching on ipv4 UDP tunnel tos and ttl
The addition of FLOW_DISSECTOR_KEY_ENC_IP to TC flower means that the ToS
and TTL of the tunnel header can now be matched on.

Extend the NFP tunnel match function to include these new fields.

Signed-off-by: John Hurley <john.hurley@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-07 12:22:14 -07:00
John Hurley
2a43747147 nfp: flower: set ip tunnel ttl from encap action
The TTL for encapsulating headers in IPv4 UDP tunnels is taken from a
route lookup. Modify this to first check if a user has specified a TTL to
be used in the TC action.

Signed-off-by: John Hurley <john.hurley@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-07 12:22:14 -07:00
David S. Miller
1ba982806c Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Daniel Borkmann says:

====================
pull-request: bpf-next 2018-08-07

The following pull-request contains BPF updates for your *net-next* tree.

The main changes are:

1) Add cgroup local storage for BPF programs, which provides a fast
   accessible memory for storing various per-cgroup data like number
   of transmitted packets, etc, from Roman.

2) Support bpf_get_socket_cookie() BPF helper in several more program
   types that have a full socket available, from Andrey.

3) Significantly improve the performance of perf events which are
   reported from BPF offload. Also convert a couple of BPF AF_XDP
   samples overto use libbpf, both from Jakub.

4) seg6local LWT provides the End.DT6 action, which allows to
   decapsulate an outer IPv6 header containing a Segment Routing Header.
   Adds this action now to the seg6local BPF interface, from Mathieu.

5) Do not mark dst register as unbounded in MOV64 instruction when
   both src and dst register are the same, from Arthur.

6) Define u_smp_rmb() and u_smp_wmb() to their respective barrier
   instructions on arm64 for the AF_XDP sample code, from Brian.

7) Convert the tcp_client.py and tcp_server.py BPF selftest scripts
   over from Python 2 to Python 3, from Jeremy.

8) Enable BTF build flags to the BPF sample code Makefile, from Taeung.

9) Remove an unnecessary rcu_read_lock() in run_lwt_bpf(), from Taehee.

10) Several improvements to the README.rst from the BPF documentation
    to make it more consistent with RST format, from Tobin.

11) Replace all occurrences of strerror() by calls to strerror_r()
    in libbpf and fix a FORTIFY_SOURCE build error along with it,
    from Thomas.

12) Fix a bug in bpftool's get_btf() function to correctly propagate
    an error via PTR_ERR(), from Yue.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-07 11:02:05 -07:00
Jakub Kicinski
0c26159352 nfp: bpf: xdp_adjust_tail support
Add support for adjust_tail.  There are no FW changes needed but add
a FW capability just in case there would be any issue with previously
released FW, or we will have to change the ABI in the future.

The helper is trivial and shouldn't be used too often so just inline
the body of the function.  We add the delta to locally maintained
packet length register and check for overflow, since add of negative
value must overflow if result is positive.  Note that if delta of 0
would be allowed in the kernel this trick stops working and we need
one more instruction to compare lengths before and after the change.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-08-04 21:58:12 +02:00
David S. Miller
89b1698c93 Merge ra.kernel.org:/pub/scm/linux/kernel/git/davem/net
The BTF conflicts were simple overlapping changes.

The virtio_net conflict was an overlap of a fix of statistics counter,
happening alongisde a move over to a bonafide statistics structure
rather than counting value on the stack.

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-02 10:55:32 -07:00
Jakub Kicinski
240b74fde3 nfp: fix variable dereferenced before check in nfp_app_ctrl_rx_raw()
'app' is dereferenced before used for the devlink trace point.
In case FW is buggy and sends a control message to a VF queue
we should make sure app is not NULL.

Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-07-31 09:28:19 +02:00
John Hurley
ee614c8710 nfp: flower: fix port metadata conversion bug
Function nfp_flower_repr_get_type_and_port expects an enum nfp_repr_type
return value but, if the repr type is unknown, returns a value of type
enum nfp_flower_cmsg_port_type.  This means that if FW encodes the port
ID in a way the driver does not understand instead of dropping the frame
driver may attribute it to a physical port (uplink) provided the port
number is less than physical port count.

Fix this and ensure a net_device of NULL is returned if the repr can not
be determined.

Fixes: 1025351a88 ("nfp: add flower app")
Signed-off-by: John Hurley <john.hurley@netronome.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-28 14:27:32 -07:00
Jakub Kicinski
17082566a9 nfp: bpf: improve map offload info messages
FW can put constraints on map element size to maximize resource
use and efficiency.  When user attempts offload of a map which
does not fit into those constraints an informational message is
printed to kernel logs to inform user about the reason offload
failed.  Map offload does not have access to any advanced error
reporting like verifier log or extack.  There is also currently
no way for us to nicely expose the FW capabilities to user
space.  Given all those constraints we should make sure log
messages are as informative as possible.  Improve them.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-07-27 07:14:35 +02:00
Jakub Kicinski
ab01f4ac5f nfp: bpf: remember maps by ID
Record perf maps by map ID, not raw kernel pointer.  This helps
with debug messages, because printing pointers to logs is frowned
upon, and makes debug easier for the users, as map ID is something
they should be more familiar with.  Note that perf maps are offload
neutral, therefore IDs won't be orphaned.

While at it use a rate limited print helper for the error message.

Reported-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-07-27 07:14:35 +02:00
Jakub Kicinski
0958762748 nfp: bpf: allow receiving perf events on data queues
Control queue is fairly low latency, and requires SKB allocations,
which means we can't even reach 0.5Msps with perf events.  Allow
perf events to be delivered to data queues.  This allows us to not
only use multiple queues, but also receive and deliver to user space
more than 5Msps per queue (Xeon E5-2630 v4 2.20GHz, no retpolines).

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-07-27 07:14:35 +02:00
Jakub Kicinski
20c5420421 nfp: bpf: pass raw data buffer to nfp_bpf_event_output()
In preparation for SKB-less perf event handling make
nfp_bpf_event_output() take buffer address and length,
not SKB as parameters.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-07-27 07:14:35 +02:00
Jakub Kicinski
79ca38e80c nfp: allow control message reception on data queues
Port id 0xffffffff is reserved for control messages.  Allow reception
of messages with this id on data queues.  Hand off a raw buffer to
the higher layer code, without allocating SKB for max efficiency.
The RX handle can't modify or keep the buffer, after it returns
buffer is handed back over to the NIC RX free buffer list.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-07-27 07:14:35 +02:00